Rust Workshop 🦀

Part 1: From Python to Rust - Fundamentals

Gabriel Nützi, gabriel.nuetzi@sdsc.ethz.ch

August 14, 2024 (updated May 27, 25), Part 2

Preface

The Rust programming language absolutely positively sucks Reedit

  • Python: Runtime Mess 🐞 💣

  • Rust: Compile-Time Mess 🔧 (deps. on your level of experience)

How I Learned? - RsFluid

Help

How to use these slides:

  • S: See the speaker notes.
  • Esc: See all slides and jump around.
  • Space: Go forward.
  • Shift + Space: Go backward.

Acknowledgment

Thanks to the following contributors who fixed typos and mistakes:

  • The ORDES Team (SDSC) and Jusong Yu (PSI), who helped me fixing typos & bugs.

  • Gerry Bräunlich, Michael Kefeder & Stefan Tüx who allowed me to attend the Rust Fest and pointing me to interesting teaching material.

External:

Rust Workshop

Rust References

Why Rust

What the Heck is Rust 🦀

A Multi-Paradigm Language

  • Procedural like Python, i.e. functions 󰊕, loops 󰑙, …

  • Functional aspects, i.e. iterators 🏃, lambdas 󰡱 …

  • Object-oriented aspects but unlike Python ( its better ❤️) …

What the Heck is Rust 🦀

A Compiled Language Unlike Python

  • The Rust compiler rustc 🦀 will convert your code to machine-code ⚙️.
    Python is an interpreter.

  • It has a strong type system (algebraic types: sum types, product types).

  • It was invented in 2009 by Mozilla (Firefox) - Rust Foundation as the driver today.

Note: 10% of Firefox is in Rust for good reasons you will realize in the following.

Benefits You Get on the 🦀 Journey

A few selling points for python programmers.

  • pydantic-core is fully rewritten in Rust.

  • Modern python toolchains are in Rust: (uv, ruff, etc.).

  • Rust can powerup your python project seamlessly with pyo3.

Come on 🐨 show me syntax!

The syntax* is similar and as easy to read as in Python

@dataclass
class Apple:
  name: str


def grow() -> List[Apple]:
  apples = [Apple("a"),
            Apple("b")]

  for b in apples:
    print(f"Apple: {b.name}")


  return apples
#[derive(Debug)]
struct Apple {
  name: String
}

fn grow() -> Vec<Apple> {
  let apples = vec![Apple{name: "a".to_string()},
                    Apple{name: "b".to_string()} ];

  for b in &apples {
    println!("Apple: {b:?}");
  }

  apples
}

*: 80% you will encounter is very readable (except macros etc.).

But Why?

More reasons why you should learn a compiled, statically typed language…

What Rust Promises 🤚

  1. Pedal to the Metal
  2. Comes with a Warranty
  3. Beautiful Code
  4. Rust is Practical

Pedal to the Metal

  • Compiled language, not interpreted.

  • State-of-the-art machine-code generation using LLVM.

  • No garbage collector (GC) getting in the way of execution.

    def run():
      d = { "a":1, "b":2 } # Memory is allocated on the heap.
    
    run()

    Question: Does the memory of d still exist after run()?

     We don’t know 🤷

  • Usable in embedded devices, operating systems and demanding websites.

Rust Comes with a Warranty

  • Strong type system helps prevent silly bugs 🐞:

    def concat(numbers: List[str]) -> str:
      return "-".join(numbers)
    
    concat(["1", "2", "30", 4, "5", "7", "10"])
  • Explicit errors instead of exceptions ❗(later):

    def main():
      file_count = get_number_of_files()
      if file_count is None:
        print("Could not determine file count.")

    Question: Is this error handling correct if:
    get_number_of_files = lambda: int(sys.argv[0])

Rust Comes with a Warranty

  • Ownership Model: Type system tracks lifetime of objects.

    • No more exceptions about accessing None.
    • You know who owns an objects (variable/memory).
  • Programs don’t trash your system accidentally

    • Warranty can be voided (unsafe).

Rust Comes with a Warranty

Experience: “♥️ If it compiles, it is more often correct. ♥️”

  • Enables compiler driven development.

  • 100% code coverage:

    def get_float(num: str | float) -> float:
      match (num):
          case str(num):
              return float(num)

    You trust mypy which is not enforced at runtime.

    enum StrOrFloat {
      MyStr(String),
      MyFloat(f64),
    }
    
    fn get_float(n: StrOrFloat) -> f64 {
        match n {
            StrOrFloat::MyFloat(x) => x,
        }
    }

Rust Comes with a Warranty

Experience: “♥️ If it compiles, it is more often correct. ♥️”

  • No invalid syntax.
  • Guaranteed thread safety.
  • Model your business logic with struct and enums.

Performance

  • Rust is fast 🚀. Comparable to C++/C, faster than go.

    • Python is slow, that’s why most libraries outsource to C or Rust.
  • Rust is concurrent ⇉ by design (safe-guarded by the ownership model).

Why Should 🫵 Learn Rust?

  • Learning a new language teaches you new tricks:

    • You will also write better code (also in Python)!
  • Rust is a young, but a quickly growing platform:

    • You can help shape its future.
    • Demand for Rust programmers will increase!
  • It’s not easy — but it’s worth it:

    • Exercise is tough, but it makes you stronger.
    • Junk food is easy, but it slows you down.
    • Learning Rust is a challenge — but soon, you’ll feel like you’re bending the Matrix.

Your First Project

Create a Project

cargo new hello-world

Cargo is the Rust package manager. Cargo downloads your Rust package’s dependencies, compiles your packages, makes distributable packages, and uploads them to crates.io, the Rust community’s package registry.

cd hello-world
cargo run
Compiling hello-world v0.1.0
Finished dev [unoptimized + debuginfo] target(s) in 0.74s
Running `target/debug/hello-world`
Hello, world!

Computing a Simple Sum

fn main() {
    println!("sum(4) = 4 + 3 + 2 + 1 = {}", sum(4));
}

fn sum(n: u64) -> u64 {
    if n != 0 {
        n + sum(n-1)
    } else {
        n
    }
}
// Note: avoid recursion as you always can :)

Output:

sum(4) = 4 + 3 + 2 + 1 = 10

Basic Syntax

Variables (1)

fn main() {
    let some_x = 5;
    println!("some_x = {}", some_x);
    some_x = 6;
    println!("some_x = {some_x}");
}
Compiling hello-world v0.1.0
error[E0384]: cannot assign twice to immutable variable `some_x`
--> src/main.rs:4:5
2 |     let some_x = 5;
  |         ------
  |         |
  |         first assignment to `some_x`
  |         help: consider making this binding mutable: `mut some_x`
3 |     println!("some_x = {}", some_x);
4 |     some_x = 6;
  |     ^^^^^^^^^^ cannot assign twice to immutable variable
  • Rust uses snake case (e.g. some_x) for variable names.
  • The immutable (read-only) variable cannot be mutated in any way.

Variables (2)

fn main() {
    let mut some_x = 5;
    println!("some_x = {}", some_x);
    some_x = 6;
    println!("some_x = {}", some_x);
}
Compiling hello-world v0.1.0 (/home/teach-rs/Projects/hello-world)
Finished dev [unoptimized + debuginfo] target(s) in 0.26s
Running `target/debug/hello-world`
some_x = 5
some_x = 6
  • Declare a mutable variable with mut to update

Declaring a Type of Variable

fn main() {
    let x: i32 = 20;
    //   ^^^^^---------- Type annotation. (as in python)
}
  • Rust is strongly and strictly typed.
  • Variables use type inference, so no need to specify a type (Henly-Millner Type System), (Understanding It).
  • We can be explicit in our types (and sometimes have to be).

Primitives: Integers

Length Signed Unsigned
8 bits i8 u8
16 bits i16 u16
32 bits i32 u32
64 bits i64 u64
128 bits i128 u128
pointer-sized isize usize


Literals

let x = 42;
let y = 42u64; // decimal as u64
let z = 42_000; // underscore separator

let u = 0xff; // hexadecimal
let v = 0o77; // octal
let q = b'A'; // byte syntax (is u8)
let w = 0b0100_1101; // binary
  • Rust prefers explicit integer sizes.
  • Use isize and usize sparingly.

Primitives: Floating Points Numbers

fn main() {
    let x = 2.0;    // f64
    let y = 1.0f32; // f32
}
  • f32: single precision (32-bit) floating point number.
  • f64: double precision (64-bit) floating point number.
  • f128: 128-bit floating point number.

Numerical Operations

fn main() {
    let sum = 5 + 10;
    let difference = 10 - 3;
    let mult = 2 * 8;
    let div = 2.4 / 3.5;
    let int_div = 10 / 3; // 3
    let remainder = 20 % 3;
}
  • Overflow/underflow checking in debug:

    let a: u8 = 0b1111_1111;
    println!("{}", a + 10); // compiler error:
                   ^^^^^^ attempt to compute `u8::MAX + 10_u8`,
                          which would overflow
  • In release builds these expressions are wrapping, for efficiency.

Numerical Operations

  • You cannot mix and match types, i.e.:
fn main() {
    let invalid_div = 2.4 / 5;          // Error!
    let invalid_add = 20u32 + 40u64;    // Error!
}
  • Rust has your typical operations, just as with other python languages.

Primitives: Booleans and Operations

fn main() {
    let yes: bool = true;
    let no: bool = false;
    let not = !no;
    let and = yes && no;
    let or = yes || no;
    let xor = yes ^ no;
}

Comparison Operators

fn main() {
    let x = 10;
    let y = 20;
    x < y;  // true
    x > y;  // false
    x <= y; // true
    x >= y; // false
    x == y; // false
    x != y; // true
}
fn main() {
    3.0 < 20;      // invalid
    30u64 > 20i32; // invalid
}
  • Boolean operators short-circuit: i.e. if in a && b, a is already false, then the code for b is not executed.

Primitives: Characters

fn main() {
    let c: char = 'z'; // Note: the single-quotes ''.
    let z = 'ℤ';
    let heart_eyed_cat = '😻';
}
  • A char is a 32-bit unicode scalar value (like in python).

Strings

    let s1 = String::from("Hello, 🌍!");
    //       ^^^^^^ Owned, heap-allocated string
  • Rust Strings are UTF-8-encoded.
  • Cannot be indexed like Python str.
  • String is heap-allocated.
  • Actually many types of strings in Rust
    • CString
    • PathBuf
    • OsString

Primitives: Tuples

fn main() {
  let tup: (i32, f32, char) = (1, 2.0, 'a');
}
  • Group multiple values into a single compound type.
  • Fixed size.
  • Different types per element.
fn main() {
  let tup = (1, 2.0, 'Z');
  let (a, b, c) = tup;
  println!("({}, {}, {})", a, b, c);

  let another_tuple = (true, 42);
  println!("{}", another_tuple.1);
}
  • Tuples can be destructured to get to their individual values
  • Access an element with . followed by a zero based index.

Primitives: Arrays

fn main() {
    let arr: [i32; 3] = [1, 2, 3];
    println!("{}", arr[0]);

    let [a, b, c] = arr;
    println!("[{}, {}, {}]", a, b, c);
}
  • A collection of multiple values, but same type.
  • Always fixed length at compile time (similar to tuples).
  • Use [i] to access an individual i-th value.
  • Destructuring as with tuples.
  • Rust always checks array bounds when accessing a value in an array.
  • This is not Pythons list type! (Vec later).

Control Flow

fn main() {
    let mut x = 0;
    loop {
        if x < 5 {
            println!("x: {}", x);
            x += 1;
        } else {
            break;
        }
    }

    let mut y = 5;
    while y > 0 {
        y -= 1;
        println!("y: {}", y);
    }

    for i in [1, 2, 3, 4, 5] {
        println!("i: {}", i);
    }
}
  • A loop or if condition must always evaluate to a boolean type, so no if 1.

  • Use break to break out of a loop, also works with for and while, continue to skip to the next iteration.

Functions

fn add(a: i32, b: i32) -> i32 {
    a + b // or: `return a+b;`
}

fn returns_nothing() -> () {
    println!("Nothing to report");
}

fn also_returns_nothing() {
    println!("Nothing to report");
}
  • The function signature must be annotated with types.
  • Type inference may be used in function body.
  • A function that returns nothing has the return type unit ()
  • Either return an expression on the last line with no semicolon (or write return expr;).

Statements

  • Statements are instructions that perform some action and do not return a value.
  • A definition of any kind (function definition etc.)
  • The let var = expr; statement.
  • Almost everything else is an expression.


Examples

fn my_fun() {
    println!("{}", 5);
}
let x = 10;
return 42;
let x = (let y = 10); // invalid

Expressions

  • Expressions evaluate to a resulting value.
  • Expressions make up most of the Rust code you write.
  • Includes all control flow such as if and loop.
  • Includes scoping braces ({ and }).
  • Semicolon (;) turns expression into statement.
fn main() {
    let y = {
        let x = 3;
        x + 1
    };
    println!("{}", y); // 4
}

Scope

  • We just mentioned the scope braces ({ and }).
  • Variable scopes are actually very important for how Rust works.
fn main() {
    println!("Hello, {}", name);  // invalid: name is not yet defined

    {
        let name = "world";  // from this point name is in scope
        println!("Hello, {}", name);
    } // name goes out of scope

    println!("Hello, {}", name);  // invalid: name is no more defined
}

Expressions - Control Flow

  • Remember: A block/function can end with an expression, but it needs to have the correct type

  • Control flow expressions as a statement do not
    need to end with a semicolon if they return unit (()).

fn main() {
    let y = 11;
    // if as an expression
    let x = if y < 10 {
        42    // missing ;
    } else {
        24    // missing ;
    };

    // if (control-flow expr.) as a statement
    if x == 42 {
        println!("Foo");
    } else {
        println!("Bar");
    } // no ; necessary
}

Expression - Control Flow

Quiz: Does this compile?

fn main() {
    if 2 < 10 {
        42
    } else {
        24
    }
}
fn main() {
    if 2 < 10 {
        42
    } else {
        24
    };
}

Answer: No - It needs a ; on line 6 because the if expression returns a value which must be turned into statement with };

Expression - Control Flow

Quiz: Does this compile?

fn main() {
    let a = if if 1 != 2 { 3 } else { 4 } == 4 {
        2
    } else {
        1
    };

    println!("{}", a)
}

Answer: Yes - a == 1.

Scope (more)

When a scope ends, all variables for that scope become “extinct” (deallocated/removed from the stack).

fn main() { // nothing in scope here
  let i = 10; // i is now in scope

  if i > 5 {
      let j = 20; // j is now in scope
      println!("i = {}, j = {}", i, j);
  } // j is no longer in scope

  println!("i = {}", i);
} // i is no longer in scope
def main():
  i = 10;

  if i > 5:
      j = 20
      print(f"i = {j}, j = {i}")


  print(i, j) # 💩: j is STILL in scope

Note: This is very different from python.

Printing & Formatting Values

With the format! or println! macros (later) you can format or print variables to stdout of your application:

fn main() {
  let x = 130;
  let y = 50;

  println!("{} + {}", x, y);
  println!("{x} + {y}");

  let s: String = format!("{x:04} + {0:>10}", y);
  println!("{s}")
}

Output:

130 + 50
130 + 50
0130 +         50

Exercise Time (1)

Approx. Time: 20-45 min.

Do the following exercises:

  • basic-syntax: all, 09 (optional: macros, read here)

Build/Run/Test:

just build <exercise> --bin 01
just run <exercise> --bin 01
just test <exercise> --bin 01
just watch [build|run|test|watch] <exercise> --bin 01

Did You Fight The Compiler?

You will get better at this! 🦀 But it needs practice!

Move Semantics

Memory

  • A computer program consists of a set of instructions.
  • Those instructions manipulate some memory.
  • How does a program know what memory can be used?

Program Execution

An executable binary (a file on your disk) will be loaded first into the memory.

  • The machine instructions are loaded into the memory.

  • The static data in the binary (i.e. strings etc) is also loaded into memory.

  • The CPU starts fetching the instructions from the RAM and will start to execute the machine instructions.

  • Two memory mechanisms are at play when executing: the stack and the heap

Further Technical Videos: A program is not a process, Why is the heap so slow, Why is the stack so fast.

Memory Layout

Stack

Continuous areas of memory for local variables in functions.

  • It is fixed in size (from start and OS dependant).

  • Stack grows and shrinks, it follows function calls. Each function has its own stack frame for all its local variables.

  • Variables must have fixed sizes known at compile time. (If the compiler doesn’t know it cannot compute the stack frame size)

  • Access is extremely fast: offset the stack pointer.

  • Great memory locality  CPU caches.

Memory Layout

Stack - Example

fn foo() { // Enter 2. stack frame.
    let a: u32 = 10; // `a` points on the stack containing 10.
    println!("a address: {:p}", &a);

    let b: u32 = a;  // Copy `a` to `b`.
    println!("b address: {:p}", &b);
} // `a,b` out of scope, we leave the stack frame.

fn main() { // Enter 1. stack frame.
  foo()
}

Stack frame for foo needs at least \(2 \cdot 32\) bits = \(2 \cdot 4\) bytes = \(8\) bytes.

a address: 0x7ffdb6f09c08
b address: 0x7ffdb6f09c0c  // 08 + 4bytes = 0c

The Heap (1)

The heap is just one big pile of memory for dynamic memory allocation.

Usage

  • Memory which outlives the stack (when you leave the function).

  • Storing big objects in memory is done using the heap.

The Heap (2)

The memory management on the heap depends on the language you write.

Mechanics

  • Allocation/deallocation on the heap is done by the operating system.

    • Linux: Programs will call into glibc (malloc , etc.) which interacts with the kernel.
  • Depends on the language:

    • Full Control: C, C++, Pascal,…:
      • Programmer decides when to allocate and deallocate memory.
      • Programmer ensures if some pointer still points to valid memory  🚀 vs. 💣🐞
    • Full Safety: Java, Python, Go, Haskell, …:
      • A runtime system (garbage collector) ensures memory is deallocated at the right time.  🐌 vs. 🦺

Mechanics 🦀

  • Full Control and Safety: Rust - Via compile time enforcement of correct memory management.

    • It does this with an explicit ownership concept.
    • It tracks life times (of references).

Variable Scoping (recap)

fn main() {
    let i: u32 = 10; // `i` in scope.

    if i > 5 {
        let j = i;
    }  // `j` no longer in scope.

    println!("i = {}", i);
} // i is no longer in scope
  • Types of i and j are examples of a Copy types.
  • What if copying is too expensive?

Ownership (1)

// Create a variable on the stack.
let a = 5;

Local integer a allocated on the
stack.

// Create an owned, heap allocated string
let a = String::from("hello");

Strings (a) store data on the heap because they can grow.

Ownership (2)

fn foo() {
  let a = 5;
  let b = a;
}

Assignment of a to b copies a to b.

fn foo() {
  let a = String::from("hello");
  let b = a;
}

Assign. a to b transfers ownership (move).

  • When a out of scope: nothing happens.
  • When b goes out of scope: the string data is deallocated.

Ownership (3)

fn foo() {
  let a = String::from("hello");
  let b = a;
  println!("{}, world!", a);
  //      Nope!! ❌ -----^
}
error[E0382]: borrow of moved value: `a`
--> src/main.rs:4:28
  |
2 |     let a = String::from("hello");
  |         - move occurs because `a`
  |           has type `String`, which
  |           does not implement the `Copy`
  |           trait
  |
3 |     let b = a;
  |             - value moved here
4 |     println!("{}, world!", a);
  |                            ^
  |            value borrowed here
  |            after move

Ownership - The Rules

  • There is always ever only one owner of a stack value.

  • Once the owner goes out of scope (and is removed from the stack), any associated values on the heap will be deallocated.

  • Rust transfers ownership for non-copy types: move semantics.

Ownership - Move into Function

fn main() {
  let a = String::from("hello");

  let len = calc_length(a);
  println!("Length of '{}' is {}.",
           a, len);
}

fn calc_length(s: String) -> usize {
  s.len()
}

What will happen when we print on line 5?

error[E0382]: borrow of moved value: `a`
--> src/main.rs:4:43
  |
2 | let a = String::from("hello");
  |     - move occurs because `a`
  |       has type `String`,
  |       which does not implement the
  |       `Copy` trait
  |
3 | let len = calc_length(a);
  |      value moved here -
  |
4 | println!("Length of '{}' is {}.",
  |          a, len);
  |          ^
  |          value borrowed here after move

Ownership - Moving Out of Function

We can return a value to move it out of the function

fn main() {
    let a = String::from("hello");
    let (len, a) = calc_length(a);

    println!("Length of '{}' is {}.", a, len);
}

fn calc_length(s: String) -> (usize, String) {
    (s.len(), s)
}
Compiling playground v0.0.1
Finished dev ...
Running `target/debug/playground`

Length of 'hello' is 5.

Clone

  • Many types in Rust are Clone-able.
  • Use clone() to create an explicit clone.
    • In contrast to Copy which creates an implicit copy.
  • ⏱️ Clones can be expensive and could take a long time, so be careful.
  • 🐌 Not very efficient if a clone is short-lived like in this example .

fn main() {
    let x = String::from("hellothisisaverylongstring...");
    let len = get_length(x.clone());
    println!("{}: {}", x, len);
}

fn get_length(s: String) -> usize {
    s.len()
}

Clone Explicitness vs. Python

In contrast to Rust, python hides when stuff gets copied or referenced:

# Python
a = 1
b = a
b += 1
print(a) # `a` is unchanged
         #  -> `int` is a copy-type.
# Python
a = {'a': 1}
b = a    # `b` is a reference to `a`.
b['a'] += 1
print(a) # `a` is 2
         #  -> dict is reference-counted.

Exercise Time (2)

Approx. Time: 20-30 min.

Do the following exercises:

  • move-semantics: all

Build/Run/Test:

just build <exercise> --bin 01
just run <exercise> --bin 01
just test <exercise> --bin 01
just watch [build|run|test|watch] <exercise> --bin 01

Ownership and Borrowing

Ownership

We previously talked about ownership:

  • There is always a single owner for each stack value.
  • If owner goes out of scope any associated values is cleaned up.
  • Copy types (Copy trait) creates copies, all other types are moved.

Moving Out of a Function

We have previously seen this example:

fn main() {
    let a = String::from("hello");
    let len = calc_length(a);

    println!("Length of '{}' is {}.", a, len);
}

fn calc_length(s: String) -> usize {
    s.len()
}
  • Does not compile ⇒ ownership of a is moved into calc_length ⇒ no longer available in main.
  • We can use Clone to create an explicit copy.
  • We can give ownership back by returning the value.

Question: Are there other options?

Moving Out of a Function (🐍)

In Python we have this:

def main() {
    a = "hello";
    l = calc_length(a);

    print(f"Length of '{a}' is {l}.");
}

def calc_length(s: str) -> int {
    return len(s)
}

Question: To what memory does s refer to? Is it a copy?

Borrowing

  • Analogy: if somebody owns something you can borrow it from them, but eventually you have to give it back.

  • If a value is borrowed, it is not moved and the ownership stays with the original owner.

  • To borrow in Rust, we create a reference with &:

fn main() {
    let x = String::from("hello");
    let len = get_length(&x); // borrow with &

    println!("{}", x);
}

fn get_length(s: &String) -> usize {
    s.len()
}

Shared References

Create a shared (read-only or immutable) reference &:

fn main() {
    let s = String::from("hello");
    change(&s);

    println!("{}", s);
}

fn change(s: &String) {
    s.push_str(", world");
}
error[E0596]:
    cannot borrow `*s` as mutable,
    as it is behind a `&` reference
 --> src/main.rs:8:5
8 | fn change(s: &String) {
  |              -------
  |     help: consider changing this to
  |            be a mutable reference:
  |           `&mut String`
9 |     s.push_str(", world");
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |     `s` is a `&` reference, so the data
  |     it refers to cannot be borrowed
  |     as mutable

For more information about this error,
try `rustc --explain E0596`.

Exclusive References

Create an exclusive (writable or mutable) reference &mut:

fn main() {
    let mut s = String::from("hello");
    change(&mut s);

    println!("{}", s);
}

fn change(s: &mut String) {
    s.push_str(", world");
}
Compiling playground v0.0.1 (/playground)
Finished dev target(s) in 2.55s
Running `target/debug/playground`
hello, world
  • A write reference can even fully replace the original value.

  • Use the dereference operator (*) to modify the value:

    *s = String::from("Goodbye");

Rules for Borrowing and References

To any value, you can either have at the same time:

References

  • A single write reference &mut T 🖊️

OR

  • Many read references &T 📑 📑 📑

Lifetime

  • References cannot live longer than their owners.
  • A reference will always at all times point to a valid value.

These rules are enforced by the borrow checker.

Borrowing and Memory Safety

  • The ownership model does guarantee no:
    null pointer dereferences, data races, dangling pointers, use after free.

  • 🦺 Rust is memory safe without any runtime garbage collector.

  • 🚀 Performance of a language that would normally let you manage memory manually.

Borrow Checker’s Scope

BorrowChecker’s Scope
  • There are different facilities in Rust to work around some limitations of the borrow checker (Interior Mutability (later)).

Reference Example

fn main() {
    let mut s = String::from("hello");
    let a = &s;
    let b = &s;
    let c = &mut s;
    println!("{} - {} - {}", a, b, c);
}
error[E0502]: cannot borrow `s` as mutable
              because it is also
              borrowed as immutable
 --> src/main.rs:5:14
  |
3 |     let a = &s;
  |              - immutable borrow occurs
  |                here
4 |     let b = &s;
5 |     let c = &mut s;
  |              ^^^^^ mutable borrow occurs here
  |
6 |     println!("{} - {} - {}", a, b, c);
  |                              ^
  |  immutable borrow later used here

For more information about this error,
try `rustc --explain E0502`.

Returning References - Quiz

Question: Does the following work? Link

fn give_me_a_ref() -> &String {
  let s = String::from("Ups");
  &s
}
error[E0106]: missing lifetime specifier
1 | fn give_me_a_ref() -> &String {
  |                       ^ expected named
  |                         lifetime parameter
  = help: this function's return type
        contains a borrowed value,
          but there is no value for it to be borrowed from
help: consider using the `'static` lifetime,
      but this is uncommon unless you're returning a
      borrowed value from a `const` or a `static`
1 | fn give_me_a_ref() -> &'static String {
  |                        +++++++
help: instead, you are more likely
      to want to return an owned value
1 | fn give_me_a_ref() -> String {

❗Note: Returning a reference to a stack value (e.g. s) is not possible.

Returning References - Quiz

The following is the correct signature:

fn give_me_a_value() -> String {
    let s = String::from("Hello, world!");
    s
}

Returning References

You can however pass a reference through the function:

fn give_me_a_ref(input: &(String, i32)) -> &String {
    &input.0
}
  • Rust annotates each reference with a lifetime.
  • How to use lifetimes?  later!

Exercise Time (3)

Approx. Time: 20-30 min.

Do the following exercises:

  • borrowing: all

Build/Run/Test:

just build <exercise> --bin 01
just run <exercise> --bin 01
just test <exercise> --bin 01
just watch [build|run|test|watch] <exercise> --bin 01

Composite Types

Types Redux

We have seen so far:

  • Primitives (integers, floats, booleans, characters)
  • Compounds (tuples, arrays)
  • Most of the types we looked at were Copy

Borrowing will make more sense with some more complex data types.

Structuring data

Rust has two important ways to structure data

  • struct : product type
  • enums : sum type
  • unions

Structs (tuple structs)

A tuple struct is similar to a tuple but it has an own name:

struct ControlPoint(f64, f64, bool);

Access to members the same as with tuples:

fn main() {
  let cp = ControlPoint(10.5, 12.3, true);
  println!("{}", cp.0); // prints 10.5
}

Structs

More common (and preferred) - structs with named fields:

struct ControlPoint {
  x: f64,
  y: f64,
  enabled: bool,
}
  • Each member has a proper identifier.
fn main() {
  let cp = ControlPoint {
    x: 10.5,
    y: 12.3,
    enabled: true,
  };

  println!("{}", cp.x); // prints 10.5
}

Enumerations

One other powerful type is the enum. It is a sum type:

enum Fruit {
  Banana,
  Apple,
}
fn main() {
  let fruit = Fruit::Banana;
}
  • An enumeration has different variants, python analogy:

    variant: str | int | List[str]
  • Each variant is an alternative value of the enum, pick a value on creation.

Enumeration - Mechanics

enum Fruit {
  Banana, // discriminant: 0
  Apple, // discriminant: 1
}
  • Each enum has a discriminant (hidden by default):

    • A numeric value (isize by default, can be changed by using #[repr(numeric_type)]) to determine the current variant.

    • One cannot rely on the discriminant being isize, the compiler may decide to optimize it.

Enumerations - Data (1)

Enums are very powerful: each variant can have associated data

enum Fruit {
  Banana(u16, u16), // discriminant: 0
  Apple(f32, f32), // discriminant: 1
}
fn main() {
  let 🍌 = Fruit::Banana(3, 2);
  let 🍎 = Fruit::Apple(3.0, 4.);
}
  • The associated data and the variant are bound together.
  • Impossible to create Apple only giving u16 integers.
  • An enum is as large as the largest variant +
    size of the discriminant.

Mix Sum and Product Types

Combining sum-type with a product-type:

struct Color {
  rgb: (bool, bool, bool)
}

enum Fruit {
  Banana(Color),
  Apple(bool, bool)
}

fn main() {
  let 🍌 = Fruit::Banana(Color{rgb: (false,true,false)});
  let 🍎 = Fruit::Apple(false, true);
}

The type Fruit has \((2\cdot 2 \cdot 2) + (2\cdot 2) = 12\) possible states.

Enumerations - Discriminant

You can control the discriminant like:

#[repr(u32)]
enum Bar {
    A, // 0
    B = 10000,
    C, // 10001
}

fn main() {
    println!("A: {}", Bar::A as u32);
    println!("B: {}", Bar::B as u32);
    println!("C: {}", Bar::C as u32);
}

Exercise Time (4)

Approx. Time: 20-30 min.

Do the following exercises:

  • composite-types: all

Build/Run/Test:

just build <exercise> --bin 01
just run <exercise> --bin 01
just test <exercise> --bin 01
just watch [build|run|test|watch] <exercise> --bin 01

Pattern Matching

Extracting Data from enum

  • We must ensure we interpret enum data correctly.
  • Use pattern matching to do so.

Pattern Matching

Using the if let <pattern> = <expr> statement:

fn accept_banana(fruit: Fruit) {
  if let Fruit::Banana(a, _) = fruit {
    println!("Got a banana: {}", a);
  } else {
    println!("not handled")
  }
}
  • a is a local variables within if-body.

  • The underscore (_) can be used to accept any value.

  • Note: Abbreviation for the above: let else.

    let Fruit::Banana(a, _) = fruit else { println!("not handled.") };
  • There is also while let.

Match Statement

Pattern matching is very powerful if combined with the match statement:

fn accept_fruit(fruit: Fruit) {
  match fruit {
    Fruit::Banana(3) => {
      println!("Banana is 3 months old.");
    },
    Fruit::Banana(v) => {
      println!("Banana: age {:?}.", v)
    }
    Fruit::Apple(true, _) => {
      println!("Ripe apple.");
    },
    _ => {
      println!("Wrong fruit...");
    },
  }
}
enum Fruit {
  Banana(u8),
  Apple(bool, bool)
}
  • Every part of the match is called an arm. First match from top to bottom wins!

  • A match is exhaustive, meaning all possible values must be handled

  • Use a catch-all _ arm for all remaining cases. Use deliberately!

Match Expression

The match statement can even be used as an expression:

fn get_age(fruit: Fruit) {
  let age = match fruit {
    Fruit::Banana(a) => a,
    Fruit::Apple(_) => 1, // `_` matches the tuple.
  };

  println!("The age is: {}", age);
}
  • All match arms must return the same type.
  • No catch all (_ =>) arm requires all cases handled.

Complex Match Statements

fn main() {
    let input = 'x';
    match input {
        'q'                       => println!("Quitting"),
        'a' | 's' | 'w' | 'd'     => println!("Moving around"),
        '0'..='9'                 => println!("Number input"),
        key if key.is_lowercase() => println!("Lowercase: {key}"),
        _                         => println!("Something else"),
    }
}
  • | means or.
  • 1..=9 is an inclusive range (later!).

Quiz: Why not if key.is_lowercase() after => ?

Answer: That would never print Something else.

Implementation Blocks impl

Implementing Member Functions

To associate functions to structs and enums, we use impl blocks

fn main() {
  let x = "Hello";
  x.len();
}
  • Syntax x.len() similar to field access in structs.

The impl Block (2)

struct Banana
{
  size: f64;
}

impl Banana {
  fn get_volume(&self) -> f64 {
    return self.size * self.size * 1.5;
  }
}

fn main() {
  let b = Banana{size: 4};
  let v = b.get_volume();
}
  • Functions can be defined on our types using impl blocks.

  • Implementation blocks possible on any type, not just structs (with exceptions).

The self & Self: Implementation

  • self parameter: the receiver on which a function is defined.
  • Self type: shorthand for the type of current implementation.
struct Banana { size: f64; }

impl Banana {
    fn new(i: f64) -> Self { Self { size: i } }

    fn consume(self) -> Self {
        Self::new(self.size - 5.0)
    }
    // Take read reference of `Banana` instance.
    fn borrow(&self) -> &f64 { &self.size }
    // Take write reference of `Banana` instance.
    fn borrow_mut(&mut self) -> &mut f64 {
        &mut self.size
    }
}
  • Absence of a self parameter means its an associated function on that type (e.g. new).
  • self is always first argument and its always the type on which impl is defined (type not needed).
  • Prepend & or &mut to self to indicate that we take a value by reference.

The self & Self: Application

struct Banana { size: f64; }

impl Banana {
    fn new(i: f64) -> Self { Self { size: i } }

    fn consume(self) -> Self {
        Self::new(self.size - 5.0)
    }
    // Take read reference of `Banana` instance.
    fn borrow(&self) -> &f64 { &self.size }
    // Take write reference of `Banana` instance.
    fn borrow_mut(&mut self) -> &mut f64 {
        &mut self.size
    }
}
fn main () {
  let mut f = Banana::new();
  println!("{}", f.borrow());

  *f.borrow_mut() = 10;

  let g = f.consume();
  println!("{}", g.borrow());
}

Optionals and Error Handling

Generics

structs become more powerful with generics:

struct PointFloat(f64, f64);
struct PointInt(i64, i64);

This is repeating data types 🤨. Is there something better?

struct Point<T>(T, T);

fn main() {
  let float_point: Point<f64> = Point(10.0, 10.0);
  let int_point: Point<i64> = Point(10, 10);
}

Generics are much more powerful (later more!)

The Option Type

A quick look into the standard library of Rust:

  • Rust does not have null (for good reasons: 🤬 💣 🐞).
  • For types which do not have a value: use Option<T>.
enum Option<T> {
  Some(T),
  None,
}

fn main() {
  let some_int = Option::Some(42);
  let no_string: Option<String> = Option::None; // You need the type here!
}

Error Handling

What would we do when there is an error?

fn divide(x: i64, y: i64) -> i64 {
  if y == 0 {
    // what to do now?
  } else {
    x / y
  }
}

Error Handling

What would we do when there is an error?

fn divide(x: i64, y: i64) -> i64 {
  if y == 0 {
    panic!("Cannot divide by zero");
  } else {
    x / y
  }
}
  • A panic! in Rust is the most basic way to handle errors.

  • A panic! will immediately stop running the current thread/program using one of two methods:

    • Unwinding: Going up through the stack and making sure that each value is cleaned up.
    • Aborting: Ignore everything and immediately exit the thread/program (OS will clean up).

Error Handling with panic!

  • Only use panic! in small programs if normal error handling would also exit the program.

  • ❗Avoid using panic! in library code or other reusable components.

Error Handling with Option<T>

We could use an Option<T> to handle the error:

fn divide(x: i64, y: i64) -> Option<i64> {
  if y == 0 {
    None
  } else {
    Some(x / y)
  }
}

Error Handling with Result<T,E>

The Result<T,E> is a powerful enum for error handling:

enum Result<T, E> {
  Ok(T),
  Err(E),
}

enum DivideError {
  DivisionByZero,
  CannotDivideOne,
}
fn divide(x: i64, y: i64) -> Result<i64, DivideError> {
  if x == 1 {
    Err(DivideError::CannotDivideOne)
  } else if y == 0 {
    Err(DivideError::DivisionByZero)
  } else {
    Ok(x / y)
  }
}

Handling Results

Handle the error at the call site:

fn div_zero_fails() {
  match divide(10, 0) {
    Ok(div) => println!("{}", div),
    Err(e) => panic!("Could not divide by zero"),
  }
}
  • Signature of divide function is explicit in how it can fail.

  • The user (call site) of it decides what to do, even if it decides to panic 🌻.

  • Note: just as with Option: Result::Ok and Result::Err are available globally.

Option vs. Result

Option<T> Result<T, E>
Usage Represent an empty state. Error handling.
Some(...) Ok(...)
🛑 None Err(...)

Handling Results

When prototyping you can use unwrap or expect on Option and Result:

fn div_zero_fails() {
  let div = divide(10, 0).unwrap();
  println!("{}", div);

  div = divide(10, 0).expect("should work!");
  println!("{}", div);
}
  • unwrap: return x in Ok(x) or Some(x) or panic! if Err(e).

  • expect the same but with a message.

  • To many unwraps is generally a bad practice.

  • If ensured an error won’t occur, using unwrap is a good solution.

Handling Results

Rust has lots of helper functions on Option and Result:

fn div_zero_fails() {
  let div = divide(10, 0).unwrap_or(-1);
  println!("{}", div);
}

Besides unwrap, there are some other useful utility functions

  • unwrap_or(val): If there is an error, use the value given to unwrap_or instead.
  • unwrap_or_default(): Use the default value for that type if there is an error (Default).
  • unwrap_or_else(fn): Same as unwrap_or, but instead call a function fn that generates a value in case of an error.

The Magic ? Operator

There is a special operator associated with Result, the ? operator

See how this function changes if we use the ? operator:

fn can_fail() -> Result<i64, DivideError> {
  let num: i32 = match divide(10, 1) {
    Ok(v) => v,
    Err(e) => return Err(e),
  };

  match divide(num, 2) {
    Ok(v) => Ok(v * 2),
    Err(e) => Err(e),
  }
}

The Magic ? Operator

fn can_fail() -> Result<i64, DivideError> {
  let num: i32 = match divide(10, 1) {
    Ok(v) => v,
    Err(e) => return Err(e),
  };

  match divide(num, 2) {
    Ok(v) => Ok(v * 2),
    Err(e) => Err(e),
  }
}
fn can_fail() -> Result<i64, DivideError> {
  let num = divide(10, 1)?;




  Ok(divide(num, 2)? * 2)



}
  • The ? operator does an implicit match:

    • on Err(e)e is immediately returned (early return).

    • on Ok(v)  the value v is extracted and it contiues.

Exercise Time (5)

Approx. Time: 20-1.5h min.

Do the following exercises:

  • options (short)
  • error-handling (longer)
  • error-propagation (longer)

Build/Run/Test:

just build <exercise> --bin 01
just run <exercise> --bin 01
just test <exercise> --bin 01
just watch [build|run|test|watch] <exercise> --bin 01

The Type Vec<T>

Vec<T>: Storage for Same Type T

The Vec<T> is an array of types T that can grow.

  • Compare this to the array, which has a fixed size:

    fn main() {
      let arr = [1, 2];
      println!("{:?}", arr);
    
      let mut nums = Vec::new();
      nums.push(1);
      nums.push(2);
    
      println!("{:?}", nums);
    }

Vec: Constructor Macro

Vec is common type. Use the macro vec! to initialize it with values:

fn main() {
  let mut nums = vec![1, 2];
  nums.push(3);

  println!("{:?}", nums);
}

Vec: Memory Layout

How can a vector grow? Things on the stack need to be of a fixed size.

Vec<T> Memory Layout
  • A Vec allocates its contents on the heap (a [i64; 4] is on the stack).

  • Quiz: What happens if the capacity is full and we add another element.

    • The Vec reallocates its memory with more capacity to another memory location  Lots of copies 🐌 ⏱️.

Slices

Vectors and Arrays

Lets write a sum function for arrays [i64; 10]:

fn sum(data: &[i64; 10]) -> i64 {
  let mut total = 0;

  for val in data {
    total += val;
  }

  total
}

Vectors and Arrays

Or one for just vectors:

fn sum(data: &Vec<i64>) -> i64 {
  let mut total = 0;

  for val in data {
    total += val;
  }

  total
}

Slices

There is a better way.

Slices are typed as [T] : T is the type of the elements in the slice.

Properties

A slice is a dynamically sized view into a contiguous sequence.

  • 󰞖 Contiguous: elements in memory are evenly spaced.

  • 󰑮 Dynamically Sized: the size of the slice is not stored in the type. It is determined at runtime.

  • 👀 View: a slice is never an owned data structure.

Slices

The catch with size known at compile time:

fn sum(data: [i64]) -> i64 {
  let mut total = 0;

  for val in data {
    total += val;
  }

  total
}

fn main() {
  let data = vec![10, 11, 12];
  println!("{}", sum(data));
}
error[E0277]: the size for values of type `[i64]`
              cannot be known at
              compilation time
 --> src/main.rs:1:8
  |
1 | fn sum(data: [i64]) -> i64 {
  |        ^^^^ doesn't have a size known
                at compile-time
  |
  = help: the trait `Sized` is not
          implemented for `[i64]`
help: function arguments must have a
      statically known size, borrowed types
      always have a known size

Slices

fn sum(data: &[i64]) -> i64 {
  let mut total = 0;

  for val in data {
    total += val;
  }

  total
}

fn main() {
  let data = vec![10, 11, 12];
  println!("{}", sum(data));
}
Compiling playground v0.0.1 (/playground)
Finished dev [unoptimized + debuginfo] target(s) in 0.89s
 Running `target/debug/playground`

Slices - Memory Layout

  • [T] is an incomplete type: we need to know how many of Ts there are.

  • Types with known compile-time size implement the Sized trait, raw slices do not.

  • Slices must always be behind a reference type, i.e. &[T] and &mut [T] (but also Box<[T]> etc.).

  • Length of the slice is always stored together with the reference.

Creating Slices

One cannot create slices out of thin air, they have to be located somewhere. Three possibilities:

  • Using a borrow:

    • We can borrow from arrays and vectors to create a slice of their entire contents.
  • Using ranges:

    • We can use ranges to create a slice from parts of a vector or array.
  • Using a literal (for immutable slices only):

    • We can have memory statically available from our compiled binary.

Creating Slices - Borrowing

Using a borrow:

fn sum(data: &[i32]) -> i32 { /* ... */ }

fn main() {
  let v = vec![1, 2, 3, 4, 5, 6];
  let total = sum(&v);

  println!("{}", total);
}

Creating Slices - Ranges

Using ranges:

fn sum(data: &[i32]) -> i32 { /* ... */ }

fn main() {
  let v = vec![0, 1, 2, 3, 4, 5, 6];
  let all = sum(&v[..]);
  let except_first = sum(&v[1..]);
  let except_last = sum(&v[..5]);
  let except_ends = sum(&v[1..5]);
}
  • The range start..end is half-open, e.g. 
    x in [s..e] fulfills s <= x < e.
  • A range is a type std::ops::Range<T>.

    use std::ops::Range;
    
    fn main() {
      let my_range: Range<u64> = 0..20;
    
      for i in 0..10 {
        println!("{}", i);
      }
    }

Creating Slices

From a literal:

fn sum(data: &[i32]) -> i32 { todo!("Sum all items in `data`") }

fn get_array() -> &'static [i32] {
    &[0, 1, 2, 3, 4, 5, 6]
}

fn main() {
  let all = sum(get_array());
}
  • Interestingly get_array works, but looks like it would only exist temporarily.

  • Literals actually exist during the entire lifetime of the program.

  • 'static indicates: this slice/reference exist for the entire lifetime of the program (later more).

Strings

We have already seen the String type being used before, but let’s dive a little deeper:

  • Strings are used to represent text.

  • In Rust they are always valid UTF-8.

  • Their data is stored on the heap.

  • A String is similar to Vec<u8> with extra checks to prevent creating invalid text.

Strings

Let’s take a look at some strings

fn main() {
  let s = String::from("Hello world 🌏");

  println!("{:?}", s.split_once(" "));

  println!("{}", s.len());

  println!("{:?}", s.starts_with("Hello"));

  println!("{}", s.to_uppercase());

  for line in s.lines() {
    println!("{}", line);
  }
}

String Literals

Constructing some strings

fn main() {
  let s1 = "Hello world 🌏";
  let s2 = String::from("Hello world");
}
  • s1 is a slice of type &str: a string slice.

String Literals

Constructing some strings

fn main() {
  let s1: &str = "Hello world";
  let s2: String = String::from("Hello world");
}

The String Slice - &str

Its possible to get only a part of a string. But what is it?

  • Not [u8]: not every sequence of bytes is valid UTF-8

  • Not [char]: we could not create a slice from a string since it is stored as UTF-8 encoded bytes (one unicode character takes multiple chars).

It needs a new type: str.

  • For string slices we do not use brackets!

Types str, String, [T; N], Vec

Static Dynamic Borrowed
[T; N] Vec<T> &[T]
- String &str
  • There is no static variant of String.

  • Only useful if we wanted strings of an exact length.

  • But just like we had the static slice literals, we can use &'static str literals for that instead!

String or str ?

When to use String and when str?

fn string_len1(data: &String) -> usize {
  data.len()
}
fn string_len1(data: &str) -> usize {
  data.len()
}
  • Prefer &str over String whenever possible.
    Reason: &str gives more freedom to the caller  🚀

  • To mutate a string use: &mut str, but you cannot change a slice’s length.

  • Use String or &mut String if you need to fully mutate the string.

Exercise Time (6)

Approx. Time: 30-50 min

Do the following exercises:

  • slices:

Build/Run/Test:

just build <exercise> --bin 01
just run <exercise> --bin 01
just test <exercise> --bin 01
just watch [build|run|test|watch] <exercise> --bin 01

Recap 1

Recap 1 - References

Recap 1 - Option & Result & ?

enum Option<T> {
  Some(T),
  None,
}

enum Result<R, E> {
  Ok(R),
  Err(E),
}

Recap - Borrow Checker

Quiz - Does that Compile?

#[derive(Debug)]
enum Color {None, Blue}
struct Egg { color: Color }

fn colorize(egg: &mut Egg) -> &Color {
  egg.color = Color::Blue;
  return &egg.color
}

fn main() {
  let mut egg = Egg {color: Color::None};
  let color: &Color;
  {
    let egg_ref = &mut egg;
    color = colorize(egg_ref)
  }
  println!("color: {color:?}")
}

Question: Does that compile?

Answer: Yes, get takes a shared reference of a field in a &mut Egg which works because we do not access egg_ref after L15.

Quiz - Addendum

The colorize method is basically a method on Egg which takes a exclusive reference &mut self only for the duration of that function.

#[derive(Debug)]
enum Color {None, Blue}
struct Egg { color: Color }

impl Egg {
  fn colorize(&mut self) -> &Color {
    self.color = Color::Blue;
    return &self.color
  }
}

fn main() {
  let mut egg = Egg {color: Color::None};
  let color: &Color;
  {
    color = egg.colorize()
  }
  println!("color: {color:?}")
}

Smart Pointers

What is a Smart Pointer?

  • A wrapper type which manages a type T
  • T is allocated on the heap.

Single Ownership - Box<T>

Box<T> will allocate a type T on the heap and wrap the pointer underneath:

fn main() {
  // Put an integer on the heap
  let boxed_int: Box<i64> = Box::new(10);
}
Smart Pointer Box<T>
  • 🧰 Boxing: Store a type T on the heap.
  • 👑 Box uniquely owns that value. Nobody else does.
  • 🧺 A Box variable will deallocate the memory when out-of-scope.
  • 🚂 Move semantics apply to a Box. Even if the type inside the box is Copy.

When to use Box<T>?

Reasons to box a type T on the heap:

  • When something is too large to move around ⏱️.

  • Need something dynamically sized (dyn Trait later).

  • You need single ownership.

  • For writing recursive data structures:

    struct Node {
      data: Vec<u8>,
      parent: Box<Node>,
    }

Shared Ownership - Arc<T>

An Arc<T> (Atomic-Reference-Counted)

  • allows shared ownership of a value of type T.
  • inner value T allocated on the heap.
  • disallows mutation of the inner value T (more in the docs).
Smart Pointer Arc<T>

Shared Ownership - Arc<T> (2)

use std::sync::Arc;

fn main() {
    let data = Arc::new(vec![1, 2, 3]);

    {
      let data_other = data.clone();
      // Is always a cheap pointer copy
      // and does not copy the Vec!

      println!("Owners: {}", Arc::strong_count(&data));
      println!("Data: {:?}", data_other)
    }

} // data is last owner -> deallocated heap memory.

Exercise Time (7)

Approx. Time: 40-50 min.

Do the following exercises:

  • exercise: boxed-data: all

Build/Run/Test:

just build <exercise> --bin 01
just run <exercise> --bin 01
just test <exercise> --bin 01
just watch [build|run|test|watch] <exercise> --bin 01

Traits and Generics

The Problem

fn add_u32(l: u32, r: u32) -> u32 { /*...*/}

fn add_i32(l: i32, r: i32) -> i32 { /*...*/ }

fn add_f32(l: f32, r: f32) -> f32 { /*...*/ }

No-one likes repeating themselves. We need generic code!

Generic code

An example

fn add<T>(a: T, b: T) -> T { /*...*/}

Or, in plain English:

  • <T> : “let T be a type”.
  • a: T : “let a be of type T.
  • -> T : “let T be the return type of this function”.

Some open points:

  • What can we do with a T?
  • What should the body be?

Bounds on Generic Code

We need to provide information to the compiler:

  • Tell Rust what T can do.
  • Tell Rust which types for T are accepted.
  • Tell Rust how T implements functionality.

The trait Keyword

Describe what the type can do but not specifying what data it has:

trait Add {
    fn add(&self, other: &Self) -> Self;
}

This is similar in other languages:

Python (not as strict 🙁):

from abc import ABC, abstractmethod
class Add(ABC):
    @abstractmethod
    def add(self, other: Self):
        pass

Go:

type Add interface {
    func add(other Add)
}

::: notes Traits are not types! :::

Implementing a trait

Describe how the type does it

impl Add for u32 {
    fn add(&self, other: &Self) -> Self {
      *self + *other
    }
}

Using a trait

// Import the trait
use my_mod::Add

fn main() {
  let a: u32 = 6;
  let b: u32 = 8;
  // Call trait method
  let result = a.add(&b);
  // Explicit call
  let result = Add::add(&a, &b);
}
  • Trait needs to be in scope.
  • Call just like a method.
  • Or by using the explicit associated function syntax.

Trait Bounds

fn add_values<T: Add>(this: &T,
                      other: &T) -> T {
  this.add(other)
}
// or, equivalently
fn add_values<T>(this: &T,
                 other: &T) -> T
  where T: Add
{
  this.add(other)
}
// or shorthand with `impl Trait`
fn add_values(this: &impl Add,
              other: &impl Add) -> impl Add
{
  this.add(other)
}
  • We’ve got a useful generic function!

  • English: “For all types T that implement the Add trait, we define…”

Limitations of Add

What happens if…

  • We want to add two values of different types?
  • Addition yields a different type?

Making Add Generic

Generalize on the input type O:

trait Add<O> {
    fn add(&self, other: &O) -> Self;
}

impl Add<u16> for u32 {
    fn add(&self, other: &u16) -> Self {
      *self + (*other as u32)
    }
}

We can now add a u16 to a u32.

Defining Output of Add

  • Addition of two given types always yields one specific type of output.
  • Add associated type for addition output.

Declaration

trait Add<O> {
  type Out;
  fn add(&self, other: &O) -> Self::Out;
}

Implementation

impl Add<u16> for u32 {
  type Out = u64;

  fn add(&self, other: &u16) -> Self::Out) {
    *self as u64 + (*other as u64)
  }
}

Trait std::ops::Add

The way std does it

pub trait Add<Rhs = Self> {
    type Output;

    fn add(self, rhs: Rhs) -> Self::Output;
}
  • Default type of Self for Rhs

Implementation of std::ops::Add

use std::ops::Add;
pub struct BigNumber(u64);

impl Add for BigNumber {
  type Output = Self;

  fn add(self, rhs: Self) -> Self::Output {
      BigNumber(self.0 + rhs.0)
  }
}

fn main() {
  // Call `Add::add`
  let res = BigNumber(1).add(BigNumber(2));
}

Quiz: What’s the type of res? BigNumber(u64)

Implementation std::ops::Add (2)

pub struct BigNumber(u64);

impl std::ops::Add<u32> for BigNumber {
  type Output = u128;

  fn add(self, rhs: u32) -> Self::Output {
      (self.0 as u128) + (rhs as u128)
  }
}

fn main() {
  let res = BigNumber(1) + 3u32;
}

Quiz: What’s the type of res? u128

Type Parameter vs. Associated Type

Use Type Parameter

if trait can be implemented for many combinations of types

impl Add<u32> for u32 {/* */}
impl Add<i64> for u32 {/* */}

Use Associated Type

to define a type internal to the trait, which the implementer chooses:

impl Add<u32> for u32 {
  // Addition of two u32's is always u32
  type Out = u32;
}

Example - Associated Types

trait Distance {
    type Scalar;

    fn distance(&self, a: &Self::Scalar, b: &Self::Scalar) -> bool;
    fn first(&self) -> &Self::Scalar;
    fn last(&self) -> &Self::Scalar;
}

// Implement Distance for some types here ....
impl Distance for ...

// The caller specifying a bound does not need to
// specify the `Scalar`.
fn distance<T: Distance>(container: &T) -> bool {
    container.distance(container.first(), container.last())
}

If chosen trait Distance<Scalar> instead, the type Scalar needs to be provided at every usage.

Derive a Trait

#[derive(Clone, Debug)]
struct Dolly {
  num_legs: u32,
}

fn main() {
  let dolly = Dolly { num_legs: 4 };
  let second_dolly = dolly.clone();

  println!("Dolly: {:?}", second_dolly)
}
  • Some traits are trivial to implement.
  • Use #[derive(...)] to quickly implement a trait.
  • For Clone: derived impl calls clone on each field.
  • Debug: provide a debug implementation for string formatting.

Culprits with #[derive(Clone)]

struct NonClone{}

#[derive(Clone)]
struct A<T> { a: Arc<T> }

fn main() {
  let a = A { a: Arc::new(NonClone{})};
}

Question: Does that compile?

Answer: No but it should.

  • #[derive(Clone)] de-sugars into impl<T> Clone for A<T> where T: Clone which adds a wrong and unnecessary bound T: Clone (maybe that changes in the future).

Orphan Rule

Coherence: There must be at most one implementation of a trait for any given type

Rule

Trait can be implemented for a type iff:

  • Either your crate (library) defines the trait
  • or your crate (library) defines the type
  • or both.


  • You cannot implement a foreign trait for a foreign type.

Compiling Generic Functions

impl Add for i32 {/* ... */}
impl Add for f32 {/* ... */}

fn add_values<T: Add>(a: &T, b: &T) -> T
{
  a.add(b)
}

fn main() {
  let sum_one = add_values(&6, &8);
  let sum_two = add_values(&6.5, &7.5);
}

Code is monomorphized:

  • Two versions of add_values end up in binary.
  • Optimized separately and very fast to run (static dispatch).
  • Slow to compile and larger binary.

Exercise Time (8)

Approx. Time: 40-50 min.

Do the following exercises:

  • generics: all

Build/Run/Test:

just build <exercise> --bin 01
just run <exercise> --bin 01
just test <exercise> --bin 01
just watch [build|run|test|watch] <exercise> --bin 01

Common Traits from std

Operator Overloading

std::ops::Add<T> et al.

  • Shared behavior
use std::ops::Add;
pub struct BigNumber(u64);

impl Add for BigNumber {
  type Output = Self;
  fn add(self, rhs: Self) -> Self::Output {
      BigNumber(self.0 + rhs.0)
  }
}
fn main() {
  // Now we can use `+` to add `BigNumber`s!
  let res: BigNumber = BigNumber(1) + BigNumber(2);
}
  • Others: Mul, Div, Sub, ..

Markers

std::marker::Sized

  • Marker traits
/// Types with a constant size known at compile time.
/// [...]
pub trait Sized { }
  • u32 is Sized
  • Slice [T], str is not Sized
  • Slice reference &[T], &str is Sized

Others:

  • Sync: Types of which references can be shared between threads.
  • Send: Types that can be transferred across thread boundaries.

Default Values

std::default::Default

pub trait Default: Sized {
    fn default() -> Self;
}

#[derive(Default)] // Derive the trait
struct MyCounter { count: u32 }

// Or, implement it (if you really need to)
impl Default for MyCounter {
  fn default() -> Self {
    MyCounter { count: 1 }
  }
}

fn main() {
  let d = MyCounter::default();
}

Duplication

std::clone::Clone & std::marker::Copy

pub trait Clone: Sized {
    fn clone(&self) -> Self;

    fn clone_from(&mut self, source: &Self) {
      *self = source.clone()
    }
}

pub trait Copy: Clone { } // That's it!
  • Both Copy and Clone can be #[derive]d.
  • Copy is a marker trait.
  • trait A: B == “Implementor of A must also implement B
  • clone_from has default implementation, can be overridden.

Conversions

Into<T> & From<T>

pub trait From<T>: Sized {
    fn from(value: T) -> Self;
}

pub trait Into<T>: Sized {
    fn into(self) -> T;
}

impl <T, U> Into<U> for T
  where U: From<T>
{
    fn into(self) -> U {
      U::from(self)
    }
}
  • Blanket implementation.
  • Prefer From over Into if orphan rule allows to.

Reference Conversion

AsRef<T> & AsMut<T>

pub trait AsRef<T: ?Sized>
{
    fn as_ref(&self) -> &T;
}

pub trait AsMut<T: ?Sized>
{
    fn as_mut(&mut self) -> &mut T;
}
  • Provide flexibility to API users.
  • T need not be Sized, e.g. slices [T] can implement AsRef<T>, AsMut<T>

Reference Conversion (2)

AsRef<T> & AsMut<T>

fn move_into_and_print<T: AsRef<[u8]>>(slice: T) {
  let bytes: &[u8] = slice.as_ref();
  for byte in bytes {
    print!("{:02X}", byte);
  }
}

fn main() {
  let owned_bytes: Vec<u8> = vec![0xDE, 0xAD, 0xBE, 0xEF];
  move_into_and_print(owned_bytes);

  let byte_slice: [u8; 4] = [0xFE, 0xED, 0xC0, 0xDE];
  move_into_and_print(byte_slice);
}

Have user of move_into_and_print choose between stack local [u8; N] and heap-allocated Vec<u8>

Destruction: std::ops::Drop

pub trait Drop {
    fn drop(&mut self);
}
  • Called when owner goes out of scope.

Destruction:std::ops::Drop

struct Inner;
struct Outer { inner: Inner }

impl Drop for Inner {
  fn drop(&mut self) {
    println!("Dropped inner");
  }
}
impl Drop for Outer {
  fn drop(&mut self) {
    println!("Dropped outer");
  }
}

fn main() {
  {
    let a = Outer { inner: Inner };
  } // a.drop() called here.

  // Explicitly calling drop.
  std::mem::drop(Outer { inner: Inner });
}

Output:

Dropped outer
Dropped inner
  • Compiler inserts calls to Drop at end of scope
  • Drop runs before members are removed from stack.
  • Signature &mut prevents explicitly dropping self or its fields in destructor.
// Implementation of `std::mem::drop`
fn drop<T>(_x: T) {}

Question: Why does std::mem::drop work?

More Std-Traits

There is more:

Std-Traits and the Orphan Rule

When you provide a type, always implement (or derive) the basic traits from the standard if they are appropriate, e.g. (> implies priority)

  • Default : Initialize the type with default values.
  • Debug > Display : Format trait for debug specifier {:?} or non-debug with with {}.
  • Clone: Cloning the type or bitwise copy.
  • PartialEq > PartialOrd (and maybe Eq > Ord).
  • Hash: To store the type in HashMap etc.
  • Copy: Only implement marker-trait Copy if you really need to.

Other traits for later:

  • Send : Auto trait: A value T can safely be send across thread boundary.
  • Sync : Auto trait: A value T can safely be shared between threads.
  • Sized : Marker trait to denote that type T is known at compile time

Exercise Time (9)

Approx. Time: 40-50 min.

Do the following exercises:

  • traits
  • std-traits
  • blanket-implementation
  • drop-with-errors
  • extension-traits: a very good one (moderate to hard)!
  • local-storage-vec: ⚠️ Hardcore exercise (spare for later)

Build/Run/Test:

just build <exercise> --bin 01
just run <exercise> --bin 01
just test <exercise> --bin 01
just watch [build|run|test|watch] <exercise> --bin 01

Lifetime Annotations

What Lifetime?

  • References refer to variables (stack-allocated memory).

  • A variable has a lifetime:

    • Starts at declaration.
    • Ends at drop.
  • The barrow checker prevents dangling references (pointing to deallocated/invalid memory 💣).

Example - Lifetime Scopes

fn main() {
    let r;

    {
        let x = 5;
        r = &x;
    }

    println!("r: {r}");
}

Question: Will this compile?

Example - Lifetime Scopes (2)

Variable r lives for lifetime 'a and x for 'b.

fn main() {
    let r;                // ---------+- 'a
                          //          |
    {                     //          |
        let x = 5;        // -+-- 'b  |
        r = &x;           //  |       |
    }                     // -+       |
                          //          |
    println!("r: {r}");   //          |
}                         // ---------+

Answer: No, r points to x which is dropped in L7.

Example - Lifetime in Function (1)

Question: Will this compile?

/// Return reference to longest of `a` or `b`.
fn longer(a: &str, b: &str) -> &str {
    if a.len() > b.len() {
        a
    } else {
        b
    }
}

Example - Lifetime in Function (2)

Answer: No. rustc needs to know more about a and b.

/// Return reference to longest
/// of `a` and `b`.
fn longer(a: &str, b: &str) -> &str {
  if a.len() > b.len() {
      a
  } else {
      b
  }
}
error[E0106]: missing lifetime specifier
 --> src/lib.rs:2:32
  |
2 | fn longer(a: &str, b: &str) -> &str {
  |              ----     ----     ^
  | expected named lifetime parameter
  |
  = help: this function's return type contains
    a borrowed value, but the signature does
    not say whether it is borrowed from `a` or `b`
help: consider introducing a named lifetime parameter
2 | fn longer<'a>(a: &'a str, b: &'a str) -> &'a str {
  |          ++++     ++          ++          ++

For more information about this error,
try `rustc --explain E0106`.

Lifetime Annotations

Solution: Provide a constraint with a lifetime parameter 'l:

fn longer<'l>(a: &'l str, b: &'l str) -> &'l str {
    if a.len() > b.len() {
        a
    } else {
        b
    }
}

English:

  • Given a lifetime called 'l,
  • longer takes two references a and b
  • that live for >= 'l
  • and returns a reference that lives for 'l.

Annotations do NOT change the lifetime of variables. Their scopes do.
Annotations are constraints to provide information to the borrow checker.

Validating Boundaries

  • Lifetime validation is done within function boundaries (and scopes e.g. {...}).
  • No information of calling context is used.

Question: Why no calling context?

Answer: Because its only important to know the lifetime relation between input & output - the constraint.

Example - Validating Boundaries

fn main() {
  let x = "frickadel";            // ------------+- 'a
  {                               //                 |
    let y = "short";              // ------+--- 'b   |
    let r: &i64 = longer(&x, &y); // --+- 'c     |   |  'l := min('a,'b) => 'l := 'b
    println!("longer: {r}")       //   |         |   |
  }                               // --+---------+   |
}                                 // ----------------+

Borrow checker checks if r’s lifetime fulfills <= 'b'c <= 'b  ✅.

Lifetime Annotations in Types

If references are used in structs, it needs a life-time annotation:

/// A struct that contains a reference.
pub struct ContainsRef<'r> {
  ref: &'r i64
}

English:

  • Given an instance let x: ContainsRef = ...,
    than constraint lifetime(x.ref) >= lifetime(x) must hold.

Lifetime Elision

Question: “Why haven’t I come across this before?”

Answer: “Because of lifetime elision!”

Lifetime Elision

Rust compiler has heuristics for eliding lifetime bounds:

  • Each elided lifetime in input position becomes a distinct lifetime parameter.

    fn print(a: &str, b: &str)
    fn print(a: &'l1 str, b: &'l2 str)
  • If exactly one input lifetime position (elided or annotated), that lifetime is assigned to all elided output lifetimes.

    fn print(a: &str) -> (&str, &str)
    fn print(a: &'l1 str) -> (&'l1 str, &'l1 str)
  • If multiple input lifetime positions, but one of them is &self or &mut self, the lifetime of self is assigned to all elided output lifetimes.

    fn print(&self, a: &str) -> &str
    fn print(&self: &'l1 str, a: &'l2 str) -> &'l1 str
  • Otherwise, annotations are needed to satisfy compiler.

Lifetime Elision Examples

fn print(s: &str);                                      // elided
fn print<'a>(s: &'a str);                               // expanded

fn debug(lvl: usize, s: &str);                          // elided
fn debug<'a>(lvl: usize, s: &'a str);                   // expanded

fn substr(s: &str, until: usize) -> &str;               // elided
fn substr<'a>(s: &'a str, until: usize) -> &'a str;     // expanded

fn get_str() -> &str;                                   // ILLEGAL (why?)

fn frob(s: &str, t: &str) -> &str;                      // ILLEGAL (why?)

fn get_mut(&mut self) -> &mut T;                        // elided
fn get_mut<'a>(&'a mut self) -> &'a mut T;              // expanded

Exercise Time

Approx. Time: 15-30 min.

Do the following exercises:

  • lifetimes: all

Build/Run/Test:

just build <exercise> --bin 01
just run <exercise> --bin 01
just test <exercise> --bin 01
just watch [build|run|test|watch] <exercise> --bin 01

Closures

Closures

fn main() {
    let z = 42;
    let compute = move |x, y| x + y + z; // Whats the type of this?

    let res = compute(1, 2);
}
  • Closures or lambda expressions are anonymous (unnamed) functions.
  • They can capture (“close over”) values in their scope.
  • They are first-class values.
  • They implement special traits: Fn, FnMut and FnOnce.

What is a Closure?

The closure |x: i32| x * x can mechanistically be mapped to the following struct

struct SquareFunc {}

impl SquareFunc {
  fn call(&self, x: i32) {
    x * x
  }
}

❗The “struct with fields” is a mental model how |x: i32| x * x is mechanistically implemented by the compiler.

What is a Fn Closure?

For a closure which captures (“closes-over”) variables from the environment:

let z = 43;
let square_it = |x| x * x + z;  // => Fn(i32) -> i32     (LSP)
                                // compiler opaque type: approx `SquareIt`.
square_it(10);

approx. maps to:

struct SquareIt<'a> {
  z: &'a i32;
}

impl SquareIt {
  fn call(&self, x: i32) {
    x * x + self.z
  }
}
let z = 43;
let square_it = SquareIt{z: &z};
square_it.call(10);

❗The closure by default captures by reference.

What is a FnMut Closure?

A closure with some mutable state:

fn main() {
  let mut total: i32 = 0;

  let mut square_it = |x| {
      // => FnMut(i32) -> i32  (LSP)
      total += x * x;
      x * x
  };

  square_it(10);
  assert_eq!(100, total);
}

approx. maps to:

struct SquareIt<'a>' {
  total: &'a mut i32
}

impl SquareIt {
  fn call(&mut self, x: i32) {
    self.total += x * x;
    x * x
  }
}

Capture by Value

Capture by value with move:

fn main() {
  let mut total: i32 = 0; // Why `mut` here?

  let mut square_it = move |x| {
      // => FnMut(i32) -> i32  (LSP)
      total += x * x;
      x + x
  };

  square_it(10);
  assert_eq!(0, total)
}

approx. maps to:

struct SquareIt {
  total: i32
}

impl SquareIt {
  fn call(&mut self, x: i32) {
    self.total += x * x;
    x * x
  }
}

Example - Quiz

Does it compile? Does it run without panic?

fn main() {
    let mut total: i32 = 0;

    let mut square_it = |x| { // => FnMut(i32) -> i32
        total += x * x;
        x * x
    };
    total = -1;

    square_it(10);
    assert_eq!(-1, total)
}

Answer: It does not compile as total is mut. borrowed on L8.
Move square_it before total = -1.

Closure Traits

Fn, FnMut and FnOnce are traits which implement different behaviors for closures. The compiler implements the appropriate ones!

  • Fn: closures that can be
    • called multiple times concurrently
    • borrowed immutable.
  • FnMut: closures that can be
    • called multiple times not concurrently
    • borrowed mutable.
  • FnOnce: closures that can be
    • called once, it takes ownership of self.

All closure implement at least FnOnce.

Quiz - What is it?

let mut s = String::from("foo");
let t     = String::from("bar");

let func = || {
    s += &t;
    s
};

Question: Whats the type of func?

Answer: Its only FnOnce() -> String, the compiler deduced that from the function body and return type.

The mental model is approx. this:

struct Closure<'a> {
    s : String,
    t : &'a String,
}

impl<'a> FnOnce<()> for Closure<'a> {
    type Output = String;
    fn call_once(self) -> String {
        self.s += &*self.t;
        self.s
    }
}

Closure and Functional Programming

Useful when working with iterators, Option and Result:

let numbers = vec![1, 3, 4, 10, 29];

let evens: Vec<_> = numbers.into_iter()
                           .filter(|x| x % 2 == 0)
                           .collect();

Closures and Functional Programming (2)

Useful when generalizing interfaces, e.g. visitor pattern

struct Graph{ nodes: Vec<i32>; };

impl Graph {

  fn visit(&self, visitor: impl FnOnce(i32)) {
    // Remember: All closure at least implement `FnOnce`.
    for n in self.nodes {
      visitor(n) // Call visitor function for each node.
    }
  }

}

Exercise Time (10)

Approx. Time: 20-60 min.

Do the following exercises:

  • closures: all

Build/Run/Test:

just build <exercise> --bin 01
just run <exercise> --bin 01
just test <exercise> --bin 01
just watch [build|run|test|watch] <exercise> --bin 01

Recap 2

Recap 2 - Closures

fn main() {
    let numbers = vec![1, 2, 5, 9];
    let smaller_than_5 = |x: i32| -> bool { x < 5 };

    let res = filter(&numbers, smaller_than_5);

    print!("Result: {res:?}")
}
  • Question: What is the type of smaller_than_9?
  • Question: How to write filter to be most generic.

Trait Objects & Dynamic Dispatch

Trait… Object?

  • We learned about traits.
  • We learned about generics and monomorphization.

There’s more to this story though…

Question: What was monomorphization again?

Static Dispatch: Monomorphization (recap)

The Add trait.

impl Add for i32 {/* ... */}
impl Add for f32 {/* ... */}

fn add_values<T: Add>(l: &T, r: &T) -> T
{
  l.add(r)
}

fn main() {
  let sum_one = add_values(&6, &8);
  let sum_two = add_values(&6.5, &7.5);
}

Code is monomorphized:

  • Two versions of add_values end up in binary.
  • Optimized separately and very fast to run (static dispatch).
  • Slow to compile and larger binary.

Dynamic Dispatch

What if don’t know the concrete type implementing the trait at compile time?

use std::io::Write;
use std::path::PathBuf;

struct FileLogger { log_file: PathBuf }
impl Write for FileLogger { /* ... */}

struct StdOutLogger;
impl Write for StdOutLogger { /* ... */}

fn log<L: Write>(logger: &mut L, msg: &str) {
  write!(logger, "{}", msg);
}
fn main() {
  let file: Option<PathBuf> = // args parsing...

  let mut logger = match file {
      Some(f) => FileLogger { log_file: f },
      None => StdOutLogger,
  };

  log(&mut logger, "Hello, world!🦀");
}

Error!

error[E0308]: `match` arms have incompatible types
  --> src/main.rs:19:17
   |
17 |       let mut logger = match log_file {
   |  ______________________-
18 | |         Some(log_file) => FileLogger { log_file },
   | |                           -----------------------
   | |                           this is found to be of
   | |                           type `FileLogger`
19 | |         None => StdOutLogger,
   | |                 ^^^^^^^^^^^^ expected struct `FileLogger`,
   | |                              found struct `StdOutLogger`
20 | |     };
   | |_____- `match` arms have incompatible types

What’s the type of logger?

Heterogeneous Collections

What if we want to create collections of different types implementing the same trait?

trait Render {
  fn paint(&self);
}

struct Circle;
impl Render for Circle {
  fn paint(&self) { /* ... */ }
}

struct Rectangle;
impl Render for Rectangle {
  fn paint(&self) { /* ... */ }
}
fn main() {
  let circle = Circle{};
  let rect = Rectangle{};

  let mut shapes = Vec::new();
  shapes.push(circle);
  shapes.push(rect);

  shapes.iter()
        .for_each(|s| s.paint());
}

Error Again!

   Compiling playground v0.0.1 (/playground)
error[E0308]: mismatched types
  --> src/main.rs:20:17
   |
20 |     shapes.push(rect);
   |            ---- ^^^^ expected struct `Circle`,
   |                      found struct `Rectangle`
   |            |
   |            arguments to this method are incorrect
   |
note: associated function defined here
  --> /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/alloc/src/vec/mod.rs:1836:12

For more information about this error, try `rustc --explain E0308`.
error: could not compile `playground` due to previous error

What is the type of shapes?

Dynamically Sized Types (DST)

Rust supports Dynamically Sized Types (DSTs): types without a statically known size or alignment.

On the surface, this is a bit nonsensical: rustc always needs to know the size and alignment to compile code!

  • Sized is a marker trait for types with know-size at compile time.

  • Types in Rust can be Sized or !Sized (unsized  DSTs).

Examples of Sized vs. !Sized

  • Most types are Sized, and automatically marked as such

    • i64
    • String
    • Vec<String>
    • etc.
  • Two major DSTs (!Sized) exposed by the language (note the absence of a reference!):

    • Trait Objects: dyn MyTrait (covered in the next section)
    • Slices: [T], str, and others.
  • DSTs can be only be used (local variable) through a reference: &[T], &str, &dyn MyTrait (references are Sized).

Trait Objects dyn Trait

  • Opaque type that implements a set of traits.

    Type Description: dyn MyTrait: !Sized

  • Like slices, trait objects always live behind pointers (&dyn MyTrait, &mut dyn MyTrait, Box<dyn MyTrait>, ...).

  • Concrete underlying types are erased from trait object.

fn main() {
  let log_file: Option<PathBuf> = // ...

  // Create a trait object that implements `Write`
  let logger: &mut dyn Write = match log_file {
    Some(log_file) => &mut FileLogger { log_file },
    None => &mut StdOutLogger,
  };
}

Quiz - Instantiate a Trait?

struct A{}
trait MyTrait { fn show(&self) {}; }
impl MyTrait for A {}

fn main() {
  let a: MyTrait = A{};
  let b: dyn MyTrait = A{};
}

Question: Does that compile?

Answer: No! - It’s invalid code.

  • You can’t declare a local variable a, MyTrait is not a type.
  • You can’t declare b as dyn MyTrait, because for the type system its !Sizedcan’t compute size of memory of b on the stack.
  • Also: You can’t pass the value of an unsized type into a function as an argument or return it from a function.

Generics and Sized : How?

  • Given a concrete type you can always say if its Sized or !Sized (DST).

  • Whats with generics?

fn generic_fn<T: Eq>(x: T) -> T { /*..*/ }
  • If T is Sized, all is OK!.

  • If T is !Sized, then the definition of generic_fn is incorrect! (why?)

Generics and Sized

  • All generic type parameters are implicitly Sized by default (everywhere structs, fns etc.):

    For example:

    fn generic_fn<T: Eq + Sized>(x: T) -> T { // Sized is obsolete here.
      //...
    }

    or

    fn generic_fn<T>(x: &T) -> u32
    where
        T: Eq + Sized // Sized is obsolete here.
    {
      // ...
    }

Generics and ?Sized

Sometimes we want to opt-out of Sized: use ?Sized:

fn generic_fn<T: Eq + ?Sized>(x: &T) -> u32 { ... }
  • In English: ?Sized means T also allows for dyn. sized types (DST)  e.g. T := dyn Eq.

  • So a x: &dyn Eq is a reference to a trait object which implements Eq.

Generics and ?Sized - Quiz

Does that compile? Why?/Why not?

fn generic_fn<T: Eq + ?Sized>(x: &T) -> u32 { 42 }

fn main() {
  generic_fn("hello world")`
}

Answer: generic_fn is instantiated with &str:

  •  match &T <-> &str
  • T := str
  • x: &str which is Sized
  •  ✅ Yes it compiles.

Generics and ?Sized - Quiz

Does that compile? Why?/Why not?

// removed the reference ------- v
fn generic_fn<T: Eq + ?Sized>(x: T) -> u32 { 42 }

fn main() {
  generic_fn("hello world");
}

Answer: ❌ No - declaration generic_fn is invalid (line 5 is not the problem!):

  • T can potentially be dyn Eq  leads to x: dyn Eq which is not Sized  compile error.
  • Remember: function parameter go onto the stack!

Generics and ?Sized - Quiz (Tip)

How to print the type T?

fn generic_fn<T: Eq>(x: T) -> u32 {
    42
}

fn main() {
    generic_fn("hello world");
}
fn generic_fn<T: Eq + std::fmt::Display>(x: T) -> u32 {
    println!(
      "x: {} = '{x}'",
      std::any::type_name::<T>());

    42
}

fn main() {
    generic_fn("hello world");
}
x: &str = 'hello world'

Dynamic Dispatch on the Heap (idiomatic)

/// Same code as last slide
fn main() {
  let log_file: Option<PathBuf> = //...

  // Create a trait object on heap that impl. `Write`
  let logger: Box<dyn Write> = match log_file {
    Some(log_file) => Box::new(FileLogger{log_file}),
    None => Box::new(StdOutLogger),
  };

  log("Hello, world!🦀", &mut logger);
}

  • 💸 Cost: pointer indirection via vtable (dynamic dispatch)  less performant.
  • 💰 Benefit: no monomorphization (no generics)  smaller binary & shorter compile time!
  • 💻 Memory: logger is a smart-pointer where the data and vtable is on the heap (dyn. mem. allocation  🐌, this is fine 99% time)

Dynamic Dispatch on the Stack (esoteric)

/// Same code as last slide
fn main() {
  let log_file: Option<PathBuf> = //...

  // Create a trait object that implements `Write`
  let logger: &mut dyn Write = match log_file {
    Some(log_file) => &mut FileLogger{log_file},
    None => &mut StdOutLogger,
  };

  log(&mut logger, "Hello World!");
}

  • 💸 Cost: same as before.
  • 💰 Benefit: same as before.
  • 💻 Memory: logger is a wide-pointer which lives only on the stack  🚀.

Fixing Dynamic Logger

  • Trait objects &dyn Trait, Box<dyn Trait>, … implement Trait!
// L no longer must be `Sized`, so to accept trait objects.
fn log<L: Write + ?Sized>(entry: &str, logger: &mut L) {
    write!(logger, "{}", entry);
}

fn main() {
    let log_file: Option<PathBuf> = // ...

    // Create a trait object that implements `Write`
    let logger: &mut dyn Write = match log_file {
        Some(log_file) => &mut FileLogger { log_file },
        None => &mut StdOutLogger,
    };
    log("Hello, world!🦀", logger);
}

And all is well! Live Stack Dyn. Dispatch, Live Heap Dyn. Dispatch.

Forcing Dynamic Dispatch

If one wants to enforce API users to use dynamic dispatch, use &mut dyn Write on log:

fn log(entry: &str, logger: &mut dyn Write) {
    write!(logger, "{}", entry);
}

fn main() {
    let log_file: Option<PathBuf> = // ...

    // Create a trait object that implements `Write`
    let logger: &mut dyn Write = match log_file {
        Some(log_file) => &mut FileLogger { log_file },
        None => &mut StdOutLogger,
    };

    log("Hello, world!🦀", &mut logger);
}

Heterogeneous Collection on the Heap

fn main() {
  let mut shapes = Vec::new();


  let circle = Circle;
  shapes.push(circle);

  let rect = Rectangle;
  shapes.push(rect);

  shapes.iter()
        .for_each(|s| s.paint());
}
fn main() {
  let mut shapes: Vec<Box<dyn Render>>
    = Vec::new();

  let circle = Box::new(Circle);
  shapes.push(circle);

  let rect = Box::new(Rectangle);
  shapes.push(rect);

  shapes.iter()
        .for_each(|s| s.paint());
}

All set!

Heterogeneous Collection on the Stack 🍭

fn main() {
    let shapes: [&dyn Render; 2] = [&Circle {}, &Rectangle {}];
    shapes.iter().for_each(|shape| shape.paint());
}

All set!

Trait Object Limitations

  • Pointer indirection cost.

  • Harder to debug.

  • Type erasure (you need a trait).

  • Not all traits work:

    Traits need to be dyn-compatible

Static Dispatch or Dynamic Dispatch?

When to use what is rarely a clear-cut, but broadly

  • In libraries: use static dispatch for the user to decide if they want to pass

    • a let d: &dyn MyTrait for a signature fn lib_func(s: impl MyTrait + ?Sized),
    • or a concrete type A which implements Trait.
  • For binaries: you are writing final code  use dynamic dispatch (no generics)  cleaner and faster compilable code with only marginal performance cost.

Static Dispatch or Dynamic Dispatch?

When to use what is rarely a clear-cut, but broadly

  • In libraries, use static dispatch for the user to decide if they want to pass

    • a let d: &dyn MyTrait for a signature fn lib_func(s: impl MyTrait + ?Sized).
    • or a concrete type A which implements Trait.
  • For binaries, you are writing final code, and using dynamic dispatch (no generics) gives cleaner code, faster compile.

Dyn-Compatible Trait

A trait is dyn-compatible (formerly object safe) when it fulfills:

  • Trait T must not be Sized: Why?
  • If trait T: Y, thenY must be dyn-compatible.
  • No associated constants allowed.
  • No associated types with generic allowed.
  • All associated functions must either be dispatchable from a trait object, or explicitly non-dispatchable:
    • e.g. function must have a receiver with a reference to Self

Details in The Rust Reference. Read them!

These seem to be compiler limitations.

Non Dyn-Compatible Trait (😱)

trait Fruit {
  fn create(&self) -> Self;
  fn show(&self) -> String;
}

struct Banana { color: i32 }

impl Fruit for Banana {
  fn create(&self) -> Self { Banana {} }

  fn show(&self) -> String {
      return format!("banana: color {}", self.color).to_string();
  }
}

fn main() {
    let obj: Box<dyn Fruit> = Box::new(Banana { color: 10 });
    println!("type: {}", obj.show())
}

Non Dyn-Compatible Trait (💩)

error[E0038]: the trait `Fruit` cannot be made into an object

18 |     println!("type: {}", obj.show())
   |                          ^^^^^^^^^^ `Fruit` cannot be made into an object

note: for a trait to be "dyn-compatible" it needs to
      allow building a vtable to allow the call to be
      resolvable dynamically; for more information
      visit <https://doc.rust-lang.org/beta/reference/items/traits.html#dyn-compatibility>

1  | trait Fruit {
   |       ----- this trait cannot be made into an object...
2  |   fn create(&self) -> Self;
   |                       ^^^^ ...because method `create` references
                                the `Self` type in its return type

Trait Object Summary

  • Trait objects allow for dynamic dispatch and heterogeneous containers.
  • Trait objects introduce pointer indirection.
  • Traits need to be dyn-compatible to make trait objects out of them.