1. Intro

1.1 Installation

  • Use rustc <rust-source-file> to compile Rust program.

  • Install RLS components with rustup update && rustup component add rls rust-analysis rust-src.

  • We can build a project using cargo build or cargo check (check the program compiles but does not produce executable).

  • We can build and run a project in one step using cargo run.

  • Use cargo doc --open to build documentation for all dependencies locally.

3. Common Programming Concepts

3.1 Variables and Mutability

Variables are immutable by default in Rust, to make it mutable, add mut keyword. An immutable variable is able to shadow a mutable variable. Const are constants, their values have to be defined at compile time, immutable variables are just immutable, they can be assigned from the result of a runtime function call.

const DEFAULT_VALUE: u32 = 100_000;


fn main() {
    let mut x = DEFAULT_VALUE;
    println!("x = {}", x);  // x = 100000
    x = 5;
    println!("x = {}", x);  // x = 5
    let x = 2;
    println!("x = {}", x);  // x = 2
}

3.2 Data Types

In the case of ambiguity, type annotation is required:

// Won't compile:
let guess = "42".parse().expect("Not a number!");

// error[E0282]: type annotations needed
//  --> src/main.rs:2:9
//   |
// 2 |     let guess = "42".parse().expect("Not a number!");
//   |         ^^^^^ consider giving `guess` a type

// Correction:
let guess: u32 = "42".parse().expect("Not a number!");

Rust has the built-in tup type representing a tuple:

let x: (i32, f64, u8) = (500, 6.4, 1);
let (val1, val2, val3) = x;
println!("x = ({}, {}, {})", val1, val2, val3);  // x = (500, 6.4, 1)
println!("x.0 = {}", x.0);  // x.0 = 500

Rust arrays are fixed length, they are always allocated on the stack instead of heap.

let a: [i32; 5] = [1, 2, 3, 4, 5];
let months = ["January", "February", "March", "April", "May", "June", "July",
              "August", "September", "October", "November", "December"];

Array’s index out of bounds check can be performed at compile time or run time, depending on how it was indexed.

let a = [1, 2, 3, 4, 5];
let ele = a[10];
println!("Value of ele is {}.", ele);
// $ cargo build
// error: index out of bounds: the len is 5 but the index is 10
//  --> src/main.rs:3:15
//   |
// 3 |     let ele = a[10];
//   |               ^^^^^
//   |
//   = note: `#[deny(const_err)]` on by default

let a = [1, 2, 3, 4, 5];
let idx = 10;
let ele = a[idx];
println!("Value of ele is {}.", ele);
// $ cargo build
// Compiling variables v0.1.0 (/Users/shuyangsun/Developer/rust_notes/projects/variables)
//  Finished dev [unoptimized + debuginfo] target(s) in 0.24s
// $ cargo run
// thread 'main' panicked at 'index out of bounds: the len is 5 but the index is 10', src/main.rs:4:15
// note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

3.3 Functions

You must declare the type of each parameter in a function’s signature.

Statements are instructions that perform some action and do not return a value. Expressions evaluate to a resulting value. Let’s look at some examples. Calling a function is an expression. Calling a macro is an expression. The block that we use to create new scopes, {}, is an expression.

let x = (let y = 6);  // Won't compile, because let y = 6 is a statement not an expression.

let y = {
    let x = 3;
    x + 1  // Note there is no semicolon here, adding semicolon would change this expression to a statement.
};
println!("The value of y is {}.", y);  // The value of y is 4.

3.5 Control Flow

if is an expression in Rust, so we can write code like this:

let condition = true;
let number = if condition {
  5
} else {
  6
};

loop executes the code until a break occurred or the program is terminated. loop can also have a return value that is placed after the break expression:

let mut counter = 0;

let result = loop {
    counter += 1;

    if counter == 10 {
        break counter * 2;
    }
};

println!("The result is {}", result);

while loops are good for conditional stop, and for loops are good for iteration:

let a = [10, 20, 30, 40, 50];

for element in a.iter() {
    println!("the value is: {}", element);
}

for number in (1..4).rev() {
    println!("{}!", number);
}
println!("LIFTOFF!!!");

4. Understanding Ownership

4.1 What is Ownership?

Ownership manages heap data. Below are the rules of ownership:

  • Each value in Rust has a variable that’s called its owner.
  • There can only be one owner at a time.
  • When the owner goes out of scope, the value will be dropped.

When a variable goes out of scope, Rust calls a special function for us. This function is called drop.

Rust has a special annotation called the Copy trait that we can place on types like integers that are stored on the stack. If a type has the Copy trait, an older variable is still usable after assignment. Rust won’t let us annotate a type with the Copy trait if the type, or any of its parts, has implemented the Drop trait.

The semantics for passing a value to a function are similar to those for assigning a value to a variable. Passing a variable to a function will move or copy, just as assignment does. Returning values can also transfer ownership.

fn main() {
    let mut s = String::from("hello");

    change(&mut s);  // Use mutable reference to change the content of "s".
    read(&s);
}

fn change(some_string: &mut String) {
    some_string.push_str(", world");
}

fn read(some_string: &String) {
    println!("Content of string is \"{}\".", some_string);
}

Mutable references have one big restriction: you can have only one mutable reference to a particular piece of data in a particular scope.

let mut s = String::from("hello");

let r1 = &mut s;
let r2 = &mut s;

println!("{}, {}", r1, r2);

// error[E0499]: cannot borrow `s` as mutable more than once at a time
//  --> src/main.rs:5:14
//   |
// 4 |     let r1 = &mut s;
//   |              ------ first mutable borrow occurs here
// 5 |     let r2 = &mut s;
//   |              ^^^^^^ second mutable borrow occurs here
// 6 |
// 7 |     println!("{}, {}", r1, r2);
//   |                        -- first borrow later used here

A similar rule exists for combining mutable and immutable references.

let mut s = String::from("hello");

let r1 = &s; // no problem
let r2 = &s; // no problem
let r3 = &mut s; // BIG PROBLEM

println!("{}, {}, and {}", r1, r2, r3);

// error[E0502]: cannot borrow `s` as mutable because it is also borrowed as immutable
//  --> src/main.rs:6:10
//   |
// 4 | let r1 = &s; // no problem
//   |          -- immutable borrow occurs here
// 5 | let r2 = &s; // no problem
// 6 | let r3 = &mut s; // BIG PROBLEM
//   |          ^^^^^^ mutable borrow occurs here
// 7 |
// 8 | println!("{}, {}, and {}", r1, r2, r3);
//   |                            -- immutable borrow later used here

Note that a reference’s scope starts from where it is introduced and continues through the last time that reference is used.

let mut s = String::from("hello");

let r1 = &s; // no problem
let r2 = &s; // no problem
println!("{} and {}", r1, r2);
// r1 and r2 are no longer used after this point

let r3 = &mut s; // no problem
println!("{}", r3);

String literals are String slices.

5.1 Defining and Instantiating Structs

Using the field init shorthand when variables and fields have the same name:

struct User {
    username: String,
    email: String,
    sign_in_count: u64,
    active: bool,
}

fn build_user(email: String, username: String) -> User {
    User {
        email,
        username,
        active: true,
        sign_in_count: 1,
    }
}

Creating instances from other instances with Struct update syntax:

let user1 = User {
    email: String::from("someone@example.com"),
    username: String::from("someusername123"),
    active: true,
    sign_in_count: 1,
};

let user2 = User {
    email: String::from("another@example.com"),
    username: String::from("anotherusername567"),
    ..user1
};

5.2 An Example Program Using Structs

The println! macro can do many kinds of formatting, and by default, the curly brackets tell println! to use formatting known as Display: output intended for direct end user consumption.

#[derive(Debug)]
struct Rectangle {
    width: f64,
    height: f64,
}

let rect = Rectangle{ width: 30.0, height: 50.0 };

println!("{}", rect);
// --> src/main.rs:27:53
//    |
// 27 |     println!("The area of the rectangle {} is {}.", rect1, rect1.area());
//    |                                                     ^^^^^ `Rectangle` cannot be formatted with the default formatter
//    |
//    = help: the trait `std::fmt::Display` is not implemented for `Rectangle`
//    = note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead
//    = note: required by `std::fmt::Display::fmt`

println!("{:?}", rect);
// Rectangle { width: 30.0, height: 50.0 }

println!("{:#?}", rect);
// Rectangle {
//     width: 30.0,
//     height: 50.0,
// }

6. Enums and Pattern Matching

6.1 Defining and Enum

Enum in Rust can contain data, they can also have methods. An Enum field that can contain data does not have to contain data.

enum Message {
    Quit,
    Move { x: i32, y: i32 },
    Write(String),
    ChangeColor(i32, i32, i32),
}

impl Message {
    fn call(&self) {

    }
}

let m = Message::Write(String::from("hello"));
m.call();

let empty = Message::Write;
empty.call();

7. Managing Growing Projects with Packages, Crates, and Modules

The module system includes:

  • Packages: A Cargo feature that lets you build, test, and share crates
  • Crates: A tree of modules that produces a library or executable
  • Modules and use: Let you control the organization, scope, and privacy of paths
  • Paths: A way of naming an item, such as a struct, function, or module

7.1 Packages and Crates

  • Crate: binary or library.
  • Crate Root: a source file that the Rust compiler starts from and makes up the root module of your crate.
  • Package: one or more crates that provide a set of functionality. A package contains a Cargo.toml file that describes how to build those crates.

A package must contain zero or one library crates, and no more. It can contain as many binary crates as you’d like, but it must contain at least one crate (either library or binary).

By default, Cargo knows src/main.rs is the crate root of a binary crate, and src/lib.rs is the crate root of a library crate, both of which have the same name as the package. Cargo passes the crate root files to rustc to build the library or binary.

7.4 Bringing Paths Into Scope with the use Keyword

Modules can be re-exported with pub use:

mod module1 {
    pub mod module2 {
        pub fn greeting() { println!("Hello!"); }
    }
}

pub use module1::module2;  // re-exporting module2

// Inside main.rs:
use crate_name::module2;

fn main() {
    module2::greeting();
}

Use nested paths to clean up use lists:

use std::io;
use std::io:Write;

// Equivalent:
use std::io::{self, Write};

8. Common Collections

8.1 Storing Lists of Values with Vectors

Use enums to store multiple types in a vector:

enum CSVCell {
    Int32(i32),
    Float64(f64),
    Text(String)
}

type CSVCellVec = Vec<CSVCell>;

let v: CSVCellVec = vec![
    CSVCell::Int32(12),
    CSVCell::Float64(2020.0301),
    CSVCell::Text(String::from("Hello, world!")),
];

8.2 Storing UTF-8 Encoded Text with Strings

String and char in Rust are uses UTF-8 encoding by default. Strings (cannot be directly indexed)[https://doc.rust-lang.org/book/ch08-02-strings.html#bytes-and-scalar-values-and-grapheme-clusters-oh-my], insead use strings slices or character iterator:

let chinese = String::from("这个字符串是UTF-8编码。");
println!("{}", chinese.len());  // Output: 32

let first_char = &chinese[0];
// error[E0277]: the type `std::string::String` cannot be indexed by `{integer}`
//   --> src/main.rs:35:23
//    |
// 35 |     let first_char = &chinese[0];
//    |                       ^^^^^^^^^^ `std::string::String` cannot be indexed by `{integer}`
//    |
//    = help: the trait `std::ops::Index<{integer}>` is not implemented for `std::string::String`

let first_char = &chinese[0..1];
// Runtime error: thread 'main' panicked at 'byte index 1 is not a char boundary;
// it is inside '这' (bytes 0..3) of `这个字符串是UTF-8编码。`'

let first_char = &chinese[0..3];
println!("First character is \"{}\"", first_char);  // Output: First character is "这"

for ch in chinese.chars() {
    print!("{} ", ch.len_utf8());
}
// Output: 3 3 3 3 3 3 1 1 1 1 1 3 3 3 %

8.3 Storing Keys with Associated Values in Hash Maps

For types that implement the Copy trait, the values are copied into the hash map. For owned values, the values will be moved and the hash map will be the owner of those values. If we insert references to values into the hash map, the values won’t be moved into the hash map. The values that the references point to must be valid for at least as long as the hash map is valid.

let mut scores = HashMap::new();

// Use insert to update value regardless if key is present or not:
scores.insert(String::from("Blue"), 10);

// Use entry and or_insert to update value only if key was not present:
scores.entry(String::from("Yellow")).or_insert(50);
scores.entry(String::from("Blue")).or_insert(50);

9. Error Handling

9.2 Recoverable Errors with Result

Use match statements to match Result and then use another match statement to match error kind:

let f = File::open("hello.txt");

let f = match f {
    Ok(file) => file,
    Err(error) => match error.kind() {
        ErrorKind::NotFound => match File::create("hello.txt") {
            Ok(fc) => fc,
            Err(e) => panic!("Problem creating the file: {:?}", e),
        },
        other_error => panic!("Problem opening the file: {:?}", other_error),
    },
};

Use unwrap_or_else to handle errors:

let file = OpenOptions::new()
        .read(true)
        .create(true)
        .open(&file_path)
        .unwrap_or_else(|error| match error.kind() {
            _ => panic!("Cannot open or create file: {:?}", error),
        });

// Shortcuts for panic on error:
let file: File = File::open("hello.txt").unwrap();
let file: File = File::open("hello.txt").expect("Cannot open hello.txt.");

A Shortcut for Propagating Errors: the ? Operator

use std::io;
use std::io::Read;
use std::fs::File;

fn read_username_from_file() -> Result<String, io::Error> {
    let mut f = File::open("hello.txt")?;
    let mut s = String::new();
    f.read_to_string(&mut s)?;
    Ok(s)
}

10. Generic Types, Traits, and Lifetimes

10.1 Generic Data Types

use std::mem;

struct Point<T> {
    x: T,
    y: T
}

impl<T> Point<T> {
    fn transpose(&mut self) {
        mem::swap(&mut self.x, &mut self.y);
    }
}

// Only Point<f64> has method "norm":
impl Point<f64> {
    fn norm(&self) -> f64 {
        (self.x * self.x + self.y * self.y).sqrt()
    }
}

// Only Point using types that implement trait "Clone" has method "get_x_y":
impl<T: Clone> Point<T> {
    fn get_x_y(&self) -> (T, T) {
        (self.x.clone(), self.y.clone())
    }
}

10.2 Traits: Defining Shared Behavior

use std::ops;

pub trait Coordinate: Clone + ops::Neg<Output = Self> {}
pub trait Distance<T>: Clone + ops::Add<T, Output = T> + ops::Sub<T, Output = T> {
    fn half(&self) -> Self;
}

// All types with Clone and Neg trait implements Coordinate
impl<T: Clone + ops::Neg<Output = Self>> Coordinate for T {}

// Implement "half" for all Distance with types that can be divided by a f64.
impl<
        T,
        U: Clone + ops::Add<T, Output = T> + ops::Sub<T, Output = T> + ops::Div<f64, Output = Self>,
    > Distance<T> for U
{
    fn half(&self) -> Self {
        self.clone() / 2.0
    }
}

pub struct Point<T: Coordinate> {
    x: T,
    y: T,
}

pub struct Rectangle<T: Coordinate, U: Distance<T>> {
    bottom_left: Point<T>,
    width: U,
    height: U,
}

We can implement a trait on a type only if either the trait or the type is local to our crate. It isn’t possible to call the default implementation from an overriding implementation of that same method.

trait MyTrait {}

// The following three implementations of "convert" are equivalent.
fn convert(val: &impl MyTrait) -> f64 {
    // -- snip --
}

fn convert<T: MyTrait>(val: &T) -> f64 {
    // -- snip --
}

fn convert<T>(val: &T) -> f64 where T: MyTrait {
    // -- snip --
}

Functions can return object of type that implements a specific trait, but the underlying type has to be the same. Trait objects can solve this problem.

trait MyTrait {}

struct A {}
struct B {}

impl MyTrait for A {}
impl MyTrait for B {}

fn get_a() -> impl MyTrait {
    A {}
}

// Won't compile:
fn one_of(switch: bool) -> impl MyTrait {
    if switch {
        A {}
    } else {
        B {}
    }
}
// error[E0308]: if and else have incompatible types
//   --> src/main.rs:19:9
//    |
// 16 | /     if switch {
// 17 | |         A {}
//    | |         ---- expected because of this
// 18 | |     } else {
// 19 | |         B {}
//    | |         ^^^^ expected struct `A`, found struct `B`
// 20 | |     }
//    | |_____- if and else have incompatible types

10.3 Validating References with Lifetimes

Lifetimes syntax:

&i32        // a reference
&'a i32     // a reference with an explicit lifetime
&'a mut i32 // a mutable reference with an explicit lifetime

Most of the time, lifetimes are implicit and inferred, just like most of the time, types are inferred. We must annotate lifetimes when the lifetimes of references could be related in a few different ways. Rust requires us to annotate the relationships using generic lifetime parameters to ensure the actual references used at runtime will definitely be valid.

The main aim of lifetimes is to prevent dangling references, which cause a program to reference data other than the data it’s intended to reference.

fn longest(x: &str, y: &str) -> &str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}
// error[E0106]: missing lifetime specifier
//  --> src/main.rs:4:33
//   |
// 4 | fn longest(x: &str, y: &str) -> &str {
//   |                                 ^ expected lifetime parameter
//   |
//   = help: this function's return type contains a borrowed value, but the signature does not say whether it is borrowed from `x` or `y`

You can think of every function in Rust that takes references as arguments, and returns references as results as a generic function, which takes lifetimes as generic parameters and each reference argument/return result has the lifetime argument on them. It’s just most of the time these lifetime generics are inferred by the compiler most of the time.

Lifetime annotations don’t change how long any of the references live. Just as functions can accept any type when the signature specifies a generic type parameter, functions can accept references with any lifetime by specifying a generic lifetime parameter. Lifetime annotations describe the relationships of the lifetimes of multiple references to each other without affecting the lifetimes.

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

The code below compiles and runs fine:

// Block 1
fn main() {
    let str1 = String::from("Hello, world!");           // ---------+-- 'a
    let str2 = String::from("This string is longer.");  // --+-- 'b |
    let res = longest(&str1[..], &str2[..]);            //   |      |
    println!("{}", res);                                //   |      |
                                                        // --+      |
}                                                       // ---------+
// Block 2
fn main() {
    let str1 = "Hello, world!";           // --------------+-- 'static
    let str2 = "This string is longer.";  // --+-- 'static |
    let res = longest(str1, str2);        //   |           |
    println!("{}", res);                  //   |           |
}                                         // --+-----------+
// Block 3
fn main() {
    let str1 = "Hello, world!";               // --------------+-- 'static
    let res;                                  //               |
    {                                         //               |
        let str2 = "This string is longer.";  // --+-- 'static |
        res = longest(str1, str2);            //   |           |
    }                                         //   |           |
    println!("{}", res);                      //   |           |
}                                             // --+-----------+

Note in block 3, str2 lives long enough because from the official documentation we know that: String literals have the type &'static str because the reference is always alive: they are baked into the data segment of the final binary.

The code below will give compile errors:

// Block 4
fn main() {
    let y;              // ---------+-- 'a
    {                   //          |
        let x = 5;      //          |
        y = &x;         // --+-- 'b |
    }                   // --+      |
    println!("{}", y);  //          |
}                       // ---------+

// error[E0597]: `x` does not live long enough
//   --> src/main.rs:9:13
//    |
// 9  |         y = &x;
//    |             ^^ borrowed value does not live long enough
// 10 |     }
//    |     - `x` dropped here while still borrowed
// 11 |     println!("{}", y);
//    |                    - borrow later used here

// error: aborting due to previous error
// Block 5
fn main() {
    let str1 = "Hello, world!";               // ---------+-- 'a
    let res;                                  //          |
    {                                         //          |
        let str2 = "This string is longer.";  // --+-- 'b |
        res = longest(str1, str2);            //   |      |
    }                                         // --+      |
    println!("{}", res);                      //          |
}                                             // ---------+

// error[E0597]: `str2` does not live long enough
//   --> src/main.rs:10:35
//    |
// 10 |         res = longest(&str1[..], &str2[..]);
//    |                                   ^^^^ borrowed value does not live long enough
// 11 |     }
//    |     - `str2` dropped here while still borrowed
// 12 |     println!("{}", res);
//    |                    --- borrow later used here

// error: aborting due to previous error

“Ultimately, lifetime syntax is about connecting the lifetimes of various parameters and return values of functions. Once they’re connected, Rust has enough information to allow memory-safe operations and disallow operations that would create dangling pointers or otherwise violate memory safety.”

Lifetime Elision

The Rust compiler can sometime infer the lifetimes for references based on some common patterns. If the compiler applied all rules but there is still ambiguity, the compiler will generate an error to tell the programmer to explicitly specify lifetimes.

“Lifetimes on function or method parameters are called input lifetimes, and lifetimes on return values are called output lifetimes.”

Three rules of lifetime elision:

  1. Each parameter gets its own lifetime parameter.
  2. If there is exactly one input lifetime parameter, that lifetime parameter will be assigned to all output lifetime parameters.
  3. If there are multiple input lifetime parameters and one of the parameter is &self or &mut self, the lifetime of self is assigned to all output parameters.

'static lifetime live through the entire duration of the program.

11. Writing Automated Tests

“Attributes are metadata about pieces of Rust code”.

// Modules with #[cfg(test)] attribute in source code file unit tests.

// src/lib.rs
#[cfg(test)]
mod tests {
    #[test]
    fn test_name() {
        // ...
    }
}

// Functions in source code that lives inside "tests" directory are integration tests.

// tests/integration_test.rs
#[test]
fn test_name() {
    // ...
}
  • cargo test -- --show-output shows stdout.
  • cargo test [NAME] runs all tests with name starting with string NAME.
  • cargo test -- --ignored ignores all tests with #[ignore] attribute.

At the time of this writing, support for LLVM code coverage instrumentation is still WIP. Tarpaulin is a good third-party tool for test coverage.

13. Functional Language Features: Iterators and Closures

13.1 Closures: Anonymous Functions that Can Capture Their Environment

Closures don’t require you to annotate the types of the parameters or the return value like fn functions do, parameter types are usually inferred.

fn  add_one_v1   (x: u32) -> u32 { x + 1 }
let add_one_v2 = |x: u32| -> u32 { x + 1 };
let add_one_v3 = |x|             { x + 1 };
let add_one_v4 = |x|               x + 1  ;

Each closure instance has its own unique anonymous type: that is, even if two closures have the same signature, their types are still considered different.

All closures implement at least one of the traits: Fn, FnMut, or FnOnce. Functions can implement all three of the Fn traits too.

Important Enums

Option

Option<T> is the null type implementation in Rust, it is an enum which can be None or Some(T).

let optional_string: Option<&str> = Some("Hi");
let none: Option<()> = None;

match statements are a common way to handle enums like Option.

match optional_string {
    Some(str) => println!("{}", str),
    None => eprintln!("No string found!"),
}

Instead of using match statementes, Rust provides many convinient methods for Option types like expect or unwrap_or_else.

let just_unwrap_it: &str = optional_string.unwrap();

let unwrap_or_static_err: &str = optional_string.expect("No string found!");

let unwrap_then_something: &str = optional_string.unwrap_or_else(|| {
    eprintln!("Complicated behaviors can be defined within this block.");
    let pi: f64 = 3.1415926535897;
    eprintln!("PI = {}", pi);
    ""
});

Result

Result<T, E> is Rust’s way of handling errors. Instead of throwing exceptions, Rust wants to encourage developers to handle errors properly. The value of a Result can either be Ok(T) or Err(E).

let result_ok: Result<f64, &'static str> = Ok(2.71828);
let result_err: Result<f64, &'static str> = Err("Not a number!");

Results usually are not defined directly in the program, they are usually the result of another function call. Hence match statements are usually used to determine the action.

let number: f64 = match "1.2.3".parse() {
    Ok(val) => val,
    Err(err) => {
        eprintln!("{}", err);
        0.0
    }
};

If the calling function does not want to handle the error or cannot handle the error yet, the function needs to propagate the original error. The ? operator is a shortcut for propagating errors.

fn add_one(number_str: &str) -> Result<f64, std::num::ParseFloatError> {
    let parsed_number = number_str.parse::<f64>()?;
    Ok(parsed_number + 1.0f64)
}