Skip to content

Latest commit

 

History

History

Rust

The Rust Programming Language Notes

Installation

Using a distribution independent method (recommended):

$ : "${RUST_PROFILE:=default}"  # Other possibilities: minimal, complete
$ curl --proto '=https' --tlsv1.2 https://sh.rustup.rs -sSf | sh -s -- -y --profile ${RUST_PROFILE}
  • this installs Rust under user's home directory, isolated from distribution installation
    • PATH environment variable is updated to include ~/.cargo/bin
    • needs a shell to be restarted or source ~/.cargo/env to be run
  • some components are installed via rustup:
    rustup component add ...
  • to use binaries installed using rustup component add ..., additional location need to be added to PATH:
    $ X=$(rustc --print target-libdir)
    $ PATH=${X%/*}/bin:${PATH} ...

GitHub Actions

Tools

Cargo

Cargo (source code) is a package manager for Rust. A package in the Rust world is called crate. Using cargo is a recommended way how to create and maintain Rust projects.

Tips, tricks, and hacks:

  • Cargo --offline

  • How do I pin indirect dependencies of a crate?

  • Nine Rust Cargo.toml Wats and Wat Nots

  • To create a new project in Rust, type:

    $ cargo new --name project_name project_slug

    This creates the project_slug directory with Cargo.toml project configuration file and src/ directory containing project sources. The content of Cargo.toml has the following structure:

    [package]
    # The name of the package, e.g. "foo":
    name = "<name>"
    # The version of the package, e.g. "0.1.0":
    version = "<version>"
    # The Rust language edition used to compile this project, e.g. "2021":
    edition = "<edition>"
    
    [dependencies]
    # A dependency is of the format `<crate name> = "<version>"`, where <version>
    # follows the semantic versioning scheme. An example of a dependency is
    # `rand = "0.8.5"`, where "0.8.5" is a shorthand for "^0.8.5" which means any
    # version that is at least "0.8.5" but below "0.9.0". Cargo considers that
    # any of these versions are compatible with "0.8.5".

    When --name is omitted, the name of the project is derived from the name of the project directory (project_slug).

  • To build the project, type:

    $ cargo build

    This will build your project and

    1. puts the resultant executable inside ./target/debug directory;
    2. creates Cargo.lock keeping the track of exact versions of project dependencies. When you run cargo build first time, Cargo resolves the project dependencies and writes the exact versions of crates to Cargo.lock. Next time Cargo reuse the information from Cargo.lock instead of resolving dependencies again. If a dependency needs to be updated, run cargo update.
  • To build a release of your project, type:

    $ cargo build --release

    This will build your project with enabled optimizations and put the resultant executable inside ./target/release directory.

  • To pre-build your project's dependencies as a container layer:

  • To build and run the project, type:

    $ cargo run
  • To check whether the project compiles, type:

    $ cargo check

    This will produce no executable.

  • To check the project with various options:

  • To build and test with all feature flag combinations:

  • To update the project dependencies, type:

    $ cargo update

    This will look for new bug fixes of crates, download them and update Cargo.lock.

  • To launch the documentation of your project dependencies, type:

    $ cargo doc --open
  • To generate a dependency graph:

  • To generate a flame graph:

  • To visualize/analyze crate's internal structure:

  • To find unused dependencies:

    • using cargo machete
    • using cargo +nightly udeps
    • using RUSTFLAGS=-Wunused-crate-dependencies:
      $ export RUSTFLAGS=-Wunused-crate-dependencies
      $ cargo build
      $ cargo check --all --all-targets
    • see more here
  • To DRY up Cargo.toml manifests:

  • To run your custom command/task, see:

  • To install a crate from its source, type:

    $ cargo install
    • to download and install binary build of crate use cargo-binstall extension:
      $ cargo binstall
  • To clean up unused build files:

  • To prune crate dependencies in target folder:

  • To cleanup ${CARGO_HOME} cache:

  • To watch over your project's source for changes:

Issues

Clippy

clippy (repo) is a tool for linting a source code written in Rust.

Tips, tricks, and hacks:

Rust Analyzer

rust-analyzer (home) is an implementation of Language Server Protocol for the Rust programming language.

Rust Code Format Checker

rustfmt formats a Rust code using the Rust code style conventions.

  • To check the code style of one file, type:
    $ rustfmt --check file.rs

Note

rustfmt is deprecated. Use cargo fmt instead.

Rust Compiler

rustc compiles a rust project into its binary representation.

  • To compile a Rust project, type:
    $ rustc main.rs
    All you need is just to pass your project's root file to rustc (here main.rs) and rustc will automatically gather all the necessary source files, compiles them and links them together.

Caching

CI & Code Coverage

Examples:

References:

Tools:

Dynamic Linking

GitHub Actions

Miri

Miri is a mid-level intermediate representation interpreter. It can run Rust programs and detect certain classes of undefined behavior.

Nextest

Nextest (documentation) is a next generation test runner for Rust.

Rudra

Rudra is a static analyzer to detect common undefined behaviors in Rust programs.

Yuga

Yuga is an automatic lifetime annotation bugs detector (article).

Lexical Elements

Grammar:

token:
    keyword
    weak_keyword
    identifier
    char_literal
    byte_literal
    string_literal
    raw_string_literal
    byte_string_literal
    raw_byte_string_literal
    integer_literal
    float_literal
    lifetime_token
    lifetime_or_label
    punctuation
    delimiters
    reserved_token_double_quote
    reserved_token_single_quote
    reserved_token_pound

lifetime_token:
    "'" identifier_or_keyword
    "'" "_"
lifetime_or_label:
    "'" non_keyword_identifier

punctuation:
    "+" | "-" | "*" | "/" | "%" | "^" | "!" | "&" | "|" | "&&" | "||" | "<<"
    ">>" | "+=" | "-=" | "*=" | "/=" | "%=" | "^=" | "&=" | "|=" | "<<="
    ">>=" | "=" | "==" | "!=" | ">" | "<" | ">=" | "<=" | "@" | "_" | "."
    ".." | "..." | "..=" | "," | ";" | ":" | "::" | "->" | "=>" | "#" | "$"
    "?" | "~"
delimiters:
    "(" | ")" | "[" | "]" | "{" | "}"

reserved_token_double_quote:
    ((identifier_or_keyword - ("b" | "r" | "br")) | "_") '"'
reserved_token_single_quote:
    ((identifier_or_keyword - "b") | "_") "'"
reserved_token_pound:
    ((identifier_or_keyword - ("r" | "br")) | "_") "#"

reserved_number:
    bin_literal ("2" | "3" | "4" | "5" | "6" | "7" | "8" | "9")
    oct_literal ("8" | "9")
    (bin_literal | oct_literal | hex_literal)
        "."{not-followed-by ("." | "_" | <XID start Unicode character>)}
    (bin_literal | oct_literal) ("e" | "E")
    "0b" "_"* (<end of input> | !bin_digit)
    "0o" "_"* (<end of input> | !oct_digit)
    "0x" "_"* (<end of input> | !hex_digit)
    dec_literal ("." dec_literal)? ("e" | "E") ("+" | "-")?
        (<end of input> | !dec_digit)

suffix:
    identifier_or_keyword
suffix_no_e:
    suffix - (("e" | "E").*)
isolated_cr:
    <a U+000D not followed by a U+000A>

utf8bom:
    U+FEFF
shebang:
    "#!" (!U+000A)+
  • Rust input is viewed as a sequence of UTF-8 characters
  • a reserved_number is rejected by the tokenizer instead of tokenized to separate tokens
  • see Tokens for greater detail

Whitespace

Grammar:

whitespace:
    U+0009  # horizontal tab
    U+000A  # line feed
    U+000B  # vertical tab
    U+000C  # form feed
    U+000D  # carriage return
    U+0020  # space
    U+0085  # next line
    U+200E  # left-to-right mark
    U+200F  # right-to-left mark
    U+2028  # line separator
    U+2029  # paragraph separator
    <unicode character that have the pattern white space property>
  • whitespace characters are ignored
  • see Whitespace for greater detail

Comments

Grammar:

line_comment:
    "//" (!("/" | "!" | U+000A) | "//") (!U+000A)*
    "//"
block_comment:
    "/*" (!("*" | "!") | "**" | block_comment_or_doc)
        (block_comment_or_doc | !"*/")*
        "*/"
    "/**/"
    "/***/"
inner_line_doc:
    "//!" (!(U+000A | isolated_cr))*
inner_block_doc:
    "/*!" (block_comment_or_doc | !("*/" | isolated_cr))* "*/"
outer_line_doc:
    "///" (!"/" (!(U+000A | isolated_cr))*)?
outer_block_doc:
    "/**" (!"*" | block_comment_or_doc)
        (block_comment_or_doc | !("*/" | isolated_cr))*
        "*/"

block_comment_or_doc:
    block_comment
    outer_block_doc
    inner_block_doc
  • comments are ignored by rustc but not by particular tools (cargo doc etc.)

Examples:

// This is a single line comment.

/*
 * This is a block comment.
 */

//! This inner doc line comment.

/*!
 * This is inner doc block comment.
 */

/// This is outer doc line comment.

/**
 * This is outer doc block comment.
 */

See Comments for greater detail.

Keywords

Grammar:

keyword:
    "as" | "break" | "const" | "continue" | "crate" | "else" | "enum"
    "extern" | "false" | "fn" | "for" | "if" | "impl" | "in" | "let"
    "loop" | "match" | "mod" | "move" | "mut" | "pub" | "ref" | "return"
    "self" | "Self" | "static" | "struct" | "super" | "trait" | "true"
    "type" | "unsafe" | "use" | "where" | "while"
    # 2018+:
    "async" | "await" | "dyn"
    # Reserved:
    "abstract" | "become" | "box" | "do" | "final" | "macro" | "override"
    "priv" | "typeof" | "unsized" | "virtual" | "yield"
    # 2018+:
    "try"
weak_keyword:
    "macro_rules" | "union" | "'static"
    # 2015:
    "dyn"

See Appendix A: Keywords from the book or Keywords for greater detail.

Identifiers

Grammar:

identifier_or_keyword:
    <XID start Unicode character> <XID continue Unicode character>*
    "_" <XID continue Unicode character>+
raw_identifier:
    "r#" (identifier_or_keyword - ("crate" | "self" | "super" | "Self"))
non_keyword_identifier:
    identifier_or_keyword - keyword

identifier:
    non_keyword_identifier
    raw_identifier

See Identifiers for greater detail.

Literals

  • any literal may end with suffix which is an identifier or keyword
  • a suffix can annotate a literal with type or it can serve as syntactical sugar in token stream processed during macro expansion

Character Literals

Grammar:

char_literal:
    "'" (
        !("'" | r"\" | U+000A | U+000D | U+0009) |
        quote_escape | ascii_escape | unicode_escape
    ) "'" suffix?

quote_escape:
    r"\'" | r'\"'
ascii_escape:
    r"\x" oct_digit hex_digit
    r"\n" | r"\r" | r"\t" | r"\\" | r"\0"
unicode_escape:
    r"\u{" (hex_digit "_"*){1,6} "}"
  • a character between quotes is any Unicode Scalar Value (U+0000 to U+D7FF and U+E000 to U+10FFFF inclusive) except single quote (U+0027), backslash (U+005C), new line (U+000A), carriage return (U+000D), and tab character (U+0009)
  • the type of character literal is chr
  • see Character literals for greater detail

String Literals

Grammar:

string_literal:
    '"' (
        !('"' | r"\" | isolated_cr) |
        quote_escape | ascii_escape | unicode_escape | string_continue
    )* '"' suffix?
string_continue:
    <r"\" followed by U+000A>

raw_string_literal:
    "r" raw_string_content suffix?
raw_string_content:
    '"' (!isolated_cr){non-greedy *} '"'
    "#" raw_string_content "#"

byte_string_literal:
    'b"' (
        ascii_for_string | byte_escape | string_continue
    )* '"' suffix?
ascii_for_string:
    <any ASCII (i.e. '\0' to '\x7f'), except '"', '\\' and isolated_cr>

raw_byte_string_literal:
    "br" raw_byte_string_content suffix?
raw_byte_string_content"
    '"' ascii{non-greedy *} '"'
    "#" raw_byte_string_content "#"
ascii:
    <any ASCII (i.e. '\0' to '\x7f')>
  • a character in a string_literal is any Unicode Scalar Value (U+0000 to U+D7FF and U+E000 to U+10FFFF inclusive) except double quote (U+0022), backslash (U+005C), and sole carriage return (U+000D); U+000D U+000A is translated to U+000A
  • in a string_literal, if (U+000D? U+000A) immediatelly follows a backslash character, then the backslash character, the (U+000D? U+000A) and the following string containing only U+0020, U+000A, U+000D and U+0009 characters are removed from the string_literal
  • a character in a raw_string_literal is any Unicode Scalar Value (U+0000 to U+D7FF and U+E000 to U+10FFFF inclusive) except sole carriage return (U+000D)
  • raw_string_literal and raw_byte_string_literal do not process any escape sequence
  • the type of string literal is &'static str
  • the type of byte string literal of the length n is &'static [u8; n]
  • see String literals, Raw string literals, Byte string literals, and Raw byte string literals for greater detail

Integer Literals

Grammar:

byte_literal:
    "b'" (ascii_for_char | byte_escape) "'" suffix?
ascii_for_char:
    <any ASCII (i.e. '\0' to '\x7f') except '\'', '\\', '\n', '\r' or '\t'>
byte_escape:
    r"\x" hex_digit hex_digit
    r"\n" | r"\r" | r"\t" | r"\\" | r"\0" | r"\'" | r'\"'

integer_literal:
    dec_literal suffix_no_e?
    bin_literal suffix_no_e?
    oct_literal suffix_no_e?
    hex_literal suffix_no_e?

dec_literal:
    dec_digit (dec_digit | "_")*
bin_literal:
    "0b" (bin_digit | "_")* bin_digit (bin_digit | "_")*
oct_literal:
    "0o" (oct_digit | "_")* oct_digit (oct_digit | "_")*
hex_literal:
    "0x" (hex_digit | "_")* hex_digit (hex_digit | "_")*

bin_digit:
    "0" | "1"
oct_digit:
    bin_digit | "2" | "3" | "4" | "5" | "6" | "7"
dec_digit:
    oct_digit | "8" | "9"
hex_digit:
    dec_digit
    "a" | "b" | "c" | "d" | "e" | "f"
    "A" | "B" | "C" | "D" | "E" | "F"
  • _ works as a digit separator and is ignored (increases number readability)
  • after macro expansion, suffix should be one of u8, i8, u16, i16, u32, i32, u64, i64, u128, i128, usize or isize
    • byte_literal suffix should be u8
  • if there is no type suffix, i32 is used
  • see Byte literals, Integer literals and Integer literal expressions for greater detail

Floating Point Literals

Grammar:

float_literal:
    dec_literal "."{not-followed-by ("." | "_" | <XID start Unicode character>)}
    dec_literal "." dec_literal suffix_no_e?
    dec_literal ("." dec_literal)? float_exponent suffix?

float_exponent:
    ("e" | "E") ("+" | "-")? (dec_digit | "_")* dec_digit (dec_digit | "_")*

Data Types

Grammar:

type:
    type_no_bounds
    impl_trait_type
    trait_object_type

type_no_bounds:
    parenthesized_type
    impl_trait_type_one_bound
    trait_object_type_one_bound
    type_path
    tuple_type
    never_type
    raw_pointer_type
    reference_type
    array_type
    slice_type
    inferred_type
    qualified_path_in_type
    bare_function_type
    macro_invocation

parenthesized_type:
    "(" type ")"
  • Rust can infer a variable's data type
    • in case of more than one possible data types, a variable must be annotated with a data type

Never Type

Grammar:

never_type:
    "!"
  • !
  • has no values
  • represent the result of computations that never complete
  • can be coerced into any other type
  • can only appear in function return types
  • see Never type for greater detail

Scalar Types

Boolean Type

  • bool
  • one byte in size
    • the single byte of a bool is guaranteed to be initialized
  • alignment on one byte boundary
  • two possible values: true (bit pattern 0x01) and false (bit pattern 0x00)
    • other values (bit patterns) may result in undefined behavior
  • implements Clone, Copy, Sized, Send, and Sync traits
  • operations table:
    a b !b a | b a & b a ^ b a == b a > b
    false false true false false false true false
    false true false true false true false false
    true false true true false true false true
    true true false true true false true false
  • other operations:
    • a != b is defined as !(a == b)
    • a >= b is defined as a == b | a > b
    • a < b is defined as !(a >= b)
    • a <= b is defined as a == b | a < b
  • see Boolean type for greater detail

Character Type

  • char
  • a value of type char is a Unicode scalar value
    • 32-bit unsigned word
      • have the same size and alignment as u32
    • range 0x0000 to 0xD7FF or 0xE000 to 0x10FFFF
      • char outside of this range has undefined behavior
  • every byte of a char is guaranteed to be initialized
  • a [char] is effectively a UCS-4/UTF-32 string of length 1
  • implements Clone, Copy, Sized, Send, and Sync traits
  • see Textual types for greater detail

Integer Types

Signed Unsigned Size
i8 u8 8 bits
i16 u16 16 bits
i32 u32 32 bits
i64 u64 64 bits
i128 u128 128 bits
isize usize machine specific
  • usize has the same number of bits as the platform's pointer type
    • at least 16 bits wide
  • isize has the same number of bits as the platform's pointer type
    • at least 16 bits wide
    • maximum isize value is the theoretical upper bound on object and array size
  • range (2**x means 1 << x):
    • iN (N bits, signed):
      • minimum: -(2**(N-1))
      • maximum: 2**(N-1) - 1
    • uN (N bits, unsigned):
      • minimum: 0
      • maximum: 2**N - 1
  • for every integer type T
    • the bit validity of T is equivalent to the bit validity of [u8; size_of::<T>()]
  • an uninitialized byte is not a valid u8
  • every integer type implements Clone, Copy, Sized, Send, and Sync traits

See Integer types and Machine-dependent integer types for greater detail.

Floating Point Types

  • f32 and f64 with 32 bits and 64 bits in size, respectively
  • IEEE 754-2008
  • default is f64
  • for every floating point type T
    • the bit validity of T is equivalent to the bit validity of [u8; size_of::<T>()]
  • every floating point type implements Clone, Copy, Sized, Send, and Sync traits
  • see Floating-point types for greater detail

Compound Types

String Type

  • str
  • dynamically sized type
    • it can only be instantiated through a pointer type, such as &str
  • a value of type str has the same representation as [u8]
  • methods working on str ensure and assume that the data in there is valid UTF-8
    • calling a str method with a non-UTF-8 buffer can cause undefined behavior
  • &str is not indexable
  • a slice of &str outside of character boundaries make a Rust program to panic
    • a slice operation is performed over bytes, not characters
    • a character boundary is where the last byte of the character ends and the first byte of the next character begins (recall that characters are UTF-8 encoded)
  • implements Clone, Copy, Sized, Send, and Sync traits
  • see Textual types and Dynamically Sized Types for greater detail

Tuple Types

Grammar:

tuple_type:
    "(" ")"
    "(" (type ",")+ type? ")"
  • tuples are finite sequences of values, where two values may have distinct types
  • an item can be either type or expression or identifier
  • an example of assigning a tuple to the variable:
    let tup = ("foo", 1, 0.5);
  • with type annotations:
    let tup: (str, i32, f64) = ("foo", 1, 0.5);
  • elements can be extracted from a tuple using either dot expression or destructuring assignment
  • dot expression has a form tuple "." index, where index is a non-negative integer literal in decimal notation not exceeding the tuple size minus one (tuple indices are zero-based); example:
    let point = (3, 5);
    
    let x = point.0;
    let y = point.1;
  • destructuring assignment:
    let point = (3, 5);
    
    let (x, y) = point;
  • a tuple type implements Clone, Copy, Sized, Send, and Sync traits if its underlying types implement these
Unit
  • empty tuple
  • both unit type and unit value are written as ()
  • represent an empty value or an empty return type
  • empty value is returned implicitly by an expression
  • implements Clone, Copy, Sized, Send, and Sync traits

See Tuple types and Tuple and tuple indexing expressions for greater detail.

Array Types

Grammar:

array_type:
    "[" type ";" expression "]"
  • arrays are finite sequences of values of same type
  • unlike in other programming languages arrays have fixed length
    • the size of an array is specified by expression, which must be a constant expression that evaluates to usize
  • in Rust, array are allocated on stack
  • all elements of an array are always initialized
  • an example of an array assigned to the variable:
    let arr = ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"];
  • with type annotation specified:
    let arr: [u8; 4] = [1, 2, 4, 8];
  • arrays can be initialized by "[" value ";" count "]" expression:
    let arr = [2; 4];  // same as `let arr = [2, 2, 2, 2];`
  • to access the element of an array, use array "[" index "]" expression:
    let arr = ["a", "b"];
    
    let a = arr[0];  // a == "a"
    let b = arr[1];  // b == "b"
  • like tuples, array indices are zero-based
  • indexing an array out of its bounds make a program panicking
  • see Array types and Array and array index expressions for greater detail

Slice Types

Grammar:

slice_type:
    "[" type "]"
  • dynamically sized type
  • represents a view into a sequence of elements of some type
  • generally used through pointer types
  • all elements of a slice are always initialized
  • access to a slice is always bounds-checked in safe methods and operators
  • see Slice types for greater detail

Struct Types

  • a struct type is a heterogeneous product of other types
  • structs have no specified memory layout
    • to specify one, use repr attribute
  • visibility of a struct's fields can be specified
  • a tuple struct type has anonymous fields
  • a unit-like struct type has no fields
  • a struct type can implement Clone, Copy, Sized, Send, and Sync traits if types of its fields implement these
    • a unit-like struct type can implement all of these
  • see Struct types and Visibility and Privacy for greater detail

Enumerated Types

  • nominal, heterogeneous disjoint union types
  • any enum value consumes as much memory as the largest variant for its corresponding enum type plus the size of discriminant
  • must be denoted by named reference to an enum item
  • an enum type can implement Clone, Copy, Sized, Send, and Sync traits if types of its variants implement these
    • an enum type with no variants or only with unit-like variants can implement all of these
  • see Enumerated types for greater detail

Union Types

  • nominal, heterogeneous C-like union
  • no notion of an active field
    • reading from a union field requires unsafe
      • since data type transmutation between read/write may result in unexpected or undefined behavior
  • only types that never need to be dropped can be used for union fields
  • by default the memory layout of a union is undefined
    • the memory layout can be specified by #[repr(...)]
  • see Union types, Unions, and Representations for greater detail

Function Item Types

  • zero-sized
  • yielded by:
    • a function item
    • a tuple-like struct constructor
    • an enum variant constructor
  • explicitly identifies the function
    • its name
    • its type arguments
    • its early-bound lifetime arguments
  • a function item can be coerced to the function pointer with the same signature
  • implements Fn, FnMut, FnOnce, Copy, Clone, Send, and Sync traits
  • see Function item types, Function pointer types, and Type coercions for greater detail

Closure Types

  • a closure type is unique for every closure value produced by a closure expression
    • a closure type is anonymous and cannot be written out
  • how a compiler defines a new closure type:
    • parse and analyze a closure expression
      • record which and how (mutably/immutably) are the closed-over variables used
        • do not take surrounding code into account, that is
          • ignore the lifetimes of involved variables
          • ignore the lifetime of the closure itself
        • are there no closed-over variables?
          • the closure is non-capturing
            • can be coerced to the function pointer type with the same signature
      • does it have the move keyword?
    • no move keyword present
      • try to capture a closed-over variable by immutable borrow first
      • try to capture a closed-over variable by unique immutable borrow, if the previous fails
        • special case which occurs when modifying the referent of a mutable reference
        • cannot be used anywhere else in the language
        • cannot be written out explicitly
      • try to capture a closed-over variable by mutable borrow, if the previous fails
      • finally, try to capture a closed-over variable by move, if all of the previous capturing attempts had failed
        • compiler usually complains about missing move keyword
      • note that the decision on which capture mode has to be chosen is made on how the captured variable is used inside the closure body
    • the move keyword is present
      • all closed-over variables are captured by move or copy
        • a copy capture is preferred if a type implements the Copy trait
      • this allows the closure to outlive the captured values
    • note that composite types such as structs, tuples, and enums are always captured entirely
    • make a new anonymous struct-like type
      • fields of this new struct-like type are captured variables
      • implement Sized trait for this type
      • implement FnOnce trait for this type
        • indicates that the closure can be called once by consuming ownership of the closure
      • does the closure of this type not move out of any captured variables?
        • implement FnMut trait for this type
          • indicates that the closure can be called by mutable reference
      • does the closure of this type not mutate or move out of any captured variables?
        • implement Fn trait for this type
          • indicates that the closure can be called by shared reference
      • no variable/value captured by unique immutable or mutable reference?
        • all variables/values captured by copy or move implement Copy trait?
          • implement Copy trait for this type
        • all variables/values captured by copy or move implement Clone trait?
          • implement Clone trait for this type
            • the order of cloning of the captured variables is left unspecified
      • all captured variables implement Sync trait?
        • implement Sync trait for this type
      • all variables captured by non-unique immutable reference implement Sync trait?
        • all variables/values captured by unique immutable or mutable reference, copy, or move implement Send trait?
          • implement Send trait for this type
  • examples:
    let mut v = vec![1, 2, 3];
    
    // Implement `Fn` (capture by reference):
    let f_a = || { println!("{v}"); };
    // Also implement `Fn` (capture by move, but the immutable reference is used
    // inside the closure's body):
    let f_b = move || { println!("{v}"); };
    
    // Implement `FnMut` (capture by mutable reference):
    let g_a = || { v.push(4); };
    // Also implement `FnMut` (capture by move, but the mutable reference is used
    // inside the closure's body):
    let g_b = move || { v.push(4); };
    
    // Implement `FnOnce` (closure uses operation that takes ownership over the
    // borrowed value):
    let h_a = || { drop(v); };
    // Also implement `FnOnce` (capture by move, but the important is what the
    // closure do with the captured value in the closure's body):
    let h_b = move || { drop(v); };
    
    // Non-capturing closures can be coerced to function pointers:
    let inc: fn(i32) -> i32 = |x| x + 1;
    
    let mut b = false;
    let mrb = &mut b;
    
    // Unique immutable capture:
    let mut f = || { *mrb = true; };
  • see Closure types, Closure expressions, FnOnce, FnMut, and Fn for greater detail

Reference Types

Grammar:

reference_type:
    "&" lifetime? "mut"? type_no_bounds
  • if mut is not present, a reference is called a shared reference
    • it points to a memory location owned by some other value
    • prevents direct mutation of the value (exception to this rule is interior mutability)
    • there can be any number of shared references to a value
    • a reference type implements Copy trait
    • referencing a temporal value keeps it alive during the lifetime of the reference itself
  • if mut is present, a reference is called a mutable reference
    • like shared reference, it also points to a memory location owned by some other value
    • allows direct mutation of the value
      • the value must not be borrowed yet
    • only one mutable reference per value can exists
    • a mutable reference type does not implement Copy trait
  • transmutation of a reference type, R, to a [u8; size_of::<R>()] is not valid
  • see References (& and &mut), Interior Mutability, and Temporaries for greater detail

Raw Pointer Types

Grammar:

raw_pointer_type:
    "*" ("mut" | "const") type_no_bounds
  • a raw pointer has no safety or liveness guarantees
  • *const T is an immutable raw pointer to T
  • *mut T is a mutable raw pointer to T
  • a raw pointer can be copied or dropped
    • these operations have no effect on the life cycle of any other value
  • dereferencing a raw pointer is an unsafe operation
    • reborrowing: &*, &mut *
  • a raw pointer, P, where P = *const T or P = *mut T, is said
    • thin if T: Sized
    • fat otherwise
      • this is the case for dynamically sized objects, where the raw pointer contains additional data, like a slot to a virtual method table
  • raw pointers are compared by their address
    • additional data are included into comparison in case of fat raw pointers
  • *const raw pointers can be created directly by core::ptr::addr_of
  • *mut raw pointers can be created directly by core::ptr::addr_of_mut
  • transmutation of a raw pointer type, P, to a [u8; size_of::<P>()] is not valid
  • transmutation of an integer or array of integers to a thin raw pointer is always valid
    • the pointer produced by this transmutation may not be dereferenced
  • see Raw pointers (*const and *mut), Unsafety, Dynamically Sized Types, core::ptr::addr_of, and core::ptr::addr_of_mut for greater detail

Function Pointer Types

Grammar:

bare_function_type:
    for_lifetimes? function_type_qualifiers
        "fn" "(" function_parameters_maybe_named_variadic? ")"
        bare_function_return_type?

function_type_qualifiers:
    "unsafe"? ("extern" abi?)?

bare_function_return_type:
    "->" type_no_bounds

function_parameters_maybe_named_variadic:
    maybe_named_function_parameters
    maybe_named_function_parameters_variadic

maybe_named_function_parameters:
    maybe_named_param ("," maybe_named_param)* ","?

maybe_named_function_parameters_variadic:
    (maybe_named_param ",")* maybe_named_param "," outer_attribute* "..."

maybe_named_param:
    outer_attribute* ((identifier | "_") ":")? type

Subtyping

T_a is a subtype of T_b if and only if:

  • without lifetimes, T_a is equal to T_b
  • T_a is not a high-ranked function pointer and T_a is not a trait object and
    • the lifetime of T_a outlives the lifetime of T_b
  • T_a is a high-ranked function pointer or T_a is a trait object, and
    • T_b is given by a substitution of the high-ranked lifetimes in T_a

Some examples:

fn bar<'a>() {
    // `&'static str` is a subtype of `&'a str` because `'static` outlives `'a`
    let s: &'static str = "hi";
    let t: &'a str = s;
}

// `for<'a> fn(&'a i32) -> &'a i32` is a subtype of `fn(&'static i32) -> &'static i32`
// because:
//   * `for<'a> fn(&'a i32) -> &'a i32` is a high-ranked function pointer
//   * `fn(&'static i32) -> &'static i32` is derived from `for<'a> fn(&'a i32) -> &'a i32`
//     via substituting of `'a` with `'static`
let subtype: &(for<'a> fn(&'a i32) -> &'a i32) = &((|x| x) as fn(&_) -> &_);
let supertype: &(fn(&'static i32) -> &'static i32) = subtype;

// `dyn for<'a> Fn(&'a i32) -> &'a i32` is a subtype of `dyn Fn(&'static i32) -> &'static i32`
// because:
//   * `dyn for<'a> Fn(&'a i32) -> &'a i32` is a trait object
//   * `dyn Fn(&'static i32) -> &'static i32` is derived from `dyn for<'a> Fn(&'a i32) -> &'a i32`
//     via substituting of `'a` with `'static`
let subtype: &(dyn for<'a> Fn(&'a i32) -> &'a i32) = &|x| x;
let supertype: &(dyn Fn(&'static i32) -> &'static i32) = subtype;

// `for<'a, 'b> fn(&'a i32, &'b i32)` is a subtype of `for<'c> fn(&'c i32, &'c i32)`
// because:
//   * `for<'a, 'b> fn(&'a i32, &'b i32)` is a high-ranked function pointer
//   * `for<'c> fn(&'c i32, &'c i32)` is derived from `for<'a, 'b> fn(&'a i32, &'b i32)`
//     via substituting of both `'a` and `'b` with `'c`
let subtype: &(for<'a, 'b> fn(&'a i32, &'b i32)) = &((|x, y| {}) as fn(&_, &_));
let supertype: &for<'c> fn(&'c i32, &'c i32) = subtype;

Variance:

  • property that generic types have with respect to their arguments
  • a generic type's variance in a parameter is how the subtyping of the parameter affects the subtyping of the type:
    • F<T> is covariant over T if T being a subtype of U implies that F<T> is a subtype of F<U>
    • F<T> is contravariant over T if T being a subtype of U implies that F<U> is a subtype of F<T>
    • F<T> is invariant over T otherwise
  • determining variance of types:
    • let F<'a> be &'a T and let 'a be a subtype of 'b (that is 'a outlives 'b)
      • then also &'a T is a subtype of &'b T, meaning that F<'a> is covariant over 'a
    • let F<T> be &'a T and let T be a subtype of U
      • then also &'a T is a subtype of &'a U since &'a T outlives &'a U, meaning that F<T> is covariant over T
    • let F<'a> be &'a mut T and let 'a be a subtype of 'b
      • then also &'a mut T is a subtype of &'b mut T, meaning that F<'a> is covariant over 'a
    • let F<T> be &'a mut T and let T be a subtype of U
      • &'a mut T is not a subtype of &'a mut U
        • &'a mut T can be promoted to &'a mut U since T is a subtype of U
        • now the value of type &'a mut T can be changed to some value of type &'a mut U
        • that is, &'a mut T can refer to a value with shorter lifetime than the original &'a mut T value
        • hence, there is no guarantee that &'a mut T outlives &'a mut U
        • hence, &'a mut T is not a subtype of &'a mut U
      • conversely, &'a mut U is not a subtype of &'a mut T
      • thus, F<T> is invariant over T
    • let F<T> be *const T and let T be a subtype of U
      • then also *const T is a subtype of *const U, meaning that F<T> is covariant over T
    • let F<T> be *mut T and let T be a subtype of U
      • *mut T can be promoted to *mut U
      • then a value of type *mut U can be changed due to mutability
      • thus, there is no guarantee that a value of type *mut T outlives a value of type *mut U
      • hence, *mut T is not a subtype of *mut U, meaning that F<T> is invariant over T
    • let F<T> be [T] and let T be a subtype of U
      • then also [T] is a subtype of [U], meaning that F<T> is covariant over T
    • let F<T> be [T; n] and let T be a subtype of U
      • then also [T; n] is a subtype of [U; n], meaning that F<T> is covariant over T
    • let F<T> be fn() -> T and let T be a subtype of U
      • then also fn() -> T is a subtype of fn() -> U, meaning that F<T> is covariant over T
    • let F<T> be fn(T) -> () and let T be a subtype of U
      • f: fn(T) -> () cannot be used in a place where fn(U) -> () is used since f expect an argument that outlives U
      • on the other hand, g: fn(U) -> () can be used in a place where fn(T) -> () is used since a provided argument of type T lives longer than an expected argument of type U
      • thus, fn(U) -> () is a subtype of fn(T) -> (), meaning that F<T> is contravariant over T
    • let F<T> be std::cell::UnsafeCell<T> and let T be a subtype of U
      • then F<T> is invariant over T due to interior mutability of std::cell::UnsafeCell<T>
    • let F<T> be std::marker::PhantomData<T> and let T be a subtype of U
      • then also std::marker::PhantomData<T> is a subtype of std::marker::PhantomData<U>, meaning that F<T> is covariant over T
    • let F<'a> be dyn Trait<T> + 'a and let 'a be a subtype of 'b
      • then also dyn Trait<T> + 'a is a subtype of dyn Trait<T> + 'b, meaning that F<'a> is covariant over 'a
    • let F<T> be dyn Trait<T> + 'a and let T be a subtype of U
      • like in the mutability case, dyn Trait<T> + 'a cannot be a subtype of dyn Trait<U> + 'a or vice versa since, due to dynamic manner, there is no guarantee that dyn Trait<T> + 'a data outlives dyn Trait<U> + 'a data (or vice versa)
      • thus, F<T> is invariant over T
    • let F<T> be struct, enum, or union
      • F<T> is covariant over T if and only if all its fields involving T are also covariant over T
      • F<T> is contravariant over T if and only if all its fields involving T are also contravariant over T
      • F<T> is invariant over T if and only if any of these cases holds:
        • a F<T> field is invariant over T
        • T is used in positions with different variances
    • outside of an struct, enum, or union, the variance for parameters is checked at each location separately

Some examples of variance:

use std::cell::UnsafeCell;
struct Variance<'a, 'b, 'c, 'd, T, U: 'a> {
    x: &'a U,
    y: *const T,
    z: UnsafeCell<&'b f64>,
    v: &'d T,
    w: *mut U,
    f: fn(&'c ()) -> &'c (),
    g: fn(&'d ()) -> (),
}
  • struct Variance is covariant in 'a because
    • x: &'a U is covariant in 'a
  • struct Variance is invariant in 'b because
    • z: UnsafeCell<&'b f64> is invariant over &'b f64
  • struct Variance is invariant in 'c because
    • f: fn(&'c ()) -> &'c () is contravariant over &'c ()
    • f: fn(&'c ()) -> &'c () is covariant over &'c ()
  • struct Variance is invariant in 'd because
    • v: &'d T is covariant in 'd
    • g: fn(&'d ()) -> () is contravariant over &'d ()
  • struct Variance is covariant in T because
    • y: *const T is covariant in T
    • v: &'d T is covariant in T
  • struct Variance is invariant in U because
    • w: *mut U is invariant in U
use std::cell::UnsafeCell;
fn generic_tuple<'short, 'long: 'short>(
    x: (&'long u32, UnsafeCell<&'long u32>),
) {
    let _: (&'short u32, UnsafeCell<&'long u32>) = x;
}
  • x's type is a tuple and hence the variance for parameters is checked at each location inside of the tuple separately
    • 'long outlives 'short and hence &'long u32 is a subtype of &'short u32, meaning that &'long u32 is covariant in 'long
    • UnsafeCell<&'long u32> is invariant in 'long
    • thus, (&'long u32, UnsafeCell<&'long u32>) is a subtype of (&'short u32, UnsafeCell<&'long u32>)
fn takes_fn_ptr<'short, 'middle: 'short>(
    f: fn(&'middle ()) -> &'middle (),
) {
    let _: fn(&'static ()) -> &'short () = f;
}
  • f's type is a function pointer and hence the variance for parameters is checked at both argument and return type location separately
    • at the argument type location, fn(&'middle ()) -> &'middle () is contravariant in 'middle
    • at the return type location, fn(&'middle ()) -> &'middle () is covariant in 'middle
    • thus, fn(&'middle ()) -> &'middle () is a subtype of fn(&'static ()) -> &'short ()
      • 'static outlives 'middle
      • 'middle outlives 'short

See Subtyping and Variance, Lifetime bounds, Higher-ranked trait bounds, Function pointer types, Trait objects, std::cell::UnsafeCell, and std::marker::PhantomData for greater detail.

Type Coercions

Type coercions are implicit type casts.

  • are done automatically at specific locations
  • any conversion allowed by coercion can also be performed explicitly via as (type cast) operator

Coercions can occur at these locations, called coercion sites:

  • let statements
  • static and const item declarations
  • arguments for function calls
  • instantiations of struct, union, or enum variant fields
  • function results

Recursive propagation of coercion sites:

  • if the expression on a coercion site is an array literal, where the array has type [U; n], then
    • each sub-expression in the array literal is a coercion site for coercion to type U
  • if the expression on a coercion site is an array literal with repeating syntax, where the array has type [U; n], then
    • the repeated sub-expression is a coercion site for coercion to type U
  • if the expression on a coercion site is a tuple that is a coercion site to type (U_0, U_1, ..., U_n), then
    • each sub-expression is a coercion site to the respective type, e.g. the 0th sub-expression is a coercion site to type U_0
  • if the expression on a coercion site is a parenthesized sub-expression (e), then
    • if (e) has type U, then e is a coercion site to U
  • if the expression on a coercion site is a block that has type U, then
    • the last expression in the block, if it is not semicolon-terminated, is a coercion site to U (this includes blocks which are part of control flow statements, if the block has a known type)

Coercion is allowed between the following types:

  • T can be coerced to U if
  • T can be coerced to V if
    1. T can be coerced to U and
    2. U can be coerced to V
  • &mut T can be coerced to &T
  • *mut T can be coerced to *const T
  • &T can be coerced to *const T
  • &mut T can be coerced to *mut T
  • &T or &mut T can be coerced to &U if
    • T implements Deref<Target = U>
  • &mut T can be coerced to &mut U if
    • T implements DerefMut<Target = U>
  • type_constructor(T) can be coerced to type_constructor(U) if
    1. type_constructor(T) is one of
      • &T
      • &mut T
      • *const T
      • *mut T
      • Box<T>
    2. U can be obtained from T by unsized coercion
  • function item types can be coerced to fn pointers
  • non capturing closures can be coerced to fn pointers
  • ! can be coerced to any T

Unsized coercions:

  • conversions of sized types to unsized ones
  • if T can be coerced to U by unsized coercion, then
    • an implementation of Unsize<U> for T will be provided
  • [T; n] can be coerced to [T]
  • T can be coerced to dyn U if
    1. T implements U + Sized
    2. U is object safe
  • Foo<..., T, ...> can be coerced to Foo<..., U, ...> if
    1. Foo is a struct
    2. T implements Unsize<U>
    3. the last field of Foo has a type involving T
      • if this field has type Bar<T>, then
        • Bar<T> implements Unsize<Bar<U>>
    4. T is not part of the type of any other fields
  • a type Foo<T> can implement CoerceUnsized<Foo<U>> if
    • T implements Unsize<U> or CoerceUnsized<Foo<U>>
      • this allows Foo<T> to be coerced to Foo<U>

Least upper bound coercions:

  • only used in the following situations:
    • to find the common type for a series of if branches
    • to find the common type for a series of match arms
    • to find the common type for array elements
    • to find the type for the return type of a closure with multiple return statements
    • to check the type for the return type of a function with multiple return statements
  • algorithm:
    • input: types T_0, T_1, ..., T_n
    • output: type T_t
    • method:
      • set T_t to T_0
      • for i in 1..n:
        • if T_i can be coerced to T_t
          • no change is made
        • otherwise, if T_t can be coerced to T_i
          • set T_t to T_i
        • otherwise
          • set T_t to mutual supertype of T_t and T_i

See Type coercions, Type cast expressions, Subtyping and Variance, Object Safety, RFC 255, RFC 546, std::marker::Unsize, std::ops::CoerceUnsized, and std::marker::Sized for greater detail.

Declarations

Grammar:

declaration:
    let_statement
    declaration_item

declaration_item:
    use_declaration
    constant_item
    static_item
    type_alias
    struct
    union
    enumeration
    function
    extern_block

Use Declarations

Grammar:

use_declaration:
    "use" use_tree ";"

use_tree:
    (simple_path? "::")? ("*" | "{" (use_tree ("," use_tree)* ","?)? "}")
    simple_path ("as" (identifier | "_"))?
  • a use declaration creates one or more local name bindings synonymous with some other path
  • use declarations may appear in modules or blocks
  • use declarations are resolved after macro expansion, e.g. this will not produce an error:
    macro_rules! m {
        ($x: item) => { $x $x }
    }
    
    m!(use std as _;);

Forms of a use declaration:

  • use a::b::c; makes c to be an alias for a::b::c in the current scope
  • use a::b::c as foo; makes foo to be an alias for a::b::c in the current scope
  • use a::b::*; brings all public items defined under the a::b in the current scope
  • use a::{self, b::c, d, e::*, f::g as h}; is the equivalent of
    // `self` refers to the common parent module, hence `use a;`
    use a;
    use a::b::c;
    use a::d;
    use a::e::*;
    use a::f::g as h;
  • nesting is also supported, e.g. use a::b::{self as c, d::{self, *}}; is the equivalent of
    use a::b as c;
    use a::b::d;
    use a::b::d::*;
  • use path as _; imports path without binding it to a name
    • use path::to::trait as _; imports trait's methods but not the trait symbol itself (recall that that to use trait's method it must be first imported)
  • use path::* as _; imports all the items under the path in their unnameable form

See Use declarations, Paths, Modules, and Block expressions for greater detail.

Variables

Grammar:

let_statement:
    outer_attribute* "let" pattern_no_top_alt (":" type)?
        ("=" expression ("else" block_expression)?)? ";"
  • introduces a new set of variables given by a pattern_no_top_alt
  • pattern_no_top_alt can be annotated with type
  • variables in pattern_no_top_alt can be initialized by expression
  • if else is not present, pattern_no_top_alt must be irrefutable
  • if else is present
    • pattern_no_top_alt can be refutable
    • expression must not be a lazy_boolean_expression or end with a }
    • block_expression must evaluate to never type
  • the semantics of else part is that if pattern_no_top_alt fails to match then the block_expression is executed

Variables:

  • are allocated on stack frame, i.e. a variable can be

    • a named local variable
    • a named function parameter
    • an anonymous temporary (e.g. created during an expression evaluation)
  • are defined as immutable by default

    • to define a mutable variable, use the mut keyword
    let x = 5;      // immutable variable
    let mut y = 7;  // mutable variable
  • are scoped

  • are not initialized

    • all variables must be initialized before their first use
  • can be shadowed:

    let x = "foo";    // x is immutable and str
    let x = x.len();  // x is shadowed - still immutable but integer

See Identifiers, let statements, Variables and Temporaries for greater detail.

Constants

Grammar:

constant_item:
    "const" (identifier | "_") ":" type ("=" expression)? ";"

Constants are scoped and always immutable.

Example:

const THREE: u32 = 1 + 2;
  • convention: use upper case and underscores for constant names

See Constant items and Constant evaluation for greater detail.

Statics

Grammar:

static_item:
    "static" "mut"? identifier ":" type ("=" expression)? ";"

See Static items for greater detail.

Type Aliases

Grammar:

type_alias:
    "type" identifier generic_params? (":" type_param_bounds)? where_clause?
        ("=" type where_clause?)? ";"

See Type aliases for greater detail.

Structs

Grammar:

struct:
    "struct" identifier generic_params? where_clause?
        ("{" struct_fields? "}" | ";")
    "struct" identifier generic_params? "(" tuple_fields? ")" where_clause? ";"

struct_fields:
    struct_field ("," struct_field)* ","?
struct_field:
    outer_attribute* visibility? identifier ":" type

tuple_fields:
    tuple_field ("," tuple_field)* ","?
tuple_field:
    outer_attribute* visibility? type

Structs with Named Fields

Declaration:

struct Point3D {
    x: f64,
    y: f64,
    z: f64,
}
  • declare a new type Point3D as a struct containing three f64 items named x, y and z, respectively

Making an instance:

let p1 = Point3D {x: 0.5, y: -1.2, z: 1.0};      // (1)
let mut p2 = Point3D {z: 1.0, y: -1.2, x: 0.5};  // (2)
  • at (1) an instance of Point3D is created and assigned to p1
  • at (2) happens the same but this time the instance is assigned to p2
  • the order of initializers does not matter in Rust, thus p1 and p2 are equal

Accessing an element:

p2.x += p1.x;
  • an element of a struct is accessed by its name using dot operator (.)

Field init shorthand:

fn x_axis_point(x: f64) -> Point3D {
    Point {y: 0.0, z: 0.0, x}
}
  • if the name of a variable coincides with the name of a field then initializer name: name can be shortened to just name

Struct update syntax:

let origin = Point3D {x: 0.0, y: 0.0, z: 0.0};
let z1 = Point3D {z: 1.0, ..origin};
  • missing initializers are taken from origin, so the z1 is equal to Point3D {x: 0.0, y: 0.0, z: 1.0}
  • note that in case Point3D contains a field that can be only moved (e.g. a field of String type), then origin cannot be used after the assignment to z1 is finished
  • z1 must be of the same type as origin

Tuple-like Structs

Declaration:

struct Color(u8, u8, u8);
  • declare a new type Color as a struct containing three u8 elements

Making an instance:

let red = Color(255, 0, 0);
let ctor = Color;
let blue = ctor(0, 0, 255);
  • create an instance of Color and assign it to red
  • Color behaves like function/constructor

Accessing an element:

let green = red.1;
  • elements are accessed like in tuples

Struct update:

let black = Color(0, 0, 0);
let red = Color {0: 255, ..black};
  • decimal integer literal as a field name specifies which field is updated
  • ..origin sets the rest of fields from origin using the copy/borrow strategy

Unit-like Structs

Declaration:

struct Ground;
struct Sink;
  • declare two new distinct types, Ground and Sink, with no elements
  • unit-like structs become useful when used together with traits

Making an instance:

let ground = Ground;
let ground2 = Ground {};
  • create an instance of Ground and assign it to ground
  • Ground can be optionally followed by {} to explicitly denote there are no fields

See Structs and Struct expressions for greater detail.

Unions

Grammar:

union:
    "union" identifier generic_params? where_clause? "{" struct_fields "}"

See Unions for greater detail.

Enumerations

Grammar:

enumeration:
    "enum" identifier generic_params? where_clause? "{" enum_items? "}"

enum_items:
    enum_item ("," enum_item)* ","?
enum_item:
    outer_attribute* visibility?
        identifier (enum_item_tuple | enum_item_struct)?
        enum_item_discriminant?

enum_item_tuple:
    "(" tuple_fields? ")"
enum_item_struct:
    "{" struct_fields? "}"
enum_item_discriminant:
    "=" expression

Enumerations represent a sum of enumeration types distinguished by constructors.

Definition and use of enumerations:

enum Animal {
    // Enum variant:
    Dog(String, f64),
    // Struct-like enum variant:
    Cat { name: String, weight: f64 },
    // Unit variant:
    Mouse,
}

let mut a: Animal = Animal::Dog("Sunny".to_string(), 13.5);
a = Animal::Cat { name: "Ginger".to_string(), weight: 4.7 };
a = Animal::Mouse;

// Values are extracted using pattern matching:
if let Animal::Cat { name, _ } == a {
    println!("Cat's name is {name}");
}

Like structs, also enumerations support defining methods on them:

enum FileError {
    NotFound,
    Read,
    Write,
}

impl FileError {
    fn detail(&self) -> String {
        match self {
            FileError::NotFound => String::from("File not found"),
            FileError::Read => String::from("Error while reading"),
            FileError::Write => String::from("Error while writing"),
        }
    }
}

Syntactically, enumerations allow to use a visibility annotation for their variants but this is rejected during the validation:

// Syntactical macros can use the enum definition to generate a code and throw
// out the old enum definition so it will not be analyzed by semantic analysis
// (if so, it will be rejected)
#[some_macro("foo")]
enum Enum {
    pub A,
    pub(crate) B(),
}

A field-less enum is an enum where no constructors contain field:

enum FieldLessEnum {
    CtorA(),
    CtorB{},
    CtorC,
}

A unit-only enum only contains unit variants:

enum UnitOnly {
    UnitA,
    UnitB,
    UnitC,
}

A zero-variant enum is an enum with no variants and thus it cannot be instantiated:

enum ZeroVariants {}
  • zero-variant enums are equivalent to never type:
    let x: ZeroVariants = panic!();
  • coercion into other types is not allowed:
    let y: u32 = x;  // type mismatch

Discriminants

A discriminant is a number associated with a constructor used to distinguish between variants of one enum instance.

  • its type is isize under the default representation
  • however, compiler is allowed to use a smaller type in its actual memory layout
Discriminant Values

A discriminant value can be set in two ways:

  1. Implicitly, if the value of the discriminant is not specified explicitly:

    • the value of the discriminant is the value of the discriminant of the previous variant plus one
    • if the value of the discriminant of the first variant is not specified explicitly it is set to zero

    Example:

    // Unit only enumeration => setting discriminants explicitly is allowed
    enum Example {
        VarA,        // Implicitly set to 0
        VarB = 123,  // Explicitly set to 123
        VarC,        // Implicitly set to 124
    }
  2. Explicitly, using = followed by a constant expression, under these circumstances:

    Examples:

    #[repr(u8)]      // A primitive (u8) representation (discriminant values ranges from 0 to 255)
    enum Enum {
        Unit = 3,    // Unit = 3 (set explicitly)
        Tuple(u16),  // Tuple = 4 (set implicitly)
        Struct {     // Struct = 1 (set explicitly)
            a: u8,
            b: u16,
        } = 1,
    }
    
    enum Bad1 {
        A = 1,
        B = 1,   // ERROR: 1 is already used
    }
    
    enum Bad2 {
        A,      // Implicitly set to 0
        B,      // Implicitly set to 1
        C = 1,  // ERROR: 1 is already used
    }
    
    #[repr(u8)]
    enum Bad3 {
        A = 255,  // Explicitly set to 255
        B,        // ERROR: Implicitly set to 256 which cannot fit to u8 (overflow)
    }
How to Get the Discriminant Value
  • using std::mem::discriminant (can be used only for == and != comparison)
    enum Enum {
      VarA(&'static str),
      VarB(i32),
      VarC(i32),
    }
    
    assert_eq!(mem::discriminant(&Enum::VarA("abc")), mem::discriminant(&Enum::VarA("def")));
    assert_ne!(mem::discriminant(&Enum::VarC(2)), mem::discriminant(&Enum::VarC(3)));
  • via typecasting (can be used only for enums having only unit variants or for field-less enums where only unit variants are explicit)
    enum Enum {
        A,  // 0
        B,  // 1
        C,  // 2
    }
    
    assert_eq!(Enum::B as isize, 1);
    
    #[repr(u8)]
    enum FieldLess {
        Tuple(),            // 0
        Struct{},           // 1
        Unit,               // 2
        ExplicitUnit = 42,  // 42
    }
    
    assert_eq!(FieldLess::Tuple() as u8, 0);
    assert_eq!(FieldLess::Struct{} as u8, 1);
    assert_eq!(FieldLess::ExplicitUnit as u8, 42);
    
    #[repr(u8)]
    enum FieldLess2 {
        Tuple() = 2,
        Unit,
    }
    
    // ERROR: Typecast cannot be used as non-unit variant's discriminant has been
    //        set explicitly
    // assert_eq!(FieldLess2::Unit as u8, 3);
  • via (unsafe) pointer casting (can be used only for enums using a primitive representation)
    #[repr(u8)]
    enum Foo {
        A,                         // 0
        B { a: i16, b: i16 } = 3,  // 3
        C(i32) = 5,                // 5
    }
    
    impl Foo {
        fn discriminant(&self) -> u8 {
            unsafe { *(self as *const Self as *const u8) }
        }
    }
    
    let a = Foo::A;
    let b = Foo::B{a: -1, b: 4};
    let c = Foo::C(3);
    
    assert_eq!(a.discriminant(), 0);
    assert_eq!(b.discriminant(), 3);
    assert_eq!(c.discriminant(), 5);

See Enumerations, Enumerated types, Struct expressions, The Rust Representation, and Primitive representations for greater detail.

Functions

Grammar:

function:
    function_qualifiers "fn" identifier generic_params?
        "(" function_parameters? ")"
        function_return_type? where_clause?
        (block_expression | ";")

function_qualifiers:
    "const"? "async"? "unsafe"? ("extern" abi?)?
abi:
    string_literal
    raw_string_literal

function_parameters:
    self_param ","?
    (self_param ",")? function_param ("," function_param)* ","?
self_param:
    outer_attribute* (shorthand_self | typed_self)
shorthand_self:
    ("&" lifetime?)? "mut"? "self"
typed_self:
    "mut"? "self" ":" type
function_param:
    outer_attribute* (function_param_pattern | "..." | type)
function_param_pattern:
    pattern_no_top_alt ":" (type | "...")

function_return_type:
    "->" type

Simple function definition and simple call example:

fn main() {
    simple_fun();
}

fn simple_fun() {
    println!("Hello!");
}

Function with parameters:

fn fun_with_params(x: i32, y: i32) {
    println!("x: {x}, y: {y}");
}

fn main() {
    fun_with_params(5, 3);
}

Function returning value:

fn max(a: i32, b: i32) -> i32 {
    if (a > b) {
        return a;
    }
    b
}

fn main() {
    let x = max(1, 2);

    println!("max(1, 2): {x}");
}

See Functions for greater detail.

External Blocks

Grammar:

extern_block:
    "unsafe"? "extern" abi? "{" inner_attribute* external_item* "}"

external_item:
    outer_attribute* (
        macro_invocation_semi |
        (visibility? (static_item | function))
    )

See External blocks for greater detail.

Ownership

  • ownership is a strategy of keeping of a track of used memory
    • whether the memory is used or not is decided on compile time
  • ownership rules
    1. every value in Rust has an owner
    2. there can be only one owner at a time
    3. when the owner goes out of scope, the value will be dropped (Rust calls drop on it)
  • see Pointer types and Slice types for greater detail

Moves, Clones, and Copies

When a value is moved from one variable to another, the value of the former variable is considered invalid and that variable cannot be used:

let a = String::from("Hey!");
let b = a;  // `a` cannot be used from now since its value is invalid

This can be bypassed using the clone method:

let a = String::from("Hello!");
let b = a.clone();  // both `a` and `b` stay valid

However, if a type implements the Copy trait, a value is not moved but copied and it stays valid:

let x = 5;
ley y = x;  // `x` is valid since `i32` implements the `Copy` trait

Note

If a type or any of its part implements the Drop trait it cannot be annotated with (it cannot implement) the Copy trait.

Types that can implement the Copy trait are in general:

  • any group of simple scalar values
  • nothing that requires allocation or is some form of resource

This includes:

  • all integer types
  • bool type
  • all floating-point types
  • char type
  • tuples containing only types implementing Copy

Moving and copying concepts hold also for functions and other assignment-like operations:

  • function(x) moves/copies x to its parameter
  • return x moves/copies x outside of function as its return value

References and Borrowing

  • a reference is holding an address to some kind of data/value
  • a reference lifetime starts with its definition and ends after its last use in the current scope
  • a reference is guaranteed to point to a valid data/value during the reference lifetime
  • a reference points to a value but it not owns it
    • a referenced value cannot be changed and thus there can be many references for the same memory location at the same time
  • a reference is created using & unary operator
    • e.g., &a is a reference to the value owned by a
  • & can be also used to declare a reference type
    • e.g., &i32 is a reference type of i32 type
  • creating a reference is called borrowing
  • using mut together with & creates/declares a mutable reference
    • &mut a is a mutable reference to the value of a
    • &mut i32 is a mutable reference type of i32 type
  • a mutable reference allows to change the value it is referring to
    • no other references to that value are allowed to exist during the mutable reference lifetime

Dangling Pointers

Rust will not allow dangling pointers:

fn dangle() {
    let s = String::from("Hey!");

    &s  // `s` goes out of scope so it is dropped; `&s` points to freed memory
}

Slices

  • a slice is a reference to contiguous sequence of elements in a collection
  • a slice is made using index (&x[r]) expression, where &x is a reference to some collection type and r has a range type; the value of &x[r] is the reference to the portion of x; some examples:
    let s = String::from("abcdefgh");
    
    let x = &s[1..4];  // `x` points to `s[1]` ("bcd") and has type `&String`
    let y = &s[..3];   // same as `&s[0..3]`
    let z = &s[2..];   // same as `&s[2..(s.len())]`
    let w = &s[..];    // same as `&s[0..(s.len())]`
    
    // let t = &s[-1..0];  // error: `a` is not of a type `usize`
    // let t = &s[1..0];   // panic: `a` is not less or equal to `b`
    let t = &s[1..=0];     // ok: `t == ""`, `a <= b` (`1 <= 1`)
    // let t = &s[2..=0];  // panic: `a` is not less or equal to `b` (`2 > 1`)
    
    let a = [1, 2, 3, 4, 5];
    
    let x = &a[1..3];  // `assert_eq!(x, &[2, 3]);`, `x` has type `&[i32]`

Note

  1. A range a..b used to make a slice must have a <= b, where both a and b are of the usize type.
  2. String (str) slice range indices must occur at valid UTF-8 character boundaries. Otherwise the program panics.
  3. String literals have a type &str since they are slices of a binary data stored in the data section of a program.
  4. String implements the Deref trait that converts &String to &str by calling a deref method (the code is generated by Rust during compile time). As a consequence, if a function fun has a signature fn fun(s: &str) and x has a type &String we can fun(x) without any worries.

Expressions

Grammar:

expression:
    expression_without_block
    expression_with_block

expression_without_block:
    outer_attribute* (
        break_expression |
        continue_expression |
        return_expression |
        closure_expression |
        operator_expression |
        range_expression |
        index_expression |
        call_expression |
        field_expression |
        tuple_indexing_expression |
        method_call_expression |
        await_expression |
        path_expression |
        async_block_expression |
        atomic_expression |
        grouped_expression |
        macro_invocation
    )
expression_with_block:
    outer_attribute* (
        loop_expression |
        if_expression |
        if_let_expression |
        match_expression |
        block_expression |
        unsafe_block_expression
    )

operator_expression:
    assignment_expression
    compound_assignment_expression
    lazy_boolean_expression
    comparison_expression
    arithmetic_or_logical_expression
    type_cast_expression
    negation_expression
    dereference_expression
    borrow_expression
    error_propagation_expression

References:

Loop Expressions

Grammar:

loop_expression:
    loop_label? (
        "loop" block_expression
        "while" (expression - struct_expression) block_expression
        "while" "let" pattern "=" (scrutinee - lazy_boolean_expression)
            block_expression
        "for" pattern "in" (expression - struct_expression) block_expression
        block_expression
    )
loop_label:
    lifetime_or_label ":"

break_expression:
    "break" lifetime_or_label? expression?
continue_expression:
    "continue" lifetime_or_label?

loop { body }:

  • execute body infinitely
  • if body does not contain break, the type of the expression is !; otherwise, the type of the expression is the type of the break expression
  • the type of the expression must be compatible with the type of every break expression inside body
  • the value of the expression is the value returned by a break expression from body

while condition { body }:

  • if condition is true execute body and go to the next iteration
  • condition must not be struct_expression
  • the type and the value of the expression, body and the break expression follow the same rules as in the loop case

while let pattern = scrutinee { body }:

  • if the value of scrutinee matches pattern execute body and go to the next iteration
  • scrutinee must not be lazy_boolean_expression
  • the type and the value of the expression, body and the break expression follow the same rules as in the loop case

for pattern in expression { body }:

  • expression must not be struct_expression
  • the value of expression must implement std::iter::IntoIterator
  • pattern must be irrefutable
  • if the iterator yield a value, the value is matched against pattern and the body is executed after which the control returns to the next iteration
  • a 'label: for pattern in expression { body } is equivalent to
    {
        let result = match IntoIterator::into_iter(expression) {
            // Don't drop temporaries from `expression` before loop is finished
            mut iter => 'label: loop {
                let mut next;
                match Iterator::next(&mut iter) {
                    Option::Some(val) => next = val,
                    Option::None => break,
                };
                let pattern = next;
                let () = { body };
            },
        };
        result
    }

Loop labels:

  • a loop expression can be optionally labeled
  • labels can be shadowed
    'a: loop {         // (1)
        'a: loop {     // (2)
            break 'a;  // exit (2) loop
        }
        break 'a;      // exit (1) loop
    }

break expressions:

  • allowed only inside of the body of a loop or a labeled block expression
  • break immediately exits from the innermost loop or a labeled block expression
    • if a label is specified, break immediately exits from a loop labeled with this label
    • in a labeled block expression the label part is mandatory
  • if the expression part is present, break returns the value of the expression to the associated loop or labeled block expression
    • just break returns ()

continue expressions:

  • allowed only inside of the body of a loop
  • associated with the innermost loop expression
    • if the label part is present, the continue expression is associated with the loop expression labeled with this label
  • continue immediately stops the current iteration and returns the control back to the associated loop so the next iteration can be started

See Loops and other breakable expressions for greater detail.

if Expressions

Grammar:

if_expression:
    "if" (expression - struct_expression) block_expression
        ("else" (block_expression | if_expression | if_let_expression))?

if_let_expression:
    "if" "let" pattern "=" (scrutinee - lazy_boolean_expression) block_expression
        ("else" (block_expression | if_expression | if_let_expression))?
  • the conditional expression must not be struct_expression and must be of a type bool
  • the scrutinee must not be lazy_boolean_expression
  • all block expressions must have the same type
  • the value of the if expression is the value of its then branch if the conditional expression evaluates to true; otherwise, the value of the if expression is the value of its else branch
  • if neither then nor else branch are evaluated the value of the if expression is ()
  • in an if let expression, if the value of scrutinee matches pattern the whole expression has the value of its then branch; otherwise, it has the value of its else branch or () if the else branch is missing
    • if let expression
      if let PATTERN = EXPRESSION {
          then_branch
      } else {
          else_branch
      }
      is equivalent to
      match EXPRESSION {
          PATTERN => { then_branch },
          _ => { else_branch },
      }

Example:

fn decide(x: u8, y: &Option<u8>, z: u8) -> u8 {
    if x > 127 {
        x
    } else if let Some(x) = *y {
        x
    } else {
        z
    }
}

fn main() {
    let a = Some(8);
    let b = None;

    println!("{}", decide(200, &a, 42));  // prints "200"
    println!("{}", decide(20, &a, 42));   // prints "8"
    println!("{}", decide(20, &b, 42));   // prints "42"
}

See if and if let expressions for greater detail.

return Expressions

Grammar:

return_expression:
    "return" expression?

See return expressions for greater detail.

Closure Expressions

Grammar:

closure_expression:
    "move"? ("||" | "|" closure_parameters? "|") (
        expression | "->" type_no_bounds block_expression
    )

closure_parameters:
    closure_param ("," closure_param)* ","?

closure_param:
    outer_attribute* pattern_no_top_alt (":" type)?

Hints

See Closure expressions for greater detail.

Assignment Expressions

Grammar:

assignment_expression:
    expression "=" expression

compound_assignment_expression:
    expression "+=" expression
    expression "-=" expression
    expression "*=" expression
    expression "/=" expression
    expression "%=" expression
    expression "&=" expression
    expression "|=" expression
    expression "^=" expression
    expression "<<=" expression
    expression ">>=" expression

Operators Overloading

  • a += b is a syntactical sugar for AddAssign::add_assign(&mut a, b) (see std::ops::AddAssign trait)
    pub trait AddAssign<Rhs = Self> {
        // Required method
        fn add_assign(&mut self, rhs: Rhs);
    }
  • a -= b is a syntactical sugar for SubAssign::sub_assign(&mut a, b) (see std::ops::SubAssign trait)
    pub trait SubAssign<Rhs = Self> {
        // Required method
        fn sub_assign(&mut self, rhs: Rhs);
    }
  • a *= b is a syntactical sugar for MulAssign::mul_assign(&mut a, b) (see std::ops::MulAssign trait)
    pub trait MulAssign<Rhs = Self> {
        // Required method
        fn mul_assign(&mut self, rhs: Rhs);
    }
  • a /= b is a syntactical sugar for DivAssign::div_assign(&mut a, b) (see std::ops::DivAssign trait)
    pub trait DivAssign<Rhs = Self> {
        // Required method
        fn div_assign(&mut self, rhs: Rhs);
    }
  • a %= b is a syntactical sugar for RemAssign::rem_assign(&mut a, b) (see std::ops::RemAssign trait)
    pub trait RemAssign<Rhs = Self> {
        // Required method
        fn rem_assign(&mut self, rhs: Rhs);
    }
  • a &= b is a syntactical sugar for BitAndAssign::bitand_assign(&mut a, b) (see std::ops::BitAndAssign trait)
    pub trait BitAndAssign<Rhs = Self> {
        // Required method
        fn bitand_assign(&mut self, rhs: Rhs);
    }
  • a |= b is a syntactical sugar for BitOrAssign::bitor_assign(&mut a, b) (see std::ops::BitOrAssign trait)
    pub trait BitOrAssign<Rhs = Self> {
        // Required method
        fn bitor_assign(&mut self, rhs: Rhs);
    }
  • a ^= b is a syntactical sugar for BitXorAssign::bitxor_assign(&mut a, b) (see std::ops::BitXorAssign trait)
    pub trait BitXorAssign<Rhs = Self> {
        // Required method
        fn bitxor_assign(&mut self, rhs: Rhs);
    }
  • a <<= b is a syntactical sugar for ShlAssign::shl_assign(&mut a, b) (see std::ops::ShlAssign trait)
    pub trait ShlAssign<Rhs = Self> {
        // Required method
        fn shl_assign(&mut self, rhs: Rhs);
    }
  • a >>= b is a syntactical sugar for ShrAssign::shr_assign(&mut a, b) (see std::ops::ShrAssign trait)
    pub trait ShrAssign<Rhs = Self> {
        // Required method
        fn shr_assign(&mut self, rhs: Rhs);
    }

See Assignment expressions and Compound assignment expressions for greater detail.

Range Expressions

Grammar:

range_expression:
    expression ".." expression
    expression ".."
    ".." expression
    ".."
    expression "..=" expression
    "..=" expression

See Range expressions for greater detail.

Lazy Boolean Expressions

Grammar:

lazy_boolean_expression:
    expression "||" expression
    expression "&&" expression

The meaning and associativity of each operator (operators lower in the table have higher precedence):

Operator Meaning Associativity
|| logical or left-to-right
&& logical and left-to-right

|| and && differs from | and & in a way that the right-hand side expression is evaluated when its value is needed.

See Lazy boolean operators for greater detail.

Comparison Expressions

Grammar:

comparison_expression:
    expression "==" expression
    expression "!=" expression
    expression ">" expression
    expression "<" expression
    expression ">=" expression
    expression "<=" expression

Operators Overloading

  • a == b is equivalent to ::std::cmp::PartialEq::eq(&a, &b)
  • a != b is equivalent to ::std::cmp::PartialEq::ne(&a, &b)
    • see std::cmp::PartialEq trait
      pub trait PartialEq<Rhs = Self>
      where
          Rhs: ?Sized,
      {
          // Required method
          fn eq(&self, other: &Rhs) -> bool;
      
          // Provided method
          fn ne(&self, other: &Rhs) -> bool { ... }
      }
  • a > b is equivalent to ::std::cmp::PartialOrd::gt(&a, &b)
  • a < b is equivalent to ::std::cmp::PartialOrd::lt(&a, &b)
  • a >= b is equivalent to ::std::cmp::PartialOrd::ge(&a, &b)
  • a <= b is equivalent to ::std::cmp::PartialOrd::le(&a, &b)
    • see std::cmp::PartialOrd trait
      pub trait PartialOrd<Rhs = Self>: PartialEq<Rhs>
      where
          Rhs: ?Sized,
      {
          // Required method
          fn partial_cmp(&self, other: &Rhs) -> Option<Ordering>;
      
          // Provided methods
          fn lt(&self, other: &Rhs) -> bool { ... }
          fn le(&self, other: &Rhs) -> bool { ... }
          fn gt(&self, other: &Rhs) -> bool { ... }
          fn ge(&self, other: &Rhs) -> bool { ... }
      }

See Comparison Operators for greater detail.

Arithmetic and Logical Expressions

Grammar:

arithmetic_or_logical_expression:
    expression "|" expression
    expression "^" expression
    expression "&" expression
    expression "<<" expression
    expression ">>" expression
    expression "+" expression
    expression "-" expression
    expression "*" expression
    expression "/" expression
    expression "%" expression

The meaning and associativity of each operator (operators lower in the table have higher precedence):

Operator Meaning Associativity
| bitwise/logical or left-to-right
^ bitwise/logical xor left-to-right
& bitwise/logical and left-to-right
<<, >> left shift, right shift left-to-right
+, - addition, subtraction left-to-right
*, /, % multiplication, division, remainder left-to-right

Operators Overloading

  • a | b is a syntactical sugar for BitOr::bitor(a, b) (see std::ops::BitOr trait)
    pub trait BitOr<Rhs = Self> {
        type Output;
    
        // Required method
        fn bitor(self, rhs: Rhs) -> Self::Output;
    }
  • a ^ b is a syntactical sugar for BitXor::bitxor(a, b) (see std::ops::BitXor trait)
    pub trait BitXor<Rhs = Self> {
        type Output;
    
        // Required method
        fn bitxor(self, rhs: Rhs) -> Self::Output;
    }
  • a & b is a syntactical sugar for BitAnd::bitand(a, b) (see std::ops::BitAnd trait)
    pub trait BitAnd<Rhs = Self> {
        type Output;
    
        // Required method
        fn bitand(self, rhs: Rhs) -> Self::Output;
    }
  • a << b is a syntactical sugar for Shl::shl(a, b) (see std::ops::Shl trait)
    pub trait Shl<Rhs = Self> {
        type Output;
    
        // Required method
        fn shl(self, rhs: Rhs) -> Self::Output;
    }
  • a >> b is a syntactical sugar for Shr::shr(a, b) (see std::ops::Shr trait)
    pub trait Shr<Rhs = Self> {
        type Output;
    
        // Required method
        fn shr(self, rhs: Rhs) -> Self::Output;
    }
  • a + b is a syntactical sugar for Add::add(a, b) (see std::ops::Add trait)
    pub trait Add<Rhs = Self> {
        type Output;
    
        // Required method
        fn add(self, rhs: Rhs) -> Self::Output;
    }
  • a - b is a syntactical sugar for Sub::sub(a, b) (see std::ops::Sub trait)
    pub trait Sub<Rhs = Self> {
        type Output;
    
        // Required method
        fn sub(self, rhs: Rhs) -> Self::Output;
    }
  • a * b is a syntactical sugar for Mul::mul(a, b) (see std::ops::Mul trait)
    pub trait Mul<Rhs = Self> {
        type Output;
    
        // Required method
        fn mul(self, rhs: Rhs) -> Self::Output;
    }
  • a / b is a syntactical sugar for Div::div(a, b) (see std::ops::Div trait)
    pub trait Div<Rhs = Self> {
        type Output;
    
        // Required method
        fn div(self, rhs: Rhs) -> Self::Output;
    }
  • a % b is a syntactical sugar for Rem::rem(a, b) (see std::ops::Rem trait)
    pub trait Rem<Rhs = Self> {
        type Output;
    
        // Required method
        fn rem(self, rhs: Rhs) -> Self::Output;
    }

See Arithmetic and Logical Binary Operators for greater detail.

Handling Integer Overflow

  • arithmetic operations with integers may overflow
  • in debug mode, overflow causes panic
  • in --release mode, overflow causes unexpected results (caused by modulo arithmetic a.k.a. complement wrapping)
  • Rust standard library provides methods for primitive types dealing with integer arithmetic:
    // wrapping_* methods do modular arithmetic
    assert_eq!(200u8.wrapping_add(100), 44);
    
    // checked_* methods return None in case of overflow
    assert_eq!(200u8.checked_add(100), None);
    
    // overflowing_* methods return the value and boolean indicating overflow
    assert_eq!(200u8.overflowing_add(100), (44, true));
    
    // saturating_* methods do a saturation
    assert_eq!(200u8.saturating_add(100), u8::MAX);

Type Cast Expressions

Grammar:

type_cast_expression:
    expression "as" type_no_bounds

See Type cast expressions for greater detail.

Negation Expressions

Grammar:

negation_expression:
    "-" expression
    "!" expression

Operators Overloading

  • -a is a syntactical sugar for Neg::neg(a) (see std::ops::Neg trait)
    pub trait Neg {
        type Output;
    
        // Required method
        fn neg(self) -> Self::Output;
    }
  • !a is a syntactical sugar for Not::not(a) (see std::ops::Not trait)
    pub trait Not {
        type Output;
    
        // Required method
        fn not(self) -> Self::Output;
    }

See Negation operators for greater detail.

Dereference Expressions

Grammar:

dereference_expression:
    "*" expression

Deref Coercion

  • If T implements Deref<Target = U>, and v is a value of type T, then:
    • In immutable contexts, *v (where T is neither a reference nor a raw pointer) is equivalent to *Deref::deref(&v).
    • Values of type &T are coerced to values of type &U.
    • T implicitly implements all the methods of the type U which take the &self receiver.
    • See std::ops::Deref trait:
      pub trait Deref {
          type Target: ?Sized;
      
          // Required method
          fn deref(&self) -> &Self::Target;
      }
  • If T implements DerefMut<Target = U>, and v is a value of type T, then:
    • In mutable contexts, *v (where T is neither a reference nor a raw pointer) is equivalent to *DerefMut::deref_mut(&mut v).
    • Values of type &mut T are coerced to values of type &mut U.
    • T implicitly implements all the (mutable) methods of the type U.
    • See std::ops::DerefMut trait:
      pub trait DerefMut: Deref {
          // Required method
          fn deref_mut(&mut self) -> &mut Self::Target;
      }

See The dereference operator for greater detail.

Borrow Expressions

Grammar:

borrow_expression:
    ("&" | "&&") "mut"? expression
  • &a produces a reference if a is an expression with an associated memory location
  • memory location associated with a is switched to a borrowed state for the entire duration of &a
    • for &a this means that a cannot be mutated, but it can be read or shared again
    • for &mut a this means that a cannot be accessed in any other way until &mut a expires (i.e. having two mutable references to the same place is considered invalid)
  • if a is a value expression (i.e. it has no associated memory location, like 3 + 5), then &a or &mut a yields a creation of a temporary memory location which is then referenced

See Borrow operators for greater detail.

Error Propagation Expressions

Grammar:

error_propagation_expression:
    expression "?"

See The question mark operator for greater detail.

Array Index Expressions

Grammar:

index_expression:
    expression "[" expression "]"

See Array and slice indexing expressions for greater detail.

Call Expressions

Grammar:

call_expression:
    expression "(" call_params? ")"

call_params:
    expression ("," expression)* ","?

See Call expressions for greater detail.

Field Access Expressions

Grammar:

field_expression:
    expression "." identifier

tuple_indexing_expression:
    expression "." integer_literal

Method-Call Expressions

Grammar:

method_call_expression:
    expression "." path_expr_segment "(" call_params? ")"

expression in the grammar above is called receiver.

Here is how receiver and method are resolved:

  1. Build a list, L, of candidate receiver types.
    1. Repeatedly dereference receiver's expression type, add each encountered type to L.
    2. Let T be the last type in L. Apply unsized coercion to T and add the result, if any, to L.
  2. For each T in L, add &T and &mut T to L immediately after T.
  3. For every T in L, search for a visible method with a receiver of type T in these places:
    1. Methods implemented directly on T.
    2. Any of the methods provided by a visible trait implemented by T.
      • If T is a type parameter, methods provided by trait bounds on T are looked up first.
      • Then all remaining methods in scope are looked up.
  4. If the look up failed or there are ambiguities an error is issued.

See Method-call expressions for greater detail.

Await Expressions

Grammar:

await_expression:
    expression "." "await"

See Await expressions for greater detail.

Path Expressions

Grammar:

path_expression:
    path_in_expression
    qualified_path_in_expression

See Path expressions for greater detail.

match Expressions

Grammar:

match_expression:
    "match" scrutinee "{" inner_attribute* match_arms? "}"

scrutinee:
    expression - struct_expression

match_arms:
    (match_arm "=>" (expression_without_block "," | expression_with_block ","?))*
        match_arm "=>" expression ","?
match_arm:
    outer_attribute* pattern match_arm_guard?
match_arm_guard:
    "if" expression

A match expression branches on a pattern.

  • a scrutinee expression and patterns must have the same type
  • all match arms must have also the same type and this type is the type of the whole match expression

If a scrutinee expression is a value expression:

  1. it is first evaluated into a temporary location
  2. the resulting value is sequentially (left-to-right, down-to-bottom) compared to the patterns until a match is found
    • if the pattern has a match guard associated with it and the pattern matches, the match guard is evaluated
      • if it is true, we have a match
      • otherwise, we have no match and the match-finding process continues (note that in p1 | p2 if g, match guard if g is applied to both p1 and p2)
      • if the match guard refers to the variables bound within the pattern
        1. a shared reference is taken to the part of the scrutinee the variable matches on (this prevents mutation inside guards)
        2. this shared reference is then used when accessing the variable during the match guard evaluation
        3. if guard evaluates to true, the value is moved or copied from the scrutinee into the variable
    • every binding in each | separated pattern must appear in all of the patterns in the arm
  3. any variables bound by the first matching pattern are assigned to local variables in the arm's block
    • variables are scoped to the match guard and the arm's expression
    • the binding mode (copy, move or reference) depends on the pattern
    • every binding of the same name must have the same type and have the same binding mode
  4. control enters the block
  5. the value returned by the block is the value of the match expression

If a scrutinee expression is a place expression the same logic as before is applied with these differences:

  • a temporary location is not allocated
  • a by-value binding may copy or move from the memory location
  • lifetime of a match inherits the lifetime of the place expression

Example:

#[derive(Clone, Debug)]
enum Foo {
    A,
    B(u8),
    C(u8, u8),
}

use crate::Foo::{A, B, C};

fn test_match(obj: &Object) -> (u8, u8) {
    match *obj {
        A => (0, 0),
        B(x @ 1) | B(x @ 2) => (1, x),
        B(x) | C(_, x) if x >= 3 => (2, x),
        B(x) => (3, x),
        C(x, _) => (4, x),
    }
}

fn main() {
    let objs = vec![
        A,        // prints "(0, 0)"
        B(0),     // prints "(3, 0)"
        B(1),     // prints "(1, 1)"
        B(2),     // prints "(1, 2)"
        B(3),     // prints "(2, 3)"
        B(4),     // prints "(2, 4)"
        C(0, 0),  // prints "(4, 0)"
        C(1, 3),  // prints "(2, 3)"
        C(3, 1),  // prints "(4, 3)"
    ];

    for x in objs {
        println!("{:#?}", test_match(&x));
    }
}

See match expressions, Patterns, Place Expressions and Value Expressions, and Binding modes for greater detail.

Block Expressions

Grammar:

block_expression:
    "{" inner_attribute* statements? "}"
statements:
    statement+ expression_without_block?
    expression_without_block

async_block_expression:
    "async" "move"? block_expression

unsafe_block_expression:
    "unsafe" block_expression

The value and type of block_expression is the value and type of expression if it is present. Otherwise the value and type of block_expression is ().

See Block expressions for greater detail.

Atomic Expressions

Grammar:

atomic_expression:
    underscore_expression
    literal_expression
    tuple_expression
    array_expression
    struct_expression

underscore_expression:
    "_"

literal_expression:
    char_literal
    string_literal
    raw_string_literal
    byte_literal
    byte_string_literal
    raw_byte_string_literal
    integer_literal
    float_literal
    "true" | "false"

tuple_expression:
    "(" tuple_elements? ")"
tuple_elements:
    (expression ",")+ expression?

array_expression:
    "[" array_elements? "]"
array_elements:
    expression ("," expression)* ","?
    expression ";" expression

struct_expression:
    path_in_expression "{" (struct_expr_fields | struct_base)? "}"
    path_in_expression "(" (expression ("," expression)* "'"?)? ")"
    path_in_expression

struct_expr_fields:
    struct_expr_field ("," struct_expr_field)* ("," struct_base | ","?)
struct_expr_field:
    outer_attribute* (
        identifier |
        (identifier | integer_literal) ":" expression
    )
struct_base:
    ".." expression
  • for integer_literal in struct_expr_field hold same restriction as for integer_literal in tuple_indexing_expression

See _ expressions, Literal expressions, Tuple expressions, Array expressions, and Struct expressions for greater detail.

Grouped Expressions

Grammar:

grouped_expression:
    "(" expression ")"

See Grouped expressions for greater detail.

Statements

Grammar:

statement:
    ";"
    item
    let_statement
    expression_statement
    macro_invocation_semi

item:
    outer_attribute* vis_item
    macro_item

vis_item:
    visibility? (
        declaration_item
        trait
        implementation
        module
        extern_crate
    )
macro_item:
    macro_invocation_semi
    macro_rules_definition

expression_statement:
    expression_without_block ";"
    expression_with_block ";"?

See Statements, Item declarations and Items for greater detail.

Patterns

Grammar:

pattern:
    "|"? pattern_no_top_alt ("|" pattern_no_top_alt)*

pattern_no_top_alt:
    pattern_without_range
    range_pattern

pattern_without_range:
    literal_pattern
    identifier_pattern
    wildcard_pattern
    rest_pattern
    reference_pattern
    struct_pattern
    tuple_struct_pattern
    tuple_pattern
    grouped_pattern
    slice_pattern
    path_pattern
    macro_invocation

literal_pattern:
    "true" | "false"
    char_literal
    byte_literal
    string_literal
    raw_string_literal
    byte_string_literal
    raw_byte_string_literal
    "-"? integer_literal
    "-"? float_literal

identifier_pattern:
    "ref"? "mut"? identifier ("@" pattern_no_top_alt)?

wildcard_pattern:
    "_"

rest_pattern:
    ".."

range_pattern:
    range_inclusive_pattern
    range_from_pattern
    range_to_inclusive_pattern
    obsolete_range_pattern

range_inclusive_pattern:
    range_pattern_bound "..=" range_pattern_bound
range_from_pattern:
    range_pattern_bound ".."
range_to_inclusive_pattern:
    "..=" range_pattern_bound
obsolete_range_pattern:
    range_pattern_bound "..." range_pattern_bound

range_pattern_bound:
    char_literal
    byte_literal
    "-"? integer_literal
    "-"? float_literal
    path_expression

reference_pattern:
    ("&" | "&&") "mut"? pattern_without_range

struct_pattern:
    path_in_expression "{" struct_pattern_elements? "}"

struct_pattern_elements:
    struct_pattern_fields ("," struct_pattern_et_cetera?)?
    struct_pattern_et_cetera

struct_pattern_fields:
    struct_pattern_field ("," struct_pattern_field)*
struct_pattern_field:
    outer_attribute* (
        integer_literal ":" pattern |
        identifier ":" pattern |
        "ref"? "mut"? identifier
    )

struct_pattern_et_cetera:
    outer_attribute* ".."

tuple_struct_pattern:
    path_in_expression "(" tuple_struct_items? ")"
tuple_struct_items:
    pattern ("," pattern)* ","?

tuple_pattern:
    "(" tuple_pattern_items? ")"
tuple_pattern_items:
    pattern ","
    rest_pattern
    pattern ("," pattern)+ ","?

grouped_pattern:
    "(" pattern ")"

slice_pattern:
    "[" slice_pattern_items? "]"
slice_pattern_items:
    pattern ("," pattern)* ","?

path_pattern:
    path_expression

See Patterns for greater detail.

Traits

Grammar:

trait:
    "unsafe"? "trait" identifier generic_params? (":" type_param_bounds?)?
        where_clause? "{" inner_attribute* associated_item* "}"

type_param_bounds:
    type_param_bound ("+" type_param_bound)* "+"?
type_param_bound:
    lifetime
    trait_bound
trait_bound:
    "?"? ("for" generic_params)? type_path
    "(" "?"? ("for" generic_params)? type_path ")"

lifetime_bounds:
    (lifetime "+")* lifetime?
lifetime:
    lifetime_or_label
    "'static"
    "'_"

See Traits, Trait and lifetime bounds, and Associated Items for greater detail.

Implementations

Grammar:

implementation:
    inherent_impl
    trait_impl

inherent_impl:
    "impl" generic_params? type where_clause?
        "{" inner_attribute* associated_item* "}"
trait_impl:
    "unsafe"? "impl" generic_params? "!"? type_path "for" type where_clause?
        "{" inner_attribute* associated_item* "}"

associated_item:
    outer_attribute* (
        macro_invocation_semi |
        (visibility? (type_alias | constant_item | function))
    )

An implementation associates an item definition with a concrete type.

  • this happens inside of impl block
  • multiple impl blocks per one implementing type are possible

Inherent implementations:

  • can contain associated functions, including methods, and associated constants
  • a type can also have multiple inherent implementations
  • an implementing type must be defined within the same crate as the original type definition

See Implementations for greater detail.

Associated Functions and Methods

Associated functions are functions associated with a type.

Methods are associated functions with self as the first parameter. The type of self, S, can be specified, but it undergoes the following restrictions:

  • Let T be an implementing type and 'a by an arbitrary lifetime.
  • Then S is one of Self or P, where
    • Self refers to a type resolving to T, such as alias of T, Self, or associated type projections resolving to T;
    • P is one of & 'a S, & 'a mut S, Box<S>, Rc<S>, Arc<S>, or Pin<S>.

When self has no type specified, then

  • self is equivalent to self: Self
  • & 'a self is equivalent to self: & 'a Self
  • & 'a mut self is equivalent to self: & 'a mut Self

Explanation on example:

#[derive(Debug)]
struct FsItem {
    name: String,
    size: usize,
}

impl FsItem {
    fn new() -> Self {
        Self {
            name: String::from(""),
            size: 0usize,
        }
    }

    fn create(name: String, size: usize) -> FsItem {
        FsItem { name, size }
    }

    fn name(&self) -> String {
        String::from(self.name.as_str())
    }

    fn size(&self) -> usize {
        self.size
    }

    fn rename(&mut self, name: String) {
        self.name = name;
    }
}

fn main() {
    let mut fsitem1 = FsItem::new();
    let fsitem2 = FsItem::create(String::from("/etc/fsitem2"), 16);

    println!("{fsitem1:#?}");
    println!("{fsitem2:#?}");

    println!("fsitem1 = {{ {}, {} }}", fsitem1.name(), fsitem1.size());

    fsitem1.rename(String::from("/etc/fsitem1"));

    println!("fsitem1 = {{ {}, {} }}", fsitem1.name(), fsitem1.size());
}
  • impl FsItem block encloses functions and methods definitions associated with struct FsItem
  • new() and create() are associated functions of struct FsItem and as such their path must be specified to call them: FsItem::new(), FsItem::create()
  • name(), size(), and rename() are methods
  • Self refers to the implementing type, here struct FsItem
  • self refers to the object of implementing type and it is an implicit parameter to the method
    • e.g. x.f(a, b) translates to T::f(x, a, b) where T is the type of the receiver or trait and x matches with self
  • mut before self denotes that a method modifies the object referred by self
  • & is necessary if the object referred by self is borrowed more than once
    • e.g. fsitem1 is borrowed by name() and by size()

See Associated Items for greater detail.

Generics

Grammar:

generic_params:
    "<" ">"
    "<" (generic_param ",")* generic_param ","? ">"
generic_param:
    outer_attribute* (lifetime_param | type_param | const_param)
lifetime_param:
    lifetime_or_label (":" lifetime_bounds)?
type_param:
    identifier (":" type_param_bounds?)? ("=" type)?
const_param:
    "const" identifier ":" type
        ("=" block_expression | identifier | "-"? literal_expression)?

where_clause:
    "where" (where_clause_item ",")* where_clause_item?
where_clause_item:
    lifetime ":" lifetime_bounds
    ("for" generic_params)? type ":" type_param_bounds?

See Generic parameters for greater detail.

Macros

Grammar:

macro_invocation:
    simple_path "!" delim_token_tree
macro_invocation_semi:
    simple_path "!" "(" token_tree* ")" ";"
    simple_path "!" "[" token_tree* "]" ";"
    simple_path "!" "{" token_tree* "}"

delim_token_tree:
    "(" token_tree* ")"
    "[" token_tree* "]"
    "{" token_tree* "}"
token_tree:
    token - delimiters
    delim_token_tree

Examples:

See Macros for greater detail.

Selected Macros from the Standard Library

format

Creates a String by the interpolation of format string literal. Example:

let (x, y) = (1, 2);

format!("x = {x}, y = {y}");     // "x = 1, y = 2"
format!("z = {z}", z = 3);       // "z = 3"
format!("Hello, {}!", "World");  // "Hello, World!"

Note

format! macro takes references, i.e. format!("{x}") will take a reference of x. References are taken also by other macros using format string.

See std::format for greater detail.

Format String

A format string is a string containing markers with format specification. A marker is a string starting with { and ending with }. During interpolation, a marker is replaced by the string representation of the corresponding value. Additional characters between { and } specify a way of interpolating a value.

Selected markers:

Marker Meaning
{} the value's type must implement std::fmt::Display trait
{:?} the value's type must implement std::fmt::Debug trait
{:#?} {:?} with pretty print flag set

See std::fmt for greater detail.

Additional references:

format_args

Creates std::fmt::Arguments object containing precompiled format string and its arguments. Example:

let args = format_args!("{} + {} = {}", 1, 2, 3);

println!("{:?}", args);

See std::format_args and std::fmt::Arguments for greater detail.

println

Prints interpolated format string and the new line character to the standard output. Examples:

println!("Hello, World!");
println!("Hello, {}!", "World");

See std::println for greater detail.

eprintln

Like println but prints the output to standard error output.

See std::eprintln for greater detail.

dbg

Prints to the standard error output and returns the value of given expression. The value is moved. The type of the value must implement std::fmt::Debug trait. Examples:

let a = dbg!(2 + 5);  // Prints: [src/main.rs:2] 2 + 5 = 7

#[derive(Debug)]
struct NoCopy(u32);

let a = NoCopy(8);
let _ = dbg!(a);
let _ = dbg!(a);  // Error! (`a` was moved)

See std::dbg for greater detail.

Attributes

Grammar:

inner_attribute:
    "#" "!" "[" attr "]"

outer_attribute:
    "#" "[" attr "]"

attr:
    simple_path attr_input?
attr_input:
    delim_token_tree
    "=" expression

See Attributes for greater detail.

Selected Attributes Supported by Rust

derive

Allows new items to be automatically generated for data structures. Example:

// Implement `std::fmt::Debug` trait for `Point`
#[derive(Debug)]
struct Point {
    x: i32,
    y: i32,
}

let p = Point {x: 1, y: 2};

println!("{p:#?}");  // Pretty print `p`

See Derive for greater detail.

repr

Specifies the layout for user-defined composite types (structs, enums, unions). Possible representations are:

  • Rust (default)
  • C
  • primitive
  • transparent

Example:

// Uses default (Rust) representation
struct Foo {
    bar: isize,
    baz: u8,
}

// C representation
#[repr(C)]
struct Pixel {
    x: u32,
    y: u32,
}

See Representations for greater detail.

Rust's Module System

Grammar:

crate:
    utf8bom? shebang? inner_attribute* item*

module:
    "unsafe"? "mod" identifier (";" | "{" inner_attribute* item* "}")

extern_crate:
    "extern" "crate" crate_ref as_clause? ";"

crate_ref:
    identifier
    "self"
as_clause:
    "as" (identifier | "_")

A module system of Rust consists of these features: packages, crates, modules and use, and paths.

Packages

A package is a feature of Cargo that allows to build, test, and share crates. It is a bundle of one or more crates, specifically:

  • a package must contain at least one crate
  • a package can contain as many binary crates as needed, but at most only one library crate

New package can be created using cargo new command. The layout of a package directory:

foopkg/
    Cargo.toml
    /src
        lib.rs
        main.rs
        /bin
            prog1.rs
            prog2.rs
            /prog3
                main.rs
                foomod.rs
    /benches
        bench1.rs
        /bench2
            main.rs
            benchmod.rs
    /examples
        example1.rs
        /example2
            main.rs
            barmod.rs
    /tests
        test1.rs
        /test2
            main.rs
            testmod.rs
  • Cargo.toml is a manifest that tells cargo how to build crates in a package
    • defaults can be override here
  • the src directory contains source code of package crates
    • the default library crate is lib.rs (cargo produces a library with the same name as the package)
    • the default executable is main.rs (cargo produces an executable with the same name as the package)
    • the bin directory is a place for other executables
  • the benches directory contains benchmarks
  • the examples directory contains examples
  • the tests directory contains integration tests
  • if a binary executable, bench, example, or test consists of multiple source files (in our case src/bin/prog3, benches/bench2, examples/example2 and tests/test2), put them into a directory:
    • main.rs fine determine the crate root
    • other files are considered as modules
    • the name of a directory becomes the name of the executable

Crates

A crate is a smallest compilation unit, a tree of modules. There are two types of crates:

  1. A binary crate, which when compiles produces an executable. A binary crate must contain the main function serving as its entry point. The main function must fulfill these requirements:
    • must take no arguments
    • must not declare any trait or lifetime bounds
    • must not have any where clauses
    • its return type must implement the Termination trait (types from the standard library implementing the Termination trait are (), !, Infallible, ExitCode, Result<T, E> where T: Termination, E: Debug)
  2. A library crate is a crate without main function. When compiled, it produces a library (by default statically linked) that exposes publicly visible items as a part of its API.

Attributes that can be applied at the crate level:

  • #![no_main] – emit no main for an executable binary
  • #![crate_name = "my_crate"] – set the name of a crate to my_crate
    • hyphens are disallowed in crate names
      • if the crate name is resolved from the package name, Cargo will replace all hyphens with underscores
  • #![no_std]
    • prevents std from being added to the extern prelude
      • note that this does not prevent std from being linked in using extern crate std;
    • affects which module is used to make up the standard library prelude (see Preludes)
    • injects the core crate into the crate root instead of std
    • pulls in all macros exported from core in the macro_use prelude

A source file from which the Rust compiler starts and makes up the root module of the crate is called the crate root.

  • the contents of the source file forms a module named crate which sits at the root of the crate's module structure, known as the module tree

The root module of a crate is the top-level of the crate's module tree.

  • the root module is anonymous (from the point of view of paths within the module) and it is accessible via crate path

Any item within a crate has a canonical module path denoting its location within the crate's module tree.

Handling the standard library:

  • automatically included in the crate root module
  • the std crate is added to the root
  • implicit macro_use attribute pulls in all macros exported in std into the macro_use prelude
  • core and std are added to the extern prelude

Extern Crates

An extern crate declaration specifies a dependency on an external crate:

  • the external crate is bound into the declaring scope as the identifier provided in the declaration
  • if the declaration appears in the crate root, the crate name is also added to the extern prelude so it will be automatically available in scope in all modules
  • the as clause can be used to bind the imported crate to a different name
  • the self crate may be imported
    • a binding to the current crate is created
    • as clause is mandatory in this case
  • extern crate foo as _; specifies a dependency on an external crate without binding its name in scope
    • useful for crates that only need to be linked
  • no_link attribute prevents a crate to be linked
    • useful to load a crate to only access its macros

During compile time and runtime linkage:

  1. If the extern crate declaration has a crate_name attribute, let NAME be the value of this attribute. Otherwise, let NAME be from extern crate NAME.
  2. From the compiler's library path and NAME resolve SONAME.
  3. At runtime, pass a runtime linkage requirement SONAME have to the linker for loading.

Example:

extern crate pcre;
extern crate std;             // Same as `extern crate std as std;`
extern crate std as ruststd;  // Linking to `std` under `ruststd`

Modules

A module is a container of zero or more items.

  • mod foo { ... } introduces a new named module, foo, into the tree of modules making up a crate.
  • Modules can nest arbitrarily.
  • Modules and types share the same name space.
  • In unsafe mod foo { ... }, unsafe is rejected at a semantic level (that is, it has use only when feed into a macro).

Every source file is a module, but not every module needs its own source file.

A module without body, i.e. mod foo; statement, is loaded from an external file.

  • The path to the file mirrors the logical module path, unless the module does have a path attribute.
  • Ancestor module path components are directories.
  • The module's contents are in a file with the name of the module plus the .rs extension.
    • If the file name containing the module's contents is mod.rs, the module file name is the name of the directory containing this mod.rs file.
    • For crate::foo, it is not allowed to have both foo.rs and foo/mod.rs.
    • Since rustc 1.30, using mod.rs files is considered obsolete.
  • If a path attribute is given, it specifies the location of an external file with the module's content to be loaded. There are three ways of handling path attributes:
    1. A path attribute is not inside inline module block, e.g.
      // File `src/a/b.rs` (1) or `src/a/mod.rs` (2):
      #[path = "foo.rs"]
      mod c;
      // The path to the `c` module is:
      //   - `crate::a::b::c` in case of (1)
      //   - `crate::a::c` in case of (2)
      • the file path is relative to the directory the source file is located, e.g. the external file with the c module's content is located at src/a/foo.rs
    2. A path attribute is inside inline module block and the source file where the path attribute is located is the root module (like lib.rs or main.rs) or mod.rs, e.g.
      // File `src/lib.rs` (1) or `src/a/mod.rs` (2):
      mod inline {
          #[path = "other.rs"]
          mod inner;
      }
      // The `other.rs` file will be searched at:
      //   - `src/inline/other.rs` in case of (1)
      //       * the path to the `inner` module is `crate::inline::inner`
      //   - `src/a/inline/other.rs` in case of (2)
      //       * the path to the `inner` module is `crate::a::inline::inner`
      • the file path is relative to the directory of root module file or mod.rs file including the inline module components as directories
    3. A path attribute is inside inline module block and the source file where the path attribute is located is neither the root module nor mod.rs, e.g.
      // File `src/a/b.rs`:
      mod inline {
          #[path = "other.rs"]
          mod inner;
      }
      // The `other.rs` file will be searched at `src/a/b/inline/other.rs`.
      // The path to the `inner` module is `crate::a::b::inline::inner`.
      • the file path is relative to the directory of the source file with the path attribute, including the directory with the same name as the module that corresponds to the source file and the inline module components as directories
      • rules can be combined:
        // File `src/main.rs`:
        mod a;
        
        fn main() {
            crate::a::b::thread::thread_local::join();
        }
        
        // File `src/a.rs`:
        pub mod b;
        
        // File `src/a/b.rs`:
        #[path = "threads"]
        pub mod thread {
            #[path = "tls.rs"]
            pub mod thread_local;
        }
        // The `tls.rs` file will be searched at `src/a/threads/tls.rs`.
        // The path to the `thread_local` module is `crate::a::b::thread::thread_local`.
        
        // File `src/a/threads/tls.rs`:
        pub fn join() {}

A module can have attributes.

  • attributes in a source file applies to the containing module
  • attributes applied to the anonymous root module apply also to the crate as a whole
  • the built-in attributes that have meaning on modules: cfg, deprecated, doc, allow, warn, deny, forbid, path, and no_implicit_prelude

Example:

// File `src/lib.rs`:
//   - this is the crate root
//   - the path accessing this module is `crate` (this module is the root
//     module)
// This module is loaded from the external file, `src/util.rs`. The effect is
// the same as `mod util { /* contents of src/util.rs */ }`:
//   - the path to the `util` module is `crate::util`
mod util;
// Like `util` module. The path to the `lexer` module is `crate::lexer`:
mod lexer;

// File `src/util.rs`:
//   - the path accessing this module is `crate::util`
//   - the path to the just loaded `config` module is `crate::util::config`
mod config;

// File `src/util/config.rs`:
//   - the path accessing this module is `crate::util::config`
struct Config {
    pub (crate) vmem_limit: usize,
    pub (crate) stack_limit: usize,
    pub (crate) nfiles_limit: usize,
}

// File `src/lexer/mod.rs`:
//   - the path accessing this module is `crate::lexer`
//   - the path to the `token` module is `crate::lexer::token`
mod token;

//   - the path to the `scan` function is `crate::lexer::scan`
pub fun scan() -> token::Token {
    token::Token(42)
}

// File `src/lexer/token.rs`:
//   - the path accessing this module is `crate::lexer::token`
//   - the path to the `Token` struct is `crate::lexer::token::Token`
struct Token(u8);

See Package Layout, The Manifest Format, Cargo Targets, Crates and source files, Trait and lifetime bounds, Where clauses, Trait std::process::Termination, Never type, Enum std::convert::Infallible, Struct std::process::ExitCode, Modules, Conditional compilation, Diagnostic attributes, The #[doc] attribute, Preludes, Paths, Extern crate declarations, Macros By Example, and Attributes for greater detail.

Preludes

A prelude is a collection of names that are automatically brought into scope of every module in a crate.

  • these prelude names are not part of the module itself
    • they are implicitly queried during name resolution
    • e.g. Box is valid but self::Box is not because Box is not a member of the current module
  • the no_implicit_prelude attribute
    • applicable at the crate level (root module) or on a module
    • indicates that it should not automatically bring the standard library prelude, extern prelude, macro_use prelude, or tool prelude into scope for that module or any of its descendants

There are several kinds of preludes:

  • Standard library prelude
    • each crate has it
    • consists of the names from a single standard library module
      • if no_std attribute is applied, this module is core::prelude::rust_EDITION
      • otherwise, this module is std::prelude::rust_EDITION
  • Extern prelude
    • consists of these crates:
      • external crates imported with extern crate in the root module
        • if extern crate foo as bar is used, the external crate foo is added to the extern prelude under the name bar
      • external crates provided to the compiler (rustc --extern)
      • the core crate
      • the std crate as long as the no_std attribute is not specified in the crate root
    • Cargo does bring in proc_macro to the extern prelude for proc-macro crates only
  • Language prelude
    • is always in scope
    • includes names of types and attributes that are built-in to the language
  • macro_use prelude
    • macros from external crates, imported by the macro_use attribute applied to an extern crate, are included into the macro_use prelude
  • Tool prelude
    • includes tool names for external tools in the type name space
    • in the tool prelude, each tool resides in its own name space
    • when a tool attribute is recognized by the compiler (i.e. when a tool is found in the tool prelude), the compiler accepts it without any warning
      • rustc currently recognizes the tools rustfmt and clippy
    • when a tool attribute is recognized by a tool, the tool is responsible for further processing and interpretation of the attribute

See Preludes, Extern crate declarations, Use declarations, Macros By Example, Namespaces, and Attributes for greater detail.

Visibility

Grammar:

visibility:
    "pub"
    "pub" "(" "crate" ")"
    "pub" "(" "self" ")"
    "pub" "(" "super" ")"
    "pub" "(" "in" simple_path ")"
  • everything in Rust is private by default except
    • associated item in a pub trait
    • enum variants in a pub enum
    • items marked with pub
  • an item is accessible in two cases:
    1. a public (pub) item can be accessed externally from some module m only if all the item's ancestors from m can also be accessed externally
    2. a private item can be accessed by the current module and its descendants
  • availability of public items can be further restricted:
    • pub(crate) makes an item visible within the current crate
    • pub(self) is equivalent to pub(in self) or not using pub at all
    • pub(super) is equivalent to pub(in super)
    • pub(in path) makes an item visible within the provided path
      • path must be an ancestor module of the item whose visibility is being declared
      • pub(in self) makes an item visible to the current module
      • pub(in super) makes an item visible to the parent module
  • pub together with use can be used for re-exporting items
    • use brings items to the current scope as usual
    • pub makes them public

Examples:

  • accessibility of items demonstration:
    // `foo` is private, so no external crate can access it (rule #1).
    // Any module in this crate can access `foo`'s public interface since `foo`
    // lays in the root module of this crate (rule #2).
    mod foo {
        // Can be used by anything in the current crate (rule #1).
        pub fn f() {}
    
        // Can be accessed only by this module and its descendants (rule #2).
        fn g() {}
    }
    
    // Public to the root module, available to external crates (rule #1).
    pub fn h() {}
    
    // Public to the root module, available to external crates (rule #1).
    pub mod bar {
        // This is legal since `bar` is a descendant of the root module (rule #2).
        use crate::foo;
    
        // Available to external crates (rule #1).
        pub fn x() {
            // This is legal (rule #2).
            foo::f();
        }
    
        // Available only to this module and its descendants (rule #2).
        fn y() {}
    
        #[cfg(test)]
        mod test {
            #[test]
            fn test_y() {
                // This is legal since this module is a descendant of the module
                // `bar` (rule #2).
                super::y();
            }
        }
    }
  • accessibility with restrictions:
    pub mod a {
        pub mod b {
            // Visible within `a`.
            pub (in crate::a) fn fa() {}
    
            // Visible to entire crate.
            pub (crate) fn fb() {}
    
            // Visible within `a`.
            pub (super) fn fc() {
                // Visible since we are in the same module.
                fd();
            }
    
            // Visible only within `b` (same as leaving `fd` private).
            pub (self) fn fd() {}
        }
    
        pub fn ga() {
            b::fa();
            b::fb();
            b::fc();
            //b::fd();  // Error
        }
    }
    
    fn ha() {
        // Still visible (we are in the same crate).
        a::b::fb();
        //a::b::fc();  // Error (we are outside of `a`)
        //a::b::fa();  // Error (we are outside of `a`)
        a::ga();
    }
    
    fn main() { ha() }
  • re-importing/re-exporting names into the current scope:
    // From the standard library: brings `Option`, `None`, and `Some` to the
    // current scope and makes them publicly visible
    pub use crate::option::Option;
    pub use crate::option::Option::None;
    pub use crate::option::Option::Some;
  • another example of re-exporting:
    pub use self::a::b;
    
    mod a {
        pub mod b {
            pub fn f() {}
        }
    }
    
    b::f();       // Valid (`b` is public in this scope, `f` is public in `b`)
    //a::b::f();  // Invalid (`a` is private)

See Visibility and Privacy and Use declarations for greater detail.

Paths

A path refers to:

  • item or variable if it has only one component
  • item if it has more than one component

Path qualifiers:

  • ::
    • paths starting with :: are considered to be global paths
    • segments of the path start being resolved from a place which differs based on edition
      • in the 2015 Edition, identifiers resolve from the crate root, i.e. ::foo in the 2015 Edition is the same as crate::foo in the 2018 and newer Edition
      • in the 2018 and newer Edition, identifiers resolve from crates in the extern prelude, i.e. they must be followed by the name of a crate, e.g. ::core resolves to the core crate which is always added to the extern prelude
        • :: is necessary if a name from to root module collide with the name from the extern prelude, e.g.
          //use std::fs;                // Error, ambiguity
          use ::std::fs;                // Imports `fs` from the standard `std` crate
          use self::std::fs as std_fs;  // Imports `fs` from the `std` module below
          
          mod std {
              pub mod fs {}
          }
  • self
    • resolves to the current module
    • a single self in a method body resolves to the method's self parameter
  • Self
    • refers to the implementing type within traits and implementations
  • super
    • resolves to the parent module
  • crate
    • refers to the root module of the current crate
  • $crate
    • allowed only inside macro transcribers
    • refers to the root module of the crate where the macro is defined

Canonical paths:

  • only items defined in a module or implementation have a canonical path
  • a canonical path reflects the location of the item relatively to the root module of its crate, e.g.
    // File `src/lib.rs`:
    mod foo {
        // Canonical path to this struct is `crate::foo::Foo`
        pub struct Foo;
    }
    • it is meaningful only within a given crate (there is no global namespace across crates)
  • a canonical path consists of a path prefix and an item appended to it
    • for modules, the path prefix is the canonical path to that module, e.g.
      // File `src/lib.rs`:
      mod foo {        // `crate::foo` (the canonical path to the module `foo`)
          struct Bar;  // `crate::foo::Bar` (the canonical path to the `struct Bar`)
      }
    • for bare implementations, the path prefix has the form <P>, where P is the canonical path of the item being implemented, e.g.
      // File `src/lib.rs`:
      mod foo {
          struct Bar;
      
          impl Bar {
              // The canonical path to the method `f` implemented here is
              // `<crate::foo::Bar>::f`.
              fn f(&self) {}
          }
      }
    • for trait implementations, the path prefix has the form <P as T>, where P is the canonical path of the item being implemented and T is the canonical path to the trait, e.g.
      // File `src/lib.rs`:
      mod a {
          struct S;
      
          trait T {
              fn f(&self);
          }
      
          impl T for S {
              // The canonical path to the method `f` implemented here is
              // `<crate::a::S as crate::a::T>::f`.
              fn f(&self) {}
          }
      
          impl S {
              fn g(&self) {}
          }
      }
  • items which do not have canonical paths:
    • implementations
    • use declarations
    • items defined in block expressions
    • items defined in a module that does not have a canonical path
    • associated items defined in an implementation that refers to an item without a canonical path

Simple Paths

Grammar:

simple_path:
    "::"? simple_path_segment ("::" simple_path_segment)*
simple_path_segment:
    identifier | "super" | "self" | "crate" | "$crate"
  • used in visibility markers, attributes, macros and, use items

Paths in Expressions

Grammar:

path_in_expression:
    "::"? path_expr_segment ("::" path_expr_segment)*
path_expr_segment:
    path_ident_segment ("::" generic_args)?
path_ident_segment:
    identifier | "super" | "self" | "Self" | "crate" | "$crate"
generic_args:
    "<" ((generic_arg ",")* generic_arg ","?)? ">"
generic_arg:
    lifetime
    type
    block_expression
    "-"? literal_expression
    simple_path_segment
    identifier "=" type
  • allow for paths with generic arguments to be specified
  • used in expressions and patterns
  • turbofish (::<) syntax:
    • Vec<u8>::with_capacity(1024) – wrong (ambiguity with <)
    • Vec::<u8>::with_capacity(1024) – correct
  • the order of generic arguments is restricted to:
    1. lifetime arguments, then
    2. type arguments, then
    3. const arguments, then
    4. equality constraints

Qualified Paths

Grammar:

qualified_path_in_expression:
    qualified_path_type ("::" path_expr_segment)+

qualified_path_type:
    "<" type ("as" type_path)? ">"

qualified_path_in_type:
    qualified_path_type ("::" type_path_segment)+
  • allow paths for trait implementations to be unambiguous
  • allow canonical paths to be specified
  • example:
    struct S;
    
    // Implementation of `struct S`:
    impl S {
        fn f() {}
    }
    
    trait T1 {
        fn f() {}
    }
    
    // Implementation of `trait T1` for `struct S`:
    impl T1 for S {}
    
    trait T2 {
        fn f() {}
    }
    
    // Implementation of `trait T2` for `struct S`:
    impl T2 for S {}
    
    // Which `f` is called? `S` implements both `T1` and `T2` so we use type cast
    // to resolve ambiguity:
    S::f();          // `S::f` is called
    <S as T1>::f();  // `T1::f` is called
    <S as T2>::f();  // `T2::f` is called

Type Paths

Grammar:

type_path:
    "::"? type_path_segment ("::" type_path_segment)*
type_path_segment:
    path_ident_segment ("::"? (generic_args | type_path_fn))?
type_path_fn:
    "(" (type ("," type)* ","?)? ")" ("->" type)?
  • used within type definitions, trait bounds, type parameter bounds, and qualified paths
  • turbofish notation (::<) is not required here as there is no danger of ambiguity like in expressions

See Paths, Visibility and Privacy, Attributes, Macros By Example, Use declarations, Expressions, Patterns, Traits, Implementations, and Preludes for greater detail.

Libraries (Crates) and Tools

Pinned: [Lib.rs]