Rust by Example
Rust by Example
html
Rust by Example
Rust is a modern systems programming language focusing on safety, speed, and
concurrency. It accomplishes these goals by being memory safe without using garbage
collection.
Rust by Example (RBE) is a collection of runnable examples that illustrate various Rust
concepts and standard libraries. To get even more out of these examples, don't forget to
install Rust locally and check out the official docs. Additionally for the curious, you can also
check out the source code for this site.
Primitives - Learn about signed integers, unsigned integers and other primitives.
Conversion
Expressions
Cargo - Go through some basic features of the official Rust package management tool.
Generics - Learn about writing a function or data type which can work for multiple
types of arguments.
Scoping rules - Scopes play an important part in ownership, borrowing, and lifetimes.
Macros
Std library types - Learn about some custom types provided by std library.
Unsafe Operations
Compatibility
Hello World
This is the source code of the traditional Hello World program.
$ rustc hello.rs
$ ./hello
Hello World!
Activity
Click 'Run' above to see the expected output. Next, add a new line with a second println!
macro so that the output shows:
Hello World!
I'm a Rustacean!
Comments
Any program requires comments, and Rust supports a few different varieties:
See also:
Library documentation
Formatted print
Printing is handled by a series of macros defined in std::fmt some of which include:
All parse text in the same fashion. As a plus, Rust checks formatting correctness at compile
time.
std::fmt contains many traits which govern the display of text. The base form of two
important ones are listed below:
fmt::Debug : Uses the {:?} marker. Format text for debugging purposes.
fmt::Display : Uses the {} marker. Format text in a more elegant, user friendly
fashion.
Here, we used fmt::Display because the std library provides implementations for these
types. To print text for custom types, more steps are required.
Implementing the fmt::Display trait automatically implements the ToString trait which
allows us to convert the type to String .
Activities
Fix the two issues in the above code (see FIXME) so that it runs without error.
Add a println! macro that prints: Pi is roughly 3.142 by controlling the number
of decimal places shown. For the purposes of this exercise, use let pi = 3.141592 as
an estimate for pi. (Hint: you may need to check the std::fmt documentation for
setting the number of decimals to display)
See also:
Debug
All types which want to use std::fmt formatting traits require an implementation to be
printable. Automatic implementations are only provided for types such as in the std library.
All others must be manually implemented somehow.
The fmt::Debug trait makes this very straightforward. All types can derive
(automatically create) the fmt::Debug implementation. This is not true for fmt::Display
which must be manually implemented.
All std library types are automatically printable with {:?} too:
So fmt::Debug definitely makes this printable but sacrifices some elegance. Rust also
provides "pretty printing" with {:#?} .
See also:
Display
fmt::Debug hardly looks compact and clean, so it is often advantageous to customize the
output appearance. This is done by manually implementing fmt::Display , which uses the
{} print marker. Implementing it looks like this:
fmt::Display may be cleaner than fmt::Debug but this presents a problem for the std
library. How should ambiguous types be displayed? For example, if the std library
implemented a single style for all Vec<T> , what style should it be? Would it be either of
these two?
No, because there is no ideal style for all types and the std library doesn't presume to
dictate one. fmt::Display is not implemented for Vec<T> or for any other generic
containers. fmt::Debug must then be used for these generic cases.
This is not a problem though because for any new container type which is not
generic, fmt::Display can be implemented.
So, fmt::Display has been implemented but fmt::Binary has not, and therefore cannot
be used. std::fmt has many such traits and each requires its own implementation. This
is detailed further in std::fmt .
Activity
After checking the output of the above example, use the Point2D struct as a guide to add a
Complex struct to the example. When printed in the same way, the output should be:
See also:
Testcase: List
Implementing fmt::Display for a structure where the elements must each be handled
sequentially is tricky. The problem is that each write! generates a fmt::Result . Proper
handling of this requires dealing with all the results. Rust provides the ? operator for
exactly this purpose.
Activity
Try changing the program so that the index of each element in the vector is also printed. The
new output should look like this:
[0: 1, 1: 2, 2: 3]
See also:
Formatting
11 de 254 14/03/2021 15:52
Rust By Example https://doc.rust-lang.org/stable/rust-by-example/print.html
The same variable ( foo ) can be formatted differently depending on which argument type is
used: X vs o vs unspecified.
This formatting functionality is implemented via traits, and there is one trait for each
argument type. The most common formatting trait is Display , which handles cases where
the argument type is left unspecified: {} for instance.
You can view a full list of formatting traits and their argument types in the std::fmt
documentation.
Activity
Add an implementation of the fmt::Display trait for the Color struct above so that the
output displays as:
See also:
std::fmt
Primitives
Rust provides access to a wide variety of primitives . A sample includes:
Scalar Types
signed integers: i8 , i16 , i32 , i64 , i128 and isize (pointer size)
unsigned integers: u8 , u16 , u32 , u64 , u128 and usize (pointer size)
floating point: f32 , f64
char Unicode scalar values like 'a' , 'α' and '∞' (4 bytes each)
bool either true or false
and the unit type () , whose only possible value is an empty tuple: ()
Despite the value of a unit type being a tuple, it is not considered a compound type because
it does not contain multiple values.
Compound Types
Variables can always be type annotated. Numbers may additionally be annotated via a suffix
or by default. Integers default to i32 and floats to f64 . Note that Rust can also infer types
from context.
See also:
Integers can, alternatively, be expressed using hexadecimal, octal or binary notation using
these prefixes respectively: 0x , 0o or 0b .
Underscores can be inserted in numeric literals to improve readability, e.g. 1_000 is the
same as 1000 , and 0.000_001 is the same as 0.000001 .
We need to tell the compiler the type of the literals we use. For now, we'll use the u32 suffix
to indicate that the literal is an unsigned 32-bit integer, and the i32 suffix to indicate that
it's a signed 32-bit integer.
The operators available and their precedence in Rust are similar to other C-like languages.
Tuples
A tuple is a collection of values of different types. Tuples are constructed using parentheses
() , and each tuple itself is a value with type signature (T1, T2, ...) , where T1 , T2 are
the types of its members. Functions can use tuples to return multiple values, as tuples can
hold any number of values.
Activity
1. Recap: Add the fmt::Display trait to the Matrix struct in the above example, so that
if you switch from printing the debug format {:?} to the display format {} , you see
the following output:
( 1.1 1.2 )
( 2.1 2.2 )
You may want to refer back to the example for print display.
2. Add a transpose function using the reverse function as a template, which accepts a
matrix as an argument, and returns a matrix in which two elements have been
swapped. For example:
println!("Matrix:\n{}", matrix);
println!("Transpose:\n{}", transpose(matrix));
Matrix:
( 1.1 1.2 )
( 2.1 2.2 )
Transpose:
( 1.1 2.1 )
( 1.2 2.2 )
Slices are similar to arrays, but their length is not known at compile time. Instead, a slice is a
two-word object, the first word is a pointer to the data, and the second word is the length of
the slice. The word size is the same as usize, determined by the processor architecture eg 64
bits on an x86-64. Slices can be used to borrow a section of an array, and have the type
signature &[T] .
Custom Types
Rust custom data types are formed mainly through the two keywords:
Constants can also be created via the const and static keywords.
Structures
There are three types of structures ("structs") that can be created using the struct
keyword:
Activity
1. Add a function rect_area which calculates the area of a rectangle (try using nested
destructuring).
2. Add a function square which takes a Point and a f32 as arguments, and returns a
Rectangle with its lower left corner on the point, and a width and height
corresponding to the f32 .
See also
Enums
The enum keyword allows the creation of a type which may be one of a few different
variants. Any variant which is valid as a struct is also valid as an enum .
Type aliases
If you use a type alias, you can refer to each enum variant via its alias. This might be useful if
the enum's name is too long or too generic, and you want to rename it.
The most common place you'll see this is in impl blocks using the Self alias.
To learn more about enums and type aliases, you can read the stabilization report from
when this feature was stabilized into Rust.
See also:
use
The use declaration can be used so manual scoping isn't needed:
See also:
C-like
enum can also be used as C-like enums.
See also:
casting
Testcase: linked-list
A common use for enums is to create a linked-list:
See also:
constants
Rust has two different types of constants which can be declared in any scope including
global. Both require explicit type annotation:
See also:
Variable Bindings
Rust provides type safety via static typing. Variable bindings can be type annotated when
declared. However, in most cases, the compiler will be able to infer the type of the variable
from the context, heavily reducing the annotation burden.
Values (like literals) can be bound to variables, using the let binding.
Mutability
Variable bindings are immutable by default, but this can be overridden using the mut
modifier.
Variable bindings have a scope, and are constrained to live in a block. A block is a collection
of statements enclosed by braces {} .
Declare first
It's possible to declare variable bindings first, and initialize them later. However, this form is
The compiler forbids use of uninitialized variables, as this would lead to undefined behavior.
Freezing
When data is bound by the same name immutably, it also freezes. Frozen data can't be
modified until the immutable binding goes out of scope:
Types
Rust provides several mechanisms to change or define the type of primitive and user
defined types. The following sections cover:
Casting
Rust provides no implicit type conversion (coercion) between primitive types. But, explicit
type conversion (casting) can be performed using the as keyword.
Rules for converting between integral types follow C conventions generally, except in cases
where C has undefined behavior. The behavior of all casts between integral types is well
defined in Rust.
Literals
Numeric literals can be type annotated by adding the type as a suffix. As an example, to
specify that the literal 42 should have the type i32 , write 42i32 .
The type of unsuffixed numeric literals will depend on how they are used. If no constraint
exists, the compiler will use i32 for integers, and f64 for floating-point numbers.
There are some concepts used in the previous code that haven't been explained yet, here's a
std::mem::size_of_val is a function, but called with its full path. Code can be split in
logical units called modules. In this case, the size_of_val function is defined in the
mem module, and the mem module is defined in the std crate. For more details, see
modules and crates.
Inference
The type inference engine is pretty smart. It does more than looking at the type of the value
expression during an initialization. It also looks at how the variable is used afterwards to
infer its type. Here's an advanced example of type inference:
No type annotation of variables was needed, the compiler is happy and so is the
programmer!
Aliasing
The type statement can be used to give a new name to an existing type. Types must have
UpperCamelCase names, or the compiler will raise a warning. The exception to this rule are
the primitive types: usize , f32 , etc.
The main use of aliases is to reduce boilerplate; for example the IoResult<T> type is an
alias for the Result<T, IoError> type.
See also:
Attributes
Conversion
Primitive types can be converted to each other through casting.
Rust addresses conversion between custom types (i.e., struct and enum ) by the use of
traits. The generic conversions will use the From and Into traits. However there are more
specific ones for the more common cases, in particular when converting to and from
String s.
From
The From trait allows for a type to define how to create itself from another type, hence
providing a very simple mechanism for converting between several types. There are
numerous implementations of this trait within the standard library for conversion of
primitive and common types.
Into
The Into trait is simply the reciprocal of the From trait. That is, if you have implemented
the From trait for your type, Into will call it when necessary.
Using the Into trait will typically require specification of the type to convert into as the
compiler is unable to determine this most of the time. However this is a small trade-off
considering we get the functionality for free.
Converting to String
To convert any type to a String is as simple as implementing the ToString trait for the
type. Rather than doing so directly, you should implement the fmt::Display trait which
automagically provides ToString and also allows printing the type as discussed in the
section on print! .
Parsing a String
One of the more common types to convert a string into is a number. The idiomatic approach
to this is to use the parse function and either to arrange for type inference or to specify the
type to parse using the 'turbofish' syntax. Both alternatives are shown in the following
example.
This will convert the string into the type specified so long as the FromStr trait is
implemented for that type. This is implemented for numerous types within the standard
library. To obtain this functionality on a user defined type simply implement the FromStr
trait for that type.
Expressions
A Rust program is (mostly) made up of a series of statements:
fn main() {
// statement
// statement
// statement
}
There are a few kinds of statements in Rust. The most common two are declaring a variable
binding, and using a ; with an expression:
fn main() {
// variable binding
let x = 5;
// expression;
x;
x + 1;
15;
}
Blocks are expressions too, so they can be used as values in assignments. The last
expression in the block will be assigned to the place expression such as a local variable.
However, if the last expression of the block ends with a semicolon, the return value will be
() .
Flow of Control
An essential part of any programming languages are ways to modify control flow: if / else ,
for , and others. Let's talk about them in Rust.
if/else
Branching with if - else is similar to other languages. Unlike many of them, the boolean
condition doesn't need to be surrounded by parentheses, and each condition is followed by
a block. if - else conditionals are expressions, and, all branches must return the same
type.
loop
Rust provides a loop keyword to indicate an infinite loop.
The break statement can be used to exit a loop at anytime, whereas the continue
statement can be used to skip the rest of the iteration and start a new one.
while
The while keyword can be used to run a loop while a condition is true.
for loops
Alternatively, a..=b can be used for a range that is inclusive on both ends. The above can
be written as:
into_iter , iter and iter_mut all handle the conversion of a collection into an iterator in
different ways, by providing different views on the data within.
iter - This borrows each element of the collection through each iteration. Thus
leaving the collection untouched and available for reuse after the loop.
into_iter - This consumes the collection so that on each iteration the exact data is
provided. Once the collection has been consumed it is no longer available for reuse as
it has been 'moved' within the loop.
iter_mut - This mutably borrows each element of the collection, allowing for the
collection to be modified in place.
In the above snippets note the type of match branch, that is the key difference in the types
of iteration. The difference in type then of course implies differing actions that are able to be
performed.
See also:
Iterator
match
Rust provides pattern matching via the match keyword, which can be used like a C switch .
The first matching arm is evaluated and all possible values must be covered.
Destructuring
A match block can destructure items in a variety of ways.
Destructuring Tuples
Destructuring Enums
Destructuring Pointers
Destructuring Structures
tuples
Tuples can be destructured in a match as follows:
See also:
Tuples
enums
An enum is destructured similarly:
See also:
pointers/ref
Dereferencing uses *
Destructuring uses & , ref , and ref mut
See also:
structs
Similarly, a struct can be destructured as shown:
See also:
Structs
Guards
A match guard can be added to filter the arm.
See also:
Tuples
Binding
Indirectly accessing a variable makes it impossible to branch and use that variable without
re-binding. match provides the @ sigil for binding values to names:
You can also use binding to "destructure" enum variants, such as Option :
See also:
if let
For some use cases, when matching enums, match is awkward. For example:
match optional {
Some(i) => {
println!("This is a really long string and `{:?}`", i);
// ^ Needed 2 indentations just so we could destructure
// `i` from the option.
},
_ => {},
// ^ Required because `match` is exhaustive. Doesn't it seem
// like wasted space?
};
if let is cleaner for this use case and in addition allows various failure options to be
specified:
In the same way, if let can be used to match any enum value:
Another benefit is that if let allows us to match non-parameterized enum variants. This is
true even in cases where the enum doesn't implement or derive PartialEq . In such cases
if Foo::Bar == a would fail to compile, because instances of the enum cannot be
equated, however if let will continue to work.
Would you like a challenge? Fix the following example to use if let :
See also:
while let
Similar to if let , while let can make awkward match sequences more tolerable.
Consider the following sequence that increments i :
See also:
Functions
Functions are declared using the fn keyword. Its arguments are type annotated, just like
variables, and, if the function returns a value, the return type must be specified after an
arrow -> .
The final expression in the function will be used as return value. Alternatively, the return
statement can be used to return a value earlier from within the function, even from inside
loops or if statements.
Methods
Methods are functions attached to objects. These methods have access to the data of the
object and its other methods via the self keyword. Methods are defined under an impl
block.
Closures
Closures are functions that can capture the enclosing environment. For example, a closure
that captures the x variable:
|val| val + x
The syntax and capabilities of closures make them very convenient for on the fly usage.
Calling a closure is exactly like calling a function. However, both input and return types can
be inferred and input variable names must be specified.
Capturing
Closures are inherently flexible and will do what the functionality requires to make the
closure work without annotation. This allows capturing to flexibly adapt to the use case,
sometimes moving and sometimes borrowing. Closures can capture variables:
by reference: &T
by mutable reference: &mut T
by value: T
They preferentially capture variables by reference and only go lower when required.
Using move before vertical pipes forces closure to take ownership of captured variables:
See also:
As input parameters
While Rust chooses how to capture variables on the fly mostly without type annotation, this
ambiguity is not allowed when writing functions. When taking a closure as an input
parameter, the closure's complete type must be annotated using one of a few traits . In
order of decreasing restriction, they are:
On a variable-by-variable basis, the compiler will capture variables in the least restrictive
manner possible.
For instance, consider a parameter annotated as FnOnce . This specifies that the closure may
capture by &T , &mut T , or T , but the compiler will ultimately choose based on how the
captured variables are used in the closure.
This is because if a move is possible, then any type of borrow should also be possible. Note
that the reverse is not true. If the parameter is annotated as Fn , then capturing variables by
&mut T or T are not allowed.
In the following example, try swapping the usage of Fn , FnMut , and FnOnce to see what
happens:
See also:
Type anonymity
Closures succinctly capture variables from enclosing scopes. Does this have any
consequences? It surely does. Observe how using a closure as a function parameter requires
generics, which is necessary because of how they are defined:
When a closure is defined, the compiler implicitly creates a new anonymous structure to
store the captured variables inside, meanwhile implementing the functionality via one of the
traits : Fn , FnMut , or FnOnce for this unknown type. This type is assigned to the variable
which is stored until calling.
Since this new type is of unknown type, any usage in a function will require generics.
However, an unbounded type parameter <T> would still be ambiguous and not be allowed.
Thus, bounding by one of the traits : Fn , FnMut , or FnOnce (which it implements) is
sufficient to specify its type.
See also:
Input functions
Since closures may be used as arguments, you might wonder if the same can be said about
functions. And indeed they can! If you declare a function that takes a closure as parameter,
then any function that satisfies the trait bound of that closure can be passed as a parameter.
As an additional note, the Fn , FnMut , and FnOnce traits dictate how a closure captures
variables from the enclosing scope.
See also:
As output parameters
Closures as input parameters are possible, so returning closures as output parameters
should also be possible. However, anonymous closure types are, by definition, unknown, so
we have to use impl Trait to return them.
Fn
FnMut
FnOnce
Beyond this, the move keyword must be used, which signals that all captures occur by value.
This is required because any captures by reference would be dropped as soon as the
function exited, leaving invalid references in the closure.
See also:
Examples in std
This section contains a few examples of using closures from the std library.
Iterator::any
Iterator::any is a function which when passed an iterator, will return true if any element
satisfies the predicate. Otherwise false . Its signature:
See also:
std::iter::Iterator::any
Iterator::find is a function which iterates over an iterator and searches for the first value
which satisfies some condition. If none of the values satisfy the condition, it returns None .
Its signature:
Iterator::find gives you a reference to the item. But if you want the index of the item, use
Iterator::position .
See also:
std::iter::Iterator::find
std::iter::Iterator::find_map
std::iter::Iterator::position
std::iter::Iterator::rposition
Diverging functions
Diverging functions never return. They are marked using ! , which is an empty type.
fn foo() -> ! {
panic!("This call never returns.");
}
As opposed to all the other types, this one cannot be instantiated, because the set of all
possible values this type can have is empty. Note that, it is different from the () type, which
has exactly one possible value.
For example, this function returns as usual, although there is no information in the return
value.
fn some_fn() {
()
}
fn main() {
let a: () = some_fn();
println!("This function returns and you can see this line.")
}
As opposed to this function, which will never return the control back to the caller.
#![feature(never_type)]
fn main() {
let x: ! = panic!("This call never returns.");
println!("You will never see this line!");
}
Although this might seem like an abstract concept, it is in fact very useful and often handy.
The main advantage of this type is that it can be cast to any other one and therefore used at
places where an exact type is required, for instance in match branches. This allows us to
write code like this:
fn main() {
fn sum_odd_numbers(up_to: u32) -> u32 {
let mut acc = 0;
for i in 0..up_to {
// Notice that the return type of this match expression must be u32
// because of the type of the "addition" variable.
let addition: u32 = match i%2 == 1 {
// The "i" variable is of type u32, which is perfectly fine.
true => i,
// On the other hand, the "continue" expression does not return
// u32, but it is still fine, because it never returns and
therefore
// does not violate the type requirements of the match
expression.
false => continue,
};
acc += addition;
}
acc
}
println!("Sum of odd numbers up to 9 (excluding): {}", sum_odd_numbers(9));
}
It is also the return type of functions that loop forever (e.g. loop {} ) like network servers or
functions that terminates the process (e.g. exit() ).
Modules
Rust provides a powerful module system that can be used to hierarchically split code in
logical units (modules), and manage visibility (public/private) between them.
A module is a collection of items: functions, structs, traits, impl blocks, and even other
modules.
Visibility
By default, the items in a module have private visibility, but this can be overridden with the
pub modifier. Only the public items of a module can be accessed from outside the module
scope.
Struct visibility
Structs have an extra level of visibility with their fields. The visibility defaults to private, and
can be overridden with the pub modifier. This visibility only matters when a struct is
accessed from outside the module where it is defined, and has the goal of hiding
information (encapsulation).
See also:
File hierarchy
Modules can be mapped to a file/directory hierarchy. Let's break down the visibility example
in files:
$ tree .
.
|-- my
| |-- inaccessible.rs
| |-- mod.rs
| `-- nested.rs
`-- split.rs
In split.rs :
// This declaration will look for a file named `my.rs` or `my/mod.rs` and will
// insert its contents inside a module named `my` under this scope
mod my;
fn function() {
println!("called `function()`");
}
fn main() {
my::function();
function();
my::indirect_access();
my::nested::function();
}
In my/mod.rs :
// Similarly `mod inaccessible` and `mod nested` will locate the `nested.rs`
// and `inaccessible.rs` files and insert them here under their respective
// modules
mod inaccessible;
pub mod nested;
pub fn function() {
println!("called `my::function()`");
}
fn private_function() {
println!("called `my::private_function()`");
}
pub fn indirect_access() {
print!("called `my::indirect_access()`, that\n> ");
private_function();
}
In my/nested.rs :
pub fn function() {
println!("called `my::nested::function()`");
}
#[allow(dead_code)]
fn private_function() {
println!("called `my::nested::private_function()`");
}
In my/inaccessible.rs :
#[allow(dead_code)]
pub fn public_function() {
println!("called `my::inaccessible::public_function()`");
}
Crates
89 de 254 14/03/2021 15:52
Rust By Example https://doc.rust-lang.org/stable/rust-by-example/print.html
A crate can be compiled into a binary or into a library. By default, rustc will produce a
binary from a crate. This behavior can be overridden by passing the --crate-type flag to
lib .
Creating a Library
Let's create a library, and then see how to link it to another crate.
pub fn public_function() {
println!("called rary's `public_function()`");
}
fn private_function() {
println!("called rary's `private_function()`");
}
pub fn indirect_access() {
print!("called rary's `indirect_access()`, that\n> ");
private_function();
}
Libraries get prefixed with "lib", and by default they get named after their crate file, but this
default name can be overridden by passing the --crate-name option to rustc or by using
the crate_name attribute.
Using a Library
To link a crate to this new library you may use rustc 's --extern flag. All of its items will
then be imported under a module named the same as the library. This module generally
behaves the same way as any other module.
// extern crate rary; // May be required for Rust 2015 edition or earlier
fn main() {
rary::public_function();
rary::indirect_access();
}
# Where library.rlib is the path to the compiled library, assumed that it's
# in the same directory here:
$ rustc executable.rs --extern rary=library.rlib --edition=2018 && ./executable
called rary's `public_function()`
called rary's `indirect_access()`, that
> called rary's `private_function()`
Cargo
cargo is the official Rust package management tool. It has lots of really useful features to
improve code quality and developer velocity! These include
Dependency management and integration with crates.io (the official Rust package
registry)
Awareness of unit tests
Awareness of benchmarks
This chapter will go through some quick basics, but you can find the comprehensive docs in
The Cargo Book.
Dependencies
Most programs have dependencies on some libraries. If you have ever managed
dependencies by hand, you know how much of a pain this can be. Luckily, the Rust
ecosystem comes standard with cargo ! cargo can manage dependencies for a project.
# A binary
cargo new foo
# OR A library
cargo new --lib foo
For the rest of this chapter, let's assume we are making a binary, rather than a library, but all
of the concepts are the same.
After the above commands, you should see a file hierarchy like this:
foo
├── Cargo.toml
└── src
└── main.rs
The main.rs is the root source file for your new project -- nothing new there. The
Cargo.toml is the config file for cargo for this project ( foo ). If you look inside it, you
should see something like this:
[package]
name = "foo"
version = "0.1.0"
authors = ["mark"]
[dependencies]
The name field under [package] determines the name of the project. This is used by
crates.io if you publish the crate (more later). It is also the name of the output binary
when you compile.
The authors field is a list of authors used when publishing the crate.
The [dependencies] section lets you add dependencies for your project.
For example, suppose that we want our program to have a great CLI. You can find lots of
great packages on crates.io (the official Rust package registry). One popular choice is clap. As
of this writing, the most recent published version of clap is 2.27.1 . To add a dependency
to our program, we can simply add the following to our Cargo.toml under
[dependencies] : clap = "2.27.1" . And that's it! You can start using clap in your
program.
cargo also supports other types of dependencies. Here is just a small sampling:
[package]
name = "foo"
version = "0.1.0"
authors = ["mark"]
[dependencies]
clap = "2.27.1" # from crates.io
rand = { git = "https://github.com/rust-lang-nursery/rand" } # from online repo
bar = { path = "../bar" } # from a path in the local filesystem
cargo is more than a dependency manager. All of the available configuration options are
listed in the format specification of Cargo.toml .
To build our project we can execute cargo build anywhere in the project directory
(including subdirectories!). We can also do cargo run to build and run. Notice that these
commands will resolve all dependencies, download crates if needed, and build everything,
including your crate. (Note that it only rebuilds what it has not already built, similar to
make ).
Conventions
In the previous chapter, we saw the following directory hierarchy:
foo
├── Cargo.toml
└── src
└── main.rs
Suppose that we wanted to have two binaries in the same project, though. What then?
It turns out that cargo supports this. The default binary name is main , as we saw before,
but you can add additional binaries by placing them in a bin/ directory:
foo
├── Cargo.toml
└── src
├── main.rs
└── bin
└── my_other_bin.rs
To tell cargo to compile or run this binary as opposed to the default or other binaries, we
just pass cargo the --bin my_other_bin flag, where my_other_bin is the name of the
binary we want to work with.
In addition to extra binaries, cargo supports more features such as benchmarks, tests, and
examples.
Testing
As we know testing is integral to any piece of software! Rust has first-class support for unit
and integration testing (see this chapter in TRPL).
From the testing chapters linked above, we see how to write unit tests and integration tests.
Organizationally, we can place unit tests in the modules they test and integration tests in
their own tests/ directory:
foo
├── Cargo.toml
├── src
│ └── main.rs
└── tests
├── my_test.rs
└── my_other_test.rs
$ cargo test
$ cargo test
Compiling blah v0.1.0 (file:///nobackup/blah)
Finished dev [unoptimized + debuginfo] target(s) in 0.89 secs
Running target/debug/deps/blah-d3b32b97275ec472
running 3 tests
test test_bar ... ok
test test_baz ... ok
test test_foo_bar ... ok
test test_foo ... ok
running 2 tests
test test_foo ... ok
test test_foo_bar ... ok
One word of caution: Cargo may run multiple tests concurrently, so make sure that they
don't race with each other. For example, if they all output to a file, you should make them
write to different files.
Build Scripts
Sometimes a normal build from cargo is not enough. Perhaps your crate needs some pre-
requisites before cargo will successfully compile, things like code generation, or some
native code that needs to be compiled. To solve this problem we have build scripts that
Cargo can run.
To add a build script to your package it can either be specified in the Cargo.toml as follows:
[package]
...
build = "build.rs"
Otherwise Cargo will look for a build.rs file in the project directory by default.
Cargo provides the script with inputs via environment variables specified here that can be
used.
The script provides output via stdout. All lines printed are written to target/debug/build
/<pkg>/output . Further, lines prefixed with cargo: will be interpreted by Cargo directly and
hence can be used to define parameters for the package's compilation.
For further specification and examples have a read of the Cargo specification.
Attributes
An attribute is metadata applied to some module, crate or item. This metadata can be used
to/for:
When attributes apply to a whole crate, their syntax is #![crate_attribute] , and when
they apply to a module or item, the syntax is #[item_attribute] (notice the missing bang
! ).
#[attribute = "value"]
#[attribute(key = "value")]
#[attribute(value)]
Attributes can have multiple values and can be separated over multiple lines, too:
#[attribute(value, value2)]
dead_code
The compiler provides a dead_code lint that will warn about unused functions. An attribute
can be used to disable the lint.
Note that in real programs, you should eliminate dead code. In these examples we'll allow
dead code in some places because of the interactive nature of the examples.
Crates
The crate_type attribute can be used to tell the compiler whether a crate is a binary or a
library (and even which type of library), and the crate_name attribute can be used to set the
name of the crate.
However, it is important to note that both the crate_type and crate_name attributes have
no effect whatsoever when using Cargo, the Rust package manager. Since Cargo is used for
the majority of Rust projects, this means real-world uses of crate_type and crate_name
are relatively limited.
When the crate_type attribute is used, we no longer need to pass the --crate-type flag
to rustc .
$ rustc lib.rs
$ ls lib*
library.rlib
cfg
Configuration conditional checks are possible through two different operators:
While the former enables conditional compilation, the latter conditionally evaluates to true
or false literals allowing for checks at run-time. Both utilize identical argument syntax.
See also:
Custom
Some conditionals like target_os are implicitly provided by rustc , but custom conditionals
must be passed to rustc using the --cfg flag.
Try to run this to see what happens without the custom cfg flag.
Generics
Generics is the topic of generalizing types and functionalities to broader cases. This is
extremely useful for reducing code duplication in many ways, but can call for rather
involving syntax. Namely, being generic requires taking great care to specify over which
types a generic type is actually considered valid. The simplest and most common use of
generics is for type parameters.
A type parameter is specified as generic by the use of angle brackets and upper camel case:
<Aaa, Bbb, ...> . "Generic type parameters" are typically represented as <T> . In Rust,
"generic" also describes anything that accepts one or more generic type parameters <T> .
Any type specified as a generic type parameter is generic, and everything else is concrete
(non-generic).
For example, defining a generic function named foo that takes an argument T of any type:
fn foo<T>(arg: T) { ... }
Because T has been specified as a generic type parameter using <T> , it is considered
generic when used here as (arg: T) . This is the case even if T has previously been defined
as a struct .
See also:
structs
Functions
The same set of rules can be applied to functions: a type T becomes generic when
preceded by <T> .
Using generic functions sometimes requires explicitly specifying type parameters. This may
be the case if the function is called where the return type is generic, or if the compiler
doesn't have enough information to infer the necessary type parameters.
A function call with explicitly specified type parameters looks like: fun::<A, B, ...>() .
See also:
Implementation
See also:
Traits
Of course trait s can also be generic. Here we define one which reimplements the Drop
trait as a generic method to drop itself and an input.
See also:
Bounds
When working with generics, the type parameters often must use traits as bounds to
stipulate what functionality a type implements. For example, the following example uses the
trait Display to print and so it requires T to be bound by Display ; that is, T must
implement Display .
Bounding restricts the generic to types that conform to the bounds. That is:
Another effect of bounding is that generic instances are allowed to access the methods of
traits specified in the bounds. For example:
As an additional note, where clauses can also be used to apply bounds in some cases to be
more expressive.
See also:
See also:
Multiple bounds
Multiple bounds can be applied with a + . Like normal, different types are separated with , .
See also:
Where clauses
A bound can also be expressed using a where clause immediately before the opening { ,
rather than at the type's first mention. Additionally, where clauses can apply bounds to
arbitrary types, rather than just to type parameters.
impl <A: TraitB + TraitC, D: TraitE + TraitF> MyTrait<A, D> for YourType {}
When using a where clause is more expressive than using normal syntax. The impl in
this example cannot be directly expressed without a where clause:
See also:
For example, an age verification function that checks age in years, must be given a value of
type Years .
Uncomment the last print statement to observe that the type supplied must be Years .
To obtain the newtype 's value as the base type, you may use tuple syntax like so:
See also:
structs
Associated items
"Associated Items" refers to a set of rules pertaining to item s of various types. It is an
extension to trait generics, and allows trait s to internally define new items.
One such item is called an associated type, providing simpler usage patterns when the trait
is generic over its container type.
See also:
RFC
The Problem
A trait that is generic over its container type has type specification requirements - users of
the trait must specify all of its generic types.
In the example below, the Contains trait allows the use of the generic types A and B .
The trait is then implemented for the Container type, specifying i32 for A and B so that it
can be used with fn difference() .
Because Contains is generic, we are forced to explicitly state all of the generic types for fn
difference() . In practice, we want a way to express that A and B are determined by the
input C . As you will see in the next section, associated types provide exactly that capability.
See also:
Associated types
The use of "Associated types" improves the overall readability of code by moving inner types
locally into a trait as output types. Syntax for the trait definition is as follows:
// `A` and `B` are defined in the trait via the `type` keyword.
// (Note: `type` in this context is different from `type` when used for
// aliases).
trait Contains {
type A;
type B;
Note that functions that use the trait Contains are no longer required to express A or B
at all:
Let's rewrite the example from the previous section using associated types:
Data types can use extra generic type parameters to act as markers or to perform type
checking at compile time. These extra parameters hold no storage values, and have no
runtime behavior.
See also:
See also:
Borrowing ( & ), Bounds ( X: Y ), enum, impl & self, Overloading, ref, Traits ( X for Y ), and
TupleStructs.
Scoping rules
Scopes play an important part in ownership, borrowing, and lifetimes. That is, they indicate
to the compiler when borrows are valid, when resources can be freed, and when variables
are created or destroyed.
RAII
Variables in Rust do more than just hold data in the stack: they also own resources, e.g.
Box<T> owns memory in the heap. Rust enforces RAII (Resource Acquisition Is Initialization),
so whenever an object goes out of scope, its destructor is called and its owned resources are
freed.
This behavior shields against resource leak bugs, so you'll never have to manually free
memory or worry about memory leaks again! Here's a quick showcase:
No leaks here!
Destructor
The notion of a destructor in Rust is provided through the Drop trait. The destructor is
called when the resource goes out of scope. This trait is not required to be implemented for
every type, only implement it for your type if you require its own destructor logic.
Run the below example to see how the Drop trait works. When the variable in the main
function goes out of scope the custom destructor will be invoked.
See also:
Box
When doing assignments ( let x = y ) or passing function arguments by value ( foo(x) ), the
ownership of the resources is transferred. In Rust-speak, this is known as a move.
After moving resources, the previous owner can no longer be used. This avoids creating
dangling pointers.
Mutability
Mutability of data can be changed when ownership is transferred.
Partial moves
Pattern bindings can have by-move and by-reference bindings at the same time which is
used in destructuring. Using these pattern will result in partial move for the variable, which
means that part of the variable is moved while other parts stayed. In this case, the parent
variable cannot be used afterwards as a whole. However, parts of it that are referenced and
not moved can be used.
See also:
destructuring
Borrowing
Most of the time, we'd like to access data without taking ownership over it. To accomplish
this, Rust uses a borrowing mechanism. Instead of passing objects by value ( T ), objects can
be passed by reference ( &T ).
The compiler statically guarantees (via its borrow checker) that references always point to
valid objects. That is, while references to an object exist, the object cannot be destroyed.
Mutability
Mutable data can be mutably borrowed using &mut T . This is called a mutable reference and
gives read/write access to the borrower. In contrast, &T borrows the data via an immutable
reference, and the borrower can read the data but not modify it:
See also:
static
Aliasing
Data can be immutably borrowed any number of times, but while immutably borrowed, the
original data can't be mutably borrowed. On the other hand, only one mutable borrow is
allowed at a time. The original data can be borrowed again only after the mutable reference
has been used for the last time.
Lifetimes
A lifetime is a construct the compiler (or more specifically, its borrow checker) uses to ensure
all borrows are valid. Specifically, a variable's lifetime begins when it is created and ends
when it is destroyed. While lifetimes and scopes are often referred to together, they are not
the same.
Take, for example, the case where we borrow a variable via & . The borrow has a lifetime
that is determined by where it is declared. As a result, the borrow is valid as long as it ends
before the lender is destroyed. However, the scope of the borrow is determined by where
the reference is used.
In the following example and in the rest of this section, we will see how lifetimes relate to
scopes, as well as how the two differ.
Note that no names or types are assigned to label lifetimes. This restricts how lifetimes will
be able to be used as we will see.
Explicit annotation
The borrow checker uses explicit lifetime annotations to determine how long references
should be valid. In cases where lifetimes are not elided1, Rust requires explicit annotations
to determine what the lifetime of a reference should be. The syntax for explicitly annotating
a lifetime uses an apostrophe character as follows:
foo<'a>
// `foo` has a lifetime parameter `'a`
Similar to closures, using lifetimes requires generics. Additionally, this lifetime syntax
indicates that the lifetime of foo may not exceed that of 'a . Explicit annotation of a type
has the form &'a T where 'a has already been introduced.
foo<'a, 'b>
// `foo` has lifetime parameters `'a` and `'b`
In this case, the lifetime of foo cannot exceed that of either 'a or 'b .
See also:
Functions
Ignoring elision, function signatures with lifetimes have a few constraints:
Additionally, note that returning references without input is banned if it would result in
returning references to invalid data. The following example shows off some valid forms of
functions with lifetimes:
See also:
functions
Methods
Methods are annotated similarly to functions:
See also:
methods
Structs
Annotation of lifetimes in structures are also similar to functions:
See also:
struct s
Traits
Annotation of lifetimes in trait methods basically are similar to functions. Note that impl
may have annotation of lifetimes too.
See also:
trait s
Bounds
Just like generic types can be bounded, lifetimes (themselves generic) use bounds as well.
The : character has a slightly different meaning here, but + is the same. Note how the
following read:
The example below shows the above syntax in action used after keyword where :
See also:
Coercion
A longer lifetime can be coerced into a shorter one so that it works inside a scope it normally
wouldn't work in. This comes in the form of inferred coercion by the Rust compiler, and also
in the form of declaring a lifetime difference:
Static
Rust has a few reserved lifetime names. One of those is 'static . You might encounter it in
two situations:
Both are related but subtly different and this is a common source for confusion when
learning Rust. Here are some examples for each situation:
Reference lifetime
As a reference lifetime 'static indicates that the data pointed to by the reference lives for
the entire lifetime of the running program. It can still be coerced to a shorter lifetime.
There are two ways to make a variable with 'static lifetime, and both are stored in the
read-only memory of the binary:
Trait bound
As a trait bound, it means the type does not contain any non-static references. Eg. the
receiver can hold on to the type for as long as they want and it will never become invalid
until they drop it.
It's important to understand this means that any owned data always passes a 'static
lifetime bound, but a reference to that owned data generally does not:
See also:
'static constants
Elision
Some lifetime patterns are overwhelmingly common and so the borrow checker will allow
you to omit them to save typing and to improve readability. This is known as elision. Elision
exists in Rust solely because these patterns are common.
The following code shows a few examples of elision. For a more comprehensive description
of elision, see lifetime elision in the book.
See also:
elision
Traits
A trait is a collection of methods defined for an unknown type: Self . They can access
other methods declared in the same trait.
Traits can be implemented for any data type. In the example below, we define Animal , a
group of methods. The Animal trait is then implemented for the Sheep data type,
allowing the use of methods from Animal with a Sheep .
Derive
The compiler is capable of providing basic implementations for some traits via the
#[derive] attribute. These traits can still be manually implemented if a more complex
behavior is required.
See also:
derive
However, there's an easy workaround. Instead of returning a trait object directly, our
functions return a Box which contains some Animal . A box is just a reference to some
memory in the heap. Because a reference has a statically-known size, and the compiler can
guarantee it points to a heap-allocated Animal , we can return a trait from our function!
Rust tries to be as explicit as possible whenever it allocates memory on the heap. So if your
function returns a pointer-to-trait-on-heap in this way, you need to write the return type
with the dyn keyword, e.g. Box<dyn Animal> .
Operator Overloading
In Rust, many of the operators can be overloaded via traits. That is, some operators can be
used to accomplish different tasks based on their input arguments. This is possible because
operators are syntactic sugar for method calls. For example, the + operator in a + b calls
the add method (as in a.add(b) ). This add method is part of the Add trait. Hence, the +
operator can be used by any implementor of the Add trait.
A list of the traits, such as Add , that overload operators can be found in core::ops .
See Also
Drop
147 de 254 14/03/2021 15:52
Rust By Example https://doc.rust-lang.org/stable/rust-by-example/print.html
The Drop trait only has one method: drop , which is called automatically when an object
goes out of scope. The main use of the Drop trait is to free the resources that the
implementor instance owns.
Box , Vec , String , File , and Process are some examples of types that implement the
Drop trait to free resources. The Drop trait can also be manually implemented for any
custom data type.
The following example adds a print to console to the drop function to announce when it is
called.
Iterators
The Iterator trait is used to implement iterators over collections such as arrays.
The trait requires only a method to be defined for the next element, which may be
manually defined in an impl block or automatically defined (as in arrays and ranges).
As a point of convenience for common situations, the for construct turns some collections
into iterators using the .into_iter() method.
impl Trait
If your function returns a type that implements MyTrait , you can write its return type as ->
impl MyTrait . This can help simplify your type signatures quite a lot!
More importantly, some Rust types can't be written out. For example, every closure has its
own unnamed concrete type. Before impl Trait syntax, you had to allocate on the heap in
order to return a closure. But now you can do it all statically, like this:
You can also use impl Trait to return an iterator that uses map or filter closures! This
makes using map and filter easier. Because closure types don't have names, you can't
write out an explicit return type if your function returns iterators with closures. But with
impl Trait you can do this easily:
Clone
When dealing with resources, the default behavior is to transfer them during assignments or
function calls. However, sometimes we need to make a copy of the resource as well.
The Clone trait helps us do exactly this. Most commonly, we can use the .clone() method
defined by the Clone trait.
Supertraits
Rust doesn't have "inheritance", but you can define a trait as being a superset of another
trait. For example:
See also:
Good news: because each trait implementation gets its own impl block, it's clear which
trait's get method you're implementing.
What about when it comes time to call those methods? To disambiguate between them, we
have to use Fully Qualified Syntax.
See also:
macro_rules!
Rust provides a powerful macro system that allows metaprogramming. As you've seen in
previous chapters, macros look like functions, except that their name ends with a bang ! ,
but instead of generating a function call, macros are expanded into source code that gets
compiled with the rest of the program. However, unlike macros in C and other languages,
Rust macros are expanded into abstract syntax trees, rather than string preprocessing, so
you don't get unexpected precedence bugs.
1. Don't repeat yourself. There are many cases where you may need similar functionality
in multiple places but with different types. Often, writing a macro is a useful way to
avoid repeating code. (More on this later)
2. Domain-specific languages. Macros allow you to define special syntax for a specific
purpose. (More on this later)
3. Variadic interfaces. Sometimes you want to define an interface that takes a variable
number of arguments. An example is println! which could take any number of
arguments, depending on the format string!. (More on this later)
Syntax
In following subsections, we will show how to define macros in Rust. There are three basic
ideas:
Designators
The arguments of a macro are prefixed by a dollar sign $ and type annotated with a
designator:
block
expr is used for expressions
ident is used for variable/function names
item
Overload
Macros can be overloaded to accept different combinations of arguments. In that regard,
macro_rules! can work similarly to a match block:
Repeat
Macros can use + in the argument list to indicate that an argument may repeat at least
once, or * , to indicate that the argument may repeat zero or more times.
In the following example, surrounding the matcher with $(...),+ will match one or more
expression, separated by commas. Also note that the semicolon is optional on the last case.
Suppose that I want to define a little calculator API. I would like to supply an expression and
have the output printed to console.
Output:
1 + 2 = 3
(1 + 2) * (3 / 4) = 0
This was a very simple example, but much more complex interfaces have been developed,
such as lazy_static or clap .
Also, note the two pairs of braces in the macro. The outer ones are part of the syntax of
macro_rules! , in addition to () or [] .
Variadic Interfaces
A variadic interface takes an arbitrary number of arguments. For example, println! can
take an arbitrary number of arguments, as determined by the format string.
We can extend our calculate! macro from the previous section to be variadic:
Output:
1 + 2 = 3
3 + 4 = 7
(2 * 3) + 1 = 7
Error handling
Error handling is the process of handling the possibility of failure. For example, failing to
read a file and then continuing to use that bad input would clearly be problematic. Noticing
and explicitly managing those errors saves the rest of the program from various pitfalls.
There are various ways to deal with errors in Rust, which are described in the following
subchapters. They all have more or less subtle differences and different use cases. As a rule
of thumb:
An explicit panic is mainly useful for tests and dealing with unrecoverable errors. For
prototyping it can be useful, for example when dealing with functions that haven't been
implemented yet, but in those cases the more descriptive unimplemented is better. In tests
panic is a reasonable way to explicitly fail.
The Option type is for when a value is optional or when the lack of a value is not an error
condition. For example the parent of a directory - / and C: don't have one. When dealing
with Option s, unwrap is fine for prototyping and cases where it's absolutely certain that
there is guaranteed to be a value. However expect is more useful since it lets you specify
an error message in case something goes wrong anyway.
When there is a chance that things do go wrong and the caller has to deal with the problem,
use Result . You can unwrap and expect them as well (please don't do that unless it's a
test or quick prototype).
For a more rigorous discussion of error handling, refer to the error handling section in the
official book.
panic
The simplest error handling mechanism we will see is panic . It prints an error message,
starts unwinding the stack, and usually exits the program. Here, we explicitly call panic on
our error condition:
We could test this against the null string ( "" ) as we do with a snake. Since we're using Rust,
let's instead have the compiler point out cases where there's no gift.
An enum called Option<T> in the std library is used when absence is a possibility. It
manifests itself as one of two "options":
These cases can either be explicitly handled via match or implicitly with unwrap . Implicit
handling will either return the inner element or panic .
Note that it's possible to manually customize panic with expect, but unwrap otherwise
leaves us with a less meaningful output than explicit handling. In the following example,
explicit handling yields a more controlled result while retaining the option to panic if
desired.
You can chain many ? s together to make your code much more readable.
Combinators: map
match is a valid method for handling Option s. However, you may eventually find heavy
usage tedious, especially with operations only valid with an input. In these cases,
combinators can be used to manage control flow in a modular fashion.
Option has a built in method called map() , a combinator for the simple mapping of Some
-> Some and None -> None . Multiple map() calls can be chained together for even more
flexibility.
In the following example, process() replaces all functions previous to it while staying
compact.
See also:
Combinators: and_then
map() was described as a chainable way to simplify match statements. However, using
map() on a function that returns an Option<T> results in the nested Option<Option<T>> .
Chaining multiple calls together can then become confusing. That's where another
combinator called and_then() , known in some languages as flatmap, comes in.
and_then() calls its function input with the wrapped value and returns the result. If the
Option is None , then it returns None instead.
See also:
Result
Result is a richer version of the Option type that describes possible error instead of
possible absence.
Like Option , Result has many methods associated with it. unwrap() , for example, either
yields the element T or panic s. For case handling, there are many combinators between
Result and Option that overlap.
In working with Rust, you will likely encounter methods that return the Result type, such as
the parse() method. It might not always be possible to parse a string into the other type,
so parse() returns a Result indicating possible failure.
Let's see what happens when we successfully and unsuccessfully parse() a string:
In the unsuccessful case, parse() leaves us with an error for unwrap() to panic on.
Additionally, the panic exits our program and provides an unpleasant error message.
To improve the quality of our error message, we should be more specific about the return
type and consider explicitly handling the error.
fn main() {
println!("Hello World!");
}
However main is also able to have a return type of Result . If an error occurs within the
main function it will return an error code and print a debug representation of the error
(using the Debug trait). The following example shows such a scenario and touches on
aspects covered in the following section.
We first need to know what kind of error type we are dealing with. To determine the Err
type, we look to parse() , which is implemented with the FromStr trait for i32 . As a result,
In the example below, the straightforward match statement leads to code that is overall
more cumbersome.
Luckily, Option 's map , and_then , and many other combinators are also implemented for
Result . Result contains a complete listing.
At a module level, creating aliases can be particularly helpful. Errors found in a specific
module often have the same Err type, so a single alias can succinctly define all associated
Results . This is so useful that the std library even supplies one: io::Result !
See also:
io::Result
Early returns
In the previous example, we explicitly handled the errors using combinators. Another way to
deal with this case analysis is to use a combination of match statements and early returns.
That is, we can simply stop executing the function and return the error if one occurs. For
some, this form of code can be easier to both read and write. Consider this version of the
previous example, rewritten using early returns:
At this point, we've learned to explicitly handle errors using combinators and early returns.
While we generally want to avoid panicking, explicitly handling all of our errors is
cumbersome.
In the next section, we'll introduce ? for the cases where we simply need to unwrap without
possibly inducing panic .
Introducing ?
Sometimes we just want the simplicity of unwrap without the possibility of a panic . Until
now, unwrap has forced us to nest deeper and deeper when what we really wanted was to
get the variable out. This is exactly the purpose of ? .
In the following code, two instances of unwrap generate different error types. Vec::first
returns an Option , while parse::<i32> returns a Result<i32, ParseIntError> :
Over the next sections, we'll see several strategies for handling these kind of problems.
There are times when we'll want to stop processing on errors (like with ? ) but keep going
when the Option is None . A couple of combinators come in handy to swap the Result and
Option .
Rust allows us to define our own error types. In general, a "good" error type:
The stdlib helps in boxing our errors by having Box implement conversion from any type
that implements the Error trait into the trait object Box<Error> , via From .
See also:
Other uses of ?
Notice in the previous example that our immediate reaction to calling parse is to map the
error from a library error into a boxed error:
.and_then(|s| s.parse::<i32>()
.map_err(|e| e.into())
Since this is a simple and common operation, it would be convenient if it could be elided.
Alas, because and_then is not sufficiently flexible, it cannot. However, we can instead use
?.
? was previously explained as either unwrap or return Err(err) . This is only mostly true.
It actually means unwrap or return Err(From::from(err)) . Since From::from is a
conversion utility between different types, this means that if you ? where the error is
convertible to the return type, it will convert automatically.
Here, we rewrite the previous example using ? . As a result, the map_err will go away when
From::from is implemented for our error type:
This is actually fairly clean now. Compared with the original panic , it is very similar to
replacing the unwrap calls with ? except that the return types are Result . As a result, they
must be destructured at the top level.
See also:
From::from and ?
Wrapping errors
An alternative to boxing errors is to wrap them in your own error type.
This adds a bit more boilerplate for handling errors and might not be needed in all
applications. There are some libraries that can take care of the boilerplate for you.
See also:
When you look at the results, you'll note that everything is still wrapped in Result . A little
more boilerplate is needed for this.
See also:
Boxed values can be dereferenced using the * operator; this removes one layer of
indirection.
Vectors
Vectors are re-sizable arrays. Like slices, their size is not known at compile time, but they can
grow or shrink at any time. A vector is represented using 3 parameters:
The capacity indicates how much memory is reserved for the vector. The vector can grow as
long as the length is smaller than the capacity. When this threshold needs to be surpassed,
the vector is reallocated with a larger capacity.
Strings
There are two types of strings in Rust: String and &str .
A String is stored as a vector of bytes ( Vec<u8> ), but guaranteed to always be a valid UTF-8
sequence. String is heap allocated, growable and not null terminated.
&str is a slice ( &[u8] ) that always points to a valid UTF-8 sequence, and can be used to
view into a String , just like &[T] is a view into Vec<T> .
More str / String methods can be found under the std::str and std::string modules
are multiple ways to write byte string literals, which all result in &[u8; N] .
Generally special characters are escaped with a backslash character: \ . This way you can
add any character to your string, even unprintable ones and ones that you don't know how
to type. If you want a literal backslash, escape it with another one: \\
String or character literal delimiters occuring within a literal must be escaped: "\"" , '\'' .
Sometimes there are just too many characters that need to be escaped or it's just much
more convenient to write a string out as-is. This is where raw string literals come into play.
Want a string that's not UTF-8? (Remember, str and String must be valid UTF-8). Or
maybe you want an array of bytes that's mostly text? Byte strings to the rescue!
For conversions between character encodings check out the encoding crate.
A more detailed listing of the ways to write string literals and escape characters is given in
the 'Tokens' chapter of the Rust Reference.
Option
Sometimes it's desirable to catch the failure of some parts of a program instead of calling
panic! ; this can be accomplished using the Option enum.
Result
We've seen that the Option enum can be used as a return value from functions that may
fail, where None can be returned to indicate failure. However, sometimes it is important to
Ok(value) which indicates that the operation succeeded, and wraps the value
returned by the operation. ( value has type T )
Err(why) , which indicates that the operation failed, and wraps why , which (hopefully)
explains the cause of the failure. ( why has type E )
?
Chaining results using match can get pretty untidy; luckily, the ? operator can be used to
make things pretty again. ? is used at the end of an expression returning a Result , and is
equivalent to a match expression, where the Err(err) branch expands to an early
Err(From::from(err)) , and the Ok(ok) branch expands to an ok expression.
Be sure to check the documentation, as there are many methods to map/compose Result .
panic!
The panic! macro can be used to generate a panic and start unwinding its stack. While
unwinding, the runtime will take care of freeing all the resources owned by the thread by
calling the destructor of all its objects.
Since we are dealing with programs with only one thread, panic! will cause the program to
report the panic message and exit.
HashMap
Where vectors store values by an integer index, HashMap s store values by key. HashMap
keys can be booleans, integers, strings, or any other type that implements the Eq and Hash
traits. More on this in the next section.
Like vectors, HashMap s are growable, but HashMaps can also shrink themselves when they
have excess space. You can create a HashMap with a certain starting capacity using
HashMap::with_capacity(uint) , or use HashMap::new() to get a HashMap with a default
initial capacity (recommended).
For more information on how hashing and hash maps (sometimes called hash tables) work,
have a look at Hash Table Wikipedia
Any type that implements the Eq and Hash traits can be a key in HashMap . This includes:
bool (though not very useful since there is only two possible keys)
int , uint , and all variations thereof
String and &str (protip: you can have a HashMap keyed by String and call .get()
with an &str )
Note that f32 and f64 do not implement Hash , likely because floating-point precision
errors would make using them as hashmap keys horribly error-prone.
All collection classes implement Eq and Hash if their contained type also respectively
implements Eq and Hash . For example, Vec<T> will implement Hash if T implements
Hash .
You can easily implement Eq and Hash for a custom type with just one line:
#[derive(PartialEq, Eq, Hash)]
The compiler will do the rest. If you want more control over the details, you can implement
Eq and/or Hash yourself. This guide will not cover the specifics of implementing Hash .
To play around with using a struct in HashMap , let's try making a very simple user logon
system:
HashSet
Consider a HashSet as a HashMap where we just care about the keys ( HashSet<T> is, in
actuality, just a wrapper around HashMap<T, ()> ).
"What's the point of that?" you ask. "I could just store the keys in a Vec ."
A HashSet 's unique feature is that it is guaranteed to not have duplicate elements. That's
the contract that any set collection fulfills. HashSet is just one implementation. (see also:
BTreeSet )
If you insert a value that is already present in the HashSet , (i.e. the new value is equal to the
existing and they both have the same hash), then the new value will replace the old.
This is great for when you never want more than one of something, or when you want to
know if you've already got something.
Sets have 4 primary operations (all of the following calls return an iterator):
difference : get all the elements that are in the first set but not the second.
intersection : get all the elements that are only in both sets.
symmetric_difference : get all the elements that are in one set or the other, but not
both.
Rc
When multiple ownership is needed, Rc (Reference Counting) can be used. Rc keeps track
of the number of the references which means the number of owners of the value wrapped
inside an Rc .
all dropped.
Cloning an Rc never performs a deep copy. Cloning creates just another pointer to the
wrapped value, and increments the count.
See also:
Arc
When shared ownership between threads is needed, Arc (Atomic Reference Counted) can
be used. This struct, via the Clone implementation can create a reference pointer for the
location of a value in the memory heap while increasing the reference counter. As it shares
ownership between threads, when the last reference pointer to a value is out of scope, the
variable is dropped.
Std misc
Many other types are provided by the std library to support things such as:
Threads
Channels
File I/O
See also:
Threads
Rust provides a mechanism for spawning native OS threads via the spawn function, the
argument of this function is a moving closure.
Testcase: map-reduce
Rust makes it very easy to parallelise data processing, without many of the headaches
traditionally associated with such an attempt.
The standard library provides great threading primitives out of the box. These, combined
with Rust's concept of Ownership and aliasing rules, automatically prevent data races.
The aliasing rules (one writable reference XOR many readable references) automatically
prevent you from manipulating state that is visible to other threads. (Where synchronisation
is needed, there are synchronisation primitives like Mutex es or Channel s.)
In this example, we will calculate the sum of all digits in a block of numbers. We will do this
by parcelling out chunks of the block into different threads. Each thread will sum its tiny
block of digits, and subsequently we will sum the intermediate sums produced by each
thread.
Note that, although we're passing references across thread boundaries, Rust understands
that we're only passing read-only references, and that thus no unsafety or data races can
occur. Because we're move -ing the data segments into the thread, Rust will also ensure the
data is kept alive until the threads exit, so no dangling pointers occur.
Assignments
It is not wise to let our number of threads depend on user inputted data. What if the user
decides to insert a lot of spaces? Do we really want to spawn 2,000 threads? Modify the
program so that the data is always chunked into a limited number of chunks, defined by a
static constant at the beginning of the program.
See also:
Threads
vectors and iterators
closures, move semantics and move closures
destructuring assignments
turbofish notation to help type inference
unwrap vs. expect
enumerate
Channels
Rust provides asynchronous channels for communication between threads. Channels allow
a unidirectional flow of information between two end-points: the Sender and the Receiver .
Path
The Path struct represents file paths in the underlying filesystem. There are two flavors of
Path : posix::Path , for UNIX-like systems, and windows::Path , for Windows. The prelude
exports the appropriate platform-specific Path variant.
A Path can be created from an OsStr , and provides several methods to get information
from the file/directory the path points to.
Note that a Path is not internally represented as an UTF-8 string, but instead is stored as a
vector of bytes ( Vec<u8> ). Therefore, converting a Path to a &str is not free and may fail
(an Option is returned).
See also:
File I/O
The File struct represents a file that has been opened (it wraps a file descriptor), and gives
read and/or write access to the underlying file.
Since many things can go wrong when doing file I/O, all the File methods return the
io::Result<T> type, which is an alias for Result<T, io::Error> .
This makes the failure of all I/O operations explicit. Thanks to this, the programmer can see
all the failure paths, and is encouraged to handle them in a proactive manner.
open
The open static method can be used to open a file in read-only mode.
A File owns a resource, the file descriptor and takes care of closing the file when it is
drop ed.
(You are encouraged to test the previous example under different failure conditions:
create
The create static method opens a file in write-only mode. If the file already existed, the old
content is destroyed. Otherwise, a new file is created.
use std::fs::File;
use std::io::prelude::*;
use std::path::Path;
fn main() {
let path = Path::new("lorem_ipsum.txt");
let display = path.display();
(As in the previous example, you are encouraged to test this example under failure
conditions.)
There is OpenOptions struct that can be used to configure how a file is opened.
read_lines
The method lines() returns an iterator over the lines of a file.
use std::fs::File;
use std::io::{self, BufRead};
use std::path::Path;
fn main() {
// File hosts must exist in current path before this produces output
if let Ok(lines) = read_lines("./hosts") {
// Consumes the iterator, returns an (Optional) String
for line in lines {
if let Ok(ip) = line {
println!("{}", ip);
}
}
}
}
This process is more efficient than creating a String in memory especially working with
larger files.
Child processes
225 de 254 14/03/2021 15:52
Rust By Example https://doc.rust-lang.org/stable/rust-by-example/print.html
The process::Output struct represents the output of a finished child process, and the
process::Command struct is a process builder.
(You are encouraged to try the previous example with an incorrect flag passed to rustc )
Pipes
The std::Child struct represents a running child process, and exposes the stdin , stdout
and stderr handles for interaction with the underlying process via pipes.
use std::io::prelude::*;
use std::process::{Command, Stdio};
fn main() {
// Spawn the `wc` command
let process = match Command::new("wc")
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.spawn() {
Err(why) => panic!("couldn't spawn wc: {}", why),
Ok(process) => process,
};
// Because `stdin` does not live after the above calls, it is `drop`ed,
// and the pipe is closed.
//
// This is very important, otherwise `wc` wouldn't start processing the
// input we just sent.
Wait
If you'd like to wait for a process::Child to finish, you must call Child::wait , which will
return a process::ExitStatus .
use std::process::Command;
fn main() {
let mut child = Command::new("sleep").arg("5").spawn().unwrap();
let _result = child.wait().unwrap();
Filesystem Operations
The std::fs module contains several functions that deal with the filesystem.
use std::fs;
use std::fs::{File, OpenOptions};
use std::io;
use std::io::prelude::*;
use std::os::unix;
use std::path::Path;
f.write_all(s.as_bytes())
}
fn main() {
println!("`mkdir a`");
// Create a directory, returns `io::Result<()>`
match fs::create_dir("a") {
Err(why) => println!("! {:?}", why.kind()),
Ok(_) => {},
}
println!("`mkdir -p a/c/d`");
// Recursively create a directory, returns `io::Result<()>`
fs::create_dir_all("a/c/d").unwrap_or_else(|why| {
println!("! {:?}", why.kind());
});
println!("`touch a/c/e.txt`");
touch(&Path::new("a/c/e.txt")).unwrap_or_else(|why| {
println!("! {:?}", why.kind());
});
println!("`cat a/c/b.txt`");
match cat(&Path::new("a/c/b.txt")) {
Err(why) => println!("! {:?}", why.kind()),
Ok(s) => println!("> {}", s),
}
println!("`ls a`");
// Read the contents of a directory, returns `io::Result<Vec<Path>>`
match fs::read_dir("a") {
Err(why) => println!("! {:?}", why.kind()),
Ok(paths) => for path in paths {
println!("> {:?}", path.unwrap().path());
},
}
println!("`rm a/c/e.txt`");
// Remove a file, returns `io::Result<()>`
fs::remove_file("a/c/e.txt").unwrap_or_else(|why| {
println!("! {:?}", why.kind());
});
println!("`rmdir a/c/d`");
// Remove an empty directory, returns `io::Result<()>`
fs::remove_dir("a/c/d").unwrap_or_else(|why| {
println!("! {:?}", why.kind());
});
}
$ tree a
a
|-- b.txt
`-- c
`-- b.txt -> ../b.txt
1 directory, 2 files
See also:
cfg!
Program arguments
Standard Library
The command line arguments can be accessed using std::env::args , which returns an
$ ./args 1 2 3
My path is ./args.
I got 3 arguments: ["1", "2", "3"].
Crates
Alternatively, there are numerous crates that can provide extra functionality when creating
command-line applications. The Rust Cookbook exhibits best practices on how to use one of
the more popular command line argument crates, clap .
Argument parsing
Matching can be used to parse simple arguments:
$ ./match_args Rust
This is not the answer.
$ ./match_args 42
This is the answer!
$ ./match_args do something
error: second argument not an integer
usage:
match_args <string>
Check whether given string is the answer.
match_args {increase|decrease} <integer>
Increase or decrease given integer by one.
$ ./match_args do 42
error: invalid command
usage:
match_args <string>
Check whether given string is the answer.
match_args {increase|decrease} <integer>
Increase or decrease given integer by one.
$ ./match_args increase 42
43
use std::fmt;
fn main() {
// z = -1 + 0i
let z = Complex { re: -1., im: 0. };
Testing
Rust is a programming language that cares a lot about correctness and it includes support
for writing software tests within the language itself.
Unit testing.
Doc testing.
Integration testing.
Also Rust has support for specifying additional dependencies for tests:
Dev-dependencies
See Also
The Book chapter on testing
API Guidelines on doc-testing
Unit testing
Tests are Rust functions that verify that the non-test code is functioning in the expected
manner. The bodies of test functions typically perform some setup, run the code we want to
test, then assert whether the results are what we expect.
Most unit tests go into a tests mod with the #[cfg(test)] attribute. Test functions are
marked with the #[test] attribute.
Tests fail when something in the test function panics. There are some helper macros:
#[cfg(test)]
mod tests {
// Note this useful idiom: importing names from outer (for mod tests) scope.
use super::*;
#[test]
fn test_add() {
assert_eq!(add(1, 2), 3);
}
#[test]
fn test_bad_add() {
// This assert would fire and test will fail.
// Please note, that private functions can be tested too!
assert_eq!(bad_add(1, 2), 3);
}
}
$ cargo test
running 2 tests
test tests::test_bad_add ... FAILED
test tests::test_add ... ok
failures:
failures:
tests::test_bad_add
Tests and ?
None of the previous unit test examples had a return type. But in Rust 2018, your unit tests
can return Result<()> , which lets you use ? in them! This can make them much more
concise.
Testing panics
To check functions that should panic under certain circumstances, use attribute
#[should_panic] . This attribute accepts optional parameter expected = with the text of
the panic message. If your function can panic in multiple ways, it helps make sure your test
is testing the correct panic.
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_divide() {
assert_eq!(divide_non_zero_result(10, 2), 5);
}
#[test]
#[should_panic]
fn test_any_panic() {
divide_non_zero_result(1, 0);
}
#[test]
#[should_panic(expected = "Divide result is zero")]
fn test_specific_panic() {
divide_non_zero_result(1, 10);
}
}
$ cargo test
running 3 tests
test tests::test_any_panic ... ok
test tests::test_divide ... ok
test tests::test_specific_panic ... ok
Doc-tests tmp-test-should-panic
running 0 tests
Doc-tests tmp-test-should-panic
running 0 tests
To run multiple tests one may specify part of a test name that matches all the tests that
should be run.
Doc-tests tmp-test-should-panic
running 0 tests
Ignoring tests
Tests can be marked with the #[ignore] attribute to exclude some tests. Or to run them
with command cargo test -- --ignored
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_add() {
assert_eq!(add(2, 2), 4);
}
#[test]
fn test_add_hundred() {
assert_eq!(add(100, 2), 102);
assert_eq!(add(2, 100), 102);
}
#[test]
#[ignore]
fn ignored_test() {
assert_eq!(add(0, 0), 0);
}
}
$ cargo test
running 3 tests
test tests::ignored_test ... ignored
test tests::test_add ... ok
test tests::test_add_hundred ... ok
Doc-tests tmp-ignore
running 0 tests
Doc-tests tmp-ignore
running 0 tests
Documentation testing
The primary way of documenting a Rust project is through annotating the source code.
Documentation comments are written in markdown and support code blocks in them. Rust
takes care about correctness, so these code blocks are compiled and used as tests.
/// Usually doc comments may include sections "Examples", "Panics" and
"Failures".
///
/// The next function divides two numbers.
///
/// # Examples
///
/// ```
/// let result = doccomments::div(10, 2);
/// assert_eq!(result, 5);
/// ```
///
/// # Panics
///
/// The function panics if the second argument is zero.
///
/// ```rust,should_panic
/// // panics on division by zero
/// doccomments::div(10, 0);
/// ```
pub fn div(a: i32, b: i32) -> i32 {
if b == 0 {
panic!("Divide-by-zero error");
}
a / b
}
$ cargo test
running 0 tests
Doc-tests doccomments
running 3 tests
test src/lib.rs - add (line 7) ... ok
test src/lib.rs - div (line 21) ... ok
test src/lib.rs - div (line 31) ... ok
See Also
RFC505 on documentation style
API Guidelines on documentation guidelines
Integration testing
Unit tests are testing one module in isolation at a time: they're small and can test private
code. Integration tests are external to your crate and use only its public interface in the
same way any other code would. Their purpose is to test that many parts of your library
work correctly together.
File src/lib.rs :
#[test]
fn test_add() {
assert_eq!(adder::add(3, 2), 5);
}
$ cargo test
running 0 tests
Running target/debug/deps/integration_test-bcd60824f5fbfe19
running 1 test
test test_add ... ok
Doc-tests adder
running 0 tests
Each Rust source file in tests directory is compiled as a separate crate. One way of sharing
some code between integration tests is making module with public functions, importing and
using it within tests.
File tests/common.rs :
pub fn setup() {
// some setup code, like creating required files/directories, starting
// servers, etc.
}
#[test]
fn test_add() {
// using common code.
common::setup();
assert_eq!(adder::add(3, 2), 5);
}
Modules with common code follow the ordinary modules rules, so it's ok to create common
module as tests/common/mod.rs .
Development dependencies
Sometimes there is a need to have dependencies for tests (or examples, or benchmarks)
One such example is using a crate that extends standard assert! macros.
File Cargo.toml :
File src/lib.rs :
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_add() {
assert_eq!(add(2, 3), 5);
}
}
See Also
Cargo docs on specifying dependencies.
Unsafe Operations
As an introduction to this section, to borrow from the official docs, "one should try to
minimize the amount of unsafe code in a code base." With that in mind, let's get started!
Unsafe annotations in Rust are used to bypass protections put in place by the compiler;
specifically, there are four primary things that unsafe is used for:
calling functions or methods which are unsafe (including calling a function over FFI,
see a previous chapter of the book)
accessing or modifying static mutable variables
implementing unsafe traits
Raw Pointers
Raw pointers * and references &T function similarly, but references are always safe
because they are guaranteed to point to valid data due to the borrow checker.
Dereferencing a raw pointer can only be done through an unsafe block.
For slice::from_raw_parts , one of the assumptions which must be upheld is that the
pointer passed in points to valid memory and that the memory pointed to is of the correct
type. If these invariants aren't upheld then the program's behaviour is undefined and there
is no knowing what will happen.
Compatibility
The Rust language is fastly evolving, and because of this certain compatibility issues can
arise, despite efforts to ensure forwards-compatibility wherever possible.
Raw identifiers
Raw identifiers
Rust, like many programming languages, has the concept of "keywords". These identifiers
mean something to the language, and so you cannot use them in places like variable names,
function names, and other places. Raw identifiers let you use keywords where they would
not normally be allowed. This is particularly useful when Rust introduces new keywords, and
a library using an older edition of Rust has a variable or function with the same name as a
keyword introduced in a newer edition.
For example, consider a crate foo compiled with the 2015 edition of Rust that exports a
function named try . This keyword is reserved for a new feature in the 2018 edition, so
without raw identifiers, we would have no way to name the function.
fn main() {
foo::try();
}
fn main() {
foo::r#try();
}
Meta
Some topics aren't exactly relevant to how you program but provide you tooling or
infrastructure support which just makes things better for everyone. These topics include:
Documentation: Generate library documentation for users via the included rustdoc .
Playpen: Integrate the Rust Playpen(also known as the Rust Playground) in your
documentation.
Documentation
Use cargo doc to build documentation in target/doc .
Use cargo test to run all tests (including documentation tests), and cargo test --doc to
only run documentation tests.
Doc comments
Doc comments are very useful for big projects that require documentation. When running
rustdoc , these are the comments that get compiled into documentation. They are denoted
by a /// , and support Markdown.
To run the tests, first build the code as a library, then tell rustdoc where to find the library
so it can link it into each doctest program:
Doc attributes
Below are a few examples of the most common #[doc] attributes used with rustdoc .
inline
#[doc(inline)]
pub use bar::Bar;
no_inline
hidden
For documentation, rustdoc is widely used by the community. It's what is used to generate
the std library docs.
See also:
Playpen
The Rust Playpen is a way to experiment with Rust code through a web interface. This
project is now commonly referred to as Rust Playground.
This allows the reader to both run your code sample, but also modify and tweak it. The key
here is the adding the word editable to your codefence block separated by a comma.
```rust,editable
//...place your code here
```
Additionally, you can add ignore if you want mdbook to skip your code when it builds and
tests.
```rust,editable,ignore
//...place your code here
```
opens the code sample up in a new tab in Rust Playground. This feature is enabled if you use
the #[doc] attribute called html_playground_url .
See also: