Home About Eric Topics SourceGear

2015-06-08 12:00:00

My initial experience with Rust

First, a digression about superhero movies

I am apparently incapable of hating any movie about a comic book superhero.

I can usually distinguish the extremes. Yes, I can tell that "The Dark Knight" was much better than "Elektra". My problem is that I tend to think that the worst movies in this genre are still pretty good.

And I have the same sort of unreasonable affection toward programming languages. I have always been fascinated by languages, compilers, and interpreters. My opinions about such things skew toward the positive simply because I find them so interesting.

I do still have preferences. For example, I tend to like strongly typed languages more. In fact, I think it is roughly true that the stricter a compiler is, the more I like it. But I can easily find things to like in languages that I mostly dislike.

I've spent more of my career writing C than any other language. But in use cases where I need something like C, I am increasingly eager for something more modern.

I started learning Rust with two questions:

The context

My exploration of Rust has taken place in one of my side projects: https://github.com/ericsink/LSM

LSM is a key-value database with a log-structured merge tree design. It is conceptually similar to Google LevelDB. I first wrote it in C#. Then I rewrote/ported it to F#. Now I have ported it to Rust. (The Rust port is not yet mentioned in the README for that repo, but it's in the top-level directory called 'rs'.)

For the purpose of learning F# and Rust, my initial experience was the same. The first thing I did in each of these languages was to port LSM. In other words, the F# and Rust ports of LSM are on equal footing. Both of them were written by someone who was a newbie in the language.

Anyway, although Rust and F# are very different languages, I have used F# as a reference point for my learning of Rust, so this blog entry walks that path as well.

This is not to say that I think Rust and F# would typically be used for the same kinds of things. I can give you directions from Denver to Chicago without asserting they are similar. Nonetheless, given that Rust is mostly intended to be a modern replacement for C, it has a surprising number of things in common with F#.

The big comparison table

  F# Rust
Machine model Managed, .NET CLR Native, LLVM
Runtime CLR None
Style Multi-paradigm, functional-first Multi-paradigm, imperative-first
Syntax family ML-ish C-ish
Blocks Significant whitespace Curly braces
Exception handling Yes No
Strings .NET (UTF-16) UTF-8
Free allocated memory Automatic, garbage collector Automatic, static analysis
Type inference Yes, but not from method calls Yes, but only within functions
Functional immutable collections Yes No
Currying Yes No
Partial application Yes No
Compiler strictness Extremely strict Even stricter
Tuples Yes Yes
Discriminated unions
type Blob =
    | Stream of Stream
    | Array of byte[]
    | Tombstone
enum Blob {
    Stream(Box),
    Array(Box<[u8]>),
    Tombstone,
}
Mutability To be avoided Safe to use
Lambda expressions
let f = 
  (fun acc item -> acc + item)
let f = 
  |acc, &item| acc + item;
Higher-order functions List.fold f 0 a a.iter().fold(0, f)
Integer overflow checking No open Checked Yes
Let bindings
let x = 1
let mutable y = 2
let x = 1;
let mut y = 2;
if statements are expressions Yes Yes
Unit type () ()
Pattern matching
match cur with
| Some csr -> csr.IsValid()
| None -> false
match cur {
    Some(csr) => csr.IsValid(),
    None => false
}
Primary collection type Linked list Vector
Naming types CamelCase CamelCase
Naming functions, etc camelCase snake_case
Warnings about naming conventions No Yes
Type for integer literals Suffix (0uy) Inference (0) or suffix (0u8)
Project file foo.fsproj (msbuild) Cargo.toml
Testing framework xUnit, NUnit, etc. Built into Cargo
Debug prints printf "%A" foo println!("{:?}", foo);

Memory safety

I have written a lot of C code over the years. More than once while in the middle of a project, I have stopped to explore ways of getting the compiler to catch my memory leaks. I tried the Clang static analyzer and Frama-C and Splint and others. It just seemed like there should be a way, even if I had to annotate function signatures with information about who owns a pointer.

So perhaps you can imagine my joy when I first read about Rust.

Even more cool, Rust has taken this set of ideas so much further than the simple feature I tried to envision. Rust doesn't just detect leaks, it also:

That last bullet is worth repeating: With Rust, you never stare at your code trying to figure out if it's thread safe or not. If it compiles, then it's thread safe.

Safety is Rust's killer feature, and it is very compelling.

Mutability

If you come to Rust hoping to find a great functional language, you will be disappointed. Rust does have a bunch of functional elements, but it is not really a functional language. It's not even a functional-first hybrid. Nonetheless, Rust has enough cool functional stuff available that it has been described as "ML in C++ clothing".

I did my Rust port of LSM as a line-by-line translation from the F# version. This was not a particularly good approach.

So if you're porting code from a more functional language, you can end up with code that isn't very Rusty.

If you are a functional programming fan, you might be skeptical of Rust and its claims. Try to think of it like this: Rust agrees that mutability is a problem -- it is simply offering a different solution to that problem.

Learning curve

I don't know if Rust is the most difficult-to-learn programming language I have seen, but it is running in that race.

Anybody remember back when Joel Spolsky used to talk about how difficult it is for some programmers to understand pointers? Rust is a whole new level above that. Compared to Rust, regular pointers are simplistic.

With Rust, we don't just have pointers. We also have ownership, borrows, and lifetimes.

As you learn Rust, you will reach a point where you think you are starting to understand things. And then you try to return a reference from a function, or store a reference in a struct. Suddenly you have lifetime<'a> annotations<'a> all<'a> over<'a> the<'a> place<'a>.

And why did you put them there? Because you understood something? Heck no. You started sprinkling explicit lifetimes throughout your code because the compiler error messages told you to.

I'm not saying that Rust isn't worth the pain. I personally think Rust is rather brilliant.

But a little expectation setting is appropriate here. Some programming languages are built for the purpose of making programming easier. (It is a valid goal to want to make software development accessible to a wider group of people.) Rust is not one of those languages.

That said, the Rust team has invested significant effort in excellent documentation (see The Book). And those compiler error messages really are good.

Finally, let me observe that while some things are hard to learn because they are poorly designed, Rust is not one of those things. The deeper I get into this, the more impressed I am. And so far, every single time I thought the compiler was wrong, I was mistaken.

I have found it helpful to try to make every battle with the borrow checker into a learning experience. I do not merely want to end up with the compiler accepting my code. I want to understand more than I did when I started.

Error handling

Rust does not have exceptions for error handling. Instead, error handling is done through the return values of functions.

But Rust actually makes this far less tedious than it might sound. By convention (and throughout the Rust standard library), error handling is done by returning a generic enum type called Result<T,E>. This type can encapsulate either the successful result of the function or an error condition.

On top of this, Rust has a clever macro called try!. Because of this macro, if you read some Rust code, you might think it has exception handling.

// This code was ported from F# which assumes that any Stream
// that supports Seek also can give you its Length.  That method
// isn't part of the Seek trait, but this implementation should
// suffice.
fn seek_len(fs: &mut R) -> io::Result where R : Seek {
    // remember where we started (like Tell)
    let pos = try!(fs.seek(SeekFrom::Current(0)));

    // seek to the end
    let len = try!(fs.seek(SeekFrom::End(0)));

    // restore to where we were
    let _ = try!(fs.seek(SeekFrom::Start(pos)));

    Ok(len)
}

This function returns std::io::Result<u64>. When it calls the seek() method of the trait object it is given, it uses the try! macro, which will cause an early return of the function if it fails.

In practice, I like Rust's Result type very much.

Nonetheless, when doing a line-by-line port of F# to Rust, this was probably the most tedious issue. Lots of functions that returned () in F# changed to return Result in Rust.

Type inference

Rust does type inference within functions, but it cannot or will not infer the types of function arguments or function return values.

Very often I miss having the more complete form of type inference one gets in F#. But I do remind myself of certain things:

Iterators

Rust iterators are basically like F# seq (which is an alias for .NET IEnumerable). They are really powerful and provide support for functional idioms like List.map. For example:

fn to_hex_string(ba: &[u8]) -> String {
    let strs: Vec = ba.iter()
        .map(|b| format!("{:02X}", b))
        .collect();
    strs.connect("")
}

However, there are a few caveats.

In Rust, you have a lot more flexibility about whether you are dealing with "a Foo" or "a reference to a Foo", and most of the time, it's the latter. Overall, this is just more work than it is in F#, and using iterators feels like it magnifies that effect.

Performance

I haven't done the sort of careful benchmarking that is necessary to say a lot about performance, so I will say only a little.

Integer overflow

Integer overflow checking is one of my favorite features of Rust.

In languages or environments without overflow checking, unsigned types are very difficult to use safely, so people generally use signed integers everywhere, even in cases where a negative value makes no sense. Rust doesn't suffer from this silliness.

For example, the following code will panic:

let x: u8 = 255;
let y = x + 2;
println!("{}", y);

That said, I haven't quite figured out how to get overflow checking to happen on casts. I want the following code (or something very much like it) to panic:

let x: u64 = 257;
let y = x as u8;
println!("{}", y);

Note that, by default, Rust turns off integer overflow checking in release builds, for performance reasons.

Miscellany

Bottom line

I am seriously impressed with Rust. Then again, I thought that Eric Bana's Hulk movie was pretty good, so you might want to just ignore everything I say.

In terms of maturity and ubiquity, C has no equal. Still, I believe Rust has the potential to become a compelling replacement for C in many situations.

I look forward to using Rust more.