Rust from a Gopher - Lessons 7, 8 & 9

2020-11-17 learning-rust Comments Word Count: 2741 words Read Time: 13 minutes

Update: There has been discussion on this post on Hacker News - feel free to see the comments there

Hello and welcome to the fifth post in my series about learning Rust. In case you want to hit it from the start, here’s a link to the first one! This entire series covers my journey from being a completely land-locked Gopher to becoming (hopefully) a hardened Rustacean, able to skitter the hazardous seabed of any application safely.

I can’t believe we’re at post number five! It’s been a wonderful ride, and I’ve been blown away by the helpful feedback from the Rust community at large. Thank you to everyone for all of the warm help.

Whilst I’m still in my infancy with Rust, it’s getting harder and harder to hold the itch to learn more. Often I want nothing more than to open the Rust Book and truck on with learning, blog posts be darned! However, I’ve made my commitment to myself, you and the Book, so every night; write, I do.

I’ve decided to make this post a triple, so let’s begin where we left off!

7. Packages and Crates and Stuff

Right off the bat, I think this chapter is very well-placed. Even though it took me the entire chapter grok crates, modules and packages; I don’t think it came a line too early or late. Bravo! The chapter introduced many new keywords, such as use, pub, mod, super, but you walk away feeling like you can actually use them properly.

Modules

Modules are introduced leaning heavily on the analogy of a file system… Now given that modules, at the most granular level, can be defined as single files with the module name, I’m actually wondering how much of this is an analogy and how much of it is really what’s happening at the compiler level. I would grin so hard if the bit about use being like a symlink was actually just a thing that was happening somewhere in the build chain. I’m sure it’s not though… Right.

Privacy

I enjoyed how the authors talked about privacy through scope. It helped me link project structure ideas with programming rules. For some reason I always thought of privacy as logical rules applied to code after the fact, rather than just having or not having something within a certain scope. It’s a lot easier to think about the later way. All this being said, there was also an overarching “Restaurant” example which they tried to tie in with privacy - check this gem:

The way privacy works in Rust is that all items (functions, methods, structs, enums, modules, and constants) are private by default. […] To continue with the restaurant metaphor, think of the privacy rules as being like the back office of a restaurant: what goes on in there is private to restaurant customers, but office managers can see and do everything in the restaurant in which they operate.

… I think that metaphor died somewhere in the rafters above the restaurant, and the smell could be starting to affect the food. Sorry.

For those who don’t know, privacy in Go is defined by whether you capitalize the first letter of your func/struct/interface name. Whilst lean and consistent - I much prefer the Rust approach. Rust is private by default unless you plop a pub in front of it. The main irk I have with the Go way is that Go’s naming convention also tells you acronyms should be entirely uppercase. So what happens when you want a private struct called jsonData? Well you have to think of something else because calling it JSONData will make it publicly exported and jsonData is not idiomatic. It’s definitely but only a nit, however I’ve become (unreasonably?) irked by it on more than one occasion.

Clean your room!

The lesson proved once again that the Rust compiler is helpful. Gentle warnings about my variables being unused really makes me feel cared for. Go isn’t like this. Go grabs you by your collar and pierces your eardrums with a shrewd screech. It won’t let you go either. Like a harpy, the tiny chipmunk won’t let up. Thank Crabby for small mercies is all I can say.

On a more serious note though, I like this choice by Rust - my preference for most things I’d code would be to enforce compiler errors for unused variables; but it should be a compiler option; which as far as I know - it is not, in Go. I’m going to guess you should be able to upgrade the Rust warnings to errors through compiler flags somehow.

Re-exporting Imports

This feature of Rust gives me mixed feelings. As I understand it, when you use a package, that package can then bring more packages into your scope? I mean it does make sense from a library user-experience perspective, but I can’t help but feel like there’s a bit of trust going on here, if not only taste-trust. It’s definitely nice to just use a crate once and have everything you need to interact with it.

There’s only one Rust project I’m really interested in right now - a Game engine and editor called rg3d. I checked that project’s source for re-exports and found some in the top level lib.rs. Basically it just imports its own submodules publicly. I have no idea if this is idiomatic but to my lay-eyes this use-case looks correct and just.

Rust has been giving me a strong Java vibe from its imports. The nested paths feature seems good - ugly but succinct. Maybe my editor or rustfmt will automatically manage this for me? I enjoy not needing to worry about imports 99% of the time in Go, thanks to goimports, so hopefully that trend will continue in Rust.

Prelude to Nothing

In previous posts I mentioned being wooed by wistful mentions of a prelude by the Book. Well this chapter mentioned it again, and more than wistfully. It drowned my unquenchable satiety with this link, which to my utmost sorrow, was not a chapter in the Book. Alas, I read it in spite the fact. Oh, but my heart was flattened like my toddler’s playdough on a Sunday morning, for I was to find the prelude means nothing more than the default set of imports for a program. Through the sobriety of hindsight, I do question what else I was expecting. Given Rust doesn’t have a runtime, it doesn’t need anything fancy. Oh well, at least the great prelude mystery is solved.

8. Common Collections

I just want to take a moment to swoon over Rust’s type inference again.

*swoons*

It’s fantastic. This is all…

Now for actual chapter 8; it’s fairly dense. Covered is a brief introduction of standard library’s collections, Vectors, HashMaps and finally a deep dive into Rust Strings. Let’s begin the blow-by-blow starting with this gem:

fn main() {
    let mut v = vec![1, 2, 3, 4, 5];
    let first = &v[0];
    v.push(6);
    println!("The first element is: {}", first);
}

Guess what won’t compile? Correct - the code above! What’s happening here is that v is mutable and first is immutable. If v needs to grow from something like a push; then this may end up moving the entire heap-allocated vector into another part of the heap. Had this to happen, then first would be a “dangling” ref to what used to be the first element of v.

Now what really tickles me is wondering about how that enforcement is encoded - what is first a reference to exactly? The Vec? No, my IDE tells me it’s an &i32 ref… But, somehow the compiler knows that THAT i32 ref is a child, or related to the Vec in the same scope. I ’d love to know more about how this works.

Perhaps not by major coincidence the Book offers up possibly the most perfect lead for such questions only paragraphs later… The Book introduces the Rustinomicon, telling us that we can peer into it to see details about how vec is made… At first, I was a little tense. Why does this reference-style-looking book have the Lovecraftian name? Only when I glimpsed the first line of its opening, did the ink droplets align:

`The Dark Arts of Unsafe Rust`

…followed by

THE KNOWLEDGE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF UNLEASHING INDESCRIBABLE HORRORS THAT SHATTER YOUR PSYCHE AND SET YOUR MIND ADRIFT IN THE UNKNOWABLY INFINITE COSMOS.

OKAY THIS LOOKS AWESOME.

Flipping back to the vec chapter of the Rustinomicon, I realised it’s not at all a boring appendix style libdoc for vectors - the god damn tomb is teaching you to write your OWN std::Vec from scratch!

Good God, this is made me bite my lip in awkward nuclear attraction. I had some plans in mind for what I wanted to write about after this series. Right now the thought of doing a mini-series through the nomcon is sounding mighty fine though.. More to come on this later, as I managed to pry myself away and back to the lesson at hand.

Looking through the method list (by god the doc pages are ugly; hint - only use the sidebar to navigate, do not bother free-scrolling), vec REALLY has some cool funcs. dedup, drain, retain all look to do what you’d expect. I jaunted quickly through std::iter:Iterator too and found SO many nice to haves, such as gt, is_sorted, map, fold, filter…

Seriously, I know there’s a lot you wouldn’t need here - but I really do wish we had simple collections/traits outside of slices and maps in Go. And yes, before I get screamed at - I know you can write your own in a few lines, but it’s not the same - especially when you start involving complex data types. Being able to just conform to standard interfaces for elegance is something I adore and miss.

String Theory

The chapter devotes a large swath to Strings and UTF-8. At first, I thought “how boring”, but then I realised I am the boring one! This is a much more interesting problem than I gave credit for and it made me go back to Go to better understand the “rune” system it uses for strings.

First, strings; there’s actually shit-loads of them in Rust… OsString, CString, CStr, OsStr… How wrong I was to assume a single String type in Rust indeed.

The stdlib provides many helpful functions over regular String. In Go, string is a part of builtin, but you use the stdlib’s strings package for more helpers.

Random question: Is there a difference between String::from("") and String::new()? According to this test; no:

    assert_eq!(true, String::from("") == String::new());

I’ve seen both methods being used so far and am not sure which one is “better”/idiomatic.

Indexing into Strings

A final reason Rust doesn’t allow us to index into a String to get a character is that indexing operations are expected to always take constant time (O(1)). But it isn’t possible to guarantee that performance with a String, because Rust would have to walk through the contents from the beginning to the index to determine how many valid characters there were.

What’s interesting is that in Go direct byte indexing is permitted on strings, though if you iterate over a string via the builtin range, you get runes, not bytes. A rune is just an alias for int32, and it holds the unicode code point for a single UTF-8 character in your string.

Having just learned many of these details, it actually gives me the chills because I don’t know if I’ve previously used code like myString = myString[1:] before, which in hindsight - would break the first unicode character of a string which had a code point represented by at least 2 bytes. This is possibly some unchecked privilege for coding in English!!

Okay so now here’s a dumb question:

How does the compiler know how much space is needed for an Enum Vec?

Does it just allocate N * MAX_SIZEOF(types)? I guess the allocations could be done dynamically instead? This would go against the grain of the constant time indexing quoted above though as either indexing or writing would need to be dynamic too.

Come to think of it, I have no idea how Rust decides on how much memory to allocate a new Vec too. In Go, slices grow their underlying array by a factor of two when they outgrow their current capacity. I also have no idea how Go makes allowances for slices of interface types, nor how it keeps (or doesn’t?) memory contiguous for said slices…

These are fascinating questions to me, of which I’ve never really had the need to ask about with Go. Go doesn’t expect you to try to optimize L1. Only when learning Rust am I even asking this, because it makes me feel closer to the hardware.

Hash Maps

Chapter 8 finishes with the standard lib’s HashMap. From first glance, the API seems solid. In Golang, maps are quite plain, though they can accept literally anything as keys (even interface{}), so they have versatility. The Rust Book didn’t make it clear to me if that’s possible with a HashMap (it might get clearer when I learn about Traits, in ch 10).

It’s pretty cool that you can provide your own hashers though… This is something I’ve not heard of in Go. Basically you get what you’re given - if you’re interested in Go’s maps, Dave Cheney has a great writeup on them that’s well worth your time.

9. Error Handling

The final hole of the half, here we go! Error handling has always been a hot topic in Go development as Go too does not have exceptions, and follows similar “bubble up” principles for returning errors.

This chapter talks a lot about how you can panic or gracefully “recover” from errors. Similar story in Go too, with the caveat that in Go you can recover from panics. Basically Go allows you to hook into a panic’s stack unrolling by giving you a recover hook. It’s not super commonly used, but even the stdlib’s default HTTP server uses it to keep things ticking nicely when a thread handling a request gets blown up via panic.

So what’s cool about Rust’s errors then? Well unwrap and expect look great for fast prototyping. What looks even two hundred thousand times more amazing though is the ‘?’ operator. A pet peeve of many Gophers (particularly new ones) is how verbose your code can get when trying to do simple things that can return errors, you may end up with many, many segments that look like:

	f, err := os.Open(filename)
	if err != nil {
		return "", err
	}

Being able to wrap up three lines of boiler in a single character. That is some beautiful thinking right there. I am so happy I was wrong about Options’s and boilerplate from my previous post.

Don’t Panic!

The Book brought up a good use-case for when you wouldn’t mind a panic. It’s when you have “more information than the compiler”. The following example from the Book shows us:

    use std::net::IpAddr;
    let home: IpAddr = "127.0.0.1".parse().unwrap();

We know 100% that the string containing “127.0.0.1” is a valid IPv4 address. In this case most people would be okay to let this code sit in production somewhere (unless there’s a better IpAddr constructor).

In Go even if you wanted to do this, you probably couldn’t because the parse() call would return a tuple of (IpAddr, error), making chaining impossible. Also, there’s no `unwrap in Go, but hey… I just love having the ability for concise yet readable code.

One final thing that crossed my panicky mind was about configuring your application to abort on panic. What would this look like for rust-embedded, where you don’t have an operating system to care about? I’d imagine it depends on the architecture you’re building for and what’s available for your particular target, i.e. is there HLT instruction, or some infinite sleep workaround (i.e. AVR).

Conclusions

The more I read about Rust, the more I wish I was writing it in my day job. I’m really looking forward to finishing the Book, so I can focus my time on getting my hands dirty with some ideas I have. It’s one thing reading and writing about tech, but another thing altogether when you’re desperately trying to make it breathe life into your ideas.

I hope you enjoyed the read, leave a comment below to further my Rustucation, and stay tuned for more!

Levi Lovelock

Dev, Dad, Dabb(l)er