One day I will write a blog post called “The Rust borrow checker is overrated, kinda”.
The borrow checker is certainly Rust’s claim to fame. And a critical reason why the language got popular and grew. But it’s probably not in my Top 10 favorite things about using Rust. And if Rust as it exists today existed without the borrow checker it’d be a great programming experience. Arguably even better than with the borrow checker.
Rust’s ergonomics, standardized cargo build system, crates.io ecosystem, and community community to good API design are probably my favorite things about Rust.
The borrow checker is usually fine. But does require a staunch commitment to RAII which is not fine. Rust is absolute garbage at arenas. No bumpalo doesn’t count. So Rust w/ borrow checker is not strictly better than C. A Rust without a borrow checker would probably be strictly better than C and almost C++. Rust generics are mostly good, and C++ templates are mostly bad, but I do badly wish at times that Rust just had some damn template notation.
This is something I've been thinking about lately. I do think memory safety is an important trait that rust has over c and other languages with manual memory management. However, I think Rust also has other attractive features that those older languages don't have:
* a very nice package manager
* Libraries written in it tend to be more modular and composable.
* You can more confidently compile projects without worrying too much about system differences or dependencies.
I think this is because:
* It came out during the Internet era.
* It's partially to do with how cargo by default encourages more use of existing libraries rather than reinventing the wheel or using custom/vendored forks of them.
* It doesn't have dynamic linking unless you use FFI. So rust can still run into issues here but only when depending on non-rust libraries.
Everytime I try to use bumpalo I get frustrated, give up, and fallback to RAII allocation bullshit.
My last attempt is I had a text file with a custom DSL. Pretend it’s JSON. I was parsing this into a collection of nodes. I wanted to dump the file into an arena. And then have all the nodes have &str living in and tied to the arena. I wanted zero unnecessary copies. This is trivially safe code.
I’m sure it’s possible. But it required an ungodly amount of ugly lifetime 'a lifetime markers and I eventually hit a wall where I simply could not get it to compile. It’s been awhile so I forget the details.
I love Rust. But you really really have to embrace the RAII or your life is hell.
To make sure I understand correctly: did you want to read a `String` and have lots of references to slices within the same string without having to deal with lifetimes? If so, would another variant of `Rc<str>` which supports substrings that also update the same reference count have worked for you? Looking through crates.io, I see multiple libraries that seem to offer this functionality:
Let’s pretend I was in C. I would allocate one big flat segment of memory. I’d read the “JSON” text file into this block. Then I’d build an AST of nodes. Each node would be appended into the arena. Object nodes would container a list of pointers to child nodes.
Once I built the AST of nested nodes of varying type I would treat it as constant. I’d use it for a few purposes. And then at some point I would free the chunk of memory in one go.
In C this is trivial. No string copies. No duplicated data. Just a bunch of dirty unsafe pointers. Writing this “safely” is very easy.
In Rust this is… maybe possible. But brutally difficult. I’m pretty good at Rust. I gave up. I don’t recall what exact what wall I hit.
I’m not saying it can’t be done. But I am saying it’s really hard and really gross. It’s radically easier to allocate lots of little Strings and Vecs and Box each nested value. And then free them all one-by-one.
oxc_parser uses bumpalo (IIRC) to compile an AST into arena from a string. I think the String is outside the arena though, but their lifetimes are "mixed together" into a single 'a, so lifetime-wise it's the same horror to manage. But manage they did.
I've just built a parser in Rust that seems to function in exactly the way you say. It reads the text into a single buffer, assembles the AST nodes in an arena (indexes are NonZeroU8's to save space), and all token nodes refer to the original rather duplicating the text. The things I'm parsing are smallish (much less than 64k, 256 tokens), but their is around 100M of them so was worth the effort to build it in a way that used 0 allocations (the arena is re-used).
It was a bit of effort, but it can't be that difficult given it's my first non-toy Rust program and it wasn't that hard to get going. I'd written maybe 50 training exercises in Rust before that, over the years. Yes, the same thing in C would have been faster and easier to write (I've written many 100's of thousands of lines of C over the years), but I'm not sure it would of worked the day after it compiled.
I also had to build a 2nd parser that looked at maybe 100 lines. It was clone() all the way down for that one because I could afford the cost. I think that was easier to write in Rust than C, mostly because of the very good standard library Rust comes with.
The borrow checker is certainly Rust’s claim to fame. And a critical reason why the language got popular and grew. But it’s probably not in my Top 10 favorite things about using Rust. And if Rust as it exists today existed without the borrow checker it’d be a great programming experience. Arguably even better than with the borrow checker.
Rust’s ergonomics, standardized cargo build system, crates.io ecosystem, and community community to good API design are probably my favorite things about Rust.
The borrow checker is usually fine. But does require a staunch commitment to RAII which is not fine. Rust is absolute garbage at arenas. No bumpalo doesn’t count. So Rust w/ borrow checker is not strictly better than C. A Rust without a borrow checker would probably be strictly better than C and almost C++. Rust generics are mostly good, and C++ templates are mostly bad, but I do badly wish at times that Rust just had some damn template notation.