One of the things I found interesting about this paper is that they find that th...

cyber_kinetist · on Nov 6, 2023

> the borrow checker is not turned off inside of unsafe blocks, but instead you are given a handful of extra APIs to use. Those extra APIs can be used to do things that the borrow checker would not normally allow (e.g. by converting checked pointers into raw pointers, and then inside the unsafe block, dereferencing those raw pointers), but the borrow checker is still active, and still catches all the same errors as before.

Yup I know that the borrow checker does also work inside unsafe blocks. But when what you primarily do as a systems programmer is establishing invariants in your systems so that you can exploit (or circumvent) the UB-ness of the underlying system (OS/driver/hardware/etc.), and when the borrow checker doesn't really help you with verifying these invariants... The Rustonomicon states that whenever you open up an unsafe block, it doesn't pollute only the scope, but the whole module (https://doc.rust-lang.org/nomicon/working-with-unsafe.html). So there can be a correctness issue outside the unsafe block because it disobeys the invariants implicitly set up by the developer inside the unsafe block. And you can't do anything about this other than carefully reasoning about all the various ways your implicit invariants can break inside the whole module. So unsafe is ultimately more of a convention (or a promise) that you have designed and verified your invariants correctly, so that you will not produce undefined behavior no matter how you use the module as an outside user. If you want to verify your invariants any further than that, you need to check UB at runtime using the Miri interpreter (which is really slow and still incomplete), or just use Ada SPARK.

nicoburns · on Nov 6, 2023

It does pollute the whole module, but this is much less of a big deal than it might seem because Rust modules are so lightweight - you don't even need a new file to create a new module, you can just do:

    mod name_of_module {
        // code goes here
    }

And it's often possible to wrap the tricky unsafe bits in a safe interface (that e.g. uses a mutex to enforce safety). So that anyone who is contributing to higher-level code doesn't need to worry about it. This is much better than C or C++ code where it's trivially easy to introduce memory unsafety or even Undefined Behaviour in even the boring "glue" parts of the codebase.

This leads to a really nice gentle onboarding flow where inexperienced users can start out contributing to the safe parts of the project, and (optionally) move on to gnarly unsafe bits later when they are already familiar with the project's codebase. It also dramatically reduces review workload for maintainers as they can rely on the compiler enforcing invariants outside of unsafe modules.

This works less well for really low-level code like embedded code or kernel code. But it's still a lot better than nothing.

cyber_kinetist · on Nov 6, 2023

Though my usual experience with writing performance-sensitive code is: if you just write naive inefficient code on the first try, there's a high probably that you need to rearchitect the whole system to get a more performant design, it's not something you can do incrementally. Maybe in these kinds of projects it's not wise to let random contributors handle your code though... (I'm geared more into graphics and numerical computing so the experience might be different from others.)

Maybe there's a reason why game developers have primarily used scripting languages - give out a safe managed GC-backed runtime for the majority of developers, and let only a select few who understand the system to develop the core C++ engine. Maybe Safe Rust can be used this way (as a "fast" scripting language), to separate between these two worlds... but the problems is even Safe Rust is just really difficult to grok for newcomers, and the hoops they go through to circumvent the borrow checker either falls into using Copy/Clone all over the place (slow) or smart pointers (slow) or array indices with bound checking (maybe less slow but more cumbersome, and also prone to logical invalidation errors if you're not careful)

estebank · on Nov 6, 2023

> Maybe Safe Rust can be used this way (as a "fast" scripting language), to separate between these two worlds... but the problems is even Safe Rust is just really difficult to grok for newcomers, and the hoops they go through to circumvent the borrow checker either falls into using Copy/Clone all over the place (slow) or smart pointers (slow) or array indices with bound checking (maybe less slow but more cumbersome, and also prone to logical invalidation errors if you're not careful)

Are smart pointers like Box, Rc and even Arc any slower than any scripting system you'd "hand out" to most developers from your tightly written C++ core engine?

One thing that I see is that 90% of code is simple enough that you don't need to have any ceremony around ownership beyond writing a & in front of a value or type, 5% is harder than that but doable, and 5% requires extensive expertise to avoid allocations, or using Arc. I'd wager that the distribution of code that a GC can optimize during runtime is comparable, if not worse, at higher memory consumption.

I've compared some simple networked applications written in Java and Rust for that purpose, and performance ended up being comparable, but with 100x memory consumption, even when using GraalVM.

jvanderbot · on Nov 6, 2023

Rust is new. Low hanging features and bugs are going to be everywhere in new Crates. Contributing to, say, libcurl or openssl sure is more difficult than contributing to yet another rewrite of a mostly mature tool.