Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> How can my unsafe block make sure an allocation stays alive past the unsafe block itself?

Put it in another data structure or something?

Listen, if you can't make the invariant work, then you need to change the function. An unfixable unsafe is not an excuse to allow errors

> I'm saying this is something to be aware of on projects with multiple people who may not catch something like this.

It's good to be aware but put the blame in the right place. It's the unsafe code that's actually at fault. If you are seeing corruption, look at the unsafe code first with an adversarial mindset.

> The bug was not in unsafe, or in the drop, it was in println.

N. O.

Any unsafe code that outsources invariant enforcement that affects memory safety to safe code has bugs. There's wiggle room on "outsources" but that's the only wiggle room.





> It's good to be aware but put the blame in the right place. It's the unsafe code that's actually at fault. If you are seeing corruption, look at the unsafe code first with an adversarial mindset.

Yeah, that's why my first comment in this thread was "Just having unsafe in your codebase means changing code outside the unsafe block could cause UB". You can caveat that with "that's bad code", but that's the reality.

> An unfixable unsafe is not an excuse to allow errors

Nobody is saying that, you're just moving goalposts now. Nobody is saying you should allow errors, I'm telling you that everyone, including you, will inevitably have to change the code around the unsafe block, because that's what has to enforce memory at the end of the day.

You said you don't need to look at the safe code, I'm asking how would you fix the unsafe, and you haven't. That's fine, and I'm not faulting the language for it.

>> The bug was not in unsafe, or in the drop, it was in println.

> N. O.

Of course it is. That line becomes instructions that access memory freed by the previous line, drop(vec). That's called a dangling pointer. You remove println and the slice is now dropped immediately. You remove drop and the println will work. The vector is not "corrupted" by unsafe. That's not how computers work. We just lose guarantees from Rust when using unsafe, including in safe code. Doesn't mean there's a bug. That's the whole point of unsafe, is to be trusted by the compiler.

> Any unsafe code that outsources invariant enforcement that affects memory safety to safe code has bugs.

All useful unsafe code outsources invariants to safe code. If we could verify the integrity of the memory, we wouldn't be using unsafe.

Now you have been interchangeably using unsafe to mean the literal blocks and the surrounding code. But here is my point: If you are saying that "unsafe code" means "just the unsafe blocks," then yes, unsafe fundamentally relies on safe code to do the right thing.

But if "unsafe code" means "everything that must uphold the invariant," then unsafe can span your entire codebase. Which is true. Just the presence of unsafe in your code base means you're looking at UB anywhere in the call stack if people don't pay attention. And that's been my whole point of this thread. The presence of unsafe means everyone now has to pay attention not just to the safe block, but all safe code interacting with that data, especially in multi-threaded scenarios.


As I said here[0], although I can't speak for what @Dylan16807 intends, invariants required by unsafe code are required exactly to the extent that some code can alter the invariants (the module boundary). In this sense, Rust's unsafe is just a particular example of encapsulation, where all notions of invariants in programming have the same essence.

[0] https://news.ycombinator.com/item?id=46030407


> Yeah, that's why my first comment in this thread was "Just having unsafe in your codebase means changing code outside the unsafe block could cause UB". You can caveat that with "that's bad code", but that's the reality.

I agreed with you that changing safe code could trigger the bug, but the safe code is not where the bug is.

> Nobody is saying that, you're just moving goalposts now. Nobody is saying you should allow errors, I'm telling you that everyone, including you, will inevitably have to change the code around the unsafe block, because that's what has to enforce memory at the end of the day.

When fixing a vulnerable unsafe block, you might have to redesign the unsafe API, and that might require changing some safe code.

Once you decide on an API, you will not have to change safe code. The unsafe code handles all of the enforcement. If safe code is enforcing anything, you broke the rules of unsafe.

> You said you don't need to look at the safe code, I'm asking how would you fix the unsafe, and you haven't. That's fine, and I'm not faulting the language for it.

I'm not an expert at rust. I gave a couple prose suggestions but they require redesigning the way the safe and unsafe code talk to each other. Because your original design is inherently flawed. The unsafe code cannot protect itself, so it must not be used this way. You're saying we should make the safe code protect the unsafe code, and that is not right. Unsafe code needs to protect itself.

> Of course it is. That line becomes instructions that access memory freed by the previous line, drop(vec). That's called a dangling pointer. You remove println and the slice is now dropped immediately. You remove drop and the println will work. The vector is not "corrupted" by unsafe. That's not how computers work. We just lose guarantees from Rust when using unsafe, including in safe code. Doesn't mean there's a bug. That's the whole point of unsafe, is to be trusted by the compiler.

"unsafe" means "trust me compiler, I verified this myself"

Losing the guarantee is a bug. You told the compiler it didn't need to prevent a dangling pointer via the unsafe block, that you would prevent a dangling pointer via the unsafe block, and then you didn't prevent it.

If you didn't tell the compiler to trust you, the part that wouldn't have compiled is the unsafe block. You tricked it into compiling that block, so that block is where the bug is.

> All useful unsafe code outsources invariants to safe code.

That's extremely untrue. Lots of data structures protect all their invariants in their unsafe code.

> If we could verify the integrity of the memory, we wouldn't be using unsafe.

"we" are smarter than the compiler. Unsafe is for things "we" can verify but the compiler cannot. You're not supposed to use it for unverified stuff.

> Now you have been interchangeably using unsafe to mean the literal blocks and the surrounding code. But here is my point: If you are saying that "unsafe code" means "just the unsafe blocks," then yes, unsafe fundamentally relies on safe code to do the right thing.

If you're doing things 100% properly, you will expand your unsafe blocks to include everything that verifies and upholds invariants. But even after that expansion, it's still going to be a tiny fraction of your codebase.

> But if "unsafe code" means "everything that must uphold the invariant," then unsafe can span your entire codebase.

Not if your design is competent.

> Just the presence of unsafe in your code base means you're looking at UB anywhere in the call stack if people don't pay attention. And that's been my whole point of this thread.

"if people don't pay attention" is a huge factor here. If your unsafe code is wrong then that makes it hard to write safe code. But if you go fix the unsafe code then you stop needing to worry about safe code triggering a memory error.

> The presence of unsafe means everyone now has to pay attention not just to the safe block, but all safe code interacting with that data, especially in multi-threaded scenarios.

If you did things correctly, any safe function can be ignored for memory safety. Unsafe blocks are supposed to assume that the safe code calling them is actively malicious, and make themselves impossible to misuse.


See my comment to @Capricorn2481 here[0]. You seem to be saying that unsafe code can be made correct so as to ensure soundness. This can be done by verifying invariants in unsafe code, but it is generally discouraged, as it tempts large unsafe blocks that could be partially safe code. Both you and @Capricorn2481 don't seem to make the distinction between "arbitrary safe code" and "my safe code within the module with unsafe", which is a crucial part of the idea of writing encapsulated unsafe. Technically, I believe unsafe code can still be made sound in the presence of arbitrary safe code, but it would have to tiptoe around violating memory safety, lowering its utility and performance substantially, if not completely.

[0] https://news.ycombinator.com/item?id=46057382


I accept that criticism. It's just that it makes things more complicated and harder to argue about (and I couldn't remember the exact boundaries). I was making the distinction earlier but dropped it for the sake of explaining the point of safe/unsafe.

So the categories of code: the unsafe block, the friend code that shares responsibility for invariants, the outside world code

Capricorn2481's mistake is putting the entire program in the second category, when only code in the same module is supposed to be there.

If a memory violation happens, you know you have a bug in a module containing unsafe blocks, which helps a lot. Those modules should be relatively rare and as small as possible.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: