The thing is that, in Haskell, even when you attach a function to run during destruction, the runtime doesn't guarantee that the function will be called promptly, or even at all. Rust drops (runs destructor and deallocates) values as soon as they go out of scope; C++ too. In Haskell you depend on the whims of the GC, which makes RAII unusable. (The Haskell approach of not guaranteeing destructors being called does have its merits; when many C++ and Rust programs are about to end, they spend the last few cycles uselessly deallocating memory that would've immediately been freed via _exit(2)).
Therefore the RAII style wouldn't really work in Haskell. The current bracket approach is still better than RAII in Haskell.
That said, the ST-style trick of a phantom type variable is pretty well-known. Unfortunately not many people knew the same trick can be used for non-ST as well. I feel like as a community we should be encouraging this style more often.
UPDATE: I wrote the original comment with the incorrect assumption that drop functions will always be called in Rust. This is wrong. Please see child comments.
I don't know if it's fair to call that "the Haskell approach", per-se. That destructors are not guaranteed to run, or run predictably, is generally a property of all fast garbage collectors. If you want a GC that can run quickly, which in a language like haskell where you're going to get lots of small allocations in contexts it would be difficult-to-impossible to efficiently determine the exact moment scopes die, or get the programmer to, you absolutely do, then one of the costs of that is you can't afford to run code for every destroyed object.
The linked post is interesting, because I didn't realise "RAII is a much better way of managing resources than destructors" was controversial. It absolutely is, RAII is fast, predictable, and flexible. It's also one of the tradeoffs some languages make to achieve more flexibility in their design by enabling performant automatic garbage collection that doesn't require perfect escape analysis.
The TL;DR; version, basically async/await, the UWP AOT compiler, improved handling of value types, spans (aka slices), improved GC (TryStartNoGCRegion()) have their roots in System C# used in Midori.
Also there are some influences of Singularity, namely Bartok and MDIL, on the WP 8.x AOT compiler, but that is not longer relevant.
Ah, so you meant the standard innovations that came out of Midori. I thought you were talking about some new abstraction I hadn't heard of called "outcomes". ¯\_(ツ)_/¯
> when many C++ and Rust programs are about to end, they spend the last few cycles uselessly deallocating memory that would've immediately been freed via _exit(2)
This isn't useless because memory allocation can happen during destruction/exit, e.g. to write some data to the filesystem.
Suppose you have a container with a billion objects. The container's destructor iterates over each object, doing some housekeeping that requires making a copy and then deleting the original before moving on to the next object.
That requires memory equivalent to one additional object because an original is destroyed following each copy. Stop dellocating memory during destruction/exit and the total memory required doubles, because you have all the copies but still all the originals.
There are also some helpful things that happen during deallocation. For example, glibc has double free detection, which strongly implies potential UAF but it's only detected if the second free() actually gets called.
> The thing is that, in Haskell, even when you attach a function to run during destruction, the runtime doesn't guarantee that the function will be called promptly, or even at all.
However, this is different than the bracket pattern that the article is taking about. No one in the Haskell community advocates cleaning up resources (like file descriptors, etc) using only destructors.
You misunderstood me. I'm explaining why simply adopting RAII is inappropriate in Haskell, even though the author thinks it's a better approach. I've edited my comment to make this clearer.
> when many C++ and Rust programs are about to end, they spend the last few cycles uselessly deallocating memory that would've immediately been freed via _exit(2)
thank god they do this. how many times did I have to manually force linux to release sockets because badly coded C programs which opened sockets forgot to release them causing them to hang up for ~5 minutes after the process ended. With proper RAII classes this does not happen.
Surely what objects are are meant to do is call shutdown(2) syscall - or shutdown(3) C library function - on the socket in their destructor or whatever to prevent that. But I don't think the same applies for memory, once the process is destroyed the kernel should reclaim all memory in the process page tables automatically. Otherwise you'd end up with a pretty trivial way of disabling the system by exhausting all the memory...
> Surely what objects are are meant to do is call shutdown(2) syscall - or shutdown(3) C library function
well, the problem with non-RAII solutions is that you depend on the whims and talent of the programmer to call shutdown at some point. With a RAII solution like in C++ or Rust you know that if your socket opened successfully, a call to close will necessarily be issued.
Maybe I'm being dumb here, but with RAII in C++ at least, doesn't shutdown() and then close() have to be called on the socket by the programmer explicitly in the destructor for the class?
> doesn't shutdown() and then close() have to be called on the socket by the programmer explicitly in the destructor for the class?
yes, the class has to be written only once - and I have personnally never had to write it except in that one group project in school, since I use libraries that handle it - e.g. boost.asio or Qt Network.
If you are in C, even if you use an abstraction layer, you have to remember to call a _free / _destroy-like function every time you write some code that uses sockets.
Point being, the same is not true for virtual memory.
You could leave the memory deallocation out of the destructors, and it is all going to be returned to the OS instantly on _exit().
I don't know how Linux manages its memory pages. FreeBSD would put all of the anonymous pages onto essentially a free, but not zeroed queue. And there's an optional background job to zero the pages and put them in the zeroed queue. When a new page is needed, the clean queue is checked first, otherwise nonzeroed pages are zeroed on demand during allocation. (Zeroing can be theoretically skipped in cases where the kernel knows the full page will be written to before any reads)
Zeroing on exit would be more secure, but significantly slower -- you want to exit quickly, so you can potentially start a replacement program, which would be expected to, at least sometimes, take time to allocate the same amount of memory. If it does allocate the whole amount immediately, it's not necessarily any slower in total time between zeroing at exit or on mapping; but it there's enough time for the pages to get zeroed in the background, that reduces the amount of time waiting for the kernel to do things.
I'm not sure what a randomized zeroing would get you from a security perspective. You shouldn't need to be concerned about other programs observing the memory, kernels are expected to give programs only zeroed pages. If you're concerned about kernel level memory dumping, randomized zeroing isn't good enough -- it may or may not have zeroed your secrets, so that's not very helpful. Background zeroing doesn't help much here either -- FreeBSD sets a target of zeroing half the free pages, so your secrets may not be zeroed for a long time.
It seems the jury is out on the benefits from a performance perspective (DragonflyBSD took out background zeroing, saying they were unable to observe a performance difference, so simpler code is better)
Why? When taking into account the cpu cache, branch mispredictions, etc, I bet it would be slower than just zeroing it, besides it wouldn't be secure at all, imagine a process that stores a secret key, and then releases the memory, if another process can trigger the first to generate and release the key memory multiple times, they would be able to read it.
Your example combined with the parents observation show that C++ put under the same construct the concepts that should be separated: memory allocation should be handled differently from the construction, destruction and other resource allocation.
Memory allocation and deallocation on the heap basically mean calling the `operator new` and `operator delete` functions in C++. The language provides a default implementation but you can override it.
Constructors are orthogonal. The job of a constructor is to construct your object given that the space for the object is already allocated. This could be on the stack, where allocation means bumping the stack pointer, or in-place in preallocated storage (like std::vector), or the result of calling `operator new`. Simply using the `new` syntax does both as a shorthand.
Similarly the job of a destructor is to destruct your object without deallocating it. One can in-place destruct without deallocating, or destruct and then deallocate implicitly when the stack pointer is adjusted, or not at all. The `delete` syntax does both destruction and deallocation as a convenience.
Memory allocation/deallocation in C++ is handled separately from construction and destruction. The delete and new syntax is short hand for combining the two.
> The thing is that, in Haskell, even when you attach a function to run during destruction, the runtime doesn't guarantee that the function will be called promptly, or even at all.
There's also no guarantee for Rust/C++ destructors to be called. It's certainly less of an issue then depending on the GC to being called, but if you need absolute correctness, then you shouldn't rely on the destructors.
If a variable has block scope in C++ (i.e. it is a local variable in a function) then its destructor is guaranteed to be called when the block is finished, regardless of whether that is due to a `return` statement or an exception being thrown (or a `break` or `continue`). In what sense do you disagree?
If you allocate an object on the heap with `new` then its destructor isn't called automatically unless you make it so through some other mechanism, but GP comment clearly want claiming that.
There are some situations where objects with block scope do not have their destructor called e.g. `_exit()` called, segfault, power cable pulled out. But in that sense nothing is guaranteed.
If you are consuming an API that provides an object with a destructor, you are correct, you can determine when destructors will be called.
The issue is when you produce an API that contains objects with destructors. Since you are handing these entities off to unknown code, you cannot ensure that they will be dropped. This was a problem in scoped threads in Rust.
I was a little unclear but that is of course what I meant: talking about the underlying shared data because the pointers themselves don't have particularly interesting destruction behaviour. (Although the sibling is also correct that not all Rc/Arc/shared_ptr handles to the shared data with have their Drop called.)
I think that falls into the category I mentioned in the third paragraph of my comment: a serious pre-existing bug with other consequences will potentially cause the guarantee to be violated. A similar effect would happen if you had a double free that sometimes caused a crash, which is a similar level of programming mistake to creating a cyclic reference. To me it sits outside of a reasonable definition of "guaranteed".
No, typically, a reference cycle is fine. It results in valid memory that never gets read again, which is unfortunate but not dangerous, whereas double-frees can result in memory corruption. http://huonw.github.io/blog/2016/04/memory-leaks-are-memory-...
Not the parent, but it is trivial to write C++ and Rust examples in which destructors of variables with block scope are not called. The std library of both languages do even come with utilities to do this:
C++ structs:
struct Foo {
Foo() { std::cout << "Foo()" << std::endl; }
~Foo() { std::cout << "~Foo()" << std::endl; }
};
{
std::aligned_storage<sizeof(Foo),alignof(Foo)> foo;
new(&foo) Foo;
/* destructor never called even though a Foo
lives in block scope and its storage is
free'd
*/
}
C++ unions:
union Foo {
Foo() { std::cout << "Foo()" << std::endl; }
~Foo() { std::cout << "~Foo()" << std::endl; }
};
{
Foo foo();
/* destructor never called */
}
Rust:
struct Foo;
impl Drop for Foo {
fn drop(&mut self) {
println!("drop!");
}
}
{
let _ = std::mem::ManuallyDrop::<Foo>::new(Foo);
/* destructor never called */
}
etc.
> There are some situations where objects with block scope do not have their destructor called e.g. `_exit()` called, segfault, power cable pulled out. But in that sense nothing is guaranteed.
This is pretty much why it is impossible for a programming language to guarantee that destructors will be called.
Might seem trivial, but even when you have automatic storage, any of the things you mention can happen, such that destructors won't be reached.
In general, C++, Rust, etc. cannot guarantee that destructors will be called, because it is also trivial to make that impossible once you start using the heap (e.g. a `shared_ptr` cycle will never be freed).
I'm not sure it's possible to force any code to be run (e.g. a process can be terminated at any time) although a closure might offer slightly stronger guarantees in some situations.
Therefore the RAII style wouldn't really work in Haskell. The current bracket approach is still better than RAII in Haskell.
That said, the ST-style trick of a phantom type variable is pretty well-known. Unfortunately not many people knew the same trick can be used for non-ST as well. I feel like as a community we should be encouraging this style more often.
UPDATE: I wrote the original comment with the incorrect assumption that drop functions will always be called in Rust. This is wrong. Please see child comments.