The thing is that, in Haskell, even when you attach a function to run during des...

annabellish · on Oct 9, 2018

I don't know if it's fair to call that "the Haskell approach", per-se. That destructors are not guaranteed to run, or run predictably, is generally a property of all fast garbage collectors. If you want a GC that can run quickly, which in a language like haskell where you're going to get lots of small allocations in contexts it would be difficult-to-impossible to efficiently determine the exact moment scopes die, or get the programmer to, you absolutely do, then one of the costs of that is you can't afford to run code for every destroyed object.

The linked post is interesting, because I didn't realise "RAII is a much better way of managing resources than destructors" was controversial. It absolutely is, RAII is fast, predictable, and flexible. It's also one of the tradeoffs some languages make to achieve more flexibility in their design by enabling performant automatic garbage collection that doesn't require perfect escape analysis.

pjmlp · on Oct 9, 2018

You can have both, GC, static allocation and RAII, like in Modula-3 for example.

Which .NET is finally arriving to, thanks to Midori outcomes. And Java might eventually get there as well, depending on how Project Valhalla ends up.

As for languages like Haskell, a mix of bracket and linear types might be the way to go.

naasking · on Oct 9, 2018

I googled "Midori outcomes", but I only found your posts mentioning it. Have a better keyword to search or a link?

pjmlp · on Oct 9, 2018

It is a path to enlightenment with multiple stops. :)

Start with Joe Duffy blog posts about Midori architecture.

http://joeduffyblog.com/2015/11/03/blogging-about-midori/

Then hop on to his talks.

"Safe Systems Programming in C# and .NET"

https://www.infoq.com/presentations/csharp-systems-programmi...

"RustConf 2017 - Closing Keynote: Safe Systems Software and the Future of Computing by Joe Duffy"

https://www.youtube.com/watch?v=EVm938gMWl0

Then you can watch "Inside .NET Native" from Channel 9

https://channel9.msdn.com/Shows/Going+Deep/Inside-NET-Native

Finally there are the specs and related discussions that lead up to C# 7.3 design.

https://github.com/dotnet/corefxlab/tree/master/docs/specs

The TL;DR; version, basically async/await, the UWP AOT compiler, improved handling of value types, spans (aka slices), improved GC (TryStartNoGCRegion()) have their roots in System C# used in Midori.

Also there are some influences of Singularity, namely Bartok and MDIL, on the WP 8.x AOT compiler, but that is not longer relevant.

naasking · on Oct 9, 2018

Ah, so you meant the standard innovations that came out of Midori. I thought you were talking about some new abstraction I hadn't heard of called "outcomes". ¯\_(ツ)_/¯

runevault · on Oct 9, 2018

I either missed or forgot about this stuff, gives me a nice rabbit hole to go down, thanks.

tveita · on Oct 9, 2018

http://joeduffyblog.com/2015/11/03/blogging-about-midori/ is a good set of articles about the Midori project.

kccqzy · on Oct 9, 2018

I think you misunderstood me, but nevertheless I still agree with your first paragraph.

The linked article was comparing RAII with the bracket approach, not the destructor approach.

AnthonyMouse · on Oct 9, 2018

> when many C++ and Rust programs are about to end, they spend the last few cycles uselessly deallocating memory that would've immediately been freed via _exit(2)

This isn't useless because memory allocation can happen during destruction/exit, e.g. to write some data to the filesystem.

Suppose you have a container with a billion objects. The container's destructor iterates over each object, doing some housekeeping that requires making a copy and then deleting the original before moving on to the next object.

That requires memory equivalent to one additional object because an original is destroyed following each copy. Stop dellocating memory during destruction/exit and the total memory required doubles, because you have all the copies but still all the originals.

There are also some helpful things that happen during deallocation. For example, glibc has double free detection, which strongly implies potential UAF but it's only detected if the second free() actually gets called.

DanWaterworth · on Oct 9, 2018

> The thing is that, in Haskell, even when you attach a function to run during destruction, the runtime doesn't guarantee that the function will be called promptly, or even at all.

However, this is different than the bracket pattern that the article is taking about. No one in the Haskell community advocates cleaning up resources (like file descriptors, etc) using only destructors.

kccqzy · on Oct 9, 2018

You misunderstood me. I'm explaining why simply adopting RAII is inappropriate in Haskell, even though the author thinks it's a better approach. I've edited my comment to make this clearer.

thesz · on Oct 9, 2018

Author of the article discusses a library - two approaches of different (parts of) libraries.

It is quite possible you may need to have RAII somewhere in Haskell code and that's where things like parametrized monads are good: http://blog.sigfpe.com/2009/02/beyond-monads.html

It is a library and I keep saying that what is usually programming language feature is just a library in Haskell.

jcelerier · on Oct 9, 2018

> when many C++ and Rust programs are about to end, they spend the last few cycles uselessly deallocating memory that would've immediately been freed via _exit(2)

thank god they do this. how many times did I have to manually force linux to release sockets because badly coded C programs which opened sockets forgot to release them causing them to hang up for ~5 minutes after the process ended. With proper RAII classes this does not happen.

foldr · on Oct 9, 2018

That has nothing to do with deallocating memory. Of course there are other kinds of resources which are not automatically freed when a program exits.

nineteen999 · on Oct 9, 2018

Do you mean orphaned sockets, stuck in FIN_WAIT?

Surely what objects are are meant to do is call shutdown(2) syscall - or shutdown(3) C library function - on the socket in their destructor or whatever to prevent that. But I don't think the same applies for memory, once the process is destroyed the kernel should reclaim all memory in the process page tables automatically. Otherwise you'd end up with a pretty trivial way of disabling the system by exhausting all the memory...

jcelerier · on Oct 10, 2018

> Surely what objects are are meant to do is call shutdown(2) syscall - or shutdown(3) C library function

well, the problem with non-RAII solutions is that you depend on the whims and talent of the programmer to call shutdown at some point. With a RAII solution like in C++ or Rust you know that if your socket opened successfully, a call to close will necessarily be issued.

nineteen999 · on Oct 12, 2018

Maybe I'm being dumb here, but with RAII in C++ at least, doesn't shutdown() and then close() have to be called on the socket by the programmer explicitly in the destructor for the class?

jcelerier · on Oct 12, 2018

> doesn't shutdown() and then close() have to be called on the socket by the programmer explicitly in the destructor for the class?

yes, the class has to be written only once - and I have personnally never had to write it except in that one group project in school, since I use libraries that handle it - e.g. boost.asio or Qt Network.

If you are in C, even if you use an abstraction layer, you have to remember to call a _free / _destroy-like function every time you write some code that uses sockets.

nineteen999 · on Oct 12, 2018

Point being, the same is not true for virtual memory. You could leave the memory deallocation out of the destructors, and it is all going to be returned to the OS instantly on _exit().

agumonkey · on Oct 9, 2018

Does _exit wipe out memory or just mark some regions as free ? asking for security purposes.

toast0 · on Oct 9, 2018

I don't know how Linux manages its memory pages. FreeBSD would put all of the anonymous pages onto essentially a free, but not zeroed queue. And there's an optional background job to zero the pages and put them in the zeroed queue. When a new page is needed, the clean queue is checked first, otherwise nonzeroed pages are zeroed on demand during allocation. (Zeroing can be theoretically skipped in cases where the kernel knows the full page will be written to before any reads)

Zeroing on exit would be more secure, but significantly slower -- you want to exit quickly, so you can potentially start a replacement program, which would be expected to, at least sometimes, take time to allocate the same amount of memory. If it does allocate the whole amount immediately, it's not necessarily any slower in total time between zeroing at exit or on mapping; but it there's enough time for the pages to get zeroed in the background, that reduces the amount of time waiting for the kernel to do things.

agumonkey · on Oct 9, 2018

Maybe a randomized sparse zeroing ?

toast0 · on Oct 9, 2018

I'm not sure what a randomized zeroing would get you from a security perspective. You shouldn't need to be concerned about other programs observing the memory, kernels are expected to give programs only zeroed pages. If you're concerned about kernel level memory dumping, randomized zeroing isn't good enough -- it may or may not have zeroed your secrets, so that's not very helpful. Background zeroing doesn't help much here either -- FreeBSD sets a target of zeroing half the free pages, so your secrets may not be zeroed for a long time.

It seems the jury is out on the benefits from a performance perspective (DragonflyBSD took out background zeroing, saying they were unable to observe a performance difference, so simpler code is better)

agumonkey · on Oct 9, 2018

Not much indeed but it might deter some low hanging leak hacks..

Franciscouzo · on Oct 9, 2018

Why? When taking into account the cpu cache, branch mispredictions, etc, I bet it would be slower than just zeroing it, besides it wouldn't be secure at all, imagine a process that stores a secret key, and then releases the memory, if another process can trigger the first to generate and release the key memory multiple times, they would be able to read it.

umanwizard · on Oct 9, 2018

Marks it as free, but the OS will wipe it before giving it to another process.

acqq · on Oct 9, 2018

Your example combined with the parents observation show that C++ put under the same construct the concepts that should be separated: memory allocation should be handled differently from the construction, destruction and other resource allocation.

kccqzy · on Oct 9, 2018

Memory allocation and deallocation on the heap basically mean calling the `operator new` and `operator delete` functions in C++. The language provides a default implementation but you can override it.

Constructors are orthogonal. The job of a constructor is to construct your object given that the space for the object is already allocated. This could be on the stack, where allocation means bumping the stack pointer, or in-place in preallocated storage (like std::vector), or the result of calling `operator new`. Simply using the `new` syntax does both as a shorthand.

Similarly the job of a destructor is to destruct your object without deallocating it. One can in-place destruct without deallocating, or destruct and then deallocate implicitly when the stack pointer is adjusted, or not at all. The `delete` syntax does both destruction and deallocation as a convenience.

gpderetta · on Oct 9, 2018

Memory allocation/deallocation in C++ is handled separately from construction and destruction. The delete and new syntax is short hand for combining the two.

naasking · on Oct 9, 2018

> memory allocation should be handled differently from the construction, destruction and other resource allocation.

These other resources still need an in-memory representation to track and reference resources, so you can't really separate them.

dan00 · on Oct 9, 2018

> The thing is that, in Haskell, even when you attach a function to run during destruction, the runtime doesn't guarantee that the function will be called promptly, or even at all.

There's also no guarantee for Rust/C++ destructors to be called. It's certainly less of an issue then depending on the GC to being called, but if you need absolute correctness, then you shouldn't rely on the destructors.

quietbritishjim · on Oct 9, 2018

If a variable has block scope in C++ (i.e. it is a local variable in a function) then its destructor is guaranteed to be called when the block is finished, regardless of whether that is due to a `return` statement or an exception being thrown (or a `break` or `continue`). In what sense do you disagree?

If you allocate an object on the heap with `new` then its destructor isn't called automatically unless you make it so through some other mechanism, but GP comment clearly want claiming that.

There are some situations where objects with block scope do not have their destructor called e.g. `_exit()` called, segfault, power cable pulled out. But in that sense nothing is guaranteed.

DanWaterworth · on Oct 9, 2018

If you are consuming an API that provides an object with a destructor, you are correct, you can determine when destructors will be called.

The issue is when you produce an API that contains objects with destructors. Since you are handing these entities off to unknown code, you cannot ensure that they will be dropped. This was a problem in scoped threads in Rust.

siscia · on Oct 9, 2018

Can you please dig deeper, that I am not sure I follow.

In which case in rust you cannot be sure that "the drop" will be called?

dbaupp · on Oct 9, 2018

If there's a cycle of strong references with Rc or Arc (or shared_ptr in C++), those objects still never get dropped/have their destructors called.

zaphar · on Oct 9, 2018

Rc's drop will be called. But whether the exposed object's drop will be called is dependent on the reference count.

But Rc would not work if the drop was not guaranteed to be called.

dbaupp · on Oct 9, 2018

I was a little unclear but that is of course what I meant: talking about the underlying shared data because the pointers themselves don't have particularly interesting destruction behaviour. (Although the sibling is also correct that not all Rc/Arc/shared_ptr handles to the shared data with have their Drop called.)

steveklabnik · on Oct 9, 2018

If you have a reference cycle, the two Rcs will keep each other alive, and their Drops will not be called.

quietbritishjim · on Oct 9, 2018

I think that falls into the category I mentioned in the third paragraph of my comment: a serious pre-existing bug with other consequences will potentially cause the guarantee to be violated. A similar effect would happen if you had a double free that sometimes caused a crash, which is a similar level of programming mistake to creating a cyclic reference. To me it sits outside of a reasonable definition of "guaranteed".

dbaupp · on Oct 9, 2018

No, typically, a reference cycle is fine. It results in valid memory that never gets read again, which is unfortunate but not dangerous, whereas double-frees can result in memory corruption. http://huonw.github.io/blog/2016/04/memory-leaks-are-memory-...

richardwhiuk · on Oct 9, 2018

A Rc cycle causing a leak.

See the very excellent http://cglab.ca/~abeinges/blah/everyone-poops/

flurrything · on Oct 9, 2018

> In what sense do you disagree?

Not the parent, but it is trivial to write C++ and Rust examples in which destructors of variables with block scope are not called. The std library of both languages do even come with utilities to do this:

C++ structs:

    struct Foo {
      Foo() { std::cout << "Foo()" << std::endl; }
      ~Foo() { std::cout << "~Foo()" << std::endl; }
    };

    {
        std::aligned_storage<sizeof(Foo),alignof(Foo)> foo;
        new(&foo) Foo;
        /* destructor never called even though a Foo
           lives in block scope and its storage is
           free'd
        */
    }

C++ unions:

    union Foo {
      Foo() { std::cout << "Foo()" << std::endl; }
      ~Foo() { std::cout << "~Foo()" << std::endl; }
    };

    {
      Foo foo();
      /* destructor never called */
    }

Rust:

    struct Foo;
    impl Drop for Foo {
        fn drop(&mut self) {
            println!("drop!");
        }
    }

    {
      let _ = std::mem::ManuallyDrop::<Foo>::new(Foo);
      /* destructor never called */
    }

etc.

> There are some situations where objects with block scope do not have their destructor called e.g. `_exit()` called, segfault, power cable pulled out. But in that sense nothing is guaranteed.

This is pretty much why it is impossible for a programming language to guarantee that destructors will be called.

Might seem trivial, but even when you have automatic storage, any of the things you mention can happen, such that destructors won't be reached.

In general, C++, Rust, etc. cannot guarantee that destructors will be called, because it is also trivial to make that impossible once you start using the heap (e.g. a `shared_ptr` cycle will never be freed).

DanWaterworth · on Oct 9, 2018

Sad to see you being downvoted when you are correct, see [1]

> forget is not marked as unsafe, because Rust's safety guarantees do not include a guarantee that destructors will always run.

This was a problem in Rust when scoped threads relied on destructors being run.

[1] https://doc.rust-lang.org/std/mem/fn.forget.html

ChrisSD · on Oct 9, 2018

This thread lists all the ways drop may not be called in Rust: https://users.rust-lang.org/t/drop-guarantees/20230/

I'm not sure it's possible to force any code to be run (e.g. a process can be terminated at any time) although a closure might offer slightly stronger guarantees in some situations.