That is nice, although I think Heartbleed was due to a missing bounds check enab...

NobodyNada · on March 13, 2021

If my memory is correct: yes, the root cause was a missing bounds check, but the vulnerability was much worse than it could have been because OpenSSL tended to allocate small blocks of memory and aggressively reuse them — meaning the exploited buffer was very likely to be close in proximity to sensitive information.

I don’t have time right now to research the full details, but the Wikipedia article gives a clue:

> Theo de Raadt, founder and leader of the OpenBSD and OpenSSH projects, has criticized the OpenSSL developers for writing their own memory management routines and thereby, he claims, circumventing OpenBSD C standard library exploit countermeasures, saying "OpenSSL is not developed by a responsible team." Following Heartbleed's disclosure, members of the OpenBSD project forked OpenSSL into LibreSSL.

saagarjha · on March 13, 2021

Until very recently, memory allocators were more than happy to return you the thing you just deallocated if you asked for another allocation of the same size. It makes sense, too: if you're calling malloc/free in a loop, which is pretty common, this is pretty much the best thing you can do for performance. Countless heap exploits later (mostly attacking heap metadata rather than stale data, to be honest) allocators have begun to realize that predictable allocation patterns might not be the best idea, so they're starting to move away from this.

rcxdude · on March 13, 2021

True of the more common ones, but it should be acknowledged that OpenBSD was doing this kind of thing (and many other hardening techniques) before heartbleed, which was the main reason Theo de Raadt was so upset that they decided to circumvent this, because OpenBSD's allocator could have mitigated the impact otherwise.

loeg · on March 13, 2021

Even higher-performance mallocs like jemalloc had heap debugging features (poisoning freed memory) before Heartbleed, which -- if enabled -- would catch use-after-frees, so long as libraries and applications didn't circumvent malloc like OpenSSL did (and Python still does AFAIK).

supergarfield · on March 13, 2021

> and Python still does AFAIK

Don't you sort of have to do that if you're writing your own garbage collector, though? I guess for a simple collector you could maintain lists of allocated objects separately, but precisely controlling where the memory is allocated is important for any kind of performant implementation.

loeg · on March 13, 2021

Python does refcount-based memory management. It's not a GC design. You don't have to retain objects in an internal linked list when the refcount drops to zero, but CPython does, purely as a performance optimization.

Type-specific free lists (just a few examples; there are more):

* https://github.com/python/cpython/blob/master/Objects/floato...

* https://github.com/python/cpython/blob/master/Objects/tupleo...

And also just wrapping malloc in general. There's no refcounting reason for this, they just assume system malloc is slow (which might be true, for glibc) and wrap it in the default build configuration:

https://github.com/python/cpython/blob/master/Objects/obmall...

So many layers of wrapping malloc, just because system allocators were slow in 2000. Defeats free() poisoning and ASAN. obmalloc can be disabled by turning off PYMALLOC, but that doesn't disable the per-type freelists IIRC. And PYMALLOC is enabled by default.

supergarfield · on March 15, 2021

Thanks for the links! I wasn't aware of the PyMem_ layer above, the justification for that does sound bad.

But Python runs a generational GC in addition to refcounting to catch cycles (https://docs.python.org/3/library/gc.html): isn't fine control over allocation necessary for that? E.g. to efficiently clear the nursery?

waterhouse · on March 13, 2021

Ah, good point; at the very least things like zeroing out buffers upon deallocation would have helped. Yes, I was a fan of the commits showing up at opensslrampage.org. One of the highlights was when they found it would use private keys as an entropy source: https://opensslrampage.org/post/83007010531/well-even-if-tim...

fulafel · on March 13, 2021

That's what happens by using normal malloc/free anyway, no? Implementations of malloc have a strong performance incentive to allocate from the cache hot most recently freed blocks.

ric129 · on March 13, 2021

Yes, all allocators (except perhaps OpenBSDs from what I see in this thread) do this. It is also why `calloc` exists - because zero-initializing every single allocation is really, really expensive.

gameswithgo · on March 13, 2021

iirc both issues caused the problem. Buffer overlow let the memory get read, re-use meant there was important data in the buffer.