The "Optimized Tarball Extraction" confuses me a bit. It begins by illustrating ...

Jarred · 2025-09-12T09:40:02 1757670002

Here is the code:

https://github.com/oven-sh/bun/blob/7d5f5ad7728b4ede521906a4...

We trust the self-reported size by gzip up to 64 MB, try to allocate enough space for all the output, then run it through libdeflate.

This is instead of a loop that decompresses it chunk-by-chunk and then extracts it chunk-by-chunk and resizing a big tarball many times over.

mrcarrot · 2025-09-12T12:58:10 1757681890

Thanks - this does make sense in isolation.

I think my actual issue is that the "most package managers do something like this" example code snippet at the start of [1] doesn't seem to quite make sense - or doesn't match what I guess would actually happen in the decompress-in-a-loop scenario?

As in, it appears to illustrate building up a buffer holding the compressed data that's being received (since the "// ... decompress from buffer ..." comment at the end suggests what we're receiving in `chunk` is compressed), but I guess the problem with the decompress-as-the-data-arrives approach in reality is having to re-allocate the buffer for the decompressed data?

[1] https://bun.com/blog/behind-the-scenes-of-bun-install#optimi...