Very cool. Typo: "I barely understand how Huffman Coding and DTC works" I guess ...

phoboslab · on Nov 24, 2021

Thanks, fixed!

I'll probably investigate "resetting" the stream to allow for multithreaded en-/decode when I try to roll this into a video codec.

Trixter · on Nov 24, 2021

Before you start rolling this into a video codec, please consult the sources of UT Codec, which has been state of the art in this space for many years. Or, at least, use UT Codec as a direct basis for comparison, to see if you are beating it in terms of compression/decompression speed.

Modern lossless video codecs don't care that much about saving space, since you're burning gigabytes per minute anyway; the key is that the compression and decompression are as transparent as possible, to reduce the bandwidth to/from the storage medium. A good lossless codec can be used to scrub through and edit on a video editor's timeline, so decomp performance is what's most important.

Also, any such lossless video codec is going to need more than 8 bits per component; most real-world footage coming off of phones and cameras is 10-bit Rec.2020. If it is too difficult to position QOI for real-world sources, you can certainly keep it 8-bit and market it as an animation codec or similar; just keep it in mind.

mkl · on Nov 24, 2021

Even better might be breaking it into tiles, so you get some vertical locality happening as well as horizontal.

It would be interesting to know which strategy is used for each pixel. Have you tried making maps of that with four different colours? Particularly interesting would be how much the cache is used and also how often it replaces a colour that it could have used later. Maybe there's a better colour hash. BTW you might want to change "(really just r^g^b^a)" to "(really just (r^g^b^a)%64)".

iainmerrick · on Nov 24, 2021

I really like the trick of caching previously-seen pixels by value rather than position. Very nifty!

I was thinking this would combine well with progressive rendering (i.e. store pixels in mipmap-style order) so the cache is warmed up with spread of pixels across the image rather than just scanline order. That doesn’t make it parallelizable in the way tiling does, though, hmm.

Another tweak that would probably help is having the file specify a rotation (or even a full mapping table) for each component of the hash, so a clever encoder can pick values that minimise collisions. (A fast encoder can just use all-zeroes to get the same behaviour as before.)