The native FP4 is one of the most interesting architectural aspects here IMO, as... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		rushingcreek 5 months ago \| parent \| context \| favorite \| on: Open models by OpenAI The native FP4 is one of the most interesting architectural aspects here IMO, as going below FP8 is known to come with accuracy tradeoffs. I'm curious how they navigated this and how the FP8 weights (if they exist) were to perform.

buildbot 5 months ago [–]

One thing to note is that MXFP4 is a block scaled format, with 4.25 bits per weight. This lets it represent a lot more numbers than just raw FP4 would with say 1 mantissa and 2 exponent bits.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact