Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What is also lost: Efficient 64-bit FP arithmetic, because your boxing format can't support 64-bit floats as values. You either stick with 32-bit floats or have additional dereferencing for 64-bit floats.

This is why I prefer the NaN-boxing approach, where you can hold 64-bit float values and 48-bit pointers and integers as values in a 64-bit register. 48-bits is sufficient for pointers because few CPUs support greater than 48-bit addressing.

When you need full 64-bit integers, chances are you need more than one of them. Have one of your NaN-box tags indicate that a memory address is a vector of 64-bit integers and load them into a vector register.



OCaml's floats are unboxed 64-bit double-precision IEEE-754 floating point numbers. So there.


According to https://v2.ocaml.org/manual/intfc.html#s%3Ac-value, a 64-bit word is either an integer or pointer. Doubles are heap allocated with pointers represented by a Double_tag or Double_array_tag.

It should be obvious. A 64-bit word can't represent 63-bit integers, 63-bit pointers and 64-bit floats without overlapping/ambiguous representation.

You can represent 64-bit floats, 48-bit pointers and 48-bit integers (signed and unsigned) in a 64-bit word without any overlap though, and you can avoid the pointer dereference for FP operations, instead only needing to perform a check for NaN. Other types need unboxing which can be done in ~2 cycles, without any dereferencing.


    As an optimization, records whose fields all have static type float are represented as arrays of floating-point numbers, with tag Double_array_tag. (See the section below on arrays.)

    [...]

    Arrays of floating-point numbers (type float array) have a special, unboxed, more efficient representation. These arrays are represented by pointers to blocks with tag Double_array_tag.


So they're unboxed in certain situations, but otherwise boxed.

With NaN-boxing you would do this for Int64s. They would need boxing when you need a single value, but you can have one or more tags for vectors of them, in which case the elements can be stored unboxed.

My argument is you would probably use FP64 more frequently than Int64s. For most common operations involving int, Int48 would suffice. In the cases where you need full Int64 (cryptography, serialization, etc), you commonly need vectors of Int64 anyway.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: