You encounter the same problem with zero argument functions. A function on a zer...

Joker_vD · on July 22, 2024

No, you just invoke the function and pass it zero arguments, that's it.

Sure, you can build your whole theoretical framework of computation with only the functions of exactly one argument, and then deal with tuples to fake multi-valued arguments/multiple return values — but you don't have to do that. You may as well start from the functions with arbitrary (natural) number of arguments/return values, it's not that hard.

shiandow · on July 22, 2024

Sure, and passing it zero arguments is exactly what it means to evaluate it on the single value of the unit set.

I mean surely we can agree that a pure function of 0 arguments picks out exactly 1 value, and that a function that accepts n different values (values not arguments) as input returns at most n different results? Why make an exception for n=0?

Your definition of a function of 0 arguments and that of a function over the unit set are identical. Or at least equivalent.

Joker_vD · on July 22, 2024

They're equivalent, but only up to whatever computational substrate one is actually using. You can build functions out of small-step operational semantics of, say, a simplistic imperative register machine with a stack. In this case, a function of 0 arguments and a function of 1 trivial unit argument are visibly different even though their total effect on the state is the same. After all, we're talking about theory of computation and so it better be able to handle computations as they are actually performed at the low level, too.

It's yet another example of "in theory, the theory and the practice are the same; in practice, they're different": I have written a toy functional language that compiles down to C, and unit-removal (e.g. transforming int*()*int into int*int, lowering "fun f() -> whatever = ..." into a "whatever f(void) {...}" etc.) is a genuine optimization. The same, I imagine, would apply to generating raw assembly: you want to special-case handling of unit so that passing it as an argument would not touch %rdi, and assigning a unit to a value should not write any registers, and "case unit_producing_function(...) of -> ... end" actually has no data dependency on the unit_producing_function etc.

lmm · on July 23, 2024

> In this case, a function of 0 arguments and a function of 1 trivial unit argument are visibly different even though their total effect on the state is the same. After all, we're talking about theory of computation and so it better be able to handle computations as they are actually performed at the low level, too.

You don't have to push anything for the unit values on the stack though, their representation doesn't take up any space. Just like if you're passing a big argument (like a large integer, or a struct passed by value) it might take multiple registers or a lot of space in your stack frame, there's no 1:1 correspondence between arguments and stack space.

> you want to special-case handling of unit so that passing it as an argument would not touch %rdi, and assigning a unit to a value should not write any registers

This shouldn't need to be a special case though. For each datatype you have a way of mapping it to/from some number of registers, and for the unit type that number of registers is 0 and that mapping is a no-op.

Joker_vD · on July 23, 2024

> their representation doesn't take up any space.

One of the way to represent them doesn't take up any space. But if you want to e.g. have a generic function of e.g. (T,U,V) -> V type (that is, with no monomorphization in your compiler) then either your units have to take space (otherwise the layout of (unit,int,bool) is different from (int,int,bool) and the same), or you'll have to pass around type descriptors when invoking generic functions. Which many, many implementors of static languages rather wouldn't do unless absolutely necessary.

lmm · on July 23, 2024

> if you want to e.g. have a generic function of e.g. (T,U,V) -> V type (that is, with no monomorphization in your compiler) then either your units have to take space (otherwise the layout of (unit,int,bool) is different from (int,int,bool) and the same), or you'll have to pass around type descriptors when invoking generic functions.

You have that problem already though surely, as T might be bigger than an int, or want to be passed in a floating-point register, or both.

shiandow · on July 22, 2024

How would one write a 0 bit value into registers?

Joker_vD · on July 22, 2024

Yes, that's the problem you face when you're dealing strictly with functions of a single argument. Still, there are two options: first, it's arguably is already written into the register so you don't need to do anything.

Alternatively, you may instead represent () as a full, 64-bit wide machine word and then map every 64-bit value to mean () so, again, you don't actually need to write anything: all registers contain a valid representation of () at all times. This is similar to how we usually represent booleans: 0 is mapped to mean False, and everything else is mapped to mean True, although in this case we sometimes do need to rematerialize some definite value into the register of choice.

In any case, it's mostly just a matter of correctly writing the constant materializer; but if you adopt multi-argument/multi-valued functions you simply never encounter this problem:

    for arg, place in zip(arg_list, arg_places):
        load(arg, place)
    invoke(fun, kont)

    for val, place in zip(ret_values, ret_places):
        load(val, place)
    kontinue(kont)

Notice how degenerate loops simply disappear with no additional handling.

lmm · on July 23, 2024

> if you adopt multi-argument/multi-valued functions you simply never encounter this problem

Sure. If you have multiple return then unit values become a lot less important because you can just have a function that returns 0 values. But most languages, especially C-like languages, do not have multiple return.