Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A RISC CPU in Excel [video] (youtube.com)
4 points by linkdd 10 months ago | hide | past | favorite | 2 comments


He designs a stack-based ISA with ten instructions.

I've previously written [1] about reducing RISC-V RV32I down to 10 instructions `addi`, `add`, `nand`, `sll`, `sra`, `jal`, `jalr`, `blt`, `lw`, `sw` [2]. Except for subword stores the missing instructions can be emulated with at most 4 instructions.

I implemented the `countPrimes` function in my own benchmark [3] using just this subset [4] and found a 28% code size expansion but only a 3% speed penalty.

[1] https://new.reddit.com/r/RISCV/comments/w0iufg/how_much_coul...

[2] or 11 if you keep a strict subset by including `and` and `xor` instead of `nand`

[3] https://hoult.org/primes.txt

[4] https://hoult.org/primes.S


Note: emulating "store byte" needs 17 instructions in the general case. Can be shortened if you know in advance which byte in the word will be replaced.

    // Implement store byte using only addi, add, nand, sll, sra, jal, jalr, blt, lw, sw
    // void store_byte(void *p, char val);
    
            .macro nand dst,src1,src2
            and \dst,\src1,\src2
            xori \dst,\dst,-1
            .endm
    
            .globl store_byte
    store_byte:
            // preparation, can be shortened if e.g. byte offset is known
            addi t0,x0,3
            addi t1,x0,-1
            nand t2,a0,t0
            nand t2,t2,t1 // byte offset
            sll  t2,t2,t0 // bit offset
            nand t0,t0,t1
            nand a0,a0,t0
            nand a0,a0,t1 // word ptr
            addi t0,x0,255
            sll  t0,t0,t2 // mask
    
            // the actual work
            sll  a1,a1,t2 // shifted val
            nand a1,a1,t0 // ~val
            nand t0,t0,t1 // ~mask
            lw   a3,(a0)
            nand a3,a3,t0 // ~word, dst field all 1s
            nand a3,a3,a1 // insert val
            sw   a3,(a0)
    
            jalr x0,(x1)
Having access to the full RV32I instruction set, minus `sb`, shortens the preparation from 10 instructions to 5, but the actual work stays at 7 instructions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: