Small ones anyway. W80 [1] W54 [2]. Most of them are over 500 lbs.
High altitude weather balloons on the other hand can carry up to 8000 lbs like the one from China the US let meander coast to coast over several military installations which could carry a B53 nuke after small modifications. Detonated at a high altitude could shut down the power grid coast to coast.
Under uClinux, executables can be position independent or not. They can run from flash or RAM. They can be compressed (if they run in RAM). Shared libraries are supported on some platforms. All in all it's a really good environment and the vfork() limitation generally isn't too bad.
I spent close to ten years working closely with uClinux (a long time ago). I implemented the shared library support for the m68k. Last I looked, gcc still included my additions for this. This allowed execute in place for both executables and shared libraries -- a real space saver. Another guy on the team managed to squeeze the Linux kernel, a reasonable user space and a full IP/SEC implementation into a unit with 1Mb of flash and 4Mb of RAM which was pretty amazing at the time (we didn't think it was even possible). Better still, from power on to login prompt was well under two seconds.
Microchip is always more expensive than the Chinese stuff, but Microchip contributions to Linux are mainline (!!!!), and is often worth the extra few $$$$.
Fully open hardware, with mainline Linux open source drivers. It's hard to beat SAM9x60 in openness, documentation and overall usability. It's specs are weaker but keeping up with mainline Linux is very, very relevant. Especially in this discussion
You are correct in that the decNumber library doesn't support trig operations. Arithmetic, log, exp and square root plus rounding and some conversion functions.
My experience is that decNumber is generally slow. Logarithms are especially slow. Intel's decimal library is much faster & as you noted, it uses binary operations to start it's algorithms.
That's a very clever approach — I hadn't even thought of factoring the target value like that.
Decomposing `0x08C0C166` into `2⁴ × 3 × 5⁵ × 11 × 89 + 1` and reusing parts like `11 = 10 + 1` and `89 = 10² - 11` is genuinely interesting.
Still, as you said, packing all the necessary manipulations into just 17 instructions is the real challenge — especially when you try to avoid any immediate constants, memory access, or stack usage.
If you do find a shorter sequence that matches the constraints exactly, please share! I’d love to see how far this can be optimized.
Yes, you're absolutely right — the initial `xor cl, cl` is technically redundant if we assume all registers are zeroed at start, as stated in the problem.
I kept it in the solution mostly out of habit and to make the logic more explicit, but you're correct that it could be removed, bringing the count down to 16.
That said, for consistency (and because some AI models needed it to understand the logic flow), I still include it when comparing instruction count across different versions.
But you're totally right: under the problem's assumptions, `xor cl, cl` is free.