Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

HAOS is absolutely terrible when it faces filesystem corruption. I faced this a few times when my HAOS image crashed. You can't even get it to fsck on boot when it happens. I think that's one big reason people on RPIs have issues. The FS is on an SD card, SD cards are flakey, and there's no good rescue path when the FS has issues.

I guess USB power is also historically not great on RPI. I haven't played with them in a few years but I remember needing powered hubs. That might explain issues with Zigbee and Z-wave dongles. Note also the '700 series' z-wave dongles have a lot of issues. You can update the firmware to fix some. Mine's been flakey and I'm on the latest firmware from ~2 weeks ago that's supposed to fix all of that.

I was running HAOS on VirtualBox, with the disk image on ZFS. I switched to running docker out of the same ZFS filesystem and it's much faster and more reliable, notably I don't get random filesystem corruption. Anecdote. YMMV.



> HAOS is absolutely terrible when it faces filesystem corruption. I faced this a few times when my HAOS image crashed. You can't even get it to fsck on boot when it happens.

I don't see this as being unique to HAOS, if a filesystem is corrupted in a certain way then _no OS_ is going to be able to boot.

There are distros that are specifically optimized for the Pi (with, eg specific logging choices [1]) that try to avoid problems (with trade-offs).

I've used something before (can't remember/find it now, but maybe thought it was in Yocto?) where there were 3 partitions: one was read-write used for persisting user data, and two were for the OS/apps. One would be live and mounted as read-only, the other was for the next system update using a blue-green deploy strategy. I think it also used RAM for log and temp files.

[1] https://dietpi.com/docs/software/log_system/


> I don't see this as being unique to HAOS, if a filesystem is corrupted in a certain way then _no OS_ is going to be able to boot.

Theoretically true if you are very unlucky, however, that's not true at all for the vast majority of filesystem failures.

I am talking about failures where if I inspect the disk image outside the VM and fsck it, everything is fine.

Also, given that HA leans so heavily on docker, it would be a reasonable feature for it to rebuild some docker images should the disk damage them.

Traditional hard disk failures are common, but also, SD card failure is really common. If you've worked on a sufficiently popular mobile app you've probably seen tons of it. It's reasonable to plan for it at the application layer. To say nothing of at an OS layer. There's a very good reason why Unix traditionally ships with fsck tools and sometimes runs them at boot. Whoever designed HAOS not to do this made a big mistake, and I would hit this a few times a year in my old setup.


I’m not familiar with virtualbox, do you have any idea why HAOS on VB+ZFS was corrupting vs Docker+ZFS?


I wasn't sure. I had some kernel panics that I didn't dig deeply into, and I would sometimes use the image outside the VM to fsck it when HAOS would subsequently not recover.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: