I have been using Linux since 1999. I have seen lots of kernel panics. But recently much less, unfortunately replaced by more problems in platform and userspace.
I know Linux works more reliably for some people and less reliably for some others. It probably has much to do what you do with it. What kind of hardware you are running it on, do you just install it and use it as it is or you are the kind of person like me who likes to change everything to his liking.
I also tend to not like to reinstall my machines. For about 15 years my daily driver was a single Debian unstable installation which was continuously updated until I faced too much problems and had to completely replace it. I would have fixed it all but I just did not have the time and I needed it working.
My experience is that Linux is rock solid as long as you're not running it on super duper expensive hardware and doing crazy-big things on it.
Randomly in my career so far, notable kernel panic causes were:
- when a spark job finishes and deallocates close to a TB of memory, kernel panic. jobs using below 750GB were typically not seeing this happen, so it was something in there. this just kind of stopped happening after we updated the kernel and spark in a semi-unrelated push, so never really got a root cause here.
- bad hardware
- a spark job that was doing simply insane amounts of shuffle output (which goes to disk) was hitting kernel panics which ended up being related to a kernel bug that only impacted ridiculously high-disk-io-using applications, with some additional spin that made me think "ah so this is basically only affecting spark jobs"
- bad hardware
Did I mention bad hardware? I've spent way too much time hunting down "bugs" that ended up just being a bad mobo and linux was kind enough to inform you of it. But "this is the only program that causes the kernel panics!" and yet when we move it to a temp server for a few days the program mysteriously stops crashing. Another reason I do like "the cloud" - I can just cycle out an ec2 box I suspect is bad instead of fighting with the IT guy about whether the 2 year old expensive server is already busted or not.
Corrupt registry hive, corrupt or missing OS file, or bad drivers are mostly the cause of Windows BSOD. Actually bad hardware is more rare. My experience during my IT consulting days.
I've seen too many systems which started to work fine after replacing a PSU.
As someone who worked L1 and L2 - %he major reason for BSODs is the faulty hardware.
My favourite story on this topic is when after a ~4 human hours of diag by L1 tech, I came to the client site, confirmed the BSOD, opened the case, straithened the SATA cable and the OS installed sucessfully.
EDIT: another one is the cheap PSU cut thr power too fast on the shutdoen, so the HDD never written 'good shutdoen' to the disk, triggering the scandisk on the startup. Fixed with a good PSU, BTW.
I thought the same. Till I bought a surface (first version)... BOY that thing was unstable. Was the last chance I gave to Microsoft. After that switched to Mac. Not coming back anytime soon.
I've worked and developed on Linux, for Linux, for 10+ years, I've seen my fair share of panics, especially using the bleeding edge releases. Most (not all!) of them were my own making though. :>
Yeah I've been using Linux exclusively for maybe 9 years and the only time I've ever seen a kernel panic was when I was messing around with Gentoo on a cheap machine I have just for that sort of screwing around, and I accidentally told it to literally overwrite the kernel. It got pretty far giving itself a lobotomy before it died, too.
Meanwhile, the last time I used Windows (in order to install Linux on a new laptop, lol), it blue screened four times just trying to mount a simple USB flash drive.
> it blue screened four times just trying to mount a simple USB flash drive
With Linux these problems can typically be solved by googling it on your phone then appending a text file with some nonsense string you found on a 10 year old forum post. I'm still holding my breath for Windows to catch up with that level of UX.
The basic difference and reason for me to have stuck with Linux for so long is that when there is a problem with Linux it is all about how much I can persevere trying to fix it. All code is there and I have the skills to fix it.
With Windows, if it doesn't work, there is a chance there simply is nothing you can do about it. There is no source code. The support people are completely useless. If you can't fiddle with it until it somehow works or find a person on the Internet that fiddled with it until it worked and they were gracious to share the solution, your only option tends to be to reinstall the entire thing and hope for the best.
So true lmao. One of the things that I noticed after moving from Mac OS to Linux is that like, yes, sometimes things don't work perfectly on Linux, and sometimes the Linux user experience is more awkward or arcane, but you can always, always figure out how to fix it or get it to work the way you want with like an hour of Googling tops and a little simple modification of configuration files or running a few terminal commands. There's no point at which something is so far gone that you can't just fix it yourself if you want to, so the choice is always there, it's just a matter of what's worth the effort for you. Meanwhile, with Mac OS, the last time I used it for a couple years I couldn't get it to consistently connect to external monitors correctly, and it was just an endless pain in the ass and there was nothing I could do about it.
I've been using GNU/Linux exclusively since 2012 and while I've seen fewer kernel panics than blue screens, I have seen some. Usually due to the Intel graphics driver (of all things). I don't recall ever having one caused by AMDGPU, but that may just be a lucky coincidence. But every OS has problems, my favourite one was OSX Yosemite hard rebooting whenever I ran a Xubuntu VM on VirtualBox.