Run updates without me having to worry that “whoops, an update was fucked, and the system is not unbootable anymore. Enjoy the next 6 hours of begging on forums for someone to help you figure out what happened, before being told that the easiest solution is to just wipe your drive and do a fresh install, while you get berated by strangers for not having the entirety of the Linux kernel source code committed to memory.”
As someone who has hundreds of installed programs with tweaks on top of tweaks and hundreds of thousands of files, I always find the suggestion to “just reinstall” beyond laughable.
Windows recovery fails in plenty of circumstances, it’s not a magic bullet. Snapshots are like you can do with btrfs, but that’s not exactly how Windows recovery works.
Sure if it fails completely it will, but it doesn’t catch everything. Here’s a related story I have:
At work we had a bunch of Lenovo X1 Carbons running windows that would have the usb-c ports die seemingly randomly on users which was a big problem since that’s also the charging port. There never seemed to be any similar root cause connecting the incidents and Lenovo’s support wasn’t any help. Our entire company is remote but luckily we had onsite support so for a while they would just come by and replace the whole motherboard each time.
Finally one day while scheduling a repair the support guy I was talking to just said, “Oh I’ve seen this before. It’s just a bad update and resetting the CMOS battery by putting a paper clip in this hidden hole fixes it.” We had the user try it out and the ports worked fine again. Apparently they had run some windows updates that failed silently and were causing the hardware issues.
From then on any time a user has had a hardware issue we can’t figure out we just have them try the reset and it has worked every time. This only happens probably 3-4 times a year but we only have less than 40 of these machines so not an insignificant amount.
I had to literally give up on a windows install that worked itself into an update hole, run the update, cant log in, undo the update, it tries to update at night. Endless cycle, no possible fix.
I don’t want to berate you, but just know with enough practice, you’ll be able to fix that linux install. Windows wont let you fix it.
Happened to me last year. I never fully found the root cause, but suspect nvidia drivers may have been an issue. I actually re-partitioned the hdd and put another ubuntu on it to try to fix things. That one booted, but I couldn’t un-fuck my old install.
I’ve had this happen to me at least once on every distro I’ve tried to use long term (longer than let’s say a month or two). Most recently was about this time last year. Luckily it was on my second computer, and I was still maintaining a full Windows install on my primary gaming system, so I didn’t really lose anything. Just reinstalled Windows on the second computer and tossed it in the closet until I decide what to do with it, and switched back to using the other system for all tasks instead of just gaming.
Conversely, all of the non-desktop systems that run some form of Linux(my NAS TrueNAS, my other NAS running unraid, multiple mini file/web servers, similar systems) are all rock solid. The only one that gets borked regularly, is the little system I use for testing out random shit(mostly Docker stuff) before installing on one of the other systems.
It’s about that time of the year where I take a trip around all of the major distros that I’ve run over the years and see what they look like, and if they have any new features that will compel me to try them out again. Probably start with Garuda, since I really did like their distro list time I tried it out. Maybe I’ll intentionally break the system and see how much of a pain in the ass, or not, the default btrfs/snapshot setup they use is.
Even in the most stable distros I’ve had this issue. We had a RHEL 9 server acting as a graphana kiosk and it failed after an update. Something dbus related. I’d love to know why, as it’s been the only failure we ever had but nonetheless it shakes confidence. Windows 11 updates trashed three servers, one to the point we had a to fly an engineer out. My hope is that immutable distros fix this.
You might be suffering from the opposite of survivorship bias: When you work in IT you end up having to fix the strangest shit that reoccurs on certain categories of hardware.
I know for a fact that RHEL 7 just did not like certain appliances by vendors that used it (back in the day). They would regularly break themselves until the vendor put out an update that switched it to a Debian-based custom thing.
Also, all the (thousands of) appliances that use Windows are utter shit so it’s not really a high bar. The vendor just needs to hire people that actually know what they’re doing and if they do they won’t use Windows on an appliance!
Moving to ublue/silver blue has really been a treat for avoiding this. Oh update borked my system time to boot to last update and wait on that one. I personally really want to get a CI/CD running next for my updates to make sure my specific build and collection of software just works the way I want it too.
I have an uncle who will assume anything that takes over 20 minutes has crashed so managed to break his Windows box by continually hard resetting as it was trying to apply a large upgrade.
I’ve actually had more issues with Windows doing that. My wifi drivers have stopped working on more than one occasion, and once it just decided to stop recognizing my wife’s hard drive.
Run updates without me having to worry that “whoops, an update was fucked, and the system is not unbootable anymore. Enjoy the next 6 hours of begging on forums for someone to help you figure out what happened, before being told that the easiest solution is to just wipe your drive and do a fresh install, while you get berated by strangers for not having the entirety of the Linux kernel source code committed to memory.”
Just to provide another data point: I’ve had bad Windows updates render my machine unbootable too.
And then you’re left searching for bullshit error messages and potentially unable to fix the problem regardless of your level of expertise.
I’m sorry, something went wrong. Here is all the information we can give you about it: “:(”
… No you just use Windows built-in rollback feature. Which I think even auto-recovers these days of it detects a failure to boot after an update.
Hah! Can someone here chime in and tell me when the slow AF (as in, it can take hours) rollback feature actually worked‽
Who TF is that patient‽ You can reinstall Windows and all your apps in half the time required.
As someone who has hundreds of installed programs with tweaks on top of tweaks and hundreds of thousands of files, I always find the suggestion to “just reinstall” beyond laughable.
I think it recovered my PC for me twice, and it took about ~10 minutes each time at most. Good luck reinstalling everything in that time lol.
Windows recovery fails in plenty of circumstances, it’s not a magic bullet. Snapshots are like you can do with btrfs, but that’s not exactly how Windows recovery works.
Of course not, but it works 9/10 times for most people. Enough so that most people never have to deal with a faulty Windows update.
Sure if it fails completely it will, but it doesn’t catch everything. Here’s a related story I have:
At work we had a bunch of Lenovo X1 Carbons running windows that would have the usb-c ports die seemingly randomly on users which was a big problem since that’s also the charging port. There never seemed to be any similar root cause connecting the incidents and Lenovo’s support wasn’t any help. Our entire company is remote but luckily we had onsite support so for a while they would just come by and replace the whole motherboard each time.
Finally one day while scheduling a repair the support guy I was talking to just said, “Oh I’ve seen this before. It’s just a bad update and resetting the CMOS battery by putting a paper clip in this hidden hole fixes it.” We had the user try it out and the ports worked fine again. Apparently they had run some windows updates that failed silently and were causing the hardware issues.
From then on any time a user has had a hardware issue we can’t figure out we just have them try the reset and it has worked every time. This only happens probably 3-4 times a year but we only have less than 40 of these machines so not an insignificant amount.
sfc /scannow didn’t work? Well too bad, cuz now you gotta reinstall your OS
And Microsoft support that’s in fact clueless fanboys.
Spoken like someone who doesn’t do stable releases
I had to literally give up on a windows install that worked itself into an update hole, run the update, cant log in, undo the update, it tries to update at night. Endless cycle, no possible fix.
I don’t want to berate you, but just know with enough practice, you’ll be able to fix that linux install. Windows wont let you fix it.
Last time that happened to me was 20 years ago. Am I lucky and this is still common?
Happened to me last year. I never fully found the root cause, but suspect nvidia drivers may have been an issue. I actually re-partitioned the hdd and put another ubuntu on it to try to fix things. That one booted, but I couldn’t un-fuck my old install.
I’ve had this happen to me at least once on every distro I’ve tried to use long term (longer than let’s say a month or two). Most recently was about this time last year. Luckily it was on my second computer, and I was still maintaining a full Windows install on my primary gaming system, so I didn’t really lose anything. Just reinstalled Windows on the second computer and tossed it in the closet until I decide what to do with it, and switched back to using the other system for all tasks instead of just gaming.
Conversely, all of the non-desktop systems that run some form of Linux(my NAS TrueNAS, my other NAS running unraid, multiple mini file/web servers, similar systems) are all rock solid. The only one that gets borked regularly, is the little system I use for testing out random shit(mostly Docker stuff) before installing on one of the other systems.
It’s about that time of the year where I take a trip around all of the major distros that I’ve run over the years and see what they look like, and if they have any new features that will compel me to try them out again. Probably start with Garuda, since I really did like their distro list time I tried it out. Maybe I’ll intentionally break the system and see how much of a pain in the ass, or not, the default btrfs/snapshot setup they use is.
Only if you aren’t on a stable branch.
Even in the most stable distros I’ve had this issue. We had a RHEL 9 server acting as a graphana kiosk and it failed after an update. Something dbus related. I’d love to know why, as it’s been the only failure we ever had but nonetheless it shakes confidence. Windows 11 updates trashed three servers, one to the point we had a to fly an engineer out. My hope is that immutable distros fix this.
You might be suffering from the opposite of survivorship bias: When you work in IT you end up having to fix the strangest shit that reoccurs on certain categories of hardware.
I know for a fact that RHEL 7 just did not like certain appliances by vendors that used it (back in the day). They would regularly break themselves until the vendor put out an update that switched it to a Debian-based custom thing.
Also, all the (thousands of) appliances that use Windows are utter shit so it’s not really a high bar. The vendor just needs to hire people that actually know what they’re doing and if they do they won’t use Windows on an appliance!
Timeshift
My thoughts exactly
That’s why I make a btrfs snapshot of my system before every upgrade. Rolling back from a rescue image takes only a minute.
Edit: automatically via the upgrade script
What a great idea! They should automate something like that! Maybe they could call it System Restore?
I never claimed to have invented the technique.
They’re just pointing out that Windows does this too.
Moving to ublue/silver blue has really been a treat for avoiding this. Oh update borked my system time to boot to last update and wait on that one. I personally really want to get a CI/CD running next for my updates to make sure my specific build and collection of software just works the way I want it too.
Amen Brother, my experience the last 20+ years
after 20 years, but usually much sooner, people usually learn to either:
maybe it’s just not your cup of tea?
I have an uncle who will assume anything that takes over 20 minutes has crashed so managed to break his Windows box by continually hard resetting as it was trying to apply a large upgrade.
I’ve actually had more issues with Windows doing that. My wifi drivers have stopped working on more than one occasion, and once it just decided to stop recognizing my wife’s hard drive.