Religion and Ethics Forum
General Category => Politics & Current Affairs => Topic started by: Nearly Sane on July 19, 2024, 08:14:25 AM
-
Ooh err...
Good luck everyone
https://www.bbc.co.uk/news/live/cnk4jdwp49et
-
I have just seen that on the BBC news channel. :o
-
Looks like Crowdstrike is the issue
-
Looks like Crowdstrike is the issue
Yes, a company that really doesn't want to be in the news.
https://mashable.com/article/crowdstrike-microsoft-outage-windows-blue-screen-explained
-
Hope they have a robust blackout/recovery process!
-
Hope they have a robust blackout/recovery process!
They are part of the infrastructure in many large companies, the recovery process is also about those processes.
-
By "they" I meant all of those as well!
-
When it was mentioned earlier this morning that the govt was working on the support for this, I have to note that I felt relieved it was this one and not the last one.
-
When it was mentioned earlier this morning that the govt was working on the support for this, I have to note that I felt relieved it was this one and not the last one.
Not sure what the support government could have offered here. It'll be down to Crowdstrike to push an update and individual IT departments etc to recover servers and PCs. Small businesses would use whatever support Crowdstrike offers or Google/Youtube to resolve.
-
Not sure what the support government could have offered here. It'll be down to Crowdstrike to push an update and individual IT departments etc to recover servers and PCs. Small businesses would use whatever support Crowdstrike offers or Google/Youtube to resolve.
There's the impact on govt debts and agencies, e.g. NHS, and communication about the problems, and areas like banking payments where impacts could involve govt decisions.
-
I was actually a little surprised at how many relatively simple devices seem to be using Windows at all. I mean, it's a massive overkill for a point of sale terminal, for example, and unnecessary complexity leads to unreliability.
About a decade ago, I know that one company (probably shouldn't mention which) was busy porting a prototype self-driving car system (obviously far more complicated than a POS terminal) from Windows to Linux, so it could eventually be cut down further.
I suppose standard Windows is cheaper to develop on, and computing power is relatively cheap too (compared to then), but we've now seen the consequences...
-
Have any of you been to the supermarket since this problem began, and were you able to pay by card?
I do my weekly Tesco supermarket shopping at about 7am on a Monday. When I go tomorrow I hope I can use my debit card.
-
I do my weekly Tesco supermarket shopping at about 7am on a Monday. When I go tomorrow I hope I can use my debit card.
According to the BBC (https://www.bbc.co.uk/news/live/cnk4jdwp49et?post=asset%3A1e50357a-c2e3-430b-9075-c04b54388d29#post), Tesco was operating normally on Friday, so clearly weren't affected much, if at all. Some other supermarkets had problems, but it looks like all the bigger ones have recovered.
-
According to the BBC (https://www.bbc.co.uk/news/live/cnk4jdwp49et?post=asset%3A1e50357a-c2e3-430b-9075-c04b54388d29#post), Tesco was operating normally on Friday, so clearly weren't affected much, if at all. Some other supermarkets had problems, but it looks like all the bigger ones have recovered.
Thanks, Stranger.
-
I was actually a little surprised at how many relatively simple devices seem to be using Windows at all. I mean, it's a massive overkill for a point of sale terminal, for example, and unnecessary complexity leads to unreliability.
About a decade ago, I know that one company (probably shouldn't mention which) was busy porting a prototype self-driving car system (obviously far more complicated than a POS terminal) from Windows to Linux, so it could eventually be cut down further.
I suppose standard Windows is cheaper to develop on, and computing power is relatively cheap too (compared to then), but we've now seen the consequences...
Windows is actually pretty reliable these days. In fact, a lot of its poor reputation stems from the old Windows 95 line which wasn't very stable or or buggy third party drivers. The question is (as long as you have powerful enough hardware) do you target an operating system used by billions with the resources of Microsoft behind it or the offering of some smaller company that can't respond to bugs as quickly.
-
Windows is actually pretty reliable these days. In fact, a lot of its poor reputation stems from the old Windows 95 line which wasn't very stable or or buggy third party drivers. The question is (as long as you have powerful enough hardware) do you target an operating system used by billions with the resources of Microsoft behind it or the offering of some smaller company that can't respond to bugs as quickly.
Yeah, I'm well aware of how much better it is now than it was. Hell, I've been using PCs since they ran MS-DOS. Before that, the first time I had a computer on my desk at work, it ran CP/M.
I don't really disagree, it's just that it's still primarily designed as a desktop OS, and even my brand new Windows 11 laptop had a 'Blue Screen of Death' while I was setting it up, and I wasn't even doing anything non-standard on it (apart from tweaking the disk encryption in a way documented my Microsoft themselves, and it wasn't at that stage that it happened). It recovered with a single reboot, but nevertheless...
I'm also aware of how you can make software very much more reliable using the methodologies and practices used in the nuclear, aerospace, and (to an extent) automotive industries. That is, where software is safety critical and a bug can literally, directly kill people.
I'm not suggesting that it would be practical to develop every POS terminal to those standards, but we probably need to think more carefully about how things are developed and tested, depending on how critical the relevant systems are.
Apparently there is something that used to be called 'Embedded Windows' and is now 'Windows for IoT' that's designed for single dedicated devices, not sure if that was affected by this problem though. There is also a MS version called 'Azure Sphere', which is based on the Linux kernel. Again, not sure if this would be vulnerable.
It's also the case that Microsoft themselves sometimes completely fuck things up and don't offer proper solutions (as long as it's only a few customers who are affected). They did one update not so long ago (KB5034441) that changed the way the recovery partition was used, which meant that some machines didn't have enough space in their WinRE partition to install it (including my desktop), with only the helpful message "Error 0x80070643". All they offered was a powershell script (with hardly any instructions) or manual instructions to fix it using the command prompt. How many users even know these command line interfaces exist?
Using a big corporation isn't a guarantee of reliability.
Anyway, more of a few musings and a bit of a rant. As you were... :)
-
The point here is surely that given the spread of Crowdstrike's impact then coexistence testing is enormously important?
-
The point here is surely that given the spread of Crowdstrike's impact then coexistence testing is enormously important?
Yes, of course, but I imagine that is already part of their normal procedures. This has all the hallmarks of a massive cock-up that meant that, either the wrong build was released, that hadn't gone through the normal testing process, or some last minute 'minor' change was made that had, err... unintended consequences.
-
Yes, of course, but I imagine that is already part of their normal procedures. This has all the hallmarks of a massive cock-up that meant that, either the wrong build was released, that hadn't gone through the normal testing process, or some last minute 'minor' change was made that had, err... unintended consequences.
It was a bit more nuanced than that. They wrote their code as a Kernal driver that was fully certified and tested, which it has to be to be allowed in the Windows Kernal (Zone 0) as you can imagine. What Cloudstrike did was circumvent this certification process for code updates by including code changes in a config file that the unchanged driver picked up. Apparently a config file with the wrong or no data was deployed which caused a kernal addressing issue by the driver.
Cloudstrike was using this loophole to circumvent a process intended to protect Zone 0 (kernal).
I imagine there will be some comeback and rolling back of this method of deploying code.
Deleting this config file in Safe Mode fixes the problem but obviously needs to be physically done and not remotely.
-
It was a bit more nuanced than that. They wrote their code as a Kernal driver that was fully certified and tested, which it has to be to be allowed in the Windows Kernal (Zone 0) as you can imagine. What Cloudstrike did was circumvent this certification process for code updates by including code changes in a config file that the unchanged driver picked up. Apparently a config file with the wrong or no data was deployed which caused a kernal addressing issue by the driver.
Cloudstrike was using this loophole to circumvent a process intended to protect Zone 0 (kernal).
I imagine there will be some comeback and rolling back of this method of deploying code.
Deleting this config file in Safe Mode fixes the problem but obviously needs to be physically done and not remotely.
On a slight tangent, your mis-spelling of "kernel" reminds me that Commodore once did the same thing with the Commodore 64 and it stuck.Fortunately, I don't think Crowdstrike is targeted at the C64.
-
On a slight tangent, your mis-spelling of "kernel" reminds me that Commodore once did the same thing with the Commodore 64 and it stuck.Fortunately, I don't think Crowdstrike is targeted at the C64.
I had a C16. Was my first 'PC'
-
It was a bit more nuanced than that. They wrote their code as a Kernal driver that was fully certified and tested, which it has to be to be allowed in the Windows Kernal (Zone 0) as you can imagine. What Cloudstrike did was circumvent this certification process for code updates by including code changes in a config file that the unchanged driver picked up. Apparently a config file with the wrong or no data was deployed which caused a kernal addressing issue by the driver.
Cloudstrike was using this loophole to circumvent a process intended to protect Zone 0 (kernal).
I imagine there will be some comeback and rolling back of this method of deploying code.
Deleting this config file in Safe Mode fixes the problem but obviously needs to be physically done and not remotely.
Thanks for the extra detail, but it doesn't really change the point that the updated config file should have been tested with the existing code. I would be amazed it that wasn't supposed to have been done according to their normal release procedure.
-
It was a bit more nuanced than that. They wrote their code as a Kernal driver that was fully certified and tested, which it has to be to be allowed in the Windows Kernal (Zone 0) as you can imagine. What Cloudstrike did was circumvent this certification process for code updates by including code changes in a config file that the unchanged driver picked up. Apparently a config file with the wrong or no data was deployed which caused a kernal addressing issue by the driver.
Cloudstrike was using this loophole to circumvent a process intended to protect Zone 0 (kernal).
I imagine there will be some comeback and rolling back of this method of deploying code.
Deleting this config file in Safe Mode fixes the problem but obviously needs to be physically done and not remotely.
This is a slight mischaracterisation of what happened. See this technical note from Crowdstrike:
https://www.crowdstrike.com/blog/falcon-update-for-windows-hosts-technical-details/
The config file is really analogous to the virus definition files that AV software uses. Their use is not a "loophole" but a necessary feature to enable the vendor to keep pace with all the new methods of attack that are being discovered daily.
-
Not great at PR
https://www.bbc.co.uk/news/articles/ce58p0048r0o
-
Not great at PR
https://www.bbc.co.uk/news/articles/ce58p0048r0o
Apparently the reason why many people found it didn't work is that Uber Eats detected the flurry of redemptions as unusual activity and ironically assumed it was a cyber attack.
And yes, if you've worked a 24 hour shift rebooting all your Windows servers, a $10 voucher as recompense doesn't really cut it.
-
Apparently the reason why many people found it didn't work is that Uber Eats detected the flurry of redemptions as unusual activity and ironically assumed it was a cyber attack.
And yes, if you've worked a 24 hour shift rebooting all your Windows servers, a $10 voucher as recompense doesn't really cut it.
It feels like satire
-
I was directly affected as resources assigned to me in my work were rightly redirected to fixing servers/PCs. The consequences for my projects were real and costly.
I'm guessing these 3rd party impacts were repeated everywhere.
a $10 voucher seems pretty ironic.