Microsoft is advocating for significant changes to Windows security architecture in the wake of a major outage caused by a faulty CrowdStrike update last week. The incident, which affected 8.5 million PCs, has prompted Microsoft to reassess the resilience of its operating system and consider limiting kernel-level access for third-party security vendors.
John Cable, Microsoft’s vice president of program management for Windows servicing and delivery, outlined the company’s stance in a blog post titled “Windows resiliency: Best practices and the path forward.” Cable emphasised the need for “end-to-end resilience” and hinted at potential changes that could restrict kernel access for security software.
The CrowdStrike update bug, which resulted in widespread system crashes, has highlighted the risks associated with allowing third-party applications to operate at the kernel level. This privileged access, while beneficial for threat detection, can lead to catastrophic failures if errors occur.
Microsoft is now exploring alternatives that don’t rely on kernel access, such as VBS enclaves and the Azure Attestation service. These technologies, which utilize Zero Trust approaches, could provide enhanced security without the risks associated with kernel-level operations.
The tech giant’s push for change may face resistance from cybersecurity vendors and regulators. A similar attempt to restrict kernel access in Windows Vista in 2006 was met with opposition. In contrast, Apple successfully locked down kernel access in macOS in 2020.
Microsoft’s approach involves collaboration with partners and the broader security community. The company aims to balance improved system resilience with the needs of security vendors who have historically relied on kernel-level access for their products.
Microsoft’s response to the CrowdStrike incident also included deploying over 5,000 support engineers to assist affected organizations and providing ongoing updates through the Windows release health dashboard. The company has emphasized the importance of business continuity planning, secure data backups, and utilizing cloud-native approaches for managing Windows devices to enhance resilience against future incidents.