WinSvr 2012 R2 hanging due to event 129 from vsmraid: solved (for me)
I’ve been plagued by a problem where after running for 3-4 days (sometimes a longer interval, sometimes shorter) the performance of my Windows Server 2012 R2 system would just tank until rebooted. The event log (System) would fill with event 129 from driver vsmraid, reading:
Reset to device, \Device\RaidPort0
Lots of ineffective ideas and proposed solutions are on the web so I’ll point you to what worked for me: Set AHCI Link Power Management – HIPM/DIPM to “Active”, which disables AHCI link power management.
The problem is apparently that some devices, e.g., certain SSDs, don’t respond properly (or at all?) to Link Power Management commands yet the Intel RAID drivers (or firmware?) apparently insist on sending them LPM commands.
To solve this you first change the registry so that the Power Settings applet shows AHCI Link Power Management options, then you set the option to “Active” which disables it (it means: let the device/link stay active and don’t try to send link power commands to it). If that works, you win, if not, more drastic surgery is required: You set the registry to totally disable Link Power Management (aka “LPM”) to all devices. I needed to do that.
Go to this excellent post by Sebastian Foss (dead link deleted, but see below where I copied the post in) and follow steps 1 and 2. Reboot and await results. If that doesn’t solve your problem then follow step 3, which did it for me. (I didn’t do step 4.)
Here’s some more information: A question with discussion on TechNet, a tutorial with screenshots on how to enable the AHCI LPM power options in the Power Applet, and a SuperUser (StackExchange) discussion of it. Also an excellent post from the NT Debugging blog1 explaining storage timeouts and event 129. It’s only off in one key point: When he sums up, saying “I have never seen software cause an Event ID 129 error.” Obviously, this post from 2011 predates this Intel LPM problem.
Hasn’t happened for two weeks now, so I’m declaring success.
P.S., here’s the information from Sebastian Foss’ post (linked above) just in case that post disappears:
I had several system freezes in Windows 10 Technical Preview (build 9926 – but I also had those freezes on earlier builds) on my Macbook Air 2013.
System Event-Log shows a warning for ID 129, storahci, Reset to device, \Device\RaidPort0, was issued.Seems to be some problem related to the SATA-Controller and the SSD (In my case Apple/Samsung SM0128F)
I was able to fix the problem by editing several registry entries:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Power\ PowerSettings\0012ee47-9041-4b5d-9b77-535fba8b1442\ 0b2d69d7-a2a1-449c-9680-f91c70521c60 and change the “Attributes” key value from 1 (default; hidden) to 2 (exposed). [This will expose “AHCI Link Power Management – HIPM/DIPM” under Hard Disk power settings]
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Power\ PowerSettings\0012ee47-9041-4b5d-9b77-535fba8b1442\dab60367-53fe-4fbc-825e-521d069d2456 and change the “Attributes” key value from 1 (default; hidden) to 2 (exposed). [This will expose “AHCI Link Power Management – Adaptive” under Hard Disk power settings]
Now you can edit AHCI Link Power Management options in your power profiles. You can either set them to “active” – or in my case I set them to HIPM. (Host-initiated) (While DIPM would be a device initiated sata bus power down).
Those settings control the behavior of the sata bus power state – they do not power down the device.
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\storahci\ Parameters\Device
Set NOLPM to * – those keys contain several hardware ID’s (vendor and device) for storage devices. Setting NOLPM to * disables LPM control messages to any storage device.I also set SingleIO to * – never had any freezes or storahci warnings again.
I hope this helps those who have also been looking for a solution for a long time.
BR – Sebastian Foss
-
Apparently moved from NT Debugging Log to author’s personal blog ↩︎