I am having an intermittent problem with my computer where programs start freezing up. What generally happens is that certain programs stop responding entirely, and are impossible to kill via the task manager (almost always iTunes and my backup program). Eventually, I’m forced to reboot. Inevitably, when I reboot, my RAID 1 array goes into a verification scan, finding and repairing errors along the way.
Because the programs that lock up are those that read large parts of the disk and the RAID repair after reboot, I’m inclined to think that one of the drives in the array has errors that develop slowly over time.
Any ideas as to how I might diagnose which drive and whether I need to just replace the drive entirely? Could it be the RAID card instead? Has anyone seen similar problems with a RAID array and iTunes locking up?
EDIT: The raid controller is an Intel ICH8R/ICH9R/ICH10R/DO SATA RAID Controller. I don’t think that’s the product name, but it’s all the info I can glean from the device manager.
Update Since I asked this question, I stopped using the RAID 1 array and upgraded to a new, single drive. I still see the same sort of degradation after a couple weeks of uptime, but now when I reboot, instead of rebuilding the array, the OS forces a check-disk, where it often finds a couple of errors, fixes them with no problem, and then continues booting.
Any help would be greatly appreciated.
The errors you are experiencing are likely filesystem-level corruption and not physical issues with the drive. If there are CRC errors or other such things with the drives, you’ll get disk errors in the Event Viewer.
One thing that can cause programs to be impossible to kill via the task manager are likely stuck somewhere in the kernel. Usually this means a device driver is at fault. I don’t know what types of drivers iTunes installs but it could be a problem. Try updating your iTunes to the latest version if possible. I could imagine some types of software that monitor your disk for changes could cause an issue as well.
Also, try to see if you have the latest drivers for your chipset and try updating your BIOS to the most recent version.
EDIT: Windows also supports things like “filter” device drivers that intercept reads and writes going to physical device drivers. If there is an issue with a filter driver, i.e. it’s stuck waiting for sommething else, then it might cause the system to freeze. Nero’s
PxHelper.sys (or something like that) is an example of such a driver commonly attached to the CD-ROM device driver.
Possible software that would do this for a hard drive would include antivirus software, encryption software, possibly some types of backup software, Windows AIK, and malware
I heard about someone having a similar issue before. It turned out that their motherboard only supported SATA version 1 and the hard drive was trying to run at SATA version 2 speeds. He ended up fixing the issue by down-throttling the hard drive to 150 MB/s using the pin OPT1 (works on Western Digital drives). If this was the case, you’d notice some abnormal-looking graphs using the HD Tune benchmark (such as reaching a peak and constantly dropping down to 0). The benchmark gives better results when the hard drive is not being used for anything else. The Average transfer rate should be around 100 MB/s for relatively new desktop computers.
The HD Tune benchmark should gradually decrease as it goes to 100%. If it is a straight line all the way across at 200 MB/s, you probably have one of those awesome new solid state drives. If it is straight all the way across at around 10 MB/s, your drive could be stuck in PIO mode and going super-slow (which can make large applications appear to hang). Windows knows when it has not been shut down properly. A forced shutdown could be causing the chkdsk on start-up. I would imagine that forcing a shutdown in the middle of a write operation could cause file system errors.
Screenshots of a completed benchmark, the info tab, and the health tab from HD Tune might help narrow down the issue (you can use the free version).
If your RAID controller is one of those lower-cost models that doesn’t do its own processing but relies on the driver to do all the hard work (many lower-end motherboards with integrated RAID controllers have these types of controllers), and Windows is crashing while the disk is being updated (or all the updates haven’t been written to the disks prior to the reboot operation), then this could be the cause of your problem.
One big clue here is that your computer is slowing down a lot, especially when dealing with large amounts of data. Does your RAID array seem to start rebuilding before the OS boots, or after? If after, then it’s very likely that you have one of those software RAID controllers I mentioned [in the first paragraph].
I had a client that had similar problems. We all thought it was the raid card so replaced it with another. 2 more cards later and we came to the conclusion it was the computer.
When he replaced the computer and used one of the “what we thought was” faulty raid card and the problem never resurfaced.
In the end, we never really narrowed down exactly where the problem was. It could have been ram, motherboard or CPU, who knows. But since he replaced the whole box and was able to re-use the raid controller & hard drive we isolated enough to know it wasn’t software, or hard disk or raid controller.
If you have multiple ram sticks, how about removing some of them and move them around and see if you get the same problem. e.g. (4GB RAM) 2x 2GB sticks. Remove the top one. Run the machine for a while on 2GB and see if the problem is still there. If it is, swap the ram stick with the other one and leave one out. Same problem? might not be the ram, problem goes away? hmmm interesting… could be a ram issue.