[Ham-Computers] RE: SATA HD failures on Promise RAID 1 controller

Hsu, Aaron (NBC Universal) aaron.hsu at nbcuni.com
Mon Nov 13 20:04:55 EST 2006


Mike,

I would seriously look into a possible power supply problem.  You didn't mention what the specific drive failures were.  One drive failing in a RAID 1 set is probable (not uncommon).  Both at the same time is suspicious.  Now that one of the replacement drives has failed (with no other equipment changes), start looking for a root cause to the failures.  The  power supply is a likely culprit.  It's extremely rare that a drive controller would cause a physical drive failure, but if the signaling voltages are wrong, that could be a cause.

How old is the system?  Don't forget that a few years ago, industrial espionage "gone wrong" caused a *LOT* of electrolytic capacitor failures (a electrolyte formula was stolen, but it was missing a stabilizing ingredient...in short, the caps would vent hydrogen and blow).  Many vendors unknowingly purchased caps with the unstable electrolyte which caused equipment failure within two years.  The electrolyte was used primarily in larger capacitors (>1000uF) typically found in power supply circuits.

Ironically, this past weekend, I worked on a system that had a power supply failure 1 year ago; but this time, the hard drive failed.  I replaced the HD and everything seemed to work fine until I dropped the system off with the client...the system refused to boot.  After 15 minutes of "tinkering", I took the system back to my workbench and spent 30 minutes checking all the cable connections and bench testing the power supply (including checking ripple on a 'scope) - still no go.  So, out comes the motherboard from the case - BINGO!  Of 19 "larger" electrolytic caps, 9 were bulging and 4 had burst and were leaking from the top.  These caps were part of the voltage regulator circuitry for the CPU.  A brief search on the net and I discovered that these 1500uF and 3300uF caps can drop to as low as 75uF after they burst and leak.  This would cause power instability that could blow the main power supply (which previously happened).  For reference, the motherboard was from a high-tier vendor (Gigabyte).  Asus, Abit, and other motherboard manufacturers (as well as many cell phone manufacturers) were also affected by this industrial espionage gone wrong.

So, before you replace the drive(s) again, do a more thorough check into why the drives are failing.  The caps I mentioned are also found (in larger sizes) in power supplies and if the power supply caps have vented (or burst), you could be seeing abnormal voltages or high-voltage spikes on the power leads which could be causing your drive failures.

As for re-installing the OS, it depends on the "format" the drive is in.  I remember the older Promise controllers used different drive parameters to access the drive (as opposed to the parameters set by the BIOS).  This difference prevented you from moving the drive from the Promise controller to the on-board IDE controller.  One way around this is to use a drive imaging utility (such as GHOST) to do a drive to drive transfer.  Install the new drive on the on-board IDE controller and keep the "working" RAID drive on the Promise.  Then use the utility to "image" the drives.  Another option is to continue to use the Promise controller in a single-drive configuration - I believe you just turn "off" the RAID 1 function and let it run the drive as a single drive.


73 and good luck!

  - Aaron Hsu, NN6O


-----Original Message-----
Sent: Monday, November 13, 2006 4:29 PM
Subject: [Ham-Computers] SATA HD failures on Promise RAID 1 controller

ASUS A7V Deluxe MB
Windows XP Pro

About a year ago I had two Maxtor 250 GB SATA drives fail in RAID 1 on the Promise PDC20378 controller of an ASUS A7V Deluxe MB. I replaced both failed drives with Seagate 250 GB SATA drives. Now, it appears one of the Seagate drives failed--upon reboot I'm getting the disk failure error message (one drive failed). I don't understand why hard drives on this motherboard are experiencing such a high failure rate. However, I suspect the Promise RAID controller is to blame. I installed the most recent driver when I replaced both drives.

Should I abandon RAID and just use one hard drive? My system is working okay otherwise, even with the failed hard drive message. If I decide to remove the failed drive and just use the one good drive, how do I go about doing this? I assume I can do this without having to reload the operating system, applications and data?

If I purchase a new Seagate 250 GB SATA drive, can I replace the failed drive without having to reload the operating system, applications and data?
The instructions don't go into detail on this, but indicates it's possible.

Thank you. 

73 de Mike, N9BOR
A-1, FISTS, JARL A-1, SMC
http://www.n9bor.us
http://www.k9ya.org

di dah dit - The only Roger Beep you'll ever need.
Let your fingers do the talking - Morse code.
My designated driver is a 12BY7A.



More information about the Ham-Computers mailing list