Recovering From A Seagate HDD Firmware Bug

Hard drive firmware is about the last place you want to find a bug. But that turned out to be the problem with [BBfoto’s] Seagate HDD which he was using in a RAID array. It stopped working completely, and he later found out the firmware has a bug that makes the drive think it’s permanently in a busy state. There’s a firmware upgrade available, but you have to apply it before the problem shows its face, otherwise you’re out of luck. Some searching led him to a hardware fix for the problem.

[Brad Garcia] put together the tutorial which illustrates the steps needed to unbrick the 7200.11 hard drive with the busy state bug. The image in the lower right shows the drive with a piece of paper between the PCB and the connectors which control the head. This is necessary to boot the drive without it hanging due to the bug. From there he issues serial commands to put it into Access Level 2, then removes the cardboard for the rest of the fix.

In the tutorial [Brad] uses a serial-TTL converter. [BBfoto] grabbed an Arduino instead, using it as a USB-ttl bridge.

41 thoughts on “Recovering From A Seagate HDD Firmware Bug

  1. and that ladies and gentlemen is why one should have ran the firmware updater tool *Before* something happened.
    I have 4 7200.11’s I flashed in a RAID-Z array. Have had zero problems before or after.

    1. Everyone knows you don’t fix what isn’t broken. This should be a voluntary recall, or a fix from the manufacturer for price of shipping one way. But at the price of HDDs now, price of shipping would almost be useless, much like what happened to me shipping a $40 shredder for $60 when it failed less than 4 months of use.

      Some good a 1 year warranty is.

    2. I think the problem is, a lot of people don’t know they have a hard drive with that bug until the problem happens. Sure, you may buy a hard drive knowing it has that bug, but many people may have bought it without knowing of the bug, or realizing that their drive is effected.

    3. Yeah, that’s great that you knew that the hard drive firmware was hooped but I’ve never had a hard drive fail since my first XT computer with a 40MB drive (1989). Well, until now that is… I do now have an unresponsive 7200.12 hard drive that undoubtedly has a firmware bug if not the same firmware bug are the 7200.11. It showed up with absolutely no warning. One day the computer shut down for a reboot and never came back. Sad but true.

    1. Won’t work for me – mine is a 7200.10, not a .11 with the BSY problem. I’ve got TONS of old photos, videos and music on it that I need off. It’s sitting around till I can get some disposable income and send it out to get professionally recovered

      1. Try using ddrescue (linux), it will try to recover the whole drive, you can either recover it to another drive or to an image file. If you do image file you can open it with linux reader from diskinternals for windows.

  2. And this is why I only use Western Digital hard drives now and avoid Seagate at all costs… Since they bought out Maxtor, Seagate’s quality control and overall reliability has gone down the shitter.

    Out of 12 Seagate drives I used in my server in the last 6 years, 9 of them have failed in some way, and of the 9 replacements I’ve gotten back from Seagate, 2 were DOA and 5 others have since failed… After that, I gave up and replaced them all with Western Digital Black drives, I haven’t has a single issue since.

    1. Ditto. I had a 7200.9 that went bad after a few months. I could’ve gotten it replaced under warranty but that doesn’t do anything for the thousands of family photos I had on there. Googled around and found out the 7200.9 drives were notoriously failure-prone. I won’t touch Seagate now.

  3. This is pretty old and several youtube howto’s were put up back when this bug hit the scene.

    It’s one of the reasons why I am very hesitant about using seagate drives, this and some other stupid things they did which should not have happened.

  4. does this fix the bug itself?

    or just reset the busy-state and put the drive back into a “ready to fail” state ???

    IE: does this fix it forever or does it just put it back into a state of “1/million chance of re-glitching” ???

  5. Once it does this and you fix it, never trust the drive again no matter what you do to the firmware. Get your stuff off and maybe use it for something you can afford to lose. I’ve fixed them, upped the firmware, and it goes bad in 6 mo’s anyway.

    Also, sometimes I’ve had to vary from the standard procedure to get one fixed. Like change what I’m disconnecting with the cardboard.

  6. I have a bunch of those drives is a series of RAID 5s (9TB raw capacity), 2 have died so far and I can’t send them back because they were OEM disks. I’m pretty sure you can’t update the firmware when they are in a RAID either. Anyone know how I could update it, with out pulling a drive from the array and having to spend god knows how many weeks rebuilding my array over and over again, all the while hoping that another drive doesnt die during the rebuild? The controller is a Highpoint 3520 if that matters.

  7. i have an ocz vertex barefoot ssd that bricked due to the controller firmware taking a shit. anyone know of a fix for these? I would love to retrieve my data from it…

  8. I had a 36 gig Maxtor drive in the millennium family that managed to lose the part of its firmware stored on the platters.

    I searched for a long time for a firmware file to use with the procedures I’d found to fix Maxtor drives. The millennium drives were the only ones I could not find any firmware for.

    For quite a while I’d look for some file, couldn’t find it on any of my drives… “Oh. It was on that POS Maxtor!”

    Western Digital in the pre WWW era had the best customer service. They had a dialup BBS with all their software and drivers. It also had a documentation lookup feature where you could find the numbers. The next step was to dial their toll free FAX-back service where (IIRC) up to four documents at a time could be requested. In a few minutes it’d FAX you the documents. Even more amazing was WD had software and documents for product lines they’d sold off to other companies. Remember the “Orchid” video cards? I got drivers and docs for them from WD after the company WD sold Orchid to had discontinued the line. Other companies (like Chaintech) would destroy/delete information on discontinued products and some went so far as to deny ever making some items, even with their current company name and address printed directly on the item.

    In later years, when a hard drive had to go back under warranty, WD had a minimum turnaround time. If they couldn’t repair the drive you sent in, they’d send a refurbished identical model. If they had no refurbished ones of that model in stock, they’d ship you a new drive. If they had none of that model at all, you’d get back a drive of the next higher capacity with at least the same spindle RPM.

    In contrast, the way Seagate would do warranty service was they would fix the returned drive, no matter how long it took. If the returned drive wasn’t repairable you’d get back a different drive of the same model. If they had none in stock you got to wait as long as it took for Seagate to get in an identical model. You would not ever get a new drive as a warranty replacement.

    IBM was even worse. They’d send out “certified” used drives. Once one of their “certified” drives didn’t work at all and on the day the second replacement arrived the customer’s original drive died completely, taking all her files to the digital graveyard with it.

    IIRC Maxtor and Quantum would replace warranty returned drives with new ones of the same model and refurbished ones if the drive was out of production. You’d never get a higher capacity or any different “equivalent” like Western Digital would do.

  9. In my shop I see about 75-100 hard drives a week and track the brand of bad drives. Frequency of failure from best to worst: Seagate, Western Digital, Samsung, Fujitsu, Toshiba, Hitachi. I no longer track IBM or Maxtor. I swore by WD until last year, then I ordered 9 drives, every one failed within 3 months. The frequency of bad drives has also increased with time, hard drives are not made anywhere near as well as they used to be, backup, backup, backup. Also worth noting WD and Seagate both chopped their warranty period 2 years regular, 3 for enterprise.

  10. You’re absolutely right about HD production quality. I have many computers and hard drives. In my 25 computer teaching lab back in the 80’s I left everything on 24/7 and had not a single failure for over 10 years. I also have a special computer I built using the most stable motherboard ever made (up to that time). I think it was an Intel 440BX. It was contracted for special government use and they required unmatched reliability. Intel tested it hundreds of hours to obtain the MTBF required. Anyway, it’s still running its original installation of Windows 2000 along with its original two hard drives, has never failed (or the hard drives) and has only been turned off and on 5 or 6 times for brief periods! I think I built it about 15 years ago! I still leave nearly all my home machines on 24/7 but now there are many more failures. There is a fairly inexpensive little program (fits on a floppy) written by a genius x-hard disk repair and design engineer named Steve Gibson that is worth its weight in gold! It is THE ONLY ONE OF ITS KIND that exists as far as I know. It is COMPLETELY NON-DESTRUCTIVE (he says you can even unplug the computer while it’s running and no harm is done but I’ not trying that) It takes hours or days to run because it calculates the very most difficult patterns of 1’s and 0’s for each sector to write and read, storing your data temporarily elsewhere, and makes sure it is able to read back each pattern without problems. I’ve used it for over 7 years and it has never failed to diagnose a drive that had unknown problems or was destined to fail even months into the future. It has also repaired many drives for me that would not boot or that I could not even read! Enough from me. I taught computer science for the good part of 20 years and ran a small computer business back when you could make decent money building and fixing computers on the side. I have a master’s degree in computer science and a BA in pure math from U. C. Berkeley.

  11. I am trying to help a cousin fix his old mybook drive that is a wd5000aacs drive. I am assuming it is the firmware issue as it just disappeared but still spins and no noises from it. I bought a usb to ttl adapter, power supply for sata etc from Amazon and have everything connection. I have RX and TX to same on drive serial pins (no gnd because if I use ground I get no keyboard after connecting to ttyUSB0) I block the pins on the drive with card stock per the instructions, power up the drive, plug in the cp2102 usb to ttl, fire up gtkterm and use the correct configuration 38400, 8,1,n,n and connect to ttyUSB0. I am able to type in letters and see them in the terminal window so I assume the loopback is working. however, when I use Ctrl+z I do not get the F3 T> line. nothing happens, any idea what the issue might be?

    1. The issue I had in communicating with the hard drive was the fact that Putty wanted the Cap’s Lock key on for this to function properly. Once I did this I was able get started with the process. I ran though the procedures and it came up with this error message after I removed my isolation strip.

      ERROR 1009 DETSEC 00006008
      SPIN ERROR
      ELAPSED TIME 26.099 SECS
      R/W STATUS 2 R/W ERROR 84150180

      So after a little Googling I was able to determine the error message was related to the PC board not making complete contact with the drive body because some of the boards screws were loose enough to allow the removal of the isolation strip.

      I tightened down the screws and issued the ‘U’ command and was rewarded with the opportunity to continue with the recovery. I finished the final steps of formatting the needed partitions
      .
      I mounted the drive back in the computer and all was good – I have my data back that was sitting there all the time.

      I found this web site very helpful. http://forum.hddguru.com/viewtopic.php?f=1&t=11040&start=160
      It helped to fill in the holes around the error message.

      Hope this helps someone
      Mike

    2. Western digital isn’t seagate, and so your issue although not mechanical might be something quite different than a firmware error.
      I’m not sure why you would thing a completely different manufacturer would have an error from the first manufacturer. Unless you are talking about SSD’s who had some issues once across brands due to them using the same controller from a third party, but regular HD all make use of their own stuff

      Plus seagate has a history of firmware bugs whereas WD does not AFAIK.

  12. It’s a kettle of fish for me. My first drive failed like that, and I just had to wear it. The 2nd one I tanked after it fell off the nightstand, that was in 2012, and I am just organising to transfer the platters to a new secondhand one, then will transfer the data (if I am lucky) to a western digital. Yes photos and music I wanted to save.

Leave a Reply to AsdfCancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.