Microsoft Sorta Explains E74 Errors

e74

Last month we speculated on the recent rise in Xbox 360 E74 errors. We assumed that this was because of an increase in the number of HDMI consoles and that the associated scalar chip was failing. Unfortunately since these weren’t red ring failures, they didn’t fall into the extended three year warranty period for Xbox 360s. That is until this week when Microsoft admitted that some E74 errors are the same types of failures that cause the RRoD and would repair E74 under the same three year warranty. Kotaku attempted to get a better explanation out of Microsoft, but only got a little more info. Microsoft did confirm that E74 is not a reclassing of RRoD, but that there is some overlap between the two.

[via xbox-scene]

29 thoughts on “Microsoft Sorta Explains E74 Errors

  1. Last I heard, both rrod and e74 are caused by solder issues, but on different chips.

    If that’s the case – they need to re-examine their design. Maybe poach some Sony or Nintendo engineers!

  2. @bikehelmet Solder issues persay, basically the amount of heat generated by the chips is too much for the solder to withstand, causing it to literally desolder. You know I don’t know why on a commercially produced pcb they can’t just weld (using steel or such) the circuit to increase their maximum heat tolerance.

  3. lol.. I’ve bought several 3RROD xbox’s now… had 100% success in reviving all of them with simple methods.. furthermore I’ve seen a few 2-ring boxes, called microsoft, was denied warranty, bashed the console (literally, hit it, drop it) until it gave 3 rings, and then was successful in getting warranty coverage under the new 3year plan…

    for those who aren’t familiar, 3RROD consoles can be fixed relatively easily (it requires voiding the warranty, but I’ve yet to have a failure). Take it apart, remove the heat sinks, re-paste if the paste is ruined (solidified). remove the fan housing, place it ontop of the dvd drive so it’s only cooling the CPU, turn the xbox on and leave it on for about 10-15 minutes. shut it off, put the fan housing back together, turn it on, enjoy.

    I’ve heard that this method supposedly re-flows broken solder points.. can’t verify this as I haven’t taken the temperature of the gpu during the process. All I can say is 4 revived xbox’s is proof of concept to me..

    this fix works for both 3RROD 360’s, aswell as 360’s that output sound but no video.

    another issue with the sound but no video 360’s is they sometimes output video, but some aspects of said output are flawed… grainy video quality, missing layers of the video signal resulting in missing bloom effects etc.. I’ve found that this issue in particular is caused by the motherboard warping. Apply some downwards pressure to the CPU sink (might trigger a 2 ring, but wont be fatal, and if it is, go back to the overheat method and start again, always works) to find the sweetspot… might have to put spacers in some of the mobo mount screws to prevent the warping for a long term fix. I also found that literally placing weights (i used 30 pounds of weights) directly over the location of the CPU sink on the exterior of the case, solved this problem.

    I have 1 console that has exhibited 2 and 3 RROD’s. This particular console has been revived several times and remains in working condition to this day… best of all I got it for $20 ;)

    Fix it yourself, you’ll feel smarter!

  4. @coreyw – nice workaround for the warranty
    @blizzarddemon – how do you propose welding ICs without either heating them so much internal connections disintegrate or without causing electrical damage?

  5. Several of the above posters need to carefully reconsider their fanciful ideas about the melting point of solder (leaded or ROHS) and the operating temperatures within the xbox. I do a lot of temperature soak testing and yes; heat can lift pins away from the board, but only in the case where the solder joints were not properly flowed in the first place. This problem is commonplace, and hard to spot until failure as ic’s make contact through mechanical pressure (until thermal gradients cause the pin to lift)

  6. @liam

    I’ll admit i was skeptical on the topic of solder flow occurring. Things get hot, but not soldering iron hot…

    I’m curious, what do you gander is taking place to cause the 360’s to come back to life using my method described above? i guess some sort of favorable warp’age in some way/shape/form…

  7. it’s possible that the RROD is just some sort of error flag set by the xbox which is reset by a GPU overheat error, which is then reset when you reboot with everything working?

    it’s basically impossible that the removing the fan will reflow the solder joints without damaging the GPU. RoHS solder melting points are on the order of 210-220C, while the maximum operating temperature is at most 125*C (this is typical for military spec ICs; commercial spec is usually more like 85*C). Storage temperature is usually 125*C, too. I’d like to believe that some mechanical deformation is happening, but honestly any well designed IC these days have thermal sensors which will basically shut themselves down if they get too hot, so it’s unlikely it’ll even be able to overheat itself to any reasonable extent.

  8. @threepointone

    i think you hit the nail on the head with the error flag & reset upon gpu overheat…

    I’ve overheated a unit, turned it off after 15 minutes and immediately turned it back on (minus the time to reassemble the fan assembly & airflow shroud; 5-10 seconds) with it then working completely fine. The console then had no problem playing for extended periods of time and (as far as i know) hasn’t failed thus far. curious problem..

    furthermore i’m curious if you can prohibit one of the fans from spinning and still have the device boot.. this might bring rise to an overheat/fix that doesn’t require the case to be opened, thus presenting a surefire method to circumvent warranty without voiding/waiting…

    my body physically cannot tolerate waiting 3 weeks for somebody to fix something that might be possible to fix in 15 minutes you see ;)

  9. RE: the “overheating” method: I’ve heard entirely different methods of achieving the same thing. More than one friend has told me that they ‘cured’ their RROD failures by leaving the console turned on, wrapped in a towel until the unit overheats, then allowing it to cool. In every case, the RROD returned within a couple of months. I have always been entirely sceptical of such methods, but having read what has been said here in the comments thread, there may actually be some mileage in this otherwise insane practice…

  10. i had a rrod xbox that i did the reheat trick on, which worked for a few months. i cut holes on top of the case and added a 80mm fan and 70mm fan, which kept the reheat trick working for a while. then i replaced the xclamps with bolts, and that worked for a few months. then finally i had to try the heatgun reflow and that finally killed my xbox.

    the design is flawed and any fixes you do are probably going to be temporary. i got probably 6 months out of mine and i feel pretty lucky that i did.

  11. The towel trick, in my opinion, is a joke. It “works” but it’s very temporary. I tried it with an RRoD console that I got for free, lasted a few minutes, enough for me to say OH LOOK IT WORKS!!! and then it was back to the red lights. I did the Team Hybrid’s X-clamp replacement and that does work, been running fine for 3 weeks now. The hybrid fix helps to level the board surface under the GPU and CPU by using a foam pad under the board, pushing up, and the heatsink on top of the chip, pushing down with equal force. This means the board will bend less due to heat, preventing the board from bending away from broken contacts.

    As for the E74, I think that it is the same as RRoD in most cases. RRoD is caused by a bad solder joint under the GPU. The GPU is connected to the HANA/ANA chip by…you guessed it…solder joints (under the GPU, traces, and then under the HANA/ANA chip). If the failed joint under the GPU happens to go to the ANA/HANA, it probably says E74 instead of RRoD.

  12. Personally, I think adding extra cooling is like sticking a turbo on your car to ‘fix’ binding brakes. If thermal gradients are causing problems, then adding more cooling could be just as likely to make things worse.

    The xbox has well documented hardware problems, true. But everyone I know who has encountered such a problem has had a repair or replacement outside of warranty without question. I’ve never had to deal with xbox support, but I’ve had similar good service for other MS hardware (most recently a bluetooth mouse). I wish I could have said the same for my PSX, which was only a month old when the video output croaked, not to mention the DRE-plagued PS2. I’m no lover of MS, but when it comes to support I have found them hard to fault. I think that the extension of the scope of their out-of-warranty replacement policy serves to underline this point.

  13. In other hardware failures, my macbook power cable almost caught fire after it started melting, but it’s a full 3 years old. It is, however, a problem that is not exactly rare to have.
    The new power adapter cable is a bit stronger, thankfully, so I think they knew.

    Anyway… burning hardware… woo!

  14. I wonder how many people are discouraged from buying a 360 with this constant stream of failure news, for years.
    If I was a console user I’d certainly feel uneasy about the whole thing, it’s all well and good to have a warranty but that’s such hassle and it’s a wait often, and in some cases you wait weeks, then get a device back only to see it fail again a week later, or even arrive in a still broken state.

  15. E-74 is a “EDRAM failure” (see http://www.free60.org/wiki/error_codes, which has a quote from a microsoft engineer – see the famous http://pictures.xbox-scene.com/xbox360/bsod/xbox360_bsod_02.jpg picture, which is from E3 2005(!), if I’m not mistaken.) EDRAM is the RAM chip on the GPU package, which is connected to the GPU with a very high speed bus. This bus requires a complex training/setup phase, in which a known texture (in GDDR3) gets rendered (to EDRAM) and then resolved back into (GDDR3) memory. Then a “checksum” on both ends is monitored. If the checksum doesn’t match, certain values are tweaked (possibly some delays or phase adjustments – the usual “brute force” approach), until said CRC do match. If this process fails for a number of retries, you get E74.

    Now, how does that relate to RROD? The usual RROD case is that the memory sizing fails. This happens quite early in the boot process, when a “virtual machine” is invoked which initializes memory (and a number of other things). The most likely reason for the memory initialization to fail is that one of the address or data lines doesn’t work reliable anymore. There have been many speculations about the cause of this (see http://www.bunniestudios.com/blog/?p=223 for example, and also see how people have been able to make the console working again by applying stress to the pins using XClamps etc.), but let’s just say that they don’t work reliable and that it’s probably also affected by temperature in a very complex way. The memory initialization happens about ~1s after you press the power button, and takes approx. 3 seconds – newer consoles take a bit longer here.

    The EDRAM-initialization happens a bit later – usually right before the bootup animation starts. So, why do we see an increasement in those failures now? Possibly not because the EDRAM gets loose, or the connection fails. The more likely reason is that the actual texture memory (GDDR3) is damaged in a way that the test texture is read incorrectly, and thus the checksums don’t match – even if the GPU and EDRAM is working properly. So basically both E-74 and the traditional RROD mean the same thing – the memory doesn’t work.

    I can only speculate why the initial memory init doesn’t catch this case – one of the reasons could be that the memory init runs at a completely different temperature level. The CPU is still running at quarter speed, the GPU clocks JUST got initialized into full speed. Temperature (of CPU,GPU,EDRAM) between this point and a few seconds later are likely to change in a range of 10 Kelvin or more – that’s a lot. My guess is that the thermal expansion causes the memory to ultimately fail, but not as early in the boot process as a RROD.

    Why is E-74 a new phenomena? I don’t know. To be honest, I haven’t checked the exact implementation in a more recent kernel, so things might have changed here. E-74 itself isn’t a new thing, but the fact that so many users are facing it is.

  16. Unfortunately, E74 decides to show up at my door. Just another Microsoft failure…. And you would think, the king of technology would actually try for once, instead of just going straight for the money unlike trying to put quality work and make customers happy.

  17. Here we go again. My HDMI xbox 360 got the E74 error. I tried replacing the thermal compound on both the CPU and GPU. But, it was a no go.

    Microsoft uses pathetic heatsink technology on these boxes. The x-clamps are crap.

    I’d like to try and reflow my gpu and scalar solders but frankly I’m more likely to burn down my house, than to achieve such a result.

    I think it’s time to consider getting a PS3. Although, admittedly I have no idea if PS3s have adequate cooling either.

  18. Okay. I got some balls. I pulled the mainboard out of my E74 360. I used a heat gun to heat up the solder points between the HANA chip and the GPU. Re-assembled the unit and … It actually worked.

    The bad news is that the fix only worked for ~ 1 month. I now have another E74 error. Obviously the solder points have disconnected again.

    Fuck Microsoft and their 99 cent X-Clamps. Piece of shit heat sinks are fucking useless.

    The board overheats, flexes, and cracks solder points.

    I’m not spending another dime or minute on these sorry fucking junk boxes anymore.

    !!!! ADRIAN !!!!

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.