Saving A Samsung TV From The Dreaded Boot Loop

[eigma] had a difficult problem. After pulling a TV out of the trash and bringing it home, it turned out it was suffering from a troubling boot loop issue that basically made it useless. As so many of us do, they decided to fix it…which ended up being a far bigger task than initially expected.

The TV in question was a Samsung UN40H5003AF. Powering it up would net a red standby light which would stay on for about eight seconds. Then it would flicker off, come back on, and repeat the cycle. So far, so bad. Investigation began with the usual—checking the power supplies and investigating the basics. No easy wins were found. A debug UART provided precious little information, and schematics proved hard to come by.

Eventually, though, investigation dialed in on a 4 MB SPI flash chip on the board. Dumping the chip revealed the firmware onboard was damaged and corrupt. Upon further tinkering, [eigma] figured that most of the dump looked valid. On a hunch, suspecting that maybe just a single bit was wrong, they came up with a crazy plan: use a script to brute-force flipping every single bit until the firmware’s CRC check came back valid. It took eighteen hours, but the script found a valid solution. Lo and behold, burning the fixed firmware to the TV brought it back to life.

It feels weird for a single bit flip to kill an entire TV, but this kind of failure isn’t unheard of. We’ve seen other dedicated hackers perform similar restorations previously. If you’re out there valiantly rescuing e-waste with these techniques, do tell us your story, won’t you?

39 thoughts on “Saving A Samsung TV From The Dreaded Boot Loop

  1. “It feels weird for a single bit flip to kill an entire TV…”

    As components get smaller, the scale of things that represent fundamental hazards does as well. It’s not my specialty (by a lot) but SPI’s relying on ” the presence or absence of trapped electrons on the floating gate” (yeah, I googled it…) for memory storage, things like charged particles/cosmic rays come into play in creating random failures. You can fill in all the vacuum tube/555 timer jokes here, but I think we’ll see more and more of this as the trend continues.

    The manufacturers will likely just warranty out the customer complaints until the perceived service span of the devices begins to be questioned (eg. “All Samsung TVs die after about 14 months”) and it begins to looks like a market-share problem that can’t be solved by monopolization.

    1. Samsung warranty the customers? Hell no! 5 TVs and 3 monitors. Within 2 years, 3 TVs are dead or about dead with one doing the boot loop, 2 TVs with lines missing on the screens, 1 with Dark spots extremely visible in uniform white background. Monitor once in a while decides to show noise on the screen. I have to turn the screen off and on to restore the it. Another one has the bezel separating itself from the monitor and needs to be snapped back into place.
      Answer from Samsung: Sorry, buy a new one.

      1. I had a SwampScum monitor on my PC. When it started to act up, I ended up breaking the case and then the screen trying to open it up. I guess the plastic case was welded together.
        Yeah, and no Service Manual available.

    2. Or alternatively arm their service ‘techs’ with a stanley knife (box cutter for those across the ocean) who subtly stab the TV whilst the owner is away then claim the TV was damaged and warranty void. Friends don’t let friends buy Samsung TVs.

    3. Intel had a batch of (Atom?) CPUs which died after ~12-18m. Unfortunately they went into NASs.
      NAS manufacturer sent me a replacement under warranty for the first one, but when that died was out of warranty. Intel refused to accept responsibility. Thankfully their later model had a different CPU so I just replaced with that.
      You don’t know who has the issues until they happen. And it’s often not the manufacturers’ fault, it’s a part they bought and thought they could trust.

      1. Don’t simply accept the manufacturer’s warranty terms – there’s a good chance that you have statutory warranty rights that extend longer than that, potentially up to the expected lifetime of the device. The manufacturer will generally do everything they can to pretend they don’t exist, or insist that they have already expired, but invoking your local consumer affairs entity is often enough to get them to suddenly decide they can “do you a favour”.

        Capitalism!

    4. With SPI (NOR flash) it’s not radiation, they’re stupidly more radiation resistant than NAND/DRAM/etc. It’s more likely either static or temperature plus age: the floating gate structure breaks down and the barrier becomes smaller. Happens with NAND too but because it happens soo often the NAND controller refreshes it.

  2. I’m going with a different tack on this than “not building to last”.

    I think this is why we should move to separating the panel and controller in displays. I think it’d be much more interesting if we could swap out the controller for a good display to something with better features for less cost than an entire new TV when a new HDMI standard comes out.

    This could also make for more interest in open source display controllers that could do more crazy or interesting and advanced tricks to get more out of panels.

      1. I think it’d be interesting to go even further than a dumb TV.

        I mean something where there is only an input for some universal display controller standard and the controller handles inputs and everything.

        I think a TV that is literally just a panel with a PSU would be very interesting.

          1. Almost but a monitor still has a display controller and takes HDMI or DP then coverts, scales, etc and then writes to the panel.

            I mean a panel nearly by itself.

      2. I wonder if the TV would have worked fine if Samsung just had a CRC warning, while letting the boot continue? There is a decent chance the corruption would be in something nonessential. One could make products last longer by just being less finicky.

    1. Wholeheartedly agree. A display should be just that. But then consumers want more than a display, they want the smartest of smart TVs which now the manufacturers are adding in all sorts of privacy/ad ‘features’ so if you could buy a display it would likely cost a hell of a lot more than the equivalent ‘smart’ TV.

      Just as a use case, apparently my flagship Sony TV supports Alexa (yes I have Alexa spying on me). However in my European region the app isn’t allowed, because. So I splashed out on a Fire TV bought in the UK which allows my old Echo to control the TV.

    2. I they made a system where a single bitflip kills it, they did it intentionally.

      I don’t mean “mustache twisting evil”.

      I mean “Hey, isn’t this going to be bad for people? We should do it in a way that fails gracefully. Management do it the cheaper way or you are fired.”

      That is willful.

      We CHOOSE to make disposable products.
      We CHOOSE to value saving 1% on a BOM and then throw it away while 99% is perfectly serviceable.

      This is what you get when you over optimize a system towards making “line go up” instead of making “good thing”.

    1. You just took out the tubes and ran them to the local store and plugged them into the tube tester and found the bad one and bought a replacement.

      TVs these days are basically free. If a 50″ TV is $250, accounting for inflation it’s less than half the cost of the 13″ Sony my Dad bought in the mid-80s. And a 42″ will probably do fine as well. If you are buying the latest Quantum Dot Organic LED model, then get an extended warranty?

      The TVs 50 years ago would get an error in the vertical sync and the picture would bounce up and down. I remember my eyes ‘bouncing’ on their own for a little while after playing video games at a friend’s house X’D

  3. My TV repair story is different, but lacking any documentation of the process, there’s no story to submit. So I guess a comment will have to do.

    My parent’s aging TV developed a fault where the picture would go out after a short period of time, five or ten seconds I think it was. After a bit of troubleshooting, I determined that the panel was still displaying the picture, however the backlight was going out. Great, probably just a power supply problem, right?

    Nope! Near as I could tell, it was fine. Alas, the testing methodology eludes me so I can’t elaborate. I just know I somehow came to the conclusion that it was working properly.

    So I kept poking around the innards of the TV, and eventually found a line that used a 0-5v signal to control the brightness of the backlight. Except that whatever was driving this was the source of my problem. I’d get 5v for a short bit on power-on, and then it’d drop out. Since the TV was otherwise trashed, and had in fact already been replaced, I chanced tying it to the 5v rail of the PSU, and tada! Permanent, 100% brightness backlight.

    Sure, it’s not possible to adjust the brightness anymore, and for a while I was concerned something might burn out because at the time it didn’t occur to me to slap a resistor in there, but it’s been running that way for years now.

    And best of all, it’s not a smart TV >_>

  4. i love that bit flip approach! dumb but effective

    speaking of flipped bits…i read somewhere that RAM had a certain number of flipped bits per month per gigabyte or whatever and figured my PC with 16GB of RAM must have bit flips in RAM??? i couldn’t believe it. so i made a memtest program that tests 4GB of RAM at a time (using mlock()), periodically re-allocating the memory to get a different set of pages. and lo, it revealed a flip! i noticed after a few months of running it that it was one of three specific bits each time…so i blacklisted those addresses in the kernel commandline and that ‘solved’ the problem. i’ve since bought new RAM and i haven’t been able to detect a fault in it.

    no punchline just an anecdote :)

  5. Not using any particularly sophisticated techniques but I’ve diverted 8 or so cheap white box dumb TVs from their dumpster destiny.

    15 or so were purchased and installed all in one go. A couple of years later they started failing in quick succession. I was the guy installing the replacements so drug the dead ones home to investigate.

    Bad LEDs turned out to be the culprit. As I had broken the LCD on a couple while trying to disassemble them, I had no guilt about cannibalizing parts.

    There were 9 strips of 5 6 volt LEDs in series. One dead LED took out a bank of 3 strips but the remaining backlight egments would still light. A second dead LED in another bank shut down the backlight.

    Once I figured out the two-dead-all-dead pattern, it was just a matter of probing each LED with a 6 volt wall watt to find the bad ones and replacing the strips where the bad ones were.

    I now have the issue of what to do with the pile of 55″ TVs. Talk about your first world problems….

  6. Good effort! Lucky if indeed one bit was actually flipped (could have found a different solution as well now). I wonder how these flash chips typically fail. Only experience I have is after overheating a pcb for a night. Comparing the flash afterwards, looked like most bytes changed. If I even compared with the correct binaries, not sure..

    If he was able to verify the CRC outside of the tv, I wonder if recomputing it with the single bit flip still in would have worked too. Good chance it will ?

  7. Something interesting I found while developing a project at work:

    If there is a CRC with known parameters, you can correct any 1-bit error if you have the failed CRC value. Many 2-bit errors can be corrected but it works a little like even-numbered-roots – there can be two possible ‘corrections’ to the error and only one is the original data.

    That being said, trying every bit was probably faster than writing code for CRC error correction.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.