On the 3rd of June 2019, a 1U CubeSat developed by students of the AGH University of Science and Technology in Kraków was released from the International Space Station. Within a few hours it was clear something was wrong, and by July 30th, the satellite was barely functional. A number of problems contributed to the gradual degradation of the KRAKsat spacecraft, which the team has thoroughly documented in a recently released paper.
We all know, at least in a general sense, that building and operating a spacecraft is an exceptionally difficult task on a technical level. But reading through the 20-pages of “KRAKsat Lessons Learned” gives you practical examples of just how many things can go wrong.
It all started with a steadily decreasing battery voltage. The voltage was dropping slowly enough that the team knew the solar panels were doing something, but unfortunately the KRAKsat didn’t have a way of reporting their output. This made it difficult to diagnose the energy deficit, but the team believes the issue may have been that the tumbling of the spacecraft meant the panels weren’t exposed to the amount of direct sunlight they had anticipated.
This slow energy drain continued until the voltage dropped to the point that the power supply shut down, and that’s were things really started going south. Once the satellite shut down the batteries were able to start charging back up, which normally would have been a good thing. But unfortunately the KRAKsat had no mechanism to remain powered down once the voltage climbed back above the shutoff threshold. This caused the satellite to enter into and loop where it would reboot itself as many as 150 times per orbit (approximately 90 minutes).
The paper then goes into a laundry list of other problems that contributed to KRAKsat’s failure. For example, the satellite had redundant radios onboard, but the software on them wasn’t identical. When they needed to switch over to the secondary radio, they found that a glitch in its software meant it was unable to access some portions of the onboard flash storage. The team also identified the lack of a filesystem on the flash storage as another stumbling block; having to pull things out using a pointer and the specific memory address was a cumbersome and time consuming task made all the more difficult by the spacecraft’s deteriorating condition.
Fair warning: any homeowners who have thermostats similar to the one that nearly burned down [Kerry Wong]’s house might be in store for a sleepless night or two, at least until they inspect and perhaps replace any units that are even remotely as sketchy as what he found when he did the postmortem analysis in the brief video below.
The story begins back in the 1980s, when the Southern New England area where [Kerry] lives enjoyed a housing boom. Contractors rushed to turn rural farmland into subdivisions, and new suburbs crawled across the landscape. Corners were inevitably cut during construction, and one common place to save money was the home’s heating system. Rather than engage an HVAC subcontractor to install a complicated heating system, many builders opted instead to have the electricians install electric baseboards. They were already on the job anyway, and at the time, both copper and electricity were cheap.
Fast forward 40 years or so, and [Kerry] finds himself living in one such house. The other night, upon catching the acrid scent of burning insulation, he followed his nose to the source: a wall-mounted thermostat for his electric baseboard. His teardown revealed burned insulation, bare conductors, and scorched plastic on the not-so-old unit; bearing a 2008 date code, the thermostat must have replaced one of the originals. [Kerry] poked at the nearly combusted unit and found the root cause: the spot welds holding the wires to the thermostat terminal had become loose, increasing the resistance of the connection. As [Kerry] points out, even a tenth of an ohm increase in resistance in a 15 amp circuit would dissipate 20 watts of heat, and from the toasty look of the thermostat it had been a lot more than that.
The corner-cutting of the 1980s was nothing new, of course – remember the aluminum wiring debacle? Electrical fires are no joke, and we’re glad [Kerry] was quick to locate the problem and prevent it from spreading.
We’ve become sadly accustomed to consumer devices that seem to give up the ghost right after the warranty period expires. And even when we get “lucky” and the device fails while it’s still covered, chances are that there will be no attempt to repair it; the unit will be replaced with a new one, and the failed one will get pitched in the e-waste bin.
Not every manufacturer takes this approach, however. When premium quality is the keystone of your brand, you need to take field failures seriously. [Dalibor Farný], maker of high-end Nixie tubes and the sleek, sophisticated clocks they plug into, realizes this, and a new video goes into depth about the process he uses to diagnose issues and prevent them in the future.
One clock with a digit stuck off was traced to via failure by barrel fatigue, or the board material cracking inside the via hole and breaking the plated-through copper. This prompted a board redesign to increase the diameter of all the vias, eliminating that failure mode. Another clock had a digit stuck on, which ended up being a short to ground caused by pin misalignment; when the tube was plugged in, the pins slipped and scraped some solder off the socket and onto the ground plane of the board. That resulted in another redesign that not only fixed the problem by eliminating the ground plane on the upper side of the board, but also improved the aesthetics of the board dramatically.
As with all things [Dalibor], the video is a feast for the eyes with the warm orange glow in the polished glass and chrome tubes contrasting with the bead-blasted aluminum chassis. If you haven’t watched the “making of” video yet, you’ve got to check that out too.
[Mastro Gippo] recently purchased a wall mounted charger for his electric car that looked great and had all the bells and whistles he wanted. There was only one problem: the thing burned up on him. Looking to find out how this seemingly high-end piece of hardware gave up the ghost so easily, he took it apart and tried to figure out where things went wrong. While he’s not looking to sling any mud and actually name the company who produced the charger, he certainly has some choice words for whoever green-lit this particular design.
With the charger open, there’s little doubt that something became very toasty inside. A large swath of the PCB has a black char mark on it, and in some places it looks like the board burned right through. After a close examination, [Mastro] is of the opinion that the board heated up to the point that the solder actually liquified on some connections. This conductive flow then shorted out components below it, and things went from bad to worse.
But where did all the heat come from? [Mastro] was stunned to see that a number of the components inside the charger were only rated for 30 amps, despite the label for the product clearly stating it’s good for up to 32A. With components pushed past their limits, something had to give. He wonders how such a device could have made it through the certification process; an excellent question we’d love to know the answer to.
The worst part is, it looks like the designers might have even known there was an overheating issue. [Mastro] notes that there are heatsinks bolted not to a component as you might assume, but directly to the PCB itself. We’ve seen what happens when designers take a cavalier attitude towards overheating components, and the fact that something like an electric vehicle charger was designed so poorly is quite concerning.
The popular press was recently abuzz with sad news from the planet Mars: Opportunity, the little rover that could, could do no more. It took an astonishing 15 years for it to give up the ghost, and it took a planet-wide dust storm that blotted out the sun and plunged the rover into apocalyptically dark and cold conditions to finally kill the machine. It lived 37 times longer than its 90-sol design life, producing mountains of data that will take another 15 years or more to fully digest.
Entire careers were unexpectedly built around Opportunity – officially but bloodlessly dubbed “Mars Exploration Rover-B”, or MER-B – as it stubbornly extended its mission and overcame obstacles both figurative and literal. But “Oppy” is far from the only long-duration success that NASA can boast about. Now that Opportunity has sent its last data, it seems only fitting to celebrate the achievement with a look at exactly how machines and missions can survive and thrive so long in the harshest possible conditions.
There’s a reason we often use the phrase “It ain’t Rocket Science”. Because real rocket science IS difficult. It is dangerous and complicated, and a lot of things can and do go wrong, often with disastrous consequences. It is imperative that the lessons learned from past failures must be documented and disseminated to prevent future mishaps. This is much easier said than done. There’s a large number of agencies and laboratories working on multiple projects over long periods of time. Which is why NASA has set up NASA Lessons Learned — a central, online database of issues documented by contributors from within NASA as well as other organizations.
Unfortunately, all of this body of past knowledge is sometimes still not enough to prevent problems. Case in point is a recently discovered issue on the ISS — a completely avoidable power supply mistake. Science payloads attach to the ISS via holders called the ExPRESS logistics carriers. These provide mechanical anchoring, electrical power and data links. Inside the carriers, the power supply meant to supply 28V to the payloads was found to have a few capacitors mounted the other way around. This has forced the payloads to use the 120V supply instead, requiring them to have an additional 120V to 28V converter retrofit. This means modifying the existing hardware and factoring in additional weight, volume, heat, cost and other issues when adding the extra converter. If you’d like to dig into the details, check out this article about NASA’s power supply fail.
One dark and stormy morning, Dr. Richard Noirimetla, private failure investigator, was sitting at his desk nursing his morning cup of joe. It was an addiction, but life, and engineering was hard. Intense eyes sat in a round dark-skinned face. An engineering degree from the prestigious Indian Institute of Technology hung from the wall in his sparse office. Lightning flashed outside of his window, as the rain began to beat even harder against his corner office windows.
His phone rang.
“Hello, Dr. Noirimetla, Private Failure Investigator here.” He said in deep, polite voice. “How may I help you?”
“Ah, I’m Chief of Manufacturing for Galileo Concrete Pillars Inc. We have a bit of a problem here. We used to see a failure rate above 33% for our concrete pillar operation. As part of our lean manufacturing efforts we tried to reduce that number through various improvements. However, we see a failure rate of almost 50% now. We expect foul play… from one of our suppliers. Can you come right away?” a worried man’s voice sounded over the phone.
“I see, that’s very troubling,” Noirimetla rumbled. “I’ll send over the contract detail. There will be an increased fee, but I’m on my way.”