When all else fails, blame it on the cloud? It seems like that’s the script for just about every outage that makes the news lately, like the Wyze camera outage this week that kept people from seeing feeds from their cameras for several hours. The outage went so far that some users’ cameras weren’t even showing up in the Wyze app, and there were even reports that some people were seeing thumbnails for cameras they don’t own. That’s troubling, of course, and Wyze seems to have taken action on that quickly by disabling a tab on the app that would potentially have let people tap into camera feeds they had no business seeing. Still, it looks like curiosity got the better of some users, with 1,500 tapping through when notified of motion events and seeing other people walking around inside unknown houses. The problem was resolved quickly, with blame laid on an “AWS partner” even though there were no known AWS issues at the time of the outage. We’ve said it before and we’ll say it again: security cameras, especially mission-critical ones, have no business being connected with anything but Ethernet or coax, and exposing them to the cloud is a really, really bad idea.
outage7 Articles
Hackaday Links: October 23, 2022
There were strange doings this week as Dallas-Forth Worth Airport in Texas experienced two consecutive days of GPS outages. The problem first cropped up on the 17th, as the Federal Aviation Administration sent out an automated notice that GPS reception was “unreliable” within 40 nautical miles of DFW, an area that includes at least ten other airports. One runway at DFW, runway 35R, was actually closed for a while because of the anomaly. According to GPSjam.org — because of course someone built a global mapping app to track GPS coverage — the outage only got worse the next day, both spreading geographically and worsening in some areas. Some have noted that the area of the outage abuts Fort Hood, one of the largest military installations in the country, but there doesn’t appear to be any connection to military operations. The outage ended abruptly at around 11:00 PM local time on the 19th, and there’s still no word about what caused it. Loss of GPS isn’t exactly a “game over” problem for modern aviation, but it certainly is a problem, and at the very least it points out how easy the system is to break, either accidentally or intentionally.
In other air travel news, almost as quickly as Lufthansa appeared to ban the use of Apple AirTags in checked baggage, the airline reversed course on the decision. The original decision was supposed to have been based on “an abundance of caution” regarding the potential for disaster from its low-power transmitters, or should a stowed AirTag’s CR2032 battery explode. But as it turns out, the Luftfahrt-Bundesamt, the German civil aviation authority, agreed with the company’s further assessment that the tags pose little risk, green-lighting their return to the cargo compartment. What luck! The original ban totally didn’t have anything to do with the fact that passengers were shaming Lufthansa online by tracking their bags with AirTags while the company claimed they couldn’t locate them, and the sudden reversal is unrelated to the bad taste this left in passengers’ mouths. Of course, the reversal only opened the door to more adventures in AirTag luggage tracking, so that’s fun.
Energy prices are much on everyone’s mind these days, but the scale of the problem is somewhat a matter of perspective. Take, for instance, the European Organization for Nuclear Research (CERN), which runs a little thing known as the Large Hadron Collider, a 27-kilometer-long machine that smashes atoms together to delve into the mysteries of physics. In an average year, CERN uses 1.3 terawatt-hours of electricity to run the LHC and its associated equipment. Technically, this is what’s known as a hell of a lot of electricity, and given the current energy issues in Europe, CERN has agreed to shut down the LHC a bit early this year, shutting down in late November instead of the usual mid-December halt. What’s more, CERN has agreed to reduce usage by 20% next year, which will increase scientific competition for beamtime on the LHC. There’s only so much CERN can do to reduce the LHC’s usage, though — the cryogenic plant to cool the superconducting magnets draws a whopping 27 megawatts, and has to be kept going to prevent the magnets from quenching.
And finally, as if the COVID-19 pandemic hasn’t been weird enough, the fact that it has left in its wake survivors whose sense of smell is compromised is alarming. Our daily ritual during the height of the pandemic was to open up a jar of peanut butter and take a whiff, figuring that even the slightest attenuation of the smell would serve as an early warning system for symptom onset. Thankfully, the alarm hasn’t been tripped, but we know more than a few people who now suffer from what appears to be permanent anosmia. It’s no joke — losing one’s sense of smell can be downright dangerous; think “gas leak” or “spoiled food.” So it was with interest that we spied an article about a neuroprosthetic nose that might one day let the nasally challenged smell again. The idea is to use an array of chemical sensors to stimulate an array of electrodes implanted near the olfactory bulb. It’s an interesting idea, and the article provides a lot of fascinating details on how the olfactory sense actually works.
Cascade Failures, Computer Problems, And Ohms Law: Understanding The Northeast Blackout Of 2003
We’ve all experienced power outages of some kind, be it a breaker tripping at an inconvenient time to a storm causing a lack of separation between a tree and a power line. The impact is generally localized and rarely is there a loss of life, though it can happen. But in the video below the break, [Grady] of Practical Engineering breaks down the Northeast Blackout of 2003, the largest power failure ever experienced in North America. Power was out for days in some cases, and almost 100 deaths were attributed to the loss of electricity.
[Grady] goes into a good amount of detail regarding the monitoring systems, software simulation, and contingency planning that goes into operating a large scale power grid. The video explains how inductive loads cause reactance and how the effect exacerbated an already complex problem. Don’t know what inductive loads and reactance are? That’s okay, the video explains it quite well, and it gives an excellent basis for understanding AC electronics and even RF electronic theories surrounding inductance, capacitance, and reactance.
So, what caused the actual outage? The complex cascade failure is explained step by step, and the video is certainly worth the watch, even if you’re already familiar with the event.
It would be irresponsible to bring up the 2003 outage without talking about the Texas ERCOT outages just one year ago– an article whose comments section nearly caused a blackout at the Hackaday Data Center!
Adventures In Power Outage Hacking
The best type of power outage is no power outage, but they will inevitably happen. When they do, a hacker with a house full of stuff and a head full of ideas is often the person of the hour. Or the day, or perhaps the week, should the outage last long past the fun little adventure phase and become a nuisance or even an outright emergency.
Such was the position that [FFcossag] found himself in at the beginning of January, when a freak storm knocked out power to his community on a remote island in the middle of the Baltic Sea. [FFcossag] documented his attempts to survive the eight-day outage in vlog form, and although each entry is fairly long, there’s a lot to be learned from his ordeal. His main asset was a wood cook stove in the basement of the house, which served as his heat source. He used a car radiator and a small water pump to get some heat upstairs – a battery bank provided the power for that, at least for a while. The system evolved over the outage and became surprisingly good at keeping the upstairs warm.
The power eventually came back on, but to add insult to injury, almost as soon as it did, the ground-source heat pump in the house went on the fritz. A little sleuthing revealed an open power resistor in the heat pump control panel, but without a replacement on hand, [FFcossag] improvised. Parts from a 30-year-old TV transmitter were close at hand, including a nice handful of power resistors. A small parallel network gave the correct value and the heat pump came back online.
All in all, it was a long, cold week for [FFcossag], but he probably fared better than his neighbors. Want to be as prepared for your next outage? Check out [Jenny]’s comprehensive guide.
Amazon S3: Out Like A Light; On Like A Bathtub
You no doubt heard about the Amazon S3 outage that happened earlier this week. It was reported far and wide by media outlets who normally don’t delve into details of the technology supporting our connected world. It is an interesting thing to think that most people have heard about The Cloud but never AWS and certainly not S3.
We didn’t report on the outage, but we ate up the details of the aftermath. It’s an excellent look under the hood. We say kudos to Amazon for adding to the growing trend of companies sharing the gory details surrounding events like this so that we can all understand what caused this and how they plan to avoid it in the future.
Turns out the S3 team was working on a problem with some part of the billing system and to do so, needed to take a few servers down. An incorrect command used when taking those machines down ended up affecting a larger block than expected. So they went out like a light switch — but turning that switch back on wasn’t nearly as easy.
The servers that went down run various commands in the S3 API. With the explosive growth of the Simple Storage Service, this “reboot” hadn’t been tried in several years and took far longer than expected. Compounding this was a backlog of tasks that built up while they were bringing the API servers back online. Working through that backlog took time as well. The process was like waiting for a bathtub to fill up with water. It must have been an agonizing process for those involved, but certainly not as bad as the folks who had to restore GitLab service a few weeks back.
[via /r/programming]
Fanboys Want To Take AT&T Down
A post about Operation Chokehold popped up on (fake) Steve Jobs’ blog this morning. It seems some folks are just plain tired of AT&T giving excuses about their network. The straw that broke the camel’s back came when AT&T floated the idea of instituting bandwidth limitations for data accounts. Now, someone hatched the idea of organizing enough users to bring the whole network down by maxing their bandwidth at the same time.
We’re not quite sure what to think about this. Our friend Google told us that there’s plenty of press already out there regarding Operation Chokehold so it’s not beyond comprehension that this could have an effect on the network. On the other hand, AT&T already knows about it and we’d wager they’re working on a plan to mitigate any outages that might occur.
As for the effectiveness of the message? We’d have more sympathy for AT&T if they didn’t have exclusivity contracts for their smart phones (most notably the iPhone). And if you’re selling an “Unlimited Plan” it should be just that. What do you think?
[Thanks Bobbers]
Gmail Without The Cloud: Tips For Next Time
Yesterday’s Gmail service outage is a hot topic on just about every news site right now. For so many of us that have always taken the reliability of Gmail for granted it was a real shock to lose all of the functionality of the web based system. Now that we’ve learned our lesson, here’s a couple of tips to help you out the next time there’s an outage.
Continue reading “Gmail Without The Cloud: Tips For Next Time”