For readers that might not spend their free time watching spools of PLA slowly unwind, The Spaghetti Detective (TSD) is an open source project that aims to use computer vision and machine learning to identify when a 3D print has failed and resulted in a pile of plastic “spaghetti” on the build plate. Once users have installed the OctoPrint plugin, they need to point it to either a self-hosted server that’s running on a relatively powerful machine, or TSD’s paid cloud service that handles all the AI heavy lifting for a monthly fee.
Unfortunately, 73 of those cloud customers ended up getting a bit more than they bargained for when a configuration flub allowed strangers to take control of their printers. In a frank blog post, TSD founder Kenneth Jiang owns up to the August 19th mistake and explains exactly what happened, who was impacted, and how changes to the server-side code should prevent similar issues going forward.
For the record, it appears no permanent damage was done, and everyone who was potentially impacted by this issue has been notified. There was a fairly narrow window of opportunity for anyone to stumble upon the issue in the first place, meaning any bad actors would have had to be particularly quick on their keyboards to come up with some nefarious plot to sabotage any printers connected to TSD. That said, one user took to Reddit to show off the physical warning their printer spit out; the apparent handiwork of a fellow customer that discovered the glitch on their own.
According to Jiang, the issue stemmed from how TSD associates printers and users. When the server sees multiple connections coming from the same public IP, it’s assumed they’re physically connected to the same local network. This allows the server to link the OctoPrint plugin running on a Raspberry Pi to the user’s phone or computer. But on the night in question, an incorrectly configured load-balancing system stopped passing the source IP addresses to the server. This made TSD believe all of the printers and users who connected during this time period were on the same LAN, allowing anyone to connect with whatever machine they wished.
The mix-up only lasted about six hours, and so far, only the one user has actually reported their printer being remotely controlled by an outside party. After fixing the load-balancing configuration, the team also pushed an update to the TSD code which puts a cap on how many printers the server will associate with a given IP address. This seems like a reasonable enough precaution, though it’s not immediately obvious how this change would impact users who wish to add multiple printers to their account at the same time, such as in the case of a print farm.
While no doubt an embarrassing misstep for the team at The Spaghetti Detective, we can at least appreciate how swiftly they dealt with the issue and their transparency in bringing the flaw to light. This is also an excellent example of how open source allows the community to independently evaluate the fixes applied by the developer in response to a discovered flaw. Jiang says the team will be launching a full security audit of their own as well, so expect more changes getting pushed to the repository in the near future.
We were impressed with TSD when we first covered it back in 2019, and glad to see the project has flourished since we last checked in. Trust is difficult to gain and easy to lose, but we hope the team’s handling of this issue shows they’re on top of things and willing to do right by their community even if it means getting some egg on their face from time to time.
This is a very interesting glitch – but what baffles me is that the adjustment they made still doesn’t quite tackle the security vulnerability.
I’m reasonably sure that faking your public IP is somewhat doable, or you can do things like get into the WiFi network with the printer with a bit of effort and then connect to the printer…
What I’d love to know is why they went for this really weird solution in the first place, rather than using a system like QR-Code scanning, i.e. OctoPrint’s TSD generates a token and shows the user a QR Code, which can be scanned – and the server then uses this token to connect the two services.
That’d be much less likely to create issues I’d say, since you’d need to get into the OctoPrint settings page first.
The program seems like a great idea but the requirement to be connected to the cloud is an instant turn off. It would be nice if you could run it on a local server…
You can, as mentioned in the first paragraph…
Burn!
I built a similar system, only it had all the tensorflow stuff local to check how it’s printing. It would continuously learn from the printer it was attached to & theoretically get better over time. The pi is on my home network for Octoprint, but I took out the gateway. I stopped working on it since I only saw a little interest, and I realized it’s just another thing to go wrong when you’re trying to get something out of the printer and the easiest thing is to have it on your desk. I honestly saw this coming though.
Carrier Grade NAT (CGNAT) is a thing. Be careful running this software if your ISP uses CGNAT (most LTE service does)– or, your “household” might include every other customer of your ISP in your city.
Not just ISPs, but student-housing, apartment condos, various kinds of gated communities and so on may all use NAT and thus present a single, shared public IP!
Just lumping everyone together based on their public IP and giving unfettered access to them all is mind-bogglingly stupid! Like, Jesus Effing Christ, anyone with even minimal understand of how IP-networks work would understand how bad this kind of a scheme is, but not these guys? No, nuh-uh, their system should be burned to the ground and buried six feet deep for a good measure.
Wow rather a strong reaction – A great many programs make similar ‘bad’ assumptions, often for much the same reason it seems reasonable at the time and will work great in the test environment (if any)… Not like every program is written by a vast team who are masters of every element of computing involved, perhaps they are opencv or AI masters etc, so when it comes to areas they are not its using a library or other off the shelf code without fully mastering the configuration and security nuances…
I would suggest that this projects code started as a home network printer monitor, with all the extra features they wanted. Then they realised over the web and potentially cloud based would be so much more useful, on the home network in nearly all cases IP would be perfectly adequate – even if the addresses are dynamically leased the default endurance on that lease is so long, and at refresh its just going to be given the one it already has, so its nearly as good as a fixed address…
Not a particularly challenging element to fix either, now they have become aware of it.
“A great many programs make similar ‘bad’ assumptions” — Other people doing a stupid thing isn’t an excuse for doing the stupid thing yourself as well.
If you launch a service on the Internet, you should consult someone who knows what they’re doing as to whether you have obvious, glaring security-issues that could put your users at risk. There’s also the potential issue of unwittingly making yourself and your users a part of a botnet, thereby causing even wider damage.
Yeah it should be done that way, but even massive companies make these sort of cockups frequently, and they will have experts in house! They just were not consulted, or not given time to do it right apparently…
The real thing is not about having issues, as software will always have issues, some of them stupid overlooked flaws nobody in their right mind would expect to find, others honest oversights, and some created by legacy code not being interfaced with correctly by the new, the real thing is how you deal with those issues when discovered, so far this mob seem to be doing it correctly.
Yes it would be nice if it wasn’t a steaming pile of garbage, but just look at the number of massive CVE cockups from a company like Apple, the ones that should never, ever, for any reason have made it to a product, probably shouldn’t have even been that way in the lab, yet some do make it to the real world! And they are one of the better tech giants…
Sorry, nope. This is a very basic failure to understand how IP addresses are used.
I think it’s pretty clear at this point that their method of associating users with printers is deeply flawed, it just took an event like this to bring it to light. The code push that limits printers per IP is a pretty nasty hack, and to me seems more like a stopgap until they can refactor the whole authentication system. Hopefully.
Seems like this is a small team, that’s taking a hobby project and trying to turn it into a service. So to a point, this kind of janky code making it to a production application is to be expected. Not excusing it, but it would be ideal to think these sort of problems aren’t hiding in plenty of projects we use on a daily basis.
We have been trying to reach you about your printer warranty
The post on Reddit has been censored: Sorry, this post has been removed by the moderators of r/3Dprinting.
Fortunately with the correct link, as the one in this article, the post on Reddit is still readable.
This is /also/ an interesting example of a more subtle problem: expecting tech to do magic.
They look similar (cf. Clarke), but they ain’t the same.
There is an arms race between “vendors” (in a broad sense) and users: vendors sell some convenience (in this case: the system figuring out which smartphone belongs to which printer) and users expect things to “Just Work” without them having to using their brains (in this case: e.g. executing some “initial pairing” protocol).
To do magic, you’ll have to cut some corners, like (in this case) assume that the source IP has any significance.
At the same time (alas, this is the downside), you gently nudge the users out of the control loop. That’s not the kind of tech I have been dreaming of, to be honest.
I am baffled how failing to receive the correct IP lets them use all IPs and does not default to refusing the service at all. I would think losing control is less of a problem than giving control to strangers?
The effective source IP address was being rewritten by the load balancer. Code behind the load balancer used this source IP address as a user’s group identifier. So for the duration of the defect, all users were pooled into a single user group. I’m guessing that the project coders’ strengths lie in the AI side of things, rather than user auth. & validation.
Well that’s an embarrassing goof. Really surprising they’re using ip for user identification, hopefully they fix it properly with actual user identities soon. The danger of doing things outside your “core competencies” – I’ve heard only good things about the actual monitoring features of tsd.
Hopefully they go to some federated/OAuth/OpenID type login, this is exactly why I prefer that for services: it is a huge responsibility to handle user authentication and identity, and I’d rather leave that part to places big enough to have a department for it, instead of leaving it to one jack-of-all-trades DevOps person who (again) doesn’t have user id as a core competency. Heck, when I used WordPress for my own stuff I even used GitHub login.
This is one of many reasons I actually built a local instance for TSD when I was playing around with it for my own place and our hackerspace.
We ultimately decided after way too many prints were thrown under the bus due to false positives to let the software develop a bit. To gain similar functionality as we still had people ‘print and split’ we implemented a function through our slack where our printers would post images in a printer centric channel, and we gave that function the ability to take commands from the registered users of our slack to pause and stop prints, but not do anything else ( so no uploading of files, sending gcode to the printer, all of that was disabled and cut out of the software). This way we could crowdsource our spaghetti detection and have it be magnitudes of accuracy greater since salty bags of water were making the decisions instead of a algorithm that didn’t know the difference between spaghetti and video cables hanging off the monitors in our coworking area that the webcams would also see sometimes. I figured if we ran into any bad actors, we could just kick ’em off the slack. This has not been an issue yet.
Network n00b here, but isn’t spoofing your public IP a thing you can do with many packet/communication types?
Hi Tom, I’m the Kenneth. Thank you for this fair and detailed technical write-up. I love your content, as always.
Here are a bit more info about what I’m doing to address this vulnerability:
1. I am prototyping a solution proposed by a reddit user: https://www.reddit.com/r/3Dprinting/comments/p7skc1/the_spaghetti_detective_security_incident_last/h9murov/?utm_source=reddit&utm_medium=web2x&context=3 . I have consulted a few friends who are security experts. They can’t poke holes in this solution.
2. The code that limit the number of discovered devices is just a temporary fix so that the TSD private server owners can still do it as they wish. The chance for private servers to suffer from this vulnerability is very low and some of them are willing to take it. Auto-discovery is still disabled in TSD cloud.
Again this has been a humbling experience that taught me how much I still need to improve, and how supportive the maker community is. I feel lucky and privileged to be part of it!
Oh one more thing to add after reading all the comments above:
TSD didn’t use public IP to authenticate. We have been using a proper token for authentication since day one. The public IP was used in auto-discovery to obtain the token.
The longer story is:
– For the first 1.5 years of TSD, we asked the user to copy/paste the token from TSD to octoprint. Obviously secure enough.
– Then about 0.5 years ago, we published the mobile app. So we started to use a time-limited 6-digit code to “exchange for token”. It is also secure enough.
– Then some users who bought the “kits” like ezpi with TSD pre-bundled told us they don’t even want to open OctoPrint. So I had to do auto-discovery and pushed it to far. :(
Ah, very interesting, thanks for filling us in!