Developing free and open source software can be a thankless experience. Most folks do it because it’s something they’re passionate about, with the only personal benefit being the knowledge that there are individuals out there who found your work useful enough to download and install. So imagine how you’d feel if it turns out somebody was playing around with the figures, and the steady growth in the number of installs you thought your software had turned out to be fake.
That’s what happened just a few days ago to OctoPrint developer [Gina Häußge]. Although there’s no question that her software for remotely controlling and monitoring 3D printers is immensely popular within the community, the fact remains that the numbers she’s been using to help quantify that popularity have been tampered with by an outside party. She’s pissed, and has every right to be.
[Gina] discovered this manipulation on June 26th after taking a look at the publicly available usage stats on data.octoprint.org. She noticed that an unusually high number of instances appeared to be running an old OctoPrint release, and upon closer inspection, realized what she was actually seeing was a stream of bogus data that was designed to trick the stat counter. Rolling back the data, she was able to find out this spam campaign has been going on since late 2022. Tens of thousands of the users she thought she’d gained over the last two years were in fact nothing more than garbage spit out by some bot. But why?
Here’s where it gets interesting. Looking at the data being reported by these fake OctoPrint instances, [Gina] could tell the vast majority of them claimed to be running a specific plugin: OctoEverywhere. The perpetrators were clever enough to sprinkle in a random collection of other popular plugins along with it, but this specific plugin was the one most of them had in common. Sure enough this pushed OctoEverywhere to the top of the charts, making it seem like it was the most popular plugin in the community repository.
So what do the developers of OctoEverywhere have to say for themselves? In a statement that [Gina] posted on the OctoPrint blog, they claim they were able to determine a member of the community had performed the stat manipulation of their own accord, but as of this writing are unwilling to release this individual’s identity. A similar statement now appears on the OctoEverywhere website.
On June 27th, Gina Häußge, the developer behind OctoPrint, informed us of an incident involving the OctoPrint usage stats. Gina had observed that the stats were being manipulated to boost OctoEverywhere’s rankings.
We took the report very seriously and quickly started an investigation. Using private community channels, we determined a community member was responsible for manipulating the OctoPrint stats. We had a private conversation with the individual, who didn’t realize the impact they were having but apologized and promised never to do it again.
From a journalistic perspective, it would be inappropriate for us to leap to any conclusions based on the currently available information. But we will say this…we’ve heard more convincing stories on a kindergarten playground. Even if we take the statement at face value, the fact that they were able to figure out who was doing this within 48 hours of being notified would seem to indicate this person wasn’t exactly a stranger to the team.
In any event, the bogus data has now been purged from the system, and the plugin popularity charts are once again showing accurate numbers. [Gina] also says some safeguards have been put into place to help prevent this sort of tampering from happening again. As for OctoEverywhere, it slid back to its rightful place as the 6th most popular plugin, a fact that frankly makes the whole thing even more infuriating — you’d think legitimately being in the top 10 would have been enough.
On Mastodon, [Gina] expressed her disappointment in being fooled into thinking OctoPrint was growing faster than it really was, which we certainly get. But even so, OctoPrint is a wildly popular piece of software that has become the cornerstone of a vibrant community. There’s no question that her work has had a incredible impact on the world of desktop 3D printing, and while this turn of events is frustrating, it will ultimately be little more than a footnote in what is sure to be a lasting legacy.
CI/CD test updating the live database, perhaps?
Based on the data that was being submitted and the fact it has to keep submitting posts to remain active instances with unique IDs I highly doubt it was any form of testing or distribution automation.
CI/CD testing would have created regular tracking traffic. This was not regular, this was highly irregular and nothing an actual OctoPrint instance could ever have generated.
I don’t want to go into details in public on the exact nature of these irregularities, as I don’t want to tell people how I figured this out so they can work around that in the future, but I can assure you that these requests didn’t come from real OctoPrint instances with 100% certainty.
Thank you for making such a great tool. Been using it for years.
>you’d think legitimately being in the top 10 would have been enough.
You think they would be there if they hadn’t manipulated the charts?
Remember music billboards in the 90’s – every top chart was fake, because music labels used them as advertisement. They went so far as to pay record shop owners to scan the same record in the cash register multiple times, and then remove the fraudulent sales after the data was collected.
Advertisement basically has one purpose: informing consumers about the availability of products or services.
What advertisement is actually used for is eclipsing your competition: hiding information from consumers by spamming the consumer through all media. Even if it’s just nonsense, at least it’s not your competitor’s logo on the screen. The richest corporations can buy all the spots and decide whose products are visible to the public, therefore whose products are more likely to sell – after all, you can’t buy what you don’t know exists.
That’s why advertisement in general has lost its point and become counter-productive to the public. It’s just a gigantic waste of money. For the consumers, finding what you need requires ignoring all the advertisements and seeking information from listings and catalogues, and the search engine isn’t your friend because Google distorts the results.
Yea.. it was.. just the 90s. /s
Ive been working in music promotion for over 15 years. I can tell you that the charts are 100% manipulated to this day, and will remain so because the ones promoting the charts have an economic eco-system that pays for them to do just that.
I wouldn’t expect anything else. What plays on the radio, or in the movies, or on television, is just pushing whatever the big labels want to sell.
In the 90’s the public media discovered the discrepancy between the charts and the actual sales, because digital point of sales systems were beginning to reveal the actual sales numbers vs. what was counted, but they simply chose to keep ignoring the fact since the record labels were paying. It’s one of those things that everyone knows is happening, but everyone keeps forgetting or ignoring because their Spotify is playing the songs they want to hear, because they’ve grown up with the same songs since the 90’s.
As I recently bought a 3D printer, what does Octoprint do?
It’s akin to a network print server is to a basic printer. You can install it on a cheap SBC connected to your 3D printer by USB, and access it from your computers over your LAN. It’s a godsend for USB/SD-card only printers.
The mentioned plugins provide tons of advanced features, like a built-in slicer, using a camera to watch it in your browser or record time-lapse coordinated with the print head starting a new layer, metering and tracking your filament usage and time costs, and supports adding hardware features your printer controller might not support, such as sensors for auto bed leveling, additional thermal monitoring and shutdown, and so on.
I’m not sure I support the concept of making it Internet accessible with the ‘everywhere’ plugin mentioned here, having my printer in the next room over is about as ‘remote’ as I am comfortable with, but there it is I guess…
OctoEverywhere is (for me) useful for monitoring long prints: the printer is at my job’s office and I don’t want to camp out there if I’m doing a 10+ hour print. Being able to check in from home means I don’t potentially waste half a spool of filament if something goes wrong when I’m not there. Plus, it’s cool that it’s possible. Their whole pitch was that it allowed you to make your printer internet-accessible securely without needing full on network admin training first; an honestly great service. Gina personally endorsed them even (if favor of just opening a port for your Pi), they really didn’t need the extra attention and she’s right to be angry.
For a 3D printer without an internet/web connection, Octoprint runs on an attached computer to allow you to monitor and control it from a web browser. Explicitily not intended to be exposed to the open Internet, more so you can sit on your couch and be able to monitor your print, pause it, stop it. If you attach a camera to the computer, you can see the print and catch a spaghetti monster early, possibly allowing you to stop the print, and start a second print sliced at that point to the end. Plugin(s) allow making timelapses of the print in progress. Very handy. Runs on a raspberry Pi too (look for the distribution OctoPi)
And D got there first, what they said…
Thanks for your responses!
I felt that I would get a more “hands-on” information this way instead of doing a web search and then picking through what it returned.
It allows you to remotely control 3d prints and view them with a camera attached. It’s an absolute must for 3d printing just need a raspberry pi and usb cable to attach
Octoprint is an amazing tool, don’t be discouraged by the shit state of the internet in 2024, the project has immense value to open 3D printing, a flagship of an OSHW case study. stats will always be fucked with. take a page from game cheating detection, put a little checksum into your telemetry so you know what’s what, and let the spammers think they won for a bit and purge them all in one go every 6 months.
Sounds to me like somebody found an exploit in an old version, and wanted to see how fertile they could make the field.
There’s no exploit. Octoprint is open source. The telemetry code is open source. If someone wants to send fake data to the telemetry API, there’s no magic to it.
Someone simply thought there was a low-effort way of boosting their plugin’s ranking, made a script that generated plausible output, then set it running.
It’s interesting to me you think they should release the individuals name; depending on jurisdiction that would be illegal without a court order.
Using the lack of release to imply anything else is entirely in appropriate.
If someone has demonstrated a habit of adding malicious code, this should be known, similarly to how futzing the numbers in academia should be disclosed. Nothing this person contributed to anything is trustworthy now, and everything they wrote needs to be reviewed again.
Though I do get your point; it’s very easy for a malicious project to throw an innocent person under the bus and invite a witch hunt against that person in order to save face.
If an academic commits career-ending fraud, they can be publicly exposed by name, but not doxxed at their home addresses. And if there are ensuing legal proceedings, journals likely can’t talk about that. But the publication side is public, so there are no legal or data protection issues there. A fraudulent author could sue a journal for defamation, but they wouldn’t get far if the published reporting stuck to public facts.
Committing to an open-source project is pretty much the same thing. The actions you take on behalf of a project, and the name you’re credited under, aren’t private information.
The big difference is that OSS authors can be anonymous, and do repeated bad acts under different aliases. I don’t think doxxing would be the answer (and that would violate privacy laws), but there’s maybe a case for some kind of institutional credential. That’s what stops disgraced academics from just changing their names.
Inappropriate.
Why is it that no one is concerned about the fact that there’s telemetry reporting this usage data in the first place? If the software didn’t ‘phone home’ in the first place, then this wouldn’t be a problem
Because said telemetry is completely opt-IN, so only running if a user actively agreed to that. There’s one question about this during setup, and no tracking will be done unless you agree.
See also tracking.octoprint.org for a full rundown of what will even get tracked, how that will happen, and why.
And yes, there are valid reasons for telemetry – some design and development decisions are better made with some hard data on existing runtime environments and deployment stats. I was flying blind for ages and it sucked.
I appreciate your work.
Same
likewise. Thanks Gina.
Hey! I’m the developer behind OctoEverywhere. I purposefully omitted many details about my interaction with the user to keep them private, as they requested. I’m not going to violate their privacy request, but I’m happy to fill in any details or answer any outstanding questions you have. You can contact me directly via the support system on OctoEverywhere.com:
https://octoeverywhere.com/support
As I said in the blog post, I’m working with Gina by contributing whatever time/resources I can to ensure this can’t happen again. I’m trying my best to make this right for the OctoPrint and 3D printing community!
Kick him/her square in the crotch, once for each fake user.
With live video from the software involved.
Then forgiveness can start!
Perhaps I missed it, but… what purpose would stat manipulation serve for anything OTHER than modifying rankings?
If there’s no plausible explanation for doing it other than nefarious reasons, well…
Ugh. It’s a cloud solution too. Of course it would be.
The Internet was designed to be an all-connected cloud were we are all nodes. Not a media service where only businesses have full access. Port forwarding, Dynamic DNS, SSH tunneling and OpenVPN… Anyone attracted to this site ought to be learning those things.