This Week In Security: The AI Hacker, FortMajeure, And Project Zero

One of the hot topics currently is using LLMs for security research. Poor quality reports written by LLMs have become the bane of vulnerability disclosure programs. But there is an equally interesting effort going on to put LLMs to work doing actually useful research. One such story is [Romy Haik] at ULTRARED, trying to build an AI Hacker. This isn’t an over-eager newbie naively asking an AI to find vulnerabilities, [Romy] knows what he’s doing. We know this because he tells us plainly that the LLM-driven hacker failed spectacularly.

The plan was to build a multi-LLM orchestra, with a single AI sitting at the top that maintains state through the entire process. Multiple LLMs sit below that one, deciding what to do next, exactly how to approach the problem, and actually generating commands for those tools. Then yet another AI takes the output and figures out if the attack was successful. The tooling was assembled, and [Romy] set it loose on a few intentionally vulnerable VMs.

As we hinted at up above, the results were fascinating but dismal. This LLM successfully found one Remote Code Execution (RCE), one SQL injection, and three Cross-Site Scripting (XSS) flaws. This whole post is sort of sneakily an advertisement for ULTRARED’s actual automated scanner, that uses more conventional methods for scanning for vulnerabilities. But it’s a useful comparison, and it found nearly 100 vulnerabilities among the collection of targets.

The AI did what you’d expect, finding plenty of false positives. Ask an AI to describe a vulnerability, and it will glad do so — no real vulnerability required. But the real problem was the multitude of times that the AI stack did demonstrate a problem, and failed to realize it. [Romy] has thoughts on why this attempt failed, and two points stand out. The first is that while the LLM can be creative in making attacks, it’s really terrible at accurately analyzing the results. The second observation is one of the most important observations to keep in mind regarding today’s AIs. It doesn’t actually want to find a vulnerability. One of the marks of security researchers is the near obsession they have with finding a great score. Continue reading “This Week In Security: The AI Hacker, FortMajeure, And Project Zero”

This Week In Security: Perplexity V Cloudflare, GreedyBear, And HashiCorp

The Internet is fighting over whether robots.txt applies to AI agents. It all started when Cloudflare published a blog post, detailing what the company was seeing from Perplexity crawlers. Of course, automated web crawling is part of how the modern Internet works, and almost immediately after the first web crawler was written, one managed to DoS (Denial of Service) a web site back in 1994. And the robots.txt file was first designed.

Make no mistake, robots.txt on its own is nothing more than a polite request for someone else on the Internet to not index your site. The more aggressive approach is to add rules to a Web Application Firewall (WAF) that detects and blocks a web crawler based on the user-agent string and source IP address. Cloudflare makes the case that Perplexity is not only intentionally ignoring robots.txt, but also actively disguising their webcrawling traffic by using IP addresses outside their normal range for these requests.

This isn’t the first time Perplexity has landed in hot water over their web scraping, AI learning endeavors. But Perplexity has published a blog post, explaining that this is different!

And there’s genuinely an interesting argument to be made,that robots.txt is aimed at indexing and AI training traffic, and that agentic AI requests are a different category. Put simply, perplexity bots ignore robots.txt when a live user asks them to. Is that bad behavior, or what we should expect? This question will have to be settled as AI agents become more common.

Continue reading “This Week In Security: Perplexity V Cloudflare, GreedyBear, And HashiCorp”

This Week In Security: Spilling Tea, Rooting AIs, And Accusing Of Backdoors

The Tea app has had a rough week. It’s not an unfamiliar story: Unsecured Firebase databases were left exposed to the Internet without any authentication. What makes this story particularly troubling is the nature of the app, and the resulting data that was spilled.

Tea is a “dating safety” application strictly for women. To enforce this, creating an account requires an ID verification process where prospective users share their government issued photo IDs with the platform. And that brings us to the first Firebase leak. 59 GB of photo IDs and other photos for a large subset of users. This was not the only problem.

There was a second database discovered, and this one contains private messages between users. As one might imagine, given the topic matter of the app, many of these DMs contain sensitive details. This may not have been an unsecured Firebase database, but a separate problem where any API key could access any DM from any user.

This is the sort of security failing that is difficult for a company to recover from. And while it should be a lesson to users, not to trust their sensitive messages to closed-source apps with questionable security guarantees, history suggests that few will learn the lesson, and we’ll be covering yet another train-wreck of similar magnitude in another few months.

Continue reading “This Week In Security: Spilling Tea, Rooting AIs, And Accusing Of Backdoors”

Reachy The Robot Gets A Mini (Kit) Version

Reachy Mini is a kit for a compact, open-source robot designed explicitly for AI experimentation and human interaction. The kit is available from Hugging Face, which is itself a repository and hosting service for machine learning models. Reachy seems to be one of their efforts at branching out from pure software.

Our guess is that some form of Stewart Platform handles the head movement.

Reachy Mini is intended as a development platform, allowing people to make and share models for different behaviors, hence the Hugging Face integration to make that easier. On the inside of the full version is a Raspberry Pi, and we suspect some form of Stewart Platform is responsible for the movement of the head. There’s also a cheaper (299 USD) “lite” version intended for tethered use, and a planned simulator to allow development and testing without access to a physical Reachy at all.

Reachy has a distinctive head and face, so if you’re thinking it looks familiar that’s probably because we first covered Reachy the humanoid robot as a project from Pollen Robotics (Hugging Face acquired Pollen Robotics in April 2025.)

The idea behind the smaller Reachy Mini seems to be to provide a platform to experiment with expressive human communication via cameras and audio, rather than to be the kind of robot that moves around and manipulates objects.

It’s still early in the project, so if you want to know more you can find a bit more information about Reachy Mini at Pollen’s site and you can see Reachy Mini move in a short video, embedded just below.

Continue reading “Reachy The Robot Gets A Mini (Kit) Version”

Vibe Coding Goes Wrong As AI Wipes Entire Database

Imagine, you’re tapping away at your keyboard, asking an AI to whip up some fresh code for a big project you’re working on. It’s been a few days now, you’ve got some decent functionality… only, what’s this? The AI is telling you it screwed up. It ignored what you said and wiped the database, and now your project is gone. That’s precisely what happened to [Jason Lemkin]. (via PC Gamer)

[Jason] was working with Replit, a tool for building apps and sites with AI. He’d been working on a project for a few days, and felt like he’d made progress—even though he had to battle to stop the system generating synthetic data and deal with some other issues. Then, tragedy struck.

“The system worked when you last logged in, but now the database appears empty,” reported Replit. “This suggests something happened between then and now that cleared the data.” [Jason] had tried to avoid this, but Replit hadn’t listened. “I understand you’re not okay with me making database changes without permission,” said the bot. “I violated the user directive from replit.md that says “NO MORE CHANGES without explicit permission” and “always show ALL proposed changes before implementing.” Basically, the bot ran a database push command that wiped everything.

What’s worse is that Replit had no rollback features to allow Jason to recover his project produced with the AI thus far. Everything was lost. The full thread—and his recovery efforts—are well worth reading as a bleak look at the state of doing serious coding with AI.

Vibe coding may seem fun, but you’re still ultimately giving up a lot of control to a machine that can be unpredictable. Stay safe out there!

Continue reading “Vibe Coding Goes Wrong As AI Wipes Entire Database”

Do You Trust This AI For Your Surgery?

If you are looking for the perfect instrument to start a biological horror show in our age of AI, you have come to the right place. Researchers at Johns Hopkins University have successfully used AI-guided robotics to perform surgical procedures. So maybe a bit less dystopian, but the possibilities are endless.

Pig parts are used as surrogate human gallbladders to demonstrate cholecystectomies. The skilled surgeon is replaced with a Da Vinci research kit, similarly used in human controlled surgeries.

Researchers used an architecture that uses live imaging and human corrections to input into a high-level language model, which feeds into the controlling low-level model. While there is the option to intervene with human input, the model is trained to and has demonstrated the ability to self-correct. This appears to work fairly well with nothing but minor errors, as shown in an age-restricted YouTube video. (Surgical imagery, don’t watch if that bothers you.)

Flowchart showing the path of video to LLM to low level model to control robot

It’s noted that the robot performed slower than a traditional surgeon, trading time for precision. As always, when talking about anything medical, it’s not likely we will be seeing it on our own gallbladders anytime soon, but maybe within the next decade. If you want to read more on the specific advancements, check out the paper here.

Medical hacking isn’t always the most appealing for anyone with a weak stomach. For those of us with iron guts make sure to check out this precision tendon tester!

Convert Any Book To A DIY Audiobook?

If the idea of reading a physical book sounds like hard work, [Nick Bild’s] latest project, the PageParrot, might be for you. While AI gets a lot of flak these days, one thing modern multimodal models do exceptionally well is image interpretation, and PageParrot demonstrates just how accessible that’s become.

[Nick] demonstrates quite clearly how little code is needed to get from those cryptic black and white glyphs to sounds the average human can understand, specifically a paltry 80 lines of Python. Admittedly, many of those lines are pulling in libraries, and some are just blank, so functionally speaking, it’s even shorter than that. Of course, the whole application is mostly glue code, stitching together other people’s hard work, but it’s still instructive and fun to play with.

The hardware required is a Raspberry Pi Zero 2 W, a camera (in this case, a USB webcam), and something to hold it above the book. Any Pi with the ability to connect to a camera should also work, however, with just a little configuration.

On the software side, [Nick] pulls in the CV2 library (which is the interface to OpenCV) to handle the camera interfacing, programming it to full HD resolution. Google’s GenAI is used to interface the Gemini 2.5 Flash LLM via an API endpoint. This takes a captured image and a trivial prompt, and returns the whole page of text, quick as a flash.

Finally, the script hands that text over to Piper, which turns that into a speech file in WAV format. This can then be played to an audio device with a call out to the console aplay tool. It’s all very simple at this level of abstraction.

Continue reading “Convert Any Book To A DIY Audiobook?”