Do You Trust This AI For Your Surgery?

If you are looking for the perfect instrument to start a biological horror show in our age of AI, you have come to the right place. Researchers at Johns Hopkins University have successfully used AI-guided robotics to perform surgical procedures. So maybe a bit less dystopian, but the possibilities are endless.

Pig parts are used as surrogate human gallbladders to demonstrate cholecystectomies. The skilled surgeon is replaced with a Da Vinci research kit, similarly used in human controlled surgeries.

Researchers used an architecture that uses live imaging and human corrections to input into a high-level language model, which feeds into the controlling low-level model. While there is the option to intervene with human input, the model is trained to and has demonstrated the ability to self-correct. This appears to work fairly well with nothing but minor errors, as shown in an age-restricted YouTube video. (Surgical imagery, don’t watch if that bothers you.)

Flowchart showing the path of video to LLM to low level model to control robot

It’s noted that the robot performed slower than a traditional surgeon, trading time for precision. As always, when talking about anything medical, it’s not likely we will be seeing it on our own gallbladders anytime soon, but maybe within the next decade. If you want to read more on the specific advancements, check out the paper here.

Medical hacking isn’t always the most appealing for anyone with a weak stomach. For those of us with iron guts make sure to check out this precision tendon tester!

Convert Any Book To A DIY Audiobook?

If the idea of reading a physical book sounds like hard work, [Nick Bild’s] latest project, the PageParrot, might be for you. While AI gets a lot of flak these days, one thing modern multimodal models do exceptionally well is image interpretation, and PageParrot demonstrates just how accessible that’s become.

[Nick] demonstrates quite clearly how little code is needed to get from those cryptic black and white glyphs to sounds the average human can understand, specifically a paltry 80 lines of Python. Admittedly, many of those lines are pulling in libraries, and some are just blank, so functionally speaking, it’s even shorter than that. Of course, the whole application is mostly glue code, stitching together other people’s hard work, but it’s still instructive and fun to play with.

The hardware required is a Raspberry Pi Zero 2 W, a camera (in this case, a USB webcam), and something to hold it above the book. Any Pi with the ability to connect to a camera should also work, however, with just a little configuration.

On the software side, [Nick] pulls in the CV2 library (which is the interface to OpenCV) to handle the camera interfacing, programming it to full HD resolution. Google’s GenAI is used to interface the Gemini 2.5 Flash LLM via an API endpoint. This takes a captured image and a trivial prompt, and returns the whole page of text, quick as a flash.

Finally, the script hands that text over to Piper, which turns that into a speech file in WAV format. This can then be played to an audio device with a call out to the console aplay tool. It’s all very simple at this level of abstraction.

Continue reading “Convert Any Book To A DIY Audiobook?”

AI Is Only Coming For Fun Jobs

In the past few years, what marketers and venture capital firms term “artificial intelligence” but is more often an advanced predictive text model of some sort has started taking people’s jobs and threatening others. But not tedious jobs that society might like to have automated away in the first place. These AI tools have generally been taking rewarding or enjoyable jobs like artist, author, filmmaker, programmer, and composer. This project from a research team might soon be able to add astronaut to that list.

The team was working within the confines of the Kerbal Space Program Differential Game Challenge, an open-source plugin from MIT that allows developers to test various algorithms and artificial intelligences in simulated spacecraft situations. Generally, purpose-built models are used here with many rounds of refinement and testing, but since this process can be time consuming and costly the researchers on this team decided to hand over control to ChatGPT with only limited instructions. A translation layer built by the researchers allows generated text to be converted to spacecraft controls.

We’ll note that, at least as of right now, large language models haven’t taken the jobs of any actual astronauts yet. The game challenge is generally meant for non-manned spacecraft like orbital satellites which often need to make their own decisions to maintain orbits and avoid obstacles. This specific model was able to place second in a recent competition as well, although we’ll keep rooting for humans in certain situations like these.

AI Might Kill Us All (With Carbon Emissions)

So-called artificial intelligence (AI) is all the rage right now between your grandma asking ChatGPT how to code in Python or influencers making videos without having to hire extras, but one growing concern is where the power is going to come from for the data centers. The MIT Technology Review team did a deep dive on what the current situation is and whether AI is going to kill us all (with carbon emissions).

Probably of most interest to you, dear hacker, is how they came up with their numbers. With no agreed upon methods and different companies doing different types of processing there were a number of assumptions baked into their estimates. Given the lack of information for closed-source models, Open Source models were used as the benchmark for energy usage and extrapolated for the industry as a whole. Unsurprisingly, larger models have a larger energy usage footprint.

While data center power usage remained roughly the same from 2005 to 2017 as increases in efficiency offset the increase in online services, data centers doubled their energy consumption by 2023 from those earlier numbers. The power running into those data centers is 48% more carbon intensive than the US average already, and expected to rise as new data centers push for increased fossil fuel usage, like Meta in Louisiana or the X data center found to be using methane generators in violation of the Clean Air Act.

Technology Review did find “researchers estimate that if data centers cut their electricity use by roughly half for just a few hours during the year, it will allow utilities to handle some additional 76 gigawatts of new demand.” This would mean either reallocating requests to servers in other geographic regions or just slowing down responses for the 80-90 hours a year when the grid is at its highest loads.

If you’re interested in just where a lot of the US-based data centers are, check out this map from NREL. Still not sure how these LLMs even work? Here’s an explainer for you.

Hackaday Links Column Banner

Hackaday Links: June 29, 2025

In today’s episode of “AI Is Why We Can’t Have Nice Things,” we feature the Hertz Corporation and its new AI-powered rental car damage scanners. Gone are the days when an overworked human in a snappy windbreaker would give your rental return a once-over with the old Mark Ones to make sure you hadn’t messed the car up too badly. Instead, Hertz is fielding up to 100 of these “MRI scanners for cars.” The “damage discovery tool” uses cameras to capture images of the car and compares them to a model that’s apparently been trained on nothing but showroom cars. Redditors who’ve had the displeasure of being subjected to this thing report being charged egregiously high damage fees for non-existent damage. To add insult to injury, if renters want to appeal those charges, they have to argue with a chatbot first, one that offers no path to speaking with a human. While this is likely to be quite a tidy profit center for Hertz, their customers still have a vote here, and backlash will likely lead the company to adjust the model to be a bit more lenient, if not outright scrapping the system.

Continue reading “Hackaday Links: June 29, 2025”

Flopped Humane “AI Pin” Gets An Experimental SDK

The Humane AI Pin was ambitious, expensive, and failed to captivate people between its launch and shutdown shortly after. While the units do contain some interesting elements like the embedded projector, it’s all locked down tight, and the cloud services that tie it all together no longer exist. The devices technically still work, they just can’t do much of anything.

The Humane AI Pin had some bold ideas, like an embedded projector. (Image credit: Humane)

Since then, developers like [Adam Gastineau] have been hard at work turning the device into an experimental development platform: PenumbraOS, which provides a means to allow “untrusted” applications to perform privileged operations.

As announced earlier this month on social media, the experimental SDK lets developers treat the pin as a mostly normal Android device, with the addition of a modular, user-facing assistant app called MABL. [Adam] stresses that this is all highly experimental and has a way to go before it is useful in a user-facing sort of way, but there is absolutely a workable architecture.

When the Humane AI Pin launched, it aimed to compete with smartphones but failed to impress much of anyone. As a result, things folded in record time. Humane’s founders took jobs at HP and buyers were left with expensive paperweights due to the highly restrictive design.

Thankfully, a load of reverse engineering has laid the path to getting some new life out of these ambitious devices. The project could sure use help from anyone willing to pitch in, so if that’s up your alley be sure to join the project; you’ll be in good company.

This Week In Security: Roundcube, Unified Threat Naming, And AI Chat Logs

Up first, if you’re running a Roundcube install prior to 1.5.10 or 1.6.11, it’s time to update. We have an authenticated Remote Code Execution (RCE) in the Roundcube Webmail client. And while that’s not quite the level of chaos that an unauthenticated RCE would cause, it’s still to be taken seriously. Mainly because for the majority of the 53 million Roundcube installs out there, the users aren’t entirely trusted.

The magic at play in this vulnerability is the Roundcube user session code, and specifically the session deserialization scheme. There’s a weird code snippet in the unserialize function:
if ($str[$p] == '!') {
$p++;
$has_value = false;

The exclamation mark makes the code skip a character, and then assume that what comes next has no value. But if it does actually have a value, well then you’ve got a slightly corrupted deserialization, resulting in a slightly corrupted session. This really comes into force when combined with the file upload function, as the uploaded filename serves as a payload delivery mechanism. Use the errant exclamation mark handling to throw off deserialization, and the filename can contain arbitrary session key/value pairs. A GPG class from the PEAR library allows running an arbitrary command, and this can be hijacked with the session manipulation. Continue reading “This Week In Security: Roundcube, Unified Threat Naming, And AI Chat Logs”