Crunching The News For Fun And Little Profit

Do you ever look at the news, and wonder about the process behind the news cycle? I did, and for the last couple of decades it’s been the subject of one of my projects. The Raspberry Pi on my shelf runs my word trend analysis tool for news content, and since my journey from curious geek to having my own large corpus analysis system has taken twenty years it’s worth a second look.

How Career Turmoil Led To A Two Decade Project

A hanging sign surrounded by ornate metalwork, with the legend "Cyder house".
This is very much a minority spelling. Colin Smith, CC BY-SA 2.0.

In the middle of the 2000s I had come out of the dotcom crash mostly intact, and was working for a small web shop. When they went bust I was casting around as one does, and spent a while as a Google quality rater while I looked for a new permie job. These teams are employed by the search giant through temporary employment agencies, and in loose terms their job is to be the trained monkeys against whom the algorithm is tested. The algorithm chose X, and if the humans also chose X, the algorithm is probably getting it right. Being a quality rater is not in any way a high-profile job, but with the big shiny G on my CV I soon found myself in demand from web companies seeking some white-hat search engine marketing expertise. What I learned mirrored my lesson from a decade earlier in the CD-ROM business, that on the web as in any other electronic publishing medium, good content well presented has priority over any black-hat tricks.

But what makes good content? Forget an obsession with stuffing bogus keywords in the text, and instead talk about the right things, and do it authoritatively. What are the right things in this context? If you are covering a subject, you need to do so using the right language; that which the majority uses rather than language only you use. I can think of a bunch of examples which I probably shouldn’t talk about, but an example close to home for me comes in cider. In the UK, cider is a fermented alcoholic drink made from apples, and as a craft cidermaker of many years standing I have a good grasp of its vocabulary. The accepted spelling is “Cider”, but there’s an alternate spelling of “Cyder” used by some commercial producers of the drink. It doesn’t take long to realise that online, hardly anyone uses cyder with a Y, and thus pages concentrating on that word will do less well than those talking about cider.

A graph of the word football versus the word soccer in British news.
We Brits rarely use the word “soccer” unless there’s a story about the Club World Cup in America.

I started to build software to analyse language around a given topic, with the aim of discerning the metaphorical cider from the cyder. It was a great surprise a few years later to discover that I had invented for myself the already-existing field of computational linguistics, something that would have saved me a lot of time had I known about it when I began. I was taking a corpus of text and computing the frequencies and collocates (words that appear alongside each other) of the words within it, and from that I could quickly see which wording mattered around a subject, and which didn’t. This led seamlessly to an interest in what the same process would look like for news data with a time axis added, so I created a version which harvested its corpus from RSS feeds. Thus began my decades-long project.

Continue reading “Crunching The News For Fun And Little Profit”

IR Point And Shoot Has A Raspberry Heart In A 35mm Body

Photography is great, but sometimes it can get boring just reusing the same wavelengths over and over again. There are other options, though and when [Malcolm Wilson] decided he wanted to explore them, he decided to build a (near) IR camera. 

The IR images are almost ethereal.
Image : Malcom Wilson.

The housing is an old Yashica Electro 35 — apparently this model was prone to electrical issues, and there are a lot of broken camera bodies floating around– which hides a Pi NoIR Camera v3. That camera module, paired with an IR pass filter, makes for infrared photography like the old Yashica used to do with special film. The camera module is plugged into a Pi Zero 2 W, and it’s powered by a PiSugar battery. There’s a tiny (0.91″) OLED display, but it’s only for status messages. The viewfinder is 100% optical, as the designers of this camera intended. Point, shoot, shoot again.

There’s something pure in that experience; we sometimes find stopping to look at previews pulls one out of the creative zone of actually taking pictures. This camera won’t let you do that, though of course you do get to skip on developing photos. [Malcom] has the Pi set up to connect to his Wifi when he gets home, and he grabs the RAW (he is a photographer, after all) image files via SSH.  Follow the link above to [Malcom]’s substack, and you’ll get some design details and his python code.

The Raspberry Pi Foundation’s NoIR camera shows up on these pages from time to time, though rarely so artistically. We’re more likely to see it spying on reptiles, or make magic wands work.  So we are quite grateful to [Malcom] for the tip, via Petapixel. Yes, photographers and artists of all stripes are welcome to use the tips line to tell us about their work.

Follow the links in this article for more images like this.
Image: Malcom Wilson

This Week In Security: Anthropic, Coinbase, And Oops Hunting

Anthropic has had an eventful couple weeks, and we have two separate write-ups to cover. The first is a vulnerability in the Antropic MCP Inspector, CVE-2025-49596. We’ve talked a bit about the Module Context Protocol (MCP), the framework that provides a structure for AI agents to discover and make use of software tools. MCP Inspector is an Open Source tool that proxies MCP connections, and provides debugging information for developers.

MCP Inspector is one of those tools that is intended to be run only on secure networks, and doesn’t implement any security or authentication controls. If you can make a network connection to the tool, you can control it. and MCP Inspector has the /sse endpoint, which allows running shell commands as a feature. This would all be fine, so long as everyone using the tool understands that it is not to be exposed to the open Internet. Except there’s another security quirk that intersects with this one. The 0.0.0.0 localhost bypass.

The “0.0.0.0 day exploit” is a bypass in essentially all the modern browsers, where localhost can be accessed on MacOS and Linux machines by making requests to 0.0.0.0. Browsers and security programs already block access to localhost itself, and 127.0.0.1, but this bypass means that websites can either request 0.0.0.0 directly, or rebind a domain name to 0.0.0.0, and then make requests.

Continue reading “This Week In Security: Anthropic, Coinbase, And Oops Hunting”

Last Chance: 2025 Hackaday Supercon Still Wants You!

Good news, procrastinators! Today was going to be the last day to throw your hat in the ring for a slot to talk at Supercon in November, but we’re extending the deadline one more week, until July 10th. We have an almost full schedule, but we’re still missing your talk.

So if the thought of having missed the deadline fills you with regret, here’s your second chance. We have spots for both 40-minute and 20-minute talks still open. We love to have a mix of newcomers as well as longtime Hackaday friends, so don’t be shy.

Supercon is a super fun time, and the crowd is full of energy and excitement for projects of all kinds. There is no better audience to present your feats of hardware derring-do, stories of reverse engineering, or other plans for world domination. Where else will you find such a density of like-minded hackers?

Don’t delay, get your talk proposal in today.

Finally, An Extension To Copyright Law We Can Get Behind

Normally when a government extends a piece of copyright law we expect it to be in the favour of commercial interests with deep pockets and little care for their consumers. But in Denmark they do things differently it seems, which is why they are giving Danes the copyright over their own features such as their faces or voices. Why? To combat deepfakes, meaning that if you deepfake a Dane, they can come after you for big bucks, or indeed kronor. It’s a major win, in privacy terms.

You might of course ask, whether it’s now risky to photograph a Dane. We are not of course lawyers here but like any journalists we have to possess a knowledge of how copyright works, and we are guessing that the idea in play here is that of passing off. If you take a photograph of a Volkswagen you will have captured the VW logo on its front, but the car company will not sue you because you are not passing off something that’s not a Volkswagen as the real thing. So it will be with Danes; if you take a picture of their now-copyrighted face in a crowd you are not passing it off as anything but a real picture of them, so we think you should be safe.

We welcome this move, and wish other countries would follow suit.


Pope Francis, Midjourney, Public domain, (Which is a copyright story all of its own!)

This Week In Security: MegaOWNed, Store Danger, And FileFix

Earlier this year, I was required to move my server to a different datacenter. The tech that helped handle the logistics suggested I assign one of my public IPs to the server’s Baseboard Management Controller (BMC) port, so I could access the controls there if something went sideways. I passed on the offer, and not only because IPv4 addresses are a scarce commodity these days. No, I’ve never trusted a server’s built-in BMC. For reasons like this MegaOWN of MegaRAC, courtesy of a CVSS 10.0 CVE, under active exploitation in the wild.

This vulnerability was discovered by Eclypsium back in March and it’s a pretty simple authentication bypass, exploited by setting an X-Server-Addr header to the device IP address and adding an extra colon symbol to that string. Send this along inside an HTTP request, and it’s automatically allowed without authentication. This was assigned CVE-2024-54085, and for servers with the BMC accessible from the Internet, it scores that scorching 10.0 CVSS.

We’re talking about this now, because CISA has added this CVE to the official list of vulnerabilities known to be exploited in the wild. And it’s hardly surprising, as this is a near-trivial vulnerability to exploit, and it’s not particularly challenging to find web interfaces for the MegaRAC devices using tools like Shodan and others.

There’s a particularly ugly scenario that’s likely to play out here: Embedded malware. This vulnerability could be chained with others, and the OS running on the BMC itself could be permanently modified. It would be very difficult to disinfect and then verify the integrity of one of these embedded systems, short of physically removing and replacing the flash chip. And malware running from this very advantageous position very nearly have the keys to the kingdom, particularly if the architecture connects the BMC controller over the PCIe bus, which includes Direct Memory Access.

This brings us to the really bad news. These devices are everywhere. The list of hardware that ships with the MegaRAC Redfish UI includes select units from “AMD, Ampere Computing, ASRock, ARM, Fujitsu, Gigabyte, Huawei, Nvidia, Supermicro, and Qualcomm”. Some of these vendors have released patches. But at this point, any of the vulnerable devices on the Internet, still unpatched, should probably be considered compromised. Continue reading “This Week In Security: MegaOWNed, Store Danger, And FileFix”

Linear Solar Chargers For Lithium Capacitors

For as versatile and inexpensive as switch-mode power supplies are at all kinds of different tasks, they’re not always the ideal choice for every DC-DC circuit. Although they can do almost any job in this arena, they tend to have high parts counts, higher complexity, and higher cost than some alternatives. [Jasper] set out to test some alternative linear chargers called low dropout regulators (LDOs) for small-scale charging of lithium ion capacitors against those more traditional switch-mode options.

The application here is specifically very small solar cells in outdoor applications, which are charging lithium ion capacitors instead of batteries. These capacitors have a number of benefits over batteries including a higher number of discharge-recharge cycles and a greater tolerance of temperature extremes, so they can be better off in outdoor installations like these. [Jasper]’s findings with using these generally hold that it’s a better value to install a slightly larger solar cell and use the LDO regulator rather than using a smaller cell and a more expensive switch-mode regulator. The key, though, is to size the LDO so that the voltage of the input is very close to the voltage of the output, which will minimize losses.

With unlimited time or money, good design can become less of an issue. In this case, however, saving a few percentage points in efficiency may not be worth the added cost and complexity of a slightly more efficient circuit, especially if the application will be scaled up for mass production. If switched mode really is required for some specific application, though, be sure to design one that’s not terribly noisy.