Crunching The News For Fun And Little Profit

Do you ever look at the news, and wonder about the process behind the news cycle? I did, and for the last couple of decades it’s been the subject of one of my projects. The Raspberry Pi on my shelf runs my word trend analysis tool for news content, and since my journey from curious geek to having my own large corpus analysis system has taken twenty years it’s worth a second look.

How Career Turmoil Led To A Two Decade Project

A hanging sign surrounded by ornate metalwork, with the legend "Cyder house".
This is very much a minority spelling. Colin Smith, CC BY-SA 2.0.

In the middle of the 2000s I had come out of the dotcom crash mostly intact, and was working for a small web shop. When they went bust I was casting around as one does, and spent a while as a Google quality rater while I looked for a new permie job. These teams are employed by the search giant through temporary employment agencies, and in loose terms their job is to be the trained monkeys against whom the algorithm is tested. The algorithm chose X, and if the humans also chose X, the algorithm is probably getting it right. Being a quality rater is not in any way a high-profile job, but with the big shiny G on my CV I soon found myself in demand from web companies seeking some white-hat search engine marketing expertise. What I learned mirrored my lesson from a decade earlier in the CD-ROM business, that on the web as in any other electronic publishing medium, good content well presented has priority over any black-hat tricks.

But what makes good content? Forget an obsession with stuffing bogus keywords in the text, and instead talk about the right things, and do it authoritatively. What are the right things in this context? If you are covering a subject, you need to do so using the right language; that which the majority uses rather than language only you use. I can think of a bunch of examples which I probably shouldn’t talk about, but an example close to home for me comes in cider. In the UK, cider is a fermented alcoholic drink made from apples, and as a craft cidermaker of many years standing I have a good grasp of its vocabulary. The accepted spelling is “Cider”, but there’s an alternate spelling of “Cyder” used by some commercial producers of the drink. It doesn’t take long to realise that online, hardly anyone uses cyder with a Y, and thus pages concentrating on that word will do less well than those talking about cider.

A graph of the word football versus the word soccer in British news.
We Brits rarely use the word “soccer” unless there’s a story about the Club World Cup in America.

I started to build software to analyse language around a given topic, with the aim of discerning the metaphorical cider from the cyder. It was a great surprise a few years later to discover that I had invented for myself the already-existing field of computational linguistics, something that would have saved me a lot of time had I known about it when I began. I was taking a corpus of text and computing the frequencies and collocates (words that appear alongside each other) of the words within it, and from that I could quickly see which wording mattered around a subject, and which didn’t. This led seamlessly to an interest in what the same process would look like for news data with a time axis added, so I created a version which harvested its corpus from RSS feeds. Thus began my decades-long project.

Continue reading “Crunching The News For Fun And Little Profit”

The End Of The Hackintosh Is Upon Us

From the very dawn of the personal computing era, the PC and Apple platforms have gone very different ways. IBM compatibles surged in popularity, while Apple was able to more closely guard the Macintosh from imitators wanting to duplicate its hardware and run its software.

Things changed when Apple announced it would hop aboard the x86 bandwagon in 2005. Soon enough was born the Hackintosh. It was difficult, yet possible, to run MacOS on your own computer built with the PC parts your heart desired.

Only, the Hackintosh era is now coming to the end. With the transition to Apple Silicon all but complete, MacOS will abandon the Intel world once more.

Continue reading “The End Of The Hackintosh Is Upon Us”

The Hackaday Summer Reading List: No AI Involvement, Guaranteed

If you have any empathy at all for those of us in the journalistic profession, have some pity for the poor editor at the Chicago Sun-Times, who let through an AI-generated summer reading list made up of novels which didn’t exist.  The fake works all had real authors and thus looked plausible, thus we expect that librarians and booksellers throughout the paper’s distribution area were left scratching their heads as to why they’re not in the catalogue.

Here at Hackaday we’re refreshingly meat-based, so with a guarantee of no machine involvement, we’d like to present our own summer reading list. They’re none of them new works but we think you’ll find them as entertaining, informative, or downright useful as we did when we read them. What are you reading this summer? Continue reading “The Hackaday Summer Reading List: No AI Involvement, Guaranteed”

Back To The Future, 40 Years Old, Looks Like The Past

Great Scott! If my calculations are correct, when this baby hits 88 miles per hour, you’re gonna see some serious shit. — Doc Brown

On this day, forty years ago, July 3rd, 1985 the movie Back to the Future was released. While not as fundamental as Hackers or realistic as Sneakers, this movie worked its way into our pantheon. We thought it would be appropriate to commemorate this element of hacker culture on this day, its forty year anniversary.

If you just never got around to watching it, or if it has been a few decades since you did, then you might not recall that the movie is set in two periods. It opens in 1985 and then goes back to 1955. Most of the movie is set in 1955 with Marty trying to get back to 1985 — “back to the future”. The movie celebrates the advanced technology and fashions of 1985 and is all about how silly the technology and fashions of 1955 are as compared with the advancements of 1985. But now it’s the far future, the year 2025, and we thought we might take a look at some of the technology that was enchanting in 1985 but that turned out to be obsolete in “the future”, forty years on. Continue reading “Back To The Future, 40 Years Old, Looks Like The Past”

Why The Latest Linux Kernel Won’t Run On Your 486 And 586 Anymore

Some time ago, Linus Torvalds made a throwaway comment that sent ripples through the Linux world. Was it perhaps time to abandon support for the now-ancient Intel 486? Developers had already abandoned the 386 in 2012, and Torvalds openly mused if the time was right to make further cuts for the benefit of modernity.

It would take three long years, but that eventuality finally came to pass. As of version 6.15, the Linux kernel will no longer support chips running the 80486 architecture, along with a gaggle of early “586” chips as well. It’s all down to some housekeeping and precise technical changes that will make the new code inoperable with the machines of the past.

Continue reading “Why The Latest Linux Kernel Won’t Run On Your 486 And 586 Anymore”

One Laptop Manufacturer Had To Stop Janet Jackson Crashing Laptops

There are all manner of musical myths, covering tones and melodies that have effects ranging from the profound to the supernatural. The Pied Piper, for example, or the infamous “brown note.”

But what about a song that could crash your laptop just by playing it? Even better, a song that could crash nearby laptops in the vicinity, too? It’s not magic, and it’s not a trick—it was just a punchy pop song that Janet Jackson wrote back in 1989.

Continue reading “One Laptop Manufacturer Had To Stop Janet Jackson Crashing Laptops”

The 2025 Iberian Peninsula Blackout: From Solar Wobbles To Cascade Failures

Some Mondays are worse than others, but April 28 2025 was particularly bad for millions of people in Spain and Portugal. Starting just after noon, a number of significant grid oscillations occurred which would worsen over the course of minutes until both countries were plunged into a blackout. After a first substation tripped, in the span of only a few tens of seconds the effects cascaded across the Iberian peninsula as generators, substations, and transmission lines tripped and went offline. Only after the HVDC and AC transmission lines at the Spain-France border tripped did the cascade stop, but it had left practically the entirety of the peninsula without a functioning power grid. The event is estimated to have been the biggest blackout in Europe ever.

Following the blackout, grid operators in the affected regions scrambled to restore power, while the populace tried to make the best of being plummeted suddenly into a pre-electricity era. Yet even as power gradually came back online over the course of about ten hours, the question of what could cause such a complete grid collapse and whether it might happen again remained.

With recently a number of official investigation reports having been published, we have now finally some insight in how a big chunk of the European electrical grid suddenly tipped over.

Continue reading “The 2025 Iberian Peninsula Blackout: From Solar Wobbles To Cascade Failures”