The up-and-coming Wonder of the World in software and information circles , and particularly in those circles who talk about them, is AI. Give a magic machine a lot of stuff, ask it a question, and it will give you a meaningful and useful answer. It will create art, write books, compose music, and generally Change The World As We Know It. All this is genuinely impressive stuff, as anyone who has played with DALL-E will tell you. But it’s important to think about what the technology can and can’t do that’s new so as to not become caught up in the hype, and in doing that I’m immediately drawn to a previous career of mine.
I Knew I Should Have Taken That 8051 Firmware Job Instead
I’m an electronic engineer by training, but on graduation back in the 1990s I was seduced by the Commodore CDTV into the world of electronic publishing. CD-ROMs were the thing, then suddenly they weren’t, so I tumbled through games and web companies, and unexpectedly ending up working for Google. Was I at Larry and Sergei’s side? Hardly, the company I had worked for folded so I found some agency temp work as a search engine quality rater.
This is a fascinating job that teaches you a lot about how search engines work, but as one of the trained monkeys against whom the algorithm is tested you are at the bottom of the Google heap. This led me into the weird world of search engine marketing companies at the white hat end, where my job morphed into discovering for myself the field of computational linguistics without realising it was already a thing, and using it to lead the customers into creating better content for their websites.
At this point, it’s probably time to talk about how the search engine marketing business works. If you own a website, you’ll almost certainly at some time have been bombarded with search engine optimisation, or SEO, companies offering you the chance to be Number One on Google. As we used to say: if anyone says that to you, ask their name. If it’s Larry Page or Sergei Brin, hire them. Otherwise don’t.
What the majority of these companies did was find chinks in the search giant’s armour, ways to exploit the algorithm to deliver a good result on some carefully chosen search term. The result is a constant battle between the SEOs and the algorithm developers, something we saw first-hand as quality raters. If you’re unwise enough to hire a black-hat SEO company, any success you gain will inevitably be taken away by an algorithm update, and you’ll probably be thrown into search engine hell as a result.
On the white hat end of the scale the job is a different one. You have a customer with a website they believe is good, but with little interesting content beyond whatever it is they sell, the search engine doesn’t agree with them. Your job is to help them turn it into an amazing website full of interesting, authoritative, and constantly updated content, and in that there were no shortcuts. The computational linguistic analysis of pages of competitor search results and websites would deliver a healthy pile of things to talk about, but making it happen was impossible without somebody putting in a lot of hard graft and creating the content. If you think about Hackaday for a moment, my colleagues have an amazing breadth of experience and are really good writers so this site has very good content, but behind all that is a lot of work as we bash away at our keyboards creating it.
Does A Thing Have To Be Clever To Tell You Things You Didn’t Know?
If there’s one awesome thing corpus text analysis can do for you it’s tell you something you didn’t know about something you thought you knew, and there were many times we had customers who gained a completely new insight into their industry by looking at a corpus of the rest of the industry’s information. They might know everything there was to know about the widgets they manufactured, but it turned out they often knew very little about how the world talked about those widgets.
But at this point it’s super-important to understand, that a corpus analysis system isn’t clever and it’s not trying to be. Comparing it to AI, it’s a big cauldron full of sentences in which the idea is to make the stuff you want float to the top when you stir it, while the AI is an attempt to make a magic clever box that knows all that information and says the good stuff from its mind when you prompt it. For simplicity’s sake, I’ll refer to the two as simply dim, and bright.
I’m very happy to be writing for Hackaday and not tweaking the web for a living any more, but I still follow the world of content analysis because it interests me. I’ve noticed a tendency in that world to discover AI and have a mind blown moment. This technology is amazing, they say, it can do all these things! And it can, but here I have a moment of puzzlement. I’m watching people who presumably have access to and experience of those “dim” tools that do the job by statistical analysis of a pile of data, reacting in amazement when a “bright” tool does the same job using an AI model trained on the same data. And I guess here is my point. The AI is a very cool tech, but it’s cool because it can do new things, not because it can do things other tools already do. I’ve even read search engine marketeers gushing about how an AI could tell you how to be a search engine marketeer, when all I’m seeing is an AI that presumably has a few search engine marketing guides in its training simply repeating something it knows from them.
Please Don’t Place AI On A Pedestal Just Because It’s New To You
A friend of mine is somewhere close to the bleeding edge of text-based AI, and I have taken the opportunity to enhance my knowledge by asking him to show me what’s under the hood. It’s a technology that can sometimes amaze you by seeming clever and human — one of the things he demonstrated was a model that does a very passable D&D DM for example, and being a DM is something that requires a bit of ability to do well — but I despair at its being placed on a hype pedestal. It’s clear that AI tools will find their place and become an indispensable part of our tech future, but let’s have a little common sense as we enthuse about them, please!
My cauldron of sentences eventually evolved into a full-blown corpus analysis system that got me a job with a well-known academic publisher. When fed with news data it could sometimes predict election results, but even with that party trick I never found a freelance customer for it. Perhaps its time has passed and an AI could do a better job.
Meanwhile I worry about how the black hats in my former industry will use the new tools, and that an avalanche of AI-generated content that seems higher quality than it is will pollute search results with garbage that can’t be filtered out. Who knows, maybe an AI will be employed to spot it. One thing you can rely on though, Hackaday content will remain written by real people with demonstrable knowledge of the subject!