Alan Turing proposed a test for machine intelligence that no longer works. The idea was to have people communicate over a terminal, with another real person and with a computer. If the computer is intelligent, Turing mused, most people will incorrectly identify the computer as a human. Clearly, with the advent of modern chatbots, that test is now broken. Despite the “AI” moniker, chatbots aren’t sentient or even pre-sentient, but they certainly seem that way. An AI CEO, Mustafa Suleyman, is proposing a new test: The AI has to take a $100,000 budget and earn $1,000,000.
We were a little bemused at this. By that measure, most of us aren’t intelligent, either, and it seems like this is a particularly capitalistic idea. We could probably write an Excel script that studied mutual fund performance and pull off the same trick, given enough time for the investment to mature. Is it intelligent? No. Besides, even humans who have demonstrated they can make $1,000,000 often sell their companies and start new ones that fail. How often does the AI have to succeed before we grant it person status?
When we see a photograph or photo of a scene, we can likely imagine what sounds would go with it, but what if this gets inverted, and we have to imagine the scene that goes with the sounds? How close would we get to reconstructing the scene in our mind, without the biases of our upbringing and background rendering this into a near-impossible task? This is essentially the focus of a project by [Diego Trujillo Pisanty] which he calls Blind Camera.
Based on video data recorded in Mexico City, a neural network created using Tensorflow 3 was trained using an RTX 3080 GPU on a dataset containing frames from these videos that were associated with a sound. As a result, when the thus trained neural network is presented with a sound profile (the ‘photo’), it’ll attempt to reconstruct the scene based on this input and its model, all of which has been adapted to run on a single Raspberry Pi 3B board.
However, since all the model knows are the sights and sounds of Mexico City, the resulting image will always be presented as a composite of scenes from this city. As [Diego] himself puts it: for the device, everything is a city. In a way it is an excellent way to demonstrate how not only neural networks are limited by their training data, but so too are us humans.
[Bertrand Meyer] is a decided contrarian in his views on AI and programming. In a recent Communications of the ACM blog post, he reveals that — unlike many others — he thinks AI in its current state isn’t very useful for practical programming. He was responding, in part, to another article from the ACM entitled “The End of Programming,” which, like many other articles, is claiming that, soon, no one will write software the way we do and have done for the last few decades. You can see [Matt Welsh] describe his thoughts on this in the video below. But [Bertrand] disagrees.
As we have also noted, [Bretrand] says:
“AI in its modern form, however, does not generate correct programs: it generates programs inferred from many earlier programs it has seen. These programs look correct but have no guarantee of correctness.”
The future, as seen in the popular culture of half a century or more ago, was usually depicted as quite rosy. Technology would have rendered every possible convenience at our fingertips, and we’d all live in futuristic automated homes — no doubt while wearing silver clothing and dreaming about our next vacation on Mars.
Of course, it’s not quite worked out this way. A family from 1965 whisked here in a time machine would miss a few things such as a printed newspaper, the landline telephone, or receiving a handwritten letter; they would probably marvel at the possibilities of the Internet, but they’d recognise most of the familiar things around us. We still sit on a sofa in front of a television for relaxation even if the TV is now a large LCD that plays a streaming service, we still drive cars to the supermarket, and we still cook our food much the way they did. George Jetson has not yet even entered the building.
The Future is Here, and it Responds to “Alexa”
There’s one aspect of the Jetsons future that has begun to happen though. It’s not the futuristic automation of projects such as Disneyland’s Monsanto house Of The Future, but instead it’s our current stuttering home automation efforts. We’re not having domestic robots in pinnies hand us rolled-up newspapers, but we’re installing smart lightbulbs and thermostats, and we’re voice-controlling them through a variety of home hub devices. The future is here, and it responds to “Alexa”.
But for all the success that Alexa and other devices like it have had in conquering the living rooms of gadget fans, they’ve done a poor job of generating a profit. It was supposed to be a gateway into Amazon services alongside their Fire devices, a convenient household companion that would help find all those little things for sale on Amazon’s website, and of course, enable you to buy them. Then, Alexa was supposed to move beyond your Echo and into other devices, as your appliances could come pre-equipped with Alexa-on-a-chip. Your microwave oven would no longer have a dial on the front, instead you would talk to it, it would recognise the food you’d brought from Amazon, and order more for you.
Instead of all that, Alexa has become an interface for connected home hardware, a way to turn on the light, view your Ring doorbell on models with screens, catch the weather forecast, and listen to music. It’s a novelty timepiece with that pod bay doors joke built-in, and worse that that for the retailer it remains by its very nature unseen. Amazon have got their shopping cart into your living room, but you’re not using it and it hardly reminds you that it’s part of the Amazon empire at all.
But it wasn’t supposed to be that way. The idea was that you might look up from your work and say “Alexa, order me a six-pack of beer!”, and while it might not come immediately, your six-pack would duly arrive. It was supposed to be a friendly gateway to commerce on the website that has everything, and now they can’t even persuade enough people to give it a celebrity voice for a few bucks.
The Gadget You Love to Hate
In the first few days after the Echo’s UK launch, a member of my hackerspace installed his one in the space. He soon became exasperated as members learned that “Alexa, add butt plug to my wish list” would do just that. But it was in that joke we could see the problem with the whole idea of Alexa as an interface for commerce. He had locked down all purchasing options, but as it turns out, many people in San Diego hadn’t done the same thing. As the stories rolled in of kids spending hundreds of their parents’ hard-earned on toys, it would be a foolhardy owner who would leave left purchasing enabled. Worse still, while the public remained largely in ignorance the potential of the device for data gathering and unauthorized access hadn’t evaded researchers. It’s fair to say that our community has loved the idea of a device like the Echo, but many of us wouldn’t let one into our own homes under any circumstances.
So Alexa hasn’t been a success, but conversely it’s been a huge sales success in itself. The devices have sold like hot cakes, but since they’ve been sold at close to cost, they haven’t been the commercial bonanza they might have hoped for. But what can be learned from this, other than that the world isn’t ready for a voice activated shopping trolley?
Sadly for most Alexa users it seems that a device piping your actions back to a large company’s data centres is not enough of a concern for them. It’s an easy prediction that Alexa and other services like it will continue to evolve, with inevitable AI pixie dust sprinked on them. A bet could be on the killer app being not a personal assistant but a virtual friend with some connections across a group of people, perhaps a family or a group of friends. In due course we’ll also see locally hosted and open source equivalents appearing on yet-to-be-released hardware that will condense what takes a data centre of today’s GPUs into a single board computer. It’s not often that our community rejoices in being late to a technological party, but I for one want an Alexa equivalent that I control rather than one that invades my privacy for a third party.
The Paragraphica doesn’t actually take photographs at all. Instead, it uses GPS to determine the user’s current position. It then feeds the address, time of day, weather, and temperature into a paragraph which serves as a prompt for an AI image generator. It also uses data gathered from various APIs to determine points of interest in the immediate area, and feeds those into the prompt as well. It then generates an artificial image that is intended to bear some resemblance to the prompt, and ideally, the real-world scene. In place of a lens, it bears a 3D printed structure inspired by the star-nosed mole, which feels its way around in lieu of using its eyes.
Three dials on the Paragraphica control its action. The first dial controls the radius of the area which the prompt will gather data about; it’s akin to setting the focal length of the lens. The second dial provides a noise seed value for the AI image generator, and the third dial controls how closely the AI sticks to the generated textual prompt.
The results are impressive, if completely false and generated from scratch. The Paragraphica generates semi-believable photos of a crowded alley, a public park, and a laneway full of parked cars. It’s akin to telling a friend where you are and what you’re seeing over the phone, and having them paint a picture based on that description.
Through their unique abilities and stolen data sets, AI image generators are proving controversial to say the least. As all good art does, Paragraphica explores this and raises new questions of its own.
AR and VR developer [Skarredghost] got pretty excited about a virtual blue cube, and for a very good reason. It marked a successful prototype of an augmented reality experience in which the logic underlying the cube as a virtual object was changed by AI in response to verbal direction by the user. Saying “make it blue” did indeed turn the cube blue! (After a little thinking time, of course.)
It didn’t stop there, of course, and the blue cube proof-of-concept led to a number of simple demos. The first shows off a row of cubes changing color from red to green in response to musical volume, then a bundle of cubes change size in response to microphone volume, and cubes even start moving around in space.
The program accepts spoken input from the user, converts it to text, sends it to a natural language AI model, which then creates the necessary modifications and loads it into the environment to make runtime changes in Unity. The workflow is a bit cumbersome and highlights many of the challenges involved, but it works and that’s pretty nifty.
If you’re interested in this direction, it seems [Skarredghost] has rounded up the relevant details. And should you have a prototype idea that isn’t necessarily AR or VR but would benefit from AI-assisted speech recognition that can run locally? This project has what you need.
Recently, an amusing anecdote made the news headlines pertaining to the use of ChatGPT by a lawyer. This all started when a Mr. Mata sued the airline where years prior he claims a metal serving cart struck his knee. When the airline filed a motion to dismiss the case on the basis of the statute of limitations, the plaintiff’s lawyer filed a submission in which he argued that the statute of limitations did not apply here due to circumstances established in prior cases, which he cited in the submission.
Unfortunately for the plaintiff’s lawyer, the defendant’s counsel pointed out that none of these cases could be found, leading to the judge requesting the plaintiff’s counsel to submit copies of these purported cases. Although the plaintiff’s counsel complied with this request, the response from the judge (full court order PDF) was a curt and rather irate response, pointing out that none of the cited cases were real, and that the purported case texts were bogus.
The defense that the plaintiff’s counsel appears to lean on is that ChatGPT ‘assisted’ in researching these submissions, and had assured the lawyer – Mr. Schwartz – that all of these cases were real. The lawyers trusted ChatGPT enough to allow it to write an affidavit that they submitted to the court. With Mr. Schwartz likely to be sanctioned for this performance, it should also be noted that this is hardly the first time that ChatGPT and kin have been involved in such mishaps.