What If The Matrix Was Made In The 1950s?

We’ve noticed a recent YouTube trend of producing trailers for shows and movies as if they were produced in the 1950s, even when they weren’t. The results are impressive and, as you might expect, leverage AI generation tools. While we enjoy watching them, we were especially interested in [Patrick Gibney’s] peek behind the curtain of how he makes them, as you can see below. If you want to see an example of the result first, check out the second video, showing a 1950s-era The Matrix.

Of course, you could do some of it yourself, but if you want the full AI experience, [Patrick] suggests using ChatGPT to produce a script, though he admits that if he did that, he would tweak the results. Other AI tools create the pictures used and the announcer-style narration. Another tool produces cinematographic shots that include the motion of the “actors” and other things in the scene. More tools create the background music.

Continue reading “What If The Matrix Was Made In The 1950s?”

Can You Hear Me Now? Try These Headphones

When you are young, you take it for granted that you can pick out a voice in a crowded room or a factory floor. But as you get older, your hearing often gets to the point where a noisy room merges into a mishmash of sounds. University of Washington researchers have developed what they call Target Speech Hearing. In plain English, it is an AI-powered headphone that lets you look at someone and pull their voice out of the chatter. For best results, however, have to enroll their voice first, so it wouldn’t make a great eavesdropping device.

If you want to dive into the technical details, their paper goes into how it works. The prototype uses a Sony noise-cancelling headset. However, the system requires binaural microphones so additional microphones attach to the outside of the headphones.

Continue reading “Can You Hear Me Now? Try These Headphones”

Feast Your Eyes On These AI-Generated Sounds

The radio hackers in the audience will be familiar with a spectrogram display, but for the uninitiated, it’s basically a visual representation of how a range of frequencies are changing with time. Usually such a display is used to identify a clear transmission in a sea of noise, but with the right software, it’s possible to generate a signal that shows up as text or an image when viewed as a spectrogram. Musicians even occasionally use the technique to hide images in their songs. Unfortunately, the audio side of such a trick generally sounds like gibberish to human ears.

Or at least, it used to. Students from the University of Michigan have found a way to use diffusion models to not only create a spectrogram image for a given prompt, but to do it with audio that actually makes sense given what the image shows. So for example if you asked for a spectrogram of a race car, you might get an audio track that sounds like a revving engine.

Continue reading “Feast Your Eyes On These AI-Generated Sounds”

Try Image Classification Running In Your Browser, Thanks To WebGPU

When something does zero-shot image classification, that means it’s able to make judgments about the contents of an image without the user needing to train the system beforehand on what to look for. Watch it in action with this online demo, which uses WebGPU to implement CLIP (Contrastive Language–Image Pre-training) running in one’s browser, using the input from an attached camera.

By giving the program some natural language visual concept labels (such as ‘person’ or ‘cat’) that fit a hypothetical template for the image content, the system will output — in real-time — its judgement on the appropriateness of such labels to what the camera sees. Again, all of this runs locally.

It’s maybe a little bit unintuitive, but what’s happening in the demo is that the system is deciding which of the user-provided labels (“a photo of a cat” vs “a photo of a bald man”, for example) is most appropriate to what the camera sees. The more a particular label is judged a good fit for the image, the higher the number beside it.

This kind of process benefits greatly from shoveling the hard parts of the computation onto compatible graphics cards, which is exactly what WebGPU provides by allowing the browser access to a local GPU. WebGPU is relatively recent, but we’ve already seen it used to run LLMs (Large Language Models) directly in the browser.

Wondering what makes GPUs so very useful for AI-type applications? It’s all about their ability to work with enormous amounts of data very quickly.

NetBSD Bans AI-Generated Code From Commits

A recent change was announced to the NetBSD commit guidelines which amends these to state that code which was generated by Large Language Models (LLMs) or similar technologies, such as ChatGPT, Microsoft’s Copilot or Meta’s Code Llama is presumed to be tainted code. This amendment was to the existing section about tainted code, which originally referred to any code that was not written directly by the person committing the code, and was due to licensing concerns. The obvious reason behind this is that otherwise code may be copied into the NetBSD codebase which may have been licensed under an incompatible (or proprietary) license.

In the case of LLM-based code generators like the above-mentioned, the problem stems from the fact that they are trained on millions of lines of code from all over the internet, which are naturally released under a wide variety of licenses. Invariably, some of that code will be covered by a license that’s not acceptable for the NetBSD codebase. Although the guideline mentions that these auto-generated code commits may still be admissible, they require written permission from core developers, and presumably an in-depth audit of the code’s heritage. This should leave non-trivial commits that got churned out by ChatGPT and kin out in the cold.

The debate about the validity of works produced by current-gen “artificial intelligence” software is only just beginning, but there’s little question that NetBSD has made the right call here. From a legal and software engineering perspective this policy makes perfect sense, as LLM-generated code simply doesn’t meet the project’s standards. That said, code produced by humans brings with it a whole different set of potential problems.

How AI Large Language Models Work, Explained Without Math

Large Language Models (LLMs ) are everywhere, but how exactly do they work under the hood? [Miguel Grinberg] provides a great explanation of the inner workings of LLMs in simple (but not simplistic) terms that eschews the low-level mathematics of how they work in favor of laying bare what it is they do.

At their heart, LLMs are prediction machines that work on tokens (small groups of letters and punctuation) and are as a result capable of great feats of human-seeming communication. Most technical-minded people understand that LLMs have no idea what they are saying, and this peek at their inner workings will make that abundantly clear.

Be sure to also review an illustrated guide to how image-generating AIs work. And if a peek under the hood of LLMs left you hungry for more low-level details, check out our coverage of training a GPT-2 LLM using pure C code.

Kaffa Roastery founder Svante Hampf shows a bag of their AI-conic coffee blend.

AI-Created Coffee Blend Isn’t Terrible

Weren’t we just talking about coffee-based sacrilege the other day? Here’s something to make the single-origin bean snobs chew their espresso cups: an artisan roastery in Helsinki is offering a coffee blend created by artificial intelligence called AI-conic. The idea, of course, is that technology will lighten the workload needed to produce coffee.

This is an interesting development because Finland consumes the most coffee in the world, according to the International Coffee Organization. Coffee roasting is a highly-valued traditional artisan profession there, so it stands to reason that they might turn to technology for help.

Just like with scotch whisky, there’s nothing wrong with coffee blends outright. Bean blends are good for consistency, when you want every cup to taste pretty much exactly the same. Single-origin beans, though, are traceable to one location, and as a result, they usually have a distinct flavor based on the climate they’re grown in.

If you’re new to coffee, blends are a nice, safe way to start out. And, interestingly, the AI chose to make the blend out of four different types of beans instead of the usual two or three, despite being tasked with creating a blend that would suit the palates of coffee enthusiasts. But the coffee experts agreed that the AI blend was “perfect” and needed no human intervention. We probably won’t be getting to Finland anytime soon, so if you try it, let us know how it tastes!

Do you like cold brew? How would you like to be able to brew some in just three minutes?