AI-Generated Sleep Podcast Urges You To Imagine Pleasant Nonsense

[Stavros Korokithakis] finds the experience of falling asleep to fairy tales soothing, and this has resulted in a fascinating project that indulges this desire by using machine learning to generate mildly incoherent fairy tales and read them aloud. The result is a fantastic sort of automated, machine-generated audible sleep aid. Even the logo is machine-generated!

The Deep Dreams Podcast is entirely machine-generated, including the logo.

The project leverages the natural language generation abilities of OpenAI’s GPT-3 to create fairytale-style content that is just coherent enough to sound natural, but not quite coherent enough to make a sensible plotline. The quasi-lucid, dreamlike result is perfect for urging listeners to imagine pleasant nonsense (thanks to Nathan W Pyle for that term) as they drift off to sleep.

We especially loved reading about the methods and challenges [Stavros] encountered while creating this project. For example, he talks about how there is more to a good-sounding narration than just pointing a text-to-speech engine at a wall of text and mashing “GO”. A good episode has things like strategic pauses, background music, and audio fades. That’s where pydub — a Python library for manipulating audio — came in handy. As for the speech, text-to-speech quality is beyond what it was even just a few years ago (and certainly leaps beyond machine-generated speech in the 80s) but it still took some work to settle on a voice that best suited the content, and the project gradually saw improvement.

Deep Dreams Podcast has a GitLab repository if you want to see the code that drives it all, and you can go to the podcast itself to give it a listen.

Hackaday Links Column Banner

Hackaday Links: March 13, 2022

As Russia’s war on Ukraine drags on, its knock-on effects are being felt far beyond the eastern Europe theater. And perhaps nowhere is this more acutely felt than in the space launch industry, seeing that at least until recently, Russia was pretty much everyone’s go-to ride to orbit. All that has changed now, at least temporarily, and has expanded to include halting sales of rocket engines used in other nations’ launch vehicles. Specifically, Roscosmos has put an end to exports of the RD-180 engine used in the US Atlas V launch vehicle, along with the RD-181 thrusters found in the Antares rocket. The loss of these engines may be more symbolic than practical, at least for the RD-180 — United Launch Alliance stopped selling launches on Atlas V back last year, and had secured the engines it needed for the 29 flights it has booked by that April. Still, there’s some irony that the Atlas V, which started life as an ICBM aimed at the USSR in the 1950s, has lost its Russian-made engines.

Bad news for Jan Mrázek’s popular open-source parametric search utility which made JLCPCB’s component library easier to use. We wrote about it back in 2020, and things seemed to be going fine up until this week, when Jan got a take-down request for his service. When we first heard about this, we checked the application’s web page, which bore a big red banner that included what were apparently unpleasant accusations Jan had received, including the words “reptile” and “parasitic.” The banner is still there, but the text has changed to a more hopeful tone, noting that LCSC, the component supplier for JLC’s assembly service, objected to the way Jan was pulling component data, and that they are now working together on something that everyone can be happy with. Here’s hoping that the service is back in action again soon.

Good news, everyone: Epson is getting into the 3D printer business. Eager to add a dimension to the planar printing world they’ve mostly worked in, they’ve announced that they’ll be launching a direct-extrusion printer sometime soon. Aimed at the industrial market, the printer will use a “flat screw extruder,” which is supposed to be similar to what the company uses on its injection molding machines. We sure didn’t know Epson was in the injection molding market, so it’ll be interesting to see if expertise there results in innovation in 3D printing, especially if it trickles down to the consumer printing market. Just as long as they don’t try to DRM the pellets, of course.

You can’t judge a book by its cover, but it turns out that there’s a lot you can tell about a person’s genetics just by looking at their face. At least that’s according to an AI startup called FDNA, which makes an app called “Face2Gene” that the company claims can identify 300 genetic disorders by analyzing photos of someone’s face. Some genetic disorders, like Down Syndrome, leave easily recognizable facial features, but some changes are far more subtle and hard to recognize. We had heard of cases where photos of toddlers posted on social media were used to diagnose retinoblastoma, a rare cancer of the retina. But this is on another level entirely.

And finally, working in an Amazon warehouse has got to be a tough gig, and if some of the stories are to be believed, it borders on being a horror show. But one Amazonian recently shared a video that showed what it’s like to get trapped by his robotic coworkers. The warehouse employee somehow managed to get stuck in a maze created by Amazon’s pods, which are stacks of shelves that hold merchandise and are moved around the warehouse floor by what amounts to robotic pallet jacks. Apparently, the robots know enough to not collide with their meat-based colleagues, but not enough to not box them in. To be fair, the human eventually found a way out, but it was a long search and it seems like another pod could have moved into position to block the exit at any time. You could see it as a scary example of human-robot interaction gone awry, but we prefer to look at it as the robots giving their friend a little unscheduled break away from the prying eyes of his supervisor.

AI Maybe Revives Dead Languages

While Star Trek’s transporter is hard to imagine — perfect matter movement across vast distances with no equipment on one end — it may not be the most far-fetched piece of tech on the Enterprise. While there are several contenders, I strongly suspect the universal translator is the most unlikely MacGuffin. After all, how would you decipher a totally unknown language in real-time? Of course, no one wants to watch 30 episodes of TV about how we finally figured out what Klingons call clouds, so pretty much every science fiction movie has some hand-waving explanation for speaking the viewer’s language. Farscape had microbes, some aliens have telepathy that works with alien brains of any kind, and still others study English from afar for decades off camera. Babelfish anyone?

I was thinking about this because of an article I read by [Alizeh Kohari] about [Jiaming Luo’s] work using AI to decode dead languages. While this might seem to be similar to Spock’s translator, it really isn’t. Human languages change over time and distance. You only have to watch the BBC or read something written by Thomas Jefferson to see that. But there is still a lot in common, at least within certain domains.

Continue reading “AI Maybe Revives Dead Languages”

Display Your Speech In Realtime To Help Lipreaders In The Mask Era

Masks are all well and good when it comes to reducing the spread of deadly pathogens, but they can make it harder to understand people when they speak. They also make lipreading impossible. [Kevin Lewis] set about building something to help.

The system consists of a small screen that can be worn on the chest or other part of the body, and a lapel microphone to record the wearer’s speech. Using the Deepgram AI speech recognition API running on a Raspberry Pi Zero W, the system decodes the speech and displays it on the Hyperpixel screen.

The API is quite capable, and can be set to only respond to the wearer’s voice, or in a group mode, display speech from multiple people in the area, displaying other voices in another colour. There’s also a translation feature using the iTranslateApp API as well.

It’s a neat tool that could be of great use in conferences or in situations where a quick simple machine translation could majorly ease communication. Video after the break.
Continue reading “Display Your Speech In Realtime To Help Lipreaders In The Mask Era”

Robot with glowing eyes

Spatial AI And CV Hack Chat

Join us on Wednesday, December 1 at noon Pacific for the Spatial AI and CV Hack Chat with Erik Kokalj!

A lot of what we take for granted these days existed only in the realm of science fiction not all that long ago. And perhaps nowhere is this more true than in the field of machine vision. The little bounding box that pops up around everyone’s face when you go to take a picture with your cell phone is a perfect example; it seems so trivial now, but just think about what’s involved in putting that little yellow box on the screen, and how it would not have been plausible just 20 years ago.

Erik Kokalj

Perhaps even more exciting than the development of computer vision systems is their accessibility to anyone, as well as their move into the third dimension. No longer confined to flat images, spatial AI and CV systems seek to extract information from the position of objects relative to others in the scene. It’s a huge leap forward in making machines see like we see and make decisions based on that information.

To help us along the road to incorporating spatial AI into our projects, Erik Kokalj will stop by the Hack Chat. Erik does technical documentation and support at Luxonis, a company working on the edge of spatial AI and computer vision. Join us as we explore the depths of spatial AI.

join-hack-chatOur Hack Chats are live community events in the Hackaday.io Hack Chat group messaging. This week we’ll be sitting down on Wednesday, December 1st at 12:00 PM Pacific time. If time zones have you tied up, we have a handy time zone converter.

Hackaday Links Column Banner

Hackaday Links: October 10, 2021

We have to admit, it was hard not to be insufferably smug this week when Facebook temporarily went dark around the globe. Sick of being stalked by crazy aunts and cousins, I opted out of that little slice of cyber-hell at least a decade ago, so Monday’s outage was no skin off my teeth. But it was nice to see that the world didn’t stop turning. More interesting are the technical postmortems on the outage, particularly this great analysis by the good folks at the University of Nottingham. Dr. Steve Bagley does a great job explaining how Facebook likely pushed a configuration change to the Border Gateway Protocol (BGP) that propagated through the Internet and eventually erased all routes to Facebook’s servers from the DNS system. He also uses a graphical map of routes to show peer-to-peer connections to Facebook dropping one at a time, until their machines were totally isolated. He also offers speculation on why Facebook engineers were denied internal access, sometimes physically, to their own systems.

It may be a couple of decades overdue, but the US Federal Communications Commission finally decided to allow FM voice transmissions on Citizen’s Band radios. It seems odd to be messing around with a radio service whose heyday was in the 1970s, but Cobra, the CB radio manufacturer, petitioned for a rule change to allow frequency modulation in addition to the standard amplitude modulation that’s currently mandatory. It’s hard to say how this will improve the CB user experience, which last time we checked is a horrifying mix of shouting, screaming voices often with a weird echo effect, all put through powerful — and illegal — linear amps that distort the signal beyond intelligibility. We can’t see how a little less static is going to improve that.

Can you steal a car with a Game Boy? Probably not, but car thieves in the UK are using some sort of device hidden in a Game Boy case to boost expensive cars. A group of three men in Yorkshire used the device, which supposedly cost £20,000 ($27,000), to wirelessly defeat the security systems on cars in seconds. They stole cars for garages and driveways to the tune of £180,000 — not a bad return on their investment. It’s not clear how the device works, but we’d love to find out — for science, of course.

There have been tons of stories lately about all the things AI is good for, and all the magical promises it will deliver on given enough time. And it may well, but we’re still early enough in the AI hype curve to take everything we see with a grain of salt. However, one area that bears watching is the ability of AI to help fill in the gaps left when an artist is struck down before completing their work. And perhaps no artist left so much on the table as Ludwig von Beethoven, with his famous unfinished 10th Symphony. When the German composer died, he had left only a few notes on what he wanted to do with the four-movement symphony. But those notes, along with a rich body of other works and deep knowledge of the composer’s creative process, have allowed a team of musicologists and AI experts to complete the 10th Symphony. The article contains a lot of technical detail, both on the musical and the informatics sides. How will it sound? Here’s a preview:

And finally, Captain Kirk is finally getting to space. William Shatner, who played captain — and later admiral — James Tiberius Kirk from the 1960s to the 1990s, will head to space aboard Blue Origin’s New Shepard rocket on Tuesday. At 90 years old, Shatner will edge out Wally Funk, who recently set the record after her Blue Origin flight at the age of 82. It’s interesting that Shatner agreed to go, since he is said to have previously refused the offer of a ride upstairs with Virgin Galactic. Whatever the reason for the change of heart, here’s hoping the flight goes well.

video of someone pushing the button to generate new art

AI Generating Paintings Off To A Flying Art

The philosophical question of “What is art?” has an ethereal, transient quality to it. A definition seems to slip away as you get close to an answer. Embracing that quality, [Max Fischer] has created an AI-powered painting that paints a new piece of art at the push of a button. When the button below the screen is pushed, a new image is generated and the old one is forever lost, which in a way, makes the frame a piece of art itself.

The really makes this project stand is the sheer quality of documentation on the GitHub repo. The instructions are incredibly detailed. Everything from setting up the Jetson to building the control box out of half-inch MDF (12mm for the sane part of the world) is laid out with copious pictures. Despite the ease of generating images ahead of time, [Max] took the hard route Hackaday route and did all inference locally and in real-time. To handle the processing requirements, an Nvidia Jetson Xavier NX single-board computer was used. He trained StyleGAN with high-resolution abstract art that gets generated whenever the button below the screen is pushed. To prevent screen burn-in, a PIR was added to turn the screen off when no one is around.

Here at Hackaday, we’ve seen several projects putting old laptop screens or monitors into a nice wooden case and mounting them to the wall. Since 32″ laptops are rather hard to find, [Max] opted to take a different approach and instead got a 32″ Samsung Frame for relatively cheap.

For all their detail, [Max] did leave one thing out of the readme: the AI that generates the art. [Max] hints that he wants others to create their picture frames, but with their own art generation. So what are you waiting for? Go make some art.