GETMusic Uses Machine Learning To Generate Music, Understands Tracks

Music generation guided by machine learning can make great projects, but there’s not usually much apparent control over the results. The system makes what it makes, and it’s an achievement if the results are not obvious cacophony. But that’s all different with GETMusic which allows for a much more involved approach because it understands and is able to create music by tracks. Among other things, this means one can generate a basic rhythm and melody first, then add additional elements to those existing ones, leaving the previous elements unchanged.

GETMusic can make music from scratch, or guided from examples, and under the hood uses a diffusion-based approach similar to the method behind AI image generators like Stable Diffusion. We’ve previously covered how Stable Diffusion works, but instead of images the same basic principles are used to guide the model from random noise to useful tracks of music.

Just a few years ago we saw a neural network trained to generate Bach, and while it was capable of moments of brilliance, it didn’t produce uniformly-listenable output. GETMusic is on an entirely different level. The model and code are available online and there is a research paper to accompany it.

You can watch a video putting it through its paces just below the page break, and there are more videos on the project summary page.

Continue reading “GETMusic Uses Machine Learning To Generate Music, Understands Tracks”

Beautifully Rebuilding A VR Headset To Add AR Features

[PyottDesign] recently wrapped up a personal project to create himself a custom AR/VR headset that could function as an AR (augmented reality) platform, and make it easier to develop new applications in a headset that could do everything he needed. He succeeded wonderfully, and published a video showcase of the finished project.

Getting a headset with the features he wanted wasn’t possible by buying off the shelf, so he accomplished his goals with a skillful custom repackaging of a Quest 2 VR headset, integrating a Stereolabs Zed Mini stereo camera (aimed at mixed reality applications) and an Ultraleap IR 170 hand tracking module. These hardware modules have tons of software support and are not very big, but when sticking something onto a human face, every millimeter and gram counts.

Continue reading “Beautifully Rebuilding A VR Headset To Add AR Features”

Text-to-Speech Model Can Do Music, Background Noises, And Sound Effects

Bark is a universal text-to-audio model that can not only create realistic speech, it can incorporate music, background noises, and sound effects. It can even include non-speech sounds like laughter, sighs, throat clearings, and similar elements. But despite the fact that it can deliver such complex results, it’s important to understand some of the peculiarities.

The model takes a prompt and generates the resulting sound from scratch. Results might sometimes be unexpected.

Bark is not a conventional text-to-speech program, and how it works has a lot more in common with large language model AI chatbots. This means that results can deviate from expectations, and outputs aren’t necessarily going to be studio-quality speech. As the project’s README points out, “(generated outputs can) be anything from perfect speech to multiple people arguing at a baseball game recorded with bad microphones.” That being said, there is some support for voice presets as a way to help guide the model with some consistency.

Bark was designed by a company called Suno for research purposes and is available under the MIT License. It can be installed and run locally, and has some demos available as well as an online implementation.

The ability to install and run Bark locally is promising territory for incorporating it into projects. And should you be more interested in speech-to-text instead, don’t forget about this plain C/C++ implementaion of AI-powered speech recognition.

Weather In Wartime: The Importance Of British Meteorology In WWII

Weather can have a significant impact on transport and operations of all kinds, especially those at sea or in the air. This makes it a deeply important field of study, particularly in wartime. If you’re at all curious about how this kind of information was gathered and handled in the days before satellites and computer models, this write-up on WWII meteorology is sure to pique your interest.

Weather conditions were valuable data, and weather forecasts even more so. Both required data, which relied on human operators for instruments to be read and their readings transmitted.

The main method of learning weather conditions over the oceans is to persuade merchant ships to report their observations regularly. This is true even today, but these days we also have the benefit of things like satellite technology. Back in the mid-1900s there was no such thing, and the outbreak of WWII (including the classification of weather data as secret information due to its value) meant that new solutions were needed.

The aircraft of the Royal Air Force (RAF) were particularly in need of accurate data, and there was little to no understanding of the upper atmosphere at the time. Eventually, aircraft flew regular 10-hour sorties, logging detailed readings that served to provide data about weather conditions across the Atlantic. Readings were logged, encoded with one-time pad (OTP) encryption, then radioed back to base where charts would be created and updated every few hours.

The value of accurate data and precise understanding of conditions and how they could change was grimly illustrated in a disaster called the Night of the Big Wind (March 24-25, 1944). Forecasts predicted winds no stronger than 45 mph, but Allied bombers sent to Berlin were torn apart when they encountered winds in excess of 120 mph, leading to the loss of 72 aircraft.

The types of data recorded to monitor and model weather are nearly identical to those in modern weather stations. The main difference is that instruments used to be read and monitored by human beings, whereas today we can rely more on electronic readings and transmission that need no human intervention.

Closing In On A PC Enabled PSVR2

When the PlayStation VR2 headset was released, people wondered whether it would be possible to get the headset to work as a PC VR headset. That would mean being able to plug it into a PC and have it work as a VR headset, instead of it only working on a PS5 as Sony intended.

Enthusiasts were initially skeptical and at times despondent about the prospects, but developer [iVRy]’s efforts recently had a breakthrough. A PC-compatible VR2 is looking more likely to happen.

So far [iVRy] is claiming they have 6 DOF SLAM (Simultaneous Localisation and Mapping), Prox sensor, and stereo camera data.

Most of the juicy bits are paywalled behind [iVRy]’s Patreon.  We’re hoping the jailbreak process will eventually be open-sourced.

The PS VR2 headset is quite unlike a PC VR headset in a number of ways, and it has not been historically easy to work with Sony’s products from a reverse-engineering perspective, whether it’s an attempt to improve the user experience of an annoying headset, or an attempt to understand the not-even-remotely-sanely-designed protocols behind the Sony Memory Stick. Getting the PS VR2 headset to work in a way it wasn’t intended was expected to be an uphill battle.

It’s not a finished job, but judging by the progress regularly shared on [iVRy]’s Twitter account, it might only be a matter of time.

RFID Emulator + E-paper Badge Can Be Programmed With Sound

In a way, an e-paper display makes an excellent foundation for a reprogrammable RFID card. The display only needs power during a refresh, and 125 kHz RFID tags are passive in the sense that the power for the RFID transaction comes from the reader itself. [Georgi Gerganov] has put those together in the GGtag, an open-source project for a 3.52″ e-paper badge with a trick or two up its sleeve.

One clever function is that it is programmable with sound, a feature built off another project of [Georgi]’s called ggwave, a data-to-sound (and vice-versa) framework that has been ported to just about every hardware platform one cares to imagine — including mobile phones — and can reliably send data through the air.

Transmitting data over sound is limited in throughput but has a number of advantages, not least of which is the huge range of compatible devices. There’s a web-based tool for programming the GGtag with sound available at ggtag.io that will give you a preview and let you hear how it works. The data encoding method gives transmissions a charming beep-boop quality that’s a bit reminiscent of an analog modem handshake. GGtag can also be programmed over USB serial, a faster (but somewhat less exciting) option.

The project’s GitHub repository contains GGtag’s code and technical details, and the CrowdSupply project is in the works for anyone who would prefer to buy one once they become available.

The Moment A Bullet Turns Into A Flashlight, Caught On Film

[The Slo Mo Guys] caught something fascinating while filming some firearms at 82,000 frames per second: a visible emission of light immediately preceding a bullet impact. The moment it occurs is pictured above, but if you’d like to jump directly to the point in the video where this occurs, it all starts at [8:18].

The ability to capture ultra-slow motion allows us to see things that would otherwise happen far too quickly to perceive, and there are quite a few visual spectacles in the whole video. We’ll talk a bit about what is involved, and what could be happening.

Spotting something unusual on video replay is what exteme slo-mo filming is all about.

First of all, the clear blocks being shot are ballistic gel. These dense blocks are tough, elastic, and a common sight in firearms testing because they reliably and consistently measure things like bullet deformation, fragmentation, and impact. It’s possible to make homemade ballistic gel with sufficient quantities of gelatin and water, but the clear ones like you see here are oil-based, visually clear, and more stable (they do not shrink due to evaporation).

We’ve seen the diesel effect occur in ballistic gelatin, which is most likely the result of the bullet impact vaporizing small amounts of the (oil-based) gel when the channel forms, and that vaporized material ignites due to a sudden increase in pressure as it contracts.

In the video linked above (and embedded below), there is probably a bit more in the mix. The rifles being tested are large-bore rifles, firing big cartridges with a large amount of gunpowder igniting behind each bullet. The burning powder causes a rapid expansion of hot, pressurized gasses that push the bullet down the barrel at tremendous speed. As the bullet exits, so does a jet of hot gasses. Sometimes, the last bits of burning powder are visible as a brief muzzle flash that accompanies the bullet leaving the barrel.

A large projectile traveling at supersonic velocities results in a large channel and expansion when it hits ballistic gel, but when fired at close range there are hot gasses from the muzzle and any remaining burning gunpowder in the mix, as well. All of which help generate the kind of visual spectacles we see here.

We suspect that the single frame of a flashlight-like emission of light as the flat-nosed bullet strikes the face of the gel is also the result of the diesel effect, but it’s an absolutely remarkable visual and a fascinating thing to capture on film. You can watch the whole thing just below the page break.

Continue reading “The Moment A Bullet Turns Into A Flashlight, Caught On Film”