Dual RGB Cameras Get Depth Sensing Powerup

June 19, 2025 by Donald Papp 17 Comments

It’s sometimes useful for a system to not just have a flat 2D camera view of things, but to have an understanding of the depth of a scene. Dual RGB cameras can be used to sense depth by contrasting the two slightly different views, in much the same way that our own eyes work. It’s considered an economical but limited method of depth sensing, or at least it was before FoundationStereo came along and blew previous results out of the water. That link has a load of interactive comparisons to play with and see for yourself, so check it out.

A box of disordered tools at close range is understood very well, and these results are typical for the system.

The FoundationStereo paper explains how researchers leveraged machine learning to create a system that can not only outperform existing dual RGB camera setups, but even active depth-sensing cameras such as the Intel RealSense.

FoundationStereo is specifically designed for strong zero-shot performance, meaning it delivers useful general results with no additional training needed to handle any particular scene or environment. The framework and models are available from the project’s GitHub repository.

While products like Microsoft’s Kinect have struggled to keep the consumer’s attention, depth sensing remains an enabling technology that opens possibilities and gives rise to interesting projects, like a headset that allows one to see the world through the eyes of a depth sensor.

The ability to easily and quickly gain an understanding of the physical layout of a space is a powerful tool, and if a system like this one can deliver such fantastic results with nothing more than two RGB cameras, that’s a great sign. Watch it in action in the video below.

Continue reading “Dual RGB Cameras Get Depth Sensing Powerup” →

Understanding Linear Regression

May 8, 2025 by Al Williams 7 Comments

Although [Vitor Fróis] is explaining linear regression because it relates to machine learning, the post and, indeed, the topic have wide applications in many things that we do with electronics and computers. It is one way to use independent variables to predict dependent variables, and, in its simplest form, it is based on nothing more than a straight line.

You might remember from school that a straight line can be described by: y=mx+b. Here, m is the slope of the line and b is the y-intercept. Another way to think about it is that m is how fast the line goes up (or down, if m is negative), and b is where the line “starts” at x=0.

[Vitor] starts out with a great example: home prices (the dependent variable) and area (the independent variable). As you would guess, bigger houses tend to sell for more than smaller houses. But it isn’t an exact formula, because there are a lot of reasons a house might sell for more or less. If you plot it, you don’t get a nice line; you get a cloud of points that sort of group around some imaginary line.

Continue reading “Understanding Linear Regression” →

Digital Squid’s Behavior Shaped By Neural Network

April 26, 2025 by Bryan Cockfield 2 Comments

In the 90s, a video game craze took over the youth of the world — but unlike today’s games that rely on powerful PCs or consoles, these were simple, standalone devices with monochrome screens, each home to a digital pet. Often clipped to a keychain, they could travel everywhere with their owner, which was ideal from the pet’s perspective since, like real animals, they needed attention around the clock. [ViciousSquid] is updating this 90s idea for the 20s with a digital pet squid that uses a neural network to shape its behavior.

The neural network that controls the squid’s behavior takes a large number of variables into account, including whether or not it’s hungry or sleepy, or if it sees food. The neural network adapts as different conditions are encountered, allowing the squid to make decisions and strengthen its algorithms. [ViciousSquid] is using a Hebbian learning algorithm which strengthens connections between neurons which activate often together. Additionally, the squid’s can form both short- and long-term memories, and the neural network can even form new neurons on its own as needed.

[ViciousSquid] is still working on this project, and hopes to eventually implement a management system in the future, allowing the various behavior variables to be tracked over time and overall allow it to act in a way more familiar to the 90s digital pets it’s modeled after. It’s an interesting and fun take on those games, though, and much of the code is available on GitHub for others to experiment with as well. For those looking for the original 90s games, head over to this project where an emulator for Tamagotchis was created using modern microcontroller platforms.

Software Project Pieces Broken Bits Back Together

April 13, 2025 by Donald Papp 18 Comments

With all the attention on LLMs (Large Language Models) and image generators lately, it’s nice to see some of the more niche and unusual applications of machine learning. GARF (Generalizeable 3D reAssembly for Real-world Fractures) is one such project.

GARF may play fast and loose with acronym formation, but it certainly knows how to be picky when it counts. Its whole job is to look at the pieces of a broken object and accurately figure out how to fit the pieces back together, even if there are some missing bits or the edges aren’t clean.

Re-assembling an object from imperfect fragments is a nontrivial undertaking.

Efficiently and accurately figuring out how to re-assemble different pieces into a whole is not a trivial task. One may think it can in theory be brute-forced, but the complexity of such a job rapidly becomes immense. That’s where machine learning methods come in, as researchers created a system that can do exactly that. It addresses the challenge of generalizing from a synthetic data set (in which computer-generated objects are broken and analyzed for training) and successfully applying it to the kinds of highly complex breakage patterns that are seen in real-world objects like bones, recovered archaeological artifacts, and more.

The system is essentially a highly adept 3D puzzle solver, but an entirely different beast from something like this jigsaw puzzle solving pick-and-place robot. Instead of working on flat pieces with clean, predictable edges it handles 3D scanned fragments with complex break patterns even if the edges are imperfect, or there are missing pieces.

GARF is exactly the kind of software framework that is worth keeping in the back of one’s mind just in case it comes in handy some day. The GitHub repository contains the code (although at this moment the custom dataset is not yet uploaded) but there is also a demo available for the curious.

Homebrew Traffic Monitor Keeps Eyes On The Streets

March 11, 2025 by Tom Nardi 63 Comments

How many cars go down your street each day? How fast were they going? What about folks out on a walk or people riding bikes? It’s not an easy question to answer, as most of us have better things to do than watch the street all day and keep a tally. But at the same time, this is critically important data from an urban planning perspective.

Of course, you could just leave it to City Hall to figure out this sort of thing. But what if you want to get a speed bump or a traffic light added to your neighborhood? Being able to collect your own localized traffic data could certainly come in handy, which is where TrafficMonitor.ai from [glossyio] comes in.

Continue reading “Homebrew Traffic Monitor Keeps Eyes On The Streets” →

New Camera Does Realtime Holographic Capture, No Coherent Light Required

February 26, 2025 by Donald Papp 31 Comments

Holography is about capturing 3D data from a scene, and being able to reconstruct that scene — preferably in high fidelity. Holography is not a new idea, but engaging in it is not exactly a point-and-shoot affair. One needs coherent light for a start, and it generally only gets touchier from there. But now researchers describe a new kind of holographic camera that can capture a scene better and faster than ever. How much better? The camera goes from scene capture to reconstructed output in under 30 milliseconds, and does it using plain old incoherent light.

The camera and liquid lens is tiny. Together with the computation back end, they can make a holographic capture of a scene in under 30 milliseconds.

The new camera is a two-part affair: acquisition, and calculation. Acquisition consists of a camera with a custom electrically-driven liquid lens design that captures a focal stack of a scene within 15 ms. The back end is a deep learning neural network system (FS-Net) which accepts the camera data and computes a high-fidelity RGB hologram of the scene in about 13 ms. How good are the results? They beat other methods, and reconstruction of the scene using the data looks really, really good.

One might wonder what makes this different from, say, a 3D scene captured by a stereoscopic camera, or with an RGB depth camera (like the now-discontinued Intel RealSense). Those methods capture 2D imagery from a single perspective, combined with depth data to give an understanding of a scene’s physical layout.

Holography by contrast captures a scene’s wavefront information, which is to say it captures not just where light is coming from, but how it bends and interferes. This information can be used to optically reconstruct a scene in a way data from other sources cannot; for example allowing one to shift perspective and focus.

Being able to capture holographic data in such a way significantly lowers the bar for development and experimentation in holography — something that’s traditionally been tricky to pull off for the home gamer.

Genetic Algorithm Runs On Atari 800 XL

February 21, 2025 by Bryan Cockfield 7 Comments

For the last few years or so, the story in the artificial intelligence that was accepted without question was that all of the big names in the field needed more compute, more resources, more energy, and more money to build better models. But simply throwing money and GPUs at these companies without question led to them getting complacent, and ripe to be upset by an underdog with fractions of the computing resources and funding. Perhaps that should have been more obvious from the start, since people have been building various machine learning algorithms on extremely limited computing platforms like this one built on the Atari 800 XL.

Unlike other models that use memory-intensive applications like gradient descent to train their neural networks, [Jean Michel Sellier] is using a genetic algorithm to work within the confines of the platform. Genetic algorithms evaluate potential solutions by evolving them over many generations and keeping the ones which work best each time. The changes made to the surviving generations before they are put through the next evolution can be made in many ways, but for a limited system like this a quick approach is to make small random changes. [Jean]’s program, written in BASIC, performs 32 generations of evolution to predict the points that will lie on a simple mathematical function.

While it is true that the BASIC program relies on stochastic methods to train, it does work and proves that it’s effective to create certain machine learning models using limited hardware, in this case an 8-bit Atari running BASIC. In previous projects he’s also been able to show how similar computers can be used for other complex mathematical tasks as well. Of course it’s true that an 8-bit machine like this won’t challenge OpenAI or Anthropic anytime soon, but looking for more efficient ways of running complex computation operations is always a more challenging and rewarding problem to solve than buying more computing resources.

Continue reading “Genetic Algorithm Runs On Atari 800 XL” →