More Details On Why DeepSeek Is A Big Deal

February 3, 2025 by Donald Papp 48 Comments

The DeepSeek large language models (LLM) have been making headlines lately, and for more than one reason. IEEE Spectrum has an article that sums everything up very nicely.

We shared the way DeepSeek made a splash when it came onto the AI scene not long ago, and this is a good opportunity to go into a few more details of why this has been such a big deal.

For one thing, DeepSeek (there’s actually two flavors, -V3 and -R1, more on them in a moment) punches well above its weight. DeepSeek is the product of an innovative development process, and freely available to use or modify. It is also indirectly highlighting the way companies in this space like to label their LLM offerings as “open” or “free”, but stop well short of actually making them open source.

The DeepSeek-V3 LLM was developed in China and reportedly cost less than 6 million USD to train. This was possible thanks to developing DualPipe, a highly optimized and scalable method of training the system despite limitations due to export restrictions on Nvidia hardware. Details are in the technical paper for DeepSeek-V3.

There’s also DeepSeek-R1, a chain-of-thought “reasoning” model which handily provides its thought process enclosed within easily-parsed <think> and </think> pseudo-tags that are included in its responses. A model like this takes an iterative step-by-step approach to formulating responses, and benefits from prompts that provide a clear goal the LLM can aim for. The way DeepSeek-R1 was created was itself novel. Its training started with supervised fine-tuning (SFT) which is a human-led, intensive process as a “cold start” which eventually handed off to a more automated reinforcement learning (RL) process with a rules-based reward system. The result avoided problems that come from relying too much on RL, while minimizing the human effort of SFT. Technical details on the process of training DeepSeek-R1 are here.

DeepSeek-V3 and -R1 are freely available in the sense that one can access the full-powered models online or via an app, or download distilled models for local use on more limited hardware. It is free and open as in accessible, but not open source because not everything needed to replicate the work is actually released. Like with most LLMs, the training data and actual training code used are not available.

What is released and making waves of its own are the technical details of how researchers produced what they did, and that means there are efforts to try to make an actually open source version. Keep an eye out for Open-R1!

close up of a TI-84 Plus CE running custom software

Going Digital: Teaching A TI-84 Handwriting Recognition

December 24, 2024 by Heidi Ulrich 4 Comments

You wouldn’t typically associate graphing calculators with artificial intelligence, but hacker [KermMartian] recently made it happen. The innovative project involved running a neural network directly on a TI-84 Plus CE to recognize handwritten digits. By using the MNIST dataset, a well-known collection of handwritten numbers, the calculator could identify digits in just 18 seconds. If you want to learn how, check out his full video on it here.

The project began with a proof of concept: running a convolutional neural network (CNN) on the calculator’s limited hardware, a TI-84 Plus CE with only 256 KB of memory and a 48 MHz processor. Despite these constraints, the neural network could train and make predictions. The key to success: optimizing the code, leveraging the calculator’s C programming tools, and offloading the heavy lifting to a computer for training. Once trained, the network could be transferred to the calculator for real-time inference. Not only did it run the digits from MNIST, but it also accepted input from a USB mouse, letting [KermMartian] draw digits directly on the screen.

While the calculator’s limited resources mean it can’t train the network in real-time, this project is a proof that, with enough ingenuity, even a small device can be used for something as complex as AI. It’s not just about power; it’s about resourcefulness. If you’re into unconventional projects, this is one for the books.

Continue reading “Going Digital: Teaching A TI-84 Handwriting Recognition” →

Training A Self-Driving Kart

December 21, 2024 by Bryan Cockfield 1 Comment

There are certain tasks that humans perform every day that are notoriously difficult for computers to figure out. Identifying objects in pictures, for example, was something that seems fairly straightforward but was only done by computers with any semblance of accuracy in the last few years. Even then, it can’t be done without huge amounts of computing resources. Similarly, driving a car is a surprisingly complex task that even companies promising full self-driving vehicles haven’t been able to deliver despite working on the problem for over a decade now. [Austin] demonstrates this difficulty in his latest project, which adds self-driving capabilities to a small go-kart.

[Austin] had been working on this project at the local park but grew tired of packing up all his gear when he wanted to work on his machine-learning algorithms. So he took all the self-driving equipment off of the first kart and incorporated it into a smaller kart with a very small turning radius so he could develop it in his shop.

He laid down some tape on the floor to create the track and then set up the vehicle to learn how to drive by watching and gathering data. The model is trained with a convolutional neural network and this data. The only inputs that the model gets are images from cameras at the front of the kart. At first, it could only change the steering angle, with [Austin] controlling the throttle to prevent crashes. Eventually, he gave it control of the throttle as well, which behaves well except at the fastest speeds.

There were plenty of challenges along the way, especially when compared to the models trained at the park; [Austin] correctly theorized that the cause of the hardship in the park was a lack of contrast at the boundary between the track and any out-of-bounds areas. With a few tweaks to the track, as well as adding some wide-angle lenses to his cameras, he was able to get a model that works fairly well. Getting started on a project like this doesn’t have as high of a barrier to entry as one might imagine, either. Take a look at this comprehensive open-source Python library for self-driving projects. If you want to start smaller, perhaps don’t start with a self-driving kart.

Continue reading “Training A Self-Driving Kart” →

Dog Poop Drone Cleans Up The Yard So You Don’t Have To

September 28, 2024 by Dan Maloney 21 Comments

Sometimes you instantly know who’s behind a project from the subject matter alone. So when we saw this “aerial dog poop removal system” show up in the tips line, we knew it had to be the work of [Caleb Olson].

If you’re unfamiliar with [Caleb]’s oeuvre, let us refresh your memory. [Caleb] has been on a bit of a dog poop journey, starting with a machine-learning system that analyzed security camera footage to detect when the adorable [Twinkie] dropped a deuce in the yard. Not content with just knowing when a poop event has occurred, he automated the task of locating the packages with a poop-pointing robot laser. Removal of the poop remained a manual task, one which [Caleb] was keen to outsource, hence the current work.

The video below, from a lightning talk at a conference, is pretty much all we have to go on, and the quality is a bit potato-esque. And while [Caleb]’s PoopCopter is clearly still a prototype, it’s easy to get the gist. Combining data from the previous poop-adjacent efforts, [Caleb] has built a quadcopter that can (or will, someday) be guided to the approximate location of the offending package, home in on it using a downward-looking camera, and autonomously whisk it away.

The retrieval mechanism is the high point for us; rather than a complicated, servo-laden “sky scoop” or something similar, the drone has a bell-shaped container on its belly with a series of geared leaves on the open end. The leaves are open when the drone descends onto the payload, and then close as the drone does a quick rotation around the yaw axis. And, as [Caleb] gleefully notes, the leaves can also open in midair with a high-torque yaw move in the opposite direction; the potential for neighborly hijinx is staggering.

All jokes and puns aside, this looks fantastic, and we can’t wait for more information and a better video. And lest you think [Caleb] only works on “Number Two” problems, never fear — he’s also put considerable work into automating his offspring and taking the awkwardness out of social interactions.

Continue reading “Dog Poop Drone Cleans Up The Yard So You Don’t Have To” →

Mothbox Watches Bugs, So You — Or Your Grad Students — Don’t Have To

September 19, 2024 by Dan Maloney 8 Comments

To the extent that one has strong feelings about insects, they tend toward the extremes of a spectrum that runs from a complete fascination with their diversity and the specializations they’ve evolved to exploit unique and ultra-narrow ecological niches, and “Eww, ick! Kill it!” It’s pretty clear that [Dr. Andy Quitmeyer] and his team tend toward the former, and while they love their bugs, spending all night watching them is a tough enough gig that they came up with Mothbox, the automated insect monitor.

Insect censuses are valuable tools for assessing the state of an ecosystem, especially insects’ vast numbers, short lifespan, and proximity to the base of the food chain. Mothbox is designed to be deployed in insect-rich environments and automatically recognize and tally the moths it sees. It uses an Arducam and Raspberry Pi for image capture, plus an array of UV and visible LEDs, all in a weatherproof enclosure. The moths are attracted to the light and fly between the camera and a plain white background, where an image is captured. YOLO v8 locates all the moths in the image, crops them out, and sends them to BioCLIP, a vision model for organismal biology that appears similar to something we’ve seen before. The model automatically sorts the moths by taxonomic features and keeps a running tally of which species it sees.

Mothbox is open source and the site has a ton of build information if you’re keen to start bug hunting, plus plenty of pictures of actual deployments, which should serve as nightmare fuel to the insectophobes out there.

Supercon 2023: Teaching Robots How To Learn

September 3, 2024 by Lewin Day 2 Comments

Once upon a time, machine learning was an arcane field, the preserve of a precious few researchers holed up in grand academic institutions. Progress was slow, and hard won. Today, however, just about anyone with a computer can dive into these topics and develop their own machine learning systems.

Shawn Hymel has been doing just that, in his work in developer relations and as a broader electronics educator. His current interest is reinforcement learning on a tiny scale. He came down to the 2023 Hackaday Supercon to tell us all about his work.

Continue reading “Supercon 2023: Teaching Robots How To Learn” →

Building AI Models To Diagnose HVAC Issues

August 8, 2024 by Bryan Cockfield 10 Comments

HVAC – heating, ventilation, and air conditioning – can account for a huge amount of energy usage of a building, whether it’s residential or industrial. Often it’s the majority energy consumer, especially in places with extreme climates or for things like data centers where cooling is a large design consideration. When problems arise with these complex systems, they can go undiagnosed for a time and additionally be difficult to fix, leading to even more energy losses until repairs are complete. With the growing availability of platforms that can run capable artificial intelligences, [kutluhan_aktar] is working towards a system that can automatically diagnose potential issues and help humans get a handle on repairs faster.

The prototype system is designed for hydronic (water-based) systems and uses two separate artificial intelligences, one to analyze thermal imagery of the system and look for problems like leaks, hot spots, or blockages, and the other to listen for anomalous sounds especially relating to the behavior of cooling fans. For the first, a CNC-like machine was built to move a thermal camera around a custom-built model HVAC system and report its images back to a central system where they can be analyzed for anomalies. The second system which analyses audio runs its artificial intelligence on a XIAO ESP32C6 and listens to the cooling fans running in the model.

One problem that had to be tackled before any of this could be completed was actually building an open-source dataset to train the AI on. That’s part of the reason for the HVAC model in this project; being able to create problems to train the computer to detect before rolling it out to a larger system. The project’s code and training models can be found on its GitHub page. It seems to be a fairly robust solution to this problem, though, and we’ll be looking forward to future versions running on larger systems. Not everyone has a hydronic HVAC system, though. As heat pumps become more and more popular and capable, you’ll need systems to control those as well.

Hackaday

Machine Learning

167 Articles

More Details On Why DeepSeek Is A Big Deal

Going Digital: Teaching A TI-84 Handwriting Recognition

Training A Self-Driving Kart

Dog Poop Drone Cleans Up The Yard So You Don’t Have To

Mothbox Watches Bugs, So You — Or Your Grad Students — Don’t Have To

Supercon 2023: Teaching Robots How To Learn

Building AI Models To Diagnose HVAC Issues

Search

Never miss a hack

If you missed it

Dearest C++, Let Me Count The Ways I Love/Hate Thee

Personal Reflections On Immutable Linux

Crunching The News For Fun And Little Profit

The End Of The Hackintosh Is Upon Us

The Hackaday Summer Reading List: No AI Involvement, Guaranteed

Our Columns

Trickle Down: When Doing Something Silly Actually Makes Sense

Hackaday Podcast Episode 328: Benchies, Beanies, And Back To The Future

This Week In Security: Bitchat, CitrixBleed Part 2, Opossum, And TSAs

Ask Hackaday: Are You Wearing 3D Printed Shoes?

FLOSS Weekly Episode 840: End-of-10; Not Just Some Guy In A Van

Search

Never miss a hack

Subscribe

If you missed it

Our Columns