We don’t fully understand the appeal of asking an AI for a picture of a gorilla eating a waffle while wearing headphones. However, [Micael Widell] shows something in a recent video that might be the best use we’ve seen yet of DALL-E 2. Instead of concocting new photos, you can apparently use the same technology for cleaning up your own rotten pictures. You can see his video, below. The part about DALL-E 2 editing is at about the 4:45 mark.
[Nicholas Sherlock] fed the AI a picture of a fuzzy ladybug and asked it to focus the subject. It did. He also fed in some other pictures and asked it to make subtle variations of them. It did a pretty good job of that, too.
[8BitsAndAByte] found herself obsessively labeling items around her house, and, like the rest of the world, wanted to see what simple, routine tasks could be made unnecessarily complicated by using AI. Instead of manually identifying objects using human intelligence, she thought it would be fun to offload that task to our AI overlords and the results are pretty amusing.
She constructed a cardboard enclosure that housed a Raspberry Pi 3B+, a Pi Camera Module V2, and a small thermal printer for making the labels. The enclosure included a hole for the camera and a button for taking the picture. The image taken by the Pi is analyzed by the DeepAI DenseCap API which, in theory, should create a label for each object detected within the image. Unfortunately, it doesn’t seem to do that very well and [8BitsAndAByte] is left with labels that don’t match any of the objects she took pictures of. In some cases it didn’t even get close, for example, the model thought an apple was a person’s head and a rotary dial phone was a cup. Go figure. It didn’t really seem to bother her though, and she got a pretty good laugh from the whole thing.
It appears the model detects all objects in the image, but only prints the label for the object it was most certain about. So maybe part of her problem is there were just too many objects in the background? If that were the case, you could probably improve the accuracy of the model by placing the object against a neutral background. That may confuse the AI a lot less and possibly give you better results. Or maybe try a different classifier altogether? Or don’t. Then you could just use it as a fun, gag project at your next get-together. That works too.
Robotics projects are always a favorite for hackers. Being able to almost literally bring your project to life evokes a special kind of joy that really drives our wildest imaginations. We imagine this is one of the inspirations for the boom in interactive technologies that are flooding the market these days. Well, [Technovation] had the same thought and decided to build a fully articulated robotic biped.
Each leg has pivot points at the foot, knee, and hip, mimicking the articulation of the human leg. To control the robot’s movements, [Technovation] uses inverse kinematics, a method of calculating join movements rather than explicitly programming them. The user inputs the end coordinates of each foot, as opposed to each individual joint angle, and a special function outputs the joint angles necessary to reach each end coordinate. This part of the software is well commented and worth your time to dig into.
In case you want to change the height of the robot or its stride length, [Technovation] provides a few global constants in the firmware that will automatically adjust the calculations to fit the new robot’s dimensions. Of all the various aspects of this project, the detailed write-up impressed us the most. The robot was designed in Fusion 360 and the parts were 3D printed allowing for maximum design flexibility for the next hacker.
Maybe [Technovation’s] biped will help resurrect the social robot craze. Until then, happy hacking.
The system uses an NVIDIA Jetson AGX Xavier, fitted with a USB camera running at 100FPS. A Nerf tennis ball launcher is used to fire a ball towards the batter. Once triggered, the AI uses the camera to capture two successive images of the ball in flight. These images are fed into a convolutional neural network (CNN), and the software determines whether the ball is heading for the strike zone, or moving off-target. It uses this information to light a green or red LED respectively to alert the batter.
While such a system is unlikely to appear in professional baseball anytime soon, it shows the sheer capability of neural network systems to quickly and effectively analyse data in ways simply impossible for mere humans. [Nick]’s future goals involve running the system on faster hardware, and expanding it to determine effects like spin and more accurate positioning within the strike zone.
Finding a good apartment is a lot of work and includes searching websites for available places and then cross-referencing with a list of characteristics. This can take hours, days or even months but in a world where cars drive themselves, it is possible to use machine learning in your hunt.
[veesot] lives in a city between Europe and Asia and was looking for a new home, and his goal was to create a model that can use historical data to not only suggest if an advertised price was right, but also recommend waiting by predicting the decrease in the the future. The data-set includes parameters such as “area”, “district”, “number of balconies” etc and tried to determine an optimal property to view.
There is a lot that [veesot] describes in his post which includes cleaning the data in terms of removing flats that are tool small or tool large. This is essentially creating a training data-set for the machine learning system that will allow the system to generate usable output. [veesot] also added parameters such districts which relate to the geographical location, age of the building and even the materials used in the construction.
There is also an interesting bit about analyzing the data variables and determining cross-correlation which ultimately leads to the obvious conclusions that the central/older districts have older apartments and newer ones are larger. It makes for a few cool graphs but the code can certainly come in handy when dealing with similar data-sets. The last part of the writing discusses applying Linear Regression and then testing its accuracy. Interpreting the model produces interesting results about the trained model and the values of the coefficients.
I’ve always been fascinated by AI and machine learning. Google TensorFlow offers tutorials and has been on my ‘to-learn’ list since it was first released, although I always seem to neglect it in favor of the shiniest new embedded platform.
Last July, I took note when Intel released the Neural Compute Stick. It looked like an oversized USB stick, and acted as an accelerator for local AI applications, especially machine vision. I thought it was a pretty neat idea: it allowed me to test out AI applications on embedded systems at a power cost of about 1W. It requires pre-trained models, but there are enough of them available now to do some interesting things.
I wasn’t convinced I would get great performance out of it, and forgot about it until last November when they released an improved version. Unambiguously named the ‘Neural Compute Stick 2’ (NCS2), it was reasonably priced and promised a 6-8x performance increase over the last model, so I decided to give it a try to see how well it worked.
I took a few days off work around Christmas to set up Intel’s OpenVino Toolkit on my laptop. The installation script provided by Intel wasn’t particularly user-friendly, but it worked well enough and included several example applications I could use to test performance. I found that face detection was possible with my webcam in near real-time (something like 19 FPS), and pose detection at about 3 FPS. So in accordance with the holiday spirit, it knows when I am sleeping, and knows when I’m awake.
That was promising, but the NCS2 was marketed as allowing AI processing on edge computing devices. I set about installing it on the Raspberry Pi 3 Model B+ and compiling the application samples to see if it worked better than previous methods. This turned out to be more difficult than I expected, and the main goal of this article is to share the process I followed and save some of you a little frustration.
One of the difficulties in learning about neural networks is finding a problem that is complex enough to be instructive but not so complex as to impede learning. [ThomasNield] had an idea: Create a neural network to learn if you should put a light or dark font on a particular colored background. He has a great video explaining it all (see below) and code in Kotlin.
[Thomas] is very interested in optimization, so his approach is very much based on mathematics and algorithms of optimization. One thing that’s handy is that there is already an algorithm for making this determination. He found it on Stack Exchange, but we’re sure it’s in a textbook or paper somewhere. The existing algorithm makes the neural network really impractical, but it makes training easy since you can algorithmically develop a training set of data.
Once trained, the neural network works well. He wrote a small GUI and you can even select among various models.
Don’t let the Kotlin put you off. It is a derivative of Java and uses the same JVM. The code is very similar, other than it infers types and also adds functional program tools. However, the libraries and the principles employed will work with Java and, in many cases, the concepts will apply no matter what you are doing.
If you want to hardware accelerate your neural networks, there’s a stick for that. If you prefer C and you want something lean and mean, try TINN.