Open Source Marker Recognition for Augmented Reality

marker

[Bharath] recently uploaded the source code for an OpenCV based pattern recognition platform that can be used for Augmented Reality, or even robots. It was built with C++ and utilized the OpenCV library to translate marker notations within a single frame.

The program started out by focusing in on one object at a time. This method was chosen to eliminate the creation of additional arrays that contained information of all of the blobs inside the image; which could cause some problems.

Although this implementation did not track marker information through multiple frames, it did provide a nice foundation for integrating pattern recognition into computer systems. The tutorial was straightforward and easy to ready. The entire program and source code can be found on Github which comes with a ZERO license so that anyone can use it. A video of the program comes up after the break:

Continue reading “Open Source Marker Recognition for Augmented Reality”

Hyperlapse Makes Your HeadCam Videos Awesome

hyperlapse First person video – between Google Glass, GoPro, and other sports cameras, it seems like everyone has a camera on their head these days. If you’re a surfer or skydiver, that might make for some awesome footage. For the rest of us though, it means hours of boring video. The obvious way to fix this is time-lapse. Typically time-lapse throws frames away. Taking 1 of every 10 frames results in a 10x speed increase. Unfortunately, speeding up a head mounted camera often leads to a video so bouncy it can’t be watched without an air sickness bag handy. [Johannes Kopf], [Michael Cohen], and [Richard Szeliski] at Microsoft Research have come up with a novel solution to this problem with Hyperlapse.

Hyperlapse photography is not a new term. Typically, hyperlapse films require careful planning, camera rigs, and labor-intensive post-production to achieve a usable video. [Johannes] and team have thrown computer vision and graphics algorithms at the problem. The results are nothing short of amazing.

The full details are available in the team’s report (35MB PDF warning). To obtain usable data, the fisheye lenses often used on these cameras must be calibrated. The team accomplished that with the OCamCalib toolbox. Imported video is broken down frame by frame. Using structure from motion algorithms, hyperlapse creates a 3D models of the various scenes in the video. With the scenes in this virtual world, the camera can be moved and aimed at will. The team’s algorithms then pick a smooth path that follows the original cameras trajectory. Once the camera’s position is known, it’s simply a matter of rendering the final video.

The results aren’t perfect. The mountain climbing scenes show some artifacts caused by the camera frame rate and exposure changing due to the varied lighting conditions. People appear and disappear in the bicycling portion of the video.

One thing the team doesn’t mention is how long the process takes. We’re sure this kind of rendering must require some serious time and processing power. Still, the output video is stunning.

Continue reading “Hyperlapse Makes Your HeadCam Videos Awesome”

Net Neutrality: FCC Hack is a Speed Bump on the Internet Fast Lane

Net neutrality is one of those topics we’ve been hearing more and more about in recent years. The basic idea of net neutrality is that all Internet traffic should be treated equally no matter what. It shouldn’t matter if it’s email, web sites, or streaming video. It shouldn’t matter if the traffic is coming from Wikipedia, Netflix, Youtube, etc. It shouldn’t matter which Internet Service Provider you choose. This is the way the Internet has worked since it’s inception. Of course, not everyone agrees that this is how things should stay. We didn’t always have the technology to filter and classify traffic. Now that it’s here, some believe that we should be able to classify internet traffic and treat it differently based on that classification.

It seems like much of the tech savvy community argues that net neutrality is a “given right” of the Internet. They believe that it’s the way the Internet has always been, and always should be. The other side of the argument is generally lobbied by Internet service providers. They argue that ISP’s have the right to classify Internet traffic that flows through their equipment and treat it differently if they so choose. As for everyone else, just about everyone these days relies on the Internet for business, banking, and entertainment but many of those people have no idea how the Internet works, nor do they really care. It’s like the electricity in their home or the engine in their car. As long as it’s working properly that’s all that matters to them. If they can check Facebook on their phone while watching Breaking Bad on Netflix in full HD, why should they care how that stuff gets prioritized? It work’s doesn’t it? Continue reading “Net Neutrality: FCC Hack is a Speed Bump on the Internet Fast Lane”

How to Upgrade Jasper’s Voice Recognition with AT&T’s Speech-to-Text API

Jarvis upgrade

Jasper is an open-source platform for developing always-on voice-controlled applications — you talk and your electronics listen! It’s designed to run on a Raspberry Pi. [Zach] has been playing around with it and wasn’t satisfied with Jasper’s built-in speech-to-text recognition system. He decided to take the advice of the Jasper development team and modify the system to use AT&T’s speech-to-text engine.

The built-in system works, but it has limitations. Mainly, you have to specify exactly which keywords you want Jasper to look out for. This can be problematic if you aren’t sure what the user is going to say. It can also cause problems when there are many possibilities of what the user might say. For example if the user is going to say a number between one and one hundred, you don’t want to have to type out all one hundred numbers into the voice recognition system in order to make it work.

The Jasper FAQ does recommend using the AT&T’s speech-to-text engine in this situation but this has its own downsides. You are limited to only one request per second and it’s also slower to recognize the speech. [Zach] was just fine with these restrictions but he couldn’t find much information online about how to modify Jasper to make the AT&T engine work. Now that he’s gotten it functional, he shared his work to make it easier for others.

The modification first requires that you have at AT&T developer account. Once that’s setup, you need to make some changes to Jasper’s mic.py module. That’s the only part of Jasper’s core that must be changed, and it’s only a few lines of code. Outside of that, there are a couple of other Python scripts that need to be added. We won’t go into the finer details here since [Zach] goes into great detail on his own page, including the complete scripts. If you are interested in using the AT&T module with your Jasper installation, be sure to check out [Zach’s] work. He will likely save you a lot of time.

 

The Most Random Electronic Dice Yet

di

If you’ve written a great library to generate random numbers with a microcontroller, what’s the first thing you would do? Build an electronic pair of dice, of course.

[Walter] created the entropy library for AVRs for a reliable source of true random numbers. It works by using the watchdog timer’s natural jitter; not fast by any means but most sources of entropy aren’t that fast anyway. By sampling a whole lot of AVR chips and doing a few statistical tests, it turns out this library is actually a pretty good source of randomness, at least as good as a pair of dice.

The circuit itself uses two 8×8 LED matrices from Adafruit, an Arduino, and a pair of buttons. The supported modes are 2d6, 2d4, 2d8, 2d10, 1d12, 1d20, a deck of cards, a single hex number, a single 8-bit binary number, or an eight character alphanumeric password. It’s more than enough for D&D or when you really need an unguessable password. Video demo below.

Continue reading “The Most Random Electronic Dice Yet”

Transcribing Piano Rolls with Python

Piano Roll

 

Perforated rolls of paper, called piano rolls, are used to input songs into player pianos. The image above was taken from a YouTube video showing a player piano playing a Gershwin tune called Limehouse Nights. There’s no published sheet music for the song, so [Zulko] decided to use Python to transcribe it.

First off the video was downloaded from YouTube. This video was processed with MoviePy library to create a single image plotting the notes. Using a Fourier Transform, the horizontal spacing between notes was found. This allowed the image to be reduced so that one pixel corresponded with one key.

With that done, each column could be assigned to a specific note on the piano. That takes care of the pitches, but the note duration requires more processing. The Fourier Transform is applied again to determine the length of a quarter note. With this known, the notes can be quantized, and a note duration can be applied to each.

Once the duration and notes are known, it’s time to export sheet music. LilyPond, an open source language for music notation, was used. This converts ASCII text into a sheet music PDF. The final result is a playable score of the piece, which you can watch after the break.

Continue reading “Transcribing Piano Rolls with Python”

QtLedTest – Software to Evaluate OLED Displays

A few days ago we featured the USBPass, an offline password keeper made with very few components. At the end of our write-up we mentioned that [Josh] was already working on another version of his hardware, which involved adding an OLED screen to the platform. To help him pick one he created QtLedTest, a Qt-based tool that simulates different OLED displays and GUI layouts for them. Internally QtLedTest is composed of QLedMatrix (a widget that simulates LED matrices), an SSD1306 OLED controller simulator, a simple graphics drawing library and some functions to draw text on the simulated screen. [Josh] used Fontbuilder together with a program he made in order to convert fonts he had found on the internet to C files. All the source code [Josh] made can be found on Github and should be updated in coming weeks as the final program is a bit slow to render the simulated screens.