The Machine That Japed: Microsoft’s Humor-Emulating AI

Ten years ago, highbrow culture magazine The New Yorker started a contest. Each week, a cartoon with no caption is published in the back of the magazine. Readers are encouraged to submit an apt and hilarious caption that captures the magazine’s infamous wit. Editors select the top three entries to vie for reader votes and the prestige of having captioned a New Yorker cartoon.

The magazine receives about 5,000 submissions each week, which are scrutinized by cartoon editor [Bob Mankoff] and a parade of assistants that burn out after a year or two. But soon, [Mankoff]’s assistants may have their own assistant thanks to Microsoft researcher [Dafna Shahaf].

[Dafna Shahaf] heard [Mankoff] give a speech about the New Yorker cartoon archive a year or so ago, and it got her thinking about the possibilities of the vast collection with regard to artificial intelligence. The intricate nuances of humor and wordplay have long presented a special challenge to creators. [Shahaf] wondered, could computers begin to learn what makes a caption funny, given a big enough canon?

[Shahaf] threw ninety years worth of wry, one-panel humor at the system. Given this knowledge base, she trained it to choose funny captions for cartoons based on the jokes of similar cartoons. But in order to help [Mankoff] and his assistants choose among the entries, the AI must be able to rank the comedic value of jokes. And since computer vision software is made to decipher photos and not drawings, [Shahaf] and her team faced another task: assigning keywords to each cartoon. The team described each one in terms of its contextual anchors and subsequently its situational anomalies. For example, in the image above, the context keywords could be car dealership, car, customer, and salesman. Anomalies might include claws, fangs, and zoomorphic automobile.

The result is about the best that could be hoped for, if one was being realistic. All of the cartoon editors’ chosen winners showed up among the AI’s top 55.8%, which means the AI could ultimately help [Mankoff and Co.] weed out just under half of the truly bad entries. While [Mankoff] sees the study’s results as a positive thing, he’ll continue to hire assistants for the foreseeable future.

Humor-enabled AI may still be in its infancy, but the implications of the advancement are already great. To give personal assistants like Siri and Cortana a funny bone is to make them that much more human. But is that necessarily a good thing?

[via /.]

image of drfitwood on a beach

Ask Hackaday: Not Your Mother’s Feedback

Imagine you were walking down a beach, and you came across some driftwood resting against a pile of stones. You see it in the distance, and your brain has no trouble figuring out what you’re looking at. You see driftwood and rocks – you can clearly distinguish between the two objects without a second thought.

Think about the raw data entering the brain. The textures of the rocks and the driftwood are similar. The colors are similar. The irregular shapes are similar. Thus the raw data entering the brain’s V1 area for both objects must be similar as well. Now think about the borders that separate the pieces of driftwood from the edges of the rocks. From a raw data perspective, there is no border, and likewise no separation because the two objects are so similar.  But yet your brain can clearly see a rock and a piece of driftwood – two distinctly different objects. So how does the brain do this? How does it so easily differentiate between the two? If the raw data on either side of the border separating the wood and the rocks is the same, then there must be an outside influence determining where that border is. Jeff Hawkins believes this outside influence is a very special and most interesting type of feedback. Read on as we explain and attempt to implement this form of feedback in our hierarchical structure of invariant representations.

Continue reading “Ask Hackaday: Not Your Mother’s Feedback”

Echo, The First Useful Home Computer Intelligence?

We’re familiar with features like Siri or Microsoft’s Cortana which grope at a familiar concept from science fiction, yet leave us doing silly things like standing in public yowling at our phones. Amazon took a new approach to the idea of an artificial steward by cutting the AI free from our peripherals and making it an independent unit that acts in the household like any other appliance. Instead of steering your starship however, it can integrate with your devices via bluetooth to aide in tasks like writing shopping lists, or simply help you remember how many quarts are in a liter. Whatever you ask for, Echo will oblige.

Screen Shot 2014-11-06 at 2.57.14 PMThe device is little more than the internet and a speaker stuffed into a minimal black cylinder the size of a vase, oh- and six far-field microphones aimed in each direction which listen to every word you say… always. As you’d expect, Echo only processes what you say after you call it to attention by speaking its given name. If you happen to be too far away for the directional microphones to hear, you can alternatively seek assistance from the Echo app on another device. Not bad for the freakishly low price Amazons asking, which is $100 for Prime subscribers. Even if you’re salivating over the idea of this chatting obelisk, or intrigued enough to buy one just to check it out (and pop its little seams), they’re only available to purchase through invite at the moment… the likes of which are said to go out in a few weeks.

The notion of the internet at large acting as an invisible ever-present swiss-army-knife of knowledge for the home is admittedly pretty sweet. It pulls on our wishful heartstrings for futuristic technology. The success of Echo as a first of its kind however relies on how seamlessly (and quickly) the artificial intelligence within it performs. If it can hold up, or prove to hold up in further iterations, it’s exciting to think what larger systems the technology could be integrated with in the near future… We might have our command center consciousness sooner than we thought.

With that said, inviting a little WiFi probe into your intimate living space to listen in on everything you do will take some getting over… your thoughts?

Continue reading “Echo, The First Useful Home Computer Intelligence?”

binary hierarchy

Ask Hackaday: Sequences Of Sequences

ask-hackaday-invariant-representations-featured-image

 

In a previous article, we talked about the idea of the invariant representation and theorized different ways of implementing such an idea in silicon. The hypothetical example of identifying a song without knowledge of pitch or form was used to help create a foundation to support the end goal – to identify real world objects and events without the need of predefined templates. Such a task is possible if one can separate the parts of real world data that changes from that which does not. By only looking at the parts of the data that doesn’t change, or are invariant, one can identify real world events with superior accuracy compared to a template based system.

Consider a friend’s face. Imagine they were sitting in front of you, and their face took up most of your visual space. Your brain identifies the face as your friend without trouble. Now imagine you were in a crowded nightclub, and you were looking for the same friend. You catch a glimpse of her from several yards away, and your brain ID’s the face without trouble. Almost as easily as it did when she was sitting in front of you.

I want you to think about the raw data coming off the eye and going into the brain during both scenarios. The two sets of data would be completely different. Yet your brain is able to find a commonality between the two events. How? It can do this because the data that makes up the memory of your friend’s face is stored in an invariant form. There is no template of your friend’s face in your brain. It only stores the parts that do not change – such as the distance between the eyes, the distance between the eye and the nose, or the ear and the mouth. The shape her hairline makes on her forehead. These types of data points do not change with distance, lighting conditions or other ‘noise’.

One can argue over the specifics of how the brain does this. True or not true, the idea of the invariant representation is a powerful one, and implementing such an idea in silicon is a worthy goal. Read on as we continue to explore this idea in ever deeper detail.

Continue reading “Ask Hackaday: Sequences Of Sequences”

Ask Hackaday: What Are Invariant Representations?

Your job is to make a circuit that will illuminate a light bulb when it hears the song “Mary Had a Little Lamb”. So you breadboard a mic, op amp, your favorite microcontroller (and an ADC if needed) and get to work. You will sample the incoming data and compare it to a known template. When you get a match, you light the light. The first step is to make the template. But what to make the template of?

“Hey boss, what style of the song do you want to trigger the light? Is it children singing, piano, what?”

Your boss responds:

“I want the light to shine whenever any version of the song occurs. It could be singing, keyboard, guitar, any musical instrument or voice in any key. And I want it to work even if there’s a lot of ambient noise in the background.”

Uh oh. Your job just got a lot harder. Is it even possible? How do you make templates of every possible version of the song? Stumped, you talk to your friend about your dilemma over lunch, who just so happens to be [Jeff Hawkins] – a guy whose already put a great deal of thought into this very problem.

“Well, the brain solves your puzzle easily.” [Hawkins] says coolly. “Your brain can recall the memory of that song no matter if it’s vocal, instrumental in any key or pitch. And it can pick it out from a lot of noise.”

“Yea, but how does it do that though!” you ask. “The pattern’s of electrical signals entering the brain have to be completely different for different versions of the song, just like the patterns from my ADC. How does the brain store the countless number of templates required to ID the song?”

“Well…” [Hawkins] chuckles. “The brain does not store templates like that”. The brain only remembers the parts of the song that doesn’t change, or are invariant. The brain forms what we call invariant representations of real world data.”

Eureka! Your riddle has been solved. You need to construct an algorithm that stores only the parts of the song that doesn’t change. These parts will be the same in all versions – vocal or instrumental in any key. It will be these invariant, unchanging parts of the song that you will look for to trigger the light. But how do you implement this in silicon?

Continue reading “Ask Hackaday: What Are Invariant Representations?”

comic of chinese room

Ask Hackaday: Program Passes Turing Test, But Is It Intelligent?

turing test program screenshot

A team based in Russia has developed a program that has passed the iconic Turing Test. The test was carried out at the Royal Society in London, and was able to convince 33 percent of the judges that it was a 13-year-old Ukrainian boy named Eugene Goostman.

The Turing Test was developed by [Alan Turing] in 1950 as an existence proof for intelligence: if a computer can fool a human operator into thinking it’s human, then by definition the computer must be intelligent. It should be noted that [Turing] did not address what intelligence was, but only tried to identify human like behavior in a machine.

Thirty years later, a philosopher by the name of [John Searle] pointed out that even a machine that could pass the Turing Test would still not be intelligent. He did this through a fascinating thought experiment called “The Chinese Room“.

Continue reading “Ask Hackaday: Program Passes Turing Test, But Is It Intelligent?”

Pokemon Artificial Intellegence

Pokemon Artificial Intelligence Is Smarter Than You

Who out there hasn’t angrily thrown a game controller across the room after continually getting killed by some stupid game-controlled villain? That is such a bummer! You probably wished there was some way to ‘just get past that point’. To take a step in that direction, [Ben] created an Artificial Intelligence program that will win at Pokemon Blue for Game Boy Advance.

The game is run in a Game Boy Advance emulator known as Visual Boy Tracer, which itself is a modified version of the most common GBA emulator, Visual Boy Advance. What sets Visual Boy Advance apart from the rest is that it has a memory dump feature which allows the user to send both the RAM and the ROM out of the emulator. The RAM holds all values currently needed by the emulator, this includes everything from text arrow flash times to details about currently battling Pokemon to the players position in the currently loaded map. The memory dump feature is key to allow the AI to understand what is happening in the game.

Continue reading “Pokemon Artificial Intelligence Is Smarter Than You”