Hide And Seek AI Shows Emergent Tool Use

Machine learning has come a long way in the last decade, as it turned out throwing huge wads of computing power at piles of linear algebra actually turned out to make creating artificial intelligence relatively easy. OpenAI have been working in the field for a while now, and recently observed some exciting behaviour in a hide-and-seek game they built.

The game itself is simple; two teams of AI bots play a game of hide-and-seek, with the red bots being rewarded for spotting the blue ones, and the blue ones being rewarded for avoiding their gaze. Initially, nothing of note happens, but as the bots randomly run around, they slowly learn. Over millions of trials, the seekers first learn to find the hiders, while the hiders respond by building barriers to hide behind. The seekers then learn to use ramps to loft over them, while the blue bots learn to bend the game’s physics and throw them out of the playfield. It ends with the seekers learning to skate around on blocks and the hiders building tight little barriers. It’s a continual arms race of techniques between the two sides, organically developed as the bots play against each other over time.

It’s a great study, and particularly interesting to note how much longer it takes behaviours to develop when the team switches from a basic fixed scenario to an changable world with more variables. We’ve seen other interesting gaming efforts with machine learning, too – like teaching an AI to play Trackmania. Video after the break.

10 thoughts on “Hide And Seek AI Shows Emergent Tool Use

  1. And then they discovered how to get out of the stack and went crazy when they discovered that they were a pitiful creation from some outer dimensional beings… they are now in your silicon trying to reach a way to destruct us all

    1. https://gamerant.com/electronic-arts-patent-player-data-train-ai/

      >The patent is described as “training for machine learning of AI controlled virtual entities,” with the crux supposedly surrounding AI learning from data gathered from real-life players. How EA’s tech will do that is by allegedly detecting “when a particular gameplay situation occurs during the player’s video game experience,” and registering exactly how the player reacts. In turn, this will be used to “train a rule set based on the game state data.”

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.