Neural Networks And MarI/O

Minecraft wizard, and record holder for the Super Mario World speedrun [SethBling] is experimenting with machine learning. He built a program that will get Mario through an entire level of Super Mario World – Donut Plains 1 – using neural networks and genetic algorithms.

A neural network simply takes an input, in this case a small graphic representing the sprites in the game it’s playing, sends that input through a series of artificial neurons, and turns that into commands for the controller. It’s an exceedingly simple neural network – the network that can get Mario through an entire level is less than a dozen neurons – but with enough training, even simple networks can accomplish very complex tasks.

To train the network, or weighting the connections between inputs, neurons, and outputs, [SethBling] is using an evolutionary algorithm. This algorithm first generates a few random neural networks, watches Mario’s progress across Donut Plains 1, and assigns a fitness value to each net. The best networks of each generation are combined, and the process continues for the next generation. It took 34 generations before MarI/O could finish the level without dying.

A few members of the Internet’s peanut gallery have pointed to a paper/YouTube video by [Tom Murphy] that generalized a completely different technique to play a whole bunch of different NES games. While both [SethBling]’s and [Tom Murphy]’s algorithms use certain variables to determine its own success, [Tom Murphy]’s technique works nearly automatically; it will play about as well as the training data it is given. [SethBling]’s algorithm requires no training data – that’s the entire point of using a genetic algorithm.

30 thoughts on “Neural Networks And MarI/O

      1. I think they mean running native code at faster than real time 50/60 frames per second? Most emulators are capable of doing this, even if it is rather unplayable for humans.

    1. If I’m reading the code right, while the number of “generations” is low, it actually simulates 300 individuals per generation, so the actual number of games simulated is a lot closer to what you were expecting.

  1. anyone who has the error mentioned above you need to load up a level first go file > save state and call it DP1.State then you need to make sure it is saved in the same location as the .lua file. its running fine for me i am going to leave it running overnight and see how far it gets.

  2. Hate to be “that guy”, but it’d be nice to test the evolutionary network on a few different levels. It seems like it’s just overfitting to this one particular level at the moment and would probably crash and burn on any slight modification to the level at this stage rather than having actually evolved any meta-abilities or abstractions. These would allow it to quickly re-enforce abstract “behaviours” (jumping, killing enemy type x, etc.) that would see it through different levels in a relatively small number of further generations.

    1. Can’t see the video right now although I guess I have the general idea.
      I’ve not played Mario for ages but as far as I remember, there are times where you need to be at a certain place at a certain moment, basically you need to have prior knowledge of some elements, at least that’s how human players play.
      Assuming the machine learning program only takes what’s on screen, how can it mimick that priori knowledge part.
      Also is machine playing the same game all the time? (rephrased: are there randomized elements in each game?).

  3. This phrase makes no sense: _”While both [SethBling]’s and [Tom Murphy]’s algorithms use certain variables to determine its own success, [Tom Murphy]’s technique works nearly automatically; it will play about as well as the training data it is given. [SethBling]’s algorithm requires no training data – that’s the entire point of using a genetic algorithm.”_
    Tom’s uses the cartridge’s raw RAM and optimizes for maximum values (for values that change). Pretty sure Seth’s neural net optimizes for specific values, AND BOTH need training data, which is the RAM in case of Tom’s and specific values in case of Seth’s.

      1. Oh, ok. Anyway, I’m not sure, but I think Tom’s need some human playing first to locate the changing data locations in RAM (that’s why his can play any game), not as training data. Seth’s setup already provides the controls and fitness variables to the GA.

    1. Mario can only work with super mario world and super mario bros.
      Specially because some parts of the code ask for the rom name.
      You can simply change the code changing rom name super mario X (bros or world) to another thing.

      BUT, another problem is that this ai is based at ram values of stuff (enemies ram value, blocks ram value) not at pixel colors so you will need to find those for the new game you want and change it.

      The fitness is based at rightmost distance the genome gone at some point of time (not where mario is when the thing reset), plus a negative fitness based at time, this will also need to be changed.

      Assuming you know lua coding and emulator ram watching you can simply (MAYBE) make the changes needed to use at new game (almost no learning ai skill needed [that is 1/3 of the entire thing, learnin ai skill, coding and emulator ram watching])

  4. Not sure why the Lua script is broken in the modern day
    NLua.Exceptions.LuaScriptException: [string “main”]:37: attempt to get length of a nil value (global ‘ButtonNames’)

    Anyone know of a fix? 2023 almost 2024

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.