Minecraft wizard, and record holder for the Super Mario World speedrun [SethBling] is experimenting with machine learning. He built a program that will get Mario through an entire level of Super Mario World – Donut Plains 1 – using neural networks and genetic algorithms.
A neural network simply takes an input, in this case a small graphic representing the sprites in the game it’s playing, sends that input through a series of artificial neurons, and turns that into commands for the controller. It’s an exceedingly simple neural network – the network that can get Mario through an entire level is less than a dozen neurons – but with enough training, even simple networks can accomplish very complex tasks.
To train the network, or weighting the connections between inputs, neurons, and outputs, [SethBling] is using an evolutionary algorithm. This algorithm first generates a few random neural networks, watches Mario’s progress across Donut Plains 1, and assigns a fitness value to each net. The best networks of each generation are combined, and the process continues for the next generation. It took 34 generations before MarI/O could finish the level without dying.
A few members of the Internet’s peanut gallery have pointed to a paper/YouTube video by [Tom Murphy] that generalized a completely different technique to play a whole bunch of different NES games. While both [SethBling]’s and [Tom Murphy]’s algorithms use certain variables to determine its own success, [Tom Murphy]’s technique works nearly automatically; it will play about as well as the training data it is given. [SethBling]’s algorithm requires no training data – that’s the entire point of using a genetic algorithm.
I want to point out that Seth is a prior World Record holder.
Ssssh! They’ll take away his title once they realize how he did it.
He uses the any% category which means you can glitch the game or beat it in any means necessary. He didn’t hack it so it’s valid.
now THAT is a hack! Epic work I love it and very interesting!
Science!
I get this error when I try to run the script:
LuaInterface.LuaScriptException: [string “main”]:337: attempt to index field ‘neurons’ (a nil value)
You guy got it to work?
Here is another article, with a fix for your problem: http://glenn-roberts.com/posts/tech/2015/07/08/neuroevolution-with-mario.html
Awesome! Nice job.
Is there a way to emulate this in a faster than real time way? LUA might not be the right framework for that.
Faster than real time? Jump into your time machine maybe.
I think they mean running native code at faster than real time 50/60 frames per second? Most emulators are capable of doing this, even if it is rather unplayable for humans.
I’m shocked that it finished the level after just 34 generations. Intuitively I would have guessed 100x that many were necessary.
If I’m reading the code right, while the number of “generations” is low, it actually simulates 300 individuals per generation, so the actual number of games simulated is a lot closer to what you were expecting.
Mine is currently on gen 14 and fist beat it on gen 11. Im still going strong though! (but my PC isn’t; framerates have dropped by an extremely noticeable amount.)
anyone who has the error mentioned above you need to load up a level first go file > save state and call it DP1.State then you need to make sure it is saved in the same location as the .lua file. its running fine for me i am going to leave it running overnight and see how far it gets.
I’m just thinking of Tic-Tac-Toe in War Games now :P
Pretty sure that was the reference the author was going for. :)
I read the title and thought it was about the super FX chip because it was called MARIO CHIP 1 (Mathematical, Argonaut, Rotation & I/O).
And now I am going to play Star Fox immediately.
Hate to be “that guy”, but it’d be nice to test the evolutionary network on a few different levels. It seems like it’s just overfitting to this one particular level at the moment and would probably crash and burn on any slight modification to the level at this stage rather than having actually evolved any meta-abilities or abstractions. These would allow it to quickly re-enforce abstract “behaviours” (jumping, killing enemy type x, etc.) that would see it through different levels in a relatively small number of further generations.
I am using it on different levels, working fine here.
Can’t see the video right now although I guess I have the general idea.
I’ve not played Mario for ages but as far as I remember, there are times where you need to be at a certain place at a certain moment, basically you need to have prior knowledge of some elements, at least that’s how human players play.
Assuming the machine learning program only takes what’s on screen, how can it mimick that priori knowledge part.
Also is machine playing the same game all the time? (rephrased: are there randomized elements in each game?).
This phrase makes no sense: _”While both [SethBling]’s and [Tom Murphy]’s algorithms use certain variables to determine its own success, [Tom Murphy]’s technique works nearly automatically; it will play about as well as the training data it is given. [SethBling]’s algorithm requires no training data – that’s the entire point of using a genetic algorithm.”_
Tom’s uses the cartridge’s raw RAM and optimizes for maximum values (for values that change). Pretty sure Seth’s neural net optimizes for specific values, AND BOTH need training data, which is the RAM in case of Tom’s and specific values in case of Seth’s.
The author is referring to the fact that Tom’s algorithm must first observe a human player play the level, something which Seth’s algorithm does not require.
Oh, ok. Anyway, I’m not sure, but I think Tom’s need some human playing first to locate the changing data locations in RAM (that’s why his can play any game), not as training data. Seth’s setup already provides the controls and fitness variables to the GA.
Would be nice if key presses would be counted and less key presses are better than more. This jumping around is fun to look at, but energy saving is more real life like. Great sample though.
how do you apply MarI/O to a game?
http://glenn-roberts.com/posts/tech/2015/07/08/neuroevolution-with-mario.html
Mario can only work with super mario world and super mario bros.
Specially because some parts of the code ask for the rom name.
You can simply change the code changing rom name super mario X (bros or world) to another thing.
BUT, another problem is that this ai is based at ram values of stuff (enemies ram value, blocks ram value) not at pixel colors so you will need to find those for the new game you want and change it.
The fitness is based at rightmost distance the genome gone at some point of time (not where mario is when the thing reset), plus a negative fitness based at time, this will also need to be changed.
Assuming you know lua coding and emulator ram watching you can simply (MAYBE) make the changes needed to use at new game (almost no learning ai skill needed [that is 1/3 of the entire thing, learnin ai skill, coding and emulator ram watching])
Not sure why the Lua script is broken in the modern day
NLua.Exceptions.LuaScriptException: [string “main”]:37: attempt to get length of a nil value (global ‘ButtonNames’)
Anyone know of a fix? 2023 almost 2024