Genetic Algorithm Programmer Gets Functions

[Kory] has been writing genetic algorithms for a few months now. This in itself isn’t anything unique or exceptional, except for what he’s getting these genetic algorithms to do. [Kory] has been using genetic algorithms to write programs in Brainfuck. Yes, it’s a computer programming a computer. Be thankful Skynet is 18 years late.

When we first saw [Kory]’s work, he had programmed a computer to write and run its own programs in Brainfuck. Although the name of the language [Kory] chose could use some work, it’s actually the ideal language for computer-generated programs. With only eight commands, each consisting of a single character, it greatly reduces the overhead of what any genetic algorithm must produce and what a fitness function must evaluate.

There was one shortcoming to [Kory]’s initial efforts: functions. It’s relatively easy to get a program to say Hello World, but to do something complex, you’re going to need something like a macro or a function. Brainfuck, it its most simple form, doesn’t support functions. This throws a wrench in [Kory]’s plan to have his computer programming computer grow smarter and get over local minima in its genetic algorithms.

The solution to this problem was the creation of a new dialect of Brainfuck [Kory] calls BrainPlus. This takes the best parts of Extended Brainfuck and adds a command that basically serves as a break statement.

With this, [Kory]’s self programming computer can develop more complex programs. Already it has created a program to generate the first few numbers of the Fibonacci sequence. It only goes up to 233 because 255 is the maximum value for a byte, and the program itself took seven hours to generate. It does, however, work. Other programs generated with the new Brainplus functions include reciting 99 bottles on the wall and a program that multiples two values.

Even though [Kory]’s computer is spending a long time to generate these programs, given enough time, there’s really not much this program can’t do. Brainfuck, and [Kory]’s Brainplus, are Turing complete, so that given infinite memory and time it can compute anything. With the new addition of functions, it can compute anything faster.

All the code for [Kyle]’s GA is available on Github.

 

47 thoughts on “Genetic Algorithm Programmer Gets Functions

    1. OK, I’ll Play along because I freely admit I know nothing about this either. I think the proper Response in cases such as These is to confidently assert that it all could have been done much more efficiently with a 555 timer.

  1. If you’re being picky, what about the it it’s “as it’s”.
    It’s about the content not the production. Made by people. In the spirt of the attitude, a little rough around the edges.

    1. If X=0 Then Z=Y/X

      Paraphrased from something I found in my code yesterday. Had been giving me trouble for weeks, as it was being error trapped by a higher function; triggering a weird sequence of events that caused an intermittent error not during the current program execution, but the *next* execution instead.

      Brainfarts occur in all languages, and no one language should claim ownership to the term.

    1. So that a piece of code that is used repeatedly can be mutated _once_ and have it effect every time it gets used. The computer might not care, but when dealing with non-infinite time, the genetic algorithms like functions.

    2. The time needed to evolve a working program probably rises close to exponentially with the complexity of the program. By breaking down a program into subtasks and evolving each subtask at least somewhat separately, you dramatically speed up the process.

      I assume it still takes some human intervention to define the need for, and criteria of, each function; and that is the limiting factor which prevents evolution of Skynet or whatnot. Didn’t read the article, my brain’s fucked up enough without deliberately exposing it to things like Brainfuck.

    3. “Functions are a convenience for fallible humans to make a shorter, easier to understand program.”

      No they aren’t! Where did this idea come from? If this were true, why would you *ever* implement them in actual hardware? Why wouldn’t you just inline everything, and never have support for callable functions at all? You could do this, of course, but virtually no one does. It’s silly.

      Functions exist to make code smaller for heavily used ideas. If you find yourself multiplying by 2 a lot, you have a function which does that, and call it, rather than reusing that function a lot. This happens in genetics, too: instead of coding for the same enzyme a bajillion times to increase its production, you have another sequence which creates something which turns it on or turns it off.

      In genetic programming, it’s also good because it makes the program easier and more robust to evolve: if something breaks the function, the whole thing falls apart immediately, so those functions are ‘protected’. So those mutations die immediately – which means the ‘non-fatal’ mutations can focus on the part of the program that still needs work.

    1. >obvious reasons

      I think the vast majority of proper academics (people that would read such a paper) wouldn’t be as silly as to be put off by a word. Where did hackaday get all of these prudes that are afraid of words?

      1. Spoken like someone that knows fuck all about academia, or publishing in a journal, which is supposed to be as professional as it gets where science and engineering are concerned. I personally don’t care, but I doubt a decent journal would be interested in publishing it. And it’s a shame to see work that’s that impressive not get the recognition it deserves, or let anyone serious take it further.

  2. Well, I did the exact same thing! Another problem of Brainfuck is the loop instruction “[” “]”. The problem is that inserting a “[” or “]” changes what the program does completely (because it changes which brace belongs to which else brace). This is a problem since it makes writing programs by “try and error” very hard and nearly impossible for the fitness-function to determine how fit the code actually is. Therefore I also started out creating a Brainfuck-inspired language, with loop-elements like “decrease program pointer by 10”.

    1. Wow, I just discovered how short the generated programs are! Defining a good fitness-function to archive that is pretty much the hardest part, very nice work! (Mine where significantly longer for simpler operation)

      1. I’m guessing that it might make sense to have the fitness function have a more complex length/valid output relation – right now, Kory’s is really simple: add a bonus to a program that got the output in less ticks. Fewer ticks = higher fitness, at all times (apparently). It might make more sense to use *two* fitness functions: one for ‘program is correct’ and one for ‘program is short’ and combine them differently over time – as in, when ‘program is correct’ maximizes, then maximize ‘program is short’ with the condition that the first fitness doesn’t decrease at all.

  3. Wow, this is actually pretty impressive. I know people have already ported Brainfuck to FPGAs, I’m guessing Kory would increase the performance of the fitness function significantly by doing the same (provided the embedded memblocks are large enough tor the stacks).

    1. As the generated programs are small enough, they could fit into the small memory available for gpgpu computing. Generate x new programs, where x is at least the number of cuda cores for instance, and evaluate them all at the same time.

  4. I wouldn’t call this artificial intelligence, it’s more like calling a set of monkeys with a typewriter then waiting until a bible is typed, Artificial Intelligence. Kory didn’t invent BF, he invented and named, :BrianPlus, which is BF with an operator character added to define functions. See Wikipedia, BF is somewhat like a turing machine, though has 8 (charcter) operators. A compiler has been implemented in less than 100 bytes. The name BF comes from it’s unreadability.

    The so-called AI is generating random strings of the 8 operators, running them, then computing a cost function based on the sum of ascii character differences. I would be amazed if this actually works for something like the 99bottlesofbeer…, done only by an optimization process on random strings. Has anyone replicated the results? Seems like an awful lot of monkeys would be required.

    1. That is genetic algorithms, it takes a base function, creates a random string (something that seems typed by a monkey) checks if it is similar and give an score, then does this a couple of times, after it is done with the first series, it takes some of the best scores and maybe some with bad scores and adds some new ones to this, and chek again if they match, gives cores and starts again… of course genetic algorithms are stupid, they dont know what they are doing before even checking, but with huge popullations like 10000 iterations or so, you can get the result. Of course this is not an AI, they are really different things.

      No one said he invented brainfuck, he invented a new subset, with easier functions (breaks?), so that the GA could generate more complex programs

  5. This is really cool, and I think it highlights a flaw in Genetic algorithms capabilities; Very quickly you hit a infeasible execution time limit when generating sequential code.

    Genetic algorithms can be though of as exploring a physical space. The size of the space is defined by the total number of combinations of the subject matter. An 4 bit SUBLEQ Program (using all 16 possible addresses) gives a solution space of 16^16 (16 values per addressable memory space ^ 16 addressable memory spaces) – over 18 quintillion possible programs – the vast majority are likely to be indistingushably bad.

    A good genetic algorithm will take a random solution and finds the smallest mutation that will improve the score by the most. There is an assumption in Genetic algortihms that there exists a path from bad solutions through the ok solutions to good solutions. So with the “exploring a space” analogy the assumption is that if I’m in an good place, then somewhere around me, somewhere close, is an even better space. If I look at adjacent spaces and I don’t find anything better, I widen my search and repeat as necessary.

    When this assumption doesn’t hold true, or only holds true weakly, the inefficiencies creep in. Consider an minimal JS script that computes the power of two numbers with a loop of multiplictions. Now consider all the ways that you could break that program with a single character change – i.e. pretty much any character change to any other value. Then consider the number of (being generous) 5 character changes you could make to the code that keep it working. Finally imagine that your program is just 1 character away from that working solution and then imagine your 100 character changes away. The result in both cases is probably a program that simply doesn’t work at all, likely doesn’t even execute without error. How do you know when you make a random change if your getting closer to the ideal solution, with that in mind? – normally the answer is “with great difficulty”.

    Specifically for sequential programs, ways of trying to mitigate these effect that I know of include;
    – Make it impossible to get a “does not compile”. This reduces the number of “bad” solutions dramatically.
    – Make behaviourally equivelant solutions impossible (although this could actually hurt your performance)
    – Make the language specific to your solution domain.
    – Make the programs huge, but limit execution time (this gives the mutator a large stock of “sorta working” code to mangle for it’s next solution)
    – Select solutions that have promising code (e.g. more complex, more return statements, etc)

    The only reasonably effective solution I have found for larger generated programs are having your Genetic algortihm generate systems formed of aggregates of procesing units, like neural networks. In that case there is a more linear progression from good to bad, in part due to the more fuzzy nature of the system, and the reduced sequential dependencies.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.