Evolutionary Algorithms Computes The Best Blackjack Strategy


Don’t want to learn about evolutionary algorithms the usual way, by generating sentences from random letters, or randomly placing pixels to generate the Mona Lisa? Then make your own evolutionary algorithm! With blackjack!

[Brian] has been playing around with evolutionary algorithms, and wanted a task that’s well suited for optimization. He chose blackjack, because of the limited number of hands that can be dealt to the player (32) and low number of hands the dealer can have (10).

Even with the low number of initial conditions for the player and the dealer, there are still 4.562 x 10^192 possible combinations of hands, so brute forcing a blackjack strategy would require the computational power of the entire planet. An easier way to compute a good strategy is an evolutionary algorithm, implemented by [Brian] with the Watchmaker Java library.

For each generation in [Brian]’s program, a 32×10 grid was generated, one cell each for possible player’s hands against the dealer’s hand. In each cell, the computer put a ‘hit’, ‘stay’, or ‘double down’, and played thousands of hands with that strategy. The best strategies were bred and eventually [Brian] ended up with a good blackjack strategy.

The resulting best strategy is pretty good – using his strategy, he can walk out of an Atlantic City casino with 96% of the money he arrived with.

28 thoughts on “Evolutionary Algorithms Computes The Best Blackjack Strategy

  1. “The resulting best strategy is pretty good – using his strategy, he can walk out of an Atlantic City casino with 96% of the money he arrived with.”

    I like that — the best you can do is not lose TOO MUCH of your money, haha. A true take on gambling.

  2. Nice work. It would be interesting to see a repeat with the option for the player to split his hand. Splitting is a very valuable strategy and should raise the player’s ability to retain his stake to over 99%. The house edge on commonly dealt blackjack games is less than 1% when a player uses the accepted basic strategy. Could the algorithm replicate the very successful basic strategy, or improve upon it? Hmm…

  3. This…. is kind of a silly thing to use evolutionary algorithms for. There’s no reason to “breed” random grids together — the optimal policy for each cell is dependent only on the optimal policy for cells with higher sums. For each of the 320 cells, counting down, try a few thousand hands with hit, a few thousand hands with stay, and a few thousand with double downs. Then pick whichever one gave you the best reward.

    Additionally, the chained nature of the strategies means that evolutionary algorithms are not going to work great. A generation which hits on twenty, for instance, is really going to screw up the policy for ten: Given the potential for drawing a face card, and the poor results from that, the program will (wrongly) decide that standing on ten is the best policy. This is the sort of thing that policy iteration shakes out in an iteration or three, but with evolutionary algorithms each cell is at the mercy of its grid-mates, so these artifacts can stick around for awhile.

    Anyhoo, I don’t want to be totally down on Brian. That’s a surprisingly good result for learning a policy from evolutionary algorithms. As a next step, why not try building a grid with policy iteration, and see how it compares?

    1. The code that I wrote decouples the blackjack engine from the genetic algorithm, so perhaps I will try this and compare!

      The reason that your solution would work is that each cell’s strategy does not effect the strategy of other cells- so even though there are 4.562 x 10^192 different grids possible, the [Dealer Ace, Player Hard 12] cell does NOT effect the [Dealer 8, Player Soft 15] cell. Going one step further would mean not only changing one cell at a time, but also only evaluating blackjack hands effected by that cell (i.e. if you are evaluating Soft 17’s strategy, rig the simulator to only deal the player Soft 17). This would allow you to play less hands per round without effecting the accuracy of the results.

      This was just a continuation of some other experimenting I’ve done with Genetic Algorithms- I knew going in that it would not be the BEST solution, only a plausible one. A much better use of GAs that I have found was for solving PID loop coefficients, which was eons more efficient than even brute force. Source: http://www.microcontrollercentral.com/author.asp?section_id=2379&doc_id=254676&

      Thanks for the feedback!

  4. Not to rag on the evolutionary algorithm, which is kind of neat, but his estimate of the difficulty of doing a regular combinatorial or simpler statistical analysis is off by several orders of magnitude. People have been doing this kind of math since the 1970’s and re-doing it every time there is a new rules variation and doing it in combination with different end deck card frequencies to develop card counting strategies.

  5. Blackjack already as a known “basic strategy” which limits the house edge to 0.5-1%. That seems a lot better than Brian’s strategy, which seems to have a house edge of 4%.

    1. You’re right, and when I went to AC I wound up using my old tried-and-true strategy. When I ran this simulation, I didn’t allow the strategy to contain splits (mainly because of the huge amount of additional work it would take to implement them) and so there was no way it COULD have hit the effectiveness of the Basic Strategy. This was simply an experiment to see if genetic algorithms could evolve a viable solution in a short time, and they did.

      1. i’m not certain that a 400-800% greater than expected loss is necessarily “viable,” although mathematically if including the split would account for the difference then sure.

        facinating look at the issue none the less.

    1. Tough crowd. The source code is included, please feel free to contribute the splitting feature if you feel that this program is ruined without it.

      I posted this not because I wanted to pose a new and improved blackjack strategy to the world, but because I thought that others may benefit from learning how genetic algorithms work and what they are good at. Spending the extra hours coding spitting in would have probably increased the win percentage by a bit, but it would not change the concepts put forth in this article.

      I guess we just see different assignments :)

      1. I wouldn’t listen to negative comments too much – I thought this was a great post, and certainly don’t mind that there’s something left to improve, actually it’s better with DIY projects to motivate someone to pick up a similar one. Heads up for nice work, I even learned much new about blackjack strategies from the comments. :)

  6. “4.562 x 10^192 possible combinations of hands” – Dont be dumb. There aren’t that many hands, there are that many strategies to react to the 320 hands that are possible.

  7. Brian, I hate to point out this thing we call the Gambler’s fallacy..but all results for your numbers are with the expectation of linear results in a live casino game. The actual results are concurrent, among the millions of hands being played at this instant. Read up on the basic bckjack strategy, practice it for a couple hours, and you’ll probably do better than trying to adapt a great program to random numbers. Today I saw a random number generator show 1 ten times in a row, with a field of 6 to choose from.

  8. oh wow I cant believe I encountered this. @Brian I have the same thought as you ahaha. I always thought that maybe there are alterate strategies. I am doing a dissertation on Blackjack as well, but after hours of research only to realise that basic has the best odds so far ahaha….AND I haven’t implement split on my application that I suppose to develop zzzz.

  9. You guys! Sorry, not a mathematician or but have a question posed differently and thought an algorithm might answer this question? Where could I find the answer, if anyone wants to help. I developed a system by virtue of dumb playing for years. Units don’t matter. Outcome matters. Not sure how many bets you start with 10-15 Units. If you are satisfied with being three units ahead. How many times can you get out of casino with+3 versus loosing it all. If you put stops in after +3 and keep playing til you double or ? keep putting back stops in, and your Unit is large enough, can you make money, in small increments this way. I have been winning more than loosing when I have the discipline to quit. I am not rich but using a $1000, making $300 a day and leaving the casino, sometimes, two minutes after getting there….Can anyone put this question to the test?

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.