Evolutionary algorithms computes the best blackjack strategy

blackjack_banner

Don’t want to learn about evolutionary algorithms the usual way, by generating sentences from random letters, or randomly placing pixels to generate the Mona Lisa? Then make your own evolutionary algorithm! With blackjack!

[Brian] has been playing around with evolutionary algorithms, and wanted a task that’s well suited for optimization. He chose blackjack, because of the limited number of hands that can be dealt to the player (32) and low number of hands the dealer can have (10).

Even with the low number of initial conditions for the player and the dealer, there are still 4.562 x 10^192 possible combinations of hands, so brute forcing a blackjack strategy would require the computational power of the entire planet. An easier way to compute a good strategy is an evolutionary algorithm, implemented by [Brian] with the Watchmaker Java library.

For each generation in [Brian]‘s program, a 32×10 grid was generated, one cell each for possible player’s hands against the dealer’s hand. In each cell, the computer put a ‘hit’, ‘stay’, or ‘double down’, and played thousands of hands with that strategy. The best strategies were bred and eventually [Brian] ended up with a good blackjack strategy.

The resulting best strategy is pretty good – using his strategy, he can walk out of an Atlantic City casino with 96% of the money he arrived with.

Comments

  1. cirrus says:

    *takes notes for 6.S912*

  2. Jon says:

    “The resulting best strategy is pretty good – using his strategy, he can walk out of an Atlantic City casino with 96% of the money he arrived with.”

    I like that — the best you can do is not lose TOO MUCH of your money, haha. A true take on gambling.

  3. Tim says:

    Nice work. It would be interesting to see a repeat with the option for the player to split his hand. Splitting is a very valuable strategy and should raise the player’s ability to retain his stake to over 99%. The house edge on commonly dealt blackjack games is less than 1% when a player uses the accepted basic strategy. Could the algorithm replicate the very successful basic strategy, or improve upon it? Hmm…

  4. Ben says:

    This…. is kind of a silly thing to use evolutionary algorithms for. There’s no reason to “breed” random grids together — the optimal policy for each cell is dependent only on the optimal policy for cells with higher sums. For each of the 320 cells, counting down, try a few thousand hands with hit, a few thousand hands with stay, and a few thousand with double downs. Then pick whichever one gave you the best reward.

    Additionally, the chained nature of the strategies means that evolutionary algorithms are not going to work great. A generation which hits on twenty, for instance, is really going to screw up the policy for ten: Given the potential for drawing a face card, and the poor results from that, the program will (wrongly) decide that standing on ten is the best policy. This is the sort of thing that policy iteration shakes out in an iteration or three, but with evolutionary algorithms each cell is at the mercy of its grid-mates, so these artifacts can stick around for awhile.

    Anyhoo, I don’t want to be totally down on Brian. That’s a surprisingly good result for learning a policy from evolutionary algorithms. As a next step, why not try building a grid with policy iteration, and see how it compares?

    • The code that I wrote decouples the blackjack engine from the genetic algorithm, so perhaps I will try this and compare!

      The reason that your solution would work is that each cell’s strategy does not effect the strategy of other cells- so even though there are 4.562 x 10^192 different grids possible, the [Dealer Ace, Player Hard 12] cell does NOT effect the [Dealer 8, Player Soft 15] cell. Going one step further would mean not only changing one cell at a time, but also only evaluating blackjack hands effected by that cell (i.e. if you are evaluating Soft 17’s strategy, rig the simulator to only deal the player Soft 17). This would allow you to play less hands per round without effecting the accuracy of the results.

      This was just a continuation of some other experimenting I’ve done with Genetic Algorithms- I knew going in that it would not be the BEST solution, only a plausible one. A much better use of GAs that I have found was for solving PID loop coefficients, which was eons more efficient than even brute force. Source: http://www.microcontrollercentral.com/author.asp?section_id=2379&doc_id=254676&

      Thanks for the feedback!

  5. localroger says:

    Not to rag on the evolutionary algorithm, which is kind of neat, but his estimate of the difficulty of doing a regular combinatorial or simpler statistical analysis is off by several orders of magnitude. People have been doing this kind of math since the 1970’s and re-doing it every time there is a new rules variation and doing it in combination with different end deck card frequencies to develop card counting strategies.

  6. Blaise Pascal says:

    Blackjack already as a known “basic strategy” which limits the house edge to 0.5-1%. That seems a lot better than Brian’s strategy, which seems to have a house edge of 4%.

    • You’re right, and when I went to AC I wound up using my old tried-and-true strategy. When I ran this simulation, I didn’t allow the strategy to contain splits (mainly because of the huge amount of additional work it would take to implement them) and so there was no way it COULD have hit the effectiveness of the Basic Strategy. This was simply an experiment to see if genetic algorithms could evolve a viable solution in a short time, and they did.

      • metis says:

        i’m not certain that a 400-800% greater than expected loss is necessarily “viable,” although mathematically if including the split would account for the difference then sure.

        facinating look at the issue none the less.

  7. t-bone says:

    I’d grade this “Incomplete”. Put the splitting in, finish the assignment.

    • Tough crowd. The source code is included, please feel free to contribute the splitting feature if you feel that this program is ruined without it.

      I posted this not because I wanted to pose a new and improved blackjack strategy to the world, but because I thought that others may benefit from learning how genetic algorithms work and what they are good at. Spending the extra hours coding spitting in would have probably increased the win percentage by a bit, but it would not change the concepts put forth in this article.

      I guess we just see different assignments :)

      • Joonas says:

        I wouldn’t listen to negative comments too much – I thought this was a great post, and certainly don’t mind that there’s something left to improve, actually it’s better with DIY projects to motivate someone to pick up a similar one. Heads up for nice work, I even learned much new about blackjack strategies from the comments. :)

      • Don’t worry about the commentors here.

        I’ve seen *maybe* two dozen commentors send in a project.

        Surprisingly very few armchair builds, by the way.

  8. Alex says:

    Wouldn’t the amount of money he loses depend on how long he’s there? As in, he loses 4% of his money over an average 4 hour casino session?
    Suppose I could RTFA to find out

  9. d says:

    Neet application!

  10. zuul says:

    make your own evolutionary algorithm, with black jack, and hookers

    in fact, forget the black jack..

  11. mr_guy99493 says:

    “4.562 x 10^192 possible combinations of hands” – Dont be dumb. There aren’t that many hands, there are that many strategies to react to the 320 hands that are possible.

  12. Al says:

    there seems to be a connection b/w black jack, genetic algorithms and hookers I seem to be misunderstanding. I guess when you write split algorithm, I’ll understand?

  13. bossman2013 says:

    Brian, I hate to point out this thing we call the Gambler’s fallacy..but all results for your numbers are with the expectation of linear results in a live casino game. The actual results are concurrent, among the millions of hands being played at this instant. Read up on the basic bckjack strategy, practice it for a couple hours, and you’ll probably do better than trying to adapt a great program to random numbers. Today I saw a random number generator show 1 ten times in a row, with a field of 6 to choose from.

  14. Well here is a new evolutionary algorithm I have introduced:

    http://www.freebasic.net/forum/viewtopic.php?f=8&t=22090

    I posted here before about some great regenerative radio circuits I had devised too:

    http://theradioboard.com/rb/viewtopic.php?f=4&t=5257

    but ’tis few who are interested.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 94,439 other followers