Evolutionary Algorithms Computes The Best Blackjack Strategy

January 15, 2013

blackjack_banner

Don’t want to learn about evolutionary algorithms the usual way, by generating sentences from random letters, or randomly placing pixels to generate the Mona Lisa? Then make your own evolutionary algorithm! With blackjack!

[Brian] has been playing around with evolutionary algorithms, and wanted a task that’s well suited for optimization. He chose blackjack, because of the limited number of hands that can be dealt to the player (32) and low number of hands the dealer can have (10).

Even with the low number of initial conditions for the player and the dealer, there are still 4.562 x 10^192 possible combinations of hands, so brute forcing a blackjack strategy would require the computational power of the entire planet. An easier way to compute a good strategy is an evolutionary algorithm, implemented by [Brian] with the Watchmaker Java library.

For each generation in [Brian]’s program, a 32×10 grid was generated, one cell each for possible player’s hands against the dealer’s hand. In each cell, the computer put a ‘hit’, ‘stay’, or ‘double down’, and played thousands of hands with that strategy. The best strategies were bred and eventually [Brian] ended up with a good blackjack strategy.

The resulting best strategy is pretty good – using his strategy, he can walk out of an Atlantic City casino with 96% of the money he arrived with.

28 thoughts on “Evolutionary Algorithms Computes The Best Blackjack Strategy”

cirrus says:

January 15, 2013 at 8:10 am

*takes notes for 6.S912*

Report comment

Reply
1. POSEIDON says:
  
  June 1, 2013 at 6:56 pm
  
  lolllllllllllll
  
  Report comment
  
  Reply
Jon says:

January 15, 2013 at 8:19 am

“The resulting best strategy is pretty good – using his strategy, he can walk out of an Atlantic City casino with 96% of the money he arrived with.”

I like that — the best you can do is not lose TOO MUCH of your money, haha. A true take on gambling.

Report comment

Reply
1. dave says:
  
  January 15, 2013 at 8:41 am
  
  if he can walk out of the casino with 96% of his cash and a hooker, I’d call tht a win.
  
  Report comment
  
  Reply
2. Hirudinea says:
  
  January 15, 2013 at 10:44 am
  
  Hey whenever you can walk out of a casino with ANY money is a win!
  
  Report comment
  
  Reply
Tim says:

January 15, 2013 at 8:42 am

Nice work. It would be interesting to see a repeat with the option for the player to split his hand. Splitting is a very valuable strategy and should raise the player’s ability to retain his stake to over 99%. The house edge on commonly dealt blackjack games is less than 1% when a player uses the accepted basic strategy. Could the algorithm replicate the very successful basic strategy, or improve upon it? Hmm…

Report comment

Reply
Ben says:

January 15, 2013 at 8:44 am

This…. is kind of a silly thing to use evolutionary algorithms for. There’s no reason to “breed” random grids together — the optimal policy for each cell is dependent only on the optimal policy for cells with higher sums. For each of the 320 cells, counting down, try a few thousand hands with hit, a few thousand hands with stay, and a few thousand with double downs. Then pick whichever one gave you the best reward.

Additionally, the chained nature of the strategies means that evolutionary algorithms are not going to work great. A generation which hits on twenty, for instance, is really going to screw up the policy for ten: Given the potential for drawing a face card, and the poor results from that, the program will (wrongly) decide that standing on ten is the best policy. This is the sort of thing that policy iteration shakes out in an iteration or three, but with evolutionary algorithms each cell is at the mercy of its grid-mates, so these artifacts can stick around for awhile.

Anyhoo, I don’t want to be totally down on Brian. That’s a surprisingly good result for learning a policy from evolutionary algorithms. As a next step, why not try building a grid with policy iteration, and see how it compares?

Report comment

Reply
1. Brian Carrigan says:
  
  January 15, 2013 at 9:33 am
  
  The code that I wrote decouples the blackjack engine from the genetic algorithm, so perhaps I will try this and compare!
  
  The reason that your solution would work is that each cell’s strategy does not effect the strategy of other cells- so even though there are 4.562 x 10^192 different grids possible, the [Dealer Ace, Player Hard 12] cell does NOT effect the [Dealer 8, Player Soft 15] cell. Going one step further would mean not only changing one cell at a time, but also only evaluating blackjack hands effected by that cell (i.e. if you are evaluating Soft 17’s strategy, rig the simulator to only deal the player Soft 17). This would allow you to play less hands per round without effecting the accuracy of the results.
  
  This was just a continuation of some other experimenting I’ve done with Genetic Algorithms- I knew going in that it would not be the BEST solution, only a plausible one. A much better use of GAs that I have found was for solving PID loop coefficients, which was eons more efficient than even brute force. Source: http://www.microcontrollercentral.com/author.asp?section_id=2379&doc_id=254676&
  
  Thanks for the feedback!
  
  Report comment
  
  Reply
localroger says:

January 15, 2013 at 9:01 am

Not to rag on the evolutionary algorithm, which is kind of neat, but his estimate of the difficulty of doing a regular combinatorial or simpler statistical analysis is off by several orders of magnitude. People have been doing this kind of math since the 1970’s and re-doing it every time there is a new rules variation and doing it in combination with different end deck card frequencies to develop card counting strategies.

Report comment

Reply
Blaise Pascal says:

January 15, 2013 at 9:03 am

Blackjack already as a known “basic strategy” which limits the house edge to 0.5-1%. That seems a lot better than Brian’s strategy, which seems to have a house edge of 4%.

Report comment

Reply
1. Brian Carrigan says:
  
  January 15, 2013 at 9:20 am
  
  You’re right, and when I went to AC I wound up using my old tried-and-true strategy. When I ran this simulation, I didn’t allow the strategy to contain splits (mainly because of the huge amount of additional work it would take to implement them) and so there was no way it COULD have hit the effectiveness of the Basic Strategy. This was simply an experiment to see if genetic algorithms could evolve a viable solution in a short time, and they did.
  
  Report comment
  
  Reply
  1. metis says:
    
    January 15, 2013 at 10:00 am
    
    i’m not certain that a 400-800% greater than expected loss is necessarily “viable,” although mathematically if including the split would account for the difference then sure.
    
    facinating look at the issue none the less.
    
    Report comment
    
    Reply
t-bone says:

January 15, 2013 at 9:29 am

I’d grade this “Incomplete”. Put the splitting in, finish the assignment.

Report comment

Reply
1. Brian Carrigan says:
  
  January 15, 2013 at 9:41 am
  
  Tough crowd. The source code is included, please feel free to contribute the splitting feature if you feel that this program is ruined without it.
  
  I posted this not because I wanted to pose a new and improved blackjack strategy to the world, but because I thought that others may benefit from learning how genetic algorithms work and what they are good at. Spending the extra hours coding spitting in would have probably increased the win percentage by a bit, but it would not change the concepts put forth in this article.
  
  I guess we just see different assignments :)
  
  Report comment
  
  Reply
  1. Joonas says:
    
    January 16, 2013 at 6:34 am
    
    I wouldn’t listen to negative comments too much – I thought this was a great post, and certainly don’t mind that there’s something left to improve, actually it’s better with DIY projects to motivate someone to pick up a similar one. Heads up for nice work, I even learned much new about blackjack strategies from the comments. :)
    
    Report comment
    
    Reply
  2. Brian Benchoff says:
    
    January 20, 2013 at 5:19 am
    
    Don’t worry about the commentors here.
    
    I’ve seen *maybe* two dozen commentors send in a project.
    
    Surprisingly very few armchair builds, by the way.
    
    Report comment
    
    Reply
Alex says:

January 15, 2013 at 9:44 am

Wouldn’t the amount of money he loses depend on how long he’s there? As in, he loses 4% of his money over an average 4 hour casino session?
Suppose I could RTFA to find out

Report comment

Reply
1. Devin says:
  
  January 15, 2013 at 10:52 am
  
  It’s not loosing 4% of the money he walks in with, but 4% of the cumulative total of all his bets
  
  Report comment
  
  Reply
d says:

January 15, 2013 at 11:46 am

Neet application!

Report comment

Reply
zuul says:

January 15, 2013 at 4:04 pm

make your own evolutionary algorithm, with black jack, and hookers

in fact, forget the black jack..

Report comment

Reply
1. Velli says:
  
  January 16, 2013 at 9:05 am
  
  ..and breed SuperChlamydia?
  
  Report comment
  
  Reply
mr_guy99493 says:

January 15, 2013 at 6:25 pm

“4.562 x 10^192 possible combinations of hands” – Dont be dumb. There aren’t that many hands, there are that many strategies to react to the 320 hands that are possible.

Report comment

Reply
Al says:

January 15, 2013 at 11:30 pm

there seems to be a connection b/w black jack, genetic algorithms and hookers I seem to be misunderstanding. I guess when you write split algorithm, I’ll understand?

Report comment

Reply
1. jwrm22 says:
  
  January 16, 2013 at 3:42 am
  
  Its a quote from Bender in Futurama… http://youtu.be/z5tZMDBXTRQ
  
  Report comment
  
  Reply
bossman2013 says:

January 19, 2013 at 10:22 pm

Brian, I hate to point out this thing we call the Gambler’s fallacy..but all results for your numbers are with the expectation of linear results in a live casino game. The actual results are concurrent, among the millions of hands being played at this instant. Read up on the basic bckjack strategy, practice it for a couple hours, and you’ll probably do better than trying to adapt a great program to random numbers. Today I saw a random number generator show 1 ten times in a row, with a field of 6 to choose from.

Report comment

Reply
Sean O'Connor (@Sean_VN) says:

December 25, 2013 at 4:10 pm

Well here is a new evolutionary algorithm I have introduced:
http://www.freebasic.net/forum/viewtopic.php?f=8&t=22090
I posted here before about some great regenerative radio circuits I had devised too:
http://theradioboard.com/rb/viewtopic.php?f=4&t=5257
but ’tis few who are interested.

Report comment

Reply
Michael Djingga says:

May 7, 2015 at 2:01 pm

oh wow I cant believe I encountered this. @Brian I have the same thought as you ahaha. I always thought that maybe there are alterate strategies. I am doing a dissertation on Blackjack as well, but after hours of research only to realise that basic has the best odds so far ahaha….AND I haven’t implement split on my application that I suppose to develop zzzz.

Report comment

Reply
Windynite says:

September 8, 2018 at 10:19 pm

You guys! Sorry, not a mathematician or but have a question posed differently and thought an algorithm might answer this question? Where could I find the answer, if anyone wants to help. I developed a system by virtue of dumb playing for years. Units don’t matter. Outcome matters. Not sure how many bets you start with 10-15 Units. If you are satisfied with being three units ahead. How many times can you get out of casino with+3 versus loosing it all. If you put stops in after +3 and keep playing til you double or ? keep putting back stops in, and your Unit is large enough, can you make money, in small increments this way. I have been winning more than loosing when I have the discipline to quit. I am not rich but using a $1000, making $300 a day and leaving the casino, sometimes, two minutes after getting there….Can anyone put this question to the test?

Report comment

Reply