Goyal, Kjeldergaard, Deshmukh, and Kim present a strategy to develop an intelligent agent capable of playing blackjack using learning, utilitary theory and decision-making to maximize the expected probability of winning the game.
Blackjack functions as an excellent use for decision trees. The popular table game is an example of an iterated game with a finite number of players (2), imperfect information, and a finite (albeit large) number of possible moves. Because of these factors, the house edge in this game can be minimized to the extent that in certain situations, the player can press their advantage to tip the odds in their favor. However, even using the best strategies, players' expected values remain negative.
The study examines the methods to compute player odds and dealer odds, and details the ideal method for inputting these computations into a probabilistic decision tree. For human actors, card-counting techniques can be used to track probable hands, to find advantageous positions. However, an intelligent agent will be able to perfectly recollect the actual history of the standard eight-deck shoe, to perfectly calculate both its and the dealer's probable hands at the beginning of each iteration.
The agent bases its probabilistic determination of strategy (hit, stay, double down, split, and amount to bet) based on the ongoing set of discarded information. The knowledge of the remaining cards in the shoe are used to determine at which point on the decision tree to pick up the analysis.
The intelligent agent should learn over time optimal strategy, by attempting to address the problem:
1. Given the total of the agent’s hand and the cards in dealer’s hand including the 2 hidden cards, is there any set of cards within the remaining deck that will make my hand total greater than 21? If such set of cards are there, what is the probability of getting one such set in addition to the hand in possession.
2. While taking care of point 1, are there any set of cards in the remaining deck which will make my total = 21 or at least total > Dealer’s Stand Point. (depending on the casino rules and practices)
Based on the answer to the preceding problem, the agent can determine the optimal strategy for its moves and the flow of money in the game. However, because Blackjack is NP Complete, there is no efficient solution to this problem.
Sanchit Goyal, E. K. (2010). Intelligent Agent for Playing Casino Card Games. Grand Forks, ND: University of North Dakota.