James P Houghton

# Playing the Lottery 2 - Maximizing Expected Value

04 Feb 2013

In the last post, we showed that the lottery is a loosing game, because the expected value of a ticket has (at least for the last year) been consistently below its cost. While it never makes sense to buy a ticket for yourself, suppose that you were given a coupon for a ticket, and you had to choose which drawing to play. You'd want to choose a drawing with the highest Expected Value, but where is that maximum? As we saw, it is dependent on both the jackpot size and the number of players:
For the PowerBall lottery, historical results have suggested that in general, higher jackpots lead to better expected values. For the MegaMillions Lottery, expected value has historically increased to a point with jackpot size, and then declined as the number of tickets sold increases:
In each case, the number of tickets sold strongly influences the expected value. We can get an estimate of the  number of tickets sold by fitting the curve in the figure. Three factors which influence the number of sales are likely to be 1) the number of people who have the lottery on their mind 2) the fraction of those who choose to play, and 3) the number of tickets each player buys. Possibly in a reflection of this, a cubic polynomial gives a good fit for both sets of data. For the Powerball, with the jackpot value to x and the number of tickets sold to y, we get:

y = 4E-19x3 + 7E-10x2 - 0.0677x + 2E+07

and for the Mega Millions:

y = 1E-18x3 + 7E-10x2 + 0.0542x + 2E+07

With R^2 values of .9918 and .9985 respectively. This would suggest playing the Powerball lottery for values above about \$450M (assuming jackpots don't climb substantially above what has been observed in the past) and playing the Mega Millions lottery between jackpots of about \$375M-\$425M.

This analysis has a major weaknesses: there are relatively few historic data points in the region of the curve with the highest expected value. This means that our jackpot vs ticket sales curves are less reliable here than our R^2 values would indicate. In the next post on this topic, we'll investigate an additional data source which can help us build confidence in our analysis.