Solving Project Euler's Poker Hands Problem

A seemingly simple problem on Project Euler, where you have to score poker hands, and determine how many times player 1 wins, left me quite puzzled. At a first glance it shouldn't be that hard to solve this problem, but each time I was nearing 100-200 lines of code, I felt that there must've been a more elegant solution.

Rules of Poker

If we quickly rehearse the rules for poker, we can create a list of the possible scores for a hand.

In decreasing order of how good the hand is, there are the following scores:

  1. Royal Flush: Ace, King, Queen, Jack, Ten, in same suit.
  2. Straight Flush: All cards are consecutive values of same suit.
  3. Four of a Kind: Four cards of the same value.
  4. Full House: Three of a kind and a pair.
  5. Flush: All cards of same suit.
  6. Straight: All cards are consecutive values.
  7. Three of a Kind: Three cards of the same value.
  8. Two Pairs: Two different pairs.
  9. One Pair: Two cards of the same value.
  10. High Card: Highest value card.

If both players have the same score, then it is the highest card for that score, and if those are equal, then it's the next highest card, repeating until a player wins, or both are equal, and the pot is split.

First attempt

In my first attempt, I created a Card class, and a Hand class, and in combination with operator overloading, it would've allowed me to easily compare hands and cards for both players. This method however, got me stuck with a lot of code, and still not a good method to determine the score of the hand. Another problem, even if I could determine the score, occurs when the ranks are equal, and the highest cards needs to be compared.

In essence, I want to create a score function for a hand, that could be easily compared with the other player's hand.

Counting ranks

To easily determine the score of a hand, I figured that it would help to create a frequency table of how many times the rank of a card occurs. To do this, we create a count_frequency function, that returns the frequency of each rank in a list.

def count_frequency(hand):
    frequencies = [0] * 13
    for (rank, _) in hand.split():
        frequencies["23456789TJQKA".index(rank)] += 1
    return frequencies

If we call the function for a given hand, e.g.: count_frequency('5H 5C 6S 7S KD') it results in [0, 0, 0, 2, 1, 1, 0, 0, 0, 0, 0, 1, 0]. Looking at the frequency table, we can deduce that if:

  • one 2 is present, we have one pair;
  • two 2's are present, we have two pairs;
  • a 3 is present, and no 2, we have three of a kind;
  • a 3 and a 2 is present, we have three of a kind and a pair;
  • a 4 is present, we have four of a kind,
which is quite a useful method to determine the score of a hand. However, this still leaves me without a simple way to determine the winner if the players have the same hand.

Researching existing solutions

To get some inspiration for ideas on how to calculate a score for a hand, I found the following two functions in this StackOverflow thread.

Histogram based scoring function

The first answer I found is the following code, from this answer.

def score(hand):
    ranks = '23456789TJQKA'
    rcounts = {ranks.find(r): ''.join(hand).count(r) 
        for r, _ in hand}.items()
    score, ranks = zip(*sorted((cnt, rank) 
        for rank, cnt in rcounts)[::-1])
    if len(score) == 5:
        if ranks[0:2] == (12, 3): #adjust if 5 high straight
            ranks = (3, 2, 1, 0, -1)
        straight = ranks[0] - ranks[4] == 4
        flush = len({suit for _, suit in hand}) == 1
        '''no pair, straight, flush, or straight flush'''
        score = ([1, (3,1,1,1)], [(3,1,1,2), (5,)])[flush][straight]
    return score, ranks

At a first glance, I understood very little of this. Except that it was using the idea of the frequency table.

Another really amazing answer is a 4-line JavaScript function that uses bit operations on a 64-bit float to determine the result. However, that solution was even more cryptic than the one above. Searching for more solutions on Google led to a lot of functions that only determine the score of the hand, and do not generate a score, that can be compared when a tie occurs. After reading some ideas, I called it a day.

Second attempt

The next day I remembered that Python also allows for comparing tuples in a convenient way. For example, if we compare $(3,2,1) \ge (2,3,1)$, Python will evaluate this to true.

Using this idea, I wanted to create a score function that returns a tuple, with this format: $(\text{score of hand}, \text{highest card in rank}, \text{highest card}, \text{next highest card}, ...)$. If we calculate this for '5H 5C 6S 7S KD', we have two fives, which should become the following tuple: $(1, 3, 11, 5, 4)$. Now this is easily comparable to other scores!

Using the idea of the frequency table, and creating a tuple with the score, I started writing the following score function.

def score(hand):
    cards = sorted([("23456789TJQKA".index(r) + 2, s) 
        for (r, s) in hand.split()])[::-1]
    ranks = [r for (r, _) in cards]
    rcount = sorted(ranks.count(r) for r in set(ranks))[::-1]
    is_straight = (ranks[0] - ranks[-1]) == 4
    is_flush = len({s for (_, s) in cards}) == 1

    # Royal Flush: Ten, Jack, Queen, King, Ace, in same suit.
    if is_straight and is_flush and ranks[0] == 14: 
        return (12, )
    # Straight Flush: All cards are consecutive values of same suit.
    elif is_straight and is_flush: 
        return (11, max(cards)[0])
    # Four of a Kind: Four cards of the same value.
    elif rcount[0] == 4: 
        return (10, [r for r in ranks if r != ranks[2]][0])
    # Full House: Three of a kind and a pair.
    elif rcount[0] == 3 and rcount[1] == 2: 
        return (9, ranks[2], [r for r in ranks if r != ranks[2]][0])
    # Flush: All cards of the same suit.
    elif is_flush: 
        return (8, *ranks)
    # Straight: All cards are consecutive values.
    elif is_straight: 
        return (7, ranks[0])
    # Three of a Kind: Three cards of the same value.
    elif rcount[0] == 3: 
        return (6, ranks[2], *[r for r in ranks if r != ranks[2]])
    # Two Pairs: Two different pairs.
    elif rcount[0] == 2 and rcount[0] == 2: 
        return (2, *ranks) # FIX: left over card must be last
    # One Pair: Two cards of the same value.
    elif rcount[0] == 2: 
        return (1, *ranks) # FIX: two pair must be in front
    # High Card: Highest value card.
    else: 
        return (0, *ranks)

This is still a bit messy in my opinion, and sorting the cards in a good order for two pair, and one pairs, is still not that easy. However, I noticed that this solution is coming quite close to the solution from StackOverflow.

Investigating the frequency table

If we take a closer look at the frequency table, we can see that there are only a few possibilities. In order, they are:

[4,1] four of same value Four of a Kind
[3,2] three of same value, and two of same value     Full House
[3,1,1] three of same value Three of a Kind
[2,2,1] two pairs of same value Two Pairs
[2,1,1,1] two of same value One Pair
[1,1,1,1,1] all different values Flush, Straight, Highest

Note that the frequencies are sorted in descending order. If we convert this to a tuple, we can compare score of a hand to another hand! However, we need still need to take care of a flush, straight, and highest card.

Now instead of storing the frequency table in a list, we are going to store it in a dictionary with the following score function.

def score(hand):
    cards = sorted([("23456789TJQKA".index(r), s) 
        for (r, s) in hand.split()])[::-1]
    ranks = [r for (r, _) in cards]
    rcount = {r:ranks.count(r) for r in ranks}
    return rcount

If we run this function for a pair of fives, we get the following.

score('5H 5C 6S 7S KD') = {11: 1, 5: 1, 4: 1, 3: 2}

Revisiting the score function

Now the key trick comes. This is also what the StackOverflow example is doing. We are going to swap the key and value, and convert it to a tuple, and sort it in descending order. To do this we modify the score function.

def score(hand):
    cards = sorted([("23456789TJQKA".index(r), s) 
        for (r, s) in hand.split()])[::-1]
    ranks = [r for (r, _) in cards]
    rcount = {r:ranks.count(r) for r in ranks}
    return sorted((v,k) for (k,v) in rcount.items())[::-1]

If we then run it again for the pair of fives, we get the following.

score('5H 5C 6S 7S KD') = [(2, 3), (1, 11), (1, 5), (1, 4)]

Notice that the first element of the tuple is the score of the hand, where the highest score is in front of the list. The second element of the tuple is the rank of the card, also sorted in descending order. However, because the score of the hand is sorted first, the cards are also in the correct order to compare them with a tuple. All that is left to do is to split the tuple into two tuples, the score of the hand, and the rank of the cards. To do this, we will again modify the score function.

def score(hand):
    cards = sorted([("23456789TJQKA".index(r), s) 
        for r, s in hand.split()])[::-1]
    ranks = [r for (r, _) in cards]
    rcounts = {r:ranks.count(r) for r in ranks}.items()
    return zip(*sorted((v, k) for k, v in rcounts)[::-1])

If we again run this score function for the pair of fives, we now get the following.

score('5H 5C 6S 7S KD') = ((2, 1, 1, 1), (3, 11, 5, 4))

Cool! The score function is almost identical to the StackOverflow answer. All that is left to do is to correctly score the Royal Flush, Straight Flush, Flush, and Straight. If we list them in a table, we can fill in the missing scores.

Royal Flush (5, ) (changed)
Straight Flush (5, ) (changed)
Four of a Kind (4, 1)
Full House (3, 2)
Flush (3, 1, 3) (changed)
Straight (3, 1, 2) (changed)
Three of a Kind (3, 1, 1)
Two Pairs (2, 2, 1)
One Pair (2, 1, 1, 1)
High Card (1, 1, 1, 1, 1)

To easily classify those, we need to determine if there is a flush or a straight.

Determine a flush

To determine if there is a flush is the easiest one, it simply means that all cards should be the same suit. We can easily do this by creating a dictionary of the suits, and check that the length of the dictionary is one. This results in the following check.

flush = len({suit for _, suit in hand}) == 1

Also note that for a straight the score is (1, 1, 1, 1, 1). This is important because the only case when we need to adjust the score, is when the length of the scores is 5.

Determine a straight

A straight can only happen if the score is (1, 1, 1, 1, 1), and in that case, the rank of the cards is in the following order $(n+4, n+3, n+2, n+1, n)$. If we subtract the last element from the first element we are left with $n+4-n=4$. This means that we can check for a straight with the following condition:

straight = (score[0] - score[4]) == 4

Do note that the condition where we need to have 5 scores also applies. This ensures that we do not have any cards of the same rank, otherwise $(n+4, n+2, n+2, n+2, n)$ is perfectly valid too!

Putting it all together

Finally we can complete the score function with the conditions for the straight and the flush. If implemented, we will have the final score function.

def score(hand):
    cards = sorted([("23456789TJQKA".index(r), s) 
        for r, s in hand.split()])[::-1]
    ranks = [r for (r, _) in cards]
    rcounts = {r:ranks.count(r) for r in ranks}.items()
    score, ranks = zip(*sorted((v, k) for k, v in rcounts)[::-1])
    if len(score) == 5:
        straight = (ranks[0] - ranks[-1]) == 4
        flush = len({s for (_, s) in cards}) == 1
        if straight and flush:       score = (5, )
        elif not straight and flush: score = (3, 1, 3)
        elif straight:               score = (3, 1, 2)
    return score, ranks

This is the score function that solved the Project Euler question.

Golfing

Ofcourse we also want a function that is as short as possible. After a bit of fiddling I got to 280 characters, which fits in a Tweet.

def f(H):
    z,s,L=zip,sorted,len
    R,U=z(*s([("23456789TJQKA".index(r),s)for r,s in H.split()])[::-1])
    S,R=z(*s((v,k)for k,v in{r:R.count(r)for r in R}.items())[::-1])
    c,s,f=L(S)==5,R[0]-R[-1]==4,L(set(U))==1
    return(((1,),(3,1,3)),((3,*S),(5,)))[s][f]if c else S,R