Why adieu is a terrible first guess in wordle

Jan 23, 2022

#combinatorics#numpy

Adieu is a terrible first word in Wordle. It's not just a little bad, and not just in a few rare cases, like Matt Damon movies. It's really bad, basically all the time, like Adam Sandler movies. On average, "adieu" is about 2.4 times worse than the optimal word. What is the optimal word? Read on to find out.

A few folks have already reverse-engineered Wordle, so we know it uses exactly 12,972 unique 5-letter words as its dictionary. The goal in Wordle is to narrow down the list of possible words as quickly as possible, ideally down to 1. So a good guess should give you as much information as possible: you learn, not just about whether a letter is present, but also its position and a letter's absence.

For that reason, the guesses "arise" and "aesir", though they have the same letters, are not equally good; they will give you different information about the positions of their letters.

So our ideal word is one that narrows down the possibilities from nearly 13,000 as fast as possible. Now, it turns out there is not just one ideal first word; it comes down to whether you're a risk-averse person or not. If you want to make sure that you solve as many puzzles as possible under the 6-guess limit, your best bet is "serai". If on the other hand you want to, on average, solve puzzles in the fewest guesses, then your ideal first word is "tares".

If you choose "serai" as your first word, you will immediately narrow down the number of possible words from 12,972 to at most 697, and an average of 75. If you choose "tares" as your first word, you will narrow down the number of possible words downt to at most 858, but an average of 61.

In Wordle, you're basically looking through a big house for your drivers license. In this analogy, your mischievous partner has hidden it somewhere, and you get to ask them questions (and then will destroy it after 6 incorrect guesses, because pandemic home life is monotonous and what doesn't spice up an afternoon like a high-stakes game of hide and seek). If your first guess is "adieu", it's like asking your partner if the license is in their left-front pocket. You're trying to ask general questions, like whether the license is upstairs or downstairs; out in the open or inside a container.

But there's another, simpler reason why "adieu" is a terrible choice of first word: "e" isn't even the most common letter among 5-letter words, "s" is! So a first guess that doesn't contain the letter "s" is unlikely to be good. But, more relevant to the game, the frequency of letters is not the same across positions - "s" is most common as a first and last letter, while the second and third letters are most likely to be an "a". You can find detailed per-position frequencies here. One insight: the most common first letters are all consonants; "a" isn't even in the top 5. And obviously, "u" is a very uncommon letter (only 18% of words have it), but as a last letter it's extremely improbable - only 0.5% of words have it.

If you like this kind of thing, drop me an email or DM me on Twitter (links in the sidebar). If you were curious, the absolute worst first words are "qajaq" and "gyppy".

Wordle Solver

Yeah, of course I wrote a solver. You can use it to help you solve a Wordle - it'll suggest words to try, and you can tell it how it did. Code is on GitHub (I would have implemented it as a web app but it requires a 90MB matrix and no one wants to blow their data cap on something like that).

I tried the solver with a few different starting words and show the results below. Note that the solver itself is very smart, so it can do pretty well even with a very bad initial starting word.

First Word	# Puzzles Solved (/219)	Average # of Guesses per Puzzle (when solved successfully)
serai	212 (96.8%)	4.20
tares	213 (97.3%)	4.25
adieu	204 (93.2%)	4.26
rates	210 (95.9%)	4.27
ratio	211 (96.3%)	4.30
gyppy (worst possible starting word)	209 (95.4%)	4.77

More results can be found on GitHub. You can try different words yourself using the solver or Tweet at me for a specific word.

Technical Details

I put all my code and data on GitHub, including the word list and the dataset of answers to all past puzzles. I hope this is helpful to others, as I found other projects only publicized their code, or only made the data available in language-specific formats, which made it hard to improve on their work. I wrote all my code in Python. It took me about an hour to write the computation of the optimal first word, and another half hour to write the solver. The main non-algorithmic trick was to make sure we don't run out of memory.

First, we statically compute a possibilities matrix: a table with all pairs of guess-answer pairs (12,972×12,972 matrix) where each entry is what evaluation Wordle would output. This is similar to dynamic programming, though we will not be changing the values in the table. Instead, we can save this table on disk for repeated use by our solver. I encoded Wordle's per-guess output as a small integer so the whole table could be represented as a numpy matrix, which is very memory-efficient. For each letter in the guess, I first encoded Wordle's output as a 5-element array where each element ranged from 0-2, where 0 means that letter is not present anywhere in the answer (grey), 1 means the letter is present elsewhere in the answer (yellow), and 2 means the letter is present at that exact position (green). I then took that 5-element array and converted it to base-3 (with a maximum value of 3⁵-1 = 242). An example is shown below.

Example Wordle output - we can encode this as [1, 2, 0, 0, 1] as an array or 88 as an integer.

We can further ensure that the table occupies a small memory and disk footprint by setting the element type to uint8 (max value 255). Therefore this table occupies a mere 160MB on disk and in memory (90MB compressed) - you can download it here from Github LFS. The trick to doing this quickly is using numpy-native methods of matrix and array manipulations. The matrix took about 8 minutes to compute on my old laptop. I loaded the dishwasher while I waited.

After we have the table, both determining the ideal starting word and coding a solver is simple. Based on this table, we can compute, for each guess word, how well it partitions the space - we simply count, for each Wordle response (matrix value), how many answers there are for that value. The mean is the "mean partition size" of that word (average case performance), and the max is the "max partition size" of that word (worst case performance). I include some interesting values below:

To finish the solver, we simply select a strategy (worst case or average case) and execute it 5 times after we get our results. I will explain the worst-case strategy, but the average-case strategy looks basically the same. First, once we get our response to our guess, we remove any answers from the table that are not consistent with that result for our guess. We then also remove those answers as potential guesses. So, if we start with "serai" and get one of the worst cases, we pare our table down to 697x697. We then use the same procedure that we used when picking our initial word to pick the next guess.

I tried the worst-case strategy on all the past answers, starting from the first puzzle on June 19, 2021. This strategy will solve all except for 7 puzzles within the 6 allowed guesses, and will take an average of 4.3 guesses. It took about 5 minutes to evaluate the solver on these puzzles. I loaded the laundry machine while I waited.

Notes and Acknowledgements

I didn't do any of the reverse-engineering of Wordle, so thanks to Bertrand Fan, Owen Yin and Spencer Churchill for letting me stand on their shoulders.

Owen found that Wordle actually has all the answers encoded into its source code, so you can build a solver that will just output the answer in one guess if you were so inclined. Alternatively, you can use the entire solutions list to test your solver, or to train a solver that does better on the solution set than the dictionary set (overtraining). I chose to only use the past solutions for evalution, since I thought using future solutions was a little against the spirit of the challenge. Nevertheless, I include those solutions as part of the dataset I provide (link) for people who have different ideas.

Addendum

I am now using "serai" as my first guess. I built a program that could completely solve 100% of possible puzzles in normal mode with "serai" as the first word, with an average of 3.71 guesses per puzzle. Specifically, I built a decision tree. You can see that decision tree here. Reading it is pretty straight-forward: start with the root guess, then find the response from Wordle and enter the next guess in tree. The tree doesn't include any results of the form "GGGGG" because once you've achieved that result, you've won and there are no more decisions to be made.