![]() last update 12.17.01 |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
What in Sam Hell Possessed Me to Waste My Time With This
So that got me wondering if it were possible to achieve a 100% win rate. I set out to write a program to answer this very pressing question.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The Rules![]() As kids, we'd also draw hats, hair, ears, facial expressions, hands, fingers, shoes, genitals, etc. But enlightened adults can't subscribe to the naive idealism of such soft-on-crime policies. No, we cannot. There are rules now, and the rules are that the guesser has exactly six misses. No more. So: Head. Body. Two arms. Two legs. Six. Boom, you're finished. Or perhaps head, body, two arms, one leg and one genital unit. But THAT'S IT. DONE. If the second player misses six times, he is "hanged" and the first player wins. If the second player reveals all the letters in the word, he escapes execution and is declared the victor. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The AlgorithmThe program's algorithm is pretty straightforward. For a given wordlist, the program determines how many valid "patterns" each letter has. For example, if it were to start with the letter A, it would find patterns such as:(The last pattern means "The word does not have an A.")[A---, **] [-A--, **] [-A-A, **] [----, A] The program then filters the existing wordlist through that pattern to form a new list. For example, after filtering a wordlist through the pattern: ...the new wordlist would contain all entries from the old wordlist that had the letter A in second position, such as:[-A--, **] With this new wordlist, the program again determines all valid patterns of each letter, filters the wordlist into another wordlist, and so on. This continues until either the number of misses totals six (hanged), or until the length of the wordlist is filtered down to one (solved).EATS JAZZ PAST SAIL TAKE
Despite these features, all that looping and recursion takes far too long. So I instead analyzed only the seven most frequently hit letters at each step. (This does not mean I only analyze the seven most frequently found letters in the English alphabet. For example, when analyzing the pattern -UE--, the program would look at the filtered wordlist and discover that Q is one of the most common letters in the wordlist.) Will this shortcut miss a key strategy? Probably not. In all of the observations I've made, the best letter was one of the four most common letters.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The DictionaryFor this program, I used the Enable dictionary found at this URL: http://www.puzzlers.org/secure/wordlists/dictinfo.html |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Summary![]() I can't seem to discern any pattern one should follow after the first guess. I'd initially theorized that you should choose vowels until the word is filled out a bit, then switch to consonants, but this doesn't appear to be a hard and fast rule. It does seem that the first three guesses of longer words are usually in the subset {AEIORSTLN}. I wonder what the missed 9-letter and 11-letter words are.
Win Probability
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
For Poop's Sake, The Strategy Files Already![]() Or you can look at individual strategies here. And here's the program's source code. The current version of the code solves the 10-, 12-, 13-, 14-, and 15-letter games very quickly. The other games require more time; the 9-letter game took upwards of 15 hours on a P3/600 machine. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Feeble Explanations![]() The search will return the following entry:**, **
This entry means that the first letter you guess should be E, and by doing so, and by continuing to follow the correct strategy, you should be able to solve 13121 out of 15323 words.0. [**, **] [E] [13121/15323] Let's say you guess E, and it misses. You should now do a search for:
The search will return the following entry:**, E This means that you should now guess A, and by doing so you can expect to solve 5043 out of 5905 words. So you guess A, and an A is revealed in the fourth position. You should do a search for:1. [**, E] A [5043/5905] The search will return the following entry:---A--, E This means that you should now guess S, and by doing so you can expect to win 347 out of 355 times. And so forth.2. [---A--, E] S [347/355] Eventually, you will come to an entry followed by a wordlist. For example: 6. [-O-AR-, EIS] BOYARD COWARD DOTARD HOLARD NONART NOTARY TOWARD VOTARY ZONARY ![]() I set the program up to simply print the wordlist once it contains fewer than 12 words, leaving the user to his or her own devices. I won't go into the boring reasons for presenting strategy in this halfassed manner. Looks like in this case, D might be the best choice, then N (if D is a miss) or T (if D is a hit). If you follow the strategy but lose the game, a pattern search will yield a list of words that might have defeated you. For example, let's say you've lost a five-letter game, ending with the following pattern:
A search for that pattern will return the following entry:---AN-, CEGIRS
8. [--AN-, CEGIRS] BLAND BLANK FLANK LLANO PLANK PLANT QUANT THANK |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Frequently Volleyed Ridicule, Frequently Proffered ShruggingQ1a. What good is this strategy if my opponent knows I use it and so only chooses words that will cause the strategy to fail?A1. If you were to always go Rock in Roshambo, it wouldn't take long for people to figure out how to beat you. Similarly, if you blindly follow a hangman strategy that has less than a 100% success rate, you're vulnerable to exploitation. Take this example: 6. [A-O-, EINS] AGOG AHOY ALOW AMOK APOD ATOM ATOP AVOW AWOL ![]() A game-theoretically optimal strategy would dictate an exact guess frequency for each letter. For example, the following strategy would result in a win 40% of the time: The first guess should be W 40% of the time, P 30%, M 30%. If first guess is a hit, the game is solved. Otherwise: if W was the first guess, the second guess should be P 10% of the time, M 10%, G 40%, H 40%. If P was the first guess, the second guess should be M 20% of the time, G 40%, H 40%. If M was the first guess, the second guess should be P 20% of the time, G 40%, H 40%. However, if you only mix your strategy late in the game, as in the above example, your opponent will still have a huge edge over you. You may need mix it up at all stages, possibly even from the very first guess. For example, instead of guessing E, guess A, I, or O. But then where to go from there? The strategy files don't instruct how to proceed except for when the initial guess is E! True enough. For now, that information isn't available. I could fancy up the program to develop optimal strategies given a particular first guess, after which I could invite one of my poker guru friends to provide some instruction on the proper way to mix strategy. If I sense any interest, I'll consider furthering the analysis in this manner. To reiterate: mixing your strategy is necessary only if the strategy is imperfect. If, for example, your opponent chooses a 12-letter word, then adhering to the strategy is a sure win, so there's no reason to stray from it.
A2. I could run this program for any wordlist, be it truncated, elongated, or otherwise, and can also easily change the number of allowed misses. I've not yet done so, but could be persuaded to. I'd actually be very interested in a wordlist that doesn't have S-pluralized words; if you know where I can get one, send me e-mail.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() Send comments to: neoncap@sharkfeeder.com. Sign my guestbook! |