The Wheel of ¿Fortune?

Continuing with the analysis of Spanish grammar a few days ago, I have made a similar study about the English language. So, we go back to the letters, the letters and the words. English grammar consists of 26 letters: a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z. With this article we will discover some clues (tips) to succeed in “The Wheel of Fortune” and the “Hangman“.
In both games we have to guess the letters that are part of the words to finally guess the hidden word or phrase. We can guess, guess and guess or we can pull data to improve our “luck“. For this study I have downloaded a list of infochimps.
The dictionary has 370.076 words and, if we group them by the number of characters that each one has, we obtain this graphic:

Without much analysis, we already see that there are some letters with more possibilities than others:

Here we find the first big difference between Spanish and English. In the language of Cervantes, the “a” was the great dominator while in the language of Shakespeare, it is the “e” that occupies the first position.
The “e” is a safe bet for the “Hangman”. In “The Wheel of Fortune”, as the vowels have to be bought, the safest bets would be the “n” or the “s“. In Spanish, this role was played by the “r”. The 5 vowels take 39% of the cake (in Spanish it was 46%). Vowels have less weight in English than in Spanish.
We also see some changes in the last positions. While in Spanish, the less used letter was the “w”, in English the “j” takes it.
We can tune a little more. In both games, we have a very important variable, the number of characters of the word(s) to guess. By doing a more detailed analysis, we can improve our chances of success.

In relative terms:

As you can see, if we know a priori the size of the word to guess, we can increase our chances of success.

The vowels accumulate in many podiums. If we get rid of the vowels (very useful for “The Wheel of Fortune”), these are the consonants that we must choose to maximize our profits.

The Wheel of ¿Fortune? It’s quite obvious that there is a little bit of luck but we can reduce it to the minimum by studying a little the English vocabulary. Thus we will limit the luck to not fall into “bankruptcy” and / or “lose turn”.
The “s“, the “r” and the “n” are the safest bets. Could we tune a little more? I think so, but I’ll leave that for another article.
#data #datascience #datascientist #textmining #mineriadetextos #rstudio #bigdata #letras #palabras #words #ruletadelasuerte #thewheeloffortune #hangman #ahorcado #dataanalytics #letters #englishgrammar


Deja una respuesta

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *