Abstract
Wordle, a popular daily puzzle in the New York Times, has garnered significant attention with its unique challenge. Players must decipher a five-letter word in six attempts or less, receiving feedback after each guess. Rooted in information theory and pragmatics, Wordle offers a valuable platform for exploration. Our research proposes an innovative approach consisting of two components. The first utilizes K-means clustering with tailored parameters: the forward difficulty evaluation index (Dforward) and the reverse difficulty evaluation index (Dreverse). Dforward is calculated by weighting factors like Nrepeat (normalized count of repeated letters), word frequency (F), and the number of vowels (Nvowel) in the word. Dreverse is obtained by normalizing the predicted number of successful guesses. The second component applies a CART decision tree model with difficulty level labels, predicting future solution complexities. By analyzing linguistic features in five-letter words, our study constructs a model that accurately determines word difficulty using statistical knowledge and machine learning techniques. Additionally, this model facilitates related linguistic analyses.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Brychkov, Y.A., Savischenko, N.V.: Application of hypergeometric functions of several variables in the mathematical theory of communication: evaluation of error probability in fading singlechannel system. Lobachevskii J. Math. 41, 1971–1991 (2020)
Cano, A., Krawczyk, B.: Kappa updated ensemble for drifting data stream mining. Mach. Learn. 109, 175–218 (2020)
Lin, C., Sun, D., Song, C.: Posterior propriety of an objective prior for generalized hierarchical normal linear models. Stat. Theor. Relat. Fields 6(4), 309–326 (2022)
Saroj, Kavita: Review: study on simple k mean and modified K mean clustering technique. Int. J. Comput. Sci. Eng. Technol. 6(7) 279281 (2016)
Han, J., Xu, J., Nie, F., Li, X.: Multi-view K-means clustering with adaptive sparse memberships and weight allocation. IEEE Trans. Knowl. Data Eng. 34(2), 816–827 (2022)
Steuer, R., Kurths, J., Daub, C.O., et al.: The mutual information: detecting and evaluating dependencies between variables. Bioinformatics 18(suppl_2), S231–S240 (2002) https://doi.org/10.1093/bioinformatics/18.suppl_2.s231
Bholowalia, P., Kumar, A.: EBK-means: a clustering technique based on elbow method and k-means in WSN. Int. J. Comput. Appl. 105(09), 17–24 (2014)
Li, H., Shao, J., Liao, K., Tang, M.: Do simpler statistical methods perform better in multivariate long sequence time-series forecasting? In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management, CIKM 2022, pp. 4168–4172. Association for Computing Machinery, New York (2022)
Yu, Y., Si, X., Hu, C., et al.: A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 31(7), 1235–1270 (2019)
Zhou, H., et al.: Informer: beyond efficient transformer for long sequence time-series forecasting. Proc. AAAI Conf. Artif. Intell. 35(12), 11106–11115 (2021). https://doi.org/10.1609/aaai.v35i12.17325
Zhou, Z.: Machine Learning. Tsinghua University Press (2016)
Liu, Y., Li, Z., Xiong, H., Gao, X., Wu, J.: Understanding of internal clustering validation measures. 2010 IEEE International Conference on Data Mining, Sydney, NSW, Australia, pp. 911–916 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Xu, X., Huang, J. (2024). From Wordle to Insights: Using Tailored Clustering and CART to Forecast Difficulty Levels. In: Pei, Y., Ma, H.S., Chan, YW., Jeong, HY. (eds) Proceedings of Innovative Computing 2024 Vol. 1. IC 2024. Lecture Notes in Electrical Engineering, vol 1214. Springer, Singapore. https://doi.org/10.1007/978-981-97-4193-9_17
Download citation
DOI: https://doi.org/10.1007/978-981-97-4193-9_17
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-4192-2
Online ISBN: 978-981-97-4193-9
eBook Packages: Computer ScienceComputer Science (R0)