Abstract
For a long time, diabetes and obesity have been considered a menace only in developed countries. Nevertheless, the proliferation of unhealthy habits, such as fast-food chains and sedentary lifestyles, have caused diabetes and obesity to spread worldwide causing many and costly complications. Since citizens use of the Internet to search, learn, and share their daily personal experiences, the social networks have become popular data-sources that facilitate a deeper understanding of public health concerns. However, the exploitation of this data requires labelled resources and examples; however, as far as our knowledge, these resources do not exist in Spanish. Consequently, (1) we compile a balanced multi-class corpus with tweets regarding diabetes and obesity written in Spanish in Central-America; and, (2) we use the aforementioned corpus to train and test a machine-learning classifier capable of determining whether the texts related to diabetes or obesity are positive, negative, or neutral. The experimental results show that the best result was obtained through the Bag of Words model with an accuracy of 84.30% with the LIBLinear library. As a final contribution, the compiled corpus is released.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
References
Apolinardo-Arzube, Ó., García-Díaz, J.A., Medina-Moreira, J., Luna-Aveiga, H., Valencia-García, R.: Evaluating information-retrieval models and machine-learning classifiers for measuring the social perception towards infectious diseases. Appl. Sci. 9(14), 2858 (2019)
Apolinario-Arzube, Ó., Medina-Moreira, J.A., Lagos-Ortiz, K., Luna-Aveiga, H., García-Díaz, J.A., Valencia-García, R.: Tecnologías inteligentes para la autogestión de la salud. Procesamiento del Lenguaje Natural 61, 159–162 (2018)
Araujo, M., Reis, J., Pereira, A., Benevenuto, F.: An evaluation of machine translation for multilingual sentence-level sentiment analysis. In: Proceedings of the 31st Annual ACM Symposium on Applied Computing, pp. 1140–1145. ACM (2016)
Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Lrec, vol. 10, pp. 2200–2204 (2010)
Barbieri, F., Ronzano, F., Saggion, H.: Is this tweet satirical? a computational approach for satire detection in spanish. Procesamiento del Lenguaje Natural 55, 135–142 (2015)
Cho, N., et al.: Idf diabetes atlas: Global estimates of diabetes prevalence for 2017 and projections for 2045. Diabetes Res. Clin. Pract. 138, 271–281 (2018)
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
García-Sánchez, F., Paredes-Valverde, M., Valencia-García, R., Alcaraz-Mármol, G., Almela, Á.: Kbs4fia: leveraging advanced knowledge-based systems for financial information analysis. Procesamiento del Lenguaje Nat. 59, 145–148 (2017)
Go, A., Bhayani, R., Huang, L.: Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford 1(12), 2009 (2009)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Hsu, C.W., Chang, C.C., Lin, C.J., et al.: A practical guide to support vector classification (2003)
Huang, M., ElTayeby, O., Zolnoori, M., Yao, L.: Public opinions toward diseases: infodemiological study on news media data. J. Med. Internet Res. 20(5), e10047 (2018)
Ishijima, H., Kazumi, T., Maeda, A.: Sentiment analysis for the japanese stock market. Global Bus. Econ. Rev. 17(3), 237–255 (2015)
Jianqiang, Z., Xiaolin, G.: Comparison research on text pre-processing methods on twitter sentiment analysis. IEEE Access 5, 2870–2879 (2017)
Koppel, M., Schler, J.: The importance of neutral examples for learning sentiment. Comput. Intell. 22(2), 100–109 (2006)
Martínez-Cámara, E., Martín-Valdivia, M.T., Urena-López, L.A., Montejo-Ráez, A.R.: Sentiment analysis in twitter. Nat. Lang. Eng. 20(1), 1–28 (2014)
Medina-Moreira, J., Lagos-Ortiz, K., Luna-Aveiga, H., Paredes, R., Valencia-García, R.: Usage of diabetes self-management mobile technology: options for ecuador. In: Valencia-García, R., Lagos-Ortiz, K., Alcaraz-Mármol, G., del Cioppo, J., Vera-Lucio, N. (eds.) CITI 2016. CCIS, vol. 658, pp. 79–89. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48024-4_7
Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38(11), 39–41 (1995)
Moghaddam, S.: Beyond sentiment analysis: mining defects and improvements from customer feedback. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 400–410. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16354-3_44
Moraes, R., Valiati, J.F., Neto, W.P.G.: Document-level sentiment classification: an empirical comparison between svm and ann. Expert Syst. Appl. 40(2), 621–633 (2013)
Ochoa, J.L., Valencia-García, R., Perez-Soltero, A., Barceló-Valenzuela, M.: A semantic role labelling-based framework for learning ontologies from spanish documents. Expert Syst. Appl. 40(6), 2058–2068 (2013)
Orces, C.H., Lorenzo, C.: Prevalence of prediabetes and diabetes among older adults in ecuador: analysis of the sabe survey. Diab. Metab. Syndr. Clin. Res. Rev. 12(2), 147–153 (2018)
Pang, B., Lee, L., et al.: Opinion mining and sentiment analysis. Found. Trends® Inf. Retrieval 2(1–2), 1–135 (2008)
Peñalver-Martinez, I., et al.: Feature-based opinion mining through ontologies. Expert Syst. Appl. 41(13), 5995–6008 (2014)
Powers, M.A., et al.: Diabetes self-management education and support in type 2 diabetes: a joint position statement of the american diabetes association, the american association of diabetes educators, and the academy of nutrition and dietetics. Diabetes Educ. 43(1), 40–53 (2017)
Ramírez-Esparza, N., Pennebaker, J.W., García, F.A., Suriá, R.: La psicología del uso de las palabras: Un programa de computadora que analiza textos en español. Rev. Mex. Psicología 24(1), 85–99 (2007)
Salas-Zárate, M.P., Medina-Moreira, J., Lagos-Ortiz, K., Luna-Aveiga, H., Rodriguez-Garcia, M.A., Valencia-Garcia, R.: Sentiment analysis on tweets about diabetes: an aspect-level approach. Comput. math. methods med. 2017, 9 (2017)
Salas-Zárate, M.P., Paredes-Valverde, M.A., Rodriguez-García, M.Á., Valencia-García, R., Alor-Hernández, G.: Automatic detection of satire in twitter: a psycholinguistic-based approach. Knowl.-Based Syst. 128, 20–33 (2017)
Salas-Zárate, M.P., Valencia-García, R., Ruiz-Martínez, A., Colomo-Palacios, R.: Feature-based opinion mining in financial news: an ontology-driven approach. J. Inf. Sci. 43(4), 458–479 (2017)
Schouten, K., Frasincar, F.: Survey on aspect-level sentiment analysis. IEEE Trans. Knowl. Data Eng. 28(3), 813–830 (2015)
Shaw Jr., G., Karami, A.: Computational content analysis of negative tweets for obesity, diet, diabetes, and exercise. Proc. Assoc. Inf. Sci. Technol. 54(1), 357–365 (2017)
Suttles, J., Ide, N.: Distant supervision for emotion classification with discrete binary values. In: Gelbukh, A. (ed.) CICLing 2013. LNCS, vol. 7817, pp. 121–136. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37256-8_11
Wilson, T., Raaijmakers, S.: Comparing word, character, and phoneme n-grams for subjective utterance recognition. In: Ninth Annual Conference of the International Speech Communication Association (2008)
Acknowledgements
This work has been supported by the Spanish National Research Agency (AEI) and the European Regional Development Fund (FEDER/ERDF) through project KBS4FIA (TIN2016-76323-R).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Medina-Moreira, J., García-Díaz, J.A., Apolinardo-Arzube, O., Luna-Aveiga, H., Valencia-García, R. (2019). Mining Twitter for Measuring Social Perception Towards Diabetes and Obesity in Central America. In: Valencia-García, R., Alcaraz-Mármol, G., Del Cioppo-Morstadt, J., Vera-Lucio, N., Bucaram-Leverone, M. (eds) Technologies and Innovation. CITI 2019. Communications in Computer and Information Science, vol 1124. Springer, Cham. https://doi.org/10.1007/978-3-030-34989-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-34989-9_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34988-2
Online ISBN: 978-3-030-34989-9
eBook Packages: Computer ScienceComputer Science (R0)