Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

A hybrid optimal weighting scheme and machine learning for rendering sentiments in tweets

Published: 01 January 2016 Publication History

Abstract

Over recent years, the world has experienced an explosive growth in the volume of shared web texts. Everyday, a huge volume of opinions expressed in various forms such as articles, reviews and tweets is generated. In general, opinion mining refers to the task of extracting opinions, and sentiment analysis is the technique that extracts subjectivity and polarity; in other words, it determines whether a text is positive or negative Taboada et al., 2011. Arabic sentiment analysis is conducted in this study using a publically available data set written in both modern standard Arabic and the Jordanian dialect. A new mathematical approach is introduced to determine the polarity of the tweet by using four functions whose parameters are the solutions of a linear program. These functions are then classified using support vector machines and K-nearest neighbours. The results show that the proposed approach is considerably reliable in Arabic sentiment analysis.

References

[1]
Abbasi, A., Hsinchun, C. and Arab, S. (2008) 'Sentiment analysis in multiple languages: feature selection for opinion classification in web forums', ACM Transactions on Information Systems (TOIS), Vol. 26, No. 3, pp. 1-34.
[2]
Abdulla, N., Ahmed, N., Shehab, M. and Al-Ayyoub, M. (2013) 'Arabic sentiment analysis: lexicon-based and corpus-based', IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT), pp. 1-6.
[3]
Abdulla, N.A., Al-Ayyoub, M. and Al-Kabi, M.N. (2014) 'An extended analytical study of Arabic sentiments', International Journal of Big Data Intelligence, Vol. 1, Nos. 1-2, pp. 103-113.
[4]
Al-Ayyoub, M., Essa, S.B. and Alsmadi, I. (2015) 'Lexicon-based sentiment analysis of Arabic tweets', International Journal of Social Network Mining, Vol. 2, No. 2, pp. 101-114.
[5]
Al-Kabi, M., Al-Qudah, N.M., Alsmadi, I., Dabour, M. and Wahsheh, H. (2013) 'Arabic / English sentiment analysis: an empirical study', Proceedings of the 4th International Conference on Information and Communication Systems (ICICS 2013), Irbid, Jordan, pp. 1-6.
[6]
Al-Kabi, M.N., Alsmadi, I.M., Gigieh, A.H., Wahsheh, H.A. and Haidar, M.M. (2014) 'Opinion mining and analysis for Arabic language', International Journal of Advanced Computer Science and Applications (IJACSA), Vol. 5, No. 5, pp. 181-195.
[7]
Alsaleem, S. (2011) 'Automated Arabic text categorization using SVM and NB', Int. Arab J. e-Technol., Vol. 2, No. 2, pp. 124-128.
[8]
Chalothorn, T. and Ellman, J. (2013) 'Sentiment analysis of web forums: comparison between SentiWordNet and SentiStrength', Proceedings of the 4th International Conference on Computer Technology and Development, Bangkok, Thailand.
[9]
Cherif, W., Madani, A. and Kissi, M. (2014a) 'Integrating effective rules to improve Arabic text stemming', Proceedings of the 2014 International Conference on Multimedia Computing and Systems (ICMCS), pp. 1077-1081.
[10]
Cherif, W., Madani, A. and Kissi, M. (2014b) 'Building a syntactic rules-based stemmer to improve search effectiveness for Arabic language', Proceedings of the 9th International Conference on Intelligent Systems: Theories and Applications (SITA-14), pp. 1-6.
[11]
Cherif, W., Madani, A. and Kissi, M. (2015a) 'A new modeling approach for Arabic opinion mining recognition', Proceedings of the 1st Conference on Intelligent Systems and Computer Vision (ISCV), pp. 1-6.
[12]
Cherif, W., Madani, A. and Kissi, M. (2015b) 'New rules-based algorithm to improve Arabic stemming accuracy', International Journal of Knowledge Engineering and Data Mining, Vol. 3, Nos. 3-4, pp. 315-336.
[13]
Diab, M., Hacioglu, K. and Jurafsky, D. (2007) 'Automatic processing of modern standard Arabic text', Computational Morphology, Vol. 38, pp. 159-179, Springer, Netherlands.
[14]
Draper, N.R., Smith, H. and Pownell, E. (1998) Applied Regression Analysis, 3rd ed., Wiley-Interscience publication, New York.
[15]
Drucker, H., Burges Chris, J.C., Kaufman, L., Smola, A. and Vapnik, V. (1997) 'Support vector regression machines', Advances in Neural Information Processing Systems, Vol. 9, pp. 155-161.
[16]
Duwairi, R., Al-refai, M. and Khasawneh, N. (2007) 'Stemming versus light stemming as feature selection techniques for Arabic text categorization', Proceedings of the 4th International Conference on Innovations in Information Technology (IIT '07), Vol. 18, No. 20, pp. 446-450.
[17]
Duwairi, R.M. (2015) 'Sentiment analysis for dialectical Arabic', 6th International Conference on Information and Communication Systems (ICICS), IEEE, pp. 166-170.
[18]
Duwairi, R.M., Marji, R., Sha'ban, N. and Rushaidat, S. (2014) 'Sentiment analysis in Arabic tweets', Proceedings of the 5th International Conference on Information and Communication Systems (ICICS), pp. 1-6.
[19]
Edwards, J.R. and Parry, M.E. (1993) 'On the use of polynomial regression equations as an alternative to difference scores in organizational research', Academy of Management Journal, Vol. 36, No. 6, pp. 1577-1613.
[20]
El-Halees, A. (2012) 'Opinion mining from Arabic comparative sentences', Proceeding of the 13th International Arab Conference on Information Technology ACIT'2012, pp. 265-271.
[21]
El-Makky, N., Nagi, K., El-Ebshihy, A., Apady, E., Hafez, O., Mostafa, S. and Ibrahim, S. (2014) 'Sentiment analysis of colloquial Arabic tweets', 2014 ASE BigData/SocialInformatics/ PASSAT/BioMedCom 2014 Conference, Harvard University.
[22]
Farag, A. and Nürnberger, A. (2011) 'A web statistics based conflation approach to improve Arabic text retrieval', Proceedings of the Federated Conference on Computer Science and Information Systems, pp. 3-9.
[23]
Farra, N., Challita, E., Assi, R.A. and Hajj, H. (2010) 'Sentence-level and document-level sentiment mining for Arabic texts', Proceedings of the ICDM Workshop, 2010, pp. 1114-1119.
[24]
Fassi Fehri, A. (1998) 'Layers in the distribution of Arabic adverbs and adjectives and their licensing', Perspectives on Arabic Linguistics, Vol. 11, pp. 9-46, John Benjamins Publishing, Amsterdam/Philadelphia.
[25]
Ghwanmeh, S., Rabab'ah, S., Al-Shalabi, R. and Kanaan, G. (2009) 'Enhanced algorithm for extracting the root of Arabic words', Proceedings of 6th International Conference on Computer Graphics, Imaging and Visualization (CGIV '09), Tianjin, China, pp. 388-391.
[26]
Hawwari, A., Attia, M. and Diab, M. (2014) 'A framework for the classification and annotation of multiword expressions in dialectal Arabic', ANLP 2014, pp. 48-56.
[27]
Hearst, M.A. (1992) 'Direction-based text interpretation as an information access refinement', in Jacobs, P. (Ed.): Text-based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval, Lawrence Erlbaum Associates, Mahwah, NJ.
[28]
Hu, X., Tang, J., Gao, H. and Liu, H. (2013) 'Unsupervised sentiment analysis with emotional signals', Proceedings of the 22nd International Conference on World Wide Web, International World Wide Web Conferences Steering Committee, pp. 607-618.
[29]
Kanaan, R. and Kanaan, G. (2014) 'An improved algorithm for the extraction of triliteral Arabic roots', European Scientific Journal, Vol. 10, No. 3, pp. 346-355.
[30]
Khasawneh, R., Wahsheh, H., AL-Kabi, M. and Alsmadi, I. (2013) 'Sentiment analysis of Arabic social media content: a comparative study', The 8th International Conference for Internet Technology and Secured Transactions (ICITST-2013), London, UK, pp. 101-106.
[31]
Kumar, K.N. and Christopher, T. (2015) 'Opinion mining: a survey', International Journal of Computer Applications, Vol. 113, No. 2, pp. 15-17.
[32]
Larkey, L.S., Ballesteros, L. and Connell, M. (2002) 'Improving stemming for Arabic information retrieval: light stemming and cooccurrence analysis', Proceedings of the 25th Annual International Conference on Research and Development in Information Retrieval (SIGIR 2002), Tampere, Finland, pp. 275-282.
[33]
Li, D.H., Laurent, A., Roche, M. and Poncelet, P. (2008) 'Extraction of opposite sentiments in classified free format text reviews', Database and Expert Systems Applications, Springer, Heidelberg, Berlin, pp. 710-717.
[34]
Liu, H., Lieberman, H. and Selker, T. (2003) 'A model of textual affect sensing using real-world knowledge', Proceedings of the International Conference on Intelligent User Interfaces, pp. 125-132.
[35]
Luenberger, D.G. (1973) Introduction to Linear and Nonlinear Programming, Vol. 28, Addison-Wesley, Reading, MA.
[36]
Mohanchandra, K., Saha, S., Murthy, K.S. and Lingaraju, G.M. (2015) 'Distinct adoption of k-nearest neighbour and support vector machine in classifying EEG signals of mental tasks', International Journal of Intelligent Engineering Informatics, Vol. 3, No. 4, pp. 313-329.
[37]
Mourad, A. and Darwish, K. (2013) 'Subjectivity and sentiment analysis of modern standard Arabic microblogs', Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Atlanta, Georgia, USA, pp. 55-64.
[38]
Pak, A. and Paroubek, P. (2010) 'Twitter as a corpus for sentiment analysis and opinion mining', Proceedings of LREC, Vol. 10, pp. 1320-1326.
[39]
Polatidis, N. and Georgiadis, C.K. (2015) 'A ubiquitous recommender system based on collaborative filtering and social networking data', International Journal of Intelligent Engineering Informatics, Vol. 3, Nos. 2-3, pp. 186-204.
[40]
Rushdi-Saleh, M., Martin-Valdivia, M.T., Ureña-Lopez, L.A. and Perea-Ortega, J.M. (2011) 'OCA: opinion corpus for Arabic', Journal of the American Society for Information Science and Technology, Vol. 62, No. 10, pp. 2045-2054.
[41]
Singh, P.K. and Husain, M.S. (2013) 'Analytical study of feature extraction techniques in opinion mining', Computer Science & Information Technology, Vol. 3, No. 4, pp. 85-94.
[42]
Smola, A.J. and Schölkopf, B. (2004) 'A tutorial on support vector regression', Statistics and Computing, Vol. 14, No. 3, pp. 199-222.
[43]
Taboada, M., Brooke, J., Tofiloski, M., Voll, K. and Stede, M. (2011) 'Lexicon-based methods for sentiment analysis', Computational Linguistics, Vol. 37, No. 2, pp. 267-307.
[44]
Thelwall, M., Buckley, K. and Paltoglou, G. (2012) 'Sentiment strength detection for the social web', Journal of the American Society for Information Science and Technology, Vol. 63, No. 1, pp. 163-173.
[45]
Vapnik, V. (1995) The Nature of Statistical Learning Theory, Chapter 5, Springer Science & Business Media, New York.
[46]
Wiebe, J. (1994) 'Tracking point of view in narrative', Computational Linguistics, Vol. 20, No. 2, pp. 233-287.
[47]
Zhang, Y., Dai, M. and Ju, Z. (2015) 'Preliminary discussion regarding SVM kernel function selection in the twofold rock slope prediction model', Journal of Computing in Civil Engineering, Vol. 30, No. 3, pp. 1-10.

Cited By

View all
  • (2020)Feature selection using an improved Chi-square for Arabic text classificationJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2018.05.01032:2(225-231)Online publication date: 1-Feb-2020
  • (2019)Threshold-based empirical validation of object-oriented metrics on different severity levelsInternational Journal of Intelligent Engineering Informatics10.5555/3337636.33376427:2-3(231-262)Online publication date: 25-May-2019
  • (2019)A Survey of Opinion Mining in ArabicACM Transactions on Asian and Low-Resource Language Information Processing10.1145/329566218:3(1-52)Online publication date: 7-May-2019
  • Show More Cited By
  1. A hybrid optimal weighting scheme and machine learning for rendering sentiments in tweets

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image International Journal of Intelligent Engineering Informatics
      International Journal of Intelligent Engineering Informatics  Volume 4, Issue 3/4
      January 2016
      169 pages
      ISSN:1758-8715
      EISSN:1758-8723
      Issue’s Table of Contents

      Publisher

      Inderscience Publishers

      Geneva 15, Switzerland

      Publication History

      Published: 01 January 2016

      Author Tags

      1. Arabic
      2. KNN
      3. SVM
      4. Twitter
      5. automatic language processing
      6. hybrid weighting
      7. k-nearest neighbour
      8. low level light stemming
      9. machine learning
      10. optimal weighting
      11. sentiment analysis
      12. sentiments
      13. support vector machines
      14. tweets

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 09 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2020)Feature selection using an improved Chi-square for Arabic text classificationJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2018.05.01032:2(225-231)Online publication date: 1-Feb-2020
      • (2019)Threshold-based empirical validation of object-oriented metrics on different severity levelsInternational Journal of Intelligent Engineering Informatics10.5555/3337636.33376427:2-3(231-262)Online publication date: 25-May-2019
      • (2019)A Survey of Opinion Mining in ArabicACM Transactions on Asian and Low-Resource Language Information Processing10.1145/329566218:3(1-52)Online publication date: 7-May-2019
      • (2018)A purely Bayesian approach for proportional visual data modellingInternational Journal of Intelligent Engineering Informatics10.1504/IJIEI.2018.0945136:5(491-508)Online publication date: 13-Dec-2018

      View Options

      View options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media