The power to predict outcomes based on Twitter data is greatly exaggerated, especially for political elections.
References
[1]
Asur, S. and Huberman, B.A. Predicting the future with social media. In Proceedings of the 2010 IEEE/WIC/ ACM International Conference on Web Intelligence and Intelligent Agent Technology (Toronto, Aug. 31--sept. 3). IEEE Computer Society, Los Alamitos, CA, 2010, 492--499.
Boiy, E., Hens, P., Deschacht, K., and Moens, M.F. Automatic sentiment analysis in online text. In Proceedings of the 2007 Conference on Electronic Publishing (Vienna, June 13--15). ÖKK Editions, Vienna, 2007, 349--360.
Choi, H. Predicting the Present with Google Trends. Tech. Rep. Google, Inc., Mountain View, CA, 2009; http://google.com/googleblogs/pdfs/google_predicting_the_present.pdf
Choi, Y. and Cardie, C. Adapting a polarity lexicon using integer Linear programming for domain-specific sentiment classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (Singapore, Aug. 6--7). Association for Computational linguistics, Stroudsburg, PA, 2009, 590--598.
Keeter, S., Kiley, J., Christian, L., and Dimock, M. Perils of Polling in Election '08. Pew Internet and American Life Project, Washington, D.C., 2009; http://pewresearch.org/pubs/1266/polling-challenges-election-08-success-in-dealing-with
Kwak, H, Lee, C., Park, H., and Moon, S. What is Twitter: a social network or a news media? In Proceedings of the 19th International World Wide Web Conference (Raleigh, NC, Apr. 26--30). ACM Press, New York, 2010, 591--600.
Hughes, A.L. and Palen, L. Twitter adoption and use in mass convergence and emergency events. In Proceedings of the Sixth International Community on Information Systems for Crisis Response and Management Conference (Gothenburg, Sweden, May 10--13, 2009).
Lenhart, A. and Fox, S. Twitter and Status Updating. Pew Internet and American Life Project, Washington, D.C. 2009; http://www.pewinternet.org/Reports/2009/Twitter-and-status-updating.aspx
O'Connor, B., Balasubramanyan, R., Routledge, B.R., and Smith, N.A. From tweets to polls: linking text sentiment to public opinion time series. In Proceedings of the Fourth International Association for the Advancement of Artificial Intelligence Conference on Weblogs and Social Media (Washington, D.C, May 23--26). Association for the Advancement of Artificial Intelligence, Menlo Park, CA, 2010, 122--129.
Smith, A. and Rainie, L. The Internet and the 2008 Election. Pew Internet and American Life Project, Washington, D.C., 2008; http://www.pewinternet.org/Reports/2008/The-Internet-and-the-2008-Election.aspx
Tumasjan, A., Sprenger, T.O., Sandner, P.G., and Welpe, I.M. Predicting elections with Twitter: What 140 characters reveal about political sentiment. In Proceedings of the Fourth International Association for the Advancement of Artificial Intelligence Conference on Weblogs and Social Media (Washington, D.C., May 23--26). Association for the Advancement of Artificial Intelligence, Menlo Park, CA, 2010, 178--185.
Turney, P.D. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (Philadelphia, July 6--12). Association for Computational Linguistics, Stroudsburg, PA, 2002, 417--424.
Wilson, T., Wiebe, J., and Hoffmann, P. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the Human Language Technology Conference, Conference on Empirical Methods in Natural Language Processing (Vancouver, Canada, Oct. 6--8). Association for Computational linguistics, Stroudsburg, PA, 2005, 347--354.
Yu, B., Kaufmann, S., and Diermeier, D. Exploring the characteristics of opinion expressions for political opinion classification. In Proceedings of the 2008 International Conference on Digital Government Research (Montréal, May 18--21). Digital Government Society of North America, Marina del Rey, CA, 2008, 82--91.
Gaur AYadav D(2025)A comprehensive analysis of forecasting elections using social media textMultimedia Tools and Applications10.1007/s11042-024-20528-wOnline publication date: 27-Jan-2025
de Almeida MVieira VAgnihotri Rde Freitas Souza R(2024)The Social Psychological Explanations for the Effects of Social Media Marketing and Traditional Media on Voting IntentionsJournal of Marketing Theory and Practice10.1080/10696679.2024.2415973(1-18)Online publication date: 20-Oct-2024
Brito KSilva Filho RAdeodato P(2024)Stop trying to predict elections only with twitter – There are other data sources and technical issues to be improvedGovernment Information Quarterly10.1016/j.giq.2023.10189941:1(101899)Online publication date: Mar-2024
ICEC '17: Proceedings of the International Conference on Electronic Commerce
Social media such as Facebook, Instagram and Twitter are originally developed as communication tools among individuals for private conversations. Through the platforms, people share photos, stories and news with their social media friends to interact ...
Twitter, due to its growth and ever-expanding user base, has become a gold mine of information for analysts who mine tweet content as a data source for gauging public opinion. Even
The New York Times
is discussing this phenomenon [1]. But what does a
Literary Digest
poll, conducted in 1936, have to do with the current practice of mining data on Twitter to make projections__?__ Gayo-Avello discusses the dangers that may result when negative results from data extracted from Twitter are ignored, stating that "current research risks turning social media analytics into the next
Literary Digest
poll."?
For those unfamiliar with this reference, it is the classic demonstration of how data bias seriously skewed poll results. The poll was conducted by the
Literary Digest
for the 1936 US presidential election. The magazine's own readers were its query source, and readers were asked which candidate they preferred: New Deal candidate Franklin Roosevelt, or Republican Alf Landon. The readers were reached by telephone numbers listed in nationwide directories and by a list of registered car owners. The poll concluded that Landon would win in a landslide. However, Roosevelt ultimately won with 61 percent of the popular vote. This poll became the quintessential example of the need to mine unbiased data as sources for reliable projections.
In the 2008 US presidential election, projections based on tweet data were heavily in favor of Barack Obama"?even in states where he eventually lost heavily to John McCain. Gayo-Avello's study looks for reasons for the faulty projections. He states:
My aim was not to compare Twitter data with pre-election polls or with the popular vote, as had been done previously, but to obtain predictions on a state-by-state basis. Additionally, unlike the other studies, my predictions were not to be derived from aggregating Twitter data but by detecting voting intention for every single user from their individual tweets.
Gayo-Avello "applied four different sentiment-analysis methods described in the most recent literature and carefully evaluated their performance."? He then demonstrated that the results for the 2008 US presidential election "could not have been predicted from Twitter data alone through commonly applied methods."? A substantial bibliography lists all of the references Gayo-Avello used in his research.
Relying heavily on statistical research and evaluation, Gayo-Avello, in the "Election Twitter Data Set"? section, first describes an analysis in which simple Twitter data was used to overestimate President Obama's victory in the 2008 US presidential election. He then hypothesizes that Twitter users are probably a sample group and, most likely, a biased one. The article continues, focusing on whether data extracted from Twitter can be used to reliably predict outcomes, both current and future. Gayo-Avello provides a series of statistical analyses, described through detailed text and tables, and highlights both errors and corrections in his research.
The data used in his study was collected from users' unprotected tweets viewed in Twitter's public timeline. These tweets are easily accessed and collected through Twitter's own application programming interface (API). For Gayo-Avello's study, tweets were collected shortly after the 2008 election using Twitter's API search function. He used the following query parameters, picking up 100 tweets per candidate, per county, per day: only one query was used per candidate"?Obama or Biden for the Democratic candidates, and McCain or Palin for the Republican candidates, and the query was limited to only those tweets published by US residents within a specified time interval. This view counted the number of appearances of a candidate in a user's tweets, assuming that the candidate mentioned more often would be the one the user would later vote for. This view would ultimately prove to be wrong.
In the next section, "Inferring Voter Intention,"¿ Gayo-Avello presents a second method using terms labeled either positive or negative. If a tweet contained more positive terms than negative, it was labeled positive; the opposite was true for negative terms. Since each tweet in the collection was applied to just one candidate, it was possible to count, for each user, the number of positive and negative tweets for each set of candidates. Three more elaborate procedures were tested: vote and flip; semantic orientation; and polarity lexicon, which ended up being the one used to infer votes for all users in the dataset since it was better at estimating McCain support and global accuracy. However, polling results achieved by analyzing Twitter data were still far less accurate than the predictive results achieved through traditional polling methods. Selection bias had tainted the sample.
Gayo-Avello then tested for Twitter bias. The first test applied to the data checked for the number of users per county. This was based on the premise that city dwellers and young adults are more likely to use Twitter and lean toward more liberal political opinions. This test looked for correlations between percentage of users per county and population density using the actual elections results for each county. Results showed that, within Twitter, all of the states showed a positive correlation between population density and the democratic vote in the 2008 US presidential election. Moreover, every state except for Missouri and Texas expressed positive correlation between population density and Twitter use. (In my own 2008 study, younger people were clearly overrepresented in Twitter, which explains part of the faulty prediction.) Results also showed that Republican voters used Twitter less than Democratic voters, or were reluctant to express their political opinions publicly. Republicans, or at least McCain supporters, tweeted much less than Democratic voters during the 2008 election.
In conclusion, the outcome of the 2008 US presidential election could not have been predicted from user content published through Twitter by applying the most common sentiment-analysis methods. According to the article, "Due to the prevalence of younger users and their tilt toward Democrats and Obama, "¿Democrats and Obama backers are more in evidence on the Internet than backers of other candidates or parties' [2]."¿ The possible biases in the data are consistent with the conclusions drawn by Lenhart and Fox [3], and Rainie and Smith [2].
The article ends with "Lessons Learned."¿ The problem with trying to predict the outcome of the 2008 US presidential election was not data collection per se, but two other things: the need to learn how to minimize the importance of bias in social media data, and the tendency to ignore how such data differs from the actual population.
Four lessons can be learned from this study. First, because researchers can assemble very large sets of data for mining does not make it statistically representative of overall populations. Second, bias can be introduced through the relative youth of social networking users. Researchers need to correct for bias by knowing user ages within their samples. Third, a topic that appears frequently within a given sample, or is repeated often, can skew results within Twitter. And finally, a no response within Twitter may play a more important role than is realized, especially if lack of information mainly affects only one group in particular.
Gayo-Avello concludes:
Until social media is used regularly by a broad segment of the voting population, its users cannot be considered a representative sample, and forecasts from the data will be of questionable value at best and incorrect in many cases. Until then, researchers using such data should identify the various strata of users"¿based on, say, age, income, gender, and race"¿to properly weigh their opinions according to the proportion of each of them in the population.
Statisticians, pollsters, media, and others interested in social behavior will find this paper enlightening. Follow-ups on the topic might investigate whether instinctive responses when tweeting, herd behavior, or the lack of critical evaluation by tweeters skew the results of a Twitter survey.
Online Computing Reviews Service
Access critical reviews of Computing literature here
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Gaur AYadav D(2025)A comprehensive analysis of forecasting elections using social media textMultimedia Tools and Applications10.1007/s11042-024-20528-wOnline publication date: 27-Jan-2025
de Almeida MVieira VAgnihotri Rde Freitas Souza R(2024)The Social Psychological Explanations for the Effects of Social Media Marketing and Traditional Media on Voting IntentionsJournal of Marketing Theory and Practice10.1080/10696679.2024.2415973(1-18)Online publication date: 20-Oct-2024
Brito KSilva Filho RAdeodato P(2024)Stop trying to predict elections only with twitter – There are other data sources and technical issues to be improvedGovernment Information Quarterly10.1016/j.giq.2023.10189941:1(101899)Online publication date: Mar-2024
Lima JSantana MCorrea ABrito K(2023)The use and impact of TikTok in the 2022 Brazilian presidential electionProceedings of the 24th Annual International Conference on Digital Government Research10.1145/3598469.3598485(144-152)Online publication date: 11-Jul-2023
Pokhriyal NValentino BVosoughi S(2023)Quantifying participation biases on social mediaEPJ Data Science10.1140/epjds/s13688-023-00405-612:1Online publication date: 28-Jul-2023
Dahish ZMaih S(2023)Crafting and Analyzing Advanced Social Monitoring Techniques for Digital Retail Platforms2023 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE)10.1109/CSDE59766.2023.10487650(1-5)Online publication date: 4-Dec-2023
Vicente P(2023)Sampling Twitter users for social science research: evidence from a systematic review of the literatureQuality & Quantity10.1007/s11135-023-01615-w57:6(5449-5489)Online publication date: 27-Jan-2023
Chauhan PSharma NSikka G(2023)On the importance of pre-processing in small-scale analyses of twitter: a case study of the 2019 Indian general electionMultimedia Tools and Applications10.1007/s11042-023-16158-383:7(19219-19258)Online publication date: 26-Jul-2023
Kumaran P Sridhar RNandy H(2023)Multi-layered perceptron based deep learning model for emotion extraction on monolingual text using intelligence feature engineering and filtering techniquesMultimedia Tools and Applications10.1007/s11042-023-15438-282:28(44037-44052)Online publication date: 27-Apr-2023