Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3132847.3132878acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Words are Malleable: Computing Semantic Shifts in Political and Media Discourse

Published: 06 November 2017 Publication History

Abstract

Recently, researchers started to pay attention to the detection of temporal shifts in the meaning of words. However, most (if not all) of these approaches restricted their efforts to uncovering change over time, thus neglecting other valuable dimensions such as social or political variability. We propose an approach for detecting semantic shifts between different viewpoints---broadly defined as a set of texts that share a specific metadata feature, which can be a time-period, but also a social entity such as a political party. For each viewpoint, we learn a semantic space in which each word is represented as a low dimensional neural embedded vector. The challenge is to compare the meaning of a word in one space to its meaning in another space and measure the size of the semantic shifts. We compare the effectiveness of a measure based on optimal transformations between the two spaces with a measure based on the similarity of the neighbors of the word in the respective spaces. Our experiments demonstrate that the combination of these two performs best. We show that the semantic shifts not only occur over time but also along different viewpoints in a short period of time. For evaluation, we demonstrate how this approach captures meaningful semantic shifts and can help improve other tasks such as the contrastive viewpoint summarization and ideology detection (measured as classification accuracy) in political texts. We also show that the two laws of semantic change which were empirically shown to hold for temporal shifts also hold for shifts across viewpoints. These laws state that frequent words are less likely to shift meaning while words with many senses are more likely to do so.

References

[1]
E. Bleich, H. Nisar, and R. Abdelhamid. The effect of terrorist events on media portrayals of islam and muslims: evidence from new york times headlines, 1985--2013. Ethnic and Racial Studies, 39 (7): 1109--1127, 2016.
[2]
M. Brysbaert, A. B. Warriner, and V. Kuperman. Concreteness ratings for 40 thousand generally known english word lemmas. Behavior research methods, 46 (3): 904--911, 2014.
[3]
M. Dehghani, H. Azarbonyad, M. Marx, and J. Kamps. Sources of evidence for automatic indexing of political texts. In ECIR '15, pages 568--573, 2015.
[4]
Dehghani, Azarbonyad, Kamps, and Marx}Dehghani2016-CLEFM. Dehghani, H. Azarbonyad, J. Kamps, and M. Marx. Two-way parsimonious classification models for evolving hierarchies. In CLEF '16, pages 69--82, 2016 a.
[5]
M. Dehghani, H. Azarbonyad, J. Kamps, and M. Marx. On horizontal and vertical separation in hierarchical text classification. In ICTIR '16, pages 185--194, 2016 b.
[6]
Y. Fang, L. Si, N. Somasundaram, and Z. Yu. Mining contrastive opinions on political texts using cross-perspective topic model. In WSDM '12, pages 63--72, 2012.
[7]
W. B. Gallie. Essentially contested concepts. In Proceedings of the Aristotelian society, volume 56, pages 167--198, 1955.
[8]
W. L. Hamilton, J. Leskovec, and D. Jurafsky. Diachronic word embeddings reveal statistical laws of semantic change. In ACL '16, pages 1489--1501, 2016 a.
[9]
W. L. Hamilton, J. Leskovec, and D. Jurafsky. Cultural shift or linguistic drift? comparing two computational measures of semantic change. In EMNLP '16, pages 2116--2121, 2016 b.
[10]
G. Hirst, Y. Riabinin, J. Graham, and M. Boizot-roche. Text to ideology or text to party status? 55: 93--116, 2014.
[11]
T. K. Ho, L. A. Lastras, and O. Shmueli. Concept evolution modeling using semantic vectors. In WWW '16, pages 45--46, 2016.
[12]
J. Huang, M. Peng, H. Wang, J. Cao, W. Gao, and X. Zhang. A probabilistic method for emerging topic tracking in microblog stream. World Wide Web, 20 (2): 325--350, 2017.
[13]
A. Jatowt and K. Duh. A framework for analyzing semantic change of words across time. In JCDL '14, pages 229--238, 2014.
[14]
G. Jeh and J. Widom. Simrank: A measure of structural-context similarity. In KDD '02, pages 538--543, 2002.
[15]
P. Jin, Y. Zhang, X. Chen, and Y. Xia. Bag-of-embeddings for text classification. In IJCAI '16, pages 2824--2830, 2016.
[16]
T. Kenter, M. Wevers, P. Huijnen, and M. de Rijke. Ad hoc monitoring of vocabulary shifts over time. In CIKM '15, pages 1191--1200, 2015.
[17]
H. D. Kim and C. Zhai. Generating comparative summaries of contradictory opinions in text. In CIKM '09, pages 385--394, 2009.
[18]
V. Kulkarni, R. Al-Rfou, B. Perozzi, and S. Skiena. Statistically significant detection of linguistic change. In WWW '15, pages 625--635, 2015.
[19]
M. J. Kusner, Y. Sun, N. I. Kolkin, and K. Q. Weinberger. From word embeddings to document distances. In ICML '15, pages 957--966, 2015.
[20]
T. Lansdall-Welfare, S. Sudhahar, J. Thompson, J. Lewis, F. N. Team, and N. Cristianini. Content analysis of 150 years of british periodicals. Proceedings of the National Academy of Sciences, 114: 457--465, 2017.
[21]
Q. V. Le and T. Mikolov. Distributed representations of sentences and documents. In ICML '14, pages 1188--1196, 2014.
[22]
O. Levy and Y. Goldberg. Linguistic regularities in sparse and explicit word representations. In CoNLL '14, pages 171--180, 2014.
[23]
C. Li, A. Sun, and A. Datta. Twevent: Segment-based event detection from tweets. In CIKM '12, pages 155--164, 2012.
[24]
C. Lin and Y. He. Joint sentiment/topic model for sentiment analysis. In CIKM '09, pages 375--384, 2009.
[25]
Q. Mei, X. Ling, M. Wondra, H. Su, and C. Zhai. Topic sentiment mixture: Modeling facets and opinions in weblogs. In WWW '07, pages 171--180, 2007.
[26]
T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. In ICLR, 2013 a.
[27]
T. Mikolov, Q. V. Le, and I. Sutskever. Exploiting similarities among languages for machine translation. In ICLR, 2013 b.
[28]
Mikolov, Sutskever, Chen, Corrado, and Dean}Mikolov2013T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS '13, pages 3111--3119, 2013 c.
[29]
B. Pang and L. Lee. Opinion mining and sentiment analysis. Found. Trends Inf. Retr., 2 (1--2): 1--135, 2008.
[30]
J. Pennington, R. Socher, and C. D. Manning. Glove: Global vectors for word representation. In EMNLP '14, pages 1532--1543, 2014.
[31]
S. D. Reese and S. C. Lewis. Framing the war on terror the internalization of policy in the us press. Journalism, pages 777--797, 2009.
[32]
Z. Ren, O. Inel, L. Aroyo, and M. de Rijke. Time-aware multi-viewpoint summarization of multilingual social text streams. In CIKM '16, pages 387--396, 2016.
[33]
Q. Skinner. Meaning and understanding in the history of ideas. History and theory, 8 (1): 3--53, 1969.
[34]
H. Takamura, R. Nagata, and Y. Kawasaki. Analyzing semantic changes in japanese loanwords. In ACL '17, 2017.
[35]
T. Thonet, G. Cabanac, M. Boughanem, and K. Pinel-Sauvagnat. Vodum: A topic model unifying viewpoint, topic and opinion discovery. In ECIR '16, pages 533--545, 2016.
[36]
A. Triandafyllidou and R. Wodak. Conceptual and methodological questions in the study of collective identity: An introduction. Journal of Language and Politics, 2 (2): 205--223, 2003.
[37]
C. K. Vaca, A. Mantrach, A. Jaimes, and M. Saerens. A time-based collective factorization for topic discovery and monitoring in news. In WWW '14, pages 527--538, 2014.
[38]
L. Wittgenstein. Philosophische untersuchungen (frankfurt am main: Suhrkamp). 1967.
[39]
C. Xing, D. Wang, X. Zhang, and C. Liu. Document classification with distributions of word vectors. In APSIPA '14, pages 1--5, 2014.
[40]
Z. Yao, Y. Sun, W. Ding, N. Rao, and H. Xiong. Discovery of evolving semantics through dynamic word embedding learning. arXiv preprint arXiv:1703.00607, 2017.

Cited By

View all
  • (2024)Semantic analysis and construction of English discourse based on neural networkApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-25469:1Online publication date: 3-Sep-2024
  • (2024)Evolving linguistic divergence on polarizing social mediaHumanities and Social Sciences Communications10.1057/s41599-024-02922-911:1Online publication date: 15-Mar-2024
  • (2024)Analyzing differences between discursive communities using dialectogramsScientific Reports10.1038/s41598-024-72144-114:1Online publication date: 28-Sep-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management
November 2017
2604 pages
ISBN:9781450349185
DOI:10.1145/3132847
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 November 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. ideology detection
  2. semantic shifts
  3. word embeddings
  4. word stability

Qualifiers

  • Research-article

Funding Sources

  • European Community's Seventh Framework Program
  • Netherlands Organization for Scienti c Research
  • Nederlab

Conference

CIKM '17
Sponsor:

Acceptance Rates

CIKM '17 Paper Acceptance Rate 171 of 855 submissions, 20%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)77
  • Downloads (Last 6 weeks)12
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Semantic analysis and construction of English discourse based on neural networkApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-25469:1Online publication date: 3-Sep-2024
  • (2024)Evolving linguistic divergence on polarizing social mediaHumanities and Social Sciences Communications10.1057/s41599-024-02922-911:1Online publication date: 15-Mar-2024
  • (2024)Analyzing differences between discursive communities using dialectogramsScientific Reports10.1038/s41598-024-72144-114:1Online publication date: 28-Sep-2024
  • (2024)SMEC: Scene Mining for E-CommerceJournal of Computer Science and Technology10.1007/s11390-021-1277-039:1(192-210)Online publication date: 1-Feb-2024
  • (2024)Studying word meaning evolution through incremental semantic shift detectionLanguage Resources and Evaluation10.1007/s10579-024-09769-1Online publication date: 9-Sep-2024
  • (2023)Detection of Change in the Senses of AI in Popular DiscourseComputational Linguistics and Intelligent Text Processing10.1007/978-3-031-23804-8_4(46-58)Online publication date: 26-Feb-2023
  • (2022)A greek parliament proceedings dataset for computational linguistics and political analysisProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602363(28874-28888)Online publication date: 28-Nov-2022
  • (2022)Applying artificial intelligence methods for solving problems of searching for semantic associates: case of toponym MoskvaVestnik of Astrakhan State Technical University. Series: Management, computer science and informatics10.24143/2072-9502-2022-2-41-512022:2(41-51)Online publication date: 29-Apr-2022
  • (2022)Discursive construction of migrant otherness on Facebook: A distributional semantics approachDiscourse & Society10.1177/0957926522111701434:2(236-254)Online publication date: 21-Oct-2022
  • (2022)Capturing stance dynamics in social media: open challenges and research directionsInternational Journal of Digital Humanities10.1007/s42803-022-00043-w3:1-3(115-135)Online publication date: 8-Mar-2022
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media