Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

QuoteInspector: Gaining Insight about Social Media Discussions

Published: 01 August 2024 Publication History

Abstract

Our greatest source of insight into the real world today is via social media. Here, a major statement or quote by a public figure (world leader, politician, celebrity, scientist) can have wide-ranging impact, igniting extensive discussions and triggering reactions. It would be helpful to have tools for monitoring, querying, and inspecting the "flow" of social discourse. We introduce QuoteInspector, a system uniquely designed for efficient tracking and analysis of social media discussions around quotes. QuoteInspector leverages modern text embeddings and employs a clustering-based methodology for extracting topics from posts; it further integrates various NLP techniques for in-depth cluster analysis. Additionally, the system enhances the user experience by combining keyword- and relationship-based (structured) search for efficient and precise quote retrieval.

References

[1]
Balažević, I., Allen, C., and Hospedales, T. M. Tucker: Tensor factorization for knowledge graph completion. EMNLP (2019).
[2]
Bär, D., Calderon, F., Lawlor, M., Licklederer, S., Totzauer, M., and Feuerriegel, S. Analyzing social media activities at bellingcat. In WebSci '23.
[3]
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. NAACL-HLT (2019).
[4]
Khattab, O., and Zaharia, M. Colbert: Efficient and effective passage search via contextualized late interaction over bert. In SIGIR (2020), pp. 39--48.
[5]
Kulis, B., and Jordan, M. I. Revisiting k-means: New algorithms via bayesian nonparametrics. ICML (2012).
[6]
Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V., and Zettlemoyer, L. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019).
[7]
Mikolov, T., Chen, K., Corrado, G., and Dean, J. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
[8]
Silburt, A., Subasic, A., Thompson, E., Dsilva, C., and Fares, T. Fanatic: Fast noise-aware topic clustering. In Findings of EMNLP (2021), pp. 650--663.
[9]
Vaucher, T., Spitz, A., Catasta, M., and West, R. Quotebank: a corpus of quotations from a decade of news. In WSDM (2021), pp. 328--336.
[10]
Zhang, Y., Ives, Z., and Roth, D. "who said it, and why?" provenance for natural language claims. In ACL (2020), pp. 4416--4426.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 17, Issue 12
August 2024
837 pages
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 August 2024
Published in PVLDB Volume 17, Issue 12

Check for updates

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 27
    Total Downloads
  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)13
Reflects downloads up to 31 Jan 2025

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media