Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Implicit entity linking in tweets: : An ad-hoc retrieval approach

Published: 01 January 2019 Publication History

Abstract

Within the context of Twitter analytics, the notion of implicit entity linking has recently been introduced to refer to the identification of a named entity, which is central to the topic of the tweet, but whose surface form is not present in the tweet itself. Compared to traditional forms of entity linking where the linking process revolves around an identified surface form of a potential entity, implicit entity linking relies on contextual clues to determine whether an implicit entity is present within a given tweet and if so, which entity is being referenced. The objective of this paper, while introducing and publicly sharing a comprehensive gold standard dataset for implicit entity linking, is to perform the task of implicit entity linking. The dataset consists of 7,870 tweets, which are classified as either containing implicit entities, explicit entities, both, or neither. The implicit entities are then linked to three levels of entities on Wikipedia, namely coarse-grained level, e.g., Person, Fine-grained level, e.g., Comedian, and the actual entity, e.g., Seinfeld. The proposed model in this work formulates the problem of implicit entity linking as an ad-hoc document retrieval process where the input query is the tweet, which needs to be implicitly linked and the document space is the set of textual descriptions of entities in the knowledge base. The novel contributions of our work include: 1) designing and collecting a gold standard dataset for the task of implicit entity linking; 2) defining the implicit entity linking process as an ad-hoc document retrieval task; and 3) proposing a neural embedding-based feature function that is interpolated with prior term dependency and entity-based feature functions to enhance implicit entity linking. We systematically compare our work with existing work in this area and show that our method is able to provide improvements on a number of retrieval measures.

References

[1]
Bagheri, E., Ensan, F. & Al-Obeidat, F. (2018). Neural word and entity embeddings for ad hoc retrieval. Inf. Process. Manage., 54(4), 657–673.
[2]
Basile, P., Basile, V., Nissim, M. & Novielli, N. (2015). Deep tweets: From entity linking to sentiment analysis. In Proceedings of the Italian Computational Linguistics Conference (CLiC-it 2015).
[3]
Bianchi, F., Palmonari, M. & Nozza, D. (2018). Towards encoding time in text-based entity embeddings. In International Semantic Web Conference (pp. 56–71). Springer.
[4]
Cano Basave, A.E., Varga, A., Rowe, M., Stankovic, M. & Dadzie, A.-S. (2013). Making sense of microposts (# msm2013) concept extraction challenge.
[5]
Chang, M., Hsu, B.P., Ma, H., Loynd, R. & Wang, K. (2014). E2E: An end-to-end entity linking system for short and noisy text. In Workshop on Making Sense of Microposts (pp. 62–63).
[6]
Derczynski, L., Maynard, D., Rizzo, G., van Erp, M., Gorrell, G., Troncy, R., Petrak, J. & Bontcheva, K. (2015a). Analysis of named entity recognition and linking for tweets. Information Processing & Management, 51(2), 32–49.
[7]
Edouard, A., Cabrio, E., Tonelli, S. & Le Thanh, N. (2017). Semantic linking for event-based classification of tweets. International Journal of Computational Linguistics and Applications, 12.
[8]
Ensan, F. & Bagheri, E. (2017). Document retrieval model through semantic linking. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, WSDM 2017, Cambridge, United Kingdom, February 6–10, 2017 (pp. 181–190). http://dl.acm.org/citation.cfm?id=3018692.
[9]
Ensan, F., Bagheri, E., Zouaq, A. & Kouznetsov, A. (2017). An empirical study of embedding features in learning to rank. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, November 06–10, 2017 (pp. 2059–2062).
[10]
Fang, Y. & Chang, M.-W. (2014). Entity linking on microblogs with spatial and temporal signals. Transactions of the Association for Computational Linguistics, 2, 259–272.
[11]
Feng, Y., Zarrinkalam, F., Bagheri, E., Fani, H. & Al-Obeidat, F. (2018). Entity linking of tweets based on dominant entity candidates. Social Network Analysis and Mining, 8(1), 46.
[12]
Ferragina, P. & Scaiella, U. (2012). Fast and accurate annotation of short texts with Wikipedia pages. IEEE Software, 29(1), 70–75.
[13]
Finin, T., Murnane, W., Karandikar, A., Keller, N., Martineau, J. & Dredze, M. (2010). Annotating named entities in Twitter data with crowdsourcing. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk (pp. 80–88). Association for Computational Linguistics.
[14]
Greenfield, K., Caceres, R.S., Coury, M., Geyer, K., Gwon, Y., Matterer, J., Mensch, A., Sahin, C.S. & Simek, O. (2016). A reverse approach to named entity extraction and linking in microposts. In # Microposts (pp. 67–69).
[15]
Grishman, R. & Sundheim, B. (1996). Message understanding conference-6: A brief history. In COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics (Vol. 1).
[16]
Guo, S., Chang, M. & Kiciman, E. (2013). To link or not to link? A study on end-to-end tweet entity linking. In Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, June 9–14, 2013, Westin Peachtree Plaza Hotel, Atlanta, Georgia, USA (pp. 1020–1030). http://aclweb.org/anthology/N/N13/N13-1122.pdf.
[17]
Hasibi, F., Balog, K. & Bratsberg, S.E. (2016). Exploiting entity linking in queries for entity retrieval. In Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval. ICTIR ’16 (pp. 209–218). New York, NY, USA: ACM.
[18]
Honnibal, M. & Johnson, M. (2015). An improved non-monotonic transition system for dependency parsing. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 1373–1378). Lisbon, Portugal: Association for Computational Linguistics. https://aclweb.org/anthology/D/D15/D15-1162.
[19]
Hosseini, H., Nguyen, T.T. & Bagheri, E. (2018). Implicit entity linking through ad-hoc retrieval. In 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) (pp. 326–329). IEEE.
[20]
Hua, W., Zheng, K. & Zhou, X. (2015). Microblog entity linking with social temporal context. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. SIGMOD ’15 (pp. 1761–1775). New York, NY, USA: ACM.
[21]
Ibrahim, Y., Amir Yosef, M. & Weikum, G. (2014). AIDA-social: Entity linking on the social stream. In Proceedings of the 7th International Workshop on Exploiting Semantic Annotations in Information Retrieval. ESAIR ’14 (pp. 17–19). New York, NY, USA: ACM.
[22]
Jovanović, J. & Bagheri, E. (2017). Semantic annotation in biomedicine: The current landscape. Journal of biomedical semantics, 8(1), 44.
[23]
Kim, J.-D., Ohta, T., Tsuruoka, Y., Tateisi, Y. & Collier, N. (2004). Introduction to the bio-entity recognition task at JNLPBA. In Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications (pp. 70–75). Association for Computational Linguistics.
[24]
Li, Y., Zheng, R., Tian, T., Hu, Z., Iyer, R. & Sycara, K. (2016). Joint embedding of hierarchical categories and entities for concept categorization and dataless classification. arXiv preprint.
[25]
Liu, X., Li, Y., Wu, H., Zhou, M., Wei, F. & Lu, Y. (2013). Entity linking for tweets. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1304–1311).
[26]
Masud, M.M., Chen, Q., Khan, L., Aggarwal, C., Gao, J., Han, J. & Thuraisingham, B. (2010). Addressing concept-evolution in concept-drifting data streams. In Data Mining (ICDM), 2010 IEEE 10th International Conference on (pp. 929–934). IEEE.
[27]
Meij, E., Weerkamp, W. & De Rijke, M. (2012). Adding semantics to microblog posts. In Proceedings of the Fifth ACM International Conference on Web Search and Data Mining (pp. 563–572). ACM.
[28]
Metzler, D. & Croft, W.B. (2005). A Markov random field model for term dependencies. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 472–479).
[29]
Metzler, D. & Croft, W.B. (2007). Latent concept expansion using Markov random fields. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR ’07 (pp. 311–318). New York, NY, USA: ACM.
[30]
Perera, S., Mendes, P.N., Alex, A., Sheth, A.P. & Thirunarayan, K. (2016). Implicit entity linking in tweets. In European Semantic Web Conference 2016 (pp. 118–132).
[31]
Ritter, A., Clark, S., Etzioni, O., et al. (2011). Named entity recognition in tweets: An experimental study. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (pp. 1524–1534). Association for Computational Linguistics.
[32]
Rizzo, G., Pereira, B., Varga, A., van Erp, M. & Cano Basave, A.E. (2017). Lessons learnt from the named entity rEcognition and linking (NEEL) challenge series. Semantic Web, 1–34.
[33]
Satoshi, S. & Hitoshi, I. (2000). IREX: IR and IE evaluation project in Japanese. In Proceedings of the 2nd International Conference on Language Resources & Evaluation.
[34]
Sekine, S. & Nobata, C. (2004). Definition, dictionaries and tagger for extended named entity hierarchy. In LREC (pp. 1977–1980). Lisbon, Portugal.
[35]
Shen, W., Wang, J. & Han, J. (2015). Entity linking with a knowledge base: Issues, techniques, and solutions. IEEE Transactions on Knowledge and Data Engineering, 27(2), 443–460.
[36]
Shen, W., Wang, J., Luo, P. & Wang, M. (2013). Linking named entities in tweets with knowledge base via user interest modeling. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’13 (pp. 68–76). New York, NY, USA: ACM.
[37]
Song, Y., Kim, E., Lee, G.G. & Yi, B.-k. (2004). POSBIOTM-NER in the shared task of BioNLP/NLPBA 2004. In Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications (pp. 100–103). Association for Computational Linguistics.
[38]
ter Horst, H., Hartung, M. & Cimiano, P. (2017). Joint entity recognition and linking in technical domains using undirected probabilistic graphical models. In International Conference on Language, Data and Knowledge (pp. 166–180). Springer.
[39]
Tjong Kim Sang, E.F. & De Meulder, F. (2003). Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003-Volume 4 (pp. 142–147). Association for Computational Linguistics.
[40]
Torres-Tramón, P., Hromic, H., Walsh, B., Heravi, B.R. & Hayes, C. (2016). Kanopy4Tweets: Entity extraction and linking for Twitter. In # Microposts (pp. 64–66).
[41]
Waitelonis, J. & Sack, H. (2016). Named entity linking in # tweets with KEA. In # Microposts (pp. 61–63).
[42]
Yamada, I., Shindo, H., Takeda, H. & Takefuji, Y. (2017). Learning distributed representations of texts and entities from knowledge base. arXiv preprint.
[43]
Yamada, I., Shindo, H. & Takefuji, Y. (2018). Representation learning of entities and documents from knowledge base descriptions. arXiv preprint.
[44]
Zamani, H. & Croft, W.B. (2017). Relevance-based word embedding. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval 2017 (pp. 505–514). New York, NY, USA: ACM.

Cited By

View all
  • (2024)LaQuE: Enabling Entity Search at ScaleAdvances in Information Retrieval10.1007/978-3-031-56060-6_18(270-285)Online publication date: 24-Mar-2024
  • (2023)Few-shot entity linking of food namesInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10346360:5Online publication date: 1-Sep-2023
  • (2022)A systemic functional linguistics approach to implicit entity recognition in tweetsInformation Processing and Management: an International Journal10.1016/j.ipm.2022.10295759:4Online publication date: 1-Jul-2022
  • Show More Cited By

Index Terms

  1. Implicit entity linking in tweets: An ad-hoc retrieval approach
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Applied Ontology
      Applied Ontology  Volume 14, Issue 4
      Meaning in Context: Ontologically and linguistically motivated representations of objects and events
      2019
      140 pages

      Publisher

      IOS Press

      Netherlands

      Publication History

      Published: 01 January 2019

      Author Tags

      1. Implicit entity linking
      2. Semantic retrieval
      3. DBpedia
      4. Knowledge graph

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 26 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)LaQuE: Enabling Entity Search at ScaleAdvances in Information Retrieval10.1007/978-3-031-56060-6_18(270-285)Online publication date: 24-Mar-2024
      • (2023)Few-shot entity linking of food namesInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10346360:5Online publication date: 1-Sep-2023
      • (2022)A systemic functional linguistics approach to implicit entity recognition in tweetsInformation Processing and Management: an International Journal10.1016/j.ipm.2022.10295759:4Online publication date: 1-Jul-2022
      • (2021)Learning to rank implicit entities on TwitterInformation Processing and Management: an International Journal10.1016/j.ipm.2021.10250358:3Online publication date: 1-May-2021
      • (2020)Towards Linking Camouflaged Descriptions to Implicit Products in E-commerceProceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3397271.3401067(901-910)Online publication date: 25-Jul-2020
      • (2019)Meaning in ContextApplied Ontology10.3233/AO-19022114:4(335-341)Online publication date: 1-Jan-2019

      View Options

      View options

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media