Abstract
In recent years, search engines have started presenting semantically relevant entity information together with document search results. Entity ranking systems are used to compute recommendations for related entities that a user might also be interested to explore. Typically, this is done by ranking relationships between entities in a semantic knowledge graph using signals found in a data source as well as type annotations on the nodes and links of the graph. However, the process of producing these rankings can take a substantial amount of time. As a result, entity ranking systems typically lag behind real-world events and present relevant entities with outdated relationships to the search term or even outdated entities that should be replaced with more recent relations or entities.
This paper presents a study using a real-world stream-processing based implementation of an entity ranking system, to understand the effect of data timeliness on entity rankings. We describe the system and the data it processes in detail. Using a longitudinal case-study, we demonstrate (i) that low-latency, large-scale entity relationship ranking is feasible using moderate resources and (ii) that stream-based entity ranking improves the freshness of related entities while maintaining relevance.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Balog, K., Serdyukov, P., de Vries, A.P.: Overview of the trec 2011 entity track. In: Voorhees, E.M., Buckland, L.P. (eds.) TREC. National Institute of Standards and Technology (NIST) (2011)
Blanco, R., Cambazoglu, B.B., Mika, P., Torzec, N.: Entity recommendations in web search. In: Alani, H., Kagal, L., Fokoue, A., Groth, P., Biemann, C., Parreira, J.X., Aroyo, L., Noy, N., Welty, C., Janowicz, K. (eds.) ISWC 2013, Part II. LNCS, vol. 8219, pp. 33–48. Springer, Heidelberg (2013)
Blanco, R., Halpin, H., Herzig, D.M., Mika, P., Pound, J., Thompson, H.S., Tran, T.: Repeatable and reliable semantic search evaluation. J. Web Sem. 21 (2013)
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Communications of the ACM 13(7) (1970)
Chen, W., Hsu, W., Lee, M.L.: Modeling user’s receptiveness over time for recommendation. In: SIGIR (2013)
Dai, N., Davison, B.D.: Freshness matters: in flowers, food, and web authority. In: SIGIR 2010 (2010)
Dai, N., Shokouhi, M., Davison, B.D.: Learning to rank for freshness and relevance (2011)
Dong, A., Chang, Y., Zheng, Z., Mishne, G., Bai, J., Zhang, R., Buchner, K., Liao, C., Diaz, F.: Towards recency ranking in web search. In: WSDM (2010)
Dong, A., Zhang, R., Kolari, P., Bai, J., Dias, F., Chang, Y., Zheng, Z.: Time is of the essence: improving recency ranking using twitter data. In: WWW (2010)
Elsas, J.L., Dumais, S.T.: Leveraging temporal dynamics of document content in relevance ranking. In: WSDM (2010)
Flajolet, P., Fusy, E., Gandouet, O., Meunier, F.: HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm. In: DMTCS Proceedings (2008)
Friedman, J.H.: Stochastic gradient boosting. Computational Statistics and Data Analysis (1999)
Friedman, J.: Greedy function approximation: a gradient boosting machine. Annals of Statistics (2001)
Kang, C., Vadrevu, S., Zhang, R., Zwol, R.V., Pueyo, L.G., Torzec, N., He, J., Chang, Y.: Ranking related entities for web search queries (2011)
Kulkarni, A., Teevan, J., Svore, K.M., Dumais, S.T.: Understanding temporal query dynamics. In: WSDM. ACM Press (2011)
Lin, T., Pantel, P., Gamon, M., Kannan, A., Fuxman, A.: Active objects: actions for entity-centric search. In: WWW (2012)
Metzler, D., Jones, R., Peng, F., Zhang, R.: Improving search relevance for implicitly temporal queries. In: SIGIR. No. 1, ACM Press (2009)
Mika, P., Meij, E., Zaragoza, H.: Investigating the semantic gap through query log analysis. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 441–455. Springer, Heidelberg (2009)
Pehcevski, J., Thom, J.A., Vercoustre, A.M., Naumovski, V.: Entity ranking in Wikipedia: utilising categories, links and topic difficulty prediction. Information Retrieval 13(5) (2010)
Pound, J., Mika, P., Zaragoza, H.: Ad-hoc object retrieval in the web of data. In: WWW (2010)
Ren, Z., Liang, S., Meij, E., de Rijke, M.: Personalized time-aware tweets summarization. In: SIGIR (2013)
Shokouhi, M., Radinsky, K.: Time-sensitive query auto-completion. In: SIGIR. ACM Press (2012)
Vorburger, P., Bernstein, A.: Entropy-based concept shift detection. In: ICDM. IEEE Computer Society (2006)
Yuan, Q., Cong, G., Ma, Z., Sun, A., Thalmann, N.M.: Time-aware point-of-interest recommendation. In: SIGIR (2013)
Zheng, Z., Zha, H., Zhang, T., Chapelle, O., Chen, K., Sun, G.: A general boosting method and its application to learning ranking functions for web search. In: Neural Information Processing Systems (2008)
van Zwol, R., Garcia Pueyo, L., Muralidharan, M., Sigurbjörnsson, B.: Machine learned ranking of entity facets. In: SIGIR. ACM Press (2010)
van Zwol, R., Pueyo, L.G., Muralidharan, M., Sigurbjornsson, B.: Ranking entity facets based on user click feedback. In: ICSC. IEEE, September 2010
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Fischer, L., Blanco, R., Mika, P., Bernstein, A. (2015). Timely Semantics: A Study of a Stream-Based Ranking System for Entity Relationships. In: Arenas, M., et al. The Semantic Web - ISWC 2015. ISWC 2015. Lecture Notes in Computer Science(), vol 9367. Springer, Cham. https://doi.org/10.1007/978-3-319-25010-6_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-25010-6_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25009-0
Online ISBN: 978-3-319-25010-6
eBook Packages: Computer ScienceComputer Science (R0)