research-article

Clustering and exploring search results using timeline constructions

Authors:

Ricardo Baeza-YatesAuthors Info & Claims

CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Pages 97 - 106

https://doi.org/10.1145/1645953.1645968

Published: 02 November 2009 Publication History

Abstract

Time is an important dimension of any information space and can be very useful in information retrieval and in particular clustering and exploration of search results. Search result clustering is a feature integrated in some of today's search engines, allowing users to further explore search results. However, only little work has been done on exploiting temporal information embedded in documents for the presentation, clustering, and exploration of search results along well-defined timelines.

In this paper, we present an add-on to traditional information retrieval applications in which we exploit various temporal information associated with documents to present and cluster documents along timelines. Temporal information expressed in the form of, e.g., date and time tokens or temporal references, appear in documents as part of the textual context or metadata. Using temporal entity extraction techniques, we show how temporal expressions are made explicit and used in the construction of multiple-granularity timelines. We discuss how hit-list based search results can be clustered according to temporal aspects, anchored in the constructed timelines, and how time-based document clusters can be used to explore search results that include temporal snippets. We also outline a prototypical implementation and evaluation that demonstrates the feasibility and functionality of our framework.

References

[1]

Alembic: http://www.mitre.org/tech/alembic-workbench/

[2]

R. Al-Kamha and D. Embley: Grouping Search--Engine Returned Citations for Person-NameQueries. In 6th ACM International Workshop on Web Information and Data Management (WIDM 2004), ACM, 96--103, 2004.

Digital Library

[3]

J. Allan, R. Gupta and V. Khandelwal: Temporal Summaries of News Topics. In Proc. of the 24th International ACM SIGIR Conference, ACM, 10--18, 2001.

Digital Library

[4]

O. Alonso, R. Baeza-Yates, and M. Gertz: Effectiveness of Temporal Snippets. WSSP Workshop, WWW Madrid, 2009.

[5]

O. Alonso, M. Gertz, and R. Baeza-Yates: On the Value of Temporal Information in Temporal Information Retrieval. SIGIR Forum, 41(2):35--41, 2007

Digital Library

[6]

O. Alonso, D. E. Rose, and B. Stewart: Crowd sourcing for Relevance Evaluation SIGIR Forum (42):2, 12--18, 2008.

Digital Library

[7]

I. Arikan, S. Bedathur, and K. Berberich: Time Will Tell: Leveraging Temporal Expressions in IR. WSDM Late Breaking Results, Barcelona, 2009.

[8]

A. Aula, N. Jhaveri, and M. Kaki: Information Search Re-access Strategies of Experienced Web Users. In phProc. of the 14th World Wide Web Conference, ACM,583--592, 2005.

Digital Library

[9]

R. Baeza-Yates: Searching the Future. In SIGIR Workshop MF/IR, 2005.

[10]

C. Carpineto, S. Osinski, G. Romano, and D. Weiss: A Survey of Web Clustering Engines. In ACM Computing Surveys, 41(3), 2009.

Digital Library

[11]

R. Catizone, A. Dalli, and Y. Wilks: Evaluating Automatically Generated Timelines from the Web. In 5th International Conference on Language Resources and Evaluation, 2006.

[12]

DMOZ http://www.dmoz.org/.

[13]

M. Dubinko et al.: Visualizing Tags over Time. In Proc. of 15th World Wide Web Conference, ACM,193--202, 2006.

Digital Library

[14]

P. Ferragina and A. Gulli: A Personalized Search Engine Based on Web-Snippet Hierarchical Clustering. In 14th International Conference on World Wide Web (Special interest tracks and posters), 801--810, 2005.

Digital Library

[15]

GUTime, http://complingone.georgetown.edu/linguist/

[16]

A. Jain, M. Murthy, and P. Flynn: Data Clustering: A Survey. ACM Computing Surveys, 31(3):264--323, 1999.

Digital Library

[17]

D. Koen and W. Bender: Time Frames: Temporal Augmentation of the News. IBM System Journal, 39(4):597--616, 2000.

Digital Library

[18]

P.J. Kalczynski and A. Chou: Temporal Document Retrieval Model for Business News Archives. Information Processing&Management 41, 635--650, 2005.

Digital Library

[19]

A. Kittur, E. H. Chi, and B. Suh: Crowd sourcing User Studies with Mechanical Turk. In Proc. 26th SIGCHI Conference on Human Factors in Computing Systems, 453--456, 2008.

Digital Library

[20]

J. Makkonen and H. Ahonen-Myka: Utilizing Temporal Expressions in Topic Detection and Tracking. In phResearch and Advanced Technology for Digital Libraries, LNCS 2769, Springer, 393--404, 2003.

[21]

I. Mani, J. Pustejovsky, and R. Gaizauskas (Eds.): The Language of Time. Oxford University Press, 2005.

[22]

I. Mani, J. Pustejovsky, and B. Sundheim: Introduction to the Special Issue on Temporal Information Processing. ACM Trans. on Asian Language Inf. Processing,3(1):1--10, 2004.

Digital Library

[23]

P. Pirolli: Information Foraging Theory. Oxford University Press, 2007.

Digital Library

[24]

M. Pasca: Towards Temporal Web Search. ACM Symposium on Applied Computing, 1117--1121, 2008.

Digital Library

[25]

J. Pustejovsky et al.: TimeML: Robust Specification of Event and Temporal Expressions in Text. New Directions in Question Answering, AAAI Spring Symp., 28--34, 2003.

[26]

J. Pustejovsky et al.: TimeBank 1.2 Documentation http://timeml.org/site/timebank/documentation-1.2.html

[27]

A. Qamra, B. Tseng, and E. Chang: Mining Blog Stories Using Community-Based and Temporal Clustering. In Proc. 15th ACM International Conference on Information and Knowledge Management, ACM, 58--67, 2006.

Digital Library

[28]

M. Ringel, E. Cutrell, S. Dumais, and E. Horvitz: Milestones in Time: The Value of Landmarks in Retrieving Information from Personal Stores. In IFIP TC13 International Conference on Human-Computer Interaction, 2003.

[29]

F. Schilder and C. Habel: From Temporal Expressions to Temporal Information: Semantic Tagging of News Messages. In ACL'01 Workshop on Temporal and Spatial Information Processing, 1--8, 2001.

Digital Library

[30]

B. Shaparenko et al.: Identifying Temporal Patterns and Key Players in Document Collections. In Proc. IEEE ICDM Workshop on Temporal Data Mining:Algorithms, Theory and Applications (TDM-05), 165--174, 2005.

[31]

TimeML 1.2.1 Specification: http://www.timeml.org

[32]

H. Toda and R. Kataoka: A Search Result Clustering Method using Informatively Named Entities. In 7th ACM International Workshop on Web Information and Data Management (WIDM 2005), ACM, 81--86, 2005.

Digital Library

[33]

Vivisimo, http://www.vivisimo.com.

[34]

R. White, K. Kules, S. Drucker, and M. Schraefel (Eds). Supporting Exploratory Search. Communication of the ACM 49(4), April 2006.

Digital Library

[35]

R. White, G. Marchionini and G. Muresan: Evaluating Exploratory Search Systems:A Special Topic Issue of Information Processing and Management. Information Processing and Management, 44(2), 433--436, 2008.

Digital Library

[36]

O. Zamir and O. Etzioni: Web Document Clustering: A Feasibility Demonstration. In Proc. of 21st International ACM SIGIR Conference,ACM, 46--54, 1998.

Digital Library

Cited By

Jones SKlein MWeigle MNelson M(2023)Summarizing Web Archive Corpora via Social Media Storytelling by Automatically Selecting and Visualizing ExemplarsACM Transactions on the Web10.1145/360603018:1(1-48)Online publication date: 11-Oct-2023
https://dl.acm.org/doi/10.1145/3606030
Wang JJatowt AYoshikawa MCai YChen HDuh WHuang HKato MMothe JPoblete B(2023)BiTimeBERT: Extending Pre-Trained Language Representations with Bi-Temporal InformationProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591686(812-821)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3591686
Qureshi SHaring K(2023)Evaluating the Effectiveness of Graph and Timeline-Based Visualizations for Search Engine Results: A Comparative StudyHCI International 2023 – Late Breaking Papers10.1007/978-3-031-48044-7_12(162-180)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.1007/978-3-031-48044-7_12
Show More Cited By

Index Terms

Clustering and exploring search results using timeline constructions
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. Information systems
  1. Information retrieval
    1. Document representation
      1. Content analysis and feature selection

Recommendations

Search results using timeline visualizations
SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Event-centric search and exploration in document collections
JCDL '12: Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries

Textual data ranging from corpora of digitized historic documents to large collections of news feeds provide a rich source for temporal and geographic information. Such types of information have recently gained a lot of interest in support of different ...
A language for manipulating clustered web documents results
CIKM '08: Proceedings of the 17th ACM conference on Information and knowledge management

We propose a novel conception language for exploring the results retrieved by several internet search services (like search engines) that cluster retrieved documents. The goal is to offer users a tool to discover relevant hidden relationships between ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

November 2009

2162 pages

ISBN:9781605585123

DOI:10.1145/1645953

General Chairs:
David Cheung
University of Hong Kong, Hong Kong
,
Il-Yeol Song
Drexel University, USA
,
Program Chairs:
Wesley Chu
UCLA, USA
,
Xiaohua Hu
Drexel University, USA
,
Jimmy Lin
University of Maryland, USA

Copyright © 2009 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2009

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CIKM '09

Sponsor:

CIKM '09: Conference on Information and Knowledge Management

November 2 - 6, 2009

Hong Kong, China

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

96
Total Citations
View Citations
1,422
Total Downloads

Downloads (Last 12 months)18
Downloads (Last 6 weeks)2

Reflects downloads up to 27 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Jones SKlein MWeigle MNelson M(2023)Summarizing Web Archive Corpora via Social Media Storytelling by Automatically Selecting and Visualizing ExemplarsACM Transactions on the Web10.1145/360603018:1(1-48)Online publication date: 11-Oct-2023
https://dl.acm.org/doi/10.1145/3606030
Wang JJatowt AYoshikawa MCai YChen HDuh WHuang HKato MMothe JPoblete B(2023)BiTimeBERT: Extending Pre-Trained Language Representations with Bi-Temporal InformationProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591686(812-821)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3591686
Qureshi SHaring K(2023)Evaluating the Effectiveness of Graph and Timeline-Based Visualizations for Search Engine Results: A Comparative StudyHCI International 2023 – Late Breaking Papers10.1007/978-3-031-48044-7_12(162-180)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.1007/978-3-031-48044-7_12
Santos WFazzion EDias DGuimarães MTuler ERocha LPereira Ada Rocha L(2021)Semantically Time Tracking of Events from Web DocumentsProceedings of the Brazilian Symposium on Multimedia and the Web10.1145/3470482.3479627(141-144)Online publication date: 5-Nov-2021
https://dl.acm.org/doi/10.1145/3470482.3479627
Santos WFazzion ETuler EDias DGuimarães MRocha L(2021)StoryTracker: A Semantic-Oriented Tool for Automatic Tracking Events by Web DocumentsComputational Science and Its Applications – ICCSA 202110.1007/978-3-030-86970-0_10(126-140)Online publication date: 11-Sep-2021
https://doi.org/10.1007/978-3-030-86970-0_10
Liu ZHammerschmidt BMcMahon DChang HLu YSpiegel JSosa ASuresh SArora GArora V(2020)Native JSON datatype supportProceedings of the VLDB Endowment10.14778/3415478.341553413:12(3059-3071)Online publication date: 14-Sep-2020
https://dl.acm.org/doi/10.14778/3415478.3415534
Wingerath WGessert FRitter N(2020)InvaliDBProceedings of the VLDB Endowment10.14778/3415478.341553213:12(3032-3045)Online publication date: 14-Sep-2020
https://dl.acm.org/doi/10.14778/3415478.3415532
Wi SHan WChang CKim K(2020)Towards multi-way join aware optimizer in SAP HANAProceedings of the VLDB Endowment10.14778/3415478.341553113:12(3019-3031)Online publication date: 1-Aug-2020
https://dl.acm.org/doi/10.14778/3415478.3415531
Kamal SNasir JUddin ZKhan B(2019)Evidential Learning on Web Search Queries Disambiguation for Active Strategic Decision MakingServant Leadership Styles and Strategic Decision Making10.4018/978-1-5225-4996-3.ch008(186-196)Online publication date: 2019
https://doi.org/10.4018/978-1-5225-4996-3.ch008
Lange LAlonso OStrötgen J(2019)The Power of Temporal Features for Classifying News ArticlesCompanion Proceedings of The 2019 World Wide Web Conference10.1145/3308560.3315000(1159-1160)Online publication date: 13-May-2019
https://dl.acm.org/doi/10.1145/3308560.3315000
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents