Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2623330.2623679acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Identifying and labeling search tasks via query-based hawkes processes

Published: 24 August 2014 Publication History

Abstract

We consider a search task as a set of queries that serve the same user information need. Analyzing search tasks from user query streams plays an important role in building a set of modern tools to improve search engine performance. In this paper, we propose a probabilistic method for identifying and labeling search tasks based on the following intuitive observations: queries that are issued temporally close by users in many sequences of queries are likely to belong to the same search task, meanwhile, different users having the same information needs tend to submit topically coherent search queries. To capture the above intuitions, we directly model query temporal patterns using a special class of point processes called Hawkes processes, and combine topic models with Hawkes processes for simultaneously identifying and labeling search tasks. Essentially, Hawkes processes utilize their self-exciting properties to identify search tasks if influence exists among a sequence of queries for individual users, while the topic model exploits query co-occurrence across different users to discover the latent information needed for labeling search tasks. More importantly, there is mutual reinforcement between Hawkes processes and the topic model in the unified model that enhances the performance of both. We evaluate our method based on both synthetic data and real-world query log data. In addition, we also apply our model to query clustering and search task identification. By comparing with state-of-the-art methods, the results demonstrate that the improvement in our proposed approach is consistent and promising.

Supplementary Material

MP4 File (p731-sidebyside.mp4)

References

[1]
E. Agichtein, R. W. White, S. T. Dumais, and P. N. Bennett. Search, interrupted: understanding and predicting search task continuation. In SIGIR, pages 315--324, 2012.
[2]
L. M. Aiello, D. Donato, U. Ozertem, and F. Menczer. Behavior-driven clustering of queries into topics. In CIKM, pages 1373--1382, New York, NY, USA, 2011. ACM.
[3]
Y. Ait-Sahalia, J. Cacho-Diaz, and R. Laeven. Modeling financial contagion using mutually exciting jump processes. Tech. rep., 2010.
[4]
AOL. http://gregsadetsky.com/aol-data/.
[5]
R. Baeza-Yates and A. Tiberi. Extracting semantic relations from query logs. KDD, pages 76--85, New York, NY, USA, 2007. ACM.
[6]
D. Blei and M. Jordan. Variational inference for dirichlet process mixtures. In Bayesian Analysis, volume 1, pages 121--144, 2005.
[7]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 3:993--1022, March 2003.
[8]
C. Blundell, K. A. Heller, and J. M. Beck. Modelling reciprocating relationships with hawkes processes. NIPS, 2012.
[9]
H. Cao, D. Jiang, J. Pei, Q. He, Z. Liao, E. Chen, and H. Li. Context-aware query suggestion by mining click-through and session data. In KDD, pages 875--883, 2008.
[10]
R. Crane and D. Sornette. Robust dynamic classes revealed by measuring the response function of a social system. Proceedings of the National Academy of Sciences of the United States of America, 105(41):15649--15653, 2008.
[11]
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), pages 1--38, 1977.
[12]
H. Deng, I. King, and M. R. Lyu. Entropy-biased models for query representation on the click graph. SIGIR, pages 339--346, New York, NY, USA, 2009. ACM.
[13]
E. Errais, K. Giesecke, and L. R. Goldberg. Affine point processes and portfolio credit risk. SIAM J. Fin. Math., 1(1):642--665, Sep 2010.
[14]
A. G. Hawkes. Spectra of some self-exciting and mutually exciting point processes. Biometrika, 58:83--90, 1971.
[15]
D. He, A. Göker, and D. J. Harper. Combining evidence for automatic web session identification. Inf. Process. Manage., 38(5):727--742, 2002.
[16]
M. D. Hoffman, D. M. Blei, and F. Bach. Online learning for latent dirichlet allocation. In NIPS, 2010.
[17]
W. Hua, Y. Song, H. Wang, and X. Zhou. Identifying users' topical tasks in web search. In WSDM, pages 93--102, 2013.
[18]
R. Jones and K. L. Klinkner. Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs. In CIKM, pages 699--708, 2008.
[19]
A. Kotov, P. N. Bennett, R. W. White, S. T. Dumais, and J. Teevan. Modeling and analysis of cross-session search tasks. In SIGIR, pages 5--14, 2011.
[20]
E. Lewisa and G. Mohlerb. A nonparametric em algorithm for multiscale hawkes processes. Journal of Nonpara-metric Statistics, 1, 2011.
[21]
L. Li and H. Zha. Dyadic event attribution in social networks with mixtures of hawkes processes. CIKM, pages 1667--1672, New York, NY, USA, 2013. ACM.
[22]
Z. Liao, Y. Song, L.-w. He, and Y. Huang. Evaluating the effectiveness of search task trails. In Proceedings of the 21st international conference on World Wide Web, pages 489--498. ACM, 2012.
[23]
C. Lucchese, S. Orlando, R. Perego, F. Silvestri, and G. Tolomei. Identifying task-based sessions in search engine query logs. In WSDM, pages 277--286, 2011.
[24]
C. Lucchese, S. Orlando, R. Perego, F. Silvestri, and G. Tolomei. Discovering tasks from search engine query logs. ACM Trans. Inf. Syst., 31(3):14, 2013.
[25]
Y. Ogata. Statistical models for earthquake occurrences and residual analysis for point processes. Journal of the American Statistical Association., 83(401):9--27, 1988.
[26]
M. D. Porter and G. White. Self-exciting hurdle models for terrorist activity. The Annals of Applied Statistics, 6(1):106--124, 2011.
[27]
F. Schoenberg. Introduction to point processes. Wiley Encyclopedia of Operations Research and Management Science, pages 616--617, 2010.
[28]
A. Spink, S. Koshman, M. Park, C. Field, and B. J. Jansen. Multitasking web search on vivisimo.com. In ITCC (2), pages 486--490, 2005.
[29]
A. Stomakhin, M. B. Short, and A. L. Bertozzi. Reconstruction of missing data in social networks based on temporal patterns of interactions. Inverse Problems., 27(11), Nov 2011.
[30]
H. Wang, Y. Song, M.-W. Chang, X. He, R. W. White, and W. Chu. Learning to extract cross-session search tasks. In WWW, pages 1353--1364, 2013.
[31]
X. Wang and E. Grimson. Spatial latent dirichlet allocation. NIPS, pages 1577--1584, 2007.
[32]
Y. Wang, E. Agichtein, and M. Benzi. Tm-lda: efficient online modeling of latent topic transitions in social media. In KDD, pages 123--131, New York, NY, USA, 2012. ACM.
[33]
R. W. White, W. Chu, A. Hassan, X. He, Y. Song, and H. Wang. Enhancing personalized search by mining and modeling task behavior. In WWW, pages 1411--1420, 2013.
[34]
S. Yang and H. Zha. Mixture of mutually exciting processes for viral diffusion. In ICML, volume 28, pages 1--9, 2013.
[35]
A. Z.-Mangion, M. Dewarc, V. Kadirkamanathand, and G. Sanguinetti. Point process modelling of the afghan war diary. PNAS, 109(31):12414--12419, July 2012.
[36]
Z. Zhang and O. Nasraoui. Mining search engine query logs for query recommendation. In WWW, pages 1039--1040, New York, NY, USA, 2006. ACM.
[37]
J. Zhuang, Y. Ogata, and D. V. Jones. Stochastic declustering of space-time earthquake occurrences. Journal of the American Statistical Association., 97(458):369--380, 2002.

Cited By

View all
  • (2024)Warnings About Future Jumps: Properties of the Exponential Hawkes ModelSSRN Electronic Journal10.2139/ssrn.4707522Online publication date: 2024
  • (2024)Understanding user intent modeling for conversational recommender systems: a systematic literature reviewUser Modeling and User-Adapted Interaction10.1007/s11257-024-09398-xOnline publication date: 6-Jun-2024
  • (2024)GTHP: a novel graph transformer Hawkes process for spatiotemporal event predictionKnowledge and Information Systems10.1007/s10115-024-02080-z66:7(4043-4062)Online publication date: 19-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2014
2028 pages
ISBN:9781450329569
DOI:10.1145/2623330
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 August 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. hawkes process
  2. latent dirichlet allocation
  3. search task
  4. variational inference

Qualifiers

  • Research-article

Funding Sources

Conference

KDD '14
Sponsor:

Acceptance Rates

KDD '14 Paper Acceptance Rate 151 of 1,036 submissions, 15%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Warnings About Future Jumps: Properties of the Exponential Hawkes ModelSSRN Electronic Journal10.2139/ssrn.4707522Online publication date: 2024
  • (2024)Understanding user intent modeling for conversational recommender systems: a systematic literature reviewUser Modeling and User-Adapted Interaction10.1007/s11257-024-09398-xOnline publication date: 6-Jun-2024
  • (2024)GTHP: a novel graph transformer Hawkes process for spatiotemporal event predictionKnowledge and Information Systems10.1007/s10115-024-02080-z66:7(4043-4062)Online publication date: 19-Mar-2024
  • (2023)Representing Tasks with a Graph-Based Method for Supporting Users in Complex Search TasksProceedings of the 2023 Conference on Human Information Interaction and Retrieval10.1145/3576840.3578279(378-382)Online publication date: 19-Mar-2023
  • (2023)Recommending tasks based on search queries and missionsNatural Language Engineering10.1017/S1351324923000219(1-25)Online publication date: 17-May-2023
  • (2022)Telecom Fraud Detection via Hawkes-enhanced Sequence ModelIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3150803(1-1)Online publication date: 2022
  • (2021)Cardinality-regularized hawkes-granger modelProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3540466(2682-2694)Online publication date: 6-Dec-2021
  • (2021)Task Intelligence for Search and RecommendationSynthesis Lectures on Information Concepts, Retrieval, and Services10.2200/S01103ED1V01Y202105ICR07413:3(1-160)Online publication date: 9-Jun-2021
  • (2021)AstrologerKnowledge-Based Systems10.1016/j.knosys.2021.107247228:COnline publication date: 27-Sep-2021
  • (2021)Modeling User Search Tasks with a Language-Agnostic Unsupervised ApproachAdvances in Information Retrieval10.1007/978-3-030-72113-8_27(405-418)Online publication date: 27-Mar-2021
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media