Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3038912.3052710acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Query Expansion Based on a Feedback Concept Model for Microblog Retrieval

Published: 03 April 2017 Publication History

Abstract

We tackle the problem of improving microblog retrieval algorithms by proposing a Feedback Concept Model for query expansion. In particular, we expand the query using knowledge information derived from Probase so that the expanded one could better reflect users' search intent, which allows for microblog retrieval at a concept-level, rather than term-level. In the proposed feedback concept model: (i) we mine the concept information implicit in short-texts based on the external knowledge bases; (ii) with the relevant concepts associated with short-texts, a mixture model is generated to estimate a concept language model; (iii) finally, we utilize the concept language model for query expansion. Moreover, we incorporate temporal prior into the proposed query expansion method to satisfy real-time information need. Finally, we test the generalization power of the feedback concept model on the TREC Microblog corpora. The experimental results demonstrate that the proposed model outperforms the previous methods for microblog retrieval significantly.

References

[1]
F. Abel, I. Celik, G. J. Houben, and P. Siehndel. Leveraging the semantics of tweets for adaptive faceted search on twitter. In International Semantic Web Conference, pages 1--17, 2011.
[2]
F. Ahmed and A. Rnberger. Evaluation of n-gram conflation approaches for arabic text retrieval. Journal of the Association for Information Science and Technology, 60(7):1448--1465, 2009.
[3]
M. D. Albakour, C. Macdonald, and I. Ounis. On sparsity and drift for effective real-time filtering in microblogs. In ACM International Conference on Conference on Information and Knowledge Management, pages 419--428, 2013.
[4]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, 2003.
[5]
G. Cao, J. Y. Nie, J. Gao, and S. Robertson. Selecting good expansion terms for pseudo-relevance feedback. In International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, Singapore, July, pages 243--250, 2008.
[6]
P. Castells, M. Fernández, and D. Vallet. An adaptation of the vector-space model for ontology-based information retrieval. Knowledge and Data Engineering IEEE Transactions on, 19(2):261--272, 2007.
[7]
S. Choi, J. Choi, S. Yoo, H. Kim, and Y. Lee. Semantic concept-enriched dependence model for medical information retrieval. Journal of Biomedical Informatics, 47(2):18--27, 2014.
[8]
W. Dakka, L. Gravano, and P. G. Ipeirotis. Answering general time sensitive queries. In ACM Conference on Information and Knowledge Management, CIKM 2008, Napa Valley, California, Usa, October, pages 1437--1438, 2008.
[9]
F. Diaz, B. Mitra, and N. Craswell. Query expansion with locally-trained word embeddings. In 54th Annual Meeting of the Association for Computational Linguistics, pages 367--377, 2016.
[10]
A. Dong, R. Zhang, P. Kolari, J. Bai, F. Diaz, Y. Chang, Z. Zheng, and H. Zha. Time is of the essence: improving recency ranking using twitter data. In 19th International Conference on World Wide Web, pages 331--340, 2010.
[11]
M. Efron and G. Golovchinsky. Estimation methods for ranking recent information. In Proceeding of the International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, Beijing, China, July, pages 495--504, 2011.
[12]
F. Fan, R. Qiang, C. Lv, and J. Yang. Improving microblog retrieval with feedback entity model. In The ACM International, pages 573--582, 2015.
[13]
W. Hua, Y. Song, H. Wang, and X. Zhou. Identifying users' topical tasks in web search. In ACM International Conference on Web Search and Data Mining, pages 93--102, 2013.
[14]
J. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 111--119, 2002.
[15]
X. Li and W. B. Croft. Time-based language models. In Twelfth International Conference on Information and Knowledge Management, pages 469--475, 2003.
[16]
Y. Li, W. P. R. Luk, K. S. E. Ho, and F. L. K. Chung. Improving weak ad-hoc queries using wikipedia asexternal corpus. In SIGIR 2007: Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, the Netherlands, July, pages 797--798, 2007.
[17]
F. Liang, R. Qiang, and J. Yang. Exploiting real-time information retrieval in the microblogosphere. In Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries, pages 267--276, 2012.
[18]
N. Limsopatham, C. Macdonald, and I. Ounis. Learning to combine representations for medical records search. In International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 833--836, 2013.
[19]
J. Lin, M. Efron, Y. Wang, and G. Sherman. Overview of the trec-2014 microblog track. 2015.
[20]
C. Lv, R. Qiang, F. Fan, and J. Yang. Knowledge-based query expansion in real-time microblog search. In Asia Information Retrieval Symposium, pages 43--55, 2015.
[21]
Y. Lv and C. X. Zhai. A comparative study of methods for estimating query language models with pseudo feedback. In ACM Conference on Information and Knowledge Management, pages 1895--1898, 2009.
[22]
K. Massoudi, M. Tsagkias, M. D. Rijke, and W. Weerkamp. Incorporating query expansion and quality indicators in searching microblog posts. In Advances in Information Retrieval - European Conference on Ir Research, ECIR 2011, Dublin, Ireland, April 18-21, 2011. Proceedings, pages 362--367, 2011.
[23]
D. A. Metzler. Automatic feature selection in the markov random field model for information retrieval. In Sixteenth ACM Conference on Information and Knowledge Management, CIKM 2007, Lisbon, Portugal, November, pages 253--262, 2007.
[24]
T. Miyanishi, K. Seki, and K. Uehara. Improving pseudo-relevance feedback via tweet selection. In ACM International Conference on Information and Knowledge Management, pages 439--448, 2013.
[25]
T. Miyanishi, K. Seki, and K. Uehara. Time-aware latent concept expansion for microblog search. In 8th International AAAI Conference on Weblogs and Social Media, pages 366--375, 2014.
[26]
I. Ounis, C. Macdonald, and J. Lin. Overview of the trec-2011 microblog track. 2011.
[27]
J. M. Ponte. A language modeling approach to information retrieval. In 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 275--281, 1998.
[28]
A. Severyn, A. Moschitti, M. Tsagkias, R. Berendsen, and M. D. Rijke. A syntax-aware re-ranker for microblog retrieval. In 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1067--1070, 2014.
[29]
Y. Song and D. Roth. On dataless hierarchical text classification. In 28th AAAI Conference on Artificial Intelligence, pages 1579--1585, 2014.
[30]
Y. Song, H. Wang, W. Chen, and S. Wang. Transfer understanding from head queries to tail queries. In 23rd ACM International Conference on Conference on Information and Knowledge Management, pages 1299--1308, 2014.
[31]
Y. Song, H. Wang, Z. Wang, H. Li, and W. Chen. Short text conceptualization using a probabilistic knowledgebase. In IJCAI 2011, Proceedings of the International Joint Conference on Artificial Intelligence, pages 2330--2336, 2011.
[32]
Y. Song, S. Wang, and H. Wang. Open domain short text conceptualization: a generative descriptive modeling approach. In International Conference on Artificial Intelligence, pages 3820--3826, 2015.
[33]
F. Wang, Z. Wang, Z. Li, and J. R. Wen. Concept-based short text classification and ranking. In The ACM International Conference, pages 1069--1078, 2014.
[34]
Y. Wang, H. Huang, C. Feng, Q. Zhou, J. Gu, and X. Gao. Cse: Conceptual sentence embeddings based on attention model. In 54th Annual Meeting of the Association for Computational Linguistics, pages 505--515, 2016.
[35]
Z. Wang, K. Zhao, H. Wang, X. Meng, and J. R. Wen. Query understanding through knowledge-based conceptualization. In International Conference on Artificial Intelligence, pages 3264--3270, 2015.
[36]
W. Wu, H. Li, H. Wang, and K. Q. Zhu. Probase:a probabilistic taxonomy for text understanding. In ACM SIGMOD International Conference on Management of Data, pages 481--492, 2012.
[37]
C. Zhai and J. Lafferty. Model-based feedback in the language modeling approach to information retrieval. In 10th International Conference on Information and Knowledge Management, pages 403--410, 2001.
[38]
Z. Zhang and L. Man. Estimating semantic similarity between expanded query and tweet content for microblog retrieval.

Cited By

View all
  • (2024)A Flexible Simplicity Enhancement Model for Knowledge Graph Completion TaskArtificial Intelligence10.1007/978-981-99-9119-8_27(298-309)Online publication date: 3-Feb-2024
  • (2023)MEGA: Meta-Graph Augmented Pre-Training Model for Knowledge Graph CompletionACM Transactions on Knowledge Discovery from Data10.1145/361737918:1(1-24)Online publication date: 16-Oct-2023
  • (2023)Microblog Retrieval Based on Concept-Enhanced Pre-Training ModelACM Transactions on Knowledge Discovery from Data10.1145/355231117:3(1-32)Online publication date: 22-Feb-2023
  • Show More Cited By

Index Terms

  1. Query Expansion Based on a Feedback Concept Model for Microblog Retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '17: Proceedings of the 26th International Conference on World Wide Web
    April 2017
    1678 pages
    ISBN:9781450349130

    Sponsors

    • IW3C2: International World Wide Web Conference Committee

    In-Cooperation

    Publisher

    International World Wide Web Conferences Steering Committee

    Republic and Canton of Geneva, Switzerland

    Publication History

    Published: 03 April 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. microblog retrieval
    2. pseudo-relevance feedback
    3. query expansion
    4. short-text conceptualization

    Qualifiers

    • Research-article

    Funding Sources

    • State Key Program of Joint Funds of the National Natural Science Foundation of China
    • National Basic Research Program of China

    Conference

    WWW '17
    Sponsor:
    • IW3C2

    Acceptance Rates

    WWW '17 Paper Acceptance Rate 164 of 966 submissions, 17%;
    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)9
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 31 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Flexible Simplicity Enhancement Model for Knowledge Graph Completion TaskArtificial Intelligence10.1007/978-981-99-9119-8_27(298-309)Online publication date: 3-Feb-2024
    • (2023)MEGA: Meta-Graph Augmented Pre-Training Model for Knowledge Graph CompletionACM Transactions on Knowledge Discovery from Data10.1145/361737918:1(1-24)Online publication date: 16-Oct-2023
    • (2023)Microblog Retrieval Based on Concept-Enhanced Pre-Training ModelACM Transactions on Knowledge Discovery from Data10.1145/355231117:3(1-32)Online publication date: 22-Feb-2023
    • (2023)Leveraging Concept-Driven Pre-Training Model for Shot-Text Conceptualization Task2023 8th International Conference on Data Science in Cyberspace (DSC)10.1109/DSC59305.2023.00054(329-336)Online publication date: 18-Aug-2023
    • (2023)Short-Text Conceptualization Based on Hyper-Graph Learning and Multiple Prior KnowledgeSocial Media Processing10.1007/978-981-99-7596-9_8(104-117)Online publication date: 15-Nov-2023
    • (2023)Leverage Heterogeneous Graph Neural Networks for Short-Text ConceptualizationSocial Media Processing10.1007/978-981-99-7596-9_7(90-103)Online publication date: 15-Nov-2023
    • (2022)Concept Commons Enhanced Knowledge Graph RepresentationKnowledge Science, Engineering and Management10.1007/978-3-031-10983-6_32(413-424)Online publication date: 6-Aug-2022
    • (2022)History-Aware Expansion and Fuzzy for Query ReformulationArtificial Intelligence10.1007/978-3-030-93049-3_19(227-238)Online publication date: 1-Jan-2022
    • (2021)The hybridised indexing method for research-based information retrievalJournal of Information Science10.1177/016555152199980049:2(319-334)Online publication date: 15-Mar-2021
    • (2021)Hierarchical Concept-Driven Language ModelACM Transactions on Knowledge Discovery from Data10.1145/345116715:6(1-22)Online publication date: 19-May-2021
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media