research-article

Pairwise cross-domain factor model for heterogeneous transfer ranking

Authors:

Jianzhang HeAuthors Info & Claims

WSDM '12: Proceedings of the fifth ACM international conference on Web search and data mining

Pages 113 - 122

https://doi.org/10.1145/2124295.2124311

Published: 08 February 2012 Publication History

Abstract

Learning to rank arises in many information retrieval applications, ranging from Web search engine, online advertising to recommendation systems. Traditional ranking mainly focuses on one type of data source, and effective modeling relies on a sufficiently large number of labeled examples, which require expensive and time-consuming labeling process. However, in many real-world applications, ranking over multiple related heterogeneous domains becomes a common situation, where in some domains we may have a relatively large amount of training data while in some other domains we can only collect very little. Theretofore, how to leverage labeled information from related heterogeneous domain to improve ranking in a target domain has become a problem of great interests. In this paper, we propose a novel probabilistic model, pairwise cross-domain factor model, to address this problem. The proposed model learns latent factors(features) for multi-domain data in partially-overlapped heterogeneous feature spaces. It is capable of learning homogeneous feature correlation, heterogeneous feature correlation, and pairwise preference correlation for cross-domain knowledge transfer. We also derive two PCDF variations to address two important special cases. Under the PCDF model, we derive a stochastic gradient based algorithm, which facilitates distributed optimization and is flexible to adopt different loss functions and regularization functions to accommodate different data distributions. The extensive experiments on real world data sets demonstrate the effectiveness of the proposed model and algorithm.

References

[1]

R. Ando and T. Zhang. A high-performance semi-supervised learning method for text chunking. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pages 1--9. Association for Computational Linguistics Morristown, NJ, USA, 2005.

Digital Library

[2]

A. Argyriou, T. Evgeniou, and M. Pontil. Multi-task feature learning. In Advances in Neural Information Processing Systems: Proceedings of the 2006 Conference, page 41. MIT Press, 2007.

Digital Library

[3]

A. Argyriou, C. Micchelli, M. Pontil, and Y. Ying. A spectral regularization framework for multi-task structure learning. Advances in Neural Information Processing Systems, 20, 2008.

[4]

S. Bickel, M. Brückner, and T. Scheffer. Discriminative learning for differing training and test distributions. In Proceedings of the 24th international conference on Machine learning, pages 81--88. ACM New York, NY, USA, 2007.

Digital Library

[5]

J. Blitzer, K. Crammer, A. Kulesza, F. Pereira, and J. Wortman. Learning bounds for domain adaptation. Advances in Neural Information Processing Systems, 20, 2008.

[6]

J. Blitzer, R. McDonald, and F. Pereira. Domain adaptation with structural correspondence learning. In Proceedings of the Empirical Methods in Natural Language Processing (EMNLP), 2006.

Digital Library

[7]

A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Proceedings of the eleventh annual conference on Computational learning theory, COLT'98, pages 92--100, 1998.

Digital Library

[8]

E. Bonilla, K. Chai, and C. Williams. Multi-task gaussian process prediction. Advances in Neural Information Processing Systems, 20:153--160.

[9]

C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In Proceedings of the 22nd International Conference on Machine learning, 2005.

Digital Library

[10]

Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li. Learning to rank: from pairwise approach to listwise approach. In ICML '07, pages 129--136, New York, NY, USA, 2007. ACM.

Digital Library

[11]

D. Chen, J. Yan, G. Wang, Y. Xiong, W. Fan, and Z. Chen. TransRank: A Novel Algorithm for Transfer of Rank Learning. In IEEE ICDM Workshops, 2008.

Digital Library

[12]

M. Collins, S. Dasgupta, and R. Reina. A generalizaion of principal component analysis to the exponential family. In NIPS'01, 2001.

[13]

C. Cortes, M. Mohri, and A. Rastogi. Magnitude-preserving ranking algorithms. In Proceedings of the 24th ICML, 2007.

Digital Library

[14]

W. Dai, G. Xue, Q. Yang, and Y. Yu. Co-clustering based classification for out-of-domain documents. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 210--219. ACM New York, NY, USA, 2007.

Digital Library

[15]

W. Dai, Q. Yang, G. Xue, and Y. Yu. Boosting for transfer learning. In Proceedings of the 24th international conference on Machine learning, pages 193--200. ACM New York, NY, USA, 2007.

Digital Library

[16]

H. Daume. Frustratingly easy domain adaptation. In Annual meeting-association for computational linguistics, volume 45, page 256, 2007.

[17]

H. Daume III and D. Marcu. Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research, 26:101--126, 2006.

Digital Library

[18]

T. Evgeniou and M. Pontil. Regularized multi-task learning. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 109--117. ACM New York, NY, USA, 2004.

Digital Library

[19]

Y. Freund, R. Iyer, R. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. In Proceedings of the Fifteenth International Conference on Machine Learning, 1998.

Digital Library

[20]

J. Friedman. Greedy function approximation: a gradient boosting machine. Annals of Statistics, pages 1189--1232, 2001.

[21]

J. Gao, Q. Wu, C. Burges, K. Svore, Y. Su, N. Khan, S. Shah, and H. Zhou. Model adaptation via model interpolation and boosting for web search ranking. In Proceedings of conference on Empirical Methods in Natural Language Processing, 2009.

Digital Library

[22]

J. Guiver and E. Snelson. Learning to rank with SoftRank and Gaussian processes. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, 2008.

Digital Library

[23]

M. Harel and S. Mannor. Learning from multiple outlooks. In L. Getoor and T. Scheffer, editors, Proceedings of the 28th International Conference on Machine Learning (ICML-11), ICML '11, pages 401--408, New York, NY, USA, June 2011. ACM.

[24]

J. He and R. Lawrence. A graph-based framework for multi-task multi-view learning. In L. Getoor and T. Scheffer, editors, Proceedings of the 28th International Conference on Machine Learning (ICML-11), ICML '11, pages 25--32, New York, NY, USA, June 2011. ACM.

[25]

J. Huang, A. Smola, A. Gretton, K. Borgwardt, and B. Scholkopf. Correcting sample selection bias by unlabeled data. Advances in neural information processing systems, 19:601, 2007.

[26]

J. Jiang and C. Zhai. Instance weighting for domain adaptation in NLP. In Annual meeting-assosciation for computational linguistics, volume 45, page 264, 2007.

[27]

T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of ACM SIGKDD, 2002.

Digital Library

[28]

N. Lawrence and J. Platt. Learning to learn with the informative vector machine. In Proceedings of the twenty-first international conference on Machine learning. ACM New York, NY, USA, 2004.

Digital Library

[29]

H. Lee, A. Battle, R. Raina, and A. Y. Ng. Efficient sparse coding algorithms. In In NIPS, pages 801--808. NIPS, 2007.

[30]

S. Lee, V. Chatalbashev, D. Vickrey, and D. Koller. Learning a meta-level prior for feature relevance from multiple related tasks. In Proceedings of the 24th international conference on Machine learning, pages 489--496. ACM New York, NY, USA, 2007.

Digital Library

[31]

X. Liao, Y. Xue, and L. Carin. Logistic regression with an auxiliary data source. In MACHINE LEARNING-INTERNATIONAL WORKSHOP THEN CONFERENCE-, volume 22, page 505, 2005.

Digital Library

[32]

P. Luo, F. Zhuang, H. Xiong, Y. Xiong, and Q. He. Transfer learning from multiple source domains via consensus regularization. In CIKM '08: Proceeding of the 17th ACM conference on Information and knowledge management, pages 103--112, New York, NY, USA, 2008. ACM.

Digital Library

[33]

R. Raina, A. Battle, H. Lee, B. Packer, and A. Ng. Self-taught learning: Transfer learning from unlabeled data. In Proceedings of the 24th international conference on Machine learning, pages 759--766. ACM New York, NY, USA, 2007.

Digital Library

[34]

A. Schwaighofer, V. Tresp, and K. Yu. Learning Gaussian process kernels via hierarchical Bayes. Advances in Neural Information Processing Systems, 17:1209--1216, 2005.

[35]

M. Sugiyama, S. Nakajima, H. Kashima, P. von Bunau, and M. Kawanabe. Direct importance estimation with model selection and its application to covariate shift adaptation. Advances in Neural Information Processing Systems, 20, 2008.

[36]

B. Wang, J. Tang, W. Fan, S. Chen, Z. Yang, and Y. Liu. Heterogeneous cross domain ranking in latent space. In Proceeding of the 18th ACM conference on Information and knowledge management, CIKM '09, pages 987--996, 2009.

Digital Library

[37]

C. Wang and S. Mahadevan. Heterogeneous domain adaptation using manifold alignment. In IJCAI, pages 1541--1546, 2011.

Digital Library

[38]

J. Xu and H. Li. Adarank: a boosting algorithm for information retrieval. In Proceedings of the 30th ACM SIGIR, 2007.

Digital Library

[39]

Q. Yang, Y. Chen, G.-R. Xue, W. Dai, and Y. Yu. Heterogeneous transfer learning for image clustering via the social web. ACL '09, pages 1--9, 2009.

Digital Library

[40]

H. Zha, Z. Zheng, H. Fu, and G. Sun. Incorporating query difference for learning retrieval functions in world wide web search. In Proceedings of the 15th ACM CIKM conference, 2006.

Digital Library

[41]

Z. Zheng, K. Chen, G. Sun, and H. Zha. A regression framework for learning ranking functions using relative relevance judgments. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 287--294, New York, NY, USA, 2007. ACM.

Digital Library

[42]

M. Zinkevich, M. Weimer, A. Smola, and L. Li. Parallelized stochastic gradient descent. In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems 23, pages 2595--2603, 2010.

Cited By

Zhao JShetty SPan JKamhoua CKwiat K(2019)Transfer learning for detecting unknown network attacksEURASIP Journal on Information Security10.1186/s13635-019-0084-42019:1Online publication date: 21-Feb-2019
https://doi.org/10.1186/s13635-019-0084-4
Li PSanderson MCarman MScholer FMukhopadhyay SZhai CBertino ECrestani FMostafa JTang JSi LZhou XChang YLi YSondhi P(2016)On the Effectiveness of Query Weighting for Adapting Rank Learners to New Unlabelled CollectionsProceedings of the 25th ACM International on Conference on Information and Knowledge Management10.1145/2983323.2983852(1413-1422)Online publication date: 24-Oct-2016
https://dl.acm.org/doi/10.1145/2983323.2983852
Diaz FCarterette BFang HLalmas MNie J(2016)Learning to Rank with Labeled FeaturesProceedings of the 2016 ACM International Conference on the Theory of Information Retrieval10.1145/2970398.2970435(41-44)Online publication date: 12-Sep-2016
https://dl.acm.org/doi/10.1145/2970398.2970435
Show More Cited By

Index Terms

Pairwise cross-domain factor model for heterogeneous transfer ranking
1. Information systems

Recommendations

Ranking with auxiliary data
CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management

Learning to rank arises in many information retrieval applications, ranging from Web search engine, online advertising to recommendation system. In learning to rank, the performance of a ranking function heavily depends on the number of labeled examples ...
A risk minimization framework for domain adaptation
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Supervised learning algorithms usually require high quality labeled training set of large volume. It is often expensive to obtain such labeled examples in every domain of an application. Domain adaptation aims to help in such cases by utilizing data ...
Domain‐invariant adversarial learning with conditional distribution alignment for unsupervised domain adaptation

Unsupervised domain adaption aims to reduce the divergence between the source domain and the target domain. The final objective is to learn domain‐invariant features from both domains that get the minimised expected error on the target domain. The ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WSDM '12: Proceedings of the fifth ACM international conference on Web search and data mining

February 2012

792 pages

ISBN:9781450307475

DOI:10.1145/2124295

General Chairs:
Eytan Adar
University of Michigan, USA
,
Jaime Teevan
Microsoft Research, USA
,
Program Chairs:
Eugene Agichtein
Emory University, USA
,
Yoelle Maarek
Yahoo! Research, Israel

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 February 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WSDM'12

Sponsor:

WSDM'12: Fifth ACM International Conference on Web Search and Data Mining

February 8 - 12, 2012

Washington, Seattle, USA

Acceptance Rates

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
456
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhao JShetty SPan JKamhoua CKwiat K(2019)Transfer learning for detecting unknown network attacksEURASIP Journal on Information Security10.1186/s13635-019-0084-42019:1Online publication date: 21-Feb-2019
Li PSanderson MCarman MScholer FMukhopadhyay SZhai CBertino ECrestani FMostafa JTang JSi LZhou XChang YLi YSondhi P(2016)On the Effectiveness of Query Weighting for Adapting Rank Learners to New Unlabelled CollectionsProceedings of the 25th ACM International on Conference on Information and Knowledge Management10.1145/2983323.2983852(1413-1422)Online publication date: 24-Oct-2016
Diaz FCarterette BFang HLalmas MNie J(2016)Learning to Rank with Labeled FeaturesProceedings of the 2016 ACM International Conference on the Theory of Information Retrieval10.1145/2970398.2970435(41-44)Online publication date: 12-Sep-2016
Long BChang Y(2014)Relevance Ranking for Vertical Search EnginesundefinedOnline publication date: 14-Feb-2014

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents