Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2740908.2742008acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Scalable Preference Learning from Data Streams

Published: 18 May 2015 Publication History

Abstract

We study the task of learning the preferences of online readers of news, based on their past choices. Previous work has shown that it is possible to model this situation as a competition between articles, where the most appealing articles of the day are those selected by the most users. The appeal of an article can be computed from its textual content, and the evaluation function can be learned from training data. In this paper, we show how this task can benefit from an efficient algorithm, based on hashing representations, which enables it to be deployed on high intensity data streams. We demonstrate the effectiveness of this approach on four real world news streams, compare it with standard approaches, and describe a new online demonstration based on this technology.

References

[1]
Christopher M Bishop. Pattern Recognition and Machine Learning, volume 1. Springer New York, 2006.
[2]
Léon Bottou. Stochastic Gradient Descent Tricks. In Neural Networks: Tricks of the Trade, pages 421--436. Springer, 2012.
[3]
Christopher JC Burges. Dimension Reduction. Now Publishers Inc, 2010.
[4]
Graham Cormode and S Muthukrishnan. An Improved Data Stream Summary: The Count-Min Sketch and its Applications. Journal of Algorithms, 55(1):58--75, 2005.
[5]
N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, 2000.
[6]
Anirban Dasgupta, Ravi Kumar, and Tamás Sarlós. A Sparse Johnson-Lindenstrauss Transform. In Proceedings of the Forty-Second ACM Symposium on Theory of Computing, pages 341--350. ACM, 2010.
[7]
I. Flaounas, O. Ali, M. Turchi, T. Snowsill, F. Nicart, T. De Bie, and N. Cristianini. NOAM: News Outlets Analysis and Monitoring System. In SIGMOD 2011, pages 1275--1278. ACM, 2011.
[8]
Ilias Flaounas, Thomas Lansdall-Welfare, Panagiota Antonakaki, and Nello Cristianini. The Anatomy of a Modular System for Media Content Analysis. CoRR, abs/1402.6208, 2014.
[9]
Elena Hensinger, Ilias Flaounas, and Nello Cristianini. Modelling and Predicting News Popularity. Pattern Analysis and Applications, 16(4):623--635, 2013.
[10]
Thorsten Joachims. Learning to Classify Text using Support Vector Machines: Methods, Theory and Algorithms. Kluwer Academic Publishers, 2002.
[11]
Thorsten Joachims. Optimizing Search Engines using Clickthrough Data. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 133--142. ACM, 2002.
[12]
Christopher D Manning, Prabhakar Raghavan, and Hinrich Schütze. Introduction to Information Retrieval, volume 1. Cambridge University Press Cambridge, 2008.
[13]
Ricardo Nanculef, Ilias Flaounas, and Nello Cristianini. Efficient Classification of Multi-labelled Text Streams by Clashing. Expert Systems with Applications, 2014.
[14]
Stephen Robertson. Understanding Inverse Document Frequency: on Theoretical Arguments for IDF. Journal of documentation, 60(5):503--520, 2004.
[15]
Evan Sandhaus. The New York Times Annotated Corpus. Linguistic Data Consortium, Philadelphia, 6(12), 2008.
[16]
Kilian Weinberger, Anirban Dasgupta, John Langford, Alex Smola, and Josh Attenberg. Feature Hashing for Large Scale Multitask Learning. In Proceedings of the 26th Annual International Conference on Machine Learning. ACM, 2009.

Cited By

View all
  • (2024)Learning to Sketch: A Neural Approach to Item Frequency Estimation in Streaming DataIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.338858946:11(7136-7153)Online publication date: Nov-2024
  • (2023)Federated User Modeling from Hierarchical InformationACM Transactions on Information Systems10.1145/356048541:2(1-33)Online publication date: 9-Feb-2023
  • (2023)Differentially Private Distributed Frequency EstimationIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2022.322765420:5(3910-3926)Online publication date: 1-Sep-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '15 Companion: Proceedings of the 24th International Conference on World Wide Web
May 2015
1602 pages
ISBN:9781450334730
DOI:10.1145/2740908

Sponsors

  • IW3C2: International World Wide Web Conference Committee

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 May 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. hashing trick
  2. learning to rank
  3. news appeal
  4. news popularity
  5. online learning

Qualifiers

  • Research-article

Funding Sources

  • ThinkBig
  • Complacs

Conference

WWW '15
Sponsor:
  • IW3C2

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Learning to Sketch: A Neural Approach to Item Frequency Estimation in Streaming DataIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.338858946:11(7136-7153)Online publication date: Nov-2024
  • (2023)Federated User Modeling from Hierarchical InformationACM Transactions on Information Systems10.1145/356048541:2(1-33)Online publication date: 9-Feb-2023
  • (2023)Differentially Private Distributed Frequency EstimationIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2022.322765420:5(3910-3926)Online publication date: 1-Sep-2023
  • (2023)SketchConf: A Framework for Automatic Sketch Configuration2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00157(2022-2035)Online publication date: Apr-2023
  • (2021)Hierarchical Personalized Federated Learning for User ModelingProceedings of the Web Conference 202110.1145/3442381.3449926(957-968)Online publication date: 19-Apr-2021

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media