research-article

New Online Kernel Ridge Regression via Incremental Predictive Sampling

Authors:

Shizhong LiaoAuthors Info & Claims

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

Pages 791 - 800

https://doi.org/10.1145/3357384.3358004

Published: 03 November 2019 Publication History

Abstract

Online kernel ridge regression via existing sampling approaches, which aim at approximating the kernel matrix as accurately as possible, is independent of learning and has a cubic time complexity with respect to the sampling size for updating hypothesis. In this paper, we propose a new online kernel ridge regression via an incremental predictive sampling approach, which has the nearly optimal accumulated loss and performs efficiently at each round. We use the estimated ridge leverage score of the labeled matrix, which depends on the accumulated loss at each round, to construct the predictive sampling distribution, and use this sampling probability for the Nyströ m approximation. To avoid calculating the inverse of the approximated kernel matrix directly, we use the Woodbury formula to accelerate the computation and adopt the truncated incremental singular value decomposition to update the generalized inverse of the intersection matrix. Our online kernel ridge regression has a time complexity of $O(tmk+k^3 )$ for updating hypothesis at round t, where k is the truncated rank of the intersection matrix, and enjoys a regret bound of order $O(\sqrtT )$, where T is the time horizon. Experimental results show that the proposed online kernel ridge regression via the incremental predictive sampling performs more stably and efficiently than the online kernel ridge regression via existing online sampling approaches that directly approximate the kernel matrix.

References

[1]

Ahmed Alaoui and Michael W. Mahoney. 2015. Fast randomized kernel ridge regression with statistical guarantees. In Advances in Neural Information Processing Systems 28. 775--783.

[2]

Haim Avron, Kenneth L Clarkson, and David P Woodruff. 2017a. Faster kernel ridge regression using sketching and preconditioning. SIAM J. Matrix Anal. Appl., Vol. 38, 4 (2017), 1116--1138.

Digital Library

[3]

Haim Avron, Michael Kapralov, Cameron Musco, Christopher Musco, Ameya Velingker, and Amir Zandieh. 2017b. Random Fourier features for kernel ridge regression: Approximation bounds and statistical guarantees. In Proceedings of the 34th International Conference on Machine Learning. 253--262.

[4]

Francis R Bach. 2013. Sharp analysis of low-rank kernel matrix approximations. In Proceedings of the 26th Conference on Learning Theory. 185--209.

[5]

Matthew Brand. 2006. Fast low-rank modifications of the thin singular value decomposition. Linear algebra and its applications, Vol. 415, 1 (2006), 20--30.

[6]

Daniele Calandriello, Alessandro Lazaric, and Michal Valko. 2016a. Analysis of Nyström method with sequential ridge leverage score sampling. In Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence .

[7]

Daniele Calandriello, Alessandro Lazaric, and Michal Valko. 2016b. Pack only the essentials: Adaptive dictionary learning for kernel ridge regression. In Adaptive and Scalable Nonparametric Methods in Machine Learning at Neural Information Processing Systems .

[8]

Daniele Calandriello, Alessandro Lazaric, and Michal Valko. 2017a. Efficient second-order online kernel learning with adaptive embedding. In Advances in Neural Information Processing Systems 30. 6140--6150.

[9]

Daniele Calandriello, Alessandro Lazaric, and Michal Valko. 2017b. Second-order kernel online convex optimization with adaptive sketching. In Proceedings of the 34th International Conference on Machine Learning. 645--653.

Digital Library

[10]

Agniva Chowdhury, Jiasen Yang, and Petros Drineas. 2018. An Iterative, Sketching-based Framework for Ridge Regression. In Proceedings of the 35th International Conference on Machine Learning. 988--997.

[11]

Michael B Cohen, Cameron Musco, and Jakub Pachocki. 2016. Online row sampling. International Workshop on Approximation, Randomization, and Conbinatorial Optimization (2016).

[12]

Petros Drineas, Malik Magdon-Ismail, Michael W Mahoney, and David P Woodruff. 2012. Fast approximation of matrix coherence and statistical leverage. Journal of Machine Learning Research, Vol. 13 (2012), 3475--3506.

Digital Library

[13]

Fuhao Gao, Xiaoxin Song, Ling Jian, and Xijun Liang. 2019. Toward Budgeted Online Kernel Ridge Regression on Streaming Data. IEEE Access, Vol. 7 (2019), 26136--26145.

[14]

Michael W Mahoney. 2010. Randomized algorithms for matrices and data. Foundations and Trends$^®$ in Machine Learning, Vol. 3, 2 (2010), 123--224.

[15]

Cameron Musco and Christopher Musco. 2017. Recursive sampling for the Nyström method. In Advances in Neural Information Processing Systems 30. 3833--3845.

[16]

Alessandro Rudi, Raffaello Camoriano, and Lorenzo Rosasco. 2015. Less is more: Nyström computational regularization. In Advances in Neural Information Processing Systems 28. 1648--1656.

[17]

Bernhard Schölkopf, Alexander J Smola, et almbox. 2002. Learning with kernels: Support vector machines, regularization, optimization, and beyond .MIT press.

[18]

Shusen Wang and Zhihua Zhang. 2013. Improving CUR matrix decomposition and the Nyström approximation via adaptive sampling. Journal of Machine Learning Research, Vol. 14 (2013), 2729--2769.

Digital Library

[19]

Jiangang Wu, Lizhong Ding, and Shizhong Liao. 2017. Predictive Nyström Method for Kernel Methods. Neurocomputing, Vol. 234 (2017), 116--125.

Digital Library

[20]

Shan Xu, Xiao Zhang, and Shizhong Liao. 2018. A linear incremental Nyström method for online kernel learning. In Proceedings of the 24th International Conference on Pattern Recognition . 2256--2261.

[21]

Xiao Zhang and Shizhong Liao. 2018. Online kernel selection via incremental sketched kernel alignment. In Proceedings of the 27th International Joint Conference on Artificial Intelligence . 3118--3124.

[22]

Xiao Zhang and Shizhong Liao. 2019. Incremental randomized sketching for online kernel learning. In Proceedings of the 36th International Conference on Machine Learning. 7394--7403.

[23]

Fedor Zhdanov and Yuri Kalnishkan. 2010. An identity for kernel ridge regression. In Proceedings of the 21th International Conference on Algorithmic Learning Theory. 405--419.

Index Terms

New Online Kernel Ridge Regression via Incremental Predictive Sampling
1. Theory of computation
  1. Design and analysis of algorithms
    1. Online algorithms
      1. Online learning algorithms
  2. Theory and algorithms for application domains
    1. Machine learning theory
      1. Kernel methods

Recommendations

Online kernel selection via incremental sketched kernel alignment
IJCAI'18: Proceedings of the 27th International Joint Conference on Artificial Intelligence

In contrast to offline kernel selection, online kernel selection must rise to the new challenges of passing the training set once, selecting optimal kernels and updating hypotheses at each round, enjoying a sublinear regret bound for online kernel ...
Reduced Rank Kernel Ridge Regression

Ridge regression is a classical statistical technique that attempts to address the bias-variance trade-off in the design of linear regression models. A reformulation of ridge regression in dual variables permits a non-linear form of ridge regression via ...
Nonlinear Dimension Reduction with Kernel Sliced Inverse Regression

Sliced inverse regression (SIR) is a renowned dimension reduction method for finding an effective low-dimensional linear subspace. Like many other linear methods, SIR can be extended to nonlinear setting via the “kernel trick.” The main purpose of this ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

November 2019

3373 pages

ISBN:9781450369763

DOI:10.1145/3357384

General Chairs:
Wenwu Zhu
Tsinghua University, China
,
Dacheng Tao
University of Massachusetts, USA
,
Xueqi Cheng
Institute of Computing Technology, CAS, China
,
Program Chairs:
Peng Cui
Tsinghua University, China
,
Elke Rundensteiner
Worcester Polytechnic Institute, USA
,
David Carmel
Amazon Research, USA
,
Qi He
LinkedIn, USA
,
Jeffrey Xu Yu
Chinese University of Hong Kong, China

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China

Conference

CIKM '19

Sponsor:

CIKM '19: The 28th ACM International Conference on Information and Knowledge Management

November 3 - 7, 2019

Beijing, China

Acceptance Rates

CIKM '19 Paper Acceptance Rate 202 of 1,031 submissions, 20%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
245
Total Downloads

Downloads (Last 12 months)17
Downloads (Last 6 weeks)2

Reflects downloads up to 02 Sep 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents