article

Free access

Universal Kernels

Authors:

Charles A. Micchelli,

Haizhang ZhangAuthors Info & Claims

The Journal of Machine Learning Research, Volume 7

Pages 2651 - 2667

Published: 01 December 2006 Publication History

Abstract

In this paper we investigate conditions on the features of a continuous kernel so that it may approximate an arbitrary continuous target function uniformly on any compact subset of the input space. A number of concrete examples are given of kernels with this universal approximating property.

References

[1]

A. Argyriou, C. A. Micchelli and M. Pontil. Learning convex combinations of continuously parameterized basic kernels. Proceeding of the 18th Annual Conference on Learning Theory (COLT'05), Bertinoro, Italy, 2005.

Digital Library

[2]

A. Argyriou, R. Hauser, C. A. Micchelli and M. Pontil. A DC-programming algorithm for kernel selection. Proceeding of the 23rd International Conference on Machine Learning (ICML'06), forthcoming (see also Research Note RN/06/04, Department of Computer Science, UCL, 2006).

Digital Library

[3]

N. Aronszajn. Theory of reproducing kernels. Trans. Amer. Math. Soc., 68: 337-404, 1950.

[4]

F. R. Bach, G. R. G. Lanckriet and M. I. Jordan. Multiple kernel learning, conic duality and the SMO algorithm. Proceeding of the 21st International Conference on Machine learning (ICML'04), 2004.

Digital Library

[5]

A. Beurling and P. Malliavin. On the closure of characters and the zeros of entire functions. Acta. Math., 118: 79-93, 1967.

[6]

C. M. Bishop. Neural Networks for Pattern Recognition. Clarendon Press, Oxford, 1995.

Digital Library

[7]

S. Bochner. Lectures on Fourier Integrals With an author's supplement on monotonic functions, Stieltjes integrals, and harmonic analysis. Annals of Mathematics Studies, no. 42, Princeton University Press, New Jersey, 1959.

[8]

T. Evgeniou, M. Pontil and T. Poggio. Regularization networks and support vector machines. Adv. Comput. Math., 13: 1-50, 2000.

[9]

C. H. FitzGerald, C. A. Micchelli and A. Pinkus. Functions that preserve families of positive semidefinite matrices. Linear Algebra Appl., 221: 83-102, 1995.

[10]

T. Hastie, R. Tibshirani and J. Friedman. The Elements of Statistical Learning. Springer, New York, 2001.

[11]

G. R. G. Lanckriet, N. Cristianini, P. Bartlett, L. El-Ghaoui and M. I. Jordan. Learning the kernel matrix with semi-definite programming. Journal of Machine Learning Research, 5: 27-72, 2004.

Digital Library

[12]

P. Lax. Functional Analysis. Wiley, New York, 2002.

[13]

J. Mercer. Functions of positive and negative type and their connection with the theory of integral equations. Philos. Trans. Royal Soc. London, 209: 415-446, 1909.

[14]

C. A. Micchelli and M. Pontil. A function representation for learning in Banach spaces. Proceeding of the 17th Annual Conference on Learning (COLT'04), 2004.

[15]

C. A. Micchelli and M. Pontil. Feature space perspectives for learning the kernel. Machine Learning, forthcoming (see also: Research Note RN/05/11, Department of Computer Science, UCL, June, 2005).

Digital Library

[16]

C. A. Micchelli, M. Pontil, Q. Wu and D. X. Zhou. Error bounds for learning the kernel. Research Note RN/05/04, Department of Computer Science, UCL, 2006.

[17]

C. A. Micchelli, Y. Xu and P. Ye. Cucker Smale learning theory in Besov spaces. Advances in Learning Theory: Methods, Models and Applications. J. Suykens, G. Horvath, S. Basu, C. A. Micchelli and J. Vandewalle, editors. IOS Press, Amsterdam, The Netherlands, 2003, 47-68.

[18]

J. Neumann, C. Schnörr and G. Steidl. SVM-based feature selection by direct objective minimization. C. E. Rasmussen, H. H. Bülthoff, B. Schölkopf and M. A. Giese, editors. Lecture Notes in Computer Science, 3175: 212-219, Proceeding of the 26th DAGM Symposium, 2004.

[19]

C. S. Ong, A. J. Smola and R. C. Williamson. Learning the kernel with hyperkernels. Journal of Machine Learning Research, 6: 1043-1071, 2005.

Digital Library

[20]

T. Poggio, S. Mukherjee, R. Rifkin, A. Raklin and A. Verri. B. Uncertainty in geometric computations , J. Winkler and M. Niranjan, editors. Kluwer Academic Publishers, 22: 131-141, 2002.

[21]

R. M. Redheffer. Completeness of sets of complex exponentials. Adv. Math., 24: 1-62, 1977.

[22]

T. J. Rivlin. Chebyshev Polynomials. 2nd Edition, John Wiley, New York, 1990.

[23]

H. Royden. Real Analysis. 3rd Edition, Macmillan Publishing Company, New York, 1988.

[24]

W. Rudin. Functional Analysis. 2nd Edition, McGraw Hill, New York, 1991.

[25]

I. J. Schoenberg. Metric spaces and completely monotone functions. Ann. of Math. (2), 39: 811-841, 1938.

[26]

I. J. Schoenberg. Positive definite functions on spheres. Duke. Math. J., 9: 96-108, 1942.

[27]

B. Schölkopf, C. J. C. Burges and A. Smola. Advances in Kernel Methods: Support Vector Learning. MIT Press, Cambridge, Mass, 1999.

Digital Library

[28]

B. Schölkopf and A. Smola. Learning with Kernels. MIT Press, Cambridge, Mass, 2002.

[29]

J. Shawe-Taylor and N. Cristianini. Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge, 2004.

Digital Library

[30]

S. Sonnenburg, G. Rätsch and C. Schäfer. A general and efficient multiple kernel learning algorithm. Y. Weiss, B. Schölkopf and J. Platt, editors. Advances in Neural Information Processing Systems, 18. MIT Press, Cambridge, Mass, 2006.

[31]

E. M. Stein and G. Weiss. Introduction to Fourier Analysis on Euclidean Spaces. Princeton University Press, New Jersey, 1971.

[32]

I. Steinwart. On the influence of kernel on the consistency of support vector machines. Journal of Machine Learning Research, 2: 67-93, 2001.

Digital Library

[33]

H. Sun. Mercer theorem for RKHS on noncompact sets. J. Complexity, 21: 337-349, 2005.

Digital Library

[34]

G. Szegö. Orthogonal Polynomials. American Mathematical Society Colloquium Publications 23. Revised Edition, Providence, RI, 1959.

[35]

G. Wahba. Splines Models for Observational Data. Series in Applied Mathematics 59. SIAM, Philadelphia, 1990.

[36]

D. X. Zhou. Density problem and approximation error in learning theory. preprint, 2003.

Cited By

Jian XTay WEldar Y(2024)Kernel Based Reconstruction for Generalized Graph Signal ProcessingIEEE Transactions on Signal Processing10.1109/TSP.2024.339502172(2308-2322)Online publication date: 30-Apr-2024
https://dl.acm.org/doi/10.1109/TSP.2024.3395021
Tütüncüoğlu FDán G(2024)Optimal Service Caching and Pricing in Edge Computing: A Bayesian Gaussian Process Bandit ApproachIEEE Transactions on Mobile Computing10.1109/TMC.2022.322146523:1(705-718)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TMC.2022.3221465
Böhnke BFouché EBöhm K(2024)DEAL: Data-Efficient Active Learning for Regression Under DriftAdvances in Knowledge Discovery and Data Mining10.1007/978-981-97-2266-2_15(188-200)Online publication date: 7-May-2024
https://dl.acm.org/doi/10.1007/978-981-97-2266-2_15
Show More Cited By

Index Terms

Universal Kernels

Recommendations

Universal Multi-Task Kernels

In this paper we are concerned with reproducing kernel Hilbert spaces H_K of functions from an input space into a Hilbert space Y, an environment appropriate for multi-task learning. The reproducing kernel K associated to H_K has its values as operators ...
Refinement of operator-valued reproducing kernels

This paper studies the construction of a refinement kernel for a given operator-valued reproducing kernel such that the vector-valued reproducing kernel Hilbert space of the refinement kernel contains that of the given kernel as a subspace. The study is ...
Universal kernels on non-standard input spaces
NIPS'10: Proceedings of the 23rd International Conference on Neural Information Processing Systems - Volume 1

During the last years support vector machines (SVMs) have been successfully applied in situations where the input space X is not necessarily a subset of ℝ^d. Examples include SVMs for the analysis of histograms or colored images, SVMs for text ...

Comments

Information & Contributors

Information

Published In

Publisher

JMLR.org

Publication History

Published: 01 December 2006

Published in JMLR Volume 7

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

64
Total Citations
View Citations
630
Total Downloads

Downloads (Last 12 months)23
Downloads (Last 6 weeks)10

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Jian XTay WEldar Y(2024)Kernel Based Reconstruction for Generalized Graph Signal ProcessingIEEE Transactions on Signal Processing10.1109/TSP.2024.339502172(2308-2322)Online publication date: 30-Apr-2024
https://dl.acm.org/doi/10.1109/TSP.2024.3395021
Tütüncüoğlu FDán G(2024)Optimal Service Caching and Pricing in Edge Computing: A Bayesian Gaussian Process Bandit ApproachIEEE Transactions on Mobile Computing10.1109/TMC.2022.322146523:1(705-718)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TMC.2022.3221465
Böhnke BFouché EBöhm K(2024)DEAL: Data-Efficient Active Learning for Regression Under DriftAdvances in Knowledge Discovery and Data Mining10.1007/978-981-97-2266-2_15(188-200)Online publication date: 7-May-2024
https://dl.acm.org/doi/10.1007/978-981-97-2266-2_15
Kalinke FSzabó ZEvans RShpitser I(2023)Nyström M-hilbert-schmidt independence criterionProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3625929(1005-1015)Online publication date: 31-Jul-2023
https://dl.acm.org/doi/10.5555/3625834.3625929
Rustamov RMajumdar SKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Intrinsic sliced Wasserstein distances for comparing collections of probability distributions on manifolds and graphsProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619630(29388-29415)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619630
Kim YLee KPaik M(2023)Conditional Wasserstein GeneratorIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.322096545:6(7208-7219)Online publication date: 1-Jun-2023
https://dl.acm.org/doi/10.1109/TPAMI.2022.3220965
Çelikkanat AShen YMalliaros F(2023)Multiple Kernel Representation Learning on NetworksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.317204835:6(6113-6125)Online publication date: 1-Jun-2023
https://dl.acm.org/doi/10.1109/TKDE.2022.3172048
Karling MGenton MMeintanis S(2023)Goodness-of-fit tests for multivariate skewed distributions based on the characteristic functionStatistics and Computing10.1007/s11222-023-10260-033:5Online publication date: 7-Jul-2023
https://dl.acm.org/doi/10.1007/s11222-023-10260-0
Zheng YWang SChen B(2023)Identification of Hammerstein Systems with Random Fourier Features and Kernel Risk Sensitive LossNeural Processing Letters10.1007/s11063-023-11191-755:7(9041-9063)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1007/s11063-023-11191-7
Zhang XYuan ZZhu MKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)Byzantine-tolerant federated Gaussian process regression for streaming dataProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601251(13499-13511)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3601251
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents