Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Universal Kernels

Published: 01 December 2006 Publication History

Abstract

In this paper we investigate conditions on the features of a continuous kernel so that it may approximate an arbitrary continuous target function uniformly on any compact subset of the input space. A number of concrete examples are given of kernels with this universal approximating property.

References

[1]
A. Argyriou, C. A. Micchelli and M. Pontil. Learning convex combinations of continuously parameterized basic kernels. Proceeding of the 18th Annual Conference on Learning Theory (COLT'05), Bertinoro, Italy, 2005.
[2]
A. Argyriou, R. Hauser, C. A. Micchelli and M. Pontil. A DC-programming algorithm for kernel selection. Proceeding of the 23rd International Conference on Machine Learning (ICML'06), forthcoming (see also Research Note RN/06/04, Department of Computer Science, UCL, 2006).
[3]
N. Aronszajn. Theory of reproducing kernels. Trans. Amer. Math. Soc., 68: 337-404, 1950.
[4]
F. R. Bach, G. R. G. Lanckriet and M. I. Jordan. Multiple kernel learning, conic duality and the SMO algorithm. Proceeding of the 21st International Conference on Machine learning (ICML'04), 2004.
[5]
A. Beurling and P. Malliavin. On the closure of characters and the zeros of entire functions. Acta. Math., 118: 79-93, 1967.
[6]
C. M. Bishop. Neural Networks for Pattern Recognition. Clarendon Press, Oxford, 1995.
[7]
S. Bochner. Lectures on Fourier Integrals With an author's supplement on monotonic functions, Stieltjes integrals, and harmonic analysis. Annals of Mathematics Studies, no. 42, Princeton University Press, New Jersey, 1959.
[8]
T. Evgeniou, M. Pontil and T. Poggio. Regularization networks and support vector machines. Adv. Comput. Math., 13: 1-50, 2000.
[9]
C. H. FitzGerald, C. A. Micchelli and A. Pinkus. Functions that preserve families of positive semidefinite matrices. Linear Algebra Appl., 221: 83-102, 1995.
[10]
T. Hastie, R. Tibshirani and J. Friedman. The Elements of Statistical Learning. Springer, New York, 2001.
[11]
G. R. G. Lanckriet, N. Cristianini, P. Bartlett, L. El-Ghaoui and M. I. Jordan. Learning the kernel matrix with semi-definite programming. Journal of Machine Learning Research, 5: 27-72, 2004.
[12]
P. Lax. Functional Analysis. Wiley, New York, 2002.
[13]
J. Mercer. Functions of positive and negative type and their connection with the theory of integral equations. Philos. Trans. Royal Soc. London, 209: 415-446, 1909.
[14]
C. A. Micchelli and M. Pontil. A function representation for learning in Banach spaces. Proceeding of the 17th Annual Conference on Learning (COLT'04), 2004.
[15]
C. A. Micchelli and M. Pontil. Feature space perspectives for learning the kernel. Machine Learning, forthcoming (see also: Research Note RN/05/11, Department of Computer Science, UCL, June, 2005).
[16]
C. A. Micchelli, M. Pontil, Q. Wu and D. X. Zhou. Error bounds for learning the kernel. Research Note RN/05/04, Department of Computer Science, UCL, 2006.
[17]
C. A. Micchelli, Y. Xu and P. Ye. Cucker Smale learning theory in Besov spaces. Advances in Learning Theory: Methods, Models and Applications. J. Suykens, G. Horvath, S. Basu, C. A. Micchelli and J. Vandewalle, editors. IOS Press, Amsterdam, The Netherlands, 2003, 47-68.
[18]
J. Neumann, C. Schnörr and G. Steidl. SVM-based feature selection by direct objective minimization. C. E. Rasmussen, H. H. Bülthoff, B. Schölkopf and M. A. Giese, editors. Lecture Notes in Computer Science, 3175: 212-219, Proceeding of the 26th DAGM Symposium, 2004.
[19]
C. S. Ong, A. J. Smola and R. C. Williamson. Learning the kernel with hyperkernels. Journal of Machine Learning Research, 6: 1043-1071, 2005.
[20]
T. Poggio, S. Mukherjee, R. Rifkin, A. Raklin and A. Verri. B. Uncertainty in geometric computations , J. Winkler and M. Niranjan, editors. Kluwer Academic Publishers, 22: 131-141, 2002.
[21]
R. M. Redheffer. Completeness of sets of complex exponentials. Adv. Math., 24: 1-62, 1977.
[22]
T. J. Rivlin. Chebyshev Polynomials. 2nd Edition, John Wiley, New York, 1990.
[23]
H. Royden. Real Analysis. 3rd Edition, Macmillan Publishing Company, New York, 1988.
[24]
W. Rudin. Functional Analysis. 2nd Edition, McGraw Hill, New York, 1991.
[25]
I. J. Schoenberg. Metric spaces and completely monotone functions. Ann. of Math. (2), 39: 811-841, 1938.
[26]
I. J. Schoenberg. Positive definite functions on spheres. Duke. Math. J., 9: 96-108, 1942.
[27]
B. Schölkopf, C. J. C. Burges and A. Smola. Advances in Kernel Methods: Support Vector Learning. MIT Press, Cambridge, Mass, 1999.
[28]
B. Schölkopf and A. Smola. Learning with Kernels. MIT Press, Cambridge, Mass, 2002.
[29]
J. Shawe-Taylor and N. Cristianini. Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge, 2004.
[30]
S. Sonnenburg, G. Rätsch and C. Schäfer. A general and efficient multiple kernel learning algorithm. Y. Weiss, B. Schölkopf and J. Platt, editors. Advances in Neural Information Processing Systems, 18. MIT Press, Cambridge, Mass, 2006.
[31]
E. M. Stein and G. Weiss. Introduction to Fourier Analysis on Euclidean Spaces. Princeton University Press, New Jersey, 1971.
[32]
I. Steinwart. On the influence of kernel on the consistency of support vector machines. Journal of Machine Learning Research, 2: 67-93, 2001.
[33]
H. Sun. Mercer theorem for RKHS on noncompact sets. J. Complexity, 21: 337-349, 2005.
[34]
G. Szegö. Orthogonal Polynomials. American Mathematical Society Colloquium Publications 23. Revised Edition, Providence, RI, 1959.
[35]
G. Wahba. Splines Models for Observational Data. Series in Applied Mathematics 59. SIAM, Philadelphia, 1990.
[36]
D. X. Zhou. Density problem and approximation error in learning theory. preprint, 2003.

Cited By

View all
  • (2024)Kernel Based Reconstruction for Generalized Graph Signal ProcessingIEEE Transactions on Signal Processing10.1109/TSP.2024.339502172(2308-2322)Online publication date: 30-Apr-2024
  • (2024)Optimal Service Caching and Pricing in Edge Computing: A Bayesian Gaussian Process Bandit ApproachIEEE Transactions on Mobile Computing10.1109/TMC.2022.322146523:1(705-718)Online publication date: 1-Jan-2024
  • (2024)DEAL: Data-Efficient Active Learning for Regression Under DriftAdvances in Knowledge Discovery and Data Mining10.1007/978-981-97-2266-2_15(188-200)Online publication date: 7-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

Publisher

JMLR.org

Publication History

Published: 01 December 2006
Published in JMLR Volume 7

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)23
  • Downloads (Last 6 weeks)10
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Kernel Based Reconstruction for Generalized Graph Signal ProcessingIEEE Transactions on Signal Processing10.1109/TSP.2024.339502172(2308-2322)Online publication date: 30-Apr-2024
  • (2024)Optimal Service Caching and Pricing in Edge Computing: A Bayesian Gaussian Process Bandit ApproachIEEE Transactions on Mobile Computing10.1109/TMC.2022.322146523:1(705-718)Online publication date: 1-Jan-2024
  • (2024)DEAL: Data-Efficient Active Learning for Regression Under DriftAdvances in Knowledge Discovery and Data Mining10.1007/978-981-97-2266-2_15(188-200)Online publication date: 7-May-2024
  • (2023)Nyström M-hilbert-schmidt independence criterionProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3625929(1005-1015)Online publication date: 31-Jul-2023
  • (2023)Intrinsic sliced Wasserstein distances for comparing collections of probability distributions on manifolds and graphsProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619630(29388-29415)Online publication date: 23-Jul-2023
  • (2023)Conditional Wasserstein GeneratorIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.322096545:6(7208-7219)Online publication date: 1-Jun-2023
  • (2023)Multiple Kernel Representation Learning on NetworksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.317204835:6(6113-6125)Online publication date: 1-Jun-2023
  • (2023)Goodness-of-fit tests for multivariate skewed distributions based on the characteristic functionStatistics and Computing10.1007/s11222-023-10260-033:5Online publication date: 7-Jul-2023
  • (2023)Identification of Hammerstein Systems with Random Fourier Features and Kernel Risk Sensitive LossNeural Processing Letters10.1007/s11063-023-11191-755:7(9041-9063)Online publication date: 1-Dec-2023
  • (2022)Byzantine-tolerant federated Gaussian process regression for streaming dataProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601251(13499-13511)Online publication date: 28-Nov-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media