research-article

Label Propagation through Linear Neighborhoods

Authors:

Changshui ZhangAuthors Info & Claims

IEEE Transactions on Knowledge and Data Engineering, Volume 20, Issue 1

Pages 55 - 67

https://doi.org/10.1109/TKDE.2007.190672

Published: 01 January 2008 Publication History

Abstract

In many practical data mining applications such as text classification, unlabeled training examples are readily available but labeled ones are fairly expensive to obtain. Therefore, semi-supervised learning algorithms have aroused considerable interests from the data mining and machine learning fields. In recent years, graph based semi-supervised learning has been becoming one of the most active research area in semi-supervised learning community. In this paper, a novel graph based semi-supervised learning approach is proposed based on a linear neighborhood model, which assumes that each data point can be linearly reconstructed from its neighborhood. Our algorithm, named Linear Neighborhood Propagation (LNP), can propagate the labels from the labeled points to the whole dataset using these linear neighborhoods with sufficient smoothness. Theoretical analysis of the properties of LNP are presented in this paper. Furthermore, we also derive an easy way to extend LNP to out-of-sample data. Promising experimental results are presented for synthetic data, digit and text classification tasks.

References

[1]

M.-F. Balcan, A. Blum, P.P. Choi, J. Lafferty, B. Pantano, M.R. Rwebangira, and X. Zhu, “Person Identification in Webcam Images: An Application of Semi-Supervised Learning,” Proc. ICML Workshop Learning with Partially Classified Training Data, 2005.

[2]

M. Belkin and P. Niyogi, “Laplacian Eigenmaps for Dimensionality Reduction and Data Representation,” Neural Computation, vol. 15, no. 6, pp. 1373-1396, 2003.

Digital Library

[3]

M. Belkin, I. Matveeva, and P. Niyogi, “Regularization and Semi-Supervised Learning on Large Graphs,” Proc. 17th Ann. Conf. Learning Theory (COLT '04), pp. 624-638, 2004.

[4]

M. Belkin, P. Niyogi, and V. Sindhwani, “Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples,” J. Machine Learning Research, vol. 7, pp.2399-2434, Nov. 2006.

Digital Library

[5]

Y. Bengio, M. Monperrus, and H. Larochelle, “Nonlocal Estimation of Manifold Structure,” Neural Computation, vol. 18, no. 10, pp.2509-2528, 2006.

Digital Library

[6]

A. Blum and T. Mitchell, “Combining Labeled and Unlabeled Data with Co-Training,” Proc. 11th Ann. Conf. Computational Learning Theory (COLT '98), pp. 92-100, 1998.

Digital Library

[7]

A. Blum and S. Chawla, “Learning from Labeled and Unlabeled Data Using Graph Mincuts,” Proc. 18th Int'l Conf. Machine Learning (ICML '01), pp. 19-26, 2001.

Digital Library

[8]

M.A. Carreira-Perpinan and R.S. Zemel, “Proximity Graphs for Clustering and Manifold Learning,” Advances in Neural Information Processing Systems 17, L.K. Saul, Y. Weiss, and L. Bottou, eds., pp.225-232, MIT Press, 2005.

Digital Library

[9]

O. Chapelle, J. Weston, and B. Schölkopf, “Cluster Kernels for Semi-Supervised Learning,” Advances in Neural Information Processing Systems 15, S. Becker, S. Thrun, and K. Obermayer, eds., pp.601-608, MIT Press, 2003.

[10]

O. Chapelle, B. Schölkopf, and A. Zien, Semi-Supervised Learning, p. 371. MIT Press, 2006.

[11]

O. Delalleu, Y. Bengio, and N. Le Roux, “Non-Parametric Function Induction in Semi-Supervised Learning,” Proc. 10th Int'l Workshop Artificial Intelligence and Statistics (AISTAT '05), pp. 96-103, 2005.

[12]

A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc., Series B, vol. 39, no. 1, pp. 1-38, 1977.

[13]

F.R.K. Chung, “Spectral Graph Theory,” CBMS Regional Conf. Series in Mathematics, vol. 92, published for the Conf. Board of the Mathematical Sciences, Washington, DC. 1997.

[14]

G.H. Golub and C.F Van Loan, Matrix Computation, second ed., 1989.

[15]

A.K. Jain and R.C Dubes, Algorithms for Clustering Data, Prentice Hall Advanced Reference Series. Prentice Hall, 1988.

Digital Library

[16]

T. Joachims, “Transductive Inference for Text Classification Using Support Vector Machines,” Proc. 16th Int'l Conf. Machine Learning (ICML '99), pp. 200-209, 1999.

Digital Library

[17]

T. Joachims, “Transductive Learning via Spectral Graph Partitioning,” Proc. 20th Int'l Conf. Machine Learning (ICML '03), pp. 290-297, 2003.

[18]

N. Kambhatla and T.K. Leen, “Dimension Reduction by Local Principal Component Analysis,” Neural Computation, vol. 9, no. 7, pp. 1493-1516, 1997.

Digital Library

[19]

A. Kapoor, Y. Qi, H. Ahn, and R.W. Picard, “Hyperparameter and Kernel Learning for Graph Based Semi-Supervised Classification,” Advances in Neural Information Processing Systems, 2005.

[20]

N.D. Lawrence and M.I. Jordan, “Semi-Supervised Learning via Gaussian Processes,” Advances in Neural Information Processing Systems 17, L.K. Saul, Y. Weiss, and L. Bottou, eds., MIT Press, 2005.

[21]

D.J. Miller and U.S. Uyar, “A Mixture of Experts Classifier with Learning Based on Both Labelled and Unlabelled Data,” Advances in Neural Information Processing Systems 9, M. Mozer, M.I. Jordan, and T.Petsche, eds., pp. 571-577, MIT Press, 1997.

[22]

K. Nigam, A.K. McCallum, S. Thrun, and T. Mitchell, “Text Classification from Labeled and Unlabeled Documents Using EM,” Machine Learning, vol. 39, no. 2-3, pp. 103-134, 2000.

Digital Library

[23]

J.R. Quinlan, “Introduction to Decision Trees,” Machine Learning, vol. 1, no. 1, pp. 81-106, 1986.

Digital Library

[24]

S.T. Roweis and L.K. Saul, “Nonlinear Dimensionality Reduction by Locally Linear Embedding,” Science, vol. 290, pp. 2323-2326, 2000.

[25]

L.K. Saul, K.Q. Weinberger, J.H. Ham, F. Sha, and D.D. Lee, “Spectral Methods for Dimensionality Reduction,” Semisupervised Learning, O. Chapelle, B. Schölkopf, and A. Zien, eds. MIT Press, 2006.

[26]

B. Schölkopf and A.J. Smola, Learning with Kernels. MIT Press, 2002.

[27]

B. Shahshahani and D. Landgrebe, “The Effect of Unlabeled Samples in Reducing the Small Sample Size Problem and Mitigating the Hughes Phenomenon,” IEEE Trans. Geoscience and Remote Sensing, vol. 32, no. 5, pp. 1087-1095, 1994.

[28]

M. Szummer and T. Jaakkola, “Partially Labeled Classification with Markov Random Walks,” Advances in Neural Information Processing Systems 14, T.G. Dietterich, S. Becker, and Z. Ghahramani, eds., pp. 945-952, 2002.

[29]

J.B. Tenenbaum, V. Silva, and J.C. Langford, “A Global Geometric Framework for Nonlinear Dimensionality Reduction,” Science, vol. 290, pp. 2319-2323, 2000.

[30]

V.N. Vapnik, The Nature of Statistical Learning Theory. Springer, 1995.

Digital Library

[31]

F. Wang and C. Zhang, “Label Propagation through Linear Neighborhoods,” Proc. 23rd Int'l Conf. Machine Learning (ICML '06), pp. 985-992, 2006.

Digital Library

[32]

D. Zhou, O. Bousquet, T.N. Lal, J. Weston, and B. Schölkopf, “Learning with Local and Global Consistency,” Advances in Neural Information Processing Systems 16, S. Thrun, L. Saul, and B.Schölkopf, eds., pp. 321-328, 2004.

Digital Library

[33]

D. Zhou and B. Schölkopf, “Learning from Labeled and Unlabeled Data Using Random Walks,” Proc. 26th Pattern Recognition Symp. (DAGM '04), 2004.

[34]

D. Zhou, B. Schölkopf, and T. Hofmann, “Semi-Supervised Learning on Directed Graphs,” Advances in Neural Information Processing Systems 17, L.K., Saul, Y. Weiss, and L. Bottou, eds., pp.1633-1640, MIT Press, 2005.

[35]

X. Zhu, Z. Ghahramani, and J. Lafferty, “Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions,” Proc. 20th Int'l Conf. Machine Learning (ICML '03), 2003.

[36]

X. Zhu and Z. Ghahramani, “Learning from Labeled and Unlabeled Data with Label Propagation,” Technical Report CMU-CALD-02-107, Carnegie Mellon Univ., 2002.

[37]

X. Zhu and Z. Ghahramani, “Towards Semi-Supervised Classification with Markov Random Fields,” Technical Report CMU-CALD-02-106, Carnegie Mellon Univ., 2002.

[38]

X. Zhu, J. Lafferty, and Z. Ghahramani, “Semi-Supervised Learning: From Gaussian Fields to Gaussian Processes,” Technical Report CMU-CS-03-175, Carnegie Mellon Univ., 2003.

[39]

X. Zhu, “Semi-Supervised Learning Literature Survey,” Computer Sciences Technical Report 1530, Univ. of Wisconsin, Madison, 2006.

Cited By

Song ZWu ZChen SZhu H(2024)Transductive classification via patch alignmentAI Communications10.3233/AIC-22017937:1(37-51)Online publication date: 21-Mar-2024
https://dl.acm.org/doi/10.3233/AIC-220179
Gao YZhu ZMa MGao FGao HShi YGao XBaeza-Yates RBonchi F(2024)Online Preference Weight Estimation Algorithm with Vanishing Regret for Car-Hailing in Road NetworkProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671664(863-871)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671664
You DYan HXiao JChen ZWu DShen LWu X(2024)Online Learning for Data Streams With Incomplete Features and LabelsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337435736:9(4820-4834)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1109/TKDE.2024.3374357
Show More Cited By

Index Terms

Label Propagation through Linear Neighborhoods

Recommendations

Label propagation through linear neighborhoods
ICML '06: Proceedings of the 23rd international conference on Machine learning

A novel semi-supervised learning approach is proposed based on a linear neighborhood model, which assumes that each data point can be linearly reconstructed from its neighborhood. Our algorithm, named Linear Neighborhood Propagation (LNP), can propagate ...
Transductive Multilabel Learning via Label Set Propagation

The problem of multilabel classification has attracted great interest in the last decade, where each instance can be assigned with a set of multiple class labels simultaneously. It has a wide variety of real-world applications, e.g., automatic image ...
SPL-LDP: a label distribution propagation method for semi-supervised partial label learning
Abstract
Partial label learning learns from examples represented by a single instance while associated with multiple candidate labels, among which only one valid label resides. However, in real-world applications, collecting candidate label sets for all ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Knowledge and Data Engineering

IEEE Transactions on Knowledge and Data Engineering Volume 20, Issue 1

January 2008

140 pages

ISSN:1041-4347

Issue’s Table of Contents

Copyright © 2008.

Publisher

IEEE Educational Activities Department

United States

Publication History

Published: 01 January 2008

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

170
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Song ZWu ZChen SZhu H(2024)Transductive classification via patch alignmentAI Communications10.3233/AIC-22017937:1(37-51)Online publication date: 21-Mar-2024
https://dl.acm.org/doi/10.3233/AIC-220179
Gao YZhu ZMa MGao FGao HShi YGao XBaeza-Yates RBonchi F(2024)Online Preference Weight Estimation Algorithm with Vanishing Regret for Car-Hailing in Road NetworkProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671664(863-871)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671664
You DYan HXiao JChen ZWu DShen LWu X(2024)Online Learning for Data Streams With Incomplete Features and LabelsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337435736:9(4820-4834)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1109/TKDE.2024.3374357
Xia JLin HXu YTan CWu LLi SLi S(2024)GNN Cleaner: Label Cleaner for Graph Structured DataIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.328800236:2(640-651)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1109/TKDE.2023.3288002
Bao JKudo MKimura KSun L(2024)Robust embedding regression for semi-supervised learningPattern Recognition10.1016/j.patcog.2023.109894145:COnline publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1016/j.patcog.2023.109894
Michelioudakis EArtikis APaliouras G(2024)Online semi-supervised learning of composite event rules by combining structure and mass-based predicate similarityMachine Language10.1007/s10994-023-06447-1113:3(1445-1481)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1007/s10994-023-06447-1
Fan ZHuang YXi CPeng CWang S(2024)Semi-supervised fuzzy broad learning system based on mean-teacher modelPattern Analysis & Applications10.1007/s10044-024-01217-827:1Online publication date: 28-Feb-2024
https://dl.acm.org/doi/10.1007/s10044-024-01217-8
Song ZZhang YKing IOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Optimal block-wise asymmetric graph construction for graph-based semi-supervised learningProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669237(71135-71149)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3669237
Huang ZShen LYu JHan BLiu TOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)FlatMatchProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666934(18474-18494)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666934
Liu ZMa QMa PWang LWilliams BChen YNeville J(2023)Temporal-frequency co-training for time series semi-supervised learningProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i7.26072(8923-8931)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1609/aaai.v37i7.26072
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents