research-article

Pairwised Specific Distance Learning from Physical Linkages

Authors:

Zhi-Hua ZhouAuthors Info & Claims

ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 9, Issue 3

Article No.: 20, Pages 1 - 27

https://doi.org/10.1145/2700405

Published: 01 April 2015 Publication History

Abstract

In real tasks, usually a good classification performance can only be obtained when a good distance metric is obtained; therefore, distance metric learning has attracted significant attention in the past few years. Typical studies of distance metric learning evaluate how to construct an appropriate distance metric that is able to separate training data points from different classes or satisfy a set of constraints (e.g., must-links and/or cannot-links). It is noteworthy that this task becomes challenging when there are only limited labeled training data points and no constraints are given explicitly. Moreover, most existing approaches aim to construct a global distance metric that is applicable to all data points. However, different data points may have different properties and may require different distance metrics. We notice that data points in real tasks are often connected by physical links (e.g., people are linked with each other in social networks; personal webpages are often connected to other webpages, including nonpersonal webpages), but the linkage information has not been exploited in distance metric learning. In this article, we develop a pairwised specific distance (PSD) approach that exploits the structures of physical linkages and in particular captures the key observations that nonmetric and clique linkages imply the appearance of different or unique semantics, respectively. It is noteworthy that, rather than generating a global distance, PSD generates different distances for different pairs of data points; this property is desired in applications involving complicated data semantics. We mainly present PSD for multi-class learning and further extend it to multi-label learning. Experimental results validate the effectiveness of PSD, especially in the scenarios in which there are very limited labeled training data points and no explicit constraints are given.

References

[1]

M. Belkin, P. Niyogi, and V. Sindhwani. 2006. A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research 7, 2399--2434.

Digital Library

[2]

J. C. Bezdek and R. J. Hathaway. 2003. Convergence of alternating optimization. Neural, Parallel and Scientific Computations 11, 4, 351--368.

Digital Library

[3]

A. Blum and S. Chawla. 2001. Learning from labeled and unlabeled data using graph mincuts. In Proceedings of the 18th International Conference on Machine Learning. 19--26.

Digital Library

[4]

A. Frome, Y. Singer, and J. Malik. 2007a. Image retrieval and classification using local distance functions. In Advances in Neural Information Processing Systems 19. 417--424.

[5]

A. Frome, Y. Singer, F. Sha, and J. Malik. 2007b. Learning globally-consistent local distance functions for shape-based image retrieval and classification. In Proceedings of the 11th International Conference on Computer Vision. 1--8.

[6]

X. Geng, D.-C. Zhan, and Z.-H. Zhou. 2005. Supervised nonlinear dimensionality reduction for visualization and classification. IEEE Transactions on Systems, Man, and Cybernetics—Part B 35, 1098--1107.

Digital Library

[7]

J. Goldberger, S. Roweis, G. Hinton, and R. Salakhutdinov. 2005. Neighbourhood components analysis. In Advances in Neural Information Processing Systems 17. 513--520.

[8]

X.-S. Hua and G.-J. Qi. 2008. Online multi-label active annotation: Towards large-scale content-based video search. In Proceeding of the 16th ACM International Conference on Multimedia. 141--150.

Digital Library

[9]

R. Jin, S. Wang, and Z.-H. Zhou. 2009. Learning a distance metric from multi-instance multi-label data. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 896--902.

[10]

F. Kang, R. Jin, and R. Sukthankar. 2006. Correlated label propagation with application to multi-label learning. In Proceedings of the Computer Society Conference on Computer Vision and Pattern Recognition. 1719--1726.

Digital Library

[11]

X. Kong, M. K. Ng, and Z.-H. Zhou. 2013. Transductive multi-label learning via label set propagation. IEEE Transactions on Knowledge and Data Engineering 25, 704--719.

Digital Library

[12]

N. Kumar and K. Kummamuru. 2007. Semi-supervised clustering with metric learning using relative comparisons. IEEE Transactions on Knowledge and Data Engineering 20, 496--503.

Digital Library

[13]

J. Kwok and I. Tsang. 2003. Learning with idealized kernels. In Proceedings of the 20th International Conference on Machine Learning. 400--407.

[14]

H. Li, M. Wang, and X.-S. Hua. 2009. MSRA-MM 2.0: A large-scale web multimedia dataset. In Proceedings of the International Conference on Data Mining Workshops. 164--169.

Digital Library

[15]

Y.-F. Li, J. Hu, Y. Jiang, and Z.-H. Zhou. 2012. Towards discovering what patterns trigger what labels. In Proceedings of the 26th AAAI Conference on Artificial Intelligence. 1012--1018.

[16]

Z. Li, J. Liu, and X. Tang. 2008. Pairwise constraint propagation by semidefinite programming for semi-supervised classification. In Proceedings of the 25th International Conference on Machine Learning. Helsinki, Finland, 576--583.

Digital Library

[17]

L. K. McEliece, K. M. Gupta, and D. W. Aha. 2007. Cautious inference in collective classification. In Proceedings of the 22nd AAAI Conference on Artificial Intelligence. Vancouver, Canada, 596--601.

Digital Library

[18]

S. Melacci and M. Belkin. 2011. Laplacian support vector machines trained in the primal. Journal of Machine Learning Research 12, 1149--1184.

Digital Library

[19]

W. Meng, L. J. Yang, and X.-S. Hua. 2009. MSRA-MM: Bridging research and industrial societies for multimedia information retrieval. Technical Report MSR-TR-2009-30. Microsoft, Redmond, WA.

[20]

G. M. Namata, P. Sen, M. Bilgic, L. Getoor, B. Gallagher, and T. Eliassi-Rad. 2009. Collective Classification for Text Classification. In Text Mining: Classification, Clustering, and Applications, M. Sahami and A. Srivastava (Eds.). Taylor and Francis, Philadelphia, PA, Chapter 3, 51--69.

[21]

J. Neville and D. Jensen. 2000. Interactive classification in relational data. In Proceedings of AAAI Workshop on Statistical Relational Learning. 13--20.

[22]

R. E. Schapire and Y. Singer. 2000. BoosTexter: A boosting-based system for text categorization. Machine Learning 39, 2--3, 135--168.

Digital Library

[23]

P. Sen and L. Getoor. 2007. Link-based classification. Technical Report CS-TR-4858. University of Maryland, College Park, MD.

[24]

P. Sen, G. M. Namata, M. Bilgic, L. Getoor, B. Gallagher, and T. Eliassi-Rad. 2008. Collective classification in network data. AI Magazine 29, 93--106.

Digital Library

[25]

Y.-Y. Sun, Y. Zhang, and Z.-H. Zhou. 2010. Multi-label learning with weak label. In Proceedings of the 24th AAAI Conference on Artificial Intelligence. 593--598.

[26]

X. Tan, S. Chen, Z.-H. Zhou, and J. Liu. 2006. Learning non-metric partial similarity based on maximal margin criterion. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 138--145.

Digital Library

[27]

X. Tan, S. Chen, Z.-H. Zhou, and J. Liu. 2009. Face recognition under occlusions and variant expressions with partial similarity. IEEE Transactions on Information Forensics and Security 4, 217--230.

Digital Library

[28]

J. Z. Wang, J. Li, and G. Wiederhold. 2001. Simplicity: Semantics sensitive integrated matching for picture libraries. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 947--963.

Digital Library

[29]

J. Wang, E. Pohlmeyer, B. Hanna, Y.-G. Jiang, P. Sajda, and S.-F. Chang. 2009. Brain state decoding for rapid image retrieval. In Proceeding of the 17th ACM International Conference on Multimedia. 945--954.

Digital Library

[30]

J. Wang, Y. Zhao, X. Wu, and X.-S. Hua. 2011. A transductive multi-label learning approach for video concept detection. Pattern Recognition 44, 10--11, 2274--2286.

Digital Library

[31]

K. Q. Weinberger, J. Blitzer, and L. K. Saul. 2005. Distance metric learning for large margin nearest neighbor classification. In Advances in Neural Information Processing Systems 17. 1473--1480.

[32]

G. Wu, E. Y. Chang, and Z. Zhang. 2005. Learning with non-metric proximity matrices. In Proceedings of the 13th Annual ACM International Conference on Multimedia. 411--414.

Digital Library

[33]

L. Wu, X. Ying, X. Wu, and Z.-H. Zhou. 2011. Line orthogonality in adjacency eigenspace with application to community partition. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence. 2349--2354.

Digital Library

[34]

S. Xiang, F. Nie, and C. Zhang. 2008. Learning a Mahalanobis distance metric for data clustering and classification. Pattern Recognition 41, 3600--3612.

Digital Library

[35]

E. Xing, A. Ng, M. Jordan, and S. Russell. 2003. Distance metric learning with application to clustering with side-information. In Advances in Neural Information Processing Systems 15. 505--512.

[36]

L. Yang and R. Jin. 2006. Distance metric learning: A comprehensive survey. Michigan State University 2.

[37]

S.-J. Yang, Y. Jiang, and Z.-H. Zhou. 2013. Multi-instance multi-label learning with weak label. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence.

Digital Library

[38]

D. Y. Yeung and H. Chang. 2007. A kernel approach for semi-supervised metric learning. IEEE Transactions on Neural Networks 18, 141--149.

Digital Library

[39]

D.-C. Zhan, M. Li, Y.-F. Li, and Z.-H. Zhou. 2009. Learning instance specific distances using metric propagation. In Proceedings of the 26th International Conference on Machine Learning. 1225--1232.

Digital Library

[40]

M.-L. Zhang and Z.-H. Zhou. 2007. ML-kNN: A lazy learning approach to multi-label learning. Pattern Recognition 40, 2038--2048.

Digital Library

[41]

W. Zhang, Y. Lu, X. Xue, and J. Fan. 2011. Automatic image annotation with weakly labeled dataset. In Proceeding of the 19th ACM International Conference on Multimedia. 1185--1188.

Digital Library

[42]

Y. Zhang and Z.-H. Zhou. 2009. Non-metric label propagation. In Proceedings of the 21st International Joint Conference on Artificial Intelligence. 1357--1362.

Digital Library

[43]

D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf. 2004. Learning with local and global consistency. In Advances in Neural Information Processing Systems 16. 321--328.

[44]

Z.-H. Zhou and M.-L. Zhang. 2007. Multi-instance multi-label learning with application to scene classification. In Advances in Neural Information Processing Systems 19. 1609--1616.

[45]

Z.-H. Zhou, M.-L. Zhang, S.-J. Huang, and Y.-F. Li. 2012. Multi-instance multi-label learning. Artificial Intelligence 176, 2291--2320.

Digital Library

[46]

X. Zhu and Z. Ghahramani. 2002. Learning from Labeled and Unlabeled Data with Label Propagation. Technical Report. Carnegie Mellon University, Pittsburgh, PA.

[47]

X. Zhu, Z. Ghahramani, and J. Lafferty. 2003. Semi-supervised learning using Gaussian fields and harmonic functions. In Proceedings of the 20th International Conference on Machine Learning. 912--919.

Cited By

Huai MZheng TMiao CYao LZhang A(2022)On the Robustness of Metric Learning: An Adversarial PerspectiveACM Transactions on Knowledge Discovery from Data10.1145/350272616:5(1-25)Online publication date: 5-Apr-2022
https://dl.acm.org/doi/10.1145/3502726
Ye HZhan DLi NJiang Y(2020)Learning Multiple Local Metrics: Global Consideration HelpsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2019.290167542:7(1698-1712)Online publication date: 1-Jul-2020
https://doi.org/10.1109/TPAMI.2019.2901675
Gong FJiang LZhang HWang DGuo X(2020)Gain ratio weighted inverted specific-class distance measure for nominal attributesInternational Journal of Machine Learning and Cybernetics10.1007/s13042-020-01112-8Online publication date: 6-Mar-2020
https://doi.org/10.1007/s13042-020-01112-8
Show More Cited By

Index Terms

Pairwised Specific Distance Learning from Physical Linkages
1. Computing methodologies
  1. Machine learning
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Joint learning of labels and distance metric
Special issue on game theory

Machine learning algorithms frequently suffer from the in sufficiency of training data and the usage of inappropriate distance metric. In this paper, we propose a joint learning of labels and distance metric (JLLDM) approach, which is able to ...
Transductive Multilabel Learning via Label Set Propagation

The problem of multilabel classification has attracted great interest in the last decade, where each instance can be assigned with a set of multiple class labels simultaneously. It has a wide variety of real-world applications, e.g., automatic image ...
Class-specific mahalanobis distance metric learning for biological image classification
ICIAR'12: Proceedings of the 9th international conference on Image Analysis and Recognition - Volume Part II

Distance metric learning (DML) is an emerging field of machine learning. The basic idea behind DML is to adapt the underlying distance metric to improve the performance for the pattern analysis tasks. In this paper, we present the use of DML techniques ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data

ACM Transactions on Knowledge Discovery from Data Volume 9, Issue 3

TKDD Special Issue (SIGKDD'13)

April 2015

313 pages

ISSN:1556-4681

EISSN:1556-472X

DOI:10.1145/2737800

Editor:
Philip S. Yu
University of Illinois at Chicago, USA

Issue’s Table of Contents

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 2015

Accepted: 01 September 2014

Revised: 01 March 2014

Received: 01 March 2012

Published in TKDD Volume 9, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

National Natural Science Foundation of China
JiangsuSF (BK2011566)
National Science Foundation (CCF-1047621)
National Institutes of Health (1R01GM103309)
NSFC (61333014)
61105043
Collaborative Innovation Center of Novel Software Technology and Industrialization

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
264
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)1

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Huai MZheng TMiao CYao LZhang A(2022)On the Robustness of Metric Learning: An Adversarial PerspectiveACM Transactions on Knowledge Discovery from Data10.1145/350272616:5(1-25)Online publication date: 5-Apr-2022
https://dl.acm.org/doi/10.1145/3502726
Ye HZhan DLi NJiang Y(2020)Learning Multiple Local Metrics: Global Consideration HelpsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2019.290167542:7(1698-1712)Online publication date: 1-Jul-2020
https://doi.org/10.1109/TPAMI.2019.2901675
Gong FJiang LZhang HWang DGuo X(2020)Gain ratio weighted inverted specific-class distance measure for nominal attributesInternational Journal of Machine Learning and Cybernetics10.1007/s13042-020-01112-8Online publication date: 6-Mar-2020
https://doi.org/10.1007/s13042-020-01112-8
Ye HZhan DJiang YZhou Z(2019)What Makes Objects Similar: A Unified Multi-Metric Learning ApproachIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2018.282919241:5(1257-1270)Online publication date: 1-May-2019
https://doi.org/10.1109/TPAMI.2018.2829192
Jiang LLi C(2019)Two improved attribute weighting schemes for value difference metricKnowledge and Information Systems10.1007/s10115-018-1229-360:2(949-970)Online publication date: 1-Aug-2019
https://dl.acm.org/doi/10.1007/s10115-018-1229-3
Wang RKwong SJia YHuang ZWu L(2018)Mutual Information Based K-Labelsets Ensemble for Multi-Label Classification2018 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)10.1109/FUZZ-IEEE.2018.8491677(1-7)Online publication date: Jul-2018
https://doi.org/10.1109/FUZZ-IEEE.2018.8491677
Ye HZhan DSi XJiang YZhou Z(2016)What makes objects similarProceedings of the 30th International Conference on Neural Information Processing Systems10.5555/3157096.3157235(1243-1251)Online publication date: 5-Dec-2016
https://dl.acm.org/doi/10.5555/3157096.3157235
Ye HZhan DJiang Y(2016)Instance specific metric subspace learningProceedings of the Thirtieth AAAI Conference on Artificial Intelligence10.5555/3016100.3016216(2272-2278)Online publication date: 12-Feb-2016
https://dl.acm.org/doi/10.5555/3016100.3016216

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents