Sparse multiple instance learning as document classification

Yan, Shengye; Zhu, Xiaodong; Liu, Guoqing; Wu, Jianxin

doi:10.1007/s11042-016-3567-z

Sparse multiple instance learning as document classification

Published: 16 May 2016

Volume 76, pages 4553–4570, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Shengye Yan¹,
Xiaodong Zhu¹,
Guoqing Liu² &
…
Jianxin Wu³

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

This work focuses on multiple instance learning (MIL) with sparse positive bags (which we name as sparse MIL). A structural representation is presented to encode both instances and bags. This representation leads to a non-i.i.d. MIL algorithm, miStruct, which uses a structural similarity to compare bags. Furthermore, MIL with this representation is shown to be equivalent to a document classification problem. Document classification also suffers from the fact that only few paragraphs/words are useful in revealing the category of a document. By using the TF-IDF representation which has excellent empirical performance in document classification, the miDoc method is proposed. The proposed methods achieve significantly higher accuracies and AUC (area under the ROC curve) than the state-of-the-art in a large number of sparse MIL problems, and the document classification analogy explains their efficacy in sparse MIL problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiple Instance Classification in the Image Domain

Relative Minimum Distance Between Projected Bags for Improved Multiple Instance Classification

Characterizing Multiple Instance Datasets

Notes

One instance can appear in more than one bags, e.g., ${x_{1}^{1}}$ and ${x_{2}^{1}}$ can have the same values.
The moralization used here has two differences with the one for DAG: on the one hand, cycle is permissive in G = (X, E); on the other hand, multiple marriage edges can be existed between two instances. So we are slightly abusing this concept.
This means that the component of every z _i corresponding to that instance is non-zero for most bags.
For the convenient of presentation, we present AUC in percentage.

References

Andrews S, Tsochantaridis I, Hofmann T (2003) Support vector machines for multiple-instance learning. In: The 15th advances in neural information processing systems
Babenko B, Yang M-H, Belongie S (2009) Visual tracking with online multiple instance learning
Bunescu RC, Mooney RJ (2007) Multiple instance learning for sparse positive bags. In: The 24th international conference on machine learning
Chen Y, Bi J, Wang JZ (2006) MILES: multiple-instance learning via embedded instance selection. IEEE Trans Pattern Anal Mach Intell 28(12):1931–1947
Article Google Scholar
Chen Y, Wang JZ (2004) Image categorization by learning and reasoning with regions. J Mach Learn Res 5:913–939
MathSciNet Google Scholar
Cheung P-M, Kwok JT (2006) A regularization framework for multiple-instance learning. In: The 23 rd international conference on machine learning
Dietterich TG, Lathrop RH, Lozano-Pérez T (1997) Solving the multiple-instance problem with axis-parallel rectangles. Artif Intell 89:31–71
Article MATH Google Scholar
Fung G, Dundar M, Krishnappuram B, Rao RB (2007) Multiple instance learning for computer aided diagnosis. In: The 19th advances in neural information processing systems
Gärtner T, Flach PA, Kowalczyk A, Smola AJ (2002) Multi-instance kernels. In: The 19th international conference on machine learning
Gehler PV, Chapelle O (2007) Deterministic annealing for multiple-instance learning. In: The 11th international conference on artificial intelligence and statistics
Kohavi R, John G (1997) Wrappers for feature subset selection. Artif Intell 97:273–324
Article MATH Google Scholar
Koller D, Friedman N (2009) Probabilistic graphical models. MIT Press
Li F, Sminchisescu C (2010) Convex multiple-instance learning by estimating likelihood ratio. In: The 24th advances in neural information processing systems
Li W, Yeung D (2010) Mild: multiple-instance learning via disambiguation. IEEE Trans Knowl Data Eng 22(1):76–89
Article Google Scholar
Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press
Rastegari M, Hajishirzi H, Farhadi A (2015) Discriminative and consistent similarities in instance-level multiple instance learning
Ray S, Craven M (2005) Supervised versus multiple instance learning: an empirical comparison. In: The 22th international conference on machine learning
Robertson S (2004) Understanding inverse document frequency: on theoretical arguments for IDF. J Doc 60:503–520
Article Google Scholar
Settles B, Craven M, Ray S (2008) Multiple instance active learning. In: The 20th advances in neural information processing systems
Viola P, Platt J-C, Zhang C (2007) Multiple instance boosting for object detection
Viola P, Platt JC, Zhang C (2006) Multiple instance boosting for object detection. In: The 18th advances in neural information processing systems
Winn J, Criminisi A, Minka T (2005) Object categorization by learned universal visual dictionary. In: The 10th IEEE international conference on computer vision
Wu J (2011) Balance support vector machines locally using the structural similarity kernel. In: The 15th pacific-asia conference on knowledge discovery and data mining
Wu J, Rehg JM (2011) CENTRIST: a visual descriptor for scene categorization. IEEE Trans Pattern Anal Mach Intell. To appear
Zhang B-C, Li Z-G, Liu J (2015) A compressed sensing ensemble classifier with application to human detection. Neurocomputing 170:221–227
Article Google Scholar
Zhang B-C, Li Z-G, Perina A, Bue A-D, Murino V (2015) Adaptive local movement modelling for object tracking. In: IEEE Winter conference on applications of computer vision, pp 25–32
Zhang B-C, Perina A, Bue VMA-D (2015) Sparse representation classification with manifold constraints transfer. In: The IEEE conference on computer vision and pattern recognition
Zhang B-C, Perina A, Li Z-G, Murino V, Liu J-Z, Ji R-R (2016) Bounding multiple gaussians uncertainty with application to object tracking
Zhang M, Zhou Z (2009) Multi-instance clustering with applications to multi-instance prediction. Appl Intell 31(1):47–68
Article Google Scholar
Zhang Q, Goldman S (2002) EM-DD: An improved multiple-instance learning technique. In: The 14th advances in neural information processing systems
Zhou Z-H, Sun Y-Y, Li Y-F (2009) Multi-instance learning by treating instances as non-i.i.d. samples. In: The 26th international conference on machine learning
Zhou Z-H, Xu J-M (2007) On the relation between multi-instance learning and semi-supervised learning. In: The 24th international conference on machine learning

Download references

Acknowledgments

This research was supported by the National Natural Science Foundation of China under Grant Nos of 61300163 and 61422203.

Author information

Authors and Affiliations

B-DAT, CICAEET, School of Information and Control, NUIST, No. 219, Niuliu Road, Nanjing, 210044, China
Shengye Yan & Xiaodong Zhu
Minieye, Youjia Innovation LLC, Shenzhen, China
Guoqing Liu
National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Jianxin Wu

Authors

Shengye Yan
View author publications
You can also search for this author in PubMed Google Scholar
Xiaodong Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Guoqing Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jianxin Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shengye Yan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yan, S., Zhu, X., Liu, G. et al. Sparse multiple instance learning as document classification. Multimed Tools Appl 76, 4553–4570 (2017). https://doi.org/10.1007/s11042-016-3567-z

Download citation

Received: 24 November 2015
Revised: 20 February 2016
Accepted: 28 April 2016
Published: 16 May 2016
Issue Date: February 2017
DOI: https://doi.org/10.1007/s11042-016-3567-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sparse multiple instance learning as document classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multiple Instance Classification in the Image Domain

Relative Minimum Distance Between Projected Bags for Improved Multiple Instance Classification

Characterizing Multiple Instance Datasets

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Sparse multiple instance learning as document classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multiple Instance Classification in the Image Domain

Relative Minimum Distance Between Projected Bags for Improved Multiple Instance Classification

Characterizing Multiple Instance Datasets

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation