Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Structured Sparse Boosting for Graph Classification

Published: 25 August 2014 Publication History

Abstract

Boosting is a highly effective algorithm that produces a linear combination of weak classifiers (a.k.a. base learners) to obtain high-quality classification models. In this article, we propose a generalized logit boost algorithm in which base learners have structural relationships in the functional space. Although such relationships are generic, our work is particularly motivated by the emerging topic of pattern-based classification for semistructured data including graphs. Toward an efficient incorporation of the structure information, we have designed a general model in which we use an undirected graph to capture the relationship of subgraph-based base learners. In our method, we employ both L1 and Laplacian-based L2 regularization to logit boosting to achieve model sparsity and smoothness in the functional space spanned by the base learners. We have derived efficient optimization algorithms based on coordinate descent for the new boosting formulation and theoretically prove that it exhibits a natural grouping effect for nearby spatial or overlapping base learners and that the resulting estimator is consistent. Additionally, motivated by the connection between logit boosting and logistic regression, we extend our structured sparse regularization framework to logistic regression for vectorial data in which features are structured. Using comprehensive experimental study and comparing our work with the state-of-the-art, we have demonstrated the effectiveness of the proposed learning method.

References

[1]
C. Baldassano, M. C. Iordan, D. M. Beck, and F.-F. Li. 2012. Voxel-level functional connectivity using spatial regularization. NeuroImage 63, 3 (2012), 1099--1106.
[2]
P. L. Bartlett and M. Traskin. 2006. AdaBoost is consistent. In Advances in Neural Information Processing Systems (NIPS’06). 105--112.
[3]
A. Bhaduri, R. Ravishankar, and R. Sowdhamini. 2004. Conserved spatially interacting motifs of protein superfamilies: Application to fold recognition and function annotation of genome data. Proteins 4, 54 (2004), 657--670.
[4]
C. Chang and C. Lin. 2001. LIBSVM: A Library for Support Vector Machines. Available at http://www.csie.ntu.edu.tw/∼cjlin/libsvm.
[5]
F. Chung. 1997. Spectral graph theory. CBMS Regional Conferences Series 92 (1997).
[6]
Wenyuan Dai, Qiang Yang, Gui rong Xue, and Yong Yu. 2007. Boosting for transfer learning. In Proceedings of the International Conference on Machine Learning. 193--200.
[7]
M. Deshpande, M. Kuramochi, and G. Karypis. 2005. Frequent sub-structure-based approaches for classifying chemical compounds. IEEE Transactions on Knowledge and Data Engineering (2005).
[8]
John Duchi and Yoram Singer. 2009. Boosting with structural sparsity. In Proceedings of the 26th Annual International Conference on Machine Learning. 297--304.
[9]
H. Fei and J. Huan. 2008. Structure feature selection for graph classification. In Proceedings of the ACM 17th Conference on Information and Knowledge Management.
[10]
H. Fei and J. Huan. 2009. L2 norm regularized feature kernel regression for graph data. In Proceeding of the 18th ACM Conference on Information and Knowledge Management. 593--600.
[11]
Y. Freund. 1995. Boosting a weak learning algorithm by majority. Information and Computation 121 (1995), 256--285.
[12]
Y. Freund and R. Shapire. 1995. A decision-theoretic generalization of on-line learning and an application to boosting. In Proceedings of the 2nd European Conference on Computational Learning Theory. 23--37.
[13]
J. Friedman, T. Hastie, and R. Tibshirani. 2000. Additive logistic regression: A statistical view of boosting. Annals of Statistics 28, 2 (2000), 337--407.
[14]
J. Friedman, T. Hastie, and R. Tibshirani. 2009. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software 33 (2009).
[15]
J. J. Goeman, S. A. van de Geer, F. de Kort, and H. C. van Houwelingen. 2004. A global test for groups of genes: Testing association with a clinical outcome. Bioinformatics 20, 1 (2004), 93--99.
[16]
Y. Guo, S. Mahony, and D. K. Gifford. 2012. High resolution genome wide binding event finding and motif discovery reveals transcription factor spatial binding constraints. PLoS Computational Biology 8, 8 (Aug. 2012).
[17]
I. Guyon, J. Weston, S. Barnhill, and V. Vapnik. 2002 January. Gene selection for cancer classification using support vector machines. Machine Learning 46 (Jan. 2002), 389--422.
[18]
G. Haffari, Y. Wang, S. Wang, G. Mori, and F. Jiao. 2008. Boosting with incomplete information. In Proceedings of the International Conference on Machine Learning. 368--375.
[19]
T. Hastie, R. Tibshirani, and J. Friedman. 2009. The Elements of Statistical Learning. Springer-Verlag.
[20]
J. Huan, W. Wang, and J. Prins. 2003. Efficient mining of frequent subgraph in the presence of isomorphism. In Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM’03). 549--552.
[21]
L. Jacob, F. Bach, and J.-P. Vert. 2009. Clustered multi-task learning: A convex formulation. In Neural Information Processing Systems (NIPS).
[22]
N. Jin and W. Wang. 2011. LTS: Discriminative subgraph mining by learning from search history. In Proceedings of International Conference on Data Engineering (ICDE’11). 207--218.
[23]
N. Jin, C. Young, and W. Wang. 2009. Graph classification based on pattern co-occurrence. In Proceeding of the 18th ACM Conference on Information and Knowledge Management. 573--582.
[24]
R. Jorissen and M. Gilson. 2005. Virtual screening of molecular databases using a support vector machine. Journal of Chemical Information Modeling 45(3) (2005), 549--561.
[25]
M. Kanehisa, S. Goto, M. Hattori, K. F. Aoki-Kinoshita, M. Itoh, S. Kawashima, T. Katayama, M. Araki, and M. Hirakawa. 2006. From genomics to chemical genomics: New developments in KEGG. Nucleic Acids Research 34 (2006), D354--357.
[26]
H. Kashima, K. Tsuda, and A. Inokuchi. 2003. Marginalized kernels between labeled graphs. In Proceedings of the 20th International Conference on Machine Learning (ICML). 321--328.
[27]
K. Knight and W. Fu. 2000. Asymptotics for lasso-type estimators. Journal of the Royal Statisical Society 28, 5 (2000), 1356--1378.
[28]
R. I. Kondor, N. Shervashidze, and K. M. Borgwardt. 2009. The graphlet spectrum. In Proceedings of the International Conference of Machine Learning, Vol. 382. ACM, 67.
[29]
X. Kong, W. Fan, and P. S. Yu. 2011. Dual active feature and sample selection for graph classification. In Proceedings of ACM Knowledge Discovery and Data Mining (KDD’11). 654--662.
[30]
X. Kong and P. S. Yu. 2010. Semi-supervised feature selection for graph classification. In Proceedings of ACM Knowledge Discovery and Data Mining (KDD’10). 793--802.
[31]
Taku Kudo, Eisaku Maeda, and Yuji Matsumoto. 2004. An application of boosting to graph classification. In The Neural Information Processing Systems (NIPS’04).
[32]
C. Leslie, E. Eskin, and W. S. Noble. 2002. The spectrum kernel: A string kernel for SVM protein classification. In Proceedings of the Pacific Symposium on Biocomputing. 564--75.
[33]
C. Li and H. Li. 2008. Network-constrained regularization and variable selection for analysis of genomic data. Bioinformatics 24, 9 (2008), 1175--1182.
[34]
P. Li. 2008. Adaptive base class boost for multi-class classification. In Proceedings of the International Conference on Machine Learning. 79--88.
[35]
L. Liang, V. Mandal, Y. Lu, and D. Kumar. 2008. Mcm-test: A fuzzy-set-theory-based approach to differential analysis of gene pathways. BMC Bioinformatics 9 (Suppl 6) (2008), S16.
[36]
C. Liu, X. Yan, H. Yu, J. Han, and P. S. Yu. 2005. Mining behavior graphs for “backtrace” of noncrashing bugs. In SDM.
[37]
P. M. Long and R. A. Servedio. 2008. Random classification noise defeats all convex potential boosters. In Proceedings of the International Conference on Machine Learning. 608--615.
[38]
T. M. Mitchell and McGraw Hill. 2010. Chapter 1 of Machine Learning: Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression. Preprint. 12--16 pages. Available at http://www.cs.cmu.edu/∼tom/mlbook/NBayesLogReg.pdf.
[39]
H. D. K. Moonesinghe, H. Valizadegan, S. Fodeh, and P.-N. Tan. 2007. A probabilistic substructure-based approach for graph classification. In Proceedings of the IEEE International Conference on Tools with Artificial Intelligence, Vol. 1. 346--349.
[40]
V. Mootha, C. Lindgren, K. Eriksson, A. Subramanian, S. Sihag, J. Lehar, P. Puigserver, E. Carlsson, M. Ridderstraale, E. Laurila et al. 2003. PGC-1: A-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nature Genetics 34, 3 (2003), 267C273.
[41]
A. G. Murzin, S. E. Brenner, T. Hubbard, and C. Chothia. 1995. SCOP: A structural classification of proteins database for the investigation of sequences and structures. Journal of Molecular Biology 247 (1995), 536--540.
[42]
S. Nowozin, K. Tsuda, T. Uno, T. Kudo, and G. Bakir. 2007. Weighted substructure mining for image analysis. In Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR’07). 1--8.
[43]
G. Pandey, S. Chawla, S. Poon, B. Arunasalam, and J. G. Davis. 2009. Association rules network: Definition and applications. Statistics Analysis Data Mining 1, 4 (2009), 260--279.
[44]
H. Saigo et al. 2007. gBoost: Graph Learning Toolbox for Matlab. Available at http://www.kyb.tuebingen.mpg.de/bs/people/nowozin/gboost/.
[45]
H. Saigo, N. Krämer, and K. Tsuda. 2008. Partial least squares regression for graph mining. In Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’08).
[46]
H. Saigo, S. Nowozin, T. Kadowaki, T. Kudo, and K. Tsuda. 2009. gBoost: A mathematical programming approach to graph classification and regression. Journal of Machine Learning 75, 1 (2009), 69--89.
[47]
T. Sandler, P. P. Talukdar, and L. H. Ungar. 2008. Regularized learning with networks of features. In The Neural Information Processing Systems (NIPS).
[48]
R. Schapire. 1990. The strength of weak learnability. Machine Learning 5 (1990), 197--227.
[49]
R. Schapire and Y. Singer. 1999. Improved boosting algorithms using confidence-rated predictions. Machine Learning 37 (1999), 297--336.
[50]
M. Thoma, H. Cheng, A. Gretton, J. Han, H.-P. Kriegel, A. J. Smola, L. Song, P. S. Yu, X. Yan, and K. M. Borgwardt. 2009. Near-optimal supervised feature selection among frequent subgraphs. In Proceedings of the 2009 SIAM Conference on Data Mining (SDM’09). Philadelphia, PA, 1076--1087.
[51]
R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight. 2005. Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society 67, 1 (2005), 91--108.
[52]
K. Tsuda. 2007. Entire regularization paths for graph data. In Proceedings of the International Conference on Machine Learning. 919--926.
[53]
Z. Xiang, Y. Xi, U. Hasson, and P. Ramadge. 2009. Boosting with spatial regularization. In Proceedings of the Neural Information Processing Systems (NIPS’09). 2107--2115.
[54]
R. Yan, J. Tesic, and J. R. Smith. 2007. Model-shared subspace boosting for multi-label classification. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 834--843.
[55]
X. Yan, H. Cheng, J. Han, and P. Yu. 2008. Mining significant graph patterns by leap search. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. ACM, 433--444.
[56]
M. Yuan and Y. Lin. 2006. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society, Series B 68 (2006), 49--67.
[57]
P. Zhao and B. Yu. 2006. Grouped and hierarchical model selection through composite absolute penalties. Annals of Statistics (2006), 3468--3497.
[58]
Y. Zhao, X. Kong, and P. S. Yu. 2011. Positive and unlabeled learning for graph classification. In Proceedings of the International Conference on Data Mining. 962--971.
[59]
L. Zheng, S. Wang, C. H. Lee, and Y. Liu. 2009. Information theoretic regularization for semi-supervised boosting. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1017--1026.
[60]
H. Zou and T. Hastie. 2005. Regularization and variable selection via the Elastic Net. Journal of the Royal Statistical Society B 67 (2005), 301--320.
[61]
H. Zou and M. Yuan. 2008. F∞ norm support vector machine. Statistica Sinica 18 (2008), 379--398.

Cited By

View all
  • (2018)An efficient heuristic approach for learning a set of composite graph classification rulesIntelligent Data Analysis10.3233/IDA-16334322:3(581-596)Online publication date: 1-Jan-2018
  • (2018)Time-Variant Graph ClassificationIEEE Transactions on Systems, Man, and Cybernetics: Systems10.1109/TSMC.2018.2830792(1-14)Online publication date: 2018
  • (2018)Representing Graphs as Bag of Vertices and Partitions for Graph ClassificationData Science and Engineering10.1007/s41019-018-0065-53:2(150-165)Online publication date: 28-Jun-2018
  • Show More Cited By

Index Terms

  1. Structured Sparse Boosting for Graph Classification

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Knowledge Discovery from Data
    ACM Transactions on Knowledge Discovery from Data  Volume 9, Issue 1
    October 2014
    209 pages
    ISSN:1556-4681
    EISSN:1556-472X
    DOI:10.1145/2663598
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 August 2014
    Accepted: 01 February 2014
    Revised: 01 November 2013
    Received: 01 March 2013
    Published in TKDD Volume 9, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Regularization
    2. boosting
    3. feature selection
    4. graph classification
    5. logistic regression
    6. semistructured data
    7. structural sparsity

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 01 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)An efficient heuristic approach for learning a set of composite graph classification rulesIntelligent Data Analysis10.3233/IDA-16334322:3(581-596)Online publication date: 1-Jan-2018
    • (2018)Time-Variant Graph ClassificationIEEE Transactions on Systems, Man, and Cybernetics: Systems10.1109/TSMC.2018.2830792(1-14)Online publication date: 2018
    • (2018)Representing Graphs as Bag of Vertices and Partitions for Graph ClassificationData Science and Engineering10.1007/s41019-018-0065-53:2(150-165)Online publication date: 28-Jun-2018
    • (2017)Incremental Subgraph Feature Selection for Graph ClassificationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2016.261630529:1(128-142)Online publication date: 1-Jan-2017
    • (2016)Edge classification in networks2016 IEEE 32nd International Conference on Data Engineering (ICDE)10.1109/ICDE.2016.7498311(1038-1049)Online publication date: May-2016
    • (2016)Fast training of a graph boosting for large-scale text classificationProceedings of the 14th Pacific Rim International Conference on Trends in Artificial Intelligence10.1007/978-3-319-42911-3_53(638-650)Online publication date: 22-Aug-2016
    • (2015)XEarth: A 3D GIS platform for managing massive city information2015 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA)10.1109/CIVEMSA.2015.7158625(1-6)Online publication date: Jun-2015
    • (2015)Virtual geographic environment based coach passenger flow forecasting2015 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA)10.1109/CIVEMSA.2015.7158618(1-6)Online publication date: Jun-2015
    • (2015)Traffic management and forecasting system based on 3D GISProceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing10.1109/CCGrid.2015.62(991-998)Online publication date: 4-May-2015

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media