Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Classification in Networked Data: A Toolkit and a Univariate Case Study

Published: 01 May 2007 Publication History

Abstract

This paper is about classifying entities that are interlinked with entities for which the class is known. After surveying prior work, we present NetKit, a modular toolkit for classification in networked data, and a case-study of its application to networked data used in prior machine learning research. NetKit is based on a node-centric framework in which classifiers comprise a local classifier, a relational classifier, and a collective inference procedure. Various existing node-centric relational learning algorithms can be instantiated with appropriate choices for these components, and new combinations of components realize new algorithms. The case study focuses on univariate network classification, for which the only information used is the structure of class linkage in the network (i.e., only links and some class labels). To our knowledge, no work previously has evaluated systematically the power of class-linkage alone for classification in machine learning benchmark data sets. The results demonstrate that very simple network-classification models perform quite well---well enough that they should be used regularly as baseline classifiers for studies of learning with networked data. The simplest method (which performs remarkably well) highlights the close correspondence between several existing methods introduced for different purposes---that is, Gaussian-field classifiers, Hopfield networks, and relational-neighbor classifiers. The case study also shows that there are two sets of techniques that are preferable in different situations, namely when few versus many labels are known initially. We also demonstrate that link selection plays an important role similar to traditional feature selection.

References

[1]
J. C. Almack. The influence of intelligence on the selection of associates. School and Society, 16: 529-530, 1922.
[2]
A. Bernstein, S. Clearwater, and F. Provost. The relational vector-space model and industry classification. In Proceedings of the Learning Statistical Models from Relational Data Workshop at the Nineteenth International Joint Conference on Artificial Intelligence (IJCAI), 2003.
[3]
J. Besag. Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society, 36(2):192-236, 1974.
[4]
J. Besag. Statistical analysis of non-lattice data. The Statistician, 24(3):179-195, 1975.
[5]
J. Besag. On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society, 48(3): 259-302, 1986.
[6]
P. M. Blau. Inequality and Heterogeneity: A Primitive Theory of Social Structure. New York: Free Press, 1977.
[7]
A. Blum and S. Chawla. Learning from labeled and unlabeled data using graph mincuts. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML), pages 19-26, 2001.
[8]
A. Blum, J. Lafferty, R. Reddy, and M. R. Rwebangira. Semi-supervised learning using randomized mincuts. In Proceedings of the Twentyfirst International Conference on Machine Learning (ICML), pages 97-104, 2004.
[9]
H. Bott. Observation of play activities in a nursery school. Genetic Psychology Monographs, 4: 44-88, 1928.
[10]
Y. Boykov, O. Veksler, and R. Zabih. Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 23(11):1222-1239, 2001.
[11]
S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext categorization using hyperlinks. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pages 307-319, 1998.
[12]
C. Cortes, D. Pregibon, and C. T. Volinsky. Communities of interest. In Proceedings of the Fourth International Conference on Advances in Intelligent Data Analysis (IDA), pages 105-114, 2001.
[13]
R. G. Cowell, A. P. Dawid, S. L. Lauritzen, and D. J. Spiegelhalter. Probabilistic networks and expert systems. Springer, 1999.
[14]
M. Craven, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, and C. Y. Quek. Learning to extract symbolic knowledge from the World Wide Web. In Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI), pages 509-516, 1998.
[15]
L. De Raedt, H. Blockeel, L. Dehaspe, and W. Van Laer. Three companions for data mining in first order logic. In S. Dzeroski and N. Lavrac, editors, Relational Data Mining, pages 105-139. Berlin; New York: Springer, 2001.
[16]
I. S. Dhillon. Co-clustering documents and words using bipartite spectral graph partitioning. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 269-274, 2001.
[17]
R. L. Dobrushin. The description of a random field by means of conditional probabilities and conditions of its regularity. Theory of Probability and its Applications, 13(2):197-224, 1968.
[18]
P. Domingos and M. Richardson. Mining the network value of customers. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 57-66, 2001.
[19]
S. Dzeroski and N. Lavrac. Relational Data Mining. Berlin; New York: Springer, 2001.
[20]
T. Fawcett and F. Provost. Adaptive fraud detection. Data Mining and Knowledge Discovery, 3: 291-316, 1997.
[21]
T. Fawcett and F. Provost. Activity monitoring: Noticing interesting changes in behavior. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 53-62, 1999.
[22]
P. A. Flach and N. Lachiche. Naive Bayesian classification of structured data. Machine Learning, 57:233-269, 2004.
[23]
N. Friedman, L. Getoor, D. Koller, and A. Pfeffer. Learning probabilistic relational models. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (IJCAI), pages 1300-1309, 1999.
[24]
A. Galstyan and P. Cohen. Is guilt by association a bad thing? In Proceedings of the First International Conference on Intelligence Analysis (IA), 2005.
[25]
S. Geman and D. Geman. Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 6:721-741, 1984.
[26]
W. R. Gilks, S. Richardson, and D. J. Spiegelhalter. Markov Chain Monte Carlo in Practice. Chapman & Hall/CRC, 1995.
[27]
D. Greig, B. Porteous, and A. Seheult. Exact maximum a posteriori estimation for binary images. Journal of the Royal Statistical Society, 51(2):271-279, 1989.
[28]
D. D. Heckathorn and J. Jeffri. Jazz networks: Using respondent-driven sampling to study stratification in two jazz musician communities. Unpublished paper presented at American Sociological Association Annual Meeting, August 2003.
[29]
D. Heckerman, D. M. Chickering, C. Meek, R. Rounthwaite, and C. Kadie. Dependency networks for inference, collaborative filtering, and data visualization. Journal of Machine Learning Research (JMLR), 1:49-75, 2000.
[30]
S. Hill, F. Provost, and C. Volinsky. Network-based marketing: Identifying likely adopters via consumer networks. Statistical Science, 22(2):256-276, 2006a.
[31]
S. Hill, D. K. Agarwal, R. Bell, and C. Volinsky. Building an effective representation for dynamic networks. Journal of Computational & Graphical Statistics, 15(3):584-608, 2006b.
[32]
G. E. Hinton and T. J. Sejnowski. Learning and relearning in Boltzmann machines. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, 1: Foundations:282-317, 1986.
[33]
J. J. Hopfield. Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences of the United States of America, 79(8): 2554-2558, 1982.
[34]
Z. Huang, H. Chen, and D. Zeng. Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering. ACM Transactions on Information Systems (TOIS), 22(1): 116-142, 2004.
[35]
R. A. Hummel and S. W. Zucker. On the foundations of relaxation labeling processes. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 5(3):267-287, 1983.
[36]
E. Ising. Beitrag zur Theorie des Ferromagnetismus. Zeitschrift f. Physik, 31:253-258, 1925. {German}.
[37]
D. Jensen and J. Neville. Data mining in social networks. In National Academy of Sciences Symposium on Dynamic Social Network Modeling and Analysis, 2002a.
[38]
D. Jensen and J. Neville. Linkage and autocorrelation cause feature selection bias in relational learning. In Proceedings of the Nineteenth International Conference on Machine Learning (ICML), pages 259-266, 2002b.
[39]
D. Jensen, J. Neville, and B. Gallagher. Why collective inference improves relational classification. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 593-598, 2004.
[40]
T. Joachims. Transductive learning via spectral graph partitioning. In Proceedings of the Twentieth International Conference on Machine Learning (ICML), pages 290-297, 2003.
[41]
J. Kleinberg and É. Tardos. Approximation algorithms for classification problems with pairwise relations: Metric labeling and Markov random fields. In Proceedings of the Fortieth Annual Symposium on Foundations of Computer Science (FOCS), pages 14-23, 1999.
[42]
R. Kohavi and G. John. Wrappers for feature subset selection. Artificial Intelligence, 97(1-2): 273-324, 1997.
[43]
D. Koller and A. Pfeffer. Probabilistic frame-based systems. In Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI), pages 580-587, 1998.
[44]
S. Kramer, N. Lavrac, and P. Flach. Propositionalization approaches to relational data mining. In S. Dzeroski and N. Lavrac, editors, Relational Data Mining, pages 262-291. Berlin; New York: Springer, 2001.
[45]
J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML), pages 282-289, 2001.
[46]
P. Langley. Crafting papers on machine learning. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML), pages 1207-1212, 2000.
[47]
P. Lazarsfeld and R. K. Merton. Friendship as a social process: A substantive and methodological analysis. In M. Berger, T. Abel, and C. H. Page, editors, Freedom and Control in Modern Society, pages 18-66. Van Nostrand, 1954.
[48]
C. P. Loomis. Political and occupational cleavages in a Hanoverian village. Sociometry, 9:316- 3333, 1946.
[49]
Q. Lu and L. Getoor. Link-based classification. In Proceedings of the Twentieth International Conference on Machine Learning (ICML), pages 496-503, 2003.
[50]
S. A. Macskassy and F. Provost. A simple relational classifier. In Proceedings of the Multi-Relational Data Mining Workshop (MRDM) at the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2003.
[51]
S. A. Macskassy and F. Provost. Suspicion scoring based on guilt-by-association, collective inference, and focused data access. In Proceedings of the First International Conference on Intelligence Analysis (IA), 2005.
[52]
A. McCallum, K. Nigam, J. Rennie, and K. Seymore. Automating the construction of internet portals with machine learning. Information Retrieval, 3(2):127-163, 2000.
[53]
M. McPherson, L. Smith-Lovin, and J. M. Cook. Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27:415-444, 2001.
[54]
K. Murphy, Y. Weiss, and M. I. Jordan. Loopy belief-propagation for approximate inference: An empirical study. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI), pages 467-475, 1999.
[55]
J. Neville and D. Jensen. Iterative classification in relational data. In Proceedings of the Workshop on Learning Statistical Models from Relational Data at the Seventeenth National Conference on Artificial Intelligence (AAAI), pages 13-20, 2000.
[56]
J. Neville and D. Jensen. Collective classification with relational dependency networks. In Proceedings of the Multi-Relational Data Mining Workshop (MRDM) at the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2003.
[57]
J. Neville and D. Jensen. Dependency networks for relational data. In Proceedings of the Fourth IEEE International Conference in Data Mining (ICDM), pages 170-177, 2004.
[58]
J. Neville and D. Jensen. Leveraging relational autocorrelation with latent group models. In Proceedings of the Fifth IEEE International Conference in Data Mining (ICDM), pages 322-329, 2005.
[59]
J. Neville and D. Jensen. Relational dependency networks. Journal of Machine Learning Research (JMLR), 8(Mar):653-692, 2007.
[60]
J. Neville, D. Jensen, L. Friedland, and M. Hay. Learning relational probability trees. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 625-630, 2003.
[61]
J. Neville, Ö. Simsek, D. Jensen, J. Komoroske, K. Palmer, and H. Goldberg. Using relational knowledge discovery to prevent securities fraud. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 449-458, 2005.
[62]
M. E. J. Newman. Mixing patterns in networks. Physical Review E, 67, 2003. 026126.
[63]
J. Pearl. Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, 1988.
[64]
C. Perlich. Citation-based document classification. In Workshop on Information Technology and Systems (WITS), 2003.
[65]
C. Perlich and F. Provost. Aggregation-based feature invention and relational concept classes. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 167-176, 2003.
[66]
C. Perlich and F. Provost. Distribution-based aggregation for relational learning with identifier attributes. Machine Learning, 62(1/2):65-105, 2006.
[67]
C. Perlich, F. Provost, and J. Simonoff. Tree induction vs. logistic regression: A learning-curve analysis. Journal of Machine Learning Research (JMLR), 4:211-255, 2003.
[68]
A. Popescul and L. H. Ungar. Statistical relational learning for link prediction. In Proceedings of the Learning Statistical Models from Relational Data Workshop at the Nineteenth International Joint Conference on Artificial Intelligence (IJCAI), 2003.
[69]
R. B. Potts. Some generalized order-disorder transformations. Cambridge Philosophic Society, 48: 106-109, 1952.
[70]
H. M. Richardson. Community of values as a factor in friendships of college and adult women. Journal of Social Psychology, 11:303-312, 1940.
[71]
M. Richardson and P. Domingos. Markov logic networks. Machine Learning, 62(1/2):107-136, 2006.
[72]
J. Rocchio. Relevance feedback in information retrieval. In G. Salton, editor, The SMART Retrieval System: Experiments in Automatic Document Processing, chapter 14, pages 313-323. Prentice-Hall, 1971.
[73]
A. Rosenfeld, R. Hummel, and S. Zucker. Scene labeling by relaxation operations. IEEE Transactions on Systems, Man and Cybernetics, 6:420-433, 1976.
[74]
L. J. Savage. The Foundations of Statistics. John Wiley and Sons, 1954.
[75]
E. Segal, H. Wang, and D. Koller. Discovering molecular pathways from protein interaction and gene expression data. Bioinformatics, 19:I264-I272, Jul 2003a.
[76]
E. Segal, R. Yelensky, and D. Koller. Genome-wide discovery of transcriptional modules from DNA sequence and gene expression. Bioinformatics, 19:I273-I282, Jul 2003b.
[77]
B. Taskar, P. Abbeel, and D. Koller. Discriminative probabilistic models for relational data. In Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI), pages 485-492, 2002.
[78]
B. Taskar, E. Segal, and D. Koller. Probabilistic classification and clustering in relational data. In Proceedings of the Seventeenthth International Joint Conference on Artificial Intelligence (IJCAI) , pages 870-878, 2001.
[79]
B. Taskar, V. Chatalbashev, and D. Koller. Learning associative Markov networks. In Proceedings of the Twentyfirst International Conference on Machine Learning (ICML), 2004.
[80]
K. Tumulty. Inside Bush's Secret Spy Net {electronic version}. Time, 167(21), 2006. Retrieved May 25, 2006, from http://www.time.com/time/archive/preview/0,10987,1194021, 00.html.
[81]
V. N. Vapnik. The support vector method of function estimation. In J. Suykens and J. Vandewalle, editors, Nonlinear Modeling: Advanced Black-Box Techniques, pages 55-86. Kluwer, Boston, 1998a.
[82]
V. N. Vapnik. Statistical Learning Theory. John Wiley, NY, 1998b.
[83]
M. J. Wainwright and M. I. Jordan. Graphical models, exponential families, and variational inference. Technical Report 649, University of California, Berkeley, 2003.
[84]
G. Winkler. Image Analysis, Random Fields and Markov Chain Monte Carlo Methods. Springer-Verlag, 2nd edition, 2003.
[85]
I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools with Java Implementations . Morgan Kaufmann, San Francisco, 2000.
[86]
X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using Gaussian fields and harmonic functions. In Proceedings of the Twentieth International Conference on Machine Learning (ICML), pages 912-919, 2003.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image The Journal of Machine Learning Research
The Journal of Machine Learning Research  Volume 8, Issue
5/1/2007
1150 pages
ISSN:1532-4435
EISSN:1533-7928
Issue’s Table of Contents

Publisher

JMLR.org

Publication History

Published: 01 May 2007
Published in JMLR Volume 8

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)244
  • Downloads (Last 6 weeks)22
Reflects downloads up to 26 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Hub‐aware random walk graph embedding methods for classificationStatistical Analysis and Data Mining10.1002/sam.1167617:2Online publication date: 1-Apr-2024
  • (2022)Graph partitioning and visualization in graph mining: a surveyMultimedia Tools and Applications10.1007/s11042-022-13017-581:30(43315-43356)Online publication date: 1-Dec-2022
  • (2022)FLIP: Active Learning for Relational Network ClassificationMachine Learning and Knowledge Discovery in Databases10.1007/978-3-662-44845-8_1(1-18)Online publication date: 10-Mar-2022
  • (2021)Node classification over bipartite graphs through projectionMachine Language10.1007/s10994-020-05898-0110:1(37-87)Online publication date: 1-Jan-2021
  • (2020)An experimental study of graph-based semi-supervised classification with additional node informationKnowledge and Information Systems10.1007/s10115-020-01500-062:11(4337-4371)Online publication date: 1-Nov-2020
  • (2020)Relational Bayesian Model Averaging for Sentiment Analysis in Social NetworksMachine Learning, Optimization, and Data Science10.1007/978-3-030-64583-0_27(285-296)Online publication date: 19-Jul-2020
  • (2019)Removing Malicious Nodes from NetworksProceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3306127.3331708(314-322)Online publication date: 8-May-2019
  • (2019)Decoupled Smoothing on GraphsThe World Wide Web Conference10.1145/3308558.3313748(263-272)Online publication date: 13-May-2019
  • (2019)Leveraging collective intelligence for behavioral prediction in signed social networks through evolutionary approachInformation Systems Frontiers10.1007/s10796-017-9760-421:2(417-439)Online publication date: 1-Apr-2019
  • (2019)Learning edge weights in file co-occurrence graphs for malware detectionData Mining and Knowledge Discovery10.1007/s10618-018-0593-733:1(168-203)Online publication date: 1-Jan-2019
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media