Abstract
We propose an approach for dependence tree structure learning via copula. A nonparametric algorithm for copula estimation is presented. Then a Chow-Liu like method based on dependence measure via copula is proposed to estimate maximum spanning bivariate copula associated with bivariate dependence relations. The main advantage of the approach is that learning with empirical copula focuses on dependence relations among random variables, without the need to know the properties of individual variables as well as without the requirement to specify parametric family of entire underlying distribution for individual variables. Experiments on two real-application data sets show the effectiveness of the proposed method.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
D. Heckerman, D. Geiger, D. M. Chickering. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, vol. 20, no. 3, pp. 197–243, 1995.
W. Buntine. A guide to the literature on learning probabilistic networks from data. IEEE Transactions on Knowledge and Data Engineering, vol. 8, no. 2, pp. 195–210, 1996.
M. I. Jordan. Learning in Graphical Models, Holland: Kluwer Academic Publishers, 1998.
C. M. Bishhop. A new framework for machine learning. Computational Intelligence: Research Frontiers, J. M. Zu rada, G. G. Yen, J. Wang, Eds., Heidelberg, Germany: Springer, pp. 1–24, 2008.
H. Joe. Multivariate Models and Dependence Concepts, London, UK: Chapmann & Hall, 1997.
R. B. Nelsen. An Introduction to Copulas. New York, USA: Springer, 1999.
E. Bouyé, V. Durrleman, A. Nikeghbali, G. Riboulet, T. Roncalli. Copulas for finance — A reading guide and some applications, [Online], Available: http://ssrn.com/abstract=1032533, October 27, 2011.
S. X. Chen, T. M. Huang. Nonparametric estimation of copula functions for dependence modelling. Canadian Journal of Statistics, vol. 35, no. 2, pp. 265–282, 2007.
J. Ma, Z. Sun. Copula component analysis. In Proceedings of the 7th International Conference on Independent Component Analysis and Signal Separation, ACM, London, UK, pp. 73–80, 2007.
K. Abayomi, U. Lall, V. de la Pena. Copula based independent component analysis, [Online], Available: http://ssrn.com/abstract=1028822, October 27, 2011.
S. Kirshner. Learning with tree-averaged densities and distributions. Advances in Neural Information Processing Systems, J. C. Platt, D. Koller, Y. Singer, S. Roweis, Eds., Cambridge, USA: MIT Press, pp. 761–768, 2000.
X. H. Chen, W. B. Wu, Y. P. Yi. Efficient estimation of copula-based semiparametric Markov models. The Annals of Statistics, vol. 37, no. 6B, pp. 4214–4253, 2009.
A. Sklar. Fonctions de repartition à n dimensions et leurs marges. Publications de l’Institut de Statistique de l’Université de Paris, vol. 8, pp. 229–231, 1959. (In French)
L. Rüschendorf. On the distributional transform, Sklar’s theorem, and the empirical copula process. Journal of Statistical Planning and Inference, vol. 139, no. 11, pp. 3921–3927, 2009.
C. K. Chow, C. N. Liu. Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory, vol. 14, no. 3, pp. 462–467, 1968.
J. D. Fermanian. Goodness-of-fit tests for copulas. Journal of Multivariate Analysis, vol. 95, no. 1, pp. 119–152, 2005.
J. Yan. Enjoy the joy of copulas: With a package copula. Journal of Statistical Software, [Online], Available: http://www.jstatsoft.org/v21/i04/paper, October 27, 2011.
X. Shen, Y. Zhu, L. Song. Linear B-spline copulas with applications to nonparametric estimation of copulas. Computational Statistics and Data Analysis, vol. 52, no. 7, pp. 3806–3819, 2008.
P. Deheuvels. La fonction de dépendance empirique et ses propriétés — Un test non paramétrique d’indépendance. Académie Royale de Belgique — Bulletin de la Classe des Sciences — 5e Série, vol. 65, pp. 274–292, 1979. (In French)
P. Deheuvels. A non parametric test for independence. Publications de l’Institut de Statistique de l’Université de Paris, vol. 26, pp. 29–50, 1981.
H. A. David, H. N. Nagaraja. Order Statistics, the 3rd Edition, New York, USA: John Wiley & Sons, 2003.
T. M. Cover, J. A. Thomas. Elements of Information Theory, New York, USA: John Wiley & Sons, 1991.
B. W. Silverman. Density Estimation for Statistics and Data Analysis. London, UK: Chapmann & Hall, 1986.
E. Parzen. On estimation of a probability density function and mode. Annals of Mathematical Statistics, vol. 33, no. 3, pp. 1065–1076, 1962.
S. Chen, X. Hong, C. J. Harris. An orthogonal forward regression technique for sparse kernel density estimation. Neurocomputing, vol. 71, no. 4–6, pp. 931–943, 2008.
F. Topsøe. On the Glivenko-Cantelli theorem. Probability Theory and Related Fields, vol. 14, no. 3, pp. 239–250, 1970.
J. B. Kruskal. On the shortest spanning subtree of a graph and the traveling salesman problem. In Proceedings of the American Mathematical Society, vol. 7, no. 1, pp. 48–50, 1956.
R. C. Prim. Shortest connection networks and some generalizations. Bell System Technical Journal, vol. 36, pp. 1389–1401, 1957.
A. Asuncion, D. J. Newman. UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences, [Online], Available: http://archive.ics.uci.edu/ml/datasets.html, October 28, 2011.
D. Harrison, D. L. Rubinfeld. Hedonic housing prices and the demand for clean air. Journal of Environmental Economics and Management, vol. 5, no. 1, pp. 81–102, 1978.
Author information
Authors and Affiliations
Corresponding author
Additional information
Jian Ma received his B. Sc. and M. Sc. degrees in computer science from Hangzhou Dianzi University, Hangzhou, PRC in 2000 and 2003, respectively, and his Ph.D. degree in computer science and technology from Tsinghua University, Beijing, PRC in 2009. He is currently a post-doctoral researcher with the Department of Automation, Tsinghua University.
His research interests include machine learning, data analysis, and information theory.
Zeng-Qi Sun received his B. Sc. degree from the Department of Automatic Control, Tsinghua University, Beijing, PRC in 1966, and his Ph.D. degree in control engineering from Chalmers University of Technology, Gothenburg, Sweden in 1981. He is currently a full professor in the Department of Computer Science and Technology, Tsinghua University. He is the author and coauthor of over 200 research papers and eight books on computer control theory, intelligent control, and robotics.
His research interests include intelligent control, robotics, fuzzy systems, neural networks, and evolution computing.
Sheng Chen received his B. Eng. degree from Huadong Petroleum Institute, Dongying, PRC in January 1982, and his Ph.D. degree from the City University, London, UK in September 1986, both in control engineering. He was awarded the Doctor of Sciences (D. Sc.) degree by the University of Southampton, Southampton, UK in 2004. From October 1986 to August 1999, he held research and academic appointments at the University of Sheffield, the University of Edinburgh and the University of Portsmouth, all in UK. Since September 1999, he has been with the Electronics and Computer Science, the University of Southampton, UK, where he currently holds the post of professor of intelligent systems and signal processing. He is a Distinguished Adjunct Professor at King Abdulaziz University, Jeddah, Saudi Arabia. He has published over 450 research papers. He is a Chartered Engineer (CEng), a fellow of IET and a fellow of IEEE. In the database of the world’s most highly cited researchers, compiled by Institute for Scientific Information (ISI) of the USA, he is on the list of the highly cited researchers in the engineering category.
His research interests include wireless communications, adaptive signal processing for communications, machine learning, evolutionary computation methods, and intelligent control systems.
Hong-Hai Liu received his Ph.D. degree in robotics from Kings College, University of London, UK in 2003. He joined the University of Portsmouth, UK in September 2005, where he currently holds a post of professor of intelligent systems. He previously held research appointments at Universities of London and Aberdeen, UK, and project leader appointments in the industrial control and system integration industries. He has published over 200 research papers including three Best Paper Awards. He is a senior member of IEEE.
His research interests include computational intelligence methods and applications with a focus on those approaches which could make contributions to the intelligent connection of perception to action.
Rights and permissions
About this article
Cite this article
Ma, J., Sun, ZQ., Chen, S. et al. Dependence tree structure estimation via copula. Int. J. Autom. Comput. 9, 113–121 (2012). https://doi.org/10.1007/s11633-012-0624-6
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11633-012-0624-6