Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

A resampling approach for interval-valued data regression

Published: 01 August 2012 Publication History

Abstract

We consider interval-valued data that frequently appear with advanced technologies in current data collection processes. Interval-valued data refer to the data that are observed as ranges instead of single values. In the last decade, several approaches to the regression analysis of interval-valued data have been introduced, but little work has been done on relevant statistical inferences concerning the regression model. In this paper, we propose a new approach to fit a linear regression model to interval-valued data using a resampling idea. A key advantage is that it enables one to make inferences on the model such as the overall model significance test and individual coefficient test. We demonstrate the proposed approach using simulated and real data examples, and also compare its performance with those of existing methods. © 2012 Wiley Periodicals, Inc. Statistical Analysis and Data Mining, 2012 © 2012 Wiley Periodicals, Inc.

References

[1]
E. Diday, An introduction to symbolic data analysis and the sodas software, J Symbolic Data Anal 7 (2003), 1723–5081.
[2]
L. Billard and E. Diday, From the statistics of data to the statistics of knowledge: Symbolic data analysis, J Am Stat Assoc 98 (2003), 470–487.
[3]
L. Billard and E. Diday, Symbolic Data Analysis: Conceptual Statistics and Data Mining, Chichester, Wiley, 2007.
[4]
E. Diday, Probabilist, possibilist and belief object for knowledge analysis, Ann Oper Res 55 (1995), 227–276.
[5]
E. Diday and R. Emilion, Lattices and capacities in analysis of probabilist object, In Studies in Classification, E. Diday, Y. Lechevallier, and O. Opilz, eds. 1996, 13–30.
[6]
E. Diday and R. Emilion, Capacities and credibilities in analysis of probabilistic objects by histograms and lattices, In Data Science, Classification, and Related Methods, C. Hayashi, N. Ohsumi, K. Yajima, Y. Tanaka, H. H. Bock, and Y. Baba, eds. 1998, 353–357.
[7]
L. Billard, Brief overview of symbolic data and analytic issues, Stat Anal Data Mining 4 (2011), 149–156.
[8]
M. Noirhomme-Fraiture and P. Brito, Far beyond the classical data models: symbolic data analysis, Stat Anal Data Mining 4 (2011), 157–170.
[9]
L. Billard and E. Diday, Regression analysis for interval-valued data, In Data Analysis, Classification, and Related Methods, H. A. L. Kiers, J.-P. Rassoon, P. J. F. Groenen, and M. Schader, eds. Berlin, Springer-Verlag, 2000, 369–374.
[10]
L. Billard and E. Diday, Symbolic regression analysis, In Classification, Clustering and Data Analysis: Proceedings of the 8th Conference of the International Federation of Classification Societies (IFCS '02), Springer, Poland,2002, 281–288.
[11]
E. A. Lima Neto, F. A. T. de Carvalho, and C. P. Tenorio, Univariate and multivariate linear regression methods to predict interval-valued features, Lecture Notes in Computer Science, AI 2004 Advances in Artificial Intelligence, Berlin, Springer-Verlag, 2004, 526–537.
[12]
E. Lima Neto, and F. de Carvalho, Constrained linear regression models for symbolic interval-valued variables, Comput Stat Data Anal 54 (2010), 333–347.
[13]
W. Xu, Symbolic Data Analysis: Interval-Valued Data Regression, Ph.D. Thesis; University of Georgia, 2010.
[14]
L. Billard, Dependencies and variation components of symbolic interval-valued data, In Selected Contributions in Data Analysis and Classification, P. Brito, G. Cucumel, P. Bertrand, and F. de Carvalho, eds. Berlin, Springer-Verlag, 2007, 3–13.
[15]
L. Billard, Sample covariance functions for complex quantitative data, In World Congress, International Association of Computational Statistics, Yokohama, Japan, 2008.
[16]
F. Alfonso, L. Billard, and E. Diday, Symbolic linear regression with taxonomies, In Classification, Clustering and Data Mining Applications, D. Banks, L. House, F. McMorris, P. Arabie, and W. Gaul, eds. Berlin, Springer-Verlag, 2004, 429–437.
[17]
A. Maia and F. D. Carvalho, Fitting a least absolute deviation regression model on symbolic interval data, Lecture Notes in Artificial Intelligence: Proceedings of the Ninth Brazilian Symposium on Artificial Intelligence, Berlin, Springer-Verlag, 2008, 207–216.
[18]
E. A. Lima Neto, G. M. Cordeiro, F. A. T. Carvalho, U. Anjos, and A. Costa, Bivariate generalized linear model for interval-valued variables, In Proceedings 2009 IEEE International Joint Conference on Neural Networks, Vol. 1, Atlanta, USA, 2009, 2226–2229.
[19]
E. Lima Neto, G. Cordeiro and F. de Carvalho, Bivariate symbolic regression models for interval-valued variables, J Stat Comput Simul 81 (2011), 1727–1744.
[20]
A. Silva, E. A. Lima Neto, and U. Anjos, A regression model to interval-valued variables based on copula approach, In Proceedings of the 58th World Statistics Congress of the International Statistical Institute, Dublin, Ireland, 2011.
[21]
E. Lima Neto, and F. de Carvalho, Center and range method for fitting a linear regression model to symbolic interval data, Comput Stat Data Anal 52 (2008), 1500–1515.
[22]
P. Bertrand and F. Goupil, Descriptive statistics for symbolic data, In Analysis of Symbolic Data, H.-H. Bock and E. Diday, eds. Berlin, Springer-Verlag, 2000, 103–124.
[23]
P. Good, Resampling Methods, Birkhauser, 2006.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Statistical Analysis and Data Mining
Statistical Analysis and Data Mining  Volume 5, Issue 4
August 2012
88 pages
ISSN:1932-1864
EISSN:1932-1872
Issue’s Table of Contents

Publisher

John Wiley & Sons, Inc.

United States

Publication History

Published: 01 August 2012

Author Tags

  1. interval-valued data
  2. linear regression
  3. resampling
  4. statistical inference

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)IV-GNN : interval valued data handling using graph neural networkApplied Intelligence10.1007/s10489-022-03780-153:5(5697-5713)Online publication date: 1-Mar-2023
  • (2022)On the Numerical Simulation of Exponential Decay and Outbreak Data Sets Involving UncertaintiesNumerical Methods and Applications10.1007/978-3-031-32412-3_8(85-99)Online publication date: 22-Aug-2022
  • (2018)Lasso-constrained regression analysis for interval-valued dataAdvances in Data Analysis and Classification10.1007/s11634-014-0164-89:1(5-19)Online publication date: 14-Dec-2018
  • (2017)Constrained center and range joint model for interval-valued symbolic data regressionComputational Statistics & Data Analysis10.1016/j.csda.2017.06.005116:C(106-138)Online publication date: 1-Dec-2017

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media