Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3230905.3230944acmotherconferencesArticle/Chapter ViewAbstractPublication PageslopalConference Proceedingsconference-collections
research-article

A study of the application of statistical methods for Big data

Published: 02 May 2018 Publication History

Abstract

The use of analysis and classification methods for big data is difficult. Several proposals consist in dividing randomly the population into b sub-samples and aggregating the parameters using an estimator based on the average parameters of these selected sub-samples. This paper aims to find a solution that minimizes calculations by selecting a small number b* sub-samples and keeping the same precision. We can apply this approach to the several method to measure its relevance.

References

[1]
A. Abarda, Y. Bentaleb, H. Mharzi, A Divided Latent Class analysis for Big Data, Procedia Computer Science, 110 (2017), 428--433
[2]
Shu H. Big data analytics: six techniques. Geo-spatial Information Science 2016;19:119--128.
[3]
Domingos P, Hulten G. A general framework for mining massive data streams. Journal of Computational and Graphical Statistics 2003;12:945--949.
[4]
Zhang Y, Duchi J. Wainwright, M. Divide and Conquer Kernel Ridge Regression. Workshop and Conference Proceedings 2013;30:1--26.
[5]
Li R, Lin DK, Li B. Statistical inference in massive data sets. Applied Stochastic Models in Business and Industry 2013;29:399--409.
[6]
Jun S, Lee SJ, Ryu JB. A Divided Regression Analysis for Big Data. International Journal of Software Engineering and Its Applications 2015;9:21--32.
[7]
Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. Journal of the royal statistical society. Series B 1977;39:1--38.
[8]
Zhao T, Cheng G, Liu H. A partially linear framework for massive heterogeneous data. arXiv preprint arXiv:1410.8570, https://arxiv.org/pdf/1410.8570.pdf; 2014.
[9]
Bentaleb Y, Abarda A, Mharzi H, El Hajji S. Probabilistic approach to estimate the risk of being a cybercrime victim. Applied Mathematical Sciences 2015;125:6233--6240.
[10]
Bentaleb Y, Abarda A, Mharzi H, El Hajji S. Application of latent class analysis to identify the youth population who risk being cybercrime victim on social networks. Contemporary Engineering Sciences 2015;32:1529--1534.
[11]
Sengupta, S., Volgushev, S., Shao, X. A subsampled double bootstrap for massive data, Journal of the American Statistical Association. (2015)
[12]
Linzer, D. A., Lewis, J. B. poLCA: An R Package for Polytomous Variable Latent Class Analysis, Journal of Statistical Software, V.42, N.10, pp. 1--29 (2011)

Cited By

View all
  • (2023)Classical and fast parameters tuning in nearest neighbors with stop conditionOPSEARCH10.1007/s12597-023-00650-360:3(1063-1081)Online publication date: 4-May-2023

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
LOPAL '18: Proceedings of the International Conference on Learning and Optimization Algorithms: Theory and Applications
May 2018
357 pages
ISBN:9781450353045
DOI:10.1145/3230905
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 May 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Latent class analysis
  2. classification method
  3. massive data

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

LOPAL '18
LOPAL '18: Theory and Applications
May 2 - 5, 2018
Rabat, Morocco

Acceptance Rates

LOPAL '18 Paper Acceptance Rate 61 of 141 submissions, 43%;
Overall Acceptance Rate 61 of 141 submissions, 43%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Classical and fast parameters tuning in nearest neighbors with stop conditionOPSEARCH10.1007/s12597-023-00650-360:3(1063-1081)Online publication date: 4-May-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media