research-article

Online and Distributed Robust Regressions with Extremely Noisy Labels

Authors:

Arnold P. Boedihardjo,

Chang-Tien LuAuthors Info & Claims

ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 16, Issue 3

Article No.: 41, Pages 1 - 24

https://doi.org/10.1145/3473038

Published: 22 October 2021 Publication History

Abstract

In today’s era of big data, robust least-squares regression becomes a more challenging problem when considering the extremely corrupted labels along with explosive growth of datasets. Traditional robust methods can handle the noise but suffer from several challenges when applied in huge dataset including (1) computational infeasibility of handling an entire dataset at once, (2) existence of heterogeneously distributed corruption, and (3) difficulty in corruption estimation when data cannot be entirely loaded. This article proposes online and distributed robust regression approaches, both of which can concurrently address all the above challenges. Specifically, the distributed algorithm optimizes the regression coefficients of each data block via heuristic hard thresholding and combines all the estimates in a distributed robust consolidation. In addition, an online version of the distributed algorithm is proposed to incrementally update the existing estimates with new incoming data. Furthermore, a novel online robust regression method is proposed to estimate under a biased-batch corruption. We also prove that our algorithms benefit from strong robustness guarantees in terms of regression coefficient recovery with a constant upper bound on the error of state-of-the-art batch methods. Extensive experiments on synthetic and real datasets demonstrate that our approaches are superior to those of existing methods in effectiveness, with competitive efficiency.

References

[1]

Jean-Yves Audibert and Olivier Catoni. 2011. Robust linear least squares regression. The Annals of Statistics 39, 5 (2011), 2766–2794.

[2]

Markus Baldauf and J. M. C. Santos Silva. 2012. On the use of robust regression in econometrics. Economics Letters 114, 1 (2012), 124–127.

[3]

Aharon Ben-Tal, Dick Den Hertog, Anja De Waegenaere, Bertrand Melenberg, and Gijs Rennen. 2013. Robust solutions of optimization problems affected by uncertain probabilities. Management Science 59, 2 (2013), 341–357.

Digital Library

[4]

Kush Bhatia, Prateek Jain, and Purushottam Kar. 2015. Robust regression via hard thresholding. In Proceedings of the 28th International Conference on Neural Information Processing Systems. 721–729.

Digital Library

[5]

Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Eckstein. 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine Learning 3, 1 (2011), 1–122.

Digital Library

[6]

Yudong Chen and Constantine Caramanis. 2013. Noisy and missing data regression: Distribution-oblivious support recovery. In Proceedings of the 30th International Conference on International Conference on Machine Learning. PMLR, 383–391.

Digital Library

[7]

Yudong Chen, Constantine Caramanis, and Shie Mannor. 2013. Robust sparse regression under adversarial corruption. In Proceedings of the 30th International Conference on Machine Learning. PMLR, 774–782.

Digital Library

[8]

Koby Crammer, Ofer Dekel, Joseph Keshet, Shai Shalev-Shwartz, and Yoram Singer. 2006. Online passive-aggressive algorithms. Journal of Machine Learning Research 7, Mar (2006), 551–585.

Digital Library

[9]

C. Cromvik and M. Patriksson. 2010. On the robustness of global optima and stationary solutions to stochastic mathematical programs with equilibrium constraints, Part 1: Theory. Journal of Optimization Theory and Applications 144, 3 (2010), 461–478.

Digital Library

[10]

Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: Simplified data processing on large clusters. Communications of the ACM 51, 1 (2008), 107–113.

Digital Library

[11]

Erick Delage and Yinyu Ye. 2010. Distributionally robust optimization under moment uncertainty with application to data-driven problems. Operations Research 58, 3 (2010), 595–612. DOI:DOI:https://doi.org/10.1287/opre.1090.0741

Digital Library

[12]

Xuan Vinh Doan, Serge Kruk, and Henry Wolkowicz. 2012. A robust algorithm for semidefinite programming. Optimization Methods and Software 27, 4–5 (2012), 667–693.

Digital Library

[13]

John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research 12, Jul (2011), 2121–2159.

Digital Library

[14]

Jitka Dupačová and Miloš Kopa. 2012. Robustness in stochastic programs with risk constraints. Annals of Operations Research 200, 1 (2012), 55–74.

[15]

Yaakov Engel, Shie Mannor, and Ron Meir. 2004. The kernel recursive least-squares algorithm. IEEE Transactions on Signal Processing 52, 8 (2004), 2275–2285.

Digital Library

[16]

Jiashi Feng, Huan Xu, and Shie Mannor. 2017. Outlier robust online learning. arXiv:1701.00251. Retrieved from http://arxiv.org/abs/1701.00251.

[17]

Chao Huang, Baoxu Shi, Xuchao Zhang, Xian Wu, and Nitesh V. Chawla. 2019. Similarity-aware network embedding with self-paced learning. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2113–2116.

Digital Library

[18]

Chao Huang, Xian Wu, Xuchao Zhang, Chuxu Zhang, Jiashu Zhao, Dawei Yin, and Nitesh V. Chawla. 2019. Online purchase prediction via multi-scale modeling of behavior dynamics. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2613–2622.

Digital Library

[19]

Seong-Cheol Kang, Theodora S. Brisimi, and Ioannis Ch Paschalidis. 2015. Distribution-dependent robust linear optimization with applications to inventory control. Annals of Operations Research 231, 1 (2015), 229–263.

[20]

Po-Ling Loh and Martin J. Wainwright. 2011. High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity. In Proceedings of the 24th International Conference on Neural Information Processing Systems. 2726–2734.

Digital Library

[21]

Julien Mairal, Francis Bach, Jean Ponce, and Guillermo Sapiro. 2010. Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research 11, Jan (2010), 19–60.

Digital Library

[22]

R. A. R. D. Maronna, R. Douglas Martin, and Victor Yohai. 2006. Robust Statistics. John Wiley & Sons, Chichester.

[23]

G. Mateos, J. A. Bazerque, and G. B. Giannakis. 2010. Distributed sparse linear regression. IEEE Transactions on Signal Processing 58, 10 (Oct 2010), 5262–5276. DOI :DOI:

Digital Library

[24]

Brian McWilliams, Gabriel Krummenacher, Mario Lucic, and Joachim M. Buhmann. 2014. Fast and robust least squares estimation in corrupted linear models. In Proceedings of the 27th International Conference on Neural Information Processing Systems. 415–423.

Digital Library

[25]

Nam H. Nguyen and Trac D. Tran. 2013. Exact recoverability from dense corrupted observations via L1-Minimization. IEEE Transactions on Information Theory 59, 4 (2013), 2017–2035.

Digital Library

[26]

Mathieu Rosenbaum and Alexandre B. Tsybakov. 2010. Sparse recovery under matrix uncertainty. The Annals of Statistics 38, 5 (2010), 2620–2651.

[27]

Peter J. Rousseeuw and Annick M. Leroy. 2005. Robust Regression and Outlier Detection. Vol. 589, John Wiley & Sons.

[28]

B. Saltzberg. 1967. Performance of an efficient parallel data transmission system. IEEE Transactions on Communication Technology 15, 6 (1967), 805–811.

[29]

Shekhar Sharma, Swanand Khare, and Biao Huang. 2016. Robust online algorithm for adaptive linear regression parameter estimation and prediction. Journal of Chemometrics 30, 6 (2016), 308–323. DOI :DOI:

[30]

Yiyuan She and Art B. Owen. 2011. Outlier detection using nonconvex penalized regression. Journal of the American Statistical Association 106, 494 (2011), 626–639. Retrieved fromhttp://www.jstor.org/stable/41416397.

[31]

John Wright and Yi Ma. 2010. Dense error correction via L1-minimization. IEEE Transactions on Information Theory 56, 7 (July 2010), 3540–3560. DOI :DOI:https://doi.org/10.1109/TIT.2010.2048473

[32]

Andrea Zanella, Nicola Bui, Angelo Castellani, Lorenzo Vangelista, and Michele Zorzi. 2014. Internet of things for smart cities. IEEE Internet of Things Journal 1, 1 (2014), 22–32.

[33]

Xuchao Zhang, Yifeng Gao, Jessica Lin, and Chang-Tien Lu. 2020. Tapnet: Multivariate time series classification with attentional prototypical network. In Proceedings of the AAAI Conference on Artificial Intelligence. 6845–6852.

[34]

Xuchao Zhang, Shuo Lei, Liang Zhao, Arnold Boedihardjo, and Chang-Tien Lu. 2018. Robust regression via online feature selection under adversarial data corruption. In Proceedings of the 2018 IEEE International Conference on Data Mining. IEEE, 1440–1445.

[35]

Xuchao Zhang, Shuo Lei, Liang Zhao, Arnold P. Boedihardjo, and Chang-Tien Lu. 2019. Robust regression via heuristic corruption thresholding and its adaptive estimation variation. ACM Transactions on Knowledge Discovery from Data 13, 3 (2019), 1–22.

Digital Library

[36]

Xuchao Zhang, Xian Wu, Fanglan Chen, Liang Zhao, and Chang-Tien Lu. 2020. Self-paced robust learning for leveraging clean labels in noisy data. In Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34, 6853–6860.

[37]

Xuchao Zhang, Liang Zhao, Arnold P. Boedihardjo, and Chang-Tien Lu. 2017. Online and distributed robust regressions under adversarial data corruption. In Proceedings of the 2017 IEEE International Conference on Data Mining. 625–634. DOI: DOI:

[38]

Xuchao Zhang, Liang Zhao, Arnold P. Boedihardjo, and Chang-Tien Lu. 2017. Online and distributed robust regressions under adversarial data corruption. In Proceedings of the 2017 IEEE International Conference on Data Mining. 625–634.

[39]

Xuchao Zhang, Liang Zhao, Arnold P. Boedihardjo, and Chang-Tien Lu. 2017. Robust regression via heuristic hard thresholding. In Proceedings of the 26th International Joint Conference on Artificial Intelligence . AAAI. Retrieved fromhttp://dl.acm.org/citation.cfm?id=3060832.3060872.

Digital Library

[40]

A. M. Zoubir, V. Koivunen, Y. Chakhchoukh, and M. Muma. 2012. Robust estimation in signal processing: A tutorial-style treatment of fundamental concepts. IEEE Signal Processing Magazine 29, 4 (July 2012), 61–80. DOI :DOI:

Cited By

Zhu YWang YQin LZhang BShia BChen M(2023)Naïve Bayes classifier based on reliability measurement for datasets with noisy labelsAnnals of Operations Research10.1007/s10479-023-05671-1Online publication date: 9-Nov-2023
https://doi.org/10.1007/s10479-023-05671-1

Index Terms

Online and Distributed Robust Regressions with Extremely Noisy Labels
1. Computing methodologies
  1. Machine learning
    1. Machine learning algorithms
2. Theory of computation
  1. Design and analysis of algorithms
    1. Online algorithms
      1. Online learning algorithms

Recommendations

Using symmetry in robust model fitting

The pattern recognition and computer vision communities often employ robust methods for model fitting. In particular, high breakdown-point methods such as least median of squares (LMedS) and least trimmed squares (LTS) have often been used in situations ...
Robust weighted LAD regression

The least squares linear regression estimator is well-known to be highly sensitive to unusual observations in the data, and as a result many more robust estimators have been proposed as alternatives. One of the earliest proposals was least-sum of ...
Robust fitting of mixture regression models

The existing methods for fitting mixture regression models assume a normal distribution for error and then estimate the regression parameters by the maximum likelihood estimate (MLE). In this article, we demonstrate that the MLE, like the least squares ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data

ACM Transactions on Knowledge Discovery from Data Volume 16, Issue 3

June 2022

494 pages

ISSN:1556-4681

EISSN:1556-472X

DOI:10.1145/3485152

Editor:
Charu Aggarwal
IBM T. J. Watson Research, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 October 2021

Accepted: 01 June 2021

Revised: 01 April 2021

Received: 01 October 2020

Published in TKDD Volume 16, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
437
Total Downloads

Downloads (Last 12 months)69
Downloads (Last 6 weeks)9

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhu YWang YQin LZhang BShia BChen M(2023)Naïve Bayes classifier based on reliability measurement for datasets with noisy labelsAnnals of Operations Research10.1007/s10479-023-05671-1Online publication date: 9-Nov-2023
https://doi.org/10.1007/s10479-023-05671-1

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents