The presence of outliers in a dataset can substantially bias the results of statistical analyses. To correct for outliers, micro edits are manually performed on all records. A set of constraints and decision rules is typically used to aid the editing process. However, straightforward decision rules might overlook anomalies arising from disruption of linear relationships. Computationally efficient methods are provided to identify historical, tail, and relational anomalies at the data-entry level (Sartore et al., 2024; <doi:10.6339/24-JDS1136>). A score statistic is developed for each anomaly type, using a distribution-free approach motivated by the Bienaymé-Chebyshev's inequality, and fuzzy logic is used to detect cellwise outliers resulting from different types of anomalies. Each data entry is individually scored and individual scores are combined into a final score to determine anomalous entries. In contrast to fuzzy logic, Bayesian bootstrap and a Bayesian test based on empirical likelihoods are also provided as studied by Sartore et al. (2024; <doi:10.3390/stats7040073>). These algorithms allow for a more nuanced approach to outlier detection, as it can identify outliers at data-entry level which are not obviously distinct from the rest of the data. — This research was supported in part by the U.S. Department of Agriculture, National Agriculture Statistics Service. The findings and conclusions in this publication are those of the authors and should not be construed to represent any official USDA, or US Government determination or policy.
Version: | 25.2.18 |
Depends: | R (≥ 4.0.0) |
Imports: | dplyr, purrr, tidyr |
Suggests: | knitr, rmarkdown, cellWise |
Published: | 2025-02-20 |
DOI: | 10.32614/CRAN.package.HRTnomaly |
Author: | Luca Sartore [aut] (ORCID = "0000-0002-0446-1328"),
Luca Sartore [cre] (ORCID = "0000-0002-0446-1328"),
Lu Chen [aut] (ORCID = "0000-0003-3387-3484"),
Justin van Wart [aut],
Andrew Dau [aut] (ORCID = "0009-0008-9482-5316"),
Valbona Bejleri [aut] (ORCID = "0000-0001-9828-968X") HRTnomaly author details |
Maintainer: | Luca Sartore <drwolf85 at gmail.com> |
License: | AGPL-3 |
NeedsCompilation: | yes |
Materials: | README ChangeLog |
CRAN checks: | HRTnomaly results [issues need fixing before 2025-03-07] |
Reference manual: | HRTnomaly.pdf |
Package source: | HRTnomaly_25.2.18.tar.gz |
Windows binaries: | r-devel: not available, r-release: not available, r-oldrel: HRTnomaly_25.2.18.zip |
macOS binaries: | r-devel (arm64): HRTnomaly_25.2.18.tgz, r-release (arm64): HRTnomaly_25.2.18.tgz, r-oldrel (arm64): HRTnomaly_25.2.18.tgz, r-devel (x86_64): HRTnomaly_25.2.18.tgz, r-release (x86_64): HRTnomaly_25.2.18.tgz, r-oldrel (x86_64): HRTnomaly_25.2.18.tgz |
Please use the canonical form https://CRAN.R-project.org/package=HRTnomaly to link to this page.