Cited By
View all- Albert PHaider FLuz S(2024)CUSCO: An Unobtrusive Custom Secure Audio-Visual Recording System for Ambient Assisted LivingSensors10.3390/s2405150624:5(1506)Online publication date: 26-Feb-2024
Paper title | Number of studies | Period | Dataset review | Evaluation metrics | Contents |
---|---|---|---|---|---|
[149] | 2 | \(\sim\) 2021 | no | Fidelity and privacy | Empirical study |
[45] | 8 | \(\sim\) 2020 | no | Fidelity \(^{*}\) and privacy | Empirical study |
[55] | 34 | \(\sim\) 2022 | no | Fidelity, utility and privacy | Comprehensive review paper |
[47] | 72 | \(\sim\) 2022 | yes \(^{**}\) | NaN | Review paper focusing on applications and use cases |
[95] | 70 | \(\sim\) 2022 | no | Fidelity, utility and privacy | Comprehensive review paper |
[54] | NA | \(\sim\) 2022 | no | Fidelity, utility and privacy | Review paper focusing on evaluation metrics |
[159] | NA | \(\sim\) 2022 | no | Fidelity, utility and privacy | Empirical study |
Ours | 82 | \(\sim\) 2023 | yes | Fidelity, utility, privacy and fairness | Comprehensive review paper |
Paper reference | Year | Distributions | Medical data applications |
---|---|---|---|
[87] | 2008 | Multinomial sampling with a dirichlet prior | Demongraphics (Census data) |
DPCopula [81] | 2014 | Copula functions with differential privacy | |
DPSynthesizer [82] | 2014 | Copula functions with differential privacy | Demongraphics (Census data) |
COCOA [6] | 2016 | 11 common data distributions | NaN \(^{*}\) |
[60] | 2016 | Copula functions | Hospital emergency population |
SyntheticDataVault [109] | 2016 | Copula functions | NaN \(^{*}\) |
Paper reference | Year | Methods | Medical data applications |
---|---|---|---|
GADP [94] | 1999 | Defining mean and variances for the distributions of \(X_C\) conditioned on \(X_U\) | NaN |
IPSO [15] | 2003 | General linear models for \(X_C\) from \(X_U\) | NaN |
CART [16] | 2010 | Random forests for \(X_C\) on \(X_U\) (only applicable to discrete sensitive attributes) | Demongraphics (Census data) |
[18] | 2009 | Fuzzy c-means for \(X_C\) on \(X_U\) | Demongraphics (Census data) |
[33] | 2010 | Support vector machines for \(X_C\) on \(X_U\) | Health insurances data |
[46] | 2020 | MICE | Cancer registry data from the Surveillance Epidemiology and End Results program |
PeGS [107] | 2013 | General linear models with differential privacy for \(X_C\) from \(X_U\) (only applicable to discrete sensitive attributes) | Public Patient Discharge Data from California Office of Statewide Health Planning and Development |
PeGS applications [108] | 2013 | General linear models with differential privacy for \(X_C\) from \(X_U\) (only applicable to discrete sensitive attributes) | Public-use data files from Centers for Medicare and Medicaid Services |
Paper reference | Year | Structural and parameter learning | Inference | Medical data applications |
---|---|---|---|---|
[134] | 2015 | Score-based (tabu search by Python Package bnlearn [137]) | Global sampling | Demographics |
PrivBayes [167] | 2017 | Constraint-based (Mutual Information and differential privacy) | Global sampling | NaN \(^{*}\) |
DataSynthesizer [111] | 2017 | PrivBayes | Global sampling | NaN \(^{*}\) |
[28] | 2020 | Score-based (AIC by Python Package pomegranate [122]) | Global sampling | Demographics |
[140] | 2020 | Constraint-based (FCI with EM for missing data) | Global sampling | CPRD Aurum data synthesis |
[70] | 2021 | Score-based (by Python Package bnlearn [137]) | Heart Disease (UCI), Diabetes datasets (UCI), MIMIC-III | |
[86] | 2021 | Constraint-based ( \(G^2\) -test) | Global sampling from the label attribute | Breast cancer (UCI), Diabetes (UCI) |
PrivSyn [168] | 2021 | Constraint-based (Independent Difference (InDif for short)) | Gradually Update Method | NaN \(^{*}\) |
Dataset name | Patient number | Data type | Data information | Disease Category |
---|---|---|---|---|
MIMIC-I (or MIMIC) [93] | 100 | Medical signals and sequential EHR | Patient monitor data, patient-descriptive data (gender, age, record duration), symptoms, fluid balance, diagnoses, progress notes, medications, and laboratory results | Potential hemodynamically unstable |
MIMIC-II [121] | 33,000 | Medical signals and sequential EHR | Patient monitor data, patient-descriptive data (demographics, admissions, transfers, discharge times, dates of death), diagnoses, notes, reports, procedure data, medications, fluid balances, and laboratory test data | diseases of the circulatory system; trauma; diseases of the digestive system; pulmonary diseases; infectious diseases; and neoplasms |
MIMIC-III [65] | 46,520 | Medical signals and sequential EHR | Patient monitor data, patient-descriptive data, diagnoses, reports, notes, interventions, medications, and laboratory tests data. | Diseases of the circulatory system, pulmonary diseases, infectious and parasitic diseases, diseases of the digestive system, diseases of the genitourinary system, neoplasms, diseases of the genitourinary system, and trauma |
MIMIC-IV [64] | 383,220 | Medical signals and sequential EHR | Hosp module contains patient-descriptive data, basic health data (blood pressure, height, weight...), medication, procedure data, and diagnoses. Icu module contains timing information data, patient monitor data, fluid balance, and procedure data. | Diseases of the circulatory system, pulmonary diseases, infectious and parasitic diseases, diseases of the digestive system, diseases of the genitourinary system, neoplasms, diseases of the genitourinary system, and trauma |
eICU-CRD [114] | 139,367 | Sequential EHR | Vital signs, laboratory measurements, medications, APACHE components, care plan information, admission diagnosis, patient history, and time-stamped diagnoses. | pulmonary sepsis, acute myocardial infarction, cerebrovascular accident, congestive heart failure, renal sepsis, diabetic ketoacidosis, coronary artery bypass graft, atrial rhythm disturbance, cardiac arrest, and emphysema |
Amsterdam UMCdb [138] | 20,109 | Medical signals and sequential EHR | Patient monitor and life support device data, laboratory measurements, clinical observation and scores, medical procedures and tasks, medication, fluid balance, diagnosis groups and clinical patient outcomes | Not specified |
UT Physicians clinical database [141] | 5,501,776 | Sequential EHR | Demographic data, vital signs, immunization data (body site, dose), laboratory data, transaction data (evaluation and management, radiology, medicine, surgery, anethesia), appointment data, medications, and invoices | diabetes mellitus, hyperlipidemia, hypertension, and unspecified chest pain |
Breast Cancer Wisconsin dataset (UCI) [34] | 569 | Tabular data | Diagnoses, radiuses, texture data, perimeters, areas, smoothness data, compactness data, concavity data, concave points data, symmetry data, and fractal dimensions. | Breast cancer |
Heart Disease dataset (UCI) [34] | 303 | Tabular data | Demographic data, smoking status data, disease history data, exercise protocols, chart data (blood pressure, heart rate, ECG), pain status data, and diagnoses | Heart disease |
Diabete dataset (UCI) [34] | 70 | Sequential data | Iinsulin dose, blood glucose measurement, hypoglycemic symptoms, meal ingestion, exercise activity | Diabete |
Display Omitted
The goal of mortality prediction task is to predict the future death risk of patients according to their previous Electronic Healthcare Records (EHR). The main challenge of mortality prediction is how to design an accurate and robust ...
The first article, "The Consequences of the Lack of Privacy in Today's Electronic Health Systems," by Deborah C. Peel, argues that because the public doesn't trust technology systems that prevent them from deciding who can see, use, or sell their health ...
Association for Computing Machinery
New York, NY, United States
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in