A generative, predictive model for menstrual cycle lengths that accounts for potential self-tracking artifacts in mobile health data

Li, Kathy; Urteaga, Iñigo; Shea, Amanda; Vitzthum, Virginia J.; Wiggins, Chris H.; Elhadad, Noémie

Computer Science > Machine Learning

arXiv:2102.12439 (cs)

[Submitted on 24 Feb 2021 (v1), last revised 16 Mar 2021 (this version, v2)]

Title:A generative, predictive model for menstrual cycle lengths that accounts for potential self-tracking artifacts in mobile health data

Authors:Kathy Li, Iñigo Urteaga, Amanda Shea, Virginia J. Vitzthum, Chris H. Wiggins, Noémie Elhadad

View PDF

Abstract:Mobile health (mHealth) apps such as menstrual trackers provide a rich source of self-tracked health observations that can be leveraged for health-relevant research. However, such data streams have questionable reliability since they hinge on user adherence to the app. Therefore, it is crucial for researchers to separate true behavior from self-tracking artifacts. By taking a machine learning approach to modeling self-tracked cycle lengths, we can both make more informed predictions and learn the underlying structure of the observed data. In this work, we propose and evaluate a hierarchical, generative model for predicting next cycle length based on previously-tracked cycle lengths that accounts explicitly for the possibility of users skipping tracking their period. Our model offers several advantages: 1) accounting explicitly for self-tracking artifacts yields better prediction accuracy as likelihood of skipping increases; 2) because it is a generative model, predictions can be updated online as a given cycle evolves, and we can gain interpretable insight into how these predictions change over time; and 3) its hierarchical nature enables modeling of an individual's cycle length history while incorporating population-level information. Our experiments using mHealth cycle length data encompassing over 186,000 menstruators with over 2 million natural menstrual cycles show that our method yields state-of-the-art performance against neural network-based and summary statistic-based baselines, while providing insights on disentangling menstrual patterns from self-tracking artifacts. This work can benefit users, mHealth app developers, and researchers in better understanding cycle patterns and user adherence.

Comments:	Extended version of the work presented at the NeurIPS 2020 Machine Learning for Mobile Health Workshop (see this https URL)
Subjects:	Machine Learning (cs.LG); Quantitative Methods (q-bio.QM); Machine Learning (stat.ML)
Cite as:	arXiv:2102.12439 [cs.LG]
	(or arXiv:2102.12439v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2102.12439

Submission history

From: Iñigo Urteaga [view email]
[v1] Wed, 24 Feb 2021 18:00:26 UTC (405 KB)
[v2] Tue, 16 Mar 2021 20:47:42 UTC (1,019 KB)

Computer Science > Machine Learning

Title:A generative, predictive model for menstrual cycle lengths that accounts for potential self-tracking artifacts in mobile health data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:A generative, predictive model for menstrual cycle lengths that accounts for potential self-tracking artifacts in mobile health data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators