short-paper

A Quantitative Comparison of Different Machine Learning Approaches for Human Spermatozoa Quality Prediction Using Multimodal Datasets

Authors:

Yin WangAuthors Info & Claims

MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Pages 4659 - 4663

https://doi.org/10.1145/3394171.3416285

Published: 12 October 2020 Publication History

Abstract

Despite remarkable advances in medical data analysis fields, they are severely restrained from the limited property of the employed single modality, usually medical imaging data. However, other modalities (such as patient-related information) should also be taken into account in the process of clinical decision. How to fully employ the multi-modal dataset is still under-explored. In this paper, we make a quantitative comparison of different machine learning approaches for the human spermatozoa quality prediction task, leveraging multiple modalities dataset. To empirically investigate the advantages and disadvantages of different machine learning approaches, we perform extensive experiments. Leveraging different features, we achieve state-of-the-art performance on most of the tasks. The obtained results show that simple models can provide better performance, which emphasizes the importance of avoiding overfitting. For the sake of reproducibility, we have released our code to facilitate the research community.

Supplementary Material

MP4 File (3394171.3416285.mp4)

The presentation video of No.2605 paper.

Download
59.94 MB

References

[1]

Tadas Baltruvs aitis, Chaitanya Ahuja, and Louis-Philippe Morency. 2018. Multimodal machine learning: A survey and taxonomy. IEEE transactions on pattern analysis and machine intelligence, Vol. 41, 2 (2018), 423--443.

[2]

Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 785--794.

Digital Library

[3]

Trine B Haugen, Steven A Hicks, Jorunn M Andersen, Oliwia Witczak, Hugo L Hammer, Rune Borgli, Pål Halvorsen, and Michael Riegler. 2019. Visem: A multimodal video dataset of human spermatozoa. In Proceedings of the 10th ACM Multimedia Systems Conference. 261--266.

Digital Library

[4]

Steven Hicks, Vajira Thabawita, Hugo L. Hammer, Trine B. Haugen, Pål Halvorsen, and Michael Riegler. 2020. ACM MM BioMedia 2020 Grand Challenge Overview. In Proceedings of the ACM International Conference on Multimedia (Seattle, Washington) (ACM MM '20). Association for Computing Machinery, New York, NY, USA.

Digital Library

[5]

Steven A Hicks, Jorunn M Andersen, Oliwia Witczak, Vajira Thambawita, Pål Halvorsen, Hugo L Hammer, Trine B Haugen, and Michael A Riegler. 2019. Machine learning-based analysis of sperm videos and participant data for male fertility prediction. Scientific reports, Vol. 9, 1 (2019), 1--10.

[6]

Pierre Jannin, J Michael Fitzpatrick, David Hawkes, Xavier Pennec, Ramin Shahidi, and Michael Vannier. 2002. Validation of medical image processing in image-guided therapy. Ieee transactions on medical imaging, Vol. 21, 12 (2002), 1445--9.

[7]

Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. Lightgbm: A highly efficient gradient boosting decision tree. In Advances in neural information processing systems. 3146--3154.

Digital Library

[8]

Erfan Miahi, Seyed Abolghasem Mirroshandel, and Alexis Nasr. 2019. Genetic Neural Architecture Search for automatic assessment of human sperm images. arXiv preprint arXiv:1909.09432 (2019).

[9]

Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, and Andrey Gulin. 2018. CatBoost: unbiased boosting with categorical features. In Advances in neural information processing systems. 6638--6648.

[10]

Mohammad Rahimzadeh, Abolfazl Attar, et al. 2020. Sperm detection and tracking in phase-contrast microscopy image sequences using deep learning and modified csr-dcf. arXiv preprint arXiv:2002.04034 (2020).

[11]

Vajira Thambawita, Pål Halvorsen, Hugo Hammer, Michael Riegler, and Trine B Haugen. 2019 a. Extracting temporal features into a spatial domain using autoencoders for sperm video analysis. arXiv preprint arXiv:1911.03100 (2019).

[12]

Vajira Thambawita, Pål Halvorsen, Hugo Hammer, Michael Riegler, and Trine B Haugen. 2019 b. Stacked dense optical flows and dropout layers to predict sperm motility and morphology. arXiv preprint arXiv:1911.03086 (2019).

[13]

Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), Vol. 58, 1 (1996), 267--288.

[14]

Svante Wold, Kim Esbensen, and Paul Geladi. 1987. Principal component analysis. Chemometrics and intelligent laboratory systems, Vol. 2, 1--3 (1987), 37--52.

[15]

Kele Xu. 2019. Mixup-Based Data Augmentation for Histopathologic Cancer Detection. In MEDICAL PHYSICS, Vol. 46. WILEY 111 RIVER ST, HOBOKEN 07030--5774, NJ USA, E336--E337.

[16]

Kele Xu, Boqing Zhu, Qiuqiang Kong, Haibo Mi, Bo Ding, Dezhi Wang, and Huaimin Wang. 2019. General audio tagging with ensembling convolutional neural networks and statistical features. The Journal of the Acoustical Society of America, Vol. 145, 6 (2019), EL521--EL527.

Cited By

Adinugroho SNakazawa A(2024)Deep learning-based sperm motility and morphology estimation on stacked color-coded MotionFlowInformatics in Medicine Unlocked10.1016/j.imu.2024.10145945(101459)Online publication date: 2024
https://doi.org/10.1016/j.imu.2024.101459

Index Terms

A Quantitative Comparison of Different Machine Learning Approaches for Human Spermatozoa Quality Prediction Using Multimodal Datasets
1. Applied computing
  1. Life and medical sciences
    1. Health care information systems
    2. Health informatics

Recommendations

Human-robot collaborative tutoring using multiparty multimodal spoken dialogue
HRI '14: Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction

In this paper, we describe a project that explores a novel experimental setup towards building a spoken, multi-modally rich, and human-like multiparty tutoring robot. A human-robot interaction setup is designed, and a human-human dialogue corpus is ...
A survey on multimodal bidirectional machine learning translation of image and natural language processing
Abstract
Advances in multimodal machine learning help artificial intelligence to resemble human intellect more closely, which perceives the world from multiple modalities. We surveyed state-of-the-art research on the modalities of bidirectional machine ...
Highlights
- Review multimodal bidirectional machine learning translation of image and NLP.
- Suggest the taxonomy which categorize them.
- Review evaluation metrics and common dataset used for the state-of-the-art models.
- Present future ...
Using machine learning to explore human multimodal clarification strategies
COLING-ACL '06: Proceedings of the COLING/ACL on Main conference poster sessions

We investigate the use of machine learning in combination with feature engineering techniques to explore human multimodal clarification strategies and the use of those strategies for dialogue systems. We learn from data collected in a Wizard-of-Oz study ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '20: Proceedings of the 28th ACM International Conference on Multimedia

October 2020

4889 pages

ISBN:9781450379885

DOI:10.1145/3394171

General Chairs:
Chang Wen Chen
Chinese University of Hong Kong, Shenzhen, China
,
Rita Cucchiara
UNIMORE, Italy
,
Xian-Sheng Hua
Alibaba Group, China
,
Program Chairs:
Guo-Jun Qi
Futurewei Technologies, USA
,
Elisa Ricci
UNITN & Fondazione Bruno Kessler, Italy
,
Zhengyou Zhang
Tencent, China
,
Roger Zimmermann
National University of Singapore, Singapore

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

MM '20

Sponsor:

SIGMM

MM '20: The 28th ACM International Conference on Multimedia

October 12 - 16, 2020

WA, Seattle, USA

Acceptance Rates

Overall Acceptance Rate 995 of 4,171 submissions, 24%

Upcoming Conference

MM '24

Sponsor:
sigmm

The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
90
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Adinugroho SNakazawa A(2024)Deep learning-based sperm motility and morphology estimation on stacked color-coded MotionFlowInformatics in Medicine Unlocked10.1016/j.imu.2024.10145945(101459)Online publication date: 2024
https://doi.org/10.1016/j.imu.2024.101459

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents