Recently, auto-encoder-based generative models have been widely used successfully for image processing. However, there are few studies on the realization of continuous input–output mappings for regression problems. Lack of a sufficient amount of training data plagues regression problems, which is also a notable problem in machine learning, which affects its application in the field of materials science. Using variational auto-encoders (VAEs) as generative models for data augmentation, we address the issue of small data size for regression problems. VAEs are popular and powerful auto-encoder-based generative models. Generative auto-encoder models such as VAEs use multilayer neural networks to generate sample data. In this study, we demonstrate the effectiveness of multi-task learning (auto-encoding and regression tasks) relating to regression problems. We conducted experiments on seven benchmark datasets and on one ionic conductivity dataset as an application in materials science. The experimental results show that the multi-task learning for VAEs improved the generalization performance of multivariable linear regression model trained with augmented data.
The author would like to thank Dr. Nobuko Ohba for preparing the ionic conductivity data, and anonymous reviewers for their constructive comments on the manuscript.
The author declares that there is no conflict of interest regarding the publication of this article.
This article does not contain any studies with human participants or animals performed by the author.
Communicated by Mu-Yen Chen.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Ohno, H. Auto-encoder-based generative models for data augmentation on regression problems. Soft Comput 24, 7999–8009 (2020). https://doi.org/10.1007/s00500-019-04094-0
DOI: https://doi.org/10.1007/s00500-019-04094-0