GAN-Based Data Augmentation for Prediction Improvement Using Gene Expression Data in Cancer

Moreno-Barea, Francisco J.; Jerez, José M.; Franco, Leonardo

doi:10.1007/978-3-031-08757-8_3

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13352))

Included in the following conference series:

International Conference on Computational Science

1786 Accesses
9 Citations

Abstract

Within the area of bioinformatics, Deep Learning (DL) models have shown exceptional results in applications in which histological images, scans and tomographies are used. However, when gene expression data is under analysis, the performance is often limited, further hampered by the complexity of these models that require several instances, in the order of thousands, to provide good results. Due to the difficulty and the costs involved in the collection of medical data, the application of Data Augmentation (DA) techniques to alleviate the lack of samples is a topic of great relevance. State-of-the-art models based on Conditional Generative Adversarial Networks (CGAN) and some introduced modifications are used in this work to investigate the effect of DA for prediction of the vital status of patients from RNA-Seq gene expression data. Experimental results on several real-world data sets demonstrate the effectiveness and efficiency of the proposed models. The application of DA methods significantly increase prediction accuracy, leading by 12% with respect to benchmark data sets and 3.15% with respect to data processed with feature selection. Results based on CGAN models outperform in most cases, alternative methods like the SMOTE or noise injection techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deep learning assisted cancer disease prediction from gene expression data using WT-GAN

Article Open access 24 October 2024

AEGAN-Pathifier: a data augmentation method to improve cancer classification for imbalanced gene expression data

Article Open access 27 December 2024

A Data Enhancement Method for Gene Expression Profile Based on Improved WGAN-GP

References

Barile, B., Marzullo, A., Stamile, C., Durand-Dubief, F., Sappey-Marinier, D.: Data augmentation using generative adversarial neural networks on brain structural connectivity in multiple sclerosis. Comput. Methods Programs Biomed. 206, 106113 (2021). https://doi.org/10.1016/j.cmpb.2021.106113
Article Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/a:1010933404324
Article MATH Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002). https://doi.org/10.1613/jair.953
Article MATH Google Scholar
Cheerla, A., Gevaert, O.: Deep learning with multimodal representation for pancancer prognosis prediction. Bioinformatics 35(14), i446–i454 (2019). https://doi.org/10.1093/bioinformatics/btz342
Article Google Scholar
Douzas, G., Bacao, F.: Effective data generation for imbalanced learning using conditional generative adversarial networks. Expert Syst. Appl. 91, 464–471 (2018). https://doi.org/10.1016/j.eswa.2017.09.030
Frid-Adar, M., Diamant, I., Klang, E., Amitai, M., Goldberger, J., Greenspan, H.: GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing 321, 321–331 (2018). https://doi.org/10.1016/j.neucom.2018.09.013
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
MATH Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Google Scholar
Han, C., et al.: GAN-based synthetic brain MR image generation. In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pp. 734–738. IEEE (2018). https://doi.org/10.1109/isbi.2018.8363678
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Hsu, W.N., Zhang, Y., Glass, J.: Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation. In: 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 16–23. IEEE, December 2017. https://doi.org/10.1109/asru.2017.8268911
Liu, Y., Zhou, Y., Liu, X., Dong, F., Wang, C., Wang, Z.: Wasserstein GAN-based small-sample augmentation for new-generation artificial intelligence: a case study of cancer-staging data in biology. Engineering 5(1), 156–163 (2019). https://doi.org/10.1016/j.eng.2018.11.018
Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: International Conference on Machine Learning, vol. 30, p. 3 (2013)
Google Scholar
Marouf, M., et al.: Realistic in silico generation and augmentation of single-cell RNA-seq data using generative adversarial networks. Nat. Commun. 11(1), 1–12 (2020). https://doi.org/10.1038/s41467-019-14018-z
Article Google Scholar
Mirza, M., Osindero, S.: Conditional generative adversarial nets (2014)
Google Scholar
Moreno-Barea, F.J., Jerez, J.M., Franco, L.: Improving classification accuracy using data augmentation on small data sets. Expert Syst. Appl. 161, 113696 (2020). https://doi.org/10.1016/j.eswa.2020.113696
Article Google Scholar
Moreno-Barea, F.J., Strazzera, F., Jerez, J.M., Urda, D., Franco, L.: Forward noise adjustment scheme for data augmentation. In: IEEE Symposium Series on Computational Intelligence (IEEE SSCI 2018) (2018). https://doi.org/10.1109/ssci.2018.8628917
Piotrowski, A.P., Napiorkowski, J.J.: A comparison of methods to avoid overfitting in neural networks training in the case of catchment runoff modelling. J. Hydrol. 476, 97–111 (2013). https://doi.org/10.1016/j.jhydrol.2012.10.019
Article Google Scholar
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks (2015)
Google Scholar
Reed, R.D., Marks, R.J.: Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks. MIT Press, Cambridge (1998)
Google Scholar
dos Santos Tanaka, F.H.K., Aranha, C.: Data augmentation using GANs. In: Proceedings of Machine Learning Research XXX 1, p. 16 (2019)
Google Scholar
Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015). https://doi.org/10.1016/j.neunet.2014.09.003
Article Google Scholar
Shao, S., Wang, P., Yan, R.: Generative adversarial networks for data augmentation in machine fault diagnosis. Comput. Ind. 106, 85–93 (2019). https://doi.org/10.1016/j.compindJ.2019.01.001
Article Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58(1), 267–288 (1996). https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
Vale-Silva, L.A., Rohr, K.: Long-term cancer survival prediction using multimodal deep learning. Sci. Rep. 11(1), 1–12 (2021). https://doi.org/10.1038/s41598-021-92799-4
Article Google Scholar
Waheed, A., Goyal, M., Gupta, D., Khanna, A., Al-Turjman, F., Pinheiro, P.R.: CovidGAN: data augmentation using auxiliary classifier GAN for improved COVID-19 detection. IEEE Access 8, 91916–91923 (2020). https://doi.org/10.1109/access.2020.2994762
Article Google Scholar
Xu, B., Wang, N., Chen, T., Li, M.: Empirical evaluation of rectified activations in convolutional network (2015)
Google Scholar
Zur, R.M., Jiang, Y., Pesce, L., Drukker, K.: Noise injection for training artificial neural networks: a comparison with weight decay and early stopping. Med. Phys. 36(10), 4810–4818 (2009). https://doi.org/10.1118/1.3213517
Article Google Scholar

Download references

Acknowledgements

The authors acknowledge the support from MICINN (Spain) through grant TIN2017-88728-C2-1-R and PID2020-116898RB-I00, from Universidad de Málaga y Junta de Andalucía through grant UMA20-FEDERJA-045, and from Instituto de Investigación Biomédica de Málaga - IBIMA (all including FEDER funds). The results published here are based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga.

Author information

Authors and Affiliations

Departamento de Lenguajes y Ciencias de la Computación, Escuela Técnica Superior de Ingeniería Informática, Universidad de Málaga, Málaga, Spain
Francisco J. Moreno-Barea, José M. Jerez & Leonardo Franco

Authors

Francisco J. Moreno-Barea
View author publications
You can also search for this author in PubMed Google Scholar
José M. Jerez
View author publications
You can also search for this author in PubMed Google Scholar
Leonardo Franco
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francisco J. Moreno-Barea .

Editor information

Editors and Affiliations

Brunel University London, London, UK
Derek Groen
University of Amsterdam, Amsterdam, The Netherlands
Clélia de Mulatier
AGH University of Science and Technology, Krakow, Poland
Maciej Paszynski
University of Amsterdam, Amsterdam, The Netherlands
Valeria V. Krzhizhanovskaya
University of Tennessee at Knoxville, Knoxville, TN, USA
Jack J. Dongarra
University of Amsterdam, Amsterdam, The Netherlands
Peter M. A. Sloot

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Moreno-Barea, F.J., Jerez, J.M., Franco, L. (2022). GAN-Based Data Augmentation for Prediction Improvement Using Gene Expression Data in Cancer. In: Groen, D., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2022. ICCS 2022. Lecture Notes in Computer Science, vol 13352. Springer, Cham. https://doi.org/10.1007/978-3-031-08757-8_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-08757-8_3
Published: 15 June 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08756-1
Online ISBN: 978-3-031-08757-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

GAN-Based Data Augmentation for Prediction Improvement Using Gene Expression Data in Cancer

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep learning assisted cancer disease prediction from gene expression data using WT-GAN

AEGAN-Pathifier: a data augmentation method to improve cancer classification for imbalanced gene expression data

A Data Enhancement Method for Gene Expression Profile Based on Improved WGAN-GP

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

GAN-Based Data Augmentation for Prediction Improvement Using Gene Expression Data in Cancer

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep learning assisted cancer disease prediction from gene expression data using WT-GAN

AEGAN-Pathifier: a data augmentation method to improve cancer classification for imbalanced gene expression data

A Data Enhancement Method for Gene Expression Profile Based on Improved WGAN-GP

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation