Gene selection and disease prediction from gene expression data using a two-stage hetero-associative memory

Cleofas-Sánchez, Laura; Sánchez, J. Salvador; García, Vicente

doi:10.1007/s13748-018-0148-6

Gene selection and disease prediction from gene expression data using a two-stage hetero-associative memory

Regular Paper
Published: 31 March 2018

Volume 8, pages 63–71, (2019)
Cite this article

Progress in Artificial Intelligence Aims and scope Submit manuscript

Laura Cleofas-Sánchez¹,
J. Salvador Sánchez ORCID: orcid.org/0000-0003-1053-4658² &
Vicente García³

396 Accesses
9 Citations
Explore all metrics

Abstract

In general, gene expression microarrays consist of a vast number of genes and very few samples, which represents a critical challenge for disease prediction and diagnosis. This paper develops a two-stage algorithm that integrates feature selection and prediction by extending a type of hetero-associative neural networks. In the first level, the algorithm generates the associative memory, whereas the second level picks the most relevant genes. With the purpose of illustrating the applicability and efficiency of the method proposed here, we use four different gene expression microarray databases and compare their classification performance against that of other renowned classifiers built on the whole (original) feature (gene) space. The experimental results show that the two-stage hetero-associative memory is quite competitive with standard classification models regarding the overall accuracy, sensitivity and specificity. In addition, it also produces a significant decrease in computational efforts and an increase in the biological interpretability of microarrays because worthless (irrelevant and/or redundant) genes are discarded.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gene Selection and Classification Rule Generation for Microarray Dataset

Combining the mRMR technique with the Northern Goshawk Algorithm (NGHA) to choose genes for cancer classification

Article 07 May 2024

A proficient two stage model for identification of promising gene subset and accurate cancer classification

Article 10 March 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Aghajari, Z.H., Teshnehlab, M., Jahed Motlagh, M.R.: A novel chaotic hetero-associative memory. Neurocomputing 167, 352–358 (2015)
Google Scholar
Aihara, K., Takabe, T., Toyoda, M.: Chaotic neural networks. Phys. Lett. A 144(6), 333–340 (1990)
MathSciNet Google Scholar
Aldape-Pérez, M., Yáñez-Márquez, C., Camacho-Nieto, O., Argüelles-Cruz, A.J.: An associative memory approach to medical decision support systems. Comput. Methods Prog. Biomed. 106(3), 287–307 (2012)
Google Scholar
Anderson, J.A.: A simple neural network generating an interactive memory. Math. Biosci. 14, 197–220 (1972)
MATH Google Scholar
Ang, J.C., Mirzal, A., Haron, H., Hamed, H.N.A.: Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection. IEEE ACM Trans Comput. Biol. Bioinform. 13(5), 971–989 (2016)
Google Scholar
Arya, K.V., Singh, V., Mitra, P., Gupta, P.: Face recognition using parallel associative memory. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, Singapore, pp. 1332–1336 (2008)
Babu, M., Sarkar, K.: A comparative study of gene selection methods for cancer classification using microarray data. In: Proceedings of the 2nd International Conference on Research in Computational Intelligence and Communication Networks, Kolkata, India, pp. 204–211 (2016)
Ben-Hur, A., Weston, J.: A user’s guide to support vector machines. In: Carugo, O., Eisenhaber, F. (eds.) Data Mining Techniques for the Life Sciences, Methods in Molecular Biology, vol. 609, pp. 223–239. Humana Press, New York (2010)
Google Scholar
Berns, A.: Cancer: gene expression in diagnosis. Nature 403, 491–492 (2000)
Google Scholar
Braga-Neto, U.M., Dougherty, E.R.: Is cross-validation valid for small-sample microarray classification? Bioinformatics 20(3), 374–380 (2004)
Google Scholar
Chartier, S., Lepage, R.: Learning and extracting edges from images by a modified hopfield neural network. In: Proceedings of the 16th International Conference on Pattern Recognition, Quebec City, Canada, vol. 3, pp. 431–434 (2002)
Cleofas-Sánchez, L., García, V., Marqués, A., Sánchez, J.: Financial distress prediction using the hybrid associative memory with translation. Appl. Soft Comput. 44, 144–152 (2016)
Google Scholar
Dougherty, E.R.: Small sample issues for microarray-based classification. Comp. Funct. Genom. 2(1), 28–34 (2001)
Google Scholar
Dudoit, S., Fridlyand, J.: Classification in microarray experiments. In: Speed, T.P. (ed.) Statistical Analysis of Gene Expression Microarray Data, pp. 93–158. Chapman & Hall/CRC Press, London (2003)
Google Scholar
Ein-Dor, L., Zuk, O., Domany, E.: Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer. Proc. Natl. Acad. Sci. 103(15), 5923–5928 (2006)
Google Scholar
García, V., Sánchez, J.S.: Mapping microarray gene expression data into dissimilarity spaces for tumor classification. Inform. Sci. 294, 362–375 (2015)
MathSciNet Google Scholar
García, V., Sánchez, J.S., Cleofas-Sánchez, L., Ochoa-Domínguez, H.J., López-Orozco, F.: An insight on the ‘large G, small n’ problem in gene-expression microarray classification. In: Proceedings of the 8th Iberian Conference on Pattern Recognition and Image Analysis, Faro, Portugal, pp. 483–490 (2017)
Hassanien, A.E., Al-Shammari, E.T., Ghali, N.I.: Computational intelligence techniques in bioinformatics. Comput. Biol. Chem. 47, 37–47 (2013)
Google Scholar
Hira, Z.M., Gillies, D.F.: A review of feature selection and feature extraction methods applied on microarray data. Adv. Bioinform. 2015(ID 198363), 1–13 (2015)
Google Scholar
Ho, T.K., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 289–300 (2002)
Google Scholar
Hopfield, J.J.: Neural networks and physical systems with emergent collective computational abilities. In: Anderson, J.A., Rosenfeld, E. (eds.) Neurocomputing: Foundations of Research, pp. 457–464. Proceedings of the National Academy of Sciences USA, Cambridge (1988)
Hruschka, E.R., Hruschka, E.R., Ebecken, N.F.F.: Towards efficient imputation by nearest-neighbors: a clustering-based approach. In: Proceedings of the 17th Australian Joint Conference on Artificial Intelligence, Cairns, Australia, pp. 513–525 (2004)
Hua, J., Xiong, Z., Lowey, J., Suh, E., Dougherty, E.R.: Optimal number of features as a function of sample size for various classification rules. Bioinformatics 21(8), 1509–1515 (2005)
Google Scholar
Irsoy, O., Yildiz, O.T., Alpaydin, E.: Design and analysis of classifier learning experiments in bioinformatics: survey and case studies. IEEE ACM Trans. Comput. Biol. 9(6), 1663–1675 (2012)
Google Scholar
Japkowicz, N.: Assessment metrics for imbalanced learning. In: He, H., Ma, Y. (eds.) Imbalanced Learning: Foundations, Algorithms, and Applications, pp. 187–210. Wiley IEEE Press, New York (2013)
Kohonen, T.: Correlation matrix memories. IEEE Trans. Comput. C–21(4), 353–359 (1972)
MATH Google Scholar
Kohonen, T.: Associative Memory. A System—Theoretical Approach. Springer, Berlin (1977)
MATH Google Scholar
Kosko, B.: Bidirectional associative memories. IEEE Trans. Syst. Man Cybern. 18(1), 49–60 (1988)
MathSciNet Google Scholar
Larrañaga, P., Calvo, B., Santana, R., Bielza, C., Galdiano, J., Inza, I., Lozano, J.A., Armañanzas, R., Santafé, G., Pérez, A., Robles, V.: Machine learning in bioinformatics. Brief. Bioinform. 7(1), 86–112 (2011)
Google Scholar
Lazar, C., Taminau, J., Meganck, S., Steenhoff, D., Coletta, A., Molter, C., de Schaetzen, V., Duque, R., Bersini, H., Nowe, A.: A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE ACM Trans. Comput. Biol. Bioinform. 9(4), 1106–1119 (2012)
Google Scholar
Lee, J.W., Lee, J.B., Park, M., Song, S.H.: An extensive evaluation of recent classification tools applied to microarray data. Comput. Stat. Data Anal. 48, 869–885 (2005)
MATH Google Scholar
Li, D., Deogun, J., Spaulding, W., Shuart, B.: Towards missing data imputation: a study of fuzzy K-means clustering method. In: Proceedings of the 4th International Conference on Rough Sets and Current Trends in Computing, Uppsala, Sweden, pp. 573–579 (2004)
Lu, Y., Han, J.: Cancer classification using gene expression data. Inform. Syst. 28(4), 243–268 (2003)
MATH Google Scholar
Ma, S., Huang, J.: Regularized ROC method for disease classification and biomarker selection with microarray data. Bioinformatics 21(2), 4356–4362 (2005)
Google Scholar
Mahata, P., Mahata, K.: Selecting differentially expressed genes using minimum probability of classification error. J. Biomed. Inform. 40(6), 775–786 (2007)
Google Scholar
Nakano, K.: Associatron—a model on associative memory. IEEE Trans. Syst. Man Cybern. 2(3), 380–388 (1972)
Google Scholar
Raspe, E., Decraene, C., Berx, G.: Gene expression profiling to dissect the complexity of cancer biology: pitfalls and promise. Semin. Cancer Biol. 22(3), 250–260 (2012)
Google Scholar
Raudys, S.J., Jain, A.K.: Small sample size effects in statistical pattern recognition: recommendations for practitioners. IEEE Trans. Pattern Anal. Mach. Intell. 13(3), 252–264 (1991)
Google Scholar
Saeys, Y., Inza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)
Google Scholar
Sharma, N., Ray, A., Sharma, S., Shukla, K., Pradhan, S., Aggarwal, L.: Segmentation and classification of medical images using texture-primitive features: application of BAM-type artificial neural network. J. Med. Phys. 33(3), 119–126 (2008)
Google Scholar
Steinbuch, K.: Die lernmatrix. Kybernetik 1(1), 36–45 (1961). In German
MATH Google Scholar
Sudo, A., Sato, A., Hasegawa, O.: Associative memory for online learning in noisy environments using self-organizing incremental neural network. IEEE Trans. Neural Netw. 20(6), 964–972 (2009)
Google Scholar
Sun, X., Liu, Y., Wei, D., Xu, M., Chen, H., Han, J.: Selection of interdependent genes via dynamic relevance analysis for cancer diagnosis. J. Biomed. Inform. 46(2), 252–258 (2013)
Google Scholar
Vaishnavi, Y., Shreyas, R., Suhas, S., Surya, U.N., Ladwani, V.M., Ramasubramanian, V.: Associative memory framework for speech recognition: adaptation of hopfield network. In: 2016 IEEE Annual India Conference, Bangalore, India, pp. 1–6 (2016)
Villuendas-Rey, Y., Rey-Benguría, C.F., Ferreira-Santiago, A., Camacho-Nieto, O., Yáñez-Márquez, C.: The naïve associative classifier (NAC): a novel, simple, transparent, and accurate classification model evaluated on financial data. Neurocomputing 265, 105–115 (2017)
Google Scholar
Weigelt, B., Baehner, F.L., Reis-Filho, J.S.: The contribution of gene expression profiling to breast cancer classification, prognostication and prediction: a retrospective of the last decade. J. Pathol. 220(2), 263–280 (2010)
Google Scholar
Xing, E.P., Jordan, M.I., Karp, R.M.: Feature selection for high-dimensional genomic microarray data. In: Proceedings of the 8th International Conference on Machine Learning, Williamstown, MA, pp. 601–608 (2001)
Yáñez-Márquez, C.: Associative memories based on order relations and binary operators. Ph.D. thesis, Centro de Investigación en Computación - Instituto Politécnico Nacional, Mexico, (In Spanish) (2002)
Yoon, Y., Lee, J., Park, S., Bien, S., Chung, H.C., Rha, S.Y.: Direct integration of microarrays for selecting informative genes and phenotype classification. Inf. Sci. 178(1), 88–105 (2008)
Google Scholar
Zhang, Z., Zhuo, H., Liu, S., de B Harrington, P.: Classification of cancer patients based on elemental contents of serums using bidirectional associative memory networks. Anal. Chim. Acta 436(2), 281–291 (2001)
Google Scholar

Download references

Acknowledgements

This study was partially supported by the Valencian Council of Education, Research, Culture and Sport [PROMETEOII/2014/062], the Mexican PRODEP [DSA/103.5/15/7004], and the Spanish Ministry of Economy, Industry and Competitiveness under Grant [TIN2013-46522-P].

Author information

Authors and Affiliations

National Institute of Genomic Medicine, 14610, Ciudad de México, Mexico
Laura Cleofas-Sánchez
Department of Computer Languages and Systems, Institute of New Imaging Technologies, Universitat Jaume I, 12071, Castelló de la Plana, Spain
J. Salvador Sánchez
Multidisciplinary University Division, Universidad Autónoma de Ciudad Juárez, 32310, Ciudad Juárez, Chihuahua, Mexico
Vicente García

Authors

Laura Cleofas-Sánchez
View author publications
You can also search for this author in PubMed Google Scholar
J. Salvador Sánchez
View author publications
You can also search for this author in PubMed Google Scholar
Vicente García
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to J. Salvador Sánchez.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cleofas-Sánchez, L., Sánchez, J.S. & García, V. Gene selection and disease prediction from gene expression data using a two-stage hetero-associative memory. Prog Artif Intell 8, 63–71 (2019). https://doi.org/10.1007/s13748-018-0148-6

Download citation

Received: 22 January 2018
Accepted: 22 March 2018
Published: 31 March 2018
Issue Date: 01 April 2019
DOI: https://doi.org/10.1007/s13748-018-0148-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gene selection and disease prediction from gene expression data using a two-stage hetero-associative memory

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Gene Selection and Classification Rule Generation for Microarray Dataset

Combining the mRMR technique with the Northern Goshawk Algorithm (NGHA) to choose genes for cancer classification

A proficient two stage model for identification of promising gene subset and accurate cancer classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Gene selection and disease prediction from gene expression data using a two-stage hetero-associative memory

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Gene Selection and Classification Rule Generation for Microarray Dataset

Combining the mRMR technique with the Northern Goshawk Algorithm (NGHA) to choose genes for cancer classification

A proficient two stage model for identification of promising gene subset and accurate cancer classification

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation