Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

DCID: Deep Canonical Information Decomposition

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases: Research Track (ECML PKDD 2023)

Abstract

We consider the problem of identifying the signal shared between two one-dimensional target variables, in the presence of additional multivariate observations. Canonical Correlation Analysis (CCA)-based methods have traditionally been used to identify shared variables, however, they were designed for multivariate targets and only offer trivial solutions for univariate cases. In the context of Multi-Task Learning (MTL), various models were postulated to learn features that are sparse and shared across multiple tasks. However, these methods were typically evaluated by their predictive performance. To the best of our knowledge, no prior studies systematically evaluated models in terms of correctly recovering the shared signal. Here, we formalize the setting of univariate shared information retrieval, and propose ICM, an evaluation metric which can be used in the presence of ground-truth labels, quantifying 3 aspects of the learned shared features. We further propose Deep Canonical Information Decomposition (DCID) - a simple, yet effective approach for learning the shared variables. We benchmark the models on a range of scenarios on synthetic data with known ground-truths and observe DCID outperforming the baselines in a wide range of settings. Finally, we demonstrate a real-life application of DCID on brain Magnetic Resonance Imaging (MRI) data, where we are able to extract more accurate predictors of changes in brain regions and obesity. The code for our experiments as well as the supplementary materials are available at https://github.com/alexrakowski/dcid.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alfaro-Almagro, F., et al.: Image processing and quality control for the first 10,000 brain imaging datasets from UK biobank. Neuroimage 166, 400–424 (2018)

    Article  Google Scholar 

  2. Argyriou, A., Evgeniou, T., Pontil, M.: Multi-task feature learning. Adv. Neural Inform. Process. Syst. 19 (2006)

    Google Scholar 

  3. Argyriou, A., Evgeniou, T., Pontil, M.: Convex multi-task feature learning. Mach. Learn. 73, 243–272 (2008)

    Article  MATH  Google Scholar 

  4. Bach, F.R., Jordan, M.I.: A probabilistic interpretation of canonical correlation analysis (2005)

    Google Scholar 

  5. Baumgart, M., Snyder, H.M., Carrillo, M.C., Fazio, S., Kim, H., Johns, H.: Summary of the evidence on modifiable risk factors for cognitive decline and dementia: a population-based perspective. Alzheimer’s and Dementia 11(6), 718–726 (2015)

    Article  Google Scholar 

  6. Billot, B., et al.: Synthseg: Domain Randomisation for Segmentation of Brain MRI Scans of any Contrast and Resolution. arXiv:2107.09559 [cs] (2021)

  7. Bousmalis, K., Trigeorgis, G., Silberman, N., Krishnan, D., Erhan, D.: Domain separation networks. Adv. Neural Inform. Process. Syst. 29 (2016)

    Google Scholar 

  8. Burgess, C., Kim, H.: 3d shapes dataset (2018)

    Google Scholar 

  9. Caruana, R.: Multitask learning. Springer (1998)

    Google Scholar 

  10. Chen, C., Zissimopoulos, J.M.: Racial and ethnic differences in trends in dementia prevalence and risk factors in the united states. Alzheimer’s and Dementia: Trans. Res. and Clin. Intervent. 4, 510–520 (2018)

    Google Scholar 

  11. Chen, J.H., Lin, K.P., Chen, Y.C.: Risk factors for dementia. J. Formos. Med. Assoc. 108(10), 754–764 (2009)

    Article  Google Scholar 

  12. Cherbuin, N., Mortby, M.E., Janke, A.L., Sachdev, P.S., Abhayaratna, W.P., Anstey, K.J.: Blood pressure, brain structure, and cognition: opposite associations in men and women. Am. J. Hypertens. 28(2), 225–231 (2015)

    Article  Google Scholar 

  13. Dekkers, I.A., Jansen, P.R., Lamb, H.J.: Obesity, brain volume, and white matter microstructure at MRI: a cross-sectional UK biobank study. Radiology 291(3), 763–771 (2019)

    Article  Google Scholar 

  14. Driscoll, I.: Midlife obesity and trajectories of brain volume changes in older adults. Hum. Brain Mapp. 33(9), 2204–2210 (2012)

    Article  Google Scholar 

  15. Eastwood, C., Williams, C.K.: A framework for the quantitative evaluation of disentangled representations. In: International Conference on Learning Representations (2018)

    Google Scholar 

  16. Emrani, S., Arain, H.A., DeMarshall, C., Nuriel, T.: Apoe4 is associated with cognitive and pathological heterogeneity in patients with Alzheimer’s disease: a systematic review. Alzheimer’s Res. Therapy 12(1), 1–19 (2020)

    Google Scholar 

  17. Frausto, D.M., Forsyth, C.B., Keshavarzian, A., Voigt, R.M.: Dietary regulation of gut-brain axis in Alzheimer’s disease: Importance of microbiota metabolites. Front. Neurosci. 15, 736814 (2021)

    Google Scholar 

  18. Gorospe, E.C., Dave, J.K.: The risk of dementia with increased body mass indexThe risk of dementia with increased body mass index. Age Ageing 36(1), 23–29 (2007)

    Article  Google Scholar 

  19. Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis: an overview with application to learning methods. Neural Comput. 16(12), 2639–2664 (2004)

    Article  MATH  Google Scholar 

  20. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  21. Hotelling, H.: Relations between two sets of variates. In: Breakthroughs in statistics, pp. 162–190. Springer (1992). https://doi.org/10.1007/978-1-4612-4380-9_14

  22. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  23. Klami, A., Kaski, S.: Probabilistic approach to detecting dependencies between data sets. Neurocomputing 72(1–3), 39–46 (2008)

    Article  Google Scholar 

  24. Köpüklü, O., Kose, N., Gunduz, A., Rigoll, G.: Resource efficient 3d convolutional neural networks. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 1910–1919. IEEE (2019)

    Google Scholar 

  25. Kumar, A., Daume III, H.: Learning task grouping and overlap in multi-task learning. arXiv preprint arXiv:1206.6417 (2012)

  26. Liu, P., Qiu, X., Huang, X.J.: Adversarial multi-task learning for text classification. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1–10 (2017)

    Google Scholar 

  27. Locatello, F., et al.: Challenging common assumptions in the unsupervised learning of disentangled representations. In: International Conference on Machine Learning, pp. 4114–4124. PMLR (2019)

    Google Scholar 

  28. Miller, K.L.: Multimodal population brain imaging in the UK biobank prospective epidemiological study. Nat. Neurosci. 19(11), 1523–1536 (2016)

    Article  Google Scholar 

  29. Monda, V., et al: Obesity and brain illness: from cognitive and psychological evidences to obesity paradox. Diabetes, Metab. Syndr. Obesity: Targets Therapy, pp. 473–479 (2017)

    Google Scholar 

  30. Pearson, K.: Liii. on lines and planes of closest fit to systems of points in space. London, Edinburgh, Dublin philosophical Mag. J. Sci. 2(11), 559–572 (1901)

    Google Scholar 

  31. Prabhakaran, S.: Blood pressure, brain volume and white matter hyperintensities, and dementia risk. JAMA 322(6), 512–513 (2019)

    Article  Google Scholar 

  32. Raji, C.A.: Brain structure and obesity. Hum. Brain Mapp. 31(3), 353–364 (2010)

    Google Scholar 

  33. Ruder, S.: An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098 (2017)

  34. Schölkopf, B., Platt, J., Hofmann, T.: Multi-task feature learning (2007)

    Google Scholar 

  35. Shiekh, S.I., Cadogan, S.L., Lin, L.Y., Mathur, R., Smeeth, L., Warren-Gash, C.: Ethnic differences in dementia risk: a systematic review and meta-analysis. J. Alzheimers Dis. 80(1), 337–355 (2021)

    Article  Google Scholar 

  36. Shinohara, Y.: Adversarial multi-task learning of deep neural networks for robust speech recognition. In: Interspeech, pp. 2369–2372. San Francisco, CA, USA (2016)

    Google Scholar 

  37. Stephan, Y., Sutin, A.R., Luchetti, M., Terracciano, A.: Subjective age and risk of incident dementia: evidence from the national health and aging trends survey. J. Psychiatr. Res. 100, 1–4 (2018)

    Article  Google Scholar 

  38. Strittmatter, W.J., et al.: Apolipoprotein e: high-avidity binding to beta-amyloid and increased frequency of type 4 allele in late-onset familial alzheimer disease. Proc. Natl. Acad. Sci. 90(5), 1977–1981 (1993)

    Article  Google Scholar 

  39. Sudlow, C., et al.: UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12(3), e1001779 (2015)

    Article  Google Scholar 

  40. Wimalawarne, K., Sugiyama, M., Tomioka, R.: Multitask learning meets tensor factorization: task imputation via convex optimization. Adv. Neural Inform. Process. Syst. vol. 27 (2014)

    Google Scholar 

  41. Yang, Y., Hospedales, T.: Deep multi-task representation learning: A tensor factorisation approach. arXiv preprint arXiv:1605.06391 (2016)

  42. Zhang, H., Greenwood, D.C., Risch, H.A., Bunce, D., Hardie, L.J., Cade, J.E.: Meat consumption and risk of incident dementia: cohort study of 493,888 UK biobank participants. Am. J. Clin. Nutr. 114(1), 175–184 (2021)

    Article  Google Scholar 

Download references

Acknowledgements

This research was funded by the HPI research school on Data Science and Engineering. Data used in the preparation of this article were obtained from the UK Biobank Resource under Application Number 40502.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Rakowski .

Editor information

Editors and Affiliations

Ethics declarations

Ethical Considerations

As mentioned in the main text (Sect. 5.2), we conducted the brain MRI experiments on the “white-British” subset of the UKB dataset. This was done to avoid unnecessary confounding, as the experiments were meant as a proof of concept, rather than a strict medical study. When conducting the latter, measures should be taken to include all available ethnicities whenever possible, in order to avoid increasing the already existing disparities in representations of ethnic minorities in medical studies.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rakowski, A., Lippert, C. (2023). DCID: Deep Canonical Information Decomposition. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14170. Springer, Cham. https://doi.org/10.1007/978-3-031-43415-0_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43415-0_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43414-3

  • Online ISBN: 978-3-031-43415-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics