Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3383455.3422555acmconferencesArticle/Chapter ViewAbstractPublication PagesicaifConference Proceedingsconference-collections
research-article

Machine learning fund categorizations

Published: 07 October 2021 Publication History

Abstract

Given the surge in popularity of mutual funds (including exchange-traded funds (ETFs)) as a diversified financial investment, a vast variety of mutual funds from various investment management firms and diversification strategies have become available in the market. Identifying similar mutual funds among such a wide landscape of mutual funds has become more important than ever because of many applications ranging from sales and marketing to portfolio replication, portfolio diversification and tax loss harvesting. The current best method is data-vendor provided categorization which usually relies on curation by human experts with the help of available data. In this work, we establish that an industry wide well-regarded categorization system is learnable using machine learning and largely reproducible, and in turn constructing a truly data-driven categorization. We discuss the intellectual challenges in learning this man-made system, our results and their implications.

References

[1]
Debashis Acharya and Gajendra Sidana. 2007. Classifying mutual funds in India: Some results from clustering. Indian Journal of Economics and Business 6, 1 (2007), 71--79.
[2]
Charu C Aggarwal et al. 2016. Recommender systems. Springer.
[3]
Luis Ferruz Agudo and Cristina Ortiz Lázaro. 2005. Does Mutual Fund Management in India correspond to its investment objective classification? Review of Pacific Basin Financial Markets and Policies 8, 04 (2005), 659--685.
[4]
Ramin Baghai-Wadji, Rami El-Berry, Stefan Klocker, and Markus Schwaiger. 2005. The Consistency of Self-Declared Hedge Fund Styles---A Return-Based Analysis with Self-Organizing Maps. Financial Stability Report 9 (2005), 64--76.
[5]
Y. Bengio, I. J. Goodfellow, and A. Courville. 2015. Deep learning. MIT Press. (2015).
[6]
Leo Breiman. 2001. Random forests. Machine learning 45, 1 (2001), 5--32.
[7]
Stephen J Brown and William N Goetzmann. 1997. Mutual fund styles. Journal of financial Economics 43, 3 (1997), 373--399.
[8]
Fan Cai, Nhien-An Le-Khac, and Tahar Kechadi. 2016. Clustering approaches for financial data analysis: a survey. arXiv preprint arXiv:1609.08520 (2016).
[9]
Arturo Rodríguez Castellanos and Belén Vallejo Alonso. 2005. Spanish Mutual Fund Misclassification: Empirical Evidence. The Journal of Investing 14, 1 (2005), 41--51.
[10]
François Chollet et al. 2015. Keras. https://github.com/fchollet/keras.
[11]
Marcella Corduas and Domenico Piccolo. 2008. Time series clustering and classification by the autoregressive metric. Computational statistics & data analysis 52, 4 (2008), 1860--1872.
[12]
George Cybenko. 1989. Approximation by superpositions of a sigmoidal function. Mathematics of control, signals and systems 2, 4 (1989), 303--314.
[13]
Nandita Das et al. 2003. hedge Fund classification using K-means clustering Method. In 9th International Conference on Computing in Economics and Finance. 11--13.
[14]
Dan DiBartolomeo and Erik Witkowski. 1997. Mutual fund misclassification: Evidence based on style analysis. Financial Analysts Journal 53, 5 (1997), 32--43.
[15]
Edwin J Elton, Martin J Gruber, and Christopher R Blake. 2003. Incentive fees and mutual funds. The Journal of Finance 58, 2 (2003), 779--804.
[16]
Tom Fawcett. 2006. An Introduction to ROC Analysis. Pattern Recogn. Lett. 27, 8 (June 2006), 861--874.
[17]
Philipp Gerlach and Raimond Maurer. 2017. Stable Return-Based Fund Classification. Available at SSRN 2838187 (2017).
[18]
Rajna Gibson and Sébastien Gyger. 2007. The style consistency of hedge funds. European Financial Management 13, 2 (2007), 287--308.
[19]
David J. Hand and Robert J. Till. 2001. A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems. Mach. Learn. 45, 2 (Oct. 2001), 171--186.
[20]
John A Haslem and Carl A Scheraga. 2001. Morningstar's classification of large-cap mutual funds. The Journal of Investing 10, 1 (2001), 79--89.
[21]
Trevor Hastie, Robert Tibshirani, and Jerome Friedman. 2009. The elements of statistical learning: data mining, inference, and prediction. Springer Science & Business Media.
[22]
Kurt Hornik, Maxwell Stinchcombe, and Halbert White. 1989. Multilayer feedforward networks are universal approximators. Neural networks 2, 5 (1989), 359--366.
[23]
Moon Kim, Ravi Shukla, and Michael Tomas. 2000. Mutual fund objective misclassification. Journal of Economics and Business 52, 4 (2000), 309--323.
[24]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[25]
Teuvo Kohonen. 1990. The self-organizing map. Proc. IEEE 78, 9 (1990), 1464--1480.
[26]
Paul Lajbcygier and Asjad Yahya. 2008. Soft Clustering for Funds Management Style Analysis: Out-of-Sample Predictability. Available at SSRN 1206731 (2008).
[27]
Daniele Lamponi. 2015. A Data-Driven Categorization of Investable Assets. The Journal of Investing 24, 4 (2015), 73--80.
[28]
Y. LeCun, Y. Bengio, and G. Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436--444.
[29]
Andy Liaw, Matthew Wiener, et al. 2002. Classification and regression by randomForest. R news 2, 3 (2002), 18--22.
[30]
lipper category [n.d.]. Rifinitive. "Lipper Fund Research.".
[31]
Francesco Lisi and Edoardo Otranto. 2010. Clustering mutual funds by return and risk levels. In Mathematical and statistical methods for actuarial sciences and finance. Springer, 183--191.
[32]
Nikita I Lytkin, Casimir A Kulikowski, and Ilya B Muchnik. 2008. Variance-based criteria for clustering and their application to the analysis of management styles of mutual funds based on time series of daily returns. Technical Report. DIMACS Technical Report 2008-01.
[33]
Achla Marathe and Hany A Shawky. 1999. Categorizing mutual funds using clusters. Advances in Quantitative analysis of Finance and Accounting 7, 1 (1999), 199--204.
[34]
John D Martin, Arthur J Keown Jr, and James L Farrell. 1982. Do fund objectives affect diversification policies? Journal of Portfolio Management 8, 2 (1982), 19--28.
[35]
John G McDonald. 1974. Objectives and performance of mutual funds, 1960--1969. Journal of Financial and Quantitative Analysis 9, 3 (1974), 311--333.
[36]
Giovanna Menardi and Francesco Lisi. 2015. Double clustering for rating mutual funds. Electronic Journal of Applied Statistical Analysis 8, 1 (2015), 44--56.
[37]
Maria-Augusta Miceli and Gabriele Susinno. 2004. Ultrametricity in fund of funds diversification. Physica A: Statistical Mechanics and its Applications 344, 1--2 (2004), 95--99.
[38]
David Moreno, Paulina Marco, and Ignacio Olmeda. 2006. Self-organizing maps could improve the classification of Spanish mutual funds. European Journal of Operational Research 174, 2 (2006), 1039--1054.
[39]
Morningstar Gloabl Equity 2019. Morningstar Gloabl Equity Classification Structure. Equity Classification Structure.
[40]
Morningstar Global Category [n.d.]. Morningstar GlobalCategory™ Classifications. GlobalCategoryClassifications.
[41]
Morningstar Global Fixed Income 2017. Morningstar Global Fixed Income Classification. Fixed Income Classification.
[42]
Athanasios Orphanides et al. 1996. Compensation incentives and risk taking behavior: evidence from mutual funds. Citeseer.
[43]
Francesco Pattarin, Sandra Paterlini, and Tommaso Minerva. 2004. Clustering financial time series: an application to mutual funds style analysis. Computational Statistics & Data Analysis 47, 2 (2004), 353--372.
[44]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825--2830.
[45]
Paul Resnick and Hal R Varian. 1997. Recommender systems. Commun. ACM 40, 3 (1997), 56--59.
[46]
Takumasa Sakakibara, Tohgoroh Matsui, Atsuko Mutoh, and Nobuhiro Inuzuka. 2015. Clustering mutual funds based on investment similarity. Procedia Computer Science 60 (2015), 881--890.
[47]
William F Sharpe. 1992. Asset allocation: Management style and performance measurement. Journal of portfolio Management 18, 2 (1992), 7--19.
[48]
Hany A Shawky and Achla Marathe. 2010. Stylistic Differences across Hedge Funds as Revealed by Historical Monthly Returns. Technology and Investment 1, 01 (2010), 26.
[49]
Nadia Vozlyublennaia and Youchang Wu. 2018. Mutual funds apart from the crowd. Available at SSRN 2769161 (2018).

Cited By

View all
  • (2025)Comparison of sectorial and financial data for ESG scoring of mutual funds with machine learningFinancial Innovation10.1186/s40854-024-00719-y11:1Online publication date: 13-Feb-2025
  • (2024)Can an unsupervised clustering algorithm reproduce a categorization system?Proceedings of the 5th ACM International Conference on AI in Finance10.1145/3677052.3698616(213-221)Online publication date: 14-Nov-2024
  • (2021)Fund2VecProceedings of the Second ACM International Conference on AI in Finance10.1145/3490354.3494381(1-8)Online publication date: 3-Nov-2021

Index Terms

  1. Machine learning fund categorizations

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICAIF '20: Proceedings of the First ACM International Conference on AI in Finance
    October 2020
    422 pages
    ISBN:9781450375849
    DOI:10.1145/3383455
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 October 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. categorization
    2. machine learning
    3. mutual funds

    Qualifiers

    • Research-article

    Conference

    ICAIF '20
    Sponsor:
    ICAIF '20: ACM International Conference on AI in Finance
    October 15 - 16, 2020
    New York, New York

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)44
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 10 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Comparison of sectorial and financial data for ESG scoring of mutual funds with machine learningFinancial Innovation10.1186/s40854-024-00719-y11:1Online publication date: 13-Feb-2025
    • (2024)Can an unsupervised clustering algorithm reproduce a categorization system?Proceedings of the 5th ACM International Conference on AI in Finance10.1145/3677052.3698616(213-221)Online publication date: 14-Nov-2024
    • (2021)Fund2VecProceedings of the Second ACM International Conference on AI in Finance10.1145/3490354.3494381(1-8)Online publication date: 3-Nov-2021

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media