Abstract
We consider the problem of determining the intrinsic dimensionality of data which is important for optimizing the organization and processing of large data sets in classical machines, quantum decision theory, and observations of natural phenomena. We prove a theorem that determines the minimum dimensions associated with the data and this result is consistent with the result that base-e is optimal for number representation. The dimension value be viewed as coding the structure in the most efficient representation of the data and has relevance for natural and engineered systems. Since the optimal intrinsic dimensionality is shown to be noninteger, this paper provides a rationale for fractals in natural data.
Similar content being viewed by others
Data availability
There is no additional data associated with the paper.
References
D. Aerts, Quantum structure in cognition. J. Math. Psychol. 53, 314–348 (2009)
J.R. Armstrong et al., Analytic solutions of topologically disjoint systems. J. Phys. A: Math. Theor. 48, 085301 (2015)
M. Baker, 1500 scientists lift the lid on reproducibility. Nature 533, 452–454 (2016)
M. Baker, E. Dolgin, Cancer reproducibility project releases first results. Nature 541, 269–270 (2017)
A. Bunde, S. Havlin, Fractals in Science (Springer, Berlin, 2013)
J.R. Busemeyer, P. Bruza, Quantum Models of Cognition and Decision (Cambridge University Press, Cambridge, 2012)
A. Carpinteri, F. Mainardi (eds.), Fractals and Fractional Calculus in Continuum Mechanics (Springer, Wien, 1997)
R.C. Conant, W.R. Ashby, Every good regulator of a system must be a model of that system. Int. J. Syst. Sci. 1(2), 89–97 (1970)
K.J. Falconer, Fractal Geometry: Mathematical Foundations and Applications (Wiley, Hoboken, 2003)
S. Kak, On training feedforward neural networks. Pramana 40, 35–42 (1993)
S. Kak, Faster web search and prediction using instantaneously trained neural networks. IEEE Intell. Syst. 14, 79–82 (1999)
S. Kak, Communication languages and agents in biological systems, in Biocommunication: Sign-Mediated Interactions between Cells and Organisms, ed. by R. Gordon, J. Seckbach (World Scientific Publishing, London, 2016), pp. 203–226
S. Kak, State ensembles and quantum entropy. Int. J. Theor. Phys. 55, 3017–3026 (2016)
S. Kak, Incomplete information and quantum decision trees. in IEEE SMC 2017, International Conference on Systems, Man, and Cybernetics. Banff, Canada (2017)
S. Kak, Learning Based on CC1 and CC4 Neural Networks. arXiv:1712.09331 (2017)
S. Kak, The base-e representation of numbers and the power law. Circuits Syst. Signal Process. (2020). https://doi.org/10.1007/s00034-020-01480-0
S. Kak, Information, representation, and structure. in International Conference on Recent Trends in Mathematics and Its Applications to Graphs, Networks and Petri Nets, New Delhi, India (2020)
S. Kak, Noninteger dimensional spaces and the inverse square law. (2020). TechRxiv: https://www.techrxiv.org/articles/preprint/Noninteger_Dimensional_Spaces_and_the_Inverse_Square_Law/13079720
A.Y. Khrennikov, Ubiquitous Quantum Structure: From Psychology to Finance (Springer, Berlin, 2010)
A.A. Kilbas, H.M. Srivastava, J.J. Trujillo, Theory and Application of Fractional Differential Equations (Elsevier, Amsterdam, 2006)
A. Kwiatkowski, H. Werner, PCA-based parameter set mappings for LPV models with fewer parameters and less overbounding. IEEE Trans. Control Syst. Technol. 16, 781–788 (2008)
B.B. Mandelbrot, The Fractal Geometry of Nature (W. H. Freeman, New York, 1983)
E.R. Omiecinski, Alternative interest measures for mining associations in databases. IEEE Trans. Knowl. Data Eng. 15, 57–69 (2003)
R. Panek, A cosmic crisis. Sci. Am. 322(3), 30–37 (2020)
A. Shortt, J.G. Keating, L. Moulinier, C.N. Pannell, Optical implementation of the Kak neural network. Inf. Sci. 171, 273–287 (2005)
F.H. Stillinger, Axiomatic basis for spaces with noninteger dimensions. J. Math. Phys. 18, 1224–1234 (1977)
K.W. Tang, S. Kak, Fast classification networks for signal processing. Circuits, Syst. Signal Process. 21, 207–224 (2002)
V.E. Tarasov, Anisotropic fractal media by vector calculus in non-integer dimensional space. J. Math. Phys. 55, 083510 (2014)
W.P. Thurston, Orbifolds, in The Geometry and Topology of Three-Manifolds (Princeton University Press, Princeton, 1997), pp. 297–355
M. Verleysen, E. de Bodt, A. Lendasse, Forecasting financial time series through intrinsic dimension estimation and non-linear data projection. in Proceedings of IWANN’99—International Work-conference on Artificial and Natural Neural Networks, Alicante (Spain), June 2–4, 1999, Springer, Lecture Notes in Computer Science 1607, J. Mira, Juan V. Sanchez-Andres eds. (1999)
G.I. Webb, Discovering significant rules. in Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data mining, KDD-2006. (pp. 434–443). New York, NY: ACM (2006)
S. Wolfram, A Class of Models with the Potential to Represent Fundamental Physics. arXiv:2004.08210 (2020)
V.I. Yukalov, D. Sornette, Decision theory with prospect interference and entanglement. Theor. Decis. 70, 283–328 (2010)
L. Zhou, Chromatic numbers of the Menger sponges. Am. Math. Mon. 114(9), 842 (2007)
J. Zhu, G. Milne, Implementing kak neural networks on a reconfigurable computing platform. Lect. Notes Comput. Sci. 1896, 260–269 (2000)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kak, S. The Intrinsic Dimensionality of Data. Circuits Syst Signal Process 40, 2599–2607 (2021). https://doi.org/10.1007/s00034-020-01583-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-020-01583-8