The article provides a summary of a statistical expert system, STATEX, developed by a group of st... more The article provides a summary of a statistical expert system, STATEX, developed by a group of statisticians and programmers in Almaty, Kazakhstan, in 1989-1993. STATEX was designed as a user-friendly tool for advanced data analysis that could be used by non-statistical experts. The software was distributed broadly in several countries of the former Soviet Union. I describe the history of the project, outline the key features of STATEX and compare the software with other statistical packages, both of the past and modern times. This comparison, besides its historical value, suggests that some of the ideas behind STATEX are not obsolete and could be re-applied now in a very different computational environment. The article provides a reference to the pdf file of STATEX User’s guide and reproduces its table of contents.
Model Assisted Statistics and Applications, Dec 29, 2009
... 17. Joaquin Sorolla y Bastida, Valencian Fisherman, 1895, Spain. ... [5] CP Bruter, ed., Math... more ... 17. Joaquin Sorolla y Bastida, Valencian Fisherman, 1895, Spain. ... [5] CP Bruter, ed., Mathematics and Art: Mathematical Visualization in Art and Education, Springer, Heidelberg, 2002. ... [7] S. Chandrasekhar, Truth and Beauty: Aesthetics and Motivations in Science, University of ...
The recent IPCC report on Climate Change calls for immediate actions from governments, requiring ... more The recent IPCC report on Climate Change calls for immediate actions from governments, requiring tremendous efforts and expenses worldwide. However, the science behind this report is far from clear. It reinforces the false sense of scientific “consensus”, while in fact does not answer the most fundamental questions, many of which have been raised long time ago. We reformulate those questions for the authors of the Report and add some other ones with a hope that they will be unequivocally and publicly answered. The stakes are too high to ignore obvious logical and factual faults of the Report. Statistical analysis with several techniques is also performed; it shows no evident support to the hypothesis of the anthropogenic impact to the temperature change.
Causality notion lies at the heart of science, but when statistics tries to address this issue so... more Causality notion lies at the heart of science, but when statistics tries to address this issue some profound questions remain unanswered. How statistical inference in probabilistic terms is linked with causality? What modern causality models offer that is substantially different from the traditional dependency models like regression or decision trees, and if yes, do they deliver these promises? How causality models are related to statistical and machine learning techniques? What is the relationship between causality modeling, statistical inference, and machine learning on one side – and operations research and optimization on the other? Or, more generally: if the causal picture of the world is a commonly accepted goal of any science, could the non-causal statistical models be of any use? If yes – in what sense? If not – why are they so widely used? The insufficient level of detail in discussions of these and similar problems creates a lot of confusion, especially now, when lauded terms like Data Mining, Big Data, Deep Learning and others appear even in the non-professional media. This paper inspects the underlying logic of different approaches, directly or indirectly, related with causality. It shows that even established methods are vulnerable to small deviations from the ideal setting; that the leading approaches to statistical causality, Structural Equations Modeling (SEM), Directed Acyclic Graphs (DAG) and Potential Outcomes (PO) theories do not provide a coherent causality theory, and argues that this theory is impossible on pure statistical grounds. It also discusses a new approach in which the concept of causality is replaced by the concept of dependent variable generation. Separation of the variables generating the outcome from others just correlated with it (which often separates also causal from non-causal variables) is proposed.
This paper briefly lists events related to wide and rapid spread of Critical Race Theory (CRT) in... more This paper briefly lists events related to wide and rapid spread of Critical Race Theory (CRT) in the USA; it contains no new scientific facts about CRT. It does, however, emphasize one very important aspect of CRT: the inevitable tsunami of double thinking in society, overwhelming current concerns with political correctness.
I share my personal thoughts about Boris Mirkin and, as a witness of his long-term development in... more I share my personal thoughts about Boris Mirkin and, as a witness of his long-term development in data analysis (especially in the area of classification), pose several questions about the future in this area. They are: about mutual treatment of the variables, variation of which has very different practical importance; relationship between internal classification criteria and external goals of data analysis; and dubious role of the distance in clustering in the light of the last results about metrics in high dimensional space. The key question: the perspective of the “complexity statistics,” similarly to “complexity economics.”
Page 1. Electronic copy available at: http://ssrn.com/abstract=1764582 Introduction to Sociosyste... more Page 1. Electronic copy available at: http://ssrn.com/abstract=1764582 Introduction to Sociosystemics Science about the utilizing of social sciences ... Page 2. Electronic copy available at: http://ssrn.com/abstract=1764582 2 Introduction ...
International Journal of Information Technology and Decision Making, Mar 1, 2017
We consider estimation of one variable’s dependence against another one in a new measure called a... more We consider estimation of one variable’s dependence against another one in a new measure called a coefficient of structural association (CSA). It is based on the distribution of one variable along the segments of another one, and yields a gauge similar to the correlation ratio in the nonlinear regression modeling. This index can be constructed as a quotient of the observed and maximum possible variances. The CSA relations to other measures of dependence are described too, particularly, for binary variables CSA reduces to the Loevinger’s coefficient of association. Numerical simulations show that CSA presents a powerful tool for data analysis where traditional measures fail. This method can enrich both theoretical and practical estimations for identifying hidden patterns in the data and help managers and researchers in taking appropriate decisions.
Model Assisted Statistics and Applications, Feb 18, 2013
Traditional statistical modeling in applied marketing is a huge area both for scientists and prac... more Traditional statistical modeling in applied marketing is a huge area both for scientists and practitioners. Agent Based Models (ABM), on the other hand, are comparatively new field, yet gaining its weight very fast in last decade. The purpose of this article is to show some aspects of interrelations between these two types of models and address the common challenges, based on the concept of self-organized ABM. In particular, the complicated problems of semantic analysis of marketing texts and the extraction of the numerical values (a problem scarcely discussed in the literature) are considered; its importance is demonstrated on some examples from marketing texts. A Marketing Net Ontology draft version is proposed. An ABM for typical business web oriented process is designed. On its example it is shown, how problems of optimal ABM parameters tuning may be better understood using proposed statistical techniques.
The article provides a summary of a statistical expert system, STATEX, developed by a group of st... more The article provides a summary of a statistical expert system, STATEX, developed by a group of statisticians and programmers in Almaty, Kazakhstan, in 1989-1993. STATEX was designed as a user-friendly tool for advanced data analysis that could be used by non-statistical experts. The software was distributed broadly in several countries of the former Soviet Union. I describe the history of the project, outline the key features of STATEX and compare the software with other statistical packages, both of the past and modern times. This comparison, besides its historical value, suggests that some of the ideas behind STATEX are not obsolete and could be re-applied now in a very different computational environment. The article provides a reference to the pdf file of STATEX User’s guide and reproduces its table of contents.
Model Assisted Statistics and Applications, Dec 29, 2009
... 17. Joaquin Sorolla y Bastida, Valencian Fisherman, 1895, Spain. ... [5] CP Bruter, ed., Math... more ... 17. Joaquin Sorolla y Bastida, Valencian Fisherman, 1895, Spain. ... [5] CP Bruter, ed., Mathematics and Art: Mathematical Visualization in Art and Education, Springer, Heidelberg, 2002. ... [7] S. Chandrasekhar, Truth and Beauty: Aesthetics and Motivations in Science, University of ...
The recent IPCC report on Climate Change calls for immediate actions from governments, requiring ... more The recent IPCC report on Climate Change calls for immediate actions from governments, requiring tremendous efforts and expenses worldwide. However, the science behind this report is far from clear. It reinforces the false sense of scientific “consensus”, while in fact does not answer the most fundamental questions, many of which have been raised long time ago. We reformulate those questions for the authors of the Report and add some other ones with a hope that they will be unequivocally and publicly answered. The stakes are too high to ignore obvious logical and factual faults of the Report. Statistical analysis with several techniques is also performed; it shows no evident support to the hypothesis of the anthropogenic impact to the temperature change.
Causality notion lies at the heart of science, but when statistics tries to address this issue so... more Causality notion lies at the heart of science, but when statistics tries to address this issue some profound questions remain unanswered. How statistical inference in probabilistic terms is linked with causality? What modern causality models offer that is substantially different from the traditional dependency models like regression or decision trees, and if yes, do they deliver these promises? How causality models are related to statistical and machine learning techniques? What is the relationship between causality modeling, statistical inference, and machine learning on one side – and operations research and optimization on the other? Or, more generally: if the causal picture of the world is a commonly accepted goal of any science, could the non-causal statistical models be of any use? If yes – in what sense? If not – why are they so widely used? The insufficient level of detail in discussions of these and similar problems creates a lot of confusion, especially now, when lauded terms like Data Mining, Big Data, Deep Learning and others appear even in the non-professional media. This paper inspects the underlying logic of different approaches, directly or indirectly, related with causality. It shows that even established methods are vulnerable to small deviations from the ideal setting; that the leading approaches to statistical causality, Structural Equations Modeling (SEM), Directed Acyclic Graphs (DAG) and Potential Outcomes (PO) theories do not provide a coherent causality theory, and argues that this theory is impossible on pure statistical grounds. It also discusses a new approach in which the concept of causality is replaced by the concept of dependent variable generation. Separation of the variables generating the outcome from others just correlated with it (which often separates also causal from non-causal variables) is proposed.
This paper briefly lists events related to wide and rapid spread of Critical Race Theory (CRT) in... more This paper briefly lists events related to wide and rapid spread of Critical Race Theory (CRT) in the USA; it contains no new scientific facts about CRT. It does, however, emphasize one very important aspect of CRT: the inevitable tsunami of double thinking in society, overwhelming current concerns with political correctness.
I share my personal thoughts about Boris Mirkin and, as a witness of his long-term development in... more I share my personal thoughts about Boris Mirkin and, as a witness of his long-term development in data analysis (especially in the area of classification), pose several questions about the future in this area. They are: about mutual treatment of the variables, variation of which has very different practical importance; relationship between internal classification criteria and external goals of data analysis; and dubious role of the distance in clustering in the light of the last results about metrics in high dimensional space. The key question: the perspective of the “complexity statistics,” similarly to “complexity economics.”
Page 1. Electronic copy available at: http://ssrn.com/abstract=1764582 Introduction to Sociosyste... more Page 1. Electronic copy available at: http://ssrn.com/abstract=1764582 Introduction to Sociosystemics Science about the utilizing of social sciences ... Page 2. Electronic copy available at: http://ssrn.com/abstract=1764582 2 Introduction ...
International Journal of Information Technology and Decision Making, Mar 1, 2017
We consider estimation of one variable’s dependence against another one in a new measure called a... more We consider estimation of one variable’s dependence against another one in a new measure called a coefficient of structural association (CSA). It is based on the distribution of one variable along the segments of another one, and yields a gauge similar to the correlation ratio in the nonlinear regression modeling. This index can be constructed as a quotient of the observed and maximum possible variances. The CSA relations to other measures of dependence are described too, particularly, for binary variables CSA reduces to the Loevinger’s coefficient of association. Numerical simulations show that CSA presents a powerful tool for data analysis where traditional measures fail. This method can enrich both theoretical and practical estimations for identifying hidden patterns in the data and help managers and researchers in taking appropriate decisions.
Model Assisted Statistics and Applications, Feb 18, 2013
Traditional statistical modeling in applied marketing is a huge area both for scientists and prac... more Traditional statistical modeling in applied marketing is a huge area both for scientists and practitioners. Agent Based Models (ABM), on the other hand, are comparatively new field, yet gaining its weight very fast in last decade. The purpose of this article is to show some aspects of interrelations between these two types of models and address the common challenges, based on the concept of self-organized ABM. In particular, the complicated problems of semantic analysis of marketing texts and the extraction of the numerical values (a problem scarcely discussed in the literature) are considered; its importance is demonstrated on some examples from marketing texts. A Marketing Net Ontology draft version is proposed. An ABM for typical business web oriented process is designed. On its example it is shown, how problems of optimal ABM parameters tuning may be better understood using proposed statistical techniques.
Uploads
Papers by Igor Mandel