The Indus script is one of the major undeciphered scripts of the ancient world. The small size of... more The Indus script is one of the major undeciphered scripts of the ancient world. The small size of the corpus, the absence of bilingual texts, and the lack of definite knowledge of the underlying language has frustrated efforts at decipherment since the discovery of the remains of the Indus civilisation. Recently, some researchers have questioned the premise that the Indus script encodes spoken language. Building on previous statistical approaches, we apply the tools of statistical language processing, specifically n-gram Markov chains, to analyse the Indus script for syntax. Our main results are that the script has well-defined signs which begin and end texts, that there is directionality and strong correlations in the sign order, and that there are groups of signs which appear to have identical syntactic function. All these require no a priori suppositions regarding the syntactic or semantic content of the signs, but follow directly from the statistical analysis. Using information ...
In several ancient Indian texts a mention is made of the movement of the Saptari constellation (B... more In several ancient Indian texts a mention is made of the movement of the Saptari constellation (Big Bear or Big Dipper) in the sky, visiting each Nakatras for 100 years. Saptari is said to visit a nakatra if the nakatra is in the middle of the stars in the first part of Saptari. Since astronomical objects except planets are more or less stationary in the sky, this is generally considered a fanciful statement devoid of astronomical meaning. We show that this may not be so. We show that the visit of Saptari to different may be a very significant astronomical observation. The transition is not a constant of time since it depends on the proximity of the Saptari to the North Pole, which changes due to Earth’s precession and relative sizes of different Nakatras. We show that since 8000 BC, Saptari has visited 5 different Nakatras and for one of them, the transition happening in the span of roughly 100 years. We show that this interpretation allows dating of this belief which is consistent...
Abstract—The Indus script is one of the major undeciphered scripts of the ancient world. The smal... more Abstract—The Indus script is one of the major undeciphered scripts of the ancient world. The small size of the corpus, the absence of bilingual texts, and the lack of definite knowledge of the underlying language has frustrated efforts at decipherment since the discovery of the remains of the Indus civilisation. Recently, some researchers have questioned the premise that the Indus script encodes spoken language. Building on previous statistical approaches, we apply the tools of statistical language processing, specifically n-gram Markov chains, to analyse the Indus script for syntax. Our main results are that the script has well-defined signs which begin and end texts, that there is directionality and strong correlations in the sign order, and that there are groups of signs which appear to have identical syntactic function. All these require no a priori suppositions regarding the syntactic or semantic content of the signs, but follow directly from the statistical analysis. Using inf...
One of the unexplained items of ancient Indian astronomical traditions is an apparent absence of ... more One of the unexplained items of ancient Indian astronomical traditions is an apparent absence of records of supernovae, which are the last moments of dying stars when they become several orders of magnitude brighter than usual and may often be visible in daytime sky. In the present paper, we make a list of about 12 supernovae that should have been visible during the periods of prehistory and history. 1.
In a recent Last Words column (Sproat 2010), Richard Sproat laments the reviewing practices of “g... more In a recent Last Words column (Sproat 2010), Richard Sproat laments the reviewing practices of “general science journals ” after dismissing our work and that of Lee, Jonathan, and Ziman (2010) as “useless ” and “trivially and demonstrably wrong. ” Although we expect such categorical statements to have already raised some red flags in the minds
The following datasets were used for the comparative statistical analysis reported in the paper. ... more The following datasets were used for the comparative statistical analysis reported in the paper. Note that the datasets are of different sizes because they were obtained from different sources – a smoothing technique was used to counter the effects of different sample sizes in estimation (see
The script of the ancient Indus civilization remains undeciphered. The hypothesis that the script... more The script of the ancient Indus civilization remains undeciphered. The hypothesis that the script encodes language has recently been questioned. Here we present evidence for the linguistic hypothesis by showing that the script’s conditional entropy is closer to those of natural languages than various types of nonlinguistic systems. The Indus civilization flourished ~2600 to 1900 before the common era in what is now eastern Pakistan and northwestern India (1). No historical information exists about the civilization, but archaeologists have uncovered samples of their writing on stamp seals, sealings, amulets, and small tablets. The script on these objects remains undeciphered, despite a number of attempts and claimed decipherments (2). A recent article (3) questioned the assumption that the script encoded language, suggesting instead that it might have been a nonlinguistic symbol system akin to the Vinča inscriptions of southeastern Europe and Near Eastern emblem systems. We compared ...
Proceedings of the National Academy of Sciences, 2009
Although no historical information exists about the Indus civilization (flourished ca . 2600–1900... more Although no historical information exists about the Indus civilization (flourished ca . 2600–1900 B.C.), archaeologists have uncovered about 3,800 short samples of a script that was used throughout the civilization. The script remains undeciphered, despite a large number of attempts and claimed decipherments over the past 80 years. Here, we propose the use of probabilistic models to analyze the structure of the Indus script. The goal is to reveal, through probabilistic analysis, syntactic patterns that could point the way to eventual decipherment. We illustrate the approach using a simple Markov chain model to capture sequential dependencies between signs in the Indus script. The trained model allows new sample texts to be generated, revealing recurring patterns of signs that could potentially form functional subunits of a possible underlying language. The model also provides a quantitative way of testing whether a particular string belongs to the putative language as captured by th...
Abstract: One of the unexplained items of ancient Indian astronomical traditions is an apparent a... more Abstract: One of the unexplained items of ancient Indian astronomical traditions is an apparent absence of records of supernovae, which are the last moments of dying stars when they become several orders of magnitude brighter than usual and may often be visible in daytime sky. ...
When did the humans begin astronomical observations? The oldest of the human observations are sca... more When did the humans begin astronomical observations? The oldest of the human observations are scattered through various Palaeolithic epochs. These observations are seen in the form of the cave paintings at various sites in France and Spain and include the phases of moon ...
1 Dept. of Computer Science & Engineering, University of Washington, Seattle, WA 98195, USA 2 Dep... more 1 Dept. of Computer Science & Engineering, University of Washington, Seattle, WA 98195, USA 2 Dept. of Astronomy & Astrophysics, Tata Institute of Fundamental Research, Mumbai 400005, India 3 Oracle, Hyderabad 500081, India 4 The Institute of Mathematical ...
We search for potentially grammatical patterns in the Indus wrting based on the concordance of Ma... more We search for potentially grammatical patterns in the Indus wrting based on the concordance of Mahadevan (1977). We make no assumptions about its structure or meaning. We only attempt to check if the Indus writing is meaningfully structured with specific rules to code useful ...
Abstract: We adopt a comprehensive approach to segment the Indus texts using statistically signif... more Abstract: We adopt a comprehensive approach to segment the Indus texts using statistically significant signs and their combinations in addition to all the texts of length 2, 3 and 4 signs. We find that we can segment 88% of Indus texts (of length 5 and above) by this method and hence it ...
The Indus script is one of the major undeciphered scripts of the ancient world. The small size of... more The Indus script is one of the major undeciphered scripts of the ancient world. The small size of the corpus, the absence of bilingual texts, and the lack of definite knowledge of the underlying language has frustrated efforts at decipherment since the discovery of the remains of the Indus civilisation. Recently, some researchers have questioned the premise that the Indus script encodes spoken language. Building on previous statistical approaches, we apply the tools of statistical language processing, specifically n-gram Markov chains, to analyse the Indus script for syntax. Our main results are that the script has well-defined signs which begin and end texts, that there is directionality and strong correlations in the sign order, and that there are groups of signs which appear to have identical syntactic function. All these require no a priori suppositions regarding the syntactic or semantic content of the signs, but follow directly from the statistical analysis. Using information ...
In several ancient Indian texts a mention is made of the movement of the Saptari constellation (B... more In several ancient Indian texts a mention is made of the movement of the Saptari constellation (Big Bear or Big Dipper) in the sky, visiting each Nakatras for 100 years. Saptari is said to visit a nakatra if the nakatra is in the middle of the stars in the first part of Saptari. Since astronomical objects except planets are more or less stationary in the sky, this is generally considered a fanciful statement devoid of astronomical meaning. We show that this may not be so. We show that the visit of Saptari to different may be a very significant astronomical observation. The transition is not a constant of time since it depends on the proximity of the Saptari to the North Pole, which changes due to Earth’s precession and relative sizes of different Nakatras. We show that since 8000 BC, Saptari has visited 5 different Nakatras and for one of them, the transition happening in the span of roughly 100 years. We show that this interpretation allows dating of this belief which is consistent...
Abstract—The Indus script is one of the major undeciphered scripts of the ancient world. The smal... more Abstract—The Indus script is one of the major undeciphered scripts of the ancient world. The small size of the corpus, the absence of bilingual texts, and the lack of definite knowledge of the underlying language has frustrated efforts at decipherment since the discovery of the remains of the Indus civilisation. Recently, some researchers have questioned the premise that the Indus script encodes spoken language. Building on previous statistical approaches, we apply the tools of statistical language processing, specifically n-gram Markov chains, to analyse the Indus script for syntax. Our main results are that the script has well-defined signs which begin and end texts, that there is directionality and strong correlations in the sign order, and that there are groups of signs which appear to have identical syntactic function. All these require no a priori suppositions regarding the syntactic or semantic content of the signs, but follow directly from the statistical analysis. Using inf...
One of the unexplained items of ancient Indian astronomical traditions is an apparent absence of ... more One of the unexplained items of ancient Indian astronomical traditions is an apparent absence of records of supernovae, which are the last moments of dying stars when they become several orders of magnitude brighter than usual and may often be visible in daytime sky. In the present paper, we make a list of about 12 supernovae that should have been visible during the periods of prehistory and history. 1.
In a recent Last Words column (Sproat 2010), Richard Sproat laments the reviewing practices of “g... more In a recent Last Words column (Sproat 2010), Richard Sproat laments the reviewing practices of “general science journals ” after dismissing our work and that of Lee, Jonathan, and Ziman (2010) as “useless ” and “trivially and demonstrably wrong. ” Although we expect such categorical statements to have already raised some red flags in the minds
The following datasets were used for the comparative statistical analysis reported in the paper. ... more The following datasets were used for the comparative statistical analysis reported in the paper. Note that the datasets are of different sizes because they were obtained from different sources – a smoothing technique was used to counter the effects of different sample sizes in estimation (see
The script of the ancient Indus civilization remains undeciphered. The hypothesis that the script... more The script of the ancient Indus civilization remains undeciphered. The hypothesis that the script encodes language has recently been questioned. Here we present evidence for the linguistic hypothesis by showing that the script’s conditional entropy is closer to those of natural languages than various types of nonlinguistic systems. The Indus civilization flourished ~2600 to 1900 before the common era in what is now eastern Pakistan and northwestern India (1). No historical information exists about the civilization, but archaeologists have uncovered samples of their writing on stamp seals, sealings, amulets, and small tablets. The script on these objects remains undeciphered, despite a number of attempts and claimed decipherments (2). A recent article (3) questioned the assumption that the script encoded language, suggesting instead that it might have been a nonlinguistic symbol system akin to the Vinča inscriptions of southeastern Europe and Near Eastern emblem systems. We compared ...
Proceedings of the National Academy of Sciences, 2009
Although no historical information exists about the Indus civilization (flourished ca . 2600–1900... more Although no historical information exists about the Indus civilization (flourished ca . 2600–1900 B.C.), archaeologists have uncovered about 3,800 short samples of a script that was used throughout the civilization. The script remains undeciphered, despite a large number of attempts and claimed decipherments over the past 80 years. Here, we propose the use of probabilistic models to analyze the structure of the Indus script. The goal is to reveal, through probabilistic analysis, syntactic patterns that could point the way to eventual decipherment. We illustrate the approach using a simple Markov chain model to capture sequential dependencies between signs in the Indus script. The trained model allows new sample texts to be generated, revealing recurring patterns of signs that could potentially form functional subunits of a possible underlying language. The model also provides a quantitative way of testing whether a particular string belongs to the putative language as captured by th...
Abstract: One of the unexplained items of ancient Indian astronomical traditions is an apparent a... more Abstract: One of the unexplained items of ancient Indian astronomical traditions is an apparent absence of records of supernovae, which are the last moments of dying stars when they become several orders of magnitude brighter than usual and may often be visible in daytime sky. ...
When did the humans begin astronomical observations? The oldest of the human observations are sca... more When did the humans begin astronomical observations? The oldest of the human observations are scattered through various Palaeolithic epochs. These observations are seen in the form of the cave paintings at various sites in France and Spain and include the phases of moon ...
1 Dept. of Computer Science & Engineering, University of Washington, Seattle, WA 98195, USA 2 Dep... more 1 Dept. of Computer Science & Engineering, University of Washington, Seattle, WA 98195, USA 2 Dept. of Astronomy & Astrophysics, Tata Institute of Fundamental Research, Mumbai 400005, India 3 Oracle, Hyderabad 500081, India 4 The Institute of Mathematical ...
We search for potentially grammatical patterns in the Indus wrting based on the concordance of Ma... more We search for potentially grammatical patterns in the Indus wrting based on the concordance of Mahadevan (1977). We make no assumptions about its structure or meaning. We only attempt to check if the Indus writing is meaningfully structured with specific rules to code useful ...
Abstract: We adopt a comprehensive approach to segment the Indus texts using statistically signif... more Abstract: We adopt a comprehensive approach to segment the Indus texts using statistically significant signs and their combinations in addition to all the texts of length 2, 3 and 4 signs. We find that we can segment 88% of Indus texts (of length 5 and above) by this method and hence it ...
Uploads
Papers by Hrishikesh Joglekar