Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
  • Alexander James O’Neill is a specially appointed assistant professor in Buddhist studies at Musashino University, in ... moreedit
This paper presents three experiments to test the most effective and efficient ASR pipeline to facilitate the documentation and preservation of endangered languages, which are often extremely low-resourced. With data from two languages in... more
This paper presents three experiments to test the most effective and efficient ASR pipeline to facilitate the documentation and preservation of endangered languages, which are often extremely low-resourced. With data from two languages in Nepal —Dzardzongke and Newar— we show that model improvements are different for different masses of data, and that transfer learning as well as a range of modifications (e.g. normalising amplitude and pitch) can be effective, but that a consistently-standardised orthography as NLP input and post-training dictionary corrections improve results even more.
By the end of the century, over half of the 6500 languages spoken in the world will die out (Turin, 2007). Nepal's situation is particularly dire: of the 120+ distinct languages identified in the 2011 census, 60 are endangered due to... more
By the end of the century, over half of the 6500 languages spoken in the world will die out (Turin, 2007). Nepal's situation is particularly dire: of the 120+ distinct languages identified in the 2011 census, 60 are endangered due to globalisation, socio-political unrest, and environmental challenges. The loss of these languages also means the loss of unique cultural and religious identifiers. Given this, there is a need for methods and tools to preserve linguistic diversity. A major challenge in language preservation, however, is the transcription bottleneck (Shi et al., 2021): transcribing one minute of audio requires an average of 40+ minutes (Durantin et al., 2017). This becomes even more complicated for endangered languages with no (standardised) orthographies or documentation. While advanced automatic speech-recognition (ASR) tools are available, they are often ineffective for these extremely low-resource languages (Foley et al., 2018). This poster presents the preliminary results to address these issues for the Newar and Dzardzongke (both representing different branches of the Sino-Tibetan language family, spoken in Nepal) using Wav2Vec2 models fine-tuned for low-resource languages (Coto-Solano 2021, 2022). We show that endangered languages benefit from a specific set of optimisation procedures through tests with Kaldi vs Wav2Vec2; different types of data augmentation, and the development of a new or standardisation of orthography.
This dataset is a model for handwritten text recognition (HTR) of Sanskrit and Newar Nepalese manuscripts in Pracalit script. This paper introduces the state of the field in Newar literature, Newar manuscripts, and HTR engines. It... more
This dataset is a model for handwritten text recognition (HTR) of Sanskrit and Newar Nepalese manuscripts in Pracalit script. This paper introduces the state of the field in Newar literature, Newar manuscripts, and HTR engines. It explains our methodology for developing the requisite ground truth consisting of manuscript images and corresponding transcriptions, training our model with a PyLAia engine, and this model's limitations. This dataset shared on Zenodo can be used by anyone working with manuscripts in Pracalit script, which will benefit the fields of Indology and Newar studies, as well as historical and linguistic analysis.
This paper is an exploration of references to book worship in later Mahāyāna literature, broadly after the eighth century. It will consider how textual prescriptions within Mahāyāna sūtra literature may have matured into actual practice.... more
This paper is an exploration of references to book worship in later Mahāyāna literature, broadly after the eighth century. It will consider how textual prescriptions within Mahāyāna sūtra literature may have matured into actual practice. Such references will demonstrate how Mahāyāna sūtras were gradually incorporated into a tantric ritual practice, with the Prajñāpāramitā at the fore, and saw inclusion into maṇḍalas used in poṣadha rites.
This study concerns the worship and utilisation of Mahāyāna sūtra literature among Newar Buddhists of the Kathmandu Valley, Nepal. The study begins by considering the contents of the texts being worshipped and the historical development... more
This study concerns the worship and utilisation of Mahāyāna sūtra literature among Newar Buddhists of the Kathmandu Valley, Nepal. The study begins by considering the contents of the texts being worshipped and the historical development of this type of worship in Nepal. In elaborating the character of contemporary sūtra worship, the study considers the organisational structure of the worshippers of the sūtras, the sūtras' popular significance in Nepal, and the manner in which their power is conceived of as related to the presence of life in the manuscripts, after which the practices of display (darśan yāyegu) and recitation (pā thyākegu) are explained. This study concludes that: sūtra worship among the Newars highlights the presence of the divine in texts; it is an example of localization; and features as a pivot point in ongoing renewal and reform.
This study explores self-referential passages in Mahāyāna sutra literature. It argues that these passages serve to mediate a reader or listener's approach to a text in much the same manner as paratexts mediate one's approach to a text... more
This study explores self-referential passages in Mahāyāna sutra literature. It argues that these passages serve to mediate a reader or listener's approach to a text in much the same manner as paratexts mediate one's approach to a text through external or adjacent devices such as commentaries; these passages, rather than being paratextual and outside of a text, are rather within the body of the text itself. This study explicates the types of self-referential passages in Mahāyāna literature, including encouragement to practice and propagate the text; turning it into a book; preserving the text; statements regarding the text's benefits; identification of the text with other qualities or principles; the qualifications required for obtaining the text; and passages for the entrustment of the text. After noting the relative absence of such passages outside of Mahāyāna literature, it is argued that such passages reveal that for some of the adherents of the disparate early Mahāyāna, textuality was a medium of unprecedented value and utility in promoting novel texts and doctrines.
A review of Professor Jinah Kim's "Receptacle of the Sacred: Illustrated Manuscripts and the Buddhist Book Cult in South Asia" for "Himalaya."
A conference report on the Seventh Annual Kathmandu Conference on Nepal and the Himalaya organized by the Social Science Baha at Hotel Shanker in Kathmandu on 25-27 July 2018.