The short paper describes the neccessity and advantage of using an alphabetic transliteration side by side with the original Ethiopic script while working on Ethiopic text corpora, establishing word indices, concordances etc. Sample pages... more
The short paper describes the neccessity and advantage of using an alphabetic transliteration side by side with the original Ethiopic script while working on Ethiopic text corpora, establishing word indices, concordances etc. Sample pages of a work in progress – the synoptic edition of the Imperial Songs in Old Amharic are given.
This is a prescriptivist take on the pluralization of words that are already plural. This happens frequently when Amharic speakers, unfamiliar with Ge'ez singular and plural forms, use Ge'ez words within their Amharic usage.
This paper presents a supervised machine learning approach to incrementally learn and segment affixes using generic background knowledge. We used Prolog script to split affixes from the Amharic word for further morphological analysis.... more
This paper presents a supervised machine learning approach to incrementally learn and segment affixes using generic background knowledge. We used Prolog script to split affixes from the Amharic word for further morphological analysis. Amharic, a Semitic language, has very complex inflectional and derivational verb morphology, with many possible prefixes and suffixes which are used to show various grammatical features. Further segmentation of the affixes into valid morphemes is a challenge addressed in this paper. The paper demonstrates how incremental and easy-to-complex examples can be used to learn such language constructs. The experiment revealed that affixes could be further segmented into valid prefixes and suffixes using a generic and robust string manipulation script by the help of an intelligent teacher who presents examples in incremental order of complexity allowing the system to gradually build its knowledge. The system is able to do the segmentation with 0.94 Precision and 0.97 Recall rates.
We introduce an approach to the grouping of morphemes into suffix slots in morphologically complex languages using a genetic algorithm. The method is applied to verbs in Amharic, a morphologically rich Semitic language. We start with a... more
We introduce an approach to the grouping of morphemes into suffix slots in morphologically complex languages using a genetic algorithm. The method is applied to verbs in Amharic, a morphologically rich Semitic language. We start with a limited set of segmented verbs and the set of suffixes themselves, extracted on the basis of our previous work. Each member of the population for the genetic algorithm is an assignment of the morphemes to one of the set of possible slots. The fitness function combines scores for exact slot position and correct ordering of morphemes. We use mutation but no crossover operator with various combinations of population size, mutation rate, and maximum number of generations, and populations evolve to yield promising morpheme classification results. We evaluate the fittest individuals on the basis of the known morpheme classes for Amharic.
A terse demonstration, with little explanation, of how an historical figure can change the landscape of language, and how a few moments of critical thinking from a native speaker and reader can illuminate connections hitherto forgotten.
This paper presents a supervised machine learning approach to incrementally learn and segment affixes using generic background knowledge. We used Prolog script to split affixes from the Amharic word for further morphological analysis.... more
This paper presents a supervised machine learning approach to incrementally learn and segment affixes using generic background knowledge. We used Prolog script to split affixes from the Amharic word for further morphological analysis. Amharic, a Semitic language, has very complex inflectional and derivational verb morphology, with many possible prefixes and suffixes which are used to show various grammatical features. Further segmentation of the affixes into valid morphemes is a challenge addressed in this paper. The paper demonstrates how incremental and easy-to-complex examples can be used to learn such language constructs. The experiment revealed that affixes could be further segmented into valid prefixes and suffixes using a generic and robust string manipulation script by the help of an intelligent teacher who presents examples in incremental order of complexity allowing the system to gradually build its knowledge. The system is able to do the segmentation with 0.94 Precision a...