Skip to main content

Karan Singla

IIIT Hyderabad, LTRC, Graduate Student

Followers

121

Following

110

Co-authors

6

Public Views

a

less

InterestsView All (7)

Uploads

Papers by Karan Singla

Can Distributed Word Embeddings be an alternative to costly linguistic features: A Study on Parsing Hindi

by aniruddha tammewar, Karan Singla, and Bhasha Agrawal

Word Embeddings have shown to be use- ful in wide range of NLP tasks. We ex- plore the method... more Word Embeddings have shown to be use-
ful in wide range of NLP tasks. We ex-
plore the methods of using the embed-
dings in Dependency Parsing of Hindi,
a MoR-FWO (morphologically rich, rel-
atively freer word order) language and
show that they not only help improve the
quality of parsing, but can even act as a
cheap alternative to the traditional features
which are costly to acquire. We demon-
strate that if we use distributed represen-
tation of lexical items instead of features
produced by costly tools such as Morpho-
logical Analyzer, we get competitive re-
sults. This implies that only mono-lingual
corpus will suffice to produce good accu-
racy in case of resource poor languages for
which these tools are unavailable. We also
explored the importance of these represen-
tations for domain adaptation

Predicting Post-Editor Profiles from the Translation Process

by David Orrego-Carmona, Michael Carl, and Karan Singla

Enhancing ASR by MT using Semantic information from Hindi WordNet

by Michael Carl and Karan Singla

In a CAT (Computer Assisted Transla-tion) system a human translator translates a source language ... more In a CAT (Computer Assisted Transla-tion) system a human translator translates a source language string into a target lan-guage string using different input methods such as speech and typing. In this paper, we improve the performance of speech recognition of a translator speaking in the target language, taking the advantage of source Language string and information from WordNet. We use machine trans-lation to translate the source Language string to target language and use this infor-mation and the semantic information we get for the words in the translated string from WordNet to bias the speech recog-niser towards the gained knowledge. In this paper, we perform different experi-ments including variation of number of hy-pothesis of MT 1 and also different tech-niques of incorporating the semantic in-formation. Overall we outperformed the baseline system having no semantic infor-mation by the increase in word accuracy of 1.6% for the Hindi ASR 2 in English-Hindi system.

SEECAT: ASR & Eye-tracking Enabled Computer-Assisted Translation

by aniruddha tammewar, Michael Carl, Karan Singla, and Ankita Thakur

Typing has traditionally been the only in-put method used by human translators working with compu... more Typing has traditionally been the only in-put method used by human translators working with computer-assisted transla-tion (CAT) tools. However, speech is a nat-ural communication channel for humans and, in principle, it should be faster and easier than typing from a keyboard. This contribution investigates the integration of automatic speech recognition (ASR) in a CAT workbench testing its real use by human translators while post-editing ma-chine translation (MT) outputs. This pa-per also explores the use of MT com-bined with ASR in order to improve recog-nition accuracy in a workbench integrat-ing eye-tracking functionalities to collect process-oriented information about trans-lators' performance.

Two-stage Approach for Hindi Dependency Parsing Using MaltParser

by Naman Jain and Karan Singla

Proceedings of the Workshop on Machine Translation and Parsing in Indian Languages (MTPIL-2012), pages 163–170, COLING 2012, Mumbai, December 2012., Dec 2012

In this paper, we present our approach towards dependency parsing of Hindi language as a part of ... more In this paper, we present our approach towards dependency parsing of Hindi language as a part of Hindi Shared Task on Parsing, COLING 2012. Our approach includes the effect of using different settings available in Malt Parser following the two-step parsing strategy i.e. splitting the data into interChunks and intraChunks to obtain the best possible LAS, UAS and LA accuracy. Our system achieved best LAS of 90.99% for Gold Standard track and second best LAS of 83.91% for automated data.

Predicting Post-Editor Profiles from Translation Process

Exploring system combination approaches for Indo-Aryan MT systems

Reducing data sparsity in Statistical Machine Translation

SEECAT: ASR & Eye-tracking Enabled Computer-Assisted Translation

Typing has traditionally been the only input method used by human translators working with comput... more Typing has traditionally been the only input method used by human translators working with computer-assisted translation (CAT) tools. However, speech is a natural communication channel for humans
and, in principle, it should be faster and easier than typing from a keyboard. This contribution investigates the integration of
automatic speech recognition (ASR) in a CAT workbench testing its real use by human translators while post-editing machine
translation (MT) outputs. This paper also explores the use of MT combined with ASR in order to improve recognition accuracy in a workbench integrating eye-tracking functionalities to collect process-oriented information about translators performance.

Enhancing ASR by MT using Semantic information from Hindi WordNet

In a conventional CAT (Computer Assisted Translation) system a human translator post-edits an aut... more In a conventional CAT (Computer Assisted Translation) system a human translator post-edits an automatically generated target language text using the keyboard. In this paper we extend a CAT system with speech input by which the translator speaks the translation, a process referred to as sight translation. We report several
experiments to improve the performance of an automatic speech recognition system, taking advantage of machine translation
output and information from WordNet. Overall we outperform a baseline system which has no semantic information by an increased 1.6% word accuracy for the English to Hindi translation.

Two-stage Approach for Hindi Dependency Parsing Using MaltParser

by Karan Singla and aniruddha tammewar

In this paper, we present our approach towards dependency parsing of Hindi language as a part of ... more In this paper, we present our approach towards dependency parsing of Hindi language as a part of Hindi Shared Task on Parsing, COLING 2012. Our approach includes the effect of using different settings available in Malt Parser following the two-step parsing strategy i.e. splitting the data into interChunks and intraChunks to obtain the best possible LAS1, UAS2 and LA3 accuracy. Our system achieved best LAS of 90.99% for Gold Standard track and second best LAS of 83.91% for Automated data.

Can Distributed Word Embeddings be an alternative to costly linguistic features: A Study on Parsing Hindi

by aniruddha tammewar, Karan Singla, and Bhasha Agrawal

Word Embeddings have shown to be use- ful in wide range of NLP tasks. We ex- plore the method... more Word Embeddings have shown to be use-
ful in wide range of NLP tasks. We ex-
plore the methods of using the embed-
dings in Dependency Parsing of Hindi,
a MoR-FWO (morphologically rich, rel-
atively freer word order) language and
show that they not only help improve the
quality of parsing, but can even act as a
cheap alternative to the traditional features
which are costly to acquire. We demon-
strate that if we use distributed represen-
tation of lexical items instead of features
produced by costly tools such as Morpho-
logical Analyzer, we get competitive re-
sults. This implies that only mono-lingual
corpus will suffice to produce good accu-
racy in case of resource poor languages for
which these tools are unavailable. We also
explored the importance of these represen-
tations for domain adaptation

Predicting Post-Editor Profiles from the Translation Process

by David Orrego-Carmona, Michael Carl, and Karan Singla

Enhancing ASR by MT using Semantic information from Hindi WordNet

by Michael Carl and Karan Singla

In a CAT (Computer Assisted Transla-tion) system a human translator translates a source language ... more In a CAT (Computer Assisted Transla-tion) system a human translator translates a source language string into a target lan-guage string using different input methods such as speech and typing. In this paper, we improve the performance of speech recognition of a translator speaking in the target language, taking the advantage of source Language string and information from WordNet. We use machine trans-lation to translate the source Language string to target language and use this infor-mation and the semantic information we get for the words in the translated string from WordNet to bias the speech recog-niser towards the gained knowledge. In this paper, we perform different experi-ments including variation of number of hy-pothesis of MT 1 and also different tech-niques of incorporating the semantic in-formation. Overall we outperformed the baseline system having no semantic infor-mation by the increase in word accuracy of 1.6% for the Hindi ASR 2 in English-Hindi system.

SEECAT: ASR & Eye-tracking Enabled Computer-Assisted Translation

by aniruddha tammewar, Michael Carl, Karan Singla, and Ankita Thakur

Typing has traditionally been the only in-put method used by human translators working with compu... more Typing has traditionally been the only in-put method used by human translators working with computer-assisted transla-tion (CAT) tools. However, speech is a nat-ural communication channel for humans and, in principle, it should be faster and easier than typing from a keyboard. This contribution investigates the integration of automatic speech recognition (ASR) in a CAT workbench testing its real use by human translators while post-editing ma-chine translation (MT) outputs. This pa-per also explores the use of MT com-bined with ASR in order to improve recog-nition accuracy in a workbench integrat-ing eye-tracking functionalities to collect process-oriented information about trans-lators' performance.

Two-stage Approach for Hindi Dependency Parsing Using MaltParser

by Naman Jain and Karan Singla

Proceedings of the Workshop on Machine Translation and Parsing in Indian Languages (MTPIL-2012), pages 163–170, COLING 2012, Mumbai, December 2012., Dec 2012

In this paper, we present our approach towards dependency parsing of Hindi language as a part of ... more In this paper, we present our approach towards dependency parsing of Hindi language as a part of Hindi Shared Task on Parsing, COLING 2012. Our approach includes the effect of using different settings available in Malt Parser following the two-step parsing strategy i.e. splitting the data into interChunks and intraChunks to obtain the best possible LAS, UAS and LA accuracy. Our system achieved best LAS of 90.99% for Gold Standard track and second best LAS of 83.91% for automated data.

Predicting Post-Editor Profiles from Translation Process

Exploring system combination approaches for Indo-Aryan MT systems

Reducing data sparsity in Statistical Machine Translation

SEECAT: ASR & Eye-tracking Enabled Computer-Assisted Translation

Typing has traditionally been the only input method used by human translators working with comput... more Typing has traditionally been the only input method used by human translators working with computer-assisted translation (CAT) tools. However, speech is a natural communication channel for humans
and, in principle, it should be faster and easier than typing from a keyboard. This contribution investigates the integration of
automatic speech recognition (ASR) in a CAT workbench testing its real use by human translators while post-editing machine
translation (MT) outputs. This paper also explores the use of MT combined with ASR in order to improve recognition accuracy in a workbench integrating eye-tracking functionalities to collect process-oriented information about translators performance.

Enhancing ASR by MT using Semantic information from Hindi WordNet

In a conventional CAT (Computer Assisted Translation) system a human translator post-edits an aut... more In a conventional CAT (Computer Assisted Translation) system a human translator post-edits an automatically generated target language text using the keyboard. In this paper we extend a CAT system with speech input by which the translator speaks the translation, a process referred to as sight translation. We report several
experiments to improve the performance of an automatic speech recognition system, taking advantage of machine translation
output and information from WordNet. Overall we outperform a baseline system which has no semantic information by an increased 1.6% word accuracy for the English to Hindi translation.

Two-stage Approach for Hindi Dependency Parsing Using MaltParser

by Karan Singla and aniruddha tammewar

In this paper, we present our approach towards dependency parsing of Hindi language as a part of ... more In this paper, we present our approach towards dependency parsing of Hindi language as a part of Hindi Shared Task on Parsing, COLING 2012. Our approach includes the effect of using different settings available in Malt Parser following the two-step parsing strategy i.e. splitting the data into interChunks and intraChunks to obtain the best possible LAS1, UAS2 and LA3 accuracy. Our system achieved best LAS of 90.99% for Gold Standard track and second best LAS of 83.91% for Automated data.