Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Surprise! What's in a Cebuano or Hindi Name?

Published: 01 September 2003 Publication History
  • Get Citation Alerts
  • Abstract

    Empirical results are presented for creating training data and training a statistical name learning algorithm on Cebuano and Hindi in roughly three weeks time. The empirical study compares performance in a compressed time frame against performance of the same statistical language model in English (where there was no compressed time frame). Rapid development of several co-reference heuristics in Hindi are also described, and co-reference performance in Hindi is compared to previously developed English techniques.

    References

    [1]
    Bikel, D., Miller, S., Schwartz, R., and Weischedel, R. 1997. Nymble: a high-performance learning name-finder. In Fifth Conference on Applied Natural Language Processing, Washington, DC, USA, April, 1997, 194-201.
    [2]
    Bikel, D., Sschwartz, R., and Weischedel, R. 1999. An algorithm that learns what's in a name. Machine Learning 34, 211-231.
    [3]
    Chinchor, N., Hirshman, L., and Lewis, D. 1993. Evaluating message understanding systems: an analysis of the third Message Understanding Conference (MUC-3). Computational Linguistics 19:3, 409-449.
    [4]
    Chinchor, N., Robinson, P., and Brown, E. 1998. HUB-4 Named entity task definition version 4.8. Available at http://www.nist.gov/speech/tests/bnr/hub4_98/h4_iene_task_def.4.8.ps on 9/26/2003.
    [5]
    Cucherzan, S., and Yarowsky, D. 1999. Language independent named entity recognition combining morphological and contextual evidence. In Proceedings, 1999 Joint SIGDAT Conference on Empirical Methods in NLP and Very Large Corpora, College Park, MD, USA, June, 1999, 90-99.
    [6]
    Doddington, G. 2001. Value-based evaluation of EDT. Technical report on the ACE 6-month meeting. Available via ftp at ftp://jaguar.ncsl.nist.gov/ace/phase2/nyu-meeting/nist-2001.05-edt-cost-model-v3.pdf on 8/20/2003.
    [7]
    Maynard, D., Tablan, V., and Cunningham, H. 2003. NE recognition without training data on a language you don't speak. 2003. ACL Workshop on Multilingual and Mixed-language Named Entity Recognition: Combining Statistical and Symbolic Models, Sapporo, Japan.
    [8]
    McCallum, A. and Li, W. 2003. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In Seventh Conference on Natural Language Learning (CoNLL), Edmonton, Canada, June, 2003.
    [9]
    McEnery, A. M., Baker, P., Gaizaukas, R., and Cunningham, H. 2000. EMILLE: Building a corpus of South Asian languages, Vivek, A Quarterly in Artificail Intelligence 13:3, 23--32.
    [10]
    Palmer, D. D., Burger, J. D., and Ostendorf, M. 1999. Information extraction from broadcast news speech data. Proceedings of The DARPA Broadcast News Workshop, February 28-March 3, Morgan Kaufmann Publishers, 41--46.
    [11]
    Przybocky, M. A., Fiscus, J. G., Garofolo, J. S., and Pallett, D. S. 1999. 1998 Hub-4 information extraction evaluation. Proceedings Of The DARPA Broadcast News Workshop, February 28-March 3, Morgan Kaufmann Publishers, 13--18.
    [12]
    Strassel, S. 2003. Simple Named Entity Guidelines. Available at http://www.ldc.upenn.edu/Projects/SurpriseLanguage/Annotation/NE/index.html on 9/26/2003.

    Cited By

    View all
    • (2022)BiLSTM-CRF Manipuri NER with Character-Level Word RepresentationArabian Journal for Science and Engineering10.1007/s13369-022-06933-zOnline publication date: 22-Jun-2022
    • (2021)Named Entity Recognizer for Konkani TextICT with Intelligent Applications10.1007/978-981-16-4177-0_69(687-702)Online publication date: 6-Dec-2021
    • (2011)Extreme extractionProceedings of the Conference on Empirical Methods in Natural Language Processing10.5555/2145432.2145585(1437-1446)Online publication date: 27-Jul-2011
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian Language Information Processing
    ACM Transactions on Asian Language Information Processing  Volume 2, Issue 3
    September 2003
    132 pages
    ISSN:1530-0226
    EISSN:1558-3430
    DOI:10.1145/979872
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 September 2003
    Published in TALIP Volume 2, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Cebuano
    2. Extraction
    3. Hindi

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)BiLSTM-CRF Manipuri NER with Character-Level Word RepresentationArabian Journal for Science and Engineering10.1007/s13369-022-06933-zOnline publication date: 22-Jun-2022
    • (2021)Named Entity Recognizer for Konkani TextICT with Intelligent Applications10.1007/978-981-16-4177-0_69(687-702)Online publication date: 6-Dec-2021
    • (2011)Extreme extractionProceedings of the Conference on Empirical Methods in Natural Language Processing10.5555/2145432.2145585(1437-1446)Online publication date: 27-Jul-2011
    • (2011)Learning Recognition of Ambiguous Proper Names in HindiProceedings of the 2011 10th International Conference on Machine Learning and Applications and Workshops - Volume 0110.1109/ICMLA.2011.87(178-182)Online publication date: 18-Dec-2011
    • (2009)Indian Language Information RetrievalGuide to OCR for Indic Scripts10.1007/978-1-84800-330-9_16(301-314)Online publication date: 28-Aug-2009
    • (2003)Rapid development of Hindi named entity recognition using conditional random fields and feature inductionACM Transactions on Asian Language Information Processing10.1145/979872.9798792:3(290-294)Online publication date: 1-Sep-2003

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media