Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2740908.2742009acmotherconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article

The Computable News project: Research in the Newsroom

Published: 18 May 2015 Publication History
  • Get Citation Alerts
  • Abstract

    We report on a four year academic research project to build a natural language processing platform in support of a large media company. The Computable News platform processes news stories, producing a layer of structured data that can be used to build rich applications. We describe the underlying platform and the research tasks that we explored building it. The platform supports a wide range of prototype applications designed to support different newsroom functions. We hope that this qualitative review provides some insight into the challenges involved in this type of project.

    References

    [1]
    T. Dawborn and J. R. Curran. docrep: A lightweight and efficient document representation framework. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pages 762--771, Dublin, Ireland, August 2014. Dublin City University and Association for Computational Linguistics.
    [2]
    B. Hachey, J. Nothman, and W. Radford. Cheap and easy entity evaluation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 464--469, Baltimore, Maryland, June 2014.
    [3]
    B. Hachey, W. Radford, J. Nothman, M. Honnibal, and J. R. Curran. Evaluating entity linking with Wikipedia. Artificial Intelligence, 194:130--150, January 2013.
    [4]
    J. Nothman. Grounding event references in news. PhD thesis, School of Information Technologies, University of Sydney, Sydney, Australia, 2014.
    [5]
    J. Nothman, T. Dawborn, and J. R. Curran. Command-line utilities for managing and exploring annotated corpora. In Proceedings of the Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies, Dublin, Ireland, August 2014.
    [6]
    J. Nothman, M. Honnibal, B. Hachey, and J. R. Curran. Event linking: grounding event reference in a news archive. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 228--232, Jeju, Korea, July 2012.
    [7]
    T. O'Keefe. Extracting and Attributing Quotes in Text and Assessing them as Opinions. PhD thesis, School of Information Technologies, University of Sydney, Sydney, Australia, 2014.
    [8]
    T. O'Keefe, J. R. Curran, P. Ashwell, and I. Koprinska. An annotated corpus of quoted opinions in news articles. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 516--520, Sofia, Bulgaria, August 2013. Association for Computational Linguistics.
    [9]
    T. O'Keefe, S. Pareti, J. R. Curran, I. Koprinska, and M. Honnibal. A sequence labelling approach to quote attribution. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 790--799, Jeju, Korea, July 2012.
    [10]
    G. Pink, W. Radford, W. Cannings, A. Naoum, J. Nothman, D. Tse, and J. R. Curran. SYDNEY-CMCRC at TAC 2013. In Proceedings of the Text Analysis Conference, Gaithersburg, MD USA, November 2013. National Institute of Standards and Technology.
    [11]
    W. Radford. Linking Named Entities to Wikipedia. PhD thesis, School of Information Technologies, University of Sydney, Sydney, Australia, 2015.
    [12]
    W. Radford, W. Cannings, A. Naoum, J. Nothman, G. Pink, D. Tse, and J. R. Curran. (Almost) Total Recall -- SYDNEY-CMCRC at TAC 2012. In Proceedings of the Text Analysis Conference, Gaithersburg, MD USA, November 2012. National Institute of Standards and Technology.
    [13]
    W. Radford and J. R. Curran. Joint apposition extraction with syntactic and semantic constraints. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 671--677, Sofia, Bulgaria, August 2013. Association for Computational Linguistics.
    [14]
    W. Radford, B. Hachey, M. Honnibal, J. Nothman, and J. R. Curran. Naive but effective NIL clustering baselines -- CMCRC at TAC 2011. In Proceedings of the Text Analysis Conference, Gaithersburg, MD USA, November 2011. National Institute of Standards and Technology.
    [15]
    W. Radford, B. Hachey, J. Nothman, M. Honnibal, and J. R. Curran. Document-level entity linking: CMCRC at TAC 2010. In Proceedings of the Text Analysis Conference, Gaithersburg, MD USA, November 2010. National Institute of Standards and Technology.

    Cited By

    View all
    • (2024)Extraction and attribution of public figures statements for journalism in Indonesia using deep learningKnowledge-Based Systems10.1016/j.knosys.2024.111558289:COnline publication date: 8-Apr-2024
    • (2022)PFSA-ID: an annotated Indonesian corpus and baseline model of public figures statements attributionsGlobal Knowledge, Memory and Communication10.1108/GKMC-04-2022-0091Online publication date: 8-Nov-2022
    • (2020)Understanding quotation extraction and attribution: towards automatic extraction of public figure’s statements for journalism in IndonesiaGlobal Knowledge, Memory and Communication10.1108/GKMC-07-2020-0098ahead-of-print:ahead-of-printOnline publication date: 2-Dec-2020
    • Show More Cited By

    Index Terms

    1. The Computable News project: Research in the Newsroom

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      WWW '15 Companion: Proceedings of the 24th International Conference on World Wide Web
      May 2015
      1602 pages
      ISBN:9781450334730
      DOI:10.1145/2740908

      Sponsors

      • IW3C2: International World Wide Web Conference Committee

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 18 May 2015

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. event linking
      2. named entity linking
      3. online news applications
      4. quotation extraction and attribution

      Qualifiers

      • Research-article

      Funding Sources

      • Australian Research Council Discovery
      • Capital Markets Cooperative Research Centre

      Conference

      WWW '15
      Sponsor:
      • IW3C2

      Acceptance Rates

      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)5
      • Downloads (Last 6 weeks)0

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Extraction and attribution of public figures statements for journalism in Indonesia using deep learningKnowledge-Based Systems10.1016/j.knosys.2024.111558289:COnline publication date: 8-Apr-2024
      • (2022)PFSA-ID: an annotated Indonesian corpus and baseline model of public figures statements attributionsGlobal Knowledge, Memory and Communication10.1108/GKMC-04-2022-0091Online publication date: 8-Nov-2022
      • (2020)Understanding quotation extraction and attribution: towards automatic extraction of public figure’s statements for journalism in IndonesiaGlobal Knowledge, Memory and Communication10.1108/GKMC-07-2020-0098ahead-of-print:ahead-of-printOnline publication date: 2-Dec-2020
      • (2016)An Interface for Assisted Curation of Knowledge Bases from Unstructured TextProceedings of the 2016 49th Hawaii International Conference on System Sciences (HICSS)10.1109/HICSS.2016.545(4386-4393)Online publication date: 5-Jan-2016

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media