Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

An Overview of BioCreative II.5

Published: 01 July 2010 Publication History
  • Get Citation Alerts
  • Abstract

    We present the results of the BioCreative II.5 evaluation in association with the FEBS Letters experiment, where authors created Structured Digital Abstracts to capture information about protein-protein interactions. The BioCreative II.5 challenge evaluated automatic annotations from 15 text mining teams based on a gold standard created by reconciling annotations from curators, authors, and automated systems. The tasks were to rank articles for curation based on curatable protein-protein interactions; to identify the interacting proteins (using UniProt identifiers) in the positive articles (61); and to identify interacting protein pairs. There were 595 full-text articles in the evaluation test set, including those both with and without curatable protein interactions. The principal evaluation metrics were the interpolated area under the precision/recall curve (AUC iP/R), and (balanced) F-measure. For article classification, the best AUC iP/R was 0.70; for interacting proteins, the best system achieved good macroaveraged recall (0.73) and interpolated area under the precision/recall curve (0.58), after filtering incorrect species and mapping homonymous orthologs; for interacting protein pairs, the top (filtered, mapped) recall was 0.42 and AUC iP/R was 0.29. Ensemble systems improved performance for the interacting protein task.

    References

    [1]
    R.B. Altman, C.M. Bergman, J. Blake, C. Blaschke, A. Cohen, F. Gannon, L. Grivell, U. Hahn, W. Hersh, L. Hirschman, L.J. Jensen, M. Krallinger, B. Mons, S.I. O'Donoghue, M.C. Peitsch, D. Rebholz-Schuhmann, H. Shatkay, and A. Valencia, "Text Mining for Biology--the Way Forward: Opinions from Leading Scientists," Genome Biology, vol. 9, suppl. 2, p. S7, 2008.
    [2]
    C. Blaschke, L. Hirschman, A. Yeh, and A. Valencia, "Critical Assessment of Information Extraction Systems in Biology," Comparative and Functional Genomics, vol. 4, pp. 674-677, 2003.
    [3]
    M. Krallinger, A. Morgan, L. Smith, F. Leitner, L. Tanabe, J. Wilbur, L. Hirschman, and A. Valencia, "Evaluation of Text-Mining Systems for Biology: Overview of the Second BioCreative Community Challenge," Genome Biology, vol. 9, suppl. 2, p. S1, 2008.
    [4]
    L. Smith, L.K. Tanabe, R.J. Ando, C.J. Kuo, I.F. Chung, C.N. Hsu, Y.S. Lin, R. Klinger, C.M. Friedrich, K. Ganchev, M. Torii, H. Liu, B. Haddow, C.A. Struble, R.J. Povinelli, A. Vlachos, W.A. Baumgartner, Jr., L. Hunter, B. Carpenter, R.T. Tsai, H.J. Dai, F. Liu, Y. Chen, C. Sun, S. Katrenko, P. Adriaans, C. Blaschke, R. Torres, M. Neves, P. Nakov, A. Divoli, M. Mana-Lopez, J. Mata, and W.J. Wilbur, "Overview of BioCreative II Gene Mention Recognition," Genome Biology, vol. 9, suppl. 2, p. S2, 2008.
    [5]
    A. Yeh, A. Morgan, M. Colosimo, and L. Hirschman, "BioCreAtIvE Task 1A: Gene Mention Finding Evaluation," BMC Bioinformatics, vol. 6, suppl. 1, p. S2, 2005.
    [6]
    "The Universal Protein Resource (UniProt) 2009," Nucleic Acids Research, vol. 37, pp. D169-D174, Jan. 2009.
    [7]
    L. Hirschman, M. Colosimo, A. Morgan, and A. Yeh, "Overview of BioCreAtIvE Task 1B: Normalized Gene Lists," BMC Bioinformatics, vol. 6, suppl. 1, p. S11, 2005.
    [8]
    A.A. Morgan, Z. Lu, X. Wang, A.M. Cohen, J. Fluck, P. Ruch, A. Divoli, K. Fundel, R. Leaman, J. Hakenberg, C. Sun, H.H. Liu, R. Torres, M. Krauthammer, W.W. Lau, H. Liu, C.N. Hsu, M. Schuemie, K.B. Cohen, and L. Hirschman, "Overview of BioCreative II Gene Normalization," Genome Biology, vol. 9, suppl. 2, p. S3, 2008.
    [9]
    C. Blaschke, E.A. Leon, M. Krallinger, and A. Valencia, "Evaluation of BioCreAtIvE Assessment of Task 2," BMC Bioinformatics, vol. 6, suppl. 1, p. S16, 2005.
    [10]
    A. Ceol, A. Chatr-Aryamontri, L. Licata, D. Peluso, L. Briganti, L. Perfetto, L. Castagnoli, and G. Cesareni, "MINT, the Molecular Interaction Database: 2009 Update," Nucleic Acids Research, vol. 38, pp. D532-D539, Jan. 2010.
    [11]
    A. Chatr-Aryamontri, S. Kerrien, J. Khadake, S. Orchard, A. Ceol, L. Licata, L. Castagnoli, S. Costa, C. Derow, R. Huntley, B. Aranda, C. Leroy, D. Thorneycroft, R. Apweiler, G. Cesareni, and H. Hermjakob, "MINT and IntAct Contribute to the Second BioCreative Challenge: Serving the Text-Mining Community with High Quality Molecular Interaction Data," Genome Biology, vol. 9, suppl. 2, p. S5, 2008.
    [12]
    C. Stark, B.J. Breitkreutz, T. Reguly, L. Boucher, A. Breitkreutz, and M. Tyers, "BioGRID: A General Repository for Interaction Datasets," Nucleic Acids Research, vol. 34, pp. D535-D539, Jan. 2006.
    [13]
    M. Krallinger, F. Leitner, C. Rodriguez-Penagos, and A. Valencia, "Overview of the Protein-Protein Interaction Annotation Extraction Task of BioCreative II," Genome Biology, vol. 9, suppl. 2, p. S4, 2008.
    [14]
    F. Leitner, M. Krallinger, C. Rodriguez-Penagos, J. Hakenberg, C. Plake, C.J. Kuo, C.N. Hsu, R.T. Tsai, H.C. Hung, W.W. Lau, C.A. Johnson, R. Saetre, K. Yoshida, Y.H. Chen, S. Kim, S.Y. Shin, B.T. Zhang, W.A. Baumgartner, Jr., L. Hunter, B. Haddow, M. Matthews, X. Wang, P. Ruch, F. Ehrler, A. Ozgur, G. Erkan, D.R. Radev, M. Krauthammer, T. Luong, R. Hoffmann, C. Sander, and A. Valencia, "Introducing Meta-Services for Biomedical Information Extraction," Genome Biology, vol. 9, suppl. 2, p. S6, 2008.
    [15]
    A. Ceol, A. Chatr-Aryamontri, L. Licata, and G. Cesareni, "Linking Entries in Protein Interaction Database to Structured Text: The FEBS Letters Experiment," FEBS Letters, vol. 582, pp. 1171-1177, Apr. 2008.
    [16]
    A. Chatr-Aryamontri, A. Ceol, L.M. Palazzi, G. Nardelli, M.V. Schneider, L. Castagnoli, and G. Cesareni, "MINT: The Molecular INTeraction Database," Nucleic Acids Research, vol. 35, pp. D572- D574, Jan. 2007.
    [17]
    S. Orchard, L. Salwinski, S. Kerrien, L. Montecchi-Palazzi, M. Oesterheld, V. Stumpflen, A. Ceol, A. Chatr-Aryamontri, J. Armstrong, P. Woollard, J.J. Salama, S. Moore, J. Wojcik, G.D. Bader, M. Vidal, M.E. Cusick, M. Gerstein, A.C. Gavin, G. Superti-Furga, J. Greenblatt, J. Bader, P. Uetz, M. Tyers, P. Legrain, S. Fields, N. Mulder, M. Gilson, M. Niepmann, L. Burgoon, J. De Las Rivas, C. Prieto, V.M. Perreau, C. Hogue, H.W. Mewes, R. Apweiler, I. Xenarios, D. Eisenberg, G. Cesareni, and H. Hermjakob, "The Minimum Information Required for Reporting a Molecular Interaction Experiment (MIMIx)," Nature Biotechnology, vol. 25, pp. 894-898, Aug. 2007.
    [18]
    D. Howe, M. Costanzo, P. Fey, T. Gojobori, L. Hannick, W. Hide, D.P. Hill, R. Kania, M. Schaeffer, S. St Pierre, S. Twigger, O. White, and S.Y. Rhee, "Big Data: The Future of Biocuration," Nature, vol. 455, pp. 47-50, Sept. 2008.
    [19]
    A. Bairoch, B. Boeckmann, S. Ferro, and E. Gasteiger, "Swiss-Prot: Juggling between Evolution and Stability," Briefings in Bioinformatics, vol. 5, pp. 39-55, Mar. 2004.
    [20]
    C.D. Manning, D.R. Prabhakar, and S. Hinrich, Introduction to Information Retrieval. Cambridge Univ. Press, 2008.
    [21]
    B.W. Matthews, "Comparison of the Predicted and Observed Secondary Structure of T4 Phage Lysozyme," Biochimica et Biophysics Acta, vol. 405, pp. 442-451, Oct. 1975.
    [22]
    F. Leitner, A. Chatr-Aryamontri, A. Ceol, M. Krallinger, L. Licata, S. Mardis, L. Hirschman, G. Cesareni, and A. Valencia, "Enriching Publications with Structured Digital Abstracts: The Human-Machine Experiment," accepted for publication in Nature Biotechnology, 2010.
    [23]
    B. Carpenter, "LingPipe," http://www.alias-i.com/, 2010.
    [24]
    Y. Tsuruoka, Y. Tateishi, J.D. Kim, T. Ohta, J. McNaught, S. Ananiadou, and J. Tsujii, "GENIA Tagger: Developing a Robust Part-of-Speech Tagger for Biomedical Text," Proc. 10th Panhellenic Conf. Informatics, pp. 382-392, 2005.
    [25]
    B. Settles, "ABNER: An Open Source Tool for Automatically Tagging Genes, Proteins and Other Entity Names in Text," Bioinformatics, vol. 21, pp. 3191-3192, July 2005.

    Cited By

    View all
    • (2015)Charaparser+EQProceedings of the 78th ASIS&T Annual Meeting: Information Science with Impact: Research in and for the Community10.5555/2857070.2857090(1-10)Online publication date: 6-Nov-2015
    • (2015)Protein-protein interaction identification using a hybrid modelArtificial Intelligence in Medicine10.1016/j.artmed.2015.05.00364:3(185-193)Online publication date: 1-Jul-2015
    • (2014)Gene name disambiguation using multi-scope species detectionIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2013.13911:1(55-62)Online publication date: 1-Jan-2014
    • Show More Cited By

    Index Terms

    1. An Overview of BioCreative II.5
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image IEEE/ACM Transactions on Computational Biology and Bioinformatics
          IEEE/ACM Transactions on Computational Biology and Bioinformatics  Volume 7, Issue 3
          July 2010
          192 pages

          Publisher

          IEEE Computer Society Press

          Washington, DC, United States

          Publication History

          Published: 01 July 2010
          Published in TCBB Volume 7, Issue 3

          Author Tags

          1. Text mining
          2. biological curation.
          3. molecular biology
          4. natural language processing
          5. text analysis

          Qualifiers

          • Article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)1
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 12 Aug 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2015)Charaparser+EQProceedings of the 78th ASIS&T Annual Meeting: Information Science with Impact: Research in and for the Community10.5555/2857070.2857090(1-10)Online publication date: 6-Nov-2015
          • (2015)Protein-protein interaction identification using a hybrid modelArtificial Intelligence in Medicine10.1016/j.artmed.2015.05.00364:3(185-193)Online publication date: 1-Jul-2015
          • (2014)Gene name disambiguation using multi-scope species detectionIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2013.13911:1(55-62)Online publication date: 1-Jan-2014
          • (2014)Lessons learnt from the DDIExtraction-2013 Shared TaskJournal of Biomedical Informatics10.1016/j.jbi.2014.05.00751:C(152-164)Online publication date: 1-Oct-2014
          • (2013)Learning bayesian network using parse trees for extraction of protein-protein interactionProceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 210.1007/978-3-642-37256-8_29(347-358)Online publication date: 24-Mar-2013
          • (2012)Improving Protein-Protein Interaction Pair Ranking with an Integrated Global Association ScoreIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2012.999:6(1690-1695)Online publication date: 1-Nov-2012
          • (2012)k-Information Gain Scaled Nearest NeighborsIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2011.329:1(305-310)Online publication date: 1-Jan-2012
          • (2012)Relation mining experiments in the pharmacogenomics domainJournal of Biomedical Informatics10.1016/j.jbi.2012.04.01445:5(851-861)Online publication date: 1-Oct-2012
          • (2010)Efficient Extraction of Protein-Protein Interactions from Full-Text ArticlesIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2010.517:3(481-494)Online publication date: 1-Jul-2010
          • (2010)Extracting Protein Interactions from Text with the Unified AkaneRE Event Extraction SystemIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2010.467:3(442-453)Online publication date: 1-Jul-2010

          View Options

          Get Access

          Login options

          Full Access

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media