Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
Bruno Sobral

    Bruno Sobral

    • I am a systems thinker with significant scientific and executive experience building and managing research institutes... moreedit
    In light of the age of genomics, a compilation of small-subunit rRNA-encoding genes (16S rRNA genes) from Rickettsiales is provided, with a phylogeny estimated from a subset of the sequences that spans the diversity of Rickettsiales.... more
    In light of the age of genomics, a compilation of small-subunit rRNA-encoding genes (16S rRNA genes) from Rickettsiales is provided, with a phylogeny estimated from a subset of the sequences that spans the diversity of Rickettsiales. Robust phylogeny estimation based on whole-genome sequences supports the close relationship of Bartonellaceae with another lineage of facultative intracellular bacteria, Brucellaceae. The first genome sequence from the Anaplasmataceae also reveals reductive genome evolution, with a genome size of 1.3 Mb and 1,195 predicted open reading frames (ORFs). Significant reclassification of the species within the Anaplasmataceae was proposed based on data. Subsequent reevaluation of the trace file archives determined that the Wolbachia sequences discovered in D. mojavensis were an artifact; however, 2,291 novel Wolbachia sequences were found in another fly genome, Drosophila willistoni. This work underscores the utility of eukaryotic genomes for discovering Rickettsiales endosymbionts and potentially assembling partial to complete bacterial genomes from the eukaryotic reads. A robust phylogeny based on whole-genome sequences supports the current taxonomic delineations within the Anaplasmataceae and Rickettsiaceae, with monophyly of each of the six genera strongly supported. The differences between RefSeq and Pathosystems Resource Integration Center (PATRIC) annotation undoubtedly result in discrepancies in protein family clustering between this work and previous studies. Two mitochondrial small-subunit rRNA gene sequences were included in the phylogeny estimation, and this lineage branched after the Holosporaceae, but basal to the derived Rickettsiales. The diversity of the Rickettsiales highlighted in this chapter presents an exciting challenge for rickettsiology.
    A new isolate of Metarhizium flavoviride Gams and Rozsypal (Hyphomycetes) (CG 423) found in Northeast Brazil infecting Schistocerca pallens (Thunberg) was identified using arbitrarily primed PCR. Cluster analysis of DNA markers revealed a... more
    A new isolate of Metarhizium flavoviride Gams and Rozsypal (Hyphomycetes) (CG 423) found in Northeast Brazil infecting Schistocerca pallens (Thunberg) was identified using arbitrarily primed PCR. Cluster analysis of DNA markers revealed a high level of homogeneity (>83% similarity) among the Brazilian (CG 423) and two other M. flavoviride isolates from Nigeria (CG 366 = IMI 330189) and Australia (CG 291). However, M. flavoviride isolates were very distinct when compared with two isolates of Metarhizium anisopliae (Metschnikoff) Sorokin (6.4% similarity). Bioassays showed that strain CG 423 is as virulent as other isolates of M. flavoviride (CG 291, CG 366), M. anisopliae (CG 087), and Beauveria bassiana (Balsamo) Vuillemin (CG 425) against the grasshopper Rhammatocerus schistocercoides (Rehn) (Orthoptera: Acrididae), an important pest in Central Brazil. However, the Brazilian isolate of M. flavoviride (CG 423) is more virulent than the Brazilian isolate of B. bassiana (CG 250). B...
    ABSTRACT
    ABSTRACT
    The polymerase chain reaction (PCR) has given plant geneticists, ecologists, evolutionary, and population biologists a powerful new tool for studying their favorite organisms. In this chapter, we will use specific PCR to mean a standard,... more
    The polymerase chain reaction (PCR) has given plant geneticists, ecologists, evolutionary, and population biologists a powerful new tool for studying their favorite organisms. In this chapter, we will use specific PCR to mean a standard, two-primer amplification that has as a target a specific genomic region, or gene, and therefore requires specific primers to be designed based on knowledge of DNA sequence. We differentiate this from PCR that uses primers of arbitrary sequence to specifically amplify a set of arbitrary loci in any genome, without the requirement for prior sequence knowledge. This is usually referred to as arbitrarily primed PCR or random amplified polymorphic DNA (RAPD) markers; herein, we will use the term arbitrarily primed PCR.
    Reductionist approaches to biological questions have provided important insights to genetic mechanisms and genome structures. These approaches have also led to the engineering of technologies that generate data at rates that far exceed... more
    Reductionist approaches to biological questions have provided important insights to genetic mechanisms and genome structures. These approaches have also led to the engineering of technologies that generate data at rates that far exceed our ability to comprehend their meaning. If biologists are going to address the fundamental questions concerning the complexity that underlies growth, development and phenotypic variability, then there is an unprecedented need to acquire, understand, manipulate, and exploit high-value genomic information. Further, for genomics to deliver the promise of using biological processes to make agricultural systems more efficient, sustainable, and environmentally friendly, various types of biological data will need to be integrated to reveal the functional relationships between DNA, RNA, proteins, environment and phenotypes. Such integrative approaches will rely heavily on development of information systems (IS) and data analysis methods that will require the same rigor and resources that have been applied to development of laboratory experimental protocols. In addition, software and process engineering will need to be applied to information systems development much in the same way as engineering principles have been applied to data production in biological laboratories, resulting in a shift from cottage industry to industrial scale biological data factories.
    The basic theory of using genetic markers to manipulate the loci controlling a trait of interest to plant geneticists was introduced by Sax (1923) over 70 years ago. The application of this theory since that time has been limited by the... more
    The basic theory of using genetic markers to manipulate the loci controlling a trait of interest to plant geneticists was introduced by Sax (1923) over 70 years ago. The application of this theory since that time has been limited by the lack of available segregating genetic markers. Recent advances in methods for assaying DNA polymorphisms have produced hundreds of segregating genetic markers in many species. These advances have allowed the application and further development of the theory of Sax. The genetic markers have been used: (1) as X variables in linear and nonlinear models to determine which markers are near (with reference to genetic recombination) loci controlling a trait of interest; (2) as indirect selection criteria; and (3) in traditional linkage analysis to arrange them into dense genetic linkage maps. Information from the map can be used further to map genetically the trait loci and to determine starting points for finding the trait loci using physical mapping.
    The Pathosystems Resource Integration Center (PATRIC) is a genomics-centric relational database and bioinformatics resource designed to assist scientists in infectious-disease research. This method paper provides detailed instructions on... more
    The Pathosystems Resource Integration Center (PATRIC) is a genomics-centric relational database and bioinformatics resource designed to assist scientists in infectious-disease research. This method paper provides detailed instructions on using this resource to finding data specific to genomes, saving it in a personalized workspace and using a variety of interactive tools to analyze that data. While PATRIC contains many diverse tools and functionalities to explore both genome-scale and gene expression data, the main focus of this chapter is on comparative analysis of bacterial genomes.
    The distinguishing feature of the 'new biology' is that it is information intensive. Not only does it demand access to and assimilation of vast data sets accumulated by engineered laboratory processes, but it also demands... more
    The distinguishing feature of the 'new biology' is that it is information intensive. Not only does it demand access to and assimilation of vast data sets accumulated by engineered laboratory processes, but it also demands a previously unimaginable level of data integration across data types and sources. There are various information resources available for rice. In addition, there are various information resources that are not focused on rice but that contain rice data. The challenge for rice researchers and breeders is to access this wealth of data meaningfully. This challenge will grow significantly as international efforts aimed at sequencing the entire rice genome come into full swing. Only through concerted efforts in bioinformatics will the power of these public data be brought to bear on the needs of rice researchers and breeders worldwide. These efforts will need to focus on two large but distinct areas: (1) development of an effective bioinformatics infrastructure (hardware systems, software systems, and software engineers and support staff) and (2) computational biology research in visualization and analysis of very large, complex data sets, such as those that will be developed using high-throughput expression technologies, large-scale insertional mutagenesis, and biochemical profiling of various types. In the midst of the large flow of high-throughput data that the international rice genome sequencing efforts will produce, it is also imperative that integration of those data with unique germplasm data held in trust by the CGIAR be a part of the informatics infrastructure. This paper will focus on the state of rice information resources, the needs of the rice community, and some proposed bioinformatics activities to support these needs.
    ToolBus is an integrated environment in which data and tools can be interoperable in an open and flexible manner. Using this environment, biological researchers can access many kinds of Bioinformatics data sources and analysis tools. Its... more
    ToolBus is an integrated environment in which data and tools can be interoperable in an open and flexible manner. Using this environment, biological researchers can access many kinds of Bioinformatics data sources and analysis tools. Its utilization of web services and its open API encourage and support the development of tools and visualization plugins by other development groups. As the
    Poly-3-hydroxybutyrate (PHB) and glycogen are major carbon storage compounds in Sinorhizobium meliloti. The roles of PHB and glycogen in rhizobia-legume symbiosis are not fully understood (1, 2, 3). Bacteroids with determinate nodules... more
    Poly-3-hydroxybutyrate (PHB) and glycogen are major carbon storage compounds in Sinorhizobium meliloti. The roles of PHB and glycogen in rhizobia-legume symbiosis are not fully understood (1, 2, 3). Bacteroids with determinate nodules often accumulate high levels of PHB (up to 70% dry weight) (1). In contrast, bacteroids within indeterminate nodules do not accumulate PHB. In the symbiosis between S. meliloti and alfalfa, it has been reported that phbC mutants form bacteroids capable of fixing nitrogen as efficiently as the wild type (4, 5, 6), but are less competitive than wild type strain (5). There are two glycogen synthase-encoding genes in S. meliloti, glgA1 in a cluster of other glycogen synthesis pathway genes on the chromosome, and glgA2 on megaplasmid pSymb. Recently, it has been shown in Rhizobium tropici with Phaseolus vulgaris that there may be a link between glycogen synthase deficiency, decreased exopolysaccharide, and increased symbiotic performance (2). However, the reason for the increased symbiotic efficiency of the glgA mutant is uncertain (3). To determine the roles these compounds may play in the symbiotic process and in the overall physiology of the organism in the free-living and bacteroid states, mutants unable to synthesize PHB and/or glycogen were constructed. A glgA1 mutation was constructed by in-frame deletion, preserving the expression of the downstream pgm gene, while a glgA2 mutation was constructed by disruption of the gene with the Sp omega cassette (7). A pre-existing Tn5-generated mutation of the PHB synthase encoding gene phbC (8) was combined with glgA1 and glgA2 mutations to make all combinations of double mutants, and the triple mutant (Table 1). PHB was not detectable in free-living cells of any mutant containing the phbC mutation; glycogen was not detectable in any of the mutants containing the glgA1 mutation. The production of PHB decreased significantly in the glgA1 mutant (Rm11479), and the glgA double mutant (Rm11482). The production of glycogen increased significantly in the phbC mutant (Rm11105) in high carbon ratio media. Exopolysaccharide (EPS) was not detected in any of the mutants containing the phbC mutation, while the glgA double mutant (Rm11482) produced much more EPS in MOPS medium compared to the wild type (Rm1021), glgA1
    Next-generation sequencing has revolutionized biology by exponentially increasing sequencing output while dramatically lowering costs. High-throughput sequence data with shorter reads has opened up new applications such as whole genome... more
    Next-generation sequencing has revolutionized biology by exponentially increasing sequencing output while dramatically lowering costs. High-throughput sequence data with shorter reads has opened up new applications such as whole genome resequencing, indel and SNP detection, transcriptome sequencing, etc. Several tools are available for the analysis of high-throughput sequencing data. In this chapter, we describe the use of an ultrafast alignment program, Bowtie, to align short-read sequence (SRS) data against the Arabidopsis reference genome. The alignment files generated from Bowtie will be used to identify SNPs and indels using Maq.
    ... Jorge da Silva 2'3, Rhonda J. Honeycutt *, William Burnquist 3, Salah M. A1-Janabi ,,4, Mark E. Sorrells 2, Steven D. Tanksley 2 and Bruno WS Sobral *'* *California Institute of Biological Research,... more
    ... Jorge da Silva 2'3, Rhonda J. Honeycutt *, William Burnquist 3, Salah M. A1-Janabi ,,4, Mark E. Sorrells 2, Steven D. Tanksley 2 and Bruno WS Sobral *'* *California Institute of Biological Research, 11099 N. Torrey Pines Road, Suite 300, La Jolla, CA 92037, USA (* author for ...
    This article cites 113 articles, 57 of which can be accessed free
    Reviewer 2: The revised manuscript has addressed most of the comments, however, now when the content of figure 5 has been explained, it is difficult to understand why the 24h time point post infection and 48 h post-LPS treatment have been... more
    Reviewer 2: The revised manuscript has addressed most of the comments, however, now when the content of figure 5 has been explained, it is difficult to understand why the 24h time point post infection and 48 h post-LPS treatment have been included in the figure. In all other figures, results comparing the effects of infection vs infection with LPS pre-treatment have been included. Since the aim of the paper is stated to be to understand the LPS-induced blunting of the proinflammatory response, then it would be appropriate to include data on the effects of LPS-pretreatment on the gene expression in infected mice. Moreover, it would be logical to include data on both the 24 and 48 time points.
    The genomes of 11 Bradyrhizobium japonicum serocluster 123 field isolates were analyzed by using field inversion gel electrophoresis. Genomic fingerprints produced by digestion of intact genomic DNA in agarose plugs with the rare-cutting... more
    The genomes of 11 Bradyrhizobium japonicum serocluster 123 field isolates were analyzed by using field inversion gel electrophoresis. Genomic fingerprints produced by digestion of intact genomic DNA in agarose plugs with the rare-cutting restriction enzymes AseI, DraI, SpeI, and XbaI showed that the isolates were genetically diverse. Few (30 to 50%) isolates exhibited the same fingerprint as the USDA serogroup strain with which they are antigenically related. Southern hybridization with a nifHD gene probe to the blotted field inversion electrophoresis gels provided further evidence of the relatedness between members of serogroups 123 and 127.
    Research Interests:
    ABSTRACT Informatics-driven approaches change how research and development are conducted, who participates, and enables systems-oriented views of science and research. Most life sciences researchers have a very strong desire for the full... more
    ABSTRACT Informatics-driven approaches change how research and development are conducted, who participates, and enables systems-oriented views of science and research. Most life sciences researchers have a very strong desire for the full integration of data and analysis tools delivered through a single interface. Infectious disease (ID) research and development provides a uniquely challenging and high impact opportunity. The biological complexity of infectious disease systems, which are composed of multiple scales of interactions between potential pathogens, hosts, vectors, and the environment, challenges information resources because of the breadth of organism-organism and organism-environment interactions. Applications of integrated data for ID serves a variety of constituencies, such as clinicians, diagnostician, drug and vaccine developers, and epidemiologists. Thus there is a complexity that makes ID an opportune area in which to develop, deploy and use CyberInfrastructure.
    Brucellosis affects millions of animals and humans world-wide; in humans, over 500,000 new cases are reported annually. Although some vaccines are available for its prevention in animals, none exist for humans. The causative agent is the... more
    Brucellosis affects millions of animals and humans world-wide; in humans, over 500,000 new cases are reported annually. Although some vaccines are available for its prevention in animals, none exist for humans. The causative agent is the facultative intracellular bacterial pathogen belonging to the genus Brucella and is transmitted from animals to humans; infected animals experience abortion and humans undulant fever.
    This paper describes a generalized web-based framework for VBI pathosystems informatics & bioinformatics web services. The framework provides a universal mechanism for accessing web services through web browsers. A web service can be... more
    This paper describes a generalized web-based framework for VBI pathosystems informatics & bioinformatics web services. The framework provides a universal mechanism for accessing web services through web browsers. A web service can be configured to the framework easily. The framework is designed using object-oriented methodology and implemented using Java 2 Enterprise Edition, relational databases, XML technology and various other open
    The VBIGenomeACS (VBI Genome Annotation and Comparison System) is a genome annotation and comparison system for prokaryotes and eukaryotes. It has been developed by the Virginia Bioinformatics Institute's (VBI) Cyberinfrastructure... more
    The VBIGenomeACS (VBI Genome Annotation and Comparison System) is a genome annotation and comparison system for prokaryotes and eukaryotes. It has been developed by the Virginia Bioinformatics Institute's (VBI) Cyberinfrastructure Group (CIG) as part of the Pathogen Portal (PathPort) project. Backed by a scalable genome relational database VBIGenomeDB and a Web service business layer, the system provides an extensible and
    Rhizobia are symbiotic nitrogen-fixing soil bacteria associated with host legumes. The signal exchanges between partners under microaerobiosis are required to establish symbiosis. We have developed macroarray of Mesorhizobium loti... more
    Rhizobia are symbiotic nitrogen-fixing soil bacteria associated with host legumes. The signal exchanges between partners under microaerobiosis are required to establish symbiosis. We have developed macroarray of Mesorhizobium loti MAFF303099, a microsymbiont of the model legume Lotus japonicus, and monitored the transcriptional dynamics of the bacterium during symbiosis, microaerobiosis, and starvation. Global transcriptional profiling demonstrated that the clusters of genes
    e Cyberinfrastructure Group (CIG) develops and uses methods, infrastructure, and resources to enable scientifi c discoveries in infectious disease research by applying the principles of cyberinfrastructure to integrate data, computational... more
    e Cyberinfrastructure Group (CIG) develops and uses methods, infrastructure, and resources to enable scientifi c discoveries in infectious disease research by applying the principles of cyberinfrastructure to integrate data, computational infrastructure, and people (Atkins, 2003). CIG has developed many public resources for curated, diverse molecular and literature data from various infectious disease systems, and implemented the processes, systems, and databases required to support them. It also conducts research applying its methods, infrastructure and data to make new discoveries of its own. CIG participates in education and outreach activities, resulting in scientifi c discoveries and publications, and an outreach program involving development of project-centric cyberinfrastructure courses for the educators from high schools and undergraduate institutions as well as graduates and postgraduates. In the reporting period, key accomplishments include publication of the Brucella abor...
    Research Interests:
    In 2010, scientists at the Microbial Genomics and Advanced Technologies section of the Division of Microbiology and Infectious Diseases of the National Institutes of Health, NIH, proposed the concept of “Community Annotation of the... more
    In 2010, scientists at the Microbial Genomics and Advanced Technologies section of the Division of Microbiology and Infectious Diseases of the National Institutes of Health, NIH, proposed the concept of “Community Annotation of the Mycobacterium tuberculosis Genomes”, with the overall goal to combine the expertise and knowledge of the TB research community with the wealth of new data, technologies and algorithms, to substantially improve and enhance the annotation of M. tuberculosis (Mtb) genomes. The specific goals of the TBCAP were to: i) improve the Mtb structural annotations, in particular to resolve the problems of missing genes, incorrect start sites, and poor or conflicting operon definitions in the current publically available databases; ii) associate function to genes, in the realization that about one-quarter of then currently annotated genes were marked as “conserved hypothetical protein”; iii) annotate all sequencedMtb genomes simultaneously in the realization that there is minimal sequence differences between Mtb strains; iv) ensure that all emerging data is immediately and usefully made available through the NIAID TB database, TBDB (tbdb.org) and other venues; v) develop a systems level annotation of regulons, operons, metabolic and signaling pathways. Relevant to the mission of TBCAP, the NIAID had funded the TB Systems Biology Consortium (TBSysBio) which produced chromatin immunoprecipitation followed by sequencing (ChIP-Seq) and expression data corresponding to 50 Mtb transcription factors; these data and the regulatory network model derived from them have been deposited in TBDB together with transcriptional and biochemical (metabolomics, proteomics and lipidomics) profiling results of Mtb during adaptation to hypoxia. Relevant systems biology computational, analysis and visualization tools are provided as well in TBDB; these data and tools are publicly accessible and will be made available to the NIAID TB database. A manuscript on other genomic resources available to the TB research community is included in this issue of Tuberculosis.
    Comparative genetic maps of Papuan Saccharum officinarum L. (2 n = 80) and S. robustum (2 n = 80) were constructed by using single-dose DNA markers (SDMs). SDM-framework maps of S. officinarum and S. robustum were compared with genetic... more
    Comparative genetic maps of Papuan Saccharum officinarum L. (2 n = 80) and S. robustum (2 n = 80) were constructed by using single-dose DNA markers (SDMs). SDM-framework maps of S. officinarum and S. robustum were compared with genetic maps of sorghum and maize by way of anchor restriction fragment length polymorphism probes. The resulting comparisons showed striking colinearity between the sorghum and Saccharum genomes. There were no differences in marker order between S. officinarum and sorghum. Furthermore, there were no alterations in SDM order between S. officinarum and S. robustum . The S. officinarum and S. robustum maps also were compared with the map of the polysomic octoploid S. spontaneum ‘SES 208’ (2 n = 64, x = 8), thus permitting relations to homology groups (“chromosomes”) of S. spontaneum to be studied. Investigation of transmission genetics in S. officinarum and S. robustum confirmed preliminary results that showed incomplete polysomy in these species. Because of in...
    The emergent needs of the bioinformatics community challenge current information systems. The pace of biological data generation far outstrips Moore's Law. Therefore, a gap continues to widen... more
    The emergent needs of the bioinformatics community challenge current information systems. The pace of biological data generation far outstrips Moore's Law. Therefore, a gap continues to widen between the capabilities to produce biological (molecular and cell) data sets and the capability to manage and analyze these data sets. As a result, Federal investments in large data set generation produces diminishing returns in terms of the community's capabilities of understanding biology and leveraging that understanding to make scientific and technological advances that improve society. We are building an open framework to address various data management issues including data and tool interoperability, nomenclature and data communication standardization, and database integration. PathPort, short for Pathogen Portal, employs a generic, web-services based framework to deal with some of the problems identified by the bioinformatics community. The motivating research goal of a scalable system to provide data management and analysis for key pathosystems, especially relating to molecular data, has resulted in a generic framework using two major components. On the server-side, we employ web-services. On the client-side, a Java application called ToolBus acts as a client-side "bus" for contacting data and tools and viewing results through a single, consistent user interface.
    Clostridium difficile in recent years has undergone rapid evolution and has emerged as a serious human pathogen. Proteomic approaches can improve the understanding of the diversity of this important pathogen, especially in comparing the... more
    Clostridium difficile in recent years has undergone rapid evolution and has emerged as a serious human pathogen. Proteomic approaches can improve the understanding of the diversity of this important pathogen, especially in comparing the adaptive ability of different C. difficile strains. In this study, TMT labeling and nanoLC-MS/MS driven proteomics were used to investigate the responses of four C. difficile strains to nutrient shift and osmotic shock. We detected 126 and 67 differentially expressed proteins in at least one strain under nutrition shift and osmotic shock, respectively. During nutrient shift, several components of the phosphotransferase system (PTS) were found to be differentially expressed, which indicated that the carbon catabolite repression (CCR) was relieved to allow the expression of enzymes and transporters responsible for the utilization of alternate carbon sources. Some classical osmotic shock associated proteins, such as GroEL, RecA, CspG, and CspF, and other stress proteins such as PurG and SerA were detected during osmotic shock. Furthermore, the recently emerged strains were found to contain a more robust gene network in response to both stress conditions. This work represents the first comparative proteomic analysis of historic and recently emerged hypervirulent C. difficile strains, complementing the previously published proteomics studies utilizing only one reference strain.
    Genetic analysis was performed in a population composed of 100 F1 individuals derived from a cross between a cultivated sugarcane (S. officinarum `LA Purple') and its proposed progenitor species (S. robustum `Mol 5829'). Various... more
    Genetic analysis was performed in a population composed of 100 F1 individuals derived from a cross between a cultivated sugarcane (S. officinarum `LA Purple') and its proposed progenitor species (S. robustum `Mol 5829'). Various types (arbitrarily primed-PCR, RFLPs, and AFLPs) of single-dose DNA markers (SDMs) were used to construct genetic linkage maps for both species. The LA Purple map was composed of 341 SDMs, spanning 74 linkage groups and 1,881 cM, while the Mol 5829 map contained 301 SDMs, spanning 65 linkage groups and 1,189 cM. Transmission genetics in these two species showed incomplete polysomy based on the detection of 15% of SDMs linked in repulsion in LA Purple and 13% of these in Mol 5829. Because of this incomplete polysomy, multiple-dose markers could not be mapped for lack of a genetic model for their segregation. Due to inclusion of RFLP anchor probes, conserved in related species, the resulting maps will serve as useful tools for breeding, ecology, evolut...
    We present the preparation, resources, results and analysis of three tasks of the BioNLP Shared Task 2011: the main tasks on Infectious Diseases (ID) and Epigenetics and Post-translational Modifications (EPI), and the supporting task on... more
    We present the preparation, resources, results and analysis of three tasks of the BioNLP Shared Task 2011: the main tasks on Infectious Diseases (ID) and Epigenetics and Post-translational Modifications (EPI), and the supporting task on Entity Relations (REL). The two main tasks represent extensions of the event extraction model introduced in the BioNLP Shared Task 2009 (ST’09) to two new areas of biomedical scientific literature, each motivated by the needs of specific biocuration tasks. The ID task concerns the molecular mechanisms of infection, virulence and resistance, focusing in particular on the functions of a class of signaling systems that are ubiquitous in bacteria. The EPI task is dedicated to the extraction of statements regarding chemical modifications of DNA and proteins, with particular emphasis on changes relating to the epigenetic control of gene expression. By contrast to these two application-oriented main tasks, the REL task seeks to support extraction in general...

    And 116 more