Alvis Brazma

European Bioinformatics Institute, Functional Genomics Group, Faculty Member

Followers

Following

Public Views

Address: Cambridge, Cambridgeshire, United Kingdom

less

InterestsView All (20)

Uploads

Papers by Alvis Brazma

Towards reconstruction of gene networks from expression data by supervised learning

Genome Biology, 2003

Download

BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis

Bioinformatics/computer Applications in The Biosciences, 2005

Download

Submission of Microarray Data to Public Repositories

PLOS Biology, 2004

Download

Gene expression data analysis

Febs Letters, 2000

Microarrays are one of the latest breakthroughs in experimental molecular biology, which allow mo... more Microarrays are one of the latest breakthroughs in experimental molecular biology, which allow monitoring of gene expression for tens of thousands of genes in parallel and are already producing huge amounts of valuable data. Analysis and handling of such data is becoming one of the major bottlenecks in the utilization of the technology. The raw microarray data are images, which have to be transformed into gene expression matrices--tables where rows represent genes, columns represent various samples such as tissues or experimental conditions, and numbers in each cell characterize the expression level of the particular gene in the particular sample. These matrices have to be analyzed further, if any knowledge about the underlying biological processes is to be extracted. In this paper we concentrate on discussing bioinformatics methods used for such analysis. We briefly discuss supervised and unsupervised data analysis and its applications, such as predicting gene function classes and cancer classification. Then we discuss how the gene expression matrix can be used to predict putative regulatory signals in the genome sequences. In conclusion we discuss some possible future directions.

ArrayExpress - a public database of microarray experiments and gene expression profiles

Nucleic Acids Research, 2007

Download

Expression Profiler

Expression Profiler (EP, http://ep.ebi.ac.uk/) is a set of tools for the analysis and interpretat... more Expression Profiler (EP, http://ep.ebi.ac.uk/) is a set of tools for the analysis and interpretation of gene expression and other functional genomics data. These tools perform expression data clustering, visualization, and analysis, integration of expression data with protein interaction data and functional annotations, such as GeneOntology, and the analysis of promoter sequences for predicting transcription factor binding sites. Several clustering analysis method implementations and tools for sequence pattern discovery provide a rich data mining environment for various types of biological data. All the tools are Web-based, with minimal browser requirements. Analysis results are cross-linked to other databases and tools are available on the Internet. This enables further integration of the tools and databases; for instance, such public microarray gene expression databases as Array Express.

Expression Profiler: next generation - an online platform for analysis of microarray data

Nucleic Acids Research, 2004

Download

Global Transcriptional Responses of Fission Yeast to Environmental Stress

Molecular Biology of The Cell, 2003

Download

Periodic gene expression program of the fission yeast cell cycle

Nature Genetics, 2004

Download

Discovering Patterns and Subfamilies in Biosequences

Download

The European Bioinformatics Institute's data resources

Nucleic Acids Research, 2003

Download

Data Mining for Regulatory Elements in Yeast Genome

We have examined methods and developed a general software tool for finding and analyzing combinat... more We have examined methods and developed a general software tool for finding and analyzing combinations of transcription factor binding sites that occur relatively often in gene upstream regions (putative promoter regions) in the yeast genome. Such frequently occurring combinations may be essential parts of possible promoter classes. The regions upstream to all genes were first isolated from the yeast genome database MIPS using the information in the annotation files of the database. The ones that do not overlap with coding regions were chosen for further studies. Next, all occurrences of the yeast transcription factor binding sites, as given in the IMD database, were located in the genome and in the selected regions in particular. Finally, by using a general purpose data mining software in combination with our own software, which parametrizes the search, we can find the combinations of binding sites that occur in the upstream regions more frequently than would be expected on the basis of the frequency of individual sites. The procedure also finds so-called association rules present in such combinations. The developed tool is available for use through the WWW.

Protein Interaction Verification and Functional Annotation by Integrated Analysis of Genome-Scale Data

Molecular Cell, 2002

Download

ArrayExpress - a public repository for microarray gene expression data at the EBI

Nucleic Acids Research, 2003

Download

Mining for Putative Regulatory Elements in the Yeast Genome Using Gene Expression Data

We have developed a set of methods and tools for automatic discovery of putative regulatory signa... more We have developed a set of methods and tools for automatic discovery of putative regulatory signals in genome sequences. The analysis pipeline consists of gene expression data clustering, sequence pattern discovery from upstream sequences of genes, a control experiment for pattern significance threshold limit detection, selection of interesting patterns, grouping of these patterns, representing the pattern groups in a concise form and evaluating the discovered putative signals against existing databases of regulatory signals. The pattern discovery is computationally the most expensive and crucial step. Our tool performs a rapid exhaustive search for a priori unknown statistically significant sequence patterns of unrestricted length. The statistical significance is determined for a set of sequences in each cluster with respect to a set of background sequences allowing the detection of subtle regulatory signals specific for each cluster. The potentially large number of significant patterns is reduced to a small number of groups by clustering them by mutual similarity. Automatically derived consensus patterns of these groups represent the results in a comprehensive way for a human investigator. We have performed a systematic analysis for the yeast Saccharomyces cerevisiae. We created a large number of independent clusterings of expression data simultaneously assessing the &quot;goodness&quot; of each cluster. For each of the over 52,000 clusters acquired in this way we discovered significant patterns in the upstream sequences of respective genes. We selected nearly 1,500 significant patterns by formal criteria and matched them against the experimentally mapped transcription factor binding sites in the SCPD database. We clustered the 1,500 patterns to 62 groups for which we derived automatically alignments and consensus patterns. Of these 62 groups 48 had patterns that have matching sites in SCPD database.

Standards for microarray data

Science, 2002

One of the underlying principles of scientific publication in peer-reviewed journals has been the... more

Standards for systems biology

Nature Reviews Genetics, 2006

Download

ArrayExpress - a public repository for microarray gene expression data at the EBI

Nucleic Acids Research, 2005

Download

Predicting Gene Regulatory Elements in Silico on a Genomic Scale

Download

Microarray Data Representation, Annotation and Storage

Management and analysis of the huge amounts of data produced by microarray experiments is becomin... more Management and analysis of the huge amounts of data produced by microarray experiments is becoming one of the major bottlenecks in the utilization of this high-throughput technology. We describe the basic design of a microarray gene expression database to help microarray users and their informatics teams to set up their information services. We describe two data models — a simpler one called ArrayExpressB and the complete model ArrayExpressC, and discuss some implementation issues. For latest developments see http: www.ebi.ac.uk/arrayexpress.