Research Interests:
Genes and proteins are known to have differences in their sensitivity to alterations. Despite numerous sequencing studies, proportions of harmful and harmless substitutions are not known for proteins and groups of proteins. To address... more
Genes and proteins are known to have differences in their sensitivity to alterations. Despite numerous sequencing studies, proportions of harmful and harmless substitutions are not known for proteins and groups of proteins. To address this question, we predicted the outcome for all possible single amino acid substitutions (AASs) in nine representative protein groups by using the PON-P2 method. The effects on 996 proteins were studied and vast differences were noticed. Proteins in the cancer group harbor the largest proportion of harmful variants (42.1%), whereas the non-disease group of proteins not known to have a disease association and not involved in the housekeeping functions had the lowest number of harmful variants (4.2%). Differences in the proportions of the harmful and benign variants are wide within each group, but they still show clear differences between the groups. Frequently appearing protein domains show a wide spectrum of variant frequencies, whereas no major protei...
Research Interests:
Numerous databases containing information about DNA, RNA, and protein variations are available. Gene-specific variant databases (locus-specific variation databases, LSDBs) are typically curated and maintained for single genes or groups of... more
Numerous databases containing information about DNA, RNA, and protein variations are available. Gene-specific variant databases (locus-specific variation databases, LSDBs) are typically curated and maintained for single genes or groups of genes for a certain disease(s). These databases are widely considered as the most reliable information source for a particular gene/protein/disease, but it should also be made clear they may have widely varying contents, infrastructure, and quality. Quality is very important to evaluate because these databases may affect health decision-making, research, and clinical practice. The Human Variome Project (HVP) established a Working Group for Variant Database Quality Assessment. The basic principle was to develop a simple system that nevertheless provides a good overview of the quality of a database. The HVP quality evaluation criteria that resulted are divided into four main components: data quality, technical quality, accessibility, and timeliness. This report elaborates on the developed quality criteria and how implementation of the quality scheme can be achieved. Examples are provided for the current status of the quality items in two different databases, BTKbase, an LSDB, and ClinVar, a central archive of submissions about variants and their clinical significance.
Research Interests:
For development and evaluation of methods for predicting the effects of variations, benchmark datasets are needed. Some previously developed datasets are available for this purpose, but newer and larger benchmark sets for benign variants... more
For development and evaluation of methods for predicting the effects of variations, benchmark datasets are needed. Some previously developed datasets are available for this purpose, but newer and larger benchmark sets for benign variants have largely been missing. VariSNP datasets are selected from dbSNP. These subsets were filtered against disease-related variants in the ClinVar, UniProtKB/Swiss-Prot, and PhenCode databases, to identify neutral or nonpathogenic cases. All variant descriptions include mapping to reference sequences on chromosomal, genomic, coding DNA, and protein levels. The datasets will be updated with automated scripts on a regular basis and are freely available at http://structure.bmc.lu.se/VariSNP.
Research Interests:
For development and evaluation of methods for predicting the effects of variations, benchmark datasets are needed. Some previously developed datasets are available for this purpose, but newer and larger benchmark sets for benign variants... more
For development and evaluation of methods for predicting the effects of variations, benchmark datasets are needed. Some previously developed datasets are available for this purpose, but newer and larger benchmark sets for benign variants have largely been missing. VariSNP datasets are selected from dbSNP. These subsets were filtered against disease-related variants in the ClinVar, UniProtKB/Swiss-Prot, and PhenCode databases, to identify neutral or nonpathogenic cases. All variant descriptions include mapping to reference sequences on chromosomal, genomic, coding DNA, and protein levels. The datasets will be updated with automated scripts on a regular basis and are freely available at http://structure.bmc.lu.se/VariSNP.
Research Interests:
Locus-Specific DataBases (LSDBs) store information on gene sequence variation associated with human phenotypes and are frequently used as a reference by researchers and clinicians. We developed the Leiden Open-source Variation Database... more
Locus-Specific DataBases (LSDBs) store information on gene sequence variation associated with human phenotypes and are frequently used as a reference by researchers and clinicians. We developed the Leiden Open-source Variation Database (LOVD) as a platform-independent Web-based LSDB-in-a-Box package. LOVD was designed to be easy to set up and maintain and follows the Human Genome Variation Society (HGVS) recommendations. Here we describe LOVD v.2.0, which adds enhanced flexibility and functionality and has the capacity to store sequence variants in multiple genes per patient. To reduce redundancy, patient and sequence variant data are stored in separate tables. Tables are linked to generate connections between sequence variant data for each gene and every patient. The dynamic structure allows database managers to add custom columns. The database structure supports fast queries and allows storage of sequence variants from high-throughput sequence analysis, as demonstrated by the X-chromosomal Mental Retardation LOVD installation. LOVD contains measures to ensure database security from unauthorized access. Currently, the LOVD Website (http://www.LOVD.nl/) lists 71 public LOVD installations hosting 3,294 gene variant databases with 199,000 variants in 84,000 patients. To promote LSDB standardization and thereby database interoperability, we offer free server space and help to establish an LSDB on our Leiden server.