Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
100% found this document useful (1 vote)
262 views

Bioinformatics Answers

Proteomics is the study of the proteome, which is the complete set of proteins expressed by an organism. REBASE is a comprehensive database of information about restriction enzymes and related proteins. BLOCKS databases primarily source their information from Prosite, which is an annotated collection of protein motif descriptors. GenBank has 18 subsets, examples being trace and wgs. Examples of BLAST programs are BLASTn and BLASTp. Gap penalties allow alignment algorithms to introduce gaps to match more terms in an alignment while minimizing gaps to create a useful alignment.

Uploaded by

Pratibha Patil
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
262 views

Bioinformatics Answers

Proteomics is the study of the proteome, which is the complete set of proteins expressed by an organism. REBASE is a comprehensive database of information about restriction enzymes and related proteins. BLOCKS databases primarily source their information from Prosite, which is an annotated collection of protein motif descriptors. GenBank has 18 subsets, examples being trace and wgs. Examples of BLAST programs are BLASTn and BLASTp. Gap penalties allow alignment algorithms to introduce gaps to match more terms in an alignment while minimizing gaps to create a useful alignment.

Uploaded by

Pratibha Patil
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

BIOINFORMATIC ANSWERS

1. What is proteomics?
A proteome is the complete set of proteins expressed by an
organism. The term can also be used to describe the assortment of
proteins produced at a specific time in a particular cell or tissue type.
Proteomics is study of proteome.
2. What is REBASE?
REBASE is a comprehensive database of information about
restriction enzymes, DNA methyltransferases and related proteins
involved in the biological process of restriction-modification (R-M).

3. What is the primary source of BLOCKS databases?


Prosite which is an annotated collection of motif descriptors dedicated
to the identification of protein families and domains.

4. How many subsets GenBank has? Name any two of them.


18,
5. Name any two BLAST programmes.
BLASTn, BLASTp
6. Why gap-penalties should be used in sequence alignments?
When aligning sequences, introducing gaps in the sequences can
allow an alignment algorithm to match more terms than a gap-less
alignment can. However, minimizing gaps in an alignment is
important to create a useful alignment.
7. What is BLOSSUM?
In bioinformatics, the BLOSUM (BLOcks SUbstitution Matrix) matrix is a
substitution matrix used for sequence alignment of proteins. BLOSUM
matrices are used to score alignments between evolutionarily divergent
protein sequences. They are based on local alignments.
7. What kind of data stored in PDB database?
The Protein Data Bank (PDB) is a database for the three-dimensional
structural data of large biological molecules, such as proteins and
nucleic acids.
9. In which database Hidden Markov Models are present?
Pfam
10. What is EST?
Expressed sequence tags (ESTs) are fragments of mRNA sequences
derived through single sequencing reactions performed on randomly
selected clones from cDNA libraries.
11. Why quantum chemical methods take longer time as compared
to molecular mechanical methods?
Quantum mechanical calculations describe the electronic behavior of
atoms and molecules and this is what make it suitable for the site of
failure of Molecular Mechanical calculations which is the calculation of
bond formation and dissociation energies, but QM methods is expensive
from a computational perspective.

12. Define molecular modeling.


Molecular modelling is based on the development of theoretical and
computational methodologies, to model and study the behaviour of
molecules, from small chemical systems to large biological molecules
and material assemblies.
13. Write any two molecular modeling software.
 SwissPDB Viewer.
 JME Molecular Editor-MolSoft.
 Viewer Applet Chemis 3D.
 YASARA.

14. What is mean by full-geometry optimization?


a method to predict the three-dimensional arrangement of the atoms in a
molecule by means of minimization of a model energy.
15. What is force field?
the functional form and parameter sets used to calculate the potential
energy of a system of atoms or coarse-grained particles in molecular
mechanics, molecular dynamics, or Monte Carlo simulations.
16. State any two approaches to identify gene.
Genes are identified broadly via two methods, i.e., a) similarity-based
searches and b) Ab-initio prediction.
17. What is molecular dynamics simulation?
Molecular dynamics can be used to explore conformational space,
and is often the method of choice for large molecules such as
proteins. MD can be defined as a computer simulation technique
that permits the prediction of time evolution of an interacting
particular system involving the generation of atomic trajectories of a
system using numerical integration of Newton’s equation of motion
for a specific interatomic potential defined by an initial condition and
boundary condition.
18. State the long form of MALDI/TOF.
In MALDI-TOF mass spectrometry, the ion source is matrix-assisted
laser desorption/ionization (MALDI), and the mass analyzer is time-of-
flight (TOF) analyzer.
19. What is homology modelling?
Homology modeling is one of the computational structure prediction
methods that are used to determine protein 3D structure from its amino
acid sequence. It is considered to be the most accurate of the
computational structure prediction methods.
20. Give any two methods of phylogenetic analysis.
molecular clock, midpoint rooting, and outgroup rooting
21. Which physicochemical properties are calculated by
computational tools? State the server name.
molecular weight, melting point, boiling point, vapor point, molecular
polarity, Henry's phase distribution, and the extrinsic properties of
pressure (P) and moles (n). A predictive tool called COSMO-RS
22. What is the primary source of PROSITE database?
Swiss_PROT
23. Why gap-penalties should be used in sequence alignments?
A Gap penalty is a method of scoring alignments of two or more
sequences. When aligning sequences, introducing gaps in the
sequences can allow an alignment algorithm to match more terms than a
gap-less alignment can.
24. What is genomics?
Genomics is the study of whole genomes of organisms, and incorporates
elements from genetics. Genomics uses a combination of recombinant
DNA, DNA sequencing methods, and bioinformatics to sequence,
assemble, and analyse the structure and function of genomes.
25. What is the basis of PAM matrix?

26. What kind of data stored in MMDB database?


The Molecular Modeling DataBase (MMDB) is a database of
experimentally determined three-dimensional biomolecular structures,
and is also referred to as the Entrez Structure database
27. State the significance of pfam database.
The general purpose of the Pfam database is to provide a complete and
accurate classification of protein families and domains. Originally, the
rationale behind creating the database was to have a semi-automated
method of curating information on known protein families to improve the
efficiency of annotating genomes.
28. Which algorithms are used for local and global sequence
alignment?
A general global alignment technique is the Needleman–Wunsch
algorithm, which is based on dynamic programming.
29. List the software used in homology modelling process.
MODELLER. MODELLER is a computer program for comparative
protein structure modeling

30. What is STS?


A sequence-tagged site (or STS) is a short (200 to 500 base pair) DNA sequence that has a
single occurrence in the genome and whose location and base sequence are known.

31. Name any two Primary protein sequence databases.


Swiss_PROT, PIR, MIPS
32. Give long forms of PDB and PIR.
Protein data bank and protein information resource
33. State any two goals of Human Genome Project.
determining the order, or "sequence," of all the bases in our genome's
DNA; making maps that show the locations of genes for major sections
of all our chromosomes; and producing what are called linkage maps,
through which inherited traits (such as those for genetic disease) can be
tracked over generations.
34. What is the purpose of energy minimization methods?
The goal of energy Minimization is to find a set of coordinates
representing the minimum energy conformation for the given structure.
35. What is the primary source of PRINTS database?
OWL
36. What is mean by local sequence alignment?
local alignments identify regions of similarity within long sequences that
are often widely divergent overall
37. Define Microarray?
A microarray is a laboratory tool used to detect the expression of
thousands of genes at the same time. DNA microarrays are microscope
slides that are printed with thousands of tiny spots in defined positions,
with each spot containing a known DNA sequence or gene.
38. Write Schrodinger equation.
it predicts the future behavior of a dynamic system. It is a wave equation
in terms of the wavefunction which predicts analytically and precisely the
probability of events or outcome.

39. What is mean by single point calculation?


40. What kind of data stored in DDBJ database?
DDBJ has been collecting annotated nucleotide sequences as its
traditional database service.
41. What is RasMol?
RasMol is a computer program written for molecular graphics
visualization intended and used mainly to depict and explore biological
macromolecule structures, such as those found in the Protein Data
Bank.
42. Write any two functions of molecular modeling.
It helps in understanding the fundamentals of physical and chemical
interactions, which are difficult to calculate using experimental
procedures. It also helps in the development of new theories, models,
processes, and products.
43. What is Conjugate-Gradient?.
the conjugate gradient method is an algorithm for the numerical solution
of particular systems of linear equations, namely those whose matrix is
positive-definite.
44. What is Steepest-Descent?
Gradient descent is an optimization algorithm which is commonly-used
to train machine learning models and neural networks. A steepest
descent algorithm would be an algorithm which follows the above update
rule, where at each iteration, the direction ∆x(k) is the steepest direction
we can take. That is, the algorithm continues its search in the direction
which will minimize the value of function, given the current point.
45. What is ExPASy?
Expasy is an extensible and integrative portal including >160 databases
and software tools developed by SIB groups. It covers a wide range of
fields in life sciences and biomedical research, spanning genomics,
proteomics, structural biology, evolution, phylogeny, systems biology
and medicinal chemistry.

46. Name any two programmes used to design of primers.


Primer – BLAST, PerlPrimer, Primer3Plus, PrimerQuest, OligoPerfect
47. What is the use of BLASTp?
Standard protein-protein BLAST (blastp) is used for both identifying a
query amino acid sequence and for finding similar sequences in protein
databases
48. What is the use of BLASTn?
The “blastn” program is a general purpose nucleotide search and
alignment program that is sensitive and can be used to align tRNA or
rRNA sequences as well as mRNA or genomic DNA sequences
containing a mix of coding and noncoding regions.
49. What is the long form of NGS?
Next-generation sequencing
50. Write any two applications of NGS?
Typical applications of NGS methods in microbiology and virology,
besides high-throughput whole genome sequencing, are discovery of
new microorganisms and viruses by using metagenomic approaches,
investigation of microbial communities in the environment and in human
body niches in healthy and disease conditions
51. In which year Human Genome Project started?
1 October 1990
52. List the existing NGS platforms
 Illumina.
 Oxford Nanopore.
 Ion.
 Pacific Biosciences.
 Roche 454.

53. Write any two Distance Based Methods?


the evolutionary distance matrix typically derived from a substitution
model, and the tree-building algorithm that constructs a tree from the
distance matrix.

54. Write any two Character Based Methods?


maximum parsimony (MP) and maximum likelihood (ML) methods.
55. Human Genome Project was completed in which year?
2003
56. Write an application of BioEdit?
Insertion of gap, edititng sequence, multiple sequence alignment
57. What is cDNA?
Complementary DNA (cDNA) is a DNA copy of a messenger RNA
(mRNA) molecule produced by reverse transcriptase, a DNA polymerase
that can use either DNA or RNA as a template.
58. What is the significance of UTR?
UTRs are known to play crucial roles in the post-transcriptional
regulation of gene expression, including modulation of the transport of
mRNAs out of the nucleus and of translation efficiency [3], subcellular
localization [4] and stability
59. Write four institutes participated in human genome project.
United States DOE Joint Genome Institute, Walnut Creek, Calif., U.S.
Baylor College of Medicine Human Genome Sequencing Center,
Department of Molecular and Human Genetics, Houston, Tex., U.S.
RIKEN Genomic Sciences Center, Yokohama, Japan. Genoscope and
CNRS UMR-8030, Evry, France.
60. Where EMBL, GenBank, DDBJ are located?
The database is maintained at the European Bioinformatics Institute
(EBI), an Outstation of the EMBL Molecular Biology Laboratory (EMBL)
in Heidelberg, Germany.
GenBank is built and distributed by the National Center for
Biotechnology Information (NCBI), a division of the National Library of
Medicine (NLM), located on the campus of the US National Institutes of
Health (NIH) in Bethesda, MD, USA.
DDBJ Center is in operation at Research Organization of Information
and System National Institute of Genetics(NIG) in Mishima, Japan

61. Define primer


A primer is a small segment of DNA that binds to a complementary
strand of DNA. Primers are necessary to start the functioning of DNA
polymerase enzyme and therefore are necessary in polymerase chain
reaction.
62. What is wavefunction?
Wave Functions. A wave function (Ψ) is a mathematical function that
relates the location of an electron at a given point in space (identified by
x, y, and z coordinates) to the amplitude of its wave, which corresponds
to its energy.
63. Write any two special features of HGP.
 Our entire genome is made up of 3164.7 million base pairs.
 On average, a gene is made up of 3000 nucleotides.
 The function of more than 50 percent of the genes is yet to be
discovered.
 Proteins are coded by less than 2 percent of the genome.
 Most of the genome is made up of repetitive sequences which
have no coding purposes specifically but such redundant codes
can help us better understand of genetic development of humanity
through the ages.

64. What is masking repetitive DNA?


The term 'masking' means transforming every nucleotide identified
as a repeat to an 'N', 'X' or to a lower case a, t, g, or c (the latter is
known as soft masking). The aim of repeat masking is to identify the
location of all repeated elements along a genome sequence.
65. What are different types of Biological database?
Primary, secondary and composite
66. What are the tools available for gene finding?
Ab initio and Gene Prediction Tools

67. SWISS_PORT is related to?


Sequence data bank
68. What is the use of PRINTS?
PRINTS is a database of protein family 'fingerprints' offering a diagnostic
resource for newly-determined sequences.
69. Write types of alignments?
GLOBAL AND LOCAL PAIRWISE ALIGNMENTS

70. What is long form of BLAST?


Basic Local Alignment Search Tool 
71. What is the annealing temperature used in PCR?
The annealing step (30 sec to 1 min, at temperatures 45–60 °C
72. What is the use of ClustalW?
ClustalW is a widely used system for aligning any number of
homologous nucleotide or protein sequences. For multi-sequence
alignments, ClustalW uses progressive alignment methods.
73. Write dawn energy equation which is used in molecular
mechanics

74. Define Hamiltonian.


a function that is used to describe a dynamic system (such as the motion
of a particle) in terms of components of momentum and coordinates of
space and time and that is equal to the total energy of the system when
time is not explicitly part of the function — compare lagrangian
75. Define conformational search
Conformational energy searching is used to find all of the
energetically preferred conformations of a molecule (especially
rotamers), which is mathematically equivalent to locating all of the
minima of its energy function.

76. Define forcefield


the force field refers to the functional form and parameter sets used to calculate the
potential energy of a system of atoms or coarse-grained particles in molecular
mechanics, molecular dynamics, or Monte Carlo simulations.

77. Define endo sugar puckering.


Sugars with atoms puckered above the reference plane (on the same
side as the base) are in an endo -form (C2'- endo pucker has the C2'-
carbon pointed up and towards the base)
78. State the importance of Ramachandran plot
The Ramachandran plot provides a way to view the distribution of
torsion angles in a protein structure and shows that the torsion
angles corresponding to the two major secondary structure elements (α-
helices and β-sheets) are clearly clustered within separate regions.
79. Write properties of Alpha helix.
 It completes one turn every 3.6 residues;
 It rises approximately 5.4 Â with each turn;
 It is a right-handed helix;
 It is held together by hydrogen bonds between the C=O. of residue
i and the NH of residue i+4;
 It is typically slightly curved.

80. What is beta sheet?


 A secondary structure motif of peptides and proteins, characterized by
two or more amino acid strands connected laterally by two or more
hydrogen bonds between a peptide bond N-H in one strand and a
peptide bond C=O. in the adjacent strand.
81. What is the use of PubMed?
PubMed is a free resource supporting the search and retrieval of biomedical and life
sciences literature with the aim of improving health–both globally and personally.

82. What is molecular docking?


Molecular docking is a key tool in structural molecular biology and
computer-assisted drug design. The goal of ligand-protein docking is
to predict the predominant binding mode(s) of a ligand with a protein of
known three-dimensional structure.
83. List the software used in molecular docking
AADS, ADAM, AutoDock,
84. What is the use of molecular docking?
Molecular docking is frequently used in the process of computer
aided drug design (CADD). It can be applied in different stages of the
drug design process in order to: (1) predict the binding mode of already
known ligands; (2) identify novel and potent ligands and (3) as a binding
affinity predictive tool
85. What is the use of molecular dynamics simulation?
One of the most practical application of the concept of molecular
recognition are docking strategies, either small molecule or protein
docking. To understand how a ligand, typically a substrate or a
regulator, binds to its macromolecular counterpart is a key issue in
the understanding of function itself, and it is the basis of structurally
driven drug design.
86. How hydrophobic amino acids contribute in protein folding?
A Hydrophobic Effect. The major driving force in protein folding is the
hydrophobic effect. This is the tendency for hydrophobic molecules to
isolate themselves from contact with water. As a consequence during
protein folding the hydrophobic side chains become buried in the
interior of the protein.
87. What is the role of hydrophilic amino acids in protein folding?
The hydrophilic amino acids interact more strongly with water (which is
polar) than do the hydrophobic amino acids. The interactions of the
amino acids within the aqueous environment result in a specific
protein shape.

88. State the significance of Rough Draft of HGP?


The Human Genome Project international consortium published a
first draft and initial analysis of the human genome sequence. The
draft sequence covered more than 90 percent of the human genome
89. What is NCBI?
It is national resource for molecular biology information, NCBI's mission
is to develop new information technologies to aid in the understanding of
fundamental molecular and genetic processes that control health and
disease.
90. What is ENTREZ?
Entrez is a molecular biology database system that provides
integrated access to nucleotide and protein sequence data, gene-
centered and genomic mapping information, 3D structure data, PubMed
MEDLINE, and more.

You might also like