Lab Report 1 Bioinformatics
Lab Report 1 Bioinformatics
LAB REPORT 1
BIOINFORMATICS
(PBI20201P)
NAME STUDENT ID
RABIATUL ADAWIYAH BINTI 012020091691
HASBULLAH
PROGRAMME:
BACHELOR OF PHARMACEUTICAL
TECHNOLOGY (BPHT)
LECTURER :
AP DR SANTOSH FATTEPUR AND
DR ALICIA NG
DATE OF SUBMISSION:
2nd DECEMBER 2020
Practical 1: Biological Databases with Reference to NCBI
Biological data is highly complex and interrelated. Vast amount of biological information needs
to be stored organized and indexed so that the information can be retrieved and used. There
are five major types of biological databases namely nucleotide databases, protein databases,
protein structure databases, metabolic pathway databases and the bibliographic databases.
Introduction:
The collection of the biological data on a computer which can be controlled to seem in
shifting arrangements and subsets is respected as a database. The biological information
can be put away in different databases. Each database has its possess site with special
route devices. The biological databases are, in common, freely available. Biological
databases can be generally partitioned into two categories which are primary databases and
secondary databases. Primary databases are also called as archieval database. They are
populated with tentatively derived information such as nucleotide sequence, protein
sequence, or macromolecular structure. Experimental results are submitted specifically into
the database by analysts, and the information is basically archival in nature. Once given a
database increase number, the information in primary databases are never changed: they
frame part of the scientific record. For examples, ENA, GenBank and DDBJ (nucleotide
sequence), Array Express Archive and GEO (functional genomics data) and Protein Data
Bank (PDB; coordinates of three-dimensional macromolecular structures). Next, Secondary
databases include information inferred from the comes about of analysing primary data.
Secondary databases regularly draw upon data from various sources, counting other
databases (primary and secondary), controlled vocabularies, and logical literature. They are
profoundly curated, frequently using a complex combination of computational calculations
and manual analysis and translation to determine new information from the public record of
science. For examples, InterPro (protein families, motifs and domains), UniProt
Knowledgebase (sequence and functional information on proteins) and Ensembl (variation,
function, regulation and more layered onto whole genome sequences). The importance of
biological databases include databases are utilized to store and organize information in such
a way that data can be recovered effortlessly via an assortment of look criteria. Next, it
permits knowledge revelation, which alludes to the identification of associations between
pieces of data that were not known when the data was, to begin with, entered. This
encourages the revelation of modern biological experiences from raw information. Lastly, it
helps to solve cases where numerous clients need to get to the same sections of
information. NCBI is presently a driving source for public biomedical databases, computer
program devices for analyzing atomic and genomic information, and investigation in
computational science. Nowadays NCBI makes and keeps up over 40 coordinates
databases for the therapeutic and logical communities as well as the general public. NCBI
provides a wide assortment of information analysis tools such as literature, health, genomes,
genes, proteins, and chemicals that permit clients to control, adjust, visualize, and assess
organic information.
Aim
To view and use the various biological databases available on the World Wide Web.
Procedure:
1. Open your web browser and type the web address of the required database.
2. Explore the database and analyze the various information available in the database.
3. Use the tools provided by the databases.
4. Print screen your output and paste on MS word.
You are required to explore the database and analyze the various information available in the
database based on SEVEN (7) proteins as follow :
Accession number
ACCESSION CAA29096
>CAA29096.1cystatinC[Homosapiens]MAGPLRAPLLLLAILAVALAVSPAAGSSPGKPPRLV
GGPMDASVEEEGVRRALDFAVGEYNKASNDMYHSRALQVVRARKQIVAGVNYFLDVELG
RTTCTKTQPNLDNCPFHDQPHLKRKAFCSFQIYAVPWQGTMTLSKSTCQDA
Graphics
ii. G-protein (Organism: Daphnia magna)
Accession number
ACCESSION KZS12823
Graphics
iii. Keratin (Organism: Mus musculus)
Accession number
ACCESSION AAA39370
>AAA39370.1keratin,partial[Musmusculus]EVVKKQCIGVQDSIADAEQHGEHAIKDARGKLT
DLEEALQQCREDLARLLRDYQELMNTKLSLDVEIATYRKLLEGEECRMSGDFSDNVSVSITS
STISSSMASKTGFGSGGQSSGGRGSYGGRGGGGGGGSSYGSGGRSSGSRGSGSGSGG
GGYSSGGGSRGGSGGGYGSGGGSRGGSGGGYGSGGGSGSGGGYSSGGGSRGGSGG
GGASSGGGSRGGSSSGGGSRGGSSSGGGGYSSGGGSRGGSSSGGQDLALKREVLGQG
KVVAQV
Graphics
iv. Albumin (Organism: Theobroma cacao)
Accession number
ACCESSION EOY34699
Graphics
v. Hemoglobin (Organism: Staphylococcus aureus)
Accession number
ACCESSION EFE26080
Graphics
vi. Collagen (Organism: Human)
Accession number
ACCESSION BAA04809
Sequence in FASTA format
Graphics
vii. Myosin (Organism: Zea mays)
Accession number
ACCESSION AHI45153
Sequence in FASTA format
Graphics
Conclusion
Biological databases play a central part in bioinformatics. They offer researchers the
opportunity to get to a wide assortment of biologically important data, including the genomic
sequences of a progressively wide extend of organisms. This unit gives a brief outline of major
sequence databases and entries, such as GenBank, the UCSC Genome Browser, and
Ensembl. Demonstrate living being databases, counting WormBase, the Arabidopsis Data
Asset (TAIR), and those made accessible through the Mouse Genome Informatics (MGI) asset
are too secured. Non-sequence-centric databases, such as Online Mendelian Legacy in Man
(OMIM), the Protein Data Bank (PDB), MetaCyc, and the Kyoto Encyclopedia of Genes and
Genomes (KEGG) are moreover talked about.