CAB Unit 1 Notes
CAB Unit 1 Notes
CAB Unit 1 Notes
Translation
DNA contains the complete genetic information that defines the
structure and function of an organism.
Proteins are formed using the genetic code of the DNA.
Conversion of DNA encoded information to RNA is essential to
form proteins.
Thus, within most cells, the genetic information flows from – DNA to
RNA to protein.
The flow of information is followed through three different processes
which are responsible for the inheritance of genetic information and
for its conversion from one form to another:
1. Replication: a double stranded nucleic acid is duplicated to give
identical copies. This process perpetuates the genetic information.
2. Transcription: a DNA segment that constitutes a gene is read and
transcribed into a single stranded sequence of RNA. The RNA moves
from the nucleus into the cytoplasm.
3. Translation: the RNA sequence is translated into a sequence of amino
acids as the protein is formed. During translation, the ribosome reads
three bases (a codon) at a time from the RNA and translates them into
one amino acid.
This flow of information is unidirectional and irreversible.
This explanation is the simplest way in which the Central Dogma of
Molecular Biology is interpreted.
In the bigger picture, the central dogma of molecular biology is an
explanation of the flow of genetic information within a biological
system.
It was first stated by Francis Crick in 1958, as
“Once ‘information’ has passed into protein it cannot get out
again. In more detail, the transfer of information from nucleic
acid to nucleic acid or from nucleic acid to protein may be
possible, but transfer from protein to protein, or from protein to
nucleic acid is impossible.”
This Llama-Derived COVID Treatment Could Be a Game Changer
The Dogmas
The dogma is a framework for understanding the transfer
of sequence information between information-carrying biopolymers,
DNA and RNA (both nucleic acids), and protein.
There are 3×3=9 conceivable direct transfers of information that can
occur between these.
The dogma classes these into 3 groups of 3:
DNA Structure
RNA STRUCTURE
Like DNA, RNA is a long polymer consisting of nucleotides.
RNA is a single-stranded helix.
The strand has a 5′end (with a phosphate group) and a 3′end (with a
hydroxyl group).
It is composed of ribonucleotides.
The ribonucleotides are linked together by 3′ –> 5′ phosphodiester
bonds.
The nitrogenous bases that compose the ribonucleotides include
adenine, cytosine, uracil, and guanine.
Thus, the difference in the structure of RNA from that of DNA include:
The bases in RNA are adenine (abbreviated A), guanine (G), uracil (U)
andcytosine (C).
Thus thymine in DNA is replaced by uracil in RNA, a different pyrimidine.
However, like thymine, uracil can form base pairs with adenine.
The sugar in RNA is ribose rather than deoxyribose as in DNA.
The corresponding ribonucleosides are adenosine, guanosine, cytidine
and uridine. The corresponding ribonucleotides are adenosine 5’-
triphosphate (ATP), guanosine 5’-triphosphate (GTP), cytidine 5’-
triphosphate (CTP) and uridine 5’-triphosphate (UTP).
RNA Secondary Structure
Most RNA molecules are single-stranded but an RNA molecule may
contain regions which can form complementary base pairing where
the RNA strand loops back on itself.
If so, the RNA will have some double-stranded regions.
Ribosomal RNAs (rRNAs) and transfer RNAs (tRNAs) exhibit substantial
secondary structure, as do some messenger RNAs (mRNAs).
Types of RNA
In both prokaryotes and eukaryotes, there are three main types of RNA –
rRNA (ribosomal)
tRNA (transfer)
mRNA (messenger)
Protein Structure
The linear sequence of amino acid residues in a polypeptide chain
determines the three-dimensional configuration of a protein, and the
structure of a protein determines its function.
All proteins contain the elements carbon, hydrogen, oxygen, nitrogen,
and sulfur some of these may also contain phosphorus, iodine, and
traces of metals like ions, copper, zinc, and manganese.
A protein may contain 20 different kinds of amino acids. Each amino
acid has an amine group at one end and an acid group at the other
and a distinctive side chain.
The backbone is the same for all amino acids while the side chain differs
from one amino acid to the next.
The structure of proteins can be divided into four levels of organization:
1. Primary Structure
The primary structure of a protein consists of the amino acid sequence
along the polypeptide chain.
Amino acids are joined by peptide bonds.
Because there are no dissociable protons in peptide bonds, the charges
on a polypeptide chain are due only to the N-terminal amino group,
the C-terminal carboxyl group, and the side chains on amino acid
residues.
The primary structure determines the further levels of organization of
protein molecules.
2. Secondary Structure
The secondary structure includes various types of local conformations in
which the atoms of the side chains are not involved.
Secondary structures are formed by a regularly repeating pattern of
hydrogen bond formation between backbone atoms.
The secondary structure involves α-helices, β-sheets, and other types of
folding patterns that occur due to a regularly repeating pattern of
hydrogen bond formation.
The secondary structure of protein could be :
1. Alpha-helix
2. Beta-helix
The α-helix is a right-handed coiled strand.
The side-chain substituents of the amino acid groups in an α-helix
extend to the outside.
Hydrogen bonds form between the oxygen of the C=O of each peptide
bond in the strand and the hydrogen of the N-H group of the peptide
bond four amino acids below it in the helix.
The side-chain substituents of the amino acids fit in beside the N-H
groups.
The hydrogen bonding in a ß-sheet is between strands (inter-strand)
rather than within strands (intra-strand).
The sheet conformation consists of pairs of strands lying side-by-side.
The carbonyl oxygens in one strand hydrogen bond with the amino
hydrogens of the adjacent strand.
The two strands can be either parallel or anti-parallel depending on
whether the strand directions (N-terminus to C-terminus) are the same
or opposite.
The anti-parallel ß-sheet is more stable due to the more well-aligned
hydrogen bonds.
3. Tertiary Structure
The tertiary structure of a protein refers to its overall three-dimensional
conformation.
The types of interactions between amino acid residues that produce the
three-dimensional shape of a protein include hydrophobic
interactions, electrostatic interactions, and hydrogen bonds, all of
which are non-covalent.
Covalent disulfide bonds also occur.
It is produced by interactions between amino acid residues that may be
located at a considerable distance from each other in the primary
sequence of the polypeptide chain.
Hydrophobic amino acid residues tend to collect in the interior of
globular proteins, where they exclude water, whereas hydrophilic
residues are usually found on the surface, where they interact with
water.
4. Quaternary Structure
Quaternary structure refers to the interaction of one or more subunits to
form a functional protein, using the same forces that stabilize the
tertiary structure.
It is the spatial arrangement of subunits in a protein that consists of
more than one polypeptide chain.
Classification of Proteins
Based on the chemical nature, structure, shape, and solubility, proteins are
classified as:
1. Simple proteins: They are composed of only amino acid residue. On
hydrolysis, these proteins yield only constituent amino acids. It is
further divided into:
Fibrous protein: Keratin, Elastin, Collagen
Globular protein: Albumin, Globulin, Glutelin, Histones
2. Conjugated proteins: They are combined with non-protein moiety.
Eg. Nucleoprotein, Phosphoprotein, Lipoprotein, Metalloprotein, etc.
3. Derived proteins: They are derivatives or degraded products of simple
and conjugated proteins. They may be :
Primary derived protein: Proteans, Metaproteins, Coagulated
proteins
Secondary derived proteins: Proteosesn or albunoses, peptones,
peptides.
Functions of Proteins
Proteins are vital for growth and repair, and their functions are endless. They
also have an enormous diversity of biological functions and are the most
important final products of the information pathways.
Proteins, which are composed of amino acids, serve in many roles in the
body (e.g., as enzymes, structural components, hormones, and
antibodies).
They act as structural components such as keratin of hair and nail,
collagen of bone, etc.
Proteins are the molecular instruments through which genetic
information is expressed.
They execute their activities in the transport of oxygen and carbon
dioxide by hemoglobin and special enzymes in the red cells.
They function in the homeostatic control of the volume of the
circulating blood and that of the interstitial fluids through the plasma
proteins.
They are involved in blood clotting through thrombin, fibrinogen, and
other protein factors.
They act as the defense against infections by means of protein
antibodies.
They perform hereditary transmission by nucleoproteins of the cell
nucleus.
Ovalbumin, glutelin, etc. are storage proteins.
Actin, myosin act as a contractile protein important for muscle
contraction.
Topic 5 :OMICS technology
The terms “Ome” derived from a Greek word and “Omics” are derivations of the
suffix -ome which means “whole,” “all,” or “complete.” With the addition of -ome to
cellular molecules, such as gene, transcript, protein, metabolite, it can be referred as
genome, transcriptome, proteome, metabolome, respectively [3, 4].
Omics technologies and systems biology are the emerging concept of molecular
medicine (Figure 1). Omics refers to collective and high-throughput analyses
including genomics, transcriptomics, proteomics, and metabolomics/lipidomics that
integrated through robust systems biology, bioinformatics, and computational tools to
study the mechanism, interaction, and function of cell populations’ tissues, organs,
and the whole organism at the molecular level in a non-targeted and non-biased
manner [5].
Genomics is the systematic study of an organism’s entire genome [6]. The human
genome is made up of DNA (deoxyribonucleic acid) comprising approximately 3
billion base pairs of four chemical structures (adenine, guanine, cytosine, and
thymine), also called nucleotides. DNA contains genetic information required to build
and maintain cells. A gene denotes a specific unit of DNA that hold information to
make a specific functional unit named protein. It is estimated that the entire human
genome contains approximately 21,500 genes. The order of the nucleotides reveals the
meaning of the information encoded in DNA. Emergence of high-throughput
sequencing technologies, such as next-generation sequencing, enables analysis of
variations between individuals at the genomics level.
Transcriptomics is the study of transcriptome that comprises the entire collection of
RNA (ribonucleic acid) sequences, called transcripts, in a cell. It is estimated that a
human cell contains about 25,000 transcripts. RNAs are classified into two groups: (1)
mRNA is the coding RNA that is translated into protein sequences. (2) Non-coding
RNAs are also classified into two subgroups; short non-coding RNAs such as
microRNA (miRNA) and long non-coding RNAs (lncRNA). Non-coding RNAs are
involved in gene regulation. Next-generation RNA sequencing technologies allow
deeply understanding of variations and gene expression on various types of RNA
molecules including miRNA, mRNA, and lncRNA [2].
Proteomics is the study of proteome, which is defined as the set of all expressed
proteins and interacting protein family networks, and biochemical pathways in a cell,
tissue, or organism. Although, the exact number of proteins/peptides is still unclear, it
is estimated to be around a few hundred thousand.
What is sequencing?
You may have heard of genomes being sequenced. For instance, the
human genome was completed in 2003, after a many-year,
international effort. But what does it mean to sequence a genome, or
even a small fragment of DNA?
In this article, we’ll take a look at methods used for DNA sequencing.
We'll focus on one well-established method, Sanger sequencing, but
we'll also discuss new ("next-generation") methods that have reduced
the cost and accelerated the speed of large-scale sequencing.
The mixture is first heated to denature the template DNA (separate the
strands), then cooled so that the primer can bind to the single-
stranded template. Once the primer has bound, the temperature is
raised again, allowing DNA polymerase to synthesize new DNA
starting from the primer. DNA polymerase will continue adding
nucleotides to the chain until it happens to add a dideoxy nucleotide
instead of a normal one. At that point, no further nucleotides can be
added, so the strand will end with the dideoxy nucleotide.
Next-generation sequencing
The name may sound like Star Trek, but that’s really what it’s called!
The most recent set of DNA sequencing technologies are collectively
referred to as next-generation sequencing.
Micro scale: reactions are tiny and many can be done at once on
a chip
Topic 10 : Big data is a collection of data from many different sources and is often
describe by five characteristics: volume, value, variety, velocity, and veracity.
Volume: the size and amounts of big data that companies manage and
analyze
Value: the most important “V” from the perspective of the business, the value
of big data usually comes from insight discovery and pattern recognition that
lead to more effective operations, stronger customer relationships and other
clear and quantifiable business benefits
Variety: the diversity and range of different data types, including unstructured
data, semi-structured data and raw data
Velocity: the speed at which companies receive, store and manage data –
e.g., the specific number of social media posts or search queries received
within a day, hour or other unit of time
Veracity: the “truth” or accuracy of data and information assets, which often
determines executive-level confidence
1. Genomics
The Genome Project, especially for the human, took over a long time of
worldwide research and support to identify the 20,000 plus genes and sequence
of all 3 billion genome bases. This project costs billions of dollars globally, but
today’s biotechnology companies use the Big Data database that can decode
entire genomes for just thousands of dollars. The genomics market helps
different data companies that use frameworks and tools to conduct huge and
complicated computing tasks to analyze genetic, medical, and biological data.
These companies often work with computer hardware giants to improve their
application performance and their Big Data analysis results.
2. Agriculture
Big data can also be quite an application in the field of agriculture. Data
gathered from GPS technology are stored in the framework of Big Data, and
multiple GPS enabled tractors can help farmers to cope with the changing
environmental condition by implementing farming precisely. Data analytics is
also changing the landscape of the biotech industry with its contribution to
genetic research in creating genetically modified organisms. Such engineered
crops can be modified with inputs from data collection from Big Data to
improve crop yield, survive changing conditions, and disease-free plants are
obtained.
3. Pharma Automation
As per almost every pharma company, it receives millions of compounds before
selecting to appropriate for the pre-clinical trials. For the journey to successful
drug discovery, it consumes an enormous amount of time and money. So there
are many software tools that help inefficiency and less time for drug discovery.
Big Data based modeling uses large size and storage like terabytes of data and
information of different compounds and their characteristics. Therefore it acts as
a virtual library that has information of millions of compounds to identify the
compounds that will most likely experience success. These predictive modeling
programs compare the trial criteria and desired outcomes against the target
disease and chemical structures. Pharma automation reduces risks, saves money,
and offers faster research-to-market cycles.
4. Healthcare
Technically the healthcare sector of biotechnology has lagged behind than
others in the use of Big Data database. Healthcare stakeholders now have access
to promising new threads of knowledge. This information in the form of Big
Data gives complexity, diversity, and timelines. Pharmaceutical industry experts
analyze big data to obtain insights. With these technological advances in the
biotech industry have improved. Their ability to work with such data, even
though files are enormous and have different database organizations that
increase the condition and the rate of development of pharmaceutical healthcare.
5. Crowdsourcing
According to Wikipedia, crowdsourcing is the sourcing model in which
individuals or organizations obtain goods and services. These services include
ideas and finances form a large, relatively open, and often rapidly-evolving
group of internet users. It divides work between participants to achieve a
cumulative result. Therefore it is commonly used outsourcing labor and
entrepreneurial projects. Some pharma companies have created online gaming
platforms that involve disease profiles, research challenges, and solving medical
puzzles. With crowdsourcing, patients drove research works through the online
surveys that empower the consumer to conduct their own studies and research,
upload their own medical data and contribute knowledge about their condition
and symptoms to benefit the whole medical community.
6. Business Development
Every day the body of information on scientific discoveries and pharma
progress from the different sources, presenting an enormous flood of data for
biopharma industries to sift through to find potential licensing opportunities.
Some Big Pharma and biotech companies have turned to analytics and data-
mining technologies to scour disparate Big Data sources and deliver the exact
information they seek. This Big Data always grows and develops along with the
business. Therefore Big Data increases the total revenue and profit of the
business and thus develops biotech business.
7. Sentiment Analysis
Among the tools of Big Data, sentiment analysis is one that helps to analyze
social networking posts and comments. Organizations primarily use it for
marketing, advertising, and public relations research. For example, many
companies use it to find the reaction of the consumer and get their feedback.
However, social media platforms contain millions of health-related comments
because health care consumers are sharing personal and public information
about diseases and medical conditions. Some companies are creating an online
group and community to centralize and uncover new discoveries and
technologies. When used together with crowdsourcing, these tools provide
sources of free labor and infinite information.
Sproxil aims to amass large amounts of transactional data with a system that
enables patients to text-message codes from medicine bottles to learn whether
the meds are authentic. With IBM’s visualization tech and other analytics,
drugmakers can tap a large amount of data on drug transactions in real-time,
according to Big Blue. Presumably, prescription drug frauds can be spotted.
In Conclusion
The prime regions enlisting global biotech industries are in the US and Europe,
where there are over 700 companies and over 200 thousand employees that
generate around 140 billion U.S. dollars of revenue. In this 21st century, the
health care sector is growing as another sector of a promising economy where
technology for collecting and processing a large amount of data and information
in Big Data database systems are applied. With the help of Big Data and its
applications in the field of biotechnology, the growth and innovations in biotech
would rise even more rapidly, alongside delivering all the great promises of its
true potential.