DNA Digital Data Storage: P Priyanka 2451-17-735-001 M Rohith 2451-17-735-047

DNA
Digital Data Storage
P Priyanka
2451-17-735-001
M Rohith
2451-17-735-047
ABSTRACT
Digital data has changed the use and access of information, An efficient way of
storing it Cloud Computing. It depends on internet, time and is not very secure. But
the volume of digital data is ever increasing. So we need an alternative to store data
for longer time, more secure and less susceptible to technical failures.
Deoxyribonucleic acid(DNA) has the capability to store genetic information for
generations together. It can be used as a robust and high-density storage device even
under unfavourable conditions. This article drafts a complete picture of using DNA
for digital data Storage.
INTRODUCTION
The idea of DNA digital data storage dates back to 1959, when the physicist Richard
P. Feynman, in "There's Plenty of Room at the Bottom: An Invitation to Enter a New
Field of Physics" outlined the general prospects for the creation of artificial objects
similar to objects of the microcosm (including biological) and having similar or even
more extensive capabilities. The demand for data storage devices is increasing day by
day as more and more data is generated every day. Total information in digital format
in the year 2012 was about 2.7 zettabytes. By 2020 an estimated 1.7 megabytes of
data will be created per second per person globally, which translates to about 418
zettabytes in a single year (418 billion one-terabyte hard drive’s worth of information),
assuming a world population of 7.8 billion. The present day technology of storing
data is less reliable and insecure. They are restricted by time and environment. As the
data increases, the current data storage technology would not be enough to store data
in future as data is growing every day. Even potentially important information can get
lost due lack of storage space. Running data centers takes huge amounts of energy. In
short, we are about to have a serious data-storage problem that will only become more
severe over time.
There are many efficient ways to store and backup digital data. One of the most
commonly used method is Cloud Computing. But to access data stored in a remote
cloud, an internet connection is needed all the time. Moreover a cloud can store huge
volumes of data but it can store them only for couple of decades. It is less secure.
Another way is to store data on an external drive. But external drives are prone data
loss too.
Scientists and Researchers for over the past decade, have been trying to develop a
robust way of storing data on a medium which is dense, robust and ever-lasting, and
finally they came up with DNA-based data storage. A new, efficient, secure, reliable
digital data storing system. DNA is very small and has high density. Just 1 gram of
dry DNA can store about 455 exabytes of data. The power usage required while
working with it is a very little compared to a conventional storage. Even the error rate
of DNA storage is much less than normal storage device. It can retain information for
centuries, it can be used for long-term storage. Due to high density, the it can store a
large amount of data in very small space. Thus, data on DNA can be conveniently
stored.
How Scientist Did It?
The DNA in our cells contains the instructions for building all the proteins that keep
us running. DNA is made up of repeating sequences of the nucleic acids adenine,
guanine, cytosine, and thymine (A, G, C, and T) which are sometimes called base
pairs. Each sequence of three bases translates to a different amino acid, which are the
building blocks of proteins. It’s data storage just like what we do with hard drives but
with much higher potential density.
The four-lettered nucleobase alphabet of DNA (A, C, G and T) can be transformed
into binary code for example, as 00 for A, 01 for C, 10 for G and 11 for T. Scientists
looked at the algorithms that were being used to encode and decode the data and first
converted the files into binary strings of 1s and 0s compressing them into one master
file and then split the data into short strings of binary code. They devised an algorithm
called a DNA Fountain which randomly packaged the strings into droplets, to which
they added extra tags to put the file back together.
They started with six files including a full computer operating system and a computer
virus. In all, the researchers generated a digital list of 72,000 DNA strands, each 200
bases long. They sent these as text files and later, the sequences were fed into a
computer which translated the genetic code back into binary and used the tags to
reassemble the six original files. The approach worked so well that the new files
contained no errors and were also able to make a virtually unlimited number of error
free copies of their files.
Storing of Data
Encoding Data into the DNA Sequence:

The computer is worked on a binary system of 1 and 2. In the very first step, digital
data is incorporated into the DNA. The DNA has 4 nitrogenous bases: Adenine (A),
Cytosine (C), Guanine (G) and Thymine (T). For storing data into the DNA, the A, T,
G and C bases of DNA first converted into binary codes 1 and 0.
Artificial DNA synthesis:

The single-stranded arbitrary DNA sequence can be synthesized chemically. On the
basis of the digital sequence data, each nucleotide is added to the adjacent nucleotide.
However, the efficiency of artificial DNA synthesis is 99% but the error of 1% can
create a major problem in digital data storage. To overcome this problem, large
numbers of parallel start sites are provided to produce multiple copies of the given
sequence. Thus, despite having an error in a single copy many other exact copies can
be produced.
Storing of sample:
The data backup is in the form of a liquid drop of several nanograms of DNA. The
DNA can be stored in deep freeze where it can be last for 100 years or we can send it
to the external storage systems (provided by some companies) which can store our
DNA for more than thousand years. DNA remains stable in any harsh conditions for
millions of year. Nonetheless, some sequences could be lost over a period of time.
Decoding information:
Finally, the sequence gets back to the decoder which decodes the DNA sequence
back into binary language. After decoding, we can retrieve our data back.
DNA Cryptography
Multiple efforts are underway to explore the potential of DNA to store cryptographic
keys and other private information. One idea is to bury sensitive information in the
DNA, so that it is sufficiently well hidden that it need not be encrypted. This method
is known as ‘DNA Steganography’. The start-up Carverr is pursuing one
implementation of this idea by attempting to store Bit coin passwords (known as
private keys) in DNA. researchers at the University of Washington and Microsoft
Research have developed a fully automated system for writing, storing and reading
data encoded in DNA. A number of companies, including Microsoft and Twist
Bioscience, are working to advance DNA-storage technology. Meanwhile DNA is
already being used to manage data in a different way, by researchers who grapple
with making sense of tremendous volumes of data. Recent advancements in
next-generation sequencing techniques allow for billions of DNA sequences to be
read easily and simultaneously. With this ability, investigators can employ bar coding
use of DNA sequences as molecular identification “tags”—to keep track of
experimental results. DNA bar coding is now being used to dramatically accelerate
the pace of research in fields such as chemical engineering, materials science and
nanotechnology.
Advantages
 It is Ultra Compact
 It can last hundreds of thousands of years if kept in a cool, dry place.
 It can maintain its integrity without any power supply.
 It is less susceptible to technical failures.
Disadvantages
 DNA synthesis is very economical.
 Data reading capacity is very slow.
 The rewriting process is very complex.
Conclusion
Thus, using DNA for data storage, it is possible to store huge amount of data in very
less size. As DNA can retain data for millions of years, it is possible to store data for a
long time. By using this technique, data is compressed and the security to the data is
provided, The financial and engineering barriers to viable storage of non-genetic data
in DNA are formidable and this technology is in its infancy. Overcoming these
barriers would bring about a revolution in data storage and security, allowing massive
amounts of data to be stored securely in just a gram of matter. It would also open up
futuristic, new organic computing use cases, including Brain-Computer Interfaces.
in case of data damage, its copy can be used to read data. In the case of any errors
while encoding the data, the error is restricted to that particular file and no other file is
affected due to that error. This technique can be used for all kind of files by making
minor changes to adapt to the type of file. Instead of using conventional storage
devices which have less capacity to store data, DNA based storage method be used in
distant future to store data secured manner and for long time storage and solve the
problem of limited space.
References
[1] Article by Raunak Laddha “Digital Data Storage on DNA” in 2016
[2] Mohan S, Vinodh S and Jeevan F R. Preventing Data Loss by Storing Information
in Bacterial DNA. International Journal of Computer Applications 69(19):53-57,
May 2013.
[3] Shaan Ray: DNA Data storage, Article and Journal
[4] Ailenberg, M. & Rotstein, O. D. An improved Huffman coding method for

archiving text, images, and music characters in DNA. Biotechniques 47,
747754(2009).

DNA Digital Data Storage: P Priyanka 2451-17-735-001 M Rohith 2451-17-735-047

Uploaded by

Copyright:

Available Formats

DNA Digital Data Storage: P Priyanka 2451-17-735-001 M Rohith 2451-17-735-047

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DNA Digital Data Storage: P Priyanka 2451-17-735-001 M Rohith 2451-17-735-047

Uploaded by

Copyright:

Available Formats

DNA

Digital Data Storage

Encoding Data into the DNA Sequence:

Artificial DNA synthesis:

[3] Shaan Ray: DNA Data storage, Article and Journal

[4] Ailenberg, M. & Rotstein, O. D. An improved Huffman coding method for

You might also like