Informa (CS: Lecture 4 - Informa0on Storage
Informa (CS: Lecture 4 - Informa0on Storage
Introduc(on
Within
the
developed
world
data
is
being
created
at
a
phenomenal
rate
Photos
and
videos
are
being
uploaded
24/7
and
people
expect
them
to
be
available
on
demand
for
ever
The
developing
world
is
catching
up
very
quickly
the
other
3
billion
o3B
The
need
to
store
data
indenitely
is
now
being
met
by
remote
storage
within
the
cloud
Storage
Devices
Typically
data
was
stored
on
hard
discs
(slow)
or
in
RAM
(fast)
and
a
typical
computer
might
have
about
100
(mes
the
hard
disc
space
as
RAM
Of
course
capaci(es
have
increased
relentlessly
and
solid
state
storage
is
becoming
more
common
(Solid
State
Discs)
In
parallel
with
this
local
storage
is
now
less
important
Storage
Devices
0.1 GB/s
1.5 GB/s
0.5 GB/s
0.01 GB/s
3 GB/s
Network speed
0.1GB/s?
Storage
interfaces
While
most
devices
have
impressive
storage,
the
focus
is
moving
towards
fast
data
interfaces:
USB3
Thunderbolt
4G
and
5G
Fast
WiFi
And
yet
most
corporate
networks
will
be
at
100MB/s
for
some
(me
to
come
Data
compression
When
storage
was
limited
it
was
quite
common
to
compress
les
that
were
only
used
infrequently
a
number
of
schemes
are
available
Compression
of
audio
and
video
les
are
now
common:
Audio
mp3
Photo
-
jpg
Video
mpeg2
(DVD)
and
m4v
Data
compression
Note
that
the
media
compression
schemes
can
be
op(mised
for
the
data
type
this
is
more
dicult
if
the
data
format
is
unknown
Needless
to
say,
there
is
a
(me
penalty
in
accessing
compressed
data
Of
course
data
can
be
compressed
and
encrypted
at
the
same
(me
this
is
covered
later
in
this
lecture
Databases
We
accept
that
we
will
be
genera(ng
huge
amounts
of
data
and
have
the
technology
to
move
it
and
store
it.
However
there
is
no
point
in
any
of
this
if
we
cant
nd
the
data
when
we
need
it.
We
now
need
to
consider
how
to
store
the
data
in
an
op(mum
way
for
later
retrieval
and
analysis
What
is
a
database?
A
database
is
an
organised
collec(on
of
data
The
organisa(on
has
to
be
appropriate
to
the
type
of
data
to
be
stored
and
the
processing
to
be
carried
out
Tesco
data
and
Flickr?
The
interac(on
and
interroga(on
is
carried
out
with
a
Database
Management
System
DBMS
common
examples
are
MySQL
(free)
and
Oracle
(expensive)
What
is
a
database?
Whenever
data
has
to
be
stored
and
interrogated
then
a
DB
is
usually
present
a
lot
of
smartphones
use
MySQL
Accessing
a
database
is
not
the
same
as
simply
retrieving
a
le
from
a
folder
think
of
the
thousands
of
photos
that
you
have
A
Standardized
Query
Language
SQL
allows
dierent
products
to
inter-operate
and
modify
and
nd
data
in
the
database
DBMS Func(ons
Database
design
Obviously
a
lot
of
care
and
thought
has
to
go
into
the
original
database
design
so
that
it
operates
eciently
and
returns
results
as
quickly
as
possible.
This
is
becoming
a
sophis(cated
problem
with
data
being
increasingly
dispersed
within
the
cloud
what
to
store
where
for
example?
Cloud
storage
Most
of
us
are
now
familiar
with
using
cloud
storage
through
services
such
as
Dropbox
and
Copy
(as
well
as
social
media)
Data
is
now
stored
on
remote
servers
although
for
large
companies
some
may
s(ll
be
local
The
process
can
be
completely
transparent
to
the
user
Cloud
storage
Advantages
Pay
only
for
what
you
need
Good
for
risk
management
a
re
at
HQ
No
upgrades
or
updates
required
No
physical
maintenance
or
infrastructure
Cloud
storage
Disadvantages
Aback
surface
area
has
increased
Trust
the
supplier
corporate
takeover
Disputes
who
owns
the
data?
Need
a
network
to
access
some
on-site
Ethical
EU
data
must
be
stored
in
the
EU
Data
encryp(on
There
are
many
situa(ons
when
we
wish
to
store
conden(al
informa(on
(passwords)
or
transmit
data
securely
across
public
networks.
The
same
ideas
can
be
used
as
digital
signatures
or
to
ensure
that
data
has
not
been
tampered
with.
Of
course
codes
and
encryp(on
go
back
thousands
of
years.
Data encryp(on
Data
encryp(on
The
basic
idea
is
to
apply
some
process
to
the
message
which
can
be
reversed
to
recover
the
original
message.
Text
can
be
processed
with
ciphers
or
look
up
tables
(single
use
for
beber
security).
Digital
text
can
be
processed
mathema(cally
using
one
or
more
keys
since
the
lebers
of
the
alphabet
can
be
regarded
as
numbers.
Wikipedia
Process
Alice
and
Bob
have
both
public
and
private
keys
Alice
takes
a
message
and
applies
Bobs
public
key
and
her
private
key
The
message
is
sent
to
Bob
Bob
uses
his
private
key
and
Alices
public
key
to
recover
the
message
Intercep(on
is
no
use
because
a
private
key
is
s(ll
needed
Process
This
only
works
with
certain
mathema(cal
processes
and
of
course,
as
men(oned,
the
numbers
are
huge
to
prevent
brute
force
guessing