Statistical Models For Secure Steganography Systems: Digital Rights Management Seminar SS 2006
Statistical Models For Secure Steganography Systems: Digital Rights Management Seminar SS 2006
Statistical Models For Secure Steganography Systems: Digital Rights Management Seminar SS 2006
SS 2006
Submitted by,
Steganography......................................................................................................................3
Information Theory..............................................................................................................4
Security Model – Proposed..................................................................................................6
5. One Time Pad Systems..................................................................................................10
6. Universal Data Compression.........................................................................................10
7. Conclusion and Future work..........................................................................................12
1. Introduction:
Until recently, information hiding techniques received very much less attention
from the research community and from industry than cryptography. This situation is,
however, changing rapidly and the first academic conference on this topic was organized
in 1996.
The main driving force is concern over protecting copyright; as audio, video and
other works become available in digital form, the ease with which perfect copies can be
made may lead to large-scale unauthorized copying, and this is of great concern to the
music, film, book and software publishing industries. At the same time, moves by various
governments to restrict the availability of encryption services have motivated people to
study methods by which private messages can be embedded in seemingly innocuous
cover messages. There has therefore been significant recent research into ‘watermarking’
(hidden copyright messages) and ‘fingerprinting’ (hidden serial numbers or a set of
characteristics that tend to distinguish an object from other similar objects); the idea is
that the latter can be used to detect copyright violators and the former to prosecute them.
But there are many other applications of increasing interest to both the academic and
business communities, including anonymous communications, covert channels in
computer systems, detection of hidden information, Steganography, etc.
Steganography
Steganography is the art and science of writing hidden messages in such a way
that no one apart from the intended recipient knows of the existence of the message.
Steganography works by replacing bits of useless or unused data in regular computer files
(such as graphics, sound, text, HTML, or even floppy disks) with bits of different,
invisible information. This hidden information can be plain text, cipher text, or even
images.
Information Theory
Our ability to transmit signals at billions of bits per second is due to an inventive
and innovative Bell Labs mathematician, Claude Shannon, whose “Mathematical Theory
of Communications” published 50 years ago in the Bell System Technical Journal has
guided communications scientists and engineers in their quest for faster, more efficient,
and more robust communications systems. If we live in an “Information Age,” Shannon
is one of its founders.
Shannon’s ideas, which form the basis for the field of Information Theory, are
yardsticks for measuring the efficiency of communications systems. He identified
problems that had to be solved to get to what he described as ideal communications
systems – a goal we have yet to reach as we push today the practical limits of
communications with our commercial gigabit- and experimental terabit-per-second
systems.
Information Theory regards information as only those symbols that are uncertain
to the receiver. For years, people have sent telegraph messages, leaving out non-essential
words such as "a" and "the." In the same vein, predictable symbols can be left out, like in
the sentence, "only infrmatn esentil to understandn mst b tranmitd." Shannon made clear
that uncertainty is the very commodity of communication.
3.1 Entropy
H(S) = Σ i pi log (1 / pi )
• log (1 / pi ) indicates the amount of information contained in Si, i.e., the number of
bits needed to code Si.
The two prisoners Alice and Bob wish to devise an escape plan. Eve, the warden
(the adversary) will observe the messages exchanged between them. She would take
extreme measures if she found out the secret escape plan and would transfer them to a
high-security prison as soon as she detects any sign of a hidden message. Alice and Bob
succeed if Alice can send information to Bob such that Eve does not become suspicious.
We now introduce some of the basic terminologies of information hiding. Cover text is
the original unaltered message or image. Digital images are preferred for information
hiding as there is more capacity or space for securely hiding the information; the sender
Alice tries to hide an embedded message by transforming the cover text using a secret
key. The resulting message is called the stegotext and is sent to the receiver Bob. Similar
to cryptography, it is assumed that the adversary Eve has complete information about the
system except for a secret key shared by Alice and Bob that guarantees the security.
There are two hypotheses H0 and H1 for an observed measurement Q. The task of
hypothesis testing is to decide which of the two hypotheses holds true for Q. A decision
rule is a binary partition of Q that assigns one of the two hypotheses to each possible
measurement q Є Q. There are two possible errors that can be made in a decision. A type
I error is said to be done for accepting hypothesis H 1 when H0 is actually true and a type
II error for accepting H0 when H1 is true. The probability of a type I error is denoted by ά,
the probability of a type II error by β.
The Fig. I illustrates the stego system under discussion. At the sender’s end, Alice
could either send the unaltered cover text C or use the Key (K), private random source
(R) and embed the message E and send the resulting stego text S. The passive adversary
Eve observes a message that is sent from Alice to Bob. She does not know whether Alice
sends legitimate cover text C or stegotext S containing hidden information for Bob. Alice
is assumed to operate strictly in one of two modes: either she is active (and her output is
S) or inactive (sending cover text C).
Bob must be able to recover E from his knowledge of the stegotext S and from the
key K.
4.3.1 Observations
3. H (E / SK) = 0. Bob must be able to decode the embedded message uniquely given the
Stegotext S and the key K.
4.4 Security Definition
Eve, upon observing the message sent by Alice, has to decide whether Alice is
active or inactive. Since this task is a hypothesis testing problem, the security of a
stegosystem is quantified in terms of the relative entropy distance between PC and PS.
Definition 1: A stegosystem as defined above with cover text C and stegotext S is called
Є –secure against passive adversaries if
D (PC|PS) < Є:
Eve's decision process for a particular decision rule is given by a binary partition (C0,
C1) of the set C of possible cover texts. Alice is active if and only if the observed
message c is contained in C1. If Eve fails to detect that she observed stegotext S, she
makes a type II error whose probability is denoted by β.
In this paper the author assumes that Eve never makes a type I error and accuses
Alice of sending hidden information when she is inactive (ά = 0).
d (ά , β ) < Є:
In particular, if ά = 0, then
β = 2-Є
In a perfectly secure system we have D (PC | PS) = 0 and therefore PC = PS; thus, Eve can
obtain no information about whether Alice is active by observing the message.”
In the presence of external information like weather forecast etc, the cover text
distribution might be influenced. The modified stegosystem with external information Y,
cover text C, and stegotext S is called Є -secure against passive adversaries if
D (PC|Y || PS|Y ) ≤ Є
5. One Time Pad Systems
As per the proposed model we obtain a secure stegosystem if the distributions of
cover text and stegotext are close without knowledge of the key. It has been discussed
that the distribution of cover text is known and design corresponding embedding
functions. Based on this concept, one time pad systems could be perfectly stego secure
provided, the cover text consists of independent and uniformly random bits. Assume the
cover text C is a uniformly distributed n-bit string for some positive n. The key generator
chooses the n-bit key K with uniform distribution and sends it to Alice and Bob. By using
a simple bitwise XOR of the particular n-bit message e and K, we obtain the S. Thus,
S=e K, and
e=S K.
The resulting stegotext S is uniformly distributed in the set of n-bit strings and therefore
D (PC | PS) = 0. Thus, the one-time pad provides perfect steganographic security if the
cover text is uniformly random.
5.1 Disadvantage
It is very obvious that the result of the one time pad system would be a sequence
of random bits. No warden will allow the prisoners to exchange random malicious
looking messages. Hence, the one time pad system, though statistically secure, will be
practically impossible to use in this scenario.
The parameters of the algorithm are the block length L and the delay D (also
considered as the memory size). Consider a stationary binary source X producing {Xt} =
{x1, x2…}. The output {Yt} is mapped to blocks of length L. Encoding is done at
repetition time, the length of the interval since its last occurrence. Within this period, if
there is any repetition occurs, the block of last occurrence is encoded. It should also be
noted that the block should be with the delay or memory D. If it occurs outside, then the
symbol is transmitted plain. The encoding is done according to following relation,
Information hiding takes place if the encoder or the decoder encounters a block y
such that ∆ ty ≥ 1 / ρ. If this is the case, bit j of the message m is embedded in y’
according to,
According to the above equation, the r(y) is a ranking based on average repetition time. If
XOR operation results in 0, then the same y will be coded (as r -1(r(y)) will be y). If it
results in 1, then the next rank is taken and continued. The decoder computes the average
repetition times in the same way and can thus detect the symbols containing hidden
information and decode E similarly. Compared to data compression, the storage
complexity of the encoding and decoding algorithms is increased by a constant factor, but
their computational complexity grows by a factor of about L due to the maintenance of
the ranking.
7. Conclusion and Future work
In this report, we have investigated the proposal of Cachin for a secure
stegosystem. As discussed in the above sections, the model is based on information
theory. The stego system was assumed to have a passive adversary whose task was to
perform the hypothesis testing. Based on the statistically information of the cover text
and the stegotext the adversary decides whether Alice is active or not. Hence, it is the job
of Alice to make sure that these statistical parameters are either not disturbed or is within
a specified threshold (Є). By ensuring such measures the stego system achieves
maximum security.
In this paper, Cachin does not consider the influence of the factor called
Embedding Distortion DEmb. The future work done by Joachim et al. addresses this issue
and tries to enhance this model to decrease the DEmb
References