Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Dps - Final

Download as pdf or txt
Download as pdf or txt
You are on page 1of 100

Message Authentication

• Message authentication is a mechanism or service used to verify the


integrity of a message.

• Message authentication assures that data received are exactly as sent by


(i.e., contain no modification, insertion, deletion, or replay) and that the
purported identity of the sender is valid.

• Symmetric encryption provides authentication among those who share


the secret key.

• Encryption of a message by a sender's private key also provides a form of


authentication.

C:\1-SudhaCIT\Security\DPS
3

Message Authentication (Contd…)


• The two most common cryptographic techniques for message authentication are:

• Message Authentication Code (MAC)

• Secure Hash Function

• A MAC takes a variable-length message and a secret key as input and produces an
authentication code.

• A recipient in possession of the secret key can generate an authentication code to


verify the integrity of the message.

• A hash function maps a variable-length message into a fixed length hash value, or
message digest.

C:\1-SudhaCIT\Security\DPS
4
Message Authentication Requirements
In the context of communications across a network, the following attacks can
be identified.

1. Disclosure

2. Traffic analysis

3. Masquerade

4. Content modification

5. Sequence modification

6. Timing modification

7. Source repudiation

8. Destination repudiation

C:\1-SudhaCIT\Security\DPS
5

Authentication Function

• Any message authentication or digital signature mechanism has


two levels of functionality.

• At the lower level, there must be some sort of function that


produces an authenticator: a value to be used to authenticate a
message.

• This lower-level function is then used as a primitive in a higher-


level authentication protocol that enables a receiver to verify the
authenticity of a message.

C:\1-SudhaCIT\Security\DPS
6
Authentication Function
The types of functions that may be used to produce an authenticator may
be grouped into three classes.

1. Message encryption: The ciphertext of the entire message serves as


its authenticator.

2. Message authentication code (MAC): A function of the message and a


secret key that produces a fixed-length value that serves as the
authenticator.

3. Hash function: A function that maps a message of any length into a


fixed length hash value, which serves as the authenticator.

C:\1-SudhaCIT\Security\DPS
7

Message Encryption
Basic Uses of Message Encryption

C:\1-SudhaCIT\Security\DPS
8
Message Encryption (Contd…)
Basic Uses of Message Encryption

C:\1-SudhaCIT\Security\DPS
9

Message Authentication Code


• An authentication technique involves the use of a secret key to generate a small
fixed-size block of data, known as a cryptographic checksum or MAC, that is
appended to the message.
• This technique assumes that two communicating parties, say A and B, share a
common secret key.
• When A has a message to send to B, it calculates the MAC as a function of the
message and the key:
MAC = C(K, M)
where M= input message
C = MAC function
K = shared secret key
MAC = message authentication code
• The message plus MAC are transmitted to the intended recipient.
• The recipient performs the same calculation on the received message, using the
same secret key, to generate a new MAC.
• The received MAC is compared to the calculated MAC.
C:\1-SudhaCIT\Security\DPS
10
Message Authentication Code

If we assume that only the receiver and the sender know the identity of the
secret key, and if the received MAC matches the calculated MAC, then

1. The receiver is assured that the message has not been altered.

2. If an attacker alters the message but does not alter the MAC, then the
receiver’s calculation of the MAC will differ from the received MAC.

3. Because the attacker is assumed not to know the secret key, the attacker
cannot alter the MAC to correspond to the alterations in the message.

C:\1-SudhaCIT\Security\DPS
11

Message Authentication Code (Contd…)


4. The receiver is assured that the message is from the alleged sender.

5. Because no one else knows the secret key, no one else could prepare a
message with a proper MAC.

6. If the message includes a sequence number, then the receiver can be assured
of the proper sequence because an attacker cannot successfully alter the
sequence number.

C:\1-SudhaCIT\Security\DPS
12
Message Authentication Code (Contd…)

C:\1-SudhaCIT\Security\DPS
13

Message Authentication Code (Contd…)

C:\1-SudhaCIT\Security\DPS
14
Message Authentication Code (Contd…)

C:\1-SudhaCIT\Security\DPS
15

Hash Function
• A variation on the message authentication code is the one-way hash function.

• A hash function accepts a variable-size message M as input and produces a fixed-size


output, referred to as a hash code H(M).

• The hash code is also referred as message digest or hash value.

• A hash value h is generated by a function H of the form

h = H(M)

where M is a variable-length message

H(M) is the fixed-length hash value

• The hash value is appended to the message at the source at a time when the
message is assumed or known to be correct.

• The receiver authenticates that message by recomputing the hash value.

C:\1-SudhaCIT\Security\DPS
16
Hash Function (Contd…)
Following figure illustrates a variety of ways in which a hash code can be used to provide message authentication.

C:\1-SudhaCIT\Security\DPS
17

Hash Function (Contd…)


Following figure illustrates a variety of ways in which a hash code can be used to provide message authentication.

C:\1-SudhaCIT\Security\DPS
18
Hash Function (Contd…)
• Fig (a) :

The message plus concatenated hash code is encrypted using symmetric


encryption.

Only A and B share the secret key, the message must have come from A and
has not been altered.

The hash code provides the structure or redundancy required to achieve


authentication.

Because encryption is applied to the entire message plus hash code,


confidentiality is also provided.

C:\1-SudhaCIT\Security\DPS
19

Hash Function (Contd…)


• Fig (b):

Only the hash code is encrypted, using symmetric encryption.

This reduces the processing burden for those applications that do not
require confidentiality.

E(K, H(M)) is a function of a variable-length message M and a secret key


K, and it produces a fixed-size output that is secure against an opponent
who does not know the secret key.

C:\1-SudhaCIT\Security\DPS
20
Hash Function (Contd…)

• Fig (c) :

Only the hash code is encrypted, using public-key encryption and


using the sender's private key.

As with (b), this provides authentication. It also provides a digital


signature, because only the sender could have produced the
encrypted hash code.

In fact, this is the essence of the digital signature technique.

C:\1-SudhaCIT\Security\DPS
21

Hash Function (Contd…)

• Fig (d):

If confidentiality as well as a digital signature is desired, then the


message plus the private-key encrypted hash code can be encrypted
using a symmetric secret key. This is a common technique.

C:\1-SudhaCIT\Security\DPS
22
Hash Function (Contd…)
• Fig (e):

It is possible to use a hash function but no encryption for message


authentication.

The technique assumes that the two communicating parties share a


common secret value S.

A computes the hash value over the concatenation of M and S and appends
the resulting hash value to M.

Because B possesses S, it can recompute the hash value to verify.

Because the secret value itself is not sent, an opponent cannot modify an
intercepted message and cannot generate a false message.

• Fig (f):

Confidentiality can be added to the approach of (e) by encrypting the entire


message plus the hash code.

C:\1-SudhaCIT\Security\DPS
23

Hash Function (Contd…)


Requirements for a Hash Function

• The purpose of a hash function is to produce a "fingerprint" of a file, message, or other block of
data.

• To be useful for message authentication, a hash function H must have the following properties

– H can be applied to a block of data of any size.

– H produces a fixed-length output.

– H(x) is relatively easy to compute for any given x, making both hardware and software
implementations practical.

– For any given value h, it is computationally infeasible to find x such that H(x) = h. This is
sometimes referred to in the literature as the one-way property.

– For any given block x, it is computationally infeasible to find y ≠x such that H(y) = H(x). This is
sometimes referred to as weak collision resistance.

– It is computationally infeasible to find any pair (x, y) such that H(x) = H(y). This is sometimes
referred to as strong collision resistance.

C:\1-SudhaCIT\Security\DPS
24
Simple Hash Function
Message: “Hello my name is Alice”

H E L L O -> 07 04 11 11 14
M Y N A M -> 12 24 13 00 12
E I S A L -> 04 08 18 00 11
I C E X X -> 08 02 04 23 23
05 12 20 08 08
HASH CODE -> F M U I I
• A sends Message and Hash code to B.

• B computes hash code using the received message and checks it with the received hash
code.

• If both hash codes are same then integrity and authentication is proved and the
message has not been altered; else it shows that the message has been altered.

C:\1-SudhaCIT\Security\DPS
25

Simple Hash Function

C:\1-SudhaCIT\Security\DPS
26
Simple Hash Function

C:\1-SudhaCIT\Security\DPS
27

General structure of Secure Hash Code


• The hash algorithm involves repeated use of a compression function, f, that takes two inputs (a n -
bit input from the previous step, called the chaining variable, and a b-bit block) and produces an
n-bit output.

• At the start of hashing, the chaining variable has an initial value that is specified as part of the
algorithm.

• The final value of the chaining variable is the hash value.

C:\1-SudhaCIT\Security\DPS
28
Secure Hash Algorithm
• SHA was developed by the National Institute of Standards and Technology (NIST) and
published as a federal information processing standard (FIPS 180) in 1993.

• SHA is based on the hash function MD4, and its design closely models MD4.

C:\1-SudhaCIT\Security\DPS
29

SHA (Contd…)
• The SHA-512 algorithm takes as input a message with a maximum length of less than 2128 bits and
produces as output a 512-bit message digest. The input is processed in 1024-bit blocks.

C:\1-SudhaCIT\Security\DPS
30
SHA (Contd…)
The processing consists of the following steps.
• Step 1 - Append padding bits

• Step 2 - Append length

• Step 3 - Initialize hash buffer

• Step 4 - Process message in 1024-bit (128-word) blocks

• Step 5 – Output – hash code

C:\1-SudhaCIT\Security\DPS
31

SHA (Contd…)

Step 1: Append padding bits.

• The message is padded so that its length is congruent to 896 = modulo


1024 [length 896 (mod 1024)].

• Padding is always added, even if the message is already of the desired


length. Thus, the number of padding bits is in the range of 1 to 1024.

• The padding consists of a single 1-bit followed by the necessary number of


0-bits.

C:\1-SudhaCIT\Security\DPS
32
SHA (Contd…)

Step 2: Append length. A block of 128 bits is appended to the message.

• This block is treated as an unsigned 128-bit integer (most significant byte first)
and contains the length of the original message (before the padding).

• The outcome of the first two steps yields a message that is an integer multiple of
1024 bits in length. The expanded message is represented as the sequence of
1024-bit blocks M1, M2,..., MN, so that the total length of the expanded message
is N x 1024 bits.

C:\1-SudhaCIT\Security\DPS
33

SHA (Contd…)
Step 3: Initialize hash buffer.

• A 512-bit buffer is used to hold intermediate and final results of the hash
function.

• The buffer can be represented as eight 64-bit registers (a, b, c, d, e, f, g, h).

• These registers are initialized to the following 64-bit integers (hexadecimal


values):

a = 6A09E667F3BCC908, b = BB67AE8584CAA73B, c = 3C6EF372FE94F82B

c = A54FF53A5F1D36F1, e = 510E527FADE682D1, f = 9B05688C2B3E6C1F

g = 1F83D9ABFB41BD6B, h = 5BE0CDI9137E2179

C:\1-SudhaCIT\Security\DPS
34
SHA (Contd…)
Step 4: Process message in 1024-bit (128-word) blocks.

• The heart of the algorithm is a module that consists of 80 rounds; this module
is labeled F.

• Each round takes as input the 512-bit buffer value abcdefgh, and updates the
contents of the buffer.

• At input to the first round, the buffer has the value of the intermediate hash
value, Hi-1.

• Each round t makes use of a 64-bit value Wt derived from the current 1024-bit
block being processed (Mi)

C:\1-SudhaCIT\Security\DPS
35

SHA (Contd…)
Step 5: Output.

• After all N 1024-bit blocks have been processed, the output from the Nth
stage is the 512-bit message digest.

C:\1-SudhaCIT\Security\DPS
36
SHA (Contd…)
Example :
Step 1 and 2: Pre-Processing Messages in SHA-512
• We denote the message by M.
• Suppose the length of this message in bits, is l.
• The final message length after pre-processing should be a
multiple of 1024 bits.
• The Pre-Processing Steps are as follows:
– Append the bit "1" to the end of the message.
– Now, append "0" bits to the end of the message, where is
the smallest non-negative solution to (l+1+k) = 896 mod
1024.
– After this, express the message length l in binary in a 128-bit
block and append this block at the end.

C:\1-SudhaCIT\Security\DPS
37

SHA (Contd…)
Example:

C:\1-SudhaCIT\Security\DPS
38
SHA (Contd…)
Example:

C:\1-SudhaCIT\Security\DPS
39

SHA (Contd…)
Example:

C:\1-SudhaCIT\Security\DPS
40
SHA (Contd…)

C:\1-SudhaCIT\Security\DPS
41

SHA (Contd…)
• Creation of 80-word Input Sequence for SHA-512 Processing of Single
Block

C:\1-SudhaCIT\Security\DPS
42
SHA (Contd…)
• Creation of 80-word Input Sequence for SHA-512 Processing of Single
Block

C:\1-SudhaCIT\Security\DPS
43

SHA (Contd…)
SHA-512 Round Function

C:\1-SudhaCIT\Security\DPS
44
SHA (Contd…)
SHA-512 Round Function
• Each round is defined by the following set of equations:

C:\1-SudhaCIT\Security\DPS
45

SHA (Contd…)

C:\1-SudhaCIT\Security\DPS
46
Pretty Good Privacy (PGP)
• Pretty Good Privacy or PGP encryption, is a data encryption program that
gives cryptographic privacy and authentication for online
communication.

• It is often used to encrypt and decrypt texts, emails, and files to increase
the security of emails.

• PGP encryption uses a mix of data compression, hashing, and public-key


cryptography.

• It also uses symmetric and asymmetric keys to encrypt data that is


transferred across networks.

C:\1-SudhaCIT\Security\DPS
3

Pretty Good Privacy (PGP)


• PGP stands for Pretty Good Privacy (PGP), invented by Phil Zimmermann.
• PGP was designed to provide all four aspects of security, i.e., privacy,
integrity, authentication, and non-repudiation in the sending of email.
• PGP uses a digital signature (a combination of hashing and public key
encryption) to provide integrity, authentication, and non-repudiation.
• PGP uses a combination of secret key encryption and public key
encryption to provide privacy.
• The digital signature uses one hash function, one secret key, and two
private-public key pairs.
• PGP is an open source and freely available software package for email
security.
• PGP provides authentication through the use of Digital Signature.
• It provides confidentiality through the use of symmetric block encryption.
• It provides compression by using the ZIP algorithm, and EMAIL
compatibility using the radix-64 encoding scheme.

C:\1-SudhaCIT\Security\DPS
4
PGP (Contd…)
The steps taken by PGP to create secure e-mail at the sender site:

• The e-mail message is hashed by using a hashing function to create a


digest.

• The digest is then encrypted to form a signed digest by using the


sender's private key, and then signed digest is added to the original
email message.

• The original message and signed digest are encrypted by using a one-
time secret key created by the sender.

• The secret key is encrypted by using a receiver's public key.

• Both the encrypted secret key and the encrypted combination of


message and digest are sent together.

C:\1-SudhaCIT\Security\DPS
5

PGP (Contd…)

C:\1-SudhaCIT\Security\DPS
6
PGP (Contd…)
The steps taken to show how PGP uses hashing and a combination of three
keys to generate the original message:

• The receiver receives the combination of encrypted secret key and


message digest is received.

• The encrypted secret key is decrypted by using the receiver's private key
to get the one-time secret key.

• The secret key is then used to decrypt the combination of message and
digest.

• The digest is decrypted by using the sender's public key, and the original
message is hashed by using a hash function to create a digest.

• Both the digests are compared if both of them are equal means that all
the aspects of security are preserved.
C:\1-SudhaCIT\Security\DPS
7

PGP (Contd…)

C:\1-SudhaCIT\Security\DPS
8
Pretty Good Privacy (PGP)
• It provides

– authentication through the use of digital signature;

– confidentiality through the use of symmetric block encryption;

– compression using the ZIP algorithm;

– e-mail compatibility using the radix-64 encoding scheme; and

– segmentation and reassembly to accommodate long e-mails.

• S/MIME (Secure/Multipurpose Internet Mail Extensions) is an Internet


standard approach to e-mail security that incorporates the same
functionality as PGP

C:\1-SudhaCIT\Security\DPS
9

PGP (Contd…)
• Provides a confidentiality and authentication service that can be used for
electronic mail and file storage applications

• Developed by Phil Zimmermann

– Selected the best available cryptographic algorithms as building blocks

– Integrated these algorithms into a general-purpose application that is


independent of operating system and processor and that is based on a
small set of easy-to-use commands

– Made the package and its documentation, including the source code,
freely available via the Internet, bulletin boards, and commercial
networks

– Entered into an agreement with a company to provide a fully


compatible, low-cost commercial version of PGP
C:\1-SudhaCIT\Security\DPS
10
PGP (Contd…)
PGP has grown explosively and is now widely used.
1. It is available free worldwide in versions that run on a variety of
platforms, including Windows, UNIX, Macintosh, and many more.
2. It is based on algorithms that have survived extensive public review and
are considered extremely secure. Specifically, the package includes RSA,
DSS, and Diffie-Hellman for public-key encryption; CAST-128, IDEA, and
3DES for symmetric encryption; and SHA-1 for hash coding.
3. It has a wide range of applicability, from corporations that wish to select
and enforce a standardized scheme for encrypting files and messages to
individuals who wish to communicate securely with others worldwide
over the Internet and other networks.
4. It was not developed by, nor is it controlled by, any governmental or
standards organization.
5. PGP is now on an Internet standards track (RFC 3156).

C:\1-SudhaCIT\Security\DPS
11

PGP (Contd…)
• The actual operation of PGP consists of five services: authentication,
confidentiality, compression, e-mail compatibility, and segmentation

C:\1-SudhaCIT\Security\DPS
13
PGP (Contd…)
PGP Cryptographic Functions

C:\1-SudhaCIT\Security\DPS
14

PGP (Contd…)
PGP Message

C:\1-SudhaCIT\Security\DPS
15
PGP (Contd…)
Radix 64 conversion

• Many electronic mail systems can only transmit blocks of ASCII text.

• This can cause a problem when sending encrypted data since ciphertext blocks might
not correspond to ASCII characters which can be transmitted.

• PGP overcomes this problem by using radix-64 conversion.

PGP E-Mail Compatibility: Example

• Suppose the email message is: new

ASCII format: 01101110 01100101 01110111

After encryption: 10010001 10011010 10001000

The problem after encryption:

– the three bytes do not represent any key board ASCII characters.

– Most email systems cannot transmit and process such a piece of ciphertext.

C:\1-SudhaCIT\Security\DPS
16

PGP (Contd…)
Suppose the text to be encrypted has been converted into binary using ASCII coding
and encrypted to give a ciphertext stream of binary.
Radix-64 conversion maps arbitrary binary into printable characters as follows:
1. The binary input is split into blocks of 24 bits (3 bytes).
2. Each 24 block is then split into four sets each of 6-bits.
3. Each 6-bit set will then have a value between 0 and 26-1 (=63).
4. This value is encoded into a printable character.

Suppose the email message is: new


• ASCII format: 01101110 01100101 01110111
• After encryption: 10010001 10011010 10001000
• The Radix-64 conversion:
• The 24-bit block: 10010001 10011010 10001000
• Four 6-bit blocks: 100100 011001 101010 001000
• Integer version: 36 25 38 8
• Printable version: k Z m I

18
C:\1-SudhaCIT\Security\DPS
IPSecurity
• Internet Protocol Security (IPsec) is a secure network protocol
suite that authenticates and encrypts the packets of data to provide
secure encrypted communication between two computers over a
network.

• IPsec includes protocols for establishing mutual authentication between


agents at the beginning of a session and negotiation of cryptographic
keys to use during the session.

• IPsec can protect data flows between a pair of hosts (host-to-host),


between a pair of security gateways (network-to-network), or between a
security gateway and a host (network-to-host).

• It supports network-level peer authentication, data-origin authentication,


data integrity, data confidentiality (encryption), and replay protection.

C:\1-SudhaCIT\Security\DPS
3

IPSec(Contd…)
• IP security (IPSec) is a capability that can be added to either current version of
the Internet Protocol (IPv4 or IPv6), by means of additional headers.

• The Internet community has developed application-specific security


mechanisms in a number of application areas, including electronic mail
(S/MIME, PGP), client/server (Kerberos), Web access (Secure Sockets Layer),
and others.

• However, users have some security concerns that cut across protocol layers.

• For example, an enterprise can run a secure, private TCP/IP network by


disallowing links to untrusted sites, encrypting packets that leave the premises,
and authenticating packets that enter the premises.

• By implementing security at the IP level, an organization can ensure secure


networking not only for applications that have security mechanisms but also
for the many security-ignorant applications. 4
C:\1-SudhaCIT\Security\DPS
IPSec(Contd…)
• IP-level security encompasses three functional areas: authentication,
confidentiality, and key management.

• The authentication mechanism assures that a received packet was, in


fact, transmitted by the party identified as the source in the packet
header.

• In addition, this mechanism assures that the packet has not been
altered in transit.

• The confidentiality facility enables communicating nodes to encrypt


messages to prevent eavesdropping by third parties.

• The key management facility is concerned with the secure exchange of


keys.

C:\1-SudhaCIT\Security\DPS
5

IPSec(Contd…)
Applications of IPSec

• Secure branch office connectivity over the Internet

– A company can build a secure virtual private network over the Internet or over a
public WAN. This enables a business to rely heavily on the Internet and reduce
its need for private networks, saving costs and network management overhead.
• Secure remote access over the Internet
– An end user whose system is equipped with IP security protocols can make a
local call to an Internet service provider (ISP) and gain secure access to a
company network. This reduces the cost of toll charges for traveling employees
and telecommuters.

• Establsihing extranet and intranet connectivity with partners

– IPSec can be used to secure communication with other organizations, ensuring


authentication and confidentiality and providing a key exchange mechanism.

• Enhancing electronic commerce security

– Even though some Web and electronic commerce applications have built-in
security protocols, the use of IPSec enhances that security.
C:\1-SudhaCIT\Security\DPS
6
IPSec(Contd…)
• The principal feature of IPSec that enables it to support these varied applications is that it can
encrypt and/or authenticate all traffic at the IP level.

• Thus, all distributed applications, including remote logon, client/server, e-mail, file transfer,
Web access, and so on, can be secured.

An IP Security scenario
C:\1-SudhaCIT\Security\DPS
7

IPSec(Contd…)
The above figure is a typical scenario of IPSec usage.

• An organization maintains LANs at dispersed locations.

• IPSec protocols are used and these protocols operate in networking


devices, such as a router or firewall, that connect each LAN to the
outside world.

• The IPSec networking device will typically encrypt and compress all
traffic going into the WAN, and decrypt and decompress traffic coming
from the WAN.

C:\1-SudhaCIT\Security\DPS
8
IPSec(Contd…)
Benefits of IPSec
• When IPSec is implemented in a firewall or router, it provides strong
security that can be applied to all traffic crossing the perimeter.
• IPSec in a firewall is resistant to bypass if all traffic from the outside must
use IP, and the firewall is the only means of entrance from the Internet
into the organization.
• IPSec is below the transport layer (TCP, UDP) and so is transparent to
applications. There is no need to change software on a user or server
system when IPSec is implemented in the firewall or router.
• IPSec can be transparent to end users. There is no need to train users on
security mechanisms, issue keying material on a per-user basis, or revoke
keying material when users leave the organization.
• IPSec can provide security for individual users if needed.
• This is useful for offsite workers and for setting up a secure virtual
subnetwork within an organization for sensitive applications.

C:\1-SudhaCIT\Security\DPS
9

IPSec Architecture
IPSec Documents

• The IPSec specification consists of numerous documents. The most important of


these, issued in November of 1998, are Request for Comments: RFCs 2401, 2402,
2406, and 2408:

● RFC 2401: An overview of a security architecture

● RFC 2402: Description of a packet authentication extension to IPv4 and IPv6

● RFC 2406: Description of a packet encryption extension to IPv4 and IPv6

● RFC 2408: Specification of key management capabilities

• Support for these features is mandatory for IPv6 and optional for IPv4.

C:\1-SudhaCIT\Security\DPS
10
IPSec Architecture (Contd…)
• In addition to these four RFCs, a number of additional drafts have been
published by the IP Security Protocol Working Group set up by the Internet
Engineering Task Force (IETF).
• The documents are divided into seven groups, as depicted in the Figure.

IPSec Document Overview


C:\1-SudhaCIT\Security\DPS
11

IPSec Architecture (Contd…)


• Architecture: Covers the general concepts, security requirements,
definitions, and mechanisms defining IPSec technology.
• Encapsulating Security Payload (ESP): Covers the packet format and
general issues related to the use of the ESP for packet encryption and,
optionally, authentication.
• Authentication Header (AH): Covers the packet format and general issues
related to the use of AH for packet authentication.
• Encryption Algorithm: A set of documents that describe how various
encryption algorithms are used for ESP.
• Authentication Algorithm: A set of documents that describe how various
authentication algorithms are used for AH and for the authentication
option of ESP.
• Key Management: Documents that describe key management schemes.
• Domain of Interpretation (DOI): Contains values needed for the other
documents to relate to each other.
• These include identifiers for approved encryption and authentication
algorithms, as well as operational parameters such as key lifetime.
C:\1-SudhaCIT\Security\DPS
12
IPSec Architecture (Contd…)

IPSec Services

• Access control

• Integrity

• Data origin authentication

• Rejection of replayed packets

• Confidentiality

C:\1-SudhaCIT\Security\DPS
13

IPSec Architecture (Contd…)


Security Associations

• A key concept that appears in both the authentication and confidentiality


mechanisms for IP is the security association (SA).

• An association is a one-way relationship between a sender and a receiver


that affords security services to the traffic carried on it.

• If a peer relationship is needed, for two-way secure exchange, then two


security associations are required.

• Security services are afforded to an SA for the use of AH or ESP, but not
both.

C:\1-SudhaCIT\Security\DPS
14
IPSec Architecture (Contd…)
A security association is uniquely identified by three parameters:

• Security Parameters Index (SPI): A bit string assigned to this SA and having
local significance only. The SPI is carried in AH and ESP headers to enable the
receiving system to select the SA under which a received packet will be
processed.

• IP Destination Address: This is the address of the destination endpoint of


the SA, which may be an end user system or a network system such as a
firewall or router.

• Security Protocol Identifier: This indicates whether the association is an AH


or ESP security association.

C:\1-SudhaCIT\Security\DPS
15

IPSec Architecture (Contd…)


SA Parameters

The Security Association Database defines the parameters associated with each
SA.

• Sequence Number Counter: A 32-bit value used to generate the Sequence


Number field in AH or ESP headers.

• Sequence Counter Overflow: A flag indicating whether overflow of the


Sequence Number Counter should generate an auditable event and prevent
further transmission of packets on this SA.

• Anti-Replay Window: Used to determine whether an inbound AH or ESP


packet is a replay.

• AH Information: Authentication algorithm, keys, key lifetimes, and related


parameters being used with AH.

C:\1-SudhaCIT\Security\DPS
16
IPSec Architecture (Contd…)
SA Parameters

• ESP Information: Encryption and authentication algorithm, keys,


initialization values, key lifetimes, and related parameters being used with
ESP.

• Lifetime of This Security Association: A time interval or byte count after


which an SA must be replaced with a new SA (and new SPI) or terminated,
plus an indication of which of these actions should occur.

• IPSec Protocol Mode: Tunnel or Transport mode.

• Path MTU: Any observed path maximum transmission unit (maximum size
of a packet that can be transmitted without fragmentation) and aging
variables

C:\1-SudhaCIT\Security\DPS
17

IPSec Architecture (Contd…)


SA Selectors

• Each Security Policy Database (SPD) entry is defined by a set of IP and


upper-layer protocol field values, called selectors.

• In effect, these selectors are used to filter outgoing traffic in order to map
it into a particular SA. Outbound processing obeys the following general
sequence for each IP packet:

– Compare the values of the appropriate fields in the packet (the


selector fields) against the SPD to find a matching SPD entry, which will
point to zero or more Sas.

– Determine the SA if any for this packet and its associated SPI.

– Do the required IPSec processing (i.e., AH or ESP processing).

C:\1-SudhaCIT\Security\DPS
18
IPSec Architecture (Contd…)
The following selectors determine an SPD entry:
• Destination IP Address: This may be a single IP address, an enumerated list
or range of addresses, or a wildcard (mask) address. The latter two are
required to support more than one destination system sharing the same SA
(e.g., behind a firewall).
• Source IP Address: This may be a single IP address, an enumerated list or
range of addresses, or a wildcard (mask) address. The latter two are required
to support more than one source system sharing the same SA (e.g., behind a
firewall).
• UserID: A user identifier from the operating system. This is not a field in the
IP or upper-layer headers but is available if IPSec is running on the same
operating system as the user.
• Data Sensitivity Level: Used for systems providing information flow security
(e.g., Secret or Unclassified).
• Transport Layer Protocol: Obtained from the IPv4 Protocol or IPv6 Next
Header field. This may be an individual protocol number, a list of protocol
numbers, or a range of protocol numbers.
• Source and Destination Ports: These may be individual TCP or UDP port
values, an enumerated list ofC:\1-SudhaCIT\Security\DPS
ports, or a wildcard port. 19

IPSec Architecture (Contd…)


Transport and Tunnel Modes
• Both AH and ESP support two modes of use: transport and tunnel mode.
Transport Mode
• Transport mode provides protection primarily for upper-layer protocols. That
is, transport mode protection extends to the payload of an IP packet.
• Examples include a TCP or UDP segment or an ICMP packet, all of which
operate directly above IP in a host protocol stack.
• Typically, transport mode is used for end-to-end communication between
two hosts (e.g., a client and a server, or two workstations).
• When a host runs AH or ESP over IPv4, the payload is the data that normally
follow the IP header.
• ESP in transport mode encrypts and optionally authenticates the IP payload
but not the IP header.
• AH in transport mode authenticates the IP payload and selected portions of
the IP header.
C:\1-SudhaCIT\Security\DPS
20
IPSec Architecture (Contd…)
Tunnel Mode

• Tunnel mode provides protection to the entire IP packet.

• To achieve this, after the AH or ESP fields are added to the IP packet, the
entire packet plus security fields is treated as the payload of new "outer" IP
packet with a new outer IP header.

• The entire original, or inner, packet travels through a "tunnel" from one
point of an IP network to another; no routers along the way are able to
examine the inner IP header.

• Because the original packet is encapsulated, the new, larger packet may have
totally different source and destination addresses, adding to the security.

• Tunnel mode is used when one or both ends of an SA are a security gateway,
such as a firewall or router that implements IPSec.
C:\1-SudhaCIT\Security\DPS
21

IPSec Architecture (Contd…)


Tunnel Mode : Here is an example of how tunnel mode IPSec operates.
• Host A on a network generates an IP packet with the destination address of host B on
another network.
• This packet is routed from the originating host to a firewall or secure router at the
boundary of A's network.
• The firewall filters all outgoing packets to determine the need for IPSec processing.
• If this packet from A to B requires IPSec, the firewall performs IPSec processing and
encapsulates the packet with an outer IP header.
• The source IP address of this outer IP packet is this firewall, and the destination
address may be a firewall that forms the boundary to B's local network.
• This packet is now routed to B's firewall, with intermediate routers examining only
the outer IP header.
• At B's firewall, the outer IP header is stripped off, and the inner packet is delivered to
B.

• ESP in tunnel mode encrypts and optionally authenticates the entire inner IP packet,
including the inner IP header.
• AH in tunnel mode authenticates the entire inner IP packet and selected portions of
the outerIP header.
C:\1-SudhaCIT\Security\DPS
22
IPSec Architecture (Contd…)

C:\1-SudhaCIT\Security\DPS
23

g
3.1. Secure Programs
y Security implies some degree of trust that the program enforces
expectedd confidentiality,
fid i li iintegrity,
i andd availability
il bili
y An assessment of security can also be influenced by someone's general
perspective
ti on software
ft quality
lit
y E.g., if your manager's idea of quality is conformance to
specifications,
ifi ti then
th she h might
i ht consider
id the
th code
d secure if it meets
t
security requirements, whether or not the requirements are
completel t or correct.t

y IEEE Terminology for Quality


y A bug
b can be
b a mistake
i k in i interpreting
i i a requirement,
i a syntax error
in a piece of code, or the (as-yet-unknown) cause of a system
crash.
h
y When a human makes a mistake, called an error, in performing
some software
ft activity,
ti it th the error may lead
l d tto a fault,
f lt or an iincorrectt
step, command, process, or data definition in a computer program.
y A failure
f il isi a ddeparture
t ffrom the th system's
t ' required
i d bbehavior.
h i
y a fault is an inside view of the system, as seen by the eyes of the
d l
developers, whereas
h a failure
f il isi an outside
t id view:
i a problembl th thatt the
th
user sees. 4
y Fixing Faults
y A module
d l iin which
hi h 100 faults
f l were di
discoveredd andd fifixedd isi better
b
than another in which only 20 faults were discovered and fixed,
suggesting
ti that
th t more rigorous
i analysis
l i andd ttesting
ti had
h d lledd tto the
th
finding of the larger number of faults (?)
y Early
E l workk iin computert security
it was bbasedd on the
th paradigm
di off
"penetrate and patch," in which analysts searched for and repaired
f lt
faults.

y Fixing Faults (Cont’d)


y However,
H the
h patchh efforts
ff were largely
l l useless,
l making
ki the
h system
less secure rather than more secure because they frequently
i t d d new faults.
introduced f lt
y The pressure to repair a specific problem encouraged a narrow focus on the
fault itself and not on its context.
context
y The fault often had nonobvious side effects in places other than the
immediate area of the fault.
y Fixing one problem often caused a failure somewhere else
y The fault could not be fixed properly because system functionality or
performance would suffer as a consequence
6

y Unexpected Behavior
y To
T understand
d d program security,
i we can examine
i programs to see
whether they behave as their designers intended or users expected.
y Such
S h unexpected
t d bbehavior
h i a program security
it flaw;
fl it isi
inappropriate program behavior caused by a program vulnerability.
y Program
P security
it flaws
fl can derive
d i ffrom any kikindd off software
ft ffaultlt
y Divide program flaws into two separate logical categories:
iinadvertent
d t t human
h errors versus malicious,
li i intentionally
i t ti ll
induced flaws.

y Regrettably, we do not have techniques to eliminate or address all


program security
i flflaws.
y Security is fundamentally hard, security often conflicts with usefulness
andd performance,
f there
th isi no ""silver
"" il bbullet"
ll t" tto achieve
hi security
it
effortlessly, and false security solutions impede real progress toward
more secure programming i

8
y There are two reasons for this distressing situation.
1. PProgram controlsl apply
l at the
h level
l l off the
h individual
i di id l program andd
programmer.
2. Programming
P i andd software
ft engineering
i i ttechniques
hi change
h andd
evolve far more rapidly than do computer security techniques.

y Types of Flaws
y validation
lid i error (incomplete
(i l or iinconsistent):
i ) permission
i i checks
h k
y domain error: controlled access to data
y serialization
i li ti andd aliasing:
li i program flflow order
d
y inadequate identification and authentication: basis for
authorization
th i ti
y boundary condition violation: failure on first or last case
y other
th exploitable
l it bl logic
l i errors

10

g Errors
3.2. Nonmalicious Program
y Buffer Overflows
y A bbuffer
ff ((or array or string)
i ) iis a space iin which
hi h ddata can bbe hheld.
ld
y A buffer's capacity is finite.

11

y Suppose a C language program contains the declaration:


char sample[10];

y Now
N we execute
t the
th statement:
tt t
sample[10] = 'B';

y However, if the statement were


sample[i]
l [i] = '
'B';
'
we could not identify the problem until i was set during execution to a
t bi subscript.
too-big b i t
12
y Suppose each of the ten elements of the array sample is filled with the
l
letter A andd the
h erroneous reference
f uses the
h letter
l BB, as follows:
f ll

for (i=0; i<=9; i++)


sample[i] = 'A';
sample[10] = 'B‘;

13

14

15

y Security Implication
y Two
T bbuffer
ff overflow fl attacks k that
h are usedd frequently
f l
1. The attacker may replace code in the system space. By replacing a few
instructions right after returning from his or her own procedure,
procedure the
attacker regains control from the operating system, possibly with raised
privileges.
2. On the other hand, the attacker may make use of the stack pointer or the
return register. Subprocedure calls are handled with a stack, a data
structure in which the most recent item inserted is the next one removed
(last arrived, first served).

16
y An alternative style of buffer overflow occurs when parameter values
are passedd iinto a routine,
i especially
i ll when
h theh parameters are passedd
to a web server on the Internet. Parameters are passed in the URL line,
with
ith a syntax
t similar
i il to
t

http://www.somesite.com/subpage/userinput.asp?
http://www somesite com/subpage/userinput asp?
parm1=(808)555-1212 &parm2=2009Jan17

The attacker might question what the server would do with a really
long telephone number, say, one with 500 or 1000 digits.

17

y Incomplete Mediation
y Consider
C id the h example
l
http://www.somesite.com/subpage/userinput.asp?
parm1=(808)555-1212 &parm2=2009Jan17

y What
Wh t would
ld happen
h if parm22 were submitted
b itt d as 1800Jan01?
1800J 01? Or
O
1800Feb30? Or 2048Min32? Or 1Aardvark2Many?

18

y Security Implication
y Consider
C id thishi example
l
http://www.things.com/order.asp?custID=101&part=555A&q
y=20&price =10&ship=boat&shipcost=5&total=205

y A malicious attacker may decide to exploit this peculiarity by


supplying
l i instead
i t d the
th following
f ll i URL,
URL where
h theth price
i hhas been
b
reduced from $205 to $25:
p // g / p p q
http://www.things.com/order.asp?custID=101&part=555A&q
y=20&price =1&ship=boat&shipcost=5&total=25

19

y Time-of-Check to Time-of-Use Errors


y The
Th time-of-check
i f h k to time-of-use
i f (TOCTTOU) flaw
fl concerns
mediation that is performed with a "bait and switch" in the middle. It
isi also
l known
k as a serialization
i li ti or synchronization
h i ti flflaw.

20
y Time-of-Check to Time-of-Use Errors (Cont’d)
y Suppose
S a requestt tto access a fil
file were presented
t d as a ddata
t structure,
t t with ith th
the
name of the file and the mode of access presented in the structure.

y To carry out this authorization sequence, the access control mediator would
have to look up the file name in tables. The mediator could compare the
names ini the
th table
t bl to
t the
th fil
file name in
i the
th ddata
t structure
t t tto determine
d t i whether
h th
access is appropriate. More likely, the mediator would copy the file name
into its own local storage
g area and comparep from there

21

y Time-of-Check to Time-of-Use Errors (Cont’d)


y While
Whil the
h mediator
di iis checking
h ki access rights
i h ffor the
h fil
file my_file,
fil the
h
user could change the file name descriptor to your_file

y The problem is called a time-of-check to time-of-use flaw because


it exploits
p the delayy between the two times. That is,, between the
time the access was checked and the time the result of the check
was used,, a change
g occurred,, invalidatingg the result of the check.
22

y Security Implication
y Pretty
P clear
l
y Checking one action and performing another is an example of
i ff ti access control
ineffective t l
y There are ways to prevent exploitation of the time lag.
y One
O way isi tot ensure that
th t critical
iti l parameters
t are nott exposedd dduringi any
loss of control.
y Another wayy is to ensure serial integrity;
g y; that is,, to allow no interruption
p
(loss of control) during the validation.

23

3.3. Viruses and Other Malicious Code


y Malicious Code Can Do Much (Harm)
y Malicious
M li i code d runs under
d theh user's' authority.
h i
y Thus, malicious code can touch everything the user can touch, and
i the
in th same ways.
y Users typically have complete control over their own program code
andd data
d t fil
files; th
they can read,
d write,
it modify,
dif append,d andd even
delete them.
y But
B t malicious
li i code d can do
d the
th same, without
ith t th
the user's' permission
i i
or even knowledge.

24
N b off malware
Number l signatures
i t
1800000
1600000
1400000
1200000
1000000
800000
600000
400000
200000
0
2002 2003 2004 2005 2006 2007 2008
Symantec report 2009
25

Al t 30 years off M
Almost Malware
l

26
y From Malware fighting malicious code

Attack Sophistication vs.


Intruder Technical Knowledge Auto
Coordinated
Cross site scripting Tools
“stealth” / advanced
High scanning techniques

packet spoofing denial of service Staged

sniffers distributed
attack tools
Intruder sweepers www attacks
Knowledge
automated probes/scans
GUI
back doors
disabling audits network mgmt. diagnostics
hijacking
burglaries sessions
Attack exploiting known vulnerabilities
Sophistication
password cracking
self-replicating code
password guessing
Intruders
Low
1980 1985 1990 1995 2004
27

y Kinds of Malicious Code


y Malicious
M li i code
d or rogue program isi the
h generall name for
f
unanticipated or undesired effects in programs or program parts,
causedd by
b an agentt intent
i t t on ddamage.
y A virus is a program that can replicate itself and pass on malicious
code
d tot other
th nonmalicious
li i programs by b modifying
dif i th
them.
y A transient virus has a life that depends on the life of its host
y A resident virus locates itself in memory

28
y Kinds of Malicious Code (Cont’d)
y A Trojan
T j hhorse isi malicious
li i code
d that,
h iin addition
ddi i to iits primary
i
effect, has a second, nonobvious malicious effect
y A logic
l i bbombb isi a class
l off malicious
li i code
d that
th t "detonates"
"d t t " or goes
off when a specified condition occurs.
y A time
ti bomb
b b isi a llogici bbombb whose
h trigger
t i isi a time
ti or ddate.
t
y A trapdoor or backdoor is a feature in a program by which
someone can access the
th program other
th ththan bby th
the obvious,
b i di directt
call

29

y Kinds of Malicious Code (Cont’d)


y A worm is
i a program thath spreads
d copiesi off iitselflf through
h h a network.k
y A rabbit is a virus or worm that self-replicates without bound, with
th intention
the i t ti off exhausting
h ti some computing
ti resource.

30

y How Viruses Attach


y Appended
A d d Vi
Viruses

31

y How Viruses Attach (Cont’d)


y Viruses
Vi Th
That Surround
S d a Program
P

32
y How Viruses Attach (Cont’d)
y Integrated
I d Vi
Viruses andd RReplacements
l

33

y How Viruses Attach (Cont’d)


y Document
D Viruses
Vi
y Implemented within a formatted document, such as a written
d
document,t a database,
d t b a slide
lid presentation,
t ti a picture,
i t or a
spreadsheet.

34

y How Viruses Gain Control

35

y Homes for Viruses


y One-Time
O Ti EExecution i – the
h majority
j i off viruses
i
y Boot Sector Viruses

36
y Homes for Viruses (Cont’d)
y Memory-Resident
M R id Vi Viruses
y Other Homes for Viruses
y Application
A li ti programs
y Libraries
y Data files – need a startup
p program
p g

37

y Virus Signatures
y A signature
i – a telltale
ll l pattern
y E.g., signature for the Code Red

/default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN %u9090%u6858%ucbd3
%u7801%u9090%u6858%ucdb3%u7801%u9090%u6858 %ucbd3%u7801%u9090
%u9090%u8190%u00c3%u0003%ub00%u531b%u53ff %u0078%u0000%u00=a
HTTP/1 0
HTTP/1.0

38

y Virus Signatures
y Storage
S Patterns
P - Most
M viruses
i attachh to programs that
h are storedd
on media such as disks. The attached virus piece is invariant, so the
start
t t off th
the virus
i code
d bbecomes a ddetectable
t t bl signature.
i t

39

y Virus Signatures
y Execution
E i Patterns
P - A virus
i writer
i may want a virus
i to do
d severall
things at the same time, namely, spread infection, avoid detection,
andd cause hharm.
y Transmission Patterns - A virus is effective only if it has some
means off transmission
t i i from
f one location
l ti to t another.
th

40
y Polymorphic Viruses
y A virus
i thath can change
h iits appearance
y Encrypting viruses
y Uses
U encryptionti underd various
i kkeys tto make
k the
th stored
t d form
f off th
the virus
i
different.
y Contain three distinct parts:
p
y a decryption key,
y the (encrypted) object code of the virus
y the (unencrypted) object code of the decryption routine.

41

y Prevention of Virus Infection


y The
Th only
l way to prevent the
h iinfection
f i off a virus
i isi not to receive
i
executable code from an infected source.

42

y Prevention of Virus Infection (Cont’d)


y Several
S l techniques
hi ffor bbuilding
ildi a reasonably
bl safe
f community
i ffor
electronic contact, including the following:
y UUse only
l commerciali l software
ft acquired
i d from
f reliable,
li bl well-established
ll t bli h d
vendors.
y Test all new software
f on an isolated computer.
p
y Open attachments only when you know them to be safe
y Make a recoverable system image and store it safely.
y Make and retain backup copies of executable system files.
y Use virus detectors (often called virus scanners) regularly and update them
daily.
daily
43

g
3.4. Targeted Malicious Code
y Trapdoors - an undocumented entry point to a module
y Examples
E l
y A system is composed of modules or components.
y Programmers first test each small component of the system separate
from the other components, in a step called unit testing, to ensure that
the component works correctly by itself.
y Then, developers test components together during integration testing, to
see how they function as they send messages and data from one to the
other.

44
45

y Trapdoors - an undocumented entry point to a module


y Examples
E l
y Hardware processor design
y The undefined opcodes sometimes implement peculiar instructions,
instructions
either because of an intent to test the processor design or because of an
oversight by the processor designer.
y Undefined opcodes are the hardware counterpart of poor error checking
for software.

46

y Causes of Trapdoors
y forget
f to remove them
h
y intentionally leave them in the program for testing
y intentionally
i t ti ll leave
l them
th ini the
th program ffor maintenance
i t off the
th
finished program, or
y intentionally
i t ti ll leave
l them
th ini th
the program as a covertt means off
access to the component after it becomes an accepted part of a
production
d ti systemt

47

y Salami Attack
y merges bi
bits off seemingly
i l iinconsequential
i l ddata to yield
i ld powerful
fl
results.

y Why Salami Attacks Persist


y Computer
C t computations
t ti are notoriously
t i l subject
bj t tto smallll errors
involving rounding and truncation, especially when large numbers
are tto be
b combined
bi d with
ith smallll ones.

48
y Privilege Escalation - a means for malicious code to be launched by a
user with
i h lower
l privileges
i il but
b run with
i h hi
higher
h privileges.
i il

y Interface
I t f Illusions
Ill i - a spoofing
fi attack
tt k iin which
hi h allll or partt off a webb
page is false.

y Keystroke Logging - retains a surreptitious copy of all keys pressed.

y Man-in-the-Middle Attacks - interjects itself between two other


programs
49

y Covert Channels: Programs That Leak Information

50

51

y Storage Channels - pass information by using the presence or absence


off objects
bj iin storage.

52
File Existence Channel Used to Signal
g 100

53

3.5. Controls Against


g Program
g Threats
y Three types of controls:
y Developmental
D l l
y Operating system
y Administrative
Ad i i t ti

54

y Developmental Controls
y The
Th Nature
N off Software
S f Development
D l
y Collaborative effort, involving people with different skill sets who
combine
bi their
th i expertise
ti tto produce
d a workingki product
d t
y Development requires people who can specify, design,
i l
implement, t test,
t t review,
i document,
d t manage, maintain
i t i the
th
system.

55

y Developmental Controls (Cont’d)


y Modularity,
M d l i EEncapsulation, l i andd IInformation
f i Hidi
Hiding
y A key principle of software engineering is to create a design or code in small,
self-contained units,
units called components or modules
y If a component is isolated from the effects of other components, then it is
easier to trace a problem to the fault that caused it and to limit the damage
the fault causes. This isolation is called encapsulation.
y Information hiding is another characteristic of modular software.

56
y Developmental Controls (Cont’d)
y Modularization
M d l i i isi the
h process off dividing
di idi a taskk into
i subtasks.
b k

57

y Developmental Controls (Cont’d)


y Modularization
M d l i i
y The goal is to have each component meet four conditions:
1
1. single-purpose: performs one function
2. small: consists of an amount of information for which a human can
readily grasp both structure and content
3. simple: is of a low degree of complexity so that a human can readily
understand the purpose and structure of the module
4. independent: performs a task isolated from other modules

58

y Developmental Controls (Cont’d)


y Modularization
M d l i i
y Several advantages to having small, independent components.
y Maintenance.
Maintenance If a component implements a single function,
function it can be
replaced easily with a revised one if necessary.
y Understandability
y Reuse
y Correctness
y Testing.

59

y Developmental Controls (Cont’d)


y Modularization
M d l i i
y A modular component usually has high cohesion and low coupling
y Cohesion,
Cohesion we mean that all the elements of a component have a logical
and functional reason for being there.
y Coupling refers to the degree with which a component depends on other
components in the system.

60
y Developmental Controls (Cont’d)
y Encapsulation
E l i
y Encapsulation hides a component's implementation details, but it does not
necessarily mean complete isolation
y Berard [BER00] notes that encapsulation is the "technique for packaging the
information [inside a component] in such a way as to hide what should be
hidden and make visible what is intended to be visible.“

61

y Developmental Controls (Cont’d)


y Information
I f i Hiding
Hidi
y Think of a component as a kind of black box, with certain well-defined
inputs and outputs and a well-defined function.
function
y Other components' designers do not need to know how the module
completes its function; it is enough to be assured that the component
performs its task in some correct manner.
y This concealment is the information hiding.
y Information hiding is desirable because developers cannot easily and
maliciously alter the components of others if they do not know how the
components work.
work

62

y Developmental Controls (Cont’d)


y Information
I f i Hiding
Hidi (Cont’d)
(C ’d)

63

y Developmental Controls (Cont’d)


y Mutual
M l Suspicion
S ii
y Mutually suspicious programs operate as if other routines in the system were
malicious or incorrect.
incorrect
y A calling program cannot trust its called subprocedures to be correct, and a
called subprocedure cannot trust its calling program to be correct.
y Each protects its interface data so that the other has only limited access.

64
y Developmental Controls (Cont’d)
y Confinement
C fi
y A confined program is strictly limited in what system resources it can access.
If a program is not trustworthy,
trustworthy the data it can access are strictly limited
limited.
y Genetic Diversity
y Tight
g integration
g of products
p is a concern.
y A vulnerability in one of these can also affect the others.
y Fixing a vulnerability in one can have an impact on the others.

65

y Developmental Controls (Cont’d)


y Pfleeger
Pfl et al.l [PFL01] recommendd severall kkey techniques
hi ffor
building what they call "solid software":
y peer reviews
i
y hazard analysis
y testingg
y good design
y prediction
y static analysis
y configuration management
y anal sis of mistakes
analysis
66

y Developmental Controls (Cont’d)


y Peer
P review i
y Review: The artifact is presented informally to a team of reviewers; the goal
is consensus and buy-in before development proceeds further.
further
y Walk-through: The artifact is presented to the team by its creator, who
leads and controls the discussion. Here, education is the goal, and the focus
is on learning about a single document.
y Inspection: The artifact is checked against a prepared list of concerns. The
creator does not lead the discussion, and the fault identification and
correction are often controlled by statistical measurements.

67

y Developmental Controls (Cont’d)


y Peer
P reviewi (C (Cont’d)
’d)
y A wise engineer who finds a fault can deal with it in at least three ways:
1
1. by learning how
how, when,
when and why errors occur
2. by taking action to prevent mistakes
3. by scrutinizing products to find the instances and effects of errors that
were missed

68
y Developmental Controls (Cont’d)
y Peer
P review
i (C(Cont’d)
’d)

F lt Discovery
Fault Di RRate
t RReported
t d att Hewlett-Packard
H l tt P k d
69

y Developmental Controls (Cont’d)


y Peer
P review
i (C(Cont’d)
’d)
Discoveryy Activityy Faults Found (Per Thousand
Lines of Code)
Requirements review 2.5
Design review
re ie 5.0
0
Code inspection 10.0
g
Integration test 33.0
Acceptance test 2.0

70

y Developmental Controls (Cont’d)


y Hazard
H d analysis l i
y A set of systematic techniques intended to expose potentially hazardous
system states.
states
y Usually involves developing hazard lists, as well as procedures for exploring
"what if" scenarios to trigger consideration of nonobvious hazards.
y A variety of techniques support the identification and management of
potential hazards. Among the most effective are
y Hazard and operability studies (HAZOP)
y Failure modes and effects analysis (FMEA)
y Fault tree analysis (FTA)

71

y Developmental Controls (Cont’d)


y Hazard
H d analysis
l i (Cont’d)
(C ’d)
Known Cause Unknown Cause

Deductive analysis,
Description of system
Known effect including fault tree
behavior
analysis

y
Inductive analysis,
Exploratory analysis,
analysis
Unknown including failure modes
including hazard and
effect and effects analysis
operability
studies

72
y Developmental Controls (Cont’d)
y Testing
T i
y A process activity that homes in on product quality: making the product
failure free or failure tolerant.
tolerant

73

y Developmental Controls (Cont’d)


y Testing
T i (Cont’d)
(C ’d)
y Usually involves several stages.
y Unit testing is done in a controlled environment whenever possible so
that the test team can feed a predetermined set of data to the
component being tested and observe what output actions and data are
produced.
y Integration testing is the process of verifying that the system
components work together as described in the system and program
design specifications.

74

y Developmental Controls (Cont’d)


y Testing
T i (Cont’d)
(C ’d)
y Usually involves several stages. (Cont’d)
y A function test evaluates the system to determine whether the functions
described by the requirements specification are actually performed by
the integrated system.
y A performance test compares the system with the remainder of these
software and hardware requirements.
y An acceptance test, in which the system is checked against the
customer's requirements description.

75

y Developmental Controls (Cont’d)


y Testing
T i (Cont’d)
(C ’d)
y Usually involves several stages. (Cont’d)
y A final installation test is run to make sure that the system still functions
as it should.
y After a change is made to enhance the system or fix a
problem, regression testing ensures that all remaining functions are still
working and that performance has not been degraded by the change.

76
y Developmental Controls (Cont’d)
y Testing
T i (Cont’d)
(C ’d)
y Each of the types of tests listed here can be performed from two
perspectives
y Black-box testing treats a system or its components as black boxes;
testers cannot "see inside" the system
y Clear-box testing (a.k.a. white box). - testers can examine the design and
code directly, generating test cases based on the code's actual
construction

77

y Developmental Controls (Cont’d)


y Testing
T i (Cont’d)
(C ’d)
y Olsen [OLS93] describes the development at Contel IPC of a system
containing 184,000
184 000 lines of code and tracked faults discovered during various
activities, and found differences:
y 17.3 percent of the faults were found during inspections of the system design
y 19.1 percent during component design inspection
y 15.1 percent during code inspection
y 29 4 percent during integration testing
29.4
y 16.6 percent during system and regression testing
y Only 0.1 percent of the faults were revealed after the system was placed in the
field.
78

y Developmental Controls (Cont’d)


y Testing
T i (Cont’d)
(C ’d)
y From a security standpoint independent testing is desirable for the testing
y Penetration testing is unique to computer security – the testers try to see if the
software does what it is not supposed to do, which is to fail or fail to enforce
security.

79

y Developmental Controls (Cont’d)


y Good
G d Design
D i
y Designers should try to anticipate faults and handle them in ways that
minimize disruption and maximize safety and security.
security
y Passive fault detection - construct the system so that it reacts in an
acceptable way to a failure's occurrence.
y Active fault detection - adopting a philosophy of mutual suspicion.
Instead of assuming that data passed from other systems or components
are correct, we always check that the data are within bounds and of the
right type or format.
y We can also use redundancy,
redundancy comparing the results of two or more
processes to see that they agree, before we use their result in a task.
80
y Developmental Controls (Cont’d)
y Good
G d Design
D i
y Fault tolerance: isolating the damage caused by the fault and minimizing
disruption to users.
users
y Typically, failures include
y failing to provide a service
y providing the wrong service or data
y corrupting data

81

y Developmental Controls (Cont’d)


y Good
G d Design
D i
y We can build into the design a particular way of handling each problem
1
1. Retrying: restoring the system to its previous state and performing the
service again, using a different strategy
2. Correcting: restoring the system to its previous state, correcting some
system characteristic, and performing the service again, using the same
strategy
3. Reporting: restoring the system to its previous state, reporting the
problem to an error-handling component, and not providing the
service again

82

y Developmental Controls (Cont’d)


y Static
S i Analysis
A l i - examine
i its
i design
d i andd code
d to llocate andd repairi
security flaws before a system is up and running
y severall aspectst off th
the ddesign
i andd code: d
y control flow structure - the sequence in which instructions are executed,
including iterations and loops.
loops
y data flow structure - follows the trail of a data item as it is accessed and
modified by the system
y data structure - the way in which the data are organized, independent of
the system itself.

83

y Developmental Controls (Cont’d)


y Configuration
C fi i Management
M - the
h process bby which
hi h we controll
changes during development and maintenance
y It isi important
i t t to
t know
k who
h isi making
ki which
hi h changes
h tto what
h t andd when:
h
y corrective changes: maintaining control of the system's day-to-day
functions
y adaptive changes: maintaining control over system modifications
y perfective changes: perfecting existing acceptable functions
y preventive changes: preventing system performance from degrading to
unacceptable levels

84
y Developmental Controls (Cont’d)
y Configuration
C fi i Management
M (Cont’d)
(C ’d)
y Four activities are involved in configuration management:
y configuration identification
y configuration control and change management
y configuration auditing
y status accounting

85

y Developmental Controls (Cont’d)


y Configuration
C fi i Management
M (Cont’d)
(C ’d)
y Configuration identification
y Sets up baselines to which all other code will be compared after changes
are made that is building and document an inventory of all components
that comprise the system.
y “Freeze" the baseline and carefully control what happens to it.
y When a change is proposed and made, it is described in terms of how the
baseline changes.

86

y Developmental Controls (Cont’d)


y Configuration
C fi i Management
M (Cont’d)
(C ’d)
y Configuration control and configuration management - ensure we can
coordinate separate,
separate related versions
versions.
y Three ways to control the changes
y Separate files - have different files for each release or version.
y Delta - designate a particular version as the main version of a system
and then define other versions in terms of what is different.
y Conditional compilation, whereby a single code component
addresses all versions , relying on the compiler to determine which
statements to apply to which versions.
versions

87

y Developmental Controls (Cont’d)


y Configuration
C fi i Management
M (Cont’d)
(C ’d)
y A configuration audit confirms that the baseline is complete and accurate,
that changes are recorded
recorded, that recorded changes are made
made, and that the
actual software (that is, the software as used in the field) is reflected
accurately in the documents
y Finally, status accounting records information about the components:
where they came from (for instance, purchased, reused, or written from
scratch), the current version, the change history, and pending change
requests.

88
y Developmental Controls (Cont’d)
y Configuration
C fi i Management
M (Cont’d)
(C ’d)
y All activities are performed by a configuration and change control board,
or CCB.
CCB
y The CCB contains representatives from all organizations with a vested
interest in the system
y The board reviews all proposed changes and approves changes based on
need, design integrity, future plans for the software, cost, and more.

89

y Developmental Controls (Cont’d)


y Proofs
P f off PProgram Correctness
C
y Program verification can demonstrate formally the "correctness" of certain
specific programs.
programs
y Making initial assertions about the inputs and then checking to see if the
desired output is generated.
y Each program statement is translated into a logical description about its
contribution to the logical flow of the program.
y Finally, the terminal statement of the program is associated with the
desired output.

90

y Developmental Controls (Cont’d)


y Proofs
P f off PProgram Correctness
C
y Proving program correctness is hindered by several factors.
y Correctness proofs depend on a programmer or logician to translate a
program's statements into logical implications.
y Deriving the correctness proof from the initial assertions and the
implications of statements is difficult, and the logical engine to generate
proofs runs slowly.
y The current state of program verification is less well developed than code
production.

91
6.1. Introduction to Databases
y Concept of a Database
y A database
d b isi a collection
ll i off data
d andd a set off rules
l that
h organize
i
the data by specifying certain relationships among the data.
y The user ddescribes
Th ib a logical
l i l format
f t for
f the
th data.
dt
y The precise physical format of the file is of no concern to the user
y A database administrator is a person who defines the rules that
organize the data and also controls who should have access to
what parts of the data.
data
y The user interacts with the database through a program called
a database manager or a database management system (DBMS), (DBMS)
informally known as a front end.
3

y Components of Databases
y Record
R d – contain i one related
l d group off data
d
y Each record contains fields or elements
y The
Th llogical
i l structure
t t off a ddatabase
t b iis called
ll d a schema
h
y A particular user may have access to only part of the database,
called
ll d a subschema
b h
Adams 212 Market St. Columbus OH 43210
Benchly 501 Union St. Chicago IL 60603
Carter 411 Elm St. Columbus OH 43210

Related Parts of a Database

Schema of Database
Name First Address City State Zip Airport
Adams Charles 212 Market St. Columbus OH 43210 CMH
Adams Edward 212 Market St. Columbus OH 43210 CMH
Benchly Zeke 501 Union St. Chicago IL 60603 ORD
C t
Carter M l
Marlene 411 Elm
El St.
St C l b
Columbus OH 43210 CMH
Carter Beth 411 Elm St. Columbus OH 43210 CMH
Carter Ben 411 Elm St. Columbus OH 43210 CMH
Carter Elisabeth 411 Elm St. Columbus OH 43210 CMH
Carter Mary 411 Elm St. Columbus OH 43210 CMH
y The name of each column is called an attribute of the database
y A relation
l ti isi a sett off columns
l

Table 6-3. Relation in a Database.


Namee Zip
ADAMS 43210
BENCHLY 60603
CARTER 43210

y Queries
y Users
U interact
i with
i h database
d b managers through
h h commands
d to the
h
DBMS that retrieve, modify, add, or delete fields and records of the
dtb
database.
y A command is called a query.
y F example,
For l

SELECT NAME = 'ADAMS'

y Queries (Cont’d)
y The
Th resultl off executing
i a query iis a subschema.
b h
y For example, we might select records in which ZIP=43210
Result of Select Query
Name First Address City State Zip Airport
ADAMS Charles 212 Market St. Columbus OH 43210 CMH
ADAMS Edward 212 Market St. Columbus OH 43210 CMH
CARTER Marlene 411 Elm St. Columbus OH 43210 CMH
CARTER Beth 411 Elm St. Columbus OH 43210 CMH
CARTER Ben 411 Elm St. Columbus OH 43210 CMH
CARTER Lisabeth
b h 411 Elm
l St. Columbus
l b OH 43210 CMH
CARTER Mary 411 Elm St. Columbus OH 43210 CMH
9

y Queries (Cont’d)
y Other,
O h more complex,
l selection
l i criteria
i i are possible,
ibl with
i h logical
l i l
operators such as and (∧) and or (∨), and comparisons such as less
th (<).
than ( )
y An example of a select query is

SELECT (ZIP='43210') ∧ (NAME='ADAMS')

10
y Queries (Cont’d)
y After
Af hhaving
i selected
l d records,
d we may project
j these
h records
d onto
one or more attributes.
y The select
Th l t operation
ti id
identifies
tifi certain
t i rows ffrom th
the ddatabase
tb
y A project operation extracts the values from certain fields (columns) of those
records.
y For example, we might
y Select records meeting the condition ZIP=43210
y Project the results onto the attributes NAME and FIRST,

11

Results of Select-Project
Select Project Query
ADAMS Charles
ADAMS Ed
Edward
d
CARTER Marlene
CARTER Beth
CARTER Ben
CARTER Lisabeth
CARTER Mary

12

y Queries (Cont’d)
y Notice
N i that
h we do
d not have
h to project
j onto the
h same attribute(s)
ib ( ) on
which the selection is done. For example, we can build a query
using
i ZIP andd NAME bbutt project
j t th
the resultlt onto
t FIRST
FIRST:

(ZIP '43210') ∧ (NAME='ADAMS')


SHOW FIRST WHERE (ZIP='43210') (NAME 'ADAMS')

y The
Th resultlt would
ld bbe a lilistt off th
the fifirstt names off people
l whose
h llastt
names are ADAMS and ZIP is 43210.

13

y Queries (Cont’d)
y We
W can also
l merge two subschema
b h on a common element
l by
b using
i
a join query.

14
y Advantage of Using Databases
y A ddatabase
b iis a single
i l collection
ll i off ddata, storedd andd maintained
i i d at
one central location, to which many people may have access as
needed
dd
y The users are unaware of the physical arrangements; the unified
l i l arrangementt iis allll they
logical th see

15

y Advantage of Using Databases


y Shared
Sh d access – users use one common, centralized
li d set off data
d
y Minimal redundancy – users do not have to collect and maintain
th i own sets
their t off ddata
t
y Data consistency – change to a data value affects all users of the
d t value
data l
y Data integrity – data values are protected against accidental or
malicious
li i undesirable
d i bl changes
h
y Controlled access – only authorized users are allowed to view or
t modify
to dif data
d t values
l
16

6.2. Securityy Requirements


q
y A list of requirements for database security.
y Physical
Ph i l database
d b integrity.
i i
y The data of a database are immune to physical problems, such as power
failures and someone can reconstruct the database if it is destroyed through
failures,
a catastrophe.
y Logical
g database integrity. g y
y The structure of the database is preserved. With logical integrity of a
database, a modification to the value of one field does not affect other
fields, for example.

17

y A list of requirements for database security. (Cont’d)


y Element
El integrity.
i i
y The data contained in each element are accurate.
y Auditability.
Auditability
y It is possible to track who or what has accessed (or modified) the elements
in the database.
y Access control.
y A user is allowed to access only authorized data, and different users can be
restricted to different modes of access (such as read or write).

18
y A list of requirements for database security. (Cont’d)
y User
U authentication.
h i i
y Every user is positively identified, both for the audit trail and for permission
to access certain data.
data
y Availability.
y Users can access the database in ggeneral and all the data for which theyy are
authorized.

19

y Integrity of the Database


y Two
T situations
i i can affect ff the h iintegrity
i off a ddatabase:
b
y when the whole database is damaged
y when individual data items are unreadable.
unreadable
y Integrity of the database as a whole is the responsibility of
y The DBMS
y The operating system
y The (human) computing system manager.

20

y Integrity of the Database (Cont’d)


y Sometimes
S i iti isi iimportant to bbe able
bl to reconstruct the
h database
d b at
the point of a failure.
y The DBMS mustt maintain
Th i t i a llog off ttransactions.
ti
y The system can obtain accurate account balances by reverting to a backup
py of the database and reprocessing
copy p g all later transactions from the log.g

21

y Element Integrity
y The integrity of database elements is their correctness or accuracy.
y This corrective action can be taken in three ways .
y Field checks - activities that test for appropriate values in a position.
y Access control
y A change log - A change log lists every change made to the database; it
contains both original and modified values. Using this log, a database
administrator can undo any changes that were made in error.

22
y Auditability
y For
F some applications
li i iit may be
b desirable
d i bl to generate an audit
di
record of all access (read or write) to a database.
y Such
S h a recordd can help
h l tto maintain
i t i the
th ddatabase's
t b ' integrity,
i t it or att
least to discover after the fact who had affected which values and
when.
h

23

y Access Control
y Databases
D b are often
f separatedd logically
l i ll by b user access privileges.
i il
y User Authentication
y The
Th DBMS can requirei rigorous
i user authentication.
th ti ti
y A DBMS might insist that a user pass both specific password and
ti
time-of-day
f d checks.
h k
y This authentication supplements the authentication performed by
th operating
the ti system.
t

24

y Availability

y Integrity/Confidentiality/Availability – Computer Security


y Integrity
I t it iis a major
j concern in
i the
th ddesign
i off ddatabase
t b managementt
systems.
y Confidentiality
C fid ti lit iis a kkey iissue with
ith ddatabases
t b bbecause off th
the
inference problem,
y availability
il bilit iis iimportant
t t bbecause off th
the shared
h d access motivation
ti ti
underlying database development.

25

6.3. Reliabilityy and Integrity


g y
y Databases amalgamate data from many sources, and users expect a
DBMS to provide
id access to the
h ddata iin a reliable
li bl way.
y Reliability - mean that the software runs for very long periods of time
without
ith t ffailing.
ili

26
y Database concerns about reliability and integrity can be viewed from
three
h dimensions:
di i
y Database integrity: concern that the database as a whole is
protected
t t d againsti t ddamage
y Element integrity: concern that the value of a specific data
element
l t iis written
itt or changed
h d only l bby authorized
th i d users.
y Element accuracy: concern that only correct values are written
i t the
into th elements
l t off a database.
dtb

27

y Protection Features from the Operating System


y A responsible
ibl system administrator
d ii backs
b k up the
h files
fil off a ddatabase
b
periodically along with other user files.
y The
Th files
fil are protected
t t d dduring
i normall execution
ti against
i t
outside access by the operating system's standard access control
f iliti
facilities.
y Finally, the operating system performs certain integrity checks for all
d t as a partt off normall readd andd write
data it operations
ti forf I/O devices.
d i

28

y Two-Phase Update
y A serious
i problem
bl for
f a ddatabase
b manager iis the
h failure
f il off the
h
computing system in the middle of modifying data.
y If the
th data
d t item
it tto bbe modified
difi d was a llong fifield,
ld hhalflf off th
the fifield
ld
might show the new value, while the other half would contain the
old.
ld

29

y Two-Phase Update (Cont’d)


y Update
U d Technique
T h i
y The intent phase - the DBMS gathers the resources it needs to perform the
update
y Committing, involves the writing of a commit flag to the database. The
commit flag means that the DBMS has passed the point of no return: After
committing, the DBMS begins making permanent changes.

30
y Two-Phase Update (Cont’d)
y Example
E l
1. The stockroom checks the database to determine that 50 boxes of paper
clips are on hand.
hand If not,
not the requisition is rejected and the transaction is
finished.
2. If enough paper clips are in stock, the stockroom deducts 50 from the
inventory figure in the database (107 - 50 = 57).
3. The stockroom charges accounting's supplies budget (also in the database)
for 50 boxes of paper clips.

31

y Two-Phase Update (Cont’d)


y Example
E l (Cont’d)
(C ’d)
4. The stockroom checks its remaining quantity on hand (57) to determine
whether the remaining quantity is below the reorder point
point. Because it is,
is a
notice to order more paper clips is generated, and the item is flagged as
"on order" in the database.
5. A delivery order is prepared, enabling 50 boxes of paper clips to be sent
to accounting.
All five of these steps must be completed in the order listed for the database
to be accurate and for the transaction to be processed correctly.

32

y Two-Phase Update (Cont’d)


y Example
E l (Cont’d)
(C ’d)
y When a two-phase commit is used, shadow values are maintained for key
data points.
points A shadow data value is computed and stored locally during the
intent phase, and it is copied to the actual database during the commit
phase.

33

y Two-Phase Update (Cont’d)


y Example
E l (Cont’d)
(C ’d)
y Intent:
y Check the value of COMMIT
COMMIT-FLAG
FLAG in the database. If it is set, this phase cannot be
performed. Halt or loop, checking COMMIT-FLAG until it is not set.
y Compare number of boxes of paper clips on hand to number requisitioned; if more are
requisitioned than are on hand,
hand halt.
halt
y Compute TCLIPS = ONHAND - REQUISITION.
y Obtain BUDGET, the current supplies budget remaining for accounting department.
C
Compute TBUDGET
B DG = BBUDGET
DG - COST,
COS where
h COST
COS is the h cost off 500 boxes
b off clips.
l
y Check whether TCLIPS is below reorder point; if so, set TREORDER = TRUE; else set
TREORDER = FALSE

34
y Two-Phase Update (Cont’d)
y Example
E l (Cont’d)
(C ’d)
y Commit:
y Set COMMIT-FLAG in database.
database
y Copy TCLIPS to CLIPS in database.
y Copy TBUDGET to BUDGET in database.
y Copy TREORDER to REORDER in database.
y Prepare notice to deliver paper clips to accounting department. Indicate
transaction completed in log.
y Unset COMMIT-FLAG.

35

y Redundancy/Internal Consistency
y Error
E Detection
D i andd Correction
C i Codes
C d
y Shadow Fields
y Entire
E ti attributes
tt ib t or entireti records
d can bbe dduplicated
li t d iin a ddatabase.
t b If th
the
data are irreproducible, this second copy can provide an immediate
replacement
p if an error is detected.

36

y Recovery
y In
I addition
ddi i to these
h error correction
i processes, a DBMS can maintain
i i
a log of user accesses, particularly changes. In the event of a failure,
th database
the d t b isi reloaded
l d d from
f a bbackup k copy andd allll llater
t changes
h
are then applied from the audit log.

37

y Concurrency/Consistency
y Database
D b systems are often f multiuser
l i systems.
y If both users try to modify the same data items, we often assume
thatt there
th th iis no conflict
fli t because
b eachh knows
k what
h t tto write;
it ththe
value to be written does not depend on the previous value of the
d t item.
data it HHowever, thi
this supposition
iti isi nott quite
it accurate.
t

38
y Concurrency/Consistency (Cont’d)
y E.g.,
E
y Agent A submits the update command
SELECT (SEAT-NO = '11D')
11D ) ASSIGN 'MOCK
MOCK, E'
E TO PASSENGER-NAME
y while Agent B submits the update sequence
SELECT (SEAT-NO = '11D') ASSIGN 'EHLERS, P' TO PASSENGER-NAME
y To resolve this problem, a DBMS treats the entire queryupdate cycle
as a single atomic operation.

39

y Monitors
y The
Th monitor
i isi the
h uniti off a DBMS responsible
ibl for
f the
h structurall
integrity of the database.
y Forms
F off monitors
it
y Range Comparisons
y A range comparison monitor tests each new value to ensure that the
value is within an acceptable range
y Filters or patterns are more general types of data form checks.
y State constraints describe the condition of the entire database.
y Transition constraints describe conditions necessary before changes can be
applied to a database.
40

6.4. Sensitive Data


y Sensitive data are data that should not be made public.
y There
Th exist i cases that
h some butb not allll off the
h elements
l iin the
h
database are sensitive.
y There
Th may bbe varying
i ddegrees off sensitivity.
iti it

41

y Several factors can make data sensitive.


y Inherently
Ih l sensitive.
ii
y The value itself may be so revealing that it is sensitive. Examples are the
locations of defensive missiles.
missiles
y From a sensitive source.
y The source of the data mayy indicate a need for confidentiality.
y An examplep
is information from an informer whose identity would be compromised if the
information were disclosed.

42
y Several factors can make data sensitive (Cont’d)
y Declared
D l d sensitive.ii
y The database administrator or the owner of the data may have declared the
data to be sensitive.
sensitive
y Part of a sensitive attribute or a sensitive record.
y In a database,, an entire attribute or record mayy be classified as sensitive.
y Sensitive in relation to previously disclosed information.
y Some data become sensitive in the presence of other data.
y For example, the longitude coordinate of a secret gold mine reveals little,
but the longitude coordinate in conjunction with the latitude coordinate
pinpoints the mine.
43

y Access Decisions
y The
Th DBMS may consider
id severall factors
f when
h ddeciding
idi whether
h h to
permit an access.
y AAvailability
il bilit off the
th data
dt
y Acceptability of the access
y Authenticityy of the user.

44

y Access Decisions (Cont’d)


y Availability
A il bili off Data
D
y One or more required elements may be inaccessible.
y For example,
example if a user is updating several fields,
fields other users'
users accesses to
those fields must be blocked temporarily. This blocking ensures that users
do not receive inaccurate information
y Acceptability of Access
y One or more values of the record may be sensitive and not accessible by
the general user. A DBMS should not release sensitive data to unauthorized
individuals.

45

y Access Decisions (Cont’d)


y Assurance
A off AAuthenticity
h ii
y Certain characteristics of the user external to the database may also be
considered when permitting access.
access
y For example, to enhance security, the database administrator may permit
someone to access the database only at certain times, such as during
working hours.

46
y Access Decisions (Cont’d)
y Types
T off Disclosures
Di l
y Exact Data - The most serious disclosure is the exact value of a sensitive
data item itself
y Bounds - Another exposure is disclosing bounds on a sensitive value; that is,
indicating that a sensitive value, y, is between two values, L and H.
y Negative Result - Sometimes we can word a query to determine a negative
result. That is, we can learn that z is not the value of y.
y Existence - The existence of data is itself a sensitive piece of data.

47

y Access Decisions (Cont’d)


y Types
T off Di
Disclosures
l (C
(Cont’d)
’d)
y Probable Value - it may be possible to determine the probability that a
certain element has a certain value.
value

48

y Security versus Precision

49

6.5. Inference
y Inference is a way to infer or derive sensitive data from nonsensitive
ddata.
Sample Database
Name Sex Race Aid Fines Drugs Dorm
Adams M C 5000 45. 1 Holmes
Bailey M B 0 0. 0 Grey
Chin F A 3000 20. 0 West
Dewitt M B 1000 35. 3 Grey
Earhart F C 2000 95. 1 Holmes
Fein F C 1000 5
15. 0 West
Groff M C 4000 0. 3 West
Hill F B 5000 10. 2 Holmes
Koch F C 0 0. 1 West
Liu F A 0 10. 2 Grey
Majors M C 2000 0. 2 Grey
50
y Direct Attack
y A user tries
i to ddetermine
i values
l off sensitive
i i fifields
ld by
b seeking
ki them
h
directly with queries that yield few records.
y A sensitive
iti query might
i ht be
b
List NAME
where
h SEX M ∧ DRUGS=1
SEX=M DRUGS 1

This query di
Thi discloses
l th
thatt ffor recordd ADAMS
ADAMS, DRUGS
DRUGS=1.
1 HHowever, it isi an obvious
bi
attack because it selects people for whom DRUGS=1, and the DBMS might
reject
j the qqueryy because it selects records for a specific
p value of the sensitive
attribute DRUGS.
51

y Direct Attack (Cont’d)


y A lless obvious
b i query iis
List NAME
where
h (SEX M ∧ DRUGS=1)
(SEX=M DRUGS 1) ∨
(SEX≠M ∧ SEX ≠ F) ∨
(DORM AYRES)
(DORM=AYRES)
This query still retrieves only one record, revealing a name that corresponds to
the sensitive DRUG value.
value The DBMS needs to know that SEX has only two
possible values so that the second clause will select no records. Even if that
were possible, the DBMS would also need to know that no records exist with
DORM=AYRES, even though AYRES might in fact be an acceptable value for
DORM. 52

y Direct Attack (Cont’d)


y Do
D not reveall results
l when
h a smallll number
b off people
l make
k up a
large proportion of a category.
y The
Th rule
l off "n
" items
it over k percent"t" means th
thatt ddata
t should
h ld bbe
withheld if n items represent over k percent of the result reported.

53

y Indirect Attack
y Sum
S - AnA attackk by
b sum tries
i to infer
i f a value
l from
f a reportedd sum.
y Count - The count can be combined with the sum to produce
some even more revealing
li results.
lt
y Mean - The arithmetic mean (average) allows exact disclosure if the
attacker
tt k can manipulate
i l t ththe subject
bj t population.
l ti
y Median

54
55

y Tracker Attacks
y A tracker
k attackk can fool
f l the
h database
d b manager iinto locating
l i the h
desired data by using additional queries that produce small results.
y The
Th ttracker
k adds
dd additional
dditi l records
d tto bbe retrieved
t i d ffor ttwo diff
differentt
queries; the two sets of records cancel each other out, leaving only
th statistic
the t ti ti or ddata
t ddesired.
i d Th The approachh iis tto use iintelligent
t lli t
padding of two queries.
y In
I other
th words,d iinstead
t d off ttrying
i tto id
identify
tif a uniquei value, l we
request n - 1 other values (where there are n values in the
d t b ) Given
database). Gi n andd n - 1, 1 we can easily
il compute t th the ddesired
i d
single element. 56

y Tracker Attacks (Cont’d)


y For
F iinstance, suppose we wish
i h to kknow hhow many female
f l
Caucasians live in Holmes Hall. A query posed might be

count ((SEX=F) (RACE=C) (DORM=Holmes))

The database management system might consult the database, find


th t the
that th answer isi 1,
1 andd refuse
f to t answer that
th t query bbecause one
record dominates the result of the query.

57

y Tracker Attacks (Cont’d)


y The
Th query
q=count((SEX=F) ∧ (RACE=C) ∧ (DORM=Holmes))
isi off th
the form
f
q = count(a ∧ b ∧ c)
y By
B using i th
the rules
l off llogici andd algebra,
l b we can ttransform
f thi this query
to
q = count(at( ∧ b ∧ c)) = count(a) t( ∧ ¬(b
t( ) - count(a (b ∧ c))))

58
y Tracker Attacks (Cont’d)
y Thus,
Th theh original
i i l query iis equivalent
i l to
count (SEX=F)
minus
i
count ((SEX=F) ∧ ((RACE ≠ C) ∨ (DORM ≠ Holmes)))

59

y Controls for Statistical Inference Attacks


y Suppression
S i - sensitive
i i data
d values
l are not provided;
id d the
h query isi
rejected without response.
y Concealing
C li - the
th answer provided
id d isi close
l tto but
b t nott exactly
tl the
th
actual value.

60

y Controls for Statistical Inference Attacks (Cont’d)


y These
Th two controlsl reflect
fl theh contrast bbetween security
i andd
precision.
y With suppression,
i any resultslt provided
id d are correct,t yett many responses mustt
be withheld to maintain security.
y With concealing,g, more results can be provided,
p , but the precision
p of the
results is lower.
The choice between suppression and concealing depends on the
context of the database.

61

y Random Sample
y With
Wi h random
d sample
l control,l a resultl iis not dderived
i d from
f the
h whole
hl
database; instead the result is computed on a random sample of
th database.
the dtb
y The sample chosen is large enough to be valid.
y Random
R d DData t PPerturbation
t b ti
y It is sometimes useful to perturb the values of the database by a
smallll error.
y Generate a small random error term εi and add it to xi for statistical
results.
lt
62
y Query Analysis
y A more complex
l form
f off security i uses query analysis.
l i
y Here, a query and its implications are analyzed to determine
whether
h th a resultlt should
h ld bbe provided.
id d

63

y Conclusion on the Inference Problem


y No
N perfect
f solutions
l i to the h inference
if problem.
bl
y The approaches to controlling it
y Suppress
S obviously
b i l sensitive
iti information.
if ti
y Track what the user knows.
y Disguise
g the data.

64

y Aggregation
y Building
B ildi sensitive
i i results
l from
f lessl sensitivei i iinputs.
y Data mining is the process of sifting through multiple databases and
correlating
l ti multiple
lti l ddata
t elements
l t tto fifindd useful
f l iinformation
f ti

65

6.6. Multilevel Databases


y The Case for Differentiated Security
Name Department Salary Phone Performance
g
Rogers training
g 43,800
43, 4‐5067
4 5 7 A2
Jenkins research 62,900 6‐4281 D4
Poling training 38,200 4‐4501 B1

Garland user services 54,600 6‐6600 A4

Hilten user services 5


44,500 4‐5351
535 B1

Davis administration 51,400 4‐9505 A3

66
y The Case for Differentiated Security (Cont’d)
y Three
Th characteristics
h i i off database
d b security i emerge.
y The security of a single element may be different from the security of other
elements of the same record or from other values of the same attribute.
attribute
This situation implies that security should be implemented for each
individual element.
y Two levels sensitive and non-sensitive are inadequate to represent some
security situations. Several grades of security may be needed.
y The security of an aggregate a sum, a count, or a group of values in a
database may differ from the security of the individual elements. The
security of the aggregate may be higher or lower than that of the individual
elements.
67

p
6.7. Proposals for Multilevel Securityy
y Separation
y Partitioning
P ii i
y The database is divided into separate databases, each at its own level of
sensitivity.
sensitivity
y This control destroys a basic advantage of databases: elimination of
redundancy and improved accuracy through having only one field to update.
y It does not address the problem of a high-level user who needs access to
some low-level data combined with high-level data.

68

y Separation (Cont’d)
y Encryption
E i
y If sensitive data are encrypted, a user who accidentally receives them
cannot interpret the data.
data

69

Cryptographic Separation: Different Encryption Keys. Cryptographic Separation: Block Chaining

70
y Separation (Cont’d)
y Integrity
I i Lock
L k
y First proposed at the U.S. Air Force Summer Study on Data Base
Security [AFS83].
[AFS83]
y The lock is a way to provide both integrity and limited access for a database.
y The operation was nicknamed "spray paint" because each element is
figuratively painted with a color that denotes its sensitivity.

71

y Separation (Cont’d)
y Integrity
I i Lock
L k (Cont’d)
(C ’d)
y The sensitivity label should be
y unforgeable,
unforgeable so that a malicious subject cannot create a new sensitivity
level for an element
y unique, so that a malicious subject cannot copy a sensitivity level from
another element
y concealed, so that a malicious subject cannot even determine the
sensitivity level of an arbitrary element

72

y Separation (Cont’d)
y Integrity
I i Lock
L k (Cont’d)
(C ’d)
y The third piece of the integrity lock for a field is an error-detecting code,
called a cryptographic checksum.
checksum

73

y Separation (Cont’d)
y Sensitivity
S i i i LLockk
y A sensitivity lock is a combination of a
unique identifier (such as the record
number) and the sensitivity level.
y Because the identifier is unique, each lock relates
to one particular record.
y Many different elements will have the same
sensitivity level.
y A malicious subject should not be able to
identify two elements having identical sensitivity
levels or identical data values jjust byy lookingg at
the sensitivity level portion of the lock.
74
y Designs of Multilevel Secure Database
y Integrity
I i Lock
L k
y A short-term solution to the security problem for multilevel databases.
y The intention was to be able to use any (untrusted) database manager with
a trusted procedure that handles access control.
y The sensitive data were obliterated or concealed with encryption that
protected both a data item and its sensitivity.

75

Trusted Database Manager

76

y Designs of Multilevel Secure Database


y Integrity
I i Lock
L k (Cont’d)
(C ’d)
y The efficiency of integrity locks is a serious drawback.
y The space needed for storing an element must be expanded to contain
the sensitivity label.
y The processing time efficiency of an integrity lock.
y The untrusted database manager sees all data

77

y Trusted Front End


y Trusted
T d front
f endd iis also
l kknown as a guardd andd operates muchh like
lik
the reference monitor

78
y Trusted Front End (Cont’d)
1. A user identifies
id ifi himself
hi lf or hherselflf to the
h ffront end;
d the
h ffront endd
authenticates the user's identity.
2. Th user iissues a query tto th
The the front
f t end. d
3. The front end verifies the user's authorization to data.
4. Th front
The f t endd iissues a query tto th
the ddatabase
t b manager.
5. The database manager performs I/O access, interacting with low-
l l access control
level t l tto achieve
hi access tto actualt l data.
dt
6. The database manager returns the result of the query to the
t t d ffrontt end.
trusted d
79

y Trusted Front End (Cont’d)


7. The front
Th f endd analyzes
l the
h sensitivity
i i i llevelsl off the
h data
d items
i iin
the result and selects those items consistent with the user's
security
it level.
l l
8. The front end transmits selected data to the untrusted front end
f formatting.
for f tti
9. The untrusted front end transmits formatted data to the user.

80

y Commutative Filters
y A process that
h forms
f an interface
i f betweenb the
h user andd a DBMS.
DBMS
y The filter reformats the query so that the database manager does as much
of the work as possible
possible, screening out many unacceptable records
records.
y The filter then provides a second screening to select only data to which the
user has access.

81

82
y Distributed Databases
y Distributed
Di ib d or federated
fd d database
d b
y A trusted front end controls access to two unmodified commercial
DBMSs:
DBMS
y one for all low-sensitivity data and
y one for all high-sensitivity data.
data
y The distributed database design is not popular because the front
end, which must be trusted
end trusted, is complex
complex, potentially including most
of the functionality of a full DBMS itself.

83

y Window/View
y One
O off the
h advantages
d off using
i a DBMS ffor multiple
l i l users off
different interests (but not necessarily different sensitivity levels) is
th ability
the bilit to
t createt a different
diff t view i ffor eachh user.
y Each user is restricted to a picture of the data reflecting only what
th user needs
the d to
t see.
y A window (or a view) is a subset of a database, containing exactly
th information
the if ti that
th t a user isi entitled
titl d tto access.
y A view can represent a single user's subset database so that all of a
user's' queries
i access only l that
th t database.
dtb
84

(a) Airline's View.


FLT# ORIG DEST DEP ARR CAP TYPE PILOT TAIL
362 JFK BWI 0830 0950 114 PASS Dosser 2463
397 JFK ORD 0830 1020 114 PASS Botto 3621
ms
202 IAD LGW 1530 0710 183 PASS Jevins 2007
749 LGA ATL 0947 1120 0 CARG Witt 3116
O
286 STA SFO 1020 1150 117 PASS Gross 4026

85

((b)) Travel Agent's


g View.
FLT ORIG DEST DEP ARR CAP

3362 JJFK BWI 0830


3 0950
95 114
4

397 JFK ORD 0830 1020 114

202 IAD LGW 1530 0710 183

286 STA SFO 1020 1150 117


86
Secure Database Decomposition

87

6.8. Data Miningg


y Databases are great repositories of data. More data are being collected
andd saved.
d
y But to find needles of information in those vast fields of haystacks of
d t requires
data i iintelligent
t lli t analyzing
l i andd querying
i off the
th data.
dt
y Indeed, a whole specialization, called data mining, has emerged.
y In
I a llargely
l automated
t t d way, ddata t mining
i i applications
li ti sortt andd searchh
thorough data.

88

y Data mining uses statistics, machine learning, mathematical models,


pattern recognition,
i i andd other
h techniques
hi to di
discover patterns andd
relations on large datasets.
y Data
D t mining
i i tools
t l use association
i ti (one
( eventt often
ft goes with
ith another),
th )
sequences (one event often leads to another), classification (events
exhibit
hibit patterns,
tt ffor example
l coincidence),
i id ) clustering
l t i ((some ititems hhave
similar characteristics), and forecasting (past events foretell future ones).
y Data
D t mining
i i presents t probable
b bl relationships,
l ti hi bbutt ththese are nott
necessarily cause-and-effect relationships.

89

y Privacy and Sensitivity


y Because
B the
h goall off ddata mining
i i iis summary results,
l not iindividual
di id l
data items, you would not expect a problem with sensitivity of
i di id l data
individual d t items.
it
y Unfortunately that is not true. Why ??? ☺

90
INFORMATION POLICIES
The principles describe the right of individuals , not requirements on
collectors(i.e, principles do not require protection of the data collected.
• Collection limitation – Data should be obtained lawfully and fairly.
• Data quality- Data should be relevant to their purposes, accurate,
complete, and up-to-date.
• Purpose specification- The purposes for which data will be used
should be identified and the data destroyed if no longer necessary to
serve that purpose.
• Use limitation-Use for purposes other than those specified is
authorized only with consent of the data subject or by authority of law.

INFORMATION POLICIES

• Security safeguards- Procedures to guard against loss, corruption,


destruction, or misuse of data should be established.
• Openness- It should be possible to acquire information about the
collection, storage, and use of personal data systems.
• Individual participation-The data subject normally has a right to
access and to challenge data relating to her.
• Accountability- A data controller should be designated and
accountable for complying with the measures to give effect to the
principles.
Ware [WAR73b] raises the problem of linking data
in multiple files and of overusing keys, such as
social security numbers, that were never intended
to be used to link records.

Turn and Ware [TUR75] consider protecting the


data themselves, recognizing that collections of
data will be attractive targets for unauthorized
access attacks.

FOUR WAYS TO PROTECT STORED DATA:

• Reduce exposure by limiting the amount of data maintained, asking


for only what is necessary and using random samples instead of
complete surveys.
• Reduce data sensitivity by interchanging data items or adding subtle
errors to the data (and warning recipients that the data have been
altered).
• Anonymize the data by removing or modifying identifying data items.
• Encrypt the data.
U.S PRIVACY LAWS

• Statements on data transfer (to other organizations) were more explicit


than before HIPAA.
• Consumers still had little control over the disclosure or dissemination
of their data.
• Statements were longer and more complex, making them harder for
consumers to understand.
• Even within the same industry branch (such as drug companies),
statements varied substantially, making it hard for consumers to
compare policies.
• Statements were unique to specific web pages, meaning they covered
more precisely the content and function of a particular page.

Controls on U.S. Government Web Sites


Government websites addresses five factors,
• Notice-Data collectors must disclose their information practices
before collecting personal information from consumers.
• Choice- Consumers must be given a choice as to whether and how
personal information collected from them may be used.
• Access-Consumers should be able to view and contest the accuracy
and completeness of data collected about them.
• Security- Data collectors must take reasonable steps to ensure that
information collected from consumers is accurate and secure from
unauthorized use.
• Enforcement- A reliable mechanism must be in place to impose
sanctions for noncompliance with these fair information practices.
In 2002, the U.S. Congress enacted the e-Government Act of
2002 requiring that federal government agencies post privacy
policies on their web sites. Those policies must disclose
• the information that is to be collected
• the reason the information is being collected
• the intended use by the agency of the information
• the entities with whom the information will be shared
• the notice or opportunities for consent that would be provided
to individuals regarding what information is collected and
how that information is shared
• the way in which the information will be secured
• the rights of the individual under the Privacy Act and other
laws relevant to the protection of the privacy of an individual

Non-U.S. Privacy Principles

In 1981, the Council of Europe (an international body of 46 European


countries, founded in 1949) adopted Convention 108 for the protection
of individuals with regard to the automatic processing of personal data,
and in 1995, the European Union (E.U.) adopted Directive 95/46/EC on
the processing of personal data. Directive 95/46/EC, often called the
European Privacy Directive, requires that rights of privacy of
individuals be maintained and that data about them be
• processed fairly and lawfully
• collected for specified, explicit and legitimate purposes and not further
processed in a way incompatible with those purposes (unless
appropriate safeguards protect privacy)
Non-U.S. Privacy Principles
• adequate, relevant, and not excessive in relation to the purposes for
which they are collected and/or further processed
• accurate and, where necessary, kept up to date; every reasonable step
must be taken to ensure that inaccurate or incomplete data having
regard for the purposes for which they were collected or for which
they are further processed, are erased or rectified.
• kept in a form that permits identification of data subjects for no longer
than is necessary for the purposes for which the data were collected or
for which they are further processed

Three more principles to the Fair Information Policies.

• Special protection for sensitive data.


• Data transfer.
• Independent oversight.
Special protection for sensitive data.
There should be greater restrictions on data collection and
processing that involves "sensitive data." Under the E.U. data protection
directive, information is sensitive if it involves "racial or ethnic origin,
political opinions, religious beliefs, philosophical or ethical persuasion . .
. [or] health or sexual life."

Data transfer
This principle explicitly restricts authorized users of
personal information from transferring that information to third parties
without the permission of the data subject.
Independent oversight.

• Entities that process personal data should not only be accountable but
should also be subject to independent oversight. In the case of the
government, this requires oversight by an office or department that is
separate and independent from the unit engaged in the data processing.
Under the data protection directive, the independent overseer must
have the authority to audit data processing systems, investigate
complaints brought by individuals, and enforce sanctions for
noncompliance.

Anonymity

• One way to preserve privacy is to guard our identity. Not every


context requires us to reveal the identity, so some people wear a form
of electronic mask.
• A person may want to do some things anonymously.
• Anonymity creates problems, too. How does an anonymous person
pay for something? A trusted third party (for example, a real estate
agent or a lawyer) can complete the sale and preserve anonymity. But
then you need a third party and the third party knows who you
are.Chaum [CHA81, CHA82, CHA85] studied this problem and
devised a set of protocols by which such payments could occur
without revealing the buyer to the seller.
Government and policy

AUTHENTICATION :
• Government plays a complex role in personal authentication.
• Many government agencies use identifiers to perform their work.
• Authentication documents(such as passports and insurance cards)
often come from the government.
• The government may also regulate the businesses that use
identification and authentication keys.
• sometimes the government obtains data based on those keys from
others.
• In these multiple roles, the government may misuse data
and violate privacy rights.

Dat a access risks:

• Dat a errors: ranging from transcription errors to incorrect analysis


• inaccurat e linking : two or more correct data items but incorrectly
linked on a presumed common element.
• difference of form and cont ent : precision, accuracy, format, and
semantic errors
• purposely w rong : collected from a source that intentionally gives
incorrect data, such as a forged identity card or a false address given
to mislead.
Data access risks:
• false positive: an incorrect or out-of-date conclusion that the
government does not have data to verify or reject, for example,
delinquency in paying state taxes
• mission creep: data acquired for one purpose leading to a broader use
because the data will support that mission.
• poorly protected: data of questionable integrity because of the way it
has been managed and handled

Steps to protect against privacy loss


• Data minimization - Obtain the least data necessary for the task. For
example, if the goal is to study the spread of a disease, only the
condition, date, and vague location (city or county) may suffice; the
name or contact information of the patient may be unnecessary.
• Data anonymization- Where possible, replace identifying information
with untraceable codes (such as a record number); but make sure those
codes cannot be linked to another database that reveals sensitive data.
• Audit trail-Record who has accessed data and when, both to help
identify responsible parties in the event of a breach and to document
the extent of damage.
Steps to protect against privacy loss
• Security and controlled access- Adequately protect and control access
to sensitive data.
• Training-Ensure people accessing data understand what to protect and
how to do so.
• Quality-Take into account the purpose for which data were collected,
how they were stored, their age, and similar factors to determine the
usefulness of the data.
• Restricted usage- Different from controlling access, review all
proposed uses of the data to determine if those uses are consistent with
the purpose for which the data were collected and the manner in which
they were handled (validated, stored,controlled).

Steps to protect against privacy loss


• Data left in place - If possible, leave data in place with the original
owner. This step helps guard against possible misuses of the data from
expanded mission just because the data are available.
• Policy-Establish a clear policy for data privacy. Do not encourage
violation of privacy policies.
Identity theft
• Identity theft is taking another person's identity.
• Use of another person’s credit card is fraud; taking out a new credit card in
that person's name is identity theft.
• Having relatively few unique keys facilitates identity theft: A thief who gets
one key can use that to get a second, and those two to get a third. Each key
gives access to more data and resources.
• resources. Few companies or agencies are set up to ask truly discriminating
authentication
• questions (such as the grocery store at which you frequently shop or the city
to which you recently bought an airplane ticket or third digit on line four of
your last tax return). Because there are few authentication keys, we are
often asked to give the same key (such as mother's maiden name) out to
many people, some of whom might be part-time accomplices in identity
theft.

Privacy Concept s
Aspects of Information Privacy
Information privacy has three aspects:
• sensitive data
• affected parties
• controlled disclosure
Controlled Disclosure
• privacy is the right to control who knows certain aspects about you, your
communications, and your activities.
• privacy is something over which you have considerable influence and the key
point is you decide.
• You do not have complete control, however Anyone who has access to an object
can copy, transfer, or propagate that object or its content to others without
restriction.
Sensitive Data
• some people find some data more sensitive than others.
• We know things people usually consider sensitive, such as financial status, certain
health data, unsavory events in their past, and the like, so if you learn something
you consider sensitive about someone, you will keep it quiet.

Here are examples of dat a many people consider privat e.


o identity, the ownership of private data and the ability to control its disclosure
o finances, credit, bank details
o legal matters
o medical conditions, drug use, DNA, genetic predisposition to illnesses
o voting, opinions, membership in advocacy organizations
o preferences: religion, sexuality

o biometrics, physical characteristics, polygraph results, fingerprints


o diaries, poems, correspondence, recorded thoughts
o privileged communications with professionals such as lawyers, accountants,
doctors, counselors, and clergy
o performance: school records, employment ratings
o activities: reading habits, web browsing, music, art, videos
o air travel data, general travel data, a person's location (present and past)
o communications: mail, e-mail, telephone calls, spam
o history: "youthful indiscretions," past events
o illegal activities, criminal records

• In general, a person's privacy expectations depend on context: who is affected


and what the prevailing norm of privacy is.
Affected Subject
• Privacy is an aspect of confidentiality.
• As we have learned throughout is, the three security goals of confidentiality,
integrity, and availability conflict, and confidentiality frequently conflicts with
availability.
• If you choose not to have your telephone number published in a directory, that
also means some people will not be able to reach you by telephone.

Summary
• Privacy is controlled disclosure: The subject chooses what personal data to give
out and to whom.
• After disclosing something, a subject relinquishes much control to the receiver.
• What data are sensitive is at the discretion of the subject; people consider
different things sensitive. Why a person considers something sensitive is less
important than that it is.
• Individuals, informal groups, and formal organizations all have things they
consider private.
• Privacy has a cost; choosing not to give out certain data may limit other benefits.
Computer-Related Privacy Problems

• The sensitivities and issues predate computers.


• Computers and networks have only affected the feasibility of some unwanted
disclosures.
• Public records offices have long been open for people to study the data held
there, but the storage capacity and speed of computers have given us the ability
to amass, search, and correlate.
• Search engines have given us the ability to find one data item out of billions, the
equivalent of finding one sheet of paper out of a warehouse full of boxes of
papers.
• Furthermore, the openness of networks and the portability of technology (such
as laptops, PDAs, cell phones, and memory devices) have greatly increased the
risk of disclosures affecting privacy.

Eight dimensions of privacy


• Informat ion collect ion : Data are collected only with knowledge and explicit
consent.
• Informat ion usage: Data are used only for certain specified purposes.
• Informat ion ret ent ion : Data are retained for only a set period of time.
• Informat ion disclosure: Data are disclosed to only an authorized set of people.
• Informat ion securit y: Appropriate mechanisms are used to ensure the protection
of the data.
• Access cont rol : All modes of access to all forms of collected data are controlled.
• M onit oring : Logs are maintained showing all accesses to data.
• Policy changes: Less restrictive policies are never applied after-the-fact to already
obtained data.
Privacy issues that have come about through use of computers:
Data Collection
• Disks on ordinary consumer PCs are measured in gigabytes (109 bytes), and
commercial storage capacities often measure in terabytes (1012 bytes).
• In 2006, EMC Corporation announced a storage product whose capacity exceeds
one petabyte (1015 bytes).
• Indiana University plans to acquire a supercomputer with one petabyte of
storage, and the San Diego Supercomputer Center has online storage of one
petabyte and offline archives of seven petabytes.
• Estimates of Google's stored data are also in the petabyte range. We have both
devices to store massive amounts of data and the data to fill those devices.
• Whereas physical space limited storing (and locating) massive amounts of printed
data, electronic data take relatively little space.
• We never throw away data; we just move it to slower secondary media or buy
more storage.

No Informed Consent
• public and commercial sources (newspapers, web pages, digital audio, and video
recordings) and others are from intentional data transfers (tax returns, a
statement to the police after an accident, readers' survey forms, school papers),
still others are collected without announcement.
• The user is not necessarily aware of this third category of data collection and thus
cannot be said to have given informed consent.
• Example: Telephone companies record the date, time, duration, source, and
destination of each telephone call. ISPs track sites visited. Some sites keep the IP
address of each visitor to the site (although an IP address is usually not unique to
a specific individual).
Loss of Control
• To have little control over dissemination (or redissemination) of your data.
• We do not always appreciate the ramifications of lost control but once something
is out of your control on the web, it may never be deleted.
• The web is a great historical archive, but because of archives, caches, and mirror
sites, things posted on the web may never go away.
• Example: consider something written about you in a note or letter by you.
Someone else has posted something on the web that is personal about you and
you want it removed. Even if the poster agrees, you may not be able to remove
all its traces.
• A second issue of loss of control concerns data exposure. Suppose a company
holds data about you and that company's records are exposed in a computer
attack. The company may not be responsible for preventing harm to you,
compensating you if you are harmed, or even informing you of the event.

Ow nership of the Data


• Information about you is being sold and you have no control; nor do you get to
share in the profit.
• Even before computers customer data were valuable. Mailing lists and customer
lists were company assets that were safeguarded against access by the
competition.
• Sometimes companies rented their mailing lists when there was not a conflict
with a competitor.
• But in those cases, the subject of the data, the name on the list, did not own the
right to be on the list or not.
• With computers the volume and sources of data have increased significantly, but
the subject still has no rights.

You might also like