Chapter 3 Data Storage
Chapter 3 Data Storage
3.3
i
3.4
3.1.1 Data inside the computer
All data types are transformed into a uniform representation when they are stored in a computer and
transformed back to their original form when retrieved. This universal representation is called a bit
pattern.
Byte: 8 bits
11111111010000
3.5
01
Figure 3.3 Storage of different data types
3.6
3-2 STORING NUMBERS
A number is changed to the binary system before being stored in the computer memory, as
described in Chapter 2. However, there are still two issues that need to be handled:
3.7
3.2.1 Storing integers
Integers are whole numbers (numbers without a fractional part). For example, 134 and −125 are integers,
An integer can be thought of as a number in which the position of the decimal point is fixed: the decimal
For this reason, fixed-point representation is used to store an integer, as shown in Figure 3.4. In this
3.8
Figure 3.4 Fixed point representation of integers
fixed-point representation.
3.9
Unsigned representation
An unsigned integer is an integer that can never be negative and can take only 0 or positive values. Its
3.10
Example 3.1
3.11
Example 3.2
Solution
First change the integer to binary (100000010)2. Add seven 0s to
make a total of sixteen bits, (0000000100000010)2. The integer is
stored in the memory location.
3.12
Example 3.3
Solution
Using the procedure shown in Chapter 2, the binary integer is
converted to the unsigned integer 43.
3.13
4
Figure 3.5 shows what happens if we try to store an integer that is larger than 2 − 1 = 15 in a memory
3.14
Sign-and-magnitude representation
n
In this method, the available range for unsigned integers (0 to 2 − 1) is divided into two equal sub-
ranges. The first half represents positive integers, the second half, negative integers.
In sign-and-magnitude representation, the leftmost bit defines the sign of the integer. If it is
3.15
Example 3.4
Solution
The integer is changed to 7-bit binary. The leftmost bit is set to 0.
The 8-bit number is stored.
3.16
Example 3.5
Solution
The integer is changed to 7-bit binary. The leftmost bit is set to 1.
The 8-bit number is stored.
3.17
Example 3.6
Solution
Since the leftmost bit is 0, the sign is positive. The rest of the bits
(1001101) are changed to decimal as 77. After adding the sign,
the integer is +77.
3.18
Example 3.7
Solution
Since the leftmost bit is 1, the sign is negative. The rest of the
bits (0100001) are changed to decimal as 33. After adding the
sign, the integer is −33.
3.19
Figure 3.7 shows both positive and negative overflow when storing an integer in sign-and-magnitude
Almost all computers use two’s complement representation to store a signed integer in an n-bit memory
n
location. In this method, the available range for an unsigned integer of (0 to 2 − 1) is divided into two
equal sub-ranges. The first sub-range is used to represent nonnegative integers, the second half to
represent negative integers. The bit patterns are then assigned to negative and nonnegative (zero and
3.21
Figure 3.8 Two’s complement representation
In two’s complement representation, the leftmost bit defines the sign of the integer. If it is 0,
3.22
One’s Complementing
Before we discuss this representation further, we need to introduce two operations. The first is called one’s completing or
taking the one’s complement of an integer. The operation can be applied to any integer, positive or negative. This operation
simply reverses (flips) each bit. A 0-bit is changed to a 1-bit, a 1-bit is changed to a 0-bit.
Example 3.8
3.23
Example 3.9
3.24
Two’s Complementing
The second operation is called two’s completing or taking the two’s complement of an integer in binary.
This operation is done in two steps. First, we copy bits from the right until a 1 is copied; then, we flip
Example 3.10
3.25
Example 3.11
An alternative way to take the two’s complement of an integer is to first take the one’s
3.26
Example 3.12
Solution
The integer is positive (no sign means positive), so after decimal
to binary transformation no more action is needed. Note that
three extra 0s are added to the left of the integer to make it eight
bits.
3.27
Example 3.13
Solution
The integer is negative, so after changing to binary, the computer
applies the two’s complement operation on the integer.
3.28
Example 3.14
Solution
The leftmost bit is 0, so the sign is positive. The integer is
changed to decimal and the sign is added.
3.29
Example 3.15
Solution
The leftmost bit is 1, so the integer is negative. The integer needs
to be two’s complemented before changing to decimal.
3.30
i
3.32
3-3 STORING TEXT
A section of text in any language is a sequence of symbols used to represent an idea in that
language. For example, the English language uses 26 symbols (A, B, C,…, Z) to represent
uppercase letters, 26 symbols (a, b, c, …, z) to represent lowercase letters, nine symbols (0, 1,
2, …, 9) to represent numeric characters and symbols (., ?, :, ; , …, !) to represent
punctuation. Other symbols such as blank, newline, and tab are used for text alignment and
readability.
3.33
We can represent each symbol with a bit pattern. In other words,
text such as “CATS”, which is made up from four symbols, can
be represented as four n-bit patterns, each pattern defining a
single symbol (Figure 3.14).
3.34
Ex: How many bits (b) used to represent (n) symbols
b = log2 n
ASCII
Unicode
Other Codes
See Appendix A
3.36
3-4 STORING AUDIO
Audio is a representation of sound or music. Audio, by nature, is different than the numbers or
text we have discussed so far. Text is composed of countable entities (characters): we can
count the number of characters in text. Text is an example of digital data. In contrast, audio is
not countable. Audio is an example of analog data. Even if we are able to measure all its
values in a period of time, we cannot store these in the computer’s memory, as we would need
infinite number of memory locations. Figure 3.15 shows the nature of an analog signal, such
as audio, that varies with time.
Figure 3.15 An audio signal
3.38
3.4.1 Sampling
If we cannot record all the values of a an audio signal over an interval, we can record some of them.
Sampling means that we select only a finite number of points on the analog signal, measure their values,
The value measured for each sample is a real number. This means that we can store 40,000 real values
for each one second sample. However, it is simpler to use an unsigned integer (a bit pattern) for each
sample. Quantization refers to a process that rounds the value of a sample to the closest integer value.
For example, if the real value is 17.2, it can be rounded down to 17: if the value is 17.7, it can be
rounded up to 18.
3.40
3.4.2 Encoding
The quantized sample values need to be encoded as bit patterns. Some systems assign positive and
negative values to samples, some just shift the curve to the positive part and assign only positive values.
If we call the bit depth or number of bits per sample B, the number of samples per second, S, we need to
store S × B bits for each second of audio. This product is sometimes refer to as bit rate, R. For example,
if we use 40,000 samples per second and 16 bits per each sample, the bit rate is
R=BXS
3.41
3.4.4 Standards for sound encoding
Today the dominant standard for storing audio is MP3 (short for MPEG Layer 3). This standard is a
modification of the MPEG (Motion Picture Experts Group) compression method used for video. It
uses 44100 samples per second and 16 bits per sample. The result is a signal with a bit rate of 705,600
bits per second, which is compressed using a compression method that discard information that cannot be
detected by the human ear. This is called lossy compression, as opposed to lossless compression: see
Chapter 15.
3.42
3-5 STORING IMAGES
Images are stored in computers using two different techniques: raster graphics and vector
graphics.
Raster graphics (or bitmap graphics) is used when we need to store an analog image such as a
photograph. A photograph consists of analog data, similarly to audio information: the difference is that
the intensity (color) of data varies in space instead of in time. This means that data must be sampled.
However, sampling in this case is normally called scanning. The samples are called pixels (picture
elements).
3.43
Resolution
Just like audio sampling, in image scanning we need to decide how many pixels we need to record for
each square or linear inch. The scanning rate in image processing is called resolution. If the resolution is
sufficiently high, the human eye cannot recognize the discontinuity in reproduced images.
Color depth
The number of bits used to represent a pixel, its color depth, depends on how the pixel’s color is handled
by different encoding techniques. The perception of color is how our eyes respond to a beam of light.
Our eyes have different types of photoreceptor cells: some respond to the three primary colors red, green,
and blue (often called RGB), while others merely respond to the intensity of light.
3.44
True-Color
One of the techniques used to encode a pixel is called True-Color, which uses 24 bits to encode a pixel.
3.45
Indexed color
The indexed color —or palette color—scheme uses only a portion of these colors. (uses 8-bit)
following shows the number of bits that need to be stored using each scheme:
3.47
Standards for image encoding
Several de facto standards for image encoding are in use. JPEG (Joint Photographic Experts Group)
uses the True-Color scheme, but compresses the image to reduce the number of bits (see Chapter 15).
GIF (Graphic Interchange Format), on the other hand, uses the indexed color scheme.
3.48
3.5.2 Vector graphics
Raster graphics has two disadvantages: the file size is big and rescaling is troublesome. To enlarge a
raster graphics image means enlarging the pixels., so the image looks ragged when it is enlarged. The
vector graphic image encoding method, however, does not store the bit patterns for each pixel. An
image is decomposed into a combination of geometrical shapes such as lines, squares, or circles.
For example, consider a circle of radius r. The main pieces of information a program needs to draw this
circle are:
3.49
3-6 STORING VIDEO
Video is a representation of images (called frames) over time. A movie consists of a series of
frames shown one after another. In other words, video is the representation of information that
changes in space and in time. So, if we know how to store an image inside a computer, we
also know how to store video: each image or frame is transformed into a set of bit patterns
and stored. The combination of the images then represents the video.
https://drive.google.com/file/d/12kVSE1DQJm2S1
b2FD3tpiEuD7La3_MPF/view?usp=sharing