Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
18 views

CS NumberSystem

Uploaded by

groy41744
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

CS NumberSystem

Uploaded by

groy41744
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

UNIT-2

Data Representation: Number Systems-Binary, Octal, Hexadecimal, Complements, Arithmetic Operations. Floating Point Representation

Register Transfer and Micro Operations: Register Transfer Language, Register Transfer, Bus and Memory Transfer, Arithmetic Micro-Operations, Logic
Micro-Operations, Shift Micro-Operations, Arithmetic Logic Shift Unit.
UNIT--3
Basic Computer Organization and Design: Instruction Codes, Computer Registers; Common Bus System. Computer Instructions; Instruction Formats,
Instruction Cycle, Fetch and Decode, Flowchart for Instruction Cycle, Register Reference Instructions, Addressing Modes.

CPU Design: Specifying A CPU, Design and Implementation of a simple CPU (Fetching Instructions from Memory Decoding and Executing Instructions,
Establishing Required Data Paths).

Number System

A number system is defined as a system of writing to express numbers. It is the mathematical notation for representing numbers of a given set by
using digits or other symbols in a consistent manner. It provides a unique representation of every number and represents the arithmetic and algebraic
structure of the figures. It also allows us to operate arithmetic operations like addition, subtraction, multiplication and division.

The value of any digit in a number can be determined by:

 The digit
 Its position in the number
 The base of the number system

Before discussing the different types of number system examples, first, let us discuss what is a number?

What is a Number?

A number is a mathematical value used for counting or measuring or labelling objects. Numbers are used to performing arithmetic
calculations. Examples of numbers are natural numbers, whole numbers, rational and irrational numbers, etc. 0 is also a number that represents a null
value.
1
A number has many other variations such as even and odd numbers, prime and composite numbers. Even and odd terms are used when a number is
divisible by 2 or not, whereas prime and composite differentiate between the numbers that have only two factors and more than two factors,
respectively.

In a number system, these numbers are used as digits. 0 and 1 are the most common digits in the number system that are used to represent binary
numbers. On the other hand, 0 to 9 digits are also used for other number systems. Let us learn here the types of number systems.

Types of Number Systems

There are various types of number systems in mathematics. The four most common number system types are:

1. Decimal number system (Base- 10)


2. Binary number system (Base- 2)
3. Octal number system (Base-8)
4. Hexadecimal number system (Base- 16)

Now, let us discuss the different types of number systems with examples.

Decimal Number System (Base 10 Number System)

The decimal number system has a base of 10 because it uses ten digits from 0 to 9. In the decimal number system, the positions successive to the left
of the decimal point represent units, tens, hundreds, thousands and so on. This system is expressed in decimal numbers. Every position shows a
particular power of the base (10).

Example of Decimal Number System:

The decimal number 1457 consists of the digit 7 in the unit’s position, 5 in the tens place, 4 in the hundreds position and 1 in the thousands place
whose value can be written as:

(1×103) + (4×102) + (5×101) + (7×100)

(1×1000) + (4×100) + (5×10) + (7×1)

1000 + 400 + 50 + 7
2
1457

Binary Number System (Base 2 Number System)

The base 2 number system is also known as the Binary number system wherein, only two binary digits exist, i.e., 0 and 1. Specifically, the usual base-2
is a radix of 2. The figures described under this system are known as binary numbers which are the combination of 0 and 1. For example, 110101 is a
binary number.

We can convert any system into binary and vice versa.

Example

Write (14)10 as a binary number.

Solution:

Base 2 Number System Example

∴ (14)10 = 11102

3
Octal Number System (Base 8 Number System)

In the octal number system, the base is 8 and it uses numbers from 0 to 7 to represent numbers. Octal numbers are commonly used in computer
applications. Converting an octal number to decimal is the same as decimal conversion and is explained below using an example.

Example: Convert 2158 into decimal.

Solution:

2158 = 2 × 82 + 1 × 81 + 5 × 80

= 2 × 64 + 1 × 8 + 5 × 1

= 128 + 8 + 5

= 14110

Hexadecimal Number System (Base 16 Number System)

In the hexadecimal system, numbers are written or represented with base 16. In the hexadecimal system, the numbers are first represented just like in
the decimal system, i.e. from 0 to 9. Then, the numbers are represented using the alphabet from A to F. The below-given table shows the
representation of numbers in the hexadecimal number system.

Hexadecimal 0 1 2 3 4 5 6 7 8 9 A B C D E F
Decimal 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Number System Chart

In the number system chart, the base values and the digits of different number systems can be found. Below is the chart of the numeral system.

4
Number System Conversion

Numbers can be represented in any of the number system categories like binary, decimal, hexadecimal, etc. Also, any number which is represented in
any of the number system types can be easily converted to another.

------

The general representations of number systems are:

Decimal Number – Base 10 – N10

Binary Number – Base 2 – N2

Octal Number – Base 8 – N8

Hexadecimal Number – Base 16 – N16

5
Number System Conversion Table

Binary Numbers Octal Numbers Decimal Numbers Hexadecimal Numbers


0000 0 0 0
0001 1 1 1
0010 2 2 2
0011 3 3 3
0100 4 4 4
0101 5 5 5
0110 6 6 6
0111 7 7 7
1000 10 8 8
1001 11 9 9
1010 12 10 A
1011 13 11 B
1100 14 12 C
1101 15 13 D
1110 16 14 E
1111 17 15 F

Number System Conversion Methods

Number system conversions deal with the operations to change the base of the numbers. For example, to change a decimal number with base 10 to
binary number with base 2. We can also perform the arithmetic operations like addition, subtraction, multiplication on the number system. Here, we
will learn the methods to convert the number of one base to the number of another base starting with the decimal number system. The
representation of number system base conversion in general form for any base number is;

(Number)b = dn-1 dn-2—–.d1 d0 . d-1 d-2 —- d-m

In the above expression, dn-1 dn-2—–.d1 d0 represents the value of integer part and d-1 d-2 —- d-m represents the fractional part.

6
Also, dn-1 is the Most significant bit (MSB) and d-m is the Least significant bit (LSB).

Decimal to Other Bases

Converting a decimal number to other base numbers is easy. We have to divide the decimal number by the converted value of the new base.

Decimal to Binary Number:

Suppose if we have to convert decimal to binary, then divide the decimal number by 2.

Example 1. Convert (25)10 to binary number.

Solution: Let us create a table based on this question.

Operation Output Remainder


25 ÷ 2 12 1(MSB)
12 ÷ 2` 6 0
6÷2 3 0
3÷2 1 1
1÷2 0 1(LSB)

Therefore, from the above table, we can write,

(25)10 = (11001)2

Decimal to Octal Number:

To convert decimal to octal number we have to divide the given original number by 8 such that base 10 changes to base 8. Let us understand with the
help of an example.
7
Example 2: Convert 12810 to octal number.

Solution: Let us represent the conversion in tabular form.

Operation Output Remainder


128÷8 16 0(MSB)
16÷8 2 0
2÷8 0 2(LSB)

Therefore, the equivalent octal number = 2008

Decimal to Hexadecimal:

Again in decimal to hex conversion, we have to divide the given decimal number by 16.

Example 3: Convert 12810 to hex.

Solution: As per the method, we can create a table;

Operation Output Remainder


128÷16 8 0(MSB)
8÷16 0 8(LSB)

Therefore, the equivalent hexadecimal number is 8016

Here MSB stands for a most significant bit and LSB stands for a least significant bit.

Other Base System to Decimal Conversion

Binary to Decimal:
8
In this conversion, binary number to a decimal number, we use multiplication method, in such a way that, if a number with base n has to be converted
into a number with base 10, then each digit of the given number is multiplied from MSB to LSB with reducing the power of the base. Let us understand
this conversion with the help of an example.

Example 1. Convert (1101)2 into a decimal number.

Solution: Given a binary number (1101)2.

Now, multiplying each digit from MSB to LSB with reducing the power of the base number 2.

1 × 23 + 1 × 2 2 + 0 × 2 1 + 1 × 2 0

=8+4+0+1

= 13

Therefore, (1101)2 = (13)10

Octal to Decimal:

To convert octal to decimal, we multiply the digits of octal number with decreasing power of the base number 8, starting from MSB to LSB and then
add them all together.

Example 2: Convert 228 to decimal number.

Solution: Given, 228

2 x 81 + 2 x 8 0

= 16 + 2

= 18

Therefore, 228 = 1810


9
Hexadecimal to Decimal:

Example 3: Convert 12116 to decimal number.

Solution: 1 x 162 + 2 x 161 + 1 x 160

= 16 x 16 + 2 x 16 + 1 x 1

= 289

Therefore, 12116 = 28910

Hexadecimal to Binary Shortcut Method

To convert hexadecimal numbers to binary and vice versa is easy, you just have to memorize the table given below.

Hexadecimal Number Binary


0 0000
1 0001
2 0010
3 0011
4 0100
5 0101
6 0110
7 0111
8 1000
9 1001
A 1010
B 1011
C 1100

10
D 1101
E 1110
F 1111

You can easily solve the problems based on hexadecimal and binary conversions with the help of this table. Let us take an example.

Example: Convert (89)16 into a binary number.

Solution: From the table, we can get the binary value of 8 and 9, hexadecimal base numbers.

8 = 1000 and 9 = 1001

Therefore, (89)16 = (10001001)2

Octal to Binary Shortcut Method

To convert octal to binary number, we can simply use the table. Just like having a table for hexadecimal and its equivalent binary, in the same way, we
have a table for octal and its equivalent binary number.

Octal Number Binary


0 000
1 001
2 010
3 011
4 100
5 101
6 110
7 111
11
Example: Convert (214)8 into a binary number.

Solution: From the table, we know,

2 → 010

1 → 001

4 → 100

Therefore,(214)8 = (010001100)2

Practice Problems on Number System Conversion

1) Convert 14610 into a binary number system


2) Convert 1A716 into the decimal number system
3) Convert (110010)2 into octal number system
4) Convert DA216 into the binary number system
5) Convert 46528 into the binary number system

------

Now let’s discuss in brief about the conversion of one number system to the other number system by taking a random number.

Assume the number 349. Thus, the number 349 in different number systems is as follows:

The number 349 in the binary number system is 101011101

The number 349 in the decimal number system is 349.

The number 349 in the octal number system is 535.

The number 349 in the hexadecimal number system is 15D

12
Number System Solved Examples

Example 1:

Convert (1056)16 to an octal number.

Solution:

Given, 105616 is a hex number.

First we need to convert the given hexadecimal number into decimal number

(1056)16

= 1 × 163 + 0 × 162 + 5 × 161 + 6 × 160

= 4096 + 0 + 80 + 6

= (4182)10

Now we will convert this decimal number to the required octal number by repetitively dividing by 8.

8 4182 Remainder
8 522 6
8 65 2
8 8 1
8 1 0
0 1

Therefore, taking the value of the remainder from bottom to top, we get;

13
(4182)10 = (10126)8

Therefore,

(1056)16 = (10126)8

Example 2:

Convert (1001001100)2 to a decimal number.

Solution:

(1001001100)2

= 1 × 29 + 0 × 2 8 + 0 × 2 7 + 1 × 2 6 + 0 × 25 + 0 × 2 4 + 1 × 2 3 + 1 × 22 + 0 × 21 + 0 × 20

= 512 + 64 + 8 + 4

= (588)10

Example 3:

Convert 101012 into an octal number.

Solution:

Given,

101012 is the binary number

We can write the given binary number as,

010 101

14
Now as we know, in the octal number system,

010 → 2

101 → 5

Therefore, the required octal number is (25)8

Example 4:

Convert hexadecimal 2C to decimal number.

Solution:

We need to convert 2C16 into binary numbers first.

2C → 00101100

Now convert 001011002 into a decimal number.

101100 = 1 × 25 + 0 × 24 + 1 × 23 + 1 × 22 + 0 × 21 + 0 × 20

= 32 + 8 + 4

= 44

Signed and Unsigned Binary Numbers

Our computer can understand only (0, 1) language. The binary numbers are represented in both ways, i.e., signed and unsigned. The positive numbers
are represented in both ways- signed and unsigned, but the negative numbers can only be described in a signed way. The difference between
unsigned and signed numbers is that unsigned numbers do not use any sign bit for positive and negative numbers identification, but the signed
number used.
15
The signed numbers are represented in three ways. The signed bit makes two possible representations of zero (positive (0) and negative (1)), which is
an ambiguous representation. The third representation is 2's complement representation in which no double representation of zero is possible, which
makes it unambiguous representation. There are the following types of representation of signed binary numbers:

1) Sign-Magnitude form
In this form, a binary number has a bit for a sign symbol. If this bit is set to 1, the number will be negative else the number will be positive if it is
set to 0. Apart from this sign-bit, the n-1 bits represent the magnitude of the number.
2) 1's Complement
By inverting each bit of a number, we can obtain the 1's complement of a number. The negative numbers can be represented in the form of 1's
complement. In this form, the binary number also has an extra bit for sign representation as a sign-magnitude form.
3) 2's Complement
By inverting each bit of a number and adding plus 1 to its least significant bit, we can obtain the 2's complement of a number. The negative
numbers can also be represented in the form of 2's complement. In this form, the binary number also has an extra bit for sign representation as a
sign-magnitude form.
16
1's Complement

In number representation techniques, the binary number system is the most used representation technique in digital electronics. The complement is
used for representing the negative decimal number in binary form. Different types of complement are possible of the binary number, but 1's and 2's
complements are mostly used for binary numbers. We can find the 1's complement of the binary number by simply inverting the given number. For
example, 1's complement of binary number 1011001 is 0100110. We can find the 2's complement of the binary number by changing each bit(0 to 1
and 1 to 0) and adding 1 to the least significant bit. For example, 2's complement of binary number 1011001 is (0100110)+1=0100111.

For finding 1's complement of the binary number, we can implement the logic circuit also by using NOT gate. We use NOT gate for each bit of the
binary number. So, if we want to implement the logic circuit for 5-bit 1's complement, five NOT gates will be used.

Example 1: 11010.1101

For finding 1's complement of the given number, change all 0's to 1 and all 1's to 0. So the 1's complement of the number 11010.1101 comes out
00101.0010.

Use of 1's complement

1's complement plays an important role in representing the signed binary numbers. The main use of 1's complement is to represent a signed binary
number. Apart from this, it is also used to perform various arithmetic operations such as addition and subtraction.

Example 1: +6 and -6

17
The number +6 is represented as same as the binary number. For representing both numbers, we will take the 5-bit register.

So the +6 is represented in the 5-bit register as 0 0110.

The -6 is represented in the 5-bit register in the following way:

+6=0 0110

Find the 1's complement of the number 0 0110, i.e., 1 1001. Here, MSB denotes that a number is a negative number.

Here, MSB refers to Most Significant Bit, and LSB denotes the Least Significant Bit.

Example 2: +120 and -120

The number +120 is represented as same as the binary number. For representing both numbers, take the 8-bit register.

So the +120 is represented in the 8-bit register as 0 1111000.

The -120 is represented in the 8-bit register in the following way:

+120=0 1111000

Now, find the 1's complement of the number 0 1111000, i.e., 1 0000111. Here, the MSB denotes the number is the negative number.
18
2's Complement

Just like 1's complement, 2's complement is also used to represent the signed binary numbers. For finding 2's complement of the binary number, we
will first find the 1's complement of the binary number and then add 1 to the least significant bit of it.

For example, if we want to calculate the 2's complement of the number 1011001, then firstly, we find the 1's complement of the number that is
0100110 and add 1 to the LSB. So, by adding 1 to the LSB, the number will be (0100110)+1=0100111.

Example 1: 110100

For finding 2's complement of the given number, change all 0's to 1 and all 1's to 0. So the 1's complement of the number 110100 is 001011. Now add
1 to the LSB of this number, i.e., (001011)+1=001100.

Example 2: 100110

For finding 1's complement of the given number, change all 0's to 1 and all 1's to 0. So, the 1's complement of the number 100110 is 011001. Now add
one the LSB of this number, i.e., (011001)+1=011010.

2's Complement Table

Binary Number 1's Complement 2's complement


0000 1111 0000
0001 1110 1111
0010 1101 1110
0011 1100 1101
0100 1011 1100
0101 1010 1011
0110 1001 1010
0111 1000 1001
1000 0111 1000
1001 0110 0111
19
1010 0101 0110
1011 0100 0101
1100 0011 0100
1101 0010 0011
1110 0001 0010
1111 0000 0001

Use of 2's complement

2's complement is used for representing signed numbers and performing arithmetic operations such as subtraction, addition, etc. The positive number
is simply represented as a magnitude form. So there is nothing to do for representing positive numbers. But if we represent the negative number, then
we have to choose either 1's complement or 2's complement technique. The 1's complement is an ambiguous technique, and 2's complement is an
unambiguous technique. Let's see an example to understand how we can calculate the 2's complement in signed binary number representation.

Example 1: +6 and -6

The number +6 is represented as same as the binary number. For representing both numbers, take the 5-bit register.

So the +6 is represented in the 5-bit register as 0 0110.

The -6 is represented in the 5-bit register in the following way:

1) +6=0 0110
2) Now, find the 1's complement of the number 0 0110, i.e. 1 1001.
3) Now, add 1 to its LSB. When we add 1 to the LSB of 11001, the newly generated number comes out 11010. Here, the sign bit is one which means
the number is the negative number.

20
Example 2: +120 and -120

The number +120 is represented as same as the binary number. For representing both numbers, take the 8-bit register.

So the +120 is represented in the 8-bit register as 0 1111000.

The -120 is represented in the 8-bit register in the following way:

1) +120=0 1111000
2) Now, find the 1's complement of the number 0 1111000, i.e. 1 0000111. Here, the MSB denotes the number is the negative number.
3) Now, add 1 to its LSB. When we add 1 to the LSB of 1 0000111, the newly generated number comes out 1 0001000. Here, the sign bit is one, which
means the number is the negative number.

Addition and Subtraction using 1's Complement

In our previous section, we learned about different complements such as 1's complement, 2's complement, 9's complement, and 10's complement,
etc.. In this section, we will learn to perform the arithmetic operations such as addition and subtraction using 1's complement. We can perform
addition and subtraction using 1's, 2's, 9's, and 10's complement.

Addition using 1's complement

There are three different cases possible when we add two binary numbers which are as follows:
21
Case 1: Addition of the positive number with a negative number when the positive number has a greater magnitude.

Initially, calculate the 1's complement of the given negative number. Sum up with the given positive number. If we get the end-around carry 1, it gets
added to the LSB.

Example: 1101 and -1001

1. First, find the 1's complement of the negative number 1001. So, for finding 1's complement, change all 0 to 1 and all 1 to 0. The 1's complement
of the number 1001 is 0110.
2. Now, add both the numbers, i.e., 1101 and 0110;
1101+0110=1 0011
3. By adding both numbers, we get the end-around carry 1. We add this end around carry to the LSB of 0011.
0011+1=0100

Case 2: Adding a positive value with a negative value in case the negative number has a higher magnitude.

Initially, calculate the 1's complement of the negative value. Sum it with a positive number. In this case, we did not get the end-around carry. So, take
the 1's complement of the result to get the final result.

Note: The resultant is a negative value.

Example: 1101 and -1110

1. First find the 1's complement of the negative number 1110. So, for finding 1's complement, we change all 0 to 1, and all 1 to 0. 1's complement
of the number 1110 is 0001.
2. Now, add both the numbers, i.e., 1101 and 0001;
1101+0001= 1110
3. Now, find the 1's complement of the result 1110 that is the final result. So, the 1's complement of the result 1110 is 0001, and we add a
negative sign before the number so that we can identify that it is a negative number.

Case 3: Addition of two negative numbers

In this case, first find the 1's complement of both the negative numbers, and then we add both these complement numbers. In this case, we always
get the end-around carry, which get added to the LSB, and for getting the final result, we take the 1's complement of the result.
22
Note: The resultant is a negative value.

Example: -1101 and -1110 in five-bit register

1. Firstly find the 1's complement of the negative numbers 01101 and 01110. So, for finding 1's complement, we change all 0 to 1, and all 1 to 0.
1's complement of the number 01110 is 10001, and 01101 is 10010.
2. Now, we add both the complement numbers, i.e., 10001 and 10010;
10001+10010= 1 00011
3. By adding both numbers, we get the end-around carry 1. We add this end-around carry to the LSB of 00011.
00011+1=00100
4. Now, find the 1's complement of the result 00100 that is the final answer. So, the 1's complement of the result 00100 is 110111, and add a
negative sign before the number so that we can identify that it is a negative number.

Subtraction using 1's complement

These are the following steps to subtract two binary numbers using 1's complement

 In the first step, find the 1's complement of the subtrahend.


 Next, add the complement number with the minuend.
 If got a carry, add the carry to its LSB. Else take 1's complement of the result which will be negative

Note: The subtrahend value always get subtracted from minuend.

Example 1: 10101 - 00111

We take 1's complement of subtrahend 00111, which comes out 11000. Now, sum them. So,

10101+11000 =1 01101.

In the above result, we get the carry bit 1, so add this to the LSB of a given result, i.e., 01101+1=01110, which is the answer.

Example 2: 10101 - 10111

We take 1's complement of subtrahend 10111, which comes out 01000. Now, add both of the numbers. So,
23
10101+01000 =11101.

In the above result, we didn't get the carry bit. So calculate the 1's complement of the result, i.e., 00010, which is the negative number and the final
answer.

Addition and Subtraction using 2's Complement

Addition using 2's complement

There are three different cases possible when we add two binary numbers using 2's complement, which is as follows:

Case 1: Addition of the positive number with a negative number when the positive number has a greater magnitude.

Initially find the 2's complement of the given negative number. Sum up with the given positive number. If we get the end-around carry 1 then the
number will be a positive number and the carry bit will be discarded and remaining bits are the final result.

Example: 1101 and -1001

1. First, find the 2's complement of the negative number 1001. So, for finding 2's complement, change all 0 to 1 and all 1 to 0 or find the 1's
complement of the number 1001. The 1's complement of the number 1001 is 0110, and add 1 to the LSB of the result 0110. So the 2's
complement of number 1001 is 0110+1=0111
2. Add both the numbers, i.e., 1101 and 0111;
1101+0111=1 0100
3. By adding both numbers, we get the end-around carry 1. We discard the end-around carry. So, the addition of both numbers is 0100.

Case 2: Adding of the positive value with a negative value when the negative number has a higher magnitude.

Initially, add a positive value with the 2's complement value of the negative number. Here, no end-around carry is found. So, we take the 2's
complement of the result to get the final result.

Note: The resultant is a negative value.

Example: 1101 and -1110

24
1. First, find the 2's complement of the negative number 1110. So, for finding 2's complement, add 1 to the LSB of its 1's complement value 0001.
0001+1=0010
2. Add both the numbers, i.e., 1101 and 0010;
1101+0010= 1111
3. Find the 2's complement of the result 1110 that is the final result. So, the 2's complement of the result 1110 is 0001, and add a negative sign
before the number so that we can identify that it is a negative number.

Case 3: Addition of two negative numbers

In this case, first, find the 2's complement of both the negative numbers, and then we will add both these complement numbers. In this case, we will
always get the end-around carry, which will be added to the LSB, and forgetting the final result, we will take the2's complement of the result.

Note: The resultant is a negative value.

Example: -1101 and -1110 in five-bit register

1. Firstly find the 2's complement of the negative numbers 01101 and 01110. So, for finding 2's complement, we add 1 to the LSB of the 1's
complement of these numbers. 2's complement of the number 01110 is 10010, and 01101 is 10011.
2. We add both the complement numbers, i.e., 10001 and 10010;
10010+10011= 1 00101
3. By adding both numbers, we get the end-around carry 1. This carry is discarded and the final result is the 2.s complement of the result 00101.
So, the 2's complement of the result 00101 is 11011, and we add a negative sign before the number so that we can identify that it is a negative
number.

Subtraction using 2's Complement

These are the following steps to subtract two binary numbers using 2's complement

 In the first step, find the 2's complement of the subtrahend.


 Add the complement number with the minuend.
 If we get the carry by adding both the numbers, then we discard this carry and the result is positive else take 2's complement of the result
which will be negative.

Example 1: 10101 - 00111


25
We take 2's complement of subtrahend 00111, which is 11001. Now, sum them. So,

10101+11001 =1 01110.

In the above result, we get the carry bit 1. So we discard this carry bit and remaining is the final result and a positive number.

Example 2: 10101 - 10111

We take 2's complement of subtrahend 10111, which comes out 01001. Now, we add both of the numbers. So,

10101+01001 =11110.

In the above result, we didn't get the carry bit. So calculate the 2's complement of the result, i.e., 00010. It is the negative number and the final
answer.

Fixed and Floating-Point Representation

Introduction

In computers, the data are stored in memory registers with binary bits 1's and 0's as the computers only understand binary language. When we enter
data in the computer, it is converted into binary and then processed and used by the CPU in different ways. The memory registers have a specific
range and a format to store data. Scientists have developed a real number representation method in the memory registers of 8 bit, 16 bit, 32bit.

We have two major approaches for storing real numbers: Fixed and Floating-Point Representation. This article will learn about Fixed and Floating-
Point Representation in detail.

Fixed Point Representation

In computers, fixed-point representation is a real data type for numbers. Fixed point representation can convert data into binary form, and then the
data is processed, stored, and used by the computer. It has a fixed number of bits for the integral and fractional parts. For example, if given fixed-point
representation is IIIII.FFF, we can store a minimum value of 00000.001 and a maximum value of 99999.999.

26
There are three parts of the fixed-point number representation: Sign bit, Integral part, and Fractional part. The below figure depicts it.

Parts of fixed-point representation.

Sign Bit: The fixed-point number representation in binary uses a sign bit. The negative number has a sign bit 1, while a positive number has a bit 0.

Integral Part: The integral part in fixed-point numbers is of different lengths at different places. It depends on the register's size; for an 8-bit register,
the integral part is 4 bits.

Fractional Part: The Fractional part is of different lengths at different places. It depends on the registers; for an 8-bit register, the fractional part is 3
bits.

Size of Sign Bit, Integer Part, and Fractional Part for different registers are displayed below:

Register Sign Bit Integer Part Fraction Part


8-bit register 1 bit 4 bits 3 bits
16-bit register 1 bit 9 bits 6 bits
32-bit register 1 bit 15 bits 9 bits

27
How to write numbers in Fixed-point notation?

Now that we have learned about fixed-point number representation, let's see how to represent it.

The number considered is 4.5

Step 1: We will convert the number 4.5 to binary form. 4.5 = 100.1

Step 2: Represent the binary number in fixed-point notation with the following format.

Fixed Point Notation of 4.5

Floating Point Representation

Floating Point representation doesn't reserve any specific number of bits for the integer or fractional parts. But instead, it reserves certain bits for the
number (called the significand or mantissa ) and a fixed number of bits to say where the decimal place lies(called the exponent).

The computer uses floating-point number representation to convert the input data into binary form. This binary form number is converted into
scientific notation, which is converted into floating-point representation.

The floating-point representation has two types of notation:

1. Scientific Notation: Scientific notation is the method of representing binary numbers into a x be form. It is further converted into floating-point
representation. For example,

Number = 32625

Number in Scientific Notation = 32.625 x 103


28
Number in binary form = 1101.101*2101

Here, Mantissa is 1101.101 and Base part is 2101.

2. Normalization Notation: It is a special case of scientific notation. Normalized means that we have at least one non-zero digit after the decimal
point.

A floating-point representation has three parts: Sign bit, Exponent Part, and Mantissa. We can see the below diagram to understand these parts.

Parts of floating-point representation

Sign Bit: The floating-point numbers in binary uses a sign bit. A negative number has a sign bit 1, while a positive number has a sign bit 0. The sign of
any number depends on mantissa, not on exponent.

Mantissa Part: The mantissa part is of different lengths at different places. It depends on registers like for a 16-bit register, and mantissa part is of 8
bits.

Exponent Part: It is the power of the number. It depends on the size of the register. For example, in the 16-bit register, the exponent part is of 7 bits.

How to write numbers in Floating-point notation

Now that we have learned about floating-point number representation, let's see how to represent it.

The number considered is 53.5

Step 1: We will convert the number 53.5 to binary form. 53.5 = 110101.1

Step 2: Normalize the number ( base is 2) = (1.101011) * 25.

Step 2: Represent the binary number in floating-point notation with the following format.
29
Floating Point Notation of 53.5

De-Normalized Notation

De-normalization Notation is just the reverse of the normalized notation. In normalized notation, after decimal we have '1' written in the equation,
but in the de-normalized notation, we have '0' after the decimal. For example, the largest de-normalized number with excess-64 can be represented
as:

Sign Bit Exponent Part Mantissa Part


0 1111111 01111111

Advantages of Fixed Point Representation

The advantages of Fixed Point Representation are as follows:

 Fixed-point calculations can be completed more quickly than floating-point calculations. This is because integer arithmetic, which is frequently
quicker than floating-point arithmetic, can be used to create fixed-point arithmetic.
 Fixed-point numbers can be represented more compactly than floating-point numbers by utilising fewer bits. In embedded systems or other
applications where memory is constrained, this could help preserve memory space.
 Effective memory management is possible with the help of fixed-point pointers. This is because registers can be accessed more quickly than
memory and can store fixed-point pointers.

Disadvantages of Fixed Point Representation

The disadvantages of Fixed Point Representation are as follows:

30
 Fixed-point numbers have the potential to overflow or underflow if improperly handled. Calculation errors may result from this.
 Compared to floating-point numbers, fixed-point numbers are less precise. As a result, fixed-point numbers cannot express a range of values as
broad as floating-point numbers.
 Fixed-point programming is sometimes more difficult than floating-point programming. This is due to the fact that fixed-point values are less
precise, making it more crucial to take precautions to prevent overflow and underflow.

Advantages of Floating Point Representation

loating-point representation offers several advantages, particularly for scientific and engineering applications that require a wide range of values and
precise calculations. Here are the key advantages:

 Wide Range of Values- Floating-point numbers can represent very large and very small numbers, making them suitable for scientific calculations
involving extreme values.
 Precision- They provide a high degree of precision for fractional values, which is essential for accurate mathematical computations.
 Support for Special Values- Floating-point representation includes special values such as zero, positive and negative infinity, and NaN (Not a
Number), which are useful for handling exceptional cases and errors in computations.

Disadvantages of Floating Point Representation

Despite its many advantages, floating-point representation has several disadvantages:

 Precision Issues- Floating-point numbers cannot represent all real numbers exactly, leading to rounding errors and limited precision, especially in
iterative calculations.
 Complexity- Floating-point arithmetic is more complex than integer arithmetic, both in terms of understanding the representation and the
hardware required to implement it.
 Performance- Floating-point operations are generally slower than integer operations due to the additional complexity in processing.

31

You might also like