Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
14 views

What Is A Multicore Processor

Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

What Is A Multicore Processor

Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 21

Addis Ababa Science and Technology

University
College of Electrical and Mechanical engineering

Department of Electrical and computer engineering

COMPUTER ARCHITECTURE AND ORGANIZATION


INDIVIDUAL ASSIGNMENT

NAME: KUMENIT DESTA


SECTION: C
ID: ETS0745/13
What is a multicore processor?

A multicore processor refers to an integrated circuit containing two or more processor


cores, designed to enhance overall performance and reduce power consumption.
These processors facilitate more efficient simultaneous execution of multiple tasks
through methods like parallel processing and multithreading. In a dual core
configuration, it's akin to having distinct processors installed on a computer, but their
connection is faster as they share the same socket. Employing multicore processors, or
microprocessors, is a strategy to improve processing capabilities without surpassing
the practical constraints of semiconductor design and fabrication. Additionally, the
use of multiple cores contributes to safe operation by addressing concerns such as heat
generation.

The use of multicore processer or microprocessor is one approach to boost processor


performance without exceeding the practical limitations of semiconductor design and
fabrication. Using multicores also ensure safe operation in areas such as heat
generation.

How do multicore processors work?


At the core of every processor lies an execution engine, or core, responsible for
processing instructions and data as directed by the computer's memory-resident
software programs. Over time, designers encountered limitations in each new
processor design. To enhance performance, various technologies were developed,
including:

 Clock speed: One strategy aimed at boosting performance was increasing


the processor's clock speed, which serves as the "drumbeat" synchronizing
instruction and data processing within the engine. Clock speeds have risen
from megahertz to gigahertz. However, power consumption becomes a
limiting factor as transistors use power with each clock tick. Current
semiconductor fabrication and heat management techniques have nearly
maxed out the potential for further increases in clock speeds.
 Hyper-threading: Another approach involved handling multiple instruction
threads, known as hyper-threading, as implemented by Intel. With hyper-
threading, processor cores are designed to concurrently manage two
separate instruction threads. When supported by both the computer's
firmware and operating system, hyper-threading allows one physical core to
function as two logical cores. However, this logical abstraction doesn't
significantly enhance real performance; its primary benefit lies in
optimizing the behavior of multiple simultaneous applications running on
the computer.

 More chips: To further enhance processing power, additional processor


chips or dies were added to the processor package, the physical device that
plugs into the motherboard. Dual-core processors have two separate cores,
quad-core processors have four, and modern multicore processors can
include 12, 24, or even more cores. This multicore approach is analogous to
using multiprocessor motherboards with two or four separate processor
sockets, achieving a similar effect. Current processor performance results
from a combination of fast clock speeds and multiple hyper-threaded cores.
What are multicore processors used for?

Multicore processors find applications across various modern computer hardware


platforms, and their effectiveness is particularly pronounced when paired with
software that emphasizes parallelism. Here are five major use cases for multicore
processors:

There are several major use cases for multicore processors, including the following
five:

1. Virtualization: Multicore processors play a crucial role in virtualization


platforms like VMware, where they can abstract physical processor cores
into virtual processors (vCPUs). These vCPUs are then assigned to virtual
machines (VMs), enabling each VM to operate as a virtual server with its
own operating system and applications. This setup allows for parallel
processing within VMs, enhancing overall system efficiency.
2. Databases: A Complex software platforms like databases, which often need
to handle numerous simultaneous tasks such as queries, heavily rely on
multicore processors. These processors assist in distributing and managing
multiple task threads, contributing to the efficient functioning of databases.
High-capacity memory, reaching terabytes on physical servers, is frequently
coupled with multiple processors in database setups.

3. Analytics and High-Performance Computing (HPC): Multicore


processors are essential for big data analytics, including machine learning,
and HPC. Tasks in these domains are typically large and complex, requiring
the division of computational efforts into smaller pieces. Multicore
processors facilitate the parallel processing of these pieces, allowing
different processors to work concurrently and efficiently solve overarching
problems at a faster rate than a single processor.

4. Cloud: Organizations building cloud infrastructure often adopt multicore


processors to support the virtualization demands of highly scalable and
transactional cloud software platforms like OpenStack. Multicore
processors enable the creation and scaling up of virtual machine instances
on demand, contributing to the flexibility and responsiveness of cloud
environments.

5. Visualization: Graphics applications, including games and data-rendering


engines, benefit from multicore processors for parallel processing. While
graphics processing units (GPUs) are preferred, multicore processors are
still valuable in scenarios where parallel processing is advantageous,
resembling the principles of GPUs with multiple cores.

Pros and cons of multicore processors

Multicore processor technology is well-established, presenting both advantages and


disadvantages that should be carefully considered when acquiring and deploying new
servers.
ADVANTAGES OF MULTICORE PROCESSORS:

1. Improved Application Performance: Multicore processors offer enhanced


processing capability, with each core acting as a separate processor that
operating systems (OSes) and applications can leverage. In virtualized servers,
multiple virtual machines (VMs) can utilize virtualized processor cores
simultaneously, enabling efficient coexistence and operation on a physical
server. Applications designed for parallelism can leverage multiple cores for
superior performance, a feat challenging to achieve with single-chip systems.
2. Enhanced Hardware Performance: Integrating two or more processor cores
on the same device allows for more efficient utilization of shared components,
such as internal buses and processor caches. This shared architecture results in
superior performance compared to multiprocessor systems with separate
processor packages on the same motherboard.

DISADVANTAGES OF MULTICORE PROCESSORS:

1. Software dependent: The effectiveness of multicore processors relies on


software applications. Most OSes and applications default to using the first
processor core (core 0), leaving additional cores idle until software is optimized
to utilize them. Consideration of server usage and planned applications is
essential to maximize the computing potential of a multicore system.
2. Performance boosts are limited: As multiple cores within a processor
package share common system buses and processor caches, there are
diminishing returns to performance as more cores are added. While the overall
performance benefit outweighs the impact of sharing in most situations, it's
crucial to consider this factor during application performance testing.

Architecture of multicore processors

CACHE ORGANIZATION IN MULTICORE PROCESSORS:


1. L1 Cache (Level 1): This is the smallest and fastest cache, unique to each core.
It is directly integrated into the core and holds frequently accessed instructions
and data. Its proximity to the core ensures rapid retrieval, contributing to low-
latency access and improved processing speed.
2. L2 Cache (Level 2): A larger storage space shared among the cores. It serves
as a secondary cache to the L1 cache, storing additional frequently used
instructions and data. While larger in size, the L2 cache maintains faster access
compared to higher-level caches.
3. Shared Caches: Some multicore processor architectures incorporate shared
caches, where a portion of the cache is accessible by all cores. This facilitates
efficient data sharing among cores, enhancing parallel processing capabilities.
MEMORY ORGANIZATION IN MULTICORE PROCESSORS:
1. Memory Channels: Multicore processors often feature multiple memory
channels, allowing concurrent access to system memory. This minimizes
memory access bottlenecks and enhances overall memory bandwidth, crucial
for handling the increased computational demands of multicore processing.
2. NUMA (Non-Uniform Memory Access): In some architectures, especially in
systems with a large number of cores, NUMA is employed. This design ensures
that each core has faster access to its local memory, reducing latency. However,
accessing memory from a remote NUMA node may incur higher latency.
REGISTERS ORGANIZATION IN MULTICORE PROCESSORS:
1. Register Files: Each core in a multicore processor has its set of registers. These
registers store data and instructions that are immediately accessible by the core
during processing. The efficient organization of register files contributes to
minimizing latency in data retrieval and execution.
2. Register Renaming: Some multicore processors employ techniques like
register renaming to enhance parallelism. This involves dynamically mapping
architectural registers to physical registers, allowing for more efficient handling
of multiple instructions in-flight.

What Is Parallel Processing?

Parallel processing refers to a computing method where multiple streams of


calculations or data processing tasks occur simultaneously through numerous central
processing units (CPUs). This technique involves utilizing two or more processors or
CPUs concurrently to manage various components of a single activity. By distributing
the numerous parts of a task among several processors, systems can significantly
reduce a program's execution time. Multi-core processors, commonly found in modern
computers, and any system with more than one CPU have the capability to execute
parallel processing, offering advantages such as improved speed, lower power
consumption, and more effective handling of multiple activities.

Integrated circuit (IC) chips with two or more CPUs make up multi-core processors,
with most computers having two to four cores, and some featuring up to twelve.
Complex operations and computations are often carried out efficiently through
parallel processing. At a basic level, the distinction between parallel and serial
operations lies in how registers are used. Parallel processing involves registers with
parallel loading, simultaneously processing each bit of a word, while serial operations
process each bit one at a time using shift registers. The complexity of parallel
processing can be managed at a higher level by utilizing various functional units that
perform the same or different activities simultaneously.

The interest in parallel computing dates back to the late 1950s, with developments in
supercomputers emerging in the 1960s and 1970s. Early multiprocessors utilized
shared memory space and executed parallel operations on a single data set. The
introduction of massively parallel processors (MPPs) occurred in the mid-1980s with
the Caltech Concurrent Computation project, demonstrating high performance using
off-the-shelf microprocessors. Clusters, parallel computers comprised of linked
commercial computers, entered the scene in the late 1980s, gradually replacing MPPs
for various applications. Today, clusters, often based on multi-core processors,
dominate scientific computing and data centers.
The evolution of parallel processing has transformed regular desktop and laptop
computers into tools capable of solving problems that previously required powerful
supercomputers. Operating systems now efficiently manage how different processors
collaborate, making parallel processing more cost-effective than serial processing in
most cases. As the demand for real-time data increases with the proliferation of
Internet of Things (IoT) sensors and endpoints, parallel computing becomes crucial.
Cloud services providing easy access to processors and graphics processing units
(GPUs) have further elevated the significance of parallel processing in microservice
rollouts.

How Does Parallel Processing Work?

In essence, parallel processing involves the partitioning of a task among a minimum


of two microprocessors. The concept is straightforward: a computer scientist employs
specialized software tailored to the task to dissect a complex problem into its
constituent elements. Subsequently, each part is assigned to a specific processor, and
each processor independently tackles its designated portion. The software then
orchestrates the reassembly of the processed data to arrive at a solution for the initial
intricate challenge.

When employing parallel processing, a substantial task is fragmented into several


smaller jobs, aligning with the number, size, and type of available processing units.
Following the task division, each processor initiates its individual processing without
direct communication with others. Instead, they utilize software for
intercommunication, exchanging information on the progress of their tasks.

Upon completion of all program segments, the outcome is a fully processed program
segment. This holds true whether the processors and tasks finished simultaneously or
sequentially. Two primary types of parallel processes exist: fine-grained and coarse-
grained. Fine-grained parallelism involves frequent inter-task communication to yield
real-time or near-real-time results, while coarse-grained parallel processes exhibit
slower communication due to less frequent interactions.

A parallel processing system functions by simultaneously processing data to expedite


task completion. For instance, the system may retrieve the next instruction from
memory while the current instruction is processed by the CPU's arithmetic-logic unit
(ALU). The primary objective of parallel processing is to augment a computer's
processing power and enhance throughput, denoting the volume of work achievable
within a given timeframe. A parallel processing system can employ multiple
functional units to concurrently execute similar or dissimilar activities.

In simpler terms, dividing the workload streamlines tasks. This division can occur
within different processors in the same computer or across distinct computers
interconnected by a computer network. A computer scientist typically employs
software tools to dissect a complex task, assigning each segment to a processor. Each
processor independently performs its designated portion, and a software tool
reconstructs the data for reading the answer or executing the operation.

Each CPU, while accessing data from the computer's memory, concurrently functions
and executes parallel tasks as instructed. Communication and data value tracking are
facilitated through software. Post-task completion, the software reunites the
fragmented data, assuming synchronization among all processors. If computers are
networked into a cluster, parallel computing can be achieved even without multiple
processors.

Types of Parallel Processing


Parallel processing encompasses various types, including MMP, SIMD, MISD, SISD,
MIMD, and Massively Parallel Processing (MPP), with SIMD standing out as one of
the most prevalent. Let's explore these parallel processing types and their
functionalities:

1. Single Instruction, Single Data (SISD)

 In SISD computing, a single processor manages a single algorithm concurrently


with a single data source.

 It mirrors the structure of a typical serial computer with a control unit,


processing unit, and memory unit.

 Instructions are executed sequentially, with the potential for parallel processing
depending on the system's configuration.
 SISD systems may feature multiple functional units, allowing for pipeline
processing or the use of numerous units to achieve parallelism.

2. Multiple Instruction, Single Data (MISD)

 MISD systems employ multiple processors sharing the same input data while
executing various algorithms.

 Several operations can be performed simultaneously on the same batch of data,


influenced by the available processors.
 Each processor in MISD operates independently with its instructions and
contributes to a comparable data flow.
3. Single Instruction, Multiple Data (SIMD)

 SIMD architecture involves multiple processors executing identical


instructions, each with its unique set of data.

 The same algorithm is applied to multiple data sets simultaneously.


 A single control unit supervises all processing components, ensuring
synchronization while each processor processes distinct data sets.

4. Multiple Instruction, Multiple Data (MIMD)

 MIMD computers feature multiple processors, each capable of independently


accepting its instruction stream.

 Each CPU processes data from a distinct data stream, enabling the
simultaneous execution of multiple tasks.

 Developing sophisticated algorithms for MIMD computers is challenging, and


interactions between processors are managed through shared data areas.

5. Single Program, Multiple Data (SPMD)

 SPMD systems, a subset of MIMD, involve each processor executing the same
set of instructions.

 Common in distributed memory computer systems, SPMD relies on message


passing programming.
 Individual nodes, forming a distributed memory computer, use send/receive
routines to communicate and synchronize.
6. Massively Parallel Processing (MPP)

 MPP involves a storage structure coordinating program operations across


numerous processors.

 Each CPU operates with its operating system and memory, allowing
coordinated processing of different program sections.
 MPP databases excel in handling vast amounts of data and delivering rapid
analyses based on extensive datasets.

Parallel Processing Examples


Parallel processing, or parallel computing, plays a crucial role in various fields,
providing significant benefits in terms of computational power and efficiency. Here
are some notable examples.

1. Supercomputers for use in astronomy

 Supercomputers with parallel processing capabilities are instrumental in


astrophysics.

 Computer simulations are used to study slow astronomical phenomena, such as


star collisions and galaxy mergers, which occur over millions of years.
 Recent advancements in understanding black holes were made possible by
parallel supercomputers, aiding in solving long-standing mysteries and
enhancing our comprehension of these phenomena..

2. Making predictions in agriculture

 Parallel processing is employed in agriculture for making accurate predictions


related to essential crops.
 The U.S. Department of Agriculture utilizes parallel processing on the Blue
Waters supercomputer at the University of Illinois to calculate supply and
demand ratios for various crops.
 Advanced data, including crop growth estimates, seasonal climate data, and
satellite data, enhance predictions and outperform industry-standard forecasts..

3. Risk calculations and cryptocurrencies in banking

 Parallel processing is integral to banking processes, including credit scoring,


risk modeling, and fraud detection.

 JPMorgan Chase adopted hybrid GPU-CPU processing in 2011, resulting in a


40% improvement in accuracy and significant cost savings.
 Cryptocurrency operations, such as Bitcoin mining and blockchain transactions,
heavily rely on parallel processing for efficient and secure execution..

4. Video post-production effects

 High-budget film releases, like "Ad Astra" and "John Wick," utilize parallel
processing for post-production special effects.
 Hollywood-standard post-production facilities, including Blackmagic Design’s
DaVinci Resolve Studio, leverage GPU-accelerated parallel processing for
advanced rendering, 3D animation, and color correction.

8085 Microprocessor:
The 8085 Microprocessor, developed by Intel in 1976 using NMOS technology,
serves as a precursor to the 8086 Microprocessor. Featuring an 8-bit data bus and a
16-bit address bus, it operates with a +5V voltage supply and functions at a frequency
of 3.2 MHz in a single-segment CLK. Equipped with an internal clock generator, it
operates on a clock cycle with a 50% duty cycle. The 8085 Microprocessor boasts 246
operational codes and accommodates 80 instructions within its architecture.
8086 Microprocessor:
Introduced by Intel in 1976, the 8086 Microprocessor represents an advanced iteration
compared to the 8085 Microprocessor. Identified by the IC number 8086, this
microprocessor is designed as a 16-bit system. It features a 16-bit data bus, allowing it
to read or write either 16 bits or 8 bits of data at a time. With 20 bits of address lines,
the 8086 Microprocessor can access a vast 220 address locations. Operating in two
modes, namely Maximum mode and Minimum mode, it is limited to executing fixed-
point arithmetic instructions and does not support floating-point operations.

8085 microprocessor 8086 microprocessor

The data bus is 8 bits. The data bus is 16 bits.

The address bus is 16 bits. The address bus is 20 bits.

The memory capacity is 64 KB. Also, 8085 Can The memory capacity is 1 MB. Also, 8086 Can
Perform Operation Up to 28 i.e. 256 numbers. Perform operations up to 216 i.e. 65,536
A number greater than this is to be taken numbers.
multiple times in an 8-bit data bus.

The input/output port addresses are 8 bits. The input/output port addresses are 16 bits.

The operating frequency is 3.2 MHz. The operating frequency is 5 MHz, 8 MHz, and
10 MHz.

8085 MP has a Single Mode Of Operation. 8086 MP has Two Modes Of Operation. 1.
Minimum Mode = Single CPU PROCESSOR 2.
Maximum Mode = Multiple CPU PROCESSOR.

It does not have multiplication and division It has multiplication and division instructions.
8085 microprocessor 8086 microprocessor

instructions.

It does not support pipelining. It supports pipe-lining as it has two


independent units Execution Unit (EU) and Bus
Interface Unit (BIU).

It does not support an instruction queue. It supports an instruction queue.

Memory space is not segmented. Memory space is segmented.

It consists of 5 flags(Sign Flag, Zero Flag, It consists of 9 flags(Overflow Flag, Direction


Auxiliary Carry Flag, Parity Flag, and Carry Flag, Interrupt Flag, Trap Flag, Sign Flag, Zero
Flag). Flag, Auxiliary Carry Flag, Parity Flag, and
Carry Flag).

It is a low-cost Microprocessor It is a comparatively High-cost Microprocessor.

There are 5 Addressing Modes. There are 11 addressing modes.

There is no concurrency in Fetching, Decoding, There is Concurrency in Fetching, Decoding,


and execution. and Execution because of the instruction
queue.

It has almost 6500 transistors. It has almost 29000 transistors.

It is Accumulator based Microprocessor It is General Purpose Registers(GPR) based


because Accumulator contains major activity microprocessor because there is no specific
in ALU Operations in store and updating Accumulator attached to the input of ALU. all
calculations. GPRs are connected with it via Bus.
8085 microprocessor 8086 microprocessor

Integer, Decimal, and Hexadecimal arithmetic It also supports ASCII Arithmetic over Integer,
is supported Decimal, and Hexadecimal.

Conclusion
In conclusion, the 8085 and 8086 microprocessors represent two important milestones
in the development of modern computing. While both microprocessors were designed
by Intel Corporation and used in a variety of applications, they differ in several key
areas.

The 8085 microprocessor is an 8-bit microprocessor with a clock speed of 3 MHz,


while the 8086 microprocessor is a 16-bit microprocessor with a clock speed of 5
MHz. The 8086 has a larger address bus and can access up to 1 megabyte of memory,
compared to the 64 kilobytes of memory that the 8085 can access. The 8086 also
introduced several new instructions and addressing modes, which made it more
versatile and efficient than the 8085. The 8086 was backward compatible with the
8085, which allowed software written for the 8085 to run on the 8086 without
modification.

2. Detailed discussion on architectures

What is RISC?
Reduced Instruction Set Computing (RISC) is a computer architecture that
emphasizes a simple and efficient instruction set. RISC processors have a
smaller instruction set than CISC processors, with each instruction
performing a single operation. The goal of RISC architecture is to reduce the
amount of work the processor needs to do for each instruction, which leads to
faster and more efficient processing.
RISC processors often use pipe-lining to achieve greater performance. Pipe-
lining involves breaking down the execution of an instruction into smaller
stages, so multiple instructions can be executed simultaneously. This reduces
the overall execution time for a program, as each stage of the pipeline can be
devoted to a different instruction.

Example: RISC processors include the ARM, MIPS, and PowerPC


architectures. The ARM architecture is used in many smartphones and
tablets, while the MIPS architecture is commonly used in embedded systems
such as routers and set-top boxes. The PowerPC architecture was used in
Apple's Power Macintosh computers before they switched to Intel processors.
Advantages of RISC:

 Simplified instruction set leads to faster processing

 Pipe-lining can increase performance

 Lower power consumption


 Smaller chip size, which can lead to cost savings

Disadvantages of RISC:

 Programs may require more instructions to complete a task than


with CISC

 Limited ability to perform complex instructions


 What is CISC?
 Complex Instruction Set Computing (CISC) is a computer architecture
that emphasizes a large and complex instruction set. CISC processors
have many instructions that can perform multiple operations in a single
instruction. The goal of CISC architecture is to reduce the number of
instructions a program needs to execute, which can lead to faster
program execution
 CISC processors typically have more extensive hardware support for
performing complex instructions. This allows for more sophisticated
operations to be performed in a single instruction, which can lead to
faster program execution. However, the increased complexity can also
lead to slower processing times.
Example: CISC processors include the x86 architecture used in most desktop
and laptop computers today. The x86 architecture includes instructions that
can perform complex tasks such as string manipulation, as well as
instructions that can access and modify system memory directly.
Advantages of CISC:

 Ability to perform complex instructions

 Programs require fewer instructions to execute

 Greater hardware support for performing complex instructions

Disadvantages of CISC:

 Increased complexity can lead to slower processing times

 Larger chip size can lead to increased costs

RISC vs CISC: A Comparison


While both RISC and CISC have their advantages and disadvantages, the
choice between them ultimately depends on the application. RISC is ideal for
applications that require fast and efficient processing, such as mobile devices
and embedded systems. CISC is better suited for applications that require
complex operations, such as video and image processing.
Another factor to consider is the trend towards hybrid architectures, which
combine the benefits of RISC and CISC. These architectures use RISC-like
designs for the CPU core but incorporate CISC-like features to support
complex instructions. Examples of hybrid architectures include Intel's x86-64
architecture and ARM's Cortex-A series.
Conclusion:
In conclusion, RISC and CISC architectures are two major instruction set
architectures used in modern processors. RISC architectures have a simpler
instruction set and are ideal for mobile devices and other applications where
space is limited. CISC architectures have a more complex instruction set and
are more versatile, but can be more difficult to optimize for performance.
Understanding the differences between RISC and CISC architectures is
important for anyone interested in computer architecture.

You might also like