Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
43 views

Assignment No 3 Embedded

Circular buffers allow data to be stored and processed in a continuous loop. They use a fixed-size buffer that is connected end-to-end like a ring. The TMS320C54x DSP uses modified Harvard architecture with separate program and data memory and powerful parallel processing capabilities. It has a 40-bit ALU, 17x17 multiplier, barrel shifter, and other components. Memory is divided into program, data, and I/O spaces, and peripherals include timers, serial ports, and DMA. Circular buffers are configured using registers for start address, size, and mode. The block repeat register controls instruction repetition, while interrupt and processor mode status registers control interrupts and memory mapping.

Uploaded by

satinder singh
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

Assignment No 3 Embedded

Circular buffers allow data to be stored and processed in a continuous loop. They use a fixed-size buffer that is connected end-to-end like a ring. The TMS320C54x DSP uses modified Harvard architecture with separate program and data memory and powerful parallel processing capabilities. It has a 40-bit ALU, 17x17 multiplier, barrel shifter, and other components. Memory is divided into program, data, and I/O spaces, and peripherals include timers, serial ports, and DMA. Circular buffers are configured using registers for start address, size, and mode. The block repeat register controls instruction repetition, while interrupt and processor mode status registers control interrupts and memory mapping.

Uploaded by

satinder singh
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

EC 306 EMBEDDED SYSTEMS NAME : SATINDER SINGH

Assignment No: 02 ROLL NUMBER: 2K17/EC/150

Assignment No: 03
1. Explain Circular Addressing in detail.
ANSWER:
DSP operations are typically computations involving an infinite stream of real-time data. The data is
accumulated into a buffer, and the oldest sample is overwritten by the newest sample. The block size
for this circular buffer should be specified and reserved in physical memory. The memory basically has
a round-robin FIFO instantiated within it. Access is provided by a base address for the memory, and a
system of pointers.
While processing the data samples coming continuously in a sequential manner, circular buffers are used. In a
circular buffer the data samples are stored sequentially from the initial location till the buffer gets filled up. Once
the buffer gets filled up, the next data samples will get stored once again from the initial location. This process
can go forever as long as the data samples are processed in a rate faster than the incoming data rate. Circular
Addressing mode requires three registers viz
a) Pointer register to hold the current location (PNTR)
b) Start Address Register to hold the starting address of the buffer (SAR)
c) End Address Register to hold the ending address of the buffer (EAR)
There are four special cases in this addressing mode. They are
a. SAR < EAR & updated PNTR > EAR
b. SAR < EAR & updated PNTR < SAR
c. SAR >EAR & updated PNTR > SAR
d. SAR > EAR & updated PNTR < EAR
The buffer length in the first two case will be (EAR-SAR+1) whereas for the next tow cases (SAREAR+1).
The pointer updating algorithm for the circular addressing mode is as shown below.
2. Discuss Arithmetic Logical Unit of TMS320C54x in detail.
ANSWER:
 ALU performs 2’s complement arithmetic operations and bit level Boolean operations on 16-,
32-, and 40-bit words.
 Also function as 2 separate 16-bit ALUs and perform two 16-bit operations simultaneously.

 ACCUMULATORS
1 Accumulators A,B store output from the ALU or the multiplier/adder block.
2 They also provide second input to ALU and accumulator A can be an input to the multiplier
block.
3 Either of the accumulators can be used as temporary storage for the other.
4 Accumulator divides into:
4.1 Guard bits(bits 39-32)
4.2 High-order word(bits 31-16)
4.3 Low-order word(bits 15-0)
 BARREL SHIFTER
1. It provides the capability to scale the data during an operand write or read.
2. It has a 40-bit input connected to the accumulators or to data memory (using CB or DB)
and a 40-bit output connected to the ALU or to data memory(using EB)
3. It produces a left shift of 0 to 31 bits and a right shift of 0 to 16 bits on the input data.
4. The shift requirements are defined in the shift count field of the instruction, shift count
field of status register ST1 or in the temporary register T.
 Multiplier has 2 inputs:
 Selected from T, a data memory operand, or Accumulator A
 Selected from program memory, data memory, Accumulator A or an intermediate value
 Fast on-chip multiplier allows convolution, correlation and filtering
 Multiplier + ALU together execute MAC computations & ALU operations in parallel in a single
instruction cycle.
 This function is used in determining the Euclidian distance and in implementing symmetrical
and LMS filters which are required for complex algorithms

3. Explain in detail the internal organization of TMS320C54x.


ANSWER:
• This DSP uses modified Harvard Architecture
• Provides a high degree parallelism due to separate program and data spaces which allows
simultaneous access to program instructions and data.
• They have 1 program and 3 data memory spaces
 Architecture of the TMS320C54XX comprises of:
 CPU: Contains:
1.1 40-bit ALU
1.2 Two 40-bit accumulators
1.3 Barrel shifter
1.4 17× 17-bit multiplier
1.5 40-bit adder
1.6 Compare, select and store unit(CSSU)
1.7 Data address generation unit(DAGEN)
 DSP offers 7 basic data addressing modes:
 Immediate addressing
 Absolute addressing
 Accumulator addressing
 Direct addressing
 Indirect addressing
 Memory-mapped register addressing
 Stack addressing
 During the execution of direct, indirect or memory mapped register addressing, the DAGEN
computes the addresses of data-memory operands.
1.8 Program address generation unit(PAGEN)
 Program memory usually addressed with Program counter
 PC is loaded by PAGEN. PAGEN increments the PC as sequential instructions are fetched.
 PAGEN may load the PC with a non-sequential value as a result of some instructions or other
operations(branches, calls, returns, conditional operations, single/multiple instruction repeats,
reset, & interrupts)
 For calls & interrupts:
 The current PC is saved onto stack, which is referenced by the stack pointer(SP).
 When interrupt service routine is finished, the PC value in the stack is restored via return
instruction
 Memory
1. Memory organized into 3 individually selectable spaces:
1.1. Program
1.2. Data
1.3. I/O space
2. DSP can contain RAM & ROM
3. ROM:
3.1. Is part of program memory space & sometimes data memory space.
3.2. Contains a bootloader that is useful for booting to faster on-chip or external RAM
4. RAM:
4.1. Dual-access type (DARAM)
4.2. Single-access type (SARAM)
4.3. Two-way shared RAM
 ON-chip peripherals
1. All the C54xE devices have a common CPU, but different on-chip peripherals
2. On-chip peripheral options:
2.1. General-purpose I/O pins
2.2. Software-programmable wait-state generator
2.3. Programmable bank-switching logic
2.4. Clock generator
2.5. Timer
2.6. Direct memory access (DMA) controller
2.7. Standard serial port
2.8. Time-division multiplexed (TDM) serial port
2.9. Buffered serial port (BSP)
2.10. Multichannel buffered serial port (McBSP)
2.11. Host-port interface (8-bit standard (HPI), 8-bit enhanced (HPI8), 16-bit enhanced (HPI16))

4. Write short notes on:


i. Circular Buffer size register
ANSWER:
A circular buffer, circular queue, cyclic buffer or ring buffer is a data structure that uses a single, fixed-
size buffer as if it were connected end-to-end. This structure lends itself easily to buffering data
streams.
Circular buffers are useful in DSP programming because most implementations include a loop of some
sort. In the filter example, all the coefficients are processed, and then the coefficient pointer is reset
when the loop is finished. Using circular buffering, the coefficient pointer will automatically wrap
around to the beginning when the end of the loop is encountered. Therefore, the time that it takes to
update the pointers is saved. Setting up circular buffers usually involves writing to some registers to tell
the DSP the buffer start address, buffer size, and a bit to tell the DSP to use circular buffers.

ii. Block-repeat register


ANSWER:
All registers are 16-bit wide. It includes RPTC, BRCR, PASR, PAER.
 Repeat counter register (RPTC) holds the repeat count in a repeat single-instruction operation
and is loaded by the RPT and RPTZ instruction.
 Block repeat counter register (BRCR) holds the count value for the block repeat feature. This
value is loaded before a block repeat operation is initiated.
 Block repeat program address start register (PASR) indicates the 16-bit address where the
repeated block of code starts.
 The block repeat program address end register (PAER) indicates the 16-bit address where the
repeated block of code ends.

iii. Interrupt register


ANSWER:
It includes interrupt mask register and interrupt flag register
 The interrupt mask register provides individual control of each interrupt source to the CPU.
 The global interrupt mask (or enable) bit provides a master switch to turn all interrupts on and
off. This bit is usually enabled once by the programmer at the beginning of the program.
During interrupt processing, this bit is toggled off by the interrupt processing logic and toggled
on by the return-from-interrupt instruction that ends the ISR. This is done to prevent an ISR
from being preempted. The user can override this by re-enabling global interrupts in the ISR.
iv. Processor mode status register
ANSWER:

INTR: Interrupt vector pointer, point to the 128-word program page where the interrupt vectors reside.
MP/MC: Microprocessor/Microcomputer mode,
MP/MC=0, the on chip ROM is enabled,
MP/MC=1, the on chip ROM is enabled.
OVLY: RAM OVERLAY, OVLY enables on chip dual access data RAM blocks to be mapped into program
space.
AVIS: It enables/disables the internal program address to be visible at the address pins.
DROM: Data ROM, DROM enables on-chip ROM to be mapped into data space.
CLKOFF: CLOCKOUT off.
SMUL: Saturation on multiplication.
SST: Saturation on store

5. Explain these parts of TMS320C55x in detail:


i. CPU
ANSWER:
The C55x CPU is responsible for performing the digital signal processing tasks required by the
application. In addition, the CPU acts as the overall system controller, responsible for handling
many system functions such as system-level initialization, configuration, user interface, user
command execution, connectivity functions, and overall system control.
Tightly coupled to the CPU are the following components:
• DSP internal memories
– Dual-access RAM (DARAM)
– Single-access RAM (SARAM)
– Read-only memory (ROM)
• Ports and buses The CPU also manages/controls all peripherals on the device. Refer to the
device-specific data manual for the full list of peripherals.
Figure 1-1 shows the functional block diagram of the DSP and how it connects to the rest of the
device. The DSP architecture uses the switched central resource (SCR) to transfer data within the
system.

The functions of internal data and address buses are as follows:

○ Data-Read Data Buses (BB, CB, DB): These three buses carry 16-bit data from data space or I/O
space to functional units of the CPU. BB only carries data from internal memory to the D unit (primarily
to the dual multiply-and-accumulate (MAC) unit).
○ Data-Read Address Buses (BAB, CAB, DAB): These three buses carry 23-bit word data addresses
to the memory interface unit, which then fetches the data from memory and transfers the requested
values to the data-read data buses.
○ Program-Read Data Bus (PB): PB carries 32 bits (4 bytes) of program code at a time to the I unit,
where instructions are decoded.
○ Program-Read Address Bus (PAB): PAB carries the 24-bit byte program address of the program
code that is carried to the CPU by PB.
○ Data-Write Data Buses (EB, FB): These two buses carry 16-bit data from functional units of the
CPU to data space or I/O space. EB and FB receive data from the P unit, the A unit, and the D unit.
○ Data-Write Address Buses (EAB, FAB): These two buses carry 23-bit addresses to the memory
interface unit, which then receives the values-driven on the data-write data buses.

ii. Program flow unit


ANSWER:
 PU receives instructions from IU, generates all program space addresses and also controls
sequence of instructions
a. Interpreting conditions for conditional instructions
b. Determining go to addresses
 PU initiates interrupt servicing, manages single repeat and block repeat operations and managing
execution of parallel instructions
● The program control logic accepts immediate values from the I unit and test results from the A unit or the
D unit and performs the following actions:
○ Tests whether a condition is true for a conditional instruction and communicates the result to the
program-address generation logic.
○ Initiates interrupt servicing when an interrupt is requested and properly enabled.
○ Controls the repetition of a single instruction preceded by a single-repeat instruction, or a block of
instructions preceded by a block-repeat instruction.
○ Manages instructions that are executed in parallel. Parallelism within the C55x DSP enables the
execution of program-control instructions at the same time as data processing instructions.

● The main registers of the Program Flow Unit are:


○ PC - Program counter
○ RETA - Return address register
○ CFCT - Control flow context register

iii. Address-Data flow unit


ANSWER:
 AU contains DAGEN and all registers to generate addresses for reads and writes to address
space
 There are 8 auxiliary registers to be used as address pointer, a coefficient data point register
 DAGEN supports both linear and circular addressing
Registers for circular addressing are also managed by AU
 Uses 16-bit ALU for addition, subtraction, comparison, arithmetic/logical shifts, etc.
 Includes four general purpose temporary registers
 Simpler operations in parallel with DU

iv. Data Computation


ANSWER: It includes
 40-bit barrel shifter; 40-bit ALU; two MAC units, four 40-bit accumulators
 40-bit ALU that performs addition, subtraction, comparison, rounding, saturation, Boolean logic
operations and absolute-value calculations
 It can perform two arithmetic operations simultaneously when a dual 16-bit instruction is
executed.
 Accumulators partitioned into a low word, a high word and 8 guard bits

6. Explain the significance of Partitioned Registers in TI C6X family.


ANSWER:
 Many memory ports are required to supply enough operands per cycle. Memories with many
ports are expensive.
 The C6x processors are closer to traditional very long instruction word (VLIW) processors because
they seek to exploit the high levels of instruction-level parallelism (ILP) in many signal processing
algorithms. For the embedded space, code compatibility is less of a problem, and so new
applications can be either hand tuned or recompiled for the newest generation of processor. The
other reason superscalar excels on the desktop is because the compiler cannot predict memory
latencies at compile time. In embedded, however, memory latencies are often much more
predictable. In fact, hard real-time constraints force memory latencies to be statically predictable.
Of course, a superscalar would also perform well in this environment with these constraints, but
the extra hardware to dynamically schedule instructions is both wasteful in terms of precious chip
area and in terms of power consumption. Thus, VLIW is a natural choice for high-performance
embedded.
 The C6x family employs different pipeline depths depending on the family member. For the C64x,
for example, the pipeline has 11 stages. The first four stages of the pipeline perform instruction
fetch, followed by two stages for instruction decode, and finally four stages for instruction
execution.
 The C6x family’s execution stage is divided into two parts, the left or “1” side and the right or “2”
side. The L1 and L2 units perform logical and arithmetic operations. D units in contrast perform a
subset of logical and arithmetic operations but also perform memory accesses (loads and stores).
The two M units perform multiplication and related operations (e.g., shifts). Finally the S units
perform comparisons, branches, and some SIMD operations. Each side has its own 32- entry, 32-
bit register file (the A file for the 1 side, the B file for the 2 side). A side may access the other side’s
registers, but with a 1- cycle penalty. Thus, an instruction executing on side 1 may access B5, for
example, but it will take 1- cycle extra to execute because of this.
Therefore, the partitioned registers in TI C6x help to accomplish several features that make this family
much faster and efficient than other dsp processors, including pipelining etc.

You might also like