Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Arm Instruction Set

Download as pdf or txt
Download as pdf or txt
You are on page 1of 54

Outline

1) Future Evolution of Information Technology 2) System - on - a- Chip Design 3) Design and Application of Cores 4) Analog and Mixed Signal Design (Prof. Berroth) 5) Test of Systems - on - a - Chip

Design and Application of Cores


Microprocessor Cores (ARM) On-chip buses A dedicated core (multimedia?)

Major Microprocessor Core Vendors


Hard Cores: ARM (Cambridge, UK) (32 bit) Firm Cores: MIPS (Mountain View, CA) (64 bit) Soft Cores: ARC (London, UK) Tensilica (Santa Clara, CA) SUN Microsystems (Mountain View, CA) picoJava-Core as synthesizable RTL

ARM System Design


History of ARM ARM Instruction Set Thumb Instruction Set ARM Cores ARM Cache Modeling ARM CPUs ARM Coprocessors System Development (optional)
4

Steve Furber. The ARM System Architecture. AW. 1996.

Acorn - a Computer Manufacturer


1983: Acorn Limited: Dominant position in UK personal computer market with Rockwell 6502 (8-Bit) CPU. 1983: 16-Bit CISC CPUs slower than standard memory ports with long interrupt latencies 1983-85: Acorn designed the first commercial RISC CPU: Acorn Risc Machine (ARM) 1990: Advanced Risc Machine: formed to broaden the market beyond Acorns product range

ARM - Advanced RISC Machine


1990: Startup with 12 engineers and 1 CEO
No patents, no customers, very little money

Mid-1990s: T.I. licensed ARM7


Incorporated into a chip for mobile phones

IPO Spring 1998


13 millionaires

More Than CPU Core Development


Design a circuit, license it and make millions does not work! Support Training Marketing Development Tools Design Consulting
7

ARM System Design


History of ARM ARM Instruction Set Thumb Instruction Set ARM Cores ARM Cache Modeling ARM CPUs ARM Coprocessors Optional
8

Architectural Inheritance from Berkeley RISC I


Used: Load-store architecture Fixed-length 32-bit instructions 3 address format Rejected: Register windows Delayed branches Single cycle execution of all instructions Result: RISC with a few CISC features
9

ARM Assembly Language Programming


Agenda:
the ARM programmers model the ARM instruction set writing simple programs examples ARM software development tools

hands-on: writing simple ARM assembly programs

10

The ARM programmers model


ARM is a Reduced Instruction Set Computer (RISC); it has:
a large, regular register file
any register can be used for any purpose

a load-store architecture
instructions which reference memory just move data, they do no processing processing uses values in registers only

fixed-length 32-bit instructions

11

ARM register organization


ro r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 R15 (PC)

usable in user mode system modes only

r8_fig r9_fig r10_fig r11_fig r12_fig r13_fig r14_fig

r13_svc r14_svc

r13_abt r14_abt

r13_irq r14_irq

r13_und r14_und

SPSR_und SPSR_irq SPSR_abt CPSR SPSR_fig user mode fig mode svc mode SPSR_svc abort mode irq mode undefined mode

12

ARM CPSR format


31 2827 8 7 6 5 4 0

NZCV

unused

IF

mode

In user programs only the top 4 bits of the CPSR are significant: N - the result was negative Z - the result was zero C - the result produced a carry out V - the result generated an arithmetic overflow
13

ARM memory organization


23 22 21 20

19 18 17 16 ----------------- word 16---------------15 14 13 12

Memory is a linear array of 232 byte locations. ARM can address:


individuaal bytes 32-bit words on 4-byte boundaries

half-word 14 11 10

half-word 12 9 8

----------------word 8------------------7 3 byte3 6 byte 6 2 byte2 5 4 half-word 4 1 byte1 0 byte0

some ARM chips can address 16-bit half-words on 2-byte boundaries

14

The ARM instruction set


data processing instructions data transfer instructions control flow instructions conditional execution special instruction memory faults operating modes and exceptions ARM architecture variants

15

Data processing instructions


ALL operands are 32-bits wide and either:
come from registers, or are literals (immediate values ) specified in the instruction

The result, if any, is 32-bits wide and goes into a register


exept long multiplies generate 64-bit results

All operand and result registers are independently specified

16

Data processing instructions


Example:
ADD r0, r1, r2 ; r0 := r1 + r2

Note: everything after the ; is a comment


it is there solely for the programmers convenience

the result register (r0) is listed first


17

Data processing instructions


Arithmetic operations:
ADD ADC SUB SBC RSB RSC r0, r1, r2 r0, r1, r2 r0, r1, r2 r0, r1, r2 r0, r1, r2 r0, r1, r2 ; ; ; ; ; ; r0 := r1 + r2 r0 := r1 + r2 + C r0 := r1 - r2 r0 := r1 - r2 + C r0 := r2 - r1 r0 := r2 - r1 + C

1 1

c is the C bit in the CPSR the operation may be viewed as unsigned or 2s complement signed

18

Data processing instructions


Bit-wise logical operations:
AND ORR EOR BIC r0, r1, r2 r0, r1, r2 r0, r1, r2 r0, r1, r2 ; r0 := r1 and r2 ; r0 := r1 or r2 ; r0 := r1 xor r2 ; r0 := r1 and not r2

the specified Boolean logic operation is performed on each bit from 0 to 31 BIC stands for bit clear
each 1 in r2 clears the corresponding bit in r1
19

Data processing instructions


Register movement operations:
MOV MVN r0, r0, r2 r2 ; ; r0 r0 := := r2 not

r2

MVN stands for move negated there is no first operand (r1) specified as these are unary operations

20

Data processing instructions


Comparison operations:
CMP CMN TST TEQ r1, r1, r1, r1, r2 r2 r2 r2 ; ; ; ; set set set set cc cc cc cc on on on on r1 r1 r1 r1 - r2 + r2 and r2 or r2

these instructions just affect the condition codes (N, Z, C, V) in the CPSR
there is no result register (r0)
21

Data processing instructions


Immediate operands
the 2nd source operand (r2) may be replaced by a constant:
ADD AND r3, r3, r8, r7, #1 #&ff ; r3 := r3 + 1 ; r8 := r7 [7:0]

# indicates an immediate value


& indicates hexadecimal notation

allowed immediate values are (in general): (0 => 255) x 22n


22

Data processing instructions


Shifted register operands
the 2nd source operand may be shifted
by a constant number of bit positions:
ADD r3, r2, r1, LSL #3 ; r3 := r2 + 8.r1

or by a register-specified number of bits:


ADD r5, r5, r3, LSL r2 ; r5 += 2r2*r3

LSL, LSR mean logical shift left, right ASL, ASR mean arithmetic shift left, right ROR means rotate right RRX means rotate right extended
23

ARM shift operations


31 0 31 0 00000 LSL #5 00000 LSL #5

31 0

31 1

00000 0 ASR #5, positive operand

11111 1 ASR #5, negative operand

31

0 C

31

C ROR #5

C RRX

24

Data processing instructions


Setting the condition codes all data processing instructions may set the condition codes.
the comparison operations always do so

For example, here is code for a 64-bit add:


ADDS ADC r2, r2, r0 ; 32-bit carry-out -> C r3, r3, r1 ; added into top 32 bits

s means Set condition codes

the primary use of the condition codes is in control flow-see later


25

Data processing instructions


Multiplication ARM has special multiply instructions
MUL r4, r3, r2, ; r4 := (r3 xr2)
[31:0]

only the bottom 32 bits are returned immediate operands are not supported
multiplication by a constant is usually best done with a short series of adds and subtracts with shifts

there is also a multiply-accumulate form:


MLA r4, r3, r2, r1 ; r4 := (r3xr2+r1)[31:0]

some ARMs support 64-bit result forms too


26

Data processing instructions


31 cond 28 27 26 25 24 21 20 19 16 15 12 11 00 # opcode S Rn Rd 0 operand 2

destination register first operand register set condition codes arithmetic/logic function 25 1 immediate alignment 11 # shift 25 0 immediate shift length shift type second operand register 7 6 5 4 3 Sh 0 Rm 0 11 # rot 8 7 8-bit immediate 0

11 Rs

8 7 6 5 0 Sh

4 3 0 1 Rm

register shift length 27

Data processing instructions


Opcode [24:21] 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 Mnemonic AND EOR SUB RSB ADD ADC SBC RSC TST TEQ CMP CMN OPR MOV BIC MVN Meaning Logical bit-wise AND Logical bit-wise exclusive OR Subtract Reverse subtract Add Add with carry Subtract with carry Reverse subtract with carry Test Test equivalence Compare Compare negated Logical bit-wise OR Move Bit clear Move negated Effect Rd:= Rn AND Op2 Rd:= Rn EOR Op2 Rd:= Rn - Op2 Rd:= Op2 - Rn Rd:= Rn + Op2 Rd:= Rn + Op2 + C Rd:= Rn - Op2 + C - 1 Rd:= Op2 - Rn + C - 1 Scc on Rn AND Op2 Scc on Rn EOR Op2 Scc on Rn - Op2 Scc on Rn + Op2 Rd:= Rn OR Op2 Rd:= Op2 Rd:= Rn AND NOT Op2 Rd:= NOT Op2
28

Data processing instructions


Assembler format:
<op> {<cond>} {S} Rd, Rn, #<32-bit imm.> <op> {<cond>} {S} Rd, Rn, Rm {,<shift>}

where <shift> = LSL, LSR, ASL, ASR, ROR followed by #<5-bit imm.> or Rs, or just RRX. monadic instructions omit Rn comparison instructions omit Rd 32-bit immediates are rotated 8-bit values
29

Multiply instructions
31 28 27 24 23 21 20 19 16 15 12 11 8 7 4 3 cond 0000 mul S Rd / Rd Hi Rn / Rd Lo Rs 1001 0 Rm

MUL {<cond>} {S} Rd, Rm, Rs MLA {<cond>} {S} Rd, Rm, Rs, Rn <mul> {<cond>} {S} RdHi, RdLo, Rm, Rs
Opcode [23:21] 000 001 100 101 110 111 Mnemonic MUL MLA UMULL UMLAL SMULL SMLAL Meaning Multiply (32-bit result) Multiply-accumulate (32-bit result) Unsigned multiply long Unsigned multiply-accumulate long Signed multiply long Signed multiply-accumulate long Effect Rd:= (Rm*Rs)[31:0] Rd:= (Rm*Rs+Rn )[31:0] RdHi:RdLo := Rm*RS RdHi:RdLo += Rm*RS RdHi:RdLo := Rm*RS RdHi:RdLo += Rm*RS

30

The ARM instruction set


Agenda: data processing instructions data transfer instructions control flow instructions conditional execution special instructions memory faults operating modes and exceptions ARM architecture variants
31

Data transfer instructions


The ARM has 3 types of data transfer instruction: single register loads and stores
flexible byte or word ( or possibly half-word) transfers

multiple register loads and stores


less flexible, multiple words, higher transfer rate

single register-memory swap


mainly for system use, so ignore for now
32

Data transfer instructions


Addressing memory all ARM data transfer instructions use register indirect addressing. Example of load and store instructions:
LDR STR r0, [r1] : r0 := mem [r1] r0, [r1] : mem [r1] := r0

therefore before any data transfer is possible:


a register must be initialized with an address close to the target
33

Data transfer instructions


Initializing an address pointer any register can be used for an address any ARM instruction may be used to compute an address the assembler also has special pseudo instructions to make this easier:
ADR r1, .. TABLE1 TABLE1 ; r1 ; points to TABLE1 LABEL
34

Data transfer instructions


Single register loads and stores the simplest form is just register indirect:
LDR r0, [r1] ; r0 := mem [r1]

this is a special form of base plus offset:


LDR r0, [r1, #4] ; r0 := mem [r1+4]

the offset is within +/- 4 Kbytes

auto-indexing is also possible:


LDR r0, [r1, #4] ! ; r0 := mem [r1+4] ; r1 := r1 + 4

35

Data transfer instructions


Single register loads and stores (..ctd) another form uses post-indexing
LDR r0, [r1], #4 ; r0 ; r1 := mem [r1] := r1 +4

finally, any of these can load a byte rather than a word:


LDRB r0 , [r1] ; r0 := mem8 [r1]

stores [STR] have the same forms


some ARMs also support half-word and signed byte transfer
36

Data transfer instructions


Multiple register loads and stores ARM also supports instructions which transfer several registers:
LDMIA r1, {r0, r2, r5} ; ; ; r0 := mem [r1] r2 := mem [r1+4] r5 := mem [r1+8]

the {..} list may contain any or all of r0 - r15


including r15 (the PC!) will cause a branch

the lowest register always uses the lowest address


37

Data transfer instructions


Multiple register loads and stores (..ctd) stack addressing:
stacks can Ascend or Descend memory stacks can be Full or Empty ARM multiple register transfer support all forms of stack

block copy addressing


addresses can Increment or Decrement Before or After each transfer
38

Multiple register transfer addressing modes


r 9 r5 r1 r0 101816 r 9 r5 r1 r0 101816 r9 100c16 r9 100c16

100016 STMIA r 9!, {r 0, r 1, r 5 } 101816

100016 STMIB r 9!, {r 0, r 1, r 5 } 101816

r9

r5 r1 r0

100c16

r9 r5 r1 r0

100c16

r 9

100016

r 9

100016

STMDA r 9!, {r 0, r 1, r 5 }

STMDB r 9!, {r 0, r 1, r 5 }

39

Stack and block copy views of the load and store multiple instructions
Ascending
Full Before
STMIB STMFA STMIA STMEA LDMDB LDMEA LDMDA LDMFA LDMIA LDMFD STMDB STMFD STMDA STMED

Descending
Full Empty
LDMIB LDMED

Empty

Increment
After Before

Decrement
After

40

Single word and unsigned byte data transfer instructions


31 cond 28 27 26 25 24 23 22 21 20 19 16 15 12 11 01 # P U B W L Rn Rd 0 offset source/destination register base register load/store write-back (auto-index) unsigned byte/word up/down pre-/post-index 25 0 11 12-bit immediate 0

25 1 immediate shift length shift type offset register

11 # shift

7 6

5 4 3 Sh 0

0 Rm

41

Half-word and signed byte data transfer instructions


31 cond 28 27 000 25 24 23 22 21 20 19 16 15 12 11 8 7 6 5 4 3 0 P U # W L Rn Rd offsetH 1 S H 1 offsetL source/destination register base register load/store write-back (auto-index) up/down pre-/post-index

22 1

11 8 Imm[7:4]

3 0 Imm[3:0]

22 0 offset register

11 8 0000

3 Rm

42

Single data transfer instructions


Assembler format:
LDR | STR {<cond>} {B|SB|H|SH }Rd,[Rn, <off>] {!} LDR | STR {<cond>} {B | SB | H | SH } Rd, [Rn, <off>] LDR | STR {<cond>} {B | SB | H | SH } Rd, LABEL

is +/-Rm or +/- 12-bit (byte, word) or 8-bit (signed or halfword) immediate Data type encoding
<off> S 1 0 1 H 0 1 1 Data type Signed Byte Unsigned half-word Signed half-word
43

Multiple register data transfers


31 cond 28 27 25 24 23 22 21 20 19 100 P U S W L 16 15 Rn register list 0

base register load/store write- back (auto-index) restore PSR and force user bit up/down pre-/post-index

Assembler format:
LDM | STM {<cond>} <add> Rn {!}, <regs> <add> = IA etc, <regs> = {rn,..rm}

44

Swap memory and register instructions


31 cond 28 27 23 22 21 20 19 16 15 0 0 0 1 0 B 0 0 Rn 12 11 4 3 Rd 0 0 0 0 1 0 0 1 Rm 0

destination register base register unsigned byte/word source register

Assembler format:
SWP {<cond>} {B} Rd, Rm, [RN]

45

The ARM instruction set


Agenda: data processing instructions data transfer instructions control flow instructions conditional execution special instructions memory faults operating modes and exceptions ARM architecture variants
46

Control flow instructions


Control flow instructions just switch execution around the program:
B LABEL .. ; these instructions are skipped LABEL ..

normal execution is sequential branches are used to change this


to move forwards or backwards
47

Control flow instructions


Conditional branches sometimes whether or not a branch is taken depends on the condition codes:
MOV LOOP.. ADD CMP r0, r0, r0, #10 #1 ; increment counter ; compare with limit ; repeat if not equal ; else continue r0, #0 ; initialize counter

BNE LOOP ..

here the branch depends on how CMP sets Z


48

Branch conditions
Branch B BAL BEQ BNE BPL BMI BCC BLO BCS BHS BVC BVS BGT BGE BLT BLE BHI BLS Interpretation Unconditional Always Equal Not equal Plus Minus Carry clear Lower Carry set Higher or same Overflow clear Overflow set Greater than Greater or equal Less than Less or equal Higher Lower or same Normal Uses Always take this branch Always take this branch Comparison equal or zero result Comparison not equal or zero result Result positive or zero Result minus or negative Arithmetic operation did not give carry-out Unsigned comparison gave lower Arithmetic operation gave carry-out Unsigned comparison gave higher or same Signed integer operation ; no overflow occurred Signed integer operation ; overflow occurred Signed integer comparison gave greater than Signed integer comparison gave greater or equal Signed integer comparison gave less than Signed integer comparison gave less than or equal Unsigned comparison gave higher Unsigned comparison gave lower or same

49

Control flow instructions


Conditional execution an unusual ARM feature is that all instructions may be conditional:
CMP ADDNE SUBNE r0, r1, r1, #5 r1, r1, r0 r2 ; } ; if (r0 ; r1 != 5) { := r1 + r0 - r2

this removes the need for some short branches


improving performance and code density

50

Control flow instructions


Branch and link ARMs subroutine call mechanism saves the return address in r14
BL SUBR .. SUBR MOV .. pc, r14 ; branch to SUBR ; return to here ; subroutine entry point ; return

note the use of a data processing instruction for return


51

Control flow instructions


Nested subroutines r14 must be saved before the next BL
BL SUB1 ; branch to SUB! .. SUB1 STMFA r13!, {r0-r2, r14} ; save regs BL SUB2 .. LDMFA r13!, {r0-r2, pc} ; return SUB2 .. MOV pc, r14 ; return

52

Control flow instructions


Supervisor calls these are calls to operating system functions such as input and output:
SWI SWI SWI_WriteC SWI_Exit ; output character in ; return to monitor r0

the range of available calls is system dependent


53

Branch and Branch with Link


31 28 27 25 24 23 cond 1 0 1 L 0 24-bit signed word offset

the L bit selects Branch with Link


the address of the instruction after the branch is placed into r14

the offset is scaled to word


giving a range of +/-32 Mbytes

Assembler format:
B{L} {<cond>} <target address>
54

Branch and exchange


31 28 27 cond 4 3 0 0 0 1 0 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 Rm 0

only available on recent ARM chips used to switch execution to the Thumb instruction set
if Rm [0] = 1

causes a branch to the address in Rm Assembler format:


BX {<cond>} Rm
55

SoftWare Interrupt
31 cond 28 27 24 23 1111 0 24-bit ( interpreted ) immediate

this instruction is the normal way to access operating system facilities; it:
puts the processor into supervisor mode saves the CPSR in SPSR_svc sets the PC to 0x8

Assembler format:
SWI {<cond>} <24-bit immediate>

56

The ARM instruction set


Agenda: data processing instructions data transfer instructions control flow instructions conditional execution special instructions memory faults operating modes and exceptions ARM architecture variants
57

The ARM condition code field


31 28 27 cond 0

every ARM instruction may have a condition added


the instruction will only be executed if the condition is passed the conditions test the vales of the N, Z, C and V flags in the CPSR

if no condition is specified A (always) is assumed


58

ARM condition codes


Opcode [31:28] 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 Mnemonic extension EQ NE CS/HS CC/LO MI Pl VS VC HI LS GE LT GT LE AL NV Interpretation Equal / equals zero Not equal Carry set / unsigned higher or same Carry clear / unsigned lower Minus / negative Plus / positive or zero Overflow No overflow Unsigned higher Unsigned lower or same Signed greater than or equal Signed less than Signed greater than Signed less than or equal Always Never (do not use!) Status flag state for execution Zset Zclear Cset Cclear Nset Nclear Vset Vclear Cset and Zclear Cclear or Zset N equals V N is not Equal to V Z clear and N equals V Z set or Nis not equal to V any none
59

The ARM instruction set


Agenda: data processing instructions data transfer instructions control flow instructions conditional execution special instructions memory faults operating modes and exceptions ARM architecture variants
60

Status register to general register transfers


31 cond 28 27 23 22 21 16 15 12 11 0 0 0 0 1 0 R 0 0 1 1 1 1 Rd 0 0 0 0 0 0 0 0 0 0 0 0

destination register SPSR/CPSR

Assembler format:
MRS {<cond>} Rd, CPSR | SPSR

and the reverse (see next slide):


MRS {<cond>} CPSR | SPSR, #32 | Rm

(with a few details about fields omitted)


61

Transfer to status register


31 cond 28 27 26 25 24 23 22 21 20 19 00 # 10 R 1 0 field 16 15 12 11 1111 0 operand

field mask SPAR/CPSR

25 1

11 8 7 # rot

0 8-bit immediate

immediate alignment 25 0 11 4 3 0 0 0 0 0 0 0 0 0 Rm

operand register

62

Coprocessor instructions
Coprocessor data processing instructions
31 28 27 24 23 20 19 16 15 12 11 8 cond 1110 Cop1 CRn CRd CP# 7 Cop2 5 4 0 3 CRm 0

CP# specifies the coprocessor number: it performs the operation specified by Cop1 and Cop2 on data in CRn and CRm, putting the result in CRd other interpretations are possible!
63

Coprocessor instructions
31 cound 28 27 25 24 23 22 21 20 19 110 P U N W L Rn 16 15 12 11 CRd CP# 8 7 8-bit offset 0

source/destination register base register load/store write- back (auto-index) data size (coprocessor dependent) up/down pre-/post-index

64

Coprocessor instructions
Coprocessor register transfer instructions
31 cond 28 27 24 23 21 20 19 16 15 12 11 1110 Cop1 L CRn Rd CP# 8 7 Cop2 5 4 3 1 CRm 0

Load from coprocessor/store to coprocessor

move a 32-bit value between the coprocessor and ARM (including CPSR)
examples: floating-point FIX, FLOAT and compare
65

The ARM instruction set


Agenda: data processing instructions data transfer instructions control flow instructions conditional execution special instructions memory faults operating modes and exceptions ARM architecture variants
66

Memory faults
ARM has full support for memory faults. Accesses may fail because of:
virtual memory page faults memory protection violations soft memory errors

Prefetch aborts are faults on instruction fetch Data aborts are faults on data transfers
both are recoverable with a little work
67

The ARM instruction set


Agenda: data processing instructions data transfer instructions control flow instructions conditional execution special instructions memory faults operating modes and exceptions ARM architecture variants
68

Operating modes and executions


ARM has privileged operating modes: SVC mode for software interrupts IRQ mode for (normal) interrupts FIQ mode for fast interrupts Abort mode for handling memory faults Undef mode for undefined instruction traps System mode for privileged operating system tasks

69

Operating modes and executions


Each privileged mode has: some private registers
its own r14 for a return address its own r13, normally for a private stack pointer FIQ mode has additional private registers to speed its operating

its own Saved Program Status Register (SPSR)


to preserve the CPSR so it can be restored upon return
70

Operating modes and exceptions


ro r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 R15 (PC)

usable in user mode system modes only

r8_fig r9_fig r10_fig r11_fig r12_fig r13_fig r14_fig

r13_svc r14_svc

r13_abt r14_abt

r13_irq r14_irq

r13_und r14_und

SPSR_und SPSR_irq SPSR_abt CPSR SPSR_fig user mode fig mode svc mode SPSR_svc abort mode irq mode undefined mode

71

Operating modes and exceptions


31 28 27 NZCV unused 8 7 6 5 4 IF T mode 0

The CPSR and SPSR format: bits 0 to 4 define the operating mode bit 5 controls the instruction set
ARM (T=0) or Thumb (T=1)

bit 6 disables FIQ when set bit 7 disables IRQ when set

72

Operating modes and exceptions


Register use:
CPSR [4:0] 10000 10001 10010 10011 10111 11011 11111 Mode User FIQ IRQ SVC Abort Undef System Use Normal user code Processing fast interrupts Processing standard interrupts Processing software interrupts (SWIs) Processing memory faults Handling undefined instruction traps Running privileged operating system tasks Registers user _fiq _irq _svc _abt _und user

73

Operating modes and exceptions


Exception entry sequence: change to the appropriate operating mode save the return address in r14_exc save the old CPSR in SPSR_exc on FIQ entry, disable FIQ force the PC to the appropriate exception vector address disable IRQ
74

Operating modes and exceptions


Exception vector addresses:
Exception Reset Undefined instruction Software interrupt (SWI) Prefetch abort (instruction fetch memory fault) Data abort (data access memory fault) IRQ (normal interrupt) FIQ (fast interrupt) Mode SVC UND SVC Abort Abort IRQ FIQ Vector address 0x00000000 0x00000004 0x00000008 0x0000000C 0x00000010 0x00000018 0x0000001C

75

Operating modes and exceptions


Exception handling the vector address normally contains a branch to the exception handling code
the FIQ handler can start at 0x1C

r13_exc usually points to a private stack


save work registers for use by the handler FIQ usually has enough private registers

process exception restore work registers and return


76

Operating modes and exceptions


Return from exception from a SWI or undefined instruction:
MOWS pc, r14

this is a special form with s and pc it restores the CPSR from SPSR_exc as well

from an IRQ, FIQ or prefetch abort:


SUBS pc, r14, #4

from a data abort to retry the data transfer:


SUBS pc, r14, #8

77

The ARM instruction set


Agenda: data processing instructions data transfer instructions control flow instructions conditional execution special instructions memory faults operating modes and exceptions ARM architecture variants
78

Writing simple programs


Even experienced programmers approach a new environment by first getting a simple program to run
often a Hello Worldprogram

This requires some basic tools:


a text editor, to enter the program an assembler to produce binary code a system or emulator to test the code

79

Writing simple programs


Assembler details to note: AREA - declaration of code area EQU - initializing constants (1 word)
used here to define SWI numbers ENTRY - code entry point = - a way to initialize memory (per byte) END - the end of the program source

labels are aligned left


opcodes are indented
80

Examples
Hello World assembly program:
AREA SWI-WriteC EQU SWI-Exit START LOOP EQU ENTRY ADR LDRB CMP SWINE BNE SWI TEXT = END r1, TEXT r0, #0 SWI_WriteC LOOP SWI_Exit HelloW, CODE, READONLY ; declare area &0 &11 ; output character in r0 ; finish program ; code entry point ; r1-> "Hello World" ; check for text end ; if not end print .. ; .. And loop back ; end of execution ; end of program source r0, [r1], #1 ; get the next byte

"Hello World" , &0a, &0d, 0

81

Examples
Subroutine to print r1 in hexadecimal
HexOut MOV LOOP MOV CMP ADDGT ADDLE SWI MOV SUBS BNE MOV r2, #8 r0, r1, LSR #28 r0, #9 r0, r0, #"A"-10 r0, r0, #"0" SWI_WriteC r1, r1, LSL #4 r2, r2, #1 LOOP pc, r14 ; ; ; ; ; ; ; ; ; ; nibble count = 8 get top nibble 0-9 or A-F? ASCII alphabetic ASCII numeric print character shift left one nibble decrement nibble count if more do next nibble ... Else return

82

The structure of the ARM crossdevelopment toolkit


c source c libraries asm source c compiler .aof linker .aif debug assembler

ARMsd system model

PIE card ARMulator

83

ARM System Design


History of ARM ARM Instruction Set Thumb Instruction Set ARM Cores ARM Cache Modeling ARM CPUs ARM Coprocessors Optional
84

The Thumb instruction set


Agenda : the Thumb programmers model Thumb instructions Thumb implementation Thumb applications hands-on: writing simple Thumb assembly programs

85

The Thumb instruction set


Agenda : the Thumb programmers model Thumb instructions Thumb implementation Thumb applications hands-on: writing simple Thumb assembly and C programs

86

The Thumb programmers model


What is Thumb? a compressed, 16-bit representation of the ARM instruction set
primarily to increase code density also increases performance in some cases

It is not a complete architecture all Thumb-aware cores also support the ARM instruction set
therefore the Thumb architecture need only support common functions
87

The Thumb programmers model


31 28 27 NZCV 8 7 unused 6 5 4 IF T 0 mode

The T bit in the CPSR controls the interpretation of the instruction stream switch from ARM to Thumb (and back) by execution BX instruction exceptions also cause switch to ARM code
return symmetrically to ARM or Thumb code

88

The Thumb programmers model


r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 SP(r13) LR(r14) PC(r15) Shaded registers have restricted access Lo register

Hi register CPSR

89

The Thumb programmers model


Thumb register use: r0-r7 are general purpose registers r13 is used implicitly as a stack pointer
in ARM code this is a software convention

r14 is used as the link register


implicitly, as in the ARM instruction set

a few instructions can access r8-15 the CPSR flags are set by data processing instructions & control conditional branches
90

The Thumb programmers model


Thumb-ARM similarities: load-store architecture
with data processing, data transfer and control flow instructions

support for 8-bit byte, 16-bit half-word and 32-bit data types
half-words are aligned on 2-byte boundaries words are aligned on 4-byte boundaries

32-bit unsegmented memory


91

The Thumb programmers model


Thumb-ARM differences: most Thumb instructions are unconditional
all ARM instructions are conditional

many Thumb instructions use a 2-address format


most ARM instructions use 3-address format

Thumb instruction formats are less regular


a result of the denser encoding

92

The Thumb instruction set


Agenda : the Thumb programmers model Thumb instructions Thumb implementation Thumb applications hands-on: writing simple Thumb assembly and C programs

93

Thumb branch instructions


15 12 11 1 1 0 1 cond 8 7 8-bit offset 0 (1) B <cond> <LABEL>

15 11 10 1 1 1 1 0

0 11-bit offset (2) B <LABEL>

15 12 11 10 1 1 1 1 H

0 11-bit offset (3) BL <LABEL>

15 7 6 5 3 2 0 1 0 0 0 1 1 1 0 H Rm 0 0 0

0 (4) BX RM

94

Thumb branch instructions


These are similar ARM instructions except: offset are scaled to half-word, not word range is reduced to fit into 16 bits BL works in two stages:
H=0: H=1: LR := PC + (offset <<12) PC := LR + (offset <<1) LR := oldPC + 3

the assembler generates both halves


95

Thumb software interrupts


15 1 1 0 1 1 1 1 1 8 7 8-bit immediate 0

The Thumb SWI operates exactly like the ARM SWI the (interpreted) immediate is restricted to 8 bits the SWI handler is entered in ARM code
the return automatically selects ARM or Thumb

96

Thumb data processing instructions


15 10 9 8 000110 A Rm 6 5 3 2 Rn Rd 0 (1) ADD|SUB Rd,Rn,Rm

15 10 9 8 6 5 000111 A #imm3 Rn

3 2 Rd

0 (2) ADD|SUB Rd, Rn, #imm3

15 13 12 11 10 8 7 0 0 1 Op Rd/Rn #imm8

0 (3) <Op> Rd/Rn, #imm8

15 13 12 11 0 0 0 Op # sh

6 5 Rn

3 2 Rd

0 (4) LSL|LSR|ASR Rd, Rn, #shift

97

Thumb data processing instructions


15 10 9 010 0 0 0 Op 6 5 3 2 0 Rm/Rs Rd/Rn (5) <Op> Rd/Rn,Rm/Rs

15 10 9 8 7 6 5 3 2 0 010001 Op D M Rm Rd/Rn

(6) ADD|CMP|MOV Rd/Rn,Rm

15 12 11 10 8 7 1 0 1 0 R Rd #imm8

0 (7 ) ADD|Rd,SP|PC, #imm8

15 8 7 6 1 0 1 1 0 0 0 0 A #imm7

0 (8) ADD|SUB SP, SP, #imm7

98

Thumb data processing instructions


Notes: in Thumb code shift operations are separate from general ALU functions
in ARM code a shift can be combined with an ALU function in a single instruction

all data processing operations on the Lo registers set the condition codes
those on the Hiregisters do not, apart from CMP which only changes the condition codes
99

Thumb single register data transfers


15 13 12 11 10 6 5 3 2 0 1 1 B L # off 5 Rn Rd 0 (1) LDR|STR {B} Rd,[Rn,#off5]

15 12 11 10 6 5 1 0 0 0 L # off 5 Rn

3 2 Rd

0 (2) LDRH|STRH Rd,[Rn,#off5]

15 12 11 9 8 6 5 3 2 0101 Op Rm Rn Rd

0 (3) LDR|STR {S} {H/B} Rd,[Rn,Rm]

15 11 10 0 1 0 0 1 Rd

8 7 # off 8

0 (4) LDR Rd,[PC,#off8]

15 12 11 10 8 7 1 0 0 1 L Rd

0 # off 8 (5) LDR|STR Rd,[SP,#off8]

100

Thumb multiple register data transfers


15 12 11 10 8 7 1 1 0 0 L Rn reg list 0 (1) LDMIA|STMIA Rn!,{<reg list>}

15 10 9 8 1 0 1 1 1 1 L R

7 reg list

0 (2) POP|PUSH {<reg list>{,R}}

These map directly onto the ARM forms:


POP: LDMFD SP!, {<regs>{, pc}} PUSH: STMFD SP!, {<regs>{, lr}}

note restrictions on available addressing modes compared with ARM code

101

The Thumb instruction set


Agenda : the Thumb programmers model Thumb instructions Thumb implementation Thumb applications hands-on: writing simple Thumb assembly and C programs

102

Thumb instruction decompressor


B operand bus ARM instruction decoder mux select ARM or Thumb stream Thumb decompressor select high or low half-word mux data in immediate fields

Instruction pipeline

data in from memory 103

Thumb - ARM instruction mapping


15 13 12 11 10 8 7 0 0 1 1 0 Rd 0 # imm 8

always condition

Major opcode, format 3: MOV/ CMP/ADD/SUB with immediate

Minor opcode denoting ADD & set CC

destination and source register

zero shift

immediate value

31 28 27 26 25 24 21 20 19 1 1 1 0 0 0 1 0 1 0 0 1 0

Rd

16 15 0

12 11 Rd 0 0 0 0 # imm 8

104

The Thumb instruction set


Agenda : the Thumb programmers model Thumb instructions Thumb implementation Thumb applications hands-on: writing simple Thumb assembly and C programs

105

Thumb applications
Thumb code properties: 70% of the size of ARM code
-30% less external memory power -40% more instructions

With 32-bit memory: ARM code is 40% faster than Thumb code With 16-bit memory: Thumb code is 45% faster than ARM code

106

Thumb applications
For the best performance: use 32-bit memory and ARM code For best cost and power-efficiency: use 16-bit memory and Thumb code In a typical embedded system: use ARM code in 32-bit on-chip memory for small speed- critical routines use Thumb code in 16-bit off-chip memory for large non-critical control routines
107

Hands-on: writing simple Thumb assembly programs


Explore further the ARM software development tools Write simple Thumb assembly programs Check that they work as expected Follow the Hands-on instructions

108

You might also like