Arm Instruction Set
Arm Instruction Set
Arm Instruction Set
1) Future Evolution of Information Technology 2) System - on - a- Chip Design 3) Design and Application of Cores 4) Analog and Mixed Signal Design (Prof. Berroth) 5) Test of Systems - on - a - Chip
10
a load-store architecture
instructions which reference memory just move data, they do no processing processing uses values in registers only
11
r13_svc r14_svc
r13_abt r14_abt
r13_irq r14_irq
r13_und r14_und
SPSR_und SPSR_irq SPSR_abt CPSR SPSR_fig user mode fig mode svc mode SPSR_svc abort mode irq mode undefined mode
12
NZCV
unused
IF
mode
In user programs only the top 4 bits of the CPSR are significant: N - the result was negative Z - the result was zero C - the result produced a carry out V - the result generated an arithmetic overflow
13
half-word 14 11 10
half-word 12 9 8
14
15
16
1 1
c is the C bit in the CPSR the operation may be viewed as unsigned or 2s complement signed
18
the specified Boolean logic operation is performed on each bit from 0 to 31 BIC stands for bit clear
each 1 in r2 clears the corresponding bit in r1
19
r2
MVN stands for move negated there is no first operand (r1) specified as these are unary operations
20
these instructions just affect the condition codes (N, Z, C, V) in the CPSR
there is no result register (r0)
21
LSL, LSR mean logical shift left, right ASL, ASR mean arithmetic shift left, right ROR means rotate right RRX means rotate right extended
23
31 0
31 1
31
0 C
31
C ROR #5
C RRX
24
only the bottom 32 bits are returned immediate operands are not supported
multiplication by a constant is usually best done with a short series of adds and subtracts with shifts
destination register first operand register set condition codes arithmetic/logic function 25 1 immediate alignment 11 # shift 25 0 immediate shift length shift type second operand register 7 6 5 4 3 Sh 0 Rm 0 11 # rot 8 7 8-bit immediate 0
11 Rs
8 7 6 5 0 Sh
4 3 0 1 Rm
where <shift> = LSL, LSR, ASL, ASR, ROR followed by #<5-bit imm.> or Rs, or just RRX. monadic instructions omit Rn comparison instructions omit Rd 32-bit immediates are rotated 8-bit values
29
Multiply instructions
31 28 27 24 23 21 20 19 16 15 12 11 8 7 4 3 cond 0000 mul S Rd / Rd Hi Rn / Rd Lo Rs 1001 0 Rm
MUL {<cond>} {S} Rd, Rm, Rs MLA {<cond>} {S} Rd, Rm, Rs, Rn <mul> {<cond>} {S} RdHi, RdLo, Rm, Rs
Opcode [23:21] 000 001 100 101 110 111 Mnemonic MUL MLA UMULL UMLAL SMULL SMLAL Meaning Multiply (32-bit result) Multiply-accumulate (32-bit result) Unsigned multiply long Unsigned multiply-accumulate long Signed multiply long Signed multiply-accumulate long Effect Rd:= (Rm*Rs)[31:0] Rd:= (Rm*Rs+Rn )[31:0] RdHi:RdLo := Rm*RS RdHi:RdLo += Rm*RS RdHi:RdLo := Rm*RS RdHi:RdLo += Rm*RS
30
35
r9
r5 r1 r0
100c16
r9 r5 r1 r0
100c16
r 9
100016
r 9
100016
STMDA r 9!, {r 0, r 1, r 5 }
STMDB r 9!, {r 0, r 1, r 5 }
39
Stack and block copy views of the load and store multiple instructions
Ascending
Full Before
STMIB STMFA STMIA STMEA LDMDB LDMEA LDMDA LDMFA LDMIA LDMFD STMDB STMFD STMDA STMED
Descending
Full Empty
LDMIB LDMED
Empty
Increment
After Before
Decrement
After
40
11 # shift
7 6
5 4 3 Sh 0
0 Rm
41
22 1
11 8 Imm[7:4]
3 0 Imm[3:0]
22 0 offset register
11 8 0000
3 Rm
42
is +/-Rm or +/- 12-bit (byte, word) or 8-bit (signed or halfword) immediate Data type encoding
<off> S 1 0 1 H 0 1 1 Data type Signed Byte Unsigned half-word Signed half-word
43
base register load/store write- back (auto-index) restore PSR and force user bit up/down pre-/post-index
Assembler format:
LDM | STM {<cond>} <add> Rn {!}, <regs> <add> = IA etc, <regs> = {rn,..rm}
44
Assembler format:
SWP {<cond>} {B} Rd, Rm, [RN]
45
BNE LOOP ..
Branch conditions
Branch B BAL BEQ BNE BPL BMI BCC BLO BCS BHS BVC BVS BGT BGE BLT BLE BHI BLS Interpretation Unconditional Always Equal Not equal Plus Minus Carry clear Lower Carry set Higher or same Overflow clear Overflow set Greater than Greater or equal Less than Less or equal Higher Lower or same Normal Uses Always take this branch Always take this branch Comparison equal or zero result Comparison not equal or zero result Result positive or zero Result minus or negative Arithmetic operation did not give carry-out Unsigned comparison gave lower Arithmetic operation gave carry-out Unsigned comparison gave higher or same Signed integer operation ; no overflow occurred Signed integer operation ; overflow occurred Signed integer comparison gave greater than Signed integer comparison gave greater or equal Signed integer comparison gave less than Signed integer comparison gave less than or equal Unsigned comparison gave higher Unsigned comparison gave lower or same
49
50
52
Assembler format:
B{L} {<cond>} <target address>
54
only available on recent ARM chips used to switch execution to the Thumb instruction set
if Rm [0] = 1
SoftWare Interrupt
31 cond 28 27 24 23 1111 0 24-bit ( interpreted ) immediate
this instruction is the normal way to access operating system facilities; it:
puts the processor into supervisor mode saves the CPSR in SPSR_svc sets the PC to 0x8
Assembler format:
SWI {<cond>} <24-bit immediate>
56
Assembler format:
MRS {<cond>} Rd, CPSR | SPSR
25 1
11 8 7 # rot
0 8-bit immediate
immediate alignment 25 0 11 4 3 0 0 0 0 0 0 0 0 0 Rm
operand register
62
Coprocessor instructions
Coprocessor data processing instructions
31 28 27 24 23 20 19 16 15 12 11 8 cond 1110 Cop1 CRn CRd CP# 7 Cop2 5 4 0 3 CRm 0
CP# specifies the coprocessor number: it performs the operation specified by Cop1 and Cop2 on data in CRn and CRm, putting the result in CRd other interpretations are possible!
63
Coprocessor instructions
31 cound 28 27 25 24 23 22 21 20 19 110 P U N W L Rn 16 15 12 11 CRd CP# 8 7 8-bit offset 0
source/destination register base register load/store write- back (auto-index) data size (coprocessor dependent) up/down pre-/post-index
64
Coprocessor instructions
Coprocessor register transfer instructions
31 cond 28 27 24 23 21 20 19 16 15 12 11 1110 Cop1 L CRn Rd CP# 8 7 Cop2 5 4 3 1 CRm 0
move a 32-bit value between the coprocessor and ARM (including CPSR)
examples: floating-point FIX, FLOAT and compare
65
Memory faults
ARM has full support for memory faults. Accesses may fail because of:
virtual memory page faults memory protection violations soft memory errors
Prefetch aborts are faults on instruction fetch Data aborts are faults on data transfers
both are recoverable with a little work
67
69
r13_svc r14_svc
r13_abt r14_abt
r13_irq r14_irq
r13_und r14_und
SPSR_und SPSR_irq SPSR_abt CPSR SPSR_fig user mode fig mode svc mode SPSR_svc abort mode irq mode undefined mode
71
The CPSR and SPSR format: bits 0 to 4 define the operating mode bit 5 controls the instruction set
ARM (T=0) or Thumb (T=1)
bit 6 disables FIQ when set bit 7 disables IRQ when set
72
73
75
this is a special form with s and pc it restores the CPSR from SPSR_exc as well
77
79
Examples
Hello World assembly program:
AREA SWI-WriteC EQU SWI-Exit START LOOP EQU ENTRY ADR LDRB CMP SWINE BNE SWI TEXT = END r1, TEXT r0, #0 SWI_WriteC LOOP SWI_Exit HelloW, CODE, READONLY ; declare area &0 &11 ; output character in r0 ; finish program ; code entry point ; r1-> "Hello World" ; check for text end ; if not end print .. ; .. And loop back ; end of execution ; end of program source r0, [r1], #1 ; get the next byte
81
Examples
Subroutine to print r1 in hexadecimal
HexOut MOV LOOP MOV CMP ADDGT ADDLE SWI MOV SUBS BNE MOV r2, #8 r0, r1, LSR #28 r0, #9 r0, r0, #"A"-10 r0, r0, #"0" SWI_WriteC r1, r1, LSL #4 r2, r2, #1 LOOP pc, r14 ; ; ; ; ; ; ; ; ; ; nibble count = 8 get top nibble 0-9 or A-F? ASCII alphabetic ASCII numeric print character shift left one nibble decrement nibble count if more do next nibble ... Else return
82
83
85
86
It is not a complete architecture all Thumb-aware cores also support the ARM instruction set
therefore the Thumb architecture need only support common functions
87
The T bit in the CPSR controls the interpretation of the instruction stream switch from ARM to Thumb (and back) by execution BX instruction exceptions also cause switch to ARM code
return symmetrically to ARM or Thumb code
88
Hi register CPSR
89
a few instructions can access r8-15 the CPSR flags are set by data processing instructions & control conditional branches
90
support for 8-bit byte, 16-bit half-word and 32-bit data types
half-words are aligned on 2-byte boundaries words are aligned on 4-byte boundaries
92
93
15 11 10 1 1 1 1 0
15 12 11 10 1 1 1 1 H
15 7 6 5 3 2 0 1 0 0 0 1 1 1 0 H Rm 0 0 0
0 (4) BX RM
94
The Thumb SWI operates exactly like the ARM SWI the (interpreted) immediate is restricted to 8 bits the SWI handler is entered in ARM code
the return automatically selects ARM or Thumb
96
15 10 9 8 6 5 000111 A #imm3 Rn
3 2 Rd
15 13 12 11 10 8 7 0 0 1 Op Rd/Rn #imm8
15 13 12 11 0 0 0 Op # sh
6 5 Rn
3 2 Rd
97
15 10 9 8 7 6 5 3 2 0 010001 Op D M Rm Rd/Rn
15 12 11 10 8 7 1 0 1 0 R Rd #imm8
0 (7 ) ADD|Rd,SP|PC, #imm8
15 8 7 6 1 0 1 1 0 0 0 0 A #imm7
98
all data processing operations on the Lo registers set the condition codes
those on the Hiregisters do not, apart from CMP which only changes the condition codes
99
15 12 11 10 6 5 1 0 0 0 L # off 5 Rn
3 2 Rd
15 12 11 9 8 6 5 3 2 0101 Op Rm Rn Rd
15 11 10 0 1 0 0 1 Rd
8 7 # off 8
15 12 11 10 8 7 1 0 0 1 L Rd
100
15 10 9 8 1 0 1 1 1 1 L R
7 reg list
101
102
Instruction pipeline
always condition
zero shift
immediate value
31 28 27 26 25 24 21 20 19 1 1 1 0 0 0 1 0 1 0 0 1 0
Rd
16 15 0
12 11 Rd 0 0 0 0 # imm 8
104
105
Thumb applications
Thumb code properties: 70% of the size of ARM code
-30% less external memory power -40% more instructions
With 32-bit memory: ARM code is 40% faster than Thumb code With 16-bit memory: Thumb code is 45% faster than ARM code
106
Thumb applications
For the best performance: use 32-bit memory and ARM code For best cost and power-efficiency: use 16-bit memory and Thumb code In a typical embedded system: use ARM code in 32-bit on-chip memory for small speed- critical routines use Thumb code in 16-bit off-chip memory for large non-critical control routines
107
108