Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

ARM Instruction Set PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 124

ARM Instruction Set

by
arn, KITCOEK
Instruction Set Summary
The Condition Field

Obsolete, unpredictable in ARM7TDMI


The Condition Field
• All ARM processor instructions are
conditionally executed, depending on the
values of the N, Z, C and V flags in the CPSR.
• If the always (AL - 1110) condition is specified,
the instruction will be executed irrespective of
the flags.
• The assembler treats the absence of a
condition code as though “always” had been
specified.
CPSR: Flags
• The N flag is set if the result is negative, otherwise it is
cleared (that is, N equals bit 31 of the result).
• The Z flag is set if the result is zero, otherwise it is cleared.
• The C flag is set to the carry-out from the ALU when the
operation is arithmetic :(ADD, ADC, SUB, SBC, RSB, RSC,
CMP, CMN) or to the carry-out from the shifter otherwise.
If no shift is required, C is preserved.
• The V flag is preserved in non-arithmetic operations. It is set in an
arithmetic operation if there is an overflow from bit 30 to bit 31 and
cleared if no overflow occurs. It has significance only when an
arithmetic operation has operands that are viewed as 2's
complement signed values, and indicates a result that is out of
range. V flag sets if result is > = 2e31 or < 2e32
Branch: B and Branch with Link: BL

•Branch instructions contain a signed 2's complement 24-bit offset.


•This is shifted left two bits, sign extended to 32 bits, and added to the PC. The instruction
can therefore specify a branch of +/- 32Mbytes.
•The branch offset must take account of the prefetch operation, which causes the PC to be
2 words (8 bytes) ahead of the current instruction.
•Branches beyond +/- 32Mbytes must use an offset or absolute destination which has
been previously loaded into a register. (In this case the PC should be manually saved in R14
if a branch with link type operation is required.)
The link bit
• Branch with Link (BL) writes the old PC into the link
register (R14) of the current bank.
• The PC value written into R14 is adjusted (Lr= Lr -4)
• R14 contains the address of the instruction following
the “branch with link” instruction.
• Note that the CPSR is not saved with the PC.
• To return from a routine called by Branch with Link use
MOV PC,R14 if the link register is still valid.
• Branch and Branch with Link instructions take 3
instruction fetches to execute branched instruction.
– (1st to execute branch and next 2 clocks to fill fetch and
decode part of pipeline)
BL example
Cycle 1 2 3 4 5
Address Operatio
n
0x8000 BL Fetch Decode Execute Linkret Adjust
Lr<-PC Lr=Lr-4
0x8004 X Fetch Decode

0x8008 XX Fetch

0x8FEC ADD Fetch Decode Execute

0x8FF0 SUB Fetch Decode Execute

0x8FF4 MOV Fetch Decode


Assembler syntax
• B{L}{cond} <expression>
• Items in {} are optional. Items in <> must be present.
• {L} requests the Branch with Link form of the
instruction.
• If {L}, absent, R14 will not be affected by the
instruction.
• {cond} is a two-char mnemonic eg. EQ, NE, VS etc.
• <expression> is the destination. The assembler
calculates the offset
Examples
here BAL here ;assembles to 0xEAFFFFFE (note
effect of PC ;offset)
1110 101 0

B there ;ALways condition used as default


CMP R1, #0 ;compare R1 with zero and branch to
fred if R1 was zero
BEQ fred ; otherwise continue to next
;Instruction
BL sub+ROM ;call subroutine at computed address
ADDS R1, R1,#1 ;add 1 to register 1, setting CPSR flags on

BLCC sub ; the result then call subroutine if the C


;flag is
;clear, which will be the case unless R1
;held
;0xFFFFFFFF
Data Processing instructions

ADD R1, R2, #70, LSL #3


Data Processing instructions

• The instruction produces a result


by performing a specified
arithmetic or logical operation on
one or two operands.
• First operand is always a register
(Rn).
• Second operand may be
– a shifted register (Rm) if I bit = 0,
or
– a rotated 8-bit immediate value
(Imm) if I bit= 1 in the instruction
Data Processing instructions
• The condition codes in the CPSR may be
preserved or updated as a result of this
instruction, according to the value of the S-bit
in the instruction.
• Certain operations (TST, TEQ, CMP, CMN) do
not write the result to Rd. They are used only
to perform tests and to set the condition
codes on the result and always have the S bit
set.
The logical operations
• The logical operations (AND, EOR, TST, TEQ, ORR, BIC
(Bit clear), MOV, MVN) perform the logical action on all
corresponding bits of the operand or operands to
produce the result.
• If the S bit is set (and Rd is not R15):
– the V flag in the CPSR will be unaffected
– the C flag will be set to the carry out from the barrel shifter
(or preserved when the shift operation is LSL #0)
– the Z flag will be set if and only if the result is all zeros
– the N flag will be set to the logical value of bit 31 of the
result
ARM data processing instructions
Why C= C-1 for SBC instruction?
Perform 0x0223- 0x0107 on 8 bit ALU
Similarly 0x0207- 0x0123
OR simply
Perform 0x23- 0x17 on 4 bit ALU
Similarly 0x27- 0x13

Comment on your result.


The arithmetic operations
• The arithmetic operations (SUB, RSB, ADD, ADC,
SBC, RSC, CMP, CMN) treat each operand as a 32-
bit integer.
• If the S bit is set (and Rd is not R15):
– the V flag in the CPSR will be set if an overflow occurs
into bit 31 of the result
– the C flag will be set to the carry out of bit 31 of the
ALU
– the Z flag will be set if and only if the result was zero
– the N flag will be set to the value of bit 31 of the result
Shifts: 2nd operand is register

Ex: ADD R1, R2, R3, LSL #3 Ex: ADD R1, R2, R3, LSL R4
Shifts: 2nd operand is 8bit #no.

Ex: ADD R1, R2, #0x70, LSR #3


•The immediate operand rotate field is a 4 bit unsigned integer which
specifies a shift operation on the 8 bit immediate value.
•This value is zero extended to 32 bits, and then subject to a rotate
right by twice the value in the rotate field.
•This enables many common constants to be generated, for example
all powers of 2.
Logical shift left: : 2nd operand is
register

•When the shift amount is specified in the instruction, it is contained in a 5 bit field which
may take any value from 0 to 31.
•A logical shift left (LSL) takes the contents of Rm and moves each bit by the specified
amount to a more significant position.
•The least significant bits of the result are filled with zeros, and the high bits of Rm which
do not map into the result are discarded, except that the lest discarded bit
becomes the shifter carry output which may be latched into the C bit of the CPSR when
the ALU operation is in the logical class (see above). (For arithmetic operations C will get
update as per arithmetic result)
For example, the effect of LSL #5 is shown in Figure above Logical shift left.
Logical shift right
Arithmetic shift right

•An arithmetic shift right (ASR) is similar to logical shift


right, except that the high bits are filled with 31st bit of
Rm instead of zeros.
•This preserves the sign in 2's complement notation.
• For example, ASR #5 is shown in Figure above
Arithmetic shift right
Rotate right
Rotate right extended, (RRX).

•This is a rotate right by one bit position of the


33 bit quantity formed by appending the CPSR C
flag to the most significant end of the contents
of Rm as shown in Figure above.
Assembler syntax
• MOV,MVN - single operand instructions
• <opcode>{cond}{S} Rd,<Op2>
• CMP,CMN,TEQ,TST - instructions which do not
produce a result.
<opcode>{cond} Rn,<Op2>
• AND,EOR,SUB,RSB,ADD,ADC,SBC,RSC,ORR,BIC
<opcode>{cond}{S} Rd,Rn,<Op2>
• <Op2> is Rm{,<shift>} or,<#expression>
• {cond} two-character condition mnemonic
• {S} set condition codes if S present (implied for
CMP, CMN, TEQ, TST).
• Rd, Rn and Rm can be expressions evaluating
to a register number.
• <#expression> if used, the assembler will
attempt to generate a shifted immediate 8-bit
field to match the expression. If this is
impossible, it will give an error.
• <shift> is <shiftname> <register> or
<shiftname> #expression,or RRX (rotate right
one bit with extend).
• <shiftname> is: ASL, LSL, LSR, ASR, ROR.
– (ASL is a synonym for LSL; they assemble to the
same code.)
Examples
• ADDEQ R2,R4,R5 ;if the Z flag is set make R2:=R4+R5
• TEQS R4,#3 ; test R4 for equality with 3 (the
S is in fact redundant as the
assembler inserts it
automatically)
• SUB R4,R5,R7,LSR R2; logical right shift R7 by the
number in the bottom byte of
R2, subtract result from R5, and
put the answer into R4
• MOV PC,R14 ;return from subroutine
• MOVS PC,R14 ;return from exception and
restore CPSR from SPSR_mode
Single Data Transfer (LDR, STR)
Offsets and auto-indexing

• The offset from the base register Rn may be either


– a 12-bit unsigned binary immediate value in the
instruction,
– a second register (possibly shifted in some way).

• The offset maybe added to (U=1) or subtracted from


(U=0) the base register Rn.
• The (calculation of ) offset modification may be
performed either before (pre-indexed, P=1) or after
(post-indexed, P=0) the base is used as the transfer
address.
W bit

• The W bit gives optional auto increment and


decrement addressing modes.
• The modified base value may be written back into
the base (W=1), or the old base value may be
retained (W=0).
• STR R1,[R2,R4]! ;store R1 at R2+R4 (both of which
are ;registers) and write back address to R2
• {!}: writes back the base register (set the W bit) if
! is present in the instruction syntax.
ARM ADDRESSING MODES
Immediate Addressing
• The data is directly specified in the instruction.
• Useful for getting constants into registers.
• Immediate data must be preceded with a “#”
sign.
• Ex: ADD R1, R2, #0x70, LSL #3
• Ex: MOV R1, #03
Register Addressing Mode
• Direct access to registers – R0 through R15
• MOV PC,R14
• MOV R1, R2
Direct Addressing
• Direct addressing can access any memory
location in the system
• The complete address is specified in the
instruction
• ARM does not offer this mode
Indirect Addressing
• Any register may be used as pointer register
where the contents of register indicate an
address in memory where data is to be read
or written
• Ex: STR R1, [R2]
– Store contents of R1 at memory location pointed
by R2 (with zero offset)
Indexed Addressing
• Use a register for storing a Base address and
an immediate number or another register for
storing an offset.
• The effective address is the sum of the two:
• EA = Base address + Offset
Pre and Post indexing
• Pre-indexed= First find effective address and
then store the operand at calculated EA
• Post-indexed= First store operand at base
address and then find effective address and
mandatorily write back (store) the EA in base
register
– The effective address written back in base register
after operand storage in memory will be utilised
for next operand storage
A pre-indexed addressing specification
With base register and zero offset
• [Rn]: with offset value zero
– STR R0, [R1]; store R0 to address in R1
With base register and 12 bit offset
• [Rn,<#expression>]{!}
– STR R0, [R1, #12]; store R0 at an address pointed by [R1+0x0c]
– STR R0, [R1, #12]!; store R0 at an address pointed by [R1+0x0c] and
R1<- R1+0x0c
• {!}: writes back the base register (set the W bit) if ! is present.
With base register and offset in another register
• [Rn,{+/-}Rm{,<shift>}]{!}: offset of +/- contents of index register,
shifted by <shift>
– LDR R0, [R1, -R2, LSL #2]; Negative offset shifted by 2 bits
– LDR R6, [R0, R1, ROR #6]!; Preindexed + write back
ARM Post-indexed addressing
• In post-indexed data transfers the, write back
is compulsory and modifies base register Rn.
• The write back bit is redundant and it is
always set to zero
– the old base value can be retained by setting the
offset to zero.
• STR R1,[R2],R4 ;store R1 at R2 and write back
;R2+R4 to R2
A post-indexed addressing
specification:
With base register and 12 bit offset
• [Rn],<#expression>
– STR R0, [R1], #12; first store R0 to memory
pointed by R1 and modify R1 to R1 + 0x0c
With base register and offset in another register
• [Rn],{+/-}Rm{,<shift>} :offset of +/- contents of
index register, shifted as by <shift>.
– LDR R2, [R0], R4, ASR #4; load R2 EA<R0>,
• and R0= R0 + R4/16, after load operation
Assembler syntax
• <LDR|STR>{cond}{B/H} Rd,<Address>
• LDR load from memory into a register
• STR store from a register into memory
• {cond} two-character condition mnemonic
• {B/H} if B is present then byte transfer, H for
half word, otherwise word transfer
• Rd is an expression evaluating to a valid
register number.
Assembler syntax
• <Address> can be:
1. An expression which generates an address:
<expression>
2. A pre-indexed addressing specification.
3. A post-indexed addressing specification.
LDR/ STR Examples
• STR R1,[R2,R4]! ;store R1 at R2+R4 (both of which are
registers) and write back address to R2
• STR R1,[R2],R4 ;store R1 at R2 and write back R2+R4 to R2
• LDR R1,[R2,#16] ;load R1 from contents memory location
pointed by R2+16, Don't write back the sum in R2
• LDR R1,[R2,R3,LSL#2] ;load R1 from contents of R2+R3*4
• LDREQB R1,[R6,#5] ;conditionally load byte at R6+5 into R1
;bits 0 to 7, filling bits 8 to 31 with zeros
• STR R1,PLACE ;generate PC relative offset to address PLACE

• PLACE
Most often used LDR/STR inst.
Loads Stores Size and type
LDR STR Word 32 bit
LDRB STRB Byte 8 bits
LDRH STRH Half word 16 bits
LDRSB -- Signed byte sign extended
32 bits
LDRSH -- Signed Half word sign
extended 32 bit
LDM STM Multiple words
Examples: cont’d
LDRH r11, [r0]; load half word in to r11 pointed
;by r0 = [0x00008000]

r11 Memory Address


R11 before load 0xEE 0x8000
0x12345678 0xFF 0x8001
R11 after load 0X90 0x8002
0x0000FFEE 0xA7 0x8003
Examples: cont’d
• LDRSB r11, [r0] ; load signed byte in r11
;r0=0x8000

r11 Memory Address


R11 before load 0xEE 0x8000
0x12345678 0x8C 0x8001
R11 after load 0X90 0x8002
0xFFFFFFEE 0xA7 0x8003
Example: Post index addressing
• R3= 0xFEEDBABE
• R8=0x00008000
STR r3, [r8], #4
r8 Memory (little Address Memory (big
endian) endian)
before > after Before > after
R8 before store 0x16 > 0xBE 0x8000 0xFE
0x00008000 0xEF > 0xBA 0x8001 0xED
R8 after store 0x9A > 0XED 0x8002 0xBA
0x00008004 0xFC > 0xFE 0x8003 0xBE
Block Data Transfer (LDM, STM)
• Block data transfer instructions are used to
load (LDM) or store (STM) any subset of the
currently visible registers.
• very efficient for saving or restoring context,
or for moving large blocks of data around
main memory.
• When context switching is required?
– Subroutine call, Mode switch, Process switch
Block data transfer instruction
The register list
• The instruction can cause the transfer of any
registers in the current bank
• The register list is a 16 bit field in the instruction,
with each bit corresponding to a register.
– A 1 in 0th bit position of the “register field” will cause
R0 to be transferred,
– a 0 will cause it not to be transferred
– similarly 1st bit position in the “register field” controls
the transfer of R1, and so on.
The register list
• Any subset of the registers, or all the registers,
may be specified.
• The only restriction is that the register list
should not be empty.
• Whenever R15 is stored to memory the stored
value of PC is the address of the STM
instruction plus 12
The register list
• The registers are transferred in the order
lowest to highest, so R15 (if in the list) will
always be transferred last.
• The lowest register gets transferred to the
lowest memory address
• The lowest memory address gets transferred
to lowest register
consider the transfer of R1, R5 and R7 in the case where
Rn=0x1000 and write back of the modified base is required (W=1)
Post-increment addressing: Empty stack, Ascending
STMEA R0!, {R1, R5, R7}

First store
then
increment
consider the transfer of R1, R5 and R7 in the case where
Rn=0x1000 and write back of the modified base is required (W=1)
Pre-increment addressing: Full stack Ascending
STRMFA R0!, {R1, R5, R7}
First increment then store
Base
register
pointing
to filled
memory

Stack
like
8051
consider the transfer of R1, R5 and R7 in the case where
Rn=0x1000 and write back of the modified base is required (W=1)
Post-decrement addressing: Empty stack Descending
STMED R0!, {R1, R5, R7}

Lower First
register store,
at lower then
address incr
consider the transfer of R1, R5 and R7 in the case where
Rn=0x1000 and write back of the modified base is required (W=1)
Pre-decrement addressing: Full stack Descending
STMFD R0!, {R1, R5, R7}

Stack
First
like
decr
8085
then
stor
Use of the S bit
• When the S bit is set in a LDM/STM instruction
its meaning depends on whether or not
– R15 is in the transfer list
– and on the type of instruction (LDM/STM).
• The S bit should only be set if the instruction is
to execute in a privileged mode.
What is difference between
mov pc lr and movs pc lr ?
STM with R15 in transfer list and S bit
set (Save User mode registers)
• The registers transferred are taken from the
User bank rather than the bank corresponding
to the current mode.
• This can be useful when you have switched to
Supervisory mode from user mode (due to
execution of SWI instruction ) and now you
want to save contest of User mode
• This instruction is executed from privileged
mode
LDM with R15 in transfer list and S bit set
Mode change: privileged to User
• If the instruction is a LDM then SPSR_<mode> is
transferred to CPSR at the same time as R15 is loaded.
Example : you are switching from Supervisory mode to
USER mode the contest of User mode (set of stored
registers and CPSR) will be loaded in required registers
• PC will be loaded with, where you want to jump
• Saved PSR_mode will be restored back in CPSR
– so you got all flags when you moved to user mode from
supervisory mode
• This instruction will be executed from privileged mode
– This is a clear case of contest switching from privileged
mode to User mode
R15 not in list and S bit set (User bank
transfer)
• For both LDM and STM instructions, the User
bank registers are transferred rather than the
register bank corresponding to the current
mode.
• This is useful for saving the user state on
process switches.
• Base write-back shall not be used when this
mechanism is employed.
Use of R15 as the base register
• R15 must not be used as the base register in
any LDM or STM instruction
Assembler syntax
• <LDM|STM>{cond}<FD|ED|FA|EA|IA|IB|DA|DB>
Rn{!},<Rlist>{^}
• where:
– {cond} is a two-character condition mnemonic,
– Rn is an expression evaluating to a valid register number
– <Rlist> is a list of registers and register ranges enclosed in
{} (e.g. {R0,R2- R7,R10}).
– {!} (if present) requests write-back (W=1), otherwise W=0
– {^} (if present) set S bit to load the CPSR along with the PC,
or force transfer of user bank when in privileged mode
Addressing mode names
• There are different assembler mnemonics for
each of the addressing modes, depending on
whether the instruction is being used to
support stacks or for other purposes.
Addressing mode names
• FD, ED, FA, EA define pre/post indexing and the
up/down bit by reference to the form of stack
required.
– F: Full stack (a pre-index has to be done before storing
to the stack)
– E: Empty stack: Post- inc/decrement addressing
– A: The stack is ascending (an STM will go up and LDM
down)
– D: The stack is descending (an STM will go down and
LDM up)
Addressing mode names
• The following symbols allow
control when LDM/STM are
not being used for stacks:
– IA Increment After
– IB Increment Before(8051)
– DA Decrement After
– DB Decrement Before(8085)
9000
9004 00000009
9008 00000008
900c 0000007
Examples
• LDMFD SP!,{R0,R1,R2} ;unstack 3 registers (8085)
• LDMFD SP!,{R15} ;R15 <- [SP],CPSR unchanged
• STMFD R13 ,{R0-R14}^ ; R13(SP)
;save user mode regs on stack
;(allowed only in privileged modes)
• LDMFD SP!,{R15}^ ;R15 <- [SP], ^ indicate set S bit
;CPSR <- SPSR_mode (allowed only in
;privileged modes)
Nested CALL
Function 1
Main Program Function 2

.
STMFD SP! (regs, LR) .

. .

.
. .

BL fun1 . .

.
BL fun2
ADD R1 … .

. SUB R1, R2,R3; .

.
. MOV PC, LR
.

.
.
LDMFD SP! (regs, LR)
MOV PC, LR
Examples
• These instructions may be used to save state on
subroutine entry, and restore it efficiently on
return to the calling routine:
• STMED SP!,{R0-R3,R14};
;save R0 to R3 to use as workspace
and R14(Lr) for returning
• BL somewhere
;this nested call will overwrite R14
• LDMED SP!,{R0-R3,R15}
;restore workspace and return
Single Data Swap: SWP
• The data swap instruction is used to swap a
byte or word quantity between a register and
external memory.
• This instruction is implemented as a memory
read followed by a memory write which are
“locked” together (the processor cannot be
interrupted until both operations have
completed, and the memory manager is
warned to treat them as inseparable).
Encoding

• This instruction class may be used to swap a byte (B=1) or a word


(B=0) between an ARM processor register and memory.
• The SWP instruction is implemented as a LDR followed by a STR
• Restriction 1:Do not use R15 as an operand (Rd, Rn or Rs) in a SWP
instruction.
• Restriction2: Rm and Rn should not be same
Assembler syntax
• <SWP>{cond}{B} Rd,Rm,[Rn]
• {cond} two-character condition mnemonic
• {B} if B is present then byte transfer, otherwise
word transfer
2
Rm [Rn]
Memory
1
3 [Rn]
Rd Temp
Rm and Rn can’t be same
Rm and Rd may be same
1st Load
in R0 Examples
• SWP R0,R1,[R2] ;load R0 with the word
2nd store
R1

addressed by R2, and store R1 at memory


location pointed by R2
• SWPB R2,R3,[R4] ;load R2 with the byte
addressed by R4, and ;store bits 0 to 7 of R3
at location pointed by R4
• SWPEQ R0,R0,[R1] ;conditionally swap the
contents pointed by R1 with R0
Semaphores
• Semaphores are used to manage access to a
shared resource.
• Before accessing a resource, a client must read
the semaphore value and check, whether the
client can proceed, or it must wait.
• When the client is about to proceed it must
change the semaphore value to inform other
clients.
• A fundamental issue with semaphores is that
they themselves are shared resources, and must
be protected against curruption
Atomic access

• In order to implement a reliable semaphore,


we must guarantee atomic access (atomos,
Greek, uncuttable), i.e. reading the
semaphore value, checking it and writing the
modified value back must occur in an
uninterruptible sequence.
• Otherwise, a second client might see the
same semaphore value before the first one
has had a chance to write back the new value
Use of SWP to check semaphore
• The SWP instruction can be used to implement a
binary semaphore, also known as mutex.
• To implement other types of semaphores, a
mutex would have to protect the actual
semaphore, making the process a bit more
complex.
• SWP carries out a read from memory followed by
a write to memory. The instruction is not
interruptible and blocks the system bus for the
entire transaction so that no other master can be
granted access between read and write accesses
Situation
Process 1 Process 2

Memory Process 1 and 2 trying to


access same array in
shared database in their
respective time slot!!!!
Sample code
LOCKED EQU 0 ; define value indicating

LDR r1, <addr> ; load r1 semaphore address


LDR r0, LOCKED ; preload "locked" value i.e. r0 = 0
1
spin_lock
SWP r0, r0, [r1] ; swap register ro with semaphore []
2
CMP r0, #LOCKED ; if semaphore was locked already
BEQ spin_lock ; retry else lock and use shared
resource
ARM Block
diagram
revisited
MUL: Multiply ->32bit result

• The multiply form of the instruction gives


Rd =Rm*Rs.
• Rn is ignored, and should be set to zero for
compatibility with possible future upgrades to
the instruction set.
MLA: Multiply & Accumulate: 32bit
result
• The multiply-accumulate form gives Rd:=(Rm*Rs)+Rn,
• which can save an explicit ADD instruction in some
circumstances.
• It can be used for both signed and unsigned
multiplications
– As these instructions produce only the lower 32 bits of a
multiplication out of 64 bit result
• The results of a signed multiply and that of an unsigned
multiply of 32-bit operands differ only in the upper 32
bits; the low 32 bits of the signed and unsigned results
are identical.
Example
• For example consider the multiplication of the operands:
Operand A Operand B Result
0xFFFFFFF6 0x00000014 0xFFFFFF38
• Case 1, A: -ve: If the operands are interpreted as signed,
operand A has the value -10, operand B has the value 20,
and the result is -200 which is correctly represented as
0xFFFFFF38
• Case 2, A: unsigned: If the operands are interpreted as
unsigned, operand A has the value 4294967286, operand B
has the value 20 and the result is 85899345720, which is
represented as 0x13FFFFFF38, so the least significant 32
bits are 0xFFFFFF38.
Operand restrictions
• The destination register (Rd) should not be the
same as the operand register (Rm), Rd:=Rm*Rs
R1=R1*R2; not allowed
as Rd is used to hold intermediate values and Rm is
used repeatedly during multiply.
• A MUL will give a zero result if Rm=Rd, and an
MLA will give a meaningless result.
• R15 must not be used as an operand or as the
destination register.
• All other register combinations will give correct
results.
CPSR flags
• Setting the CPSR flags is optional, and is
controlled by the S bit in the instruction.
• The N (Negative) and Z (Zero) flags are set
correctly on the result (N is made equal to
bit 31 of the result, and Z is set if and only if
the result is zero).
• The C (Carry) flag is set to a meaningless value
and the V (oVerflow) flag is unaffected
Assembler syntax
• MUL{cond}{S} Rd,Rm,Rs
• MLA{cond}{S} Rd,Rm,Rs,Rn
• where:
• {cond} two-character condition mnemonic
• {S} set condition codes if S present
• Rd, Rm, Rs, Rn are expressions evaluating to a register
number other than R15.
• Examples
– MUL R1,R2,R3 ;R1:=R2*R3
– MLAEQS R1,R2,R3,R4 ;conditionally R1:=(R2*R3)+R4,
;setting condition codes
SMLAL – Signed MuLtiply-Accumulate
Long

• SMLAL{<cond>}{S} <Rd_LSW>, <Rd_MSW>,


<Rm>, <Rs>
• RTL:
if(cond)
Rd_MSW:Rd_LSW(n) <- Rd_MSW:Rd_LSW(n-1) + (Rs •
Rm)
Usage and Examples:
• SMLAL performs a signed 32x32 multiply
operation with a 64-bit accumulation.
• The product of Rm and Rs is added to the 64-
bit signed value contained in the register pair
Rd_MSW:Rd_LSW.
– All values are interpreted as 2’s-complement.
• The instruction below adds the product of R2
and R3 to the 64-bit number stored in R1:0.
– SMLAL R0, R1, R2, R3
SMULL – Signed MULtiply Long

• SMULL{<cond>}{S} <Rd_LSW>, <Rd_MSW>,


<Rm>, <Rs>
• if(cond) , Rd_MSW:Rd_LSW <= Rs • Rm
Usage and Examples:
• SMULL performs a signed 32x32 multiply
operation.
• The product of Rm and Rs is stored as a 64-bit
signed value in the register pair
Rd_MSW:Rd_LSW.
• All values are interpreted as 2’s-complement.
• The instruction below stores the product of R2
and R3 as a 64-bit number in R1:0.
– SMULL R0, R1, R2, R3; R1:R0 <- R2*R3
UMULL – Unsigned Multiply Long

Syntax:
• UMULL{<cond>}{S} <Rd_LSW>, <Rd_MSW>, <Rm>,
<Rs>
• RTL:
if(cond)
Rd_MSW:Rd_LSW <- Rs • Rm
• Flags updated if S used:N, Z (V, C are unpredictable)
UMULL: Usage and Examples:
UMULL performs an unsigned 32x32 multiply
operation. The product of Rm and Rs is
stored as a 64-bit unsigned value in the
register pair Rd_MSW:Rd_LSW.
• All values are interpreted as unsigned binary.
• The instruction below stores the product of R2
and R3 as a 64-bit number in R1:0.
– UMULL R0, R1, R2, R3
UMLAL – Unsigned Multiply-
Accumulate Long

• Syntax:
UMLAL{<cond>}{S} <Rd_LSW>, <Rd_MSW>, <Rm>,
<Rs>
• RTL:
if(cond)
• Rd_MSW:Rd_LSW <= Rd_MSW:Rd_LSW + (Rs • Rm)
• Flags updated if S used:N, Z (V, C are unpredictable)
Usage and Examples: UMLAL

• UMLAL performs an unsigned 32x32 multiply


operation with a 64-bit accumulation. The
• product of Rm and Rs is added to the 64-bit
unsigned value contained in the register pair
Rd_MSW:Rd_LSW.
• All values are interpreted as unsigned binary.
• The instruction below adds the product of R2 and
R3 to the 64-bit number stored in R1:0.
• UMLAL R0, R1, R2, R3; R1:0 <- (R2 * R3)+ R1:0
Multiply Instructions Summary
• MUL{S} Rd,Rm,Rs (Rd = Rm*Rs)
• MLA: Multiply and Accumulate
– MLARd,Rm,Rs,Rn ( Rd=(Rm*Rs)+Rn)
• SMULL : Signed Multiply Long
[Rdhi,Rdlo]=(Rm*Rs)
• SMLAL: Signed Multiply accumulate long
[Rdhi,Rdlo]= [Rdhi,Rdlo]+(Rm*Rs)
• UMULL: Unsigned multiply long
• UMLAL: Unsigned multiply accumulate long
Multiplication by constant
• MOV r1, r0, LSL #2 ; r1=r0*4

• RSB r0, r2, r2, LSL #3; r0= r2*8 -r2 = r2*7

• ADD r0, r1, r1, LSL #1; r0 =r1+ r1*2= r1 *3


• SUB r0, r0, r1, LSL #4;
r0= r1*3 – r1*16 = r1*(-13)
• ADD r0, r0, r1, LSL#7; r0= r1*(-13) + r1*128
;r0= r1*115
Cycle time
– Basic MUL instruction
• 2-5 cycles on ARM7TDMI
– +1 cycle for accumulate
– +1 cycle for “long
PSR Transfer
• These instructions allow access to the CPSR
and SPSR registers.
MRS: MoveRegisterStatus
• The MRS instruction allows the contents of
the CPSR or SPSR_< of current mode> to be
moved to a general register
The problem
• When an exception occurs and there is a
possibility of a nested exception of the same
type to occurring, the SPSR of the exception
mode is in danger of being corrupted.
Example
• If ARM is in IRQ mode due to IRQ_DEVICE_1
then USER mode CPSR is saved in SPSR_IRQ
and return address in LR_IRQ.
• If nested IRQ_DEVICE_2 arrives CPSR of IRQ
mode should be saved in SPSR_IRQ and return
address in LR_IRQ
• This operation corrupts SPSR of IRQ_DEVICE_1
and return address of IRQ_DEVICE_1
Solution
• To deal with this, the SPSR value must be
saved before the nested exception can occur,
and later restored in preparation for the
exception return.
• The saving is normally done by using an MRS
instruction followed by a store instruction.
• Restoring the SPSR uses the reverse sequence
of a load instruction followed by an MSR
instruction
MSR: Move Status R
egister

• MSR (Move to Status Register from ARM


Register) transfers the value of a general-
purpose register or an immediate constant to
the CPSR or the SPSR of the current mode
• SBO/SBZ= Should Be One/Zero
PSR Transfer: MSR
(immediate/register)

4 bit field mask@bits19-16

0001 Load imm in LS byte position


0010 Load imm in subsequent pos
0100 Load imm in subsequent pos
1000 Load imm in MS byte
Syntax
• MSR{<cond>} CPSR_<fields>, #<immediate>
• MSR{<cond>} CPSR_<fields>, <Rm>
• MSR{<cond>} SPSR_<fields>, #<immediate>
• MSR{<cond>} SPSR_<fields>, <Rm>
• <fields> Is a sequence of one or more of the
following:
_c sets the control field mask bit (bit 16)
_x sets the extension field mask bit (bit 17)
_s sets the status field mask bit (bit 18)
_f sets the flags field mask bit (bit 19).
<immediate>

• Is the immediate value to be transferred to


the CPSR or SPSR.
• Allowed immediate values are 8-bit (in the
range 0x00 to 0xFF) and values that can be
obtained by rotating them right by an even
amount in the range 0 to 30.
Usage
• Use MSR to update the value of the condition code
flags, interrupt enables, or the processor mode.
• a good way to switch the ARM to Supervisor mode
from another privileged mode is:
MRS R0,CPSR ; Read CPSR
BIC R0,R0,#0x1F ; Modify by removing
current mode
ORR R0,R0,#0x13 ; and substituting
Supervisor mode
MSR CPSR_c,R0 ; Write the result back
to CPSR
Caution
• You can use the immediate form of MSR to set
any of the fields of a PSR, but you must take care
to use the read-modify-write technique described
above.
• The immediate form of the instruction is
equivalent to reading the PSR concerned,
replacing all the bits in the fields concerned by
the corresponding bits of the immediate constant
and writing the result back to the PSR.
Caution_contd
• The immediate form must therefore only be
used when the intention is to modify all the
bits in the specified fields and, in particular,
must not be used if the specified fields include
any undefined bits of PSR by ARM
• Failure to observe this rule might result in
code which has unanticipated side effects on
future versions of the ARM architecture.
• As an exception to the above rule, it is
legitimate to use the immediate form of the
instruction to modify the flags byte, despite
the fact that bits[26:25] of the PSRs have no
allocated function at present. For example,
you can use MSR to set all four flags (and clear
the Q flag if the processor implements the
Enhanced DSP extension):
• MSR CPSR_f,#0xF0000000
The T bit or J bit

• The MSR instruction must not be used to alter


the T bit or the J bit in the CPSR.
• If such an attempt is made, the results are
UNPREDICTABLE
Operand restrictions
• In User mode, the control bits of the CPSR are
protected from change, so only the condition
code flags (NZCV) of the CPSR can be changed.
• In other (privileged) modes the entire CPSR can
be changed.
• The SPSR register which is accessed depends on
the processor_mode at the time of execution.
– For example, only SPSR_fiq is accessible when the
processor is in FIQ mode.
Operand restrictions Cont.
• R15 must not be specified as the source or
destination register.
• A further restriction is that you must not
attempt to access an SPSR in User mode, since
no such register exists.
Example: Mode Change
• For example, the following sequence performs a
mode change:
• MRS R0,CPSR ;take a copy of the CPSR
• BIC R0,R0,#0x1F ;clear the mode bits
• ORR R0,R0,#new_mode ;select new mode
• MSR CPSR,R0 ;write back the modified CPSR
• Advantage: Only mode is changed, rest of the
CPSR will be (flags, irq -fiq mask bits ) unchanged
Example: Condition Flag Change
• When the aim is simply to change the condition code
flags in a PSR, a value can be written directly to the flag
bits without disturbing the control bits (in user mode).
e.g. The following instruction sets the N,Z,C & V flags:
• MSR CPSR_flg,#0xF0000000 ;
– set all the flags regardless of their previous state (does not
affect any control bits in user mode)
Note: Do not attempt to write an 8 bit immediate
value into the whole PSR since such an operation
cannot preserve the reserved bits.
Assembler syntax
1) MRS - transfer PSR contents to a register
– MRS{cond} Rd,<psr>
– <psr> is CPSR, CPSR_all, SPSR or SPSR_all.

MRS Rd, CPSR ;Rd[31:0] <- CPSR[31:0]


2) MSR - transfer register contents to
PSR
MSR{cond} <psr>,Rm
In User mode the instructions behave as follows:
• MSR CPSR_all,Rm ; CPSR[31:28] <- Rm[31:28]
In privileged modes the instructions behave as
follows:
• MSR CPSR_all,Rm ;CPSR[31:0] <- Rm[31:0]
3) MSR - transfer register contents to
PSR flag bits only
MSR{cond} <psrf>,Rm
• The most significant four bits of the register
contents are written to the N,Z,C & V flags
respectively.
<psrf> is CPSR_flg or SPSR_flg
• In User mode the instructions behave as
follows:
– MSR CPSR_flg,Rm ;CPSR[31:28] <- Rm[31:28]
4) MSR - transfer immediate value to
PSR flag bits only
MSR{cond} <psrf>,<#expression>
• The expression should symbolize a 32-bit value of
which the most significant four bits are written to the
N,Z,C & V flags respectively
• <#expression> where used, the assembler will attempt
to generate a shifted immediate 8-bit field to match
the expression.
• MSR CPSR_flg,#0xA0000000
;CPSR[31:28] <- 0xA (i.e. set N,C; clear Z,V)
• MSR SPSR_flg,#0xC0000000
;SPSR_<mode>[31:28] <- 0xC (i.e. set N,Z; clear C,V)
Software Interrupt (SWI)

• The software interrupt instruction is used to


enter Supervisor mode in a controlled manner.
• The instruction causes the software interrupt
trap to be taken, which effects the mode change.
• The PC is then forced to a fixed value (0x08) and
the CPSR is saved in SPSR_svc
Return from the supervisor
• The PC is saved in R14_svc upon entering the
software interrupt trap, with the PC adjusted
to point to the word after the SWI instruction.
• MOVS PC,R14_svc will return to the calling
program and restore the CPSR.
SWI: Comment field
• The bottom 24 bits of the instruction are
ignored by the processor, and may be used to
communicate information to the supervisor
code.
• For instance, the supervisor may look at this
field and use it to index into an array of entry
points for routines which perform the various
supervisor functions.
Assembler syntax
• SWI{cond} <expression>
• {cond} two-character condition mnemonic
• <expression> is evaluated and placed
• Examples
– SWI ReadC ;get next character from read stream
– SWI WriteI+”k” ;output a “k” to the write stream
– SWINE 0 ;conditionally call supervisor with 0 in
comment field
– 0x08 B Supervisor ;SWI entry point
Supervisor

SWI has type of S/W interrupt in bits 8-23 and data (if any) in bits ;0-7.
Assumes R13_svc points to a suitable stack
• STMFD R13!,{R0-R2,R14}; save work registers and return address
• LDR R0,[R14,#-4] ;get SWI instruction
• BIC R0,R0,#0xFF000000; clear top 8 bits
• MOV R1,R0,LSR#8 ;get routine offset
• ADR R2,EntryTable ;get start address of entry table
• LDR R15,[R2,R1,LSL#2] ;branch to appropriate routine
• WriteIRtn ;enter with character in R0 bits 0-7
• ......
• LDMFD R13!,{R0-R2,R15}^ ;restore workspace and return restoring
processor mode and flags

You might also like