Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Keeping Coherency on Arm:
Reborn
Julien Grall <julien.grall@arm.com>
Xen Developer Summit 2019
© 2019 Arm Limited
Xen and Arm
Over the past years, a few architectural compliance issues where iden fied in Xen:
Boot code
Memory subsystem
Guest memory subsystem
Atomics helper
2 © 2019 Arm Limited
Xen runs fine for me.
What are you talking about?
3 © 2019 Arm Limited
Xen and Arm - 2
Some of the recent examples:
4.11 failing to boot on Thunder-X
https://lists.xenproject.org/archives/html/xen-devel/2019-06/msg00184.html
32-bit guest intermi ently failing to boot
https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg03191.html
Assump on of system registers layout
https://lists.xenproject.org/archives/html/xen-devel/2019-05/msg01210.html
XSA-295
https://xenbits.xen.org/xsa/advisory-295.txt
4 © 2019 Arm Limited
Why following the ARM ARM?
Reliability
Upgrading to a newer processor without Xen modifica ons
5 © 2019 Arm Limited
Scope
This talk is focusing on boot and memory handling.
Focussing on architectural guarantees.
The Architecture Reference Manual is authorita ve.
6 © 2019 Arm Limited
Page-table updates
7 © 2019 Arm Limited
Page table entry
An entry can point to either a block mapping, transla on table or a page descriptor.
Note that whilst Arm supports different granules (4KB, 16KB and 64KB), Xen only supports 4KB.
8 © 2019 Arm Limited
Page table entry - 2
Each block mapping and page descriptor contains informa on such as:
Access permission: read, write, execute never
Memory type: Device or Normal memory
Shareability: non-shareable, inner-shareable, outer-shareable
Cacheability
Access flag
Con guous bit
Non-Global bit
9 © 2019 Arm Limited
TLB entries
TLB is a structure which caches results of the transla on table walks.
Each TLB entry
contains data from the page descriptor.
is tagged by ASID and VMID.
10 © 2019 Arm Limited
Address Space Iden fier
The Address Space Iden fier (ASID)
iden fies pages associated to a single address space (e.g. a process).
applies for EL1 (kernel) and EL0 (userspace) transla on regime.
provides a mechanism for changing process-specific table without requiring TLB invalida on.
is cached by the TLB entries.
11 © 2019 Arm Limited
Virtual Machine Iden fier
The Virtual Machine Iden fier (VMID)
iden fies the current virtual machine (Non-Secure EL1 & EL0).
has its own independent ASID space.
is cached by the TLB entries so no TLB invalida on is required when switching between VM.
12 © 2019 Arm Limited
Con guous bit
It indicates whether the entry is one of a number of adjacent page table entries that point to a
con guous output address range.
The number of adjacent entries depends on the page size.
For instance with 4KB, it will be 16 entries.
The TLB is allowed to cache a con guous region in a single entry.
13 © 2019 Arm Limited
Non-Global (nG) bit
The non-Global bit applies only for Stage-1 page tables.
nG 0 means the region is available for all ASIDs.
nG 1 means the region is restricted to the current ASID.
Global mapping can be invalidated by using any ASID.
14 © 2019 Arm Limited
Special considera ons apply to
transla on table updates ...
ARM ARM
15 © 2019 Arm Limited
Page table updates
Page table updates may require one to use a break-before-make sequence.
It is a sequence required by the ARM ARM to update page table entries.
Required for certain update of the tables.
Ensures that a coherent view of the page tables is used by all observers.
16 © 2019 Arm Limited
Why do we care?
Page tables are used by different observers
HW transla on table walkers
Observers may be accessing the page-tables during updates
17 © 2019 Arm Limited
What could happen?
The TLB may hold two mappings for the same address and might lead to
CONSTRAINED UNPREDICTABLE behavior and break coherency
A processor may see access erroneous data
This may be either of the TLB entry or an amalgama on of the two
A TLB conflict abort
18 © 2019 Arm Limited
When to use it
Changing the size of the block
Replacing a block mapping with a transla on table
Replacing a transla on table with block mapping
Se ng/Unse ng the con guous bit
Changing the output address if one of the entry is writeable
Changing the memory type
Changing the cacheability a ributes
Crea ng global entry that might overlaps non-global entries
19 © 2019 Arm Limited
When it is not necessary
Changing the permission of an entry
Changing the access flag
20 © 2019 Arm Limited
Steps
It is a 4 steps approach:
1. Replace the old entry with an invalid entry
2. Invalidate the cached old entries with a broadcas ng TLB invalida on instruc on
3. Wait for the comple on of the TLB instruc on with a dsb followed by an isb
4. Write the new entry
21 © 2019 Arm Limited
Accessing Memory with MMU off
22 © 2019 Arm Limited
Cache architecture
(Modified) Harvard architecture
Mul ple levels of caching (with snooping)
Separate I-cache and D-cache (no snooping between I and D)
Either PIPT or non-aliasing VIPT for D-cache
Mee ng at the Point of Unifica on (PoU)
Controlled by a ributes in the page tables
Memory type (normal, device)
Cacheability, Shareability
Two Enable bits (I and C)
Actually not really an Enable switch
More like a global ”a ribute override”
Generally invisible to normal so ware
With a few key excep ons
An example is wri ng with MMU turned off
23 © 2019 Arm Limited
Accessing memory with MMU off/on
Memory can be wri en with MMU off and read with MMU on
An example is preparing the page-tables at boot
What could possibly go wrong?
24 © 2019 Arm Limited
Accessing memory with MMU off/on
Memory can be wri en with MMU off and read with MMU on
An example is preparing the page-tables at boot
What could possibly go wrong?
Write access with MMU off are non-cacheable
The state of the cache may be unknown
It may contain dirty/clean lines
Cache can specula vely load a line
Without proper cache maintenance:
Data may be overwri en
Old data may be read
24 © 2019 Arm Limited
Arm64 Image boot protocol
The Arm64 Image specifica on describes the state of the processor at boot
MMU will be turned off
Loaded Image will be cleaned to Point Of Coherency (PoC)
The state of cache for the rest of the RAM is unknown
This includes BSS
25 © 2019 Arm Limited
Wri ng in the loaded image
The loaded image is cleaned to PoC
The cache may contain clean line for the region modified
Steps required
1. Write the data
2. Invalidate to PoC the region modified
Remove any clean line
Avoid to read the wrong data
26 © 2019 Arm Limited
Wri ng outside of the loaded image
The state of the cache is not known for any RAM but the loaded image
The cache may contain clean or dirty line for the region modified
Steps required
1. Invalidate to PoC the region modified
Remove any line
Prevent dirty cache line to be evicted
Avoid to overwrite the data
2. Write the data
3. Invalidate to PoC the region modified
Remove any line that were specula vely loaded
Avoid to read the wrong data
27 © 2019 Arm Limited
Implica ons for Xen
28 © 2019 Arm Limited
Implica ons for Xen
Parts of the boot and memory code needs to be rewri en.
Memory wri en with MMU off
Se ng up page-tables
29 © 2019 Arm Limited
Memory wri en with MMU off
At the moment, Xen is modifying the following part with MMU off
Zeroing BSS
Required because boot page-tables are part of it
This can be moved later if page-tables are moved out
Se ng-up ini al page-tables
Ini al page-tables are always writen with MMU off
Cache maintenance is required
30 © 2019 Arm Limited
Se ng up page-tables
A 1:1 mapping is necessary to turn the MMU on
It may clash with part of memory layout
The memory layout is sta c
At the moment, Xen is avoiding the clash by switching page-tables
This is not safe to do with the MMU is turned on
31 © 2019 Arm Limited
Se ng up page-tables
A 1:1 mapping is necessary to turn the MMU on
It may clash with part of memory layout
The memory layout is sta c
At the moment, Xen is avoiding the clash by switching page-tables
This is not safe to do with the MMU is turned on
The 1:1 mapping needs to be kept for
CPU bring-up
Suspend/Resume
31 © 2019 Arm Limited
Se ng up page-tables
A 1:1 mapping is necessary to turn the MMU on
It may clash with part of memory layout
The memory layout is sta c
At the moment, Xen is avoiding the clash by switching page-tables
This is not safe to do with the MMU is turned on
The 1:1 mapping needs to be kept for
CPU bring-up
Suspend/Resume
The memory layout needs to be dynamic to avoid clash
31 © 2019 Arm Limited
Upstreaming
The work is split in mul ple series as it is quite consequent.
Part of the page-tables update has been merged
Boot code reworked is under review
Comments, reviews, tes ng, help are more than welcomed
32 © 2019 Arm Limited
Ques ons?
33 © 2019 Arm Limited
The Arm trademarks featured in this presenta on are registered trademarks or
trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere. All rights
reserved. All other marks featured may be trademarks of their respec ve owners.
www.arm.com/company/policies/trademarks
© 2019 Arm Limited

More Related Content

XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd

  • 1. Keeping Coherency on Arm: Reborn Julien Grall <julien.grall@arm.com> Xen Developer Summit 2019 © 2019 Arm Limited
  • 2. Xen and Arm Over the past years, a few architectural compliance issues where iden fied in Xen: Boot code Memory subsystem Guest memory subsystem Atomics helper 2 © 2019 Arm Limited
  • 3. Xen runs fine for me. What are you talking about? 3 © 2019 Arm Limited
  • 4. Xen and Arm - 2 Some of the recent examples: 4.11 failing to boot on Thunder-X https://lists.xenproject.org/archives/html/xen-devel/2019-06/msg00184.html 32-bit guest intermi ently failing to boot https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg03191.html Assump on of system registers layout https://lists.xenproject.org/archives/html/xen-devel/2019-05/msg01210.html XSA-295 https://xenbits.xen.org/xsa/advisory-295.txt 4 © 2019 Arm Limited
  • 5. Why following the ARM ARM? Reliability Upgrading to a newer processor without Xen modifica ons 5 © 2019 Arm Limited
  • 6. Scope This talk is focusing on boot and memory handling. Focussing on architectural guarantees. The Architecture Reference Manual is authorita ve. 6 © 2019 Arm Limited
  • 7. Page-table updates 7 © 2019 Arm Limited
  • 8. Page table entry An entry can point to either a block mapping, transla on table or a page descriptor. Note that whilst Arm supports different granules (4KB, 16KB and 64KB), Xen only supports 4KB. 8 © 2019 Arm Limited
  • 9. Page table entry - 2 Each block mapping and page descriptor contains informa on such as: Access permission: read, write, execute never Memory type: Device or Normal memory Shareability: non-shareable, inner-shareable, outer-shareable Cacheability Access flag Con guous bit Non-Global bit 9 © 2019 Arm Limited
  • 10. TLB entries TLB is a structure which caches results of the transla on table walks. Each TLB entry contains data from the page descriptor. is tagged by ASID and VMID. 10 © 2019 Arm Limited
  • 11. Address Space Iden fier The Address Space Iden fier (ASID) iden fies pages associated to a single address space (e.g. a process). applies for EL1 (kernel) and EL0 (userspace) transla on regime. provides a mechanism for changing process-specific table without requiring TLB invalida on. is cached by the TLB entries. 11 © 2019 Arm Limited
  • 12. Virtual Machine Iden fier The Virtual Machine Iden fier (VMID) iden fies the current virtual machine (Non-Secure EL1 & EL0). has its own independent ASID space. is cached by the TLB entries so no TLB invalida on is required when switching between VM. 12 © 2019 Arm Limited
  • 13. Con guous bit It indicates whether the entry is one of a number of adjacent page table entries that point to a con guous output address range. The number of adjacent entries depends on the page size. For instance with 4KB, it will be 16 entries. The TLB is allowed to cache a con guous region in a single entry. 13 © 2019 Arm Limited
  • 14. Non-Global (nG) bit The non-Global bit applies only for Stage-1 page tables. nG 0 means the region is available for all ASIDs. nG 1 means the region is restricted to the current ASID. Global mapping can be invalidated by using any ASID. 14 © 2019 Arm Limited
  • 15. Special considera ons apply to transla on table updates ... ARM ARM 15 © 2019 Arm Limited
  • 16. Page table updates Page table updates may require one to use a break-before-make sequence. It is a sequence required by the ARM ARM to update page table entries. Required for certain update of the tables. Ensures that a coherent view of the page tables is used by all observers. 16 © 2019 Arm Limited
  • 17. Why do we care? Page tables are used by different observers HW transla on table walkers Observers may be accessing the page-tables during updates 17 © 2019 Arm Limited
  • 18. What could happen? The TLB may hold two mappings for the same address and might lead to CONSTRAINED UNPREDICTABLE behavior and break coherency A processor may see access erroneous data This may be either of the TLB entry or an amalgama on of the two A TLB conflict abort 18 © 2019 Arm Limited
  • 19. When to use it Changing the size of the block Replacing a block mapping with a transla on table Replacing a transla on table with block mapping Se ng/Unse ng the con guous bit Changing the output address if one of the entry is writeable Changing the memory type Changing the cacheability a ributes Crea ng global entry that might overlaps non-global entries 19 © 2019 Arm Limited
  • 20. When it is not necessary Changing the permission of an entry Changing the access flag 20 © 2019 Arm Limited
  • 21. Steps It is a 4 steps approach: 1. Replace the old entry with an invalid entry 2. Invalidate the cached old entries with a broadcas ng TLB invalida on instruc on 3. Wait for the comple on of the TLB instruc on with a dsb followed by an isb 4. Write the new entry 21 © 2019 Arm Limited
  • 22. Accessing Memory with MMU off 22 © 2019 Arm Limited
  • 23. Cache architecture (Modified) Harvard architecture Mul ple levels of caching (with snooping) Separate I-cache and D-cache (no snooping between I and D) Either PIPT or non-aliasing VIPT for D-cache Mee ng at the Point of Unifica on (PoU) Controlled by a ributes in the page tables Memory type (normal, device) Cacheability, Shareability Two Enable bits (I and C) Actually not really an Enable switch More like a global ”a ribute override” Generally invisible to normal so ware With a few key excep ons An example is wri ng with MMU turned off 23 © 2019 Arm Limited
  • 24. Accessing memory with MMU off/on Memory can be wri en with MMU off and read with MMU on An example is preparing the page-tables at boot What could possibly go wrong? 24 © 2019 Arm Limited
  • 25. Accessing memory with MMU off/on Memory can be wri en with MMU off and read with MMU on An example is preparing the page-tables at boot What could possibly go wrong? Write access with MMU off are non-cacheable The state of the cache may be unknown It may contain dirty/clean lines Cache can specula vely load a line Without proper cache maintenance: Data may be overwri en Old data may be read 24 © 2019 Arm Limited
  • 26. Arm64 Image boot protocol The Arm64 Image specifica on describes the state of the processor at boot MMU will be turned off Loaded Image will be cleaned to Point Of Coherency (PoC) The state of cache for the rest of the RAM is unknown This includes BSS 25 © 2019 Arm Limited
  • 27. Wri ng in the loaded image The loaded image is cleaned to PoC The cache may contain clean line for the region modified Steps required 1. Write the data 2. Invalidate to PoC the region modified Remove any clean line Avoid to read the wrong data 26 © 2019 Arm Limited
  • 28. Wri ng outside of the loaded image The state of the cache is not known for any RAM but the loaded image The cache may contain clean or dirty line for the region modified Steps required 1. Invalidate to PoC the region modified Remove any line Prevent dirty cache line to be evicted Avoid to overwrite the data 2. Write the data 3. Invalidate to PoC the region modified Remove any line that were specula vely loaded Avoid to read the wrong data 27 © 2019 Arm Limited
  • 29. Implica ons for Xen 28 © 2019 Arm Limited
  • 30. Implica ons for Xen Parts of the boot and memory code needs to be rewri en. Memory wri en with MMU off Se ng up page-tables 29 © 2019 Arm Limited
  • 31. Memory wri en with MMU off At the moment, Xen is modifying the following part with MMU off Zeroing BSS Required because boot page-tables are part of it This can be moved later if page-tables are moved out Se ng-up ini al page-tables Ini al page-tables are always writen with MMU off Cache maintenance is required 30 © 2019 Arm Limited
  • 32. Se ng up page-tables A 1:1 mapping is necessary to turn the MMU on It may clash with part of memory layout The memory layout is sta c At the moment, Xen is avoiding the clash by switching page-tables This is not safe to do with the MMU is turned on 31 © 2019 Arm Limited
  • 33. Se ng up page-tables A 1:1 mapping is necessary to turn the MMU on It may clash with part of memory layout The memory layout is sta c At the moment, Xen is avoiding the clash by switching page-tables This is not safe to do with the MMU is turned on The 1:1 mapping needs to be kept for CPU bring-up Suspend/Resume 31 © 2019 Arm Limited
  • 34. Se ng up page-tables A 1:1 mapping is necessary to turn the MMU on It may clash with part of memory layout The memory layout is sta c At the moment, Xen is avoiding the clash by switching page-tables This is not safe to do with the MMU is turned on The 1:1 mapping needs to be kept for CPU bring-up Suspend/Resume The memory layout needs to be dynamic to avoid clash 31 © 2019 Arm Limited
  • 35. Upstreaming The work is split in mul ple series as it is quite consequent. Part of the page-tables update has been merged Boot code reworked is under review Comments, reviews, tes ng, help are more than welcomed 32 © 2019 Arm Limited
  • 36. Ques ons? 33 © 2019 Arm Limited
  • 37. The Arm trademarks featured in this presenta on are registered trademarks or trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their respec ve owners. www.arm.com/company/policies/trademarks © 2019 Arm Limited