Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Information Classification: General
CONTRIBUTE.
COLLABORATE.
COMMERCIALIZE.
December 8-10 | Virtual Event
Information Classification: General
December 8-10 | Virtual Event
Easily Emulating Full Systems on Amazon
FPGAs
Jamey Hicks
Chief Engineer and President
#RISCVSUMMIT
Information Classification: General
Emulating SoCs in Amazon FPGAs
• How many of you are specifying, designing, or developing software for new SoCs?
• How many of you were doing so at the start of the pandemic?
• Wouldn’t it nice to be able to do so in the cloud?
• Did you know that Amazon Web Services (AWS) rents servers with FPGAs?
• In this talk, I will tell you how we validated our RISC-V CPU by running Linux on it in an AWS F1 FPGA
Information Classification: General
Context: DARPA SSITH and FETT
• DARPA SSITH
• MIT CSAIL and Accelerated Tech, Inc
• Sanctum/MI6 security enclaves
• The goal for software running in an enclave to be as isolated from other software as if it was running on a separate machine
• MIT developed many RISC-V variants in BSV
• Smallest microcontroller fabricated in carbon nanotubes
• In order cores, used in teaching
• Speculative, out-of-order processor
• Enhanced with Sanctum hardware isolation
• MI6 prevents side channels due to speculation and shared resources such as caches
• For SSITH, we extended Bluespec, Inc’s RISC-V Flute processor
• In order to compare between a common baseline and other teams (e.g., SRI/Cambridge)
Information Classification: General
SSITH FPGA Architecture
• Xilinx VCU118 on a lab bench
• Xilinx Vivado block design outside
• Contains I/O devices: UART, Ethernet
• Filesystem in ramdisk, transferred slowly via JTAG
• Verilog CPU core inside
• Fixed interface
• Verilog generated from Chisel or BSV
• Provided by each team in the program: MIT, SRI/Cambridge, …
• Programs loaded via gdb over JTAG
• Debug via gdb over JTAG
Information Classification: General
SSITH BSV Cores on AWS
• FETT Bounty Hunt
• Run the security-enhanced processors in the cloud
• Make them available on-demand to many experts at exploiting security vulnerabilities
• Requirements
• Runs on AWS F1 FPGA instances
• UART, Ethernet, and Block device support
• Debugger
• Constraints
• No hardware ethernet connection
• No hardware block devices
• Goals
• Improve flexibility for BSV designs
• Speed up code loading
Get this working as fast as possible
Information Classification: General
MIT Sanctum on AWS
RV64GC
/Enclave
Core
512-bit
DDR A
Ctrl
64-bit
Portal
FPGA CL FPGA SH Host Software
VIRTIO
Console
Network
Block
Connectal
Portal
DMA
PCIS
I/O Mem
Request
/Response
DMI
IRQ
Tandem V
IMEM
DMEM
BSV Netlist C/C++
Information Classification: General
SRI/Cambridge CHERI on AWS
RV64GC
CHERI
Core
512-bit
DDR A
Ctrl
64-bit
Portal
FPGA CL FPGA SH Host Software
VIRTIO
Console
Network
Block
Connectal
Portal
L1 Cache
I/O Mem
Request
/Response
DMI
IRQ
Tandem V
IMEM
DMEM
BSV Netlist C/C++
Information Classification: General
SSITH RISC-V on AWS
• AWS “Shell” logic provides PCIe interface to host
• Use virtio device models for network and storage
• Reuse code from open source VM (tinyemu)
• Use popular virtio Linux device drivers
• DMA transfers from device models for high performance
• Load code via DMA instead of JTAG for faster startup
• CPUs running at 100MHz
• All logic except AWS Shell and IP cores in BSV
Information Classification: General
How easy was it?
• Mid March 2020
• 400 lines of code to build AWS FPGA image containing MIT RG64 Flute Enclaves processor
• Akin to an IP Integrator design, plus host software connection
• In April
• Added vanilla Bluespec RG64 Flute processor
• Added SRI/Cambridge RG64 Flute CHERI processor
• Faster ELF loader
• Added I/O models via TinyEmu
• with much help from Jessica Clarke at Cambridge University
• Debugged MIT security monitor, built root filesystems with buildroot and Debian, …
• By end of May, started deploying to testers, with Galois, SRI/Cambridge
• In June, Jessica Clarke integrated Bluespec’s gdb stub
• Part time efforts from two people
Information Classification: General
Key Ingredients
• AWS F1 FPGA Shell
• Connectal
• Virtio
• And also
• BSV
• Vivado
Information Classification: General
AWS F1 Shell
Information Classification: General
F1 FPGA Shell
• Provides common functionality
• Hides details of PCIe interface
• Isolates the server from errors in FPGA
logic
• “Customer Logic” loaded via partial
reconfiguration
• Enables host to memory-map FPGA
addresses
• Enables FPGA direct access to host
memory
• 4 banks 16GB DRAM
”Customer Logic”
Information Classification: General
MIT Sanctum on AWS
RV64GC
/Enclave
Core
512-bit
DDR A
Ctrl
64-bit
Portal
FPGA CL FPGA SH Host Software
VIRTIO
Console
Network
Block
Connectal
Portal
DMA
PCIS
I/O Mem
Request
/Response
DMI
IRQ
Tandem V
IMEM
DMEM
BSV Netlist C/C++
mmap()
Connectal
Information Classification: General
Connectal
Connecting software to hardware
Information Classification: General
Connecting the hardware and software
• Wouldn’t it be nice to be able to connect the hardware and software without writing a device driver?
• Wouldn’t it be nice to have the compiler verify that the two sides agree on the data format?
• Connectal enables user-space software to connect to hardware
• Without an app-specific device driver
• Hardware and software stubs generated from interface spec
• Asynchronous remote method call interface
• Minimal Connectal app consists of 3 files:
• C++ software
• BSV hardware
• Makefile
• One command builds both hardware and software
• Portable across FPGA families and operating systems
Information Classification: General
Debug Module Interface (DMI)
Interface, Connectal-Style
interface AWSP2_Request; // software to hardware
method Action dmi_read(Bit#(7) req_addr);
method Action dmi_write(Bit#(7) req_addr, Bit#(32) req_data);
method Action dmi_status();
…
endinterface
interface AWSP2_Response; // hardware to software
method Action dmi_read_data(Bit#(32) rsp_data);
method Action dmi_status_data(Bit#(16) status);
…
endinterface
Information Classification: General
Connectal “Portals”
Legend:
User generated
Connectal generated
1. Define HW/SW interfaces using BSV
2. Invoke interface compiler to generate
Wrappers and Proxies
3. Connect user-software and user-
hardware using generated “glue”
void dma_read_data(int loc){
…
}
Main(){
…
dmi_read(a);
dmi_read(b);
dmi_write(c,val);
…
}
Indication
Wrapper
Request
Proxy
Indication
Proxy
FIFO
Request
Wrapper
FIFO
A B
C
D
write
read&deq
H/W bus
User Hardware
User Software interrupt
HW->SW
portal
SW->HW
portal
User Hardware (BSV)
rule dmi_response;
let v <- pop(cpu.dmi.resp);
ind.dmi_read_data(v);
Endrule
…
method Action dmi_read(a);
cpu.dmi.read.enq(a);
endmethod
…
Information Classification: General
Connectal Portal Details
19
www.connectal.org
Linux
Connectal Device Driver
-Portal discovery
-Create /dev/portalXXX
-Register ISR
-MMAP
Interrupt Service Routine
void found(int loc){
…
}
Main(){
…
portalExec_start();
configure(‘a’);
configure(‘b’);
search(hptr,hlen);
…}
Indication
Wrapper
Request
Proxy
Indication
Proxy
FIFO
Request
Wrapper
FIFO
A B
C
D
write
read&deq
interrupt
H/W bus
User Hardware
User Software
Poller
void* PortalPoller::portalExec(void* __x){
while (1) {
portalExec_poll(portalExec_timeout);
portalExec_event();
}}
Connectal Libraries
Indication Thread
poll
Legend:
User generated
Connectal generated
Connectal libraries
Linux
Information Classification: General
DMI interface: C++ stubs
class AWSP2_RequestProxy : public Portal {
public:
AWSP2_RequestProxy(int id…) :
int dmi_read ( const uint8_t req_addr );
int dmi_write ( const uint8_t req_addr, const uint32_t req_data );
int dmi_status ( );
…
}
Information Classification: General
DMI Interface: BSV methods
interface AWSP2_Request request;
method Action dmi_read(Bit#(7) addr);
//$display("dmi_read req addr %h", addr);
dmiReadFifo.enq(addr);
endmethod
method Action dmi_write(Bit#(7) addr, Bit#(32) data);
//$display("dmi_write req addr %h data %h", addr, data);
dmiWriteFifo.enq(tuple2(addr, data));
endmethod
…
endinterface
Information Classification: General
VIRTIO Device Models
Modeling I/O devices in host software
Information Classification: General
VIRTIO
• TinyEMU
• Small virtual machine by same
author as qemu
• Implements virtio device models
• Open source
• Reusable
• OS Drivers
• Standard Linux kernel drivers
• FreeBSD kernel drivers
• Needed some patches due to virtio
spec evolution
Reduce device models to be a software problem
And reuse existing device models and drivers
Information Classification: General
VIRTIO
• What is virtio?
• A common framework for I/O developed for hypervisors
• Used by kvm, qemu, etc.
• Drivers for Linux and FreeBSD
• Devices Supported
• Block storage
• Network
• Random/Entropy
• Console
• Normally, memory mapped regs and DRAM shared between guest and
host
• In our case, the risc-v CPU in the core is like a virtual machine guest
• And host software runs the device models
• A file per block device
Memory
Mapped
Regs
DRAM
Req queue
Req queue
Req queue
Device Implementations / Models
Information Classification: General
VIRTIO (2)
• Normally, memory mapped regs and DRAM shared between
guest and host
• In our case, the risc-v CPU in the core is like a virtual machine
guest
• And host software runs the device models
• A file per block device
• /dev/tun enables device model to send and receive
ethernet packets
• UART connected to pseudo terminal
Memory
Mapped
Regs
DRAM
Req queue
Req queue
Req queue
Device Implementations / Models
Information Classification: General
FPGA Core access to Virtio control registers
interface AWSP2_Response; // hardware to software
method Action io_awaddr(Bit#(32) awaddr, Bit#(16) awlen, Bit#(16) awid);
method Action io_araddr(Bit#(32) araddr, Bit#(16) arlen, Bit#(16) arid);
method Action io_wdata(Bit#(64) wdata, Bit#(8) wstrb);
endinterface
interface AWSP2_Request; // software to hardware
method Action io_rdata(Bit#(64) data, Bit#(16) rid, Bit#(8) rresp, Bool last);
method Action io_bdone(Bit#(16) bid, Bit#(8) bresp);
endinterface
Information Classification: General
MIT Sanctum on AWS
RV64GC
/Enclave
Core
512-bit
DDR A
Ctrl
64-bit
Portal
FPGA CL FPGA SH Host Software
VIRTIO
Console
Network
Block
Connectal
Portal
DMA
PCIS
I/O Mem
Request
/Response
DMI
IRQ
Tandem V
IMEM
DMEM
BSV Netlist C/C++
AXI Mem via Connectal
Access to queues in DRAM via mmap()
Information Classification: General
Building on AWS
Information Classification: General
What’s needed to build it
• Open Source
• aws-fpga (for AWS F1)
• BSV Compiler (bsc)
• Connectal
• FluteEnclavesTagging (our CPU)
• Flute (Bluespec, Inc CPU)
• ssith-aws-fpga
• Includes all of the above as
submodules
• Closed Source
• Vivado
• AWS F1 Shell design checkpoint
• Build environment
• Centos z1d.2xlarge or other beefy
build machine
• Centos can be made to work
• AWS FPGA Developer instances
include Vivado license
• S3 bucket: aws-fpga
• About 4 hours
Information Classification: General
Create AWS Account
• If you do not have an account and you
would like to try out this flow
• https://www.aws.amazon.com
Information Classification: General
Create Key Pair
Information Classification: General
Creating a Key Pair
• Creates ssh key pair
• Copy to your .ssh directory
Information Classification: General
Creating AWS Keys
• Used by AWS CLI to access/control AWS
resources
• Used for uploading design checkpoint to
S3 during build process
• This shows a root access key
• Should create IAM accounts that have
EC2 and S3 privileges
Information Classification: General
Create Access Key
• Save file
• Copy to ~/.aws/credentials
Information Classification: General
Launching a build
machine
• Navigate to Services -> EC2 -> Instances
• Launch an instance
• Search for FPGA Developer
• Select “FPGA Developer AMI”
• On the next screen, select “z1d.2xlarge”,
which is the default.
Information Classification: General
Building the hardware
• Using FPGA Developer instance
• z1d.2xlarge recommended ($0.744 per hour)
• Copy .aws/credentials to the machine so the build script can upload to the S3 bucket
[centos@ip-172-31-85-110 ~]$ git clone --recursive -b riscv-summit-2020
https://github.com/acceleratedtech/ssith-aws-fpga
[centos@ip-172-31-85-110 ~]$ screen
[centos@ip-172-31-85-110 ~]$ cd ssith-aws-fpga/hw
[centos@ip-172-31-85-110 hw]$ sudo ./install-deps.sh
[centos@ip-172-31-85-110 hw]$ ./build.sh
Information Classification: General
Building the hardware
• First step generates the BSV and cpp stubs
• Second step compiles BSV to generate Verilog (in a docker container)
• Third step runs Vivado to synthesize, place, and route the Verilog inside the AWS Shell
• This takes several hours on this design
• Fourth step copies the final design checkpoint (.dcp) to S3 in the aws-fpga bucket
• Fifth step requests an Amazon FPGA Image to be created
• Somewhere in the cloud, AWS will run DRCs on the checkpoint and create a partial bitstream, identified by AFI /
AGFI
• Typically 30-60 minutes
Information Classification: General
Running on AWS
Information Classification: General
What’s needed to run it?
• Open Source
• ssith-aws-fpga
• Building the host software this time
• Linux kernel
• Root filesystem
• Busybox (small)
• Buildroot
• Debian (full featured)
• AWS F1 FPGA instance
• f1.2xlarge contains 1 FPGA
• Choose Ubuntu 20.04 or Ubuntu
18.04
Information Classification: General
Building the host software
• ssh to the instance
ubuntu@ip-172-31-57-143:~$ git clone --recursive https://github.com/acceleratedtech/ssith-aws-
fpga
ubuntu@ip-172-31-57-143:~/ssith-aws-fpga$ sudo ./install-deps.sh
ubuntu@ip-172-31-57-143:~/ssith-aws-fpga$ ./build.sh
Information Classification: General
Load the FPGA
ubuntu@ip-172-31-57-143:~/ssith-aws-fpga$ fpga-load-local-image -S 0 -I agfi-021ac2c422864fa59
ubuntu@ip-172-31-57-143:~/ssith-aws-fpga$ sudo insmod hw/connectal/drivers/pcieportal/pcieportal.ko
Bitstream identified by global identifier (AGFI)
Information Classification: General
Run the host software and RISC-V CPU
ubuntu@ip-172-31-57-143:~/ssith-aws-fpga$ /build/ssith_aws_fpga -L -G 2020 --uart-console=1 --block
~/rootfs.ext2 --tun tap0 --xdma=0 --dma=1 --dtb build/devicetree-mit.dtb
~/sanctum/linux_enclaves/build/test_linux.elf aeskey.elf
…
loadElf: ./linux_enclaves/build/test_linux.elf is a 64-bit ELF file
loadElf: entry point 80003000
…
GDB server listening on port 2020
[ 0.000000] OF: fdt: Ignoring memory range 0x80000000 - 0x82000000
[ 0.000000] Linux version 4.20.0-44637-g06a0649ab5ac (ubuntu@ip-172-31-57-143) (gcc version 9.2.0
(GCC)) #1 SMP Thu Jul 9 17:53:49 UTC 2020
[ 0.000000] printk: bootconsole [early0] enabled
[ 0.000000] Zone ranges:
[ 0.000000] DMA32 [mem 0x0000000082000000-0x00000000bfffffff]
[ 0.000000] Normal [mem 0x00000000c0000000-0x00000bffffffffff]
…
Information Classification: General
Linux boots
[ OK ] Reached target Multi-User System.
[ OK ] Stopped OpenBSD Secure Shell server.
Starting OpenBSD Secure Shell server...
Debian GNU/Linux bullseye/sid ip-172-31-20-121 ttyS0
ip-172-31-20-121 login: root
Linux ip-172-31-20-121 4.20.0-44637-g06a0649ab5ac #1 SMP Thu Jul 9 17:53:49 UTC 2020 riscv64
The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Sun Jun 7 23:05:37 UTC 2020 on ttyS0
root@ip-172-31-20-121:~# uname -a
Linux ip-172-31-20-121 4.20.0-44637-g06a0649ab5ac #1 SMP Thu Jul 9 17:53:49 UTC 2020 riscv64 GNU/Linux
Information Classification: General
Summary
• Hardware designed in BSV
• Hardware/software connection via Connectal
• Devices modeled in software via Virtio / TinyEMU
• Hardware designers focused on CPU extensions
• Software developers developed/debugged in qemu and deployed on AWS F1
Information Classification: General
December 8-10 | Virtual Event
Thank you for joining us.
Contribute to the RISC-V conversation on social!
#RISCVSUMMIT @risc_v

More Related Content

Easily emulating full systems on amazon fpg as

  • 2. Information Classification: General December 8-10 | Virtual Event Easily Emulating Full Systems on Amazon FPGAs Jamey Hicks Chief Engineer and President #RISCVSUMMIT
  • 3. Information Classification: General Emulating SoCs in Amazon FPGAs • How many of you are specifying, designing, or developing software for new SoCs? • How many of you were doing so at the start of the pandemic? • Wouldn’t it nice to be able to do so in the cloud? • Did you know that Amazon Web Services (AWS) rents servers with FPGAs? • In this talk, I will tell you how we validated our RISC-V CPU by running Linux on it in an AWS F1 FPGA
  • 4. Information Classification: General Context: DARPA SSITH and FETT • DARPA SSITH • MIT CSAIL and Accelerated Tech, Inc • Sanctum/MI6 security enclaves • The goal for software running in an enclave to be as isolated from other software as if it was running on a separate machine • MIT developed many RISC-V variants in BSV • Smallest microcontroller fabricated in carbon nanotubes • In order cores, used in teaching • Speculative, out-of-order processor • Enhanced with Sanctum hardware isolation • MI6 prevents side channels due to speculation and shared resources such as caches • For SSITH, we extended Bluespec, Inc’s RISC-V Flute processor • In order to compare between a common baseline and other teams (e.g., SRI/Cambridge)
  • 5. Information Classification: General SSITH FPGA Architecture • Xilinx VCU118 on a lab bench • Xilinx Vivado block design outside • Contains I/O devices: UART, Ethernet • Filesystem in ramdisk, transferred slowly via JTAG • Verilog CPU core inside • Fixed interface • Verilog generated from Chisel or BSV • Provided by each team in the program: MIT, SRI/Cambridge, … • Programs loaded via gdb over JTAG • Debug via gdb over JTAG
  • 6. Information Classification: General SSITH BSV Cores on AWS • FETT Bounty Hunt • Run the security-enhanced processors in the cloud • Make them available on-demand to many experts at exploiting security vulnerabilities • Requirements • Runs on AWS F1 FPGA instances • UART, Ethernet, and Block device support • Debugger • Constraints • No hardware ethernet connection • No hardware block devices • Goals • Improve flexibility for BSV designs • Speed up code loading Get this working as fast as possible
  • 7. Information Classification: General MIT Sanctum on AWS RV64GC /Enclave Core 512-bit DDR A Ctrl 64-bit Portal FPGA CL FPGA SH Host Software VIRTIO Console Network Block Connectal Portal DMA PCIS I/O Mem Request /Response DMI IRQ Tandem V IMEM DMEM BSV Netlist C/C++
  • 8. Information Classification: General SRI/Cambridge CHERI on AWS RV64GC CHERI Core 512-bit DDR A Ctrl 64-bit Portal FPGA CL FPGA SH Host Software VIRTIO Console Network Block Connectal Portal L1 Cache I/O Mem Request /Response DMI IRQ Tandem V IMEM DMEM BSV Netlist C/C++
  • 9. Information Classification: General SSITH RISC-V on AWS • AWS “Shell” logic provides PCIe interface to host • Use virtio device models for network and storage • Reuse code from open source VM (tinyemu) • Use popular virtio Linux device drivers • DMA transfers from device models for high performance • Load code via DMA instead of JTAG for faster startup • CPUs running at 100MHz • All logic except AWS Shell and IP cores in BSV
  • 10. Information Classification: General How easy was it? • Mid March 2020 • 400 lines of code to build AWS FPGA image containing MIT RG64 Flute Enclaves processor • Akin to an IP Integrator design, plus host software connection • In April • Added vanilla Bluespec RG64 Flute processor • Added SRI/Cambridge RG64 Flute CHERI processor • Faster ELF loader • Added I/O models via TinyEmu • with much help from Jessica Clarke at Cambridge University • Debugged MIT security monitor, built root filesystems with buildroot and Debian, … • By end of May, started deploying to testers, with Galois, SRI/Cambridge • In June, Jessica Clarke integrated Bluespec’s gdb stub • Part time efforts from two people
  • 11. Information Classification: General Key Ingredients • AWS F1 FPGA Shell • Connectal • Virtio • And also • BSV • Vivado
  • 13. Information Classification: General F1 FPGA Shell • Provides common functionality • Hides details of PCIe interface • Isolates the server from errors in FPGA logic • “Customer Logic” loaded via partial reconfiguration • Enables host to memory-map FPGA addresses • Enables FPGA direct access to host memory • 4 banks 16GB DRAM ”Customer Logic”
  • 14. Information Classification: General MIT Sanctum on AWS RV64GC /Enclave Core 512-bit DDR A Ctrl 64-bit Portal FPGA CL FPGA SH Host Software VIRTIO Console Network Block Connectal Portal DMA PCIS I/O Mem Request /Response DMI IRQ Tandem V IMEM DMEM BSV Netlist C/C++ mmap() Connectal
  • 16. Information Classification: General Connecting the hardware and software • Wouldn’t it be nice to be able to connect the hardware and software without writing a device driver? • Wouldn’t it be nice to have the compiler verify that the two sides agree on the data format? • Connectal enables user-space software to connect to hardware • Without an app-specific device driver • Hardware and software stubs generated from interface spec • Asynchronous remote method call interface • Minimal Connectal app consists of 3 files: • C++ software • BSV hardware • Makefile • One command builds both hardware and software • Portable across FPGA families and operating systems
  • 17. Information Classification: General Debug Module Interface (DMI) Interface, Connectal-Style interface AWSP2_Request; // software to hardware method Action dmi_read(Bit#(7) req_addr); method Action dmi_write(Bit#(7) req_addr, Bit#(32) req_data); method Action dmi_status(); … endinterface interface AWSP2_Response; // hardware to software method Action dmi_read_data(Bit#(32) rsp_data); method Action dmi_status_data(Bit#(16) status); … endinterface
  • 18. Information Classification: General Connectal “Portals” Legend: User generated Connectal generated 1. Define HW/SW interfaces using BSV 2. Invoke interface compiler to generate Wrappers and Proxies 3. Connect user-software and user- hardware using generated “glue” void dma_read_data(int loc){ … } Main(){ … dmi_read(a); dmi_read(b); dmi_write(c,val); … } Indication Wrapper Request Proxy Indication Proxy FIFO Request Wrapper FIFO A B C D write read&deq H/W bus User Hardware User Software interrupt HW->SW portal SW->HW portal User Hardware (BSV) rule dmi_response; let v <- pop(cpu.dmi.resp); ind.dmi_read_data(v); Endrule … method Action dmi_read(a); cpu.dmi.read.enq(a); endmethod …
  • 19. Information Classification: General Connectal Portal Details 19 www.connectal.org Linux Connectal Device Driver -Portal discovery -Create /dev/portalXXX -Register ISR -MMAP Interrupt Service Routine void found(int loc){ … } Main(){ … portalExec_start(); configure(‘a’); configure(‘b’); search(hptr,hlen); …} Indication Wrapper Request Proxy Indication Proxy FIFO Request Wrapper FIFO A B C D write read&deq interrupt H/W bus User Hardware User Software Poller void* PortalPoller::portalExec(void* __x){ while (1) { portalExec_poll(portalExec_timeout); portalExec_event(); }} Connectal Libraries Indication Thread poll Legend: User generated Connectal generated Connectal libraries Linux
  • 20. Information Classification: General DMI interface: C++ stubs class AWSP2_RequestProxy : public Portal { public: AWSP2_RequestProxy(int id…) : int dmi_read ( const uint8_t req_addr ); int dmi_write ( const uint8_t req_addr, const uint32_t req_data ); int dmi_status ( ); … }
  • 21. Information Classification: General DMI Interface: BSV methods interface AWSP2_Request request; method Action dmi_read(Bit#(7) addr); //$display("dmi_read req addr %h", addr); dmiReadFifo.enq(addr); endmethod method Action dmi_write(Bit#(7) addr, Bit#(32) data); //$display("dmi_write req addr %h data %h", addr, data); dmiWriteFifo.enq(tuple2(addr, data)); endmethod … endinterface
  • 22. Information Classification: General VIRTIO Device Models Modeling I/O devices in host software
  • 23. Information Classification: General VIRTIO • TinyEMU • Small virtual machine by same author as qemu • Implements virtio device models • Open source • Reusable • OS Drivers • Standard Linux kernel drivers • FreeBSD kernel drivers • Needed some patches due to virtio spec evolution Reduce device models to be a software problem And reuse existing device models and drivers
  • 24. Information Classification: General VIRTIO • What is virtio? • A common framework for I/O developed for hypervisors • Used by kvm, qemu, etc. • Drivers for Linux and FreeBSD • Devices Supported • Block storage • Network • Random/Entropy • Console • Normally, memory mapped regs and DRAM shared between guest and host • In our case, the risc-v CPU in the core is like a virtual machine guest • And host software runs the device models • A file per block device Memory Mapped Regs DRAM Req queue Req queue Req queue Device Implementations / Models
  • 25. Information Classification: General VIRTIO (2) • Normally, memory mapped regs and DRAM shared between guest and host • In our case, the risc-v CPU in the core is like a virtual machine guest • And host software runs the device models • A file per block device • /dev/tun enables device model to send and receive ethernet packets • UART connected to pseudo terminal Memory Mapped Regs DRAM Req queue Req queue Req queue Device Implementations / Models
  • 26. Information Classification: General FPGA Core access to Virtio control registers interface AWSP2_Response; // hardware to software method Action io_awaddr(Bit#(32) awaddr, Bit#(16) awlen, Bit#(16) awid); method Action io_araddr(Bit#(32) araddr, Bit#(16) arlen, Bit#(16) arid); method Action io_wdata(Bit#(64) wdata, Bit#(8) wstrb); endinterface interface AWSP2_Request; // software to hardware method Action io_rdata(Bit#(64) data, Bit#(16) rid, Bit#(8) rresp, Bool last); method Action io_bdone(Bit#(16) bid, Bit#(8) bresp); endinterface
  • 27. Information Classification: General MIT Sanctum on AWS RV64GC /Enclave Core 512-bit DDR A Ctrl 64-bit Portal FPGA CL FPGA SH Host Software VIRTIO Console Network Block Connectal Portal DMA PCIS I/O Mem Request /Response DMI IRQ Tandem V IMEM DMEM BSV Netlist C/C++ AXI Mem via Connectal Access to queues in DRAM via mmap()
  • 29. Information Classification: General What’s needed to build it • Open Source • aws-fpga (for AWS F1) • BSV Compiler (bsc) • Connectal • FluteEnclavesTagging (our CPU) • Flute (Bluespec, Inc CPU) • ssith-aws-fpga • Includes all of the above as submodules • Closed Source • Vivado • AWS F1 Shell design checkpoint • Build environment • Centos z1d.2xlarge or other beefy build machine • Centos can be made to work • AWS FPGA Developer instances include Vivado license • S3 bucket: aws-fpga • About 4 hours
  • 30. Information Classification: General Create AWS Account • If you do not have an account and you would like to try out this flow • https://www.aws.amazon.com
  • 32. Information Classification: General Creating a Key Pair • Creates ssh key pair • Copy to your .ssh directory
  • 33. Information Classification: General Creating AWS Keys • Used by AWS CLI to access/control AWS resources • Used for uploading design checkpoint to S3 during build process • This shows a root access key • Should create IAM accounts that have EC2 and S3 privileges
  • 34. Information Classification: General Create Access Key • Save file • Copy to ~/.aws/credentials
  • 35. Information Classification: General Launching a build machine • Navigate to Services -> EC2 -> Instances • Launch an instance • Search for FPGA Developer • Select “FPGA Developer AMI” • On the next screen, select “z1d.2xlarge”, which is the default.
  • 36. Information Classification: General Building the hardware • Using FPGA Developer instance • z1d.2xlarge recommended ($0.744 per hour) • Copy .aws/credentials to the machine so the build script can upload to the S3 bucket [centos@ip-172-31-85-110 ~]$ git clone --recursive -b riscv-summit-2020 https://github.com/acceleratedtech/ssith-aws-fpga [centos@ip-172-31-85-110 ~]$ screen [centos@ip-172-31-85-110 ~]$ cd ssith-aws-fpga/hw [centos@ip-172-31-85-110 hw]$ sudo ./install-deps.sh [centos@ip-172-31-85-110 hw]$ ./build.sh
  • 37. Information Classification: General Building the hardware • First step generates the BSV and cpp stubs • Second step compiles BSV to generate Verilog (in a docker container) • Third step runs Vivado to synthesize, place, and route the Verilog inside the AWS Shell • This takes several hours on this design • Fourth step copies the final design checkpoint (.dcp) to S3 in the aws-fpga bucket • Fifth step requests an Amazon FPGA Image to be created • Somewhere in the cloud, AWS will run DRCs on the checkpoint and create a partial bitstream, identified by AFI / AGFI • Typically 30-60 minutes
  • 39. Information Classification: General What’s needed to run it? • Open Source • ssith-aws-fpga • Building the host software this time • Linux kernel • Root filesystem • Busybox (small) • Buildroot • Debian (full featured) • AWS F1 FPGA instance • f1.2xlarge contains 1 FPGA • Choose Ubuntu 20.04 or Ubuntu 18.04
  • 40. Information Classification: General Building the host software • ssh to the instance ubuntu@ip-172-31-57-143:~$ git clone --recursive https://github.com/acceleratedtech/ssith-aws- fpga ubuntu@ip-172-31-57-143:~/ssith-aws-fpga$ sudo ./install-deps.sh ubuntu@ip-172-31-57-143:~/ssith-aws-fpga$ ./build.sh
  • 41. Information Classification: General Load the FPGA ubuntu@ip-172-31-57-143:~/ssith-aws-fpga$ fpga-load-local-image -S 0 -I agfi-021ac2c422864fa59 ubuntu@ip-172-31-57-143:~/ssith-aws-fpga$ sudo insmod hw/connectal/drivers/pcieportal/pcieportal.ko Bitstream identified by global identifier (AGFI)
  • 42. Information Classification: General Run the host software and RISC-V CPU ubuntu@ip-172-31-57-143:~/ssith-aws-fpga$ /build/ssith_aws_fpga -L -G 2020 --uart-console=1 --block ~/rootfs.ext2 --tun tap0 --xdma=0 --dma=1 --dtb build/devicetree-mit.dtb ~/sanctum/linux_enclaves/build/test_linux.elf aeskey.elf … loadElf: ./linux_enclaves/build/test_linux.elf is a 64-bit ELF file loadElf: entry point 80003000 … GDB server listening on port 2020 [ 0.000000] OF: fdt: Ignoring memory range 0x80000000 - 0x82000000 [ 0.000000] Linux version 4.20.0-44637-g06a0649ab5ac (ubuntu@ip-172-31-57-143) (gcc version 9.2.0 (GCC)) #1 SMP Thu Jul 9 17:53:49 UTC 2020 [ 0.000000] printk: bootconsole [early0] enabled [ 0.000000] Zone ranges: [ 0.000000] DMA32 [mem 0x0000000082000000-0x00000000bfffffff] [ 0.000000] Normal [mem 0x00000000c0000000-0x00000bffffffffff] …
  • 43. Information Classification: General Linux boots [ OK ] Reached target Multi-User System. [ OK ] Stopped OpenBSD Secure Shell server. Starting OpenBSD Secure Shell server... Debian GNU/Linux bullseye/sid ip-172-31-20-121 ttyS0 ip-172-31-20-121 login: root Linux ip-172-31-20-121 4.20.0-44637-g06a0649ab5ac #1 SMP Thu Jul 9 17:53:49 UTC 2020 riscv64 The programs included with the Debian GNU/Linux system are free software; the exact distribution terms for each program are described in the individual files in /usr/share/doc/*/copyright. Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. Last login: Sun Jun 7 23:05:37 UTC 2020 on ttyS0 root@ip-172-31-20-121:~# uname -a Linux ip-172-31-20-121 4.20.0-44637-g06a0649ab5ac #1 SMP Thu Jul 9 17:53:49 UTC 2020 riscv64 GNU/Linux
  • 44. Information Classification: General Summary • Hardware designed in BSV • Hardware/software connection via Connectal • Devices modeled in software via Virtio / TinyEMU • Hardware designers focused on CPU extensions • Software developers developed/debugged in qemu and deployed on AWS F1
  • 45. Information Classification: General December 8-10 | Virtual Event Thank you for joining us. Contribute to the RISC-V conversation on social! #RISCVSUMMIT @risc_v