Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Design of An Efficient AXI-4 Protocol For High Speed SOC Applications On FPGA Platform

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Design of an Efficient AXI-4 Protocol for High

Speed SOC Applications on FPGA Platform


Archana H. R1, C. R. Byrareddy2, and Narendra C. P2
1
Department of ECE, BMS College of Engineering, Bangalore, India
2
Department of ECE, Bangalore Institute of Technology, Bangalore, India
archanahr.ece@bmsce.ac.in

Abstract—The system-on-chip(SoC) design process within the constraints of the Master and Slaves configuration
encounters various challenges of communication between one to along with dynamic topology, a total number of transaction
another module. Thus, the Bus interconnection plays a types, and different interface protocols. The verification
significant role in improving the system performance on a single challenges are corrected by using new tools and technologies
chip. The traditional bus interconnections cease its applicability
to meet the requirements of future generation SoC. This paper
with functional correctness, verification completeness,
proposes an efficient design of the AXI-4 protocol to achieve protocol and conversion compliance, stress verification,
high-speed data transfer in the SoC application. The proposed security, and power management [4]. The SoC bus
AXI-4 Interface protocol includes the Master and Slave module, interconnections are designed based on the topologies and
which are designed using state flow and state diagrams. Both the protocol deployments in buses. The bus topologies include
Master and Slave module operations support burst based single-level, multi-level shared structure, and Multi-bus
transactions and perform the five different channel transactions interconnection structure. The AMBA protocol used as a bus
that include “write address,” “write data,” “write a response,” module in the AXI (Advanced Extensible Interface) provides
“read address,” “read data” along with “read response.” The communications to high-performance devices, and APB
simulation results of the AXI-4 interface and its FPGA
realization on Artix-7 illustrate lower resource utilization, and
(Advanced Peripheral Bus) provides transmission among
the performance benchmarking between proposed AXI-4 with low-speed devices and peripherals [5].
traditional AHB and Wishbone bus modules illustrates an The Bridge module is an essential component found in SoC
average minimization in area and increase in frequency by 40% to establish communication between protocols [6]. For
and 41% respectively. multiple Master and multiple Slaves, the communication
interface plays an essential role in improving system
Index Terms—AHB; AMBA; ASB; AXI-4; Burst; FPGA; performance. The serial communications can transfer any
Master; Slave; SOC, Transaction. data, although it could not meet the complex system
requirements. The parallel systems transfer data from a
I. INTRODUCTION particular source to a destination using the AXI interface
protocol [7].
In recent years, the development of SoC and multicore The proposed AXI-4 Interface protocol offers high-speed
processors are gaining popularity because of their interconnection to the SoC systems. The proposed design
advantages. The suitably designed bus interconnection follows the AMBA AXI Protocol specifications [8], which
between the devices plays a vital role in improvising the supports bus-based transactions suitable for low latency and
system performance. The high throughput and low latency high bandwidth designs to improve the system performance.
bus architectures are an essential part of the system The proposed design overcomes the drawbacks of the
communication and interconnection. Many of the bus previous bus-based architectures like wishbone and AHB
interconnection system designs include Hyper transport, PCI, interfaces. The AXI-4 Design uses five-channel transactions,
and Quick Path Interconnect to provide high performance in which includes write address, write data, write a response,
SoC, but it still faces congestion issues in system read address, and read data with read response. The AXI-4
performance [1]. Master and Slave interface modules are designed using state
In VLSI Field, the SoC is the fastest growing design flow and state diagrams. The data width is considered for 8-
platform to achieve low cost, low power, and high-speed bits wide. Further, these interface protocols are used in image
constraints on a single chip rather than conventional board- processing applications.
level design. In recent years, compared to ASIC design, This paper is organized in six sections. After the
FPGA (Field-programmable gate array) has been widely used introduction, presented in Section I, Section II discusses the
due to its advantageous capacity of reconfigurability and its existing works of interface protocols, AXI, and its related
easiness to realize and modify the design at the device level. technologies and research gaps findings. The methodology
The programmable SoC architecture FPGA is the Zynq-7000 adopted for the proposed work is discussed in section III.
series, which supports the ARM- Cortex-A9 processing Section IV explains the proposed system with detailed
system and 28nm Xilinx programmable logic on a single descriptions. The results and analysis of the work are
chip. The Zynq SoC supports secured data transmission with elaborated in section V. Finally, section VI concludes the
high performance and low latency embedded systems along overall system with improvements and future work.
with shared memory access [2][3].
The performance and functionality aspects of the SoC in
the context of associated verification is a challenging task

ISSN: 2180 – 1843 e-ISSN: 2289-8131 Vol. 12 No. 3 July – September 2020 61
Journal of Telecommunication, Electronic and Computer Engineering

II. RELATED WORK Slave, and OCP to AXI Kernel. The AXI Master is designed
using the write and read FSM to handle the handshake
The review of the existing work on AXI protocol with mechanism between an AXI Interface and bridge. The AXI
different functional architectures and for different bus interface module with the Master, Slave, and interconnect
applications is described below. module is designed as a hardware approach by Ramesh et al.
Sarojini et al. [9] presented the high optimized throughput [18]. The interconnect module includes Master controller,
Memory Interface design, using AXI protocol as a Hardware read-write arbiter and decoder, and Slave controller. The
approach with separate read/write channels. The data is AXI Bus interface supports 2 master and 4-slave
passed through FIFO and accessed by AXI Master, while communications. Archana et al. [19] explained the imaging
AXI Core interconnect provides the interface between Master chip concept using the AXI protocol for medical applications
and Slave. The slave DDR memory controller receives the on a hardware platform. Fabio et al. [20] presented a reliable
data and connects to the external world. The design is communication analysis between embedded processing
speedup in terms of Mbps based on the clock frequency. The systems and programmable logic as a Hybrid (hardware and
design limits the Input-Output speed, which depends upon software) approach, using AXI ports on ZYNQ-7000
the FPGA Hardware capabilities. Tidala [10] describes the Multicore ARM processor.
FPGA (Hardware approach) based on high-speed on-chip Research Gap: It has been noticed from the review works,
communication using the AXI-4 protocol. The design and existing approaches, the significant work carried on the
overcomes the many challenges concerning quality of Interface-bus protocol designs are based either on software or
service, resolving the complexity issues in Network routing hardware approaches. The research gaps are explained from
interface, hence providing better data transmission. The the identification of existing approaches as follows:
Network communication happens between Programmable 1) Most of the interface modules, like bus-based
Logic device and Memory via AXI interconnect on FPGA. architectures, shared bus connections are used in many
For each burst data transaction, the Bandwidth is observed SOC applications for communication purposes, which
and tabulated. Erfan et al. [11] reported that the AXI-4 is are facing scalability and reliability problems on the
based on interconnect as a software approach, and it is used hardware platform.
to improve the performance in terms of low latency, high 2) Very few hardware-based designs work towards
bandwidth, less traffic, and better execution timing for Smart AMBA based protocols like AHB, APB, and ASB.
Memory Cube (SMC) Module. Even with the AXI protocol, there has been very
Makni et al. [12] described ABMA based AXI4 bus limited work carried out and they have a lot of
protocol as hardware/software (H/S) approach for Wireless constraints issues.
Sensor Network (WSN). SoC has three similar type 3) Many vendors designed soft-core AXI Protocol as an
interconnect bus protocols namely stream, lite and burst type. IP core. Although it is used in most of the applications,
These interconnect bus protocols are configured and which includes MPSOC fast communications, they are
optimized by High-Level Synthesis (HLS) techniques. not customized to other hardware.
Compared with the benchmarked designs with 4) Most of the AXI-based interface protocol design on
improvements, the optimization techniques include loop hardware-based approaches are facing cost-effective
pipelining, dataflow, and array partitioning. The functional solutions over SOC platform.
verifications of the digital system are done by Bus functional 5) The Complete AXI-4 interface protocol with high-
Modelling (BFM) and Transaction-level modeling (TLM) speed architecture is yet to come with optimized
techniques. The transaction-based SOC system is designed constraints to improve the performance of the SOC
using the AXI-4 bus, interconnected with the help of VHDL applications.
language to provide cost-effective solutions on hardware Hence an efficient, cost-effective solution with better
[13]. performance is required to fulfill the above research gaps.
Sebastian et al. [14] presented a Verification of Serial The overview of the research methodology for the proposed
gigabit media independent interface (SGMII) IP core using design to address the research gaps is described in the next
Universal Verification Methodology – (UVM-VC) section.
Verification component with AXI to Wishbone Bus (WB)
Bridge as a software approach. By using AXI Bus, the III. RESEARCH METHODOLOGY
coverage of SGMIII is improved along with the creation of a
a reusable, reliable verification environment. An efficient AXI-4 protocol is a high speed interface bus
The verification Intellectual Property (VIP) for AXI4 as a protocol, which corrects the drawbacks of existing bus
software approach is designed in Prasad et al. [15], which interface protocols with better performance. The schematic
provides the verification and IP core based flow for SOC flow of the proposed High-speed AXI-4 Interface is
designs. The functionality of five channel transactions and represented in Figure 1. The model includes AXI-4 Master
the verification of out-of-order and multiple outstanding Module, interconnect module, and AXI4 Slave module.
transaction scenarios with the Questa tool. Sharma et al. [16] These modules are interconnected, each based on AMBA-
presented the design of conventional AMBA AXI-3 bus AXI-4 Input-outputs with its specifications, which are:
protocol as a software approach, which includes the write Define the AXI-4 Master and slave Input-output (IO) as per
and read burst based transactions with verification analysis ARM –AXI-4 Specifications [8] and Assign the Input-output
with the identification of code coverage findings. Panjkov et connections as per five channels. The Master and Slave
al. [17] addressed the Bridging of the well-known protocols, modules are designed using FSM (Finite state machine) along
like AXI and OCP (Open core protocol), as a software with five transactions which are: write address channel
approach which supports the multiple transactions. The (WAC), write data channel(WDC), write response channel
Bridge mainly contains AXI Master, AXI downsizer, OCP

62 ISSN: 2180 – 1843 e-ISSN: 2289-8131 Vol. 12 No. 3 July – September 2020
Design of an Efficient AXI-4 Protocol for High Speed SOC Applications on FPGA Platform

(WRC), read address channel (RAC) and read data channel completion of the transaction details in Slave to module using
(RDC) with the response. response and acknowledges signals. Data is transferred from
The proposed module provides a high speed interface slave to master along with the response to the Slave using
between Master and slave modules via interconnection. The read data channel. The detailed design description about
AXI-4 Interconnect provides a comman interface between master and slave is explained below.
Master and slave using a state machine. The AXI-4 Slave
Module performs different transcations, which include write A. AXI-4 Master Module Operation
address ready signal generation, write address latching with The AXI-4 Master module is designed using five signal
different burst transactions, write ready and response signal transactions. First, define the AXI-4 Master input-output port
generation, read address ready signal with burst transactions, signals with many transactions and data width for write
memory-mapped register select, read logic signal generation address channel, write data channel, write response channel,
and design the Block RAM (BRAM) to access the master read address channel, and read data channel with a response.
data. Next, the AXI-4 internal temporary signals for channel
transactions are defined using AXI-4 Master input-outputs.
Define AXI-4 Master-Slave Input-Outputs The local parameter is set for a targeted Slave with the base
address to 8’h80, and the Master module waits for start-
counter, which is set to 16 clock cycles before initiating the
AXI-4 Input-Outputs Connections for Five Channels
write transaction. In the Master module, the burst size and
burst length are allotted based on the total number of burst
Master- FSM Module transfers. The write and read burst counters are used to find
Write
Write Write
Respons
Read Read the number of burst transfers based on burst length.
Address Data Address Data
Channel Channel
e
Channel Channel
The AXI-4 Master Input-output connections are set for five
Channel channel transactions. The master AXI address (AWADDR)
(WAC) (WDC) (RAC) (RDC)
(WRC)
is the concatenation of the targeted salve base address and
AXI-4 Interconnect
active offset range (axi_awaddr). The write Burst length
(AWLEN) is defined as based on the number of transactions
Slave Module
minus one. The burst size (AWSIZE) is set to 3, which is 2^3
and equal to 8-bit data. The 2-bit increment burst type
Write Write Write Read
address ready response
Read
logic (AWBURST) is selected and set 2’b01. The master cache
Address
generatio generatio generatio
ready
generatio type (AWCACHE) is set 4’b0011, which indicates to
n n n n
cacheable and bufferable type. The write address valid
Design of Block RAM (BRAM) (AWVALID) is set to 1, if valid address and control
information is present, otherwise 0.
Figure 1: Schematic flow of proposed AXI-4 interface module The master AXI Write data (WDATA) is 8-bit wide write
data. Write strobe (WSTRB) is used to indicate which byte is
The study outcome offers high throughput, reduced used to update the memory. The Write Last (WLAST) is the
resource consumption, and better performance in a single last data transaction in a write burst. The write valid
chip for low-cost SOC applications. It is also anticipated that (WVALID) is set to 1, if data is valid, otherwise 0. The write
the proposed scheme overcomes the drawbacks of traditional response (BREADY) provides the response information, if it
interface protocols with the constraints and performance is 1= Master is ready, 0= Master is not ready. The AXI-4
improvements. The next section describes the proposed AXI- Master Read address (ARADDR), the read burst length
4 Master-slave design with FSM models with different Input- (ARLEN), the read burst size (ARSIZE), read increment
output functionality for high-speed SOC applications. burst type (ARBURST), Read cache type (ARCACHE), the
read address valid (ARVALID) and read ready (ARREADY)
IV. PROPOSED SYSTEM is set the same as the write transactions.

This section discusses the proposed AXI-4 Master-Slave B. Write Address Channel (WAC)
Modules using FSM and state flow with AXI-4 Input-output The address and control information is requested for all the
functionality. The AXI protocol offers the latest key features transactions and processes for the write operation as early as
for the next-generation technology in high-speed possible. In the initial process, the valid address signal
interconnection. The Features include the AXI-4 interface (awvalid) is reset, the initial next transaction is set, if the
protocol, which suits low latency and high bandwidth previous address is not valid. In the next clock cycle,
designs, memory controller designs with low latency, awvalid=1. Once valid is set, wait for the AWREADY signal
interconnect architecture flexibility, backward compatible to accept the transactions. Once the AWREADY indicates the
with previous ASP, AHB, and APR interface protocols. previous address is accepted, the next address is set based on
The AXI-4 protocol supports write response requirement the burst size and burst length.
updation, up to 256 beats burst length support, Updated cache
signal details, QoS signaling, and provides ordering C. Write Data Channel (WDC)
requirements information. The AXI-4 protocol provides a The write data is continuously forwarding to the slave via
burst-based transaction. In the address channel, every interface signals. If the valid (WVALID) and ready
transaction has control and address information that deals (WREADY) signals are high, the next transaction (wnext)
with the type of data to be transferred. The data is will be started. Reset the write valid signal (wvalid) initially,
transformed from the Master to the Slave module using a start the next transaction, if it is not valid previously. In next
write data channel. The write response channel provides the clock cycle, wvalid=1. When many write transactions are in

ISSN: 2180 – 1843 e-ISSN: 2289-8131 Vol. 12 No. 3 July – September 2020 63
Journal of Telecommunication, Electronic and Computer Engineering

the process (wnext), and till last burst (wlast), the WVALID Any data mismatch during the write and read transactions
signal must wait to complete the write process. The write leads to read or write interface errors. If any read mismatch
counter is used to synchronize the last write data when it is happens or error in write and read response, these error data
full. The write last (wlast) is active only, write counter is are stored in the error register. The write and read burst
equal to burst length along with the next transactions. Burst counter are used to track the total number of burst
counter is used with an additional counter to perform the next transactions, which is initiated against the individual number
transactions to avoid the decoding logic in AXI-4 of burst transactions for master or salve to initiate.
interconnection. The write data (wdata) is generated based on
the wnext logic, with incrementing till the burst count. F. Master Interface Finite State Machine
The Master interface Finite state machine (FSM) is used to
D. Write Response Channel (WRC) compare and validate the write and read transactions. The
The write response channel assures that all the write FSM mainly has five states, which include the IDLE,
transactions are ready to store in slave memory. The COUNTER, WRITE, READ, and FINAL states, as presented
BREADY is response ready. If it is 1, the master is ready to in Figure 3.
accept the slave response information. If it is 0, the master is count = 0
not ready to receive the slave response information. Reset the c_done = init_burst_write =
IDLE 0
write response ready (bready) initially; when the Write 0
STATE init_ burst_read =
response valid (BVAILD) is valid from the slave and master 0
is not ready to respond, Then, the next clock cycle, the master c_done = 1 c_ done = 0
is ready to respond (axi_bready=1). If the Master is not ready, FINAL c_done = 0
it retains its previous(axi_bready) value, as presented in STATE count = count+1
COUNTER
Figure 2. STATE
Read_done = 0
read_done = 1 init_burst_read = 1
Start
READ
STATE count = 16
Define Inputs: RESETN, BVALID; WRITE
Output: axi_bready
STATE
Write_done = 1 Write_done = 0
init_brust_write = 0 init_brust_write =
Y
RESETN = 0 axi_bready = 0 1

Figure 3: AXI-4 master interface finite state machine


N
The IDLE state is used to reset to initial values under reset
condition and assign the next transition to COUNTER state.
BVALID && ~ axi_bready = 1 axi_bready = 1 The COUNTER state initializes the counter and waits for
Y
start counter, which is set to 16 clock cycles, before initiating
N the write transaction, and once counting is done, the next
transition to WRITE State is assigned. The Write state
axi_bready = 1
Y initiates the write transactions. It remains active till burst
axi_bready = 0
write signal (init_burst_write) is asserted, If the burst write
N
signal is zero, the write transactions will stop and write done
signal will be high and assign next transition to READ state.
The Read state initiates the read transactions. It will remain
Stop
active till burst read signal (init_burst_read) is asserted. If the
burst read signal is zero, the read transactions will stop, and
Figure 2: AXI-4 master write response state flow the read done signal will be high and the next transition to the
FINAL state is assigned.
E. Read Address Channel (RAC) and Read Data Channel The FINAL State provides the final comparison of written
(RDC) with Response data with read data. If any data mismatch are found, the error
The read address channel works similarly to the write flag will set and assign the error data to error register and
address channel. Instead of write address signals, read assign the next transition to IDEAL or COUNTER state. If
address signals are the process to complete all the read the comparison is carried out (c_done) with no error, it
transactions. The Read data is continuously forwarding to the indicates the transaction is completed.
master via interface signals. If the valid (RVALID) and ready The write and read done signals are activated based on the
(RREADY) signals are high, the next transaction (rnext) will last write and read transactions, and which are dependent on
be started. The read last (rlast) is active only; the read counter write and read the response, ready and valid signals.
is equal to burst length along with its next read transactions.
When the read valid (RVAILD) is valid from the master, the G. AXI-4 Slave Module Operation
slave is not ready to respond. Then, the next clock cycle, the The AXI-4 slave module supports burst based transactions
slave is ready to respond (axi_rready=1). If the Slave is not and it is implemented with Block Random Access Memory
ready, it retains its previous (axi_rready) value. When (B-RAM). The slave module receives the master output
RVALID indicates that the required read is available and read signals as input signals through interconnects. The slave
transaction can complete according to the response and module write and read the data based on the five signal
master ready.

64 ISSN: 2180 – 1843 e-ISSN: 2289-8131 Vol. 12 No. 3 July – September 2020
Design of an Efficient AXI-4 Protocol for High Speed SOC Applications on FPGA Platform

transactions. The slave address and data width are fixed to 8, the slave is ready to accept the next address transactions after
and the slave module receives the master write data and issue the completion of the present write transaction.
the read data. First, define the AXI-4 slave input-output port
signals of write address, write data, write a response, read J. Write Response Logic Generation
address, and read data channel with the response. Also, define The Slave module declares the valid response and write
the AXI-4 internal temporary signals for channels transaction response signals when the WVALID and axi_wready are set,
using AXI-4 slave input-outputs. results in the response type is (00- OKAY) and valid response
The AXI-4 slave Input-output connections are set for (bvalid). The master module’s Ready response (BREADY)
write-read transactions. The master Output signals are input and slave’s valid response (bvalid) are active, which results
to the slave interface module. The slave output signals in the write transaction is accepted from the slave module.
include thewrite address ready (AWREADY), which
receives the address and control information, when it is set to K. Read Address Generation
1 slave is ready to accept, and 0 slave is not to accept the The read address generation is similar to write address
information. The write ready (WREADY), which receives generation except, use the read address signals than write
the data information, when set to 1 slave is ready to accept, address signals. It also supports the read burst type.
and 0 slave is not to accept the write data information. The 2-
bit write response (BRESP), which provides write transaction L. Read Logic Generation
status information with a response like OKAY, EXOKAY, Reset the previous read response (rresp) and valid response
etc. The write response valid (BVALID) signal provides a (rvalid). Set the Read flag (axi_flagr), to access the slave
valid write response if it is set to 1, otherwise set to 0 when it module read data (axi_rdata) when rvalid =1 and read
is not available. The write response ID (BID) is a write response (rresp=00) is OKAY. The master accepts the valid
response identification tag if it matches with AWID, then the response when RREADY=1 is represented in Figure 4.
slave will respond to write transactions.
The read address ready (ARREADY), which receives the M. Block RAM Module
read address and control information, when set to 1 slave is The block RAM is designed, and it supports up to 256
ready to accept, and 0 slave is not to accept the information. memory locations, each allocates 8-bit at a time. Based on the
The 2-bit Read response (RRESP) which provides read write and read flag set, the memory location is allocated.
transaction status information with the response. The read
response valid (RVALID) signal provides a valid read Start
response if it is set to 1 otherwise 0. The Read Last (RLAST)
is the last data transaction in a read burst. The Read ID (RID)
is read identification tag if it is matched with ARID, then the Define Inputs: RESETN, axi_rvalid, axi_flagr, RREADY;
slave will respond to read transactions. The slave AXI read Output: axi_rvalid, axi_rresp;
data (RDATA) is 8-bit wide read data, which receives the
block memory data as final data output.
The proper valid and response transactions are supported Y axi_rvalid = 0
RESETN = 0
from the write flag (axi_flagw) and read flag (axi_flagr) axi_rresp = 0
registers. The write flags are reset to generate write address
ready (axi_awaddr). If the AWVALID and previous control N
signals like write and read flags are set, then AWREADY will
be active along with the write flag. If the WLAST is set high, Y
~axi_rvalid && axi_rvalid =1
the Slave is ready to accept the next address transactions after axi_flagr =1 axi_rresp = 2'b00
completion of the present write transaction.
N
H. Write Address Generation
The write address offset is defined based on the write axi_rvalid && Y axi_rvalid =0
address size and length. If both the AWVALID and RREADY = 1
WVALID signals are valid, the process of write address
latching is started. Reset the write address (axi_awaddr) and
burst length counter (axi_len_cnt). If the AWVALID and Stop
previous write flag (axi_flagw) is valid, then set the write
address (axi_awaddr) using AWADDR [7:0] and burst length
Figure 4: AXI-4 slave read response state flow
counter using AWLEN. The 2-bit burst type transaction is
performing, which includes fixed burst (00), incremental
burst (01) based on the burst sizes—wrapping burst (10) V. RESULTS AND ANALYSIS
based on the wrap boundary.
The proposed AXI-4 Master-slave Protocol results are
I. Write Ready Generation described in detail in the below section. The Complete AXI-
4 Master-slave Protocol is designed using Verilog HDL over
The write address (AWVALID) and write valid
the Xilinx ISE Platform and simulated on Modelsim
(WVALID) are valid when the write ready (axi_wready) is
ready to accept them for the number of burst transactions simulator and Hardware prototyped on low-cost Artix-7
(wdata). The write ready (axi_wready) is generated, when the FPGA.
WVALID and previous control signals like write flags are The AXI-4 Master-slave Protocol simulation results are
represented in Figure 5. The global clock (ACLK) is
set, then axi_wready will be active. If the WLAST is set high,

ISSN: 2180 – 1843 e-ISSN: 2289-8131 Vol. 12 No. 3 July – September 2020 65
Journal of Telecommunication, Electronic and Computer Engineering

activated with toggling with a positive edge. The Global transactions. The 2-bit master write a response (BRESP=00),
asynchronous Reset Negedge (ARESETN) signal is initially which is an OKAY response about the transaction.
set low, then keep it high to start the AXI-4 master-slave
operations. The AXI-4 Master Module receives the incoming D. Master-Slave Read Address Transaction
signals after the AXI-4 Slave response and, based on Master When the write transactions are over, the read address
Module operation, generates the Master output signals, which transaction will start. The read address master valid output
are input signals to the AXI-4 Slave module through AX-4 signal (ARVALID=1) will be high, followed by write address
interconnect signals. After the slave process, the AXI-4 Slave ready slave output (master input) signal (ARREADY=1) to
output signals will be generated after the response, and again initialize the AXI-4 read process. When both ARVALID and
slave output signals are input to the AXI-4 Master Module. ARREADY are high, the master read address (ARADDR)
The same process repeats till the data and address transaction initiates the address for each transaction and set to 1, 2, and 3,
happens. which is followed by master output signals like ARLEN=0,
The transactions are performed based on AXI- 4 Master- ARSIZE = 000, 2-bit ARBURST=01 and AWCACHE=0011
slave specifications, which includes write address (WA), which are allocated.
Write Data (WD), Write Response (WR), Read Address (RA)
and Read data (RD). These transactions are explained below E. Master-Slave Read Data Transaction
as per simulation results. Read data transaction will perform parallelly with read
address, the slave output valid (RVALID) signal followed by
master ready (RWREADY) will be high, when the 8-bit read
data (RDATA) receives 10, 20 and 30 data for each read
address transaction completion. The Last read (RLAST)
signal will be high for each read data transaction, and The 2-
bit Read response (RRESP=00) indicates the OKAY
response to receives the correct data.
The performance analysis of the proposed AXI-4 Master-
Slave module and its sub-modules resource utilization in
terms of Area (Slices) are presented in Table 1.

Table 1
Resource Utilization of AX-4 Master-Slave Module on Atrix-7

Logic AXI-4 AXI-4 AXI-4 Master-


Utilization Master Slave Slave
Slice Registers 49 24 63
Figure 5: Simulation results of AXI-4 master-slave module
Slice LUTs 61 28 77

A. Master-Slave Write Address Transaction LUT-FF pairs 48 17 56


The Write address valid output signal (AWVALID) will be
high, followed by write address ready slave input signal The AXI-4 Master, Slave, and Complete AXI-4 Master-
(AWREADY=1) from the slave module to initialize the AXI- slave models area utilization after PAR (Place and Route) is
4 process. When both AWVALID and AWREADY are high, obtained on the Atrix-7 Device environment. The AXI-4
the master write address (AWADDR) initiates the address for Master-slave Utilizes 63 Slice registers, 77 Slice LUT’s, and
each transaction and set to 1, 2 and 3 which is followed by 56 LUT-FF pairs. The Complete AXI-4 works at a high
master output signals like 8-bit Burst Length (AWLEN=0), frequency of 561.07 MHz. The AXI4 Design fits low-cost
3-bit Burst size (AWSIZE = 000), 2-bit increment address Artix-7 FPGA device at high speed. The total power
Burst type (AWBURST=01) and 4-bit cache type consumption of the AXI-4 Master-slave Module from the X-
(AWCACHE=0011) which is bufferable and cacheable are Power Analyzer tool is 0.085W with a clock frequency of 100
allocated. MHz. The AXI-4 Full design utilizes less Area, power, and
process at high speed on any low-cost FPGA Devices.
B. Master-Slave Write Data Transaction The comparative analysis in terms of Area utilization of the
Write data transaction will perform parallelly with write proposed AXI-4 Master-Slave module with AHB Protocol
address, the master output valid (WVALID=1) signal [21] on the same FPGA Spartan-3E device is tabulated in
followed by slave ready (WREADY =1), when the write data Table 2. The proposed AXI-4 improved in terms of Slice
(WDATA) transact the data for each transaction with an Registers around 6.80%, Slice LUT 40%, and LUT-FF Pairs
address. The 8-bit WDATA is 8-bit image data 33.33% overhead over AHB Protocol. The AXI-4 works at
(IMAGE_IN) is set to 10, 20, and 30 to the corresponding 1, 234.36, whereas AHB Protocol frequency is not mentioned
2, and 3 address. by [21].
In a similar passion, the proposed AXI-4 module is
C. Master-Slave Write Response Transaction compared with Wishbone Shared Bus Protocol [22] in terms
For each data transactions, the master input or slave output of Area and frequency on the same FPGA device. The
write response valid (BVALID=1) signal followed by proposed AXI-4 improves in terms of Slice Registers Slice
response ready (BREADY=1) signal will be high for one LUT, and LUT-FF Pairs is around 84 % over Wishbone
clock, both the signals will be low, till the new transaction Shared Bus Protocol [22]. The maximum frequency
happens and it will be followed till the last address and data utilization of AXI-4 was improved around 41 % over
Wishbone Shared Bus.

66 ISSN: 2180 – 1843 e-ISSN: 2289-8131 Vol. 12 No. 3 July – September 2020
Design of an Efficient AXI-4 Protocol for High Speed SOC Applications on FPGA Platform

proposed AXI-4 protocol provides a notable improvement in


Table 2 slice LUT’s around 40% over the AHB bus protocol. When
Area Comparison of Proposed AX-4 with AHB-Bus [21]
compared to wishbone bus architecture, a huge margin in area
improvement and around 41% overhead in frequency. In the
Logic Utilization AHB- Bus [21] AXI-4 Proposed
future, these models can be incorporated to image processing
FPGA Device Spartan 3E-1600 Spartan 3E-1600
applications to process and monitor the images at high speed
Slice Registers 44 41 as an imaging chip.
Slice LUTs 95 57
LUT-FF pairs 96 64 REFERENCES
Frequency(MHz) - 234.36
[1] W. Su, J. Wang, H, Wang, and L. Wang, “An Optimized Solution for
Cross-Domain System Bus Transaction Processing”, In Software
Table 3
Engineering, Artificial Intelligence, Networking and
Resource Comparison of Proposed AXI-4 with Wishbone-Shared-Bus [22]
Parallel/Distributed Computing (SNPD), 14th ACIS International
Conference on IEEE, 2013, pp. 165-170
Logic Utilization Wishbone-shared Bus [22] AXI-4 Proposed [2] Z. Li, J. Li, Y. Zhao, C. Rong, & J. Ma, “A SoC Design and
FPGA Device Spartan 3E-500 Spartan 3E-500 Implementation of H. 264 Video Encoding System Based on FPGA”,
In Intelligent Human-Machine Systems and Cybernetics (IHMSC),
Slice Registers 292 41
2014 Sixth International Conference on IEEE, Vol. 2,2014, pp.321-
Slice LUTs 416 57 324
LUT-FF pairs 459 64 [3] S. Ramagond, S. Yellampalli and C. Kanagasabapathi, “A review and
analysis of communication logic between PL and PS in ZYNQ AP
Frequency (MHz) 118.312 201.572 SoC”, In 2017 International Conference on Smart Technologies for
Smart Nation (Smart Tech-Con), 2017, pp. 946-951
500 [4] A. B. Mehta, “SoC Interconnect Verification”, In ASIC/SoC
Slice Registers Functional Design Verification, Springer, Cham, 2018, pp. 273-284
400 Slice LUTs [5] D.C. Liang, “Hard real-time bus architecture and arbitration algorithm
LUT-FF pairs based on AMBA”, 2015, pp. 1-7
300
[6] G. Mahesh and S.M. Sakthivel, “Functional verification of the
200 Axi2OCP Bridge using system verilog and effective bus utilization
calculation for AMBA AXI 3.0 protocol”, In Innovations in
100 Information, Embedded and Communication Systems (ICIIECS),
International Conference on IEEE, 2015, pp. 1-5
0 [7] P.R. Ronak and S. Jagtap, “Design and verification of flexible interface
Wishbhone-shared Bus AHB- Bus AXI-4 Proposed for multicore system using PCIe IO virtualization”, In Recent Trends
in Electronics, Information & Communication Technology (RTEICT),
IEEE International Conference on IEEE, 2016, pp. 623-627
Figure 6: Comparative analysis of AXI-4 with other bus protocols
[8] AMBA, “AXI-Protocol Specification V2”, 0. ARM Holdings plc. Std,
2010
The AXI-4 protocol is compared with the traditional AHB [9] C. Sarojini and J. Thangaraj “Implementation and Optimization of
and Wishbone bus protocols with improvements in hardware Throughput in High Speed Memory Interface Using AXI Protocol”,
design constraints. Table 2 and Table 3 clearly show that the In 2018 9th International Conference on Computing, Communication,
and Networking Technologies (ICCCNT), 2018, pp. 1-5
AXI-4 bus protocol is more efficient than the AHB and
[10] N.Tidala, “High-Performance Network on Chip using AXI4 protocol
Wishbone bus protocols. interface on an FPGA”, In 2018 Second International Conference on
For communication and interconnection to multiple devices Electronics, Communication and Aerospace Technology (ICECA),
on SoC, the platform, the AXI-4, is highly commendable 2018, pp. 1647-1651
because of its ow cost, flexible less area and power utilization. [11] E. Azarkhish, D. Rossi, I. Loi, and L. Benini,”High-performance AXI-
4.0 based interconnect for extensible smart memory cubes”,
It also works at a higher speed. The overview of comparative In Proceedings of the 2015 Design, Automation & Test in Europe
analysis is presented in Figure 6. The proposed AXI-4 utilizes Conference & Exhibition, 2015, pp. 1317-1322
less area overhead than the other two similar functionally [12] M. Makni, M. Baklouti, S. Niar, and M. Abid, “Performance
working bus protocols, namely the AHB and Wishbone Bus Exploration of AMBA AXI4 Bus Protocols for Wireless Sensor
Networks”, In Computer Systems and Applications (AICCSA), 2017
protocol. IEEE/ACS 14th International Conference on IEEE, 2017, pp. 1163-
1169
VI. CONCLUSION [13] D.C. Kho and K. Munusamy, “Transaction-based SoC design
techniques for AMBA AXI4 bus interconnects using VHDL”,
In this paper, an efficient AMBA-AXI-4 interface protocol In Electrical Engineering/Electronics, Computer, Telecommunications
and Information Technology (ECTI-CON), 2014 11th International
which offers high-speed communication between the Conference on IEEE, 2014, pp. 1-6
processing elements has been designed,. The limitations of the [14] M. Gayathri, R. Sebastian, S.R. Mary, and A. Thomas, “A SV-UVM
conventional bus-based communications are overcome with framework for verification of SGMII IP core with reusable AXI to WB
the inclusion of Optimized AXI-4 interface protocol for high- Bridge UVC”, In Advanced Computing and Communication Systems
(ICACCS), 3rd International Conference on IEEE, 2016, pp. 1-4
speed SOC usage. The proposed AMBA-AXI-4 protocol
[15] R.H. Prasad and C.S. Rani, “Development of VIP for AMBA AXI-4.0
module contains a Master, Interconnect model, and slave Protocol”, Indian Journal of Science and Technology, Vol.9, 2016, pp.
module. These modules are designed using efficient state 48
machines. The simulation results of complete AXI-4 Master- [16] S. Sharma, and S.M. Sakthivel, “Design and Verification of AMBA
slave are presented with a detailed description. The proposed AXI3 Protocol”, In VLSI Design: Circuits, Systems, and Applications,
Springer, Singapore, 2016, pp. 247-259
model consumes very less area utilization and is tabulated [17] Z. Panjkov, J. Haas, M. Aigner, H. Rosmanith, T. Liu, “Poppenreiter
after place and route on a low-cost FPGA device. The model & R. Hagelauer, “OCP2XI Bridge: An OCP to AXI Protocol Bridge”,
work at high speed with an operating frequency of 561.07 In International Symposium on Applied Reconfigurable Computing ,
MHz and utilize a small amount of power 0.085W. The 2016, pp. 179-190

ISSN: 2180 – 1843 e-ISSN: 2289-8131 Vol. 12 No. 3 July – September 2020 67
Journal of Telecommunication, Electronic and Computer Engineering

[18] R. Bhaktavatchalu, B.S. Rekha, G.A. Divya, and V.U.S Jyothi, [21] A. Gaur, P. Sharma, and S.P. Pandey, “HDL and timing analysis of
“Design of AXI bus interface modules on FPGA”, In Advanced AMBA AHB on FPGA platform”, In Control, Automation & Power
Communication Control and Computing Technologies (ICACCCT), Engineering (RDCAPE), Recent Developments in IEEE, 2017, pp. 22-
International Conference on IEEE, 2016, pp. 141-146 27
[19] H.R. Archana, and K.V. Patel, “A Novel Design and Implementation [22] A.K. Swain, and K. Mahapatra, “Design and verification of
of Imaging Chip Using AXI Protocol for MPSOC on FPGA”, WISHBONE bus interface for System-on-Chip integration”, In India
In Proceedings of the Computational Methods in Systems and Conference (INDICON), Annual IEEE, 2010, pp. 1-4
Software Springer, Cham, 2018, pp. 44-57
[20] F. Benevenuti and F.L. Kastensmidt, “Reliability evaluation on
interfacing with AXI and AXI-S on Xilinx Zynq-7000 AP-SoC”,
In Test Symposium (LATS), IEEE 19th Latin-American, 2018, pp. 1-6

68 ISSN: 2180 – 1843 e-ISSN: 2289-8131 Vol. 12 No. 3 July – September 2020

You might also like