Design of DDR4 SDRAM Controller: December 2014
Design of DDR4 SDRAM Controller: December 2014
Design of DDR4 SDRAM Controller: December 2014
net/publication/301412054
CITATIONS READS
4 5,957
3 authors, including:
Some of the authors of this publication are also working on these related projects:
Feasibility Analysis of 4G Wireless Technology in Ubiquitous m-Health System in Bangladesh View project
Marie Skłodowska-Curie Multi Initial Training Network (ITN-VISION) on Visible light based Interoperability and Networking View project
All content following this page was uploaded by Jahid Hasan on 23 July 2020.
I. INTRODUCTION
DDR4 SDRAM, an abbreviation for Double Data Rate
type 4 Synchronous Dynamic Random Access Memory is the Fig.1 New Features of DDR4 SDRAM
latest member of DDR family of technologies mostly for the Now Table 1 shows the comparison of DDR4 SDRAM than
computing, server, biomedical and embedded applications. others.
Recently DDR3 SDRAM is widely used in PC, Smart phones, TABLE 1
Tablets, High-end servers and micro server systems. DDR4 is COMPARISON TABLE OF DDR3 AND DDR4 SDRAM
focusing on data reliability. A large number of server systems
are required for increasing high-bandwidth and high capacity Feature DDR3 DDR4 DDR4 Advantage
of multimedia contents [1]. Hence, a main memory of server Voltage 1.5V 1.2V Reduces memory
systems is one of the vital components, so it should definitely (core and power demand
I/O)
have high performance feature [2]. The increasing demand of
Prefetch 8n 8n DDR4 also have 8n
faster operating systems as well as big size applications prefetch but parallel
created the need of using high capacity memory. Due to the bank group
use of high capacity memory, power consumption is also Densities 512Mb–8Gb 2Gb–16Gb Better enablement
increased immensely which is one of the major concerns for for large-capacity
memory market. DDR4 not only increased its volume but also memory subsystems
reduced the power consumption. Data rate 800, 1066, 1600, 1866, Migration to higher
The main changes of DDR4 SDRAM are supply voltage, (MT/s) 1333, 1600, 2133, 2400, data bandwidth
higher data rate, higher densities, more banks, faster bursts 1866, 2133 2667, 3200
access and higher system reliability compared with DDR3 Internal 8 16 More Banks
banks
SDRAM.
Bank 0 4 Faster burst
Figure 1 shows the new features which is introduced in
groups accesses
DDR4 SDRAM such as : CRC, C/A parity, data bus inversion (BG)
(DBI), Command address latency (CAL), per-DRAM
addressability , CAL (command address latency), MPR DDR4 SDRAM controller is used as a bridge to interface with
(multi-purpose registers), gear-down mode, FGREF (fine SDRAM memory devices and processors subsystem. It
granularity refresh) and temperature controlled refresh. manages bi-directional data flow of memory. The main
Among these, CRC, C/A parity, data bus inversion (DBI) for function of controller is data read and write. It also assists the
solving the data reliability issue. Using Per DRAM DRAM to retain the data such as periodic refresh, precharge
addressability mode only selected DRAM programmed is etc. Data transfer is accomplished with the bi-directional
possible in a single Rank. differential data strobe (DQS, DQS#). The DDR4 SDRAM
memory device transmits data strobe (DQS) for read operation
and controller device transmits DQS for write operation.
II. FUCNTIONAL DESCRIPTION OF DDR4 SDRAM III. ARCHITECTURE OF DDR4 SDRAM CONTROLLER
DDR4 SDRAM has 8n prefetch with parallel bank for Fig. 4 shows the proposed overall architecture of DDR4
higher data transfer rate. The memory internally configured as SDRAM controller.
16 banks, 4 bank group with 4 banks for each bank group [4].
Bank is independent memory array. The row and column
address is used to locate the requested address of bank in
memory. And Page is unique address comprise of bank group,
bank and row address. A page size is a total number of
columns in a row. Rank architecture is very important to
enhance data rate without increasing burst length [3]. Figure 2
shows the organization of memory.
tRAS (active-to-precharge delay) etc. In rank status machine is output with a “0” until the rising edge is detected by the
corresponding bank “x” status processor starts counting of DQS at which time the DQ will be output with a “1”. The
these times i.e. tRRD, tRCD, tRAS etc. after issuing activate calibration unit will detect these “1” on the DQ bus and then
command. At beginning the bank “x” status was bank active knows the correct DQS compensation to align the DQS and
=0, readable =0, writeable =0, prechagable =0. After the tRCD CLK on the write path. Once all DQS have been adjusted,
time counts bank “x” status updated as bank active =1, these compensation values of DLL stored for each DQS for
readable =1, writeable =1, prechagable =0. future usage. Then the memory controller sends another MRS
DDR4 introduces new specifications for read to read or command to exit the write level mode.
write to write delay which are called tCCD_S and tCCD_L
time for read or writes to banks of different Bank group and
same Bank groups respectively.
B. CMD pipelining
Today’s most of system shares the common RAM (Such
as multiple processor, GPU, GPIO shares the DRAM). So the Fig. 7 Write Leveling
controller queues up multiple commands from different
peripherals using command pipeline. The command pipeline As DDR4 write leveling manages the DQS/DQ on write
is used for optimal bandwidth utilization. Command pipeline data, the DDR4 read leveling manages the DQS/DQ on read
is used by the Command generation unit to pre-process the data. DDR4 read leveling is to compensate the imbalanced
target bank for read or write operation such as active and loading on the read path. First the memory controller puts the
precharge. DDR4 memory devices into MPR mode by writing to the
MR3 register. This puts the DDR4 memory devices into the
read leveling mode which outputs a training pattern of
C. Command address latency consecutive “01” on each memory read command. Since the
Command-to-Address Latency (CAL) function offers by memory controller knows that data stream, so it will adjust the
DDR4 memory that can be used to save power. Basically, internal DQS delay to capture the DQ using DQS. Memory
CAL is the delay between enable a Chip select and controller repeats these calibration reads until read data
command/Address in clock cycles which is defined by mode capture at memory controller is optimized.
register. Before issuing a command, it provides the DRAM ZQ Calibration command is used to calibrate DRAM Ron
time to active the CMD/ADDR receivers. The receivers can & ODT values. The ZQ calibration accounts the voltage and
be latched, whenever the command and the address are temperature variation on DRAM output driver.
latched. In case of providing consecutive commands, the
receiver will have enabled by DRAM for the period of E. Command Generation Unit
command sequence [4]. The command generator accepts user commands from the
CMD pipelining unit. When the unit receives a single request,
D. Calibration Unit from pipelining unit request is processed immediately. For
specified read or write request, Command Generation Unit
Calibration Unit performs write leveling, read leveling and
lookup in the rank status to ensure the timing constraints for
ZQ calibration. DDR4 memory used fly-by topology for better
each previous memory transaction is met up before start new
signal integrity. To reduce the number of stubs and their
transaction. It also checks for availability of target bank for
length this topology is used. Despite benefiting, it causes skew
read or write operation from the rank status. Bank availability
problem between clock and strobe at every DRAM on the
means the requested row in corresponding bank is active. If
DIMM module. Hence, to solve the skew problem memory
the requested bank is not active then it generates the necessary
controller uses the ‘write leveling’ feature and feedback from
precharge and active command to activate the bank in DDR4.
the DDR4 SDRAM to adjust the data strobe to clock
The command generation unit halt itself from throwing next
relationship [4]. This feature can compensate for unbalanced
command until the timing requirements of DDR4 are met.
loading on the board for write and read operations. Calibration
Upon availability of the bank; the requested read or write
unit put the DDR4 into write leveling state by writing mode
command is send to the DDR4 and internal rank status
register MR1 of DDR4. The write strobe DQS is repeatedly
machine so that rank can update itself each time when a new
delayed in small increments by the Delay Locked Loop (DLL).
command is throwing to the DDR4. If there are outstanding
DDR4 samples the CLK with the rising edge of DQS and
requests from two or more queues, the Command generation
provide feedback on DQ. During this protocol, each set of DQ
unit process the requests in look-ahead way. While processing
151
the active request from pipeline if there are any unsatisfied latency, cas latency etc. After setting the mode register the
timing constraints which lead the controller to stall the Initialization and refresh unit switch on the calibration unit to
transaction, controller then check the next upcoming request perform the read and write leveling. Upon completion of
from queue. Command generation unit check the next requests coalition the DDR4 is ready for normal operation. It has a
bank address to find out the upcoming requests bank whether tREFI counter, which counts DDR4 refresh interval. On each
it is in active and activated row does match with the requested tREFI count the controller issues refresh command to DDR4.
row address or not. If the condition does not met then
command generation unit perform necessary precharge IV. CONCLUSIONS
activate sequence to active that bank, so that when the request
The proposed architecture is designed and implemented
will be active the request will incur minimum stall time. Such
according to DDR4 SDRAM JEDEC specification. DDR4 has
deliberately bank active of future requests utilize the active
reduced power consumption whereas it transfer rate is more
requests stall time. This opportunistically look-ahead activates
above of its predecessor i.e. DDR3, DDR2, DDR. Its
improve overall throughput.
application will be more widely used from low power mobile
computing device to high density servers. To enhance the
F. Data Transfer Unit throughput and reliability the controller should be aware of
both parallel bank operation and error detection. For that
Data transfer unit consist a read FIFO and a write FIFO.
reason we proposed parallel bank group and bank status
The read FIFO receives data from the DDR4 and passes that
machine co-ordinated by rank and separate data transfer unit
data to the user. The write FIFO receives write data from the
with CRC generator. For better performance upon notifying
user interface and passes that data to the DDR4. Read data
error status from DDR4 the controller should be able to handle
from DDR4 arrive at the controller at both positive and
the ongoing transaction and suspended transaction traffic. This
negative edge of clock. The burst length of DDR4 is 8. Read
is the upcoming challenge for DDR4 controller which we will
FIFO captures these data in 4 clock cycles and provides the
try in future.
user read data in a single clock cycle on next positive edge of
clock. Similarly write FIFO loads the user write data in a REFERENCES
single clock and send to the DDR4 in next 4 clocks both at [1] R. Ramakrishnan, “CAP and cloud data management,” Computer, vol.
positive and negative edge. During the write data transfer to 45, no. 2, pp. 43–49, Feb. 2012.
DDR4 it also added up CRC code at trail of each burst. [2] M. E. Tolentino, J. Turner, and K. W. Cameron, “Memory MISER:
Improving main memory energy efficiency in servers,” IEEE Trans.
G. CRC Generator Comput., vol. 58, no. 3, pp. 336–350, Mar. 2009.
[3] T.-Y. Oh et al., “A 7 Gb/s/pin GDDR5 SDRAM with 2.5 ns bank-to
As speed increased data prone to error. To solve Data bank active time and no bank-group restriction,” in IEEE ISSCC Dig.
reliability issue during WRITE operations, DDR4 included Tech. Papers, 2010, pp. 434–435.
[4] DDR4 SDRAM Specification (JESD79-4), JEDEC Standard, JEDEC
Cyclic Redundancy Check (CRC) along with DBI. DDR4 solid state technology association, September 2012.
uses an 8-bit CRC header error control. The CRC polynomial [5] Kyomin Sohn, Taesik Na, Indal Song, Yong Shim, Wonil Bae,
used by DDR4 is the ATM-8 HEC, X^8+X^2+X^1+1. Sanghee Kang, Dongsu Lee, Hangyun Jung, Seokhun Hyun, Hanki
Jeoung, Ki-Won Lee, Jun-Seok Park, Jongeun Lee, Byunghyun Lee,
Figure 8 shows CRC generator. DRAM generates Inwoo Jun, Juseop Park, Junghwan Park, Hundai Choi, Sanghee Kim,
checksum in each write burst per Data Strobe lane. CRC use Haeyoung Chung, Young Choi, Dae-Hee Jung, Byungchul Kim, Jung
72 bits of data and all unallocated bits are 1s. DRAM Hwan Choi, Seong-Jin Jang, Chi-Wook Kim, Jung-Bae Lee, and Joo
Sun Choi “A 1.2 V 30 nm 3.2 Gb/s/pin 4 Gb DDR4 SDRAM With
compares the checksum. If two checksums do not match then
Dual-Error Detection and PVT-Tolerant” IEEE JOURNAL OF SOLID-
DRAM flags an error. STATE CIRCUITS, VOL. 48, NO. 1, JANUARY 2013.