Chisel Tutorial
Chisel Tutorial
Chisel Tutorial
0 Tutorial (Beta)
Jonathan Bachrach, Krste Asanović, John Wawrzynek
EECS Department, UC Berkeley
{jrb|krste|johnw}@eecs.berkeley.edu
1
2 Hardware expressible in Chisel 5.S(7.W) // signed decimal 7-bit lit of type SInt
5.U(8.W) // unsigned decimal 8-bit lit of type UInt
2
val sel = a | b 6 Functional Abstraction
val out = (sel & in1) | (~sel & in0)
3
Example Explanation
Bitwise operators. Valid on SInt, UInt, Bool.
val invertedX = ~x Bitwise NOT
val hiBits = x & "h_ffff_0000".U Bitwise AND
val flagsOut = flagsIn | overflow Bitwise OR
val flagsOut = flagsIn ^ toggle Bitwise XOR
Bitwise reductions. Valid on SInt and UInt. Returns Bool.
val allSet = andR(x) AND reduction
val anySet = orR(x) OR reduction
val parity = xorR(x) XOR reduction
Equality comparison. Valid on SInt, UInt, and Bool. Returns Bool.
val equ = x === y Equality
val neq = x =/= y Inequality
Shifts. Valid on SInt and UInt.
val twoToTheX = 1.S << x Logical left shift.
val hiBits = x >> 16.U Right shift (logical on UInt and& arithmetic on SInt).
Bitfield manipulation. Valid on SInt, UInt, and Bool.
val xLSB = x(0) Extract single bit, LSB has index 0.
val xTopNibble = x(15,12) Extract bit field from end to start bit position.
val usDebt = Fill(3, "hA".U) Replicate a bit string multiple times.
val float = Cat(sign,exponent,mantissa) Concatenates bit fields, with first argument on left.
Logical operations. Valid on Bools.
val sleep = !busy Logical NOT
val hit = tagMatch && valid Logical AND
val stall = src1busy || src2busy Logical OR
val out = Mux(sel, inTrue, inFalse) Two-input mux where sel is a Bool
Arithmetic operations. Valid on Nums: SInt and UInt.
val sum = a + b Addition
val diff = a - b Subtraction
val prod = a * b Multiplication
val div = a / b Division
val mod = a % b Modulus
Arithmetic comparisons. Valid on Nums: SInt and UInt. Returns Bool.
val gt = a > b Greater than
val gte = a >= b Greater than or equal
val lt = a < b Less than
val lte = a <= b Less than or equal
4
(Note that we have to specify the type of the Vec ele- By folding directions into the object declarations,
ments inside the trailing curly brackets, as we have Chisel is able to provide powerful wiring constructs
to pass the bitwidth parameter into the SInt construc- described later.
tor.)
The set of primitive classes (SInt, UInt, and Bool)
plus the aggregate classes (Bundles and Vecs) all in- 9 Modules
herit from a common superclass, Data. Every object
that ultimately inherits from Data can be represented Chisel modules are very similar to Verilog modules in
as a bit vector in a hardware design. defining a hierarchical structure in the generated cir-
Bundles and Vecs can be arbitrarily nested to build cuit. The hierarchical module namespace is accessible
complex data structures: in downstream tools to aid in debugging and phys-
ical layout. A user-defined module is defined as a
class BigBundle extends Bundle { class which:
// Vector of 5 23-bit signed integers.
val myVec = Vec(5, SInt(23.W))
• inherits from Module,
val flag = Bool()
// Previously defined bundle.
val f = new MyFloat()
• contains an interface wrapped in an IO() func-
} tion and stored in a port field named io, and
Note that the builtin Chisel primitive and aggre- • wires together subcircuits in its constructor.
gate classes do not require the new when creating
As an example, consider defining your own two-
an instance, whereas new user datatypes will. A
input multiplexer as a module:
Scala apply constructor can be defined so that a user
datatype also does not require new, as described in class Mux2 extends Module {
Section 14. val io = IO(new Bundle{
val sel = Input(UInt(1.W))
val in0 = Input(UInt(1.W))
val in1 = Input(UInt(1.W))
8 Ports })
val out = Output(UInt(1.W))
5
}) expect(c.io.out, (if (s == 1) i1 else i0))
val m0 = Module(new Mux2()) }
m0.io.sel := io.sel(0) }
m0.io.in0 := io.in0; m0.io.in1 := io.in1 }
}
val m1 = Module(new Mux2())
m1.io.sel := io.sel(0) assignments for each input of Mux2 are set to the ap-
m1.io.in0 := io.in2; m1.io.in1 := io.in3
propriate values using poke. For this particular ex-
val m3 = Module(new Mux2()) ample, we are testing the Mux2 by hardcoding the
m3.io.sel := io.sel(1) inputs to some known values and checking if the
m3.io.in0 := m0.io.out; m3.io.in1 := m1.io.out output corresponds to the known one. To do this,
io.out := m3.io.out
on each iteration we generate appropriate inputs to
} the module and tell the simulation to assign these
values to the inputs of the device we are testing c,
We again define the module interface as io and wire step the circuit 1 clock cycle, and test the expected
up the inputs and outputs. In this case, we create value. Steps are necessary to update registers and
three Mux2 children modules, using the Module con- the combinational logic driven by registers. For pure
structor function and the Scala new keyword to create combinational paths, poke alone is sufficient to up-
a new object. We then wire them up to one another date all combinational paths connected to the poked
and to the ports of the Mux4 interface. input wire.
Finally, the following the tester is invoked by call-
ing runPeekPokeTester:
10 Running Examples
def main(args: Array[String]): Unit = {
runPeekPokeTester(() => new GCD()){
Now that we have defined modules, we will discuss (c,b) => new GCDTests(c,b)}
how we actually run and test a circuit. }
Testing is a crucial part of circuit design, and thus
in Chisel we provide a mechanism for testing circuits This will run the tests defined in GCDTests with the
by providing test vectors within Scala using tester GCD module being simulated but the Firrtl inter-
method calls which binds a tester to a module and preter. We can instead have the GCD module be
allows users to write tests using the given debug simulated by a C++ simulator generated by Verilator
protocol. In particular, users utilize: by calling the following:
• poke to set input port and state values, def main(args: Array[String]): Unit = {
runPeekPokeTester(() => new GCD(), "verilator"){
(c,b) => new GCDTests(c,b)}
• step to execute the circuit one time unit,
}
For example, in the following: This circuit has an output that is a copy of the input
signal in delayed by one clock cycle. Note that we do
class Mux2Tests(c: Mux2, b: Option[TesterBackend] = None) not have to specify the type of Reg as it will be auto-
extends PeekPokeTester(c, _backend=b) {
matically inferred from its input when instantiated in
val n = pow(2, 3).toInt
for (s <- 0 until 2) {
this way. In the current version of Chisel, clock and
for (i0 <- 0 until 2) { reset are global signals that are implicity included
for (i1 <- 0 until 2) { where needed.
poke(c.io.sel, s)
poke(c.io.in1, i1)
Using registers, we can quickly define a number of
poke(c.io.in0, i0) useful circuit constructs. For example, a rising-edge
step(1) detector that takes a boolean signal in and outputs
6
true when the current value is true and the previous has been defined. Because Scala evaluates program
value is false is given by: statements sequentially, we allow data nodes to serve
as a wire providing a declaration of a node that can
def risingedge(x: Bool) = x && !Reg(next = x)
be used immediately, but whose input will be set later.
Counters are an important sequential circuit. To For example, in a simple CPU, we need to define the
construct an up-counter that counts up to a maxi- pcPlus4 and brTarget wires so they can be referenced
mum value, max, then wraps around back to zero (i.e., before defined:
modulo max+1), we write: val pcPlus4 = UInt()
val brTarget = UInt()
def counter(max: UInt) = {
val pcNext = Mux(io.ctrl.pcSel, brTarget, pcPlus4)
val x = Reg(init = 0.U(max.getWidth.W))
val pcReg = Reg(next = pcNext, init = 0.U(32.W))
x := Mux(x === max, 0.U, x + 1.U)
pcPlus4 := pcReg + 4.U
x
...
}
brTarget := addOut
7
Initial values leads to r and s being updated according to the fol-
0 0 e1 lowing truth table:
c1 c2 r s
c1 f t 0 0 3 3
0 1 2 3 // r updated in c2 block, s at top level.
when (c1) 1 0 1 1
{ r := e1 } 1 1 2 1
e2
is the same as
is equivalent to:
when (!c) { body }
when (idx === v1) { u1 }
.elsewhen (idx === v2) { u2 }
The update block can target multiple registers, and
there can be different overlapping subsets of registers Chisel also allows a Wire, i.e., the output of some
present in different update blocks. Each register is combinational logic, to be the target of conditional up-
only affected by conditions in which it appears. The date statements to allow complex combinational logic
same is possible for combinational circuits (update expressions to be built incrementally. Chisel does
of a Wire). Note that all combinational circuits need a not allow a combinational output to be incompletely
default value. For example: specified and will report an error if an unconditional
r := 3.S; s := 3.S
update is not encountered for a combinational output.
when (c1) { r := 1.S; s := 1.S }
when (c2) { r := 2.S }
In Verilog, if a procedural specification of a combina-
8
tional logic block is incomplete, a latch will silently be Here is the vending machine FSM defined with
inferred causing many frustrating bugs. switch statement:
It could be possible to add more analysis to the
Chisel compiler, to determine if a set of predicates class VendingMachine extends Module {
val io = IO(new Bundle {
covers all possibilities. But for now, we require a
val nickel = Input(Bool())
single predicate that is always true in the chain of val dime = Input(Bool())
conditional updates to a Wire. val valid = Output(Bool())
})
val s_idle :: s_5 :: s_10 :: s_15 :: s_ok :: Nil =
11.3 Finite State Machines Enum(5)
val state = Reg(init = s_idle)
A common type of sequential circuit used in digital
switch (state) {
design is a Finite State Machine (FSM). An example
is (s_idle) {
of a simple FSM is a parity generator: when (io.nickel) { state := s_5 }
when (io.dime) { state := s_10 }
class Parity extends Module {
}
val io = IO(new Bundle {
is (s_5) {
val in = Input(Bool())
when (io.nickel) { state := s_10 }
val out = Output(Bool()) })
when (io.dime) { state := s_15 }
val s_even :: s_odd :: Nil = Enum(2)
}
val state = Reg(init = s_even)
is (s_10) {
when (io.in) {
when (io.nickel) { state := s_15 }
when (state === s_even) { state := s_odd }
when (io.dime) { state := s_ok }
when (state === s_odd) { state := s_even }
}
}
is (s_15) {
io.out := (state === s_odd)
when (io.nickel) { state := s_ok }
}
when (io.dime) { state := s_ok }
}
where Enum(2) generates two UInt literals. States are is (s_ok) {
updated when in is true. It is worth noting that all state := s_idle
of the mechanisms for FSMs are built upon registers, }
}
wires, and conditional updates. io.valid := (state === s_ok)
Below is a more complicated FSM example which }
is a circuit for accepting money for a vending ma-
chine:
class VendingMachine extends Module {
val io = IO(new Bundle {
12 Memories
val nickel = Input(Bool())
val dime = Input(Bool()) Chisel provides facilities for creating both read only
val valid = Output(Bool()) }) and read/write memories.
val s_idle :: s_5 :: s_10 :: s_15 :: s_ok :: Nil =
Enum(5)
val state = Reg(init = s_idle)
when (state === s_idle) {
12.1 ROM
when (io.nickel) { state := s_5 }
when (io.dime) { state := s_10 }
Users can define read only memories with a Vec:
}
Vec(inits: Seq[T])
when (state === s_5) {
Vec(elt0: T, elts: T*)
when (io.nickel) { state := s_10 }
when (io.dime) { state := s_15 }
} where inits is a sequence of initial Data literals
when (state === s_10) { that initialize the ROM. For example, users can cre-
when (io.nickel) { state := s_15 } ate a small ROM initialized to 1, 2, 4, 8 and loop
when (io.dime) { state := s_ok }
}
through all values using a counter as an address gen-
when (state === s_15) { erator as follows:
when (io.nickel) { state := s_ok }
val m = Vec(Array(1.U, 2.U, 4.U, 8.U))
when (io.dime) { state := s_ok }
val r = m(counter(UInt(m.length.W)))
}
when (state === s_ok) {
state := s_idle We can create an n value sine lookup table using a
} ROM initialized as follows:
io.valid := (state === s_ok)
} def sinTable (amp: Double, n: Int) = {
9
val times = val rdata = ram1p(reg_raddr)
Range(0, n, 1).map(i => (i*2*Pi)/(n.toDouble-1) - Pi)
val inits = If the same Mem address is both written and sequen-
times.map(t => SInt(round(amp * sin(t)), width = 32))
Vec(inits)
tially read on the same clock edge, or if a sequential
} read enable is cleared, then the read data is unde-
def sinWave (amp: Double, n: Int) = fined.
sinTable(amp, n)(counter(UInt(n.W))
Mem also supports write masks for subword writes.
where amp is used to scale the fixpoint values stored A given bit is written if the corresponding mask bit
in the ROM. is set.
val ram = Mem(256, UInt(32.W))
when (wen) { ram.write(waddr, wdata, wmask) }
12.2 Mem
Memories are given special treatment in Chisel since
hardware implementations of memory have many
variations, e.g., FPGA memories are instantiated 13 Interfaces & Bulk Connections
quite differently from ASIC memories. Chisel defines
a memory abstraction that can map to either sim- For more sophisticated modules it is often useful to
ple Verilog behavioral descriptions, or to instances define and instantiate interface classes while defin-
of memory modules that are available from exter- ing the IO for a module. First and foremost, inter-
nal memory generators provided by foundry or IP face classes promote reuse allowing users to capture
vendors. once and for all common interfaces in a useful form.
Chisel supports random-access memories via the Secondly, interfaces allow users to dramatically re-
Mem construct. Writes to Mems are positive-edge- duce wiring by supporting bulk connections between
triggered and reads are either combinational or producer and consumer modules. Finally, users can
positive-edge-triggered.1 make changes in large interfaces in one place reduc-
Ports into Mems are created by applying a UInt ing the number of updates required when adding or
index. A 32-entry register file with one write port and removing pieces of the interface.
two combinational read ports might be expressed as
follows:
13.1 Ports: Subclasses & Nesting
val rf = Mem(32, UInt(64.W))
when (wen) { rf(waddr) := wdata } As we saw earlier, users can define their own inter-
val dout1 = rf(waddr1)
faces by defining a class that subclasses Bundle. For
val dout2 = rf(waddr2)
example, a user could define a simple link for hand-
If the optional parameter seqRead is set, Chisel will shaking data as follows:
attempt to infer sequential read ports when the read class SimpleLink extends Bundle {
address is a Reg. A one-read port, one-write port val data = Output(UInt(16.W))
SRAM might be described as follows: val valid = Output(Bool())
}
val ram1r1w =
Mem(1024, UInt(32.W)) We can then extend SimpleLink by adding parity bits
val reg_raddr = Reg(UInt())
when (wen) { ram1r1w(waddr) := wdata }
using bundle inheritance:
when (ren) { reg_raddr := raddr }
class PLink extends SimpleLink {
val rdata = ram1r1w(reg_raddr)
val parity = Output(UInt(5.W))
}
Single-ported SRAMs can be inferred when the
read and write conditions are mutually exclusive in In general, users can organize their interfaces into
the same when chain: hierarchies using inheritance.
val ram1p = Mem(1024, UInt(32.W)) From there we can define a filter interface by nest-
val reg_raddr = Reg(UInt()) ing two PLinks into a new FilterIO bundle:
when (wen) { ram1p(waddr) := wdata }
.elsewhen (ren) { reg_raddr := raddr } class FilterIO extends Bundle {
val x = new PLink().flip
1 Current FPGA technology does not support combinational val y = new PLink()
(asynchronous) reads (anymore). The read address needs to be }
registered.
10
where flip recursively changes the “gender” of a
bundle, changing input to output and output to in-
put.
We can now define a filter by defining a filter class
extending module:
class Filter extends Module {
val io = IO(new FilterIO())
...
}
f1.io.x <> io.x Now the control path can build an interface in terms
f1.io.y <> f2.io.x
f2.io.y <> io.y
of these interfaces:
}
class CpathIo extends Bundle {
val imem = RomIo().flip()
where <> bulk connects interfaces of opposite gender val dmem = RamIo().flip()
between sibling modules or interfaces of same gender ...
between parent/child modules. Bulk connections }
11
val io = IO(new DpathIo()) }
...
io.imem.raddr := ... Selecting inputs is so useful that Chisel builds it in
io.dmem.raddr := ...
and calls it Mux. However, unlike Mux2 defined above,
io.dmem.wdata := ...
... the builtin version allows any datatype on in0 and
} in1 as long as they are the same subclass of Data. In
Section 15 we will see how to define this ourselves.
We can now wire up the CPU using bulk connects as Chisel provides MuxCase which is an n-way Mux
we would with other bundles:
MuxCase(default, Array(c1 -> a, c2 -> b, ...))
class Cpu extends Module {
val io = IO(new CpuIo()) where each condition / value is represented as a tuple
val c = Module(new CtlPath())
in a Scala array and where MuxCase can be translated
val d = Module(new DatPath())
c.io.ctl <> d.io.ctl into the following Mux expression:
c.io.dat <> d.io.dat
Mux(c1, a, Mux(c2, b, Mux(..., default)))
c.io.imem <> io.imem
d.io.imem <> io.imem
c.io.dmem <> io.dmem Chisel also provides MuxLookup which is an n-way
d.io.dmem <> io.dmem indexed multiplexer:
d.io.host <> io.host
} MuxLookup(idx, default,
Array(0.U -> a, 1.U -> b, ...))
Repeated bulk connections of partially assigned con-
trol and data path interfaces completely connect up which can be rewritten in terms of:MuxCase as follows:
the CPU interface. MuxCase(default,
Array((idx === 0.U) -> a,
(idx === 1.U) -> b, ...))
14 Functional Module Creation Note that the cases (eg. c1, c2) must be in parentheses.
It is also useful to be able to make a functional inter-
face for module construction. For instance, we could 15 Polymorphism and
build a constructor that takes multiplexer inputs as
parameters and returns the multiplexer output: Parameterization
object Mux2 { Scala is a strongly typed language and uses parame-
def apply (sel: UInt, in0: UInt, in1: UInt) = {
val m = new Mux2()
terized types to specify generic functions and classes.
m.io.in0 := in0 In this section, we show how Chisel users can de-
m.io.in1 := in1 fine their own reusable functions and classes using
m.io.sel := sel parameterized classes.
m.io.out
}
} This section is advanced and can be skipped at first
reading.
where object Mux2 creates a Scala singleton object on
the Mux2 module class, and apply defines a method 15.1 Parameterized Functions
for creation of a Mux2 instance. With this Mux2 creation
function, the specification of Mux4 now is significantly Earlier we defined Mux2 on Bool, but now we show
simpler. how we can define a generic multiplexer function.
We define this function as taking a boolean condition
class Mux4 extends Module {
val io = IO(new Bundle {
and con and alt arguments (corresponding to then
val in0 = Input(UInt(1.W)) and else expressions) of type T:
val in1 = Input(UInt(1.W))
val in2 = Input(UInt(1.W)) def Mux[T <: Bits](c: Bool, con: T, alt: T): T { ... }
val in3 = Input(UInt(1.W))
val sel = Input(UInt(2.W)) where T is required to be a subclass of Bits. Scala
val out = Output(UInt(1.W)) ensures that in each usage of Mux, it can find a com-
})
io.out := Mux2(io.sel(1),
mon superclass of the actual con and alt argument
Mux2(io.sel(0), io.in0, io.in1), types, otherwise it causes a Scala compilation type
Mux2(io.sel(0), io.in2, io.in3)) error. For example,
12
Mux(c, 10.U, 11.U) class DataBundle extends Bundle {
val A = UInt(32.W)
yields a UInt wire because the con and alt arguments val B = UInt(32.W)
}
are each of type UInt.
We now present a more advanced example of pa- object FifoDemo {
rameterized functions for defining an inner product def apply () = new Fifo(new DataBundle, 32)
FIR digital filter generically over Chisel Num’s. The in- }
A generic FIFO could be defined as shown in Figure 3 The FIFO interface in Figure 3 can be now be simpli-
and used as follows: fied as follows:
13
class Fifo[T <: Data] (data: T, n: Int) Chisel: Constructing Hardware in a Scala Em-
extends Module { bedded Language. in DAC ’12.
val io = IO(new Bundle {
val enq = new DecoupledIO( data ).flip()
[2] Bachrach, J., Qumsiyeh, D., Tobenkin, M. Hard-
val deq = new DecoupledIO( data )
})
ware Scripting in Gel. in Field-Programmable
... Custom Computing Machines, 2008. FCCM ’08.
} 16th.
17 Acknowlegements
Many people have helped out in the design of Chisel,
and we thank them for their patience, bravery, and
belief in a better way. Many Berkeley EECS students
in the Isis group gave weekly feedback as the de-
sign evolved including but not limited to Yunsup
Lee, Andrew Waterman, Scott Beamer, Chris Celio,
etc. Yunsup Lee gave us feedback in response to
the first RISC-V implementation, called TrainWreck,
translated from Verilog to Chisel. Andrew Waterman
and Yunsup Lee helped us get our Verilog backend
up and running and Chisel TrainWreck running on
an FPGA. Brian Richards was the first actual Chisel
user, first translating (with Huy Vo) John Hauser’s
FPU Verilog code to Chisel, and later implementing
generic memory blocks. Brian gave many invaluable
comments on the design and brought a vast expe-
rience in hardware design and design tools. Chris
Batten shared his fast multiword C++ template li-
brary that inspired our fast emulation library. Huy
Vo became our undergraduate research assistant and
was the first to actually assist in the Chisel imple-
mentation. We appreciate all the EECS students who
participated in the Chisel bootcamp and proposed
and worked on hardware design projects all of which
pushed the Chisel envelope. We appreciate the work
that James Martin and Alex Williams did in writ-
ing and translating network and memory controllers
and non-blocking caches. Finally, Chisel’s functional
programming and bit-width inference ideas were in-
spired by earlier work on a hardware description lan-
guage called Gel [2] designed in collaboration with
Dany Qumsiyeh and Mark Tobenkin.
References
[1] Bachrach, J., Vo, H., Richards, B., Lee, Y., Wa-
terman, A., Avižienis, Wawrzynek, J., Asanović
14