RTL Coding Techniques
RTL Coding Techniques
RTL Coding Techniques
In Verilog there is a casez statement, a variation of the case statement that permits "z" and "? values to be treated during case-comparison as "don't care" values. "Z" and "?" are treated as a don't care if they are in the case expression and/or if they are in the case item Guideline: Exercise caution when coding synthesizable models using the Verilog casez statement Coding Style Guideline: When coding a case statement with "don't cares," use a casez statement and use "?" characters instead of "z" characters in the case items to indicate "don't care" bits.
4/28/2012 1
4/28/2012
CaseX
In Verilog there is a casex statement, a variation of the case statement that permits "z", "?" and "x" values to be treated during comparison as "don't care" values. "x", "z" and "?" are treated as a don't care if they are in the case expression and/or if they are in the case item
Guideline: Do not use casex for synthesizable code
4/28/2012 3
4/28/2012
module mux3a (y, a, b, c, sel); output y; input [1:0] sel; input a, b, c; reg y; always @(a or b or c or sel) case (sel) 2'b00: y = a; 2'b01: y = b; 2'b10: y = c; endcase endmodule
4/28/2012
synopsys full_case
module mux3b (y, a, b, c, sel); output y; input [1:0] sel; input a, b, c; reg y;
always @(a or b or c or sel) case (sel) // synopsys full_case 2'b00: y = a; 2'b01: y = b; 2'b10: y = c; endcase endmodule
4/28/2012 7
4/28/2012
Parallel case
A "parallel" case statement is a case statement in which it is only possible to match a case expression to one and only one case item. If it is possible to find a case expression that would match more than one case item, the matching case items are called "overlapping" case items and the case statement is not "parallel."
4/28/2012 9
module intctl1a (int2, int1, int0, irq); output int2, int1, int0; input [2:0] irq; reg int2, int1, int0; + always @(irq) begin {int2, int1, int0} = 3'b0; casez (irq) 3'b1??: int2 = 1'b1; 3'b?1?: int1 = 1'b1; 3'b??1: int0 = 1'b1; endcase end endmodule
4/28/2012
10
synopsys parallel_case
module intctl1b (int2, int1, int0, irq); output int2, int1, int0; input [2:0] irq; reg int2, int1, int0;
always @(irq) begin {int2, int1, int0} = 3'b0; casez (irq) // synopsys parallel_case 3'b1??: int2 = 1'b1; 3'b?1?: int1 = 1'b1; 3'b??1: int0 = 1'b1; endcase end endmodule
4/28/2012 11
4/28/2012
12
Do not use // synopsys full_case directive -if this is used, and all cases are not defines, it will hide the fact that all cases are not defined. It masks errors. In general, do not use Synopsys synthesis directives like full_case and parallel_case in the RTL code. These directives act as comments in the simulation tool, but provide extra information to the synthesis tool, thereby creating a possibility of mismatch in the results between pre-synthesis simulation and post-synthesis simulation. (Full_case and parallel_case are for use with the Synopsys tool only and do not act on the Xilinx FPGA synthesis tools.)
4/28/2012 13
Guideline: In general, do not use "full_case parallel_case" directives with any Verilog case statements.
Guideline: There are exceptions to the above guideline but you better know what you're doing if you plan to add "full_case parallel_case" directives to your Verilog code. Guideline: Educate (or fire) any employee or consultant that routinely adds "full_case parallel_case" to all case statements in their Verilog code, especially if the project involves the design of medical diagnostic equipment, medical implants, or detonation logic for thermonuclear devices!
Guideline: only use full_case parallel_case to optimize one hot FSM designs.
4/28/2012 14
Myth: "// synopsys full_case" removes all latches that would otherwise be inferred from a case statement.
Truth: The "full_case" directive only removes latches from a case statement for missing case items. One of the most common ways to infer a latch is to make assignments to multiple outputs from a single case statement but neglect to assign all outputs for each case item. Even adding the "full_case" directive to this type of case statement will not eliminate latches.
4/28/2012 15
module addrDecode1a (mce0_n, mce1_n, rce_n, addr); output mce0_n, mce1_n, rce_n; input [31:30] addr; reg mce0_n, mce1_n, rce_n;
always @(addr) casez (addr) // synopsys full_case 2'b10: {mce1_n, mce0_n} = 2'b10; 2'b11: {mce1_n, mce0_n} = 2'b01; 2'b0?: rce_n = 1'b0; endcase endmodule
4/28/2012 16
The easiest way to eliminate latches is to make initial default value assignments to all outputs immediately beneath the sensitivity list, before executing the case statement,
4/28/2012
17
module addrDecode1d (mce0_n, mce1_n, rce_n, addr); output mce0_n, mce1_n, rce_n; input [31:30] addr; reg mce0_n, mce1_n, rce_n;
always @(addr) begin {mce1_n, mce0_n, rce_n} = 3'b111; casez (addr) 2'b10: {mce1_n, mce0_n} = 2'b10; 2'b11: {mce1_n, mce0_n} = 2'b01; 2'b0?: rce_n = 1'b0; endcase end endmodule
4/28/2012 18
Although good priority encoders can be inferred from case statements, following the above coding guidelines will help to prevent mistakes and mismatches between presynthesis and post synthesis simulations.
4/28/2012
19
4/28/2012
20
4/28/2012
21
RTL code should be as simple as possible - no fancy stuff RTL Specification should be as close to the desired structure as possible w/o sacrificing the benefits of a high level of abstraction Detailed documentation and readability (indentation and alignment). Signal and variable names should be meaningful to enhance the readability Do not use initial construct in RTL code there is no equivalent hardware for initial construct in Verilog All the flops in the design must be reset, especially in the control path
4/28/2012
22
All assignments in a sequential procedural block must be non-blocking - Blocking assignments imply order, which may or may not be correctly duplicated in synthesized code Use non-blocking assignments for sequential logic and latches, Do not mix blocking and non-blocking assignments within the same always block.
4/28/2012
23
When modelling latches nonblocking assignments Combinational and sequential in same always block nonblocking assignments Do not make assignments to the same variable from more than one always block Use $strobe to display values that have been assigned using nonblocking assignments Do not make assignments using #0 delays (inactive events queue)
4/28/2012
24
RTL specification should be as close to the desired structure as possible with out scarifying the benefits of a high level of abstraction Names of signals and variables should be meaningful so that the code becomes self commented and readable Mixing positive and negative edge triggered flip-flops mat introduce inverters and buffers into the clock tree. This is often undesirable because clock skews are introduced in the circuit
4/28/2012 25
Small blocks reduce the complexity of optimization for the logic synthesis tool In general, any construct that is used to define a cycle-by-cycle RTL description is acceptable to the logic synthesis tool While and forever loops must contain @ (posedge clk) or @ (negedge clk) for synthesis Disabling of named blocks allowed in synthesis Delay info is ignored in the synthesis === !== related X and Z are not supported by synthesis
4/28/2012 26
Use parenthesis to group logic the way you want into appear. Do not depend on operator precedence coding tip Operators supported for synthesis * / + - % +(unary) - (unary) arithmetic ! && || logical > < >= <= relational == != equality ~ & | ^ ~^ bitwise & ~& | ~| ^ ~^ reduction >> << shift {} concatenation ?: conditional 4/28/2012
27
Design is best to be synchronous and register based Latches should only be used to implement memories or FIFOs Should aim to have edge triggering for all register circuits Edge triggering ensures that circuits change events easier for timing closure
4/28/2012 28
Use as few as clock domains as possible If using numerous clock domains document fully Have simple interconnection in one simple module By-pass phase Lock Loop circuits for ease of testing
4/28/2012 29
Typically, synchronous reset is preferred as it - easy to synthesize - avoids race conditions on reset
4/28/2012
30
Asynchronous resets. Designer has to: - worry about pulse width through the circuit - synchronize the reset across system to ensure that every part of the circuit resets properly in one clock cycle - makes static timing analysis more difficult
4/28/2012
31
Tri state is favored in PCB design as it reduces the number of wires On chip, you must ensure that - only one driver is active - tri-state buses are not allowed to float These issues can impact chip reliability MUX-based is preferred as it is safer and is easy to implement
4/28/2012 32
4/28/2012
33
4/28/2012
34
4/28/2012
35
4/28/2012
36
4/28/2012
37
Combinatorial procedural blocks should be fully specified, latches will be inferred otherwise De-assert all the control signals, once the purpose is served (proper else conditions apart from reset).
4/28/2012
38
Do not make any assignments in the RTL code using #delays, whether in the blocking assignment or in the non-blocking assignment. Do not even use #0 construct in the assignments.
4/28/2012
39
Do not use any internally generated clocks in the design. These will cause a problem during the DFT stage in the ASIC flow.
4/28/2012
40
Use the clock for synthesizing only sequential logic and not combinational logic, i.e. do not use the clock in always @ (clk or reset or state).
4/28/2012
41
Do not use the `timescale directive in the RTL code. RTL code is meant to be technology independent and using timescale directive which works on #delays has no meaning in the RTL code.
4/28/2012
42
If an output does not switch/toggle, then it should not be an output, it can be hardwired into the logic. Do not put any logic in top level file except instantiations. Divide the bi-directional signals to input and output at top level in hierarchy.
4/28/2012
43
Code all intentional priority encoders using if-else-if-else constructs The reset signal is not to be used in any combinational logic. (It does not make sense to use reset in combinational logic.)
4/28/2012
44
Use only one clock source in every nontop module. Ideally only the top module can have multiple clock sources. This eases timing closure for most of the timing analysis tools.
4/28/2012
45
All state machines must be either initialized to known state or must self-clear from every state Future state determination will depend on registered state variables State machines should be coded with case statements and parameters for state variables State machines, initialization and state transitions from unused states must be specified to prevent incorrect operation
4/28/2012 46
Code RTL with timing in mind levels of combinatorial logic Minimize ping-pong signals (signals combinatorially bounce back to same block) register inputs and outputs to avoid loops
4/28/2012
47
All the elements within a combinatorial always block should be specified in the sensitivity list
4/28/2012
48
The design must be fully synchronous and must use only the rising edge of the clock This rule results in insensitivity to the clock duty cycle and simplifies STA
4/28/2012
49
All clock domain boundaries should be handled using 2-stage synchronizers Never synchronize a bus through 2-stage synchronizer
4/28/2012
50
Major block modules should insert a number of spare gate modules on each clock domain. RTL code should be completely synthesizable by WHATEVERSYNTHESISTOOL (basic)
4/28/2012
51
Signals must be defined only in nonindependent processes - A signal cannot be defined and assigned in a process in which it is also in the sensitivity list Do not ignore warnings in synthesis report
4/28/2012
52
Avoid using asynchronous resettable flipflops asynchronous lines are susceptible to glitches caused by cross talk -Use asynchronous reset in the design only for Power-On reset. In case used (central reset generation), such nets should undergo crosstalk analysis in the physical design phase
4/28/2012 53
Combinational loops are not allowed, they must be broken by flip-flop. Pulse generators are not allowed, susceptible to post layout discrepancies and duty cycle variations
4/28/2012
54
Instantiation of I/O buffers in the core logic is not allowed, internal logic must consists of core-logic elements only Do not use partial decoding logic
4/28/2012
55
No lathes should be used Lathes severely complicate the STA and more difficult to test.
4/28/2012
57
3-state bus should not be used, susceptible to testing problems, delay inaccuracies and exceptionally high loads
4/28/2012
58
Use a clock naming convention to identify the clock source of every signal in a design Reason: A naming convention helps all team members to identify the clock domain for every signal in a design and also makes grouping of signals for timing analysis easier to do using regular expression wildcarding from within a synthesis script
4/28/2012 60
Only allow one clock per module Reason: STA and creating synthesis scripts is more easily accomplished on single-clock modules or group of single-clock modules
4/28/2012
61
Use $strobe instead of $display to display variables that have been assigned using the non-blocking assignment. Verification suite must demonstrate 100% code coverage
4/28/2012
63