10G Ethernet Pcs/Pma V6.0: Logicore Ip Product Guide
10G Ethernet Pcs/Pma V6.0: Logicore Ip Product Guide
10G Ethernet Pcs/Pma V6.0: Logicore Ip Product Guide
v6.0
Chapter 1: Overview
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Unsupported Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Licensing and Ordering Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Appendix C: Debugging
Finding Help on Xilinx.com . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Debug Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Simulation Debug. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Hardware Debug . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Ethernet systems and subsystems. Resources Performance and Resource Utilization web page
IEEE Standard 802.3-2012 clause 49, 72, 73, Test Bench Verilog and VHDL
Overview
10GBASE-R/KR is a 10 Gb/s serial interface. It is intended to provide the Physical Coding
Sublayer (PCS) and Physical Medium Attachment (PMA) functionality between the
10 Gigabit Media Independent Interface (XGMII) interface on a Ten Gigabit Ethernet Media
Access Controller (MAC) and a Ten Gigabit Ethernet network physical-side interface (PHY).
The 10GBASE-KR core is distinguished from the 10GBASE-R core by the addition of a Link
Training block as well as optional Auto-Negotiation (AN) and Forward Error Correction
(FEC) features, to support a 10 Gb/s data stream across a backplane.
10GBASE-R
For Zynq®-7000, UltraScale™, Virtex®-7, and Kintex®-7 devices, all of the PCS and
management blocks illustrated are implemented in logic, except for part of the Gearbox and
SERDES. Figure 1-1 shows the architecture.
X-Ref Target - Figure 1-1
Test
Pattern
Generate
64b66b
Scramble Gearbox txn,p
Encode
XGMII (SDR)
X12649-091115
• Receive path, including block synchronization, descrambler, decoder and BER (Bit Error
Rate) monitor
• Elastic buffer in the receive datapath.
The elastic buffer is 32 words deep (1 word = 64bits data + 8 control). (For 32-bit
10GBASE-R cores, the elastic buffer is twice the depth and half the width, but has the
same properties.) If the buffer empties, local fault codes are inserted instead of data.
This allows you to collect up to 64 clock correction (CC) sequences before the buffer
overflows (and words are dropped). The buffer normally fills up to one half and then
deletes CC sequences when over half full, and inserts CC sequences when under one half
full. So from a half-full state, you can (conservatively) accept an extra 360 KB of data
(that is, receiving at +200 ppm) without dropping any. From a half-full state you can
cope with another 360 KB of data without inserting local faults (for –200 ppm).
10GBASE-KR
Figure 1-2 illustrates a block diagram of the 10GBASE-KR core implementation. The major
functional blocks include the following:
The elastic buffer is 32 words deep (1 word = 64bits data + 8 control). If the buffer
empties, local fault codes are inserted instead of data. This allows you to collect up to 64
clock correction (CC) sequences before the buffer overflows (and words are dropped).
The buffer normally fills up to one half and then deletes CC sequences when over half
full, and inserts CC sequences when under one half full. So from a half-full state, you can
(conservatively) accept an extra 360 KB of data (that is, receiving at +200 ppm) without
dropping any. From a half-full state you can cope with another 360 KB of data without
inserting local faults (for –200 ppm).
Fabric GTHE2/GTHE3/
GTYE3
Elastic
Buffer
rxn,p
User
PCS FEC AN TRAIN SERDES
Logic
txn,p
XGMII
(SDR) MDIO Control
PCS/PMA
+
Registers
Status
Applications
Figure 1-3 shows a typical Ethernet system architecture and the core within it. The MAC and
all the blocks to the right are defined in IEEE Std 802.3 [Ref 1].
X-Ref Target - Figure 1-3
*(WKHUQHW3&6
30$&RUH
X12648
The 10G Ethernet PCS/PMA core is designed to be attached to the Xilinx IP 10G Ethernet
MAC core over XGMII. More details are provided in Chapter 3, Designing with the Core.
FPGA
User Serial
Logic Interface
10G 10GBASE-R PHY
(10G Ethernet or
Ethernet PCS/PMA 10GBASE-KR Backplane
MAC) XGMII Core
Unsupported Features
The following features are not supported in this release of the core.
While the Training Protocol is supported natively by the core, no logic is provided that
controls the far-end transmitter adaptation based on analysis of the received signal quality.
This is because extensive testing has shown that to be unnecessary.
However, a training interface is provided on the core that allows access to all core registers
and to the DRP port on the transceiver. You can employ this interface to implement your
own Training Algorithm for 10GBASE-KR, if required.
• Vivado Synthesis
• Vivado Implementation
• write_bitstream (Tcl command)
IMPORTANT: IP license level is ignored at checkpoints. The test confirms a valid license exists. It does
not check IP license level.
License Type
10G Ethernet PCS/PMA (10GBASE-R)
This Xilinx LogiCORE™ IP module is provided at no additional cost with the Xilinx Vivado®
Design Suite under the terms of the Xilinx End User License. Information about this and
other Xilinx LogiCORE IP modules is available at the Xilinx Intellectual Property page. For
information about pricing and availability of other Xilinx LogiCORE IP modules and tools,
contact your local Xilinx sales representative.
For more information, visit the 10 Gigabit Ethernet PCS/PMA (10GBASE-R) product web
page.
For more information, visit the 10 Gigabit Ethernet PCS/PMA with FEC/Auto-Negotiation
(10GBASE-KR) product web page. The 10G/25GBASE-KR/CR license key is bundled with this
product. For more information, visit the 10G/25G Ethernet Subsystem product web page.
Information about this and other Xilinx LogiCORE IP modules is available at the Xilinx
Intellectual Property page. For information on pricing and availability of other Xilinx
LogiCORE IP modules and tools, contact your local Xilinx sales representative.
Product Specification
Standards
The 10GBASE-R/KR core is designed to the standard specified in clauses 45, 49, 72, 73 and
74 of the 10 Gigabit Ethernet specification IEEE Std 802.3 [Ref 1].
Performance
Transceiver Latency
See the 7 Series Transceivers User Guide (UG476) [Ref 3], the UltraScale Architecture GTH
Transceivers User Guide (UG576) [Ref 4], and the UltraScale Architecture GTY Transceivers
User Guide (UG578) [Ref 5] for information on the transceiver latency.
As measured from the input port xgmii_txd[63:0] of the transmitter side XGMII (until
that data appears on gt_txd[31:0] on the transceiver interface), the latency through the
7 series core for the XGMII interface configuration in the transmit direction is 20 periods of
txoutclk. When the optional FEC functionality is included in the core and enabled, this
increases to 26 periods of txoutclk.
Measuring in the same way for an UltraScale™ device, the transmit latency is six periods of
the 156.25 MHz transmit clock, which increases to 12 periods when FEC is included and
enabled.
Latency in the receive direction is variable and depends mainly on the fill level of the receive
elastic buffer.
Measured from the input into the core on gt_rxd[31:0] until the data appears on
xgmii_rxd[63:0] of the receiver side XGMII interface, the latency through the 7 series
core in the receive direction is nominally equal to 1831 UI, or 27.75 cycles of coreclk,
increasing to 2723 UI, or 41.26 cycles of coreclk when the elastic buffer is at its fullest
possible level. The exact latency depends on sync bit alignment position and data
positioning within the transceiver 4-byte interface. For UltraScale devices, excluding the
elastic buffer, the latency through the core in the receive direction is nominally equal to
seven cycles of the 156.25 MHz receive clock. The latency through the elastic buffer is the
same as calculated for 7 series devices (the number of cycles is for the 156.25 MHz receive
clock).
When the optional FEC functionality is included in the core the UltraScale core has a single
extra cycle of the 156.25 MHz receive clock and this increases for all devices by 70 cycles of
rxrecclk_out when FEC is enabled and if error reporting to the PCS layer is enabled,
there is an extra 66 cycles of rxrecclk_out latency.
As measured from the input port xgmii_txd[31:0] of the transmitter side XGMII (until
that data appears on gt_txd[31:0] on the transceiver interface), the latency through the
core for the XGMII interface configuration in the transmit direction for 7 series devices is 14
periods of txoutclk. For UltraScale devices this is eight periods of the 312.5 MHz transmit
clock.
Latency in the Receive direction is variable and depends mainly on the fill level of the
receive elastic buffer.
Measured from the input into the core on gt_rxd[31:0] until the data appears on
xgmii_rxd[31:0] of the receiver side XGMII interface, the latency through the core in the
receive direction for 7 series devices is nominally equal to 1472 UI, or 44.6 cycles of
coreclk, increasing to ~72 cycles of coreclk when the elastic buffer is at its fullest
possible level. The exact latency depends on sync bit alignment position and data
positioning within the transceiver 4-byte interface. For UltraScale devices, excluding the
elastic buffer, the latency is eight cycles of the 312.5 MHz receive clock.
Resource Utilization
For details about resource utilization, visit Performance and Resource Utilization.
Port Descriptions
This section provides information about the ports for the XGMII interface and for the serial
data interface. Additionally, information is provided about the ports for the management
interface (MDIO) and its alternative, the vector-based configuration and status signals.
Information is also provided about the clock and reset signals, the DRP training interface
ports, the transceiver debug ports and miscellaneous core signals.
64-Bit XGMII
When the 64-bit datapath is selected, the MAC (or client) side of the core has a 64-bit
datapath plus eight control bits implementing an XGMII interface. Table 2-1 defines the
signals, which are all synchronous to a 156.25 MHz clock source; the relevant clock port is
dependent upon the family and core permutation. It is designed to be connected to either
user logic within the FPGA or, by using SelectIO™ technology Double Data Rate (DDR)
registers in your own top-level design, to provide an external 32-bit DDR XGMII, defined in
clause 46 of IEEE Std 802.3. TX clock source and RX clock source are defined in Table 3-1.
32-bit XGMII
When the 32-bit datapath is selected, the MAC (or client) side of the core has a 32-bit
datapath plus four control bits implementing an XGMII interface. Table 2-2 defines the
signals, which are all synchronous to a 312.5 MHz clock source; the relevant clock port is
dependent upon the family and core permutation. It is designed to be connected to user
logic within the FPGA. TX clock source and RX clock source are defined in Table 3-1.
Notes:
1. When an optical module is present, the logical NOR of MODDEF0 and LOS (Loss of Signal) outputs should be used
to create the signal_detect input to the core.
2. This signal is not connected inside this version of the core. You should handle these inputs and reset your design
as required.
3. Connect to SFP+ tx_fault signal, or XFP MOD_NR signal, depending on which is present.
In this core, the MDIO interface is an optional block. If implemented, the bidirectional data
signal MDIO is implemented as three unidirectional signals. These can be used to drive a
3-state buffer either in the FPGA IOB or in a separate device.
Table 2-6 shows the ports on the core that are associated with clocks and resets.
Notes:
1. For UltraScale devices the DCLK must be free-running and the frequency must be kept less than or equal to the
maximum DRPCLK frequency specified for the transceiver type or the TXUSRCLK2 frequency, which is 156.25 MHz
for 64-bit datapaths.
2. This reset also resets all management registers.
Table 2-7 shows the ports on the core that are associated with these clocks and resets,
which can be reused by other user logic or IP cores.
Notes:
1. For UltraScale devices the DCLK must be free-running and the frequency must be kept less than or equal to the
maximum DRPCLK frequency specified for the transceiver type or the TXUSRCLK2 frequency, which is 156.25 MHz
for 64-bit datapaths.
2. This reset also resets all management registers.
Notes:
1. This signal has no meaning or effect when the core is created without an MDIO interface because all registers are
exposed through the configuration and status vectors. This should be tied to 0 in that case. Access to transceiver
DRP registers through the training interface is unaffected.
Figure 2-1 and Figure 2-2 show the timing diagrams for using the training interface to
access internal core registers and transceiver registers through the DRP port. As shown,
training_drp_cs , training_ipif_cs , and training_enable should be brought Low
between read or write accesses. The clock for Figure 2-1 and Figure 2-2 can be determined
from Table 3-1.
X-Ref Target - Figure 2-1
TRAINING?WRDATA
TRAINING?WRACK
TRAINING?RDDATA
TRAINING?RDACK
TRAINING?ENABLE
TRAINING?IPIF?CS
TRAINING?DRP?CS
TRAINING?ADDR
TRAINING?RNW
Figure 2-1: Using the Training Interface to Access Internal Core Registers
X-Ref Target - Figure 2-2
TRAINING?WRDATA
TRAINING?WRACK
TRAINING?RDDATA
TRAINING?RDACK
TRAINING?ENABLE
TRAINING?IPIF?CS
TRAINING?DRP?CS
TRAINING?ADDR
TRAINING?RNW
Figure 2-2: Using the Training Interface to Access Transceiver Registers through the DRP Port
All signals in Table 2-9 are synchronous to the dclk input of the core.
UltraScale Architecture
To facilitate the connection of user logic to the DRP interface of the transceiver, the
interface between the core logic and the transceiver is brought out to an interface that can
be connected to an external Arbiter block. The interface directly to the transceiver DRP is
also provided.
All signals in Table 2-10 are synchronous to the dclk input of the core.
Miscellaneous Ports
The signals in Table 2-11 apply to all supported devices.
Notes:
1. This bit is equivalent to the FEC block lock if FEC is included in the core and FEC is enabled AND Training Done AND
signal_detect AND an_link_up.
If FEC is not included or is not enabled, this bit is equivalent to Training Done AND signal_detect AND
an_link_up.
2. This is equivalent to Training Done AND signal_detect.
3. The latter two signals are required in the core to enable a switching of transceiver RX modes during
auto-negotiation. When the optional auto-negotiation block is not included with the core, or is included but
disabled by either the an_enable pin on the core (simulation-only) or by the management register 7.0.12,
an_link_up (bit 5) is fixed to a constant 1 and bits 3 and 4 is a constant 0.
Speeding up Simulation
Direct control of some timers in the core is provided for use before and after
implementation. To use the shorter timer values, drive sim_speedup_control Low until
after GSR has fallen (typically after 100 ns of simulation time) and then High and hold it
High.
To remove short-cut logic automatically, tie the port to either 0 or 1 before the final
implementation stage. This allows the optimization step to remove the logic.
While tying the port off for final implementation is recommended, you can leave it
connected to a pin on the device. As long as that pin is not driven Low and then High, the
speedup values for the timers will never be used.
The timer that is speeded up with this control is the transceiver RX reset timer. This delays
the assertion of RXUSERRDY which is reduced from 37 million UI to 50,000 UI. Also, for
BASE-KR cores, the auto-negotiation Break Link Timer value is reduced from around 67 ms
to just 6.4 μ s.
IMPORTANT: The ports in the transceiver Control And Status Interface must be driven in accordance
with the appropriate GT user guide. Using the input signals listed in Table 2-12 might result in
unpredictable behavior of the core.
Notes:
1. This output is 8-bits wide for the GTXE2 transceiver and 15 bits for the GTHE2 transceiver.
Notes:
1. This input is 4 bits wide for the GTHE3 transceiver but 5 bits wide for the GTYE3/GTHE4/GTYE4 transceiver.
See the Register Space section for information about the registers emulated with these
configuration and status vectors.
Some IEEE registers are defined as set/clear-on-read, and because there is no read when
using the configuration and status vectors, special controls have been provided to imitate
that behavior. See Figure 2-3 and Figure 2-4.
BASE-R
Table 2-15 shows the breakdown of the 10GBASE-R-specific configuration vector and
Table 2-16 shows the breakdown of the status vector. Any bits not mentioned are assumed
to be 0s. The TX clock source is defined in Table 3-1.
Notes:
1. These reset signals should be asserted for a single clock tick only.
2. Reset controls for the given registers.
3. Typically constant.
Notes:
1. This signal should be asserted for at most 3 cycles of the associated clock.
2. This bit is a logical OR of two latching bits and so will exhibit latching behavior without actually being latching
itself.
3. This bit is a constant and clock domain is not applicable.
4. These bits are only valid for 10GBASE-R cores.
BASE-KR
Table 2-17 shows the additional signals in the configuration vector which are specific to
BASE-KR functionality. TX clock source is defined in Table 3-1.
Notes:
1. Only valid when the optional FEC block is included
2. If FEC is enabled during auto-negotiation then this register bit is overridden by the auto-negotiation control of
FEC. So even with this bit set to 0, if auto-negotiation results in FEC being enabled, FEC is enabled and cannot be
disabled except by changing the auto-negotiation Base Page Ability bits 46 and 47, and re-negotiating the link.
FEC can still be enabled explicitly by setting this bit to 1 which overrides whatever auto-negotiation decides.
3. Only valid when the optional AN block is included
4. Reset controls for the given registers
5. Toggle to load the AN Page data from the associated configuration vector bits
6. If FEC Error Passing is enabled while FEC is enabled, errors will be seen temporarily. To avoid this, only enable Error
Passing while FEC is disabled.
Table 2-18 shows the additional signals in the status vector which are specific to BASE-KR
functionality.
Notes:
1. This bit is a constant and clock domain is not applicable.
2. Only valid when the optional FEC block is included
3. Only valid when the optional AN block is included
Bit 286 of the status vector is latching-High and is cleared Low by bit 518 of the
configuration_vector port. Figure 2-3 shows how the status bit is cleared.
X-Ref Target - Figure 2-3
status_vector[286]
configuration_vector[518]
X13638
status_vector[18]
or status_vector[226]
or status_vector[287]
configuration_vector[512]
or configuration_vector[516]
or configuration_vector[518],
respectively.
X13664
• Status bits 285:272 are also reset using configuration vector bit 518
• Status bits 303:288 are reset using configuration vector bit 519
For Base-KR cores, similar reset behaviors exist for the following status vector bits:
• status vector bits 159:128 – cleared with configuration vector bit 514
• status vector bits 191:160 – cleared with configuration vector bit 515
• status vector bit 322 – set with configuration vector bit 520
• status vector bit 324 – cleared with configuration vector bit 521
• status vector bit 326 – cleared with configuration vector bit 522
Finally, configuration vector bits 335:293 and 383:336 each implement three registers which
are normally latched into the core when the lower register is written, keeping the data
coherent. Because there is no need for this behavior when the entire vector is exposed,
these bits are latched into the core whenever configuration register bits 523 and 524
respectively are toggled High.
Register Space
This core implements registers which are further described in 802.3 Clause 45. If the core is
generated without an MDIO interface, these registers are still implemented but accessed
using configuration or status pins on the core. For example, register 1.0, bit 15 (PMA Reset)
is implemented as bit 15 of the configuration vector and register 1.1, bit 7 (PMA/PMD Fault)
is implemented as status vector bit 23. These mappings are described in Configuration and
Status Signals.
If the core is configured as a 10BASE-R PCS/PMA, it occupies MDIO Device Addresses 1 and
3 in the MDIO register address map, as shown in Table 2-19.
Notes:
1. For cores with optional FEC block
2. For cores with optional AN block
15 14 13 12 11 10 7 6 5 2 1 0
Reg 1.0
RESET
RSVD
SPEED
RSVD
POWER DOWN
RSVD
SPEED
SPEED
RSVD
LOOPBACK
X13641
15 8 7 6 3 2 1 0
Reg 1.1
RSVD
LOCAL FAULT
RSVD
RX LINK STATUS
POWERDOWN ABILITY
RSVD
X13642
15 0
Reg 1.4
RSVD
10G CAPABLE
X13647
Table 2-23 shows the PMA/PMD Speed Ability register bit definitions.
15 14 13 0
Reg 1.6
VENDOR2 PRESENT
VENDOR1 PRESENT
CLAUSE 22 EXT.N PRESENT
RSVD
15 8 7 6 5 4 3 2 1 0
Reg 1.5
RSVD
AN PRESENT
TC PRESENT
DTE XS PRESENT
PHY XS PRESENT
PCS PRESENT
WIS PRESENT
PMD/PMA PRESENT
CLAUSE 22 PRESENT
X13648
15 3 0
Reg 1.7
RSVD
X13649
Notes:
1. BASE-R: Set from pma_pmd_type port.
BASE-KR: returns 0xB
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Reg 1.8
DEVICE PRESENT
TX FAULT ABILITY
RX FAULT ABILITY
TX FAULT
RX FAULT
RSVD
PMD TX DISABLE ABILITY
10GBASE-SR ABILITY
10GBASE-LR ABILITY
10GBASE-ER ABILITY
10GBASE-LX4 ABILITY
10GBASE-SW ABILITY
10GBASE-LW ABILITY
10GBASE-EW ABILITY
PMA LOOPBACK ABILITY
X13650
Notes:
1. Depends on pma_pmd_type port
15 1 0
Reg 1.9
RSVD
X13651
15 1 0
Reg 1.10
RSVD
15 2 1 0
Reg 1.150
RSVD
Enable Training
Restart Training
X13645
15 4 3 2 1 0
Reg 1.151
RSVD
Training failure
Start-up protocol status
Frame lock
Receiver status
X13646
Table 2-30 shows the 10GBASE-KR PMD Status register bit definitions.
15 141312 11 6 54 3 2 1 0
Reg 1.152
RSVD
Preset
Initialize
RSVD
X13647
Notes:
1. Writable only when register 1.150.1 = 0
15 14 6 5 4 3 2 1 0
Reg 1.153
Receiver Ready
RSVD
X13638
15 14 13 12 11 6 5 4 3 2 1 0
Reg 1.154
RSVD
Preset
Initialize
RSVD
X13639
Notes:
1. These registers are programmed by writing to register 1.65520.
15 14 6 5 4 3 2 1 0
Reg 1.155
Receiver Ready
RSVD
X13640
15 2 1 0
Reg 1.170
RSVD
X13641
15 2 1 0
Reg 1.171
RSVD
Notes:
1. If FEC Error Passing is enabled while FEC is enabled, errors will be seen temporarily. To avoid this, only enable Error
Passing while FEC is disabled.
2. If FEC is enabled during auto-negotiation then this register bit is overridden by the auto-negotiation control of
FEC. So even with this bit set to 0, if auto-negotiation results in FEC being enabled, FEC is enabled and cannot be
disabled except by changing the auto-negotiation Base Page Ability bits 46 and 47, and re-negotiating the link.
FEC can still be enabled explicitly by setting this bit to 1 which overrides whatever auto-negotiation decides.
15 0
Reg 1.172
Notes:
1. Cleared when read.
15 0
Reg 1.173
FEC corrected blocks upper
X13644
Table 2-38 shows the 10GBASE-R FEC Corrected Blocks (upper) register bit definitions.
Notes:
1. Latched when 1.172 is read. Cleared when read.
15 0
Reg 1.174
FEC uncorrected blocks lower
X13645
Notes:
1. Cleared when read.
15 0
Reg 1.175
Notes:
1. Latched when 1.174 is read. Cleared when read.
15 14 13 12 11 6 5 4 3 2 1 0
Reg 1.65520
Training Done
RSVD
Preset
Initialize
RSVD
X13652
Notes:
1. This register will be transferred automatically to register 1.155.15.
2. These registers will be transferred automatically to register 1.154.
15 8 7 6 5 4 3 1 0
Reg 1.65535
Core Version
Core Params
EVAL
X13653
Table 2-42 shows the core version information register bit definitions.
Notes:
1. x'60' for version 6.0 of core
2. Depends on core generation parameters
15 14 13 12 11 10 7 6 5 2 1 0
Reg 3.0
RESET
LOOPBACK
SPEED
RSVD
LOW POWER
RSVD
SPEED
SPEED
RSVD
X13654
15 8 7 6 3 2 1 0
Reg 3.1
RSVD
LOCAL FAULT
RSVD
RX LINK STATUS
POWERDOWN ABILITY
RSVD
X13655
15 0
Reg 1.4
RSVD
10G CAPABLE
X13647
15 14 13 0
Reg 3.6
VENDOR2 PRESENT
VENDOR1 PRESENT
CLAUSE 22 EXT.N PRESENT
RSVD
15 8 7 6 5 4 3 2 1 0
Reg 3.5
RSVD
AN PRESENT
TC PRESENT
DTE XS PRESENT
PHY XS PRESENT
PCS PRESENT
WIS PRESENT
PMD/PMA PRESENT
CLAUSE 22 PRESENT X13646
15 2 1 0
Reg 3.7
RSVD
15 14 13 12 11 10 9 3 2 1 0
Reg 3.8
DEVICE PRESENT
RSVD
TX FAULT
RX FAULT
RSVD
10GBASE-W ABILITY
10GBASE-X ABILITY
10GBASE-R ABILITY
X13648
15 13 12 11 3 2 1 0
Reg 3.32
RSVD
LINK STATUS
RSVD
15 14 13 8 7 0
Reg 3.33
LATCHED BLOCK_LOCK
LATCHED HI_BER
BER
15 0
Reg 3.34
15 0
Reg 3.36
TEST PATTERN A [47:32]
15 10 9 0
Reg 3.37
TEST PATTERN A [57:48]
15 0
Reg 3.38
15 0
Reg 3.40
TEST PATTERN B [47:32]
15 10 9 0
Reg 3.41
TEST PATTERN B [57:48]
15 6 5 4 3 2 1 0
Reg 3.42
RSVD
Notes:
1. PRBS31 test pattern generation and checking is implemented in the transceiver and the error count is read by the
10GBASE-R/KR core through the transceiver DRP interface. All other test pattern generation and checking where
applicable is implemented in the PCS logic in the core
enabled in register 3.42.5, no other MDIO commands are accepted until a different PCS
address is selected with an MDIO ADDRESS command.
For 7 series devices, the number of errors is equal to the number of 20-bit words received
that included errors, rather than the actual number of bit errors.
For UltraScale devices, the number of errors is equal to the number of single bit errors
received where there are fewer than 64K bit errors. UltraScale device transceivers use a
32-bit counter for this feature, but it is only possible to read back 16 bits from the error
counter register in the core. The lower 16 bits from the transceiver counter register are used
as the value for the core register. Each read operation clears the transceiver counter register
so, as long as there are fewer than 64K bit errors between each successive read, the values
returned are valid.
Figure 2-38 shows the MDIO register 3.43: 10GBASE-R Test Pattern Error Counter.
X-Ref Target - Figure 2-38
15 0
Reg 3.43
TEST PATTERN ERROR COUNT
X12645
15 0
Reg 3.65535
15 14 13 12 11 10 9 8 0
Reg 7.0
Auto-Negotiation Reset
RSVD
Extended next page control
Auto-Negotiation Enable
RSVD
Restart AutoNegotiation
RSVD
X13649
Notes:
1. For simulation purposes only, to disable AN at start-up, the external core pin ‘an_enable’ should be tied Low.
15 10 9 8 7 6 5 4 3 2 1 0
Reg 7.1
RSVD
X13650
15 14 13 12 5 4 0
Reg 7.16
Next Page
Acknowledge
Remote fault
D12:D5
Selector field
X13651
15 0
Reg 7.17
D31:D16
X13652
15 0
Reg 7.18
D47:D32
X13653
15 0
Reg 7.19
D15:D0
X13654
Figure 2-46 shows the MDIO register 7.20: AN LP Base Page Ability.
X-Ref Target - Figure 2-46
15 0
Reg 7.20
D31:D16
X13655
Figure 2-47 shows the MDIO register 7.21: AN LP Base Page Ability.
X-Ref Target - Figure 2-47
15 0
Reg 7.21
D47:D32
X13656
Table 2-63 shows the AN LP Base Page Ability register bit definitions.
15 14 13 12 11 10 0
Reg 7.22 Next Page
RSVD
Message Page
Acknowledge 2
Toggle
X13657
15 0
Reg 7.23
15 0
Reg 7.24
Unformatted Code Field 2
X13659
15 14 13 12 11 10 0
Reg 7.25
Next Page
Acknowledge
Message Page
Acknowledge 2
Toggle
15 0
Reg 7.26
Unformatted Code Field 1
X13661
15 0
Reg 7.27
X13662
15 5 4 3 2 1 0
Reg 7.48
RSVD
X13663
Table 2-70 shows the Backplane Ethernet Status register bit definitions.
This chapter also describes the steps required to turn a 10BASE-R/KR core into a
fully-functioning design with user-application logic. It is important to realize that not all
implementations require all of the design steps listed in this chapter. Follow the logic
design guidelines in this manual carefully.
This design can be used as a starting point for your own design or can be used to
sanity-check your application in the event of difficulty.
See Chapter 5, Detailed Example Design, for information about using and customizing the
example designs for the 10BASE-R/KR core.
Keep It Registered
To simplify timing and increase system performance in an FPGA design, keep all inputs and
outputs registered between your application and the core. This means that all inputs and
outputs from your application should come from, or connect to a flip-flop. While
registering signals is not possible for all paths, it simplifies timing analysis and makes it
easier for the Xilinx ® tools to place and route the design.
Clocking
The clocking schemes in this section are illustrative only and might require customization
for a specific application.
Reference Clock
For Zynq-7000, Virtex-7, and Kintex-7 devices, the transceiver differential reference clock
(refclk_p/refclk_n ports) must run at 156.25 MHz, with the exception that, for 32-bit
10GBASE-R cores, the differential reference clock must run at 312.5 MHz.
For UltraScale devices, refclk can run at 156.25, 161.13, 312.5, and 322.26 MHz.
Transceiver Placement
For 7 series, a single IBUFDS_GTE2 block is used to feed the reference clocks for up to 12
GTXE2_CHANNEL and GTHE2_CHANNEL transceivers, through GTXE2_COMMON or
GTHE2_COMMON blocks. The COMMON blocks can each be shared by up to 4 CHANNEL
blocks in the same quad.
For details about Zynq-7000, Virtex-7, and Kintex-7 device transceiver clock distribution,
see the 7 Series Transceivers User Guide (UG476) [Ref 3].
The same scheme is also valid in UltraScale devices using IBUFDS_GTE3, GTHE3_CHANNEL,
GTYE3_CHANNEL, GTHE4_CHANNEL, GTYE4_CHANNEL, GTHE3_COMMON, and
GTYE3_COMMON, GTHE4_COMMON and GTYE4_COMMON blocks. On UltraScale devices,
it is possible to feed the reference clocks to up to 20 CHANNEL transceivers (2 QUADs
above and 2 QUADs below). For details, see the UltraScale Architecture GTH Transceivers
User Guide (UG576) [Ref 4] and the UltraScale Architecture GTY Transceivers User Guide
(UG578) [Ref 5].
For further notes on instantiating multiple cores, see Multiple Core Instances.
The GT*_CHANNEL primitives require a 156.25 MHz differential reference clock, as well as
322.26 MHz TX and RX user clocks. These user clocks must be created from the TXOUTCLK
and RXOUTCLK outputs respectively.
The 156.25 MHz core clock (coreclk) must be created from the transceiver differential
reference clock to keep the user logic and transceiver interface synchronous.
A management/configuration clock, dclk, is used by the core and the transceiver and can
be any rate that is supported for the transceiver DRPCLK.
example_design
core_support_layer rxrecclk_out
core
local_clocking_and_reset
gt_wiz_wrapper_GT
0+]
GT*E2 BUFH
RXOUTCLK Encrypted RTL
TXOUTCLK
0+]
txusrclk/ coreclk
txoutclk txusrclk2
refclk_n
refclk_p
IBUFDS_GTE2
X13249
A management/configuration clock, dclk, is used by the core and the transceiver and can
be any rate that is supported for the transceiver DRPCLK.
The transmitter signals of the XGMII interface are synchronous to txusrclk2. The receiver
signals of the XGMII are also synchronous to txusrclk2 unless the Exclude RX Elastic
Buffer option is selected, in which case they are synchronous to rxrecclk_out. Note that
these XGMII clock sources are a change in version 6.0 of the core. Prior to version 6.0, the
equivalent of coreclk was used instead. This change was made to reduce utilization and
latency.
X-Ref Target - Figure 3-2
example_design
core_support_layer rxrecclk_out
core
local_clocking_and_reset
gt_wiz_wrapper_GT
GTHE3 BUFG
RXOUTCLK
Encrypted RTL
TXOUTCLK
txusrclk/ coreclk
txoutclk txusrclk2
refclk_n
refclk_p
IBUFDS_GTE3
;
The 312.5 MHz clock (coreclk) must be created from the transceiver reference clock to keep
the user logic and transceiver interface synchronous. A management/configuration clock,
dclk, is used by the core and the transceiver and can be any rate that is supported for the
transceiver DRPCLK.
A management/configuration clock, dclk, is used by the core and the transceiver and can
be any rate that is supported for the transceiver DRPCLK.
The transmitter signals of the XGMII interface are synchronous to txusrclk2. The receiver
signals of the XGMII are also synchronous to txusrclk2 unless the Exclude RX Elastic
Buffer option is selected, in which case they are synchronous to rxrecclk_out.
Resets
All register resets within the 10BASE-R/KR core netlist are synchronized to the relevant
clock port.
Shared Logic
In earlier versions of the core, the RTL hierarchy for the core was fixed. This resulted in some
difficulty because shareable clocking and reset logic needed to be extracted from the core
example design for use with a single instance or multiple instances of the core. Shared Logic
is a feature that provides a more flexible architecture that works both as a standalone core
and as a part of a larger design with one or more core instances. This minimizes the amount
of HDL modifications required, but at the same time retains the flexibility to address more
core configurations.
<component_name> _example_design
<component_name>
<component_name> _support
<component_name> _block
shared logic
X13591
<component_name> _example_design
X13592
Figure 3-5 shows the core hierarchy when Shared Logic is included in the core. The
component is the shaded support layer.
X-Ref Target - Figure 3-5
example_design
core_support_layer rxrecclk_out
core
ORFDOBFORFNLQJBDQGBUHVHW
JWBZL]BZUDSSHUB*7
*7
( BUFG
RXOUTCLK
Encrypted RTL
TXOUTCLK
txusrclk/ coreclk
txclkout
txusrclk2
refclk_n
refclk_p
IBUFDS_GTE3
;
Figure 3-6 shows the core hierarchy when Shared Logic is included in the example design.
The component is the shaded core layer.
X-Ref Target - Figure 3-6
example_design
rxrecclk_out
core_support_layer
core
local_clocking_and_reset
gt_wiz_wrapper_GT
*7+( BUFG
5;287&/.
7;287&/.
(QFU\SWHG57/
txusrclk/ coreclk
txoutclk txusrclk2
gt0_qpllclk_i gt0_qpllrefclk_i
shared_clocking_and_reset dclk
GT_COMMON
GTHE3_COMMON
BUFG
BUFG
refclk_n
IBUFDS_GTE3
refclk_p
;
Figure 3-6: Core Hierarchy with Shared Logic Included in Example Design
See Shared Logic and the Support Layer and Special Design Considerations for more
information on sharing logic between cores.
among others, have the control line for that lane set to 1 and have a specific data byte
value.
The 64-bit single-data rate (SDR) XGMII interface is based upon the industry-standard
32-bit XGMII interface. The bus is demultiplexed from 32-bits wide to 64-bits wide on a
single rising clock edge. This demultiplexing is done by extending the bus upwards so that
there are now eight lanes of data numbered 0–7; the lanes are organized such that data
appearing on lanes 4–7 is transmitted or received later in time than that in lanes 0–3.
The mapping of lanes to data bits is shown in Table 3-3. The lane number is also the index
of the control bit for that particular lane; for example, xgmii_txc[2] and
xgmii_txd[23:16] are the control and data bits respectively for lane 2.
xgmii_txd[7:0] I I D D D D I
xgmii_txd[15:8] I I D D D T I
xgmii_txd[23:16] I I D D D I
xgmii_txd[31:24] I I D D D I
xgmii_txd[39:32] I S D D D I
xgmii_txd[47:40] I D D D D I
xgmii_txd[55:48] I D D D D I
xgmii_txd[63:56] I D D D D I
xgmii_txc[7:0] FF 1F 00 00 FE FF
X12668
Figure 3-7: Normal Frame Transmission Across the 64-bit XGMII Interface
Figure 3-8 depicts a similar frame to that in Figure 3-7, with the exception that this frame is
propagating an error. The error code is denoted by the letter E, with the relevant control bits
set. The clock source for Figure 3-8 can be determined from Table 3-1.
X-Ref Target - Figure 3-8
xgmii_txd[7:0] I I D D E D I
xgmii_txd[15:8] I I D D E D I
xgmii_txd[23:16] I I D D E D I
xgmii_txd[31:24] I I D D E D I
xgmii_txd[39:32] I S D D D D I
xgmii_txd[47:40] I D D D D T I
xgmii_txd[55:48] I D D D D I
xgmii_txd[63:56] I D D D D I
xgmii_txc[7:0] FF 1F 00 0F E0 FF
X13636
Figure 3-8: Frame Transmission with Error Across the 64-Bit XGMII Interface
xgmii_rxd[7:0] I S D D D D I
xgmii_rxd[15:8] I D D D D D I
xgmii_rxd[23:16] I D D D D D I
xgmii_rxd[31:24] I D D D D D I
xgmii_rxd[39:32] I D D D D D I
xgmii_rxd[47:40] I D D D D T I
xgmii_rxd[55:48] I D D D D I
xgmii_rxd[63:56] I D D D D I
xgmii_rxc[7:0] FF 01 00 00 E0 FF
X12669
Figure 3-10 shows an inbound frame of data propagating an error. In this instance, the error
is propagated in lanes 4 to 7, shown by the letter E. The clock source for Figure 3-10 can be
determined from Table 3-1.
X-Ref Target - Figure 3-10
clk156
xgmii_rxd[7:0] I S D E D D I
xgmii_rxd[15:8] I D D E D D I
xgmii_rxd[23:16] I D D E D D I
xgmii_rxd[31:24] I D D E D T I
xgmii_rxd[39:32] I D D E D I
xgmii_rxd[47:40] I D D E D I
xgmii_rxd[55:48] I D D E D I
xgmii_rxd[63:56] I D D E D I
xgmii_rxc[7:0] FF 01 00 FF 00 F8 FF
X13634-032816
Figure 3-10: Frame Reception with Error Across the 64-bit XGMII Interface
The 32-bit XGMII data interface is optionally available for 10GBASE-R core permutations.
This provides a 32-bit datapath which is synchronous to a 312.5 MHz clock source as shown
in Table 3-1.
The 32-bit single-data rate (SDR) XGMII interface is based upon the industry-standard
32-bit XGMII interface.
The mapping of lanes to data bits is shown in Table 3-4. The lane number is also the index
of the control bit for that particular lane; for example, xgmii_txc[2] and
xgmii_txd[23:16] are the control and data bits respectively for lane 2.
XGMII?TXD;= ) 3 $ $ $ $ $ $ ) )
XGMII?TXD;= ) $ $ $ $ $ $ 4 ) )
XGMII?TXD;= ) $ $ $ $ $ $ ) ) )
XGMII?TXD;= ) $ $ $ $ $ $ ) ) )
Figure 3-11: Normal Frame Transmission Across the 32-bit XGMII Interface
Figure 3-12 depicts a similar frame to that in Figure 3-11, with the exception that this frame
is propagating an error. The error code is denoted by the letter E, with the relevant control
bits set. The clock source for Figure 3-12 can be determined from Table 3-1.
X-Ref Target - Figure 3-12
XGMII?TXD;= ) 3 $ $ $ $ % $ ) )
XGMII?TXD;= ) $ $ $ $ $ % 4 ) )
XGMII?TXD;= ) $ $ $ $ $ % ) ) )
XGMII?TXD;= ) $ $ $ $ $ % ) ) )
Figure 3-12: Frame Transmission with Error Across 32-bit XGMII Interface
XGMII?RXD;= ) 3 $ $ $ $ $ $ ) )
XGMII?RXD;= ) $ $ $ $ $ $ $ ) )
XGMII?RXD;= ) $ $ $ $ $ $ 4 ) )
XGMII?RXD;= ) $ $ $ $ $ $ ) ) )
Figure 3-14 shows an inbound frame of data propagating an error. In this instance, the error
is propagated in lanes 0 to 3, shown by the letter E. The clock source for Figure 3-14 can be
determined from Table 3-1.
X-Ref Target - Figure 3-14
XGMII?RXD;= ) 3 $ $ $ $ % $ ) )
XGMII?RXD;= ) $ $ $ $ $ % 4 ) )
XGMII?RXD;= ) $ $ $ $ $ % ) ) )
XGMII?RXD;= ) $ $ $ $ $ % ) ) )
Figure 3-14: Frame Reception with Error Across the 32-bit XGMII Interface
MDIO Interface
The Management Data Input/Output (MDIO) interface is a simple, low-speed 2-wire
interface for management of the 10BASE-R/KR core consisting of a clock signal and a
bidirectional data signal. It is defined in clause 45 of IEEE Std 802.3.
MAC 1 MAC 2
mdio
STA mdc
MMD MMD
MMD MMD
MMD MMD
X13665
MDIO Ports
The core ports associated with MDIO are shown in Table 2-5, page 14. If implemented, the
MDIO interface is implemented as four unidirectional signals. These can be used to drive a
3-state buffer either in the FPGA SelectIO™ interface buffer or in a separate device.
The prtad[4:0] port sets the port address of the core instance. Multiple instances of the
same core can be supported on the same MDIO bus by setting the prtad[4:0] to specify
a unique value for each instance; the 10BASE-R/KR core ignores transactions with the
PRTAD field set to a value other than that on its prtad[4:0] port.
MDIO Transactions
The MDIO interface should be driven from a STA master according to the protocol defined
in IEEE Std 802.3. An outline of each transaction type is described in the following sections.
In these sections, these abbreviations apply:
• PRE: preamble
• ST: start
• OP: operation code
• PRTAD: port address
• DEVAD: device address
• TA: turnaround
DEVAD
The device address in this case will be either 00001 for the PMA device or 00011 for the PCS
device. For BASE-KR cores that include the optional Auto-Negotiation block, a DEVAD of
00111 should be used to access the associated Auto-Negotiation registers.
Figure 3-16 shows a Set Address transaction defined by OP=’00.’ Set Address is used to set
the internal 16-bit address register which is particular to the given DEVAD (called the
“current address” in the following sections), for subsequent data transactions. The core
contains two or three such address registers, one for PCS and one for PMA and possibly a
third for Auto-Negotiation.
mdc
mdio
Z Z 1 1 1 0 0 0 0 P4 P3 P2 P1 P0 V4 V3 V2 V1 V0 1 0 D15 D13 D11 D9 D7 D5 D3 D1 Z Z
D14 D12 D10 D8 D6 D4 D2 D0
IDLE 32 bits ST OP PRTAD DEVAD TA 16-bit ADDRESS IDLE
PRE
X13637
Figure 3-17 shows a Write transaction defined by OP=01. The 10BASE-R/KR core takes the
16-bit word in the data field and writes it to the register at the current address.
X-Ref Target - Figure 3-17
mdc
mdio
Z Z 1 1 1 0 0 0 1 P4 P3 P2 P1 P0 V4 V3 V2 V1 V0 1 0 D15 D13 D11 D9 D7 D5 D3 D1 Z Z
D14 D12 D10 D8 D6 D4 D2 D0
IDLE 32 bits ST OP PRTAD DEVAD TA 16-bit WRITE DATA IDLE
PRE
X13666
Figure 3-18 shows a Read transaction defined by OP=11. The 10BASE-R/KR core returns the
16-bit word from the register at the current address.
X-Ref Target - Figure 3-18
mdc
mdio
Z Z 1 1 1 0 0 1 1 P4 P3 P2 P1 P0 V4 V3 V2 V1 V0 Z 0 D15 D13 D11 D9 D7 D5 D3 D1 Z Z
D14 D12 D10 D8 D6 D4 D2 D0
32 bits
IDLE ST OP DEV AD TA 16-bit READ DATA IDLE
PRE
X13639
Post-Read-increment-address Transaction
mdc
mdio
Z Z 1 1 1 0 0 1 0 P4 P3 P2 P1 P0 V4 V3 V2 V1 V0 Z 0 D15 D13 D11 D9 D7 D5 D3 D1 Z Z
D14 D12 D10 D8 D6 D4 D2 D0
IDLE 32 bits ST OP PRTAD DEVAD TA 16-bit READ DATA IDLE
PRE
X13640
Due to the special implementation of the PCS Test Pattern Error Counter register (3.43) for
cores with the MDIO interface, whenever the MDIO PCS Address is set to point to that
register and PRBS31 RX error checking is enabled in register 3.42.5, no other MDIO
commands are accepted until a different PCS address is selected with an MDIO ADDRESS
command. PRBS31 RX error checking requires special handling for UltraScale devices (see
MDIO Register 3.43: 10GBASE-R Test Pattern Error Counter.)
The MDIO bus is not available on UltraScale devices for up to 5 ms after initiating a PMA or
PCS reset.
DRP Interface
To access the DRP interface on the transceiver CHANNEL block used in the core, the
interface between the core and the CHANNEL DRP ports is brought out to the boundary of
the core.
If access to the DRP interface is not required, then connecting one interface to the other,
port-by-port, and connecting drp_req to drp_gnt allows the core to access the DRP
when required (see Figure 3-20).
FRUHBWRBJWBGUS
1RXVHU'53DFFHVVUHTXLUHG
JWBGUS &RUH
GUSBJQW
GUSBUHT
FRUHBWRBJWBGUS
8VHU'53DFFHVVUHTXLUHG
user_drp 8VHU
user_req JWBGUS &RUH
$UELWHU
user_gnt
GUSBJQW
GUSBUHT
The arbiter should hold drp_gnt High until drp_req is brought Low, allowing the core to
stay in control of the transceiver DRP interface as long as it needs to. See Figure 3-21 for an
example of the User Arbiter.
FRUHBJWBGUS
:((1
$''5 XVHUBGUS
',
XVHUBGUS
5HTXHVWV
user_gt_drp_interface
$UELWHU/RJLF or
gt_drp
*UDQWV
FRUHBJWBGUS
5'<
'2 XVHUBGUS
XVHUBGUS
Receiver Termination
The receiver termination for Zynq-7000, Virtex-7, and Kintex-7 devices must be set
correctly. See the 7 Series Transceivers User Guide (UG476) [Ref 3].
For UltraScale architecture, see the UltraScale Architecture GTH Transceivers User Guide
(UG576) [[Ref 4].
The GT Common block within the core support layer can be used to supply the reference
clock to up to four transceivers, if they are all placed into the same GT quad.
The shared clock and reset block can be similarly shared between multiple cores. Where
multiple cores are to share that block, only one txoutclk signal needs to be connected
from a single core instance to that shared clock and reset block. The txoutclk outputs
from the other cores can be left dangling.
IMPORTANT: The information about sharing txoutclk is applicable for 7 series 10GBASE-R/KR cores
and UltraScale 10GBASE-R cores. However, for an UltraScale 10GBASE-KR core, each core should use its
own txoutclk. In the UltraScale 10GBASE-KR core the frequency of txoutclk changes. The GT is initially
configured to have the asynchronous gearbox disabled and txoutclk set to 161 MHz for the 64-bit
interface and 322 MHz for the 32-bit interface. After link training and auto-negotiation is completed
the frequency of txoutclk is set to 156 MHz for the 64-bit interface and 312 MHz for the 32-bit
interface. Therefore txoutclk cannot be shared for UltraScale 10GBASE-KR cores.
When creating a design with multiple core instances, you need to take care to replicate the
correct items and not to replicate items which should be shared. There is logic in the core
support layer which combines the synchronized TX and RX resetdone signals to create a
single resetdone_out signal. When multiple instances of the core are required, the
synchronized TX and RX resetdone signals from each core should be included in this
combined signal.
You should be aware that the core PMA reset issues gttxreset and gtrxreset signals to
the transceivers. The gttxreset to the transceiver results in txresetdone going Low and
the txoutclk output from the transceiver being lost for a short time. This affects all cores
that have a shared txoutclk.
Where up to four 10BASE-R/KR cores are required which are all in the same GT_QUAD on
the target device, you should generate one core with Include Shared Logic in core
selected and a second core with Include Shared Logic in example design selected. On
UltraScale devices, all other cores should be generated individually (that is, generate cores
B, C and D). This is shown in Figure 3-23. The former core, core A, can be used to provide
the clocks and control signals required by up to three instances of the latter, core B (or for
UltraScale devices, instances of cores B, C and D), with no further editing of core output
products required.
This simplifies the previous methodology where you would need to edit the core output
products to produce the same result. The architecture of the multi core design resembles
that in Figure 3-23.
For designs which require more than a single QUAD of transceivers, it is still possible that
they can share a single IBUFDS between multiple QUADs and multiple GT_COMMON blocks.
In this case, every core should be generated with Include Shared Logic in example design
selected. The shared logic should then be manually edited to create the correct structure of
the IBUFDS, GT_COMMON and GT_CHANNEL blocks.
core A
local_clock_and_reset
encrypted rtl gtwizard_10gbaser_gt
refclk_n
refclk_p shared_clock_and_reset GT_COMMON
XGMII
MAC MDIO coreclk txusrclk/
txusrclk2 resets
to all cores
core B
local_clock_and_reset
core B
local_clock_and_reset
Figure 3-23: Attaching Multiple Cores to a GT_QUAD Tile Using the Shared Logic Feature
When a BASE-KR core is created with the optional MDIO interface and with the optional
Auto-Negotiation block, there are extra steps that you must take when bringing the core up
in a link.
First, write to the MDIO registers to disable training (register bit 1.150.1), and then set the
training restart bit (register bit 1.150.0, which will self-clear).
Then, monitor either the core_status[5] output from the core, or register bit 7.1.2
(which latches Low and clears on read), to wait for the AN Link Up indication which is set in
the AN_GOOD_CHECK state (see IEEE Std 803.2 [Ref 2] Clause 73, Figure 11). Now enable
training (1.150.1) and then immediately restart training (1.150.0).
Training must complete within 500 ms in order for Auto-Negotiation to also complete and
set AN Complete. The Training block automatically disables itself if it does get to the
Training Done state.
RECOMMENDED: Currently Xilinx recommends setting the Training Done bit – register bit 1.65520.15.
This means that the core will not attempt to train the far-end device but can still be trained by the
far-end device.
If training does not complete within the time allowed by Auto-Negotiation, then you must
manually disable training (register 1.150.1) and restart training (1.150.0) to allow
auto-negotiation to restart the process.
When a BASE-KR core is created with no MDIO interface, logic in the block level can be used
to control the interaction between auto-negotiation and training.
When auto-negotiation is not included with the core, or when it is present and it reaches
AN Link Up, the Training block is automatically enabled (if the Configuration vector bit 33
to enable training is also set) and restarted. If Auto-Negotiation needs to restart, training is
automatically disabled until Auto-Negotiation again reaches AN Link Up.
If training is disabled using the Configuration vector bit 33, training is never run. If
Auto-Negotiation is either not included with the core, or is disabled by Configuration
vector bit 284, training can still be used by programming the Configuration bits to drive the
process.
The FEC Request bit is in register bit 7.21.15, or on status vector bit 383 if there is no MDIO
interface.
Loopback
There are two possible loopback settings for the 10GBASE-R core and one for the
10GBASE-KR core.
While not officially supported for 10GBASE-KR in IEEE Std 802.3, the loopback path has
been implemented for 10BASE-KR cores for convenience. However, only 10GBASE-R cores
transmit the expected 0x00FF pattern when in PCS loopback.
The core loops back the XGMII_TX ports directly to the XGMII_RX ports and for 10GBASE-R
cores also transmits the 0x00FF pattern on the TX serial ports.
• Vivado Design Suite User Guide: Designing IP Subsystems using IP Integrator (UG994)
[Ref 9]
• Vivado Design Suite User Guide: Designing with IP (UG896) [Ref 7]
• Vivado Design Suite User Guide: Getting Started (UG910) [Ref 8]
• Vivado Design Suite User Guide: Logic Simulation (UG900) [Ref 6]
If you are customizing and generating the core in the IP Integrator, see the Vivado Design
Suite User Guide: Designing IP Subsystems using IP Integrator (UG994) [Ref 9] for detailed
information. Vivado IDE might auto-compute certain configuration values when validating
or generating the design, as noted in this section. You can view the parameter value after
successful completion of the validate_bd_design command.
You can customize the IP for use in your design by specifying values for the various
parameters associated with the IP core using the following steps:
For details, see the Vivado Design Suite User Guide: Designing with IP (UG896) [Ref 7] and
the Vivado Design Suite User Guide: Getting Started (UG910) [Ref 8].
Note: Figures in this chapter are illustrations of the Vivado Integrated Design Environment (IDE).
This layout might vary from the current version.
Figure 4-1 displays the main screen for customizing the 10BASE-R/KR core.
• When targeting devices containing only GTXE2 transceivers, the 10BASE-KR options do
not appear.
• When targeting a 7 series device, the transceiver clocking and location selections do
not appear. These are still controllable through manually editing XDC LOC constraints.
• When targeting Kintex® UltraScale™ or Virtex® UltraScale devices that contain only a
single type of transceiver, the Transceiver Type selection does not appear.
• The Exclude RX Elastic Buffer option is only activated for UltraScale architecture
cores.
X-Ref Target - Figure 4-1
Component Name
The component name is used as the base name of the output files generated for the core.
Names must begin with a letter and must be composed from the following characters: a
through z, 0 through 9 and “_” (underscore).
BASE-R or BASE-KR
Select the BASE-KR option to get a 10BASE-KR core and have access to the
Auto-Negotiation and FEC options. This is only available for devices containing GTHE2,
GTHE3, or GTYE3, GTHE4, or GTYE4 transceivers.
Licensing Information
When BASE-R is selected, all licensing information displayed can be ignored. Use of the
10GBASE-R is free.
Optional Features
MDIO Management
Select this option to implement the MDIO interface for managing the core. Deselect the
option to remove the MDIO interface and expose a simple bit vector to manage the core.
Autonegotiation
Select this option to include the Auto-Negotiation (AN) block in the 10GBASE-KR core.
FEC
Select this option to include the FEC block in the 10GBASE-KR core.
Transceiver Options
Transceiver Type
Select the GTH or GTY transceiver to be used (for UltraScale devices that support both
transceiver types).
Transceiver Location
Use the Transceiver Location drop-down list to select the transceiver. The selection
available in the drop-down list changes depending on the selected refclk. For example, you
cannot select any of the lower two quads of transceivers if you have specified that you wish
to use a refclk from two QUADs below ('South' of), because there are no refclks from two
QUADs below the lowest two QUADs.
Transceiver RefClk
Use the Transceiver RefClk drop-down list to first select the relative location of the
IBUFDS_GTE3 that is used to clock the transceiver. For example, select refclk0+2 to use the
refclk0 signal which is provided from the IBUFDS_GTE3 block in the GT_QUAD which lies
two QUADs above ('North' of) the QUAD which contains the transceiver itself.
Use the Reference Clock Frequency drop-down list to select from the available reference
clock rates for the current core configuration. The frequencies available depend on the core
serial bit rate.
DRP Clocking
Use the DRP Clocking Frequency dialog box to define the DRP clock frequency which is
used for the DCLK input to the core. This can be any frequency which is valid for the
targeted transceiver.
Transceiver Debug
Select Additional Transceiver Control and Status Ports to expose additional transceiver
ports on the core interface. These are detailed in Transceiver Debug Ports in Chapter 2,
Table 2-12.
Otherwise the Shared Logic will be exposed in the Example Design. See Shared Logic in
Chapter 3 and Special Design Considerations for more information.
User Parameters
Table 4-1 shows the relationship between the GUI fields in the Vivado IDE and the user
parameters (which can be viewed in the Tcl console).
Output Generation
For details, see the Vivado Design Suite User Guide: Designing with IP (UG896) [Ref 7].
Required Constraints
This section defines the constraint requirements for the core. Constraints are provided in
several XDC files which are delivered with the core and the example design to give a
starting point for constraints for the user design.
There are four XDC constraint files associated with this core:
• <corename>_example_design.xdc
• <corename>_ooc.xdc
• <corename>.xdc
• <corename>_clocks.xdc
The first is used only by the example design; the second file is used for Out Of Context
support where this core can be synthesized without any wrappers; the third file is the main
XDC file for this core. The last file defines the constraints which depend on clock period
definition, either those defined by other XDC files or those generated automatically by the
Xilinx ® tools, and this XDC file is marked for automatic late processing within the Vivado
design tools to ensure that the definitions exist.
or
The reference clock is used as the main clock for the core, coreclk.
This clock is automatically defined and they can be shared between multiple cores so only
the identifiers need be extracted in the XDC files:
or
The transceiver creates 322.26 MHz clocks that must be constrained in the XDC files. These
constraints are required for devices with GTXE2 transceivers.
The example design contains a DDR register that can be used to forward the XGMII_RX
clock off-chip.
Note: These delay values are halved for the 32-bit cores.
The transceiver creates more clocks but these clock frequencies are automatically derived
by the Vivado IDE.
Clock Management
No Clock Management tiles (MMCMs) are required in this design or in the accompanying
example design.
Clock Placement
There are no special restrictions.
Banking
All ports should be given Location constraints appropriate to your design within Banking
limits
Transceiver Placement
Transceivers should be given location constraints appropriate to your design. An example of
these LOC constraints can be found in the example design or core XDC file. For UltraScale
devices, the placement is selectable at core configuration time.
These constraints are required if the optional MDIO interface is included and if the MDIO
interface is on the chip boundary.
Example Design
In the XDC file:
Simulation
For comprehensive information about Vivado® simulation components, as well as
information about using supported third-party tools, see the Vivado Design Suite User
Guide: Logic Simulation (UG900) [Ref 6].
IMPORTANT: For cores targeting 7 series or Zynq®-7000 devices, UNIFAST libraries are not supported.
Xilinx IP is tested and qualified with UNISIM libraries only.
Simulation of 10GBASE-R/KR at the core level is not supported without the addition of an
appropriate test bench (not supplied). Simulation of the example design is supported.
All synthesis sources required by the core are included. For this core there is a mix of both
encrypted and unencrypted sources. Only the unencrypted sources are visible and
optionally editable by using the Managed IP property option.
In Figure 5-1, the example HDL wrapper generated contains the following:
• The block-level core instance containing the encrypted RTL for the core itself, a local
clocking and reset block, and a transceiver wizard wrapper for the GT CHANNEL block
(gtwizard_10gbaser_gt). This is a per-core block and the logic cannot be shared
between multiple cores.
• A wrapper for the GT COMMON block, which can be shared between up to four cores if
the cores place their transceivers into the same GT quad.
• A shared clocking and reset block—can be shared between up to 12 cores in 7 series
devices and up to 20 cores in UltraScale devices.
Example Design
Core Support Layer
Core
gt_common shared_clocking_and_reset
The difference is subtle but selecting Include Shared Logic in the core produces a core
that includes all the shared logic and has outputs for clocks and control signals that can be
shared between multiple 10BASE-R/KR IP cores. Selecting Include Shared Logic in the
Example Design allows you to access the shared logic.
Typically in a multi-core design, you can create one core, core A with Shared Logic included
in the core, and one core, core B with the opposite setting. A single instance of core A then
provides the clocks for up to three instances of core B. See Special Design Considerations
for more information.
Test Bench
This chapter contains information about the test bench provided in the Vivado® Design
Suite.
In Figure 6-1, the demonstration test bench is a simple VHDL or Verilog program to exercise
the example design and the core itself. This test bench consists of transactor procedures or
tasks that connect to the major ports of the example design and a control program that
pushes frames of varying length and content through the design and checks the values as
they exit the core. The test bench is supplied as part of the Example Simulation output
product group.
Note that the demonstration test bench that is used to simulate the example design uses a
slightly different clock period to that which the core requires in hardware. This allows the
demonstration test bench to run with simplified logic and timing. The correct refclk is
created with a period based on a bit period of 98 ps instead of the nominal 96.969696 ps,
to ease simulation complexity in the demonstration test bench. A similar method is used for
32-bit 10GBASE-R cores.
X-Ref Target - Figure 6-1
Test Bench
DUT
10GBASE-R
XGMII-like Monitor
Stimulus
64-bit 10GBASE-R
Interface
XGMII-like 10GBASE-R
Monitor Stimulus
Simulation
A highly parameterizable transaction-based simulation test suite has been used to verify
the core. Tests included:
Hardware Verification
The 10GBASE-R and 10GBASE-KR cores have been validated on Kintex-7 and Virtex-7
devices. Hundreds of millions of Ethernet frames have been successfully transmitted and
received on each board and other features such as hot-plugging, Auto-Negotiation and FEC
error correction have been tested with the setup.
Testing
The 64-bit 10GBASE-R and 10GBASE-KR cores have successfully undergone validation on
7 series devices at the University of New Hampshire Interoperability Lab. Detailed test
reports are available from Xilinx. All tests were successful.
Migrating
For information on migrating to the Vivado Design Suite, see the ISE to Vivado Design Suite
Migration Guide (UG911) [Ref 11].
Device Migration
If you are migrating from a 7 series GTX or GTH device to an UltraScale device, the prefixes
of the optional transceiver debug ports for single-lane cores are changed from “gt0”, “gt1”
to “gt”, and the postfix “_in” and “_out” are dropped. For multi-lane cores, the prefixes of
the optional transceiver debug ports gt(n) are aggregated into a single port. For example:
gt0_gtrxreset and gt1_gtrxreset now become gt_gtrxreset[1:0]. This is true
for all ports, with the exception of the DRP buses which follow the convention of
gt(n)_drpxyz.
For more information about migration to UltraScale devices, see the UltraScale Architecture
Migration Methodology Guide (UG1026).
In version 6.0 of the core, there were several changes that make the core pin-incompatible
with the previous version(s). These changes were required to enhance the overall customer
experience.
Shared Logic
As part of the hierarchical changes to the core, it is now possible to have the core itself
include all of the logic that can be shared between multiple cores, which was previously
exposed in the example design for the core.
If you are updating a version 3.* to the latest version with shared logic, there is no simple
upgrade path; it is recommended to consult the Shared Logic sections of this document for
more guidance.
The new output port areset_coreclk_out is added when Shared Logic is included in the
core, for UltraScale devices only.
The new output port qpll0reset is added when Shared Logic is included in the example
design, for UltraScale devices only.
The port rxrecclk_out is added for all core configurations when previously it was only
available when the RX Elastic Buffer was omitted from UltraScale device 32-bit datapath
cores. The port qpll0reset is added for all UltraScale devices, to connect to the shared
clock and reset block, to control QPLLRESET.
Ports Changed
Table B-2 shows the ports that were changed in name only in v6.0.
Debugging
This appendix includes details about resources available on the Xilinx ® Support website
and debugging tools.
TIP: If the IP generation halts with an error, there might be a license issue. See License Checkers in
Chapter 1 for more details.
Documentation
This product guide is the main document associated with the 10G Ethernet PCS/PMA core.
This guide, along with documentation related to all products that aid in the design process,
can be found on the Xilinx Support web page or by using the Xilinx Documentation
Navigator.
Download the Xilinx Documentation Navigator from the Downloads page. For more
information about this tool and the features available, open the online help after
installation.
Solution Centers
See the Xilinx Solution Centers for support on devices, software tools, and intellectual
property at all stages of the design cycle. Topics include design assistance, advisories, and
troubleshooting tips. The Solution Center specific to the 10G Ethernet PCS/PMA core is the
Xilinx Ethernet IP Solution Center.
Answer Records
Answer Records include information about commonly encountered problems, helpful
information on how to resolve these problems, and any known issues with a Xilinx product.
Answer Records are created and maintained daily ensuring that users have access to the
most accurate information available.
Answer Records can also be located by using the Search Support box on the main Xilinx
support web page. To maximize your search results, use proper keywords such as
• Product name
• Tool message(s)
• Summary of the issue encountered
A filter search is available after results are returned to further target the results.
AR: 54669
Technical Support
Xilinx provides technical support in the Xilinx Support web page for this LogiCORE™ IP
product when used as described in the product documentation. Xilinx cannot guarantee
timing, functionality, or support if you do any of the following:
• Implement the solution in devices that are not defined in the documentation.
• Customize the solution beyond that allowed in the product documentation.
• Change any section of the design labeled DO NOT MODIFY.
Xilinx provides premier technical support for customers encountering issues that require
additional assistance.
To contact Xilinx Technical Support, navigate to the Xilinx Support web page.
Debug Tools
There are many tools available to address 10G Ethernet PCS/PMA core design issues. It is
important to know which tools are useful for debugging various situations.
The Vivado logic analyzer is used with the logic debug IP cores, including:
See the Vivado Design Suite User Guide: Programming and Debugging (UG908) [Ref 10].
Several transceiver ports have been marked for easy access by the debug feature, including
those input ports that are already driven by the core logic.
Reference Boards
Contact your Xilinx representative for information on development platforms for this IP
core.
Simulation Debug
The simulation debug flow for Mentor Graphics Questa Simulator (QuestaSim) is illustrated
in Figure C-1. A similar approach can be used with other simulators.
X-Ref Target - Figure C-1
QuestaSim
Simulation Debug
Yes
Yes
No
If the libraries are not compiled and mapped
correctly, it will cause errors such as:
# ** Error: (vopt-19) Failed to access
library 'secureip' at "secureip". Yes Need to compile and map the proper
Do you get errors referring to
# No such file or directory. failing to access library? libraries. See "Compiling Simulation
(errno = ENOENT) Libraries Section."
# ** Error: ../..example_design/
ten_gig_eth_pcs_pma_core_block.v(820):
Library secureip not found. No
No
No
Hardware Debug
Hardware issues can range from link bring-up to problems seen after hours of testing. This
section provides debug steps for common issues. The Vivado Design Suite debug feature is
a valuable resource to use in hardware debug. The signal names mentioned in the following
individual sections can be probed using the debug feature for debugging the specific
problems. Many of these common issues can also be applied to debugging design
simulations.
General Checks
Ensure that all the timing constraints for the core were properly incorporated and that all
constraints were met during implementation.
• Does it work in post-place and route timing simulation? If problems are seen in
hardware but not in timing simulation, this could indicate a PCB issue. Ensure that all
clock sources are active and clean.
• If using MMCMs in the design, ensure that all MMCMs have obtained lock by
monitoring the LOCKED port.
• If your outputs go to 0 after operating normally for several hours, check your licensing.
IMPORTANT: The latching Local Fault and Link Status bits in the status vector or MDIO registers must
be cleared with the associated reset bits in the configuration vector or by reading the MDIO registers, or
by issuing a PMA or PCS reset.
Local Fault
The receiver outputs a local fault when the receiver is not up and operational. This RX local
fault is also indicated in the status and MDIO registers. The most likely causes for an RX local
fault are:
Remote Fault
Remote faults are only generated in the MAC reconciliation layer in response to a Local Fault
message. When the receiver receives a remote fault, this means that the link partner is in a
local fault condition.
When the MAC reconciliation layer receives a remote fault, it silently drops any data being
transmitted and instead transmits IDLEs to help the link partner resolve its local fault
condition. When the MAC reconciliation layer receives a local fault, it silently drops any data
being transmitted and instead transmits a remote fault to inform the link partner that it is in
a fault condition. Be aware that the Xilinx 10GEMAC core has an option to disable remote
fault transmission.
$EVICE ! $EVICE "
48 2EMOTE &AULT
,INK ,INK
-AC0#30-! -AC0#30-!
$OWN $OWN
$EVICE ! $EVICE "
48 2EMOTE &AULT
,INK ,INK
-AC0#30-! -AC0#30-!
$OWN $OWN
$EVICE ! $EVICE "
48 )$,%
,INK ,INK
$OWN -AC0#30-! -AC0#30-! $OWN
$EVICE ! $EVICE "
48 )$,%
,INK ,INK
-AC0#30-! -AC0#30-!
5P 5P
Then some low-data rate Auto-Negotiation (AN) frames are exchanged and checked at
either end. The transceiver is placed into a different receive mode for this low-rate protocol.
When this initial exchange of AN frames is complete, the AN_GOOD_CHECK state is entered
and then transmission switches to a higher data rate training protocol frame exchange,
which must complete within 500 ms of AN starting/restarting. The transceiver is placed into
yet another different mode for this part of the bring-up.
Training involves detection and measurement of the received signal and transmission of
commands that alter the far-end transmitter characteristics to improve that received signal.
This happens in both directions until both ends of the link are receiving the best possible
signal.
At that point, training is flagged as COMPLETE and the AN protocol also completes and sets
the AN Link Good flag, which then enables normal Ethernet transmission and reception, with
the transceiver being placed into normal operating mode. Any time that AN is restarted or
reset, this entire process is repeated.
Note that if two identical 10GBASE-KR cores are powered up and reset at identical times, the
pseudorandom 'nonce' generation which forms part of Auto-Negotiation will produce
identical sequences of 'nonce' values which might stop the two cores from completing
Auto-Negotiation. By delaying the reset to one or other core by at least two clock cycles,
this can be avoided.
• Monitor the state of the signal_detect input to the core. This should either be:
Transceiver-Specific:
• Ensure that the polarities of the txn/txp and rxn/rxp lines are not reversed. If they are,
these can be fixed by using the TXPOLARITY and RXPOLARITY ports of the transceiver.
• Check that the transceiver is not being held in reset or still being initialized. The
RESETDONE outputs from the transceiver indicate when the transceiver is ready.
received, (which is defined in IEEE Std 802.3, Section 49.2.13.2.3 as a 66-bit code with a bad
sync header or a control word (C S or T) with no matching translation for the block type
field.)
If the RX Elastic Buffer underflows, the core will insert an /E/ block, followed by /L/ blocks.
When FEC is enabled in register bit 1.170.0 and FEC Error Passing is enabled in register bit
1.170.1, any uncorrectable FEC errors cause the FEC block to set the two sync header bits to
the same value, creating a bad sync header, which the RX PCS Decoder decodes as an /E/
block.
• Ensure that the MDIO is driven properly. Check that the mdc clock is running and that
the frequency is 2.5 MHz or less. Overclocking of the MDIO interface is possible with
this core, up to 12.5 MHz.
• Ensure that the 10 Gb Ethernet PCS/PMA core is not held in reset.
• Read from a configuration register that does not have all 0s as a default. If all 0s are
read back, the read was unsuccessful. Check that the PRTAD field placed into the MDIO
frame matches the value placed on the prtad[4:0] port of the core.
• Verify in simulation and/or a Vivado Design Suite debug feature core capture that the
waveform is correct for accessing the host interface for a MDIO read/write.
Link Training
There is currently no link training algorithm included with the core so you should implement
what is required.
It has been noted in hardware testing that the RX DFE logic in the Xilinx transceivers is
usually capable of adapting to almost any link so far-end training might not be required at
all and the Training Done register/configuration bit can be set as default.
When you decide to implement your own training algorithm, do not only include hardware
to monitor the received data signal integrity and provide the inc/dec/preset/initialize
commands to send to the far end device, but also follow the protocol of sending a
command until 'updated' is seen on return, and then sending 'hold' until 'not updated' is
seen on return.
IMPORTANT: Also, the priority of commands defined in IEEE Std 802.3 must be adhered to, such as
never transmitting 'Preset' with 'Initialize'.
The 10GBASE-KR core does include logic that allows it to be trained by a far-end device
without user-interaction.
For special considerations when using link training with auto-negotiations see:
Xilinx Resources
For support resources such as Answers, Documentation, Downloads, and Forums, see Xilinx
Support.
References
These documents provide supplemental material useful with this product guide.
1. IEEE Standard 802.3-2012, Carrier Sense Multiple Access with Collision Detection
(CSMA/CD) Access Method and Physical Layer Specifications
(standards.ieee.org/findstds/standard/802.3-2012.html)
2. IEEE Standard 802.3-2012, Media Access Control (MAC) Parameters, Physical Layers, and
Management Parameters for 10 Gb/s Operation
(standards.ieee.org/findstds/standard/802.3-2012.html)
3. 7 Series Transceivers User Guide (UG476)
4. UltraScale Architecture GTH Transceivers User Guide (UG576)
5. UltraScale Architecture GTY Transceivers User Guide (UG578)
6. Vivado Design Suite User Guide - Logic Simulation (UG900)
7. Vivado Design Suite User Guide: Designing with IP (UG896)
8. Vivado Design Suite User Guide: Getting Started (UG910)
9. Vivado Design Suite User Guide: Designing IP Subsystems using IP Integrator (UG994)
10. Vivado Design Suite User Guide: Programming and Debugging (UG908)
11. ISE to Vivado Design Suite Migration Guide (UG911)
12. Vivado Design Suite User Guide - Implementation (UG904)
Revision History
The following table shows the revision history for this document.