best lab practices for xilinx fpga board-level debug · cycle-to-cycle jitter ... level of each...

58
© Copyright 2017 Xilinx Best Lab Practices for Xilinx FPGA Board-Level Debug

Upload: hangoc

Post on 08-Jun-2018

244 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Best Lab Practices for Xilinx FPGA Board-Level Debug

Page 2: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Table of Contents Best Lab Practices for Xilinx FPGA Board-Level Debug ................................................................................. 1

Introduction .............................................................................................................................................. 4

Chapter 1: Power Supplies ............................................................................................................................ 5

Power Supply Sequencing & Ramp Up ..................................................................................................... 6

Power Supply Ripple & Noise .................................................................................................................... 7

Power Supply Decoupling and filtering ..................................................................................................... 7

Power Supply Debug Checklist .................................................................................................................. 8

Measuring Power Supply Ripple and Noise ............................................................................................ 10

Chapter 2: Signal Integrity on Critical Nets ................................................................................................. 13

Scope probes ........................................................................................................................................... 13

PCB probing ............................................................................................................................................. 15

Anticipate what you expect to see ......................................................................................................... 17

Signal at probe point vs die ..................................................................................................................... 18

Inter-Symbol Interference (ISI) ............................................................................................................... 19

Chapter 3: Debugging SSO Noise and Cross-Talk Issues ............................................................................. 21

SSN Debug ............................................................................................................................................... 21

SSN Mitigation Methods ......................................................................................................................... 23

Board-Level Cross-Talk ............................................................................................................................ 24

Chapter 4: Debugging Jitter Issues .............................................................................................................. 27

Period Jitter ............................................................................................................................................. 29

Period Jitter Application...................................................................................................................... 30

Cycle-to-Cycle Jitter ................................................................................................................................ 30

Time Interval Error (TIE) .......................................................................................................................... 31

Wander ................................................................................................................................................... 32

Duty Cycle Distortion (DCD) .................................................................................................................... 32

Chapter 5: Debugging Memory Interface Issues, Additional Considerations ............................................. 34

Reproduce the Error ........................................................................................................................... 34

Isolating the Error ............................................................................................................................... 34

Power Supply Noise ............................................................................................................................ 35

Address, Command, and Control Signals ............................................................................................ 35

DQ, DQS, and DM Signals .................................................................................................................... 39

Page 3: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Chapter 6: Debugging High Speed Serial Interfaces, Additional Considerations ........................................ 46

Divide and conquer ................................................................................................................................. 46

Components of a SerDes Link ................................................................................................................. 46

Stackup and Layout design ................................................................................................................. 47

Power Supplies .................................................................................................................................... 49

Reference Clock .................................................................................................................................. 50

Transmitter ......................................................................................................................................... 51

Receiver ............................................................................................................................................... 51

RCAL .................................................................................................................................................... 52

Scope the Problem .................................................................................................................................. 52

Debug sequence of the serial line ........................................................................................................... 53

Fine tuning of TX FIR and RX Equalizers .................................................................................................. 57

Tools for accurate HW debug ................................................................................................................. 57

Conclusion ................................................................................................................................................... 57

Additional References ................................................................................................................................. 57

Page 4: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Introduction This paper is intended to provide recommendations of “best practices” for FPGA board-level debug in

the lab. Whenever you are attempting to debug a complex technical problem, it is always important to

apply a good, basic, scientific approach. In essence:

Brainstorm possible root-causes and form a list of hypotheses to test in the lab.

Change only one variable at a time between tests.

Document the lab equipment used, the board name or serial number, the device name/marking,

as well as the test results for each test.

Archive your test designs – so that you can refer back to them at a later time.

Regardless of the issue you are debugging, checking the Power Supplies should always be the

first step – See Chapter 1

The board design should adhere to the rules outlined in the PCB Design User Guide

a. 7 Series: UG483

b. UltraScale and UltraScale+: UG583

It is always recommended to simulate the design. This will help identify upfront issues and give

an idea of what to expect when measuring signals

All signals should be measured based on the Nyquist Theorem or Sampling Theorem

a. For power supplies, noise up to 3 GHz is recommended

b. For signals, the bandwidth should be at least 2x the signal rate or rise or fall time

bandwidth (whichever is greater)

Page 5: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Chapter 1: Power Supplies With today’s much larger FPGAs and SoC devices, gone are the days where only a few supply rails were

needed to power a device. Large FPGAs, such as our Virtex-7 2000Ts or our larger UltraScale and

UltraScale+ product offerings have more than 20 different types of supply rails. This is a design choice

mainly driven by industry demands for higher speed, lower power, lower noise, higher design flexibility,

etc.

Debugging problematic FPGA or MPSoC boards can be quite difficult if unprepared. Generally, the first

rule of debugging efficiently and successfully is understanding the full system or structure of the design

that is being debugged. Power supplies play a major role in the functionality of any electronic circuit;

FPGAs/MPSoCs are no exception. As such, verifying the devices supply rail integrity is the best place to

start.

Generally, most power supply related issues tend to be more complex than simply verifying that the DC

level of each supply rail met the Xilinx data sheet specifications. However, the best place to start is

often to place a voltmeter on each of the board power rails as close as possible to the FPGA or MPSOC,

and verify that the values are what they should be. Semiconductor devices, including Xilinx’ offerings,

rely on their supply rails being within a particular min/max margin during operation to perform reliably

and at the performance specifications outlined by the device’s data sheet. Under or over-voltage of

Xilinx devices can have adverse effects and should not be underestimated or overlooked when

debugging power supply issues. Undefined functionality and compromised device performance in the

case of under-voltaged supplies are not uncommon. In the latter case, depending on the degree and

time of exposure, over-voltaged devices can be unreliable or, in the extreme case, overstressed to the

point of damage. Electrical overstress will not be discussed within this paper, however, it should be kept

in mind when debugging any device with issues not attributable to general user error. It is not unheard

of for a device to be overstressed to a degree of small internal damage that manifests itself in

uncharacterized device behavior.

Power Integrity engineering best practice is to model the power system using a power integrity CAD

tool. PCB extraction of parameters and all components should be used. Modeling up to the FPGA

device should be sufficient to resolve problems. Internal package models are available if needed later to

resolve issues. These may be requested through Xilinx support (via distributor or forums or service

request).

Beyond this, debugging power supply issues becomes more involved. If you are dealing with a

problematic device (especially if it is problematic at power-on only), a good next step is to check if the

ramp-rates and sequencing of the power rails are meeting the power on/off and sequencing

requirements specified in the device’s respective data sheet. If the DC levels are good and the supply

ramp-rates and sequencing meet all requirements, power supply rail noise (AC) might be the culprit of

board or system-level issues. Power supply rail noise can be caused by devices external to the FPGA,

such as switching regulators, or caused by the FPGA itself, such as from excessive simultaneously

switching outputs (SSOs). The noise will not cause the voltage levels to exceed the recommended

operating values (min and max).

Page 6: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Power Supply Sequencing & Ramp Up Power supply sequences for Xilinx devices can differ between device families. However, most follow a

sequential set of lowest to highest voltage (VCCINT>VCCAUX>VCCO order) for power on and the reverse

for power off. Devices have a recommended sequence that minimizes current demand from system

power supplies and guarantees the I/O behavior during power up/down. Failure to follow the

recommended sequence can result in a lack of minimized current demand and potentially improper I/O

behavior during power-on. Improper initialization of I/Os can result in increased current spikes during

power-up and in some cases damage can occur.

Only sequences noted in the data sheet as prohibited must be avoided: all other sequences are allowed,

given the current and output behavior as described above. Figure 1-1 below shows an example of the

recommended power sequence for a 7 Series FPGA.

Figure 1-1: Example Power up/down Sequence for 7 Series FPGA.

Designing your supplies such that they are clean and

monotonic is a must. This is also true with any other

semiconductor devices. Not doing so can introduce a

host of issues most of which are not characterized

by Xilinx. The power supply ramp rate requirements

are not just for a successful configuration. They are

also crucial for internal Power-On-Reset (POR) and

other such initialization circuitries that become

active during power up and are triggered off certain

supply rail voltage thresholds. For example, if the

supply voltage was to rise above its POR threshold on power up, then subsequently dip below the POR’s

hysteresis range and rise again, the VCO could change frequencies, phase, or both causing a PLL to fall

out of phase and lose synchronization.

Figure 1-2: Non-Monotonic Ramp Up of Vccint Example

Page 7: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Power Supply Ripple & Noise Power supply ripple and noise are often meshed together when talking about power supply fidelity. In

general, power supply ripples are the larger swings in the power supply caused by the charging and

discharging of each storage element in the power circuitry. Thus, you’ll see ripple components of the

voltage regulator module, larger decoupling/bulk capacitors, and any other power circuitry used in the

power distribution network. Everything else can be listed under power supply noise. This generally

includes the higher frequency noise content brought on from the switching noise from the load device,

crosstalk from neighboring signal trace planes, and other parasitics in the power distribution network.

The power supply ripple component is to be controlled and designed effectively by the hardware

engineer. Usually this comes down to proper component selections to meet the power demands of the

design on board. Power supply noise is mitigated by everything else in the board design process. Proper

component placement, trace routing/layout, pad size/orientation, proper board stack etc. Xilinx

provides detailed PCB design recommendations in the device family PCB user guide.

It is essential to keep your power supply clean and minimize any large ripples and noise swings. This is

especially the case for high speed interfaces that run in and above the Gigahertz’s where the noise

margins are in the hundreds of picoseconds range. Margin will only shrink as higher speed and lower

voltage devices are fabricated in the future. Additionally, increases in data processing make this even

more of a crucial consideration. Re-spinning boards can be quite costly and time consuming when

considering the smaller initial investment in better component selection, board design, layout and

placement.

Power Supply Decoupling and filtering

Modern FPGAs and MPSoCs operate at lower voltages and higher current loads than their older

counterparts. This increases the need for a robust power distribution system. A good power distribution

network can mitigate power supply ripple and noise generated from the surrounding system to facilitate

proper functionality of the FPGA/MPSoC device and board. In a good power distribution network with a

given optimal board layout, the proper selection of the power components, their types, amounts and

values for each of the device’s supply rails are paramount. Unlike most devices in the semiconductor

industry, FPGAs/MPSOCs operate over a large range of frequencies and thus require a broad range of

power solutions to supply the current demands at all corners and ensure voltage ripple and noise are

minimized.

As stated previously, modern day FPGA/MPSOC contain multiple supply rails which power each of the

different types of resources within the device. With so many supply rails, it is important to know which

supply rail should be checked and verified against Xilinx data sheet specifications. Table-1 below lists the

resources each rail supplies in the FPGA/MPSOC. If the issue has already been narrowed down to a

particular on chip resource or block, check these rails first. For example, in a DDR3/4 memory

Page 8: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

application, if an SI issue on the 1.2V HSTL with DCI is marginal and causing distortion of the input data,

the VCCO of the DCI bank should be measured.

Table 1-1: Power Supply Resources Table

Notes: 1. These resources are available only in certain device families. Refer to the appropriate data sheets and user guides for more information. 2. VCCO in bank 0 (VCCO_0 or VCCO_CONFIG) powers all I/Os in bank 0 as well as the configuration circuitry. See the applicable Configuration User Guide. 3. Xilinx 7 series Block RAM/FIFO only. 4. Xilinx 7 series High Performance (HP) I/O banks only.

Power Supply Debug Checklist The following Power Supply Debug Checklist is a good starting point when debugging power rails on an FPGA board.

Page 9: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Power Supply Debug Check List:

1) Check the DC level of all power-rails on the board (including Vref and Vtt) with a voltmeter or oscilloscope with adequate resolution and accuracy. Are they within the recommended DC specifications of the device or devices that are having issues? Check as close as possible to the FPGA with the full design running so as to include any IR drop due to high current loads and the impedance of the power supply delivery path on the board. Also check close to the regulator outputs. A large delta in voltage between the measurement points implies a large IR drop is occurring. This may be a symptom of a poorly designed layout for the supply. Use the sysmon or XADC to verify the Vccint and Vccaux are within limits. Note the min and max values are able to capture values that last as little as 1 nanosecond. These may be viewed in the Hardware Manager of Vivado over the JTAG connection.

2) If the board issue occurs directly after power-on or power-off, check that the power supply ramp-rates and sequences are within the requirements of all devices on the board. Xilinx devices can have requirements for both.

3) With a high-speed oscilloscope and low-impedance probes, measure the power supply noise for each power rail as close as possible to the device or devices that are having issues (see the following section for “Steps for Measuring Power Supply Noise”). This always should be done with the FPGA design configured and running.

4) For the Xilinx FPGA/MPSoC device, ensure the decoupling capacitor recommended guidelines

are met and the power delivery system is well designed. These guidelines can be found in the

family-specific PCB Design and Pin Planning User Guide. This step is particularly important.

a. Check that the decoupling capacitor recommendations are met: i. Number of each capacitor

ii. Capacitance values iii. Voltage ratings iv. L-series v. R-series

vi. Mounting guidelines b. Examine how the power delivery system was designed in the PCB layout – particularly

the rails for devices that are problematic. i. Make sure that each high-current rail is either a full-power plane layer or a

partial plane design that can handle the current requirements of the loads. It is not recommended to use traces to provide power to the FPGA power rails, with the exception of the Vref pins, as they have no real current load.

ii. Check that the power supply “sense” connection is directly under the FPGA for any power rails specific to the FPGA, particularly high-current rails such as Vccint.

5) For memory interfaces or other high-speed interfaces that are having issues, is there a noise pattern associated with the data errors?

a. If yes, what does the AC noise on the FPGA’s Vccint, Vcco, and Vccaux voltage rails look like at the time of the data error?

b. Additionally, check the memory device voltages as close as possible to the memories and during the time of the data error.

c. Check whether the FPGA reference voltages for the I/O Standards in use are stable, clean, and are within data sheet recommendations. These mainly include termination voltages for Vref and DCI based I/O Standards.

Page 10: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Measuring Power Supply Ripple and Noise When measuring AC noise on a power rail, it is critical to take measurements with proper equipment and techniques. This section will outline these guidelines. How and where you take your measurement is critical!

1) USE EQUIPMENT THAT MEETS THE NEED! For scope probes, use <1pF high-performance probes with minimum vertical resolution <=1% of nominal supply level, and minimum 3 Ghz bandwidth scope (faster is better). The bandwidth of the scope and probes should be at least twice the frequency of the highest effects that you need to measure, such as edge-rates or glitches. Alternatively, instead of high-performance probes, coaxial cables can be used - preferably micro-coaxial cable soldered to the point of measurement and terminated to 50 Ohms internal to the scope.

2) CALIBRATE YOUR PROBE! Check the probe accuracy/resolution. You need to be able to resolve down to 10-20mV, or even 5mV for the 0.9V or 1.0V supplies such as Vccint.

For example, P7380A (Tektronix) only has accuracy to +/-50mV:

3) Measurements need to be taken at the fan-out or in-pad vias underneath the FPGA/problematic device, with probe tip placed at or soldered to a Vcc pin via, and probe ground on a neighboring ground pin via (the idea is to minimize probe loop length)

4) Avoid measuring across a decoupling cap – if this is the only option, remove a cap from the board to take the measurement.

5) Avoid measuring across other active components as noise from these can be coupled onto the probe especially when using lower end probes with long ground leads.

6) Avoid using low bandwidth probes with long ground leads. If you have to use such probes try minimizing the inductive antenna created by using a shorter ground wire/lead or wrap it in the coil around the signal probe. Minimize the signal to ground impedance by reducing the length of the ground lead and separation distance of the signal and ground lead as much as possible.

Page 11: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 1-3: Improper power supply measurement

Figure 1-4: Improved power supply measurement

7) Use DC offset to center the signal in the middle of the scope screen vertically to increase the

dynamic range. If the scope cannot provide the required offset, a DC block or 10uF capacitor in series with a high impedance probe can be used.

8) Set the scope to use infinite persistence. 9) Set the Horizontal resolution to 10ns/division (you might need to adjust this to capture the

switching noise). 10) Set the Vertical resolution to 5mV or 10mV/division, or as low as possible. 11) Set the trigger level to just above that value, and gradually increase the trigger level, which

should catch larger and larger peaks. 12) Repeat in the negative direction (below the nominal Vcc level)

Page 12: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

13) If the noise peaks violate the Data Sheet specifications for the power supply rail, then the next steps are to analyze the frequency components of the noise for the purpose of identifying the potential causes of the spikes.

It is possible to open a service request with WTS requesting Matlab analysis of the power supply noise. To save the capture for this type of analysis, adjust the vertical amplitude so that the signal spans approximately 80% of the window vertically, remove infinite persistence, and adjust the horizontal scale to capture the max memory depth the scope will allow while maintaining its highest sampling rate. Save this waveform in binary format. The following Oscilloscope file formats are preferred:

LeCroy = .trc

Tektronix = .wfm

Agilent = .bin

All = .csv with amplitude and time stamps (or .txt as a last resort)

Measurements made at different at different points on the FPGA power delivery network:

Figure 1-5: Measurement with signal close to FPGA and distant ground

Figure 1-6: Measurement across pads outside FPGA periphery

Figure 1-7: Measurements across a decoupling capacitor outside FPGA

Page 13: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Example of Power Supply Noise measurement using infinite persistence:

Figure 1-8: Power supply noise with infinite persistence

Chapter 2: Signal Integrity on Critical Nets Signal Integrity (SI) is loosely defined as a measure of the quality of an electrical signal. Measuring the SI

of a signal on a board, is really just another way of saying that you want to observe what the signal looks

like with an oscilloscope. At the typical frequencies of Xilinx’s SelectIOs (kHz to 1+ GHz), a real-time

sampling oscilloscope scope is a good choice. It is important to ensure that the bandwidth of the scope

can handle the frequencies associated with what you need to observe or measure - typically the

recommendation is to have the sampling rate be at least twice the edge rate of the fastest signal or

noise components that you are interested in.

All signals should be simulated in a Signal Integrity (SI) engineering CAD tool using Xilinx IBIS models and

the extracted board parameters before board fabrication. When verifying the resulting board, these

signal results should agree.

Scope probes For high-speed signals (>100MHz), it is important that good quality, low-impedance (<1pF) scope probes

are used. These can be either active or passive probes. It is also critical to use probe tips that are as

short as possible – both on the main probe tip and the ground tip.

Any notable additional length to these tips will add impedance from the inductance and capacitance of

the leads/wires, which can distort the signal compared to what it normally looks like at the probe point.

Additionally, when wires are used, for example to attach scope probe grounds to a ground location on a

board, they can even act as antennae in picking up EMI surrounding the area being probed.

It is good to start with a selection of gold-plated scope probe tip attachments, and try to use the

shortest options of connecting the scope probe to the board. Figure 2-1 is an example of scope probe

tip with ground tip.

Page 14: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 2-1

Whenever possible, it is even better to use a soldered down probe tip. Figure 2-2 shows an example of

a solder-down differential probe tip from Agilent. In addition to insuring a short, good-quality probe

connection to the probe-point, this type of probe tip also ensures a more rugged connection – no longer

requiring you to physically hold the probe down.

Figure 2-2: Solder-down differential probe

Page 15: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 2-3: Example of solder-down differential probes on PCB

PCB probing It is critical that when you are trying to measure an input to a device, that you measure as close as

possible to the pin. Typically, for a Xilinx FPGA or any high-speed device in a BGA package, this will be at

a through-hole “fan-out” or “dogbone” via, commonly located diagonally adjacent to the BGA ball

landing for the pin.

Modern board manufacturing techniques allow these vias to sometimes be located right in the center of

the BGA pads – known as “via-in-pad”. However, there might be no vias in or around the BGA pin that

you want to probe – as when the vias are “blind vias” (not traversing all the way through the PCB layers),

as opposed to “through-hole vias”. In those cases, you will need to become more creative in locating a

good probe point.

An example of “fan-out” or “dogbone” vias for a Xilinx FPGA is shown in Figure 2-4. The smaller circles

are the ball landings, while the larger ones that are diagonal to the smaller ones are the through-hole

vias:

Page 16: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 2-4

An example of “via-in-pad” is the large BGA on the left in Figure 2-5 – this is becoming more popular

because of the decrease in the price premium for the board manufacturing effort, and the signal quality

and board route-ability are both improved. Note that the smaller (SDRAM ICs) to the right still have fan-

out vias.

Figure 2-5

Some signals are extremely sensitive to having test points or even through-hole vias on the signal lines,

as is the case with LPDDR4. Designing the PCB with the ability to use a memory interposer will allow for

Page 17: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

clean memory signals. Figure 2-6 is an example LPDDR4 interposer from Keysight. An interposer

solution does require additional PCB space around the memory device.

Figure 2-6

Anticipate what you expect to see It is very important to consider what you are trying to learn or capture. Typically for I/Os, you will want

to ensure that the important electrical specifications from the data sheet are being met. For most

CMOS devices this typically will include the minimum/maximum voltage levels for the signal to register a

transition on the input buffer: Vil/Vih (AC), as well as some maximum overshoot/undershoot

requirements. These are shown in the device data sheets, and are often a good first step when

analyzing any I/O signals.

It is also important to make sure that the rising/falling edges are monotonic (no glitches) within the

Vil/Vih or Vicm/Vid switching regions, particularly on any clock or strobe inputs. If a timing issue is

suspected as a problem, you might need to also check setup and hold time by observing both a data and

clock or strobe signal simultaneously in a scope window.

Signal integrity resources are available here:

https://www.xilinx.com/products/technology/signal-integrity.html

Page 18: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Table 2-1, Example DC input and output voltage levels for Virtex UltraScale, HR I/O Banks (DS893)

Table 2-1

Table 2-1

Table 2-2, Example Vin maximum AC voltage overshoot and undershoot for Virtex UltraScale, HR I/O

Banks (DS893)

Table 2-2

Signal at probe point vs die For high-speed signals in general it is also important to consider that the signal will always look different

at the die pad inside the package compared to any probe point on the board, even measurements taken

directly underneath the FPGA or other device at the fanout vias. The parasitic impedances associated

with the package and die will always distort the signal at some level. Use of optional internal

Page 19: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

termination features inside Xilinx FPGAs, such as Split-Termination DCI, IN_TERM, and DIFF_TERM, will

often create a much cleaner signal at the die pad compared to what is measured on the board. For this

reason, when a signal integrity issue is suspected on a board signal, in addition to the scope

measurements, you may also need to run SI simulations – either IBIS or HSPICE. The first step is to

attempt to correlate the measured signal on the bench to a simulation at the same probe point.

Once there is confidence in the board, package, and die models, through achieving good correlation at

the probe point, you can then use the simulation to show how different – better or worse – the SI is at

the actual die pad of the receiver. A short but good summary of this can be found in Xilinx Answer

Record 31922 – Figure 2-7 is an illustration from that AR showing the difference in signal quality

between the simulation results at the pin and at the die pad:

Figure 2-7

Inter-Symbol Interference (ISI) ISI occurs when a receiver sees information from prior bits interfere with following bits. This is caused

by reflections. The severity of ISI depends on the routing topology of the transmission line, impedance

mismatches and termination.

Figure 2-8 is an extreme example of ISI, It is possible for single-cycle pulses to get “swallowed” at the

receiver.

Figure 2-8

Page 20: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

You can minimize or eliminate the ISI effect in order to focus your examination of the other factors

contributing to the SI such as impedance mismatches, or drive strength and termination issues. One

simple way to accomplish this is to simply slow down the data rate so that the unit interval is long

relative to the rise time (Unit Interval >> Rise Time) – typically by slowing down the clock of the

interface. The rise time will remain the same, and the effects of the transmission line to the signal edges

will remain the same, but you can effectively remove the effect of ISI from the measurements. Note

that proper use of the SI engineering CAD tool would have identified any crosstalk issues and allowed

their correction prior to fabrication.

Page 21: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Chapter 3: Debugging SSO Noise and Cross-Talk Issues A book could be written on the subject of noise but we’ll focus on a few of the most prevalent types.

Coupled noise is noise that is introduced through inductive and capacitive coupling from an aggressor

signal to a victim signal. This can occur within a device’s package or on the traces of a PCB board. It is

typically thought of as a Voltage issue (vertical), however, it can even cause a disturbance of signal

timing (horizontal). This is a hazard that is typically guarded against in proper device package designs

and in using good PCB layout techniques. These layout techniques include keeping required distances

between signal traces on the same routing layer as well as in adjacent routing layers. These techniques

apply equally to package designs and PCB layout designs.

Two common forms of coupled noise issues that you may encounter when debugging an FPGA board are

internal Simultaneously Switching Output Noise (commonly referred to as either SSO or SSN), and

board-level cross-talk.

SSN Debug SSO (simultaneous switching outputs) / SSN (simultaneous switching noise) is also referred to as GND or

VCC bounce. This occurs when multiple outputs switch simultaneously generating voltage drops within

the die package power distribution network.

The magnitude of SSO/SSN noise depends on a few key factors, including:

The proximity of the victim die pad and BGA ball location with respect to aggressor SSO die pads

and ball locations

The aggregate number of aggressor SSOs and their proximity to the victim

The proximity of neighboring ground and Vcc die pads and BGA balls

The package design for the FPGA/device, in particular, its PDN for the Vcco power rails. For

example, on-die and on-package capacitors can help reduce the effects of SSOs

In an overall system sense, it is also important to minimize the inductance between the VCC and

GND planes to further reduce the magnitude of the SSN noise. Proper PCB design can go a long

ways towards not having to respin a board later in the production cycle.

If you suspect an SSO/SSN related problem with a Xilinx FPGA device, first check what the SSN analyzer

in Vivado or Plan Ahead shows for estimated noise on any of the I/O pins of the FPGA design. For more

information on running SSN analysis, please reference Xilinx I/O and Clock Planning user guide UG899.

Page 22: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 3-1: Example Vivado SSN Report

Debugging an SSO issue in the lab can be difficult. It often helps to use a few different methods to hone

in on the source. One method is to extract an FFT of a suspected victim pin and observe the frequency

and magnitude of all of the major frequency components. This can be particularly useful when the

aggressor frequency or frequencies are asynchronous to the victim pin’s switching frequency. On the

other hand, if the aggressors are suspected of being synchronous to a victim pin’s switching frequency,

you might also be able to use infinite persistence from a scope capture on the victim pin to look for

occasional noise that occurs at a particular point in the clock period.

An example of this “infinite persistence” approach is shown below on the P-side and N-side of a DDR

SDRAM clock signal. The P-side shows a noise incursion occasionally occurring just after the rising edge.

Whereas, the N-side scope shot shows the noise incursion just after the falling edge – essentially at the

same point in time.

CK (P-side) with SSO noise:

Figure 3-2: CK (P-side) with SSO Noise

Page 23: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

CK# (N-side) with SSO noise:

Figure 3-3: CK# (N-side) with SSO Noise

Consider creating FPGA test designs that can help determine if this is an issue. For example, for a

memory interface, it is possible to create a testbench that only does WRITE bursts and uses test modes

(perhaps controlled through ChipScope) that 3-state a group of signals at a time: upper/lower DQ bytes,

groups of Address signals, etc.. Then use your scope to observe the effect to the victim pin with each

test design.

SSN Mitigation Methods The most effective way of mitigating SSN is to follow the proper guidelines documented within the FPGA

vendor’s PCB/board user guides. For example, for 7 Series and UltraScale/+, UG483 and UG583 contain

the required guidelines for properly implementing a good power distribution network. Here are few

recommendations on reducing design SSN:

•Identify potential SSOs and spread them around the package.

•Avoid placement of asynchronous pins (resets, enables, etc.) near SSOs.

•Place SSOs away from clock pins/traces.

•Properly decouple VCC/GND pairs to filter out noise.

• Use I/O Standards that have a lower SSN impact for the failing group. Changing to a lower drive strength, a parallel-terminated DCI I/O standard, or a lower class of driver can improve SSN. For example, changing SSTL Class II to SSTL Class I.

Page 24: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

•Low pass filters can be used to meter out the glitches at the PCB level.

•Whenever possible, create synchronous designs that are glitch tolerant.

•Because they have better noise margin, use 3.3V CMOS compliant inputs when possible.

•Increased capacitive load decreases the amplitude of the ground bounce by reducing the

output slew rate.

•Offsetting switching outputs in a staggered fashion by inserting delays in the design so

switching is not simultaneous is another option that requires no change in the PCB layout.

Therefore, this can be done when the board has already been manufactured. Offsetting the

switching outputs can be done by inserting the proper I/O delay components onto the output

signal path.

Board-Level Cross-Talk Board-level cross-talk typically occurs when signal traces on the PCB run in parallel for extended lengths

at close proximities. It is primarily caused by capacitive and inductive coupling of those parallel

segments. This can occur on the same PCB layer or even adjacent PCB layers.

It is usually worse on the outer layers where microstrip traces are used. Crosstalk can only be reduced or

minimized. You cannot completely eliminate it from the system as it is a remnant of electromagnetic

physics. The magnitude of cross-talk will primarily depend on:

1) The trace spacing for traces on the same PCB layer or the dielectric thickness separating the

traces on adjacent PCB layers.

2) The length that the traces are parallel.

3) The driver edge rate (dV/dt). For a Xilinx FPGA, this can be a combination of both Slew and

Drive.

Page 25: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 3-4: PCB Routing Example

If you suspect that the issue you are observing might be due to cross talk, request a board cross-talk

analysis report. Ideally, this should be a standard procedure for every Xilinx customer but in reality is

often skipped. As previously mentioned, Signal Integrity simulation software vendors such as Mentor’s

Hyperlynx SI or Allegro’s Sigrity SI have support for this and more.

Design rule checks (DRC) limiting cross-talk to less than 150mV between signal traces is an older “rule-

of-thumb” and can be the default setting in a DRC check for a layout tool if left unchanged. It is

important that the board designer sets these DRC limits according to the actual requirements of the I/O

standards and devices. With lower-voltage signaling, it is typically required that any noise coupling from

cross-talk be limited to much smaller voltages than 150mV. An integral part of this type of analysis is

also setting up the correct assumptions for the aggressor edge rates. These must reflect the edge rates

expected of the aggressing devices.

Many of the same techniques for debugging SSO noise issues can be applied to debugging board-level

cross-talk. Though most of the SSO and cross-talk issues can be avoided altogether in layout there are a

few things you can do after the board has been built.

When designing FPGA test designs, it can help to try to create multiple designs that 3-state

various suspected cross-talk aggressors. You can use a scope on the victim pin and observe the

effect from the suspected aggressors.

Place possible aggressors away from critical nets. Space out possible aggressors I/Os between

unused pins.

Use signal traces that do not share the same plane nor are parallel in any way if adjacent. If

adjacent, make sure the trace is orthogonal to each other so coupling regions are minimized.

Page 26: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Place noise sensitive signal traces on the inner signal planes with close proximity to ground and

power planes. This is will also help reduce EMI.

Higher voltage aggressors will induce more crosstalk. Use lower voltage standards whenever

possible.

Use narrowest traces with the highest impedance if the option is present.

Use slow edge signals by reducing slow slew and lower drive strength or use a series termination

to slow signal edges.

Figure 3-5 shows the victim pin which should be settled at a logic 0, while a cross-talk aggressor is shown

in magenta. Putting the scope in infinite persistence mode provided a method to measure the cross-talk

noise on the victim occurring on the rising and falling edges of the aggressor:

Figure 3-5: Aggressor and Victim Cross-Talk

Page 27: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Chapter 4: Debugging Jitter Issues Jitter is the timing variations of a signals actual edge from the signals ideal edge. Essentially, jitter is the

effect of a rising or falling edge happening earlier or later in time compared to when it is expected to

happen. In non-GHz FPGA signaling, there are four primary forms of jitter that we tend to deal with, and

occasionally need to debug in the lab:

1) Period Jitter

2) Cycle-to-Cycle Jitter

3) Time Interval Error (TIE)

4) Duty-Cycle Distortion (DCD)

Figure 4-1, illustrates the first three forms of jitter

Figure 4-1

For measuring jitter in the lab, some universal recommendations are:

Most importantly, the SI for the signal being measured needs to be reasonably good. Any

significant SI issues due to over or undershoot, poor termination, transmission line “stubbing”,

impedance discontinuities in the trace, cross-talk, SSO, etc., can cause false data on the jitter

measurements due to the signal distortion.

Try to always probe at the end of a terminated line, as close to the termination resistor as

possible.

If you need to measure a clock or other signal that is internal to the FPGA, route the signal out

on an unused or borrowed I/O pin using a differential I/O standard such as LVDS if possible.

Choose a test point on the board that is designed for high-speed signaling and is preferably

terminated.

Page 28: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Use a high-bandwidth scope – with a bandwidth that is at least twice the frequency of the signal

edge rate if possible. This will reduce quantization noise due to a low sampling rate.

Adjust the vertical amplitude of the scope channel so that the signal takes up approximately

80% of the window (vertically). This will help minimize the quantization noise from the scope

ADC.

Understand what jitter measurement features are available from the scope you are using and

use them to capture the measurements. If the scope has limited jitter measurement features,

at a minimum, you can use the technique below for estimating Period Jitter.

To derive the number of cycles that should be included in a jitter measurement, it will be

important to understand the application. For example, the DDR3 SDRAM memory specification

requires the CK/CK# signal to have a maximum cycle-to-cycle jitter that is supposed to be

measured over a 200 cycle rolling period. For other applications, 10,000 cycles is a rule-of-

thumb that is often applied. If you are trying to assess the jitter during some error/event you

are debugging, you will need to make sure that the jitter measurement is actually taken during

the event. Generally, measure maximum jitter until it stops increasing.

Use soldered-down differential probes

Figure 4-2: LeCroy solder-down differential probe

Page 29: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Period Jitter Period Jitter is the overall variation of a clock or strobe period over the course of some length of time. If

not available as a measurement from the scope you are using, it may be estimated by setting the scope

for infinite persistence and adjusting the horizontal display to show a little more than one complete

clock cycle. If the scope triggers on the first edge (rising or falling), period jitter can be seen on the

second (same) edge and measuring the width of that next edge using vertical cursers after a reasonably

large number of samples are taken. This will provide a reasonable approximation to the period jitter.

Period jitter measurements can be helpful for calculating the system timing margins in Single-Data Rate

(SDR) systems, such as setup time (tSU) and clock-to-out time (tCO).

Period jitter should be measured over a sample of 10,000 cycles.

Figure 4-3: Period Jitter measurement

Page 30: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Period Jitter Application

Period jitter is useful when calculating timing margins in digital systems. If the clock jitter is large

enough, this can impact setup or hold times.

Here is an example of clock jitter causing a data bit to be sampled before the data is valid.

Figure 4-4: Example of clock jitter causing data error

Period jitter can have a similar impact on hold time.

Period jitter should be measured over a sample of 10,000 cycles, or until it stops increasing.

Cycle-to-Cycle Jitter Cycle-to-cycle jitter is a measurement of how much a clock or strobe period changes from any one cycle

to the subsequent cycle. Cycle to cycle jitter only involves the difference between 2 consecutive cycles,

with no reference to an ideal cycle. Cycle to cycle jitter is typically reported as a peak value in pS. Cycle

to cycle jitter can be measured by measuring the cycle times of two adjacent clock cycles (T1 and T2).

Calculate the absolute value of T1 – T2. Wait a random number of clock cycles and repeat the

measurement. JEDEC recommends capturing 1,000 measurements, or until it stops increasing.

Note that ½ the peak to peak jitter shortens the clock period. This will result in timing violations (unless

there is sufficient slack in the timing report). All of this jitter can be entered into Vivado to be used in

the implementation phase by modifying the default system jitter value from 100 picoseconds (peak to

peak) to agree with what your design is experiencing.

Below is a diagram showing cycle-to-cycle jitter between three adjacent periods:

Page 31: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 4-5: Cycle-to-cycle jitter

Time Interval Error (TIE) TIE is the distortion of a clock signal’s phase for any given period/cycle compared to a golden/reference

clock. It is essentially how far each active edge of the clock varies from its ideal position. However, to

measure this requires a reasonable approximation to an ideal clock. In practice, this is often simply the

average clock period for the same clock that the TIE is being measured against. This is typically used to

measure the quality of a PLL.

Illustration of the TIE concept:

Figure 4-6: Time Interval Error

Histogram for a TIE measurement – a useful way to analyze the TIE measurement:

Page 32: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 4-7: Histogram for a TIE measurement

Wander Slow jitter is referred to as “wander.” It is generally defined as the jitter (phase noise) below a 10 Hertz

low pass filter cutoff frequency. For all purposes, we are concerned here with the total unlimited

bandwidth jitter frequency spectrum. Wander is primarily an issue in large synchronized clock networks

and will not be discussed any further here.

Duty Cycle Distortion (DCD) Duty Cycle is the % of time that a clock or strobe signal is high during its period. DCD is the change or

distortion of the duty cycle of a clock or strobe compared to what it is expected to be, typically 50%.

DCD can be seen as variance in timing away from the ideal duty cycle and/or in variance of the offset

voltage level of the signal. In a DDR application, DCD on the strobe or clock can cause direct timing

errors due to altering the setup/hold between each unit interval (UI). See Figure x below where clock

DCD is shown. In SDR applications, DCD can cause timing errors, particularly if the duty cycle is below

30% or above 70%.

Figure 4-8: Duty Cycle Distortion

Page 33: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 4-9: Duty Cycle Distortion, scope capture

Many of the newer or more performance based oscilloscopes contain simplified options catered to DDR

interface based probing. For example in newer Tektronix scopes, DDR signal probing is simplified using

their DDRA Wizard tool. This includes measurement de-rating.

Page 34: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Chapter 5: Debugging Memory Interface Issues, Additional

Considerations One of the most commonly used forms of Xilinx IP are external memory interfaces. Due to the

complexity of memory interfaces and the high signal speeds associated with them, debugging memory

interface issues can be complex and difficult. This chapter focuses on common trends for memory

interface hardware failures and on how techniques in the previous chapters can be used to resolve

them. Important documents that should be referenced when debugging issues with memory interfaces

are the PCB Design User Guide and debug guide chapters contained within the target architecture’s

Product Guide. Additionally, the Memory Interface Design Checklist is a great resource to verify all

design requirements are followed. Failure to follow the critical design requirements for the memory

interface IPs can be detrimental - especially at top data rates.

Reproduce the Error

Debugging memory interface errors requires the ability to look at both the expected and actual data

read back during reads. This is not always feasible when running a full system. The Memory Interface IP

includes Advanced Traffic Generators (ATG) to aid in data error debug. The ATG can be set up to send a

variety of address, command, and data patterns to reproduce data errors seen in true systems. The ATG

includes all of the diagnostic signals and test features to root cause the issue. Reproducing data errors

with the ATG can be the fastest way to root cause and resolve data errors. Refer to the UltraScale

Architecture-Based FPGAs Memory IP PG150 for full details.

Isolating the Error

One of the first things to try to identify is where and how the error is occurring. The following list is

helpful in isolating the error:

Are the errors bit or byte errors?

o Are errors seen on data bits belonging to certain DQS groups?

o Are errors seen on specific DQ bits?

Is the data shifted, garbage, swapped, etc.?

Are errors seen on accesses to certain addresses, banks, or ranks of memory?

o Designs that can support multiple varieties of DIMM modules, all possible address and

bank bit combinations should be supported.

Do the errors only occur for certain data patterns or sequences?

o This can indicate a shorted or open connection on the PCB. It can also indicate an SSO or

crosstalk issue.

Determine the frequency and reproducibility of the error

o Does the error occur on every calibration/reset?

o Does the error occur at specific temperature or voltage conditions?

Determine if the error is correctable

Page 35: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Power Supply Noise

The above chapter on power supplies goes into great detail on the importance of power supply design.

Power supply noise is a common culprit in memory failures. Power supply noise can cause nearly any

type of memory failure including calibration failures, address/command/control errors, and data errors

on writes and reads. When memory interface errors occur, power supply measurements using the

techniques reviewed in the previous chapters should be among the first board level analysis to be

completed.

Below is an example of a Vtt power supply with insufficient decoupling capacitors. As displayed in this

waveform, this resulted in 300mV+ VTT noise resulting in address/command/control errors.

Figure 5-1: Vtt Power Supply Noise

Address, Command, and Control Signals

There are common symptoms that are particularly indicative of a group of memory interface signals:

Address, Command, Control, Data, Data Strobe, Data Mask. Address, Command, or Control signal issues

can cause a wide variety of symptoms but tend to be broader in their effect compared to a repeating

data bit or data byte error.

For example, if the forwarded CK/CK# nets from the FPGA are filled with jitter or have bad SI, the entire

system level timing can suffer and cause random/intermittent errors. Additionally, errors on Address,

Command, and/or Control can potentially result in both Read and Write operations. Errors prone to a

specific address or group of addresses can be indicative of SI issues on Address lines. Missed or wrong

commands can be indicative of SI issues on the Command lines. When analyzing Address, Command, or

Control issues, compare the expected versus actual data. Is the data missing (potential missed read

command), from a wrong address (potential address issue), or from a previous write to the address

(potential missed write command)?

When Address, Command, or Control signals are suspect of having signal integrity issues, consider the

following when taking SI measurements:

Page 36: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Probe the suspect signals at termination resistors near the memory device, as these are output-only signals from the FPGA/Controller.

If a specific memory component is suspect, probe as close to this this suspect component as possible.

Look for signal quality, noise, cross-talk, duty-cycle-distortion (on CK/CK#), etc.

Double check for termination resistor issues such as cold-solder joints or incorrectly populated resistors.

Double check the Vtt power and decoupling capacitors.

When measuring at the DRAM, ensure VIL and VIH are met. For more information, see the JESD79-3F, DDR3 SDRAM Standard and JESD79-4, DDR4 SDRAM Standard, JEDEC Solid State Technology Association.

Look for 50% duty cycle periods on CK.

Ensure that the signals have low jitter/noise that can result from any power supply or board noise.

Actual DRAM issues are possible but are very rare.

Crosstalk

Crosstalk on Address, Command, and Control signals is becoming a more common issue due to packing

of memory components with a lack of sufficient grounds vias. UG583 goes into detail on ground via

requirements. Using a clamshell layout for memory components is a key example where space for

ground vias is limited. The lack of these vias results in, most typically, Z-direction crosstalk. Third party

tools are available to run a crosstalk analysis on PCB files and should be utilized prior to board

fabrication as these errors are uncorrectable after fabrication.

Below is a scope example of crosstalk on a DDR4 BG0 address signal. In this figure, Yellow is CK; Green is

BG0. This design showed data errors at 2000Mbps but worked reliably at 1600Mbps. The data errors

traced back to incorrect data when a high to low transition on BG0 occurred during writes/reads:

Figure 5-2: Crosstalk on BG0

Page 37: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 5-2: Suspicious of Address Crosstalk

After examination of the PCB files, the culprit of this noise was found to be crosstalk between two

address signals as a result of improper ground stitching. The below figure shows this address crosstalk

between the aggressor and victim:

Figure 5-3: Crosstalk on Address Lines

Signal Timing

Timing of signals is critical to successful memory interface designs. The PCB guidelines associated with

the memory interfaces must be followed. For example, lack of adherence to trace lengths, trace length

matching, and trace impedance requirements can result in poor signal timing. The image below shows

misalignment of CAS_n and Address bit. This will cause incorrect commands at the DRAM.

Page 38: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 5-4: Address/Command Signal Timing

Duty Cycle Distortion (DCD)

Having equal duty cycle on, particularly, the CK clocks forwarded from the FPGA to the memory is

critical. The CK clocks are used internal to the DRAM device to create the read DQS strobes sent back to

the FPGA. In addition to DCD causing potential issues clocking the DRAM, any DCD will be sent back to

the FPGA on the read strobes. The Xilinx memory interface designs avoid use of tap delays in the

generation of the CK clocks to minimize jitter and therefore, generate clean CK clocks. Adherence to the

PCB board requirements ensures the CK clocks remain clean at the DRAM. The image below is an

example of poor DCD on CK/CK# nets:

Figure 5-5: DCD on CK

CK – CK#

CK

CK

Page 39: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

DQ, DQS, and DM Signals

Issues on the DQ, DQS, and DM signals will result in read and/or write failures. Debugging these errors

requires a methodical approach and proper scope techniques. As with address, command, and control

failures, proper adherence to the PCB and design requirements of the FPGA and Memory Interface IP is

critical for successful DQ, DQS, and DM signaling.

When analyzing Signal Integrity effects on DQ, DQS, or DM, it is important to look at where within the

read or write bursts the data errors exist and to compare the actual versus the expected data pattern.

Looking at the data errors, the following should be identified:

Are the errors bit or byte errors?

Are errors seen on data bits belonging to certain DQS groups?

Are errors seen on specific DQ bits?

Is the data shifted, garbage, swapped, etc.?

Next, analyze the error to look for clues as to what is causing the error:

Errors at the beginning or end of a read or write burst, can be indicative of termination (ODT at

the SDRAMs or DCI/IN_TERM at the FPGA), DQS or DM issues.

Constant or intermittent errors at a particular bit, tend to indicate poor SI at that particular DQ

bit.

Marginal read and/or write windows can result in bit errors on the marginal bit/byte.

Byte or Word issues can be DQS or DM SI issues, or related to the termination (ODT at the

SDRAMs or DCI/IN_TERM at the FPGA).

General signal integrity checks for DQS, DQ, and DM:

When measuring at the FPGA, ensure VIL and VIH are met for the specific I/O Standard in use. For more information, see the target architecture’s DC and AC Switching Characteristics Datasheet.

When measuring at the DRAM, ensure VIL and VIH are met. For more information, see the JESD79-3F, DDR3 SDRAM Standard and JESD79-4, DDR4 SDRAM Standard, JEDEC Solid State Technology Association [Ref x].

Look for 50% duty cycle periods on DQS.

Ensure that the signals have low jitter/noise that can result from any power supply or board noise.

Probe the VREF level at the DRAM (for DDR3).

Probe the DM pin which should be de-asserted during the write burst (or tied off on the board with an appropriate value resistor).

Determining if the Error is a Read or Write Error

If the issue has been narrowed down to data errors, one of the first things to identify is if the error is a

read or a write error. Typically a test-bench or test design would need to be used that does data

compares on every read byte, such as the Advanced Traffic Generator (ATG) provided with the Memory

Interface IP Example Design. Further, it may be helpful to modify this type of test to re-read the same

location more than one time when a compare error occurs. If the error repeats each time, the issue may

Page 40: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

be a corruption of a previous write operation. If the second or subsequent read does not have a

compare error, then the issue has likely been narrowed down to a Read corruption. The ATG includes

an automated read/write error checking feature. When this feature is enabled, traffic is stopped when

the first error is detected. A read test is then performed to determine whether a read or write error

occurred. The result of this test is reported in the provided ATG signals. Refer to the UltraScale

Architecture-Based FPGAs Memory IP PG150 for full details.

Scope Techniques

Due to the importance of debugging reads or writes separately, it is important to learn good techniques

for triggering a scope capture for reads vs. writes. Because most scopes are unable to observe all of the

Command signals and all of the DQ and DQS bits, this can be a challenge. For DDR2/3/4 SDRAM

interfaces, you can use the Read Preamble on the DQS pins to help separate these captures. To trigger

on a Read, if the scope you are using supports “Triggering on Glitch”, set it to trigger on DQS preamble

by adjusting the glitch duration to capture the 1 clock cycle low preamble. If the scope you are using

does not have a sophisticated trigger feature that can be set to trigger on glitch duration, it may be

possible to trigger on voltage level if there is a noticeable difference between the amplitude of the DQS

signals when the FPGA is driving (on a write) and when the DRAM is driving (on a Read) at the probe

point. This may also be the best way to trigger on a write. Below is an example of a DDR3 read (left

hand picture) and a DDR3 write (right hand picture).

Figure 5-6: Read versus Write Scope Captures

Write Operations

Write operations require good DM and ODT signaling, and that the ODT value at the DRAM was

correctly written to the DRAM during initialization. The ODT input pin at DDR SDRAMs is used for

terminating DQ, DQS, and DM during writes. It is important to ensure that the ODT net is connected

between FPGA and DRAM and is terminated just like the other control signals.

The image below shows a problem where ODT was not enabled properly before the write Command. In

this case, the ODTLon, ODTLoff specification was not met at the DRAM:

DQS

DQ

DQS

DQ

Page 41: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 5-7: Poor ODT Timing

Data Masks (DM) for DDR4/3 SDRAMs need to be pulled strongly to ground if they are not driven by the

FPGA (DM disabled) in order to overcome ODT. Designers should reference their memory vendor

datasheet for specific requirements on the value of the external pull-down resistor used for DM. If write

data errors are occurring and DM is not driven from the FPGA, check the value of the DM pull-down.

The red area in the image below shows DM rising to 0.75V when it should remain at 0. This caused

incorrect masking of data during the write, resulting in a write error:

WE_N

CK

ODT

Page 42: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 5-8: Improper Pull-Down on DM

Read Operations

Memory controller Read operations require that termination is present at the DQ and DQS pins of the

FPGA. Use of internal termination is the preferred method rather than use of external termination

resistors because it can reduce the BOM cost of the board, reduce the board-space requirements, and

improve the signal-integrity of the signaling.

I/O standards that have Split-Termination DCI or the IN_TERM attribute are available in Xilinx FPGAs, but

have to be implemented correctly in the design. The Xilinx Memory Interface IP properly set up the read

termination.

The scope shot below shows very large overshoot due to the DCI not being “on” in an I/O bank with

DQ/DQS pins in the Xilinx FPGA. This issue was due the DCI Cascade attribute not being correctly set in

the user design. This caused data issues on the very first Read burst:

DM

RESET#

Page 43: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 5-9: DCI Termination Issue

The next image shows a termination issue causing a Read error because the internal split-termination

was present but skewed incorrectly. The root cause of this issue was the pull-down resistor in the split-

termination being too strong compared to the pull-up resistor.

Figure 5-10: Read Termination Issue

Note that DQ and DQS are phase aligned when the DQS magnitude is small. This indicates the issue is

occurring during a Read operation. However, there is no noticeable pre-amble on DQS because the

voltage level during 3-state is already at 0V. This type of problem can be caused by board level issues

such as cold-solder joints on the VRP/VRN resistors, or incorrectly populated resistor values.

DQS

CK

Page 44: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Triggering on Data Errors

In order to analyze the failure event in hardware, it is critical to set up an appropriate trigger. For

Calibration stage data failures, if possible, bring out a signal associated with the failing stage to an

unused I/O pin to use as the scope trigger. Refer to the Debug Guide for the target Memory Interface IP

for advice on the best scope signal(s).

For Read/Write errors, bring out a data compare or error signal to use as the scope trigger. When using

the ATG, compare error flags are provided.

If the design includes a hardware debug core, route the ILA trigger to unused I/O pin to have a

dynamically controlled trigger that can be modified as needed using the Vivado debug tools.

Read and Write Data Margin

The Memory Interface IPs includes advanced calibration algorithms to create robust memory systems

across process, voltage, and temperature variations. A general recommendation for the UltraScale

generation of Memory Interface IP is for calibration margin on both the write and read sides to be >30%

of the ideal eye. The MIG hardware debug tools available for UltraScale memory interface IP can be

used to quickly determine the calibration margin. Below is an example waveform of the MIG Debug

GUI. The smallest margin byte will always be highlighted in orange as shown below:

Figure 5-11: Calibration Margin in MIG Debug GUI

Small write and read data eyes can be the result of a wide variety of PCB issues including lack of

adherence to the trace requirements as well as signal and power supply noise.

The UltraScale Memory Interface IP also include a post calibration window checker with the ATG. This

feature can be used to test the window sizes across environmental changes. Refer to PG150 for

Page 45: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

information on this feature. Clean PCB design according to the Xilinx guidelines ensures robust memory

systems across environmental changes.

Page 46: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Chapter 6: Debugging High Speed Serial Interfaces, Additional

Considerations The HSSIO debug process is the perfect place to see the benefits of a System Level Debug process.

The high speed serial transceiver is a complex set of circuits, all concurring in one operation:

transmitting or receiving data. Any problem occurring in one these many parts can have the main effect

of data corruption.

Old debug techniques based on a "try and fail" strategy can be really time consuming, or even worse,

they can lead to wrong assumptions and hide the real problem.

Divide and conquer The most efficient strategy for quick GT debugging is probably the one invented by Philip II of Macedon

(382-336 BC) and then adopted and translated to Latin by Romans: "Divide et Impera" became the basis

of Caesar's political power. (Figure 1)

Divide and conquer is a strategy to narrow down the problem to the root cause, when the problem is

complex and with many variables, by dividing the original problem in smaller problems that can be more

easily handled.

Figure 1. Philip II, a 1st-century Roman copy of a Hellenistic Greek original (Wikipedia)

The first step, before "dividing" is to know the many variables and objects that are populating the

design.

Components of a SerDes Link All components of a SerDes link should be understood and properly designed. They are represented in

the simplified Figure 2.

Page 47: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 2. Components of a SerDes link

Stackup and Layout design

The SerDes can drive and receive a signal at several tenths of gigabits per second. At this high speed, the

quality of the transmission lines is a fundamental part of the design and influences the very early stage

of the PCB design. Particular care is needed during the PCB stackup design and PCB Layout. Some

common mistakes are listed below:

It is always a bad idea to save money at the expense of solid ground planes. The ground plane is the main reference for the transmission line and it affects the transmission line local impedance. Usually a 2D EM field solver calculates the differential pair geometry, with the assumption that the reference plane is solid and infinite. The final layout is usually far from this initial assumption: ideally we should guarantee solid ground/power reference planes and no discontinuities in the proximity of the transmission line. A simple rule of thumb is to keep all discontinuities far from the transmission line, at least twice or more the transmission line section. Moreover, because the solid reference planes minimize the return current paths, removing the ground planes from the stackup might create EMI-EMC problems that were not present originally.

Many layout tools can check horizontal (same layer) couplings by setting DRC rules for automatic control of horizontal clearances. Not all of the layout tools care for vertical (adjacent layer) couplings. If there is no automatic check on vertical couplings, we must compare each layer with the adjacent one, and identify all possible risks of vertical coupling.

Any interruption in the reference plane is a source of crosstalk also between non adjacent planes. (Figure 3)

Page 48: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 3. Example of 3D EM field solver analysis highlighting the vertical coupling due to

non-continuous reference plane

Coupling between vias should be considered (a 3D EM field solver analysis can suggest the best break-out strategy) (Figure 4)

Avoid long (compared with wave length) stubs. Consider back-drilling technology. Thick PCBs (i.e. backplanes) were known to be more prone to this problem; but since datarates in the order of 30-50 Gbps are showing up, 1.6mm thickness PCB should also be analyzed.

Figure 4. Example of 3D EM field solver analysis of coupling between vias to traces, with

different backdrilling solutions. This kind of analysis can guide a good quality layout strategy.

Always run exhaustive IBIS-AMI simulations (Figure 5) o Use the latest IBIS-MODEL release o Ask if each model in the simulation has been correlated with hardware measurements o Read carefully the model User Guide and start from a simple design. o Represent the transmission line with accuracy o Add estimated jitter to receiver and transmitter o Simulate the Min, Max, Typ corner cases o Compare the result with the device mask

Page 49: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

o Starting from the UltraScale Family, the REFCLK receiver model is also available for IBIS simulations

Figure 5. Example of IBIS-AMI simulation, with mask validation.

Some good habits to design PCB trace to reduce crosstalk are:

Coupling becomes a more critical parameter under low loss material

Use Differential traces to reject common noise

Route traces and vias symmetrically from the suspected aggressors

The distance is the best shield to reduce crosstalk

Power Supplies

Power supplies of a SerDes Link are typically dedicated to power different elements of the transceiver:

the internal PLLs, the terminations and calibration circuits, analog and the digital sections, the reference

clock network. (Figure 6)

Page 50: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 6. 7 Series: internal power supplies connections

Furthermore, power supplies are distributed by power areas. Each power area may include multiple

quads.

The power supply quality requirement is specified in the Datasheet. Violations of the requirements (i.e.

excessive power noise) can cause issues to one or several parts of the circuit. Please consider that some

circuits might be connected to multiple power supplies and the picture above is a simplification. A

problem caused by a non-ideal power supply is always very difficult to debug. For this reason, the power

supply quality represents one of the basic checks, to be passed at the very beginning of each debug

activity.

At the same time, the power analysis is a tedious set of measurements that every engineer knows to be

difficult and sensitive to environment noise. A good quality power noise measurement requires a certain

number of tricks and precautions, in order to be always meaningful and repetitive: for example, the

intrinsic probe noise cannot be higher than the noise we want to measure; the environment EM noise

cannot affect the measurement; the best tool should be selected for the measurement (oscilloscope or

spectrum analyzer).

Reference Clock

The Reference Clock feeds the internal HSSIO PLLs: the PLL is needed for both the transmitter and the

receiver side. Quality defects on REFCLK source increase the TX output jitter and reduce the RX jitter

tolerance.

The requirements for REFCLK quality are provided in the Datasheet (UltraScale and UltraScale+) or in

Answer Records (mature FPGA families).

The comprehension of the Datasheet needs some background on phase noise characterization.

Page 51: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Xilinx invest a lot of resources in constraining the REFCLK phase noise, in order to allow the selection of a

cost-effective, local oscillator and at the same time the preservation of the link margin.

Signal quality by design: starting from the UltraScale Family, the REFCLK receiver model is also available

for IBIS simulations.

Transmitter

The Transmitter building blocks are represented in the figure below. We can identify two main sections:

the PMA and the PCS. These two regions are synchronous with two clock domains and a phase adjust

FIFO is needed to compensate for the phase skew. Some of the most frequent reasons for data errors

are:

A wrong clock tree plan

A suboptimal reset sequence

A wrong setup already in the GT Wizard GUI

A use mode that is not supported/characterized

Figure 7. Transmitter building blocks (US GTH)

Receiver

The Receiver building blocks are represented in the figure below. We can identify two main sections: the

PMA and the PCS. These two regions are synchronous with two clock domains and an elastic buffer is

needed to compensate for the phase and frequency differences. Some of the most frequent reasons of

data errors are:

A wrong clock tree plan

A suboptimal reset sequence

A wrong setup already in the GT Wizard GUI

A use mode that is not supported/characterized

Page 52: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 8. Receiver building blocks (US GTH)

Compared to the Transmitter, the Receiver is far more complex due to the presence of equalizers and

the Clock and Data Recovery unit (CDR) in the PMA side.

RCAL

The RCAL circuit is responsible for tuning the termination resistors both on the TX and RX side. Any

problem with this circuit will immediately result in signal integrity issues. The principal reasons for

malfunction are incorrect PCB layout, missing calibration resistor, missing/wrong/noisy power supply

and wrong impedance of transmission differential nets.

Scope the Problem We need to ask the basic question and answer to them at the beginning of the debug process.

What is the device?

What is the data rate?

What is the protocol?

Reduce the problem complexity o Is this a protocol or physical layer problem? o Is it a User application design? o Is this a Xilinx (or other) IP code/design? o What is the GT configuration? o How are the ports driven?

What is the clock architecture?

Device, speed grade and package should match with design requirements.

Reference clock o What is reference clock source? (LVDS/LVPECL) o Is it passing the phase noise mask?

Page 53: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

o What is the Reference clock transmission line? o Have the DC block capacitors been used? o Evaluate any oscilloscope measurement of the REFCLK signal o Evaluate any existing IBIS simulation of the REFCLK signal

CPLL/QPLL settings o Is the PLL frequency supported? o Is the PLL common between many channels, or does every channel have its own PLL? o Are all the PLL-lock signals probed in the design?

What is the TX/RXUSRCLK source?

How is it routed from source to TX/RXUSRCLK input? Explore the clock tree: is this a multi-channel or single-channel design?

Verify Application Settings:

IP Wizard settings

IP Wizards call the GT Wizard and its rules engine to instantiate the GT’s.

GT Wizard settings (If the GT is not already instantiated in the IP)

Do they comply with currently released settings (ports and attributes) for this application and device?

The Rules Engine in the GT Wizard has the most currently released settings for each protocol.

IP Checklist o What is the IP release? o Is the user design compliant with current IP Wizard output?

GT Wizard (if IP Wizard does not already call GT Wizard)

Was the current version of GT Wizard used?

Debug sequence of the serial line The following should also be analyzed at Voltage and Temperature corner cases.

Verify connectivity. Is the plug connected? This is a trivial question, but a cable can be broken, a connector damaged, a decoupling capacitor unsoldered… Sometimes we neglect to test basic things just because they were working a few days ago.

Verify compliance. Is this protocol supported or not? Some protocols have a dedicated compliance report. For other protocols, we should refer to the generic characterization report and datasheet values. There are many aspects to consider and the TX/RX must work in all conditions. Even if in the laboratory we can make a link work, only the characterization report will tell us if the same link will work, with this protocol, in all field conditions.

Layout correctness o Check presence of impedance discontinuity o Identify possible crosstalk aggressors o Check integrity of reference planes (i.e. split planes) o Identify any “Swiss cheese” effect in the power or ground planes due to contiguous via

presence (Figure 9) o Improper/inadequate power supply filtering o Identify possible p-n pins swaps o Represent with a model of the transmission channel and simulate it

Page 54: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Power Supply compliance: o DC Levels o Maintains Datasheet Compliance under load transient conditions (Config/reset/rate

change…) o With oscilloscope use 50 ohm probe and limit measurement bandwidth to ~100MHz o Measure as close to the package pin as possible o Verify that peak-to-peak noise meets requirements (for example, <10mVpk-pk 10Hz to

80MHz) If there is noise, attempt to identify the source If harmonic/periodic, identify the frequency

o Methods for isolating power supply noise sources Fundamental frequency crosstalk coupling (i.e. from other circuit/systems) Combinational frequency crosstalk coupling (frequency mixed: sum and

difference) o Review PCB layout o Perform a FFT of the TIE (Time Interval Error) trend, of transmitted serial signal

Identify the worst spurs and compare with power supply noise

Power-on sequence o Is a power-on sequence required for any MGT power rails? Compliance? o Power-on before FPGA configuration starts?

FPGA configuration o Voltage rails maintain compliance throughout configuration? o Voltage rails maintain compliance after configuration (after DONE pin goes high) o Voltage rails maintain compliance before/during/after RX/TX reset (Lane/Quad)

Power noise

Figure 9. Example of “swiss cheese” effect on power planes due to the presence of vias.

This is affecting the quality of power supplies by adding a higher impedance path. In some cases the via antipads are removing the copper to the reference planes, close to the transmission line and influencing the impedance continuity.

Measure the TX output jitter, compare with Datasheet numbers and highlight any discrepancy

Page 55: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

RX eyescan in all voltage and temperature conditions. Compare with simulation results, bathtub and eye diagrams.

Loopback testing (Run bit error test under each condition) o Nearend fabric loopback verifies fabric logic inbound and outbound. o Nearend PCS loopback verifies fabric to GT interface inbound and outbound. o Nearend PMA loopback verifies fabric through PMA path inbound and outbound. o Channel loopback. Prefer a loopback through as much of the channel as possible to

verify channel performance o Far-end PMA and Far-end fabric loopback, if clocking supports it

Page 56: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 10. Near End Fabric loopback

Figure 11. Near End PCS loopback

Figure 12. Near End PMA loopback

Figure 13. Channel loopback

Page 57: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

Figure 14. Far-end PMA loopback

Fine tuning of TX FIR and RX Equalizers When the Signal quality is not enough, or when there are requirements to reduce power or minimize

crosstalk, it is still possible to fine tune the setup of transmitter and receiver and maximize the link

margin. This can be done manually or using Tcl scripts that can make this time consuming process more

automatic and better documented. Please open a Service Request or ask your Specialist for these kind of

operations. The fine tuning process typically analyzes:

TX output metrics o TX pre-emphasis/output amplitude o TX Jitter

RX input metrics o RX eyescan and bathtub o RX equalization (with IBERT or other methods) o LPM/DFE modes and settings – probe the adaptation values for equalizers.

Tools for accurate Hardware debug The two main tools for Hardware debug are IBERT and the GT Debugger. The first one is a standalone

design, the second is capable of testing the true design GT configuration. Some IPs offer the In System

IBERT option. Please open a Support Request or ask your Specialist for GT Debugger analysis.

Conclusion Debugging board level issues can be difficult and time consuming. Extreme care in proper board design and verification through Signal Integrity (SI) and Power Integrity (PI) CAD tools before board fabrication is the best way to mitigate risk of timely board debug.

When hardware debug is needed, the techniques reviewed in this paper will provide accurate scope captures techniques and systematic debug approaches that will result in quick identification of the root cause and therefore work-arounds/fixes. It highlights the importance of having a systematic and scientific approach to the debug effort. The paper leverages established and proven best practices in FPGA board-level debug and applying them uniformly to solve new issues.

Additional References 7 Series PCB Design User Guide: UG483

UltraScale and UltraScale+ PCB Design User Guide: UG583

Page 58: Best Lab Practices for Xilinx FPGA Board-Level Debug · Cycle-to-Cycle Jitter ... level of each supply rail met the Xilinx data sheet specifications ... Improper initialization of

© Copyright 2017 Xilinx

7 Series MIG DDR3/DDR2 - Hardware Debug Guide: AR#43879

Xilinx SelectIO Solution Center – Design Assistant: AR#50926

Zynq-7000 All Programmable SoC and 7 Series Devices Memory Interface Solutions User Guide:

UG586

UltraScale Architecture-Based FPGAs Memory IP Product Guide – PG150

Virtex II - What is the difference between Cycle-Cycle Jitter and Period Jitter (as discussed in the

Virtex-II data sheet): AR#12010

Virtex/Spartan - There is noise/glitch on an input; however, the FPGA design works correctly:

AR#31922