prototyping next-gen tegra soc - dvcon india · prototyping next-gen tegra soc ... haps. rtl...

28
Prototyping Next-Gen Tegra SoC Sivarama Prasad Valluri & Ramanan Sanjeevi Krishnan © Accellera Systems Initiative 1

Upload: tranquynh

Post on 30-Aug-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

Prototyping Next-Gen Tegra SoC

Sivarama Prasad Valluri &Ramanan Sanjeevi Krishnan

© Accellera Systems Initiative 1

Page 2: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

Agenda

Introduction

Prototyping flow overview

RTL Conversion Challenges

Partitioning Challenges

PIN multiplexing Challenges

Other Challenges

Results & Conclusion

Page 3: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

INTRODUCTIONProject Overview

Page 4: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

IntroductionProject overview

Page 5: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

Introduction

• Requirements for Prototyping Tegra SoC– Prototype RTL close to ASIC RTL– Kernel boot on Multi-Processor Setup– Faster time to Prototype –Early SW Development– Faster turnaround of bit-streams -Incr. RTL Drops– Achieve FPGA Prototyping an “order of magnitude”

faster than emulation– Support all HSIOs (HDMI, SATA & PCIe)

Page 6: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

PROTOTYPING FLOW OVERVIEW

Page 7: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

Prototyping Flow

Certify SynplifyPremier + Xilinx Vivado

Partitioning

Pin Multiplexing

Trace Assignment

Project creation per FPGA & time

budgeting

Synthesis P&R FPGA1

Synthesis P&R FPGA2

Synthesis P&R FPGAn

Converted RTL+ FDC + Board Files

Prototyping Platform

Synopsys HAPS

Page 8: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

RTL CONVERSION CHALLENGES

Page 9: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

Handling Clocks• Large number of clocks in ASIC – Compared to limited FPGA

global clock resources• Clock generation, gating & mux-ing logic in clock paths • Clock Skew introduced due to FPGA Partitioning

Approach• Merged related clocks and reduced to 6 global clocks • Used global clocks to generate necessary clocks in each FPGA• Replaced logic in clock-path with equivalent FPGA blocks• Used Synthesis tool to convert remaining clock gates

Page 10: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

Reset Synchronization• Global SoC reset drives seq. elements across the entire design• Critical to ensure that reset is released at same time across all

FPGAs

Approach• Modify RTL to add pipeline stages to reset signal• Use the pipeline tree to achieve the reset synchronization

Page 11: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

Reset Synchronization Tree

1

3 3

3 4 4

To Next System

Asynchronous Reset

Page 12: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

PARTITIONING CHALLENGES

Page 13: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

Initial Partition Approach

First time partition• Ran Area-Estimation• Partitioned design based on

– Design-hierarchies & IP Area/Size– External Interface Proximity &– Layout of the multi-HAPS system

Page 14: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

Interconnect ProblemHuge number of interconnects(IC’s) between FPGA’s---------------------------@W: CU603 |Actual I/O count(1558) after CPM exceeds the total I/O count(1200) for device <>

@W: CU603 |Actual I/O count(3794) after CPM exceeds the total I/O count(1200) for device <>

@W: CU603 |Actual I/O count(14677) after CPM exceeds the total I/O count(1200) for device <>

@W: CU603 |Actual I/O count(13876) after CPM exceeds the total I/O count(1200) for device <>

@W: CU603 |Actual I/O count(20724) after CPM exceeds the total I/O count(1200) for device <>

@W: CU603 |Actual I/O count(26143) after CPM exceeds the total I/O count(1200) for device <>

---------------------------Approach• Change partition to reduce interconnects• Additionally used pin-multiplexing techniques to address this

• Used HSTDM – Synopsys Certify Pin-multiplexing scheme• Only pin-multiplexed the flop-to-flop signals

Page 15: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

Partition attempts to reduce the ICs

• Moving blocks with large number of inputs and less outputs into the Source FPGA’s

• Moving blocks which are going from one FPGA to another and coming back.

FPGA A

FPGA B

M2M1 256

256

256

300

300

256

300

300

300

FPGA A

FPGA B

M2M1 256

300

FPGA A

FPGA B

M2M1 256

300

300

300

To FPGA C

FPGA A

FPGA B

M2M1To FPGA C

Page 16: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

Partition attempts to reduce the ICs(2)

• Huge number of non F2F IC’s going across multiple FPGAs which can not be pin-mux’ed

Approach• Design insight from IP Team• Identified combinational buses running across multiple IP’s

across FPGAs• Moved all the logic into a single FPGA

Page 17: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

Addressing FPGA clock CrossingsInter-FPGA Clock Crossings• Introduces clock skew in Destination FPGA• More clock capable IO pins needs to be used.

ApproachUsed automation to address the following• Using HDL Analyst(find/expand commands) to analyze the Partitioned netlist• Populated the list of Clock-crossings and their loads into logs

Fix them by replicating the clock generation logic

Page 18: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

Full Design

IP 1

R1

R11

R2

Rn

R12

R1n

Rm1

Rm2

Rmn

.

.

.

.

.

.

.

.

.

...

...

...

clkgen

CLK_ip1

CLK

Rm1

Rm2

Rmn

.

.

.

...

...

...

FPGA 2

IP 1-Part2

Clock Crossing

FPGA 1

IP 1

R1

R11

R2

Rn

R12

R1n

.

.

.

.

.

.

clkgenCLK_ip1

CLK

clkgen CLK

Page 19: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

So many partition trials – Any simpler way?

Time taken to do the Partitioning change and to check the impact on the I/C’s and clock crossings

• Iterative process – More than one run needed• Manual runs– Not efficient & Prone to human errors • UI – Not the most efficient way as human intervention needed.• Batch mode – How to check the impact?

ApproachUsed automation to do the following• Apply partition file on the design• Generate an excel sheet with interconnection matrix & calculated

connector count info• Check clock crossings & Partitioned netlist file analysis• E-mail report

Page 20: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

PIN MULTIPLEXING CHALLENGES

Page 21: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

Slack Based HSTDM

Selection of Appropriate HSTDM ratios• No single button flow for HSTDM selection based on slack• Not possible to hand-pick the HSTDM ratios based on slack for

50k+ signals

Approach• Developed a TCL script to do the slack-based HSTDM

placements• Script applies the HSTDM ratio based on slack• Applies higher HSTDM ratios for slow signals and lower HSTDM

ratios for fast signals• Optimizes the number of IC’s with clean timing

Page 22: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

TIME BUDGETING CHALLENGES

Page 23: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

Time-Budgeting issuesMultiple issues seen in SLP time-budgeting steps• Zero/Negative values seen in Tool Generated FDC• Slack not accurately evaluated• Missing constraintsApproach• Slack based HSTDM• Used Automation to address these issues

– Certify HDL Analyst to analyze the partitioned netlist + TCL commands(find, expand, etc) helps to analyze paths in batch mode and write out the evaluated constraints

– Created Incremental FDC’s to write the missing/zero/-ve constraints.Added them to flow to fix the constraints with issues in the original FDC’s

Page 24: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

Bit-stream Generation/Turnaround timeTurnaround time• Complete RTL going into all the Individual FPGA Projects• All Modified Modules going into all the Individual FPGA Projects

Approach• Developed scripts to generate the RTL list per FPGA project• Developed scripts to identify and split the modified modules

per partition• Reduces the compilation time per partition from 3-4 hours to

30-40 minutes

Page 25: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

RESULTS

Page 26: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

ResultsProject results• Kernel booted much ahead of the Tape-out

– enabled early SW development

• Kernel booted on Multi-Processer setup. – SW able to execute inter-cluster tests like cache-coherency

tests

• “Order of magnitude” faster than Emulation• Able to run the interfaces at speed for driver

development

Page 27: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

Thank You

Page 28: Prototyping Next-Gen Tegra SoC - DVCon India · Prototyping Next-Gen Tegra SoC ... HAPS. RTL CONVERSION CHALLENGES. ... • No single button flow for HSTDM selection based on slack

Questions

© Accellera Systems Initiative 28