ruckman, jj russell matt graham, giovanna lehmann, mark ... · matt graham, giovanna lehmann, mark...

28
January 25, 2018 Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

Upload: others

Post on 29-Sep-2020

6 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

January 25, 2018

Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell

DUNE FD DAQ: ATCA/RCE + FELIX Solution

Page 2: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

Felix + ATCA RCE Overview & Responsibilities

2

FrontEnds

WIBsATCA RCE

ClusterFront end data passes through WIB without concentration, electrical to optical conversion only** ATCA RCE Cluster provides

filtering, feature extraction & SuperNova Buffering (100s).Raw data received at

the ATCA RCE RTM, passed directly to DPMs

10Gbps links between RCE and Felix (underground); buffer for trigger

FELIXCluster

Event Builder, Aggregator, L3 triggering

Optics Up Shaft

Backend Computing

TriggerFarm Trigger

decisions

Trigger primitives

● FE+WIB → RCE: all raw data into RTM with some custom format (e.g. COLDATA); 8B/10B (probably) at 1.28 Gbps

● RCE → FELIX: all raw data out of the RTM some custom format (GBT etc) of multiplexed data ~10-12 Gbps links

● FELIX → Backend Computing: triggered raw data over ethernet on switched network● Trigger Path: RCE-extracted primitives go either to RCE → FELIX → trigger farm on

separate stream or directly from RCE → trigger farm via ethernet (shown)● Lossy Buffer (not shown) → RCE-extracted waveforms/time slices → lossy buffer either

through FELIX or direct from RCE

Page 3: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

3

Numerology (just CE, RCE, FELIX)

● “Cold” Electronics: 64 channels/COLDATA, 2 COLDATA/FEMB, 4 FEMB/WIB, 20 FEMB/5 WIBs/APA

○ these are fixed, never ever will change● RCE System: 4 DPMs/COB, 1* RCEs/DPM, up to 14 COBs/shelf, 1 COB/APA (target)

○ current-gen of DPM has 2 RCEs/DPM, see later slides● FELIX System: 2 APAs/FELIX; 2 FELIX/PC (target)● WIB → RCE Links: 16 links/WIB @ 1.28 Gbps, 80 links/COB/APA

○ assume passive WIB● RCE→FELIX Links: (raw, uncompressed) 2 10-Gbps links/DPM, 8 links/APA, 16 links/FELIX,

32 links/PC

Page 4: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

4

RTM Block DiagramS

NA

P12

SN

AP

12

SN

AP

12

SN

AP

12

SN

AP

12

SN

AP

12

SN

AP

12

SN

AP

12

SFP

+

SN

AP

12

SN

AP

12

WIB Connections: Support 80 links FELIX Interface Timing

DTMDPMs

Experience with high density fiber optic RTMs

QS

FPReflexphotonics SNAP12 transmitter/receiver supports 10.3125 Gbps per lane:http://reflexphotonics.com/embedded-transceivers/snap12/

Page 5: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

Data Flow In ATCA RCEs

5

Filtering

Filtering

Filtering

FrontEnds

FeatureExtraction

SuperNovaPre-Buffer

SuperNovaPost Buffer

FelixUplink

(GBT or PGP)

TimingInterface

To Felix

● Flexible architecture allows front ends to be allocated across RCEs in a flexible fashion○ Simply add more cards and move fibers

● Target is 640 channels per RCE (1x APA per COB) → 5 FEMBs/DPM○ Numerology is important! 5 WIBs vs 4 DPMs/APA; multiplexing at WIB ( 2xFEMB links e.g.)

reduces flexibility

Compression& Other

Processing

CompressionProposed

Page 6: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

RCE To Felix Uplink

6

● Multiple optical links will be routed between the ATCA RCE platform to the Felix nodes○ DWDM utilized to maximize uplink bandwidth and provide redundancy

● The ATCA RCE platform will utilize its powerful interconnects to provide a data routing capability

○ Flexible configuration of which data goes to each Felix board■ Allows system to adapt to changing data needs (noise, etc)■ Some channels can be used for raw data from a subset of the detector■ SuperNova readout lanes (slow trickle, post trigger)■ Data can be re-routed to different fibers in the case of a fiber break or Felix board failure!

○ Link count can be scaled to match system needs

■ Less fibers when RCEs do majority of data processing and event selection■ More fibers when computing cluster is needed for data processing■ One or more fibers per DPM, one fiber per COB or 1 fiber per crate

RCECluster Felix

Felix

Felix

Felix

Felix

Felix

Page 7: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

Example Data Routing

7

Felix

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

OutboundRCE

OutboundRCE

OutboundRCE

Felix

Felix

ATCA RCEInterconnect

Note: Processing RCEs can also serve as outbound RCEs

DetectorData

Page 8: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

Example Data Routing: 2 Active & 1 Spare Felix

8

Felix

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

OutboundRCE

OutboundRCE

OutboundRCE

Felix

Felix

ATCA RCEInterconnect

Note: Processing RCEs can also serve as outbound RCEs

DetectorData

Page 9: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

Example Data Routing: Felix Failure Or Fiber Break

9

Felix

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

OutboundRCE

OutboundRCE

OutboundRCE

Felix

Felix

ATCA RCEInterconnect

Note: Processing RCEs can also serve as outbound RCEs

ATCA RCE cross connect routes data to redundant Felix board after fiber break or Felix board failure!

DetectorData

Page 10: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

Example Data Routing: Load Adjustment

10

Felix

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

DetectorData

ProcessingRCE

OutboundRCE

OutboundRCE

OutboundRCE

Felix

Felix

ATCA RCEInterconnect

Note: Processing RCEs can also serve as outbound RCEs

Redundant Felix boards can take on additional loads due to flexible data routing in ATCA RCE!

DetectorData

Page 11: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

11

Data from on FELIX PCs

Page 12: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

Benefits Of Felix + ATCA RCE

12

● ATCA RCE provides a powerful front end processing platform for data processing in FPGAs○ RCE data processing features included in upcoming slides

■ Filtering■ Feature extraction■ Neural Network Processing

○ RCE could provide SuperNova data buffering (minute easily) ○ Proven packaging, cooling, interconnects, high reliability (incl hot-swap redundant fan and power supplies)

■ Already used for other experiments (LSST, ATLAS CSC, ATLAS IBL, KOTO, etc), mature design, low risk

● ATCA RCE interconnect provides ability to re-route data to Felix nodes on demand○ Adjust processor load to match the amount of processing needed in back end○ Route data around failed uplink fibers○ Route data to move from a failed Felix node (or host CPU) to a redundant element

● Felix provides a point to point path between the ATCA RCEs and the back end data processing○ Better flow control model than Ethernet or TCP / UDP over long links

■ Both PGP and GBT provide proven flow control over long link distances○ Transmitted frames stay in their native size instead of being chunked up into small network transfers (Ethernet

MTU)■ Felix has demonstrated high throughput with larger packet sizes

○ End to end data integrity■ GBT and PGP both have data integrity checking on their transport protocols■ Minimal error handling required in receiving nodes before data processing layer■ Test pattern capability over GBT or PGP links

● Back end processing model follows classic Felix architecture○ Receive data in Felix node with PCI-Express card○ Back end data processing with CPUs and GPUs

Page 13: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

13

ATCA RCE Data Processing

Page 14: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

Readout Overview

14

Page 15: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

15

ATCA RCE Platform Clustering

• The RCE nodes are interconnected through Ethernet• Each COB contains a low latency 10/40Gbps Ethernet switch

- Cut through latency < 300ns• The COB supports a full mesh 14-slot backplane

- Each COB has a direct 10Gbps link to every other COB in a crate- Any RCE in an ATCA shelf has a maximum of two switches between it and every other RCE- 14 * 8 = 112 RCEs in a low latency cluster

• Reliable UDP protocol allows direct firmware to firmware data sharing• Allows for low latency data sharing between nodes

- APA combining and edge channel data sharing- Neural Network data sharing

COB

DPM 0 DPM 1

DPM 2 DPM 3

EthernetSwitch

DTM

COB

DPM 0 DPM 1

DPM 2 DPM 3

EthernetSwitch

DTM

COB

DPM 0DPM 1

DPM 2DPM 3

EthernetSwitch

DTM

COB

DPM 0DPM 1

DPM 2DPM 3

EthernetSwitch

DTM

Off shelf link

Page 16: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

16

Oxford Design: Revision C01

● ZYNQ: XCZU15EG-1FFVB1156E● PL DDR4: 8 GB on DPM● PS DDR4 8 GB on DPM● M.2 NMVe: 512 GB on DPM

○ Located above the DPM’s DDR ICs● Dimensions: 85.09 mm x 110 mm

○ Increased by 1.27mm for NMVe

XCZU15EG-1FFVB1156E

DDR4 ICs

M.2 NMVe

SD Memory Card

JTAG

Page 17: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

17

DPM Redesign for DUNE

● Oxford/SLAC Collaboration● Optimized for large memory buffering on the DPM● Only 24 GT channels on this FPGA

○ 20 of 24 GTs for the FEBs:■ 80 links/COB @ 1.28 Gbps (8B/10B)

○ 2 of 24 GTs for the ETH SW: ■ two separate 10 GbE (10Gbps/lane, 64B/66B) to ETH SW

○ 2 of 24 GTs for the Felix: ■ 2 RX lanes and up to 22 TX lanes

● Able to support redundant Felix connections■ 20 Gb/s @ 2 lanes (10Gbps/lane, 64B/66B)

SuperNova Pre-Buffer

SuperNova Post-Buffer

Linux Kernel + SuperNova Pre-Buffer

Boot Memory

Unused FEB TX lanes can be used to increase bandwidth to Felix

Page 18: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

Backup Slides

Page 19: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

19

ATCA Packaging for DUNE

● 1 APA = 2560 channels● 1 APA per COB

○ 4 DPMs per COB○ 640 channels per DPM

● 150 APA for the entire system = 150 COBs● Total Rack space: 165U

○ 11x 14-slot ATCA crates○ 15U per 14-slot ATCA Crate

■ http://www.asis-pro.com/maxum-atca-systems/14-Slot-14U-MaXum-460

Page 20: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

20

ATCA Power/Cooling Estimates for DUNE

● COB Max Power: 300W○ ~100W for ETH SW○ 36W for RTM (limited by 3A fuse)○ 160 W for digital processing

■ 40W per DPM● Total Max Power: 45kW● Cooling via forced air (Integrated into the ATCA platform)● Power and thermal monitoring via standard IPMI interface● Example of ATCA crate that support 400W per slot

○ http://www.asis-pro.com/maxum-atca-systems/14-Slot-14U-MaXum-460

Page 21: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

21

ATCA Costs for DUNE (Updated for quantity)

● RCE Cost Estimate: ○ Upgraded DPM + Flash: $2.5K○ Upgrade COB: $4K○ RTM: $2K○ DTM $1k○ ~$17k/unit

● 14-slot ATCA crates, in quantity, 2019○ ~$7k/unit○ IPMI + shelf manager + 10GbE/40GbE backplane + fans + power supplies

● Total ATCA Hardware Cost: $2.7M○ 11x ATCA crates○ 150x RCE ATCA slots

Page 22: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

22

Packaging And Architecture Thoughts

Page 23: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

23

ATCA ComponentsProven standard, built to be robust and reliable, also fully monitored

Air Intake Filter

Intake Fans

Power supply DC or AC input

Shelf Manager

● Telecom standard designed for “5 nines” uptime● Almost all components can be replaced in the field● Redundancy is available if desired

○ N + 1 redundancy for power supplies○ Redundant shelf managers

● System is designed to handle one fan failure in each fan tray○ Shelf manager generates alarm to request fan tray replacement

Exit Fans

Page 24: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

Application Card

ShelfManager

24

ATCA Provides Management & Monitoring Features Required In Reliable & Maintainable DAQ Designs

ShelfManagers

Ethernet

Console

PowerSupplies

FanTrays

EEPROM

IPMC

EPROMs● ATCA uses IPMI for management purposes

○ Intelligent Platform Management Interface● Manages and monitors all shelf based components

○ Power supply status and power○ Shelf inlet and exit temperatures○ Fan speed control and monitoring○ Application card control and monitoring

● Redundant EEPROMs contain all shelf information○ Shelf serial number, location and ID○ Shelf manager IP/MAC address

● Application card hosts IPMC○ Intelligent Platform Management Controller

● IPMC hosts all application card information in local EEPROM○ MAC addresses○ Serial number, card type & revision

Page 25: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

Supernova Buffering In Two Stages (Update)

● Pre trigger buffer stores data in a ring buffer waiting for a supernova trigger○ 640 channels per RCE (1x APA per COB)○ 2 MHz ADC sampling rate○ 12-bits per ADC○ Raw Bandwidth: 15.36 Gbps (1.92 GB/s)

■ 640 x 2MHz x 12b○ Each DPM has 16 GB RAM:

■ 9.6 TB DDR4 RAM for all system across 150x COBs○ Total Memory for supernova “pre-buffering”: 15 GB

■ PL 8 GB + PS 7 GB (1GB for Kernel & OS)○ Without compression: 7.8 seconds pre-trigger buffer

■ Assuming 12-bit packing to remove 4-bit overhead when packing into bytes● Post trigger buffer stores data in flash based SSD before backend DAQ

○ Write sequence occurs once per supernova trigger: Low write wearing over experiment lifetime○ Low bandwidth background readout post trigger: Does not impact normal data taking○ ~$180K for NVMe M.2 SD buffering (150x COBs x 4 DPMs/COB x $300/NVMe)○ 512GB/DPM = 266 second post-trigger buffer○ Samsung NVME SSD 960 PRO: Sequential write up to 2.1GB/s

■ SSD write bandwidth matches well with 640 channels of uncompressed data

25NOTE: NO compression factor is applied in slide (only RAW bandwidths)

Page 26: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

Zynq Ultrascale+ and M.2 SDD Performance

● Benchmarked read/write bandwidth into Samsung NVMe SSD 960 PRO with the ZYNQ PS PCIe root complex interface

● M.2 SDD mounted and formated as EXT4 hard drive● Running on ArchLinux● Measuring ~1.6GB/s for read/writing dummy data

generated by the CPU○ Limited by the Zynq’s PCIe GEN2 x 4 lane

interface (Theoretical limit: 2.0Gb/s)■ Not limited by M.2 SDD’s controller

● Because the input bandwidth is 1.92GB/s > 1.6 GB/s SDD write speed, we would be able to buffer for 37 seconds in DDR before 100% back pressure

● Need some amount of compression before the SSD to prevent bottlenecking at the SDD

● This is a very simple test with only one process○ Need to do stress testing of other interfaces in

parallel of SDD to confirm rate is still 1.6GB/s

26

Page 27: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

Optional Compression

● Past Development has shown firmware compression can be costly in FPGA resources● If compression is done in firmware, a minimal LUT footprint would be required● With the high performance Zynq Ultrascale+, real-time software compression does become a reality.

27

Algorithm kLUTs/DPM kFFs/DPM DSP48/DPM RAM(Mb)/DPM

Arithmetic Probability Encoding

292(86%)

120(18%)

75(<1%)

22.3(38%)

Huffman 143(43%)

60(9%)

75(<1%)

22.3(38%)

FPGA Resources for 640 channel per DPM compression

Page 28: Ruckman, JJ Russell Matt Graham, Giovanna Lehmann, Mark ... · Matt Graham, Giovanna Lehmann, Mark Convery, Ryan Herbst, Larry Ruckman, JJ Russell DUNE FD DAQ: ATCA/RCE + FELIX Solution

28

Waveform Extraction

•–

•–

•••

•–

•–

••

● See slides from JJ Russell here:https://docs.google.com/presentation/d/1XufamuZOdFGkIlHZEw4N8nXMSUEbK9OlhQ9pcAGn4wk/edit?usp=sharing