reliability in white rabbit network · rapid/multi spanning tree protocol reconfiguration time:...

26
Introduction Reliability in WRN Summary Q&A Reliability in White Rabbit Network Maciej Lipi´ nski Hardware and Timing Section / Institute of Electronic Systems CERN / Warsaw University of Technology 1st June 2012 Wilga Symposium Maciej Lipi ´ nski White Rabbit 1/17

Upload: others

Post on 16-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

Reliability in White Rabbit Network

Maciej Lipinski

Hardware and Timing Section / Institute of Electronic SystemsCERN / Warsaw University of Technology

1st June 2012Wilga Symposium

Maciej Lipinski White Rabbit 1/17

Page 2: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

Outline

1 Introduction

2 Reliability in WRNDefinitionData RedundancyTopology Redundancy

3 Summary

4 Q&A

Maciej Lipinski White Rabbit 2/17

Page 3: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

What is White Rabbit?

Accelerator’s control and timing

International collaboration

Based on well-known technologies

Open Hardware and Open SoftwareMain features:

transparent, high-accuracy synchronizationlow-latency, deterministic data deliverydesigned for high reliabilityplug & play

Maciej Lipinski White Rabbit 3/17

Page 4: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

What is White Rabbit?

Accelerator’s control and timing

International collaboration

Based on well-known technologies

Open Hardware and Open SoftwareMain features:

transparent, high-accuracy synchronizationlow-latency, deterministic data deliverydesigned for high reliabilityplug & play

Maciej Lipinski White Rabbit 3/17

Page 5: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

White Rabbit applications

Existing applications:CERN Neutrino to GranSasso

Maciej Lipinski White Rabbit 4/17

Page 6: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

White Rabbit applications

Existing applications:CERN Neutrino to Gran Sasso

Future applications:CERN and GSI

Maciej Lipinski White Rabbit 4/17

Page 7: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

White Rabbit applications

Existing applications:CERN Neutrino to Gran Sasso

Future applications:CERN and GSIHiSCORE:Gamma&Cosmic-Rayexperiment (Tunka, Siberia)

Maciej Lipinski White Rabbit 4/17

Page 8: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

White Rabbit applications

Existing applications:CERN Neutrino to Gran Sasso

Future applications:CERN and GSIHiSCORE:Gamma&Cosmic-Rayexperiment (Tunka, Siberia)The Large High Altitude AirShower Observatory (Tibet)

Maciej Lipinski White Rabbit 4/17

Page 9: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

White Rabbit applications

Existing applications:CERN Neutrino to Gran Sasso

Future applications:CERN and GSIHiSCORE:Gamma&Cosmic-Rayexperiment (Tunka, Siberia)The Large High Altitude AirShower Observatory (Tibet)

Potential applications:Cherenkov Telescope Array

Maciej Lipinski White Rabbit 4/17

Page 10: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

White Rabbit applications

Existing applications:CERN Neutrino to Gran Sasso

Future applications:CERN and GSIHiSCORE:Gamma&Cosmic-Rayexperiment (Tunka, Siberia)The Large High Altitude AirShower Observatory (Tibet)

Potential applications:Cherenkov Telescope ArrayITER

Maciej Lipinski White Rabbit 4/17

Page 11: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

White Rabbit applications

Existing applications:CERN Neutrino to Gran Sasso

Future applications:CERN and GSIHiSCORE:Gamma&Cosmic-Rayexperiment (Tunka, Siberia)The Large High Altitude AirShower Observatory (Tibet)

Potential applications:Cherenkov Telescope ArrayITEREuropean deep-sea researchinfrastructure (KM3NET)

Maciej Lipinski White Rabbit 4/17

Page 12: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

White Rabbit – enhanced Ethernet

Two separate services(enhancements to Ethernet)provided by WR:

High accuracy/precisionsynchronization

Deterministic, reliableand low-latency ControlData delivery

Maciej Lipinski White Rabbit 5/17

Page 13: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

White Rabbit – enhanced Ethernet

Two separate services(enhancements to Ethernet)provided by WR:

High accuracy/precisionsynchronization

Deterministic, reliableand low-latency ControlData delivery

Maciej Lipinski White Rabbit 5/17

Page 14: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

Reliability in a White Rabbit Network (WRN)

Maciej Lipinski White Rabbit 6/17

Page 15: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

Reliability in a White Rabbit Network (WRN)

WRN is functional if ...

... it provides all its services to all its clients at any time.Maciej Lipinski White Rabbit 6/17

Page 16: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

Control Data

Two types of data:Control Data (High Priority, HP)Standard Data (Best Effort)

Characteristics of Control DataSent in Control MessagesSent by Data Master(s)Broadcast (one-to-alot)Deterministic and low latencyReliable delivery

Maciej Lipinski White Rabbit 7/17

Page 17: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

Data Redundancy: Forward Error Correction (FEC)

Re-transmission of Control Data not possibleForward Error Correction – additional transparent layer:

One Control Message encoded into N Ethernet frames,Recovery of Control Message from any M (M<N) frames

FEC can prevent data loss due to:bit errornetwork reconfiguration

Maciej Lipinski White Rabbit 8/17

Page 18: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

Topology Redundancy

Standard Ethernet solution:Rapid/Multi Spanning Tree Protocol

Reconfiguration time: ≈ 1s(best: milliseconds)

1s = ≈ 82 000 Ethernet Frames lostExtensive research:

existing standardsacademic expertsexpert companies

Solution:take advantage of FECspeed up (R/M)STP− >eRSTP

Maciej Lipinski White Rabbit 9/17

Page 19: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

eRSTP + FEC

eRSTP+FEC=seamless redundancy <=> max 2 frames

500 bytes message (288 byte FEC) – max re-conf ≈2.3us

Maciej Lipinski White Rabbit 10/17

Page 20: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

enhanced Rapid Spanning Tree Protocol (eRSTP)

RSTP’s a priori information(alternate/backup)

Limited number of topologies

Drop only on reception –within VLAN, exceptself-sending

Take advantage ofbroadcast characteristic ofControl Data

Do it in hardware !!!

Maciej Lipinski White Rabbit 11/17

Page 21: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

Topology Resolution Unit (TRU)

Universal and decoupled unit for topology resolution

Common firmware base for many different solutionsTwo solutions considered:

enhanced Rapid Spanning Tree Protocol (eRSTP)enhanced Link Aggregation Control Protocol (eLACP)

Maciej Lipinski White Rabbit 12/17

Page 22: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

Status

Maciej Lipinski White Rabbit 13/17

Page 23: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

Status

Deterministic Packet Delivery

Cut-through

Separate resources

Output queuing

Optimization

Topology redundancy

Extensive study

Hardware(eRSTP)

Software(eRSTP)

Synchronization Resilience

WRPTP support -improvements for further study

Hardware support

Data Resilience

FEC Encoder - more workFEC Decoder

Maciej Lipinski White Rabbit 14/17

Page 24: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

Conclusions

Timing-wise WR is working nowfocus on data

Interest of standardizationbodies: WR presented to ITU-Tand IEEE

First deployment at CERN ofWR timing and control networkfor AD

Increasing number ofapplications

First commercially available WRswitch by the end of 2012

Maciej Lipinski White Rabbit 15/17

Page 25: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

Questions and answers

[One more slide after Q&S]

Maciej Lipinski White Rabbit 16/17

Page 26: Reliability in White Rabbit Network · Rapid/Multi Spanning Tree Protocol Reconfiguration time: ≈1s (best: milliseconds) 1s = ≈82 000 Ethernet Frames lost Extensive research:

Introduction Reliability in WRN Summary Q&A

Piotr Doniec(1987-2012)

Maciej Lipinski White Rabbit 17/17