Download - Network Telemetry: Pushing Boundaries
Network Telemetry:
Pushing Boundaries
Ramki Krishnan, Distinguished Engineer, SP & NFV
OSM Plenary, Dell EMC Campus
Santa Clara, CA
2
Network Telemetry: Where are we today?
• Primary focus on end-to-end aspects
• Relies on injected packets
• Virtualization challenges are poorly addressed
• Popular standards – Y.1731 (ITU-T), TWAMP (IETF)
• Not ready for @scale hyper-converged SP Infrastrcuture
3
User Facing Converged Infrastrcuture Evolution
Key Takeaway: Low-latency is key for HFT, VR gaming, Connected Cars, AR
Source: https://www.ciscoknowledgenetwork.com/files/584_04-26-16-CKN_Webinar_v2.pdf?PRIORITY_CODE=194542_20
4
User Facing Converged Infrastrcuture Evolution (2)
Goal: Low Latency Edge cloud app Service Assurance
Gaps: Real-time Per hop & per network function visibility, Hotspot identification
Source: https://www.ciscoknowledgenetwork.com/files/584_04-26-16-CKN_Webinar_v2.pdf?PRIORITY_CODE=194542_20User Plane – Packet data plane, SGi-LAN – service chaining of virtualized network functions such as video transcoder
5
Real-time Network Monitoring – Emerging Solutions
• Initial focus on data plane latency monitoring – high value and not difficult to
implement
• Appends timestamp information at each hop in Layer 2/3/4 header
• Benefits
• Amenable to HW implementation in programmable ASICs
• Can compute per-hop latency
• Issues
• Packet size varies across intermediate hops leading to non-deterministic
performance
• Key infrastructure requirement
• Real-time network timing synchronization – IEEE 1588 PTP
6
Real-time Network Monitoring – New Directions
• Real-time end-to-end data collection for selective flows (e.g. live video) across NICs and routers - @scale with no intermediate node monitoring
• Monitor programmable number of nodes with pre-defined header size –deterministic performance
• Hierarchical monitoring framework – service chain, overlay, underlay etc.
Rack Servers R630, R730 etc.
40G
Spine Z9500
Leaf S6000
L3 Network Fabric OS9/10
40G
ToR S4048
Virtual Network
Function (A)
Virtual Network
Function (B)
- Pre-construct programmable timestamp header for all hops
- Use timestamp append model in NICs, switches/routers
- Mirror packet with entire timestamp information in the last hop
- Examine latency deviation over baseline for anomalies
7
Real-time Network Monitoring – New Directions (2)
• Latency typically follows a long-tail distribution • Average latency not a useful metric for anomaly baseline
• Start with an approximate value of anomaly baseline• Refine the baseline using simple predictive analytics techniques, e.g. Holt-Winters
time series forecasting algorithm
• Advanced predictive analytics techniques, e.g. machine learning – research area• Automatic dependency/clustering detection between latency and other events such as
packet drops, queue depth, poor video QoE, noisy neighbors etc.
Source: https://www.ietf.org/proceedings/96/slides/slides-96-bmwg-8.pdf
8
Real-time Network Monitoring – New Directions (3)• Beyond network latency …
• Other key aspects to monitor are queue depth, ingress/egress port bandwidth etc.
• These are not as straightforward to implement in the packet data path as latency
• Last but not the least …
• Orchestration is a key piece of the overall solution
• Goal is to align with OSM and other orchestrators
• Dell EMC Industry Leadership …
• P4 In-band Network Telemetry: http://p4.org/wp-content/uploads/fixed/INT/INT-current-spec.pdf
• IETF NFVRG Leadership (https://irtf.org/nfvrg): Real-time properties work item
9
Acknowledgements
• Co-conspirator
• Anoop Ghanwani, Dell EMC