![Page 1: Research Faculty Summit 2018...Machine Learning in Azure Networking (a few sample problems) David A. Maltz Distinguished Engineer Azure Physical Networking Team dmaltz@microsoft.com](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0dc2277e708231d43bf0f1/html5/thumbnails/1.jpg)
Systems | Fueling future disruptions
ResearchFaculty Summit 2018
![Page 2: Research Faculty Summit 2018...Machine Learning in Azure Networking (a few sample problems) David A. Maltz Distinguished Engineer Azure Physical Networking Team dmaltz@microsoft.com](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0dc2277e708231d43bf0f1/html5/thumbnails/2.jpg)
Machine Learning in Azure Networking(a few sample problems)
David A. MaltzDistinguished EngineerAzure Physical Networking [email protected]
![Page 3: Research Faculty Summit 2018...Machine Learning in Azure Networking (a few sample problems) David A. Maltz Distinguished Engineer Azure Physical Networking Team dmaltz@microsoft.com](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0dc2277e708231d43bf0f1/html5/thumbnails/3.jpg)
Large Scale Creates Large Problems
• 100,000s of links in each datacenter
• 10,000s of links in each MAN
• 1,000s of links in the WAN
→ High availability is job number #1 for the network
At scale, the law of large numbers is not your friend
• Instead of “Occam’s Razor” – the simplest explanation is most likely
• “Murphy’s Law” applies – whatever can go wrong, will
Find the cause of perceived network problems is hard
![Page 4: Research Faculty Summit 2018...Machine Learning in Azure Networking (a few sample problems) David A. Maltz Distinguished Engineer Azure Physical Networking Team dmaltz@microsoft.com](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0dc2277e708231d43bf0f1/html5/thumbnails/4.jpg)
Large Scale Creates Large Problems
• 100,000s of links in each datacenter
• 10,000s of links in each MAN
• 1,000s of links in the WAN
→ High availability is job number #1 for the network
At scale, the law of large numbers is not your friend
• Instead of “Occam’s Razor” – the simplest explanation is most likely
• “Murphy’s Law” applies – whatever can go wrong, will
Find the cause of perceived network problems is hard
Network
![Page 5: Research Faculty Summit 2018...Machine Learning in Azure Networking (a few sample problems) David A. Maltz Distinguished Engineer Azure Physical Networking Team dmaltz@microsoft.com](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0dc2277e708231d43bf0f1/html5/thumbnails/5.jpg)
Machine Learning in Azure NetworkA Few Sample Problems
4
“I don't understand the underlying physics that causes this; however, I see outcomes, I know good vs bad, and I want to try and understand the outcome”
“I have a good physical model and understanding of causes”
Problem Machine
Learning
Rules
Based
System
![Page 6: Research Faculty Summit 2018...Machine Learning in Azure Networking (a few sample problems) David A. Maltz Distinguished Engineer Azure Physical Networking Team dmaltz@microsoft.com](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0dc2277e708231d43bf0f1/html5/thumbnails/6.jpg)
Topology: Which cables would you choose?
5
![Page 7: Research Faculty Summit 2018...Machine Learning in Azure Networking (a few sample problems) David A. Maltz Distinguished Engineer Azure Physical Networking Team dmaltz@microsoft.com](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0dc2277e708231d43bf0f1/html5/thumbnails/7.jpg)
Region and Path Availability
6
![Page 8: Research Faculty Summit 2018...Machine Learning in Azure Networking (a few sample problems) David A. Maltz Distinguished Engineer Azure Physical Networking Team dmaltz@microsoft.com](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0dc2277e708231d43bf0f1/html5/thumbnails/8.jpg)
Machine Learning in Azure NetworkLayer-1 Sample Problems
7
“I don't understand the underlying physics that causes this; however, I see outcomes, I know good vs bad, and I want to try and understand the outcome”
“I have a good physical model and understanding of causes”
Problem Machine
Learning
Rules
Based
System
![Page 9: Research Faculty Summit 2018...Machine Learning in Azure Networking (a few sample problems) David A. Maltz Distinguished Engineer Azure Physical Networking Team dmaltz@microsoft.com](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0dc2277e708231d43bf0f1/html5/thumbnails/9.jpg)
Wavelength & Performance Optimization
8
• Gaussian Noise model code implemented as gnpy on github
![Page 10: Research Faculty Summit 2018...Machine Learning in Azure Networking (a few sample problems) David A. Maltz Distinguished Engineer Azure Physical Networking Team dmaltz@microsoft.com](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0dc2277e708231d43bf0f1/html5/thumbnails/10.jpg)
Machine Learning in Azure NetworkLayer-1 Sample Problems
9
“I don't understand the underlying physics that causes this; however, I see outcomes, I know good vs bad, and I want to try and understand the outcome”
“I have a good physical model and understanding of causes”
Problem Machine
Learning
Rules
Based
System
![Page 11: Research Faculty Summit 2018...Machine Learning in Azure Networking (a few sample problems) David A. Maltz Distinguished Engineer Azure Physical Networking Team dmaltz@microsoft.com](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0dc2277e708231d43bf0f1/html5/thumbnails/11.jpg)
Network Availability“Gray” switch failures are the worst
• Switch stays in service
• Drops some fraction of packets
Find the needle in haystack• Pingmesh
• Targeted probe packets
• Error messages sent from switches
• Service-level health metrics
Combine them all to localize problem to most likely switch
![Page 12: Research Faculty Summit 2018...Machine Learning in Azure Networking (a few sample problems) David A. Maltz Distinguished Engineer Azure Physical Networking Team dmaltz@microsoft.com](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0dc2277e708231d43bf0f1/html5/thumbnails/12.jpg)
Machine Learning in the Azure Network TeamPractitioners Guide
11
![Page 13: Research Faculty Summit 2018...Machine Learning in Azure Networking (a few sample problems) David A. Maltz Distinguished Engineer Azure Physical Networking Team dmaltz@microsoft.com](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0dc2277e708231d43bf0f1/html5/thumbnails/13.jpg)
Thank you!
![Page 14: Research Faculty Summit 2018...Machine Learning in Azure Networking (a few sample problems) David A. Maltz Distinguished Engineer Azure Physical Networking Team dmaltz@microsoft.com](https://reader033.vdocuments.net/reader033/viewer/2022042323/5f0dc2277e708231d43bf0f1/html5/thumbnails/14.jpg)