public transit network automation in a large and highly ... atl slide... · greenfielding network...
TRANSCRIPT
Greenfielding Network and Systems Automation in a Large and Highly Dynamic Public Transit Network
Logan BestDevOps EngineerTransit Wireless
Share your automation story
1. How did you get started with Ansible?
2. How long have you been using it?
3. What's your favorite thing to do when you Ansible?
-vvv
Disclaimer:
This talk will be intentionally vague in some cases due to NDA and proprietary
IP that I cannot divulge.
Any opinions expressed are of my own and not my employers.
Core Network
Cisco● IOS● IOS-XE● IOS-XR● NX-OS● ASA
Extreme● NX9500● NX9600● VX9000
Nokia● ALu● ALE
WestellMikrotikDigi LTE
DebianUbuntuCentOS
ProxmoxOxidizedZabbixThe list goes on….
What does this all come down to?
● We have a massive footprint of vendors, versions, and platforms to cover
● Almost 20,000 devices just in NYC● Network_cli just isn’t enough sometimes● Yes, that means some things rely on telnet >.<● LOADS of underlying groundwork required
So how do you even begin?
● Talk to your peers about existing pain points● Where’s the low hanging fruit you can get easy wins with?● How’s the existing infrastructure setup? What’s missing?
What are the current projects?
● Find out what your team or related teams are working on● How can those tasks be automated?
What was missing?
● Source of Truth● Secrets Management● CMDB● Central Authentication● Self Service● DEVELOPERS DEVELOPERS DEVELOPERS
Whew… So how do we even get started?
● Crawl, Walk, Run principle● K.I.S.S● Have a BIG emphasis on team training and buy in● Network Audit● Get corporate buy in on conferences, trainings, and certifications● Use the small initial wins as leverage
Crawl
● Utilize Network Audit to gather facts about the network● Team Education● Monitoring● Automation used as needed with validated and reviewed additive only
changes● Start introducing input validation to reduce change risk
Walk
● Introduce Netbox as Source of Truth● Build your Inventory strategy● Setup DNS and LDAP/Radius AAA● Start simple small when making changes to the network● Severely limit your initial footprint to reduce risk to prod● LAB EVERYTHING!!!
Run
● Netbox implementation complete● Monitoring adds new automation and device specific metrics● Implement rollback, integrates with Oxydized to backup on each run and
restore if needed● Automated ZTP with Ansible instead of console provisioning● Introduce Jira and proper change/project management culture● Auto documenting Jira issues with Ansible!● Getting closer to no manual changes as playbooks evolve and become
more robust
How are we doing all of this?
● Python● Ansible● AWX● Netbox
● Stackstorm● Zabbix● Jira● Slack● Viewflow.io
So where’s the “highly dynamic” part?
Wifi onboard the trains!
A C C A
Some train operators don’t keep cars together
How can we keep our sanity?
● Rigorous testing● Get so good at that you can write a whitepaper on it● Innovate using existing protocols● Have a backup strategy for when it all fails to provision
In the end...
● Don’t be afraid to start slow● Don’t be afraid to start small● Have a well thought out vision● Advocate for education for yourself and your peers● You will eventually break something.