infrastructure api lightning talk by jeremy pollard of box.com

42
1 Jeremy Pollard What If Your Network Was Smarter Than You?

Upload: devops4networks

Post on 03-Dec-2014

123 views

Category:

Technology


0 download

DESCRIPTION

What If Your Network Was Smarter Than You?

TRANSCRIPT

Page 1: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

1

Jeremy Pollard

What If Your Network Was Smarter Than You?

Page 2: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

2

Who Am I?• Jeremy Pollard

• Network Engineer @ Box.com

• SIGGRAPH2015 GraphicsNet Committee Chair

• Automator

• Lindy-Hop and Blues Dancer

Page 3: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

3

Complete Network OverhaulNetworks that grow organically don’t scale, news to no one.

Page 4: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

4

Network Overhaul

• Old design grew as needed‒ Need a switch? Add a switch.‒ Flat layer 2 design.‒ Did not Scale.

• New Design‒ Greenfield!‒ New hardware!‒ New design!‒ New Datacenter!

Page 5: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

5

“Let’s build a smarter network.

Said everyone, everywhere.

Page 6: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

6

How do we do this?What are we trying to solve?

Page 7: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

7

We’re Network Engineers…

Page 8: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

8

And We Like…

• Standards

• Specifications

• Designing with scalability in mind

• Repeatable patterns

Page 9: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

9

And Yet We Still Have To Answer Questions Like…

• Which IP address should I use?

• Where is this host located?

• Do you know how this device is supposed to be cabled?

• Which port should I use?

• Did you configure that new switch?

Page 10: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

10

Boring

Page 11: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

11

Error Prone

Page 12: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

12

A Waste Of Time

Page 13: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

13

Cost The Company $$$

Page 14: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

14

How Did Box Approach This?By thinking outside the Box… HA! Get it?!

*crickets*

Page 15: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

15

New Network Design

• Core / Agg / ToR model

• Fully routed to the ToR

• Two ToRs per cabinet

• Pattern based port assignment

• Mathematically generated ‒ IP addresses‒ Hostnames‒ VLANs

• ID numbers to indicate Datacenter, Pod, Cabinet‒ More on this later!

In 30 seconds or less

Page 16: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

16

For Every Pair of ToRs

• Over 300 pieces of unique information‒ IP addresses/subnets‒ Pinned routes‒ Radius / Logging / NTP / etc servers‒ Interface descriptions

• ~180 DNS records

• Cabling instructions‒ 8 upstream port assignments‒ 2 Serial consoles‒ 2 management ports

Page 17: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

17

Highly Complex

Page 18: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

18

Highly Automatable

Page 19: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

19

Time to build a smarter network

Page 20: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

20

The Infrastructure API

Page 21: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

21

Infrastructure API

• HTTP based REST API

• All things IP / Network / Datacenter

• Single source of truth

Page 22: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

22

It’s our design specification

Page 23: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

23

It’s our design specification

Implemented in code

Page 24: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

24

Infrastructure API

• IP address management for network devices and hosts‒ In-band and Out-of-Band

• Hostname generation

• DNS registration

• Generates all 300 unique pieces of info for ToR provisioning

• Generates physical cable mappings and port assignments

• Host to Security zone mapping

• Provide network information for a given IP

• Provide physical location for a given IP

Page 25: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

25

Infrastructure API

• Returns JSON objects

• Easily integrates into token-based templates‒ Full text configuration‒ Cabling instructions

• Can be easily integrated into other services

Page 26: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

26

How Does It Work?

Page 27: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

27

Fundamentals First

• Procedurally Generated

• Single Seed

• Remember the IDs?‒ Datacenter‒ Pod‒ Cabinet‒ Host Type (Production side only)‒ Rack-u (Out-of-Band side only)

Static HostDatacenter Pod Type

0001010.10101000.10100001.00010100Cab

Page 28: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

28

Seeds

• IP - > Datacenter / Pod / Cabinet / Type IDs

• IDs - > Everything Else‒ $cab_count = ($MAX_POD_SIZE * $pod_id - 1 ) + $cab_id‒ $hostname = sprintf(‘tsw%02d’, $cab_count)‒ $serial_server_number = $cab_count / 32 + 7($pod_id - 1) + 4‒ $serial_port_number = 33 + (($cab_count - 1) % 32) / 2

• And so on…

Page 29: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

29

New Switch ProvisioningA Use Case

Page 30: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

30

In The Datacenter

• DC Tech enters rack information to get cabling specifications for the cabinet

Page 31: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

31

Once Racking and Cabling is Complete:

• Manually Configure the management IP address‒ This will be our seed!‒ We’re working on DHCP…

• Download provision.sh to the switch and execute.‒ Downloads latest EOS‒ Detects management IP‒ API Call: device_config with management IP as the argument

‒ Infrastructure API generates the config‒ Config is then saved to startup-config

‒ API Call: register_dns with management IP as the argument‒ Infrastructure API calls our DNS API to register all records

‒ Download first_boot.sh‒ Reboot device

Page 32: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

32

After Reboot

• first_boot.sh executed 2 minutes after boot

• API Call: inventory_update‒ Inventory API scans the device collecting:

‒ Hostname‒ Serial Numbers‒ Interface IP Addresses‒ Interface States

• Success!!‒ Switch successfully provisioned‒ Automatically added to monitoring

Page 33: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

33

Other Uses?

Page 34: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

34

Other uses?

• Core / Datacenter teams host provisioning‒ Host IP address assignment‒ Hostname generation / DNS registration

• Hadoop rack awareness

• Assists in automating inventory audits‒ Physical / logical mappings‒ Host locating

• If you build it, they will come.

Page 35: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

35

Humans are still needed… Right?Right?!

Page 36: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

36

You Bet!

• All those IDs need to be defined

– Thankfully it’s crazy easy!

• YAML based data structure

• Datacenters are assigned pods

• Pods exist in cages

• Pods are assigned Cabs

• Etc…

Page 37: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

37

We’re just not answering these questions anymore…

• Which IP address should I use?

• Where is this host located?

• Do you know how this device is supposed to be cabled?

• Which port should I use?

• Did you configure that new switch?

Page 38: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

38

“This sounds great! But what are the potential problems?

- Said anyone still paying attention

Page 39: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

39

Problems…

• Screw up ID allocation

• DC Tech cabled devices incorrectly or incorrect physical location

• Need to move an existing cab to another pod

• Bugs!

Page 40: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

40

What’s Next?To the future!!

Page 41: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

41

Yet To Come

• Get DHCP working for management addresses

• Dynamically generate topology diagrams‒ Graphviz‒ D3‒ Take your pick

• Automated validation of link health‒ Up / Down‒ Light levels‒ Db loss

Page 42: Infrastructure API Lightning Talk by Jeremy Pollard of box.com

42

Thanks!