stupid boot tricks: using ipxe and chef to get to boot management bliss

32
Stupid Boot Tricks Jason Cook - @macros - [email protected]

Upload: macslide

Post on 27-Jun-2015

3.069 views

Category:

Technology


0 download

DESCRIPTION

In this talk I will cover how I built a boot system using ipxe and chef's api to create a lightweight tool for managing install and firmware updating of hosts and network gear.

TRANSCRIPT

Page 1: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Stupid Boot Tricks

Jason Cook - @macros - [email protected]

Page 2: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Me

• Co-founder and Principal Engineer at Fastly

• Former Operations Engineer at Wikia

• Lots of Sysadmin and Linux consulting

Page 3: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss
Page 4: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

A little history

Page 5: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

First Racks (the bad old days)

• 2-6 machines per location

• Installs over ipmi

• Organic growth

• No local management infrastructure

Page 6: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Scaling Up and Out

Page 7: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss
Page 8: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Fastly “Mega” Design

• Single platform for caching clusters

• Deployed as a unit, limited to no incremental growth

• Same components for 4 to 32 machine clusters

• Able to justify management infrastructure

• Able to lean on convention

Page 9: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss
Page 10: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

The “oob” machine

• Private link to internet

• Provides local provisioning

• DHCP

• Squid

• Donner

Page 11: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Existing Tools

• Cobbler

• Razor

• Foreman

Page 12: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Why not existing?

• 20+ "datacenters"

• No backbone/internal network

• Too many moving pieces

• Host network complexity

Page 13: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Donner

• Sinatra app and cookbook for booting things over http

• iPXE

• Chef as datastore

• Open Source soon (stupid heartbleed)

Page 14: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

iPXE

• Open Source implementation of pxe

• Formerly known as both gPXE and Etherboot

• ROM image that can be burned into firmware

• Can boot off floppy/usb/hard/other pxe as well

Page 15: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Why iPXE?

• Boot of more than just tftp targets

• http, iSCSI, ATAoE, Fiber Channel

• Scriptable

• Minimal hardware and network inventory data

Page 16: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Why Chef for the datastore?

• Already available as a common service

• Multiple sources of truth suck

• Databags as integration point

Page 17: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Why databags?

• Hardware lifecycle is independent from the node object

• Searchable

• Easy to consume from other tools

Page 18: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Partial Search?

• Fast

• Somewhat convenient API

• I’m too lazy to deal with the databag api for reads

Page 19: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

The Workflow

• Shipment Manifest

• Racking/Cabling

• Map Serial to Real World Location

• Power on machines and wait

Page 20: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Vendor Data

• For each shipment vendor provides a spreadsheet

• Serial number

• mac addresses

• Converted to data bag entries

Page 21: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Inventory Data Bag

{ "environment": "production", "datacenter": "LCY", "id": "cache-lcy1122", "mac": “00:25:90:86:91:d8”, "hostname": "cache-lcy1122", "publicip": "185.31.18.22", "mgmtip": "172.16.6.22", "profile": "mega16" }

Page 22: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Site Details

• Racking/Cabling done by remote hands

• Labels applied to physical position

• Labels mapped to serial numbers in data bags

Page 23: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

From Bare Metal to Chef

1. Get address

2. Assign boot image

3. Build installer config

4. Build post-install config

5. Install

6. Run chef on first boot

Page 24: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Getting iPXE in your pxe

• ISC dhcpd can do conditional responses

subnet 172.16.16.0 netmask 255.255.255.0 { range 172.16.16.225 172.16.16.254; if exists user-class and option user-class = "iPXE" { filename “http://172.16.16.7/images/dhcpd.ipxe”; } if-else substring(hardware, 1, 3) = 01:1C:73 { option bootfile-name “http://172.16.16.7:1080/ztp”; } else { filename "undionly.kpxe"; } option routers 172.16.16.7; option domain-name-servers 172.16.16.7; }

Page 25: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Scripting the boot image

#!ipxe !:net isset ${net0/mac} && dhcp net0 || goto target set dhcp_mac ${net0/mac:hexhyp} !:target chain http://172.16.16.7:1180/pxe/${dhcp_mac} || goto error !:error sleep 15 goto net

Page 26: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Booting the installer

#!ipxe echo Installation node: <%= @machine['hostname'] %> !sleep 3 kernel http://<%= @serverip %>/images/<%= @image['kernel'] %> <%= @bootargs %> || goto error initrd http://<%= @serverip %>/images/<%= @image['initrd'] %> || goto error boot !:error echo Something went wrong, dropping to a shell… shell

Page 27: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

The Install

• Ubuntu with preseed in our case

• Another erb template

• Nothing special here

Page 28: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

The post-install

• Annoying amount of our magic happens here

• Lots of netconfig the installer can’t handle

• Install internal apt keys and repos

• Install our chef package and kernels

• Configure chef for first boot

• Generated from a template with access to chef objects

Page 29: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

For more than just installers

• BIOS/Firmware Update ISOs

• Boot a live debug image

• Network Gear

Page 30: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Boot an ISO

FreeDOS ISO + vendor firmware

#!ipxe echo Installing Supermicro Firmware for: <%= @machine['hostname'] %> !sleep 3 initrd http://<%= @serverip %>/images/current_firmware.iso || goto error kernel http://<%= @serverip %>/images/memdisk.iso || goto error boot !:error echo Something went wrong, dropping to a shell… shell

Page 31: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Network Gear

• Arista Supports dhcp + http

get '/ztp' do mac = request['X-Arista-SystemMAC'] @device = lookup_device(mac) erb :ztp end

Page 32: Stupid Boot Tricks: using ipxe and chef to get to boot management bliss

Thanks!

Jason Cook [email protected]

@macros