living the nomadic life - nic jackson
TRANSCRIPT
[email protected]@sheriffjackson
NIC JACKSON
3
AGENDA
HASHICORP
SCHEDULING
NOMAD
Overview Fundamentals
Job ConfigurationSchedulingDemo
HASHICORPOVERVIEW
5
FOUNDED 2012 by Mitchell Hashimoto and Armon Dadgar
MISSION We enable organizations to provision, secure, and run any infrastructure for any application
INVESTORS Mayfield Fund, GGV Capital, Redpoint and True Ventures
KEY PRODUCTS Vagrant, Packer, Terraform, Vault, Nomad, Consul
COMPANY OVERVIEW
6
OSS TO ENTERPRISE
SOFTWARE INNOVATORS TECHNOLOGY PARTNERS
7
PRODUCT SUITE
8
NOMAD
Nomad
SCHEDULINGOVERVIEW
Schedulers map a set of work to a set of resources
11
CPU SCHEDULER
11
CORE
CORE
CORE
CORE
CPUSCHEDULER
KERNEL
APACHE
REDIS
BASH
12
CPU SCHEDULER
12
CORE
CORECPUSCHEDULER
KERNEL
APACHE
REDIS
BASH
13
SCHEDULERS IN THE WILD
13
Type Work Resources
CPU Scheduler Threads Physical Cores
EC2 / Nova Virtual Machines Hypervisors
Hadoop YARN MapReduce Jobs Client Nodes
Cluster Scheduler Applications Machines
14
SCHEDULER ADVANTAGES
14
Higher Resource Utilization
Decouple Work from Resources
Better Quality of Service
15
SCHEDULER ADVANTAGES
15
Bin Packing
Over-Subscription
Job Queueing
Higher Resource Utilization
Decouple Work from Resources
Better Quality of Service
16
SCHEDULER ADVANTAGES
16
Abstraction
API Contracts
Standardization
Higher Resource Utilization
Decouple Work from Resources
Better Quality of Service
17
SCHEDULER ADVANTAGES
17
Priorities
Resource Isolation
Pre-emption
Higher Resource Utilization
Decouple Work from Resources
Better Quality of Service
18
NOT A NEW CONCEPT
18
19
BASED ON RESEARCH
19
NOMADOVERVIEW
21
NOMAD DESIGN PRINCIPLES
21HashiCorp confidential do not distribute
Integrated scheduler and cluster manager
Distributed, shared state, optimistically concurrent
Agent-based, client/server
No dependencies
22
NOMAD CHARACTERISTICS
22HashiCorp confidential do not distribute
Multi-datacenter and multi-region
Highly performant and highly available
Hybrid workloads with multiple schedulers and drivers
Seamlessly integrates with HashiCorp ecosystem
NOMADFUNDAMENTALS
24
SINGLE REGION DEPLOYMENT
24
SERVER SERVER SERVER
CLIENT CLIENT CLIENTDC1 DC2 DC3
FOLLOWER LEADER FOLLOWER
REPLICATIONFORWARDING
REPLICATIONFORWARDING
RPC RPC RPC
25
MULTI REGION DEPLOYMENT
25
SERVER SERVER SERVERFOLLOWER LEADER FOLLOWER
REPLICATIONFORWARDING
REPLICATION
REGION B� GOSSIP
REPLICATION REPLICATIONFORWARDING
REGION FORWARDING
�REGION A
SERVERFOLLOWER
SERVER SERVERLEADER FOLLOWER
26
SERVER ARCHITECTURE
26
Omega Class Scheduler
Pluggable Logic
Internal Coordination and State
Multi-Region / Multi-Datacenter
27
CLIENT ARCHITECTURE
27
Broad OS Support
Host Fingerprinting
Pluggable Drivers
Job restarts and lifecycle management
28
CLIENT DRIVERS
28
ContainerizedDockerrktWindows Server Containers
VirtualizedQemu / KVM
Hyper-VXen
StandaloneJava Jar
C#Static Binaries
29
CLIENT FINGERPRINTING
29
Type Examples
Operating System Kernel, OS, Version
Hardware CPU, Memory, Disk
Apps (Capabilities) Docker, Java, Consul
Environment AWS, GCE
NOMADJOB CONFIGURATION
31
JOB FILE
31
Declarative
Scheduler, driver, and resource needs
Lifecycle behavior
Constraints
Versioned
32
redis.nomad
JOB FILE
job "redis" { datacenters = ["us-east-1"]
task "redis" {
driver = "docker" config { image = "redis:v13" }
resources { cpu = 500 # Mhz memory = 256 # MB
network { mbits = 10 dynamic_ports = ["redis"] } } }}
33
redis.nomad
JOB FILE: TASK GROUPS
job "app" {
group "app" {
task "redis" {# ...
}
task "app" {# ...
}
}
}
34
redis.nomad
JOB FILE: CONSTRAINTS
job "redis" {
constraint { attribute = "${attr.kernel.version}" operator = "version" value = "> 3.19"}
constraint { attribute = "${attr.platform.aws.instance-type}"
value = "p2.16xlarge"}
task "redis" {# ...
}
}
35
redis.nomad
JOB FILE: CONSUL SERVICE DISCOVERY
job "redis" {
task "redis" {# ...
service {port = “redis”check {
type = “tcp”interval = “10s”
} }
}
}
36
redis.nomad
JOB FILE: CONSUL CONFIGURATION
job "redis" {
task "redis" {# ...
template {data = <<EOH
bind_port: {{ env "NOMAD_PORT_db" }}scratch_dir: {{ env "NOMAD_TASK_DIR" }}service_key: {{ key "service/my-key" }}
EOH
destination = "local/file.yml"}
}
}
37
redis.nomad
JOB FILE: VAULT INTEGRATION
job "redis" {
task "redis" {# ...
template { data = <<EOH
{{ with secret "secret/credentials" }} username: {{ .Data.username }} password: {{ .Data.password }}{{ end }}
EOH
destination = "local/file.yml"}
}
}
38
redis.nomad
JOB FILE: PARAMETERIZED
job "encode" {
type = "batch"
parameterized {payload = "required"meta_required = ["s3-input", "s3-output", ...]
} # ...
task "ffmpeg" {driver = "exec"
config {command = "ffmpeg"
# When dispatched, the payload is written to a file that is then# read by the created task upon startupargs = ["-config=${NOMAD_TASK_DIR}/config.json"]
# ...}
39
$ nomad job dispatch encode video-config.json$$ cat video-config.json
{ "s3-input": "https://s3-us-west-1.com/video-bucket/cb31dabb1", "s3-output": "https://s3-us-west-1.com/video-bucket/a149adbe3", "input-codec": "mp4", "output-codec": "webm", "quality": "1080p"}
Text
JOB FILE: PARAMETERIZED
NOMADMULTI-CLOUD
Why Multi-Cloud?
• High Availability
• Redundancy
• Burstable Workload
• Cloud Migration
• Because we can
42
CONSUL
NOMAD
SERVERLEADER
SERVERFOLLOWER
SERVERLEADER
SERVERFOLLOWER
SERVERFOLLOWER
SERVERFOLLOWER
NODE A NODE B
GOOGLE CLOUD
NATS CLOUDMESSAGING
REPLICATION
FORWARDING
REPLICATION
FORWARDING
REPLICATION
FORWARDING
REPLICATION
FORWARDING
LOAD BALANCER
LOAD BALANCER
CONSUL
NOMAD
SERVERLEADER
SERVERFOLLOWER
SERVERLEADER
SERVERFOLLOWER
SERVERFOLLOWER
SERVERFOLLOWER
NODE A NODE B
AWS
REPLICATION
FORWARDING
REPLICATION
FORWARDING
REPLICATION
FORWARDING
REPLICATION
FORWARDING
LOAD BALANCER
REGION FORWARDING (VPN)
REGION FORWARDING (VPN)
NOMADSCHEDULING
44
SCHEDULING
44
Schedulers process evaluations and generate allocation plans.
Placement is determined using the relevant scheduler.
Scheduling involves feasibility checking and ranking.
Feasibility filters out nodes missing necessary drivers and those failing the specified constraints.
Ranking score feasible nodes to find the best fit (bin packing).
45
SCHEDULER TYPES
45HashiCorp confidential do not distribute
Service Long-running applications and services
Batch Short-lived data processing jobs (benefit from fast placement)
System Lower level jobs that run on all clients (logging, monitoring)
46
$ nomad plan example.nomad+ Job: "example"+ Task Group: "cache" (1 create) + Task: "redis" (forces create)
Scheduler dry-run:- All tasks successfully allocated.
$
Text
SCHEDULING: PLAN
47
$ nomad plan example.nomad.java+ Job: "example"+ Task Group: "web" (1 create) + Task: "tomcat" (forces create)
Scheduler dry-run:- WARNING: Failed to place all allocations. Task Group "web" (failed to place 1 allocation): * Constraint "missing drivers" filtered 2 nodes
$
Text
SCHEDULING: PLAN
48
$ nomad run example.nomad==> Monitoring evaluation "4b8b7779" Evaluation triggered by job "example" Allocation "38720b8e" created: node "ec2f0830", group "cache" Evaluation status changed: "pending" -> "complete"==> Evaluation "4b8b7779" finished with status "complete"
$
Text
SCHEDULING: RUN
49
$ nomad run -region=gcp events.nomad==> Monitoring evaluation "e2a8dfe6" !On branch master Evaluation triggered by job "events" !Your branch is up-to-date with 'origin/master'. Allocation "6615b39f" modified: node "0d6a6103", group "pubsub" !nothing to commit, working tree clean Evaluation status changed: "pending" -> "complete" !==> Evaluation "e2a8dfe6" finished with status "complete"
$
Text
SCHEDULING: RUN DIFFERENT REGION
50
$ nomad status exampleID = exampleName = exampleType = servicePriority = 50Datacenters = us-west-1Status = running
SummaryTask Group Queued Starting Running Failed Complete Lostcache 0 0 1 0 0 0
AllocationsID Eval ID Node ID Task Group Desired Status Created At38720b8e 4b8b7779 ec2f0830 cache run running 04/26/17 ...
$
Text
SCHEDULING: STATUS
DEMO!
NomadMillion ContainerChallenge
1,000 Jobs
1,000 Tasks per Job
5,000 Hosts on GCE
1,000,000 Containers
53
MILLION CONTAINER CHALLENGE
53
54
MILLION CONTAINER CHALLENGE
54
– Bill Gates
640 KB ought to be enough for anybody.“
55
REAL WORLD SCALE
55
2nd Largest Hedge Fund
18K Cores
5 Hours
2,200 Containers/second
Q/A AND HASHICONF
SEPTEMBER 18-20AUSTIN, TEXASwww.hashiconf.com
#hashiconf�
#hashiconf�
Links:
https://www.nomadproject.io
https://github.com/nicholasjackson/terraform-nomad-multi-cloud