Tyler Langlois, October 12th 2017 Software Engineer, Elastic @leothrix, github: tylerjl
Custom Types and Providers: Modeling Modern REST Interfaces and Beyond
2
Obligatory “About Me” Slide
• Been with company since 2014
• Co-maintainer of Elastic Puppet modules (primarily Elasticsearch and Kibana)
• Puppet-ing in one way or another over my whole professional career
• Brought too many Elastic stickers that need to be given away (please partake)
• Talk to me about Elasticsearch/Logstash/Kibana/Beats!
Infrastructure/Operations/Software Engineer @ Elastic
3
Who is This Presentation For?
Developers who work with Puppet modules
Puppet users who want to dip into native type/provider development
“What in the %@#$ is the Elasticsearch module doing”
Operators who want to automate against APIs
Hopefully empowers you to implement custom resources on your own
4
WHAT DOES THIS TALK’S TITLE EVEN MEAN?
5
6
Managing Resources with Raw APIsExample: CloudFormation
• Pro:
• Infrastructure resources are data
• Extensible
• Con:
• Managing changes
• Grokking huge chunks of JSON
7
Modeling Resources with Raw APIsExample: Terraform
• Pro:
• Readable
• Manageable
• Lifecycle + changes
• Interoperability between other systems
8
Modeling Resources in Puppet
A DSL to Model Disparate Resources
A Graph to Manage Relationships
A Concept of Changes to Manage Lifecycles
ls, stat, chmod, chown
sysv, systemd, upstart
deb, rpm, pkg }
9
Modeling Resources in PuppetAbstraction is Powerful
file { “/tmp/foo”:source => “puppet:///foo”,
} ->package { “foo”:
source => “/tmp/foo”} ~>service { “foo”:
ensure => “running”,}
10
Modeling Resources in Puppet
?Elasticsearch
Logstash
Other REST APIs }
11
Modeling Resources in PuppetExtending the idea to APIs
elasticsearch::template { “logstash”:content => {
“template” => “*”,“settings” => {
“number_of_replicas” => 0}
}} ->service { “es-app”:
ensure => “running”}
12
Modeling Resources in Puppet
• State Changes
• Instead of comparing changes with GET responses and template files, compare during a no-op
• A change in state can form dependencies and refresh events into other resources
• Trickling changes up via reports lends better visibility
Benefits
13
Modeling Resources in Puppet
• State Changes
• More finely-grained control
• Most resources can be represented as Puppet hashes, so Hiera can be fully leveraged
• Communicating via full Ruby HTTP libraries means CA files, auth, and more are easier to control
• TESTS!
Benefits
14
Modeling Resources in Puppet
• State Changes
• More finely-grained control
• Some existing API-based resources:
• Kubernetes module (swagger-generated)
• Google Cloud
• Following examples will be low-level (i.e. with just native Ruby HTTP libraries)
• …hopefully, will help you write your own for $system
15
Let’s (briefly) talk about Puppet Types and Providers
16
Types, Providers, and their Resources
Underlying Resource
Puppet Provider
Puppet Type
• Has some way to change a property
• Its state is introspectable and discoverable
• Uniquely identified
• How Ruby interacts with actual commands/system properties
• Knows how to discover the properties of resources
• Normalized provider API to Puppet DSL
• Somewhat typed, catalog compilation
• Abstraction over providers
17
Types, Providers, and their Resources: service
• systemctl/service/rc commands
• Startup visibility with enable/chkconfig/etc.
• Primarily shell-based for state
• One provider for each init system
• Ruby knows which shell commands to invoke to start, stop, enable, etc.
• Unified API to start, enable, and restart a general service resource
• Abstraction over provider-specific implementations
• What we see in a manifest
Underlying Resource
Puppet Provider
Puppet Type
18
Types, Providers, and their Resources: elasticsearch
• REST API endpoints
• Objects modeled in JSON
• Individual endpoints via _template, _ingest, etc.
• One provider base class, one provider per resource type
• Using native Ruby HTTP APIs are high-level enough
• Better alternative than `exec { “curl”:`
• Resource properties expressed in Puppet DSL hashes
• We don’t make API calls, we declare desired state
Underlying Resource
Puppet Provider
Puppet Type
19
Then:
20
Now:
Case Study: Elasticsearch Pipelinescurl vs. Puppet
22
Ingest Pipelines
23
Ingest Pipelines
24
Ingest Pipelines
• All pipelines are uniquely identified by a name (like defined or native types!)
• Endpoints to manage pipelines:
• GET to retrieve JSON object enumerating all pipelines
• Note: can also retrieved based by name alone
• PUT to create with JSON body
• Note that we’re using unauthenticated APIs right now
Key observations
25
Ingest Pipelines: Puppet Type
26
Ingest Pipelines: Puppet Type (Implementation)
27
Ingest Pipelines: Puppet Type (Implementation)…what the included abstraction does
28
Ingest Pipelines: Puppet Provider (Implementation)
29
Ingest Pipelines: Puppet Provider (details)…what the parent class does
30
Ingest Pipelines: Puppet Provider (details)…what the parent class does
31
Ingest Pipelines: Puppet Tests
32
Ingest Pipelines
• That’s most of it!
• Test-driven development + rspec makes it smooth
• Bulk is abstracted; the beefy parts are in parent classes and reused by templates, indices, etc.
• Native types and providers ≠ scary
Summary
33
Fitting REST Resources Into PuppetConsiderations
`exists?` versus `prefetch`
Leveraging type-level tools
HTTP
API availability
1
2
3
4
34
35
An Example of Returning a Hash to PrefetchAutomatically Gathering Resources
uri = URI(“http://localhost:9200/_template”) http = Net::HTTP.new uri.host, uri.port req = Net::HTTP::Get.new uri.request_uri response = http.request req JSON.parse(response.body).map do |object_name, api_object| { :name => object_name, :ensure => :present, :content => api_object, :provider => name } end
36
Advantages
• puppet resource functionality
• Minimizes chatter with API endpoints
• i.e., checking for existence versus properties, etc.
• Call flush only when necessary
• Additional API freebies (i.e., centralized access in flush(), etc.)
Prefetching resources versus vanilla exists?
37
38
Fitting REST Resources Into PuppetConsiderations
`exists?` versus `prefetch`
Leveraging type-level tools
HTTP
API availability
1
2
3
4
39
Response Content vs. Request ContentUsually never 1:1 mappings
{ "logstash": { "order": 0, "version": 60001, "index_patterns": [ "logstash-*" ],
. . .
elasticsearch::template { 'logstash': content => { 'template' => '*', 'settings' => {
. . .
vs.
40
Types To the Rescue
• A resource’s desired state is almost never the plain response for a query against the resource
• Example: kubernetes Deployment versus the state of a Deployment
• munge can help unify the resource versus JSON for comparability
• insync? can be enhanced to understand which fields are being explicitly controlled by a user
• e.g., I want {“foo”: “bar”} set, I don’t care about what’s in {“another”: “field”}
• Used pretty heavily in puppet-elasticsearch
Managing response data
41
Example: Setting Default FieldsElasticsearch template
# Set default values for templatesmunge do |value|
{'order' => 0,'aliases' => {},'mappings' => {}
}.merge(value)end
42
Example: Unifying FormattingElasticsearch template
# Normalize then compare the Puppet hash and jsondef insync?(is)
Puppet_X::Elastic.deep_implode(is) == \Puppet_X::Elastic.deep_implode(should)
end
{ “foo”:{ “bar”: “value” }
}
{“foo.bar”: “value”
}
43
Fitting REST Resources Into PuppetConsiderations
`exists?` versus `prefetch`
Leveraging type-level tools
HTTP
API availability
1
2
3
4
44
HTTP In Providers
45
HTTP In Providers
• Native HTTP libraries let us more easily control and pass:
• TLS certificate authorities and verification booleans
• HTTP basic auth credentials
• Failure cases (timeouts, 4xx/5xx response codes, etc.)
• In this case with Elasticsearch, error responses can return JSON messages for more helpful Puppet failures
46
Fitting REST Resources Into PuppetConsiderations
`exists?` versus `prefetch`
Leveraging type-level tools
HTTP
API availability
1
2
3
4
47
API Availability
• What happens if:
• An API-based REST resource requires an API to be up, not just a daemon?
• A resource should block until one is available?
• An unrelated resource needs that API as well?
Weird edge cases when controlling APIs as opposed to hosts
48
API Availability
• es_instance_conn_validator doesn’t resolve until a connection can be made
Some observations after a couple years…
50
Results From the Field
• One parent class makes creating more easy
• Supported REST-based resources include:
• indices • templates • pipelines • + more
Extensibility
• rspec + webmock for great testing
• ES docs + specs first have made some implementations first try successes
• Good mocks make some acceptance tests unnecessary (faster CI!)
Reliability
• Much easier to extend to new OS’s (i.e., Windows)
• Greater control has made some tasks (like 3.x → 4.x module update) smooth
+ more