carpooling 100% containers powered · a change in my own app/container: “immutable” ... rkt...
Post on 04-Oct-2020
2 Views
Preview:
TRANSCRIPT
100% Containers Powered Carpooling
Maxime FouilleulDatabase Reliability Engineer
Today’s agenda
BlaBlaCar - Facts & Figures
Infrastructure Ecosystem - 100% containers powered carpooling
Stateful Services into containers - MariaDB as an example
Next challenges - Kubernetes, the Cloud
BlaBlaCarFacts & Figures
60 million members
Foundedin 2006
1 million tonnesless CO
2
In the past year
30 million mobileapp downloadsiPhone and Android
15 milliontravellers /quarter
Currently in22 countriesFrance, Spain, UK, Italy, Poland, Hungary, Croatia, Serbia, Romania, Germany, Belgium, India, Mexico, The Netherlands, Luxembourg, Portugal, Ukraine, Czech Republic, Slovakia, Russia, Brazil and Turkey.
Facts and Figures
MariaDB Redis PostgreSQL
Transactional
Our prod data ecosystem
Cassandra
DistributedVolatile
Spatial
Kafka
Stream
ElasticSearch
Search
Infrastructure Ecosystem 100% containers powered carpooling
Why containers?
Homogeneous HardwareFrom this
srv_001
svc_001
srv_002
svc_002
srv_003
svc_003
srv_004
svc_004
srv_005
svc_005
srv_006
svc_006
srv_007
svc_007
srv_008
svc_008
srv_009
svc_009
srv_010
svc_010
srv_011
svc_011
srv_012
svc_012
srv_013
srv_014
svc_013
svc_014
Homogeneous HardwareTo that
srv_007
srv_008
svc_013
srv_005
srv_006
srv_003
srv_004
srv_001
srv_002
svc_001
svc_002
svc_003svc_004svc_005
svc_006
svc_007
svc_010
svc_008
svc_011
svc_009
svc_012
svc_014
Homogeneous Hardware - “Pets vs Cattle”
Easier to replace broken hardware
Cost Effective
Easier to manage
redis trip-meeting-point
Homogeneous Deploymenttrip-meeting-point application
cat ./prod-dc1/services/trip-meeting-point/service-manifest.yml---containers: - aci.blbl.cr/aci-trip-meeting-point:20180928.145115-v-979da34 - aci.blbl.cr/aci-go-synapse:15-40 - aci.blbl.cr/aci-go-nerve:21-27 - aci.blbl.cr/aci-logshipper:27
nodes: - hostname: trip-meeting-point1 gelf: level: INFO fleet: - MachineMetadata=rack=110 - Conflicts=*trip-meeting-point* - hostname: trip-meeting-point2 fleet: - MachineMetadata=rack=210 - Conflicts=*trip-meeting-point* - hostname: trip-meeting-point3 fleet: - MachineMetadata=rack=310 - Conflicts=*trip-meeting-point*
cat ./prod-dc1/services/redis-meeting-point/service-manifest.yml---containers: - aci.blbl.cr/aci-redis:4.0.2-1 - aci.blbl.cr/aci-redis-dictator:20 - aci.blbl.cr/aci-go-nerve:21-27 - aci.blbl.cr/aci-prometheus-redis-exporter:0.12.2-1
nodes: - hostname: redis-meeting-point1 fleet: - MachineMetadata=rack=110 - Conflicts=*redis-meeting-point* - hostname: redis-meeting-point2 fleet: - MachineMetadata=rack=210 - Conflicts=*redis-meeting-point* - hostname: redis-meeting-point3 fleet: - MachineMetadata=rack=310 - Conflicts=*redis-meeting-point*
ggn prod-dc1 trip-meeting-point update -y ggn prod-dc1 redis-meeting-point update -y
Volatile by designtrip-meeting-point dependencies
cat ./prod-dc1/services/trip-meeting-point/service-manifest.yml---containers: - aci.blbl.cr/aci-trip-meeting-point:20180928.145115-v-979da34 - aci.blbl.cr/aci-go-synapse:15-41 - aci.blbl.cr/aci-go-nerve:21-27 - aci.blbl.cr/aci-logshipper:27
[...]
cat ./aci-trip-meeting-point/aci-manifest.yml---name: aci.blbl.cr/aci-trip-meeting-point:{{.version}}aci: dependencies: - aci.blbl.cr/aci-java:1.8.181-2[...]
cat ./aci-java/aci-manifest.yml---name: aci.blbl.cr/aci-java:1.8.181-2aci: dependencies: - aci.blbl.cr/aci-debian:9.5-9 - aci.blbl.cr/aci-common:7
trip-meeting-point
aci-java
aci-debian aci-common
aci-trip-meeting-point aci-go-synapse aci-go-nerve aci-logshipper
aci-hindsight
Volatile - When should I redeploy?
A change in my own app/container: “immutable”
Noisy neighbours: “mutualization”
A change on a sidecar container or its dependencies
When you are ready for instability your are HA
How?
Infrastructure Ecosystem
bare-metal servers
1 type of hardware
3 disk profiles
fleet cluster
CoreOS
ggn“Distributed init system”
Hardware
Container Registry
etcd
dgr
Service Codebase
rkt PODs
build
run
store
host
create mysqld
monitoring
nerve
mysql-main1
php
nginx
nerve
monitoring
synapse
front1
synapse
nerve
zookeeper Service Discovery
Infrastructure Ecosystem
bare-metal servers 1 type of hardware
3 disk profiles
fleet
CoreOS
ggn“Distributed init system”
Hardware
Container Registry
etcd
dgr
Service Codebase
rkt PODs
build
run
store
host
create mysqld
monitoring
nerve
mysql-main1
php
nginx
nerve
monitoring
synapse
front1
synapse
nerve
zookeeper Service Discovery
kuberneteshelm
backend pod
client pod
Service Discovery
/database/node1
go-nerve does health checks and reports to zookeeper in
service keys
node1
/database
Applications hit their local haproxy to access backends
go-synapse watches zookeeper service keys and reloads haproxy if changes are
detected
HAProxy
go-nerve
Zookeeper
go-synapse
Stateful Services into containersMariaDB as an example
“Stateful” and “volatile by design”?
The recipe/prereqs/pillars to succeed:
Be Quiet!“A node should be able
to restart without impacting the app”
Abolish Slavery “For a given service, every node have the
same role”
Build Smart“Services can be
operate by any SRE”
MariaDB as an example
Abolish Slavery “For a given service, every node
have the same role”
Asynchronous vs. Synchronous
Master
Slave Slave Slave
wsrep wsrep wsrep wsrep
MariaDB Cluster
wsrep
MariaDB Cluster means
No Single Point of Failure
No Replication Lag
Auto States Transfers
As fast as the slowest
The Target
wsrep wsrep wsrep wsrep
MariaDB Cluster
wsrep
MariaDB Cluster
Containers
Writes go on one node
Writes
Reads are balanced on the others
Reads
How to hit the target?Service Discovery
# zookeepercli -c lsr /services/mysql/mainmysql-main1_192.168.1.2_ba0f1f8bmysql-main2_192.168.1.3_734d63damysql-main3_192.168.1.4_dde45787# zookeepercli -c get /services/mysql/main/mysql-main1_192.168.1.2_ba0f1f8b3{ "available":true, "host":"192.168.1.2", "port":3306, "name":"mysql-main1", "weight":255, "labels":{ "host":"r10-srv4" }}
# cat env/prod-dc1/services/mysql-main/attributes/nerve.yml---override: nerve: services: - name: "mysql-main" port: 3306 reporters: - {type: zookeeper, path: /services/mysql/main} checks: - type: sql driver: mysql datasource: "local_mon:local_mon@tcp(127.0.0.1:3306)/"
Nerve - Track and report service status
# cat env/prod-dc1/services/tripsearch/attributes/tripsearch.yml—-override: tripsearch: database: read: host: localhaproxy database: tripsearch user: tripsearch_rd port: 3307 write: host: localhaproxy database: tripsearch user: tripsearch_wr port: 3308
Synapse - Service discovery router# cat env/prod-dc1/services/tripsearch/attributes/synapse.yml---override: synapse: services: - name: mysql-main_read path: /services/mysql/main port: 3307 - name: mysql-main_write path: /services/mysql/main port: 3308 serverOptions: backup serverSort: date
Be Quiet!“A node should be able to
restart without impacting the app”
# cat env/prod-dc1/services/mysql-main/attributes/nerve.yml---override: nerve: services: - name: "mysql-main" port: 3306 reporters: - {type: zookeeper, path: /services/mysql/main} checks: - type: sql driver: mysql request: "SELECT 1" datasource: "local_mon:local_mon@tcp(127.0.0.1:3306)/"
Nerve - “Readiness Probe”
mysql -h 127.0.0.1 -ulocal_mon -plocal_mon -p3306 -e ‘SELECT 1;’
Starting Pod mysql-main1Nerve check is KO
Starting MySQLNerve check is KO
MySQL is syncing (IST/SST)Nerve check is KO
MySQL is readyNerve check is OK
# cat env/prod-dc1/services/mysql-main/attributes/nerve.yml---override: nerve: services: - name: "mysql-main" port: 3306 reporters: - {type: zookeeper, path: /services/mysql/main} checks: - type: sql driver: mysql datasource: "local_mon:local_mon@tcp(127.0.0.1:3306)/" disableCommand: "/report_remaining_processes.sh" disableMaxDurationInMilli: 180000
Nerve - “Grace Period”
Wait
The remaining sessions are finishing their job
Pod Stopped
The service can be shutdown without risk.
Stop Pod
Call /disable on Nerve’s API
Set weight to 0 = no more new sessions will go into the services.
SELECT COUNT(1) FROM processlist WHERE user LIKE 'app_%';
Build Smart“Services can be operate by any SRE”
Use Service Discovery to find peersExample:
Use Service Discovery to find peers
Eg: the wsrep_cluster_address attribute in Galera Cluster
Description: The addresses of cluster nodes to connect to when starting up. Good practice is to specify all possible cluster nodes, in the form gcomm://<node1 or ip:port>,<node2 or ip2:port>,<node3 or ip3:port>. Specifying an empty ip (gcomm://) will cause the node to start a new cluster.
node1
mysql-main
node2 node3 node1
Ask the Service Discovery to find
mysql-main peers ?
No peer found!
wsrep_cluster_address = gcomm://
node2
node1
wsrep_cluster_address = gcomm://node1
node3
node1, node2
wsrep_cluster_address = gcomm://node1,node2
Next challenges Kubernetes, the Cloud
Kubernetes, the Cloud, why now?
Kubernetes, the Cloud, why now?
Fleet is deprecated
Fleet is no longer developed and maintained by CoreOS.
Kubernetes
From a simple “Distributed init
system” to the standard for container
orchestration.Docker
rkt-based implementation of Kubernetes has a poor adoption.
Service Oriented Architecture
Delegated Ownership.
Google Kubernetes Engine & Managed Services
Allows us to focus on services.
3 years old servers
We need to renew our hardware.
Kubernetes, the Cloud, why now?
Kubernetes and stateful services?
Kubernetes Statefulsets
Stable, unique network identifiers.
Stable, persistent storage.
Ordered, graceful deployment, scaling and rolling updates.
StatefulSets control Pods that are based on an identical spec.
Google Kubernetes Engine...
Why are we excited about GKE?
Native suport of Liveness and Readiness probes
Release granularity, from Pod to Deployment/Statefulset
Native Service Discovery (kube-proxy and Services)
GCEPersistentDisk provisioner to manage Persistent Volumes
This + resources limitations make powerfull orchestration
See you next year for 100% GKE Powered Carpooling !
top related