scaling datastax in docker
TRANSCRIPT
Joel JacobsonScaling DataStax in Docker
© DataStax, All Rights Reserved. 2
How it started
Internal project at dotCloud
Pivoted to Docker Inc.
Execution using libcontainer
Huge adoption
What is Docker?and why is it important?
© DataStax, All Rights Reserved. 4
3 key concepts
Images
Registries
Containers
© DataStax, All Rights Reserved. 5
Example Dockerfile image
© DataStax, All Rights Reserved. 6
Why are containers important?Speeding up application development
Better resource utilization
Mobility
Faster provisioning
Microservices
© DataStax, All Rights Reserved. 7
Why are containers important?
WEB UI BILLINGCUSTOMER
MYSQL
EXT SERVICE
DB ADAPTER
PAYMENTS
SERVICE X
SERVICE YREST API
EXT SERVICE
© DataStax, All Rights Reserved. 8
Why are containers important?WEB UI
BILLINGREST API
CUSTOMERREST API
CASSANDRA SPARKSOLR
PAYMENTSREST API
SERVICE XREST API
SERVICE YREST API
EXT SERVICEEXT SERVICE
© DataStax, All Rights Reserved. 9
Why are containers important?
© DataStax, All Rights Reserved. 10
Why are containers important?
DataStax Enterprise in Docker
© DataStax, All Rights Reserved. 12
Why are containers important?
Build once, deploy anywhere
Flexibility for sharing binaries and libraries across applications
Process of managing, maintaing and deploying turn key
Officially supported since DSE 4.8
© DataStax, All Rights Reserved. 13
DSE processesCore DSE JVM
One or more Spark executor processes
Single Spark worker process
Multiple processes for the Hadoop stack
Ad-hoc process (Spark job server, SparkSQL, CLI etc.)
OpsCenter agent
© DataStax, All Rights Reserved. 14
DataStax Enterprise configuration
Cassandra configuration (seeds, cluster_name etc)
Where to manage Cassandra data
Optimal JVM heap size
Optimal garbage collector
© DataStax, All Rights Reserved. 15
DataStax Enterprise configuration
Default capability limits of Docker break mlockall
Add –XX:+AlwaysPreTouch to the JVM arguments
ulimits inherited from Docker daemon
Disable swap on host OS
© DataStax, All Rights Reserved. 16
Networking
Default networking (via Linux bridge) not recommended
Instead use docker run –net=host
Use pipework or weave for consistent IP addresses
© DataStax, All Rights Reserved. 17
Storage
Everything in /var/lib/cassandra;
commitlog
saved_caches
data directories
Use supported filesystem
© DataStax, All Rights Reserved. 18
Storage
Data volumes can be shared and reused amoung containers
Changes are made directly
Changes to a volume will not be included when you update an image
Data volumes persist if container is deleted
© DataStax, All Rights Reserved. 19
Storage
docker run –v <some root dir>/<dse_image_name>-data:/data –v <some root dir>/<dse_image_name>-conf:/conf –v <some root dir>/<dse_image_name>-logs:/logs –d <dse_image_name>
DSE Docker Demo
© DataStax, All Rights Reserved. 21
Futures
Splitting up DSE processes into separate containers
Integration with Kubernetes, Mesos
Deployment model on public/private clouds
© DataStax, All Rights Reserved. 22
SummaryConfigure OS and JVM
Map storage volumes
Avoid bridge/NAT networking
Test. Test. Test.
Useful Information
© DataStax, All Rights Reserved. 24
Links and informationDatastax.com
http://www.datastax.com/wp-content/uploads/resources/DataStax-WP-Best_Practices_Running_DSE_Within_Docker.pdf
github.com/joeljacobson/dse-docker
academy.datastax.com
Thank you