nosql and big data for devops

38
NoSQL and Big Data for DevOps Gustavo Fernandes Sunday, 3 February 13

Upload: gustavo-fernandes

Post on 01-Jul-2015

647 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: NoSQL and Big Data for DevOps

NoSQL and Big Data for DevOps

Gustavo Fernandes

Sunday, 3 February 13

Page 2: NoSQL and Big Data for DevOps

Agenda

• DevOps?

• Toolset

• BigData

• NoSQL

• Demo

• Q&A

Sunday, 3 February 13

Page 3: NoSQL and Big Data for DevOps

DevOps - Motivations

• Silos development/ops

• Slow release cycles

• Lack of awareness from either side

Sunday, 3 February 13

Page 4: NoSQL and Big Data for DevOps

Developers

• Payed to add new features constantly

• “Works in my laptop”

• Usually IGNORE non-functional requirements

Sunday, 3 February 13

Page 5: NoSQL and Big Data for DevOps

Operations

• Keep it stable

• Reliable

• Monitoring

• Distance from code

Sunday, 3 February 13

Page 6: NoSQL and Big Data for DevOps

Devops - Enablers

• Agile

• Infrastructure as software

Sunday, 3 February 13

Page 7: NoSQL and Big Data for DevOps

What is Devops?

• Development + Operations

• Discipline/Philosophy/Methodology

• Role - coder of non-functional requirements

• Faster, more reliable, continuous releases to production

Sunday, 3 February 13

Page 8: NoSQL and Big Data for DevOps

Devops - Principles

• Automate everything: release, deployment, provision

• Infrastructure as code - TDD, tags, branches, ...

• Agile to ops

Sunday, 3 February 13

Page 9: NoSQL and Big Data for DevOps

Open Source Toolset • Configuration management tool

• Puppet, chef

• Application Lifecycle management tools

• Build tools: Maven, Gradle, Buildr, SBT, Rake

• Maven repository: Nexus, Artifactory

• Provisioning tools

• Vagrant, Boxgrinder

• CI servers: jenkins

Sunday, 3 February 13

Page 10: NoSQL and Big Data for DevOps

Puppet

• Custom Declarative Language• Describe resources and states• Applies states to servers• Standalone/client-server/pub-sub• Testable

Sunday, 3 February 13

Page 11: NoSQL and Big Data for DevOps

Puppet Resourcesfile { "/my/file":

source => "/path/to/", backup => main, mode => “0644”}

cron { logrotate: command => "/usr/sbin/logrotate", user => root, hour => 2, minute => 0}

exec { "tar -xf /Volumes/nfs02/important.tar": cwd => "/var/tmp", creates => "/var/tmp/myfile", path => ["/usr/bin", "/usr/sbin"]}

user { 'opuser': ensure => 'present', password => '$1$9VC1vFFa$GHKWgtdODti8eKqkQ7Ruv.'}

Sunday, 3 February 13

Page 12: NoSQL and Big Data for DevOps

Classesclass mongodb($replicaset = ‘’, $disablenuma = ‘’) {

$mongo_tgz = "mongodb-${arch}- ${version}.tgz"   $base_dir = "${base}"

 group { "mongodb":        "ensure => present" }" user{ "mongodb":         "ensure => present,          gid => "mongodb",        "shell => "/sbin/nologin" " }    file { "$base_dir":" " ensure => "directory"," " owner => "mongodb"," " group => "mongodb"," " alias => "mongo-base"" }}

Sunday, 3 February 13

Page 13: NoSQL and Big Data for DevOps

File Serverfile { "/etc/sudoers": mode => 440, owner => root, group => root, source => "puppet:///modules/name/sudoers"}

file { "$installdir/conf/mongo.conf": mode => 0744, ensure => present, content => template('mongodb/mongodb.conf.erb'),}

Sunday, 3 February 13

Page 14: NoSQL and Big Data for DevOps

ERB Templatesservers = [‘server1.domain’,‘server2.domain’,‘server3.domain’]

file {'/etc/foo.conf': ensure => file, content => template('foo/foo.conf.erb'),}

# foo.conf.erb<% servers.each do |server| -%> <%= server %><% end -%>

# foo.conf.erbserver1.domainserver2.domainserver3.domain

Sunday, 3 February 13

Page 15: NoSQL and Big Data for DevOps

Modules

Static Files

ERB Templates

Module manifest

Sunday, 3 February 13

Page 16: NoSQL and Big Data for DevOps

Site

node "box1.domain" { include java include hadoop_master}

site.pp

Sunday, 3 February 13

Page 17: NoSQL and Big Data for DevOps

Manifest• Puppet entry point

• Nodes

node "box1.domain" { include java include hadoop_master class { 'myapp': version => "1.5-SNAPSHOT", maven_repo => "http://devserver:8081/nexus/content/repositories/snapshots/", }}

node “box2.domain” { include java include hadoop_slave}

Sunday, 3 February 13

Page 18: NoSQL and Big Data for DevOps

Facter Inventory

• Companion ruby utility to puppet

• Collect facts about one environment and expose as a map

• Puppet uses it to decide what to deliver to a node

Sunday, 3 February 13

Page 19: NoSQL and Big Data for DevOps

Facter - Example$ facterarchitecture => x86_64domain => domainfqdn => devserver.domainhardwaremodel => x86_64hostname => devserveripaddress_eth1 => 192.168.95.15ipaddress_lo => 127.0.0.1is_virtual => truekernel => Linuxmemorytotal => 491.11 MBnetwork_eth0 => 10.0.2.0network_eth1 => 192.168.95.0operatingsystem => OpenSuSEoperatingsysrelease => 12.2physicalprocessorcount => 1processor0 => Intel(R) Core(TM) i7-2677M CPU @ 1.80GHzprocessorcount => 1......

Sunday, 3 February 13

Page 20: NoSQL and Big Data for DevOps

Using facter values

... <property> <name>mapred.tasktracker.map.tasks.maximum</name> <value><%= $::processorcount %></value> </property>...

mapred-site.xml.erb

Sunday, 3 February 13

Page 21: NoSQL and Big Data for DevOps

Puppet Extensions

• The language itself

• New types

• New functions

• New resources

Sunday, 3 February 13

Page 22: NoSQL and Big Data for DevOps

The maven resource

maven { "download-artifact": groupid => "com.gustavonalle", artifactid => "$artifact", version => "1.0-SNAPSHOT", repos => "http://devserver:8081/repo/snapshots/", directory => "/opt/myapp",

classifier => “jar” require => File["basedir"], before => Exec["unzip"]}

Sunday, 3 February 13

Page 23: NoSQL and Big Data for DevOps

Maven resource

• Support for SNAPSHOTS

• Support for Releases

• No need to install Maven in the server

• Support Authentication

• Support for http and https

• Support for puppet:// protocol

Sunday, 3 February 13

Page 24: NoSQL and Big Data for DevOps

Installing Hadoop

Sunday, 3 February 13

Page 25: NoSQL and Big Data for DevOps

Hadoop - Processes

• Name Node running on at least one node

• Secondary Name Name node running elsewhere

• Data Node process running on nodes who are part of HDFS

• Job Tracker running in the cluster

• Task Tracker running in each node that can execute map or reduce task

Sunday, 3 February 13

Page 26: NoSQL and Big Data for DevOps

Hadoop HDFS• Name node knows all the slaves from

<HADOOP_HOME>/conf/slaves

• Slaves need to point at namenode in the file <HADOOP_HOME>/conf/core-site.xml

slave1.domain.comslave2.domain.com...

<property> <name>fs.default.name</name> <value>hdfs://namenode:9000</value></property>

Sunday, 3 February 13

Page 27: NoSQL and Big Data for DevOps

Hadoop Map Reduce• Job Tracker knows all the slaves from

<HADOOP_HOME>/conf/slaves

• Slaves need to point at jobtracker in the file <HADOOP_HOME>/conf/mapredsite.xml

slave1.domain.comslave2.domain.com...

<property> <name>mapred.job.tracker</name> <value>master:9001</value> </property>

Sunday, 3 February 13

Page 28: NoSQL and Big Data for DevOps

Hadoop - SSH

• SSH is required to do cluster-wide operations

• ‘ssh localhost’ without asking password

• su hadoop -c ‘ssh slave01’ without password

Sunday, 3 February 13

Page 29: NoSQL and Big Data for DevOps

How puppet can help

• Facter calculates optimal values for memory, cpu, number of maps

• Generate ssh keys on the fly

• Obtain hostnames automatically

• Hide most of the complexity and expose only the bare minimal

• Install in parallel high number of slaves

Sunday, 3 February 13

Page 30: NoSQL and Big Data for DevOps

MongoDB - Replicaset

PrimaryArbiter

Secondaries

(no data)

Sunday, 3 February 13

Page 31: NoSQL and Big Data for DevOps

MongoDB - Creating cluster

• Replicaset must be done with all servers running

• Using cmd tools and a bit of javascript

Sunday, 3 February 13

Page 32: NoSQL and Big Data for DevOps

How puppet can help

• Ensure all servers are running and configured

• Generate .js files and configuration files using templates

Sunday, 3 February 13

Page 33: NoSQL and Big Data for DevOps

Demo

Jenkins

Nexus

Puppet Master

github.com/gustavonalle/puppet

Sunday, 3 February 13

Page 34: NoSQL and Big Data for DevOps

Demo

Name Node

Mongo Primary

JVM

Mongo Secondary

Job Tracker

Data Node

Task Tracker M&R Job

Mongo Arbiter

JVM

box1.domain

box2.domain

devserver.domain** 2 Cpus* 1 Cpu

**

*

*Data Node Task Tracker

Sunday, 3 February 13

Page 35: NoSQL and Big Data for DevOps

site.ppclass mongo_replicaset { class { 'mongodb': replicaset => 'fosdem', primary => 'box1.domain', secondaries => ['box2.domain'], arbiter => 'devserver.domain' }}class customApp { class { 'myapp': version => "1.0-SNAPSHOT", maven_repo => "http://devserver.domain:8081/repo/snapshots/", }}

Sunday, 3 February 13

Page 36: NoSQL and Big Data for DevOps

site.pp (cont.)node "box1.domain" { include java class { 'hadoop': master => box1, slaves => [box1,box2] } include mongo_replicaset include customApp}

node "box2.domain" { include java include mongo_replicaset class { 'hadoop': master => box1, }}

Sunday, 3 February 13

Page 37: NoSQL and Big Data for DevOps

Wrapping up

• Server side software is not getting any simpler

• But infrastructure is now “software”

• Devops is here to stay

Sunday, 3 February 13

Page 38: NoSQL and Big Data for DevOps

Thank you

github.com/gustavonalle/puppet

[email protected]

Sunday, 3 February 13