putting rails and couch db on the cloud - indicthreads cloud computing conference 2011

27
06/06/11 1

Upload: indicthreads

Post on 11-May-2015

4.531 views

Category:

Technology


1 download

DESCRIPTION

Session presented at the 2nd IndicThreads.com Conference on Cloud Computing held in Pune, India on 3-4 June 2011. http://CloudComputing.IndicThreads.com Session Abstract: Apache CouchDB is a document-oriented NoSQL database that can be queried and indexed in a MapReduce fashion using JavaScript. CouchDB offers an easy way to get introduced to the world of NoSQL.In this session we will learn how to work with CouchDB, how to install it over an Amazon EC2 instance and how to insert and query data on it. We will then create a Ruby on Rails application, host it on the cloud through Heroku and integrate it with our CouchDB. After this session, the audience will be able to work with CouchDB, understand it’s strengths and work with it over an EC2 instance. The audience will also be able to appreciate the ease of hosting Rails application with Heroku and how quickly one can launch and scale applications over the cloud with the combination of these two technologies. Speaker: Rocky Jaiswal is Software Architect at McKinsey & Company and has more than 8 years of experience in software analysis, design and programming. His primary area of expertise is application development using Java/JEE/Spring & Hibernate. He has worked as a consultant for major investment banks like Goldman Sachs and Morgan Stanley. He has extensive international experience and has worked in the UK, USA, Netherlands, Japan and Mexico. Rocky is a strong believer in Agile methodologies for software development particularly Scrum and XP.

TRANSCRIPT

Page 1: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

06/06/11 1

Page 2: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

2

ABOUT MERocky Jaiswal

Daytime job – Software Architect at McKinsey & Co

Programmer / Agilist at heart

Have been programming for almost 9 years, plan to do it for a long loooong time

I Java/Ruby/JRuby/jQuery and anything to do with web application development

Want to build good looking, scalable and performing websites that help people

Blog – www.rockyj.in

Twitter – www.twitter.com/whatsuprocky

Page 3: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

3

Why we need the cloudI am a developer. Don’t want the hassle to maintain infrastructure.

We are a small organization. We want cheap and flexible infrastructure.

I / we want to scale easily. Be it scale up or scale down.

*Choice of technology also determines how easily you can scale. e.g. Use of NoSQL instead of a RDBMS

Page 4: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

4

A WORKING EXAMPLE

http://biblefind.in

Page 5: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

5

HELLO COUCHDB

Apache CouchDB is a document-oriented database that can be queried and indexed in a MapReduce fashion using JavaScript.

CouchDB also offers incremental replication with bi-directional conflict detection and resolution.

CouchDB provides a RESTful JSON API than can be accessed from any environment that allows HTTP requests.

+ It offers an easy introduction to the world of NoSQL

Page 6: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

6

HELLO COUCHDB – DOCUMENT ORIENTED

A different way to model your data.

Data stored in documents

Think of it as a de-normalized table row

Page 7: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

7

HELLO COUCHDB - MAPREDUCEMapReduce – Divide and Rule for programmers

How would you count the occurrences of each word in a book given a group of helpers

Page 8: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

8

HELLO COUCHDB – JSON/HTTP

When we talk of databases we talk of drivers

CouchDB’s protocol is HTTP

The data exchange + storage language is JSON

{“Subject” : “I like JSON”,“Author” : “Rocky”,“Tags” : [

“Web”, “Programming”, “Data Exchange”]

}

And the queries are written in JavaScript

Page 9: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

9

PARSING THE BIBLE AND STORING IT IN COUCHDB

Page 10: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

10

CREATING VIEWS IN COUCHDB

Views are like queries. Hmmm… more like Stored Procedures

Views are expressed as Map + Reduce functions written in JavaScript

For example my view to query all the verses –

{ "lookup": { "map": "function(doc){

if (doc.book && doc.chapter && doc.verse && doc.text){ key = [doc.book, doc.chapter, doc.verse]; emit(key, doc.text);

}}"

}}

CouchDB runs the function for every document in DB and stores results in a B-Tree.

Page 11: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

11

THE COUCHREST GEM

CouchRest lightly wraps CouchDB's HTTP API, managing JSON serialization, and remembering the URI-paths to CouchDB's API endpoints so you don't have to.

@db = CouchRest.database!("http://127.0.0.1:5984/the_bible")

@db.save_doc({:key => 'value', 'another key' => 'another value'})

Page 12: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

12

THE RAILS APPLICATION

So our back-end is set

We only need a front-end now

Nothing much needs to be done

1 Controller

1 View

Some jQuery for autocomplete

Page 13: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

13

REGEX NIGHTMARES

Matthew 1 – One whole chapter

Mark 2:3 – One verse

Psalms 23:1-4 – A set of verses

For God so loved the world – Free Text

Page 14: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

14

LUCENE INTEGRATION

Couchdb-lucene (https://github.com/rnewson/couchdb-lucene)

Java project

CouchDB View –

@db.save_doc({ "_id" => "_design/lucene", :fulltext => { :by_text => { :index => "function(doc) { var ret=new Document(); ret.add(doc.text);

return ret }" } }})

Page 15: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

15

LUCENE INTEGRATION CONTD..CouchDB config -

[external]fti=python /home/rocky/Apps/couchdb-lucene-0.7-SNAPSHOT/tools/couchdb-external-hook.py

[httpd_db_handlers]_fti = {couch_httpd_external, handle_external_req, <<"fti">>}

Page 16: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

16

USING HEROKU

Rails application hosting provider

Free for 1 “Dyno” + 1 Shared database

Page 17: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

17

USING HEROKU CONTD..

Page 18: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

18

USING SLICEHOST

Page 19: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

19

PUTTING IT OUT IN THE BIG BAD WORLDBuy a domain name from http://www.godaddy.comOr a domain provider of your choice

In Heroku add the domain name

Add the Zerigo add-on in Heroku

In godaddy’s admin console point your nameservers to Zerigo’s name servers

Wait …

Page 20: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

20

SCALING

Heroku makes scaling dead easy

Page 21: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

21

SCALING

Heroku makes scaling dead easy (if we were using SQL)

Page 22: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

22

SCALING

In NoSQL world, replication is a first class citizen

POST /_replicate {“source”:”a”, “target”:”b”, “continuous”:”true”}

Page 23: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

23

SCALING (HOT BACKUP)

Page 24: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

24

SCALING WITH A DUMB PROXY

Page 25: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

25

SCALING WITH COUCHDB LOUNGE

Have a look at CouchDB Lounge

It consists of – a dumb proxy that is a module for nginxa smart proxy that distributes work

All in all, make a cluster –Have continuous replicationUse Lounge to distribute load

Page 26: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

26

WHEN NOT TO USE COUCHDB

When the number of writes far exceed the number of reads (plus the data volume is very high)

- This would create a bottleneck for replication- And you may encounter more conflicts

When you need ad-hoc queries- You cannot use the power of views in this case

Use CouchDB’s brother MongoDB in this case.

Page 27: Putting rails and couch db on the cloud -  Indicthreads cloud computing conference 2011

27

THANKS