couchbase server 2.0 and incremental map reduce for real-time analytics
TRANSCRIPT
Couchbase Server 2.0 -‐ Webinar Series
Couchbase Server 2.0 Use Cases Overview
Introducing Couchbase Server 2.0
Couchbase Server 2.0 and Indexing/Querying
Couchbase Server 2.0 and Incremental Map Reduce for Real-Time Analytics
Couchbase Server 2.0 and Cross Data Center Replication
Couchbase Server 2.0 and Full-Text Search Integration
1Wednesday, October 10, 12
Couchbase Server 2.0 -‐ Webinar Series
Couchbase Server 2.0 Use Cases Overview
Introducing Couchbase Server 2.0
h"p://www.couchbase.com/webinars
Couchbase Server 2.0 and Indexing/Querying
Couchbase Server 2.0 and Incremental Map Reduce for Real-Time Analytics
Couchbase Server 2.0 and Cross Data Center Replication
Couchbase Server 2.0 and Full-Text Search Integration
1Wednesday, October 10, 12
Couchbase Server 2.0 -‐ Webinar Series
Couchbase Server 2.0 Use Cases Overview
Introducing Couchbase Server 2.0
h"p://www.couchbase.com/webinars
Couchbase Server 2.0 and Indexing/Querying
Couchbase Server 2.0 and Incremental Map Reduce for Real-Time Analytics
Couchbase Server 2.0 and Cross Data Center Replication
Couchbase Server 2.0 and Full-Text Search Integration
1Wednesday, October 10, 12
2
Incremental Map Reduce for Real-‐Time Analy?cs
Jasdeep JaitlaTechnical Evangelist
2Wednesday, October 10, 12
New in Two
JSON support Indexing and Querying
Cross data center replication
Incremental Map Reduce
3Wednesday, October 10, 12
New in Two
JSON support Indexing and Querying
Cross data center replication
Incremental Map Reduce
3Wednesday, October 10, 12
4
What we’ll talk about
• Quick RelaAonal vs Document Databases•Why Views are Helpful• Anatomy of Views
•Map• Reduce
• Simple Example of Map Reduce• Use Case -‐ Analyzing Reddit in Real-‐Time
• Demo• Breakdown
• Final Words on Views
4Wednesday, October 10, 12
DOCUMENT DATABASE PRIMER
55Wednesday, October 10, 12
6
RelaAonal vs Document Data Model
Rela?onal data model Document data modelCollecAon of complex documents witharbitrary, nested data formats and
varying “record” format.
Highly-‐structured table organizaAon with rigidly-‐defined data formats and
record structure.
JSONJSON
JSON
C1 C2 C3 C4
{
}
6Wednesday, October 10, 12
7
SQL Normalized Data
Addresses
1 DEN 30303CO
2 MV 94040CA
3 CHI 60609IL
Users
KEY First ZIP_IDLast
4 NY 10010NY
1 Jasdeep 2Jaitla
2 Joe 2Smith
3 Ali 2Dodson
4 John 3Doe
ZIP_ID CITY ZIPSTATE
To get informa?on about specific user, you perform a join across two tables
7Wednesday, October 10, 12
7
SQL Normalized Data
Addresses
1 DEN 30303CO
2 MV 94040CA
3 CHI 60609IL
Users
KEY First ZIP_IDLast
4 NY 10010NY
1 Jasdeep 2Jaitla
2 Joe 2Smith
3 Ali 2Dodson
4 John 3Doe
ZIP_ID CITY ZIPSTATE
To get informa?on about specific user, you perform a join across two tables
7Wednesday, October 10, 12
7
SQL Normalized Data
Addresses
1 DEN 30303CO
2 MV 94040CA
3 CHI 60609IL
Users
KEY First ZIP_IDLast
4 NY 10010NY
1 Jasdeep 2Jaitla
2 Joe 2Smith
3 Ali 2Dodson
4 John 3Doe
ZIP_ID CITY ZIPSTATE
To get informa?on about specific user, you perform a join across two tables
foreign key
7Wednesday, October 10, 12
7
SQL Normalized Data
Addresses
1 DEN 30303CO
2 MV 94040CA
3 CHI 60609IL
Users
KEY First ZIP_IDLast
4 NY 10010NY
1 Jasdeep 2Jaitla
2 Joe 2Smith
3 Ali 2Dodson
4 John 3Doe
ZIP_ID CITY ZIPSTATE
To get informa?on about specific user, you perform a join across two tables
foreign key
SELECT * FROM Users u INNER JOIN Addresses a ON u.zip_id = a.zip_id WHERE key=1
7Wednesday, October 10, 12
Addresses
1 DEN 30303CO
2
3 CHI 60609IL
4 NY 10010NY
ZIP_ID CITY ZIPSTATE
Users
KEY First ZIP_IDLast
2
2 Joe 2Smith
3 Ali 2Dodson
4 John 3Doe
8
All data in a single document
Documents are Aggregates
1 Jasdeep Jaitla
94103CASF
8Wednesday, October 10, 12
Addresses
1 DEN 30303CO
2
3 CHI 60609IL
4 NY 10010NY
ZIP_ID CITY ZIPSTATE
Users
KEY First ZIP_IDLast
2
2 Joe 2Smith
3 Ali 2Dodson
4 John 3Doe
8
All data in a single document
Documents are Aggregates
1 Jasdeep Jaitla
94103CASF
1 Jasdeep Jaitla
8Wednesday, October 10, 12
+
Addresses
1 DEN 30303CO
2
3 CHI 60609IL
4 NY 10010NY
ZIP_ID CITY ZIPSTATE
Users
KEY First ZIP_IDLast
2
2 Joe 2Smith
3 Ali 2Dodson
4 John 3Doe
8
All data in a single document
Documents are Aggregates
1 Jasdeep Jaitla
94103CASF
1 Jasdeep Jaitla
8Wednesday, October 10, 12
+
Addresses
1 DEN 30303CO
2
3 CHI 60609IL
4 NY 10010NY
ZIP_ID CITY ZIPSTATE
Users
KEY First ZIP_IDLast
2
2 Joe 2Smith
3 Ali 2Dodson
4 John 3Doe
8
All data in a single document
Documents are Aggregates
1 Jasdeep Jaitla
94103CASF
1 Jasdeep Jaitla
SF 94103CA
8Wednesday, October 10, 12
+
Addresses
1 DEN 30303CO
2
3 CHI 60609IL
4 NY 10010NY
ZIP_ID CITY ZIPSTATE
Users
KEY First ZIP_IDLast
2
2 Joe 2Smith
3 Ali 2Dodson
4 John 3Doe
8
All data in a single document
Documents are Aggregates
1 Jasdeep Jaitla
94103CASF
1 Jasdeep Jaitla
SF 94103CA
8Wednesday, October 10, 12
+
Addresses
1 DEN 30303CO
2
3 CHI 60609IL
4 NY 10010NY
ZIP_ID CITY ZIPSTATE
Users
KEY First ZIP_IDLast
2
2 Joe 2Smith
3 Ali 2Dodson
4 John 3Doe
8
All data in a single document
Documents are Aggregates
=1 Jasdeep Jaitla
94103CASF
1 Jasdeep Jaitla
SF 94103CA
8Wednesday, October 10, 12
+
Addresses
1 DEN 30303CO
2
3 CHI 60609IL
4 NY 10010NY
ZIP_ID CITY ZIPSTATE
Users
KEY First ZIP_IDLast
2
2 Joe 2Smith
3 Ali 2Dodson
4 John 3Doe
8
All data in a single document
Documents are Aggregates
{ “ID”: 1, “First”: “Jasdeep”, “Last”: “Jaitla”, “ZIP”: “94103”, “CITY”: “SF”, “STATE”: “CA” } JSON
=1 Jasdeep Jaitla
94103CASF
1 Jasdeep Jaitla
SF 94103CA
8Wednesday, October 10, 12
+
Addresses
1 DEN 30303CO
2
3 CHI 60609IL
4 NY 10010NY
ZIP_ID CITY ZIPSTATE
Users
KEY First ZIP_IDLast
2
2 Joe 2Smith
3 Ali 2Dodson
4 John 3Doe
8
All data in a single document
Documents are Aggregates
{ “ID”: 1, “First”: “Jasdeep”, “Last”: “Jaitla”, “ZIP”: “94103”, “CITY”: “SF”, “STATE”: “CA” } JSON
=
Document Data is an Aggregate
1 Jasdeep Jaitla
94103CASF
1 Jasdeep Jaitla
SF 94103CA
8Wednesday, October 10, 12
+
Addresses
1 DEN 30303CO
2
3 CHI 60609IL
4 NY 10010NY
ZIP_ID CITY ZIPSTATE
Users
KEY First ZIP_IDLast
2
2 Joe 2Smith
3 Ali 2Dodson
4 John 3Doe
8
All data in a single document
Documents are Aggregates
{ “ID”: 1, “First”: “Jasdeep”, “Last”: “Jaitla”, “ZIP”: “94103”, “CITY”: “SF”, “STATE”: “CA” } JSON
=
couchbase.get(“user::1”)
Document Data is an Aggregate
1 Jasdeep Jaitla
94103CASF
1 Jasdeep Jaitla
SF 94103CA
8Wednesday, October 10, 12
9
Document Database Schema is Flexible & Dynamic
{ “ID”: 1, “FIRST”: “Jasdeep”, “LAST”: “Jaitla”, “ZIP”: “94103”, “CITY”: “SF”, “STATE”: “CA”
JSON
9Wednesday, October 10, 12
9
Document Database Schema is Flexible & Dynamic
{ “ID”: 1, “FIRST”: “Jasdeep”, “LAST”: “Jaitla”, “ZIP”: “94103”, “CITY”: “SF”, “STATE”: “CA”
}
Just add informa?on to a document
JSON
9Wednesday, October 10, 12
9
Document Database Schema is Flexible & Dynamic
{ “ID”: 1, “FIRST”: “Jasdeep”, “LAST”: “Jaitla”, “ZIP”: “94103”, “CITY”: “SF”, “STATE”: “CA”
}
Just add informa?on to a document
JSON
,”STATUS”:
{ “TEXT”: “Wow!”, “GEO_LOC”: “27.4” “LIKES”: 45 }
9Wednesday, October 10, 12
MAP-‐REDUCE BASICS
1010Wednesday, October 10, 12
Document Keys
11
JSONJSON
JSON
{
}
Document Keys Come In Many Flavors
• Human Readable• Incremental Counter Index• UUID• Timestamp Based• Social Media Account ID• Random Numbers
Q:11Wednesday, October 10, 12
Document Keys
11
JSONJSON
JSON
{
}
Document Keys Come In Many Flavors
• Human Readable• Incremental Counter Index• UUID• Timestamp Based• Social Media Account ID• Random Numbers
Q: Does Couchbase have a mechanism for creating unique keys?
11Wednesday, October 10, 12
Document Keys
11
JSONJSON
JSON
{
}
Document Keys Come In Many Flavors
• Human Readable• Incremental Counter Index• UUID• Timestamp Based• Social Media Account ID• Random Numbers
If I use unique usernames or emails for keys, will I need a map-query?Q:
11Wednesday, October 10, 12
Document Keys
11
JSONJSON
JSON
{
}
Document Keys Come In Many Flavors
• Human Readable• Incremental Counter Index• UUID• Timestamp Based• Social Media Account ID• Random Numbers
Q: If I use UUID’s for ID’s will I need a map-reduce to find Documents?
11Wednesday, October 10, 12
Document Keys
12
A:
12Wednesday, October 10, 12
Document Keys
12
If your keys are indeterminable, you will need Secondary Indexes -- Views (Map or Map/Reduce) or Elastic Search to find Documents.
A:
12Wednesday, October 10, 12
Document Keys
12
If your keys are indeterminable, you will need Secondary Indexes -- Views (Map or Map/Reduce) or Elastic Search to find Documents.
A:
There are many pa_erns for key creaAon, it’s a skill and an art to design your keys.
12Wednesday, October 10, 12
Document Keys
13
A:
13Wednesday, October 10, 12
Document Keys
13
If you want to find Documents based on more than one parameter, you may need Views as well.
A:
13Wednesday, October 10, 12
Document Keys
13
If you want to find Documents based on more than one parameter, you may need Views as well.
A:
In many cases Lookups can also be done without Views, using a Lookup Pa_ern, but that’s not always
the case especially for Ame based or geo based values.
13Wednesday, October 10, 12
ANATOMY OF A VIEW
1414Wednesday, October 10, 12
Buckets >> Design Documents >> Views
15
Couchbase Bucket
15Wednesday, October 10, 12
Buckets >> Design Documents >> Views
15
Couchbase Bucket
Design Document 1
15Wednesday, October 10, 12
Buckets >> Design Documents >> Views
15
Couchbase Bucket
Design Document 1
View
15Wednesday, October 10, 12
Buckets >> Design Documents >> Views
15
Couchbase Bucket
Design Document 1
ViewView
15Wednesday, October 10, 12
Buckets >> Design Documents >> Views
15
Couchbase Bucket
Design Document 1
ViewViewView
15Wednesday, October 10, 12
Buckets >> Design Documents >> Views
15
Couchbase Bucket
Design Document 1 Design Document 2
ViewViewView
15Wednesday, October 10, 12
Buckets >> Design Documents >> Views
15
Couchbase Bucket
Design Document 1 Design Document 2
ViewViewViewView
15Wednesday, October 10, 12
Buckets >> Design Documents >> Views
15
Couchbase Bucket
Design Document 1 Design Document 2
View ViewViewViewView
15Wednesday, October 10, 12
Buckets >> Design Documents >> Views
15
Couchbase Bucket
Design Document 1 Design Document 2
View ViewViewViewView
Indexers Are Allocated Per Design Doc
15Wednesday, October 10, 12
Buckets >> Design Documents >> Views
15
Couchbase Bucket
Design Document 1 Design Document 2
View ViewViewViewView
Indexers Are Allocated Per Design Doc
All Updated at Same TimeAll Updated at Same TimeAll Updated at Same Time
15Wednesday, October 10, 12
Buckets >> Design Documents >> Views
15
Couchbase Bucket
Design Document 1 Design Document 2
View ViewViewViewView
Indexers Are Allocated Per Design Doc
All Updated at Same TimeAll Updated at Same TimeAll Updated at Same Time
Can Only Access Data in the Bucket Namespace
Can Only Access Data in the Bucket Namespace
15Wednesday, October 10, 12
Buckets >> Design Documents >> Views
15
Couchbase Bucket
Design Document 1 Design Document 2
View ViewViewViewView
All Updated at Same TimeAll Updated at Same TimeAll Updated at Same Time
Can Only Access Data in the Bucket Namespace
Can Only Access Data in the Bucket Namespace
15Wednesday, October 10, 12
Buckets >> Design Documents >> Views
15
Couchbase Bucket
Design Document 1 Design Document 2
View ViewViewViewView
Can Only Access Data in the Bucket Namespace
Can Only Access Data in the Bucket Namespace
15Wednesday, October 10, 12
function(doc, meta) {emit(doc.username, doc.email)
}
Map() funcAon = index
16
Every Document passes through View Map() functions
Map
16Wednesday, October 10, 12
function(doc, meta) {emit(doc.username, doc.email)
}
Map() funcAon = index
16
json doc
Every Document passes through View Map() functions
Map
16Wednesday, October 10, 12
function(doc, meta) {emit(doc.username, doc.email)
}
Map() funcAon = index
16
json doc doc metadata
Every Document passes through View Map() functions
Map
16Wednesday, October 10, 12
function(doc, meta) {emit(doc.username, doc.email)
}
Map() funcAon = index
16
create row
json doc doc metadata
Every Document passes through View Map() functions
Map
16Wednesday, October 10, 12
function(doc, meta) {emit(doc.username, doc.email)
}
Map() funcAon = index
16
indexed keycreate row
json doc doc metadata
Every Document passes through View Map() functions
Map
16Wednesday, October 10, 12
function(doc, meta) {emit(doc.username, doc.email)
}
Map() funcAon = index
16
indexed key output value(s)create row
json doc doc metadata
Every Document passes through View Map() functions
Map
16Wednesday, October 10, 12
function(doc, meta) {emit(doc.email, meta.id)
}
Text or Numeric Based Keys
17
Map
17Wednesday, October 10, 12
function(doc, meta) {emit(doc.email, meta.id)
}
Text or Numeric Based Keys
17
text key
Map
17Wednesday, October 10, 12
function(doc, meta) {emit(doc.email, meta.id)
}
Text or Numeric Based Keys
17
text key
Map
doc.email meta.id
[email protected] u::1
[email protected] u::2
[email protected] u::3
17Wednesday, October 10, 12
function(doc, meta) {emit(dateToArray(doc.timestamp), 1)
}
Array Based Index Keys
18
Array Based Index Keys get sorted by each element starting with first element
Map
18Wednesday, October 10, 12
function(doc, meta) {emit(dateToArray(doc.timestamp), 1)
}
Array Based Index Keys
18
array key
Array Based Index Keys get sorted by each element starting with first element
Map
18Wednesday, October 10, 12
function(doc, meta) {emit(dateToArray(doc.timestamp), 1)
}
Array Based Index Keys
18
array key
Array Based Index Keys get sorted by each element starting with first element
Map
dateToArray(doc.?mestamp) value
[2012,10,9,18,45] 1
[2012,9,26,11,15] 1
[2012,8,13,2,12] 1
18Wednesday, October 10, 12
Querying Views
32 3219Wednesday, October 10, 12
Beer Database Example
20
{ "name": "Aventinus Weizenstarkbier / Doppel Weizen Bock", "abv": 8.2, "ibu": 0, "srm": 0, "upc": 0, "type": "beer", "brewery_id": "110f1f2012", "updated": "2010-07-22 20:00:20", "description": "Dark-ruby, almost black-colored and streaked with fine top-fermenting yeast, this beer has a compact and persistent head. This is a very intense wheat doppelbock with a complex spicy chocolate-like arome with a hint of banana and raisins. On the palate, you experience a soft touch and on the tongue it is very rich and complex, though fresh with a hint of caramel. It finishes in a rich soft and lightly bitter impression.", "style": "South German-Style Weizenbock", "category": "German Ale"}
{ "id": "110f37fa30", "rev": "1-000000000", "expiration": 0, "flags": 0, "type": "json"}
meta doc
20Wednesday, October 10, 12
Beer Database Example
20
{ "name": "Aventinus Weizenstarkbier / Doppel Weizen Bock", "abv": 8.2, "ibu": 0, "srm": 0, "upc": 0, "type": "beer", "brewery_id": "110f1f2012", "updated": "2010-07-22 20:00:20", "description": "Dark-ruby, almost black-colored and streaked with fine top-fermenting yeast, this beer has a compact and persistent head. This is a very intense wheat doppelbock with a complex spicy chocolate-like arome with a hint of banana and raisins. On the palate, you experience a soft touch and on the tongue it is very rich and complex, though fresh with a hint of caramel. It finishes in a rich soft and lightly bitter impression.", "style": "South German-Style Weizenbock", "category": "German Ale"}
{ "id": "110f37fa30", "rev": "1-000000000", "expiration": 0, "flags": 0, "type": "json"}
meta docalcohol by volume (abv)
brewery_id (key)document key
20Wednesday, October 10, 12
30
The index definiAon
21Wednesday, October 10, 12
30
The index definiAon
+row
21Wednesday, October 10, 12
30
The index definiAon
indexed key+row
21Wednesday, October 10, 12
30
The index definiAon
indexed key value(s)+row
21Wednesday, October 10, 12
31
The result set: beers keyed by brewery_id
22Wednesday, October 10, 12
31
The result set: beers keyed by brewery_id
brewery_id
document key (of the beer)
alcohol by volume (abv)
22Wednesday, October 10, 12
34 34
We are reducing doc.abv with _stats
23Wednesday, October 10, 12
34 34
We are reducing doc.abv with _stats
add _stats built-in reduction
23Wednesday, October 10, 12
33
Use a built-‐in reduce funcAon with a group query
Find average alcohol by volume per brewery.
24Wednesday, October 10, 12
33
Use a built-‐in reduce funcAon with a group query
Find average alcohol by volume per brewery.
set group=true & reduce=true
add _stats built-in reduction
24Wednesday, October 10, 12
35 35
Group reduce (reduce by unique key)
25Wednesday, October 10, 12
35 35
Group reduce (reduce by unique key)
group=true & reduce=true
number of beers by this brewery max abvmin abv
25Wednesday, October 10, 12
Using Incremental Map-‐ReduceUse Case Example
36 3626Wednesday, October 10, 12
reddalyzer.com
27
reddalyzer.comReal-Time Analysis of Redditusing Couchbase & Clojure
27Wednesday, October 10, 12
Quick Demo
2828Wednesday, October 10, 12
Sample Reddit Post -‐ Document
29
{ "over_18": false, "banned_by": null, "is_self": false, "link_flair_text": null, "hidden": false, "edited": false, "kind": "link", "subreddit_id": "t5_2qh55", "downs": 5, "domain": "ibelieveicanfry.com", "selftext": "", "approved_by": null, "score": 5, "author": "ibelieveicanfry", "name": "t3_yph1p", "num_comments": 0, "selftext_html": null, "link_flair_css_class": null, "likes": null, "media_embed": { }, "media": null, "title": "I don't buy the bottled Thai Sweet Chili Sauce anymore...", "thumbnail": "", "permalink": "/r/food/comments/yph1p/i_dont_buy_the_bottled_thai_sweet_chili_sauce/", "url": "http://www.ibelieveicanfry.com/2012/08/thai-sweet-chili-sauce.html", "created": 1345745189, "num_reports": null, "saved": false, "subreddit": "food", "ups": 10, "created_utc": 1345745189, "author_flair_css_class": null, "id": "yph1p", "author_flair_text": null, "clicked": false}
29Wednesday, October 10, 12
Sample Reddit Post -‐ Document
29
{ "over_18": false, "banned_by": null, "is_self": false, "link_flair_text": null, "hidden": false, "edited": false, "kind": "link", "subreddit_id": "t5_2qh55", "downs": 5, "domain": "ibelieveicanfry.com", "selftext": "", "approved_by": null, "score": 5, "author": "ibelieveicanfry", "name": "t3_yph1p", "num_comments": 0, "selftext_html": null, "link_flair_css_class": null, "likes": null, "media_embed": { }, "media": null, "title": "I don't buy the bottled Thai Sweet Chili Sauce anymore...", "thumbnail": "", "permalink": "/r/food/comments/yph1p/i_dont_buy_the_bottled_thai_sweet_chili_sauce/", "url": "http://www.ibelieveicanfry.com/2012/08/thai-sweet-chili-sauce.html", "created": 1345745189, "num_reports": null, "saved": false, "subreddit": "food", "ups": 10, "created_utc": 1345745189, "author_flair_css_class": null, "id": "yph1p", "author_flair_text": null, "clicked": false}
“score”: 5
“subreddit”: “food”
“created_utc”: 1345745189
“kind”: “link”
29Wednesday, October 10, 12
Map Subreddits by Name & Day
30
function (doc, meta) { // Skip documents that aren't JSON if (meta.type == "json") { // Skip docs that aren't links if(doc.kind == "link") { var dt = new Date(doc.created_utc * 1000);
//Get day of week, but start week on Saturday, not Sunday, so that //we can pull out the weekend easily. var ssday = dt.getUTCDay() + 1; if (ssday == 7) ssday = 0;
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score}); } }}
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score});
30Wednesday, October 10, 12
Map Subreddits by Name & Day
30
function (doc, meta) { // Skip documents that aren't JSON if (meta.type == "json") { // Skip docs that aren't links if(doc.kind == "link") { var dt = new Date(doc.created_utc * 1000);
//Get day of week, but start week on Saturday, not Sunday, so that //we can pull out the weekend easily. var ssday = dt.getUTCDay() + 1; if (ssday == 7) ssday = 0;
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score}); } }}
ensure doc.kind == “link”
ensure meta.type == “json”
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score});
30Wednesday, October 10, 12
Map Subreddits by Name & Day
30
function (doc, meta) { // Skip documents that aren't JSON if (meta.type == "json") { // Skip docs that aren't links if(doc.kind == "link") { var dt = new Date(doc.created_utc * 1000);
//Get day of week, but start week on Saturday, not Sunday, so that //we can pull out the weekend easily. var ssday = dt.getUTCDay() + 1; if (ssday == 7) ssday = 0;
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score}); } }}
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score});
30Wednesday, October 10, 12
Map Subreddits by Name & Day
30
function (doc, meta) { // Skip documents that aren't JSON if (meta.type == "json") { // Skip docs that aren't links if(doc.kind == "link") { var dt = new Date(doc.created_utc * 1000);
//Get day of week, but start week on Saturday, not Sunday, so that //we can pull out the weekend easily. var ssday = dt.getUTCDay() + 1; if (ssday == 7) ssday = 0;
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score}); } }}
convert doc.created_utcto Date Object
calculate day of week
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score});
30Wednesday, October 10, 12
Map Subreddits by Name & Day
30
function (doc, meta) { // Skip documents that aren't JSON if (meta.type == "json") { // Skip docs that aren't links if(doc.kind == "link") { var dt = new Date(doc.created_utc * 1000);
//Get day of week, but start week on Saturday, not Sunday, so that //we can pull out the weekend easily. var ssday = dt.getUTCDay() + 1; if (ssday == 7) ssday = 0;
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score}); } }}
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score});
30Wednesday, October 10, 12
Map Subreddits by Name & Day
30
function (doc, meta) { // Skip documents that aren't JSON if (meta.type == "json") { // Skip docs that aren't links if(doc.kind == "link") { var dt = new Date(doc.created_utc * 1000);
//Get day of week, but start week on Saturday, not Sunday, so that //we can pull out the weekend easily. var ssday = dt.getUTCDay() + 1; if (ssday == 7) ssday = 0;
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score}); } }}
emit (create) row
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score});
30Wednesday, October 10, 12
Map Subreddits by Name & Day
30
function (doc, meta) { // Skip documents that aren't JSON if (meta.type == "json") { // Skip docs that aren't links if(doc.kind == "link") { var dt = new Date(doc.created_utc * 1000);
//Get day of week, but start week on Saturday, not Sunday, so that //we can pull out the weekend easily. var ssday = dt.getUTCDay() + 1; if (ssday == 7) ssday = 0;
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score}); } }}
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score});
30Wednesday, October 10, 12
Map Subreddits by Name & Day
30
function (doc, meta) { // Skip documents that aren't JSON if (meta.type == "json") { // Skip docs that aren't links if(doc.kind == "link") { var dt = new Date(doc.created_utc * 1000);
//Get day of week, but start week on Saturday, not Sunday, so that //we can pull out the weekend easily. var ssday = dt.getUTCDay() + 1; if (ssday == 7) ssday = 0;
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score}); } }} order by doc.subreddit then order by day of week
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score});
30Wednesday, October 10, 12
Map Subreddits by Name & Day
30
function (doc, meta) { // Skip documents that aren't JSON if (meta.type == "json") { // Skip docs that aren't links if(doc.kind == "link") { var dt = new Date(doc.created_utc * 1000);
//Get day of week, but start week on Saturday, not Sunday, so that //we can pull out the weekend easily. var ssday = dt.getUTCDay() + 1; if (ssday == 7) ssday = 0;
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score}); } }}
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score});
30Wednesday, October 10, 12
Map Subreddits by Name & Day
30
function (doc, meta) { // Skip documents that aren't JSON if (meta.type == "json") { // Skip docs that aren't links if(doc.kind == "link") { var dt = new Date(doc.created_utc * 1000);
//Get day of week, but start week on Saturday, not Sunday, so that //we can pull out the weekend easily. var ssday = dt.getUTCDay() + 1; if (ssday == 7) ssday = 0;
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score}); } }} output hour of day output karma score
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score});
30Wednesday, October 10, 12
Map Subreddits by Name & Day
30
function (doc, meta) { // Skip documents that aren't JSON if (meta.type == "json") { // Skip docs that aren't links if(doc.kind == "link") { var dt = new Date(doc.created_utc * 1000);
//Get day of week, but start week on Saturday, not Sunday, so that //we can pull out the weekend easily. var ssday = dt.getUTCDay() + 1; if (ssday == 7) ssday = 0;
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score}); } }}
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score});
30Wednesday, October 10, 12
Map Output
31
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score});
31Wednesday, October 10, 12
Map Output
31
{"id":"zx4sc","key":["funny",0],"value":{"hour":9,"score":0}},{"id":"zxak2","key":["funny",0],"value":{"hour":13,"score":1}},{"id":"ytw3t","key":["funny",1],"value":{"hour":0,"score":938}},{"id":"yv3uf","key":["funny",1],"value":{"hour":19,"score":2508}},......
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score});
31Wednesday, October 10, 12
indexed key output value(s)
Map Output
31
{"id":"zx4sc","key":["funny",0],"value":{"hour":9,"score":0}},{"id":"zxak2","key":["funny",0],"value":{"hour":13,"score":1}},{"id":"ytw3t","key":["funny",1],"value":{"hour":0,"score":938}},{"id":"yv3uf","key":["funny",1],"value":{"hour":19,"score":2508}},......
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score});
31Wednesday, October 10, 12
Map Output
31
{"id":"zx4sc","key":["funny",0],"value":{"hour":9,"score":0}},{"id":"zxak2","key":["funny",0],"value":{"hour":13,"score":1}},{"id":"ytw3t","key":["funny",1],"value":{"hour":0,"score":938}},{"id":"yv3uf","key":["funny",1],"value":{"hour":19,"score":2508}},......
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score});
output value(s)indexed key
31Wednesday, October 10, 12
Map Output
31
{"id":"zx4sc","key":["funny",0],"value":{"hour":9,"score":0}},{"id":"zxak2","key":["funny",0],"value":{"hour":13,"score":1}},{"id":"ytw3t","key":["funny",1],"value":{"hour":0,"score":938}},{"id":"yv3uf","key":["funny",1],"value":{"hour":19,"score":2508}},......
emit([doc.subreddit, ssday], {hour: dt.getUTCHours(), score: doc.score});
31Wednesday, October 10, 12
Reduce Map
32
function (keys, values, rereduce) { var out = {freqs: [], score: []} //Prefill the arrays with zeroes for(i = 0; i < 24; i++) { out.freqs[i] = 0; out.score[i] = 0; } for(v in values) {
if(!rereduce) { //Values are the output of map out.freqs[values[v].hour] += 1; out.score[values[v].hour] += values[v].score; }
else { //Values are the output of reduce // Combine the arrays for(h in values[v].freqs) { out.freqs[h] += values[v].freqs[h]; out.score[h] += values[v].score[h];
} } } return out;
}
{
"freqs": [ 178344, 174476, 171569, 161836, 146411, 120881, 94139, 75880, 62617, 56553, 57811, 70185, 88880, 114252,137301, 156750, 166376, 172562, 177094, 182093, 180485, 180434, 178706, 176525 ],
"score": [ 2856922, 2688783,2392233, 1954973, 1623642, 1355241, 1187087, 1061364, 1009152, 1165220, 1506009, 2207945, 3081796, 3868605, 4441859,4633668, 4200795, 4291777, 3986492, 3757385, 3420142, 3032258, 3029148, 2975291 ] }
h"p://localhost:8092/reddalyzr/_design/reddit/_view/posthours?stale=update_arer
32Wednesday, October 10, 12
Reduce Map
32
function (keys, values, rereduce) { var out = {freqs: [], score: []} //Prefill the arrays with zeroes for(i = 0; i < 24; i++) { out.freqs[i] = 0; out.score[i] = 0; } for(v in values) {
if(!rereduce) { //Values are the output of map out.freqs[values[v].hour] += 1; out.score[values[v].hour] += values[v].score; }
else { //Values are the output of reduce // Combine the arrays for(h in values[v].freqs) { out.freqs[h] += values[v].freqs[h]; out.score[h] += values[v].score[h];
} } } return out;
}
{
"freqs": [ 178344, 174476, 171569, 161836, 146411, 120881, 94139, 75880, 62617, 56553, 57811, 70185, 88880, 114252,137301, 156750, 166376, 172562, 177094, 182093, 180485, 180434, 178706, 176525 ],
"score": [ 2856922, 2688783,2392233, 1954973, 1623642, 1355241, 1187087, 1061364, 1009152, 1165220, 1506009, 2207945, 3081796, 3868605, 4441859,4633668, 4200795, 4291777, 3986492, 3757385, 3420142, 3032258, 3029148, 2975291 ] }
For every row increment post count and post score (karma)
h"p://localhost:8092/reddalyzr/_design/reddit/_view/posthours?stale=update_arer
32Wednesday, October 10, 12
Reduce Map
32
function (keys, values, rereduce) { var out = {freqs: [], score: []} //Prefill the arrays with zeroes for(i = 0; i < 24; i++) { out.freqs[i] = 0; out.score[i] = 0; } for(v in values) {
if(!rereduce) { //Values are the output of map out.freqs[values[v].hour] += 1; out.score[values[v].hour] += values[v].score; }
else { //Values are the output of reduce // Combine the arrays for(h in values[v].freqs) { out.freqs[h] += values[v].freqs[h]; out.score[h] += values[v].score[h];
} } } return out;
}
{
"freqs": [ 178344, 174476, 171569, 161836, 146411, 120881, 94139, 75880, 62617, 56553, 57811, 70185, 88880, 114252,137301, 156750, 166376, 172562, 177094, 182093, 180485, 180434, 178706, 176525 ],
"score": [ 2856922, 2688783,2392233, 1954973, 1623642, 1355241, 1187087, 1061364, 1009152, 1165220, 1506009, 2207945, 3081796, 3868605, 4441859,4633668, 4200795, 4291777, 3986492, 3757385, 3420142, 3032258, 3029148, 2975291 ] }
h"p://localhost:8092/reddalyzr/_design/reddit/_view/posthours?stale=update_arer
32Wednesday, October 10, 12
Reduce Map
32
function (keys, values, rereduce) { var out = {freqs: [], score: []} //Prefill the arrays with zeroes for(i = 0; i < 24; i++) { out.freqs[i] = 0; out.score[i] = 0; } for(v in values) {
if(!rereduce) { //Values are the output of map out.freqs[values[v].hour] += 1; out.score[values[v].hour] += values[v].score; }
else { //Values are the output of reduce // Combine the arrays for(h in values[v].freqs) { out.freqs[h] += values[v].freqs[h]; out.score[h] += values[v].score[h];
} } } return out;
}
{
"freqs": [ 178344, 174476, 171569, 161836, 146411, 120881, 94139, 75880, 62617, 56553, 57811, 70185, 88880, 114252,137301, 156750, 166376, 172562, 177094, 182093, 180485, 180434, 178706, 176525 ],
"score": [ 2856922, 2688783,2392233, 1954973, 1623642, 1355241, 1187087, 1061364, 1009152, 1165220, 1506009, 2207945, 3081796, 3868605, 4441859,4633668, 4200795, 4291777, 3986492, 3757385, 3420142, 3032258, 3029148, 2975291 ] }
Array of Results
h"p://localhost:8092/reddalyzr/_design/reddit/_view/posthours?stale=update_arer
32Wednesday, October 10, 12
Reduce Map
32
function (keys, values, rereduce) { var out = {freqs: [], score: []} //Prefill the arrays with zeroes for(i = 0; i < 24; i++) { out.freqs[i] = 0; out.score[i] = 0; } for(v in values) {
if(!rereduce) { //Values are the output of map out.freqs[values[v].hour] += 1; out.score[values[v].hour] += values[v].score; }
else { //Values are the output of reduce // Combine the arrays for(h in values[v].freqs) { out.freqs[h] += values[v].freqs[h]; out.score[h] += values[v].score[h];
} } } return out;
}
{
"freqs": [ 178344, 174476, 171569, 161836, 146411, 120881, 94139, 75880, 62617, 56553, 57811, 70185, 88880, 114252,137301, 156750, 166376, 172562, 177094, 182093, 180485, 180434, 178706, 176525 ],
"score": [ 2856922, 2688783,2392233, 1954973, 1623642, 1355241, 1187087, 1061364, 1009152, 1165220, 1506009, 2207945, 3081796, 3868605, 4441859,4633668, 4200795, 4291777, 3986492, 3757385, 3420142, 3032258, 3029148, 2975291 ] }
h"p://localhost:8092/reddalyzr/_design/reddit/_view/posthours?stale=update_arer
32Wednesday, October 10, 12
View UpdaAng
33
Couchbase Bucket
Design Document 2
View View
Design Document 1
ViewViewView
33Wednesday, October 10, 12
View UpdaAng
33
Couchbase Bucket
Design Document 2
View View
Design Document 1
ViewViewView
Updates every 3 seconds or 5000 document operations
33Wednesday, October 10, 12
View UpdaAng
33
Couchbase Bucket
Design Document 2
View View
Design Document 1
ViewViewView
Updates every 3 seconds or 5000 document operations
This is a Configurable Setting
33Wednesday, October 10, 12
View UpdaAng
33
Couchbase Bucket
Design Document 2
View View
Design Document 1
ViewViewView
33Wednesday, October 10, 12
View UpdaAng
33
Couchbase Bucket
Design Document 2
View View
Design Document 1
ViewViewView
Can also be Triggered to Update by client queries by using
stale=false parameter
33Wednesday, October 10, 12
Why is it Incremental?
34
View Indexes are Append Only B+ Trees, so new data is just added to them, and they are compacted and opAmized automaAcally
Views are only Re-‐Indexed if you change their definiAon and republish them. The original index
stays available unAl new redefined index completes indexing.
34Wednesday, October 10, 12