real-time search at yammer - by aleksandrovsky boris

48
Realtime revolution at work REAL-TIME SEARCH AT YAMMER May 25, 2011 By Boris Aleksandrovsky http://www.linkedin.com/in/baleksan Yammer, Inc. http://www.linkedin.com/in/baleksan

Upload: lucenerevolution

Post on 11-May-2015

1.451 views

Category:

Technology


1 download

DESCRIPTION

See conference video - http://www.lucidimagination.com/devzone/events/conferences/revolution/2011 This talk will be focused on the architecture, scalability concerns, performance bottlenecks, operational characteristics and lessons learned while designing and implementing Yammer distributed real-time search system. Yammer is an enterprise social network SaaS offering with over 100,000 networks (including 85% of the Fortune 100) and nearly 2 million users. The search system we developed scales well up to 1B messages and serves a foundation of knowledge base analysis services Yammer is developing.

TRANSCRIPT

Page 1: Real-time Search at Yammer - By Aleksandrovsky Boris

Realtime revolution at work

REAL-TIME SEARCH AT YAMMER

May 25, 2011

By Boris Aleksandrovsky

http://www.linkedin.com/in/baleksanYammer, Inc.http://www.linkedin.com/in/baleksan

Page 2: Real-time Search at Yammer - By Aleksandrovsky Boris

2

• Communication is hard, search is harder• What me grammar?• Private language• Conversational language• Time compressed• Transient• Poorly organized• Authority is suspect• Social pressures

Page 3: Real-time Search at Yammer - By Aleksandrovsky Boris

33

Page 4: Real-time Search at Yammer - By Aleksandrovsky Boris

4

Information

Facts

Knowledge

Attention

Engagement

Retention

Challenges - From information to knowledge

4

Messages

Metadata

Personalized Search

Page 5: Real-time Search at Yammer - By Aleksandrovsky Boris

5

Agenda

• Background• Why search?• Indexing• Search• Tools and methodologies• Lessons learned• Future• Q&A

Page 6: Real-time Search at Yammer - By Aleksandrovsky Boris

6

: Putting Social Media to Work

Knowledge Management:Document-oriented

Enterprise Collaboration:

Outcome-focused

Social Media:People-centric

Yammer makes work• Real-time, Social, Mobile• Collaborative, Contextual• More Human!

Similar to:• Facebook• Twitter• Wikis• Groups

Page 7: Real-time Search at Yammer - By Aleksandrovsky Boris

7

Yammer: The Enterprise Social Network

• Messaging and Feeds• Direct Messaging• User Profiles• Company Directory• Groups (Internal)• Communities (External)• File Sharing• Applications• Integrations• Web, Desktop, Mobile, Tablet• Translations• Network Consultation and

Support

Easy. Shared. Searchable. Real-time. Where your company’s knowledge lives.

Page 8: Real-time Search at Yammer - By Aleksandrovsky Boris

8

100,000+ companies, including 85% of the Fortune 500 – and growing.

Page 9: Real-time Search at Yammer - By Aleksandrovsky Boris

9

What do you discuss at work, and with whom?

What do our employees think of

our 401K program? Is everybody saving?

What’s the latest with the XYZ

account?

What are our recommendations for

financial and regulatory reform

given the latest news about…?

What will be discussed at our Quarterly Sales

Kickoff?

Where can I find out more about customer

events here at the ABC conference? Who’s free

to meet up?

How can my team better prepare for our next product

release?

Who has any fresh

ideas for…

• Who do you need to communicate with, across the company?• How often are the same questions asked? • Who has the answers? Who has new ideas? Who can help?

Who will I be working with on

this new project?

Page 10: Real-time Search at Yammer - By Aleksandrovsky Boris

1010

Page 11: Real-time Search at Yammer - By Aleksandrovsky Boris

11

Search use case - Transient Awareness

• Reverse-chronological• Simple queries• Facet

• Date• Sender• Group

Page 12: Real-time Search at Yammer - By Aleksandrovsky Boris

12

Search use case - Knowledge Exploration

• Complicated relevance story• tf/idf• popularity• engagement• social distance

• Complicated queries• Facet

• Date• Sender• Group• Object type

Page 13: Real-time Search at Yammer - By Aleksandrovsky Boris

13

Challenges for Yammer’s search engine

• More knowledge is generated in realtime• Availability latency < 1 sec• Not always well formed

• Complicated relevance story• experts and their reputation• popularity• social graph• tagging/topics• engagement signals• timeliness• location

Page 14: Real-time Search at Yammer - By Aleksandrovsky Boris

14

Team

• 2 engineers• 8 man months• Lots of fun

14

Page 15: Real-time Search at Yammer - By Aleksandrovsky Boris

15

Indexing

• DB to replica

15

Page 16: Real-time Search at Yammer - By Aleksandrovsky Boris

1616

Page 17: Real-time Search at Yammer - By Aleksandrovsky Boris

17

Replication

• Independent near-replicas based on a single distributed source of truth

• Can (will) get out of sync• Automatic monitoring of replication quality

• Are replicas out of sync with other replicas?• number of docs• alert > X

• Are replicas out of sync with the DB?• statistical sample of docs

17

Page 18: Real-time Search at Yammer - By Aleksandrovsky Boris

18

Indexing

• In-replica to index

18

Page 19: Real-time Search at Yammer - By Aleksandrovsky Boris

1919

30s

Page 20: Real-time Search at Yammer - By Aleksandrovsky Boris

20

Why is it hard?

•No timeliness guarantee•Fragmentation•Out-of-order deliveries•Index dependencies

• Need to denormalize the information•Need to build for network partition tolerance and redundancy

•But • Eventual consistency• Eventual delivery

20

Page 21: Real-time Search at Yammer - By Aleksandrovsky Boris

21

How do we cope?

•Out of order delivery source of (most) evil

•?

• A) Assure in-order delivery

• buffer and wait

• degrades performance, availability and timeliness and is only very eventual consistent

• B) Minimize probability and ignore

• timestamp precision

• clock skew

• C) Arbitrate

• timestamp / vector clocks

• semantics

• need to index lifecycle events

•Need to build for network partition tolerance and redundancy

•But • Consistency guarantee• Eventual delivery

21

Page 22: Real-time Search at Yammer - By Aleksandrovsky Boris

22

Delete-update race

• [create Message “hello” id=5 ts=12:34:39]• [delete Message “hello there” id=5 ts=12:45:01]• [modify Message “hello there” id=5 ts=12:45:01]

22

id timestamp tombstone

5 12:34:39 no

5 12:45:01 yes

Page 23: Real-time Search at Yammer - By Aleksandrovsky Boris

23

Multiple update race

• [create Message “hello” id=5 ts=12:34:39]• [modify Message “hello there now” id=5 ts=12:45:01]• [modify Message “hello there” id=5 ts=12:45:01]

23

id timestamp text

5 12:34:39 hello

5 12:45:01 hello there now

Page 24: Real-time Search at Yammer - By Aleksandrovsky Boris

24

Dupes

• [create Message “hello” id=5 ts=12:34:39]• [like Message id=5 userId=3 ts=12:45:01]• [like Message id=5 userId=3 ts=12:45:02]• [unlike Message id=5 userId=3 ts=12:45:04]

24

id timestamp numLikes

5 12:34:39 0

5 12:45:01 1

5 12:45:02 1

5 12:45:04 0

Page 25: Real-time Search at Yammer - By Aleksandrovsky Boris

25

Thread example

25

Page 26: Real-time Search at Yammer - By Aleksandrovsky Boris

26

Zoie

• Realtime indexing system • Open sourced by LinkedIn• Used by LinkedIn in production for about 3 years• Deployed at dozen or so locations• Thanks Xiaoyang Gu, Yasuhiro Matsuda, John Wang and Lei Wang

Page 27: Real-time Search at Yammer - By Aleksandrovsky Boris

27

Zoie

• Push events into buffer and the transaction log• Push buffer into Zoie• When Zoie commits, transaction log is truncated.

Page 28: Real-time Search at Yammer - By Aleksandrovsky Boris

28

Indexing HA

• Cluster queue systems• Round-robin of Rabbits introduce further out-of-order

problems.• Transaction log

• Between RabbiMQ dequeue and Zoie disk commit

28

Page 29: Real-time Search at Yammer - By Aleksandrovsky Boris

29

Dual indexing

• Primary for serving out• Secondary for reindexing

• Verify secondary index consistency• foreach replica do

• shutdown• mv secondary to primary• restart

• Availability should not be affected except for slight chance of system failure

29

Page 30: Real-time Search at Yammer - By Aleksandrovsky Boris

30

Index consistency problems

• Detect• integrity check against the :source of truth:

• Reindex• gaps• whole• reindex into secondary, swap with primary

• Repair• patch in place• run on restart

30

Page 31: Real-time Search at Yammer - By Aleksandrovsky Boris

31

Search

• <insert animated architecture slide>

31

Page 32: Real-time Search at Yammer - By Aleksandrovsky Boris

32

Goal

• 50/50-500/100 per partition• 50M docs• 50 msec P75 - 500 msec P99• 100 qps

32

Page 34: Real-time Search at Yammer - By Aleksandrovsky Boris

34

Payload• Payload is usually small json object• For security reasons only ids and scores are send out• One page (usually 10 items) x 6 index types.

34

Page 35: Real-time Search at Yammer - By Aleksandrovsky Boris

35

Payload

35

Page 36: Real-time Search at Yammer - By Aleksandrovsky Boris

36

Web Server

• Jersey over Jetty• http://jetty.codehaus.org/jetty/

• Custom configuration• tuned to the required 100 qps• generally impeccable, occasional lock contention

• http://jsr311.java.net/• Annotation driven• Much easier to test

36

Page 37: Real-time Search at Yammer - By Aleksandrovsky Boris

37

Search master

• More like a router• Knows about partitioning scheme• Performs load normalization

• Call all, take the first• Possible to use multicast

• Round Robin• switch to for scale

• DLB (Least busy)• Maintains primary SLA metrics

37

Page 38: Real-time Search at Yammer - By Aleksandrovsky Boris

38

Partitioning

• Simple Jenkins 64bit hash of networkId• 2 level hash to split large partitions• Exception list to split large partition• Limitation: Cannot partition inside a single network • Repartitioning story is expensive• Consistent hashing?

38

Page 39: Real-time Search at Yammer - By Aleksandrovsky Boris

39

Testing

• Indexing• Idempotent• Out-of-order delivery• Duplicate and incomplete docs tolerance• 10K docs delivered in random order with X% of

dupes and Y% incomplete records • Search

• Small manual index by recording event• Unit style tests (testng) with Asserts

39

Page 40: Real-time Search at Yammer - By Aleksandrovsky Boris

40

Production

• Measure• Hardware is cheap, people are not

• People require more maintenance• Have enough redundancy

40

Page 41: Real-time Search at Yammer - By Aleksandrovsky Boris

41

Metrics

• JVM, Queue, Logging and Configuration

Page 42: Real-time Search at Yammer - By Aleksandrovsky Boris

42

Metrics

• Gauges

Page 43: Real-time Search at Yammer - By Aleksandrovsky Boris

43

Metrics

• Meters

Page 44: Real-time Search at Yammer - By Aleksandrovsky Boris

44

Metrics

• Timers

Page 45: Real-time Search at Yammer - By Aleksandrovsky Boris

45

Metrics

• https://github.com/codahale/metrics

Page 46: Real-time Search at Yammer - By Aleksandrovsky Boris

46

Lessons

• Do not underestimate your data model• Tradeoff between consistency, RT availability and

correctness• Measure• Flexible partitioning scheme• Data recovery plan

Page 47: Real-time Search at Yammer - By Aleksandrovsky Boris

47

Future

• Dynamic routing• Zookeeper

• Partition rebalancing• Multiple sub-partitions with different SLAs• Work on relevancy• Multiple languages• Document parsing• External data • Scala

Page 48: Real-time Search at Yammer - By Aleksandrovsky Boris

48

Q&A Session: What’s On Your Mind?