migration best practices: from rdbms to cassandra without a hitch
TRANSCRIPT
#Cassandra @doanduyhai
Who am I ?
2
DuyHai Doan
Achilles Cassandra Technical Advocate @ Datastax
Former Java Developer @ Libon
#Cassandra @doanduyhai
Agenda • Libon context
• Migration strategy
• Business code migration
• Data Modeling
• Take Away
3
#Cassandra @doanduyhai
What is Libon ? • Messaging app
• VOIP (out)
• Custom voicemail & greetings
• SMS/chat/file transfer
• Contacts matching
5
#Cassandra @doanduyhai
Project Context • Application grew over the years
• Already using Cassandra to handle events
• messaging / file sharing / SMS / notifications
• Cassandra R/W latencies ≈ 0,4 ms
• server response time under 10 ms
11
#Cassandra @doanduyhai
Project Context • About contacts …
• stored as relational model in RDBMS (Oracle)
13
#Cassandra @doanduyhai
Project Context • About contacts …
• stored as relational model in RDBMS (Oracle)
• 1 user ≈ 300 contacts
14
#Cassandra @doanduyhai
Project Context • About contacts …
• stored as relational model in RDBMS (Oracle)
• 1 user ≈ 300 contacts
• with millions users ☞ billions of contacts to handle
15
#Cassandra @doanduyhai
Project Context • About contacts …
• stored as relational model in RDBMS (Oracle)
• 1 user ≈ 300 contacts
• with millions users ☞ billions of contacts to handle
• query latency unpredictable
16
#Cassandra @doanduyhai
Fixing the problem • Tune the RDBMS
• indices
• partitioning
• less joins, simplified relational model
21
#Cassandra @doanduyhai
Fixing the problem • Tune the RDBMS
• indices
• partitioning
• less joins, simplified relational model
• hardware capacity increased
22
#Cassandra @doanduyhai
Fixing the problem • Tune the RDBMS
• indices
• partitioning
• less joins, simplified relational model
• hardware capacity increased
23
That worked
#Cassandra @doanduyhai
Fixing the problem • Tune the RDBMS
• indices
• partitioning
• less joins, simplified relational model
• hardware capacity increased
24
That worked but …
#Cassandra @doanduyhai
Next Challenges • High Availability (DB failure, site failure …)
• Predictable performance at scale
28
#Cassandra @doanduyhai
Next Challenges • High Availability (DB failure, site failure …)
• Predictable performance at scale
• Going to multi data-centers
29
#Cassandra @doanduyhai
Next Challenges • High Availability (DB failure, site failure …)
• Predictable performance at scale
• Going to multi data-centers
☞ Cassandra, what else ?
30
#Cassandra @doanduyhai
Objectives • No downtime
• No concurrency corner-cases
• Safe rollback possible
34
#Cassandra @doanduyhai
Objectives • No downtime
• No concurrency corner-cases
• Safe rollback possible
• Replay-ability & resume-ability
35
#Cassandra @doanduyhai
Strategy • 4 phases
• Write contacts to both data stores
• Old contacts migration
38
#Cassandra @doanduyhai
Strategy • 4 phases
• Write contacts to both data stores
• Old contacts migration
• Switch to Cassandra (but keep RDBMS in case of…)
39
#Cassandra @doanduyhai
Strategy • 4 phases
• Write contacts to both data stores
• Old contacts migration
• Switch to Cassandra (but keep RDBMS in case of…)
• Remove the RDBMS code
40
#Cassandra @doanduyhai
Migration Phase 1
41
Back end server
· ·
·
SQL SQL SQL
C*
C*
C* C*
C*
Write
contactUUID
contactId … contactUUID 129363 123e4567-
e89b-12d3… 834849
contacId(long) + contactUUID
#Cassandra @doanduyhai
Migration Phase 2 • On live production, migrate old contacts
43
SQL SQL SQL
C*
C*
C* C*
C*
For each batch of users SELECT * FROM contacts WHERE user_id = … AND contact_uuid IS NULL
Old contacts created before phase 1
#Cassandra @doanduyhai
Migration Phase 2 • On live production, migrate old contacts
44
SQL SQL SQL
C*
C*
C* C*
C*
For each batch of users SELECT * FROM contacts WHERE user_id = … AND contact_uuid IS NULL
Logged batches of INSERT INTO contacts(..) VALUES(…) USING TIMESTAMP now() - 1 week
Old contacts created before phase 1
#Cassandra @doanduyhai
Migration Phase 2 • During data migration …
• … concurrent writes from the migration batch …
47
#Cassandra @doanduyhai
Migration Phase 2 • During data migration …
• … concurrent writes from the migration batch …
• … and updates from production for the same contact
48
#Cassandra @doanduyhai
Migration Phase 2
49
contact_uuid name (now -1 week) … name (now) …
Johny … Johnny …
Insert from batch (to the past)
Update from production
#Cassandra @doanduyhai
Migration Phase 2
50
contact_uuid name (now -1 week) … name (now) …
Johny … Johnny …
Future reads pick the most up-to-date value
#Cassandra @doanduyhai
Last Write Win in action
51
Case 1 Case 2
Batchpast(Johny) t1
Prodnow(Johnny) t2
t3 Read(Johnny)
Batchpast(Johny)
t1 Prodnow(Johnny)
t2
t3 Read(Johnny)
#Cassandra @doanduyhai
Migration Phase 2
52
"Write to the Past… to save the Future"
Libon – 2014/10/08
#Cassandra @doanduyhai
Migration Phase 4
54
Back end server
· ·
·
SQL SQL SQL
C*
C*
C* C*
C*
Write
❌
#Cassandra @doanduyhai
Code Inventory • Written for RDBMS
• Lots of joins (no surprise)
• Designed around transactions
58
#Cassandra @doanduyhai
Code Inventory • Written for RDBMS
• Lots of joins (no surprise)
• Designed around transactions
• Spring @Transactional everywhere
59
#Cassandra @doanduyhai
Code Inventory cont. • Entities go through Services & Repositories
60
Repositories
Services
ContactEntity
#Cassandra @doanduyhai
Code Inventory cont. • Hibernate is auto-magic
• lazy loading
• 1st level cache
• N+1 select
62
Repositories
Services
ContactEntity
#Cassandra @doanduyhai
Which options ? • Throw existing code …
• … and re-design from scratch for Cassandra
63
#Cassandra @doanduyhai
Which options ? • Throw existing code …
• … and re-design from scratch for Cassandra
64
No way !
#Cassandra @doanduyhai
Code Quality • Existing business code has…
• … ≈ 3500 unit tests
• and ≈600+ integration tests
66
#Cassandra @doanduyhai
Code Quality
67
"The code coverage is one of your most
valuable technical asset" Libon – since beginning
#Cassandra @doanduyhai
Repositories
Services
Refactoring Strategy
68
ContactMatchingService ContactService ContactSync
ContactEntity
n 1 n n
#Cassandra @doanduyhai
Repositories
Services
Refactoring Strategy
69
ContactMatchingService ContactService
ContactNoSQLEntity
ContactSync
ContactEntity
n 1 n n
Proxy
#Cassandra @doanduyhai
Repositories
Services
Refactoring Strategy
70
ContactMatchingService ContactService
ContactNoSQLEntity
ContactSync
ContactEntity
n 1 n n
Denorm2 … DenormN Denorm1
Proxy
#Cassandra @doanduyhai
Refactoring Strategy • Use CQRS
• ContactReadRepository
• ContactWriteRepository
• ContactUpdateRepository
• ContactDeleteRepository
71
#Cassandra @doanduyhai
Refactoring Strategy • ContactReadRepository
• direct sequential read
• no joins
• 1 read ≈ 1 SELECT
72
#Cassandra @doanduyhai
Refactoring Strategy • ContactWriteRepository
• write to all denormalized tables
• using CQL logged batches
• use TTLs
73
#Cassandra @doanduyhai
Refactoring Strategy • ContactUpdateRepository
• read-before-write most of the time 😟
• rare updates ☞ acceptable perf penalty
74
#Cassandra @doanduyhai
Outcome • 5 months of 2 men work
• Many iterations to fix bugs (thanks to IT)
77
#Cassandra @doanduyhai
Outcome • 5 months of 2 men work
• Many iterations to fix bugs (thanks to IT)
• Lots of performance benchmarks using Gatling
78
#Cassandra @doanduyhai
Outcome • 5 months of 2 men work
• Many iterations to fix bugs (thanks to IT)
• Lots of performance benchmarks using Gatling
☞ data model & code validation
80
#Cassandra @doanduyhai
Outcome • 5 months of 2 men work
• Many iterations to fix bugs (thanks to IT)
• Lots of performance benchmarks using Gatling
☞ data model & code validation
• … we are almost there for production
81
#Cassandra @doanduyhai
Denormalization, the good • Support fast reads
• 1 read ≈ 1 SELECT
• Worthy because mostly read, few updates
83
#Cassandra @doanduyhai
Denormalization, the bad • Updating mutable data can be nightmare
• Data model bound by existing client-facing API
• Update paths very error-prone without tests
84
#Cassandra @doanduyhai
Data model in detail
85
Contacts_by_id
Contacts_by_identifiers
Contacts_in_profiles
Contacts_by_modification_date
Contacts_by_firstname_lastname
Contacts_linked_user
#Cassandra @doanduyhai
Data model in detail
86
Contacts_by_id
Contacts_by_identifiers
Contacts_in_profiles
Contacts_by_modification_date
Contacts_by_firstname_lastname
Contacts_linked_user user_id always component
of partition key
#Cassandra @doanduyhai
Scalable design
87
n1
n2
n3
n4
n5
n6
n7
n8
A
B
C
D
E
F
G
H
user_id1
user_id2
user_id3
user_id4
user_id5
#Cassandra @doanduyhai
Scalable design
88
n1
n2
n3
n4
n5
n6
n7
n8
A
B
C
D
E
F
G
H
user_id1 user_id2
user_id3
user_id4
user_id5
#Cassandra @doanduyhai
Bloom filters in action • For some tables, partition key = (user_id, contact_id)
☞ fast look-up, leverages Bloom filters
☞ touches 1 SSTable most of the time
89
#Cassandra @doanduyhai
Data model in detail
90
Contacts_by_id
Contacts_by_identifiers
Contacts_in_profiles
Contacts_by_modification_date
Contacts_by_firstname_lastname
Contacts_linked_user Wide partition
#Cassandra @doanduyhai
A "queue" story • contacts_by_modification_date
• queue-like pattern 😭
☞ buckets to the rescue
92
user_id:2014-12 date35 date12 … … date47
… … … …
user_id:2014-11 date11 date12 … … date34
… … … …
#Cassandra @doanduyhai
Data model summary • 7 tables for denormalization
• Normalize some tables because rare access
94
#Cassandra @doanduyhai
Data model summary • 7 tables for denormalization
• Normalize some tables because rare access
• Read-before write in most update scenarios 😟
95
#Cassandra @doanduyhai
Notes on contact_id • In SQL, auto-generated long using sequence
• In Cassandra, auto-generated timeuuid
96
#Cassandra @doanduyhai
Notes on contact_id • How to store both types ?
• As text ? ☞ easy solution …
98
#Cassandra @doanduyhai
Notes on contact_id • How to store both types ?
• As text ? ☞ easy solution …
• … but waste of space !
• because encoded as UTF-8 or ASCII in Cassandra
99
#Cassandra @doanduyhai
Notes on contact_id • Long ☞ 8 bytes
• Long as text(UTF-8: 1 byte) ☞ "digits count" bytes
100
#Cassandra @doanduyhai
Notes on contact_id • UUID ☞ 16 bytes
E81D4C70-A638-11E4-83CB-DEB70BF9330F
• 32 hex chars + 4 hyphens = 36 chars
• UUID as text(UTF-8: 1 byte) ☞ 36 bytes
• Bytes overhead = 36 – 16 = 20 bytes
101
#Cassandra @doanduyhai
Notes on contact_id • 20 bytes wasted per contact uuid
• × 7 denormalizations = 140 bytes per contact uuid
103
#Cassandra @doanduyhai
Notes on contact_id • 20 bytes wasted per contact uuid
• × 7 denormalizations = 140 bytes per contact uuid
• × 109 contacts = 140 GB wasted
104
😠 not even counting replication factor …
#Cassandra @doanduyhai
Notes on contact_id • ☞ just save contact id as byte[ ]
• Achilles @TypeTransformer for automatic conversion (see later)
106
#Cassandra @doanduyhai
Notes on contact_id • ☞ just save contact id as byte[ ]
• Achilles @TypeTransformer for automatic conversion (see later)
• Use blobAsBigInt( ) or blobAsUUID( ) to view data
107
#Cassandra @doanduyhai
Achilles • Advanced "object mapper"
• Fluent API
• Tons of features
• TDD friendly
108
#Cassandra @doanduyhai
Achilles • Dirty checking, what is it ?
• 1 contact ≈ 8 mutable fields
• × 7 denormalizations = 56 update combinations …
111
#Cassandra @doanduyhai
Achilles • Dirty checking, what is it ?
• 1 contact ≈ 8 mutable fields
• × 7 denormalizations = 56 update combinations …
• and not even counting multiple fields updates …
112
#Cassandra @doanduyhai
Achilles • Are you going to manually generate 56+ prepared
statements for all possible updates ?
113
#Cassandra @doanduyhai
Achilles • Are you going to manually generate 56+ prepared
statements for all possible updates ?
• Or just use dynamic plain string statements and get some perf penalty ?
114
#Cassandra @doanduyhai
Achilles • Dirty check in action
115
//No read-before-write ContactEntity proxy = manager.forUpdate(ContactEntity.class, contactId); proxy.setFirstName(…); proxy.setLastName(…); //type-safe updates proxy.setAddress(…);
manager.update(proxy);
#Cassandra @doanduyhai
Achilles • Dynamic statements generation
117
UPDATE contacts SET firstname=?, lastname=?,address=? WHERE contact_id=?
prepared statements are cached, of course
#Cassandra @doanduyhai
Achilles • Simple INSERT prepared statement
119
INSERT INTO contacts(contact_id,name,age,address,gender,avatar,…) VALUES(?, ?, ?, ? … ?);
#Cassandra @doanduyhai
Achilles • Runtime values binding
• some columns are optional
120
preparedStatement.bind(49374,’John DOE’,33, null, null, …, null);
#Cassandra @doanduyhai
Achilles
124
Inserting null ≡ creating tombstones × 7 denormalizations
× billions of contacts created
😱 not even counting replication factor …
#Cassandra @doanduyhai
Achilles • Simple annotation
125
@Entity(table = "contacts_by_id ») @Strategy(insert = InsertStrategy.NOT_NULL_FIELDS) public class ContactById {
}
#Cassandra @doanduyhai
Achilles • Runtime dynamic INSERT statement
126
INSERT INTO contacts(contact_id, name, age, address,) VALUES(:contact_id, :name, :age, :address);
prepared statements are cached, of course
#Cassandra @doanduyhai
Achilles • Remember the contactId ⇄ byte[ ] conversion ?
127
@PartitionKey @Column(name = "contact_id") @TypeTransformer(valueCodecClass = ContactIdToBytes.class) private ContactId contactId;
BYOC ☞ Bring Your Own Codec
#Cassandra @doanduyhai
Achilles
128
public interface Codec<FROM, TO> { Class<FROM> sourceType(); Class<TO> targetType(); TO encode(FROM fromJava) FROM decode(TO fromCassandra); }
#Cassandra @doanduyhai
Achilles • Dynamic logging in action
129
2014-12-01 14:25:20,554 Bound statement : [INSERT INTO contacts.contacts_by_modification_date(user_id,month_bucket,modification_date,...) VALUES (:user_id,:month_bucket,:modification_date,...) USING TTL :ttl;] with CONSISTENCY LEVEL [LOCAL_QUORUM] 2014-12-01 14:25:20,554 bound values : [222130151, 2014-12, e13d0d50-7965-11e4-af38-90b11c2549e0, ...]
2014-12-01 14:25:20,701 Bound statement : [SELECT birthday,middlename,avatar_size,... FROM contacts.contacts_by_modification_date WHERE user_id=:user_id AND month_bucket=:month_bucket AND (modification_date)>=(:modification_date) ORDER BY modification_date ASC;] with CONSISTENCY LEVEL [LOCAL_QUORUM] 2014-12-01 14:25:20,701 bound values : [222130151, 2014-10, be6bc010-6109-11e4-b385-000038377ead]
#Cassandra @doanduyhai
Achilles • Dynamic logging
• runtime activation
• no need to recompile/re-deploy
• save us hours of debugging
• TRACE log level ☞ query tracing
130
#Cassandra @doanduyhai
Conditions for success • Data modeling is crucial
• Double-run strategy & timestamp trick FTW
133
#Cassandra @doanduyhai
Conditions for success • Data modeling is crucial
• Double-run strategy & timestamp trick FTW
• Data type conversion can be tricky
134
#Cassandra @doanduyhai
Conditions for success • Data modeling is crucial
• Double-run strategy & timestamp trick FTW
• Data type conversion can be tricky
• Benchmark !
135
#Cassandra @doanduyhai
Conditions for success • Data modeling is crucial
• Double-run strategy & timestamp trick FTW
• Data type conversion can be tricky
• Benchmark !
• Mindset shifts for the team
136