swarm
TRANSCRIPT
![Page 2: Swarm](https://reader034.vdocuments.net/reader034/viewer/2022042700/554ba569b4c905b8618b4e9e/html5/thumbnails/2.jpg)
No more capacity limits!Google had 147GB of data in 1998. Now, ~100$ buys you a 128GB microSD, and that is in your phone! Storage is pervasive, abundant and cheap.
With 64bit multicore CPUs, even phones may store and process lots of data. !Network is the bottleneck now!
20141998
![Page 3: Swarm](https://reader034.vdocuments.net/reader034/viewer/2022042700/554ba569b4c905b8618b4e9e/html5/thumbnails/3.jpg)
Users wait for the data to load too long, too often.
Web/mobile apps go data-heavy... but RTT* does not improve. !
Mobile devices rely on wireless... which is unreliable by its nature. !A user has many devices... so instant sync is expected. !
The network is often slow and unreliable! So?
* network round-trip time
![Page 4: Swarm](https://reader034.vdocuments.net/reader034/viewer/2022042700/554ba569b4c905b8618b4e9e/html5/thumbnails/4.jpg)
Solution: cache everything, sync it as needed
Once the data is delivered, caching is free. Once data is prefetched and cached: • there are no "loading" stalls; • offline mode is OK; • intermittent connection is also OK. So, huge UX improvement! !
But, total caching poses a challenge: • the data is changed on both sides; • invalidation no longer works; • need versioning and synchronization! !
![Page 5: Swarm](https://reader034.vdocuments.net/reader034/viewer/2022042700/554ba569b4c905b8618b4e9e/html5/thumbnails/5.jpg)
CRDT enables total caching and incremental syncCRDT (commutative replicated data types) • real-time background sync • versioned data (detects new and seen) • offline work, caching, prefetching • conflict-free merge for concurrent changes • CRDTs are used by Cassandra, Riak Causal trees: collaborative real-time editing • a CRDT replacement for OT* • offline-first, perfectly cacheable • in-browser (JavaScript, contentEditable) • authorship attribution (who wrote what) • change detection (what has been changed?) • initially, developed for letters.yandex.ru
* Operational Transformation
![Page 6: Swarm](https://reader034.vdocuments.net/reader034/viewer/2022042700/554ba569b4c905b8618b4e9e/html5/thumbnails/6.jpg)
Swarm: client-side CRDT implementationSwarm: real-time synchronized object cache • a replicated model library, M of MVC • think of "Dropbox for objects" • client-side: JavaScript (ObjC, Java is planned) • server-side: node.js (Java is planned) • Backbonish, 2KLoC
Citrea: collaborative real-time editor • builds on regular contentEditable • advanced versioning/authorship tracking • think of "Google Docs, embedded"
![Page 7: Swarm](https://reader034.vdocuments.net/reader034/viewer/2022042700/554ba569b4c905b8618b4e9e/html5/thumbnails/7.jpg)
Building a total cache system from scratch is man-years
• "There are only two hard things in Computer Science: cache invalidation and naming things" -- attributed to P.Karlton
• Data on the client turns a Web system (simple) into an AP* system (complex)
• That is man-years.
* by the CAP theorem
![Page 8: Swarm](https://reader034.vdocuments.net/reader034/viewer/2022042700/554ba569b4c905b8618b4e9e/html5/thumbnails/8.jpg)
Team: we implement CRDTs faster than the theory is written! *
Victor Grishchenko, PhD, USU and Delft University of Technology, Bank of Russia, Yandex, does rocket science.
Alexei Balandin, USU, Beeline, AT Consulting, gosuslugi.ru e-gov, does enterprisey stuff.
* we actually do sometimes, as we found at PaPEC'14