improving server applications with system transactions
DESCRIPTION
Sangman Kim , Michael Z. Lee, Alan M. Dunn, Owen S. Hofmann, Xuan Wang, Emmett Witchel , Donald E. Porter. Improving Server Applications with System Transactions. Poor OS API Support for Concurrency. Fine-grained locking - Bug-prone, hard to maintain - OS provides poor support. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/1.jpg)
1
Improving Server Applications with System Transactions
Sangman Kim, Michael Z. Lee, Alan M. Dunn, Owen S. Hofmann, Xuan Wang, Emmett Witchel, Donald E. Porter
![Page 2: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/2.jpg)
Poor OS API Support for Concurrency
2
Para
llelis
m
Maintainability
Fine-grained locking - Bug-prone, hard to maintain - OS provides poor support
Coarse-grained locking - Reduced resource utilization
![Page 3: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/3.jpg)
System Transaction Improves OS API Concurrency
3
Para
llelis
m
Server Applications
working with OS API
System Transactio
n
Server Applications
working with OS API
Maintainability
![Page 4: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/4.jpg)
4
LinuxTxOS
Improving System Transactions
TxOS provides operating system transaction [Porter et al., SOSP 2009] Transaction for OS objects (e.g., files,
pipes)System transaction in TxOS
TxOS
ApplicationJVM
Middleware state sharing with multithreading
TxOS system callsMiddleware state sharing
![Page 5: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/5.jpg)
5
TxOS
Improving System Transactions
TxOS provides operating system transaction [Porter et al., SOSP 2009] Transaction for OS objects (e.g., files ,
pipes)Synchronization in legacy code
ApplicationJVM
TxOS system calls
Synchronization primitivesMiddleware state sharing
![Page 6: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/6.jpg)
TxOS
Improving System Transactions
TxOS provides operating system transaction [Porter et al., SOSP 2009] Transaction for OS objects (e.g., files,
pipes) TxOS+: Improved system
transactions
6
ApplicationJVM
TxOS system calls
TxOS+TxOS+: pause/resume,commit ordering, and more
Up to 88% throughput improvement
At most 40 application line changes
Synchronization primitivesMiddleware state sharing
![Page 7: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/7.jpg)
7
Background: system transaction
System transactions in action
Challenges for rewriting applications
Implementation and evaluation
Outline
![Page 8: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/8.jpg)
Background: System Transaction Transaction Interface and semantics
System calls: xbegin(), xend(), xabort()
ACID semantics ▪ Atomic – all or nothing▪ Consistent – one consistent state to another▪ Isolated – updates as if only one concurrent transaction▪ Durable – committed transactions on disk
Optimistic concurrency control
Fix synchronization issues with OS APIs8
![Page 9: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/9.jpg)
Background: System Transaction
Lazy versioning: speculative copy for data
TxOS requires no special hardware9
xbegin();write(f, buf);xend(); CommitAbort
Conflict!inode
header
inode iinumlock
…
inode datasize
mode…
Copy of inode data
![Page 10: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/10.jpg)
10
Background: system transaction
System transactions in action
Challenges for rewriting applications
Implementation and evaluation
Outline
![Page 11: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/11.jpg)
11
Applications Parallelized with OS Transactions
Parallelizing applications that synchronize on OS state
Example 1: State-machine replication Constraint: Deterministic state update
Example 2: IMAP Email Server Constraint: Consistent file system
operations
![Page 12: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/12.jpg)
12
Example 1: Parallelizing State-machine Replication
Core component of fault tolerant services e.g., Chubby, Zookeeper, Autopilot
Replicas execute the same sequence of operations Often single-threaded to avoid non-determinism
Ordered transaction Makes parallel OS state updates deterministic Applications determine commit order of
transactions
![Page 13: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/13.jpg)
13
Example 2: Parallelizing IMAP Email ServersEveryone has concurrent email
clients Desktop, laptop, tablets, phones, .... Need concurrent access to stored emails
Brief history of email storage formats mbox: single file, file locking Lockless Maildir Dovecot Maildir: return of file locking
![Page 14: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/14.jpg)
14
mbox Single file mailbox of email messages
Synchronization with file-locking▪ One of fcntl(), flock(), lock file (.mbox.lock)▪ Very coarse-grained locking
mbox: Database Without Parallelism
~/.mboxFrom MAILER-DAEMON Wed Apr 11 09:32:28 2012 From: Sangman Kim <[email protected]> To: EuroSys 2012 audienceSubject: mbox needs file lock. Maildir hides message.…..From MAILER-DAEMON Wed Apr 11 09:34:51 2012From: Sangman Kim <[email protected]> To: EuroSys 2012 audienceSubject: System transactions good, file locks bad!….
![Page 15: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/15.jpg)
15
Maildir: Parallelism Through Lockless Design Maildir: Lockless alternative to mbox
Directories of message files Each file contains a message Directory access with no synchronization
(originally)
Message filenames contain flagsMaildir/cur 00000000.00201.host:2,T00001000.00305.host:2,R
00002000.02619.host:2,T00010000.08919.host:2,S00015000.10019.host:2,S
TrashedRepliedTrashedSeenSeen
![Page 16: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/16.jpg)
16
Messages Hidden with Lockless Maildir
PROCESS 2 (MARKING)
if (access(“043:2,S”)):
rename(“043:2,S”, “043:2,R”)
PROCESS 1 (LISTING)
while (f = readdir(“Maildir/cur”)):
print f.name
018:2,S 021:2,S 052:2,S 061:2,SSeen
“Maildir/cur” directory
Seen043:2,SSeen Seen Seen
![Page 17: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/17.jpg)
17
043:2,RReplied
Messages Hidden with Lockless Maildir
PROCESS 2 (MARKING)
if (access(“043:2,S”)):
rename(“043:2,S”, “043:2,R”)
PROCESS 1 (LISTING)
while (f = readdir(“Maildir/cur”)):
print f.name
018:2,S 021:2,S 052:2,S 061:2,SSeen
“Maildir/cur” directory
Seen043:2,SSeen Seen Seen
Process 1 Result018:2,S021:2,S052:2,S061:2,S
Message missing!
![Page 18: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/18.jpg)
18
Return of The Coarse-grained File Locking Maildir synchronization
Lockless
File locks▪ Per-directory coarse-grained locking▪ Complexity of Maildir, performance of mbox
System transactions
“certain anomalous situations may result” – Courier IMAP manpage
![Page 19: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/19.jpg)
Maildir Parallelized with System Transaction
PROCESS 1 (MARKING)
xbegin()
if (access(“XXX:2,S”)):
rename(“XXX:2,S”,
“XXX:2,R”)xend()
PROCESS 2 (MESSAGE LISTING)
xbegin()
while (f = readdir(“Maildir/cur”)):
print f.name
xend()
xbegin()
xend()
xbegin()
xend()
Consistent directory accesses with better parallelism
19
![Page 20: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/20.jpg)
20
Background: system transaction
System transactions in action
Challenges for rewriting applications
Implementation and evaluation
Outline
![Page 21: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/21.jpg)
21
Challenges of Rewriting Applications
1. Middleware state sharing
2. Deterministic parallel update for system state
3. Composing with other synchronization primitives
![Page 22: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/22.jpg)
22
Middleware and System Transaction Problem with memory management
Multiple threads share the same heap
Thread 1 Thread 2In Transaction
xbegin();p1 = malloc();
xabort();p2 = malloc();
*p2 = 1;
Middleware (libc)
Kernel mmap()
Heap
Transactional object for heap
p1
![Page 23: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/23.jpg)
23
Middleware and System Transaction Problem with memory management
Multiple threads share the same heap
Thread 1 Thread 2In Transaction
xbegin();p1 = malloc();
xabort();p2 = malloc();
*p2 = 1;
Middleware (libc)
Kernel
Transactional object for heap
Heap
FAULT!Certain middleware actions should not roll back
p1 p2unmapped
![Page 24: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/24.jpg)
24
Two Types of Actions on Middleware State
USER-INITIATED ACTION User changes system
state Most file accesses Most synchronization
MIDDLEWARE-INITIATED System state changed
as side effect of user action malloc() memory mapping Java garbage collection Dynamic linking
Middleware state shared among user threads Can’t just roll back!
![Page 25: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/25.jpg)
25
Handling Middleware-Initiated Actions Transaction pause/resume
Expose state changes by middleware-initiated actions to other threads
Additional system calls ▪ xpause(), xresume()
Limited complexity increase▪ We used pause/resume 8 times in glibc, 4 times in
JVM▪ Only used in application for debugging
![Page 26: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/26.jpg)
26
Pause/Resume In JVM Execution
SysTransaction.begin();
files = dir.list();
SysTransaction.end();
Java code
xpause()
xresume()
xbegin();
files = dir.list();
VM operations(garbage collection)
xend();
JVM Execution
![Page 27: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/27.jpg)
27
Other Challenges for Maturing TxOS 17,000 lines of kernel changes
Transactionalizing file descriptor table Handling page lock for disk I/O Memory protection Optimization with directory caching Reorganizing data structure and more
Details in the paper
![Page 28: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/28.jpg)
28
Background: system transaction
System transactions in action
Challenges for rewriting applications
Implementation and evaluation
Outline
![Page 29: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/29.jpg)
29
Application 1: Parallelized BFT Application
Implemented in UpRight BFT library
Fault tolerant routing backend Graph stored in a file Compute shortest path Edge add/remove
Ordered transactions for deterministic update
![Page 30: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/30.jpg)
30
Minimal Application Code Change
Component Total LOC Changed LOC
Routing application
1,006 18 (1.8%)
Upright Library 22,767 174 (0.7%)
JVM 496,305 384 (0.0008%)
glibc 1,027,399 826 (0.0008%)
![Page 31: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/31.jpg)
31
0 10 20 30 40 50 60 70 80 90 1000
5001000150020002500300035004000
TxOS, denseLinux,dense
Deterministic State Update with Better Throughput
Thro
ughp
ut (
req/
s)
Dense graph: 88%
tput
Sparse graph:
11% tput
Work to add/delete edges small compared to scheduling
overhead
Write ratio (%)BFT graph server
![Page 32: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/32.jpg)
32
Application 2: Dovecot Maildir access Dovecot mail server
Uses directory lock files for maildir accesses
Locking is replaced with system transactions Changed LoC: 40 out of 138,723
Benchmark: Parallel IMAP clients Each client executes operations on a random
message▪ Read: message read▪ Write: message creation/deletion▪ 1500 messages total
![Page 33: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/33.jpg)
33
Mailbox Consistency with Better Throughput
Dovecot benchmark with 4 clients
0102030405060708090
0 10 25 50 100Write ratio (%)
Tput
Impr
ovem
ent
(%)
Better block scheduling
enhances write performance
![Page 34: Improving Server Applications with System Transactions](https://reader035.vdocuments.net/reader035/viewer/2022062520/568165dd550346895dd8f62c/html5/thumbnails/34.jpg)
34
Conclusion: OS Transactions Improve Server Performance
System transactions parallelize tricky server applications Parallel Dovecot maildir operations Parallel BFT state update
System transaction improves throughput with few application changes Up to 88% throughput improvement At most 40 changed lines of application
code