presented by, mysql & o’reilly media, inc. falcon from the beginning jim starkey...
TRANSCRIPT
Why Falcon?Because the World is Changing!
Hardware is evolving rapidly Customers need ACID transactions
Atomic – the books should balance
Consistent – the alternative is chaos
Isolated – preserve programmer’s sanity(sic)
Durable – who wants to lose data?
Where Hardware is going
CPUs breed like rabbits – more sockets, more cores per socket, more threads per core
Memory is bigger, faster, and cheaper Disks are bigger and cheaper but not much
faster (Boxes are cheaper and more plentiful, but
that’s a different story)
Where Applications are going
Batch – dead! Timesharing – dead! Departmental computing – dead! Client server – fading fast Application servers for most of us Web services for the really big guys
The Database challenge
Traditional challenge:
Exhaust CPU, memory, and disk simultaneously
Today’s challenge:
Exhaust CPU and memory and avoid the disk
Falcon tradeoffs
Use memory (page cache) to avoid disk reads Use memory (record cache) to avoid the page
cache manipulation. Use CPU to find the fastest path to a record Use CPU to minimize record size Synchronize most data structures with user
mode read/write locks Synchronize high contention data structures
with interlocked instructions.
The Falcon architecture
Incomplete in-memory database with disk backfill
Multi-version concurrency control in memory Updates in memory until commit Group commits to a single serial log write Post-commit multi-threaded pipe line to move
updates to disk
Incomplete in-memory database
Selected records cached in memory Separate cache for disk pages Record cache hit is 15% the cost of a page
cache hit Record cache is more memory efficient than
page cache
Record Encoding - Cache Efficiency
Records encoded by value, not declaration String “abc” occupies the same space in
varchar(3) or varchar(4096) The number 7 is the same where small,
medium, int, bigint, decimal, or numeric
Multi-Version Concurrency Control
Update operations create new record versions New version is tagged with transaction id, points
to old version System tracks which transactions should see
which versions Readers don’t block writers Everyone sees a consistent view of the data
Updates Are in Memory Until Commit
Updates held in memory pending commit (well, usually)
Index changes held in memory pending commit (same caveat)
Verb rollback is dirt cheap Transaction rollback is dirt cheap
At Commit…
Pending record updates flushed to serial log Pending index updates flushed to serial log Commit record written to serial log Serial log flushed to the oxide And the transaction is committed!
Alas, Memory isn’t infinite, so Large transaction chills uncommitted data
(flushes it to the log early) Chilled records can be thawed (fetched from the
log) Scavenger garbage collects unloved records
periodically When things get really bad, entire record chains
flushed to backlog (Note: This is hard and we aren’t done.)
Falcon Weaknesses
Transactions are ACID but not serializable Latency advantage disappears at saturation Very large transactions degrade performance Optimized for Web, not batch
Falcon Strengths
Runs like a memory database when data fits in cache
Scales like disk-based database when data doesn’t fit in cache
Lowest possible latency for Web applications Absorbs huge spiky loads
Performance Measurement
Generally benchmark against InnoDB (transactional engines)
We use the DBT2 benchmark:High contentionWrite intensive – 40% records touched are updatedMeasures only performance at saturation
DBT2 (we believe) is InnoDB’s best spot and Falcon’s worst
Benchmarking Results
16 & 8 cpu system: Falcon exceeds InnoDB performance
4 cpu systems: Falcon exceeds InnoDB performance for moderate to large number of threads
2 cpu systems: Rough parity, advantage to InnoDB
1 cpu systems: InnoDB wins
Caveat: Results subject to change! Both systems are moving targets!!!
When should you use what? If you don’t need ACID, MyISAM is probably fastest For Uniprocessors and small memory systems,
InnoDB is a good choice For large transaction batch, InnoDB may be best
match For multi-cores and large number of threads,
Falcon is probably best For the Web, Falcon is hard to beat.
Questions?