concurrency at the database layer
DESCRIPTION
A presentation for the Reactive Programming Enthusiasts Denver meet-up. http://www.meetup.com/Reactive-Programming-Enthusiasts-Denver/ How Reactive Mongo helps utilize your hardware better and achieve a non-blocking application from the bottom up.TRANSCRIPT
Concurrency at the Database Layer
Scala / Reactive Mongo Example For Reactive Programming Enthusiasts Denver
Mark Wilson
Why ?
• To be non-blocking all the way through
• Because of Amdahl’s law
• and the Reactive Manifesto
The Contextual Explanation
Amdahl’s Law“…Therefore it is important that the entire solution is
asynchronous and non-blocking.” - Reactive Manifesto
Reactive• Use resources better to achieve scale, resilience,
and performance.
• Do this by replacing thread managed state with event-driven messaging
• Replace synchronous implementations with non-blocking code.
• Pillars: Responsive, Scalable, Event-Driven, Resilient
Why (cont.) ?The Specific Explanations for concurrency at the database layer.
• Because most db drivers are blocking.
• Because concurrency beats thread tuning at large scale.
• Because it is faster on multicore machines (which most are) and more fully utilizes the machine recourses
Presentation
• Restful services with GET, POST, PUT supported by a concurrent database layer
• Demo app resembling a real-world structure
• Coding with the library. Example implementation
• Some load testing blocking vs non-blocking.
Focus and Outline What I wanted to achieve
Demo Application“Whale Sighting” Single Page application
https://github.com/p44/FishStore
• The UI !
• The Services !
• The DB Layer
The POSTPOST {"breed":"Sei Whale","count":1,"description":"Pretty whale."}
Concurrency with Future(s). So What’s a Future?
• A future is an object holding a value that may become available at some point in time.
• A future is either completed or not completed
• A completed future is either:
• Successful(value)
• Failed(exception)
• A future frees the current thread by establishing a callback
import scala.concurrent.ExecutionContext.Implicits.global import scala.concurrent.Future import scala.util.{Failure, Success}
! val fs: Future[String] = Future { “a” + “b” } fs.onComplete { case Success(x) => println(x) case Failure(e) => throw e }
Future Example
Execution contexts execute tasks submitted to them, and you can think of execution contexts as thread pools.
The Insertval fResult: Future[LastError] = WhaleSighting.insertOneAsFuture(db, ws)
object WhaleSighting extends Persistable[WhaleSighting] {…}!trait Persistable[T] {…! def getCollection(db: DefaultDB): BSONCollection = { db.collection(collName) } def getBsonForInsert(obj: T): BSONDocument // BSON for insert! def insertOneAsFuture(db: DefaultDB, obj: T): Future[LastError] = { insertOneAsFuture(getCollection(db), obj) }! def insertOneAsFuture(coll: BSONCollection, obj: T): Future[LastError] = { coll.insert(getBsonForInsert(obj)) }…}
The Insert
import reactivemongo.core.netty.BufferSequence protected def watchFailure[T](future: => Future[T]): Future[T] = Try(future).recover { case e: Throwable => Future.failed(e) }.get! def insert[T](document: T, writeConcern: GetLastError = GetLastError()) (implicit writer: Writer[T], ec: ExecutionContext): Future[LastError] = watchFailure {! val op = Insert(0, fullCollectionName) val bson = writeDoc(document, writer) val checkedWriteRequest = CheckedWriteRequest(op, BufferSequence(bson), writeConcern) Failover(checkedWriteRequest, db.connection, failoverStrategy).future.mapEither(LastError.meaningful(_)) }
inside Reactive Mongo GenericCollection
Netty in a NutshellNetty architecture in a nutshell -
http://docs.jboss.org/netty/3.1/guide/html/architecture.html
• Summary 3 Main Areas:
• 1. Buffer - ChannelBuffer - Customized types, dynamic sizing buffer, faster than ByteBuffer
• 2. Channel - Universal Asynchronous I/O - abstracts away all operations required to point-to-point communication (ChannelFactory)
• 3. Event Model - ChannelEvent, ChannelHandler, ChannelPipeline (chain events and send upstream or downstream)
A Tale of 3 Load Tests• Non-blocking with ReactiveMongo
• Blocking with Await on each insert with ReactiveMongo
• Synchronous with Casbah
Load Test 1Non-blocking with ReactiveMongo
Load Test 2Blocking with ReactiveMongo
Load Test 3Synchronous with Casbah
Next Load Tests…
• Not just inserts. Mongodb is fast at inserts.
• Heavy Queries
• Updates
Details: _idObjectId is a 12-byte BSON type, constructed using:
• a 4-byte value representing the seconds since the Unix epoch,
• a 3-byte machine identifier,
• a 2-byte process id, and
• a 3-byte counter, starting with a random value.
It is indexed and efficient
Serializing the _id to/from BSONobject WhaleSighting extends Persistable[WhaleSighting] {…}!trait Persistable[T] {…! // MongoId converters def getMongoId(doc: BSONDocument, fieldName_id: String): String = { val oid: Option[BSONObjectID] = doc.getAs[BSONObjectID](fieldName_id) oid match { case None => { val m = "Failed mongoId conversion for field name " + fieldName_id throw new Exception(m) case Some(id) => id.stringify // hexadecimal String representation } } def hexStringToMongoId(hexString: String): BSONObjectID = BSONObjectID(hexString)!…}
Find And Modifyhttps://github.com/ReactiveMongo/ReactiveMongo/blob/master/driver/samples/
SimpleUseCasesSample.scala#L186-212
// finds all documents with lastName = Godbillon and replace lastName with GODBILLON def findAndModify() = { val selector = BSONDocument( "lastName" -> "Godbillon") val modifier = BSONDocument( "$set" -> BSONDocument("lastName" -> "GODBILLON")) val command = FindAndModify(collection.name, selector, Update(modifier, false))! db.command(command).onComplete { case Failure(error) => { throw new RuntimeException("got an error while performing findAndModify", error) } case Success(maybeDocument) => println("findAndModify successfully done with original document = " + // if there is an original document returned, print it in a pretty format maybeDocument.map(doc => { // get a BSONIterator (lazy BSON parser) of this document // stringify it with DefaultBSONIterator.pretty BSONDocument.pretty(doc) })) } }
Streaming From a Collectionhttps://github.com/sgodbillon/reactivemongo-tailablecursor-demo
def watchCollection = WebSocket.using[JsValue] { request => val collection = db.collection[JSONCollection]("acappedcollection") // Inserts the received messages into the capped collection val in = Iteratee.flatten(futureCollection.map(collection => Iteratee.foreach[JsValue] { json => println("received " + json) collection.insert(json) }))! // Enumerates the capped collection val out = { val futureEnumerator = futureCollection.map { collection => // so we are sure that the collection exists and is a capped one val cursor: Cursor[JsValue] = collection // we want all the documents .find(Json.obj()) .options(QueryOpts().tailable.awaitData) // tailable and await data .cursor[JsValue] cursor.enumerate } Enumerator.flatten(futureEnumerator) }! // We're done! (in, out)}
Connections
import reactivemongo.api.MongoDriver
val driver = new MongoDriver
//Without any parameter, MongoDriver creates a new Akka’s ActorSystem.
//Then you can connect to a MongoDB server.
val connection = driver.connection(List(“localhost")) // a conn pool
val db = connection.db("somedatabase")
val collection = db.collection("somecollection")
http://reactivemongo.org/releases/0.10/documentation/tutorial/connect-database.html
Play! Thread PoolsPlay uses a number of different thread pools for different purposes:
• Netty boss/worker thread pools - These are used internally by Netty for handling Netty IO. An applications code should never be executed by a thread in these thread pools.
• Play Internal Thread Pool - This is used internally by Play. No application code should ever be executed by a thread in this thread pool, and no blocking should ever be done in this thread pool. Its size can be configured by setting internal-threadpool-size in application.conf, and it defaults to the number of available processors.
• Play default thread pool - This is the default thread pool in which all application code in Play Framework is executed. It is an Akka dispatcher, and can be configured by configuring Akka, described below. By default, it has one thread per processor.
• Akka thread pool - This is used by the Play Akka plugin, and can be configured the same way that you would configure Akka.
Relational Database Options?
https://github.com/mauricio/postgresql-async
Non-blocking for MySql and PostgreSQL
A tour through the code
Data is not information, information is not knowledge, knowledge is not understanding, understanding is not
wisdom. !
- Clifford Stoll !