concurrency at the database layer

Concurrency at the Database Layer

Scala / Reactive Mongo Example For Reactive Programming Enthusiasts Denver

Mark Wilson

Why ?

• To be non-blocking all the way through

• Because of Amdahl’s law

• and the Reactive Manifesto

The Contextual Explanation

Amdahl’s Law“…Therefore it is important that the entire solution is

asynchronous and non-blocking.” - Reactive Manifesto

Reactive• Use resources better to achieve scale, resilience,

and performance.

• Do this by replacing thread managed state with event-driven messaging

• Replace synchronous implementations with non-blocking code.

• Pillars: Responsive, Scalable, Event-Driven, Resilient

Why (cont.) ?The Specific Explanations for concurrency at the database layer.

• Because most db drivers are blocking.

• Because concurrency beats thread tuning at large scale.

• Because it is faster on multicore machines (which most are) and more fully utilizes the machine recourses

Presentation

• Restful services with GET, POST, PUT supported by a concurrent database layer

• Demo app resembling a real-world structure

• Coding with the library. Example implementation

• Some load testing blocking vs non-blocking.

Focus and Outline What I wanted to achieve

Demo Application“Whale Sighting” Single Page application

https://github.com/p44/FishStore

• The UI !

• The Services !

• The DB Layer

https://github.com/p44/FishStore

The POSTPOST {"breed":"Sei Whale","count":1,"description":"Pretty whale."}

Concurrency with Future(s). So What’s a Future?

• A future is an object holding a value that may become available at some point in time.

• A future is either completed or not completed

• A completed future is either:

• Successful(value)

• Failed(exception)

• A future frees the current thread by establishing a callback

import scala.concurrent.ExecutionContext.Implicits.global import scala.concurrent.Future import scala.util.{Failure, Success}

! val fs: Future[String] = Future { “a” + “b” } fs.onComplete { case Success(x) => println(x) case Failure(e) => throw e }

Future Example

Execution contexts execute tasks submitted to them, and you can think of execution contexts as thread pools.

The Insertval fResult: Future[LastError] = WhaleSighting.insertOneAsFuture(db, ws)

object WhaleSighting extends Persistable[WhaleSighting] {…}!trait Persistable[T] {…! def getCollection(db: DefaultDB): BSONCollection = { db.collection(collName) } def getBsonForInsert(obj: T): BSONDocument // BSON for insert! def insertOneAsFuture(db: DefaultDB, obj: T): Future[LastError] = { insertOneAsFuture(getCollection(db), obj) }! def insertOneAsFuture(coll: BSONCollection, obj: T): Future[LastError] = { coll.insert(getBsonForInsert(obj)) }…}

The Insert

import reactivemongo.core.netty.BufferSequence protected def watchFailure[T](future: => Future[T]): Future[T] = Try(future).recover { case e: Throwable => Future.failed(e) }.get! def insert[T](document: T, writeConcern: GetLastError = GetLastError()) (implicit writer: Writer[T], ec: ExecutionContext): Future[LastError] = watchFailure {! val op = Insert(0, fullCollectionName) val bson = writeDoc(document, writer) val checkedWriteRequest = CheckedWriteRequest(op, BufferSequence(bson), writeConcern) Failover(checkedWriteRequest, db.connection, failoverStrategy).future.mapEither(LastError.meaningful(_)) }

inside Reactive Mongo GenericCollection

Netty in a NutshellNetty architecture in a nutshell -

http://docs.jboss.org/netty/3.1/guide/html/architecture.html

• Summary 3 Main Areas:

• 1. Buffer - ChannelBuffer - Customized types, dynamic sizing buffer, faster than ByteBuffer

• 2. Channel - Universal Asynchronous I/O - abstracts away all operations required to point-to-point communication (ChannelFactory)

• 3. Event Model - ChannelEvent, ChannelHandler, ChannelPipeline (chain events and send upstream or downstream)

A Tale of 3 Load Tests• Non-blocking with ReactiveMongo

• Blocking with Await on each insert with ReactiveMongo

• Synchronous with Casbah

Load Test 1Non-blocking with ReactiveMongo

Load Test 2Blocking with ReactiveMongo

Load Test 3Synchronous with Casbah

Next Load Tests…

• Not just inserts. Mongodb is fast at inserts.

• Heavy Queries

• Updates

Details: _idObjectId is a 12-byte BSON type, constructed using:

• a 4-byte value representing the seconds since the Unix epoch,

• a 3-byte machine identifier,

• a 2-byte process id, and

• a 3-byte counter, starting with a random value.

It is indexed and efficient

Serializing the _id to/from BSONobject WhaleSighting extends Persistable[WhaleSighting] {…}!trait Persistable[T] {…! // MongoId converters def getMongoId(doc: BSONDocument, fieldName_id: String): String = { val oid: Option[BSONObjectID] = doc.getAs[BSONObjectID](fieldName_id) oid match { case None => { val m = "Failed mongoId conversion for field name " + fieldName_id throw new Exception(m) case Some(id) => id.stringify // hexadecimal String representation } } def hexStringToMongoId(hexString: String): BSONObjectID = BSONObjectID(hexString)!…}

Find And Modifyhttps://github.com/ReactiveMongo/ReactiveMongo/blob/master/driver/samples/

SimpleUseCasesSample.scala#L186-212

// finds all documents with lastName = Godbillon and replace lastName with GODBILLON def findAndModify() = { val selector = BSONDocument( "lastName" -> "Godbillon") val modifier = BSONDocument( "$set" -> BSONDocument("lastName" -> "GODBILLON")) val command = FindAndModify(collection.name, selector, Update(modifier, false))! db.command(command).onComplete { case Failure(error) => { throw new RuntimeException("got an error while performing findAndModify", error) } case Success(maybeDocument) => println("findAndModify successfully done with original document = " + // if there is an original document returned, print it in a pretty format maybeDocument.map(doc => { // get a BSONIterator (lazy BSON parser) of this document // stringify it with DefaultBSONIterator.pretty BSONDocument.pretty(doc) })) } }

https://github.com/ReactiveMongo/ReactiveMongo/blob/master/driver/samples/SimpleUseCasesSample.scala#L186-212

Streaming From a Collectionhttps://github.com/sgodbillon/reactivemongo-tailablecursor-demo

def watchCollection = WebSocket.using[JsValue] { request => val collection = db.collection[JSONCollection]("acappedcollection") // Inserts the received messages into the capped collection val in = Iteratee.flatten(futureCollection.map(collection => Iteratee.foreach[JsValue] { json => println("received " + json) collection.insert(json) }))! // Enumerates the capped collection val out = { val futureEnumerator = futureCollection.map { collection => // so we are sure that the collection exists and is a capped one val cursor: Cursor[JsValue] = collection // we want all the documents .find(Json.obj()) .options(QueryOpts().tailable.awaitData) // tailable and await data .cursor[JsValue] cursor.enumerate } Enumerator.flatten(futureEnumerator) }! // We're done! (in, out)}

https://github.com/sgodbillon/reactivemongo-tailablecursor-demo

Connections

import reactivemongo.api.MongoDriver

val driver = new MongoDriver

//Without any parameter, MongoDriver creates a new Akka’s ActorSystem.

//Then you can connect to a MongoDB server.

val connection = driver.connection(List(“localhost")) // a conn pool

val db = connection.db("somedatabase")

val collection = db.collection("somecollection")

http://reactivemongo.org/releases/0.10/documentation/tutorial/connect-database.html

http://reactivemongo.org/releases/0.10/documentation/tutorial/connect-database.html

Play! Thread PoolsPlay uses a number of different thread pools for different purposes:

• Netty boss/worker thread pools - These are used internally by Netty for handling Netty IO. An applications code should never be executed by a thread in these thread pools.

• Play Internal Thread Pool - This is used internally by Play. No application code should ever be executed by a thread in this thread pool, and no blocking should ever be done in this thread pool. Its size can be configured by setting internal-threadpool-size in application.conf, and it defaults to the number of available processors.

• Play default thread pool - This is the default thread pool in which all application code in Play Framework is executed. It is an Akka dispatcher, and can be configured by configuring Akka, described below. By default, it has one thread per processor.

• Akka thread pool - This is used by the Play Akka plugin, and can be configured the same way that you would configure Akka.

Relational Database Options?

https://github.com/mauricio/postgresql-async

Non-blocking for MySql and PostgreSQL

https://github.com/mauricio/postgresql-async

A tour through the code

Data is not information, information is not knowledge, knowledge is not understanding, understanding is not

wisdom. !

- Clifford Stoll !

concurrency at the database layer

Software

reactivemongo blocking

completed future

load tests nonblocking

vs nonblocking

future exampleimport

reactive manifesto

db layer

byte value