distributed & highly available server applications in java and scala

Distributed & highly available server applications in Java and ScalaMax Alexejev, Aleksei Kornev

JavaOne Moscow 2013

24 April 2013

What is talkbits?

Maxim Alexejev

Create "JavaOne Moscow 2013" geo-channel ?

Architecture

by Max Alexejev

Lightweight SOA

Key principles

•S1, S2 - edge services

•Each service is 0..1 servers and 0..N clients built together

•No special "broker" services

•All services are stateless

•All instances are equal

What about state?

State is kept is specialized distributed systems and fronted by specific services.

Example follows...

Case study: Talkbits backend

Recursive call

Requirements for a distrubuted RPC system Must have and nice to have

•Elastic and reliable discovery - schould handle nodes brought up and shut down transparently and not be a SPOF itself

•Support for N-N topology of client and server instances

•Disconnect detection and transparent reconnects

•Fault tolerance - for example, by retries to remaining instances where called instance goes down

•Clients backoff built-in - i.e., clients should not overload servers when load spikes - as far as possible

•Configurable load distribution - i.e., which server instance to call for this specific request

•Configurable networking layer - keepalives & heartbeats, timeouts, connection pools etc.)

•Distributed tracing facilities

•Portability among different platforms

•Distributed stack traces for exceptions

•Transactions

Key principles to be lightweight and get rid of architectural waste

•Java SE

•No containers. Even servlet containers are light and built-in

•Standalone applications: unified configuration, deployment, metrics, logging, single development framework - more on this later

•All launched istances are equal and process requests - no "special" nodes or "active-standby" patterns

•Minimal dependencies and JAR size

•Minimal memory footprint

•One service - one purpose

•Highly tuned for this one purpose (app, JVM, OS, HW)

•Isolated fault domains - i.e., single datasource or external service is fronted by one service only

No bloatware in technology stack!

"Lean" services

Finagle library

(twitter.github.io/finagle) acts as a distributed RPC framework.

Services are written in Java and Scala and use Thrift communication protocol.

Talkbits implementation choices

Apache Zookeeper (zookeeper.apache.org)

Provides reliable service discovery mechanics. Finagle has a nice built-in integration with Zookeeper.

Finagle server: networking

Finagle is built on top of Netty - asynchronous, non-blocking TCP server.

Finagle codec

trait Codec[Req, Rep]

class ThriftClientFramedCodec(...) extends Codec[ThriftClientRequest, Array[Byte]] { pipeline.addLast("thriftFrameCodec", new ThriftFrameCodec) pipeline.addLast("byteEncoder", new ThriftClientChannelBufferEncoder) pipeline.addLast("byteDecoder", new ThriftChannelBufferDecoder) ...}

Finagle comes with ready-made codecs for Thrift, HTTP, Memcache, Kestrel, HTTP streaming.

Finagle services and filters

// Service is simply a function from request to a future of response.trait Service[Req, Rep] extends (Req => Future[Rep])

// Filter[A, B, C, D] converts a Service[C, D] to a Service[A, B].abstract class Filter[-ReqIn, +RepOut, +ReqOut, -RepIn] extends ((ReqIn, Service[ReqOut, RepIn]) => Future[RepOut])

abstract class SimpleFilter[Req, Rep] extends Filter[Req, Rep, Req, Rep]

// Service transformation exampleval serviceWithTimeout: Service[Req, Rep] =

new RetryFilter[Req, Rep](..) andThen new TimeoutFilter[Req, Rep](..) andThen service

Finagle comes with rate limiting, retries, statistics, tracing, uncaught exceptions handling, timeouts and more.

Functional composition

Given Future[A]

Sequential composition

def map[B](f: A => B): Future[B]

def flatMap[B](f: A => Future[B]): Future[B]

def rescue[B >: A](rescueException: PartialFunction[Throwable, Future[B]]): Future[B]

Concurrent composition

def collect[A](fs: Seq[Future[A]]): Future[Seq[A]]

def select[A](fs: Seq[Future[A]]): Future[(Try[A], Seq[Future[A]])]

And more

times(), whileDo() etc.

Functional composition on RPC calls

Sequential composition

val nearestChannel: Future[Channel] = metadataClient.getUserById(uuid) flatMap { user => geolocationClient.getNearestChannelId( user.getLocation() ) } flatMap { channelId => metadataClient.getChannelById( channelId ) }

Concurrent composition

val userA: Future[User] = metadataClient.getUserById(“a”)val userB: Future[User] = metadataClient.getUserById(“b”)val userC: Future[User] = metadataClient.getUserById(“c”)

val users = Future.collect(Seq(userA, userB, userC)).get()

*All this stuff works in Java just like in Scala, but does not look as cool.

Finagle server: threading model

You should never block worker threads in order to achieve high performance (throughput).

For blocking IO or long compuntations, delegate to FuturePool.

val diskIoFuturePool = FuturePool(Executors.newFixedThreadPool(4))

diskIoFuturePool( { scala.Source.fromFile(..) } )

Boss thread accepts new client connections and binds NIO Channel to a specific worker thread.

Worker threads perform all client IO.

More gifts and bonuses from Finagle

In addition to all said before, Finagle has

•Load-distribution in N-N topos - HeapBalancer ("least active connections") by default

•Client backoff strategies - comes with TruncatedBinaryBackoff implementation

•Failure detection

•Failover/Retry

•Connection Pooling

•Distributed Tracing (Zipkin project based on Google Dapper paper)

Finagle, Thrift & Java: lessons learnedPros

•Gives a lot out of the box

•Production-proven and stable

•Active development community

•Lots of extension points in the library

Cons

•Good for Scala, usable with Java

•Works well with Thrift and HTTP (plus trivial protocols), but lacks support for Protobuf and other stuff

•Poor exceptions handling experience with Java (no Scala match-es) and ugly code

•finagle-thrift is a pain (old libthrift version lock-in, Cassandra dependencies clash, cannot return nulls, and more). All problems avoidable thought.

•Cluster scatters and never gathers when whole Zookeeper ensemble is down.

Finagle: competitors & alternatives

Trending

•Akka 2.0 (Scala, OpenSource) by Typesafe

•ZeroRPC (Python & Node.js, OpenSource) by DotCloud

•RxJava (Java, OpenSource) by Netflix

Old

•JGroups (Java, OpenSource)

•JBOSS Remoting (Java, OpenSource) by JBOSS

•Spread Toolkit (C/C++, Commercial & OpenSource)

Configuration, deployment, monitoring and logging

by Aleksei Kornev

Get stuff done...

Typical application

Architecture of talkbits service

One way to configure service, logs, metrics.

One way to package and deploy service.

One way to lunch service.

Bundled in one-jar.

One delivery unit. Contains:

Java service

In a single executable fat-jar.

Installation script

[Re]installs service on the machine, registers it in /etc/init.d

Init.d script

Contains instructions to start, stop, restart JVM and get quick status.

Delivery

Logging

Confuguration

•SLF4J as an API, all other libraries redirected

•Logback as a logging implementation

•Each service logs to /var/log/talkbits/... (application logs, GC logs)

•Daily rotation policy applied

•Also sent to loggly.com for aggregation, grouping etc.

Aggregation

•loggly.com

•sshfs for analyzing logs by means of linux tools such as grep, tail, less, etc.

Aggregation alternatives

Splunk.com, Flume, Scribe, etc...

Metrics

Application metrics and health checks are implemented with CodaHale lib (metrics.codahale.com). Codahale reports metrics via JMX.

Jolokia JVM agent (www.jolokia.org/agent/jvm.html) exposes JMX beans via REST (JSON / HTTP), using JVMs internal HTTP server.

Monitoring agent use jolokia REST interface to fetch metrics and send them to monitoring system.

All metrics are divided into common metrics (HW, JVM, etc) and service-specific metrics.

Deployment

Fabric (http://fabfile.org) used for environments provisioning and services deployment.

Process

•Fabric script provisions new env (or uses existing) by cluster scheme

•Amazon instances are automatically tagged with services list (i.e., instance roles)

•Fabric script reads instance roles and deploys (redeploys) appropriate components.

MonitoringAs monitoring platform we chose Datadoghq.com. Datadog is a SaaS which is easy to integrate into your infrastucture. Datadog agent is opensourced and implemented in Python. There are many predefined checksets (plugins, or integrations) for popular products out of the box - including JVM, Cassandra, Zookeeper and ElasticSearch.

Datadog provides REST API.

Alternatives

•Nagios, Zabbix - need to have bearded admin in team. We wanted to go SaaS and outsource infrastructure as far as possible.

•Amazon CloudWatch, LogicMonitor, ManageEngine, etc.

Process

Each service has own monitoring agent instance on a single machine. If node has 'monitoring-agent' role in the roles tag of EC2 instance, monitoring agent will be installed for each service on this node.

Talkbits cluster structure

QA

Max Alexejevhttp://ru.linkedin.com/pub/max-alexejev/51/820/ab9http://www.slideshare.net/MaxAlexejev/[email protected]

Aleksei [email protected]

http://ru.linkedin.com/pub/max-alexejev/51/820/ab9

distributed & highly available server applications in java and scala

Technology

monitoring

finagle server

monitoring

service

built

jvm

finagle

datadog