enterprise solution engineer twitter: @knight cloud · akka clustering • peer-to-peer based...

Post on 16-May-2020

11 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Next-Generation Scala Architectures

Ryan KnightEnterprise Solution Engineer

Twitter: @knight_cloud

My Experience• Sun Microsystems

• Oracle

• Family Search / LDS Church

• Riot Games

• Adobe / T-Mobile

• Deloitte / State of Louisiana

• Typesafe

• Tomax / Demandware

• DataStax - Enterprise Technical Sales

Which is Faster?

• Fighter Jet

• Mantis Shrimp

• Bullet

• Mushroom Spores

Sphagnum Moss!

• Launches Spores at 89 MPH in less than a thousandth of second

• Spores Travel over 80 Height of the Launching Capsule

"Você nunca muda as coisas lutando contra a realidade existente.Para mudar alguma coisa, construa um novo modelo que torne o modelo

existente obsoleto ".

Agenda

• Architecting the Application Tier

• Architecting the Data Tier

• Fundamental Architectural Principles

Architecting the Application Tier

with Scala

Evaluation Criteria

Professor Zapinsky provou que a lula é mais inteligente do que o gato doméstico quando

desafiados em condições semelhantes.

Flaw of Performance Benchmarking

• Unrealistic Load Scenario

• Unrealistic Application Scenario

• Performance is only one criteria

• Framework optimized for benchmark

Four Traits of Reactive Architectures

Why Scala?

• Type Inference

• Uniform Access of Principle - fields can be declared via methods or fields

• Traits

• Value Classes

• Package-level methods & fields

• Default and Named Parameters

• Higher Ordered Types

• Functions as First Class Citizens

• Currying / Methods with multiple parameter lists

• Qualified Imports

• Scoped access modifiers

• Case Classes

• Singleton Objects

• Default Methods - apply / unapply / set

• Implicit Conversion and Views

• Macros

• Parser Combinators

• Multi-Line Strings

• String Interpolation

• Traits

• Default Public Access

• Type Classes

• Extractor Patterns

Functional

XKCD

Why Functional Rocks!

• Immutability

• Higher-Level of Abstraction

• Define the What not the How

• Eliminating side effects

• Inherent Parallelism

Functional in Reactive Programming

• Easy to create callbacks

• Easy to handle Events and Async Results

Statements vs. Expressionsdef errMsg(errorCode: Int): String = { var result: String = _ errorCode match { case 1 => result = "Network Failure" case 2 => result = "I/O Failure" case _ => result = "Unknown Error" } return result; }

Statements vs. Expressions

def errMsg(errorCode: Int): String = errorCode match { case 1 => "Network Failure" case 2 => "I/O Failure" case _ => "Unknown Error" }

No Imperative Code!

• Imperative programming - Describes computation in terms of statements that change a program state.

def findPeopleIn(city: String, people: Seq[People]): Set[People] = val found = new mutable.HashSet[People] for(person <- people) { for(address <- person.addresses) { if(address.city == city) found.put(person) } } return found }

No Imperative Code!

No Imperative Code!

def findPeopleIn(city: String, people: Seq[People]): Set[People] = for { person <- people.toSet[People] address <- person.addresses if address.city == city } yield person

Down with Null Pointers!def authenticateSession( session: HttpSession, username: Option[String], password: Option[Array[Char]]) = for { u <- username p <- password if canAuthenticate(u, p) privileges <- privilegesFor.get(u) } injectPrivs(session, privileges)

NO BLOCKING!

Scala Futures

Future API

import scala.concurrent._

import ExecutionContext.Implicits.global

def calcInt(x: Int) = {

Future(x * 5)

}

calcInt(10).map { rslt => println(rslt) } // prints 50

Traditional Request/Response

Client Server Serviceblocking blocking

Problems?

Reactive Request/Response

def getTweets = Action.async { Ok(WS.get("http://twitter.com/"))}}

Client Server Servicenon-blocking non-blocking

Reactive CompositionAsync & Non-Blocking

def foo = Action.async {

val futureTS = WS.url("http://www.typesafe.com").get

val futureTwitter = WS.url("http://www.twitter.com").get

for {

ts <- futureTS

twitter <- futureTwitter

} yield Ok(ts.body + twitter.body)

}

• Futures Treated as Collections

• For Expression used to represent a “callback”

Akka

• Actor Based Toolkit

• Simple Concurrency & Distribution

• Error Handling and Self-Healing

• Elastic and Decentralized

• Adaptive Load Balancing

What is an Actor?• Isolated lightweight processes• Message Based / Event Driven• Non-Request Based Lifecycle• Share nothing • Isolated Failure Handling• Same Semantics for Local and Remote

Akka Clustering• Peer-to-peer based cluster membership service

• No single point of failure or single point of bottleneck.

• Automatic node failure detector

• Cluster Events / Cluster-Aware Routers

• Cluster Routing

• Cluster Sharding

Programming Actors

32

case class Greeting(who: String) case class Departure(who: String)

class GreetingActor extends Actor with ActorLogging { def receive = { case Greeting(who) => log.info(s”Hello ${who}”)

case Departure(who) => log.info(s”Good by ${who}") } }

val system = ActorSystem("MySystem") val greeter = system.actorOf(Props[GreetingActor], name = "greeter") greeter ! Greeting("Charlie Parker")

Location Transparency!

Akka Supervisor Hierarchies• Parents send work to Children

• Router to Balance Work

• Parents supervise children actors

• Children delegate failure to parent

• Error-prone tasks delegated to children- “Error Kernel Pattern”

A

CB

D

GFE

Failure Recovery• Supervisor hierarchies with “let-it-crash”

semantics

• Lifecycle Monitoring

• Parent can resume, restart or terminate Child

• Error-prone tasks are delegated to child Actors - “Error Kernel Pattern”

Reference Architecture

35

Web Tier Work Tier

Data Service

AkkaRouter

Tweet Service

Geo Location

UserActor

UserActor

UserActor

UserActor

Reactive Server

UserActor

UserActor

UserActor

UserActor

Reactive Server

Architecting the Data Tier

It’s all Trade-offs

Intelligent Data• Not just about Big Data or NoSQL

• Batch processing is dead! Ala Haddop

• Real-time data processing!

• Fluent API

• Integrated Batch, Iterative and Streaming Analysis!

The Event Log

• Append-Only Logging• Database of Facts• Disks are Cheap• Why Delete Data any more?• Replay Events

39

Akka Persistence Webinar

Domain Events

• Things that have completed, facts• Immutable• Verbs in past tense

• CustomerRelocated• CargoShipped• InvoiceSent

• State Transitions

41

“In general, application developers simply do not implement large scalable applications

assuming distributed transactions.”- Pat Helland

Life beyond Distributed Transactions:

an Apostate’s Opinion

What is Cassandra?

Distributed Database

✓ Individual DBs (nodes)

✓ Working in a cluster

✓ Nothing is shared

C *

Client

Why Cassandra?

It’s Hugely Scalable (High Throughput)

Spark• Clustered In-Memory Data Analytics

• Fault Tolerant Distributed Datasets

• Batch, iterative and streaming analysis

• In Memory Storage and Disk

• 2-5× less code

• 10x faster on disk, 100x faster in memory than Hadoop MR

Spark Cassandra Connector • Loads data from Cassandra to Spark

• Writes data from Spark to Cassandra

• Implicit Type Conversions and Object Mapping

• Implemented in Scala (offers a Java API)

• Open Source

• Exposes Cassandra Tables as Spark RDDs + Spark DStreams (Soon)

Spark Cassandra Connector

• Data locality-aware (speed)

• Server-Side filters (where clauses)

• Cross-table operations (JOIN, UNION, etc.)

• Data transformation, aggregation, etc.

• Natural Time Series Integration

Intelligent Data Architecture

val conf = new SparkConf(loadDefaults = true) .set("spark.cassandra.connection.host", "127.0.0.1").setMaster("spark://127.0.0.1:7077") Initialization

val sc = new SparkContext(conf)

val table: CassandraRDD[CassandraRow] = sc.cassandraTable("keyspace", "tweets")

val ssc = new StreamingContext(sc, Seconds(30)) val stream = KafkaUtils.createStream[String, String, StringDecoder, StringDecoder]( ssc, kafka.kafkaParams, Map(topic -> 1), StorageLevel.MEMORY_ONLY) stream.map(_._2).countByValue().saveToCassandra("demo", "wordcount") ssc.start() ssc.awaitTermination()

val sc = new SparkContext( "local", "Inverted Index") sc.textFile("data/crawl") .map { line => val array = line.split("\t", 2) (array(0), array(1)) } .flatMap { case (path, text) => text.split("""\W+""") map { word => (word, path) } } .map { case (w, p) => ((w, p), 1) } .reduceByKey { (n1, n2) => n1 + n2 } .groupBy { case (w, (p, n)) => w } .map { case (w, seq) =>

Architectural Principles

How to Fail

Shared Mutable State +

Locks / Thread Libraries

AVOID AT ALL COSTS!

Traditional Request/Response

Client Server Serviceblocking blocking

Problems?

• SINGLE thread of control• If thread blows - you are screwed!• Explicit error handling WITHIN this single thread• Errors do not propagate between threads so there

is NO WAY OF EVEN FINDING OUT that something have failed

Failure Recovery in Java/C/C# etc.

Never block

• ...unless you really have to

• Blocking kills scalability (and performance)

• Never sit on resources you don’t use

• Use non-blocking IO

Go Async

• Isolate the failure

• Compartmentalize

• Manage failure locally

• Avoid cascading failures

Use Bulkheads

Backpressure

• http://ferd.ca/queues-don-t-fix-overload.html

Backpressure

• http://ferd.ca/queues-don-t-fix-overload.html

Handling Backpressure• Fail Fast

• Circuit Breaker with default responses

• Load Shedding - Bounded Mailboxes

• Worker Pull Pattern vs. Push to Overload

• Throttling

Questions?

©DataStax 2015 – All Rights Reserved

top related