prezo tooracleteam (2)
TRANSCRIPT
Assorted Topics on Scalable Distributed Services
- Sharma PodilaSenior Software Engineer
PS: Lots of things can, and will, fail at that scale
Topics
Managing Service DependenciesAsynchronous Event Processing (RxJava)Async IOMesos - DataCenter OSDeployment and Build Automation
Managing Service Dependencies
Distributed architectures can have dozens of dependencies
Each can fail independentlyEven 0.01% downtime on each of dozens of services equates to potentially hours a month of downtime if not engineered for resilience
Service A
Service B
Service C
requestDependency
Dependency
A Healthy Request Flow
When One Backend Service Becomes Latent
Threads and other resources can exhaust with high volume
Hystrix to the Rescue
Wrap calls to external systems in a dependency command object, run in separate threadTimeout calls after time ~ >99.5th of all latenciesControl #threads w/ a pool or semaphoreMeasure success, trip circuit if neededPerform fallback logicMonitor metrics and change in real time
Hystrix Wiki
Taming Tail Latencies of Service Calls
Real time metrics show problems as they occurTrends help configure timeoutsSet timeouts based on histogram data
99.5th + buffer is a good startTier timeouts for retrials on other servers (e.g., 5, 15, 30)
Reactive Data/Event Stream Processing
Processing a Data/Event Stream
Iterator<T> iterator = dataStream.iterator();
while(iterator.hasNext()) { process(iterator.next()); }
What if dataStream represents an unbounded stream?What if data comes over the network? With latencies, failures.What if data comes from multiple sources?How would you manage concurrency? Threads? Semaphores?
RxJava implementation of reactive extensions addresses these questions
“...provides a collection of operators with which you can filter, select, transform, combine, and compose Observables. This allows for efficient execution and composition…”
RxJavaJava impl for Reactive Extensions
A library for composing asynchronous and event-based programs by using observable sequences
Extends the observer pattern to support sequences of data/events and adds operators that allow you to compose sequences together declaratively while abstracting away concerns about things like low-level threading, synchronization, thread-safety, concurrent data structures, and non-blocking I/O
Event Iterable (pull) Observable (push)
retrieve data T next() onNext(T)
discover error throws Exception onError(Exception)
complete returns onComplete()
RxJava Operator Examples
Example Code: Iterable and Observable
getDataFromLocalMemory()
.skip(10)
.take(5)
.map({ s -> return s + " transformed" })
.forEach({ println "next => " + it })
getDataFromNetwork()
.skip(10)
.take(5)
.map({ s -> return s + " transformed" })
.subscribe({ println "onNext => " + it })
Data can be pushed from multiple sourcesNo need to block for result availability
RxJava is a tool to react to push dataJava Futures as an alternative are non-trivial with nested async execution
Async IO
Async IO with Netty, RxNetty
Netty is an NIO client server framework(see Java IO Vs. NIO)
Supports non-blocking IOHigh throughput, low latency, less resource consumptionRxNetty is Reactive Extensions adaptor for Netty
When using something like Netty, Total #threads in app = Total #cores in the system
RxNetty Server Examplepublic static void main(final String[] args) {
final int port = 8080;
RxNetty.createHttpServer(port, new RequestHandler<ByteBuf, ByteBuf>() {
@Override
public Observable<Void> handle(HttpServerRequest<ByteBuf> request, final HttpServerResponse<ByteBuf> response) {
System.out.println("New request recieved");
System.out.println(request.getHttpMethod() + " " + request.getUri() + ' ' + request.getHttpVersion());
for (Map.Entry<String, String> header : request.getHeaders().entries()) {
System.out.println(header.getKey() + ": " + header.getValue());
}
<continued…>
RxNetty Server Example (Cntd.) return request.getContent().materialize()
.flatMap(new Func1<Notification<ByteBuf>, Observable<Void>>() {
@Override
public Observable<Void> call(Notification<ByteBuf> notification) {
if (notification.isOnCompleted()) {
return response.writeStringAndFlush("Welcome!!!");
} else if (notification.isOnError()) {
return Observable.error(notification.getThrowable());
} else {
ByteBuf next = notification.getValue();
System.out.println(next.toString(Charset.defaultCharset()));
return Observable.empty();
}
}
});
}
}).startAndWait();
}
Mesos
Mesos Cluster Manager
Resource allocation across distributed applications (aka Frameworks) on shared pool of nodes.Akin to Google BorgPlugable isolation for CPU, I/O, etc. via Linux CGroups, Docker, etc.Fault tolerant leader election via ZooKeeperUsed at Twitter, AirBnB, etc.
Mesos Architecture
Mesos Resource Offers
Mesos Framework Development
Implement Framework Scheduler and ExecutorScheduler:
resourceOffers(SchedulerDriver driver, java.util.List<Offer> offers)
executorLost(SchedulerDriver driver, ExecutorID executorId, SlaveID slaveId, int status)
statusUpdate(SchedulerDriver driver, TaskStatus status)
… and more
Executor:launchTask(ExecutorDriver driver, TaskInfo task)
killTask(ExecutorDriver driver, TaskID taskId)
… and more
Mesos Framework Fault Tolerance
Mesos task reconciliationPeriodic heartbeats from executorsState engine
taskStatesStream
.groupBy(JobId)
.flatmap(groupedObservable) {
groupedObservable.takeWhile(state != TerminalState)
.debounce(2000)
.doOnNext(state) { schedule(taskStuckInState(state), stateTimeout); }
}
RxJava Style