![Page 1: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/1.jpg)
ID2210
Gautier Berthou (SICS)
Managing large clusters resources
![Page 2: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/2.jpg)
Big Data Processing with No Data Locality
Job(“/crawler/bot/jd.io/1”)
Workflow Manager
21 3 5 6 53 6 21 4 41 52 34 6
Compute Grid Node Job
submit
This doesn’t scale.Bandwidth is the bottleneck
1 6 3 2 54
![Page 3: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/3.jpg)
MapReduce – Data Locality
Job(“/crawler/bot/jd.io/1”)
Job Tracker
21 3 5 6 53 6 21 4 41 52 34 6
Task Tracker
Task Tracker
Task Tracker
Task Tracker
Task Tracker
Task Tracker
submit
Job Job Job Job Job Job
DN DN DN DN DN DN
R RR R = esultFile(s)
![Page 4: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/4.jpg)
First step
Hadoop 1.x
HDFS(distributed storage)
MapReduce(resource mgmt, job scheduler,
data processing)
Single Processing Framework
Batch Apps
06/05/2015 Managing large clusters resources, Gautier Berthou 4
![Page 5: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/5.jpg)
MapReduce – Data Locality
Job(“/crawler/bot/jd.io/1”)
Job Tracker
21 3 5 6 53 6 21 4 41 52 34 6
Task Tracker
Task Tracker
Task Tracker
Task Tracker
Task Tracker
Task Tracker
submit
Job Job Job Job Job Job
DN DN DN DN DN DN
R RR R = esultFile(s)
![Page 6: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/6.jpg)
The Job Tracker Job
•Distribute the map and reduce tasks on the nodes of the cluster
•Ensure fairness of the cluster resource attributions
•Track the progress of these tasks
•Authenticate job tenants and make sure that each job is isolated from the others
•Etc.
06/05/2015 Managing large clusters resources, Gautier Berthou 6
![Page 7: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/7.jpg)
Limitations of MapReduce [Zaharia’11]
Map
Map
Map
Reduce
Reduce
Input Output
•MapReduce is based on an acyclic data flow from stable storage to stable storage.
- Slow writes data to HDFS at every stage in the pipeline
•Acyclic data flow is inefficient for applications that repeatedly reuse a working set of data:
-Iterative algorithms (machine learning, graphs)
-Interactive data mining tools (R, Excel, Python)
![Page 8: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/8.jpg)
Limitations
•Only one programming model.
•The map reduce framework is not using the cluster at its maximum.
•The job tracker is a bottle neck.
•The job tracker is a single point of failure.
06/05/2015 Managing large clusters resources, Gautier Berthou 8
![Page 9: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/9.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Goals for a new Scheduler
•Being able to run different frameworks
•Scale
•Provide advance scheduling policies
•Run efficiently with different kind of workloads
9
![Page 10: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/10.jpg)
Second step
Hadoop 2.x
MapReduce(data processing)
HDFS(distributed storage)
YARN(resource mgmt, job scheduler)
Others(spark, mpi, giraph, etc)
Multiple Processing Frameworks
Batch, Interactive, Streaming …
06/05/2015 Managing large clusters resources, Gautier Berthou 10
![Page 11: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/11.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Examples of scheduling Policies
•Capacity scheduler:
- Applications have different levels of priorities.
•Fair scheduler:
- Applications have different levels of priorities.
- Used resources can be preempted.
•Reservation-based scheduler(1):
- Applications can indicate how long they will run and whenthey have to be finished.
11
(1) Reservation-based Scheduling: If you’re late don’t blame us!, C. Curino & al., Microsoft tech-report
![Page 12: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/12.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Scenario
•3 kinds of jobs:
- Emergency jobs: need to be run as soon as possible.
- Production jobs: have a deadline, a known running time and are very exigent on the nodes they can be scheduled on.
- Best effort jobs: interactive jobs that have lower priority, but on which users expect low latency.
12
![Page 13: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/13.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Capacity scheduler
13
Best effort
Best effort
Best effort
production
Best effort
emergencyBest effort
Best effort
NowProduction work deadline
production
![Page 14: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/14.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Fair scheduler
14
Best effort
Best effort
emergency
NowProduction work deadline
Best effort
Best effort
Best effort
production
![Page 15: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/15.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Reservation-based scheduler
15
Best effort
Best effort
emergency
NowProduction work deadline
Best effort
productionBest effort
production
![Page 16: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/16.jpg)
Scheduler Architectures
Omega: flexible, scalable schedulers for large compute clusters, Malte Schwarzkopf & al., EuroSys’13
06/05/2015 Managing large clusters resources, Gautier Berthou 16
![Page 17: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/17.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
The monolithic Scheduler
•Yarn:- Apache Hadoop YARN: Yet Another Resource Negotiator, V. K.
Vavilapalli & al., SoCC’13.
•Borg:- Large-scale cluster management at Google with Borg, A. Verma &
al., EuroSys’15.
17
![Page 18: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/18.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Architecture 1/3
18
![Page 19: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/19.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Architecture 2/3
19
Data node
Data node
Data node
Data node
Data node
Resources Manager
![Page 20: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/20.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Architecture 3/3
20
Data node
Data node
Data node
Data node
Data node
Resources ManagerStandby
Resources Manager
StandbyResources Manager
MasterResources Manager
zookeeper
MasterResources Manager
![Page 21: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/21.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Pros and Cons
•Pros:
- Fine knowledge of the state of the cluster state -> optimal use of the cluster resources.
- Easy to implement new scheduling policies.
•Cons:
- Bottle neck.
- The failure of the master scheduler has a big impact on the cluster usage.
21
![Page 22: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/22.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Two level Scheduler
•Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center, B. Hindman & al., NSDI’11
22
![Page 23: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/23.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Architecture 1/2
23
![Page 24: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/24.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Architecture 2/2
24
Data node
Data node
Data node
Data node
Data node
Mesos Master
MapReduceScheduler
SparkScheduler
FlinkScheduler
Partial State
![Page 25: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/25.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Pros and Cons
•Pros:
- Scale out by adding schedulers.
- Concurrent scheduling of tasks.
•Cons:
- Suboptimal use of the cluster. Especially when there exist long running tasks.
25
![Page 26: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/26.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Shared State Scheduler
•Omega: flexible, scalable schedulers for large compute clusters, M. Schwarzkopf & al. EuroSys’13
26
![Page 27: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/27.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Architecture 1/2
27
![Page 28: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/28.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Architecture 2/2
28
Data node
Data node
Data node
Data node
Data node
State Manager
MapReduceScheduler
SparkScheduler
FlinkScheduler
Global state`
![Page 29: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/29.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Architecture 2/2
29
Data node
Data node
Data node
Data node
Data node
State Manager
MapReduceScheduler
SparkScheduler
FlinkScheduler
Global state Conflict
![Page 30: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/30.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Pros and Cons
•Pros
- Scalable.
- Good use of the cluster resources.
•Cons
- Unpredictable interaction between the different schedulers’ policies.
30
![Page 31: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/31.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Comparison
31
![Page 32: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/32.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Performance comparison 1/
32
Two-level
Simple monolithic Advence monolithic
Shared state
![Page 33: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/33.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Performance Comparison 2/
33
What the previous evaluation does not show about the Two-levelscheduling:
![Page 34: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/34.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Performance Comparison 3/
34
How to write a good paper.How to write an accepted paper.
Trying to handle more batch jobs in Omega by running several batch schedulers in parallel.
![Page 35: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/35.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Sum up
•Two-Level and Shared state Schedulers scale better.
•Shared state Schedulers use the cluster resources more optimally than Two-level Schedulers.
•Monolithic Scheduler are a potential Bottleneck.
•But as Monolithic schedulers are easier to design, allow finer allocation of resources and more advance scheduling policies, they are the ones used in practice.
35
![Page 36: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/36.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Making Yarn more scalable
•HOPS YARN: a one and a half level scheduler
36
![Page 37: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/37.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Hadoop Yarn HA Implementation
37
Data node
Data node
Data node
Data node
Data node
Resources ManagerStandby
Resources Manager
StandbyResources Manager
MasterResources Manager
zookeeper
MasterResources Manager
![Page 38: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/38.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Hops Yarn HA Implementation3/3
38
Data node
Data node
Data node
Data node
Data node
Resources ManagerStandby
Resources Manager
StandbyResources Manager
MasterResources Manager
MasterResources Manager
NDB
![Page 39: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/39.jpg)
39
•Distributed, In-memory
•2-Phase Commit
- Replicate DB, not the Log!
•Real-time
- Low TransactionInactive timeouts
•Commodity Hardware
•Scales out
- Millions of transactions/sec
- TB-sized datasets (48 nodes)
•Split-Brain solved with Arbitrator Pattern
•SQL and Native Blocking/Non-Blocking APIs
MySQL Cluster (NDB) – Shared Nothing DB
SQL API NDB API
30+ million update transactions/second
on a 30-node cluster
![Page 40: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/40.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Standby is boring
40
Data node
Data node
Data node
Data node
Data node
StandbyResources Manager
StandbyResources Manager
MasterResources Manager
![Page 41: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/41.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Dificulties 1/2
41
![Page 42: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/42.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Difficulties 2/2
•Pulling from the database when the state is needed is inefficient.
•Having an independent thread that regularly pull from the database is difficult to tune and cause lock problems.
42
![Page 43: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/43.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
Solution
•Luckily NDB has an event API.
43
![Page 44: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/44.jpg)
06/05/2015 Managing large clusters resources, Gautier Berthou
With streaming
44
Data node
Data node
Data node
Data node
Data node
StandbyResources Manager
StandbyResources Manager
MasterResources Manager
![Page 45: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/45.jpg)
Conclusion
•There exists three architectures for large cluster resource scheduling:
- Monolithic
- Two-levels
- Shared State
•Each of these architectures has pros and cons.
•The monolithic architectur is the one presently usedbecause it is easyer to use and develop.
•At KTH and SICS we are exploring the possibilitiesfor a new architecture ensuring more scalabilitywhile keeping the advantages of the monolithicarchitecture.
![Page 46: Managing large clusters resources - KTH...•Provide advance scheduling policies •Run efficiently with different kind of workloads 9. Second step Hadoop 2.x MapReduce (data processing)](https://reader034.vdocuments.net/reader034/viewer/2022042223/5ec999927b57b771a76679d8/html5/thumbnails/46.jpg)
References
•Reservation-based Scheduling: If you’re late don’t blame us!, C. Curino & al., Microsoft tech-report
•Omega: flexible, scalable schedulers for large compute clusters, Malte Schwarzkopf & al., EuroSys’13
•Apache Hadoop YARN: Yet Another Resource Negotiator, V. K. Vavilapalli & al., SoCC’13.
•Large-scale cluster management at Google with Borg, A. Verma & al., EuroSys’15.
•Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center, B. Hindman & al., NSDI’11