youngil kim awalin sopan sonia ng zeng. introduction system architecture implementation…
DESCRIPTION
How can we know system information from many nodes? ◦ It is hard to track which node has a problem when too many nodes exist But… DFS & Map/Reduce make it easy! ◦ Analyze system information using Map/Reduce ◦ A kind of network managing system like HPTRANSCRIPT
P2P Control System based on Map/Reduce
Youngil KimAwalin Sopan
Sonia Ng Zeng
Introduction System architecture Implementation – HDFS Implementation – System Analysis
◦ System Information Logger (SIL)◦ System Information Gatherer (SIG)◦ Map/Reduce
Implementation – Visualization Implementation – P2P Application Demo
Outline
How can we know system information from many nodes?◦ It is hard to track which node has a problem when
too many nodes exist But… DFS & Map/Reduce make it easy!
◦ Analyze system information using Map/Reduce◦ A kind of network managing system like HP
Introduction
System Architecture
System Info Gatherer
(Hadoop Master)
Hadoop Slave Node
Slave
Slave Slave
HDFS
SystemManager
(Visualization)
p2p Local
P2P app.
p2p Local
P2P app.
p2p Local
P2P app.
p2p Local
P2P app.
Sys Info Logger
Sys InfoLogger
Sys Info Logger
Sys Info Logger
SystemControlNetwork
P2PNetwork
SystemInformation
Hadoop for DFS & Map/Reduce Framework◦ Master: brood00◦ Slaves: Currently tested with 5 nodes
(bug51 ~ bug55)◦ Using each local storage (not using home
directory)◦ Network Ports: hdfs(9000), job tracker(9001),
Namenode Interface (50070), JobTracker Interface (50030)
Implementation - HDFS
Implementation - System Analysis
mr_syslog.py◦ Implemented in Python◦ Save information in both local storage and HDFS◦ Gather information about every 10 secs◦ Create logfile based on time
Information of each node is saved with the following format◦ < 20110501_2252_bug51.log >◦ bug51 1304304720: mem(75.50), cpu(1.00), disk(10.00)◦ bug51 1304304724: mem(75.50), cpu(1.50), disk(10.00)◦ bug51 1304304727: mem(75.51), cpu(0.40), disk(10.00)◦ bug51 1304304729: mem(75.51), cpu(0.50), disk(10.00)◦ bug51 1304304732: mem(75.50), cpu(0.50), disk(10.00)◦ bug51 1304304734: mem(75.50), cpu(0.40), disk(10.00)
System Information Logger (SIL)
Functions◦ Find current resource usage of each node at
current time using Map/Reduce Currently, it shows maximum values per minute time
slot◦ Communication Gateway between nodes and
visualization tool Send “QUERY” to each P2P application Send node status to visualization tool (node ID,
(in)active, CPU usage, memory usage, storage)
System Information Gatherer (SIG)
Map:◦ Input – each node log file
Key: position of file Value: raw data, one line per key
◦ Output Key: node ID Value: set of system information
(CPU/memory/storage usage) Eg: < bug51, [30.0, 29.0, 12.0] >
Map/Reduce
Reduce:◦ Input – from Map
Key: node ID Value: set of set of system information Eg: < bug51, [ [30.0, 29.0, 12.0], [33.0, 40.0, 9.0], …
] >◦ Output
Key: Node ID Value: Maximum values for each piece of information Eg: < bug51, [33.0, 40.0, 12.0] >
Map/Reduce
Implementation - Visualization
Not a real application to use◦ Just to show how to control application or system
on each node using visualization◦ Only has STOP/RESUME operation
Functions◦ Response to “QUERY” Show active/inactive◦ Response to “CONTROL” Change status based
on control argument
Implementation – P2P Application
System set-up and initialization (video file) Show namenode & jobtracker interface
Show Map/Reduce jobs Show Visualization tool
◦ Changes of each status◦ Control each P2P application
Demo