ci report for nsf visit

Post on 23-Jan-2018

31 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Building a Scalable, Cost-effective Cyberinfrastructure for Multidisciplinary Scientific Research in Big Data Era

Student: Xiangron Ma and Zhao FuAdvisor: Yingtao Jiang Mei Yang University of Nevada, Las Vegas

Department of Electrical and Computer Engineering

• Infrastructure

• Distributed Storage System

• Connectivity

• Processing and Visualization Framework

• User Interface

• Research Results and Plan

• Education/Outreach/Border Impact

Overview

Infrastructure

• Hardware

• System Performance

• Network Architecture

• Virtualization Resources

Hardware

Unit Qty. Details (per unit)

Head node 1 • Processor :• 2X 6-Core 2.4G Hz Xeon

• RAM: 64 GB • Disk: 2X 1.2 TB SAS• Network:

• 2X 10 GE• 1X 10 GE QSFP+

Compute/Storage node

10 • Processor :• 2X Quad-Core 2.5G Hz Xeon

• RAM: 32 GB• Disk:

• 4X 1 TB SATA• 1X 150GB SAS

• Network: • 2X 10 GE

Ethernet switch 1 • 48Port 10GE RJ45• 4Port 40GE QSFP+

Misc. N/A • Enclosure, Ethernet cables, accessories, KVM, labor

Table 1 Hardware Specification

System Performance

0.000

1.000

2.000

3.000

4.000

5.000

6.000

7.000

8.000

9.000

1 core 2 core 4 core 8 core 16 core

Base

Opt

Fig. 1. Benchmark results(Parsec.Bodytrack)

• System Performance Evaluation

Fig. 2. Benchmark results(Matrix Multiplication )

Network Architecture

Namenode(n1)

Interface 1192.168.1.3/28

Controller/Network Node(n2)Interface 110.1.1.2/24

Internet

Firewall

IPMI Interface10.1.0.102/23

Compute/Network Node(n3)

Compute/Network Node(n11)

Management Network10.1.1.1 – 10.1.1.11

Tunnel Network10.1.1.102 – 10.1.1.111

External Network192.168.0.0/28

IPMI Network10.1.0.101 – 10.1.0.111

...

Interface 210.1.1.2/23

NAT

Gateway 10.1.0.1/23

Interface 210.1.1.102/24

Interface 110.1.1.3/24

IPMI Interface10.1.0.103/23

Interface 210.1.1.103/24

Interface 110.1.1.11/24

IPMI Interface10.1.0.111/23

Interface 210.1.1.111/24

Fig. 3. Network Architecture in NRDC

Distributed Storage System

InstanceInstancesInstanceInstances

InstanceInstances

InstanceInstance

Compute NodesCinder

StorageService

HDFS Volume

Cinder Volume

HDFS Volume

Cinder Volume

HDFS Volume

Cinder Volume

HDFS Volume

Cinder Volume

Physical Storage Namespace

(Name Node)

Instances

VHDFS Volume

VHDFS Volume

VHDFSVolume

VHDFS Volume

Virtual Storage Namespace

Nova Compute Service

Neutron Network Service

Virtual Storage Cluster

Physical Storage Cluster

Fig. 4. Distributed Storage System Architecture

Virtualization Resources

Fig. 5. Cloud Computing Resources

• OpenStack Dashboard

Connectivity

Query APIs

list_sites() : enumerate all the sites (sensor towers and cameras) deployed.list_properties(site_id) : enumerate all monitored properties available on specified site.list_streams(site_id): enumerate the camera streamslist_image_sites_names(): List the name of all the sites in storage system.

Transfer APIs

get_sensor_data(sensor_ids) : download sensor data from NRDC, returns dataframe.get_csv(sensor_id): download sensor data and saved as a CSV fileimport_images(siteid, presets, startdate, enddate, timerange, savedir): Download image

from specified direction of one site in the given time range.

Synchronization APIs

db_sync() : synchronize the entire remote database to HDFS

Table 2. Connectivity API

Processing and Visualization Framework

Hadoop

Hive

Apache Spark (cluster)

Connectivity API(REST Client)

Domain Specific Libararies

GNU Octave R

SKLearn

Scipy

Sympy

Storage Service

Connectivity Service

Processing & Visualization

ServiceNumpy

iPython Kernel

Visualization Libraries(Matplotlib,ggplot,plotly,etc.)

iPython Notebook

Fig. 6. Processing and Visualization Framework

User Interface

• Code execution

• Rich text

• Visualization

• Rich media

• Documentation

• IPython Notebook

Fig. 7. User Interface

Research Results and Plan

• Research – Publications

• Published 2 journals, 3 conference papers• Submitted 1 conference, 2 journal papers in preparation

– Proposals• Submitted two NSF grant proposals and one proposal to

Toyota• One proposal collaborated with Nexus researcher to DoD is

under preparation

– Future plan• Service engagement• Further optimization/ improvement • CI-enabled data mining

Education/Outreach/Border Impact

• Results– Hosted two CI workshops to train Nexus researchers and

UNLV students– More than 35 graduate students were trained– Engaged 3 undergraduate students in development work

• Future plan– Develop more workshops/tutorials to Nexus researchers– Engaging more Nexus researchers to use the CI node in

their research work– Better user management – Public user support

Demo

top related