marist college enterprise computing research lab

37
Joint Studies / Marist College Enterprise Computing Research Lab Undergraduate Projects Howard Baker, Andrew Evans, Chris Cordisco, Junaid Kapadia June 11, 2013

Upload: others

Post on 18-Dec-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Marist College Enterprise Computing Research Lab

Joint Studies/

Marist CollegeEnterprise Computing Research Lab

Undergraduate Projects Howard Baker, Andrew Evans, Chris Cordisco, Junaid Kapadia

June 11, 2013

Page 2: Marist College Enterprise Computing Research Lab

Joint Studies/

IBM and Marist College Joint Studies

22

IBM

Gets• Collaborative Research Projects

• Tools, S/W, Libraries • Conference

presentations, demos, Workshops, courseware

• Recruit top talent

• Direct Sales

• Sales Enablement – Briefings/Reference Acct

• Identify Intellectual Property(IP) and/or Joint Development Agreement(JDA) opportunities

Product Ecosyste

m

Talent Pool

Sales

IP/JDA

• SUR Grants

• Faculty Awards

Industry and/or Government Partners

ADVA, NEC, CIENA, Brocade, NSF, Ellucian, Sakai Community, NY State, BigSwitch

• IBM Internships

• Full-time hire – Many Marist grads at IBM

• Access to IBM hardware/software and provides access to many other schools (z and p)

• Collaborative Research Projects

• Cutting edge technologies – Software Defined Networking

• Areas of mutual interestRecruitment

Hardware

Research

Funding

(in-kind/cash)

Marist

Gets

Key Focus Areas

Education for a Smarter Planet/Smarter Classrooms

• Open source systems (Sakai enabled on IBM HW/SW, Integration w/existing ERP systems)

• Cloud Computing (K-12 SaaS hosting, Virtualization, Project Greystone)

• Analytic Tools (Cognos, SPSS, OAAI project)

• Virtual Computing Lab (VCL)

Smarter/Dynamic Infrastructure

• Technology infrastructure (e.g. systems/data centers, OpenFlow, PaaS)

• Virtualization (e.g. hybrid multi-core systems, cloud computing, server provisioning)

Course Development (25+ courses)

• z/OS, AIX/Power

• Converged Networking, SDN

• Cloud/Mobile App Development

• Business Intelligence/Analytics

Page 3: Marist College Enterprise Computing Research Lab

Joint Studies/

Student Intern Team• Ryan Flaherty - OpenFlow - Grad CS • Andrew Evans - Analytics - Senior CS • Chris Cordisco - NSF Intern - Grad CS • Kevin Pietrow - OpenFlow - Junior CS • Junaid Kapadia - ECRL Projects - Senior IT • Rebecca Murphy - Digital Archives - Junior CS • Mary Miller - OpenFlow - Sophomore CS• Devin Young - OpenFlow - Senior CS • Zachary Meath - OpenFlow - Sophomore CS

Page 4: Marist College Enterprise Computing Research Lab

Joint Studies/

MARIST ECRL zEnterprise• Junaid Kapadia

• Christopher Cordisco

Page 5: Marist College Enterprise Computing Research Lab

Joint Studies/

zEnsemble Layout

1. Management network

2. Intranode management network

3. Intranode management network - extension

4. Intraensemble data network

5. FC-attached disk storage

Page 6: Marist College Enterprise Computing Research Lab

Joint Studies/

zBX Specifications• 2 PS701 Power blades• 64 GB of memory each• Eight 3.0 GHz processors; 64 "virtual processors"

each blade• 300GB Internal HDD• 2 HX5 X blades• Eight 8GB memory kits (total 64GB memory)

each• Two 2.13GHZ Processors with eight cores; total

16 cores (virtual processors) each• 100GB SSD internal disk

Page 7: Marist College Enterprise Computing Research Lab

Joint Studies/

System z Breakdown

z114

LP1 LP2 LP3

z/VM z/OS

SU

SE

Lin

ux o

n z

Ron

C

olem

an

SU

SE

Lin

ux C

ogno

s S

erve

rS

US

E L

inux

Cog

nos

data

base

Ext

ra R

edha

t S

erve

rE

xtra

SU

SE

Ser

ver

IBM

CO

P S

US

E S

erve

r

Linu

x R

oute

r

DB2

z/VM

Page 8: Marist College Enterprise Computing Research Lab

Joint Studies/

zBX Breakdown

zBX

Po

we

r VM

H

ype

rvisor

X H

ype

rvis or

P.Bla

de

1.1

P.bla

de

1.2

X.b

lad

e 1

.05

NIM

Matt Johnson AIX

z/os IEDN OSA server

z/os IEDN OSA server

R. Coleman P blade Scaly

Scott Frank – Underwater Acoustics

Windows Deployment Server

John’s Hopkins Research

Cognos Framework ManagerEiel Lauria - OAAI

Tivoli Monitoring

zSentinel

X.b

lad

e 1

.04

Ron Coleman - Scaly

Cognos Express Server

zBX OpenFlow Controller

VCL Management Node

SuSE PXE

Network File System

DayTrader

zBX

Po

we

r VM

H

ype

rvisor

X H

ype

rvis or

P.Bla

de 1

.1P.b

lad

e 1.2

X.b

lad

e 1

.05

NIM

Matt Johnson AIX

z/os IEDN OSA server

z/os IEDN OSA server

R. Coleman P blade Scaly

Scott Frank – Underwater Acoustics

Windows Deployment Server

John’s Hopkins Research

Cognos Framework ManagerEiel Lauria - OAAI

Tivoli Monitoring

zSentinel

X.b

lad

e 1

.04

Ron Coleman - Scaly

Cognos Express Server

zBX OpenFlow Controller

VCL Management Node

SuSE PXE

Network File System

DayTrader

Page 9: Marist College Enterprise Computing Research Lab

Joint Studies/

Drupal on z114 – First Time Ever!

Page 10: Marist College Enterprise Computing Research Lab

Joint Studies/

Sakai on z114 – Mid-East Universities

Page 11: Marist College Enterprise Computing Research Lab

Joint Studies/

zSentinel

By Christopher Cordisco

Page 12: Marist College Enterprise Computing Research Lab

Joint Studies/

Overview• The zEnterprise is a powerful system capable of

running many applications concurrently.

• zSentinel is a non-interactive web application which gives a simplified layout of real-time resource utilization of the zEnterprise on a single webpage.

• It uses REST requests to gather data from the system, and uses a Python web server to send the data to a browser.

• The browser displays the data using a combination of traditional web elements and Google Charts.

• zSentinel provides a visual layout of the servers running on each host of the system as well as a tabular layout which gives details on each server. It also provides an overview of processor utilization of the mainframe itself. zSentinel is

not interactive but designed to run on an external monitor to give an overview of the system at a single glance.

Page 13: Marist College Enterprise Computing Research Lab

Joint Studies/

Page 14: Marist College Enterprise Computing Research Lab

Joint Studies/

Tree Map• The tree map gives a visual representation

of the each host on the system.

• Each node represents a virtual server.

• The size of the node corresponds to the number of processors allocated to the server.

• The color of a node corresponds to the current resource utilization of the server.

• It cycles through each host on the system, showing the servers on each at a predefined interval .

Page 15: Marist College Enterprise Computing Research Lab

Joint Studies/

Table & Gauges• The table gives a more detailed inspection of the hosts on the system and

each server on the host.

• It gives the status of each server, as well as the resources allocated to it and its resource utilization.

• The gauges show the current processor usage of the Central Processing Complex specifying both the shared and dedicated resources of both the IFLs and CPs.

Page 16: Marist College Enterprise Computing Research Lab

Joint Studies/

Behind The Scenes

Page 17: Marist College Enterprise Computing Research Lab

Joint Studies/

The Application• The application’s web server is written in Python.

• The server responds to simple http requests which returns the resources necessary to load zSentinel in a web browser.

• The client uses JavaScript in conjunction with Google Charts to query the web server for the most recent information and display it on the zSentinel webpage.

• The server responds to the queries using a Google Charts wrapper for Python which returns the data in the appropriate format to be displayed on the charts.

Page 18: Marist College Enterprise Computing Research Lab

Joint Studies/

Unified Resource Manager• The Python application queries the Unified Resource Manager (URM) API for

the most recent information on the resource statistics of the zEnterprise using Python’s built in httplib. This is asynchronous with the client’s requests.

• After logging in, the URM returns a session ID which is then sent with other all other requests to query data.

• The URM responds to the requests with JSON objects.

• Python parses the JSON objects into Python dictionaries which are then sent to the client as a Google Chart DataTable object.

• Metrics, which need to be updated at shorter intervals, such as processor utilization, are returned in a comma-delimited string for efficiency.

Page 19: Marist College Enterprise Computing Research Lab

Joint Studies/

Google Charts• The data to be displayed in a Google Chart is

done so through a DataTable object.• A DataTable object contains all the necessary

information to create any of the charts provided through the API.

• A Google Charts Query is constructed through JavaScript which queries the Python server for the DataTable objects, updating the charts in real time.

Page 20: Marist College Enterprise Computing Research Lab

Joint Studies/

ParaBond

By Dr. Ron Coleman & Christopher Cordisco

Page 21: Marist College Enterprise Computing Research Lab

Joint Studies/

Overview• ParaBond is an application written in Scala.• Its objective is to test the efficiency of

concurrently analyzing financial portfolios stored in a database.

• It runs on a minimum of two servers. One server contains the database.  The other contains the algorithm which analyzes these portfolios.

Page 22: Marist College Enterprise Computing Research Lab

Joint Studies/

Original Setup• Originally, ParaBond was set up among several

typical quad-core desktop computers.• One would run the computation, the others

would store the database, which was sharded across several computers.

• The database used was MongoDB, a nonrelational database.

• This setup worked really well, producing efficiency levels over 300%.

Page 23: Marist College Enterprise Computing Research Lab

Joint Studies/

New Setup• We wanted to discover how this application

would run when placed on the zEnterprise.• A Linux server running on an X blade with 16

cores would run the computation.• This server would query a second server on the

zEnterprise running the database on zLinux.• The Database used here was DB2, since

MongoDB would not run reliably on the mainframe.

Page 24: Marist College Enterprise Computing Research Lab

Joint Studies/

MongoDB to DB2• In converting this database to

DB2, new functions had to be written to build the database and query it.

• The conversion to a relational database required an extra table as an associative entity between bonds and portfolios.

Page 25: Marist College Enterprise Computing Research Lab

Joint Studies/

Results• Preliminary tests on the zEnterprise were fast,

but we did not achieve efficiency levels above 100%.

• We are still working on this project in an attempt to achieve efficiency levels equal to that of the original setup.

Page 26: Marist College Enterprise Computing Research Lab

Joint Studies/

Predictive Defect Analytics

Marist/IBM Joint Studies

Page 27: Marist College Enterprise Computing Research Lab

Joint Studies/

BackgroundIBM z/OS operating system

• Mature software product

Every potential defect recorded• Defect record management system

Multiple cycles of extensive testing• System Test

Valid Defect Distribution

Data Partitioning:• Training – 65%

• Test – 20%

• Validate – 15%

• Same target distribution as original data

Page 28: Marist College Enterprise Computing Research Lab

Joint Studies/

OverviewMission:

• Automatically predict defect validity upon record creation and send notifications of that outcome in real time to leverage historic empirical test data to drive a smarter test process

Team:

James Gilchrist

Chris Robbins Andrew EvansMarist Joint Studies

Michael Gildein

Page 29: Marist College Enterprise Computing Research Lab

Joint Studies/

Benefits• Increased test efficiency by reducing invalid and

duplicate defects• Creation of base infrastructure and process for

future analytics work• Visualization reporting to determine future test

focus areas• Automate basic defect analysis reporting

including release quality tracking

Page 30: Marist College Enterprise Computing Research Lab

Joint Studies/

ImplementationIBM Cognos:

Report and analyze data and results

IBM SPSS Modeler: Predictive analysis to determine potential validity

Prediction Accuracies: > 100 different algorithm variations executed

~70% overall accuracy on average

~90% confidence on average for true positive

Academic field research averages ~70% accuracy in defect predictions

Increasing Accuracy: With more training data

With more data points

With voting scheme combining multiple algorithms

Page 31: Marist College Enterprise Computing Research Lab

Joint Studies/

Data Prep FlowMerge pulled data

Remove non related defect records

Remove obviously erroneous records

Derived Target – Valid flag

Removed Fields: Dates

Elapsed times

Manual text fields

Mostly blank

No unique values

> 90% 1 categorical value (+~5% Accuracy Correct)

Fields not applicable

Reclassified remaining fields properly through binning: Blanks & white space

Manually entered categories

Defaults

User error

Page 32: Marist College Enterprise Computing Research Lab

Joint Studies/

Physical Infrastructure Clients

Windows Server

IBM

Da

ta S

tud

io

IBMDB2

IBMSPSS

Modeler Server

IBMCognosServer

Defect Record Management Servers

Additional Input Data Servers

System z Host

IBM

SP

SS

Mod

ele

r C

lient

IBM

Co

gno

s F

ram

ewor

k M

ana

ger

IBM

Cog

nos

Met

ric

De

sign

er

zLinux

Scripts & Data Crawlers

zVM

Page 33: Marist College Enterprise Computing Research Lab

Joint Studies/

SPSS Models

Page 34: Marist College Enterprise Computing Research Lab

Joint Studies/

Cognos ReportsAutomated, scheduled, and on demand real time reportingAnalysis:

Algorithm Accuracies (release/product)

Confusion Matrices (release/product)

Defect Prediction Densities (release/product)

Management:

Non closed defect tables (release/product/team)

Current defect state graphs (release/product/team)

Project Managers:

Open vs. closed defects over time (release/product/team)

Test end criteria report

Defect rate over time of release (release/product)

Testers:

Hardware trigger vs. root cause (release/product)

Test area vs. root cause (release/product)

Lines of code changed (release/product)

Hardware trigger vs. component (release/product)

Page 35: Marist College Enterprise Computing Research Lab

Joint Studies/

Cognos Report Examples

Page 36: Marist College Enterprise Computing Research Lab

Joint Studies/

Cognos Report Examples

Page 37: Marist College Enterprise Computing Research Lab

Joint Studies/

Questions?