a converged hpc & big data system for nontraditional and hpc … · 2016-11-16 · 4 mds nodes...

46
Nick Nystrom · Sr. Director of Research · [email protected] SC16 · NVIDIA Theater · November 16, 2016 A Converged HPC & Big Data System for Nontraditional and HPC Research

Upload: others

Post on 27-Jul-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

Nick Nystrom · Sr. Director of Research · [email protected] · NVIDIA Theater · November 16, 2016

A Converged HPC & Big Data Systemfor Nontraditional and HPC Research

Page 2: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

2

is delivering Bridges

All trademarks, service marks, trade names, trade dress, product names, and logos appearing herein are the property of their respective owners.

Acquisition and operation of Bridges are made possible by the National Science Foundation through award #ACI-1445606 ($17.2M):

Bridges: From Communities and Data to Workflows and Insight

Page 3: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

3

Gaining Access to Bridges

Bridges is allocated through XSEDE: https://www.xsede.org/allocations– Startup Allocation

• Can request anytime; requires only a brief proposal/abstract• Up to 50,000 core-hours on RSM and GPU (128GB) nodes, 1,000 TB-hours on LSM (3TB) and ESM

(12TB) nodes, and a user-specified (in GB) amount of storage on Pylon• Can request XSEDE ECSS (Extended Collaborative Support Service)

– Research Allocation (XRAC)• Appropriate for larger requests; can request ECSS• Community allocations can support gateways• Can be up to millions to tens of millions of SUs• Quarterly submission windows: March 15–April 15, June 15–July 15,

September 15–October 15, December 15–January 15

– Educational Allocations• To support use of Bridges for courses

Up to 10% of Bridges’ SUs are available on a discretionary basis to industrial affiliates, Pennsylvania-based researchers, and others to foster discovery and innovation and broaden participation in data-intensive computing.

Page 4: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

4

The Shift to Big Data and New Users

Pan-STARRS telescopehttp://pan-starrs.ifa.hawaii.edu/public/

Genome sequencers(Wikipedia Commons)

NOAA climate modelinghttp://www.ornl.gov/info/ornlreview/v42_3_09/article02.shtml

CollectionsHorniman museum: http://www.horniman.ac.uk/get_involved/blog/bioblitz-insects-reviewed

Legacy documents(Wikipedia Commons)

Environmental sensors: Water temperature profiles from tagged hooded sealshttp://www.arctic.noaa.gov/report11/biodiv_whales_walrus.html

Library of Congress stackshttps://www.flickr.com/photos/danlem2001/6922113091/

Video(Wikipedia Commons)

Social networks and the Internet

New Emphases

Structured, regular, homogeneous

Unstructured, irregular, heterogeneous

Page 5: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

5

Motivating Use Cases

Data-intensive applications & workflowsGateways – the power of HPC without the programmingShared data collections & related analysis toolsCross-domain analyticsDeep learningGraph analytics, machine learning, genome sequence assembly, and other large-memory applications Scaling beyond the laptopScaling research to teams and collaborationsIn-memory databasesOptimization & parameter sweepsDistributed & service-oriented architecturesData assimilation from large instruments and Internet dataLeveraging an extensive collection of interoperating software

Research areas that haven’t used HPC

New approaches to “traditional HPC” fields (e.g., machine learning)

Coupling applications in novel ways

Leveraging large memory, GPUs, and high bandwidth

}

Page 6: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

6

Objectives and Approach

• Bring HPC to nontraditional users andresearch communities.

• Apply HPC effectively to big data.

• Bridge to campuses to streamline accessand provide cloud-like burst capability.

• Leveraging PSC’s expertise with sharedmemory, Bridges will feature 3 tiers oflarge, coherent shared-memory nodes:12TB, 3TB, and 128GB.

• Bridges implements a uniquely flexible environment featuring interactivity, gateways, databases, distributed (web) services, high-productivity programming languages and frameworks, and virtualization, and campus bridging.

Page 7: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

7

Interactivity

• Interactivity is the feature mostfrequently requested bynontraditional HPC communities.

• Interactivity provides immediatefeedback for doing exploratorydata analytics and testing hypotheses.

• Bridges offers interactivity through a combination of shared and dedicated resources to maximize availability while accommodating different needs.

Page 8: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

8

High-Productivity Programming

Supporting languages that communities already use is vital for them to apply HPC to their research questions.

Page 9: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

9

Bridges’ large memory is great for Spark!

Bridges enables workflows thatintegrate Spark/Hadoop, HPC, GPU,and/or large shared-memory components.

Spark, Hadoop & Related Approaches

Page 10: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

10

800 HPE Apollo 2000 (128GB) compute nodes

20 “leaf” Intel® OPA edge switches

6 “core” Intel® OPA edge switches:fully interconnected,2 links per switch

42 HPE ProLiant DL580 (3TB) compute nodes

20 Storage Building Blocks, implementing the parallel Pylonstorage system(10 PB usable)

4 HPE Integrity Superdome X (12TB)

compute nodes …

12 HPE ProLiant DL380 database nodes

6 HPE ProLiant DL360 web server nodes

4 MDS nodes2 front-end nodes

2 boot nodes8 management nodes

Intel® OPA cables

16 RSM nodes, each with 2 NVIDIA Tesla K80 GPUs

32 RSM nodes, each with 2 NVIDIA Tesla P100 GPUs

… each with 2 OPA↔IB gateway nodes

http://staff.psc.edu/nystrom/bvtBridges Virtual Tour:

Purpose-built Intel® Omni-PathArchitecture topology for

data-intensive HPC

Page 11: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

11

GPU Nodes

Bridges’ GPUs are accelerating both deep learning and simulation codes

Phase 1: 16 nodes, each with:• 2 × NVIDIA Tesla K80 GPUs (32 total)• 2 × Intel Xeon E5-2695 v3 (14c, 2.3/3.3 GHz)• 128GB DDR4-2133 RAM

Phase 2: +32 nodes, each with:• 2 × NVIDIA Tesla P100 GPUs (64 total)• 2 × Intel Xeon E5-2683 v4 (16c, 2.1/3.0 GHz)• 128GB DDR4-2400 RAM

Page 12: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

12

NVIDIA Tesla K80

Kepler architecture2496 CUDA cores (128/SM)7.08B transistors on 561mm2 die (28nm)2×24 GB GDDR5; 2×240.6 GB/s562 MHz base – 876 MHz boost2.91 Tf/s (64b), 8.73 Tf/s 32bBridges: 32 K80 GPUs → 279 Tf/s (32b)

Page 13: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

13

NVIDIA Tesla P100

Pascal architecture3584 CUDA cores (64/SM)15.3B transistors on 610mm2 die (16nm)16GB CoWoS® HBM2 at 720 GB/s w/ ECC1126 MHz base – 1303 MHz boost4.7 Tf/s (64b), (9.3) Tf/s 32b, (18.7) Tf/s 16bPage migration engine improves unified memoryBridges: 64 P100 GPUs → 600 Tf/s (32b)

From http://www.nvidia.com/object/tesla-p100.html. Bridges results forthcoming.

Page 14: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

14

Applying Deep Learning to ConnectomicsGoal: Explore the potential of deep learning to automate segmentation of high-resolution scanning electron microscope (SEM) images of brain tissue and the tracing of neurons through 3D volumes to automate generation of the connectome, a comprehensive map of neural connections.

Motivation: This project builds substantially on anongoing collaboration between PSC, HarvardUniversity, and the Allen Brain Institute, throughwhich we have access to high-quality raw andhuman-annotated data. The SEM data volume formouse cortex imaging is ~3TB/day, and dataprocessing is currently human-intensive. Forthcomingcamera systems will increase data bandwidth by 65×.

Datasets: Zebrafish larva (1024×1024×4900),mouse, …

Collaborators: Ishtar Nyawĩra (Pitt), Iris Qian & Annie Zhang (CMU), John Urbanic, Joel Welling, and Nick Nystrom (PSC/CMU).

Page 15: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

15

Connectomics

Connectomics is the building of complete mappings of the neural pathways in organisms (connectomes)

Page 16: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

Can we use CNNs to automate

connectomics?

… especially considering how big

biomedical image data tends

to be?

Page 17: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

17

Most Likely!Build on training data & pixel-wise classification

Pixel-wise image classification of cancer cells by Andrew Janowczyk

Extensive training data (1.4M points) from ongoing collaboration1. Bock, D. D. et al., Network anatomy and in vivo physiology of visual cortical

neurons, Nature 471, 177–182 (10 March 2011), doi:10.1038/nature09802.2. Lee, W.-C. A. et al., Anatomy and function of an excitatory network in the

visual cortex, Nature 532, 370–374 (21 April 2016), doi:10.1038/nature17192.

Page 18: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

Our MethodsChoosing a direction...

• Data Approach

• Neural Network

• Framework

Page 19: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

19

Choosing A Data Approach

5-Slice Approach• Using 4 surrounding slices of a single slice

as channel dimension• (H × W × C) → (1024 × 1024 × 5)• Take into account that surrounding slices

provide minute (rather than drastic) variances

• Difficult to use the kind of click-point data that we have

Cube Approach• Using image volumes → 41 × 41 × 41 cubes• Provides us with a single neuron path

through a small volume of the full fish• Could help by providing the net with

pre-existing paths• Which makes it easier to use the kind of

click-point data we have

Something to work around:Our annotated data consists of single click locations within each

neuron in the 4900 image slices instead of outlines of each neuron

Page 20: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

20

Spotting the Neurons

Images courtesy Florian Engert, David Hildebrand, and their students at the Center for Brain Science, Harvard

Page 21: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

21

Approach

• Test two distinct data decompositions• Select and optimize CNN architectures for the data under study• Implement in Caffe (Caffe-SegNet) and TensorFlow• Results planned for GTC 2017

Page 22: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

22

Deep Learning Projects Using Bridges (examples)

Florian Metze (CMU): Automatic Building of Speech Recognizers for Non-Experts

Peter Rossky (Rice U.): Development of a Hybrid Computational Approach for Macroscale Simulation of Exciton Diffusion in Polymer Thin Films, Based on Combined Machine Learning, Quantum-Classical Simulations and Master Equation Techniques

Junbo Xu (Toyota Technological Institute at Chicago): Developing Large-Scale Distributed Deep Learning Methods for Protein Bioinformatics

Deva Ramanan (CMU): Learning to Parse Images and Videos

Eric Xing (CMU): Petuum, a Distributed System for High-Performance Machine Learning

Mark Ohman (Scripps Institution of Oceanography): Quantifying California Current Plankton using Machine Learning

Xinghua Lu (U. of Pittsburgh): Deciphering Cellular Signaling System by Deep Mining a Comprehensive Genomic Compendium

Michael Reale (SUNY Polytechnic Institute): Automatic Pain Assessment

Dokyun Lee (CMU): Education Allocation for the Course Unstructured Data & Big Data: Acquisition to Analysis

Page 23: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

23

Deep Learning Projects Using Bridges (examples)

Manuela Veloso (CMU): Deep Learning of Game Strategies for RoboCup

Diane Litman (U. of Pittsburgh): Automatic Evaluation of Scientific Writing

Param Singh (CMU): Image Classification Applied in Economic Studies

Matt Fredrikson (CMU): Exploring Stability, Cost, and Performance in Adversarial Deep Learning

Adriana Kovashka (U. of Pittsburgh): Enabling Robust Image Understanding Using Deep Learning

Gil Alterovitz (Harvard Medical School/Boston Children’s Hospital): Preparing Grounds to Launch All-US Students Kaggle Competition on Drug Prediction

Shaun Mahony (Penn State): Deep Learning the Gene Regulatory Code

Michael Lam (Oregon State): Deep Recurrent Models for Fine-Grained Recognition

Page 24: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

24

Thank You

Questions?

Page 25: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

25

Additional Content

Page 26: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

26

For Additional Information

Website: www.psc.edu/bridges

Bridges PI: Nick [email protected]

Co-PIs: Michael J. LevineRalph RoskiesJ Ray Scott

Project Manager: Robin Scibek

Page 27: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

27

High-throughputgenome sequencers

Data Infrastructure for theNational Advanced Cyberinfrastructure Ecosystem

Bridges is a new resource on XSEDE, interoperating with other XSEDE resources, Advanced Cyberinfrastructure (ACI) projects, campuses, and instruments nationwide.

Data Infrastructure Building Blocks (DIBBs)‒ Data Exacell (DXC)‒ Integrating Geospatial Capabilities into HUBzero‒ Building a Scalable Infrastructure for Data-Driven

Discovery & Innovation in Education‒ Other DIBBs projects

Other ACI projects

Reconstructing brain circuits from high-resolution electron microscopy

Temple University’s new Science, Education, and Research Center

Carnegie Mellon University’s Gates Center for Computer Science

Examples:

Social networks and the Internet

Page 28: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

28

Campus Bridging

Through a pilot project with Temple University, the Bridges project will explore new ways to transition data and computing seamlessly between campus and XSEDE resources.

Federated identity management will allow users to use their local credentials for single sign-on to remote resources, facilitating data transfers between Bridges and Temple’s local storage systems.

Burst offload will enable cloud-like offloading of jobs from Temple to Bridges and vice versa during periods of unusually heavy load.

Federated identity management

Burst offload

http://www.temple.edu/medicine/research/RESEARCH_TUSM/

Page 29: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

29

Gateways and Tools for Building Them

Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex workflows, and manage data from their browsers.

- Provide “HPC Software-as-a-Service”- Extensive use of VMs, databases, and distributed services

Interactive pipeline creation in GenePattern(Broad Institute)

Galaxyhttps://galaxyproject.org/

Col*Fusion portal for the systematic accumulation, integration, and utilization of

historical data, from http://colfusion.exp.sis.pitt.edu/colfusion/

Page 30: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

30

Virtualization and Containers

• Virtual Machines (VMs) enable flexibility, security, customization, reproducibility, ease of use, and interoperability with other services.

• User demand is for custom database and web server installations to develop data-intensive, distributed applications and containers for custom software stacks and portability.

• Bridges leverages OpenStack to provision resources, between interactive, batch, Hadoop, and VM uses.

Page 31: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

31

Database and Web Server Nodes

Dedicated database nodes power persistent relational and NoSQL databases

– Support data management and data-driven workflows– SSDs for high IOPs; HDDs for high capacity

Dedicated web server nodes– Enable distributed, service-oriented architectures– High-bandwidth connections to XSEDE and the Internet

(examples)

Page 32: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

32

Data Management

Pylon: A large, central, high-performance storage system– 10 PB usable– Visible to all compute nodes– Currently implemented as two complementary file systems:

• /pylon1: Lustre, targeted for $SCRATCH use; non-wiped directories available by request

• /pylon2: SLASH2, targeted for large datasets, community repositories, and distributed clients

Distributed (node-local) storage– Enhance application portability– Improve overall system performance– Improve performance consistency to the shared filesystem– Aggregate 7.2 PB (6 PB non-GPU RSM: Hadoop, Spark)

Page 33: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

33

Intel® Omni-Path Architecture (OPA)

Bridges is the first production deployment of Omni-PathOmni-Path connects all nodes and the shared filesystem, providing Bridges and its users with:

– 100 Gbps line speed per port;25 GB/s bidirectional bandwidth per port

– Measured 0.93μs latency, 12.36 GB/s/dir– 160M MPI messages per second– 48-port edge switch reduces

interconnect complexity and cost– HPC performance, reliability, and QoS– OFA-compliant applications supported without modification– Early access to this new, important, forward-looking technology

Bridges deploys OPA in a two-tier island (leaf-spine) topology developed by PSC for cost-effective, data-intensive HPC

Page 34: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

34

Omni-Path Architecture User Group (OPUG)

• Opportunity for OPA early adopters to share experiences• Led by PSC

– Bridges is the first large-scale deployment of Omni-Path– Initial conversations on advanced interconnect technology date

back to 2007

• Sign up at http://www.psc.edu/index.php/opug• PSC OPUG contacts

– Dave Moses [email protected]– J. Ray Scott [email protected]– Nick Nystrom [email protected]

OPUG meeting at SC16November 17, 1:30-3:00p, Room 155-Ahttp://sc16.supercomputing.org/presentation/?id=bof160&sess=sess354

Page 35: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

35

Type RAMa Ph n CPU / GPU / other Server

ESM 12 TB1 2 16 × Intel Xeon E7-8880 v3 (18c, 2.3/3.1 GHz, 45MB LLC) HPE Integrity

Superdome X2 2 16 × Intel Xeon E7-8880 v4 (22c, 2.2/3.3 GHz, 55MB LLC)

LSM 3TB1 8 4 × Intel Xeon E5-8860 v3 (16c, 2.2/3.2 GHz, 40 MB LLC)

HPE ProLiant DL5802 34 4 × Intel Xeon E7-8870 v4 (20c, 2.1/3.0 GHz, 50MB LLC)

RSM 128 GB 1 752 2 × Intel Xeon E5-2695 v3 (14c, 2.3/3.3 GHz, 35MB LLC)

HPE Apollo 2000RSM-GPU 128 GB

1 16 2 × Intel Xeon E5-2695 v3 + 2 × NVIDIA Tesla K80

2 32 2 × Intel Xeon E5-2683 v4 (16c, 2.1/3.0 GHz, 40MB LLC) + 2 × NVIDIA Tesla P100

DB-s128 GB 1

6 2 × Intel Xeon E5-2695 v3 + SSD HPE ProLiant DL360

DB-h 6 2 × Intel Xeon E5-2695 v3 + HDDs HPE ProLiant DL380

Web 128 GB 1 6 2 × Intel Xeon E5-2695 v3 HPE ProLiant DL360

Otherb 128 GB 1 16 2 × Intel Xeon E5-2695 v3 HPE ProLiant DL360, DL380

Gateway 128 GB1 4

2 × Intel Xeon E5-2695 v3 HPE ProLiant DL3802 4

Storage 128 GB1 5 2 × Intel Xeon E5-2680 v3 (12c, 2.5/3.3 GHz, 30MB LLC)

Supermicro X10DRi2 15 2 × Intel Xeon E5-2680 v4 (14c, 2.4/3.3 GHz, 35MB LLC)

Total 282 TB 908

a. RAM in Bridges’ Intel Xeon v3 nodes is DDR4-2133, and RAM in Bridges’ Intel Xeon v4 nodes is DDR4-2400b. Other nodes = front end (2) + management/log (8) + boot (4) + MDS (4)

Bridges: Node Types

Page 36: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

36

Com

pute

Nod

es

Phase 1 Phase 2 Technical Upgrade

RSMRSM GPU

LSM ESMPh. 1 Total

RSMGPU

LSM ESMPh. 2 Total

Total†

Peak fp64 (Tf/s)

774.9 80.5 18.0 21.2 894.6 333.3 91.4 24.8 449.5 1344

RAM (TB) 94 2 24 24 144 4 102 24 130 274

HDD (TB) 6016 128 128 128 6400 256 544 128 928 7328

Bridges: System Capacities

Stor

age

Phase 1 Phase 2 Total

PylonPB, raw 3.5 10.5 14

PB, usable 2.5 7.5 10

Local† PB, raw 6.4 0.9 7.3

† Excludes utility nodes (database, web server, front-end, management, storage, etc.

Page 37: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

37

Connectionist Temporal ClassificationFlorian Metze, Hari Parthasarathi, Yajie Miao, and Dong Yu,

Carnegie Mellon University

A revived approach to sequenceclassification that works well withspeech at the phoneme, syllable,character or word level

Bridges is being used to train the Eesen Toolkit part of the Speech Recognition Virtual Kitchen, a web-based resource that allows users to check out VMs with complete deep neural network (DNN) environments.

Page 38: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

38

Causal Discovery PortalCenter for Causal Discovery, an NIH Big Data to Knowledge Center of Excellence

Web node

VMApache Tomcat

Messaging

Database node

VM

Other DBs

LSM Node (3TB)

ESM Node(12TB)

Analytics:FGS, IMaGES, and other algorithms, building on

TETRAD

Browser-based UI

• Authentication• Data mgmt.• Provenance

Execute causal discovery algorithms

Omni-Path

• Prepare and upload data

• Run causal discovery algorithms

• Visualize results

Internet

Memory-resident datasets

Pylon filesystem

TCGAfMRI

Page 39: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

39

De Novo Assembly of the Sumatran Rhino GenomeJames Denvir and Swanthana Rekulapa, Marshall University

First assembled the 1 gigabase Narcissus flycatcher(Ficedula narcissina) genome– On the users’ local resources, they could only assemble

⅓ of the data due to memory limitations, taking 16 hours– Assembly of all data on a 3TB node of Bridges required

1.5 TB of memory and only 6.6 hours, which was at least3-4 faster than anticipated based on execution times elsewhere

Then assembled the 3 gigabase Sumatran rhino(Dicerorhinus sumatrensis) genome– Required 1.9 TB of memory and completed in only

11 hours, again 3-4 faster than anticipatedSumatran Rhinoceroses at the Cincinnati Zoo & Botanical Garden, Charles W. Hardin, CC BY 2.0. https://upload.wikimedia.org/wikipedia/commons/3/33/Sumatran_Rhino_2.jpg

Narcissus Flycatcher (Ficedula narcissina) in Osaka, Japan, Kuribo, CC BY-SA 2.0. https://commons.wikimedia.org/wiki/File:Narcissus_Flycatcher-cropped.jpg

Page 40: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

40

Improved SNP Detection in Metagenomic PopulationsWenxuan Zhong, Xin Xing, and Ping Ma, Univ. of Georgia

Assembled 378 Gigabase pairs (Gbp) of gut microbial DNA from normal and diabetic patients– Massive metagenome assembly took only 16 hours using

an MPI-based metagenome assembler, Ray, on 20 BridgesRM nodes connected by Omni-Path

Identified 2480 species-level clusters in assembledsequence data– Ran MetaGen clustering software across 10 RM nodes,

clustering 500,000 contiguous sequences in only 14 hours– Currently testing new likelihood-based statistical method

for fast and accurate SNP detection from these metagenomicsequencing data

→The team is now using Bridges to test a new statistical method on the sequence data to identify critical differences in gut microbes associated with diabetes.

Environmental Shotgun Sequencing (ESS). (A) Sampling from habitat; (B) filtering particles, typically by size; (C) DNA extraction and lysis; (D) cloning and library; (E) sequence the clones; (F) sequence assembly. By John C. Wooley, Adam Godzik, IddoFriedberg -http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000667, CC BY 2.5, https://commons.wikimedia.org/w/index.php?curid=17664682

Page 41: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

41

Characterizing Diverse Microbial Ecosystemsfrom Terabase-Scale Metagenomic Data

Brian Couger, Oklahoma State University

Assembled 11 metagenomes sampled from diverse sources, comprising over 3 trillion bases of sequence data– Including a recent massive assembly of 1.6 Tbp

of metagenomic data from an oil sands tailings pond, a bioremediation target

– Excellent performance of MPI-based Ray assembler on 90 RM nodes completed assembly in only 4.25 days

– Analysis of assembled data in progress to characterize organisms present in these diverse environments and identify new microbial phyla

Oil sands tailings pond.By NASA Earth Observatoryhttp://earthobservatory.nasa.gov/IOTD/view.php?id=40997, Public Domain, https://commons.wikimedia.org/w/index.php?curid=8346449

Page 42: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

42

Investigating Economic Impacts of Images and Natural Language in E-commerce

Dokyun Lee, CMU Tepper School of Business• Security and uncertain quality create challenges for

sharing economies– Lee et al. studied the impact of high-quality, verified photos

for Airbnb hosts– 17,000 properties over 4 months– Used Bridges’ GPU nodes

• Security and uncertain quality create challenges for sharing economies Difference-in-Difference (DD) analysis showed that on

average, rooms with verified photos are booked 9% more often

Separating effects of photo verification from photo quality and room reviews indicates that high photo quality resultsin $2,455 of additional yearly earnings

They found asymmetric spillover effects: on the neighborhood level, there appears to be higher overall demand if more rooms have verified photos

Page 43: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

43

Thermal Hydraulics of Next-GenerationGas-Cooled Nuclear Reactors

PI Mark Kimber, Texas A&M University

Completed 3 Large Eddy Simulations of turbulent thermal mixing of helium coolant in the lower plenum of the Gen IV High-Temperature Gas-cooled Reactor (HTGR)– Performed using OpenFOAM, a open-source CFD code – Simulations are for scaled-down versions of representative sections of

the HTGR lower plenum, each containing 7 support posts and 6 high-Reynolds number jets in a cross-flow

– Each simulation involves a high-quality block-structured mesh with 16+ million cells, and was run on 20 regular Bridges nodes (560 cores) for ~10 million time steps (~1 second of simulated time)

– Helps to understand the turbulent thermal mixing in great detail, as well as temperature distributions and hotspots on support posts

Anirban Jana (PSC) is co-PI and computational lead for this DOE project

Distribution of vorticity in arepresentative “unit cell” of the HTGRlower plenum containing 7 supportposts and 6 jets in a crossflow (right toleft). Turbulent mixing of core coolantjets in the lower plenum can causetemperature fluctuations and thermalfatigue of the support posts.

Acknowledgements:• DOE-NEUP (Grant nos. DE-NE0008414)• NSF-XSEDE (Grant no. CTS160002)

Page 44: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

44

Simulations of Nanodisc SystemsJifey Qi and Wonpil Im, Lehigh University

A nanodisc system showing two helical proteins warping around a discoidal lipid bilayer.

Using NAMD on Bridges’ NVIDIA P100 GPU nodes to simulate a number of nanodisc systems to be used as examples for the Nanodisc Builder module in CHARMM-GUI For a system size of 263,888 atoms, the simulation

speed on one P100 node (6.25 ns/day) is faster than on three dual-CPU nodes (4.09 ns/day)

− CHARMM-GUI provides a web-based graphical user interface to generate various molecular simulation systems and input files (for CHARMM, NAMD, GROMACS, AMBER, GENESIS, LAMMPS, Desmond, OpenMM, and CHARMM/OpenMM) to facilitate and standardize the usage of common and advanced simulation techniques

Page 45: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

45

Mechanisms of Chromosomal RearrangementsJose Ranz and Edwin Solares, UC Irvine

Assembled Drosophila willistroni genome four ways with various assemblers– Mix of large memory and regular memory nodes on Bridges enables

testing various assemblers and sequencing technologies to obtain the highest-quality genome

– High-quality assemblies are critical to understanding chromosomal rearrangements, which are pivotal in natural adaptation

– Bridges’ high-core, high-memory nodes saved the researchers at least a month compared to other resources

Assembled Anopheles arabiensis genome and transcriptomes– Again, using multiple sequencing data types and assembly software– Bridges’ flexibility key is to supporting these diverse needs

Drosophila willistoni Adult Male.By Mrs. Sarah L. Martin - Plate XVI, J.T. Patterson and G.B. Mainland. The Drosophilidae of Mexico. University of Texas Publications, 4445:9-101, 1944., Public Domain, https://commons.wikimedia.org/w/index.php?curid=30607942

Anopheles gambiae mosquitoBy James D. Gathany - The Public Health Image Library , ID#444, Public Domain, https://commons.wikimedia.org/w/index.php?curid=198377

Page 46: A Converged HPC & Big Data System for Nontraditional and HPC … · 2016-11-16 · 4 MDS nodes 2 front-end nodes 2 boot nodes 8 management nodes ... ‒Integrating Geospatial Capabilities

46

Calculated the integration efficiencies for protein sequences by means of their own coarse-grained (CG) model, running ensembles of ~400−1200 simulations in parallel

As of 9/13/16, they’ve calculated membrane protein integration efficiencies for 16 proteins on Bridges

The calculated integration efficiencies correlate well with experimentally-observed expression levels for a series of three homologs of the membrane protein TatC

Experimentally observed expression levels for a series of three homologs of themembrane protein TatC (left) correlate well with the calculated integrationefficiency (right).

Schematic representation of integration. Ribosome (violet), nascent loops (cyan),nascent transmembrane domains (red), lateral gate (green), lipids (light gray).

Integration of Membrane Proteins into the Cell MembraneMichiel Niesen and Thomas Miller, Caltech