data science stuff

Upload: gt0084e1

Post on 03-Jun-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Data Science Stuff

    1/18

    Visualization of Very LargeScientific Data

    David PugmireScientific Data Group

    Oak Ridge National Laboratory 6 March 2014

  • 8/12/2019 Data Science Stuff

    2/18

    Data Driven Science and

    Scientific Visualization

    Volume Increasing mesh resolutions Increasing temporal resolution

    Velocity Increasing temporal resolutionFrequency of data

    Variety Multi-variateEnsemblesIncreasing complexity

    Veracity Uncertainty ErrorsApproximations

    Value Visualization and AnalysisFeature detectionScientific insight

  • 8/12/2019 Data Science Stuff

    3/18

    HPC Visualization Tools of

    Today

    Analysis clusteror

    Supercomputer Client

    GUIViewerAPI

    ServerI/OAnalysis Visualization

  • 8/12/2019 Data Science Stuff

    4/18

    Scalability of Visualization Tools Can current visualization tools survive at the exascale? What are the bottlenecks at the largest scales? What differences to architecture make?

    Research Questions:

    Core-collapse supernova simulation. Data

    courtesy of T. Mezzacappa (GenASiS)

    Methodology: Createexascale data. Trillions of zones Run a simple workflow:

    Read data Volume render / contour data Render and composite

  • 8/12/2019 Data Science Stuff

    5/18

    Scalability of Visualization Tools

  • 8/12/2019 Data Science Stuff

    6/18

    Challenges at Exascale:

    100-200

    I/O Caveats:

    System System Peak I/O Peak I/O Reality I/O HeroJaguarPF 2PF 200 GB/s 1 GB/s 60 GB/s

    Titan 20PF 1.2 TB/s 1 GB/s 120 GB/sFuture 1000PF 10 TB/s (?) ?? ??

  • 8/12/2019 Data Science Stuff

    7/18

    Visualization at the Exascale Target approaching hardware/software ecosystems EAVL: Extreme-scale Analysis and Visualization Library

    Research Areas Volume Velocity Variety Veracity Value

    Data Model X X X XHeterogenous

    Computing X X XIn situ / In transit X X X

    And, make it all accessible for developers

  • 8/12/2019 Data Science Stuff

    8/18

    EAVL Research Goals:

    Data ModelDe-facto standards like VTK have a limited data model

    Point ArrangementCells Coordinates Explicit Logical Implicit

    StructuredStrided Structured Grid

    Separated Rectilinear Grid Image Data

    UnstructuredStrided Unstructured Grid

    Separated

  • 8/12/2019 Data Science Stuff

    9/18

    Arbitrary Composition for Both

    Efficiency and FlexibilityEAVL allows full flexibility in representation

    Point ArrangementCells Coordinates Explicit Logical Implicit

    StructuredStrided

    Separated

    UnstructuredStrided

    Separated

  • 8/12/2019 Data Science Stuff

    10/18

    Data Model Gaps Addressed in EAVL Hybrid mesh types 1D/2D/3D/.... coordinate systems Higher dimensional data Non-physical data, e.g. graphs Face and edge data Multiple groups of cells in one mesh

    e.g. subsets, external faces Mixed topology meshes

    e.g. molecules, embedded surfaces

    9D mesh used by

    GenASiS2nd order quadtree

    from MADNESS

    Mixed topology

    molecule mesh

    Graph mesh

  • 8/12/2019 Data Science Stuff

    11/18

    Example: Memory and Algorithmic

    Efficiency

    Explicit pointsExplicit cells

    Threshold regular grid: 35 < pressure < 45

    Traditional Data Model

    EAVL Data Model

    Implicit points

    Explicit cells

    Fully unstructured grid Hybrid implicit/explicit grid

  • 8/12/2019 Data Science Stuff

    12/18

    Example: Memory and Algorithmic

    EfficiencyEAVL: 7X reduction in

    memory usage EAVL: 4-5x performanceimprovement

  • 8/12/2019 Data Science Stuff

    13/18

    EAVL Research Goals:

    Heterogeneous Computing Implementations for CPU, GPU, and Phi

    Surface Normal Calculation

  • 8/12/2019 Data Science Stuff

    14/18

    EAVL Research Goals:

    Usability Minimal footprint No dependencies Header file only implementation 1D, 2D, and 3D rendering with annotations Optional MPI, CUDA, OpenMP support Optional file readers EAVLab lightweight toolfor rapid prototyping and

    experimentation

  • 8/12/2019 Data Science Stuff

    15/18

    EAVL Research Goals:

    Tightly-coupled In Situ Zero-copy host and device Parallel rendering

    infrastructure Examples: LULESH (Hydrodynamics) Xlotal (Fusion)

  • 8/12/2019 Data Science Stuff

    16/18

    EAVL Research Goals:

    Loosely-coupled In Situ ADIOS Staging and XGC Fusion

    code Exploits network hardware

    support for fast data transfer toremote memory

    Application writes using ADIOSAPI

    Viz app reads using ADIOS API

    Staging Viz

    XGC application

  • 8/12/2019 Data Science Stuff

    17/18

    EAVL Roadmap Continued algorithm research and development

    Data parallel algorithms are verydifferent Autonomic algorithms Techniques for handling uncertainty

    Continued efforts in loosely and tightly coupled in situ Deployment as services into data streaming frameworks Deployment path into HPC vis tools (e.g., VisIt and

    Paraview)

  • 8/12/2019 Data Science Stuff

    18/18

    Thank you for yourattention