component infrastructure of cqos and its application in scientific computations

23
Component Infrastructure of CQoS and Its Application in Scientific Computations Li Li 1 , Boyana Norris 1 , Lois Curfman McInnes 1 , Kevin Huck 2 , Joseph Kenny 3 , Meng-Shiou Wu 4 1 Argonne National Laboratory, Argonne, IL. 2 University of Oregon 3 Sandia National Laboratories, California 4 Ames Laboratory CCA meeting Jan. 2009

Upload: evadne

Post on 25-Jan-2016

34 views

Category:

Documents


3 download

DESCRIPTION

Component Infrastructure of CQoS and Its Application in Scientific Computations. Li Li 1 , Boyana Norris 1 , Lois Curfman McInnes 1 , Kevin Huck 2 , Joseph Kenny 3 , Meng-Shiou Wu 4 1 Argonne National Laboratory, Argonne, IL. 2 University of Oregon 3 Sandia National Laboratories, California - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Component Infrastructure of CQoS and Its Application in Scientific Computations

Component Infrastructure of CQoS and Its Application in Scientific Computations

Li Li1, Boyana Norris1, Lois Curfman McInnes1, Kevin Huck2 , Joseph Kenny3 , Meng-Shiou Wu4

1Argonne National Laboratory, Argonne, IL.2University of Oregon3Sandia National Laboratories, California4Ames Laboratory

CCA meeting Jan. 2009

Page 2: Component Infrastructure of CQoS and Its Application in Scientific Computations

2

Outline

Motivation CQoS introduction Database component design Application examples Ongoing and future work

Page 3: Component Infrastructure of CQoS and Its Application in Scientific Computations

3

Overall Goals

Automate the configuration and runtime adaptation of high-performance component applications, through the so called Computational Quality of Service (CQoS) infrastructure– Instrumentation of component interfaces – Performance data gathering– Performance analysis – Adaptive algorithm support

Motivating application examples– Quantum Chemistry challenges: How, during runtime, can we

make the best choices for reliability, accuracy, and performance of interoperable QC components?• When several QC components provide the same functionality, what criteria should be

employed to select one implementation for a particular application instance and computational environment?

• How do we incorporate the most appropriate externally developed components? (e.g., which algorithms to employ from numerical optimization components?)

Page 4: Component Infrastructure of CQoS and Its Application in Scientific Computations

4

Motivating Application Examples (cont.)

Overall simulation times for nonlinear (time-dependent) PDE-based models often depend to a large extent on the robustness and efficiency of sparse linear solvers– Properties of linear system change during runtime– No single method is best because of the complexity of long-running

applications

Efficient parallel structured adaptive mesh refinement (SAMR) applications depend on load-balancing algorithms– Computational resources are dynamically concentrated to areas in

need of a high accuracy– Application and computer state change at runtime– Dynamic resource allocation requires the workload partitioning

algorithm be selected at runtime according to state change

Page 5: Component Infrastructure of CQoS and Its Application in Scientific Computations

5

Outline

Motivation CQoS introduction Database component design Application examples Ongoing and future work

Page 6: Component Infrastructure of CQoS and Its Application in Scientific Computations

6

CQoS Analysis Infrastructure Performance monitoring, problem/solution characterization, and performance model building

PerformanceDatabases

(historical & runtime)

Interactive Analysis and Model Building

SubstitutionAssertionDatabase

SubstitutionAssertionDatabase

Scientist can analyze data interactively

Scientist can provide decisions on substitution and reparameterization

Instrumented Component

Application Cases

Instrumented Component

Application Cases

CQoS Control Infrastructure Interpretation and execution of control laws to modify an application’s behavior

Control System(parameter changes andcomponent substitution)

Control System(parameter changes andcomponent substitution)

CQoS-Enabled Component Application

CQoS-Enabled Component Application

Component AComponent A

Component BComponent B

Component CComponent C

ComponentSubstitution Set

ComponentSubstitution Set

Page 7: Component Infrastructure of CQoS and Its Application in Scientific Computations

7

Database Needs for the Scientific Application Adaptation

Performance analysis of candidate solver/algorithm– Large number of performance runs

– Store, manage, and search performance data Store and manage hardware, compiler, and application

metadata– Information essential to algorithm selection, e.g., system

configurations, problem properties, application states Optimal algorithm determination

– Input data (or problem features)

– Algorithmic parameters

– Performance models (or hints)

Page 8: Component Infrastructure of CQoS and Its Application in Scientific Computations

8

Database Needs for Scientific Application Adaptation (cont.)

Database use cases:– Store historical performance data and application meta-

data

– Facilitate offline performance analysis

– Match the current application state against historical data through DB queries during runtime

– Search for optimal algorithm w.r.t. current application state

– Retrieve settings associated with the optimal algorithm so we can apply it immediately to the application during runtime

Page 9: Component Infrastructure of CQoS and Its Application in Scientific Computations

9

Outline

Motivation CQoS introduction Database component design Application examples Ongoing and future work

Page 10: Component Infrastructure of CQoS and Its Application in Scientific Computations

10

CQoS Database Component Design

Designed C++ and SIDL interfaces for CQoS database management

Implemented prototype database management components

– Description and software:

http://wiki.mcs.anl.gov/cqos/index.php/CQoS_database_components_version_0.0.0

– Based on PerfDMF performance data format and PERI metadata formats

– Comparator interface and corresponding component for searching and matching parameter sets

Page 11: Component Infrastructure of CQoS and Its Application in Scientific Computations

11

CQoS Database Component Design

AdaptiveHeuristicAdaptiveHeuristic

Perf. ComparatorPerf. ComparatorPerf. data: compare/match

Perf. DatabasePerf. DatabasePerf. data: query/store

… …

… …

: component

: component connection

Fig.1. Connect database and comparator components to adaptive heuristics component. There can be multiple database and comparator components that deal with different data types.

Metadata: query/store

Metadata: compare/match Meta-ComparatorMeta-Comparator

Meta-DatabaseMeta-Database

Page 12: Component Infrastructure of CQoS and Its Application in Scientific Computations

Use DB interfaces in 2D driven-cavity /* instantiate parameter 1 */ ierr = ComputeQuantity(matrix,"icmk","splits",&res,&flg); CHKERRQ(ierr); MatrixProperty param1("splits", "matrix_meta", res.i);

/* instantiate parameter 2 */ ierr = ComputeQuantity(matrix,"structure","nnzeros",&res,&flg); CHKERRQ(ierr); MatrixProperty param2("nnzeros", "matrix_meta", res.i);

/**** Store matrix property set into database. ***/ int myRank; ierr = MPI_Comm_rank(PETSC_COMM_WORLD, &myRank); CHKERRQ(ierr); if (myRank == 0){

int localID;int trialID;

string conninfo("dbname = perfdb");

/* Generate a runtime database manager. It connects to a PostgreSQL database through DB interfaces. */

RunTimeRecord *R = RunTimeRecord::instance();R->Connect2DB(conninfo);trialID = R->getTrialID();localID = R->getCurEvtID(cflStr);

/* instantiate a parameter set */PropertySet aSet;

/* add parameter 1 and 2 into the set */aSet.addAParameter(&param1);aSet.addAParameter(&param2);

/* store the parameter set into database, */R->loadParameterSet(trialID, localID, aSet);

}

Page 13: Component Infrastructure of CQoS and Its Application in Scientific Computations

13

CQoS Performance and Metadata

Performance (general)– Historical performance data from different instances of the same application

or related applications:• Obtained through source instrumentation, e.g., TAU (U. Oregon)• Binary instrumentation, e.g., HPCToolkit (Rice U.)

Ideally, for each application execution, the metadata should provide enough information to be able to reproduce a particular application instance. Examples:– Input data (reduced representations)

• E.g., molecule characteristics,matrix properties

– Algorithmic parameters• E.g., convergence level, maximum number of iterations

– System parameters• Compilers, hardware

– Domain-specific• Provided by scientist/algorithm developer

Page 14: Component Infrastructure of CQoS and Its Application in Scientific Computations

14

Outline

Motivation CCA and CQoS introduction Database component design Application examples Ongoing and future work

Page 15: Component Infrastructure of CQoS and Its Application in Scientific Computations

151J. Steensland and J. Ray, "A Partitioner-Centric Model for SAMR Partitioning Trade-Off Optimization : Part I," International Journal of High Performance Computing Applications, 2005, 19(4):409-422.

15

Example: CQoS in Quantum Chemistry

Initial focus: parallel application configuration of QC applications so that these can run effectively on various high-performance machines

– Eliminate guesswork or trial-and-error configuration Future work: more sophisticated analysis to configure

algorithmic parameters for particular molecular targets, calculation approaches, and hardware environments

Page 16: Component Infrastructure of CQoS and Its Application in Scientific Computations

Interactions of the Quantum Chemistry Components With the Database and Comparator CQoS Components

Page 17: Component Infrastructure of CQoS and Its Application in Scientific Computations

CQoS/QC Component Wiring

Page 18: Component Infrastructure of CQoS and Its Application in Scientific Computations

18

CQoS Component Usage in Quantum Chemistry

CQoS database usage– Application metadata

• Molecule characteristics: atom types, topology, moments of inertia• Algorithm parameters: tunable parameters, convergence level

– System parameters• Compilers• Machine info, e.g., number of nodes, threads per node, network

– Historical performance data• Execution times, etc.• Obtained through source instrumentation, e.g., TAU• Can guide configuration of related new simulations

CQoS comparator components– Compare sets of parameters within the performance database– Quantum chemistry applications can match the current application state against

historical data through database queries during runtime.– Use metadata to guide parameter selection and application configuration

• Match molecule similarity, basis set similarity, electronic correlation approach, etc.

Page 19: Component Infrastructure of CQoS and Its Application in Scientific Computations

19

Ongoing and Future Work (Incomplete List)

Integration of ongoing efforts in – Performance tools: common interfaces and data representation

(leverage PerfExplorer, TAU performance interfaces, PERI tools, and other efforts)

Support training experiment design– To perform an empirical search for selecting the optimal solver

components/parameters Incorporate more offline performance analysis capabilities

(machine learning, statistical analysis, etc.) Apply to more problem domains, implementing extensions as

necessary

Page 20: Component Infrastructure of CQoS and Its Application in Scientific Computations

20

Acknowledgements to Collaborators

TAU Performance Tools group, University of Oregon Victor Eijkhout, the University of Texas at Austin CCA Forum members Funding:

– Department of Energy (DOE) Mathematical, Information, and Computational Science (MICS) program

– DOE Scientific Discovery through Advanced Computing (SciDAC) program

– National Science Foundation

Page 21: Component Infrastructure of CQoS and Its Application in Scientific Computations

interface DB extends gov.cca.Port{bool connect();bool disconnect();bool isClosed();void setConnectionInfo(in string info);string getConnectioninfo();int executeQuery(in string commd, out Outcome res);

void storeParameter(in int trialID, in int iterNo, in Parameter aParam); // store a parameter into DB

void storeParameterSet(in int trialID, in int iterNo, in ParameterSet aParamSet); // store a set of

parameter into DB

void getParameter(in int trialID, int iterNo, inout Parameter aParam); // retrieve parameter value

void getParameterSet(in int trialID, int iterNo, inout ParameterSet aParamSet);// retrieve parameter set

value

int getMatchingTrialsBetween(in ParameterSet lower, in ParameterSet upper, out Outcome trialIDs); // retrieve trials from database, whose parameter set value is within [lower,

upper] int getMatchingTrials(in ParameterSet lower, in vector e

psilons, out Outcome trialIDs);// retrieve trials from database,

whose parameter set value is within [lower-epsilons, lower+epsilons] }

interface Comparator extends gov.cca.Port {/* Comparison operations between parameter sets

*/

void setLHS(in ParameterSet lefthand);void setRHS(in ParameterSet righthand);ParameterSet getLHS();ParameterSet getRHS();

int getDimension();

Parameter getLHSParameterAt(in string paraName);Parameter getRHSParameterAt(in string paraName);

void setToleranceAt(in string name, in double epsilon);double getToleranceAt(in string name);

void setRelationAt(in string name, in int aRelation);int getRelationAt(in string name);

bool doCompare(); }

Main Database Component Interfaces

Page 22: Component Infrastructure of CQoS and Its Application in Scientific Computations

22

Database Component Usage – Example 1: 2D Driven Cavity Flow1

1 T. S. Coffey, C.T. Kelley, and D.E. Keyes. Pseudo-transient continuation and differential algebraic equations. SIAM J. Sci. Comp, 25:553–569, 2003.

Linear solver: GMRES(30), vary only fill level of ILU preconditioner Adaptive heuristic based on:

– Matrix properties (which change during runtime) computed with Anamod (Eijkhout, http://sourceforge.net/projects/salsa/)

Page 23: Component Infrastructure of CQoS and Its Application in Scientific Computations

23

How are the Database Components Used?

During runtime, the driver (e.g., linear solver proxy component) evaluates important matrix properties, and matches the properties to historical data in MetaDB through PropertyComparator interfaces.

Linear solver performance data is retrieved and compared given the current matrix properties. This is accomplished by the PerfComparator component.

The linear solver parameters resulting in the best performance, in this case fill level of ILU preconditioner, is returned back to the driver.

The driver adapts accordingly to continue execution.