a comparison of library tracking methods in high performance computing computer system cluster and...

20
A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William Rosenberger (New Mexico Tech), Dennis Trujillo (New Mexico State University) Chris DeJager (Michigan Technological University) LA-UR-13-25963

Upload: laureen-farmer

Post on 12-Jan-2016

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

A Comparison of Library Tracking Methods in High Performance

Computing

Computer System Cluster and Networking Summer Institute 2013Poster Seminar

William Rosenberger (New Mexico Tech), Dennis Trujillo (New Mexico State University)

Chris DeJager (Michigan Technological University)

LA-UR-13-25963

Chris DeJager
Slide 2 and 3 seem to be redundent
Dennis Trujillo
I think one LANL logo is sufficient.
Page 2: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

Need for a Library Tracking Utility

• HPC requires the support of many software packages, and multiple versions must be supported at the same timeo Issues

License availability Additional licensing cost Admin support

• No existing methods of tracking user software behavior

• Current user tracking methods don't typically illustrate true use

• Eliminate less utilized software and compiler options

Page 3: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

5/3

0/0

9

6/2

/09

6/5

/09

6/8

/09

6/1

1/0

9

6/1

4/0

9

6/1

7/0

9

6/2

0/0

9

6/2

3/0

9

6/2

6/0

9

6/2

9/0

9

7/2

/09

7/5

/09

7/8

/09

7/1

1/0

9

7/1

4/0

9

0

100

200

300

400

500

600

Simulated New Software Version Acceptance

openmpi-1.5.4openmpi-1.6.5U

se

rs

Page 4: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

5/3

0/0

9

6/2

/09

6/5

/09

6/8

/09

6/1

1/0

9

6/1

4/0

9

6/1

7/0

9

6/2

0/0

9

6/2

3/0

9

6/2

6/0

9

6/2

9/0

9

7/2

/09

7/5

/09

7/8

/09

7/1

1/0

9

7/1

4/0

9

0

100

200

300

400

500

600

Simulated New Software Version Re-jection

gcc-4.4.4

gcc-4.4.7Us

ers

Page 5: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

Potential Tracking Solutions

Automatic Library Tracking Database• Developed at National Institute of Computational

Science for use on Cray systems• Presented at Cray User Group 2010• Wraps the linker 'ld' and job launcher 'aprun'• Stores to MySQL databaseLinux Auditing Utility (auditd)

• System daemon• Tracks specified files for access and or writes• Stores to log files

Page 6: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

Linux Auditing Utility (auditd)

• Provides log information on targeted files or directorieso Object access and useo Stores to log fileso Provides commands for searching log data

• Originally meant for security

• Root access required to add files to be tracked

• Log file is readable by a non-root linux group

Page 7: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

ALTD OperationMySQL Database

Linkline TableTable Data:• Linkline_id: Reference number to this

record• Unique list of libraries used in a

compilation

Link Tag TableTable Data:• Tag_id: Reference number to this record• Reference number to the Linkline entry.• Username, date linked, build machine

Jobs TableTable Data:• Reference number to the Link Tag table• Run machine, date run, username

Wrapper ScriptsLinker Wrapper

• Gets list of libraries used• Creates Linkline entry• Creates Link Tag entry

Job Launch Wrapper• Pulls reference number to the Link Tag

table from the ALTD assembly header• Creates Jobs table entry

Page 8: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

ALTD Alterations

• Support added for wrapping mpirun and mvapich2 job launchers, while still allowing aprun

• Detects version of prefered MPI wrapped compiler and MPI executable and wraps accordingly

• Created script to query the database and collect the number of times a library has been compiled and run

• Made ALTD entries generic across servers

Page 9: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

ALTD

Pros:• Data is well organized into

structured database• Theoretical constant low

overhead• Tracks all libraries

automatically

Cons:• Requires a SQL database• Tracks libraries used at

compile time, not necessarily libraries used at run time

Page 10: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

auditdPros: • Standard Linux daemon• Well tested by the Linux

community• Logs to a flat file

Cons:

• Dynamically linked libraries cannot be reliably tracked, the .so files are cached in memory and not read at every run

• Statically linked libraries can’t be track at runtime as they are in the executable

• A single make can generate numerous log entries

Page 11: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

System Overhead Testing

• Interested in increase in compile and run time of different solutions over a control time

• Linpack was run while varying the problem size and the number of processors working on the problem

• A simple "do nothing" MPI program was also tested to investigate launch time more easily

Page 12: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

4 9 16 25 36 49 64Number of Processors

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

Additional Runtime for a Simple MPI Program

ALTD

auditd

Tim

e (s

)

Page 13: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

2*2 3*3 4*4 5*5 6*6 7*7 8*8

0102030405060708090

100

500

1500

2500

3500

4500

5500

6500

7500

Linpack Runtime (Control)

Processor Grid (P*Q)

Ru

nti

me

(s

)

Pro

ble

m S

ize

(N

)

Page 14: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

2*2 3*3 4*4 5*5 6*6 7*7 8*8

0102030405060708090

100

500

1500

2500

3500

4500

5500

6500

7500

Linpack Runtime with ALTD

Processor Grid (P*Q)

Ru

nti

me

(s

)

Pro

ble

m S

ize

(N

)

Page 15: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

2*2 3*3 4*4 5*5 6*6 7*7 8*8

0102030405060708090

100

500

1500

2500

3500

4500

5500

6500

7500

Linpack Runtime with auditd

Processor Grid (P*Q)

Ru

nti

me

(s

)

Pro

ble

m S

ize

(N

)

Page 16: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

2*2 3*3 4*4 5*5 6*6 7*7 8*8-0.2

0

0.2

0.4

0.6

0.8

1

5001500 2500 3500 4500 5500 6500 7500

Additional Runtime for Linpack withALTD

Processor Grid (P*Q)

Ru

nti

me

Dif

fere

nc

e (

s)

Pro

ble

m S

ize

(N

)

Average: 0.121 (s)

Page 17: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

2*2 3*3 4*4 5*5 6*6 7*7 8*8-0.2

0

0.2

0.4

0.6

0.8

1

5001500 2500 3500 4500 5500 6500 7500

Additional Runtime for Linpack with auditd

Processor Grid (P*Q)

Ru

nti

me

Dif

fere

nc

e (

s)

Pro

ble

m S

ize

(N

)

Average: 0.00304 (s)

Page 18: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

Simple MPI Program Linpack0

0.05

0.1

0.15

0.2

0.25

Additional Compilation Time Required

ALTD

auditd

Tim

e (s

)

Page 19: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

Summary

• ALTDo The Automatic Library Tracking Database has the

potential to track libraries well in HPCo Performance overhead is minimal and seems to be

constant, regardless of job sizeo Additions to the software have provided for simplicity

in utilization and data evaluation.

• auditdo Outperforms ALTD in some cases but falls behind

compiling larger programs o auditd is not sufficient to track all types of libraries

during compilation and runtime

Page 20: A Comparison of Library Tracking Methods in High Performance Computing Computer System Cluster and Networking Summer Institute 2013 Poster Seminar William

We would like to thank the following individuals for their support throughout the workshop:

Georgia PediciniDavid GunterDane GardnerMatt BroomfieldCarol HogsettJosephine OlivasCarol ConnorGary Grider

Thanks!