mse presentation 3 by lakshmikanth ganti under the guidance of dr. virgil wallentine major...

Post on 19-Jan-2018

219 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Overview Goals: To develop a parallel program for the simulation of a group of molecules using Molecular Dynamics Simulation. To implement various parallel algorithms and compare their performance. To produce good documentation of the design and the overall system.

TRANSCRIPT

MSE Presentation 3ByLakshmikanth Ganti

Under the Guidance of

Dr. Virgil Wallentine – Major ProfessorDr. Paul Smith – Committee MemberDr. Mitch Neilsen – Committee Member

Introduction Overview Revised Artifacts Component Design Assessment Evaluation Project Evaluation User Manual Conclusion

OverviewGoals:

To develop a parallel program for the simulation of a group of molecules using Molecular Dynamics Simulation.

To implement various parallel algorithms and compare their performance.

To produce good documentation of the design and the overall system.

Revised Artifacts Architecture Design

Revised with design descriptions for each of the parallel programming paradigms used.

Revised Artifacts Object Model

Component Design Classes

Atom Barrier ObjBuf EnergyWriter ParThread MdPar MdConstants

Component Design Classes

LineReader IO_Utils Semaphore BinarySemaphore CountingSemaphore

Assessment Evaluation Feature Testing

Read Data from files Read Program Arguments Format Values for output

Assessment Evaluation Functional Testing

Program executed with different number of threads . Velocities read from a file each time instead of calculating using Random Gaussian Distribution.

Assessment Evaluation Performance Evaluation

Initial Design 3-D grid shaped pattern of thread

creation Message passing by Bounded Buffers Number of threads is 512; Each thread is

assigned one partition No Speedup achieved

Assessment Evaluation Performance Evaluation

Design I 3-D grid shaped pattern of thread

creation Message passing by bounded buffers Number of threads can 2x2x2 or 4x4x4 3-D array of partitions are assigned to

each thread

Assessment Evaluation Performance Evaluation

Design I, Fine Grained

Number of Threads Time Taken Speed-up Efficiency

1 179625 -- --

8 174393 1.03 25.75

64 216415 0.83 20.75

Assessment Evaluation Performance Evaluation

Design I, Coarse Grained

Number of Threads Time Taken Speed-up Efficiency

1 1726549 -- --

8 1676261 1.03 25.75

64 2105547 0.82 20.5

Assessment Evaluation Performance Evaluation

Design ISpeedup vs Number of Threads

0

0.2

0.4

0.6

0.8

1

1.2

0 10 20 30 40 50 60 70

Number of Threads

Spee

dup

Fine GrainedCoarse Grained

Performance Evaluation Design II

Vertical Pipeline shaped pattern of thread creation

Message Passing through Bounded Buffers Layers of partitions assigned to each

thread rather than a 3-D array of partitions Number of threads created can be 1, 2, 4

or 8.

Assessment Evaluation

Assessment Evaluation Performance Evaluation

Design II, Fine Grained

Number of Threads Time Taken Speed-up Efficiency

1 170062 -- --

2 158936 1.07 26.75

4 104796 1.62 40.5

8 126911 1.34 33.5

Assessment Evaluation Performance Evaluation

Design II, Coarse Grained

Number of Threads Time Taken Speed-up Efficiency

1 1699526 -- --

2 1559198 1.09 27.25

4 982384 1.73 43.25

8 1196849 1.42 35.5

Assessment Evaluation Performance Evaluation

Design IISpeedup vs Number of Threads

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0 1 2 3 4 5 6 7 8 9

Number of Threads

Spee

dup

Fine GrainedCoarse Grained

Performance Evaluation Final Design

Vertical Pipeline shaped pattern of thread creation

Synchronization by Barrier. No message passing

Layers of partitions assigned to each thread rather than a 3-D array of partitions

Number of threads created can be 1, 2, 4 or 8.

Assessment Evaluation

Assessment Evaluation Performance Evaluation

Final Design, Fine Grained

Number of Threads Time Taken Speed-up Efficiency

1 162653 -- --

2 137841 1.18 29.5

4 62800 2.59 64.75

8 66935 2.43 60.75

Assessment Evaluation Performance Evaluation

Final Design, Coarse Grained

Number of Threads Time Taken Speed-up Efficiency

1 1684963 -- --

2 1306172 1.29 32.25

4 591215 2.85 71.25

8 640670 2.63 65.75

Assessment Evaluation Performance Evaluation

Final DesignSpeedup vs Number of Threads

0

0.5

1

1.5

2

2.5

3

0 1 2 3 4 5 6 7 8 9

Number of threads

Spee

dup

Fine GrainedCoarse Grained

Project Evaluation Problems encountered

JPF Debugging Parallel Programs Limited Processing power of available

systems

Project Evaluation Accuracy of Estimates

Estimated duration of the project ~ 8 Months

Actual duration of the project ~ 7 months

Estimated LOC Actual LOC

Sequential 1435 504

Parallel 1545 1271

Project Evaluation Lessons Learnt

Methodology Reviews

User Manual Data Formats Program usage User Commands System Configuration

Conclusion Various parallel algorithms based on 1)

Synchronization mechanism, 2) the pattern of thread creation and 3) Granularity, are implemented

The above implementations are compared for speedup and efficiencies

Documentation of the design and the overall system is produced.

Questions/Comments

top related