a relaxed synchronization approach for solving parallel

27
Introduction Problem Description Relaxed Synchronization Experimental Results Summary A Relaxed Synchronization Approach for Solving Parallel Quadratic Programming Problems with Guaranteed Convergence Kooktae Lee * , Raktim Bhattacharya * , Jyotikrishna Dass ** , V N S Prithvi Sakuru ** , Rabi N. Mahapatra ** *Aerospace Engineering **Computer Science and Engineering Texas A&M University May 24, 2016 Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 1

Upload: others

Post on 11-Dec-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

A Relaxed Synchronization Approach for Solving Parallel QuadraticProgramming Problems with Guaranteed Convergence

Kooktae Lee∗, Raktim Bhattacharya∗, Jyotikrishna Dass∗∗, V N S PrithviSakuru∗∗, Rabi N. Mahapatra∗∗

*Aerospace Engineering **Computer Science and Engineering

Texas A&M University

May 24, 2016

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 1

Page 2: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

Table of Contents

1 Introduction

2 Problem Description for Parallel Quadratic Programming Problem

3 A Relaxed Synchronization approach

4 Experimental Results

5 Summary

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 2

Page 3: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

1. Introduction

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 3

Page 4: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

Large-scale Distributed Optimization Problemswith focus on Parallel Quadratic Programming (PQP)

Applications of PQP:1) least square problems with linear constraints2) regression analysis and statistics3) SVMs (Support Vector Machines)4) lasso (least absolute shrinkage and selection operator)5) portfolio optimization problems

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 4

Page 5: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

Problems when implementing parallel computing algorithm

Several critical issues commonly encountered by parallel computing:Load imbalanceShared memory movementCommunication overheadSynchronization bottleneck

According to the literature 1, the idle process time may be up to 50% of totalcomputation time.

1Buchholz, Peter, Markus Fischer, and Peter Kemper. ”Distributed steady state analysisusing Kronecker algebra.” Numerical Solutions of Markov Chains (NSMC’99): 76-95.

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 5

Page 6: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

Some facts on synchronization:Synchronization requires idle process time for multi-core computing devicesFor extreme-scale parallel computing, this leads to waste of computing timeNeed to relax the synchronization penalty!

How to avoid synchronization latency?1) Asynchronous Computing algorithm:

Proceed with computation without waiting values computed by otherprocessors ⇒ No need to synchronize dataAsynchrony causes randomness in computing valuesCannot guarantee the numerical stability and convergence of solution obtainedby async. algorithm

2) Relaxed Synchronization approach:Do not synchronize the data and hold synchronizationCommunication takes places periodically.Objective is to minimize the number of communication (synchronization)⇒ How frequently communicate?

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 6

Page 7: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

2. Problem Description for Parallel QuadraticProgramming (PQP) Problem

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 7

Page 8: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

The quadratic programming problem considered here is

Quadratic Programming Problem

minx

f (x) subject to Ax = b, (1)

where x ∈ Rn, A ∈ Rm×n, b ∈ Rm,

f (x) := 12 xT Qx + cT x ,

Q ∈ Rn×n is a symmetric, positive definite matrix, and c ∈ Rn.

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 8

Page 9: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

The cost function f (x) is separable, i.e.,

Assumption

f (x) =N∑

i=1fi (xi ) =

N∑i=1

12 xT

i Qi xi + cTi xi

Ax =N∑

i=1Ai xi ,

where N denotes the total number of subproblems.

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 9

Page 10: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

Dual Ascent AlgorithmLagrangian:

L(x , y) := f (x) + yT (Ax − b),

where y is the dual variable or the Lagrange multiplier.

xk+1i = arg min

xiLi (xi , yk) = −Q−1

i (ATi yk + c), i = 1, . . . ,N, (2)

yk+1 = yk + αk(Axk+1 − b), (3)

where αk > 0 is the step size, the superscript k is the iteration counter, and xi arepartitions of x .

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 10

Page 11: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

Figure: The schematic of Dual Ascent algorithm

In the gathering stage, the synchronization is necessary and unavoidable!

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 11

Page 12: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

3. Relaxed Synchronization Approach:Lazy Synchronization

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 12

Page 13: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

TSDA (Tightly Synchronized Dual Ascent) Algorithm

yk+1 = yk + αk(Axk+1 − b)

= yk +N∑

i=1αi

(Ai xk+1

i − bN

). (4)

LSDA (Lazily Synchronized Dual Ascent) Algorithm

yk+1 = yk +N∑

i=1αi

(Ai x tP+1

i − bN

), tP ≤ k < (t + 1)P, (5)

where t ∈ N0.

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 13

Page 14: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

Known: x tP+1i = −Q−1

i (ATi y tP +c)

LSDA Algorithm Development

yk+1 = yk +N∑

i=1αi

(−Ai Q−1

i(Ai y tP + c

)− b

N

), tP ≤ k < (t + 1)P. (6)

when k = (t + 1)P − 1, we have

y (t+1)P =(

I − PN∑

i=1αi(Ai Q−1

i ATi))

y tP − PN∑

i=1αi

(Ai Q−1

i c + bN

), (7)

where I stands for the identity matrix with a proper dimension.New dynamics given by:

⇒ y (t+1)P = A(P)y tP + b(P), P ∈ N

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 14

Page 15: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

Three issues related to LSDA algorithm

For LSDA algorithm, the following issues have to be resolved.1) Stability issue:

Is LSDA algorithm stable?

2) Convergence issue:Does LSDA algorithm provide the same solution as compared to TSDAalgorithm?

3) Optimality issue:What is the optimal synchronization period P∗ then?

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 15

Page 16: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

Lemma(Stability) The dual variable for LSDA algorithm is stable if and only if

ρ(

A(P))< 1. (8)

where A(P) := I − P∑N

i=1 αi(Ai Q−1

i ATi)

and the symbol ρ(·) denotes thespectral radius of the given matrix (i.e., the largest magnitude of the eigenvalue).

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 16

Page 17: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

Proposition(Convergence) Consider the QP problem that is separable. If the condition (8)holds, then the dual variables yLSDA for LSDA and yTSDA for TSDA converge tothe same fixed-point value y∗ := limk→∞ yk

TSDA = limt→∞ y tPLSDA.

Theorem(Optimality) For the given parallel QP problem with LSDA technique, the optimalsynchronization period P? is obtained by

P? = max argminP∈N

max{|1− λ(β)P|, |1− λ̄(β)P|} (9)

where β :=∑N

i=1 αi Ai Q−1i AT

i , λ(·) and λ̄(·) denote the smallest and the largesteigenvalues of the square matrix, respectively.

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 17

Page 18: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

4. Experimental Results

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 18

Page 19: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

Hardware & Software DescriptionHardware:40 node cluster of Amazon Web Services (AWS)Elastic Cloud Compute (EC2) instances:Intel Xeon processors with clock speed up to 3.33 GHzOne processing unit with 1GB memory

Software:C++ with Armadillo (v5.400.2) – linear algebra libraryMPI framework library (MPICH-v3.1.4) for the inter-node communication

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 19

Page 20: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

The data was synthetically generated with random values uniformly distributedover [−1, 1]. The problem specifics are as follows:

Experimental Setup1) Number of instances in synthetic dataset, d = 200, 000.2) Step size, α = 0.27.3) Optimal Synchronization Period, P∗ = 70.4) Stopping threshold, ε = 10−5.5) Cluster Size, N = {10, 20, 32, 40}.

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 20

Page 21: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

Figure: LSDA algorithm: Number of iterations (k) vs Synchronization Period (P). Thenumber of iterations required for the TSDA algorithm to converge is constant.

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 21

Page 22: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

Figure: LSDA algorithm: Computation Time vs Synchronization Period for cluster size N= {10, 20, 32, 40}

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 22

Page 23: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

Figure: Dual variable solution vs Number of iterations. LSDA algorithm converges to theoptimal solution of the dual variable significantly faster than the TSDA algorithm.

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 23

Page 24: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

Figure: Total execution time vs Cluster size for TSDA algorithm (left) and for LSDAalgorithm (right)

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 24

Page 25: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

Computing Performance Analysis

Computing Performance Comparison between TSDA & LSDA algorithm

TSDA algorithm LSDA algorithmNo. of iteration 868 211Sync. period 1 70No. of Sync. 868 (=868/1) times 3 (=211/70) timesComm. delay reduction 99.65%Speedup 160 times

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 25

Page 26: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

Summary

1 A relaxed synchronization technique was developed to solve massively parallellarge-scale QP problems.

2 Optimal synchronization period is computed analytically with guaranteedconvergence.

3 The efficiency of the proposed methods was verified through the realimplementation of parallel computing algorithm.

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 26

Page 27: A Relaxed Synchronization Approach for Solving Parallel

Introduction Problem Description Relaxed Synchronization Experimental Results Summary

Thank you.

Acknowledgement:

This work was supported by AFOSR grant FA9550-15-1-0071,with Dr. Frederica Darema as the program manager.

Lab. for Uncertainty Quantification Aerospace Engineering, Texas A&M University July 1 2015, 27