continuous performance testing for microservices · testing for microservices vincenzo ferme...

Post on 03-Jun-2020

26 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Continuous Performance Testing for Microservices

Vincenzo FermeReliable Software SystemsInstitute for Software TechnologiesUniversity of Stuttgart, Germany

HPI Future SOC Lab Day

info@vincenzoferme.it

2

Outline

2

1. Continuous Software Development Environments

Outline

2

1. Continuous Software Development Environments

2. Continuous Performance Testing of Microservices: a. Challenges b. Our Approach c. Preliminary Results d. BenchFlow Automation Framework

Outline

2

1. Continuous Software Development Environments

2. Continuous Performance Testing of Microservices: a. Challenges b. Our Approach c. Preliminary Results d. BenchFlow Automation Framework

3. Approach Extensions:

a. ContinuITy Framework

Outline

2

1. Continuous Software Development Environments

2. Continuous Performance Testing of Microservices: a. Challenges b. Our Approach c. Preliminary Results d. BenchFlow Automation Framework

3. Approach Extensions:

a. ContinuITy Framework

4. Future Work

Outline

3

Continuous Software Development Environments

3

Continuous Software Development Environments

Repo

Developers, Testers,

Architects

3

Continuous Software Development Environments

CI ServerRepo

Developers, Testers,

Architects

3

Continuous Software Development Environments

CI ServerRepo

Developers, Testers,

Architects

CD Server

3

Continuous Software Development Environments

CI ServerRepo

Developers, Testers,

Architects

Production

CD Server

4

Challenges

4

The system changes frequently:- Need to keep the test definition updated (manual is not an option)

Challenges

4

The system changes frequently:- Need to keep the test definition updated (manual is not an option)

Too many tests to be executed:- Need to reduce the number of executed tests

Challenges

4

The system changes frequently:- Need to keep the test definition updated (manual is not an option)

Too many tests to be executed:- Need to reduce the number of executed tests

Need to integrate tests in delivery pipelines:- Need to have complete automation for test execution - Need to have clear KPIs to be evaluated in the pipeline

Challenges

5

Our Approach

5

Our Approach

Production

5

Our Approach

Observed load situations Time

Load

Lev

el

Production

5

Our Approach

Observed load situations Time

Load

Lev

el

Production Empirical Distribution of Load situations

50 200 350 500Load Level

Rel.

Freq

.

1

5

Our Approach

Observed load situations Time

Load

Lev

el

Production Empirical Distribution of Load situations

50 200 350 500Load Level

Rel.

Freq

.

1

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq. 2

5

Our Approach

Observed load situations Time

Load

Lev

el

Production

Sampled Load Tests

Empirical Distribution of Load situations

50 200 350 500Load Level

Rel.

Freq

.

1

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq. 2

5

Our Approach

Observed load situations Time

Load

Lev

el

Production

Sampled Load TestsDeployment Conf.

Empirical Distribution of Load situations

50 200 350 500Load Level

Rel.

Freq

.

1

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq. 2

5

Our Approach

Observed load situations Time

Load

Lev

el

Production

Baseline Test

Sampled Load TestsDeployment Conf.

Empirical Distribution of Load situations

50 200 350 500Load Level

Rel.

Freq

.

1

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq. 2

5

Our Approach

Observed load situations Time

Load

Lev

el

Production

Baseline Test

Sampled Load Tests

Scalability Criteria

Deployment Conf.

Empirical Distribution of Load situations

50 200 350 500Load Level

Rel.

Freq

.

1

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq. 2

5

Our Approach

Observed load situations Time

Load

Lev

el

Production

Baseline Test

Sampled Load Tests

Scalability Criteria

Deployment Conf.

Empirical Distribution of Load situations

50 200 350 500Load Level

Rel.

Freq

.

1

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq. 2

0.12 0.14 0.20 0.16 0.11

Test Results

3

5

Our Approach

Observed load situations Time

Load

Lev

el

Production

Baseline Test

Sampled Load Tests

Scalability Criteria

Deployment Conf.

Empirical Distribution of Load situations

50 200 350 500Load Level

Rel.

Freq

.

1

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq. 2

0.12 0.14 0.20 0.16 0.11

Test Results

3

Domain Metric

0.734

6

Preliminary Experiments

6

Preliminary Experiments

Production

12 microservices

6

Preliminary Experiments

Production

12 microservices

Sampled Load Tests

Empirical Distribution of Load situations

6 Load Levels

50,100,150,200,250,300 Users

1,2

6

Preliminary Experiments

Production

12 microservices

Custom Op. Mix

Sampled Load Tests

Empirical Distribution of Load situations

6 Load Levels

50,100,150,200,250,300 Users

1,2

6

Preliminary Experiments

Production

12 microservices

Custom Op. Mix

Sampled Load Tests

Empirical Distribution of Load situations

6 Load Levels

50,100,150,200,250,300 Users

1,2

Deployment Config.

10 configurations

Replicas

CPURAM

3

6

Preliminary Experiments

Scal = avg + 3σ

Production

12 microservices

Custom Op. Mix

Sampled Load Tests

Empirical Distribution of Load situations

6 Load Levels

50,100,150,200,250,300 Users

1,2

Deployment Config.

10 configurations

Replicas

CPURAM

3

7

Experiments Infrastructure

Virtual Users

32GB RAM, 24 Cores

7

Experiments Infrastructure

Virtual Users

32GB RAM, 24 Cores

896GB RAM, 80 Cores

System Under Test

8

Preliminary Experiments Results

Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica

8

Preliminary Experiments Results

Custom Op. Mix

API Scalability Criteria(PASS/FAIL)

GET / PASS

GET /cart PASS

POST /item FAIL

Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica

8

Preliminary Experiments Results

Custom Op. Mix

API Scalability Criteria(PASS/FAIL)

GET / PASS

GET /cart PASS

POST /item FAIL

Users Aggr. Rel. Freq.

50 0.10582

100 0.18519

150 0.22222

200 0.22222

250 0.20370

300 0.06085

Aggr. Rel. Freq.

Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica

8

Preliminary Experiments Results

Custom Op. Mix

API Scalability Criteria(PASS/FAIL)

GET / PASS

GET /cart PASS

POST /item FAIL

Users Aggr. Rel. Freq.

50 0.10582

100 0.18519

150 0.22222

200 0.22222

250 0.20370

300 0.06085

Aggr. Rel. Freq. Contrib. to Domain Metric

Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica

8

Preliminary Experiments Results

Custom Op. Mix

API Scalability Criteria(PASS/FAIL)

GET / PASS

GET /cart PASS

POST /item FAIL

Users Aggr. Rel. Freq.

50 0.10582

100 0.18519

150 0.22222

200 0.22222

250 0.20370

300 0.06085

Aggr. Rel. Freq. Contrib. to Domain Metric

Max: 0.20370

Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica

8

Preliminary Experiments Results

Custom Op. Mix

API Scalability Criteria(PASS/FAIL)

GET / PASS

GET /cart PASS

POST /item FAIL

Users Aggr. Rel. Freq.

50 0.10582

100 0.18519

150 0.22222

200 0.22222

250 0.20370

300 0.06085

Aggr. Rel. Freq. Contrib. to Domain Metric

Max: 0.20370

Actual: 0.13580

Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica

9

Preliminary Experiments Results

Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica

9

Preliminary Experiments Results

Users Contribution

50 0.10582

100 0.18519

150 0.22222

200 0.07999

250 0.13580

300 0.04729

Contrib. to Domain Metric

Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica

9

Preliminary Experiments Results

Users Contribution

50 0.10582

100 0.18519

150 0.22222

200 0.07999

250 0.13580

300 0.04729

Contrib. to Domain Metric

Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica

Max: 1

Domain Metric4

9

Preliminary Experiments Results

Users Contribution

50 0.10582

100 0.18519

150 0.22222

200 0.07999

250 0.13580

300 0.04729

Contrib. to Domain Metric

Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica

Max: 1

Domain Metric4

0.77631Actual:

10

Preliminary Experiments Resultsh�#H2 9, .QK�BM J2i`B+ T2` *QM};m`�iBQM BM i?2 irQ 2MpB`QMK2Mib U>SA- 6l"VXh?2 *QM};m`�iBQM rBi? i?2 >B;?2bi .QK�BM J2i`B+ Bb >B;?HB;?i2/X

_�J *Sl O *�`i _2THB+�b .QK�BM J2i`B+ U>SAV .QK�BM J2i`B+ U6l"VyX8 :" yXk8 R yXeR9NN yX89Rj9R :" yXk8 R yXddejR yX8j339R :" yX8 R yX8j88N yX89RyeyX8 :" yX8 R yX8R8je yX89ddjyX8 :" yX8 k yX8yNN8 yX89RRRR :" yXk8 k yXd9y3y yX89d38R :" yX8 k yX8j9yR yX89RyeyX8 :" yX8 9 yX8y8jR yX89NjNR :" yXk8 9 yXjdRek yX89kdkR :" yX8 9 yX8edR3 yX89kdR

�i >SA `2bmHib BM T2`7Q`K�M+2 /2;`�/�iBQMX 6Q` 2t�KTH2- i?2 +QM};m`�iBQM rBi?R :" Q7 _�J- yXk8 *Sl b?�`2- �M/ 7Qm` +�`i `2THB+�b- r�b �bb2bb2/ �b 0.37162-r?BH2 i?2 +QM};m`�iBQM rBi? R :" Q7 _�J- yX8 *Sl b?�`2- �M/ QM2 +�`i `2THB+�-r�b �bb2bb2/ �b 0.55356X �i 6l"- i?2 /QK�BM K2i`B+ Qb+BHH�i2b Qp2` � M�``Qr`�M;2X a+�HBM; #2vQM/ yX8 :" Q7 _�J- yX8 *Sl b?�`2- �M/ 9 +�`i `2THB+�b /Q2bMQi H2�/ iQ � #2ii2` T2`7Q`K�M+2 B7 i?2 MmK#2` Q7 mb2`b Bb ?B;?2` i?�M R8yX AM�//BiBQM- i?2 +?QB+2 Q7 i?2 >SA Q` i?2 6l" /2THQvK2Mib ?�p2 bB;MB}+�Mi BKT�+iQM i?2 /QK�BM@K2i`B+ �b b?QrM BM h�#H2 9X h?2b2 }M/BM;b bm;;2bi i?�i #QiiH2M2+F�M�HvbBb �M/ +�`27mH T2`7Q`K�M+2 2M;BM22`BM; �+iBpBiB2b b?QmH/ #2 2t2+mi2/ #27Q`2�//BiBQM�H `2bQm`+2b �`2 �//2/ iQ i?2 �`+?Bi2+im`2 /2THQvK2Mi +QM};m`�iBQMX

e h?`2�ib iQ o�HB/Biv

h?2 7QHHQrBM; i?`2�ib iQ p�HB/Biv iQ Qm` `2b2�`+? r2`2 B/2MiB}2/,

Ĝ PT2`�iBQM�H T`Q}H2 /�i� �M�HvbBb, i?2 /QK�BM K2i`B+ BMi`Q/m+2/ BM i?BbT�T2` `2HB2b QM i?2 +�`27mH �M�HvbBb Q7 T`Q/m+iBQM mb�;2 QT2`�iBQM�H T`Q}H2/�i�X J�Mv Q`;�MBx�iBQMb rBHH MQi ?�p2 �++2bb iQ �++m`�i2 QT2`�iBQM�H T`Q}H2/�i�- r?B+? KB;?i BKT�+i i?2 �++m`�+v Q7 i?2 /QK�BM K2i`B+ �bb2bbK2MibXa2p2`�H �TT`Q�+?2b +�M #2 mb2/ iQ Qp2`+QK2 i?2 H�+F Q7 �++m`�i2 QT2`�iBQM�HT`Q}H2 /�i� (9)- bm+? �b, mbBM; `2H�i2/ bvbi2Kb �b T`Qtv 7Q` i?2 bvbi2K mM/2`bim/v- +QM/m+iBM; mb2` bm`p2vb- �M/ �M�HvxBM; HQ; /�i� 7`QK T`2pBQmb p2`bBQMbQ7 i?2 bvbi2K mM/2` bim/vX

Ĝ 1tT2`BK2Mi ;2M2`�iBQM, 2tT2`BK2Mi ;2M2`�iBQM `2[mB`2b i?2 2biBK�iBQMQ7 2�+? T2`7Q`K�M+2 i2bi +�b2 T`Q#�#BHBiv Q7 Q++m``2M+2- r?B+? Bb #�b2/ QMi?2 QT2`�iBQM�H T`Q}H2 /�i�X q?2M i?2 QT2`�iBQM�H T`Q}H2 /�i� ;`�MmH�`BivBb +Q�`b2 i?2`2 Bb � i?`2�i iQ i?2 �++m`�+v Q7 i?2 2biBK�i2/ QT2`�iBQM�H T`Q@}H2 /Bbi`B#miBQMX aQK2 Q7 i?2 bm;;2bi2/ �TT`Q�+?2b iQ Qp2`+QK2 i?2 +Q�`b2;`�MmH�`Biv Q7 i?2 QT2`�iBQM�H T`Q}H2 /�i� �`2, T2`7Q`KBM; i?2 +QKTmi�iBQM

10

Preliminary Experiments Resultsh�#H2 9, .QK�BM J2i`B+ T2` *QM};m`�iBQM BM i?2 irQ 2MpB`QMK2Mib U>SA- 6l"VXh?2 *QM};m`�iBQM rBi? i?2 >B;?2bi .QK�BM J2i`B+ Bb >B;?HB;?i2/X

_�J *Sl O *�`i _2THB+�b .QK�BM J2i`B+ U>SAV .QK�BM J2i`B+ U6l"VyX8 :" yXk8 R yXeR9NN yX89Rj9R :" yXk8 R yXddejR yX8j339R :" yX8 R yX8j88N yX89RyeyX8 :" yX8 R yX8R8je yX89ddjyX8 :" yX8 k yX8yNN8 yX89RRRR :" yXk8 k yXd9y3y yX89d38R :" yX8 k yX8j9yR yX89RyeyX8 :" yX8 9 yX8y8jR yX89NjNR :" yXk8 9 yXjdRek yX89kdkR :" yX8 9 yX8edR3 yX89kdR

�i >SA `2bmHib BM T2`7Q`K�M+2 /2;`�/�iBQMX 6Q` 2t�KTH2- i?2 +QM};m`�iBQM rBi?R :" Q7 _�J- yXk8 *Sl b?�`2- �M/ 7Qm` +�`i `2THB+�b- r�b �bb2bb2/ �b 0.37162-r?BH2 i?2 +QM};m`�iBQM rBi? R :" Q7 _�J- yX8 *Sl b?�`2- �M/ QM2 +�`i `2THB+�-r�b �bb2bb2/ �b 0.55356X �i 6l"- i?2 /QK�BM K2i`B+ Qb+BHH�i2b Qp2` � M�``Qr`�M;2X a+�HBM; #2vQM/ yX8 :" Q7 _�J- yX8 *Sl b?�`2- �M/ 9 +�`i `2THB+�b /Q2bMQi H2�/ iQ � #2ii2` T2`7Q`K�M+2 B7 i?2 MmK#2` Q7 mb2`b Bb ?B;?2` i?�M R8yX AM�//BiBQM- i?2 +?QB+2 Q7 i?2 >SA Q` i?2 6l" /2THQvK2Mib ?�p2 bB;MB}+�Mi BKT�+iQM i?2 /QK�BM@K2i`B+ �b b?QrM BM h�#H2 9X h?2b2 }M/BM;b bm;;2bi i?�i #QiiH2M2+F�M�HvbBb �M/ +�`27mH T2`7Q`K�M+2 2M;BM22`BM; �+iBpBiB2b b?QmH/ #2 2t2+mi2/ #27Q`2�//BiBQM�H `2bQm`+2b �`2 �//2/ iQ i?2 �`+?Bi2+im`2 /2THQvK2Mi +QM};m`�iBQMX

e h?`2�ib iQ o�HB/Biv

h?2 7QHHQrBM; i?`2�ib iQ p�HB/Biv iQ Qm` `2b2�`+? r2`2 B/2MiB}2/,

Ĝ PT2`�iBQM�H T`Q}H2 /�i� �M�HvbBb, i?2 /QK�BM K2i`B+ BMi`Q/m+2/ BM i?BbT�T2` `2HB2b QM i?2 +�`27mH �M�HvbBb Q7 T`Q/m+iBQM mb�;2 QT2`�iBQM�H T`Q}H2/�i�X J�Mv Q`;�MBx�iBQMb rBHH MQi ?�p2 �++2bb iQ �++m`�i2 QT2`�iBQM�H T`Q}H2/�i�- r?B+? KB;?i BKT�+i i?2 �++m`�+v Q7 i?2 /QK�BM K2i`B+ �bb2bbK2MibXa2p2`�H �TT`Q�+?2b +�M #2 mb2/ iQ Qp2`+QK2 i?2 H�+F Q7 �++m`�i2 QT2`�iBQM�HT`Q}H2 /�i� (9)- bm+? �b, mbBM; `2H�i2/ bvbi2Kb �b T`Qtv 7Q` i?2 bvbi2K mM/2`bim/v- +QM/m+iBM; mb2` bm`p2vb- �M/ �M�HvxBM; HQ; /�i� 7`QK T`2pBQmb p2`bBQMbQ7 i?2 bvbi2K mM/2` bim/vX

Ĝ 1tT2`BK2Mi ;2M2`�iBQM, 2tT2`BK2Mi ;2M2`�iBQM `2[mB`2b i?2 2biBK�iBQMQ7 2�+? T2`7Q`K�M+2 i2bi +�b2 T`Q#�#BHBiv Q7 Q++m``2M+2- r?B+? Bb #�b2/ QMi?2 QT2`�iBQM�H T`Q}H2 /�i�X q?2M i?2 QT2`�iBQM�H T`Q}H2 /�i� ;`�MmH�`BivBb +Q�`b2 i?2`2 Bb � i?`2�i iQ i?2 �++m`�+v Q7 i?2 2biBK�i2/ QT2`�iBQM�H T`Q@}H2 /Bbi`B#miBQMX aQK2 Q7 i?2 bm;;2bi2/ �TT`Q�+?2b iQ Qp2`+QK2 i?2 +Q�`b2;`�MmH�`Biv Q7 i?2 QT2`�iBQM�H T`Q}H2 /�i� �`2, T2`7Q`KBM; i?2 +QKTmi�iBQM

10

Preliminary Experiments Resultsh�#H2 9, .QK�BM J2i`B+ T2` *QM};m`�iBQM BM i?2 irQ 2MpB`QMK2Mib U>SA- 6l"VXh?2 *QM};m`�iBQM rBi? i?2 >B;?2bi .QK�BM J2i`B+ Bb >B;?HB;?i2/X

_�J *Sl O *�`i _2THB+�b .QK�BM J2i`B+ U>SAV .QK�BM J2i`B+ U6l"VyX8 :" yXk8 R yXeR9NN yX89Rj9R :" yXk8 R yXddejR yX8j339R :" yX8 R yX8j88N yX89RyeyX8 :" yX8 R yX8R8je yX89ddjyX8 :" yX8 k yX8yNN8 yX89RRRR :" yXk8 k yXd9y3y yX89d38R :" yX8 k yX8j9yR yX89RyeyX8 :" yX8 9 yX8y8jR yX89NjNR :" yXk8 9 yXjdRek yX89kdkR :" yX8 9 yX8edR3 yX89kdR

�i >SA `2bmHib BM T2`7Q`K�M+2 /2;`�/�iBQMX 6Q` 2t�KTH2- i?2 +QM};m`�iBQM rBi?R :" Q7 _�J- yXk8 *Sl b?�`2- �M/ 7Qm` +�`i `2THB+�b- r�b �bb2bb2/ �b 0.37162-r?BH2 i?2 +QM};m`�iBQM rBi? R :" Q7 _�J- yX8 *Sl b?�`2- �M/ QM2 +�`i `2THB+�-r�b �bb2bb2/ �b 0.55356X �i 6l"- i?2 /QK�BM K2i`B+ Qb+BHH�i2b Qp2` � M�``Qr`�M;2X a+�HBM; #2vQM/ yX8 :" Q7 _�J- yX8 *Sl b?�`2- �M/ 9 +�`i `2THB+�b /Q2bMQi H2�/ iQ � #2ii2` T2`7Q`K�M+2 B7 i?2 MmK#2` Q7 mb2`b Bb ?B;?2` i?�M R8yX AM�//BiBQM- i?2 +?QB+2 Q7 i?2 >SA Q` i?2 6l" /2THQvK2Mib ?�p2 bB;MB}+�Mi BKT�+iQM i?2 /QK�BM@K2i`B+ �b b?QrM BM h�#H2 9X h?2b2 }M/BM;b bm;;2bi i?�i #QiiH2M2+F�M�HvbBb �M/ +�`27mH T2`7Q`K�M+2 2M;BM22`BM; �+iBpBiB2b b?QmH/ #2 2t2+mi2/ #27Q`2�//BiBQM�H `2bQm`+2b �`2 �//2/ iQ i?2 �`+?Bi2+im`2 /2THQvK2Mi +QM};m`�iBQMX

e h?`2�ib iQ o�HB/Biv

h?2 7QHHQrBM; i?`2�ib iQ p�HB/Biv iQ Qm` `2b2�`+? r2`2 B/2MiB}2/,

Ĝ PT2`�iBQM�H T`Q}H2 /�i� �M�HvbBb, i?2 /QK�BM K2i`B+ BMi`Q/m+2/ BM i?BbT�T2` `2HB2b QM i?2 +�`27mH �M�HvbBb Q7 T`Q/m+iBQM mb�;2 QT2`�iBQM�H T`Q}H2/�i�X J�Mv Q`;�MBx�iBQMb rBHH MQi ?�p2 �++2bb iQ �++m`�i2 QT2`�iBQM�H T`Q}H2/�i�- r?B+? KB;?i BKT�+i i?2 �++m`�+v Q7 i?2 /QK�BM K2i`B+ �bb2bbK2MibXa2p2`�H �TT`Q�+?2b +�M #2 mb2/ iQ Qp2`+QK2 i?2 H�+F Q7 �++m`�i2 QT2`�iBQM�HT`Q}H2 /�i� (9)- bm+? �b, mbBM; `2H�i2/ bvbi2Kb �b T`Qtv 7Q` i?2 bvbi2K mM/2`bim/v- +QM/m+iBM; mb2` bm`p2vb- �M/ �M�HvxBM; HQ; /�i� 7`QK T`2pBQmb p2`bBQMbQ7 i?2 bvbi2K mM/2` bim/vX

Ĝ 1tT2`BK2Mi ;2M2`�iBQM, 2tT2`BK2Mi ;2M2`�iBQM `2[mB`2b i?2 2biBK�iBQMQ7 2�+? T2`7Q`K�M+2 i2bi +�b2 T`Q#�#BHBiv Q7 Q++m``2M+2- r?B+? Bb #�b2/ QMi?2 QT2`�iBQM�H T`Q}H2 /�i�X q?2M i?2 QT2`�iBQM�H T`Q}H2 /�i� ;`�MmH�`BivBb +Q�`b2 i?2`2 Bb � i?`2�i iQ i?2 �++m`�+v Q7 i?2 2biBK�i2/ QT2`�iBQM�H T`Q@}H2 /Bbi`B#miBQMX aQK2 Q7 i?2 bm;;2bi2/ �TT`Q�+?2b iQ Qp2`+QK2 i?2 +Q�`b2;`�MmH�`Biv Q7 i?2 QT2`�iBQM�H T`Q}H2 /�i� �`2, T2`7Q`KBM; i?2 +QKTmi�iBQM

BenchFlow Automation Framework

11

BenchFlow

BF

BenchFlow Automation Framework

11

BenchFlow

BF

DSL

BenchFlow Automation Framework

11

BenchFlow

BF

DSL

- Workload Modeling

BenchFlow Automation Framework

11

BenchFlow

BF

DSL

- Workload Modeling

Adapters

BenchFlow Automation Framework

11

BenchFlow

BF

DSL

- Workload Modeling

- SUT Configuration

Adapters

BenchFlow Automation Framework

11

BenchFlow

BF

DSL

- Workload Modeling

- SUT Configuration

- SUT Deployment

Adapters

BenchFlow Automation Framework

11

BenchFlow

BF

DSL

- Workload Modeling

- SUT Configuration

- SUT Deployment

Adapters

BenchFlow Automation Framework

11

BenchFlow

BF

DSL

- Workload Modeling

- SUT Configuration

- SUT Deployment

- BenchFlow

Adapters

BenchFlow Automation Framework

11

BenchFlow

BF

DSL

- Workload Modeling

- SUT Configuration

- SUT Deployment

- BenchFlow

Adapters

BenchFlow Automation Framework

11

BenchFlow

BF

DSL

- Workload Modeling

- SUT Configuration

- SUT Deployment

- BenchFlow

Adapters FabanTest Execution

BenchFlow Automation Framework

11

BenchFlow

BF

DSL

- Workload Modeling

- SUT Configuration

- SUT Deployment

- BenchFlow

Adapters Faban

MinioData Collection

Test Execution

BenchFlow Automation Framework

11

BenchFlow

BF

DSL

- Workload Modeling

- SUT Configuration

- SUT Deployment

- BenchFlow

Adapters Faban

MinioData Collection

Data Analysis

Test Execution

BenchFlow Automation Framework

11

BenchFlow

BF

DSL

- Workload Modeling

- SUT Configuration

- SUT Deployment

- BenchFlow

Adapters Faban

MinioData Collection

Data Analysis

Test Execution

https://github.com/benchflow

12

Generation

BenchFlow Automation Framework

[Ferme et al.]Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development Environments. In Proc. of ICPE 2018. 261-272.

12

Generation

Test Generation

BenchFlow Automation Framework

[Ferme et al.]Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development Environments. In Proc. of ICPE 2018. 261-272.

12

Generation

Test Bundle

Test Generation

BenchFlow Automation Framework

[Ferme et al.]Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development Environments. In Proc. of ICPE 2018. 261-272.

12

Generation

Test Bundle

Test Generation

Experiment Generation

Experiment Bundle

BenchFlow Automation Framework

[Ferme et al.]Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development Environments. In Proc. of ICPE 2018. 261-272.

12

Generation

Execution

Test Bundle

Experiment Execution

Test Generation

Experiment Generation

Experiment Bundle

BenchFlow Automation Framework

[Ferme et al.]Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development Environments. In Proc. of ICPE 2018. 261-272.

12

Generation

Execution

Test Bundle

Failures

Experiment Execution

Test Generation

Experiment Generation

Experiment Bundle

Success

BenchFlow Automation Framework

[Ferme et al.]Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development Environments. In Proc. of ICPE 2018. 261-272.

12

Generation

ExecutionA

nalysis

Test Bundle

Failures

Experiment Execution

Result Analysis

Test Generation

Experiment Generation

Experiment Bundle

Success

BenchFlow Automation Framework

[Ferme et al.]Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development Environments. In Proc. of ICPE 2018. 261-272.

12

Generation

ExecutionA

nalysis

Test Bundle

Metrics

Failures

Experiment Execution

Result Analysis

Test Generation

Experiment Generation

Experiment Bundle

Success

BenchFlow Automation Framework

[Ferme et al.]Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development Environments. In Proc. of ICPE 2018. 261-272.

Approach: Summary

13

Approach: Summary

13

Production

Approach: Summary

13

Production

Sampled Load Tests

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq.

1,2

Approach: Summary

13

Production

Sampled Load Tests

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq.

1,2

Baseline Test

Scalability Criteria

Deployment Conf.

BenchFlow

BF3

Approach: Summary

13

Production

Sampled Load Tests

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq.

1,2

Baseline Test

Scalability Criteria

Deployment Conf.

Domain Metric

0.734

BenchFlow

BF3

Approach: Summary

13

Production

Sampled Load Tests

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq.

1,2

Baseline Test

Scalability Criteria

Deployment Conf.

Domain Metric

0.734

BenchFlow

BF3

1. Decide if the system is ready for deployment

Examples of Use of Domain Metric KPI:

Approach: Summary

13

Production

Sampled Load Tests

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq.

1,2

Baseline Test

Scalability Criteria

Deployment Conf.

Domain Metric

0.734

BenchFlow

BF3

1. Decide if the system is ready for deployment

2. Assess different deployment configurations

Examples of Use of Domain Metric KPI:

Approach: Summary

14

Con

trib.

to D

omai

n M

etric

Sampled Load Tests

Approach: Summary

14

Con

trib.

to D

omai

n M

etric

Sampled Load Tests

Max Contrib.

Approach: Summary

14

Con

trib.

to D

omai

n M

etric

Sampled Load Tests

Max Contrib.

Depl. Conf.

Approach: Summary

15

Production

Sampled Load Tests

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq.

1,2

Baseline Test

Scalability Criteria

Deployment Conf.

Domain Metric

0.734

BenchFlow

BF3

Examples of Use of Domain Metric KPI:

1. Decide if the system is ready for deployment

2. Assess different deployment configurations

Approach: Summary

15

Production

Sampled Load Tests

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq.

1,2

Baseline Test

Scalability Criteria

Deployment Conf.

Domain Metric

0.734

BenchFlow

BF3

3. Configure scaling plans and autoscalers

Examples of Use of Domain Metric KPI:

1. Decide if the system is ready for deployment

2. Assess different deployment configurations

16

Approach: Extensions

0.12 0.14 0.20 0.16 0.11

Observed load situations Time

Load

Lev

el

Production Empirical Distribution of Load situations

50 200 350 500Load Level

Rel.

Freq

.

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq.

Baseline Test

Sampled Load Tests

Scalability Criteria

Domain Metric

Test Results

Deployment Conf.

0.73

1

23

4

16

Approach: Extensions

0.12 0.14 0.20 0.16 0.11

Observed load situations Time

Load

Lev

el

Production Empirical Distribution of Load situations

50 200 350 500Load Level

Rel.

Freq

.

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq.

Baseline Test

Sampled Load Tests

Scalability Criteria

Domain Metric

Test Results

Deployment Conf.

0.73

1

23

4

16

Approach: Extensions

0.12 0.14 0.20 0.16 0.11

Observed load situations Time

Load

Lev

el

Production Empirical Distribution of Load situations

50 200 350 500Load Level

Rel.

Freq

.

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq.

Baseline Test

Sampled Load Tests

Scalability Criteria

Domain Metric

Test Results

Deployment Conf.

0.73

1

23

4

ContinuITy Framework

17

Production

[Schulz et al.]Henning Schulz, Tobias Angerstein, and André van Hoorn. Towards Automating Representative Load Testing in Continuous Software Engineering. In Proc. of ICPE 2018 Companion. 123-126.

ContinuITy Framework

17

Production Sampled Load TestsSampled Load Tests

Load Levels

API CallsOp. Mix

[Schulz et al.]Henning Schulz, Tobias Angerstein, and André van Hoorn. Towards Automating Representative Load Testing in Continuous Software Engineering. In Proc. of ICPE 2018 Companion. 123-126.

ContinuITy Framework

18

Production Sampled Load TestsSampled Load Tests

Load Levels

API CallsOp. Mix

[Schulz et al.]Henning Schulz, Tobias Angerstein, and André van Hoorn. Towards Automating Representative Load Testing in Continuous Software Engineering. In Proc. of ICPE 2018 Companion. 123-126.

ContinuITy Framework

18

Workload Model

Production Sampled Load TestsSampled Load Tests

Load Levels

API CallsOp. Mix

[Schulz et al.]Henning Schulz, Tobias Angerstein, and André van Hoorn. Towards Automating Representative Load Testing in Continuous Software Engineering. In Proc. of ICPE 2018 Companion. 123-126.

ContinuITy Framework

18

APIs SpecificationWorkload Model+

Production Sampled Load TestsSampled Load Tests

Load Levels

API CallsOp. Mix

[Schulz et al.]Henning Schulz, Tobias Angerstein, and André van Hoorn. Towards Automating Representative Load Testing in Continuous Software Engineering. In Proc. of ICPE 2018 Companion. 123-126.

ContinuITy Framework

18

APIs Specification Annotations ModelWorkload Model+ +

Production Sampled Load TestsSampled Load Tests

Load Levels

API CallsOp. Mix

[Schulz et al.]Henning Schulz, Tobias Angerstein, and André van Hoorn. Towards Automating Representative Load Testing in Continuous Software Engineering. In Proc. of ICPE 2018 Companion. 123-126.

Future Work

19

Future Work

19

Automated update of tests definition: - Automatically update load levels, API calls, operations mix

Future Work

19

Automated update of tests definition: - Automatically update load levels, API calls, operations mix

Declarative Definitions for Load Testing:- Make performance engineering activities more accessible

Future Work

19

Automated update of tests definition: - Automatically update load levels, API calls, operations mix

Declarative Definitions for Load Testing:- Make performance engineering activities more accessible

Declarative Performance Test Regression: - Extend the proposed approach to continuously assess regressions in performance and PASS/FAIL CI/CD pipelines

Future Work

19

Automated update of tests definition: - Automatically update load levels, API calls, operations mix

Declarative Definitions for Load Testing:- Make performance engineering activities more accessible

Declarative Performance Test Regression: - Extend the proposed approach to continuously assess regressions in performance and PASS/FAIL CI/CD pipelines

Prioritisation and Selection of Load Tests: - Exploit performance models to reduce even more the # of tests

20

Highlights

4

The system changes frequently:- Need to keep the test definition updated

Too many tests to be executed:- Need to reduce the number of executed tests

Need to integrate tests in delivery pipelines:- Need to have complete automation for test execution - Need to have clear KPIs to be evaluated in the pipeline

Challenges

Challenges

20

Highlights

4

The system changes frequently:- Need to keep the test definition updated

Too many tests to be executed:- Need to reduce the number of executed tests

Need to integrate tests in delivery pipelines:- Need to have complete automation for test execution - Need to have clear KPIs to be evaluated in the pipeline

Challenges

Challenges Our Approach 5

Our Approach

Observed load situations Time

Load

Lev

el

Production

Baseline Test

Sampled Load Tests

Scalability Criteria

Deployment Conf.

Empirical Distribution of Load situations

50 200 350 500Load Level

Rel.

Freq

.

1

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq. 2

0.12 0.14 0.20 0.16 0.11

Test Results

3

Domain Metric

0.734

20

Highlights

4

The system changes frequently:- Need to keep the test definition updated

Too many tests to be executed:- Need to reduce the number of executed tests

Need to integrate tests in delivery pipelines:- Need to have complete automation for test execution - Need to have clear KPIs to be evaluated in the pipeline

Challenges

Challenges Our Approach 5

Our Approach

Observed load situations Time

Load

Lev

el

Production

Baseline Test

Sampled Load Tests

Scalability Criteria

Deployment Conf.

Empirical Distribution of Load situations

50 200 350 500Load Level

Rel.

Freq

.

1

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq. 2

0.12 0.14 0.20 0.16 0.11

Test Results

3

Domain Metric

0.734

Preliminary Experiments 6

Preliminary Experiments

Stab = avg + 3σProduction

12 microservices

Custom Op. Mix

Sampled Load Tests

Empirical Distribution of Load situations

6 Load Levels

50,100,150,200,250,300 Users

1,2

Deployment Config.

10 configurations

Replicas

CPURAM

3

20

Highlights

4

The system changes frequently:- Need to keep the test definition updated

Too many tests to be executed:- Need to reduce the number of executed tests

Need to integrate tests in delivery pipelines:- Need to have complete automation for test execution - Need to have clear KPIs to be evaluated in the pipeline

Challenges

Challenges Our Approach 5

Our Approach

Observed load situations Time

Load

Lev

el

Production

Baseline Test

Sampled Load Tests

Scalability Criteria

Deployment Conf.

Empirical Distribution of Load situations

50 200 350 500Load Level

Rel.

Freq

.

1

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq. 2

0.12 0.14 0.20 0.16 0.11

Test Results

3

Domain Metric

0.734

Preliminary Experiments 6

Preliminary Experiments

Stab = avg + 3σProduction

12 microservices

Custom Op. Mix

Sampled Load Tests

Empirical Distribution of Load situations

6 Load Levels

50,100,150,200,250,300 Users

1,2

Deployment Config.

10 configurations

Replicas

CPURAM

3

Extensions and Future Work 16

Approach: Extensions

0.12 0.14 0.20 0.16 0.11

Observed load situations Time

Load

Lev

el

Production Empirical Distribution of Load situations

50 200 350 500Load Level

Rel.

Freq

.

Empirical Distribution of Load situations

100 300 500Sampled Load Level

Agg

r. Re

l. Fr

eq.

Baseline Test

Sampled Load Tests

Scalability Criteria

Domain Metric

Test Results

Deployment Conf.

0.73

1

23

4

top related