empirical studies in test-driven development laurie williams, ncsu [email protected]

30
Empirical Studies in Test-Driven Development Laurie Williams, NCSU [email protected]

Upload: dominic-paul

Post on 21-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

Empirical Studies in Test-Driven Development

Laurie Williams, [email protected]

Page 2: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

Agenda

• Overview of Test-Driven Development (TDD)

• TDD Case Studies

• TDD within XP Case Studies

• Summary

Page 3: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

Test-driven Development

Page 4: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

Overview of TDD via JUnit

Page 5: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

xUnit tools

Page 6: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

Agenda

• Overview of Test-Driven Development (TDD)

• TDD Case Studies

• TDD within XP Case Studies

• Summary

Page 7: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

Structured Experiment

– 3 companies, 24 professional developers • all worked in pairs• Rolemodel, John Deere, Ericsson

– Random groups: Test-first vs. test-last • Test-last “refused” to write automated JUnit tests test-never

– Test-first: • 18% more manual, black box test cases passed• 16% more time

• Good test coverage (98% Method, 92% Statement, 97% Branch)

Page 8: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

IBM Retail Device Drivers

0

2

4

6

8

10

12

14

16

18

1 2 3 4 5 6 7 8 9 10

Release

Nu

mb

er o

f d

evel

oper

s

Raleigh Guadalajara Martinez

-20000

-10000

0

10000

20000

30000

40000

50000

1 2 3 4 5 6 7 8 9 10

Release

Lin

es o

f C

ode

Source Test

IBM research partners:Julio Sanchez (Mexico) and Michael Maximilien (Almaden)

Page 9: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

Building Test Assets

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

1 2 3 4 5 6 7 8 9 10

Release

Tes

t L

OC

/Sou

rce

LO

C

Test LOC/Source LOC

Page 10: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

Defect Density

1 2 3 4 5 6 7 8 9 10

Release

Def

ects

/KL

OC

Internal External

Page 11: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

The New Anti-aging Formula?

1 2 3 4 5 6 7 8 9 10

Release

Cyclomatic Complexity Test LOC/Source LOC

Page 12: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

Lessons Learned

1. A passionate champion for the practice and tools is necessary.

2. JUnit well structured framework relative to “ad hoc” automated testing.

3. Can extend JUnit to handle necessary manual intervention.

4. Not all tests can be automated.

5. Create a good design using object-oriented principles.

6. Execute all tests prior to checking code into code base.

7. Write tests prior to code.

8. Institute a nightly build that includes running all the unit tests.

9. Create a test whenever a defect is detected internally or externally.

Page 13: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

Test-first Performance

Execute Tests for Performance-Critical Devices

Performance Log File

Fix Problem Problems

Check In

Implement a Feature

Functional Tests Passed?

Y

Y

N

N

Test Design

Performance Problems Found?

Project Finished?

Y

N

Execute Master Test Suite

Performance Measurements

Performance Problems Found?

Make Suggestions for Performance

Improvement

Y

N

Test Design

Project Finished?

Y

N

Research partners: Chih-wei Ho; IBM: Mike Johnson and Michael Maximilien

Page 14: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

Sample Results

Printer Throughput Tests (taller is better)

0

5

10

15

20

25

30

ADept ASmkt MaxS SDept

S1

S2

S3

S4

Printer Throughput Tests (taller is better)

0

5

10

15

20

25

ADept ASmkt MaxS SDept

Previous

S3

Page 15: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

Agenda

• Overview of Test-Driven Development (TDD)

• TDD Case Studies

• TDD within XP Case Studies

• Summary

Page 16: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007 16

IBM: XP-Context Factors (XP-cf)• Small team (7-10)• Co-located• Web development (toolkit)• Supplier and customer

distributed (US and overseas)

• Examined one release “old” (low XP) to the next “new” (more XP)

Page 17: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007 17

IBM: XP-Adherence Metrics (XP-am)

XP-am Metric Practice Old New

Automated test class per user story

Testing 0.11 0.45

Test coverage (statement) Testing 30% 46%

Unit test runs per person day Testing 14% 11%

Test LOC/Source LOC Testing 0.26 0.42

Accept test execute Testing Manual Manual

Did customers run your acceptance tests?

Testing No No

Pairing Frequency Pair Programming <5% 48%

Release Length Short Release 10 months 5 months

Iteration Length Short Release Weekly Weekly

Page 18: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007© Laurie Williams 2007

18

IBM: XP-Outcome Measures (XP-om)

XP Outcome Measures Old New

Response to Customer Change(Ratio (user stories in + out) /total)

NA 0.23

Pre-release Quality(test defects/KLOEC of code)

1.0 0.50

Post-release Quality(released defects/KLOEC of code)

1.0 0.61

Productivity (stories / PM)Relative KLOEC / PMPutnam Product. Parameter

1.01.01.0

1.341.701.92

Customer Satisfaction NA High (qualitative)

Morale (via survey) 1.0 1.11

Page 19: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

© Laurie Williams 2007

19

Sabre-A: XP Context Factors (XP-cf)

Page 20: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

© Laurie Williams 2007

20

Sabre-A: XP-Adherence Metrics (XP-am)

XP-am Metric Practice Old New

Automated test class per new/changed class

Testing 0.036 0.572

Test coverage (statement) Testing N/A 32.9%

Unit test runs per person day Testing 0 1.0

Test LOC/Source LOC Testing 0.054 0.296

Accept test execute Testing Manual Manual

Did customers run your acceptance tests?

Testing No No

Pairing Frequency Pair Programming

<0% 50%

Release Length Short Release 18 months 3.5 months

Iteration Length Short Release -- 10 days

Page 21: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007© Laurie Williams 2007

21

Sabre-A: XP-Outcome Measures (XP-om)

XP Outcome Measures Old New

Response to Customer Change(Ratio (user stories in + out) /total)

NA N/A

Pre-release Quality(test defects/KLOEC of code)

1.0 0.35

Post-release Quality(released defects/KLOEC of code)

1.0 0.70

Productivity (stories / PM)Relative KLOEC / PM Putnam Product. Parameter

N/A1.01.0

N/A1.462.89

Customer Satisfaction NA High (anecdotal)

Morale (via survey) N/A 68.1%

Page 22: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007 22

Sabre-P: XP Context Factors (XP-cf) Medium sized team (15) Co-located Large web application (1M LOC) Customers domestic & overseas

Examined 13th release of the product; 20 months after starting XP

Page 23: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

© Laurie Williams 2007

23

Sabre-P: XP-Adherence Metrics (XP-am)

XP-am Metric Practice New

Automated test class per new/changed class

Testing 0.0225

Test coverage (statement) Testing 7.7%

Unit test runs per person day Testing 0.4

Test LOC/Source LOC Testing 0.296

Pair programming Pair programming 70%

Release Length Short Release 3 months

Iteration Length Short Release 10 days

Page 24: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007© Laurie Williams 2007

24

Sabre-P: XP-Outcome Measures (XP-om)

XP Outcome Measures Bangalore SPIN Benchmarking group

Capers Jones

Pre-release defect density Similar Lower

Total defect density Lower Lower

Productivity Similar Higher

Page 25: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007 25

Tekelec: XP Context Factors (XP-cf)

• Small team (4-7; 2 during maintenance phase)

• Geographically distributed– Contractors in Czech Republic for US

development organization (Tekelec)

• Simulator for a telecommunications signal transfer point system (train new customers)

• Considerable amount of requirements volatility

Page 26: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007© Laurie Williams 2007

26

Tekelec: XP-Adherence Metrics (XP-am)

XP-am Metric Practice New

Automated test class per new/changed class

Testing 1.0

Test coverage (statement) Testing N/A

Unit test runs per person day Testing 1/day for all; 1/hour for quickset

Test LOC/Source LOC Testing 0.91

Pair programming Pair programming 77.5%

Release Length Short Release 4 months

Iteration Length Short Release 10 days

Page 27: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007© Laurie Williams 2007

27

Tekelec: XP-Outcome Measures (XP-om)

Outcome measure F-15 project

Pre-release Quality(test defects/KLOEC)

N/A

Post-release Quality(post-release defects/KLOEC)

1.62 defects/KLOEC [Lower than industry standards]

Customer Satisfaction(interview)

Capability – NeutralReliability – SatisfiedCommunication – Very Satisfied

Productivity 1.22 KLOEC/PM [Lower than industry standards]

2.32 KLOEC/PM (including test code) [on par with industry standards]

Page 28: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007© Laurie Williams 2007

28

Empirical Studies of XP Teams

Page 29: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

Agenda

• Overview of Test-Driven Development (TDD)

• TDD Case Studies

• TDD within XP Case Studies

• Summary

Page 30: Empirical Studies in Test-Driven Development Laurie Williams, NCSU williams@csc.ncsu.edu

© Laurie Williams 2007

Summary

• Increased quality with “no” long-run productivity impact

• Valuable test assets created in the process• Indications:

– Improved design– Anti-aging