better software magazine - agile alliance · january/february2008 bettersoftware 63 christmas-m...

62 BETTER SOFTWARE JANUARY/FEBRUARY 2008 www.StickyMinds.com

ISTO

CKPH

OTO

www.StickyMinds.com JANUARY/FEBRUARY 2008 BETTER SOFTWARE 63

CHRISTMAS-MORNINGHEARTBREAKBeginning just after Halloween and end-ing on Christmas morning, children’santicipation is fueled by relentless mar-keting for the season’s hottest toys. Withvisions of talking, flying, and magicaltoys, children embark on the geneticallyprogrammed process of convincing theirparents they need all the toys. Harriedparents weave through crowded malls insearch of the coveted toys. On ChristmasEve, festively wrapped presents are care-fully placed under the tree.

Christmas-morning happiness quicklyturns to heartbreak once the endless lay-ers of protective packaging have beenremoved and reality sets in. Space Con-querors can’t fly. Fishbowl Monkeysdon’t look anything like the package.What good is the Fashion Runway dollwithout all the clothes and accessories?

These children aren’t spoiled rotten,their parents aren’t completely incompe-tent, the toys aren’t broken, and thecommercials weren’t outrageous lies. Theunderlying problem is that, amidst theseason’s hustle and bustle, no one tookthe time to carefully read the fine print:“Batteries not included,” “Not exactly asseen on TV,” “Parts sold separately,” and“For ages 6+.” Disaster could have beenaverted if parents had acted accordingly.

Likewise, test-driven development(TDD) heartbreak begins just after a slickdemo or training class and ends with thefirst production release. In between, pre-sentations, books, and articles by peopleselling state-of-the-art tools and consult-ing services fuel the team’s expectations.With visions of “It’s as simple as red-green-refactor,” “The tests take less thana minute to write,” “Deliver defect-freecode faster and cheaper,” and “No moredocumentation,” the team convincesmanagement that it needs it all. On “Pro-duction Eve,” carefully written code isplaced in the repository, ready for pro-duction.

Teams frequently experience “Christ-mas-morning heartbreak” because they,too, haven’t “read the fine print” andacted accordingly. Early success is shortlived, and TDD gets abandoned like yes-terday’s tinsel because:

• Control environmental factors,such as date and time

• Completely isolate the system fromits dependents

• Trigger an action on the system• Control responses from dependent

systems• Access all direct and indirect (side

effect) outcomes

Designing a system for this type offlexibility, controllability, isolatability, ac-cessibility, and repeatability involves newdesign strategies. As Gerard Meszarusdescribes, decisions about how depend-encies are managed, test environmentsare coordinated, and test data is handled

will have longer-lasting im-plications than decisionsabout which automated toolto use (see the StickyNotesfor more information).

BATTERIESNOT INCLUDED

Children are creativeenough to play with powertoys even when the batteriesare missing. However, it is amore exhilarating experi-ence with batteries installed.Similarly, teams followingthe basic red-green-refactorTDD rhythm typically endup with a heightened sense

of confidence, clarity of purpose, andsimpler code. Simply taking TDD at facevalue won’t magically result in highlytestable systems or produce superiorspecification and design artifacts. To real-ize these kinds of long-term benefits, ourbatteries must be charged by investingtime and energy in concepts like func-tional tests as effective requirementspecifications and domain-specific testinglanguages.

TESTS ASREQUIREMENTSPECIFICATIONS

The most unfortunate thing aboutTDD is that the name contains the word“test.” On one level, it is a constant re-minder of the radical reordering ofprocess steps (test before you code).However, when the words are taken at

stories, which are ultra-lightweight, tran-sient descriptions (see the StickyNotes formore on user stories). Stories direct theincremental development of functionaltests, which starts a chain reaction ofTDD cycles. The green cycle of one layertriggers the red cycle of the next layer.

Notice the overall shape formed bythese interconnected elements. Unit tests,like other things at the bottom of a foodchain, are plentiful; coarser-grained func-tional tests provide context anddirection. The linkage between all ofthese elements should be managed ex-plicitly through grouping, organization,metadata, and intent-revealing namingschemes.

DESIGN FORTESTABILITY

Designing and assembling a traintrack is not an end in itself; we want toplay with the train. Similarly, TDD is ajust a means to an end; we want to buildthe right software (specification) andbuild the software right (confirmation). Ifyour team’s practice of TDD focusesmore on tests than on the software beingdeveloped, you are heading toward alandmine. The focus should be on howTDD can help make the system’s designmore testable. A testable design facilitatescontrolled experiments of the followingform: Given the system is in state x whenwe do y, we expect the outcome to be z.

A highly testable system makes it easyfor people and tools to:

• Set the starting state for the systemand all dependent systems

• Tests take significantly longer tomaintain than system code.

• Tests fail intermittently and inex-plicably.

• Tests are slow and are not run fre-quently.

• The “defect-free” system crashes inproduction.

It doesn’t have to be this way. TDDactually does work. It just takes atremendous amount of discipline and anunderstanding of where the “landmines”are. This article applies a magnifyingglass to the TDD fine print and suggestspaths to safely navigate some key land-mines.

SOMEASSEMBLYREQUIRED

The box for a train setcontains a jumble of differ-ent pieces: segments ofstraight, curved, and y-shaped track; train cars; aremote control; scenery;and more. A large part ofthe fun is putting the trainset together in new ways.

The “box” for TDDcontains many interde-pendent pieces: unit tests,functional tests, user sto-ries, user goals, businessprocesses, iterations, releases, etc. All ofthese pieces support one another andamplify the others’ power like intercon-nected gears. Like the train set, thesepieces must be assembled prior to use.As anyone who has had to assemble toysknows, the ones with gears take themost patience and skill to get working.

“Red-green-refactor” is the TDDmantra (see the sidebar), but it’s not thecomplete story. The full TDD cycle startswith supporting a business process andresults in multiple layers of interconnect-ed functional and unit tests as shown infigure 1.

Successful TDD isn’t adopted as anisolated practice—planning and specifi-cation activities are also impacted. Toassist in planning development iterations,the system features that support the busi-ness process are decomposed into user


Figure 1:The fullTDD cycle

and fine-grained objects (button). In con-trast, the executable example in figure 3is expressed in terms of business-domaingoals (add movie title) and real-worldobjects (inventory); it focuses on what,not how, bringing why to the foreground.Effective functional specifications arebased on a domain-specific language(DSL). Eric Evans, the leading proponentof domain-driven design, suggests that allpeople involved in a project—manager,user, tester, domain expert, analyst, devel-oper, database administrator, andothers—become fluent in their uniqueDSLs, making it ubiquitous in all conver-sations, artifacts, and code (see theStickyNotes). For complex domains, thebenefits of clear and consistent communi-cation far outweigh the cost offormalizing a DSL.

A domain-specific testing language(DSTL) refers to coding DSL elements forautomated tests and extending the DSLwith domain-based vocabulary for set-ting up precondition data (e.g., inventory

detailed requirements help teams buildbetter systems?

To bring us back to the original in-tent, TDD experts like Brian Marick andDave Astels suggest different terms, suchas “executable examples” and “behav-ior-driven development (BDD)” (see theStickyNotes for more information). Fig-ure 3 illustrates what we should beaiming for—specific, succinct, unam-biguous, and relevant scenarios thatilluminate the business rule or workflow.

You can sign off on this one. This tinyspecification covers everything in figure 2and, additionally, fixes the flaws—the de-tailed test script had an ambiguousprecondition and failed to check that theinventory was unchanged.

DOMAIN-SPECIFICTESTING LANGUAGE

Vocabulary is the essential differencebetween the previous two examples. Thetest script in figure 2 is expressed in termsof application-oriented actions (click)

face value, functional requirement speci-fications tend to resemble detailed testscripts of old, even with state-of-the-arttools like FIT and Watir, and regardlessof whether the target is the UI or the API.To illustrate, when I show figure 2 to au-diences, many indicate that it resemblestheir TDD functional specifications. Canyou identify the business rule being ex-pressed? Would you sign off that this iscorrect and complete?

Answer: This script is for a videostore administration system and de-scribes the business rule to rejectduplicates for the “add movie title” fea-ture. I wouldn’t recommend signing off,as this specification contains several sig-nificant omissions and flaws (see theStickyNotes for more on effective func-tional tests).

Replacing traditional requirementsspecifications with a collection of testscripts like this is one of the biggest land-mines associated with TDD. How canunreadable, ambiguous, and excessively


RED-GREEN-REFACTORRed: Failing automated tests guide design. The cycle starts with writing automated tests that describe a new or enhanced ca-pability. While the tests are being developed, a minimal skeleton of the application API is created to enable tests to compile,as shown in figure A. Emphasis is on correctness and completeness of the tests. This stage benefits significantly from thecontinuous design and code review achieved by pair programming. The tests are run to ensure that they fail due to the in-complete system methods.

Green: Passing automated tests gauge completion. The focus shifts from the tests to the system code. The API and under-lying logic are implemented just enough to cause the tests to pass. While the code is being developed, the tests are runfrequently to ascertain progress; we’re done when the tests are green.

Refactor: Passing automated tests grant localized safety. The term refactoring refers to a disciplined, step-by-step processof cleaning up the code while preserving the original behavior. Technical debt is accumulated over time as the system growsincrementally; frequent refactoring episodes are the means for paying down this debt. Refactoring should occur only whenthe system is in a known state of stability, such as when all of the tests are passing. The test suite is rerun after every discreterefactoring step to ensure this stability is preserved. See the StickyNotes for more on refactoring.

Figure A:TDD cycle

contains) and validating outcomes (e.g.,inventory unchanged). Writing tests witha DSTL is much like building with Legobricks—simply snap the appropriatepieces together and fill in data related tothe particular example. Sophisticatedcompound pieces are built by snappingsmaller pieces together, enabling bothfunctional tests and unit tests to be builtfrom the same DSTL elements.

The landmine here is the expectationthat tests will take only a minute or twoto write—a pace that is sustainable onlyfor simple domains with no preconditionset up (like those typically used in demos,training courses, and articles) or if a ma-ture DSTL exists. Without a DSTL,teams head down the “Pay Me Later”path as shown in figure 4. Tests initiallyare churned out quickly; however, thesame test fragment is repeated in manytests. Eventually progress comes to ascreeching halt when maintenance is re-quired on the repeated fragments.

The alternative path, “Pay As YouGo,” which is shown in figure 5, is un-comfortable because progress is muchslower early on. Because each DSTL isspecific both to the domain and to thesystem being built, you must build eachLego piece before you can use it. Whenbuilding tests for the long term, youmust accept the paradox: By going slow-er at first, you will go faster in the longrun and will be able to maintain the op-timal pace much longer.


Figure 4: Pay Me Later

Figure 3: Executable example—Add movie title, reject duplicates

Figure 2: Detailed test script—Add movie title, reject duplicates

Figure 5: Pay AsYou Go

a real environment using real data. Incase you were wondering, that was thesound of a landmine exploding!

A passing automated regression testsuite confirms code stability and enablesiteration and release testing to be moreeffective and efficient. Proceed with otherforms of testing—exploratory, perform-ance, load, security, usability, etc. (See theStickyNotes for more information.)

BEST-BEFORE DATEMost business systems span multiple

years, teams, and projects, as the systemprogresses through the different phasesdescribed below and shown in figure 7.

• Greenfield: A project team buildsthe system from scratch usingTDD. The system is deployed intoproduction.

• Operations: Code and tests arehanded down to the operationalsupport and maintenance team.Ideally the teams use TDD on bugfixes and minor enhancements.However, if team members are nottrained in TDD, and if the tests aredifficult to maintain the tests willbe ignored and will “dry up” (no-body likes hand-me-downs).

• Enhancement: A parallel branch ofcode and tests is handed down toanother project team to develop amajor enhancement. Ideally thisteam uses TDD as it builds on theexisting code and test base. Nowthat two teams are working in par-allel, we could end up with clashesin the test suite that are tricky to re-solve.

• Legacy: Yet another parallel branchof code is handed down to yet

TDD, running the tests frequently as youwork on the system code.

TAKING TURNS(INTEGRATE)

Integrate when your new tests and allpre-existing tests pass. Only one personshould integrate at a time. A physical to-ken (like a stuffed toy or a hat) signalswho is “it”; you can’t integrate unlessyou have the token.

Again, the first step is to synchronizeand merge your local environment withthe shared repository. Run the full testsuite to verify stability. Perform a check-in with the shared repository, and returnthe integration token. Check-in shouldtrigger an automated process that buildsthe code and runs the test suite, as shownin figure 6.

ENDING A ROUND(DEPLOY)

Coding and integration continue untilthe functional test passes. The completedfeature can be deployed to the test envi-ronment from the repository. Run the fulltest suite to verify stability of the build.

PARTS SOLDSEPARATELY

Another unfortunate TDD misnomeris “acceptance test,” which refers to busi-ness-level automated tests. This termleads teams to believe that if their code isbuilt test first and all “acceptance” andunit tests pass, then the code is implicitlyaccepted by the customer. They feel littleneed for investigative testing by experi-enced testers or subject-matter experts.Ironically, most of these teams will reportthat their “defect-free” system crashed in

RECOMMENDED FORTWO OR MOREPLAYERS

Most games have a clear etiquette—instructions for setting up the game andprotocol for playing your turn, takingturns, and ending a round of play. TDDis a multi-player “game.” The object ofthe game is to make sure the code-basealways contains working software. Thesecret to winning is establishing team eti-

quette. TDD benefits will only be asgood as the weakest level of commitmentand discipline to the team etiquette.You’ve got to play to win, and the teamcan’t win if you don’t all play.

SETTING UP THEGAME(ENVIRONMENTS)

A shared, version-controlled coderepository contains the completed codeand passing tests. Developers have a localsnapshot of the repository and a data-base sandbox. A shared test environmentcontains a deployed copy of the reposito-ry. Analysts, subject-matter experts, andtesters use the test environment for inves-tigative and acceptance testing.

YOUR TURN(WORKING LOCALLY)

Development starts by creating afunctional test, which is checked in whilestill failing. Developers work in parallelon different aspects of the unit tests andsystem code related to this functionaltest.

The first step is to synchronize yourlocal environment with the shared repos-itory. Run the full test suite to verifystability of the code base. Proceed with


Figure 6: Environments

Jennitta Andrea is a hands-on analyst,developer, tester, retrospective facilita-tor, coach, and instructor. Hermulti-faceted experience on more than adozen different agile projects since 2000is reflected in her frequent publicationsand conference presentations (seewww.theandreagroup.ca). Jennitta’smain interest is in improving the state ofthe art of automated functional testingas it applies to agile requirements (seethe StickyNotes for more information);she applies insights from her early expe-rience with compiler technology anddomain-specific language design. Jennit-ta is a board member of the AgileAlliance and is serving on the IEEESoftware Advisory Board. She lovesgolf, ballet, playing games, and spend-ing time with her family: Jim, Ava,Shark Boy, and Lava Girl.

and, at worst, stifle the very creativityand fluidity that the prototyping is beingused to facilitate.

NEW YEAR’SRESOLUTION

I’ve experienced enough Christmas-morning heartbreak firsthand that I’veresolved to carefully read the fine printwhen I shop for my daughter’s toys. Inrecent years, I’ve discovered that the localnews station runs a pre-Christmas serieson the hottest holiday toys. The toys arearranged in a room, and children are in-vited in for unstructured play. Weobserve firsthand which toys the childrenare drawn to, which ones they actuallyspend time playing with, how engagedthe play is, and what problems they en-counter. This reporting from the field isan invaluable source of relevant informa-tion for harried parents.

I’ve received enough bruises from theTDD “School of Hard Knocks” that Ihave resolved to make the fine printmore visible and relevant. I encourageyou to study reports from the field thatcontain hard-won insights and hands-onexperiences from ordinary practitioners.However, I can only go so far: I can leadteam members to the fine print, but Ican’t make them read it. How havethings been going on your TDD projects?Are you using it when you shouldn’t be?Are you practicing it effectively? Whatwill you resolve to do differently to makethis a better TDD year? {end}

another team to upgrade or portthe aged legacy system. If the testshave survived thus far, the function-al tests are used to test drive theporting work, while the originalunit tests are appropriately retired(see the StickyNotes for more ontest-driven porting).

Like modeling clay, which has a limit-ed shelf life that can be prematurelyshortened if left uncovered, much carefulplanning and handling is required tomake sure the automated test suite does-n’t “dry up” part way through. For TDDto be regarded a true success, the func-tional tests should be considered reliablerequirement artifacts, and all of the rele-vant tests should continue to pass for aslong as the system is alive. Supportingthe handoffs between teams is a key fac-tor in achieving this level of success.

CHOKING HAZARDFOR CHILDREN UNDERTHREE

Just as some toys are not appropriatefor all children and could even pose greatdanger to some, TDD is not suitable forall types of projects. Testing consultantJonathan Kohl highlights where thingscan go wrong: TDD isn’t appropriate forquick and dirty prototyping or throw-away exploratory work where thequality of the end product is not a majorconcern. Applying TDD in cases likethese would, at best, be a waste of effort


StickyNotes

For more on the following topics go towww.StickyMinds.com/bettersoftware.

� User stories� Functional testing� Test-driven porting� References and further reading� Refactoring

Figure 7: FullTDD lifecycle

better software magazine - agile alliance · january/february2008 bettersoftware 63 christmas-m...

Documents