agile development and testing in python -

Agile Development and Testing in Python

Grig Gheorghiu and Titus Brown

PyCon 2006, Feb. 23, Addison, TX

Introduction – agile concepts

● what is "agile"?– rapid feedback– continuous integration– automated testing– collaboration / “whole team” approach– small frequent iterations

● testing pyramid:– unit tests– functional/acceptance tests– UI tests

Introduction – testing pyramidcourtesy of Jason “Selenium” Huggins

Introduction (cont.)

● functionality implemented as stories– a story is not done unless it is unittested and acceptance

tested● automated tests

– safety net that enables merciless refactoring– unit tests ("codefacing tests") should run fast– acceptance/functional tests ("customer tests", "business

facing tests") should run via continuous integration process


● “tracer bullet" development technique: like candlemaking– dip wick in wax, you get a thin but functional candle– keep dipping until you get a fully formed one

● TestEnhanced Development (TED)– write some code– write unit tests (different person can do it)– write more code– write more tests– etc.


Lessons learned● continuous integration and automated testing are

essential● remote pair programming works really well: each of the

two is accountable to the other● whole team approach to both development and testing● no code “thrown over the wall” to QA

MailOnnaStick: our “AUT”

● Three basic requirements:– Browse email– Search email– Comment email

Technology base

● CherryPy Web framework● Durus object database● Commentary AJAX commenting system● jwzthreading Netscape 4.xlike threading● py.lib logging & HTML generation● quixote.html HTML escaping code

Implementation history(since Dec. 2005)

● Browse mailboxes via the Web.● Add search functionality.● Add commenting.● Refactored to allow multiple mailboxes per mail source (e.g.

Mail/ dir, IMAP).● Next step: optimize mail indexing.● After that: optimize database storage/retrieval.

Throughout, test the bejeezus out of it.

Testing MOS is a challenge

● Accesses network for outside mailboxes● Has Web GUI● Commentary uses AJAX/JavaScript

Agile project management with Trac

● Wiki: allows easy editing/jotting down of ideas and encourages brainstorming

● defect tracking system: simplifies keeping track of bugs, tasks and enhancements

● source browser: makes it easy to see all the SVN changesets

● roadmap: milestones that are tied into the tracker

Agile project management with Trac (cont.)

● ability to link between bug/task tickets, changesets, source, wiki pages: everything is treated as wiki text

● “timeline”: constantlyupdated, RSSified chronological image of all the important events that happened in the project: code checkins, ticket changes, wiki changes, milestones

Trac – lessons learned

● ease of wiki editing makes it a good way to jot down ideas; same with ticketing system

● great collaborative tool for doing Test Enhanced Development

● user stories / requirements can be specified as tickets of type 'enhancement' and 'task'

Source code management with Subversion

● centralized sourcecode management system like CVS, but better in a number of ways.

● repository can be accessed via file system, SSH, Apache/DAV

● integrates with Trac● standard layout

– trunk: main body of code– branches: correspond to a specific release– tags: copy of the code corresponding to a specific release

● email notifications via SvnReporter

Unit testing

● Completely automated tests that should run quickly;● Test small, individual “units” of function;● Test functionality of specific interface or function;● DON’T test full path through some code;● Traditionally require setup/teardown (e.g. load/reset

database contents, or build mock objects).

Using nose to unit test

● Easy to build ad hoc unit test code;● Test frameworks like nose give you a structure within

which to run, manipulate, display, and manage tests and results;

● nose is one of many Python unit test frameworks.● Author Jason Pellerin;● Based on concepts from py.test;● Can be integrated into setup.py;

nose caveats

● Path handling is a bitch “where's my app code?!”● Because all code is executed within single interpreter,

can have crossmodule artifacts and inconsistencies: isolating code properly can be difficult.

● stdout/stderr capture is annoying.

nose features and hacks

● Select subset of tests to execute on the command line.● Hack unit test display to time individual unit tests.● Option to AVOID slow unit tests…

nose – lessons learned

● Tests that are both fast and easy to run (‘% nosetests’) are more likely to be used; 30 seconds was getting somewhat long…

● Unit tests serve as a kind of explicit documentation of functions/interfaces.

● In particular, this makes writing unit tests for other people’s code a very worthwhile activity: you learn the code well.

Acceptance testing with Fit/FitNesse

● FitNesse: more userfriendly variant of Ward Cunningham's Fit framework

● "businessfacing" or “customerfacing” tests, as opposed to "codefacing" tests (i.e. unit tests)

● tests are expressed as stories ("storytests")– business domain language– higher level compared to unit tests

● FitNesse tests make sure you "write the right code"● unit tests make sure you "write the code right"

Acceptance testing with Fit/FitNesse (cont.)

● wiki format encourages collaboration● Fit/FitNesse brings together business customers,

developers and testers, and it forces them to focus on the core business rules of the application

● James Shore: Done right, FIT fades into the background


● PyFIT is Python port of FIT/FitNesse● tests are written in tabular format, with inputs and

expected outputs● fixtures are a thin layer of glue code that tie the test

tables to the application


● ColumnFixture: similar to SQL select/insert row from/into table

● RowFixture: similar to SQL select from table● declarative style (ColumnFixture) vs. procedural style

(ActionFixture)● tests can be executed from the command line

– can be included in a smoke test run via a continuous integration tool such as buildbot

Fit/FitNesse – lessons learned

● acceptance tests written with FitNesse can be seen as an “alternative” GUI into the business logic of the applications– as such, they prove to be very resilient in the presence of UI

changes● GUI tests are fragile and need frequent changes to

keep up with changes in the UI

Regression testing with TextTest

● TextTest is a tool for "behaviorbased" acceptance testing– it looks at behavior of the AUT as expressed by log files,

stdout and stderr– golden images of logs, stdout/err are compared with what is

obtained at run time● documentation on project's home page is plentiful, but

lacks howtos

TextTest – lessons learned

● to get the best mileage out of it if, plan your logging carefully, with different severity levels, and have different log files for different functionality areas of the app (a single log file tends to change too much and too rapidly)

twill functional Web tests

● twill is a “HTTP driver” program for scripting Web sessions, i.e. it’s a scriptable commandline browser.

● Titus is the primary author: inject grain o’ salt.● Layers a domainspecific language on top of a simple

set of Python functions, e.g.“go http://python.org/”Becomes“go(‘http://python.org’)”

http://python.org/

more twill

● twill scripts are simple to write.● twill tests are just twill scripts that shouldn’t ever fail.● Interactive browsing via the command line is pretty

cool ;).● Extending twill with Python is easy.

Automating twill tests

● Integrating twill into unit test framework is great: completely automated functional testing.

● However, setup/teardown is annoying: must start a full HTTP server on some port; run tests; then kill server.

● Nonetheless, should definitely do it as a “smoke test”.● Other drawbacks: multiple processes/threads makes

coverage tracking and performance profiling difficult.

Inprocess twill testing

● Can use wsgi_intercept module to reroute httplib (client) requests directly to a WSGI application object.

Standard HTTP client/server

Inprocess httplib>WSGI interception

Inprocess httplib>WSGI interception

def create_app(): return wsgi_app

twill.add_wsgi_intercept('localhost', 80, create_app)

Inprocess twill testing

● lets us avoid complexities of multithreaded/multiprocess apps

● This also lets us analyze coverage of tests & profile the app.

● (also works with all other Python Web testing frameworks)

n.b. long article on how to do this for CherryPy and Quixote at advogato.org.

Simple coverage testing

● Determines which lines of code were executed, or “covered”, by a particular test;

● Only free/OSS tool is coverage.py, by Gareth Rees and Ned Batchelder.

● Uses sys.settrace to record which lines of code were executed.

Coverage testing with twill (or really any functional) tests

● Can use coverage analysis to figure out which parts of your app aren’t hit by other tests;

● Usually quite easy to quickly run through that code in a new functional test.

Profiling

● Many profilers for Python.● Most are poorly documented.● Personally, I like statprof, a statistical profiler developed

by Andy Wingo.

Profiling with twill

● Using twill, you can automate a particular user path through your site;

● Then, do profiling analysis to figure out where the bottlenecks are;

● Lets you target specific user paths, which seems cool.

twill – lessons learned

● twill scripts are easier to write quickly than Python code;

● Possible to achieve high levels of coverage quite quickly by hacking together new twill functional tests with an eye on coverage;

● twill scripts can be reused:● Automated unit tests;● Setup/teardown for other tests;● Command line tests of functionality.

Web UI testing with Selenium

● functional/acceptance testing at the UI level for Web applications

● uses an actual browser driven via JavaScript● unique features

– clientside JavaScript testing (think AJAX)– browser compatibility testing: can be used crossplatform

and crossbrowser● Selenium framework needs to be deployed on the

same server that is running the AUT

Web UI testing with Selenium (cont.)

● individual tests and test suites written as HTML tables, similar to tests written in FIT/FitNesse

● tests consist in actions (open, type, select, click) and assertions

● Selenium IDE– record/playback tool, aims for fullfledged IDE– uses Mozilla chrome to get around JavaScript security

● Other useful tools– XPath Checker and XPather Firefox extensions

Web UI testing with Selenium (cont.)

● "Driven mode"– standalone Selenium server using Twisted + XMLRPC– reverse proxy gets around JavaScript crosssite scripting

security limitation– driven by scripts written in any language with XMLRPC

bindings● Selenium tests can be added to continuous integration

process– setup/teardown via twill– post results for reporting purposes

Selenium – lessons learned

● Selenium tests are fairly brittle (as all GUI tests are) in the presence of UI changes

● it's no great fun to write the tests, but Selenium IDE helps a lot

● Selenium is the only way known to mankind for testing AJAX

Agile documentation with doctest and epydoc

● doctest: "literate testing"/"executable documentation"● unit tests expressed as stories (big docstrings) that

offer context● “storytests”: stories together with acceptance criteria

that validate their implementation● many projects generate documentation from doctests

with minimal processing code (Django, Zope3)● epydoc makes it trivial to show the storytests

Agile documentation with doctest and epydoc (cont.)

● test list: set of unit tests for a given module● test map: set of unit test functions that exercise a given

application function/method● easy to generate automatically and integrate with

epydocLessons learned● unit test duplication is not necessarily a bad thing● when writing "agile documentation", we discovered

bugs that weren't caught by the initial unit tests

Continuous integration process

Continuous integration with buildbot

● continuous integration– automate timeconsuming tasks– important for the immediate public feedback it gives you

about changes you made– the more often you build and test your software, the quicker

you will discover and fix bugs● buildbot is based on Twisted

– pretty hard to configure, but works really well once you have it up and running

– can be deployed behind Apache for access control purposes

Continuous integration with buildbot (cont.)

● master process kicks off buildandtest process on configured slaves (via scheduler or email notifications)

● easy to extend master.cfg module with extensions for running various types of commands and tests

● you get the most value out of the process if you have slaves running on as many of the OSes/Python versions/etc. you plan to support


● run as many types of testing as possible with buildbot– unit tests– acceptance w. fitnesse & texttest– unit/functional with twill– UI with selenium– coverage and profiling– egg creation and installation (this exercises setup.py!)

● the more aspects of the application you cover, the better protected you are against breakages


Lessons learned● running unit/acceptance tests as separate user under

different environment will uncover all kinds of environmentspecific issues– OS versions– Python versions– required package versions– hardcoded paths– log locations

buildbot hacks

Goal: automate everything & stick it in Buildbot.

Problem: Selenium tests run inside a real browser…

Solution: VNC and FireFox.

buildbot hacks (cont.)

Goal: be able to do effective post mortems on failed tests.

Problem: not all output belongs on stdout.

Solution: record all output & dump in buildspecific directory.

Conclusions

● “Holistic” testing: test the application at all levels (no one test type does it all)– unit testing– functional/acceptance testing of business logic and UI– regression testing

● Make it as easy as humanly possible to run all these types of tests automatically via a continuous integration tool – or they will not get run!

MetaLesson Learned

[tests] DO OR DO NOT [pass] THERE IS NO TRY

Agile Master Yoda

agile development and testing in python -

Documents