testing and debugging (depuração em haskell)
DESCRIPTION
Complementa as seções anteriores Original autorizado por: John Hughes http://www.cs.chalmers.se/~rjmh/ Adaptado por: Claudio Cesar de Sá. Testing and Debugging (Depuração em Haskell). What’s the Difference?. - PowerPoint PPT PresentationTRANSCRIPT
Testing and Debugging(Depuração em Haskell)
Complementa as seções anteriores
Original autorizado por: John Hughes
http://www.cs.chalmers.se/~rjmh/
Adaptado por: Claudio Cesar de Sá
What’s the Difference?
Testing means trying out a program which is believed to work, in an effort to gain confidence that it is correct. If no errors are revealed by thorough testing, then, probably, relatively few errors remain.
Debugging means observing a program which is known not to work, in an effort to localise the error. When a bug is found and fixed by debugging, testing can be resumed to see if the program now works.
This lecture: describes recently developed tools to help with each activity.
Debugging
Here’s a program with a bug:
median xs = isort xs !! (length xs `div` 2)isort = foldr insert []insert x [] = [x]insert x (y:ys) | x<y = x:y:ys | x>=y = y:x:ys
Median> median [8,4,6,10,2,7,3,5,9,1]2 Median> isort [8,4,6,10,2,7,3,5,9,1][1,8,4,6,10,2,7,3,5,9]
A test revealsmedian doesn’t
work.We start trying
functions median calls.
isort doesn’t work either.
Debugging Tools
The Manual Approach
We choose cases to try, and manually explore the behaviour of the program, by calling functions with various (hopefully revealing) arguments, and inspecting their outputs.
The Automated Approach
We connect a debugger to the program, which lets us observe internal values, giving us more information to help us diagnose the bug.
The Haskell Object Observation Debugger
Provides a function
which collects observations of its second argument, tagged with the String, and returns the argument unchanged.
Think of it as like connecting an oscilloscope to the program: the program's behaviour is unchanged, but we see more.
(You need to import the library which defines observe in order to use it: add
at the start of your program).
observe :: String -> a -> a
import Observe
Listas> :l listas_haskell.hs Reading file "listas_haskell.hs":Reading file "/usr/share/hugs/lib/exts/Observe.lhs":Reading file "listas_haskell.hs": Hugs session for:/usr/share/hugs/lib/Prelude.hs/usr/share/hugs/lib/exts/Observe.lhslistas_haskell.hsListas>
Garantindo que Observe.lhsfoi carregado ...
What Do Observations Look Like?
Median> sum [observe "n*n" (n*n) | n <- [1..4]]30
>>>>>>> Observations <<<<<<
n*n 1 4 9 16
We add a ''probe'' to the program
The values observed aredisplayed, titled with thename of the observation.
Observing a List
Median> sum (observe "squares" [n*n | n <- [1..4]])30
>>>>>>> Observations <<<<<<
squares (1 : 4 : 9 : 16 : [])
Observing the entire list letsus see the order of values
also.
Now there is just oneobservation, the list
itself.Lists are always
observedin ''cons'' form.
Observing a Pipeline
Median> (sum . observe "squares" . map (\x->x*x)) [1..4]30
>>>>>>> Observations <<<<<<
squares (1 : 4 : 9 : 16 : [])
We can add observers
to ''pipelines'' -- longcompositions of
functions -- to see the
values flowingbetween them.
Observing Counting Occurrences
countOccurrences = map (\ws -> (head ws, length ws)) . observe "after groupby" . groupBy (==) . observe "after sort" . sort . observe "after words” . words
Add observationsafter each stage.
Observing Counting Occurrences
Main> countOccurrences "hello clouds hello sky"[("clouds",1),("hello",2),("sky",1)]
>>>>>>> Observations <<<<<<
after groupby (("clouds" : []) : ("hello" : "hello" : []) : ("sky" : []) : [])
after sort ("clouds" : "hello" : "hello" : "sky" : [])
after words ("hello" : "clouds" : "hello" : "sky" : [])
Observing Consumers
An observation tells us not only what value flowed past the observer -- it also tells us how that value was used!
Main> take 3 (observe "xs" [1..10])[1,2,3]
>>>>>>> Observations <<<<<<
xs (1 : 2 : 3 : _)
The _ is a ''don't care'' value --certainly some list appeared here,
but it was never used!
Observing Length
Main> length (observe "xs" (words "hello clouds"))2
>>>>>>> Observations <<<<<<
xs (_ : _ : [])
The length function did not needto inspect the values of the
elements,so they were not observed!
Observing Functions
We can even observe functions themselves!
Main> observe "sum" sum [1..5]15
>>>>>>> Observations <<<<<<
sum { \ (1 : 2 : 3 : 4 : 5 : []) -> 15 }
Observe ''sum'' sumis a function, which is
applied to [1..5]
We see arguments andresults, for the calls
whichactually were made!
Observing foldr
Recall that : foldr (+) 0 [1..4] = 1 + (2 + (3 + (4 + 0)))
Let’s check this, by observing the addition function.
Main> foldr (observe "+" (+)) 0 [1..4]10
>>>>>>> Observations <<<<<<
+ { \ 4 0 -> 4 , \ 3 4 -> 7 , \ 2 7 -> 9 , \ 1 9 -> 10 }
Observing foldl
We can do the same thing to observe foldl, which behaves as
foldl (+) 0 [1..4] = (((0 + 1) + 2) + 3) + 4
Main> foldl (observe "+" (+)) 0 [1..4]10
>>>>>>> Observations <<<<<<
+ { \ 0 1 -> 1 , \ 1 2 -> 3 , \ 3 3 -> 6 , \ 6 4 -> 10 }
How Many Elements Does takeWhile Check?
takeWhile isAlpha ''hello clouds hello sky'' == ''hello''
takeWhile isAlpha selects the alphabetic characters from the front of the list.
How many times does takeWhile call isAlpha?
How Many Elements Does takeWhile Check?
Main> takeWhile (observe "isAlpha" isAlpha) "hello clouds hello sky""hello"
>>>>>>> Observations <<<<<<
isAlpha { \ ' ' -> False , \ 'o' -> True , \ 'l' -> True , \ 'l' -> True , \ 'e' -> True , \ 'h' -> True }
takeWhile calls isAlpha
six times -- the last calltells us it’s time to
stop.
Observing Recursion
fac 0 = 1fac n | n>0 = n * fac (n-1)
Main> observe "fac" fac 6720
>>>>>>> Observations <<<<<<
fac { \ 6 -> 720 }
We did not observe therecursive calls!
We observe this useof the function.
Observing Recursionfac = observe "fac" fac'fac' 0 = 1fac' n | n>0 = n * fac (n-1)
Main> fac 6720
>>>>>>> Observations <<<<<<
fac { \ 6 -> 720 , \ 5 -> 120 , \ 4 -> 24 , \ 3 -> 6 , \ 2 -> 2 , \ 1 -> 1 , \ 0 -> 1 }
We observe all calls ofthe fac function.
Debugging median
median xs = observe "isort xs" (isort xs) !! (length xs `div` 2)
Main> median [4,2,3,5,1]2
>>>>>>> Observations <<<<<<
isort xs (1 : 4 : 2 : 3 : 5 : [])
Wrong answer:the median is 3
Wrong (unsorted)result from isort
Debugging isort
isort :: Ord a => [a] -> [a]isort = foldr (observe "insert" insert) []
Main> median [4,2,3,5,1]2
>>>>>>> Observations <<<<<<
insert { \ 1 [] -> 1 : [] , \ 5 (1 : []) -> 1 : 5 : [] , \ 3 (1 : 5 : []) -> 1 : 3 : 5 : [] , \ 2 (1 : 3 : 5 : []) -> 1 : 2 : 3 : 5 : [] , \ 4 (1 : 2 : 3 : 5 : []) -> 1 : 4 : 2 : 3 : 5 : [] }
All well, except
for this case
Debugging insert
insert x [] = [x]insert x (y:ys) | x<y = observe "x<y" (x:y:ys) | x>=y = observe "x>=y" (y:x:ys)
Main> median [4,2,3,5,1]2
>>>>>>> Observations <<<<<<
x>=y (1 : 5 : []) (1 : 3 : 5 : []) (1 : 2 : 3 : 5 : []) (1 : 4 : 2 : 3 : 5 : [])
Observe theresults from
eachcase
Only the secondcase was used!
The Bug!
I forgot the recursive call…
insert x [] = [x]insert x (y:ys) | x<y = x:y:ys | x>=y = y:insert x ys
Main> median [4,2,3,5,1]3
Bug fixed!
The right answer
Summary
➢ The observe function provides us with a wealth of information about how programs are evaluated, with only small changes to the programs themselves
➢ That information can help us understand how programs work (foldr, foldrl, takeWhile etc.)
➢ It can also help us see where bugs are.
Testing
Testing means trying out a program which is believed to work, in an effort to gain confidence that it is correct.
Testing accounts for more than half the development effort on a large project (I’ve heard all from 50-80%).
Fixing a bug in one place often causes a failure somewhere else -- so the entire system must be retested after each change. At Ericsson, this can take three months!
''Hacking'' vs Systematic Testing
''Hacking''
Systematic testing
•Try some examples until the software seems to work.
•Record test cases, so that tests can be repeated after a modification (regression testing).
•Document what has been tested.
•Establish criteria for when a test is successful -- requires a specification.
•Automate testing as far as possible, so you can test extensively and often.
QuickCheck: A Tool for Testing Haskell Programs
Based on formulating properties, which
•can be tested repeatedly and automatically
•document what has been tested
•define what is a successful outcome
•are a good starting point for proofs of correctness
Properties are tested by selecting test cases at random!
Random Testing?
Is random testing sensible? Surely carefully chosen test cases are more effective?
By taking 20% more points in a random test, anyadvantage a partition test might have had is wipedout.
D. Hamlet
✔ QuickCheck can generate 100 random test cases in less time than it takes you to think of one!
✔Random testing finds common (i.e. important!) errors effectively.
A Simple QuickCheck Property
prop_Sort :: [Int] -> Boolprop_Sort xs = ordered (sort xs)
Check that theresult of sort is
ordered.
Random valuesfor xs are generated.
Main> quickCheck prop_SortOK, passed 100 tests. The tests were
passed.
Some QuickCheck Details
import QuickCheck
prop_Sort :: [Int] -> Boolprop_Sort xs = ordered (sort xs)
Main> quickCheck prop_SortOK, passed 100 tests.
We must importthe QuickCheck
library.The type ofa property
must not bepolymorphi
c.
quickCheck is an(overloaded) higher order
function!
We give properties names beginning
with ”prop_” so we can easily find and
test all the properties in a module.
A Property of insert
prop_Insert :: Int -> [Int] -> Bool
prop_Insert x xs = ordered (insert x xs)
Main> quickCheck prop_InsertFalsifiable, after 4 tests:-2[5,-2,-5] Whoops! This list
isn’t ordered!
Discards test caseswhich are not
ordered.
A Corrected Property of insert
prop_Insert :: Int -> [Int] -> Property
prop_Insert x xs = ordered xs ==> ordered (insert x xs)
Main> quickCheck prop_InsertOK, passed 100 tests.
Result is nolonger a simple
Bool.
Read it as ”implies”:
if xs is ordered, then so is (insert x
xs).
Using QuickCheck to Develop Fast Queue Operations
What we’re going to do:
•Explain what a queue is, and give slow implementations of the queue operations, to act as a specification.
•Explain the idea behind the fast implementation.
•Formulate properties that say the fast implementation is ”correct”.
•Test them with QuickCheck.
What is a Queue?
Leave
fromthe
front
Join at the back
Examples
• Files to print
• Processes to run
• Tasks to perform
What is a Queue?
A queue contains a sequence of values. We can add elements at the back, and remove elements from the front.
We’ll implement the following operations:
•empty :: Queue a -- an empty queue
•isEmpty :: Queue a -> Bool -- tests if a queue is empty
•add :: a -> Queue a -> Queue a -- adds an element at the back
•front :: Queue a -> a -- the element at the front
•remove :: Queue a -> Queue a -- removes an element from the front
The Specification: Slow but Simple
type Queue a = [a]
empty = []
isEmpty q = q==empty
add x q = q++[x]
front (x:q) = x
remove (x:q) = q
Addition takes timedepending on the
numberof items in the queue!
The Idea: Store the Front and Back Separately
b c d e f g h ia jOld
Fast toremove
Slow to add
b c d e
i h g f
a
j
New
Fast to add
Fast to remove Periodically
move theback to the
front.
The Fast Implementation
type Queue a = ([a],[a])
flipQ ([],b) = (reverse b,[])flipQ (x:f,b) = (x:f,b)
emptyQ = ([],[])isEmptyQ q = q==emptyQaddQ x (f,b) = (f,x:b)removeQ (x:f,b) = flipQ (f,b)frontQ (x:f,b) = x
Make sure thefront is never
empty when theback is not.
Relating the Two Implementations
What list does a ”double-ended” queue represent?
retrieve :: Queue a -> [a]retrieve (f, b) = f ++ reverse b
What does it mean to be correct?
✔ retrieve emptyQ == empty
✔ isEmptyQ q == isEmpty (retrieve q)
✔ retrieve (addQ x q) == add x (retrieve q)
✔ retrieve (removeQ q) == remove (retrieve q) and so on.
Using Retrieve Guarantees Consistent Results
Example
frontQ (removeQ (addQ 1 (addQ 2 emptyQ)))
== front (retrieve (removeQ (addQ 1 (addQ 2 emptyQ))))
== front (remove (retrieve (addQ 1 (addQ 2 emptyQ))))
== front (remove (add 1 (retrieve (addQ 2 emptyQ))))
== front (remove (add 1 (add 2 (retrieve emptyQ))))
== front (remove (add 1 (add 2 empty)))
QuickChecking Properties
prop_Remove :: Queue Int -> Boolprop_Remove q = retrieve (removeQ q) == remove (retrieve q)
Main> quickCheck prop_Remove4Program error: {removeQ ([],sized_v1740 (instArbitrary_v1…
Removing froman empty queue!
Correcting the Property
prop_Remove :: Queue Int -> Propertyprop_Remove q = not (isEmptyQ q) ==> retrieve (removeQ q) == remove (retrieve q)
Main> quickCheck prop_Remove0Program error: {removeQ ([],[Arbitrary_arbitrary instArbitrary…
How can this be?
Making Assumptions Explicit
We assumed that the front of a queue will never be empty if the back contains elements!
Let’s make that explicit:
goodQ :: Queue a -> BoolgoodQ ([],[]) = TruegoodQ (x:f,b) = TruegoodQ ([],x:b) = False
prop_Remove q = not (isEmptyQ q) && goodQ q ==> retrieve (removeQ q) == remove (retrieve q)
NOW IT
WORKS!
How Do We Know Only Good Queues Arise?
Queues are built by add and remove:
addQ x (f,b) = (f,x:b)removeQ (x:f,b) = flipQ (f,b)
New properties:
prop_AddGood x q = goodQ q ==> goodQ (addQ x q)prop_RemoveGood q =
not (isEmptyQ q) && goodQ q ==>goodQ (removeQ q)
Whoops!
Main> quickCheck prop_AddGoodFalsifiable, after 0 tests:2([],[])
addQ x (f,b) = (f,x:b)removeQ (x:f,b) = flipQ (f,b)
See the bug?
Whoops!
Main> quickCheck prop_AddGoodFalsifiable, after 0 tests:2([],[])
addQ x (f,b) = flipQ (f,x:b)removeQ (x:f,b) = flipQ (f,b)
Looking Back
•Formulating properties let us define precisely how the fast queue operations should behave.
•Using QuickCheck found a bug, and revealed hidden assumptions which are now explicitly stated.
•The property definitions remain in the program, documenting exactly what testing found to hold, and providing a ready made test-bed for any future versions of the Queue library.
•We were forced to reason much more carefully about the program’s correctness, and can have much greater confidence that it really works.
Summary
• Testing is a major part of any serious software development.
• Testing should be systematic, documented, and repeatable.
• Automated tools can help a lot.
• QuickCheck is a state-of-the-art testing tool for Haskell.
The remaining slides discuss an important subtlety when using
QuickCheck
Testing the Buggy insert
prop_Insert :: Int -> [Int] -> Property
prop_Insert x xs = ordered xs ==> ordered (insert x xs)
Main> quickCheck prop_InsertFalsifiable, after 51 tests:5[-3,4]
Yields [-3,5,4]
Why so manytests?
Observing Test Data
prop_Insert :: Int -> [Int] -> Propertyprop_Insert x xs = ordered xs ==> collect (length xs) (ordered (insert x xs))
Main> quickCheck prop_InsertOK, passed 100 tests.43% 0.37% 1.11% 2.8% 3.1% 4.
Collect valuesduring testing.
Distributionof length xs.
Random lists whichhappen to be
ordered are likelyto be short!
A Better Property
prop_Insert :: Int -> Propertyprop_Insert x = forAll orderedList
(\xs -> collect (length xs) (ordered (insert x xs)))
Main> quickCheck prop_Insert2OK, passed 100 tests.22% 2.15% 0.14% 1.8% 6.8% 5.
8% 4.8% 3.4% 8.3% 9.3% 11.
2% 12.2% 10.1% 7.1% 30.1% 13.
Read this as:xsorderedList. …
What is forAll?
A higher order function!
forAll :: (Show a, Testable b) => Gen a -> (a -> b) -> Property
A generator fortest data of type
a.
A function, whichgiven a
generateda, produces a
testable result.forAll orderedList
(\xs -> collect (length xs) (ordered (insert x xs)))
What is orderedList?
A test data generator:
orderedList :: Gen [Int]
A ”generator for” a,behaves like IO a
”a command producing a”.
Some primitive generators
arbitrary :: Arbitrary a => Gen aoneof :: [Gen a] -> Gen afrequency :: [(Int,Gen a)] -> Gen a
80% of the time, generate
another list ns of elements
>=n, and return n:ns.
Defining orderedList
orderedList :: Gen [Int]orderedList = do n <- arbitrary listFrom n where listFrom n = frequency [(1,return []),
(4,do m <- arbitrary ns <- listFrom (n+abs m)
return (n:ns))]
We can use the do syntaxto write generators, like IO,but we cannot mix Gen and
IO!Choose an n, and make a list
of elements >= n.
Choose anumber>= n.
20% of the time, just stop.