Property-based Testing

Property-based Testing

Lets look at QuickCheck

Ported to 40 other languages

PBT used in basically all kinds of software…


















Plan

  1. Property-based Testing

  2. Random Test Generation

  3. Case-Study












Property-based Testing

Don’t (only) write individual unit-tests

  • only check particular input-output behavior of code

Do write properties desired of the functions

  • generate random inputs
  • run function
  • verify (or rather, try to falsify) the property
















Testing with Specifications

PBT emphasizes the importance of specifying correct behavior

  • Makes you think about what the code should do,

  • Finds corner-cases where the specification is violated (so code or spec are fixed)

  • Makes specs live on as machine-checked documentation.
















Properties

A Property is a function that returns a Bool

Looks like a theorem that the programmer believes is true.

  • By convention, we write the prefix "prop_"














QUIZ

Consider the property prop_revapp

It says forall xs, ys. reverse (xs ++ ys) == reverse xs ++ reverse ys.

Is the property true?

A. Yes

B. No












Question

Why specify type as [Int] -> [Int] -> Bool?

  • Why not write [a] -> [a] -> Bool?












Checking Properties

QC uses the types to generate random inputs

  • For that, it needs a specific type like [Int]

  • Cannot generate an input of type [a]

Let’s quickCheck it!

Huh? QC gives us a counterexample with xs = [0] and ys == [1]

but
















EXERCISE

Can you modify prop_revapp so that it is sensible and true ?

When you are done, we should see

We can run more tests by specifying that as a parameter

Followed by











QuickSort

Here’s a simple sorting function (quickSort)

Seems to work?










QuickCheck QuickSort

Lets check the output of qsort is ordered

and use it to write a property

which we can check












QUIZ

Lets check that the first element of the output is the smallest

What is the result of

A. Pass 100 tests

B. Fail















Properties and Assumptions

Oops!

prop_qsort_min is not true for all Int lists

  • Property only makes sense if xs is not empty

Writing specifications clarifies the assumptions

  • under which a given piece of code is supposed to work.













Conditional Properties

Lets modify prop_qsort_min so equality holds if input is non-null

Instead of Bool the function’s output is Property

  • a special type built into the QC library

The implies operator ==> is one of many

  • that allow the construction of rich properties.












QUIZ

Lets test that our qsort is identical to a trusted reference implementation

  • may be too slow to deploy but ok to use for checking correctness

Lets use the standard library’s Data.List.sort function

  • (Much faster than ours… but just for illustration!)

What is the result of

A. OK after 100 tests

B. Failed! Falsifiable...















Lets Check

Oops? What?

Ugh! So close, and yet … Can you spot the bug in our code?












Specifying No-Duplicates

We assumed that the input has no duplicates

  • Values equal to x are thrown out from ls and rs

Is this a bug? Maybe? Maybe not?

  • But at least its something we should be aware of!

Lets specify that a list has no-duplicates










Specifying a Post-Condition

We can now check that qsort outputs a list with no-duplicates











Specifying a Pre-Condition

Also, qsort is identical to sort on inputs with no duplicates


















Plan

  1. Property-based Testing
    • Properties are boolean-functions
    • Generate inputs, run function, check if result is False
  2. Random Test Generation

  3. Case-Study












Plan

  1. Property-based Testing
    • Properties are boolean-functions
    • Generate inputs, run function, check if result is False
  2. Test Generation

  3. Case-Study
















Test Generation

Lets notice something about quickCheck

If you run it once …

and if you run it again …

The falsifying tests are different!

How is this possible?
















Generators

QC defines a special Generator data type

A Gen a is a function that takes as input

  • a random number generator StdGen
  • a “seed” Int

and returns as output

  • a value of type a












Creating Generators

There are some functions to create generators, e.g.

which

  • takes a pair of (lo, hi)

  • returns a random generator for values between lo and hi












Running Generators

To execute the Gen we need access to the system’s “randomness”

Done via an IO “recipe”

Which

  • takes a Generator of a values and
  • returns a recipe that produces a list of (10) a values

We can run the recipe in ghci











EXERCISE

Lets write a function that returns a Generator over a list of elements

So elements [x0,x1,...,xn] returns a Generator that randomly produces values from x0, x1, … xn.

















PROBLEM: How to combine Generators?

Suppose I have a generator of positive Int

How can I create a generator of a pair of positive Ints?















Generator is a Monad!

  • You can see details here

  • This will let us combine generators (like combining parsers…)














QUIZ

Which of the below implements posPair :: Gen (Int, Int) ?

  • given pos :: Gen Int and sample :: (Int, Int) -> Gen Int






















EXERCISE

Lets write a function that mixes a list of Generators

oneOf [g0,g1,...,gn] should be a generator that

  • randomly selects one of g0,…gn

  • randomly generates a value from the chosen generator










oneOf is generalized into the frequency combinator

which builds weighted mixtures of individual Generators















Types that can be Generated

QC has a class for types whose values can be randomly Generated

T is an instance of Arbitrary if there is Gen T function

  • i.e. there is a generator of T values!

Many standard types have Arbitrary instances

  • Users write their own instances when testing their own types











Plan

  1. Property-based Testing
    • Properties are Boolean-functions
    • Generate inputs, run function, check if result is False
  2. Test Generation
    • Gen a is a monadic generator of a values
    • Arbitrary is a class for types with generators
  3. Case-Study











Case Study: Compiler Optimizations

Lets use QC to test compiler optimizations

  • Learn how to generate structure data (programs)

  • Learn how to specify fancy properties (equivalence)

Using the WHILE language from your HW assignment.
















WHILE: Syntax

Recall the definition of Variable and Value

which we used to define Expression

and Statement














WHILE: Semantics

Next, we defined the behavior of programs as

  • functions from starting state to final state

  • where state was defined as a map from Variable to Value

and then you wrote a function

(We can skip the details of evalS because you wrote it… right?)















Generating WHILE Programs

Lets write a program generator














Generating Variables

and we get






Generating Values

and we get






Generating Expressions

which gives us






Generating States

QC already has an way to automatically generate Maps

So for free we get a generator for WState






Specification: Program Equivalence

Next, lets specify the properties

Let p1 and p2 be two While programs.

Program p1 is equivalent to p2 written p1 === p2 if

That is, for all input states st

  • executing p1 from st and

  • executing p2 from st

produce the same state.

QUIZ

For example, suppose that

then what do the following two queries return?

A. FAIL, FAIL

B. FAIL, PASS

C. PASS, FAIL

D. PASS, PASS












Checking Compiler Optimizations

A compiler optimization can be viewed as a pair of programs

  • input written by human p_in
  • output optimized by compiler p_out

An optimization is correct if p_in === p_out

  • Compiler should not change behavior of the code











Checking an Optimization: Zero-Add-Elimination

Here’s a simple optimization

In Out
X := E + 0 X := E
X := E - 0 X := E

Should be correct because adding 0 doesn’t change anything…

We can write this as a QC property

So what does QC say?















What’s that? Lets see is W := True equivalent to W := True + 0 ?

Forgot about those pesky boolean expressions!

If you recall W := True will just assign True to W

but W := True + 0 will have a “type error” so will assign 0 to W!














Fix: Restrict Optimizations to Int Expressions

Problem was expressions like True

  • Caused strange behaviors due to “type errors”

Lets restrict to only integer-valued expressions

Now we can restrict the property to













QUIZ

Consider the property

What will be the result of

A. PASS

B. FAIL














Oops, the counterexample is Z := G and Z := G + 0

  • but now the starting state maps (G, True)

Pesky Bool sneaked right back in …

Moral: Even simple optimizations are really tricky without types!

Try at home Can you fix this property so it passes?











Checking an Optimization: Constant-Folding-ish

Lets try another optimization that doesn’t use any arithmetic

  • So should not have any problems with Int-vs-Bool

Suppose you have two back-to-back assignments

Why recompute E? Result is already stored in X! So optimize the above to

We can specify this transformation as the QC property

Mighty QC, do you agree ?















Shrinking Tests

The property fails … but the counterexample is very long!

QC has a test shrinking mechanism … also in the Arbitrary class

shrink t

  • takes as input a candidate t and
  • returns as output a list of smaller candidates

That QC will systematically search with to find a minimally failing test!

We should just keep shrinking each Expression to its sub-Expressions.

```haskell >>> quickCheck prop_const_prop *** Failed! Falsifiable (after 26 tests and 4 shrinks):
D U A + D fromList [(D,-638),(G,256),(H,False),(K,False),(O,True),(R,True),(S,-81),(T,926)] ~~~~~

Aha! Consider the two programs

and

are they equivalent? Pretty subtle, eh.












Recap: Property-Based Testing

  1. Property-based Testing
    • Properties are Boolean-functions
    • Generate inputs, run function, check if result is False
  2. Test Generation
    • Gen a is a monadic generator of a values
    • Arbitrary is a class for types with generators
  3. Case-Study
    • Compiler optimizations are very tricky
    • QC-inspired methods have found serious bugs in many compilers & databases & …












Recap: Property-Based Testing

QuickCheck is awesome!

  • Simple: typeclasses + monads

  • Useful: Can find subtle bugs or inconsistencies in your code.

Lots of literature on QC and techniques it has inspired

  • Can even use QC to generate test data systems in other languages