Black and White Box Tests

Inspired somewhat by Matt Might’s “reply-to-all” approach to (academic) blogging, and assuming that what I keep here could be considered a blog, here’s my response to a student question from a COMP1927 class.

Thanks to Curtis Millar for several intelligent and thoughtful comments that improved this write-up immensely.

I am a bit confused at how to write white box tests, as all I can find on it are definitions of what white box testing is, with no examples. I consulted my tutor, who said white box testing and black box testing were essentially the same […]

Oh, good gods, no.

There’s two halves to the testing story.

Let’s say we have this interface.

// Bag.h
typedef struct bag *Bag;
Bag bag_new (void); // create a new Bag
void bag_delete (Bag); // destroy a Bag
void bag_insert (Bag, int item); // add an item to the Bag
size_t bag_size (Bag); // get the Bag size
bool bag_contains (Bag, int item); // does Bag contain `item`?

Black-box testing, as you know, follows the shape of the specification and its interface to determine whether an implementation conforms to it.

To write and perform black-box tests, it isn’t necessary to know how a particular implementation ‘does things’ – the layout of structures, the inner workings of helper functions, etc. It can, and should, be treated as a “black box”; hence the name.

For our Bag, we may like to write tests for the bag_size function that look like this:

// in BagTest.c
void test_bag_size (void) {
    Bag b = bag_new();
    assert (bag_size (b) == 0);

    for (int i = 1; i <= 10; i++) {
        bag_insert (b, i);
        assert (bag_size (b) == (size_t)i);
    }

    assert (bag_size (b) == 10);
}

It’s worth noting that we’re testing here for correct behaviour. It’s slightly harder to deal with testing for incorrect behaviour, unless you’ve defined various boundary behaviours to use for robustness testing – and in any case, that becomes part of the spec, and can and should be tested anyway.

White-box testing, on the other hand, is highly subjective (and much harder to automate because of that). It’s testing with awareness of the inner structure of structures, functions, etc., and there are many approaches to it. At our level, though, here’s two you’ll want to use.

The first approach, and I feel the easiest to implement, is to follow a similar tack to how you would write black-box tests: you set up some input conditions, call a function, then check the output matches your expectations.

Instead of, as you would have done with black-box testing, examining isolated return values as outputs from the purposely small ‘surface area’ exposed by the interface functions, with white-box testing you must examine the state of your ADT. Every piece of data you meaningfully store, every byte referenced by or held in your type’s structure becomes fair game for inspection.

Such tests would have to, of course, sit in one or more functions, to keep your code clean, and by necessity it must live in your implementation file.

Let’s assume our Bag is implemented as a linked-list stack. There are, of course, other implementations you could use here: arrays, queues, trees, heaps, etc. etc. etc.

// in Bag.c
struct bag {
    struct bag_node *top;
    size_t n_items;
};
struct bag_node {
    int item;
    struct bag_node *next;
};

#define for_each_bag_node(first,this) \
    for (struct bag_node *this = first; \
        this != NULL; this = this->next)

// ... some actual implementation elided...

How would we go about writing tests, then? We have to be aware of the shape of our implementation, of course.

// still in Bag.c
void white_box_tests (void) {
    wbt_bag_size ();
    // and other tests
}

static void wbt_bag_size (void) {
    Bag b = bag_new ();
    assert (b->size == 0);
    assert (b->top == NULL);
    assert (b->size == bag_size (b));

    for (int i = 1; i <= 10; i++) {
        bag_insert (b, i);
        assert (b->size == i);
        assert (b->top != NULL);

        size_t node_count = 0;
        for_each_bag_node (b->top, curr) {
            node_count++;
        }
        assert (node_count == b->size && node_count == i);

        assert (b->size == bag_size (b));
    }
}

This looks a lot like the black box tests above, but instead of using the interface, I’m stabbing into my ADT structures, I’m walking along the list I hold to ensure the number of nodes in the list actually matches the number of nodes I think I have, etc.

Each implementation would have its own white-box tests. Some would be more complex than others, by the very nature of the implementation; some would be simpler. But each implementation has a distinct set of tests, which cannot be swapped into another implementation, unlike black-box tests which are purposefully implementation agnostic.

Another approach is to use the assert(3) macro/function at every point to confirm that your conditions, or “invariants”, hold: for example, that your head and tail pointers are the same iff there is only one node, or that each node’s previous node’s next node is itself.

This approach takes significantly more work, is more invasive into your code, and you have to almost engineer it in, but you may find this approach can be a bit more thorough, and allow for cleaner design. You’ll encounter more of this sort of idea in software engineering courses.

Going back to our Bag ADT, let’s have a look at what this would look like for the bag_size function.

// back in Bag.c
size_t bag_size (struct bag *b) {
    // pre-conditions
    assert ((b != NULL) &&
        ((b->size == 0 && b->top == NULL) ||
         (b->size > 0  && b->top != NULL)));

    return b->size;
}

And let’s have a quick look at bag_insert while we’re here.

void bag_insert (struct bag *b, int value) {
    // pre-conditions
    assert ((b != NULL) &&
        ((b->size == 0 && b->top == NULL) ||
         (b->size > 0  && b->top != NULL)));

    struct bag_node *bn = bag_node_new (value);
    assert (bn != NULL && bn->item == value);
    bn->next = b->top;
    b->top = bn;
    b->size++;

    // post-conditions
    assert (b->size > 0 &&
        b->top != NULL &&
        b->top->item == value);
}

This looks much more complex, but the core of the function is still the same: create a node, prepend it to the list. We merely make statements about the nature of the bag before and after we do so, and ensure that those statements hold (or the program goes away).

You could probably, with some thought, expand the pre-condition and post-conditions to ensure that (more of) the integrity of the list holds, though it would become messy with more than a few nodes.

(In fact, as I was writing this example, I stuffed up and forgot to increment b->size… and then wrote the post-condition assertion and realised my mistake.)

As a marker, I’d consider that this strategy shows you have a better understanding of the code you’ve written because, by its very nature, it encourages clearer code through more thorough use of abstraction to isolate common code, which can radically affect your design.

For example, instead of seeing you iterate through a list until some counter matches some other value, you may have a function that gets the nth line of the list, and you would be able to test this in either a separate function, or merely assert your expectations.

Both strategies can be used together, too.

It’s probably a bit hard to graft the latter strategy into your assignment code at this stage, what with the deadline mere hours ago… but I hope this is at least somewhat enlightening about the shape and nature of testing strategies.