K&R C Exercise 1-11: How to Test the Word Count Program

Exercise 1-11. How would you test the word count program? What kinds of input are most likely to uncover bugs if there are any?

Exercise 1-11 has no new program to write. It asks you to think like a tester — to look at the word count program K&R presented in Section 1.5.4 and design inputs that are most likely to expose any hidden bugs. This is one of the most practically useful exercises in Chapter 1, because the skill of systematic test design applies to every program you will ever write.

The Program Under Test

Here is the K&R word count program. It reads from standard input and prints the count of lines, words, and characters on one line. Save it as wordcount.c.

/* Compile: gcc -ansi -Wall wordcount.c -o wordcount */
#include <stdio.h>

#define IN  1   /* inside a word */
#define OUT 0   /* outside a word */

/* count lines, words, and characters in input */
int main(void)
{
    int c, nl, nw, nc, state;

    state = OUT;
    nl = nw = nc = 0;
    while ((c = getchar()) != EOF) {
        ++nc;
        if (c == '\n')
            ++nl;
        if (c == ' ' || c == '\n' || c == '\t')
            state = OUT;
        else if (state == OUT) {
            state = IN;
            ++nw;
        }
    }
    printf("%d %d %d\n", nl, nw, nc);
    return 0;
}

Compile

gcc -ansi -Wall wordcount.c -o wordcount

How the Program Works: A Two-State Machine

The variable state is either OUT (reading whitespace or at the very start of input) or IN (reading a non-whitespace character that belongs to a word). The word counter nw increments only on the OUT → IN transition — the first non-whitespace character after a gap. Every subsequent character inside the same word keeps state at IN without touching nw.

That single transition is where the most dangerous bugs live. Any test case that forces the state machine through that transition in an unusual way — or prevents it from happening at all — is the most likely to expose a broken implementation.

The Test Suite

All tests pipe input using printf so no temporary files are needed. The output format is nl nw nc (newlines  words  characters). Work through each case in order — they build from simplest to most subtle.

Test 1: Empty Input

Why: The while loop never executes. All three counters must be initialized to zero before the loop — and they are: nl = nw = nc = 0. Any implementation that produces nonzero output here has an initialization bug.

printf "" | ./wordcount
Expected: 0 0 0

Test 2: Only Whitespace — No Words

Why: The state machine should stay in OUT the entire time. nw must remain zero because the OUT → IN transition never fires. This distinguishes “no input” from “input containing no words” — two different situations that must both produce nw = 0.

printf "   \t\n  " | ./wordcount
Expected: 1 0 7
(3 spaces + tab + newline + 2 spaces = 7 characters, 1 newline, 0 words)

Test 3: Single Word, No Trailing Newline

Why: The program reaches EOF while still in state IN. Notice that nl is 0 even though there is one word — word count and line count are independent. Any code that assumes words are always terminated by a newline will fail this case.

printf "hello" | ./wordcount
Expected: 0 1 5

Test 4: Single Word With Newline

Why: The simplest complete input. Confirms the baseline: one word, one newline, six characters (hello = 5 + the newline itself).

printf "hello\n" | ./wordcount
Expected: 1 1 6

Test 5: Multiple Spaces Between Words

Why: This is the most important test. A common bug counts a new word on every non-whitespace character rather than only on the OUT → IN transition. With three spaces between “hello” and “world”, the state machine must cycle IN → OUT → IN exactly once between the two words, producing nw = 2 — not 10 (the number of letters in both words).

printf "hello   world\n" | ./wordcount
Expected: 1 2 14

Test 6: Input Starting With Whitespace

Why: The program starts with state = OUT. Leading spaces must keep the machine in OUTnw must not increment until the first non-whitespace character triggers the OUT → IN transition. This probes whether the initial state is set correctly and whether leading whitespace is handled identically to whitespace elsewhere in the input.

printf "   hello\n" | ./wordcount
Expected: 1 1 9
(3 spaces + 5 letters + newline = 9 characters)

Test 7: Mixed Whitespace Delimiters

Why: The whitespace condition is c == ' ' || c == '\n' || c == '\t'. A missing clause — for example, forgetting to check '\t' — would treat tabs as word characters and give the wrong count. Use all three whitespace types in a single run to test each branch.

printf "one two\tthree\nfour" | ./wordcount
Expected: 1 4 18
(space after "one", tab after "two", newline after "three", EOF after "four")

Test 8: Very Long Single Word

Why: nc grows with every character, but the state transitions to IN exactly once and stays there. nw must remain 1. This confirms that “staying in IN” across many characters is handled correctly — the else if (state == OUT) guard only fires on entry to a word, not once per character.

printf "averylongwordwithnospacesortabs" | ./wordcount
Expected: 0 1 31

Test 9: Words Alternating With Blank Lines

Why: After each blank line, the state must return to OUT so that the next word triggers a fresh OUT → IN transition. A missed state reset after a blank line would fail to count the second word correctly. This also exercises nl across multiple newlines.

printf "one\n\n\ntwo\n" | ./wordcount
Expected: 4 2 10
(o+n+e+\n+\n+\n+t+w+o+\n = 10 characters, 4 newlines, 2 words)

Which Tests Are Most Likely to Expose Bugs?

The highest-value tests are the ones that put the state machine through transitions in unexpected ways:

  • Test 5 (multiple spaces) — catches the most common implementation error: counting words per character instead of per OUT → IN transition.
  • Test 2 (whitespace-only) — catches uninitialized state or a missing state = OUT initialization; reveals when nw fires even though it should not.
  • Test 3 (no trailing newline) — catches any assumption that a word is only complete when followed by a newline. The program correctly counts words by state transition, not by line ending, so this should work — but an altered version that increments nw only on '\n' would fail here.
  • Test 6 (leading whitespace) — catches an off-by-one where the initial state is wrong or where the first character always triggers a word count.
  • Test 7 (mixed delimiters) — catches an incomplete whitespace check that recognizes spaces but not tabs, or tabs but not newlines.

The general principle: test every state transition, every way it can be reached, and the case where it should not fire at all. Tests 1 and 2 cover “transition never fires.” Tests 3–8 cover “transition fires exactly once in various ways.” Test 9 covers “transition fires multiple times with unusual spacing.”

What This Exercise Teaches

  • State machine thinking — recognizing that the IN/OUT variable makes this a state machine, and that transitions are where implementation errors appear
  • Boundary-condition test design — testing zero, one, and many before the typical case; the empty input and single-word inputs are as important as multi-word inputs
  • Whitespace equivalence in C' ', '\n', and '\t' are distinct integer values; the program must check each one explicitly, and omitting any one is a real bug
  • Counter independencenl, nw, and nc are separate and can take very different values for the same input; never assume they move together

Set Up Your C Environment

To compile and run these test cases, you need GCC installed. If you haven’t set up C on your machine yet:

← Exercise 1-10  | 
Chapter 1 Solutions  | 
Exercise 1-12 →

Book:
The C Programming Language, 2nd Ed — Kernighan & Ritchie

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>