Exercise 1-11. How would you test the word count program? What kinds of input are most likely to uncover bugs if there are any?
Exercise 1-11 has no new program to write. It asks you to think like a tester — to look at the word count program K&R presented in Section 1.5.4 and design inputs that are most likely to expose any hidden bugs. This is one of the most practically useful exercises in Chapter 1, because the skill of systematic test design applies to every program you will ever write.
The Program Under Test
Here is the K&R word count program. It reads from standard input and prints the count of lines, words, and characters on one line. Save it as wordcount.c.
/* Compile: gcc -ansi -Wall wordcount.c -o wordcount */
#include <stdio.h>
#define IN 1 /* inside a word */
#define OUT 0 /* outside a word */
/* count lines, words, and characters in input */
int main(void)
{
int c, nl, nw, nc, state;
state = OUT;
nl = nw = nc = 0;
while ((c = getchar()) != EOF) {
++nc;
if (c == '\n')
++nl;
if (c == ' ' || c == '\n' || c == '\t')
state = OUT;
else if (state == OUT) {
state = IN;
++nw;
}
}
printf("%d %d %d\n", nl, nw, nc);
return 0;
}
Compile
gcc -ansi -Wall wordcount.c -o wordcount
How the Program Works: A Two-State Machine
The variable state is either OUT (reading whitespace or at the very start of input) or IN (reading a non-whitespace character that belongs to a word). The word counter nw increments only on the OUT → IN transition — the first non-whitespace character after a gap. Every subsequent character inside the same word keeps state at IN without touching nw.
That single transition is where the most dangerous bugs live. Any test case that forces the state machine through that transition in an unusual way — or prevents it from happening at all — is the most likely to expose a broken implementation.
The Test Suite
All tests pipe input using printf so no temporary files are needed. The output format is nl nw nc (newlines words characters). Work through each case in order — they build from simplest to most subtle.
Test 1: Empty Input
Why: The while loop never executes. All three counters must be initialized to zero before the loop — and they are: nl = nw = nc = 0. Any implementation that produces nonzero output here has an initialization bug.
printf "" | ./wordcount
Expected: 0 0 0
Test 2: Only Whitespace — No Words
Why: The state machine should stay in OUT the entire time. nw must remain zero because the OUT → IN transition never fires. This distinguishes “no input” from “input containing no words” — two different situations that must both produce nw = 0.
printf " \t\n " | ./wordcount
Expected: 1 0 7 (3 spaces + tab + newline + 2 spaces = 7 characters, 1 newline, 0 words)
Test 3: Single Word, No Trailing Newline
Why: The program reaches EOF while still in state IN. Notice that nl is 0 even though there is one word — word count and line count are independent. Any code that assumes words are always terminated by a newline will fail this case.
printf "hello" | ./wordcount
Expected: 0 1 5
Test 4: Single Word With Newline
Why: The simplest complete input. Confirms the baseline: one word, one newline, six characters (hello = 5 + the newline itself).
printf "hello\n" | ./wordcount
Expected: 1 1 6
Test 5: Multiple Spaces Between Words
Why: This is the most important test. A common bug counts a new word on every non-whitespace character rather than only on the OUT → IN transition. With three spaces between “hello” and “world”, the state machine must cycle IN → OUT → IN exactly once between the two words, producing nw = 2 — not 10 (the number of letters in both words).
printf "hello world\n" | ./wordcount
Expected: 1 2 14
Test 6: Input Starting With Whitespace
Why: The program starts with state = OUT. Leading spaces must keep the machine in OUT — nw must not increment until the first non-whitespace character triggers the OUT → IN transition. This probes whether the initial state is set correctly and whether leading whitespace is handled identically to whitespace elsewhere in the input.
printf " hello\n" | ./wordcount
Expected: 1 1 9 (3 spaces + 5 letters + newline = 9 characters)
Test 7: Mixed Whitespace Delimiters
Why: The whitespace condition is c == ' ' || c == '\n' || c == '\t'. A missing clause — for example, forgetting to check '\t' — would treat tabs as word characters and give the wrong count. Use all three whitespace types in a single run to test each branch.
printf "one two\tthree\nfour" | ./wordcount
Expected: 1 4 18 (space after "one", tab after "two", newline after "three", EOF after "four")
Test 8: Very Long Single Word
Why: nc grows with every character, but the state transitions to IN exactly once and stays there. nw must remain 1. This confirms that “staying in IN” across many characters is handled correctly — the else if (state == OUT) guard only fires on entry to a word, not once per character.
printf "averylongwordwithnospacesortabs" | ./wordcount
Expected: 0 1 31
Test 9: Words Alternating With Blank Lines
Why: After each blank line, the state must return to OUT so that the next word triggers a fresh OUT → IN transition. A missed state reset after a blank line would fail to count the second word correctly. This also exercises nl across multiple newlines.
printf "one\n\n\ntwo\n" | ./wordcount
Expected: 4 2 10 (o+n+e+\n+\n+\n+t+w+o+\n = 10 characters, 4 newlines, 2 words)
Which Tests Are Most Likely to Expose Bugs?
The highest-value tests are the ones that put the state machine through transitions in unexpected ways:
- Test 5 (multiple spaces) — catches the most common implementation error: counting words per character instead of per OUT → IN transition.
- Test 2 (whitespace-only) — catches uninitialized state or a missing
state = OUTinitialization; reveals whennwfires even though it should not. - Test 3 (no trailing newline) — catches any assumption that a word is only complete when followed by a newline. The program correctly counts words by state transition, not by line ending, so this should work — but an altered version that increments
nwonly on'\n'would fail here. - Test 6 (leading whitespace) — catches an off-by-one where the initial state is wrong or where the first character always triggers a word count.
- Test 7 (mixed delimiters) — catches an incomplete whitespace check that recognizes spaces but not tabs, or tabs but not newlines.
The general principle: test every state transition, every way it can be reached, and the case where it should not fire at all. Tests 1 and 2 cover “transition never fires.” Tests 3–8 cover “transition fires exactly once in various ways.” Test 9 covers “transition fires multiple times with unusual spacing.”
What This Exercise Teaches
- State machine thinking — recognizing that the
IN/OUTvariable makes this a state machine, and that transitions are where implementation errors appear - Boundary-condition test design — testing zero, one, and many before the typical case; the empty input and single-word inputs are as important as multi-word inputs
- Whitespace equivalence in C —
' ','\n', and'\t'are distinct integer values; the program must check each one explicitly, and omitting any one is a real bug - Counter independence —
nl,nw, andncare separate and can take very different values for the same input; never assume they move together
Set Up Your C Environment
To compile and run these test cases, you need GCC installed. If you haven’t set up C on your machine yet:
- Complete C Development Environment Setup
- Install GCC on Windows 11
- Install GCC on macOS
- Install GCC on Ubuntu/Linux
- VS Code for C Programming — recommended editor
← Exercise 1-10 |
Chapter 1 Solutions |
Exercise 1-12 →
Book:
The C Programming Language, 2nd Ed — Kernighan & Ritchie