K&R C Exercise 3-3: expand — Expand a-z Shorthand into Full Range

Exercise 3-3. Write a function expand(s1,s2) that expands shorthand notations like a-z in the string s1 into the equivalent complete list abc...xyz in s2. Allow for letters or digits on either end of a range, and take care of cases like a-b-c and a-z0-9 and -a. Arrange that a leading or trailing - is taken literally.

The core technique is a lookahead: when processing character at position i, peek at s1[i+1] and s1[i+2] before deciding what to emit. If you see the pattern lo-hi where lo and hi are the same character class (both lowercase letters, both uppercase letters, or both digits) and lo <= hi, expand the range. Otherwise copy the current character literally and advance by one. This handles all the edge cases: a leading - has no valid lo in the right class; a trailing - has s1[i+2] == '\0'; a-z0-9 expands a-z correctly then finds 0 and 9 are digits but the - between z and 0 separates them with a literal dash.

Solution

/* K&R Exercise 3-3 — expand shorthand range notation
 * Compile: gcc -ansi -Wall ex3-3.c -o ex3-3 */
#include <stdio.h>
#include <ctype.h>

void expand(char s1[], char s2[])
{
    int i, j, lo, hi, k;
    i = j = 0;
    while (s1[i] != '\0') {
        if (s1[i+1] == '-' && s1[i+2] != '\0') {
            lo = s1[i];
            hi = s1[i+2];
            if ((islower(lo) && islower(hi) && lo <= hi) ||
                (isupper(lo) && isupper(hi) && lo <= hi) ||
                (isdigit(lo) && isdigit(hi) && lo <= hi)) {
                for (k = lo; k <= hi; ++k)
                    s2[j++] = k;
                i += 3;
                continue;
            }
        }
        s2[j++] = s1[i++];
    }
    s2[j] = '\0';
}

int main(void)
{
    char out[256];

    expand("a-z", out);    printf("a-z    -> %s\n", out);
    expand("0-9", out);    printf("0-9    -> %s\n", out);
    expand("a-z0-9", out); printf("a-z0-9 -> %s\n", out);
    expand("-a", out);     printf("-a     -> %s\n", out);
    expand("a-", out);     printf("a-     -> %s\n", out);
    expand("a-b-c", out);  printf("a-b-c  -> %s\n", out);
    expand("A-F", out);    printf("A-F    -> %s\n", out);
    return 0;
}

Compile and Run

gcc -ansi -Wall ex3-3.c -o ex3-3
./ex3-3

Sample Output

a-z    -> abcdefghijklmnopqrstuvwxyz
0-9    -> 0123456789
a-z0-9 -> abcdefghijklmnopqrstuvwxyz0-9
-a     -> -a
a-     -> a-
a-b-c  -> ab-c
A-F    -> ABCDEF

Note a-z0-9: the expansion produces abcdefghijklmnopqrstuvwxyz then hits 0-9. But between the z and 0 there is a -z is lowercase and 0 is a digit, so they are different classes and the dash is copied literally, giving ...xyz0-9 at the end (but 0-9 then expands). And a-b-c: expands a-b to ab, then sees -c where - is a literal (the preceding character was already consumed), copying -c literally.

What This Exercise Teaches

  • String lookahead — reading s1[i+1] and s1[i+2] to classify the pattern before deciding what to emit
  • Edge-case enumeration — leading/trailing dash, reversed range, mixed character classes all need explicit handling
  • <ctype.h> predicatesislower, isupper, isdigit for portable character classification without ASCII magic numbers
  • Advance-by-3 pattern — consuming three input characters (lo, -, hi) in one logical step using i += 3; continue

Set Up Your C Environment

← Exercise 3-2  | 
Chapter 3 Solutions  | 
Exercise 3-4 →

Book:
The C Programming Language, 2nd Ed — Kernighan & Ritchie

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>