Thoughts on the Craft of Programming

Abstraction, Refactoring and How Changes Introduce Bugs

by

Michael J. Harrison

Abstract

Starting with a fragment of code known to contain an error, I correct the error, make a series of improvements and finally add an enhancement. I discuss the motivations for the improvements, hinting at some general principles that may guide programmers to produce code of high quality. I also reveal how in the course of making changes I inadvertantly introduced an new error into the code, to illustrate how easily this can happen even when one is being careful. I conclude by drawing a few morals from the exercise.

Introduction

This essay is about the craft of programming. It attempts to show how the grand concepts that we bandy about, such as modularity and abstraction, are reflected in the everyday work of a programmer. It also illustrates how, as a craft, programming combines elements of both science and art. The scientific element comes about because programming concerns logical propositions that are objectively true or false; the artistic element arises because there are many ways to formulate these propositions and subjective judgments are required to distinguish the good ways from the bad.

I start by examining a fragment of code that is known to contain an error. I explain the error and show a simple change that will remove it. I then make a series of improvements to the code to make it easier to read and maintain. Finally I add a new feature. It's a trivially simple piece of code, much too small to really show the advantages of modularity and refactoring, but I believe it is big enough to illustrate the principles. Because the original fragment was written in C and uses native types, my changes and additions do also, and I have kept the original's K&R style because I feel it is, as Tony Hoare said about Algol-60, such an improvement on all its successors. For clarity, I have omitted the #include statements that would be required.

Fixing a Bug

The Listing One comes from one of the advertisements that Gimpel Software runs every month in Dr. Dobb's Journal; each advertisement illustrates a bug that could be found by their enhanced version of lint and that therefore can be discovered by visual inspection of the code. The symptom of failure stated in the ad is "This function which is intended to count the vowels in the string provided is taking a long time to do so," but I found that under either Windows or Linux the actual failure is that an addressing exception is generated.

Listing One

/* Version 1: Original code from Gimpel ad.
 * Gimpel Bug #527, DDJ December 1999, p. 96
 */
int count_vowels(char *s)
{
     int sum = 0;

     for (;;)
          switch (*s++) {
          case 'a':
          case 'e':
          case 'i':
          case 'o':
          case 'u':
               sum++; continue;
          default:    continue;
          case '\0':  break;
          }
     return sum;
}
The error results from the coder's failure to see and appreciate the implication of C syntax: the break statement is in the scope of the switch statement, not that of the for statement, and consequently causes exit from the switch but not from the for. One is misled by the two continue statements, which are indeed in the scope of the for loop. So the for statement in fact has no termination condition, and thus after a potentially long series of character accesses an invalid address is generated.

Making the Initial Bug Fix

An explicit infinite loop (such as the C idiom "for (;;)") should always be viewed with suspicion, for very few loops, if indeed any, are truly endless. Let's put the termination condition in the loop, which is clearer and also fixes the bug. The new for statement is another piece of C idiom (a cliché, if you will) that is instantly recognizable to an experienced C programmer as an instance of the scan-the-whole-string pattern (though some C programmers would omit the explicit comparison to the null character and rely on C's implicit comparison). Note that we no longer need the "case '\0':", since the loop condition now guarantees this case would never arise. This code is definitely more readable than the original, and an experienced programmer could comprehend the whole thing in little more than a glance.

Although Java's for and switch statements work in exactly the same way as C's and expose the programmer to the same error, a Java programmer almost certainly would not have made it. That is because Java strings use a length rather than a terminating character, and a Java programmer would automatically write the loop in the Java idiom [1]:

for (int i = 0; i < s.length(); i++)
So, by using the termination condition directly in the loop, the bug can be fixed as shown in Listing Two.

Listing Two

/* Version 2: Use termination condition directly in loop definition.
 */
int count_vowels(char *s)
{
     int sum = 0;

     for (; *s != '\0'; s++)
          switch (*s) {
          case 'a':
          case 'e':
          case 'i':
          case 'o':
          case 'u':
               sum++; continue;
          default:    continue;
          }
     return sum;
}

Cleaning Up the switch Statement

That fixed the bug, but the switch statement looks a bit silly now, and would look even sillier if I had stripped it to its bare essentials by also removing the default case and the preceding continue (which is made redundant by the removal of the default) — even though, if compiled using a branch table, the switch statement would be quite efficient. In any case, I tend to look askance at continue statements [2]. Let's replace the switch with a more straightforward if statement.

Listing Three

/* Version 3: With no real alternatives, switch statement is poor fit.
   Use if instead.
 */
int count_vowels(char *s)
{
     int sum = 0;

     for (; *s != '\0'; s++)
          if (*s == 'a' | *s == 'e' | *s == 'i' | *s == 'o' | *s == 'u')
               sum++;
     return sum;
}

Replacing the Ugly if

That's reasonable and not too inefficient, but the if statement looks a bit ugly. We now need to switch our focus from the scientific — getting the code correct — to the artistic — making the code good. Let's replace the ugly statement with a nicer test, using the C run-time library: a tiny [3] example of component-based development, with the library acting as a component. Usually, strchr is employed to search for a given character in an arbitrary string; here we invert that to search for an arbitrary character in a given string. Its specification is "strchr searches a string for the first occurrence of a character; if the character is found in the string, a pointer to the first occurrence is returned, otherwise a null pointer is returned." We use it in the logically equivalent way: the character occurs in the string if and only if strchr returns a value other than the null pointer value. On x86 CPUs with a decent compiler this version will also be a little faster. Notice that the total amount of text (Listing Four) is now substantially reduced, which also contributes to the code's being easier to read and understand.

Listing Four

/* Version 4: The if statement is ugly. Use a cleaner and faster test.
 */
int count_vowels(char *s)
{
     static char vowels[] = "aeiuo";
     int sum = 0;

     for (; *s != '\0'; s++)
          if (strchr(vowels, *s) != 0)
               sum++;
     return sum;
}

Making a Simple Enhancement

We notice from Listing Four that only lower case letters are counted as vowels. Let's change the code to accept upper case vowels as well. This is now very easy and safe to do, since the only change needed is to extend a static array. Note that if we had made an equivalent change to Version 3 the if statement would have become really ugly and significantly inefficient. The change shown in Listing Five is by no means the only way to implement this enhancement; arguably it would be just as good, for example, to initialize the vowels array with only the upper case vowels, and pass toupper(*s) to strchr.

Listing Five

/* Version 5: Deal with upper case as well. Note how easy this change is,
 * now that the logic is contained in a static declaration.
 */
int count_vowels(char *s)
{
     static char vowels[] = "aeiouAEIOU";
     int sum = 0;

     for (; *s != '\0'; s++)
          if (strchr(vowels, *s))       /* Implicit pointer comparison */
               sum++;
     return sum;
}

Refactoring to Minimize the Scope of the Definition of "Vowel"

That code looks reasonably good, but it does combine two functions that are really quite separate: scanning the string, and deciding whether a character is a vowel. Once again, our artistic antennae are alerted. Scanning a string is an instance of the more general function of traversing a data collection, so with the Visitor pattern at the back of our minds we decide to separate the two functions. This also makes sense because the definition of "vowel" may vary between languages (and indeed may not exist at all in some), so this is a part of the program that looks as though it might depend on the target environment and hence one that should be isolated into a separate function. With this change, the total amount of text has increased a little but the conceptual complexity has been reduced because each function can now be viewed in isolation.

Although I will not demonstrate it here, the code in Listing Six is ready for further generalization, because any is_x function with the same signature as is_vowel could be passed as an argument to the string scanning function and so all sorts of different tests could be realized by writing only additional code (i.e., not changing existing code). In C++ or Java, instead of passing a function address (impossible in Java) one would wrap each function in its own class declaring is_x as a method and pass as the argument an object of the appropriate class (the functor idiom), and in C# one could use a delegate, but the principle is exactly the same.

Now that it is a separate function, is_vowel invites a more efficient implementation, for example, one based on a 256-byte array of character properties in the manner of the isalpha function and its companions [4]. This would be worth doing if profiling revealed that this function was an execution bottleneck.

The function is_vowel is a candidate for an independent unit test. Indeed, since there are only 256 possible values of c [5], this is an example of that rare case, a piece of code that can be exhaustively tested. We should take advantage of this and write and execute such a test in this instance; this would be particularly worthwhile if we planned to improve the implementation as suggested in the previous paragraph. I'll return to this point a little later.

Listing Six

/* Version 6: Refactor, to minimize the scope of the definition of "vowel".
 */
int is_vowel(char c)
{
     static char vowels[] = "aeiouAEIOU";
     return strchr(vowels, c) != 0;
}

int count_vowels(char *s)
{
     int sum = 0;

     for (; *s != '\0'; s++)
          if (is_vowel(*s))
               sum++;
     return sum;
}

A New Feature: Counting the Number of Vowel Pairs

The new feature is to count the number of vowel pairs in the string, or more precisely the number of instances where s[i] and s[i+1] are both vowels; for example, the vowel pair count for "beautiful" is 2, for "ea" and "au" [6]. Here is the new function count_vowelpairs and its supporting function is_vowelpair:

Listing Seven

/* Version 7: Implement a new feature - count vowel pairs.
 */
int is_vowelpair(const char *p)
{
     return is_vowel(*p) && is_vowel(*(p+1));
}

int count_vowelpairs(char *s)
{
     int sum = 0;

     for (; *s != '\0'; s++)
          if (is_vowelpair(s))
               sum++;
     return sum;
}

That was easy — a little copy and paste, a minor edit, a one-line function, and we're done. Note how, because of the previous refactoring, we were able to use the is_vowel function to implement is_vowelpair. Note also that if in the original implementation of is_vowel we had passed a character pointer instead of a character, we could have implemented is_vowelpair as an is_x function and used the function pointer protocol mentioned previously to implement the new feature without having to implement count_vowelpairs as a separate new function. But we were not sufficiently farsighted, and the decision to pass a character was a wise one, based on the principle that it is safer not to pass a pointer if it can be avoided.

But are we really done? No, we have been a little cavalier in our approach, and the program contains an error, one that might well evade testing [7].

On hearing that, a programmer might first be drawn to the is_vowelpair function because it dereferences (p+1). Is that always a valid address? Yes, in fact it is, because the test in the for statement guarantees that p always points to a character in the string, so even when p is pointing to the last character in the string (p+1) is pointing at the null character terminating the string and so is still a valid address. We were lucky when we made the change, because we had not explicitly considered this. There is no way in C to protect is_vowelpair from being invoked with a pointer p such that (p+1) is an invalid address; we resolve to ensure that the specification for is_vowelpair contains a prominent notice of this precondition.

So what is the bug? It is this: if the string ends in a vowel x, the character pair "x\0" will be incorrectly identified as a vowel pair. The groundwork for this bug was laid in the improvement from Version 3 to Version 4; it turns out that the if test and the strchr test are not exactly equivalent. The function strchr considers the terminating null character to be part of the string for the purposes of the search, so searching for the null character will always succeed and return a non zero pointer. This is documented in the specification for strchr, but is not well known; it is not mentioned in the MSDN Library documentation for the Microsoft Foundation Classes method CString::Find (which itself calls strchr). Yet it is on such minutiæ that the correct execution of a program may hinge.

How did you answer the question about unit testing in the previous section? If you thought then it would not be worth the trouble, would you like to reconsider that decision now? If you would have chosen to perform unit testing, would you have tried the case when c is zero?

How should we fix this bug?

One way would be to replace the body of is_vowelpair with the statement:

     return is_vowel(*p) && *(p+1) != '\0' && is_vowel(*(p+1));
This will fix the bug and in one sense is quite correct, because it simply compensates for the slightly awkward behavior of strchr in our context. Another interpretation might be that the error lies in is_vowel, which classifies the null character as a vowel, and that this function should be changed to correct the error (I hear Bertrand Meyer in the background chiding us for an insufficiently precisely defined precondition for is_vowel) [8]. I maintain, however, that either of these changes fixes the symptom rather than the disease. Our original suspicion of (p+1) was justified; I contend that the proper precondition for is_vowelpair is that both p and (p+1) point to actual characters within the string, not to the terminator. Therefore a correct fix for the bug is to leave is_vowelpair as it is and replace the for statement in count_vowelpairs with this:
     for (; *s != '\0' && *(s+1) != '\0'; s++)
It is necessary to test both characters because testing only *(s+1) will fail if the string is null (for then *(s+1) would reference an invalid address). This is a bit clumsy, and we might want to consider testing for a null string before entering the loop or using a different termination condition, perhaps one based directly on the length of the string. We might also want to consider weakening the precondition for is_vowel by making it classify the null character as not a vowel. As the saying goes, these things are left as exercises for the reader.

An Aside on Programming by Contract

The functions count_vowels and count_vowelpairs both receive as input argument a pointer to a C string. Neither function applies any validity check to the pointer, so there is an implicit precondition that the caller supply a valid pointer. To follow the rules of programming by contract [9] we should make this an explicit precondition, albeit one that is so common and obvious that it hardly seems to need stating.

There are four possibilities for the pointer passed in as argument:

Our contract imposes the precondition that the pointer must be valid as specified in the first case above. There is no reasonable way to detect the second case, and for the third case it is usual to delegate detection (and handling) to the exception mechanism of the operating system [10].

The fourth case is the interesting one. Null is a valid value for a pointer, but it is (by definition) an invalid address that will generally cause an addressing exception if dereferenced. It is, however, trivial for the function to detect that a null pointer argument has been passed. Therefore, we can propose a general rule that makes life easier for callers of a function: if a pointer to a null object is a valid argument, a null pointer should also be a valid argument with the same meaning. In the specific examples we are studying here, we should change the contract by weakening the precondition to allow the first and fourth cases, and implement the function to immediately return the value 0 if a null pointer is passed as the argument. The purpose of this rule is to relieve callers of the need to make the test for a null pointer; for a language like C the gain is not obvious, but for functional style languages like Lisp the gain is significant.

Programming by contract places responsibility squarely on the shoulders of the caller for ensuring that specified preconditions for a function are met, and relieves the function of any responsibility for verifying that preconditions are met. As intended, this has the desirable effect of eliminating redundant validity checking. In real life, however, it also has an undesirable side effect: if the caller makes a mistake and does not in fact meet the preconditions, the function is likely to fail but the cause of the failure may be difficult to track down. A pointer that is null or contains an invalid address is usually easy to diagnose, since an exception will be caused immediately an attempt is made to deference it, but other errors may cause failures far removed from the root cause. Therefore it is good practice for a function to perform reasonable validity checks for its preconditions, and in the C language the preferred technique is to use assertions. "Reasonable" is, of course, a slippery word; it would be reasonable, for example, to test whether an integer lies within a specified range, but not to test whether a binary search tree was indeed valid.

Lessons

The first thing that strikes me is how easy it is to write programs that are almost right, and how hard it is to write ones that are exactly right. The original bug is a result of some unfortunate choices in the syntax of C (for example, the overloading of the break statement), but even apart from the specific error the whole style of the original program is really bad [11]. However, it's quite possible to get large programs with equally poor style to work, more or less. I know because I have seen them and even, I am ashamed to say, written them (before I knew any better).

It is possible to get away with programming in poor style because, unlike with physical artifacts, the cost of destructive testing of a program is very small. Therefore it is economically feasible to build a program by an iterative process of construction and testing ("cut and try") in a way that would be out of the question for, say, an automobile or an aircraft. Just imagine what software development would be like if any processor exception caused the CPU to be destroyed.

Unfortunately, it is quite infeasible to perform an exhaustive test of even a small program, and so any program developed by cut and try is bound to contain errors, even serious ones. It's perfectly reasonable to argue that because destructive testing of software is very cheap the development process should take advantage of this; but the current sad state of software development is due in large part to driving this argument way past its logical conclusion to an irrational extreme. Testing is necessary and even reasonably effective, but we also have to go about developing code in such a way that for the most part we do not introduce errors in the first place [12].

When we are talking about component-based development and modularity, what we are really talking about is abstraction. What I see in the switch statement in the original piece of code is a total lack of abstraction. To me, the test of whether a given character is a vowel is a simple test of set membership, but I see no reflection of that insight in the original code, and I am forced to conclude that the programmer did not conceive of it in that way at all. Yet it's difficult to see how anyone can make effective use of component-based development without thinking in terms of programming abstractions.

Undoubtedly, part of the problem is the programming languages that are used. C is a fine language in its way, but its direct support for abstraction is limited to functions, macros and arrays. Pascal provides sets (though they are limited in cardinality), and it would be entirely natural for a Pascal programmer to write is_vowel as "c in ['a', 'e', 'i', 'o', 'u']"; thus a Pascal programmer is more used to thinking in terms of set operations, which are inherently more abstract than tests and assignments. My use of strchr is, of course, nothing but a C implementation of a set membership test, but how many C programmers would see it that way?

It seems to me that we must encourage and help developers to think in a more abstract way about the programming task. I suspect, however, that this would be an uphill battle. Languages such as APL, Lisp, or Prolog that encourage or require a more abstract view of programming have never been widely popular and are usually regarded as hard to learn. On the other hand, Python, which nicely combines aspects of procedural, object-oriented and functional languages, has proved quite popular, so there are some grounds for optimism.

In my series of transformations on the original program, I was guided, as much as anything, by an essentially aesthetic sense, most clearly in my feeling that there was something wrong about the nonterminating for loop, and my description of the if statement as ugly. There is a classic quote from the theoretical physicist Paul Dirac: "It seems that if one is working from the point of view of getting beauty in one's equations, and if one has really a sound insight, one is on a sure line of progress." Programs are very rarely beautiful in the Dirac sense [13], and we are usually limited to trying to get the ugliness out, rather than the beauty in, but the essence of Dirac's point still applies. We need to instill in developers an intuitive sense of what makes a program beautiful or ugly.

The final lesson I draw is how dangerous it is to make changes to code. It is so easy unwittingly to violate essential preconditions in the existing code, which are usually only implicit and often not completely known even by the original programmers. Furthermore, such violations are often not revealed by even quite thorough testing. The only other known countermeasures, which are not wholly effective, are extreme care on the part of the developers making changes, and thorough review, preferably by people intimately familiar with the existing code.


Footnotes

  1. A helpful reader has pointed out that in practice an experienced Java programmer would almost certainly write the loop like this, to avoid calling the length function more than once:
    for (int i = 0, imax = s.length(); i < imax; i++)
  2. Because continue is a sort of "do nothing" statement, rather similar to an if-else statement with a null if part; and, of course, continue, like break, is a clandestine goto. In production code I occasionally find a need for break but I have never used continue.
  3. I will readily agree that this is an extremely tiny example, but the string functions of the C run-time library really do form a component, albeit within the strictly procedural paradigm.
  4. Admittedly, this is a rather farfetched and contrived example, but it does illustrate the principle.
  5. An 8-bit character has 256 possible values, 128 of which are valid ASCII characters. However, C often promotes a char to an int, and in fact will do so in the course of passing the char argument, so the claim that this function may be exhaustively tested is perhaps a little optimistic.
  6. We checked with our client and that is really how it is to be defined; if you think that once a pair has been identified both members of the pair should be removed from further consideration that's fine, but it's not what the client wants.
  7. In fact, I noticed this bug as I was thinking about the succession of code changes described here, and tested the code only to confirm my deduction of what would happen.
  8. I agree that the implementation using strchr is in fact incorrect as it stands, and should be corrected so as not to classify the null character as a vowel. Note that when we originally did the refactoring, count_vowels in fact guaranteed that is_vowel would never be called with the null character as argument, so the implicit precondition could never be violated. Indeed, we did not consider that aspect of the matter at the time. Also, a C programmer is predisposed never to think of the null character as an actual character and so is unlikely to envision null being passed as an argument. That's why oversights of this sort happen.
  9. Bertrand Meyer has been the most prolific author on this topic. See, for example, his well-known book Object-Oriented Software Construction and the paper Applying "Design by Contract".
  10. The best way to handle exceptional conditions is a large topic in its own right that I do not have space to cover here.
  11. Given the context, I am sure the code fragment as presented by Gimpel was abstracted from a larger piece of code and stripped to its bare essentials. I'm not sure whether this would make its style better or worse than that of the original.
  12. This hints at the long-running and contentious argument about the utility of formal methods, an argument that's far too complicated to cover here. Close readers may be able to discern where my sympathies lie. Relevant material can be found in:
  13. David Parnas, "Why Software Jewels Are Rare", IEEE Computer, February 1996, p. 57. Available (for the moment, at least) here.

© Copyright 2002-2005 Michael J. Harrison. All rights reserved.

<mike@mjharrison.com>