Your program fails. How can this be? The answer is that the programmer creates a defect in the code. When the code is executed, the defect causes an infection in the program state, which later becomes visible as a failure. To find the defect, one must reason backward, starting with the failure. This chapter defines the essential concepts when talking about debugging, and hints at the techniques discussed subsequently – hopefully whetting your appetite for the remainder of this book.
MY PROGRAM DOES NOT WORK!
Oops! Your program fails. Now what? This is a common situation that interrupts our routine and requires immediate attention. Because the program mostlyworked until now, we assume that something external has crept into our machine – something that is natural and unavoidable;somethingwe are not responsible for – namely, a bug. If you are a user, you have probably already learned to live with bugs. You may even think that bugs are unavoidable when it comes to software. As a programmer, though, you know that bugs do not creep out of mother nature into our programs.
(See Bug Story 1 for an exception.) Rather, bugs are inherent parts of the programs we produce. At the beginning of any bug story stands a human who produces the program in question.
The following is a small program I once produced. The sample program is a very simple sorting tool. Given a list of numbers as command-line arguments, sample prints them as a sorted list on the standard output ($ is the command-line prompt).
$ ./sample 9 7 8
Output: 7 8 9
$ _
Unfortunately, sample does not always work properly, as demonstrated by the following failure:
$ ./sample 11 14
Output: 0 11
$ _
BUG STORY 1
The First Bug
We do not know when the first defect in a program was introduced. What we know, though, is when the first actual bug was found. It may have been in search of plant food, or awarmplace to lay its eggs,
or both. Nowitwandered around in this humming, warm machine that constantly clicked and rattled. But suddenly, it got stuck between the metal contacts of a relay – actually, one of 13,000 high-performance relays commissioned for this particular machine. The current killed it instantly – and its remains caused the
Machine to fail.
This first actual bug was a moth, retrieved by a technician from the Harvard Mark II machine on September 9,1947.The moth got taped into the logbook, with the comment “1545 Relay #70 Panel F (moth) in relay. First actual case of bug being found.”The moth thus became the living proof that computer problems could indeed be caused by actual bugs.
Although the sample output is sorted and contains the right number of items, some original arguments are missing and replaced by bogus numbers. Here, 14 is missing and replaced by 0. (Actual bogus numbers and behavior on your system may vary.) From the sample failure, we can deduce that sample has a bug (or, more precisely, a defect). This brings us to the key question of this chapter:
HOW DOES A DEFECT CAUSE A FAILURE, AND HOW CAN WE FIX IT?
FROM DEFECTS TO FAILURES
In general, a failure such as that in the sample program comes about in the four stages discussed in the following.
1. The programmer creates a defect. A defect is a piece of the code that can cause an infection. Because the defect is part of the code, and because every code is initially written by a programmer, the defect is technically created by the programmer. If the programmer creates a defect, does that mean the programmer was at fault?
Not necessarily. Consider the following: