Plagiarism in a Programming Context

Mon, Jan 5, 2015

This post was originally written with students in introductory programming courses in mind. It was originally posted on the A-State CS Department wiki in January, 2015.

Foreword

Students first encountering a programming course often have some confusion or misunderstandings about what plagiarism means in the context of programming, and why it is unacceptable. The following is an attempt to address each of these misunderstandings.

What is Plagiarism?

Let’s start with a definition. According to Merriam-Webster (http://www.merriam-webster.com/dictionary/plagiarism), plagiarism is:

the act of using another person’s words or ideas without giving credit to that person : the act of plagiarizing something

That definition is sufficiently general to apply to plagiarism of any kind. Since we are concerned specifically with plagiarism of programming code here, consider the following specialization of the definition:

Plagiarism with respect to programming code is the act of using another person’s implementation of an algorithm without giving credit to that person.

The most obvious way to plagiarize programming code it to directly copy it. Unfortunately, computers make this a trivial task. It is up to your own ethics as a programmer to avoid the temptation to copy code.

But Code Reuse is Good, Right?

One of the guiding Principles of good software design is to “Reuse, Reuse, Reuse”. A programmer should never spend time re-developing a tool that already exists in the same form. A single correct tool should be built for every unique task, then those tools should be re-used whenever that task is encountered. The caveat here is that someone has to build the tool the first time. A second caveat is that you cannot expect to correctly use a tool you don’t understand. And in programming, the most sure way to understand how a piece of software works it to write it yourself (if only once, for the exercise). In fact, learning to program is more like learning a craft such as cooking than like learning a liberal art such as history or reading (although there is certainly a language aspect as well). This brings us to the next point:

Programming is a Craft

Computer Science concerns itself with the fundamental capabilities of computing machines and the expression of algorithms in such a way that they can be executed correctly by those machines. The day-to-day business of how we actually express those algorithms to build useful things is the focus of the study of computer programming. Programming itself is more closely akin to a craft than to a science. Computer science studies the things that are possible and then it is up to programmers to make those things a reality. In the same way, physical science taught us about electricity and now electricians and electrical engineers can make useful things with that knowledge.

Programming is a necessary craft for computer scientists — in order to design experiments, and test hypotheses, a computer scientist must be able to interact with the computing machine s/he is studying. Programming provides the set of tools that make this possible.

Programming is a desirable craft for other disciplines — modern advances in mathematics, physics, chemistry, and biology have come as a result of the application of computational power to those areas. This is only possible because programmers applied their tools to those problems. This means that scholars from almost every discipline would be better off if they learned the craft of programming. The tools provided by programming create a “force multiplier” effect — you can do so much more with the help of a computer (through programming) than you can without it.

Crafts Require Practice

To make this point, consider programming alongside another common craft: Woodworking. There are plenty of books available on the subject. The Complete Manual of Woodworking by Albert Jackson, David Day, and Simon Jennings is a good place to start. It contains chapters ranging from “Wood: The Raw Material” to “Joinery” and “Wood Carving”. In Chapter 2, “Designing in Wood”, the authors include a section “Principles of Chair Construction”. The section, like the rest of the book, is fully illustrated. It discusses how the seat angle should relate to the angle of the chair back, and how this is affected by the human body and how it in turn affects the posture (and comfort) of the person sitting in the chair. Suggested measurements are given, and a discussion of methods of joinery involved in different styles of chairs is presented. Even the order of construction is clearly stated.

Would you expect an average person — most likely not having a background in woodworking — to be able to successfully create a quality chair after reading this book and passing a test over the terms and concepts? Probably not. What is missing here is practice. In order to learn the craft of woodworking, one must practice it by building things. Early projects are likely to fail, but over time the woodworker will become more and more proficient through a combination of trial-and-error and repetition.

Just like you would not commission a “book educated” woodworker with no practical experience to build a fine set of furnishings for a fancy dining hall, you would not hire a “programmer” with no actual programming experience to build a payroll application for your company. The results in either case would be disastrous, and the money spent would have been wasted. Programming simply cannot be learned from a book alone. As with any craft, practice is a necessity. Trial and error, struggle, and overcoming difficulty eventually lead to experience (and maybe enlightenment). The basic things get easier, and learning new concepts becomes faster because of this prior experience. The quality of the programmer’s work will improve over time. The “novice” craftsman eventually becomes skilled, and maybe progresses to become a “master” of the craft.

Coding Practice is a Solitary Pursuit

Just as with the woodworking example, programming requires every skill to be practiced in order to achieve mastery. You cannot allow someone else to build the tables while you build the chairs, and then honestly claim that you understand everything that went into building the dining suite. Likewise, a programmer must learn to develop programs from “first principles” by practicing those skills. Programming is a unique craft in that we first build the tools that we use later to build even more powerful tools, and then eventually use those to build the final product. You cannot expect to jump directly to the final product without any understanding of the parts (and tools) used to create it. Of course, you will eventually get to collaborate with a team on a large project, but this only works when every team member trusts that every other team member deeply understands the way each part will interact to produce the final product.

Plagiarism in programming amounts to going to the local furniture store and buying the dining room suite, then trying to claim it was an original design of your own. In some cases, you might get by with this approach. But when your customer comes back later and wants a unique piece of furniture the likes of which have never been seen before — well, then you will be exposed as a fraud. The craft of programming involves constantly solving new and novel problems by combining the tools and building blocks developed along the way in new and different ways. It requires an intricate understanding of what is happening in every piece of the code in order to create something truly new and different. There is no shortcut to the mastery required for this; it comes only through practice and struggle.

Plagiarism isn’t Just Copying

It was stated earlier that the most obvious form of plagiarism is direct copying of code. This is not the only way to end up in violation of the Academic Misconduct Policy in the Plagiarism section. Plagiarism broadly encompasses several related ethical violations that relate to producing code that isn’t original. The set of related ethical violations that fall into the “plagiarism” category with respect to grading in a programming course are:

directly copying code
- even if some changes are made to it
re-typing someone else’s implementation of an algorithm
working together with another student
- even if you are typing the programs independently, but working together on the contents
re-using someone else’s code from a previous semester
- even if you make some changes to it
using code from the Internet
- even if you type it yourself, not just a copy/paste
allowing someone else to write your program for you

To sum it up succinctly, it is a violation of the Plagiarism and Academic Misconduct Policy if you turn in code that was not completely produced by your own industry as a result of your own ingenuity.

The Reality of the Internet

The Internet is a modern programmer’s most valuable source of reference material; you are encouraged to make use of it as such. A danger posed by this is that you may uncover a solution to the exact problem you are trying to solve (or a nearly identical solution). This leaves you with a serious ethical dilemma. If you use the solution you just discovered, you are in clear violation of the Academic Misconduct Policy. But you had good ethical intentions with respect to searching for the information that led you here. What should you do?

The answer is that you should absolutely not use the code that you have discovered and try to claim it as your own. You have two possible choices:

Close the page immediately and (later) produce your own independent solution without consulting it again.
Produce your own solution (inspired by, or perhaps making use of the one you have seen) and cite the source of the original version.

The first choice may seem like the “best option” at first — but there is a problem. If the code you saw was reasonably short, there is a good chance it “imprinted” on you when you read through it. It is impossible to “unsee” the solution once you have seen it. There may be no way — even after closing the page — to produce a solution that would not be the same as the one you saw. If this is the case, then option 2 is your only ethical choice. If the code was long enough that you think you can close the page and then produce an original solution, a recommendation would be to leave the room for a while (go get a coffee or something) then come back and see if you can write a solution to the problem without consulting the page again. If you can, you are probably ethically in the clear... It would be a good idea to consider citing the original inspiration for your solution even in this case though, as “imprinting” probably happened to some extent whether you realize it or not.

The second choice — cite the source — is the way a situation like this should most often be handled in industry. If the source was a good solution to the problem, and was free of any licensing encumbrance, then you are probably free to use it with just a citation. You should always cite the source of the code, not just for ethical reasons, but also so that you can return to it later for reference in case something needs to be fixed or changed.

In the context of a programming course, the second option (citing the source) is still not ideal, since you have not actually produced the solution yourself (and you didn’t gain the experience that comes with struggle). As such, you probably won’t receive “full credit” for a solution that wasn’t original, but at least you are not in violation of any ethics policies.

In Summary (or TL;DR)

Learning to program is akin to learning a craft like cooking, pottery, or woodworking. You cannot successfully master the necessary skills without significant time spent practicing those skills on your own. Through struggle, trial-and-error, and overcoming difficulty, experience and mastery are gained. Any attempt to “take a shortcut” by using someone else’s code, whether by blatantly copying or by re-producing manually is not only dishonest (and a violation of the Academic Misconduct Policy), it is also a waste of time. Your time is too valuable to waste; spend it practicing and improving, not plagiarizing.

Fixing Pointer-Related Segmentation Faults using GDB

Sat, Mar 22, 2014

This post was originally written for the A-State CS Department wiki on March 22, 2014. It provides a beginner’s overview of using gdb.

So, your program crashed...

While testing a program involving pointers, it is common to experience a program crash related to a bad pointer value somewhere in the code. On Linux systems, the message displayed when your program crashes is often “Segmentation Fault” — meaning the actual “crime” your code committed (as judged by the OS) was to try to access memory outside the “segment” in which it was allocated. Without getting to far into the details of how operating systems handle memory, suffice it to say that this shouldn’t have happened unless you have mis-handled a pointer (or an array access) in some way.

Segmentation faults and other crashes related to pointer mis-handling can be tricky to track down using program tracing techniques (like adding “debug output”) alone. To find the source of these problems, it really helps to be able to “see” exactly what the program was trying to do at the moment that the crash occurred. Debuggers are special programs designed to help us do exactly that. One such debugger that is available on our class virtual servers is the GNU Project Debugger, known as “GDB” and accessed through the ‘gdb’ command.

What is GDB?

The following was taken from the official GNU Project Debugger website, under the heading “What is GDB?”:

GDB, the GNU Project debugger, allows you to see what is going on ‘inside’ another program while it executes – or what another program was doing at the moment it crashed.

GDB can do four main kinds of things (plus other things in support of these) to help you catch bugs in the act:

Start your program, specifying anything that might affect its behavior.

Make your program stop on specified conditions.

Examine what has happened, when your program has stopped.

Change things in your program, so you can experiment with correcting the effects of one bug and go on to learn about another.

The program being debugged can be written in Ada, C, C++, Objective-C, Pascal (and many other languages). Those programs might be executing on the same machine as GDB (native) or on another machine (remote). GDB can run on most popular UNIX and Microsoft Windows variants.

Learn by Example

GDB has a substantial user manual, and provides an extensive set of commands for controlling the debugging session. In reality, you can get a lot of benefit from GDB with only a small subset of the total available command set. This will be shown by working through an example below, given some code that is crashing with a Segmentation Fault error. The commands that will be used in this example are:

The Program

The program for this example will consist of a single header file (olist_with_bug.h) and a single source file (debugexample.cpp). The header file contains a class definition and implementation for a singly-linked ordered list. The source file contains a main() function and some code to fill the list, print it to the screen, remove some elements, and print it out again. The source code for each file is shown below:

`olist_with_bug.h`

/**
 * @file olist.h
 * Defines OList Ordered Linked List
 *
 * OList is a minimally functional singly-linked ordered list
 * designed to contain integer values.
 */

#ifndef OLLIST_H
#define OLLIST_H

#include<iostream>
#include<string>

/**
 * OList is a minimally functional singly-linked list of integer
 * values.  
 */
class OList{
    public:
        // ctor, dtor
        OList()     : head(0) , m_count(0)  {                   }                       
        ~OList()                            { erase();          }                       

        // methods
        void          insert(int value);                                                    /// insert a new item
        bool          remove(int value);                                                    /// remove an item, if it is there
        bool          contains(int value) const;                                            /// see if the list contains an item
        int           count() const         { return m_count;   }                           /// get the current number of items in the list
        bool          is_empty() const      { return head == 0; }                           /// see if the list is empty
        void          erase();                                                              /// erase all values in the list
        std::ostream& print(std::ostream& stream=std::cout, std::string delim=", ") const;  /// print the list to the specified stream

    private:
        /**
         * container node for an integer and a pointer to the next item in the list.
         */
        class Node {
            public:
                // ctor
                Node(int value) 
                    : data(value), next(0)  { }                                             /// Node requires data value to construct
                
                // Attributes:
                int   data;                                                                 /// data value
                Node* next;                                                                 /// pointer to the next Node in the list
        };

        Node* _find(int value) const;
        Node* _find(int value, Node*& previous) const;

        Node* head;
        int   m_count;
};

/**
 * inserts a new data value in the ordered list
 * @param value the new data item (an integer) to insert
 */
void OList::insert(int value){
    OList::Node* previous;
    OList::Node* newnode = new Node(value);
    _find(value, previous);
    if(!is_empty()){
        newnode->next  = previous->next;
        previous->next = newnode;
    }
    else{
        head           = newnode;
    }
    m_count++;
}

/**
 * remove an item from the list if it is there
 * @param  value item to remove
 * @return       true if item was found and removed, false otherwise
 */
bool OList::remove(int value){
    OList::Node* previous;
    OList::Node* victim = _find(value, previous);
    bool  found  = victim != 0;
    if(victim){
        previous->next = victim->next;
        if(victim == head){
            head = victim->next;
        }
        delete victim;
        m_count--;
    }
    return found;
}

/**
 * returns true if the list contains the value
 * @param  value item to look for
 * @return       true if item is found in the list, false otherwise
 */
bool OList::contains(int value) const {
    return _find(value) != 0;
}

/**
 * erase the entire list, leaving it empty
 */
void OList::erase(){
    for(OList::Node* current = head; current != 0;){
        OList::Node* victim = current;
        current             = current->next;
        delete victim;
    }
    m_count = 0;
    head    = 0;
}

/**
 * print the list to the stream specified by the first parameter, 
 * delimited by the string given in the second parameter
 * @param  stream std::ostream stream to send output to
 * @param  delim  string to place between items (defaults to comma-separated)
 * @return        the modified stream object is returned
 */
std::ostream& OList::print(std::ostream& stream, std::string delim) const{
    for(OList::Node* current = head; current != 0; current=current->next){
        stream << current->data << (current->next != 0 ? delim : "");
    }
    return stream;
}

/**
 * (internal use) - finds location of a value in the list, 
 * @param  value value to find
 * @return       pointer to the Node where `value` was found, or NULL if not found
 */
OList::Node* OList::_find(int value) const{
    OList::Node* n;
    return _find(value, n);
}

/**
 * (internal use) - finds location of a value in the list, along with
 * the node immediately to its "left" (the previous node) in the list.
 * In cases where the node isn't in the list, the previous pointer will
 * be set to the node immediately to the "left" of where the node should
 * be located in the list.  This makes the function useful for determining
 * where a node should be added in the list.
 * @param           value    value to find
 * @param[out]      previous pointer to the previous Node in the list (or NULL if the item's position is at head)
 * @return                   pointer to the Node where `value` was found, or NULL if not found
 */
OList::Node* OList::_find(int value, Node*& previous)const {
    OList::Node* current = head;
    previous             = 0;
    while(current && current->data < value){
        previous = current;
        current  = current->next;
    }
    return current && current->data == value ? current : 0;
}

#endif

`debugexample.cpp`

#include<iostream>
#include<cstdlib>
#include<ctime>
#include<string>

//#include "olist_with_bug.h"
#include "olist_with_bug.h"


int main()
{
    // Un-comment the appropriate line below to
    srand(time(0));     // produce different pseudo-random values on every run
    //  ... or ... 
    //srand(5);           // Produce the same set of "random" values every time.
    OList l;
    
    // Add some values
    for(int i = 0; i < 30; i++){
        l.insert(rand() % 100);
    }
    std::cout << "Initial List: ";
    l.print() << "\n\n";

    // Remove some values:
    for(int i = 0; i < 30; i++){
        l.remove(rand() % 100);
    }

    std::cout << "Final List: ";
    l.print() << "\n\n";

    return 0;
}

Compile and Run

At this point, the usual thing to do is to try to compile and run the program as usual. The corresponding terminal session is shown below:

exampleuser@exampleserver:~/OList_with_error$ g++ -W -Wall -ansi -pedantic debugexample.cpp 
exampleuser@exampleserver:~/OList_with_error$ ./a.out 
Initial List: 3, 7, 8, 12, 17, 20, 21, 22, 24, 25, 25, 30, 33, 33, 36, 48, 53, 53, 57, 61, 63, 64, 71, 78, 82, 85, 86, 87, 90, 92

Final List: 3, 7, 8, 12, 17, 20, 21, 25, 25, 30, 33, 33, 53, 53, 57, 61, 63, 64, 71, 82, 87, 90, 92

exampleuser@exampleserver:~/OList_with_error$ ./a.out 
Segmentation fault

Notice here that on the first run, everything went OK. The main program adds data values at random to the list, then tries to remove them at random... In some cases, the program may run as expected. But in other cases, the program may crash with a Segmentation fault, and it did that on the second run.

This behavior is typical of the way these sorts of errors behave in the real world. It is often the case that these errors only show up intermittently; reproducing them during testing can sometimes be difficult. A good approach is to make sure that your testing code exercises all possible “edge conditions” in the source - for an ordered list, that would mean adding to an empty list, adding a value that would be inserted at the head, and adding a value that would be inserted at the end of the list (in addition to the “normal” case of the value being added in the middle somewhere).

Compiling for Debugging

The compiler flags shown above did not produce an executable file that was ideal for debugging. To be most useful, the debugger will need information like the original source code, the line numbers, and the symbol (variable and function) names. This information is removed from the executable produced by the compiler by default (and for good reason -- it increases the output file size significantly). But since the program crashed and a debugging session would be helpful, the program should now be re-compiled in such a way that the compiler will produce a debugger-friendly executable.

Since the debugger used in this example is GDB, the compiler flag -ggdb will be added to the g++ command (the -g flag tells the compiler to produce debugging information, and the -ggdb flag says to produce that information optimized for the GDB debugger). The full command is:

g++ -W -Wall -ansi -pedantic -ggdb debugexample.cpp

No additional compiler output is shown on the console window, but the size of the executable file a.out has increased (on the system being used to produce this example, it was ~13KiB before and ~40KiB after).

The Debugging Session

Now that the program has been compiled with debugging information included, a debugging session can be initiaited. The command to start GDB is ‘gdb’, and the program’s executable name should be passed as a command-line argument. The full command to start debugging this example (whose executable is named a.out) is shown below:

gdb a.out

This assumes that a.out is in the current working directory. The actual terminal session (with the command and its output) is shown below:

exampleuser@exampleserver:~/OList_with_error$ gdb a.out
GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
Reading symbols from /home/exampleuser/OList_with_error/a.out...done.
(gdb)

At this point, GDB has loaded the executable file and is ready for a command. The (gdb) at the end is GDB’s own command prompt. First thing’s first - run the program to try to reproduce the error:

(gdb) run
Starting program: /home/exampleuser/OList_with_error/a.out 

Program received signal SIGSEGV, Segmentation fault.
0x08048969 in OList::insert (this=0xbffff6b4, value=35) at olist_with_bug.h:65
65          newnode->next  = previous->next;
(gdb)

In this case, the error occurred on the first try. If it had not, the run command could be repeated until the crash occurred.

Notice that the function call where the Segmentation fault occurred is shown (OList::insert()) and the actual parameters are also shown. At this point it is clear that the object being operated on (this) is located at address 0xbffff6b4, and the value passed to the value parameter was 35. There is nothing obviously wrong with either of those actual parameters, so continue looking...

The line of code that was actually being executed when the SIGSEGV occurred was line 65 of olist_with_bug.h, which contains the code:

newnode->next  = previous->next;

So, something bad happened on that line of code. Notice that there are two pointer dereferences (by way of the -> operator) on this line. Either of those could be to blame. We need to know what values are contained in the newnode and previous pointers. To find out, the command backtrace full can be used to show the function call stack as well as the values of any local variables along the way:

(gdb) backtrace full
#0  0x08048969 in OList::insert (this=0xbffff6b4, value=35) at olist_with_bug.h:65
        previous = 0x0
        newnode = 0x804c038
#1  0x08048c63 in main () at debugexample.cpp:20
        i = 3
        l = {head = 0x804c008, m_count = 3}
(gdb)

From this output, it is clear that previous is NULL (contains the value 0x0, which is zero expressed as hexidecimal). Therefore, it was the attempt to dereference previous (in previous->next) that caused the fault. Dereferencing a NULL pointer is always illegal.

So the question is: “How did previous come to be NULL in the first place?”

To answer this, we can consult the source code itself. The code from the insert() function is shown below:

void OList::insert(int value){
    OList::Node* previous;
    OList::Node* newnode = new Node(value);
    _find(value, previous);
    if(!is_empty()){
        newnode->next  = previous->next;        // how is `previous` NULL here?
        previous->next = newnode;
    }
    else{
        newnode->next  = head;
        head           = newnode;
    }
    m_count++;
}

Looking at the path taken to arrive at that particular line, we see that previous is intended to receive its value on the line:

    _find(value, previous);

Which is line 63 in the file. This means that the value being placed in previous by the _find() method is NULL. So the next thing to do is examine the _find() method to see if there is a bug there, or perhaps it is just being used incorrectly. The _find() method (along with its documentation) is:

/**
 * (internal use) - finds location of a value in the list, along with
 * the node immediately to its "left" (the previous node) in the list.
 * In cases where the node isn't in the list, the previous pointer will
 * be set to the node immediately to the "left" of where the node should
 * be located in the list.  This makes the function useful for determining
 * where a node should be added in the list.
 * @param       value    value to find
 * @param[out]  previous pointer to the previous Node in the list (or NULL if the item's position is at head)
 * @return               pointer to the Node where `value` was found, or NULL if not found
 */
OList::Node* OList::_find(int value, Node*& previous)const {
    OList::Node* current = head;
    previous             = 0;
    while(current && current->data < value){
        previous = current;
        current  = current->next;
    }
    return current && current->data == value ? current : 0;
}

From reading the method’s documentation, it seems that previous will indeed be set to NULL if the item’s position is at the head of the list. This makes sense — there is nothing to the logical “left” of the head of a singly-linked list, after all.

So it probably isn’t an error that the _find() method is setting previous to NULL; it just means that the item that is being inserted belongs to the logical “left” of the current head of the list (thus becoming the new head item). Has this situation been properly handled in the insert() method? Take another look:

void OList::insert(int value){
    OList::Node* previous;
    OList::Node* newnode = new Node(value);
    _find(value, previous);                  
    if(!is_empty()){
        newnode->next  = previous->next;        
        previous->next = newnode;
    }
    else{
        head           = newnode;
    }
    m_count++;
}

Now that it is clear that previous will be set to NULL any time the new node belongs at the head of the list, it is worth listing the situations in which that will be the case:

the list is empty, meaning that any item inserted will become the head

the item being inserted is smaller than the item currently at head, and should become the new head of the list

Examining the code carefully shows that the “empty list” case has been planned for, but the “new smallest item” case has not. In either of these cases, the new node’s next pointer should be pointed to the current head (which may be NULL if the list is empty, but that is not a problem). Then, the head pointer must be pointed to the new node (the item being inserted becomes the first item). In both cases, the previous pointer will be set to NULL by the call to _find(), so that can be used as the condition in the if, replacing the check for “not empty”:

    if(previous != 0){      // replaces "if(!is_empty()){"

The else clause must also be modified to properly set the new node’s next pointer (for the “new first item” case):

    else{
        newnode->next  = head;      // link in `newnode` in case list is non-empty
        head           = newnode;
    }

This means that the modifed insert() function should look like:

void OList::insert(int value){
    OList::Node* previous;
    OList::Node* newnode = new Node(value);
    _find(value, previous);                  
    if(previous != 0){             // if list is non-empty and new node isn't a minimum
        newnode->next  = previous->next;        
        previous->next = newnode;
    }
    else{
        newnode->next  = head;     // link in `newnode` in case list is non-empty
        head           = newnode;
    }
    m_count++;
}

Now it is once again time to compile and test the code. The GDB debugging session is still running, so it should be stopped before re-compiling:

(gdb) quit
A debugging session is active.

    Inferior 1 [process 5617] will be killed.

Quit anyway? (y or n) y
exampleuser@exampleserver:~/OList_with_error$

The last line shown indicates that the shell prompt is back and waiting for a command. The compile (with debugger output) and test session (testing directly in GDB just in case) is shown below:

exampleuser@exampleserver:~/OList_with_error$ g++ -W -Wall -ansi -pedantic -ggdb debugexample.cpp
exampleuser@exampleserver:~/OList_with_error$ gdb a.out
GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
Reading symbols from /home/exampleuser/OList_with_error/a.out...done.
(gdb) run
Starting program: /home/exampleuser/OList_with_error/a.out 
Initial List: 4, 7, 10, 11, 15, 20, 24, 25, 29, 35, 35, 36, 37, 39, 39, 41, 42, 48, 56, 57, 65, 67, 68, 73, 73, 82, 82, 86, 89, 96

Final List: 4, 7, 10, 11, 15, 20, 25, 29, 35, 35, 37, 39, 39, 48, 56, 57, 65, 68, 73, 73, 82, 89, 96

[Inferior 1 (process 6529) exited normally]
(gdb) run
Starting program: /home/exampleuser/OList_with_error/a.out 
Initial List: 0, 3, 13, 20, 20, 22, 23, 33, 34, 35, 38, 44, 46, 48, 49, 53, 58, 60, 62, 64, 68, 70, 70, 74, 81, 90, 90, 94, 95, 99

Final List: 0, 3, 13, 20, 20, 22, 23, 33, 34, 35, 38, 44, 46, 48, 49, 53, 60, 62, 68, 70, 70, 74, 81, 90, 99

[Inferior 1 (process 6533) exited normally]
(gdb) run
Starting program: /home/exampleuser/OList_with_error/a.out 
Initial List: 1, 6, 6, 7, 9, 11, 12, 13, 14, 15, 16, 17, 18, 24, 25, 31, 34, 38, 42, 44, 57, 65, 66, 68, 69, 69, 70, 73, 90, 94

Final List: 1, 6, 6, 7, 9, 11, 14, 16, 17, 18, 24, 31, 34, 44, 65, 66, 73, 90, 94

[Inferior 1 (process 6534) exited normally]
(gdb) run
Starting program: /home/exampleuser/OList_with_error/a.out 
Initial List: 1, 6, 14, 15, 16, 17, 19, 26, 34, 37, 38, 39, 40, 42, 43, 47, 50, 50, 51, 52, 57, 58, 63, 63, 69, 69, 70, 72, 80, 98

Final List: 1, 6, 17, 19, 26, 34, 37, 38, 39, 40, 42, 43, 47, 50, 50, 51, 57, 58, 63, 69, 69, 70, 72, 98

[Inferior 1 (process 6537) exited normally]
(gdb) run
Starting program: /home/exampleuser/OList_with_error/a.out 
Initial List: 1, 4, 6, 12, 16, 17, 18, 21, 22, 27, 37, 42, 42, 43, 46, 48, 50, 51, 51, 56, 57, 61, 65, 74, 77, 92, 94, 94, 95, 96


Program received signal SIGSEGV, Segmentation fault.
0x080489d5 in OList::remove (this=0xbffff6b4, value=1) at olist_with_bug.h:85
85          previous->next = victim->next;
(gdb)

As you can see, the program ran well four times before crashing with another Segmentation fault on the fifth run!

This error is in a different place (line 85 in the remove() function). Once again, the actual parameters look reasonable. There are two pointers being dereferenced in that line, so a backtrace with local variable values might help shed some light on what happened:

(gdb) backtrace full
#0  0x080489d5 in OList::remove (this=0xbffff6b4, value=1) at olist_with_bug.h:85
        previous = 0x0
        victim = 0x804c0c8
        found = true
#1  0x08048d29 in main () at debugexample.cpp:27
        i = 29
        l = {head = 0x804c0c8, m_count = 20}
(gdb)

Once again, the previous pointer seems to be to blame, containing a NULL value at the time it is being dereferenced. Back to the source code! The remove() function looks like:

bool OList::remove(int value){
    OList::Node* previous;
    OList::Node* victim = _find(value, previous);
    bool  found  = victim != 0;
    if(victim){
        previous->next = victim->next;
        if(victim == head){
            head = victim->next;
        }
        delete victim;
        m_count--;
    }
    return found;
}

And once again the _find() helper method is being used. From the earlier experience, it is clear that previous will be set to NULL if the value being searched for isn’t found or if it is found at the head of the list. A quick look a the documentation for the _find() method’s return value indicates that if the value is not found, the return value is also NULL. That leaves only one situation: The value being searched for must be at the beginning of the list.

Like before, it seems that the code is handling the “value not found” and “empty list” special case just fine (in fact, those cases are the same when trying to remove a value). It is the special case where the value is at the beginning of the list that is being ignored. In that case, the item at the head of the list is being removed... The code is already checking for this on the line:

        if(victim == head){

All that is needed is to take advantage of this already-existing check to specify that the previous pointer should only be accessed when the “victim” node is not the head node. This is most easily accomplished by adding an else clause for that if statement and moving the line causing the crash into it. The modified remove() function looks like:

bool OList::remove(int value){
    OList::Node* previous;
    OList::Node* victim = _find(value, previous);
    bool  found  = victim != 0;
    if(victim){
        if(victim == head){
            head = victim->next;
        }
        else{
            previous->next = victim->next;  // only if `victim` is not `head`!
        }
        delete victim;
        m_count--;
    }
    return found;
}

After making the code adjustment, stop the GDB session using the quit command, then re-compile and test the program. At this point the program no longer crashes under any sequence of data additions and deletions! The Segmentation fault errors have been tracked down and squashed thanks to some help from GDB.

Don’t forget to re-compile without the additional debugging information if you are intending to distribute your object/executable file (to reduce the file size).

Other Useful GDB Commands

This example only made use of the run and backtrace full commands from GDB; it is compelling that only two commands can help speed the correction of subtle coding errors so dramatically, but sometimes these two just aren’t quite enough. Adding just a few more commands to your GDB arsenal will allow you to debug a majority of simple (common) programming errors of all kinds:

run : Start program execution from the beginning of the program. Allows command-line arguments to be listed, and also allows basic I/O redirection.
backtrace : If (or when) a program has crashed, this command will show trace of where you are currently (which functions you are in). The “trace” consists of the current function call stack, where each call (referred to as a “frame”) is shown in order from most recent to earliest (which will be the entry into main()), and each is numbered for reference.
backtrace full : Just like backtrace, but local variables are also shown for each frame.
list : List the source code surrounding the current execution point, or in the currently selected frame.
print : Prints the value contained in a variable, or in some cases the result returned by a function that has been called in the current frame. print also has several options available to select the desired representation of the value.
break : create a breakpoint, which is a point at which execution of the program will stop during your debugging session so that you can examine local variables, etc.
step : “steps into” next line of code - will drop into a function call
next : like step but will not drop into a function call
continue : continue running the program as usual, until next breakpoint is encountered or program exits

Epsilon Comparison

Tue, Nov 5, 2013

This post was originally written for the A-State CS Department wiki in November, 2013.

The Numerical (In)accuracy Problem

Computers use binary (base-2) numbers internally. This means that our familiar decimal (base-10) numbers must be converted to binary before they can be stored in the computer. The problem is that some—many, actually—base-10 real numbers do not have an exact representation in base-2. This may seem odd until you consider that we deal with lots of numbers that do not have an exact representation in base-10 either. Two examples are $\pi$ and $e$. Neither of these values can be written exactly in decimal no matter how many digits you allow after the decimal point. Numbers of this variety are called irrational numbers. It turns out that many numbers that are easily written in decimal become irrational numbers when translated to binary. A simple example is the number 0.1. It is easy enough to write in decimal, but in binary it becomes impossible to represent exactly.

So what do we do when we can’t exactly represent a number? We round. If you were asked to write down $\pi$, what would you write? You would probably write down some value between 3.14 and 3.14159, depending on how precise you wanted to be. Neither number is exactly correct, but both are reasonable representations given the number of digits used. In a computer, there is a limit to the number of binary digits (bits) that may be used to represent a number. Because of this, numbers like 0.1 must be rounded to some limited precision. This rounding produces a “pretty good”, but inexact representation of 0.1.

At this point, you are probably asking yourself “why should I care?”, and that is a reasonable question. The answer is: Because it will cause errors in your programs! To help illustrate why, open the interactive Python interpreter and follow along with the following example:

First, let’s see how 0.1 is approximated in Python (Python is used for this example because it’s default numeric output mode shows more decimal places than C++). At the interactive prompt, type 0.1 and <ENTER>. You should see something like the following:

>>> 0.1
0.10000000000000001

Hmm. It looks like 0.1 gained an extra 1 in the 17^th^ decimal position! Now enter this relational expression and note the result:

0.1 == 3.1 - 3

Here’s the output that was produced on the computer used to write this page:

>>> 0.1 == 3.1 - 3
False

Why was the result False? To see the answer, let’s enter each operand into the interpreter separately and look at the result:

>>> 0.1
0.10000000000000001
>>> 3.1 - 3
0.10000000000000009

Notice that 0.1 is represented internally as 0.10000000000000001 (as we saw before). Now notice that 3.1 - 3 (which should also be 0.1) is represented internally as 0.10000000000000009. There is a difference in the 17^th^ digit following the decimal point! When we compare these two values with the == operator, the expression will return False since the two numbers (internally) are not exactly the same! We refer to this as rounding error, and if we want to compare real numbers for equality in a computer program, we must learn how to deal with rounding error properly.

A C++ Example

In C++, showing the inaccuracy is a little more difficult due to differences in the internal representation of floating-point values. The following example may or may not show the “error”, but it is likely to do so:

double pi1 = acos(-1);
double pi2 = (pi1 * (0.1 * 0.1)) / (0.1 * 0.1);
if(pi1 == pi2){
    cout << "OK, " << pi1 << " == " << pi2 << "!\n";
}
else{
    cout << "An ERROR occurred: " << pi1 << " != " << pi2 << "!\n";
}

Here, a very accurate value for $\pi$ is calculated using the inverse cosine function acos(). Then, this version of $\pi$ is used to calculate another representation (by calculating the area of a circle of radius 0.1 and then solving for $\pi$ by dividing out the $r^2$ term). In theory, these two values of $\pi$ should be exactly the same — but in the process of doing imprecise floating-point operations (the operations involving 0.1), rounding error was (possibly) introduced. Therefore, it is likely that the following output is produced:

An error occurred: 3.14159 != 3.14159!

Even worse, since the default behavior of the << operator in C++ is to round the output, you cannot even see the difference that caused the two values to be considered “not equal” (it was beyond the 5^th^ decimal place).

The Right Way to Compare Real Numbers for Equality

Ok, if we can’t use the == operator by itself to compare real numbers, how can we hope to determine if two real number values are equal? The answer is that we must accept the fact that we can’t compare real numbers exactly with a computer—not by using the built-in float type anyway. Once we have accepted that limitation, we can find a way to compare real numbers that takes the rounding error into account. We will decide on some amount of error that we will consider acceptable, then we will determine if the two values are reasonably close to equal. This makes sense in the real world, since you would probably say that both 3.1416 and 3.14159 are reasonable representations of $\pi$. To determine if two numbers are functionally equal (or “close enough”), we can ask the following question:

“Is the difference between the two values so small that we can ignore it?”

If the answer to this question is “Yes”, we will consider the numbers equal. Otherwise, we will consider them to be unequal. To answer this question, we can use an expression like the following (which compares 0.1 and 3.1 - 3 ):

In Python

abs(0.1 - (3.1 - 3)) < 1e-11

In C++

fabs(0.1 - (3.1 - 3)) < 1e-11

In the expression above, we take the difference of the two values (by subtracting them), then we look at the magnitude of that number (by taking its absolute value with abs() (in Python), or fabs() (in C++) to remove any negative sign). If the magnitude of the difference of the two numbers is less than a very small value ($1\times10^{-11}$ in this case), we consider the numbers equal. We usually refer to the small number representing allowable error as EPSILON. Thus, a comparison of this type is usually referred to as an EPSILON comparison. At the Python interactive prompt, enter the expression above and note the result. You should see something like the following:

>>> abs(0.1 - (3.1 - 3)) < 1e-11
True

For the C++ example, consider the same code testing the two values of $\pi$ as shown above, but now using an EPSILON comparison:

double pi1 = acos(-1);
double pi2 = (pi1 * (0.1 * 0.1)) / (0.1 * 0.1);

if(fabs(pi1 - pi2) < 1e-11){
    cout << "OK, " << pi1 << " == " << pi2 << "!\n";
}
else{
    cout << "An ERROR occurred: " << pi1 << " != " << pi2 << "!\n";
}

Now, you can be confident that the outcome will always be as expected:

OK, 3.14159 == 3.14159!