This post was originally written with students in introductory programming courses in mind. It was originally posted on the A-State CS Department wiki in January, 2015.
Foreword
Students first encountering a programming course often have some
confusion or misunderstandings about what plagiarism means in the
context of programming, and why it is unacceptable. The following is an
attempt to address each of these misunderstandings.
What is Plagiarism?
Let’s start with a definition. According to Merriam-Webster
(http://www.merriam-webster.com/dictionary/plagiarism), plagiarism is:
the act of using another person’s words or ideas without giving
credit to that person : the act of plagiarizing something
That definition is sufficiently general to apply to plagiarism of any
kind. Since we are concerned specifically with plagiarism of programming
code here, consider the following specialization of the definition:
Plagiarism with respect to programming code is the act of using
another person’s implementation of an algorithm without giving credit
to that person.
The most obvious way to plagiarize programming code it to directly copy
it. Unfortunately, computers make this a trivial task. It is up to your
own ethics as a programmer to avoid the temptation to copy code.
But Code Reuse is Good, Right?
One of the guiding Principles of good software design is to “Reuse,
Reuse, Reuse”. A programmer should never spend time re-developing a
tool that already exists in the same form. A single correct tool should
be built for every unique task, then those tools should be re-used
whenever that task is encountered. The caveat here is that someone has
to build the tool the first time. A second caveat is that you cannot
expect to correctly use a tool you don’t understand. And in
programming, the most sure way to understand how a piece of software
works it to write it yourself (if only once, for the exercise). In fact,
learning to program is more like learning a craft such as cooking than
like learning a liberal art such as history or reading (although there
is certainly a language aspect as well). This brings us to the next
point:
Programming is a Craft
Computer Science concerns itself with the fundamental capabilities of
computing machines and the expression of algorithms in such a way that
they can be executed correctly by those machines. The day-to-day
business of how we actually express those algorithms to build useful
things is the focus of the study of computer programming. Programming
itself is more closely akin to a craft than to a science. Computer
science studies the things that are possible and then it is up to
programmers to make those things a reality. In the same way, physical
science taught us about electricity and now electricians and electrical
engineers can make useful things with that knowledge.
Programming is a necessary craft for computer scientists — in
order to design experiments, and test hypotheses, a computer scientist
must be able to interact with the computing machine s/he is studying.
Programming provides the set of tools that make this possible.
Programming is a desirable craft for other disciplines — modern
advances in mathematics, physics, chemistry, and biology have come as a
result of the application of computational power to those areas. This is
only possible because programmers applied their tools to those problems.
This means that scholars from almost every discipline would be better
off if they learned the craft of programming. The tools provided by
programming create a “force multiplier” effect — you can do so much
more with the help of a computer (through programming) than you can
without it.
Crafts Require Practice
To make this point, consider programming alongside another common craft:
Woodworking. There are plenty of books available on the subject. The
Complete Manual of Woodworking by Albert Jackson, David Day, and Simon
Jennings is a good place to start. It contains chapters ranging from
“Wood: The Raw Material” to “Joinery” and “Wood Carving”. In
Chapter 2, “Designing in Wood”, the authors include a section
“Principles of Chair Construction”. The section, like the rest of the
book, is fully illustrated. It discusses how the seat angle should
relate to the angle of the chair back, and how this is affected by the
human body and how it in turn affects the posture (and comfort) of the
person sitting in the chair. Suggested measurements are given, and a
discussion of methods of joinery involved in different styles of chairs
is presented. Even the order of construction is clearly stated.
Would you expect an average person — most likely not having a
background in woodworking — to be able to successfully create a
quality chair after reading this book and passing a test over the terms
and concepts? Probably not. What is missing here is practice. In order
to learn the craft of woodworking, one must practice it by building
things. Early projects are likely to fail, but over time the woodworker
will become more and more proficient through a combination of
trial-and-error and repetition.
Just like you would not commission a “book educated” woodworker with
no practical experience to build a fine set of furnishings for a fancy
dining hall, you would not hire a “programmer” with no actual
programming experience to build a payroll application for your company.
The results in either case would be disastrous, and the money spent
would have been wasted. Programming simply cannot be learned from a book
alone. As with any craft, practice is a necessity. Trial and error,
struggle, and overcoming difficulty eventually lead to experience (and
maybe enlightenment). The basic things get easier, and learning new
concepts becomes faster because of this prior experience. The quality of
the programmer’s work will improve over time. The “novice” craftsman
eventually becomes skilled, and maybe progresses to become a “master”
of the craft.
Coding Practice is a Solitary Pursuit
Just as with the woodworking example, programming requires every skill
to be practiced in order to achieve mastery. You cannot allow someone
else to build the tables while you build the chairs, and then honestly
claim that you understand everything that went into building the dining
suite. Likewise, a programmer must learn to develop programs from
“first principles” by practicing those skills. Programming is a unique
craft in that we first build the tools that we use later to build even
more powerful tools, and then eventually use those to build the final
product. You cannot expect to jump directly to the final product without
any understanding of the parts (and tools) used to create it. Of course,
you will eventually get to collaborate with a team on a large project,
but this only works when every team member trusts that every other team
member deeply understands the way each part will interact to produce the
final product.
Plagiarism in programming amounts to going to the local furniture store
and buying the dining room suite, then trying to claim it was an
original design of your own. In some cases, you might get by with this
approach. But when your customer comes back later and wants a unique
piece of furniture the likes of which have never been seen before —
well, then you will be exposed as a fraud. The craft of programming
involves constantly solving new and novel problems by combining the
tools and building blocks developed along the way in new and different
ways. It requires an intricate understanding of what is happening in
every piece of the code in order to create something truly new and
different. There is no shortcut to the mastery required for this; it
comes only through practice and struggle.
Plagiarism isn’t Just Copying
It was stated earlier that the most obvious form of plagiarism is
direct copying of code. This is not the only way to end up in violation
of the Academic Misconduct Policy in the Plagiarism section. Plagiarism
broadly encompasses several related ethical violations that relate to
producing code that isn’t original. The set of related ethical
violations that fall into the “plagiarism” category with respect to
grading in a programming course are:
- directly copying code
- even if some changes are made to it
- re-typing someone else’s implementation of an algorithm
- working together with another student
- even if you are typing the programs independently, but working
together on the contents
- re-using someone else’s code from a previous semester
- even if you make some changes to it
- using code from the Internet
- even if you type it yourself, not just a copy/paste
- allowing someone else to write your program for you
To sum it up succinctly, it is a violation of the Plagiarism and
Academic Misconduct Policy if you turn in code that was not completely
produced by your own industry as a result of your own ingenuity.
The Reality of the Internet
The Internet is a modern programmer’s most valuable source of reference
material; you are encouraged to make use of it as such. A danger posed
by this is that you may uncover a solution to the exact problem you are
trying to solve (or a nearly identical solution). This leaves you with a
serious ethical dilemma. If you use the solution you just discovered,
you are in clear violation of the Academic Misconduct Policy. But you
had good ethical intentions with respect to searching for the
information that led you here. What should you do?
The answer is that you should absolutely not use the code that you
have discovered and try to claim it as your own. You have two possible
choices:
- Close the page immediately and (later) produce your own independent
solution without consulting it again.
- Produce your own solution (inspired by, or perhaps making use of the
one you have seen) and cite the source of the original version.
The first choice may seem like the “best option” at first — but
there is a problem. If the code you saw was reasonably short, there is a
good chance it “imprinted” on you when you read through it. It is
impossible to “unsee” the solution once you have seen it. There may be
no way — even after closing the page — to produce a solution that
would not be the same as the one you saw. If this is the case, then
option 2 is your only ethical choice. If the code was long enough that
you think you can close the page and then produce an original solution,
a recommendation would be to leave the room for a while (go get a coffee
or something) then come back and see if you can write a solution to the
problem without consulting the page again. If you can, you are probably
ethically in the clear... It would be a good idea to consider citing
the original inspiration for your solution even in this case though, as
“imprinting” probably happened to some extent whether you realize it
or not.
The second choice — cite the source — is the way a situation like
this should most often be handled in industry. If the source was a good
solution to the problem, and was free of any licensing encumbrance, then
you are probably free to use it with just a citation. You should
always cite the source of the code, not just for ethical reasons, but
also so that you can return to it later for reference in case something
needs to be fixed or changed.
In the context of a programming course, the second option (citing the
source) is still not ideal, since you have not actually produced the
solution yourself (and you didn’t gain the experience that comes with
struggle). As such, you probably won’t receive “full credit” for a
solution that wasn’t original, but at least you are not in violation of
any ethics policies.
In Summary (or TL;DR)
Learning to program is akin to learning a craft like cooking, pottery,
or woodworking. You cannot successfully master the necessary skills
without significant time spent practicing those skills on your own.
Through struggle, trial-and-error, and overcoming difficulty, experience
and mastery are gained. Any attempt to “take a shortcut” by using
someone else’s code, whether by blatantly copying or by re-producing
manually is not only dishonest (and a violation of the Academic
Misconduct Policy), it is also a waste of time. Your time is too
valuable to waste; spend it practicing and improving, not
plagiarizing.
This post was originally written for the A-State CS Department wiki on March 22, 2014. It provides a beginner’s overview of using gdb.
So, your program crashed...
While testing a program involving pointers, it is common to experience a
program crash related to a bad pointer value somewhere in the code. On
Linux systems, the message displayed when your program crashes is often
“Segmentation Fault” — meaning the actual “crime” your code
committed (as judged by the OS) was to try to access memory outside the
“segment” in which it was allocated. Without getting to far into the
details of how operating systems handle memory, suffice it to say that
this shouldn’t have happened unless you have mis-handled a pointer (or
an array access) in some way.
Segmentation faults and other crashes related to pointer mis-handling
can be tricky to track down using program tracing techniques (like
adding “debug output”) alone. To find the source of these problems, it
really helps to be able to “see” exactly what the program was trying
to do at the moment that the crash occurred. Debuggers are special
programs designed to help us do exactly that. One such debugger that is
available on our class virtual servers is the GNU Project Debugger,
known as “GDB” and accessed through the ‘gdb’ command.
What is GDB?
The following was taken from the official GNU Project Debugger
website, under the heading “What is
GDB?”:
GDB, the GNU Project debugger, allows you to see what is going on
‘inside’ another program while it executes – or what another
program was doing at the moment it crashed.
GDB can do four main kinds of things (plus other things in support of
these) to help you catch bugs in the act:
- Start your program, specifying anything that might affect its
behavior.
- Make your program stop on specified conditions.
- Examine what has happened, when your program has stopped.
- Change things in your program, so you can experiment with
correcting the effects of one bug and go on to learn about
another.
The program being debugged can be written in Ada, C, C++, Objective-C,
Pascal (and many other languages). Those programs might be executing
on the same machine as GDB (native) or on another machine (remote).
GDB can run on most popular UNIX and Microsoft Windows variants.
Learn by Example
GDB has a substantial user
manual, and provides
an extensive set of
commands for
controlling the debugging session. In reality, you can get a lot of
benefit from GDB with only a small subset of the total available command
set. This will be shown by working through an example below, given some
code that is crashing with a Segmentation Fault error. The commands that
will be used in this example are:
run : Start program execution from the beginning of the program. Allows command-line arguments to be listed, and also allows basic I/O redirection.
backtrace : If (or when) a program has crashed, this command will show trace of where you are currently (which functions you are in). The “trace” consists of the current function call stack, where each call (referred to as a “frame”) is shown in order from most recent to earliest (which will be the entry into main()), and each is numbered for reference.
backtrace full : Just like backtrace, but local variables are also shown for each frame.
The Program
The program for this example will consist of a single header file
(olist_with_bug.h) and a single source file (debugexample.cpp).
The header file contains a class definition and implementation for a
singly-linked ordered list. The source file contains a main() function
and some code to fill the list, print it to the screen, remove some
elements, and print it out again. The source code for each file is shown
below:
olist_with_bug.h
/**
* @file olist.h
* Defines OList Ordered Linked List
*
* OList is a minimally functional singly-linked ordered list
* designed to contain integer values.
*/
#ifndef OLLIST_H
#define OLLIST_H
#include<iostream>
#include<string>
/**
* OList is a minimally functional singly-linked list of integer
* values.
*/
class OList{
public:
// ctor, dtor
OList() : head(0) , m_count(0) { }
~OList() { erase(); }
// methods
void insert(int value); /// insert a new item
bool remove(int value); /// remove an item, if it is there
bool contains(int value) const; /// see if the list contains an item
int count() const { return m_count; } /// get the current number of items in the list
bool is_empty() const { return head == 0; } /// see if the list is empty
void erase(); /// erase all values in the list
std::ostream& print(std::ostream& stream=std::cout, std::string delim=", ") const; /// print the list to the specified stream
private:
/**
* container node for an integer and a pointer to the next item in the list.
*/
class Node {
public:
// ctor
Node(int value)
: data(value), next(0) { } /// Node requires data value to construct
// Attributes:
int data; /// data value
Node* next; /// pointer to the next Node in the list
};
Node* _find(int value) const;
Node* _find(int value, Node*& previous) const;
Node* head;
int m_count;
};
/**
* inserts a new data value in the ordered list
* @param value the new data item (an integer) to insert
*/
void OList::insert(int value){
OList::Node* previous;
OList::Node* newnode = new Node(value);
_find(value, previous);
if(!is_empty()){
newnode->next = previous->next;
previous->next = newnode;
}
else{
head = newnode;
}
m_count++;
}
/**
* remove an item from the list if it is there
* @param value item to remove
* @return true if item was found and removed, false otherwise
*/
bool OList::remove(int value){
OList::Node* previous;
OList::Node* victim = _find(value, previous);
bool found = victim != 0;
if(victim){
previous->next = victim->next;
if(victim == head){
head = victim->next;
}
delete victim;
m_count--;
}
return found;
}
/**
* returns true if the list contains the value
* @param value item to look for
* @return true if item is found in the list, false otherwise
*/
bool OList::contains(int value) const {
return _find(value) != 0;
}
/**
* erase the entire list, leaving it empty
*/
void OList::erase(){
for(OList::Node* current = head; current != 0;){
OList::Node* victim = current;
current = current->next;
delete victim;
}
m_count = 0;
head = 0;
}
/**
* print the list to the stream specified by the first parameter,
* delimited by the string given in the second parameter
* @param stream std::ostream stream to send output to
* @param delim string to place between items (defaults to comma-separated)
* @return the modified stream object is returned
*/
std::ostream& OList::print(std::ostream& stream, std::string delim) const{
for(OList::Node* current = head; current != 0; current=current->next){
stream << current->data << (current->next != 0 ? delim : "");
}
return stream;
}
/**
* (internal use) - finds location of a value in the list,
* @param value value to find
* @return pointer to the Node where `value` was found, or NULL if not found
*/
OList::Node* OList::_find(int value) const{
OList::Node* n;
return _find(value, n);
}
/**
* (internal use) - finds location of a value in the list, along with
* the node immediately to its "left" (the previous node) in the list.
* In cases where the node isn't in the list, the previous pointer will
* be set to the node immediately to the "left" of where the node should
* be located in the list. This makes the function useful for determining
* where a node should be added in the list.
* @param value value to find
* @param[out] previous pointer to the previous Node in the list (or NULL if the item's position is at head)
* @return pointer to the Node where `value` was found, or NULL if not found
*/
OList::Node* OList::_find(int value, Node*& previous)const {
OList::Node* current = head;
previous = 0;
while(current && current->data < value){
previous = current;
current = current->next;
}
return current && current->data == value ? current : 0;
}
#endif
debugexample.cpp
#include<iostream>
#include<cstdlib>
#include<ctime>
#include<string>
//#include "olist_with_bug.h"
#include "olist_with_bug.h"
int main()
{
// Un-comment the appropriate line below to
srand(time(0)); // produce different pseudo-random values on every run
// ... or ...
//srand(5); // Produce the same set of "random" values every time.
OList l;
// Add some values
for(int i = 0; i < 30; i++){
l.insert(rand() % 100);
}
std::cout << "Initial List: ";
l.print() << "\n\n";
// Remove some values:
for(int i = 0; i < 30; i++){
l.remove(rand() % 100);
}
std::cout << "Final List: ";
l.print() << "\n\n";
return 0;
}
Compile and Run
At this point, the usual thing to do is to try to compile and run the
program as usual. The corresponding terminal session is shown below:
exampleuser@exampleserver:~/OList_with_error$ g++ -W -Wall -ansi -pedantic debugexample.cpp
exampleuser@exampleserver:~/OList_with_error$ ./a.out
Initial List: 3, 7, 8, 12, 17, 20, 21, 22, 24, 25, 25, 30, 33, 33, 36, 48, 53, 53, 57, 61, 63, 64, 71, 78, 82, 85, 86, 87, 90, 92
Final List: 3, 7, 8, 12, 17, 20, 21, 25, 25, 30, 33, 33, 53, 53, 57, 61, 63, 64, 71, 82, 87, 90, 92
exampleuser@exampleserver:~/OList_with_error$ ./a.out
Segmentation fault
Notice here that on the first run, everything went OK. The main program
adds data values at random to the list, then tries to remove them at
random... In some cases, the program may run as expected. But in other
cases, the program may crash with a Segmentation fault, and it did that
on the second run.
This behavior is typical of the way these sorts of errors behave in the
real world. It is often the case that these errors only show up
intermittently; reproducing them during testing can sometimes be
difficult. A good approach is to make sure that your testing code
exercises all possible “edge conditions” in the source - for an
ordered list, that would mean adding to an empty list, adding a value
that would be inserted at the head, and adding a value that would be
inserted at the end of the list (in addition to the “normal” case of
the value being added in the middle somewhere).
Compiling for Debugging
The compiler flags shown above did not produce an executable file that
was ideal for debugging. To be most useful, the debugger will need
information like the original source code, the line numbers, and the
symbol (variable and function) names. This information is removed from
the executable produced by the compiler by default (and for good reason
-- it increases the output file size significantly). But since the
program crashed and a debugging session would be helpful, the program
should now be re-compiled in such a way that the compiler will produce a
debugger-friendly executable.
Since the debugger used in this example is GDB, the compiler flag
-ggdb will be added to the g++ command (the -g flag tells the
compiler to produce debugging information, and the -ggdb flag says to
produce that information optimized for the GDB debugger). The full
command is:
g++ -W -Wall -ansi -pedantic -ggdb debugexample.cpp
No additional compiler output is shown on the console window, but the
size of the executable file a.out has increased (on the system being
used to produce this example, it was ~13KiB before and ~40KiB after).
The Debugging Session
Now that the program has been compiled with debugging information
included, a debugging session can be initiaited. The command to start
GDB is ‘gdb’, and the program’s executable name should be passed as
a command-line argument. The full command to start debugging this
example (whose executable is named a.out) is shown below:
This assumes that a.out is in the current working directory. The
actual terminal session (with the command and its output) is shown
below:
exampleuser@exampleserver:~/OList_with_error$ gdb a.out
GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
Reading symbols from /home/exampleuser/OList_with_error/a.out...done.
(gdb)
At this point, GDB has loaded the executable file and is ready for a
command. The (gdb) at the end is GDB’s own command prompt. First
thing’s first - run the program to try to reproduce the error:
(gdb) run
Starting program: /home/exampleuser/OList_with_error/a.out
Program received signal SIGSEGV, Segmentation fault.
0x08048969 in OList::insert (this=0xbffff6b4, value=35) at olist_with_bug.h:65
65 newnode->next = previous->next;
(gdb)
In this case, the error occurred on the first try. If it had not, the
run command could be repeated until the crash occurred.
Notice that the function call where the Segmentation fault occurred is
shown (OList::insert()) and the actual parameters are also shown. At
this point it is clear that the object being operated on (this) is
located at address 0xbffff6b4, and the value passed to the value
parameter was 35. There is nothing obviously wrong with either of those
actual parameters, so continue looking...
The line of code that was actually being executed when the SIGSEGV
occurred was line 65 of olist_with_bug.h, which contains the code:
newnode->next = previous->next;
So, something bad happened on that line of code. Notice that there are
two pointer dereferences (by way of the -> operator) on this line.
Either of those could be to blame. We need to know what values are
contained in the newnode and previous pointers. To find out, the
command backtrace full can be used to show the function call stack
as well as the values of any local variables along the way:
(gdb) backtrace full
#0 0x08048969 in OList::insert (this=0xbffff6b4, value=35) at olist_with_bug.h:65
previous = 0x0
newnode = 0x804c038
#1 0x08048c63 in main () at debugexample.cpp:20
i = 3
l = {head = 0x804c008, m_count = 3}
(gdb)
From this output, it is clear that previous is NULL (contains the
value 0x0, which is zero expressed as hexidecimal). Therefore, it was
the attempt to dereference previous (in previous->next) that caused
the fault. Dereferencing a NULL pointer is always illegal.
So the question is: “How did previous come to be NULL in the first
place?”
To answer this, we can consult the source code itself. The code from the
insert() function is shown below:
void OList::insert(int value){
OList::Node* previous;
OList::Node* newnode = new Node(value);
_find(value, previous);
if(!is_empty()){
newnode->next = previous->next; // how is `previous` NULL here?
previous->next = newnode;
}
else{
newnode->next = head;
head = newnode;
}
m_count++;
}
Looking at the path taken to arrive at that particular line, we see that
previous is intended to receive its value on the line:
Which is line 63 in the file. This means that the value being placed in
previous by the _find() method is NULL. So the next thing to do is
examine the _find() method to see if there is a bug there, or perhaps
it is just being used incorrectly. The _find() method (along with its
documentation) is:
/**
* (internal use) - finds location of a value in the list, along with
* the node immediately to its "left" (the previous node) in the list.
* In cases where the node isn't in the list, the previous pointer will
* be set to the node immediately to the "left" of where the node should
* be located in the list. This makes the function useful for determining
* where a node should be added in the list.
* @param value value to find
* @param[out] previous pointer to the previous Node in the list (or NULL if the item's position is at head)
* @return pointer to the Node where `value` was found, or NULL if not found
*/
OList::Node* OList::_find(int value, Node*& previous)const {
OList::Node* current = head;
previous = 0;
while(current && current->data < value){
previous = current;
current = current->next;
}
return current && current->data == value ? current : 0;
}
From reading the method’s documentation, it seems that previous will
indeed be set to NULL if the item’s position is at the head of the
list. This makes sense — there is nothing to the logical “left” of
the head of a singly-linked list, after all.
So it probably isn’t an error that the _find() method is setting
previous to NULL; it just means that the item that is being inserted
belongs to the logical “left” of the current head of the list (thus
becoming the new head item). Has this situation been properly handled
in the insert() method? Take another look:
void OList::insert(int value){
OList::Node* previous;
OList::Node* newnode = new Node(value);
_find(value, previous);
if(!is_empty()){
newnode->next = previous->next;
previous->next = newnode;
}
else{
head = newnode;
}
m_count++;
}
Now that it is clear that previous will be set to NULL any time the
new node belongs at the head of the list, it is worth listing the
situations in which that will be the case:
- the list is empty, meaning that any item inserted will become the
head
or
- the item being inserted is smaller than the item currently at
head, and should become the new head of the list
Examining the code carefully shows that the “empty list” case has been
planned for, but the “new smallest item” case has not. In either
of these cases, the new node’s next pointer should be pointed to the
current head (which may be NULL if the list is empty, but that is not
a problem). Then, the head pointer must be pointed to the new node
(the item being inserted becomes the first item). In both cases, the
previous pointer will be set to NULL by the call to _find(), so that
can be used as the condition in the if, replacing the check for
“not empty”:
if(previous != 0){ // replaces "if(!is_empty()){"
The else clause must also be modified to properly set the new
node’s next pointer (for the “new first item” case):
else{
newnode->next = head; // link in `newnode` in case list is non-empty
head = newnode;
}
This means that the modifed insert() function should look like:
void OList::insert(int value){
OList::Node* previous;
OList::Node* newnode = new Node(value);
_find(value, previous);
if(previous != 0){ // if list is non-empty and new node isn't a minimum
newnode->next = previous->next;
previous->next = newnode;
}
else{
newnode->next = head; // link in `newnode` in case list is non-empty
head = newnode;
}
m_count++;
}
Now it is once again time to compile and test the code. The GDB
debugging session is still running, so it should be stopped before
re-compiling:
(gdb) quit
A debugging session is active.
Inferior 1 [process 5617] will be killed.
Quit anyway? (y or n) y
exampleuser@exampleserver:~/OList_with_error$
The last line shown indicates that the shell prompt is back and waiting
for a command. The compile (with debugger output) and test session
(testing directly in GDB just in case) is shown below:
exampleuser@exampleserver:~/OList_with_error$ g++ -W -Wall -ansi -pedantic -ggdb debugexample.cpp
exampleuser@exampleserver:~/OList_with_error$ gdb a.out
GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
Reading symbols from /home/exampleuser/OList_with_error/a.out...done.
(gdb) run
Starting program: /home/exampleuser/OList_with_error/a.out
Initial List: 4, 7, 10, 11, 15, 20, 24, 25, 29, 35, 35, 36, 37, 39, 39, 41, 42, 48, 56, 57, 65, 67, 68, 73, 73, 82, 82, 86, 89, 96
Final List: 4, 7, 10, 11, 15, 20, 25, 29, 35, 35, 37, 39, 39, 48, 56, 57, 65, 68, 73, 73, 82, 89, 96
[Inferior 1 (process 6529) exited normally]
(gdb) run
Starting program: /home/exampleuser/OList_with_error/a.out
Initial List: 0, 3, 13, 20, 20, 22, 23, 33, 34, 35, 38, 44, 46, 48, 49, 53, 58, 60, 62, 64, 68, 70, 70, 74, 81, 90, 90, 94, 95, 99
Final List: 0, 3, 13, 20, 20, 22, 23, 33, 34, 35, 38, 44, 46, 48, 49, 53, 60, 62, 68, 70, 70, 74, 81, 90, 99
[Inferior 1 (process 6533) exited normally]
(gdb) run
Starting program: /home/exampleuser/OList_with_error/a.out
Initial List: 1, 6, 6, 7, 9, 11, 12, 13, 14, 15, 16, 17, 18, 24, 25, 31, 34, 38, 42, 44, 57, 65, 66, 68, 69, 69, 70, 73, 90, 94
Final List: 1, 6, 6, 7, 9, 11, 14, 16, 17, 18, 24, 31, 34, 44, 65, 66, 73, 90, 94
[Inferior 1 (process 6534) exited normally]
(gdb) run
Starting program: /home/exampleuser/OList_with_error/a.out
Initial List: 1, 6, 14, 15, 16, 17, 19, 26, 34, 37, 38, 39, 40, 42, 43, 47, 50, 50, 51, 52, 57, 58, 63, 63, 69, 69, 70, 72, 80, 98
Final List: 1, 6, 17, 19, 26, 34, 37, 38, 39, 40, 42, 43, 47, 50, 50, 51, 57, 58, 63, 69, 69, 70, 72, 98
[Inferior 1 (process 6537) exited normally]
(gdb) run
Starting program: /home/exampleuser/OList_with_error/a.out
Initial List: 1, 4, 6, 12, 16, 17, 18, 21, 22, 27, 37, 42, 42, 43, 46, 48, 50, 51, 51, 56, 57, 61, 65, 74, 77, 92, 94, 94, 95, 96
Program received signal SIGSEGV, Segmentation fault.
0x080489d5 in OList::remove (this=0xbffff6b4, value=1) at olist_with_bug.h:85
85 previous->next = victim->next;
(gdb)
As you can see, the program ran well four times before crashing with
another Segmentation fault on the fifth run!
This error is in a different place (line 85 in the remove() function).
Once again, the actual parameters look reasonable. There are two
pointers being dereferenced in that line, so a backtrace with local
variable values might help shed some light on what happened:
(gdb) backtrace full
#0 0x080489d5 in OList::remove (this=0xbffff6b4, value=1) at olist_with_bug.h:85
previous = 0x0
victim = 0x804c0c8
found = true
#1 0x08048d29 in main () at debugexample.cpp:27
i = 29
l = {head = 0x804c0c8, m_count = 20}
(gdb)
Once again, the previous pointer seems to be to blame, containing a
NULL value at the time it is being dereferenced. Back to the source
code! The remove() function looks like:
bool OList::remove(int value){
OList::Node* previous;
OList::Node* victim = _find(value, previous);
bool found = victim != 0;
if(victim){
previous->next = victim->next;
if(victim == head){
head = victim->next;
}
delete victim;
m_count--;
}
return found;
}
And once again the _find() helper method is being used. From the
earlier experience, it is clear that previous will be set to NULL if
the value being searched for isn’t found or if it is found at the
head of the list. A quick look a the documentation for the _find()
method’s return value indicates that if the value is not found, the
return value is also NULL. That leaves only one situation: The value
being searched for must be at the beginning of the list.
Like before, it seems that the code is handling the “value not found”
and “empty list” special case just fine (in fact, those cases are the
same when trying to remove a value). It is the special case where the
value is at the beginning of the list that is being ignored. In that
case, the item at the head of the list is being removed... The code
is already checking for this on the line:
All that is needed is to take advantage of this already-existing check
to specify that the previous pointer should only be accessed when
the “victim” node is not the head node. This is most easily
accomplished by adding an else clause for that if statement and
moving the line causing the crash into it. The modified remove()
function looks like:
bool OList::remove(int value){
OList::Node* previous;
OList::Node* victim = _find(value, previous);
bool found = victim != 0;
if(victim){
if(victim == head){
head = victim->next;
}
else{
previous->next = victim->next; // only if `victim` is not `head`!
}
delete victim;
m_count--;
}
return found;
}
After making the code adjustment, stop the GDB session using the quit
command, then re-compile and test the program. At this point the program
no longer crashes under any sequence of data additions and deletions!
The Segmentation fault errors have been tracked down and squashed thanks
to some help from GDB.
Don’t forget to re-compile without the additional debugging
information if you are intending to distribute your object/executable
file (to reduce the file size).
Other Useful GDB Commands
This example only made use of the run and backtrace full commands
from GDB; it is compelling that only two commands can help speed the
correction of subtle coding errors so dramatically, but sometimes these
two just aren’t quite enough. Adding just a few more commands to your
GDB arsenal will allow you to debug a majority of simple (common)
programming errors of all kinds:
run : Start program execution from the beginning of the program. Allows command-line arguments to be listed, and also allows basic I/O redirection.
backtrace : If (or when) a program has crashed, this command will show trace of where you are currently (which functions you are in). The “trace” consists of the current function call stack, where each call (referred to as a “frame”) is shown in order from most recent to earliest (which will be the entry into main()), and each is numbered for reference.
backtrace full : Just like backtrace, but local variables are also shown for each frame.
list : List the source code surrounding the current execution point, or in the currently selected frame.
print : Prints the value contained in a variable, or in some cases the result returned by a function that has been called in the current frame. print also has several options available to select the desired representation of the value.
break : create a breakpoint, which is a point at which execution of the program will stop during your debugging session so that you can examine local variables, etc.
step : “steps into” next line of code - will drop into a function call
next : like step but will not drop into a function call
continue : continue running the program as usual, until next breakpoint is encountered or program exits
This post was originally written for the A-State CS Department wiki in November, 2013.
The Numerical (In)accuracy Problem
Computers use binary (base-2) numbers internally. This means that our
familiar decimal (base-10) numbers must be converted to binary before
they can be stored in the computer. The problem is that some—many,
actually—base-10 real numbers do not have an exact representation in
base-2. This may seem odd until you consider that we deal with lots of
numbers that do not have an exact representation in base-10 either. Two
examples are $\pi$ and $e$. Neither of these values can be written
exactly in decimal no matter how many digits you allow after the decimal
point. Numbers of this variety are called irrational numbers. It turns
out that many numbers that are easily written in decimal become
irrational numbers when translated to binary. A simple example is the
number 0.1. It is easy enough to write in decimal, but in binary it
becomes impossible to represent exactly.
So what do we do when we can’t exactly represent a number? We round. If
you were asked to write down $\pi$, what would you write? You would
probably write down some value between 3.14 and 3.14159, depending on
how precise you wanted to be. Neither number is exactly correct, but
both are reasonable representations given the number of digits used. In
a computer, there is a limit to the number of binary digits (bits)
that may be used to represent a number. Because of this, numbers like
0.1 must be rounded to some limited precision. This rounding produces a
“pretty good”, but inexact representation of 0.1.
At this point, you are probably asking yourself “why should I care?”,
and that is a reasonable question. The answer is: Because it will cause
errors in your programs! To help illustrate why, open the interactive
Python interpreter and follow along with the following example:
First, let’s see how 0.1 is approximated in Python (Python is used for
this example because it’s default numeric output mode shows more
decimal places than C++). At the interactive prompt, type 0.1 and
<ENTER>. You should see something like the following:
>>> 0.1
0.10000000000000001
Hmm. It looks like 0.1 gained an extra 1 in the 17^th^ decimal position!
Now enter this relational expression and note the result:
Here’s the output that was produced on the computer used to write this
page:
Why was the result False? To see the answer, let’s enter each operand
into the interpreter separately and look at the result:
>>> 0.1
0.10000000000000001
>>> 3.1 - 3
0.10000000000000009
Notice that 0.1 is represented internally as 0.10000000000000001 (as
we saw before). Now notice that 3.1 - 3 (which should also be 0.1) is
represented internally as 0.10000000000000009. There is a difference
in the 17^th^ digit following the decimal point! When we compare these
two values with the == operator, the expression will return False
since the two numbers (internally) are not exactly the same! We refer to
this as rounding error, and if we want to compare real numbers for
equality in a computer program, we must learn how to deal with rounding
error properly.
A C++ Example
In C++, showing the inaccuracy is a little more difficult due to
differences in the internal representation of floating-point values. The
following example may or may not show the “error”, but it is likely to
do so:
double pi1 = acos(-1);
double pi2 = (pi1 * (0.1 * 0.1)) / (0.1 * 0.1);
if(pi1 == pi2){
cout << "OK, " << pi1 << " == " << pi2 << "!\n";
}
else{
cout << "An ERROR occurred: " << pi1 << " != " << pi2 << "!\n";
}
Here, a very accurate value for $\pi$ is calculated using the inverse
cosine function
acos(). Then, this version of $\pi$ is used to calculate another
representation (by calculating the area of a circle of radius 0.1 and
then solving for $\pi$ by dividing out the $r^2$ term). In theory, these
two values of $\pi$ should be exactly the same — but in the process
of doing imprecise floating-point operations (the operations involving
0.1), rounding error was (possibly) introduced. Therefore, it is
likely that the following output is produced:
An error occurred: 3.14159 != 3.14159!
Even worse, since the default behavior of the << operator in C++ is to
round the output, you cannot even see the difference that caused the
two values to be considered “not equal” (it was beyond the 5^th^
decimal place).
The Right Way to Compare Real Numbers for Equality
Ok, if we can’t use the == operator by itself to compare real
numbers, how can we hope to determine if two real number values are
equal? The answer is that we must accept the fact that we can’t compare
real numbers exactly with a computer—not by using the built-in float
type anyway. Once we have accepted that limitation, we can find a way to
compare real numbers that takes the rounding error into account. We will
decide on some amount of error that we will consider acceptable, then we
will determine if the two values are reasonably close to equal. This
makes sense in the real world, since you would probably say that both
3.1416 and 3.14159 are reasonable representations of $\pi$. To determine
if two numbers are functionally equal (or “close enough”), we can
ask the following question:
“Is the difference between the two values so small that we can ignore
it?”
If the answer to this question is “Yes”, we will consider the numbers
equal. Otherwise, we will consider them to be unequal. To answer this
question, we can use an expression like the following (which compares
0.1 and 3.1 - 3 ):
In Python
abs(0.1 - (3.1 - 3)) < 1e-11
In C++
fabs(0.1 - (3.1 - 3)) < 1e-11
In the expression above, we take the difference of the two values (by
subtracting them), then we look at the magnitude of that number (by
taking its absolute value with abs() (in Python), or fabs() (in C++)
to remove any negative sign). If the magnitude of the difference of the
two numbers is less than a very small value ($1\times10^{-11}$ in this
case), we consider the numbers equal. We usually refer to the small
number representing allowable error as EPSILON. Thus, a comparison of
this type is usually referred to as an EPSILON comparison. At the
Python interactive prompt, enter the expression above and note the
result. You should see something like the following:
>>> abs(0.1 - (3.1 - 3)) < 1e-11
True
For the C++ example, consider the same code testing the two values of
$\pi$ as shown above, but now using an EPSILON comparison:
double pi1 = acos(-1);
double pi2 = (pi1 * (0.1 * 0.1)) / (0.1 * 0.1);
if(fabs(pi1 - pi2) < 1e-11){
cout << "OK, " << pi1 << " == " << pi2 << "!\n";
}
else{
cout << "An ERROR occurred: " << pi1 << " != " << pi2 << "!\n";
}
Now, you can be confident that the outcome will always be as expected: