- •Contents
- •Introduction
- •Who This Book Is For
- •What This Book Covers
- •How This Book Is Structured
- •What You Need to Use This Book
- •Conventions
- •Source Code
- •Errata
- •p2p.wrox.com
- •The Basics of C++
- •The Obligatory Hello, World
- •Namespaces
- •Variables
- •Operators
- •Types
- •Conditionals
- •Loops
- •Arrays
- •Functions
- •Those Are the Basics
- •Diving Deeper into C++
- •Pointers and Dynamic Memory
- •Strings in C++
- •References
- •Exceptions
- •The Many Uses of const
- •C++ as an Object-Oriented Language
- •Declaring a Class
- •Your First Useful C++ Program
- •An Employee Records System
- •The Employee Class
- •The Database Class
- •The User Interface
- •Evaluating the Program
- •What Is Programming Design?
- •The Importance of Programming Design
- •Two Rules for C++ Design
- •Abstraction
- •Reuse
- •Designing a Chess Program
- •Requirements
- •Design Steps
- •An Object-Oriented View of the World
- •Am I Thinking Procedurally?
- •The Object-Oriented Philosophy
- •Living in a World of Objects
- •Object Relationships
- •Abstraction
- •Reusing Code
- •A Note on Terminology
- •Deciding Whether or Not to Reuse Code
- •Strategies for Reusing Code
- •Bundling Third-Party Applications
- •Open-Source Libraries
- •The C++ Standard Library
- •Designing with Patterns and Techniques
- •Design Techniques
- •Design Patterns
- •The Reuse Philosophy
- •How to Design Reusable Code
- •Use Abstraction
- •Structure Your Code for Optimal Reuse
- •Design Usable Interfaces
- •Reconciling Generality and Ease of Use
- •The Need for Process
- •Software Life-Cycle Models
- •The Stagewise and Waterfall Models
- •The Spiral Method
- •The Rational Unified Process
- •Software-Engineering Methodologies
- •Extreme Programming (XP)
- •Software Triage
- •Be Open to New Ideas
- •Bring New Ideas to the Table
- •Thinking Ahead
- •Keeping It Clear
- •Elements of Good Style
- •Documenting Your Code
- •Reasons to Write Comments
- •Commenting Styles
- •Comments in This Book
- •Decomposition
- •Decomposition through Refactoring
- •Decomposition by Design
- •Decomposition in This Book
- •Naming
- •Choosing a Good Name
- •Naming Conventions
- •Using Language Features with Style
- •Use Constants
- •Take Advantage of const Variables
- •Use References Instead of Pointers
- •Use Custom Exceptions
- •Formatting
- •The Curly Brace Alignment Debate
- •Coming to Blows over Spaces and Parentheses
- •Spaces and Tabs
- •Stylistic Challenges
- •Introducing the Spreadsheet Example
- •Writing Classes
- •Class Definitions
- •Defining Methods
- •Using Objects
- •Object Life Cycles
- •Object Creation
- •Object Destruction
- •Assigning to Objects
- •Distinguishing Copying from Assignment
- •The Spreadsheet Class
- •Freeing Memory with Destructors
- •Handling Copying and Assignment
- •Different Kinds of Data Members
- •Static Data Members
- •Const Data Members
- •Reference Data Members
- •Const Reference Data Members
- •More about Methods
- •Static Methods
- •Const Methods
- •Method Overloading
- •Default Parameters
- •Inline Methods
- •Nested Classes
- •Friends
- •Operator Overloading
- •Implementing Addition
- •Overloading Arithmetic Operators
- •Overloading Comparison Operators
- •Building Types with Operator Overloading
- •Pointers to Methods and Members
- •Building Abstract Classes
- •Using Interface and Implementation Classes
- •Building Classes with Inheritance
- •Extending Classes
- •Overriding Methods
- •Inheritance for Reuse
- •The WeatherPrediction Class
- •Adding Functionality in a Subclass
- •Replacing Functionality in a Subclass
- •Respect Your Parents
- •Parent Constructors
- •Parent Destructors
- •Referring to Parent Data
- •Casting Up and Down
- •Inheritance for Polymorphism
- •Return of the Spreadsheet
- •Designing the Polymorphic Spreadsheet Cell
- •The Spreadsheet Cell Base Class
- •The Individual Subclasses
- •Leveraging Polymorphism
- •Future Considerations
- •Multiple Inheritance
- •Inheriting from Multiple Classes
- •Naming Collisions and Ambiguous Base Classes
- •Interesting and Obscure Inheritance Issues
- •Special Cases in Overriding Methods
- •Copy Constructors and the Equals Operator
- •The Truth about Virtual
- •Runtime Type Facilities
- •Non-Public Inheritance
- •Virtual Base Classes
- •Class Templates
- •Writing a Class Template
- •How the Compiler Processes Templates
- •Distributing Template Code between Files
- •Template Parameters
- •Method Templates
- •Template Class Specialization
- •Subclassing Template Classes
- •Inheritance versus Specialization
- •Function Templates
- •Function Template Specialization
- •Function Template Overloading
- •Friend Function Templates of Class Templates
- •Advanced Templates
- •More about Template Parameters
- •Template Class Partial Specialization
- •Emulating Function Partial Specialization with Overloading
- •Template Recursion
- •References
- •Reference Variables
- •Reference Data Members
- •Reference Parameters
- •Reference Return Values
- •Deciding between References and Pointers
- •Keyword Confusion
- •The const Keyword
- •The static Keyword
- •Order of Initialization of Nonlocal Variables
- •Types and Casts
- •typedefs
- •Casts
- •Scope Resolution
- •Header Files
- •C Utilities
- •Variable-Length Argument Lists
- •Preprocessor Macros
- •How to Picture Memory
- •Allocation and Deallocation
- •Arrays
- •Working with Pointers
- •Array-Pointer Duality
- •Arrays Are Pointers!
- •Not All Pointers Are Arrays!
- •Dynamic Strings
- •C-Style Strings
- •String Literals
- •The C++ string Class
- •Pointer Arithmetic
- •Custom Memory Management
- •Garbage Collection
- •Object Pools
- •Function Pointers
- •Underallocating Strings
- •Memory Leaks
- •Double-Deleting and Invalid Pointers
- •Accessing Out-of-Bounds Memory
- •Using Streams
- •What Is a Stream, Anyway?
- •Stream Sources and Destinations
- •Output with Streams
- •Input with Streams
- •Input and Output with Objects
- •String Streams
- •File Streams
- •Jumping around with seek() and tell()
- •Linking Streams Together
- •Bidirectional I/O
- •Internationalization
- •Wide Characters
- •Non-Western Character Sets
- •Locales and Facets
- •Errors and Exceptions
- •What Are Exceptions, Anyway?
- •Why Exceptions in C++ Are a Good Thing
- •Why Exceptions in C++ Are a Bad Thing
- •Our Recommendation
- •Exception Mechanics
- •Throwing and Catching Exceptions
- •Exception Types
- •Throwing and Catching Multiple Exceptions
- •Uncaught Exceptions
- •Throw Lists
- •Exceptions and Polymorphism
- •The Standard Exception Hierarchy
- •Catching Exceptions in a Class Hierarchy
- •Writing Your Own Exception Classes
- •Stack Unwinding and Cleanup
- •Catch, Cleanup, and Rethrow
- •Use Smart Pointers
- •Common Error-Handling Issues
- •Memory Allocation Errors
- •Errors in Constructors
- •Errors in Destructors
- •Putting It All Together
- •Why Overload Operators?
- •Limitations to Operator Overloading
- •Choices in Operator Overloading
- •Summary of Overloadable Operators
- •Overloading the Arithmetic Operators
- •Overloading Unary Minus and Unary Plus
- •Overloading Increment and Decrement
- •Overloading the Subscripting Operator
- •Providing Read-Only Access with operator[]
- •Non-Integral Array Indices
- •Overloading the Function Call Operator
- •Overloading the Dereferencing Operators
- •Implementing operator*
- •Implementing operator->
- •What in the World Is operator->* ?
- •Writing Conversion Operators
- •Ambiguity Problems with Conversion Operators
- •Conversions for Boolean Expressions
- •How new and delete Really Work
- •Overloading operator new and operator delete
- •Overloading operator new and operator delete with Extra Parameters
- •Two Approaches to Efficiency
- •Two Kinds of Programs
- •Is C++ an Inefficient Language?
- •Language-Level Efficiency
- •Handle Objects Efficiently
- •Use Inline Methods and Functions
- •Design-Level Efficiency
- •Cache as Much as Possible
- •Use Object Pools
- •Use Thread Pools
- •Profiling
- •Profiling Example with gprof
- •Cross-Platform Development
- •Architecture Issues
- •Implementation Issues
- •Platform-Specific Features
- •Cross-Language Development
- •Mixing C and C++
- •Shifting Paradigms
- •Linking with C Code
- •Mixing Java and C++ with JNI
- •Mixing C++ with Perl and Shell Scripts
- •Mixing C++ with Assembly Code
- •Quality Control
- •Whose Responsibility Is Testing?
- •The Life Cycle of a Bug
- •Bug-Tracking Tools
- •Unit Testing
- •Approaches to Unit Testing
- •The Unit Testing Process
- •Unit Testing in Action
- •Higher-Level Testing
- •Integration Tests
- •System Tests
- •Regression Tests
- •Tips for Successful Testing
- •The Fundamental Law of Debugging
- •Bug Taxonomies
- •Avoiding Bugs
- •Planning for Bugs
- •Error Logging
- •Debug Traces
- •Asserts
- •Debugging Techniques
- •Reproducing Bugs
- •Debugging Reproducible Bugs
- •Debugging Nonreproducible Bugs
- •Debugging Memory Problems
- •Debugging Multithreaded Programs
- •Debugging Example: Article Citations
- •Lessons from the ArticleCitations Example
- •Requirements on Elements
- •Exceptions and Error Checking
- •Iterators
- •Sequential Containers
- •Vector
- •The vector<bool> Specialization
- •deque
- •list
- •Container Adapters
- •queue
- •priority_queue
- •stack
- •Associative Containers
- •The pair Utility Class
- •multimap
- •multiset
- •Other Containers
- •Arrays as STL Containers
- •Strings as STL Containers
- •Streams as STL Containers
- •bitset
- •The find() and find_if() Algorithms
- •The accumulate() Algorithms
- •Function Objects
- •Arithmetic Function Objects
- •Comparison Function Objects
- •Logical Function Objects
- •Function Object Adapters
- •Writing Your Own Function Objects
- •Algorithm Details
- •Utility Algorithms
- •Nonmodifying Algorithms
- •Modifying Algorithms
- •Sorting Algorithms
- •Set Algorithms
- •The Voter Registration Audit Problem Statement
- •The auditVoterRolls() Function
- •The getDuplicates() Function
- •The RemoveNames Functor
- •The NameInList Functor
- •Testing the auditVoterRolls() Function
- •Allocators
- •Iterator Adapters
- •Reverse Iterators
- •Stream Iterators
- •Insert Iterators
- •Extending the STL
- •Why Extend the STL?
- •Writing an STL Algorithm
- •Writing an STL Container
- •The Appeal of Distributed Computing
- •Distribution for Scalability
- •Distribution for Reliability
- •Distribution for Centrality
- •Distributed Content
- •Distributed versus Networked
- •Distributed Objects
- •Serialization and Marshalling
- •Remote Procedure Calls
- •CORBA
- •Interface Definition Language
- •Implementing the Class
- •Using the Objects
- •A Crash Course in XML
- •XML as a Distributed Object Technology
- •Generating and Parsing XML in C++
- •XML Validation
- •Building a Distributed Object with XML
- •SOAP (Simple Object Access Protocol)
- •. . . Write a Class
- •. . . Subclass an Existing Class
- •. . . Throw and Catch Exceptions
- •. . . Read from a File
- •. . . Write to a File
- •. . . Write a Template Class
- •There Must Be a Better Way
- •Smart Pointers with Reference Counting
- •Double Dispatch
- •Mix-In Classes
- •Object-Oriented Frameworks
- •Working with Frameworks
- •The Model-View-Controller Paradigm
- •The Singleton Pattern
- •Example: A Logging Mechanism
- •Implementation of a Singleton
- •Using a Singleton
- •Example: A Car Factory Simulation
- •Implementation of a Factory
- •Using a Factory
- •Other Uses of Factories
- •The Proxy Pattern
- •Example: Hiding Network Connectivity Issues
- •Implementation of a Proxy
- •Using a Proxy
- •The Adapter Pattern
- •Example: Adapting an XML Library
- •Implementation of an Adapter
- •Using an Adapter
- •The Decorator Pattern
- •Example: Defining Styles in Web Pages
- •Implementation of a Decorator
- •Using a Decorator
- •The Chain of Responsibility Pattern
- •Example: Event Handling
- •Implementation of a Chain of Responsibility
- •Using a Chain of Responsibility
- •Example: Event Handling
- •Implementation of an Observer
- •Using an Observer
- •Chapter 1: A Crash Course in C++
- •Chapter 3: Designing with Objects
- •Chapter 4: Designing with Libraries and Patterns
- •Chapter 5: Designing for Reuse
- •Chapter 7: Coding with Style
- •Chapters 8 and 9: Classes and Objects
- •Chapter 11: Writing Generic Code with Templates
- •Chapter 14: Demystifying C++ I/O
- •Chapter 15: Handling Errors
- •Chapter 16: Overloading C++ Operators
- •Chapter 17: Writing Efficient C++
- •Chapter 19: Becoming Adept at Testing
- •Chapter 20: Conquering Debugging
- •Chapter 24: Exploring Distributed Objects
- •Chapter 26: Applying Design Patterns
- •Beginning C++
- •General C++
- •I/O Streams
- •The C++ Standard Library
- •C++ Templates
- •Integrating C++ and Other Languages
- •Algorithms and Data Structures
- •Open-Source Software
- •Software-Engineering Methodology
- •Programming Style
- •Computer Architecture
- •Efficiency
- •Testing
- •Debugging
- •Distributed Objects
- •CORBA
- •XML and SOAP
- •Design Patterns
- •Index
Conquering Debugging
If you don’t have a memory-checking tool at your disposal, and the normal strategies for debugging are not helping, you may need to resort to code inspection. Once you’ve narrowed down the part of the code containing the bug, here are some specific items to look for.
Object and Class-related Errors
Verify that your classes with dynamically allocated memory have destructors that free exactly the memory that’s allocated in the object: no more, and no less.
Ensure that your classes handle copying and assignment correctly with copy constructors and assignment operators, as described in Chapter 9.
Check for suspicious casts. If you are casting a pointer to an object from one type to another, make sure that it’s valid.
General Memory Errors
Make sure that every call to new is matched with exactly one call to delete. Similarly, every call to malloc, alloc, or calloc should be matched with one call to free. And every call to new[] should be matched with one call to delete[]. Although duplicate free calls are generally harmless, they can cause problems if that same memory was handed out in a different memory allocation call after the first free.
Check for buffer overruns. Anytime you iterate over an array or write into or read from a C-style string, verify that you are not accessing memory past the end of the array.
Check for dereferencing invalid pointers.
Debugging Multithreaded Programs
Unlike in Java, the C++ language does not provide any mechanisms for threading and synchronization between threads. However, multithreaded C++ programs are common, so it is important to think about the special issues involved in debugging a multithreaded program. Bugs in multithreaded programs are often caused by variations in timing in the operating system scheduling, and can be difficult to reproduce. Thus, debugging multithreaded programs takes a special set of techniques:
1.Use cout debugging. When debugging multithreaded programs, cout debugging is often more effective than using a debugger. Most debuggers do not handle multiple threads of execution very well, or at least don’t make it easy to debug a multithreaded program. It is difficult to step through your program when you don’t know which thread will run at any given time. Add debug statements to your program before and after critical sections, and before acquiring and after releasing locks. Often by watching this output, you will be able to detect deadlocks and race conditions because you will be able to see that two threads are in a critical section at the same time or that one thread is stuck waiting for a lock.
2.Insert forced sleeps and context switches. If you are having trouble reproducing the problem consistently, or have a hunch about the root cause but want to verify it, you can force certain thread-scheduling behavior by making your threads sleep for specified amounts of time.
Although there is no standard way in C++ to make a thread sleep, most platforms provide a call, often called sleep(). Sleeping for several seconds right before releasing a lock, immediately before signaling a condition variable, or directly before accessing shared data can reveal race conditions that would otherwise go undetected.
547
Chapter 20
Debugging Example: Article Citations
This section presents a buggy program and shows you the steps to take in order to debug it and fix the problem.
Suppose that you’re part of a team writing a Web page that allows users to search for the research articles that cite a particular paper. This type of service is useful for authors who are trying to find work similar to their own. Once they find one paper representing a related work, they can look for every paper that cites that one to find other related work.
In this project, you are responsible for the code that reads the raw citations data from text files. For simplicity, assume that the citation info for each paper is found in its own file. Furthermore, assume that the first line of each file contains the author, title, and publication info for the paper; the second line is always empty; and all subsequent lines contain the citations from the article (one on each line). Here is an example file for one of the most important papers in Computer Science:
Alan Turing,”On Computable Numbers with an Application to the Entscheidungsproblem”,\ Proceedings of the London Mathematical Society, Series 2, Vol.42 (1936 - 37) pages\ 230 to 265.
Godel, “Uber formal unentscheidbare Satze der Principia Mathernatica und verwant der\ Systeme, I”, Monatshefte Math. Phys., 38 (1931). 173-198.
Alonzo Church. “An unsolvable problem of elementary number theory”, American J of\ Math., 58(1936), 345 363.
Alonzo Church. “A note on the Entscheidungsprob1em”, J. of Symbolic logic, 1 (1930),\ 40 41.
Cf. Hobson, “Theory of functions of a real variable (2nd ed., 1921)”, 87, 88.\ Proc. London Math. Soc (2) 42 (1936 7), 230 265.
Note that the \ character is the continuation character to ensure that the computer treats the multiple lines as a single line during processing.
Buggy Implementation of an ArticleCitations Class
You decide to structure your program by writing an ArticleCitations class that reads the file and stores the information. This class stores the article info from the first line in one string, and the citations info in an array of strings. Please note that this design decision is not necessarily the best possible. However, for the purposes of illustrating buggy applications, it’s perfect! The class definition looks like this:
#include <string> using std::string;
class ArticleCitations
{
public:
ArticleCitations(const string& fileName); ~ArticleCitations();
ArticleCitations(const ArticleCitations& src); ArticleCitations& operator=(const ArticleCitations& rhs);
548
Conquering Debugging
string getArticle() const { return mArticle; }
int getNumCitations() const { return mNumCitations; } string getCitation(int i) const { return mCitations[i]; }
protected:
void readFile(const string& fileName);
string mArticle; string* mCitations; int mNumCitations;
};
The implementations of the methods follow. This program is buggy! Don’t use it verbatim or as a model.
#include “ArticleCitations.h” #include <iostream>
#include <fstream> #include <string> #include <stdexcept> using namespace std;
ArticleCitations::ArticleCitations(const string& fileName)
{
// All we have to do is read the file. readFile(fileName);
}
ArticleCitations::ArticleCitations(const ArticleCitations& src)
{
//Copy the article name, author, etc. mArticle = src.mArticle;
//Copy the number of citations. mNumCitations = src.mNumCitations;
//Allocate an array of the correct size. mCitations = new string[mNumCitations];
//Copy each element of the array.
for (int i = 0; i < mNumCitations; i++) { mCitations[i] = src.mCitations[i];
}
}
ArticleCitations& ArticleCitations::operator=(const ArticleCitations& rhs)
{
//Check for self-assignment. if (this == &rhs) {
return (*this);
}
//Free the old memory. delete [] mCitations;
//Copy the article name, author, etc. mArticle = rhs.mArticle;
//Copy the number of citations. mNumCitations = rhs.mNumCitations;
//Allocate a new array of the correct size. mCitations = new string[mNumCitations];
549
Chapter 20
// Copy each citation.
for (int i = 0; i < mNumCitations; i++) { mCitations[i] = rhs.mCitations[i];
}
return (*this);
}
ArticleCitations::~ArticleCitations()
{
delete[] mCitations;
}
void ArticleCitations::readFile(const string& fileName)
{
//Open the file and check for failure. ifstream istr(fileName.c_str());
if (istr.fail()) {
throw invalid_argument(“Unable to open file\n”);
}
//Read the article author, title, etc. line. getline(istr, mArticle);
//Skip the white space before the citations start. istr >> ws;
int count = 0;
//Save the current position so we can return to it. int citationsStart = istr.tellg();
//First count the number of citations.
while (!istr.eof()) { string temp; getline(istr, temp);
// Skip white space before the next entry. istr >> ws;
count++;
}
if (count != 0) {
//Allocate an array of strings to store the citations. mCitations = new string[count];
mNumCitations = count;
//Seek back to the start of the citations. istr.seekg(citationsStart);
//Read each citation and store it in the new array. for (count = 0; count < mNumCitations; count++) {
string temp; getline(istr, temp); mCitations[count] = temp;
}
}
}
Testing the ArticleCitations class
Following the advice of Chapter 19, you decide you unit test your ArticleCitations class before proceeding, though for simplicity in this example, the unit test does not use a test framework. The following
550
Conquering Debugging
program asks the user for a filename, constructs an ArticleCitations class with that filename, and passes the object by value to the processCitations() function, which prints out the info using the public accessor methods on the object.
#include “ArticleCitations.h” #include <iostream>
using namespace std;
void processCitations(ArticleCitations cit);
int main(int argc, char** argv)
{
string fileName;
while (true) {
cout << “Enter a file name (\”STOP\” to stop): “; cin >> fileName;
if (fileName == “STOP”) { break;
}
// Test constructor ArticleCitations cit(fileName); processCitations(cit);
}
return (0);
}
void processCitations(ArticleCitations cit)
{
cout << cit.getArticle() << endl; int num = cit.getNumCitations(); for (int i = 0; i < num; i++) {
cout << cit.getCitation(i) << endl;
}
}
cout Debugging
You decide to test the program on the Alan Turing example (stored in a file called paper1.txt). Here is the output:
Enter a file name (“STOP” to stop): paper1.txt
Alan Turing.”On Computable Numbers with an Application to the Entscheidungsproblem”, Proceedings of the London Mathematical Society, Series 2, Vol.42 (1936 - 37) pages 230 to 265.
Enter a file name (“STOP” to stop): STOP
That doesn’t look right! There are supposed to be five citations printed instead of five blank lines.
551
Chapter 20
For this bug, you decide to try good ole cout debugging. In this case, it makes sense to start by looking at the function that reads the citations from the file. If that doesn’t work right, then obviously the object won’t have the citations. You can modify readFile() as follows:
void ArticleCitations::readFile(const string& fileName)
{
//Open the file and check for failure. ifstream istr(fileName.c_str());
if (istr.fail()) {
throw invalid_argument(“Unable to open file\n”);
}
//Read the article author, title, etc. line. getline(istr, mArticle);
//Skip the white space before the citations start. istr >> ws;
int count = 0;
//Save the current position so we can return to it. int citationsStart = istr.tellg();
//First count the number of citations.
cout << “readFile(): counting number of citations\n”; while (!istr.eof()) {
string temp; getline(istr, temp);
// Skip white space before the next entry. istr >> ws;
cout << “Citation “ << count << “: “ << temp << endl; count++;
}
cout << “Found “ << count << “ citations\n” << endl;
cout << “readFile(): reading citations\n”; if (count != 0) {
//Allocate an array of strings to store the citations. mCitations = new string[count];
mNumCitations = count;
//Seek back to the start of the citations. istr.seekg(citationsStart);
//Read each citation and store it in the new array. for (count = 0; count < mNumCitations; count++) {
string temp; getline(istr, temp); cout << temp << endl; mCitations[count] = temp;
}
}
Running the same test on this program gives this output:
Enter a file name (“STOP” to stop): paper1.txt readFile(): counting number of citations
Citation 0: Godel, “Uber formal unentscheidbare Satze der Principia Mathernatica und verwant der Systeme, I”, Monatshefte Math. Phys., 38 (1931). 173-198.
552
Conquering Debugging
Citation 1: Alonzo Church. “An unsolvable problem of elementary number theory”, American J of Math., 58(1936), 345 363.
Citation 2: Alonzo Church. “A note on the Entscheidungsprob1em”, J. of Symbolic logic, 1 (1930), 40 41.
Citation 3: Cf. Hobson, “Theory of functions of a real variable (2nd ed., 1921)”, 87, 88.
Citation 4: Proc. London Math. Soc (2) 42 (1936 7), 230 265. Found 5 citations
readFile(): reading citations
Alan Turing,”On Computable Numbers with an Application to the Entscheidungsproblem”, Proceedings of the London Mathematical Society, Series 2, Vol.42 (1936 - 37) pages 230 to 265.
Enter a file name (“STOP” to stop):
As you can see from the output, the first time the program reads the citations from the file, in order to count them, they are read correctly. However, the second time, they are not read correctly. Why not? One way to delve deeper into this issue is to add some debugging code to check the state of the file stream after each attempt to read a citation:
void printStreamState(const istream& istr)
{
if (istr.good()) {
cout << “stream state is good\n”;
}
if (istr.bad()) {
cout << “stream state is bad\n”;
}
if (istr.fail()) {
cout << “stream state is fail\n”;
}
if (istr.eof()) {
cout << “stream state is eof\n”;
}
}
void ArticleCitations::readFile(const string& fileName)
{
//Open the file and check for failure. ifstream istr(fileName.c_str());
if (istr.fail()) {
throw invalid_argument(“Unable to open file\n”);
}
//Read the article author, title, etc. line. getline(istr, mArticle);
553
Chapter 20
// Skip the white space before the citations start. istr >> ws;
int count = 0;
//Save the current position so we can return to it. int citationsStart = istr.tellg();
//First count the number of citations.
cout << “readFile(): counting number of citations\n”; while (!istr.eof()) {
string temp; getline(istr, temp);
// Skip white space before the next entry. istr >> ws;
printStreamState(istr);
cout << “Citation “ << count << “: “ << temp << endl; count++;
}
cout << “Found “ << count << “ citations\n” << endl; cout << “readFile(): reading citations\n”;
if (count != 0) {
//Allocate an array of strings to store the citations. mCitations = new string[count];
mNumCitations = count;
//Seek back to the start of the citations. istr.seekg(citationsStart);
//Read each citation and store it in the new array. for (count = 0; count < mNumCitations; count++) {
string temp; getline(istr, temp); printStreamState(istr); cout << temp << endl; mCitations[count] = temp;
}
}
}
When you run your program this time, you find some interesting information:
Enter a file name (“STOP” to stop): paper1.txt readFile(): counting number of citations stream state is good
Citation 0: Godel, “Uber formal unentscheidbare Satze der Principia Mathernatica und verwant der Systeme, I”, Monatshefte Math. Phys., 38 (1931). 173-198. stream state is good
Citation 1: Alonzo Church. “An unsolvable problem of elementary number theory”, American J of Math., 58(1936), 345 363.
stream state is good
Citation 2: Alonzo Church. “A note on the Entscheidungsprob1em”, J. of Symbolic logic, 1 (1930), 40 41.
stream state is good
Citation 3: Cf. Hobson, “Theory of functions of a real variable (2nd ed., 1921)”, 87, 88.
stream state is eof
Citation 4: Proc. London Math. Soc (2) 42 (1936 7), 230 265. Found 5 citations
554
Conquering Debugging
readFile(): reading citations stream state is fail
stream state is eof
stream state is fail
stream state is eof
stream state is fail
stream state is eof
stream state is fail
stream state is eof
stream state is fail
stream state is eof
Alan Turing,”On Computable Numbers with an Application to the Entscheidungsproblem”, Proceedings of the London Mathematical Society, Series 2, Vol.42 (1936 - 37) pages 230 to 265.
Enter a file name (“STOP” to stop):
It looks like the stream state is good until after the final citation is read for the first time. Then, the stream state is eof, because the end-of-file has been reached. That is expected. What is not expected is that the stream state is both fail and eof after all attempts to read the citations a second time. That doesn’t appear to make sense at first: the code uses seekg() to seek back to the beginning of the citations before reading them a second time, so the file shouldn’t still be at the end. However, recall from Chapter 13 that streams maintain their error and eof states until you clear them explicitly. seekg() doesn’t clear the eof state automatically. When in an error or eof state, streams fail to read data correctly, which explains why the stream state is fail also after trying to read the citations a second time. A closer look at your method reveals that it fails to call clear() on the istream after reaching the end of file. If you modify the method by adding a call to clear(), it will read the citations properly.
Here is the corrected readFile() method without the debugging cout statements:
void ArticleCitations::readFile(const string& fileName)
{
// CODE OMMITTED FOR BREVITY
if (count != 0) {
//Allocate an array of strings to store the citations. mCitations = new string[count];
mNumCitations = count;
//Clear the previous eof.
istr.clear();
//Seek back to the start of the citations. istr.seekg(citationsStart);
//Read each citation and store it in the new array. for (count = 0; count < mNumCitations; count++) {
string temp;
555
Chapter 20
getline(istr, temp); mCitations[count] = temp;
}
}
}
Using a Debugger
The following example uses the gdb debugger on the Linux operating system.
Now that your ArticleCitations class seems to work well on one citations file, you decide to blaze ahead and test some special cases, starting with a file with no citations. The file looks like this, and is stored in a file named paper2.txt:
Author with no citations
When you try to run your program on this file, you get the following result:
Enter a file name (“STOP” to stop): paper1.txt
Alan Turing.”On Computable Numbers with an Application to the Entscheidungsproblem”, Proceedings of the London Mathematical Society, Series 2, Vol.42 (1936 - 37) pages 230 to 265.
Godel, “Uber formal unentscheidbare Satze der Principia Mathernatica und verwant der Systeme, I”, Monatshefte Math. Phys., 38 (1931). 173-198.
Alonzo Church. “An unsolvable problem of elementary number theory”, American J of Math., 58(1936), 345 363.
Alonzo Church. “A note on the Entscheidungsprob1em”, J. of Symbolic logic, 1 (1930), 40 41.
Cf. Hobson, “Theory of functions of a real variable (2nd ed., 1921)”, 87, 88. Proc. London Math. Soc (2) 42 (1936 7), 230 265.
Enter a file name (“STOP” to stop): paper2.txt Author with no citations
Segmentation fault
Oops. There must be some sort of memory error. This time you decide to give the debugger a shot. The Gnu DeBugger (gdb) is widely available on Unix platforms, and works quite well. First, you must compile your program with debugging info (-g with g++). After that, you can launch the program under gdb. Here’s an example session using the debugger to root-cause this problem:
>gdb buggyprogram
GNU gdb Red Hat Linux (5.2-2)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type “show copying” to see the conditions.
There is absolutely no warranty for GDB. Type “show warranty” for details. This GDB was configured as “ia64-redhat-linux”...
(gdb) run
Starting program: buggyprogram
Enter a file name (“STOP” to stop): paper1.txt
Alan Turing.”On Computable Numbers with an Application to the Entscheidungsproblem”, Proceedings of the London Mathematical Society, Series 2, Vol.42 (1936 - 37) pages 230 to 265.
556
Conquering Debugging
Godel, “Uber formal unentscheidbare Satze der Principia Mathernatica und verwant der Systeme, I”, Monatshefte Math. Phys., 38 (1931). 173-198.
Alonzo Church. “An unsolvable problem of elementary number theory”, American J of Math., 58(1936), 345 363.
Alonzo Church. “A note on the Entscheidungsprob1em”, J. of Symbolic logic, 1 (1930), 40 41.
Cf. Hobson, “Theory of functions of a real variable (2nd ed., 1921)”, 87, 88. Proc. London Math. Soc (2) 42 (1936 7), 230 265.
Enter a file name (“STOP” to stop): paper2.txt Author with no citations
Program received signal SIGSEGV, Segmentation fault. __libc_free (mem=0x6000000000010320) at malloc.c:3143 3143 malloc.c: No such file or directory.
in malloc.c
Current language: auto; currently c
When the SEGV occurs, the debugger allows you to poke around in the state of program at the time. The bt command shows the current stack trace. You can move up and down the function calls in the stack with up and down.
(gdb) bt
#0 __libc_free (mem=0x6000000000010320) at malloc.c:3143 #1 0x2000000000089010 in __builtin_delete ()
from /usr/lib/libstdc++-libc6.2-2.so.3
#2 0x2000000000089050 in __builtin_vec_delete () from /usr/lib/libstdc++-libc6.2-2.so.3
#3 0x400000000000a820 in ArticleCitations::~ArticleCitations ( this=0x80000fffffffb920, __in_chrg=2) at ArticleCitations.cpp:51
#4 0x4000000000004f40 in main (argc=1, argv=0x80000fffffffb968) at BuggyProgram.cpp:20
One item of interest in this stack trace is that delete calls free(). It’s actually fairly common for new and delete to be implemented in terms of malloc() and free(). More importantly, from this stack trace you can see that there seems to be some sort of problem in the ArticleCitations destructor. The list command shows the code in the current stack frame.
(gdb) up 3
#3 0x400000000000a820 in ArticleCitations::~ArticleCitations ( this=0x80000fffffffb920, __in_chrg=2) at ArticleCitations.cpp:51
51 delete [] mCitations; Current language: auto; currently c++ (gdb) list
46return (*this);
47}
48
49ArticleCitations::~ArticleCitations()
50{
51delete [] mCitations;
52}
53
54 void ArticleCitations::readFile(const string& fileName)
{
557
Chapter 20
The only thing in the destructor is a single delete[] call. In gdb, you can print values available in the current scope with print. In order to root-cause the problem, you can try printing some of the object member variables. Recall that the string type in C++ is really a typedef of the basic_string template instantiated for chars.
(gdb) print mCitations $3 = (
basic_string<char,string_char_traits<char>,__default_alloc_template<true, 0> > *) 0x6000000000010338
Hmm, mCitations looks like a valid pointer (though it’s hard to tell, of course).
(gdb) print mNumCitations $2 = 5
Ah ha! Here’s the problem. This article isn’t supposed to have any citations. Why is mNumCitations set to 5? Take another look at the code in readFile() for the case that there are no citations. In that case, it looks like it never initializes mNumCitations and mCitations! The code is left with whatever junk is in memory already in those locations. In this case, the previous ArticleCitations object had the value 5 in mNumCitations. The second ArticleCitations object must have been placed in the same location in memory and so received that same value. However, the pointer value that it was assigned randomly is certainly not a valid pointer to delete! You need to initialize mCitations and mNumCitations whether or not you actually find any citations in the file. Here is the fixed code:
void ArticleCitations::readFile(const string& fileName)
{
// CODE OMMITTED FOR BREVITY
mCitations = NULL;
mNumCitations = 0; if (count != 0) {
//Allocate an array of strings to store the citations. mCitations = new string[count];
mNumCitations = count;
//Clear the previous eof.
istr.clear();
//Seek back to the start of the citations. istr.seekg(citationsStart);
//Read each citation and store it in the new array. for (count = 0; count < mNumCitations; count++) {
string temp; getline(istr, temp); mCitations[count] = temp;
}
}
}
As this example shows, memory errors don’t always show up right away. It often takes a debugger and some persistence to figure them out.
If you attempt to replicate this debugging session on a different platform, you may find that, due to the vagaries of memory errors, the program crashes in a different place than this example shows.
558