- •Contents
- •Introduction
- •Who This Book Is For
- •What This Book Covers
- •How This Book Is Structured
- •What You Need to Use This Book
- •Conventions
- •Source Code
- •Errata
- •p2p.wrox.com
- •The Basics of C++
- •The Obligatory Hello, World
- •Namespaces
- •Variables
- •Operators
- •Types
- •Conditionals
- •Loops
- •Arrays
- •Functions
- •Those Are the Basics
- •Diving Deeper into C++
- •Pointers and Dynamic Memory
- •Strings in C++
- •References
- •Exceptions
- •The Many Uses of const
- •C++ as an Object-Oriented Language
- •Declaring a Class
- •Your First Useful C++ Program
- •An Employee Records System
- •The Employee Class
- •The Database Class
- •The User Interface
- •Evaluating the Program
- •What Is Programming Design?
- •The Importance of Programming Design
- •Two Rules for C++ Design
- •Abstraction
- •Reuse
- •Designing a Chess Program
- •Requirements
- •Design Steps
- •An Object-Oriented View of the World
- •Am I Thinking Procedurally?
- •The Object-Oriented Philosophy
- •Living in a World of Objects
- •Object Relationships
- •Abstraction
- •Reusing Code
- •A Note on Terminology
- •Deciding Whether or Not to Reuse Code
- •Strategies for Reusing Code
- •Bundling Third-Party Applications
- •Open-Source Libraries
- •The C++ Standard Library
- •Designing with Patterns and Techniques
- •Design Techniques
- •Design Patterns
- •The Reuse Philosophy
- •How to Design Reusable Code
- •Use Abstraction
- •Structure Your Code for Optimal Reuse
- •Design Usable Interfaces
- •Reconciling Generality and Ease of Use
- •The Need for Process
- •Software Life-Cycle Models
- •The Stagewise and Waterfall Models
- •The Spiral Method
- •The Rational Unified Process
- •Software-Engineering Methodologies
- •Extreme Programming (XP)
- •Software Triage
- •Be Open to New Ideas
- •Bring New Ideas to the Table
- •Thinking Ahead
- •Keeping It Clear
- •Elements of Good Style
- •Documenting Your Code
- •Reasons to Write Comments
- •Commenting Styles
- •Comments in This Book
- •Decomposition
- •Decomposition through Refactoring
- •Decomposition by Design
- •Decomposition in This Book
- •Naming
- •Choosing a Good Name
- •Naming Conventions
- •Using Language Features with Style
- •Use Constants
- •Take Advantage of const Variables
- •Use References Instead of Pointers
- •Use Custom Exceptions
- •Formatting
- •The Curly Brace Alignment Debate
- •Coming to Blows over Spaces and Parentheses
- •Spaces and Tabs
- •Stylistic Challenges
- •Introducing the Spreadsheet Example
- •Writing Classes
- •Class Definitions
- •Defining Methods
- •Using Objects
- •Object Life Cycles
- •Object Creation
- •Object Destruction
- •Assigning to Objects
- •Distinguishing Copying from Assignment
- •The Spreadsheet Class
- •Freeing Memory with Destructors
- •Handling Copying and Assignment
- •Different Kinds of Data Members
- •Static Data Members
- •Const Data Members
- •Reference Data Members
- •Const Reference Data Members
- •More about Methods
- •Static Methods
- •Const Methods
- •Method Overloading
- •Default Parameters
- •Inline Methods
- •Nested Classes
- •Friends
- •Operator Overloading
- •Implementing Addition
- •Overloading Arithmetic Operators
- •Overloading Comparison Operators
- •Building Types with Operator Overloading
- •Pointers to Methods and Members
- •Building Abstract Classes
- •Using Interface and Implementation Classes
- •Building Classes with Inheritance
- •Extending Classes
- •Overriding Methods
- •Inheritance for Reuse
- •The WeatherPrediction Class
- •Adding Functionality in a Subclass
- •Replacing Functionality in a Subclass
- •Respect Your Parents
- •Parent Constructors
- •Parent Destructors
- •Referring to Parent Data
- •Casting Up and Down
- •Inheritance for Polymorphism
- •Return of the Spreadsheet
- •Designing the Polymorphic Spreadsheet Cell
- •The Spreadsheet Cell Base Class
- •The Individual Subclasses
- •Leveraging Polymorphism
- •Future Considerations
- •Multiple Inheritance
- •Inheriting from Multiple Classes
- •Naming Collisions and Ambiguous Base Classes
- •Interesting and Obscure Inheritance Issues
- •Special Cases in Overriding Methods
- •Copy Constructors and the Equals Operator
- •The Truth about Virtual
- •Runtime Type Facilities
- •Non-Public Inheritance
- •Virtual Base Classes
- •Class Templates
- •Writing a Class Template
- •How the Compiler Processes Templates
- •Distributing Template Code between Files
- •Template Parameters
- •Method Templates
- •Template Class Specialization
- •Subclassing Template Classes
- •Inheritance versus Specialization
- •Function Templates
- •Function Template Specialization
- •Function Template Overloading
- •Friend Function Templates of Class Templates
- •Advanced Templates
- •More about Template Parameters
- •Template Class Partial Specialization
- •Emulating Function Partial Specialization with Overloading
- •Template Recursion
- •References
- •Reference Variables
- •Reference Data Members
- •Reference Parameters
- •Reference Return Values
- •Deciding between References and Pointers
- •Keyword Confusion
- •The const Keyword
- •The static Keyword
- •Order of Initialization of Nonlocal Variables
- •Types and Casts
- •typedefs
- •Casts
- •Scope Resolution
- •Header Files
- •C Utilities
- •Variable-Length Argument Lists
- •Preprocessor Macros
- •How to Picture Memory
- •Allocation and Deallocation
- •Arrays
- •Working with Pointers
- •Array-Pointer Duality
- •Arrays Are Pointers!
- •Not All Pointers Are Arrays!
- •Dynamic Strings
- •C-Style Strings
- •String Literals
- •The C++ string Class
- •Pointer Arithmetic
- •Custom Memory Management
- •Garbage Collection
- •Object Pools
- •Function Pointers
- •Underallocating Strings
- •Memory Leaks
- •Double-Deleting and Invalid Pointers
- •Accessing Out-of-Bounds Memory
- •Using Streams
- •What Is a Stream, Anyway?
- •Stream Sources and Destinations
- •Output with Streams
- •Input with Streams
- •Input and Output with Objects
- •String Streams
- •File Streams
- •Jumping around with seek() and tell()
- •Linking Streams Together
- •Bidirectional I/O
- •Internationalization
- •Wide Characters
- •Non-Western Character Sets
- •Locales and Facets
- •Errors and Exceptions
- •What Are Exceptions, Anyway?
- •Why Exceptions in C++ Are a Good Thing
- •Why Exceptions in C++ Are a Bad Thing
- •Our Recommendation
- •Exception Mechanics
- •Throwing and Catching Exceptions
- •Exception Types
- •Throwing and Catching Multiple Exceptions
- •Uncaught Exceptions
- •Throw Lists
- •Exceptions and Polymorphism
- •The Standard Exception Hierarchy
- •Catching Exceptions in a Class Hierarchy
- •Writing Your Own Exception Classes
- •Stack Unwinding and Cleanup
- •Catch, Cleanup, and Rethrow
- •Use Smart Pointers
- •Common Error-Handling Issues
- •Memory Allocation Errors
- •Errors in Constructors
- •Errors in Destructors
- •Putting It All Together
- •Why Overload Operators?
- •Limitations to Operator Overloading
- •Choices in Operator Overloading
- •Summary of Overloadable Operators
- •Overloading the Arithmetic Operators
- •Overloading Unary Minus and Unary Plus
- •Overloading Increment and Decrement
- •Overloading the Subscripting Operator
- •Providing Read-Only Access with operator[]
- •Non-Integral Array Indices
- •Overloading the Function Call Operator
- •Overloading the Dereferencing Operators
- •Implementing operator*
- •Implementing operator->
- •What in the World Is operator->* ?
- •Writing Conversion Operators
- •Ambiguity Problems with Conversion Operators
- •Conversions for Boolean Expressions
- •How new and delete Really Work
- •Overloading operator new and operator delete
- •Overloading operator new and operator delete with Extra Parameters
- •Two Approaches to Efficiency
- •Two Kinds of Programs
- •Is C++ an Inefficient Language?
- •Language-Level Efficiency
- •Handle Objects Efficiently
- •Use Inline Methods and Functions
- •Design-Level Efficiency
- •Cache as Much as Possible
- •Use Object Pools
- •Use Thread Pools
- •Profiling
- •Profiling Example with gprof
- •Cross-Platform Development
- •Architecture Issues
- •Implementation Issues
- •Platform-Specific Features
- •Cross-Language Development
- •Mixing C and C++
- •Shifting Paradigms
- •Linking with C Code
- •Mixing Java and C++ with JNI
- •Mixing C++ with Perl and Shell Scripts
- •Mixing C++ with Assembly Code
- •Quality Control
- •Whose Responsibility Is Testing?
- •The Life Cycle of a Bug
- •Bug-Tracking Tools
- •Unit Testing
- •Approaches to Unit Testing
- •The Unit Testing Process
- •Unit Testing in Action
- •Higher-Level Testing
- •Integration Tests
- •System Tests
- •Regression Tests
- •Tips for Successful Testing
- •The Fundamental Law of Debugging
- •Bug Taxonomies
- •Avoiding Bugs
- •Planning for Bugs
- •Error Logging
- •Debug Traces
- •Asserts
- •Debugging Techniques
- •Reproducing Bugs
- •Debugging Reproducible Bugs
- •Debugging Nonreproducible Bugs
- •Debugging Memory Problems
- •Debugging Multithreaded Programs
- •Debugging Example: Article Citations
- •Lessons from the ArticleCitations Example
- •Requirements on Elements
- •Exceptions and Error Checking
- •Iterators
- •Sequential Containers
- •Vector
- •The vector<bool> Specialization
- •deque
- •list
- •Container Adapters
- •queue
- •priority_queue
- •stack
- •Associative Containers
- •The pair Utility Class
- •multimap
- •multiset
- •Other Containers
- •Arrays as STL Containers
- •Strings as STL Containers
- •Streams as STL Containers
- •bitset
- •The find() and find_if() Algorithms
- •The accumulate() Algorithms
- •Function Objects
- •Arithmetic Function Objects
- •Comparison Function Objects
- •Logical Function Objects
- •Function Object Adapters
- •Writing Your Own Function Objects
- •Algorithm Details
- •Utility Algorithms
- •Nonmodifying Algorithms
- •Modifying Algorithms
- •Sorting Algorithms
- •Set Algorithms
- •The Voter Registration Audit Problem Statement
- •The auditVoterRolls() Function
- •The getDuplicates() Function
- •The RemoveNames Functor
- •The NameInList Functor
- •Testing the auditVoterRolls() Function
- •Allocators
- •Iterator Adapters
- •Reverse Iterators
- •Stream Iterators
- •Insert Iterators
- •Extending the STL
- •Why Extend the STL?
- •Writing an STL Algorithm
- •Writing an STL Container
- •The Appeal of Distributed Computing
- •Distribution for Scalability
- •Distribution for Reliability
- •Distribution for Centrality
- •Distributed Content
- •Distributed versus Networked
- •Distributed Objects
- •Serialization and Marshalling
- •Remote Procedure Calls
- •CORBA
- •Interface Definition Language
- •Implementing the Class
- •Using the Objects
- •A Crash Course in XML
- •XML as a Distributed Object Technology
- •Generating and Parsing XML in C++
- •XML Validation
- •Building a Distributed Object with XML
- •SOAP (Simple Object Access Protocol)
- •. . . Write a Class
- •. . . Subclass an Existing Class
- •. . . Throw and Catch Exceptions
- •. . . Read from a File
- •. . . Write to a File
- •. . . Write a Template Class
- •There Must Be a Better Way
- •Smart Pointers with Reference Counting
- •Double Dispatch
- •Mix-In Classes
- •Object-Oriented Frameworks
- •Working with Frameworks
- •The Model-View-Controller Paradigm
- •The Singleton Pattern
- •Example: A Logging Mechanism
- •Implementation of a Singleton
- •Using a Singleton
- •Example: A Car Factory Simulation
- •Implementation of a Factory
- •Using a Factory
- •Other Uses of Factories
- •The Proxy Pattern
- •Example: Hiding Network Connectivity Issues
- •Implementation of a Proxy
- •Using a Proxy
- •The Adapter Pattern
- •Example: Adapting an XML Library
- •Implementation of an Adapter
- •Using an Adapter
- •The Decorator Pattern
- •Example: Defining Styles in Web Pages
- •Implementation of a Decorator
- •Using a Decorator
- •The Chain of Responsibility Pattern
- •Example: Event Handling
- •Implementation of a Chain of Responsibility
- •Using a Chain of Responsibility
- •Example: Event Handling
- •Implementation of an Observer
- •Using an Observer
- •Chapter 1: A Crash Course in C++
- •Chapter 3: Designing with Objects
- •Chapter 4: Designing with Libraries and Patterns
- •Chapter 5: Designing for Reuse
- •Chapter 7: Coding with Style
- •Chapters 8 and 9: Classes and Objects
- •Chapter 11: Writing Generic Code with Templates
- •Chapter 14: Demystifying C++ I/O
- •Chapter 15: Handling Errors
- •Chapter 16: Overloading C++ Operators
- •Chapter 17: Writing Efficient C++
- •Chapter 19: Becoming Adept at Testing
- •Chapter 20: Conquering Debugging
- •Chapter 24: Exploring Distributed Objects
- •Chapter 26: Applying Design Patterns
- •Beginning C++
- •General C++
- •I/O Streams
- •The C++ Standard Library
- •C++ Templates
- •Integrating C++ and Other Languages
- •Algorithms and Data Structures
- •Open-Source Software
- •Software-Engineering Methodology
- •Programming Style
- •Computer Architecture
- •Efficiency
- •Testing
- •Debugging
- •Distributed Objects
- •CORBA
- •XML and SOAP
- •Design Patterns
- •Index
Writing Efficient C++
Use Thread Pools
Thread pools are very similar to object pools. Instead of creating and deleting threads dynamically throughout your program lifetime, you can create a pool of threads that can be used as needed. This technique is often used in programs that process incoming network requests. Your Web server might keep a pool of threads ready to lookup pages in response to each client request that arrives.
As discussed further in Chapter 18, threading support is platform-specific, so we do not show an example of a thread pool here. However, you can write one in a similar way to an object pool.
Profiling
Although we urge you to think about efficiency as you design and code, you should accept that not every finished application will perform as well as it could. It is easy for efficiency to fall by the wayside in an attempt to generate a functional program; in our experience, most efficiency optimization is performed on already working programs. Even if you did consider efficiency in your development, you might not have optimized the right parts of the program! Recall from Chapter 4 that 90 percent of the running time of most programs is spent in only 10 percent of the code. This means that you could optimize 90 percent of your code out of existence, but still only improve the running time of the program by 10 percent. Obviously, you want to optimize the parts of the code that are exercised the most for the specific workload that you expect the program to run.
Consequently, it is often helpful to profile your program to determine which parts of the code require optimization. There are many profiling tools available that analyze programs as they run in order to generate data about their performance. Most profiling tools provide analysis at the function level by specifying the amount of time (or percent of total execution time) spent in each function in the program. After running a profiler on your program, you can usually tell immediately which parts of the program need optimization. Profiling before and after optimizing is also useful to prove that your optimizations had an effect.
In our experience, the best profiling tool is Rational Quantify from IBM. It requires significant license fees, but you should check if your company or academic institution has a license for its use. If the license restriction is prohibitive, there are several free profiling tools. One of the most well-known is gprof, which can be found on most Unix systems, including Solaris and Linux.
Profiling Example with gprof
The power of profiling can best be seen with a real coding example. As a disclaimer, the performance bugs in the first attempt shown are not subtle. Real efficiency issues would probably be more complex, but a program long enough to demonstrate them would be too lengthy for this book.
Suppose that you work for the United States Social Security Administration. Every year the administration puts up a Web site that allows users to look up the popularity of new baby names from the previous year. Your job is to write the back-end program that looks up names for users. Your input is a file containing the name of every new baby. This file will obviously contain redundant names. For example, in the file for boys for 2003, the name Jacob was the most popular, showing up 29,195 times. Your program must read the file to construct an in-memory database. A user may then request the absolute number of babies with a given name, or the rank of that name among all the babies.
479
Chapter 17
First Design Attempt
A logical design for this program consists of a NameDB class with the following public methods:
#include <string> #include <stdexcept>
using std::string;
class NameDB
{
public:
//Reads the list of baby names in nameFile to populate the database.
//Throws invalid_argument if nameFile cannot be opened or read. NameDB(const string& nameFile) throw (std::invalid_argument);
//Returns the rank of the name (1st, 2nd, etc).
//Returns –1 if the name is not found.
int getNameRank(const string& name) const;
//Returns the number of babies with this name.
//Returns –1 if the name is not found.
int getAbsoluteNumber(const string &name) const;
// Protected and private members and methods not shown
};
The hard part is choosing a good data structure for the in-memory database. A first attempt might be an array, or a vector from the STL, of name/count pairs. Each entry in the vector would store one of the names, along with a count of the number of times that name shows up in the raw data file. Here is the complete class definition with such a design:
#include <string> #include <stdexcept> #include <vector>
using std::string;
class NameDB
{
public:
NameDB(const string& nameFile) throw (std::invalid_argument);
int getNameRank(const string& |
name) |
const; |
int getAbsoluteNumber(const string& |
name) const; |
|
protected: |
|
|
std::vector<std::pair<string, |
int> > mNames; |
|
// Helper methods |
|
|
bool nameExists(const string& |
name) |
const; |
void incrementNameCount(const |
string& name); |
|
void addNewName(const string& |
name); |
|
private:
480
Writing Efficient C++
// Prevent assignment and pass-by-value. NameDB(const NameDB& src);
NameDB& operator=(const NameDB& rhs);
};
Note the use of the STL vector and pair. A pair is simply a utility class that combines two variables of different types. Consult Chapters 21 to 23 for details on the STL.
Here is the implementation of the constructor and the helper functions nameExists(), incrementNameCount(), and addNewName(). If you’re unfamiliar with the STL, you might be confused by the loops in nameExists() and incrementNameCount(). They simply iterate over all the elements of the mNames vector.
//
//Reads the names from the file and populates the database.
//The database is vector of name/count pairs, storing the
//number of times each name shows up in the raw data.
//
NameDB::NameDB(const string& nameFile) throw (invalid_argument)
{
//Open the file and check for errors. ifstream inFile(nameFile.c_str());
if (!inFile) {
throw invalid_argument(“Unable to open file\n”);
}
//Read the names one at a time.
string name;
while (inFile >> name) {
// Look up the name in the database so far. if (nameExists(name)) {
//If the name exists in the database, just
//increment the count. incrementNameCount(name);
}else {
//If the name doesn’t yet exist, add it with
//a count of 1.
addNewName(name);
}
}
inFile.close();
}
//
//nameExists
//Returns true if the name exists in the database. Returns false otherwise.
bool NameDB::nameExists(const string& name) const
{
//Iterate through the vector of names looking for the name.
for (vector<pair<string, int> >::const_iterator it = mNames.begin(); it != mNames.end(); ++it) {
if (it->first == name) {
481
Chapter 17
return (true);
}
}
return (false);
}
//
//incrementNameCount
//Precondition: name exists in the vector of names.
//Postcondition: the count associated with name is incremented.
void NameDB::incrementNameCount(const string& name)
{
for (vector<pair<string, int> >::iterator it = mNames.begin(); it != mNames.end(); ++it) {
if (it->first == name) {
it->second++; return;
}
}
}
//
//addNewName
//Adds a new name to the database
void NameDB::addNewName(const string& name)
{
mNames.push_back(make_pair<string, int>(name, 1));
}
Note that in the preceding example, you could use an algorithm like find_if to accomplish the same thing as the loops in nameExists() and incrementNameCount(). We show the loops explicitly in order to emphasize the performance problems.
The savvy reader might notice some performance problems already. What if there are hundreds of thousands of names? The many linear searches involved in populating the database might become slow.
In order to complete the example, here are the implementations of the two public methods:
//
//getNameRank
//Returns the rank of the name.
//First looks up the name to obtain the number of babies with that name.
//Then iterates through all the names, counting all the names with a higher
//count than the specified name. Returns that count as the rank.
//
int NameDB::getNameRank(const string& name) const
{
// Make use of the getAbsoluteNumber() method. int num = getAbsoluteNumber(name);
482
Writing Efficient C++
// Check if we found the name. if (num == -1) {
return (-1);
}
//
//Now count all the names in the vector that have a
//count higher than this one. If no name has a higher count,
//this name is rank number 1. Every name with a higher count
//decreases the rank of this name by 1.
//
int rank = 1;
for (vector<pair<string, int> >::const_iterator it = mNames.begin(); it != mNames.end(); ++it) {
if (it->second > num) { rank++;
}
}
return (rank);
}
//
//getAbsoluteNumber
//Returns the count associated with this name
int NameDB::getAbsoluteNumber(const string& name) const
{
for (vector<pair<string, int> >::const_iterator it = mNames.begin(); it != mNames.end(); ++it) {
if (it->first == name) {
return(it->second);
}
}
return (-1);
}
Profile of the First Attempt
In order to test the program, you need a main function:
#include “NameDB.h”
int main(int argc, char** argv)
{
NameDB boys(“boys_long.txt”);
cout << boys.getNameRank(“Daniel”) << endl; cout << boys.getNameRank(“Jacob”) << endl; cout << boys.getNameRank(“William”) << endl;
return (0);
}
483
Chapter 17
This main creates one NameDB database called boys, telling it to populate itself with the file boys_long
.txt This file contains 500,500 names.
There are three steps to using gprof:
1.Compile your program with a special flag that causes it to log raw execution information next time it is run. On Solaris 9, using the SunOne Studio 8 C++ Compiler, the flag is –xpg:
>CC -o namedb -xpg main.cpp NameDB.cpp
2.Next, run your pogram. This run should generate a file gmon.out in the working directory.
3.The final step is to run the gprof command in order to analyze the gmon.out profiling information and produce a (somewhat) readable report. The –C option tells gprof to demangle C++ function names so they are more readable. gprof outputs to standard out, so you should redirect the output to a file:
>gprof -C namedb gmon.out > gprof_analysis.out
Now you can analyze the data. Unfortunately, the output file is somewhat cryptic and intimidating. It takes a little while to learn how to interpret it. gprof provides two separate sets of information. The second set summarizes the amount of time spent executing each function in the program. The first, and more useful, set summarizes the amount of time spent executing each function and its descendents. Here is some of the output from the gprof_analysis.out file, edited to make it more readable:
[2] |
85.1 |
0.00 |
48.21 |
1 |
main [2] |
The preceding line means that main() and its descendents took 85.1 percent of the total execution time of the program, for a total of 48.21 seconds. The remaining 14.9 percent of the time was spent performing other tasks like looking for dynamically linked libraries and initializing global variables. The next entry shows that the NameDB constructor and its descendents took 48.18 seconds, which is almost the entire time of main(). The nested entries below NameDB::NameDB show which of its descendents took the most time. Here you can see that nameExists() and incrementNameCount() both took approximately 14 seconds. Remember that these times are the sums of all the calls to the functions. The third column
in those lines shows the number of calls to the function (500,500 to nameExists() and 499,500 to incrementNameCont()). No other function took a significant amount of the NameDB time.
[3] |
85.1 |
0.03 |
48.18 |
1 |
NameDB::NameDB |
|
|
9.60 |
14.04 |
500500/500500 |
bool NameDB::nameExists |
|
|
8.36 |
14.00 |
499500/499500 |
void NameDB::incrementNameC |
ount |
|
|
|
|
|
|
|
|
|
|
|
Without going any further in this analysis, two things should jump out at you:
1.48 seconds to populate the database of approximately 500,000 names is slow. Perhaps you need a better data structure.
2.nameExists() and incrementNameCount() take almost identical time, and are called almost the same number of times. If you think about the application domain, that makes sense: most names in the text file input are duplicates, so the vast majority of the calls to nameExists() are followed by a call to incrementNameCount(). If you look back at the code, you can see that these functions are almost identical; they could probably be combined. In addition, most of what they are doing is searching the vector. It would probably be better to use a sorted data structure to reduce the searching time.
484
Writing Efficient C++
Second Attempt
With these two observations from the gprof output, it’s time to redesign the program. The new design uses a map instead of a vector. Recall from Chapter 4 that the STL map employs an underlying tree structure that keeps the entries sorted, and provides O(log n) lookup instead of the O(n) searches in the vector.
The new version of the program also combines nameExists() and incrementNameCount() into one nameExistsAndIncrement() function.
Here is the new class definition:
#include <string> #include <stdexcept> #include <map>
using std::string;
class NameDB
{
public:
NameDB(const string& nameFile) throw (std::invalid_argument);
int getNameRank(const string& name) const;
int getAbsoluteNumber(const string& name) const;
protected:
std::map<string, int> mNames;
bool nameExistsAndIncrement(const string& name); void addNewName(const string& name);
private:
// Prevent assignment and pass-by-value NameDB(const NameDB& src);
NameDB& operator=(const NameDB& rhs);
};
Here are the new method implementations:
//
//Reads the names from the file and populates the database.
//The database is a map associating names with their frequency.
NameDB::NameDB(const string& nameFile) throw (invalid_argument)
{
//
// Open the file and check for errors.
//
ifstream inFile(nameFile.c_str()); if (!inFile) {
throw invalid_argument(“Unable to open file\n”);
}
485
Chapter 17
//
// Read the names one at a time.
//
string name;
while (inFile >> name) {
//
// Look up the name in the database so far.
//
if (!nameExistsAndIncrement(name)) {
//
//If the name exists in the database, the
//function incremented it, so we just continue.
//We get here if it didn’t exist, in case which
//we add it with a count of 1.
//
addNewName(name);
}
}
inFile.close();
}
//
//nameExistsAndIncrement
//Returns true if the name exists in the database. false
//otherwise. If it finds it, it increments it.
//
bool NameDB::nameExistsAndIncrement(const string& name)
{
//
//Find the name in the map.
//
map<string, int>::iterator res = mNames.find(name); if (res != mNames.end()) {
res->second++; return (true);
}
return (false);
}
//
//addNewName
//Adds a new name to the database
void NameDB::addNewName(const string& name)
{
mNames.insert(make_pair<string, int>(name, 1));
}
//
//getNameRank
//Returns the
486
Writing Efficient C++
int NameDB::getNameRank(const string& name) const
{
int num = getAbsoluteNumber(name);
//
// Check if we found the name.
//
if (num == -1) { return (-1);
}
//
//Now count all the names in the map that have
//count higher than this one. If no name has a higher count,
//this name is rank number 1. Every name with a higher count
//decreases the rank of this name by 1.
//
int rank = 1;
for (map<string, int>::const_iterator it = mNames.begin(); it != mNames.end(); ++it) {
if (it->second > num) {
rank++;
}
}
return (rank);
}
//
//getAbsoluteNumber
//Returns the count associated with this name
int NameDB::getAbsoluteNumber(const string& name) const
{
map<string, int>::const_iterator res = mNames.find(name); if (res != mNames.end()) {
return (res->second);
}
return (-1);
}
Profile of the Second Attempt
By following the same steps shown earlier, you can obtain the gprof performance data on the new version of the program. The data are quite encouraging:
[2] |
85.3 |
0.00 |
3.19 |
1 |
main [2] |
main() now takes only 3.19 seconds: a 15-fold improvement! There are certainly further improvements that you could make on this program, but we leave those as an exercise for the reader. One hint is that caching could come in handy for ranking the names.
487
Chapter 17
Summar y
This chapter discussed the key aspects of efficiency and performance in C++ programs, and provided several specific tips and techniques for designing and writing more efficient applications. We hope you gained an appreciation for the importance of performance and for the power of profiling tools. Remember to think about performance and efficiency from the beginning of your program life cycle: design-level efficiency is far more important than is language-level efficiency.
488