- •Contents
- •Introduction
- •Who This Book Is For
- •What This Book Covers
- •How This Book Is Structured
- •What You Need to Use This Book
- •Conventions
- •Source Code
- •Errata
- •p2p.wrox.com
- •The Basics of C++
- •The Obligatory Hello, World
- •Namespaces
- •Variables
- •Operators
- •Types
- •Conditionals
- •Loops
- •Arrays
- •Functions
- •Those Are the Basics
- •Diving Deeper into C++
- •Pointers and Dynamic Memory
- •Strings in C++
- •References
- •Exceptions
- •The Many Uses of const
- •C++ as an Object-Oriented Language
- •Declaring a Class
- •Your First Useful C++ Program
- •An Employee Records System
- •The Employee Class
- •The Database Class
- •The User Interface
- •Evaluating the Program
- •What Is Programming Design?
- •The Importance of Programming Design
- •Two Rules for C++ Design
- •Abstraction
- •Reuse
- •Designing a Chess Program
- •Requirements
- •Design Steps
- •An Object-Oriented View of the World
- •Am I Thinking Procedurally?
- •The Object-Oriented Philosophy
- •Living in a World of Objects
- •Object Relationships
- •Abstraction
- •Reusing Code
- •A Note on Terminology
- •Deciding Whether or Not to Reuse Code
- •Strategies for Reusing Code
- •Bundling Third-Party Applications
- •Open-Source Libraries
- •The C++ Standard Library
- •Designing with Patterns and Techniques
- •Design Techniques
- •Design Patterns
- •The Reuse Philosophy
- •How to Design Reusable Code
- •Use Abstraction
- •Structure Your Code for Optimal Reuse
- •Design Usable Interfaces
- •Reconciling Generality and Ease of Use
- •The Need for Process
- •Software Life-Cycle Models
- •The Stagewise and Waterfall Models
- •The Spiral Method
- •The Rational Unified Process
- •Software-Engineering Methodologies
- •Extreme Programming (XP)
- •Software Triage
- •Be Open to New Ideas
- •Bring New Ideas to the Table
- •Thinking Ahead
- •Keeping It Clear
- •Elements of Good Style
- •Documenting Your Code
- •Reasons to Write Comments
- •Commenting Styles
- •Comments in This Book
- •Decomposition
- •Decomposition through Refactoring
- •Decomposition by Design
- •Decomposition in This Book
- •Naming
- •Choosing a Good Name
- •Naming Conventions
- •Using Language Features with Style
- •Use Constants
- •Take Advantage of const Variables
- •Use References Instead of Pointers
- •Use Custom Exceptions
- •Formatting
- •The Curly Brace Alignment Debate
- •Coming to Blows over Spaces and Parentheses
- •Spaces and Tabs
- •Stylistic Challenges
- •Introducing the Spreadsheet Example
- •Writing Classes
- •Class Definitions
- •Defining Methods
- •Using Objects
- •Object Life Cycles
- •Object Creation
- •Object Destruction
- •Assigning to Objects
- •Distinguishing Copying from Assignment
- •The Spreadsheet Class
- •Freeing Memory with Destructors
- •Handling Copying and Assignment
- •Different Kinds of Data Members
- •Static Data Members
- •Const Data Members
- •Reference Data Members
- •Const Reference Data Members
- •More about Methods
- •Static Methods
- •Const Methods
- •Method Overloading
- •Default Parameters
- •Inline Methods
- •Nested Classes
- •Friends
- •Operator Overloading
- •Implementing Addition
- •Overloading Arithmetic Operators
- •Overloading Comparison Operators
- •Building Types with Operator Overloading
- •Pointers to Methods and Members
- •Building Abstract Classes
- •Using Interface and Implementation Classes
- •Building Classes with Inheritance
- •Extending Classes
- •Overriding Methods
- •Inheritance for Reuse
- •The WeatherPrediction Class
- •Adding Functionality in a Subclass
- •Replacing Functionality in a Subclass
- •Respect Your Parents
- •Parent Constructors
- •Parent Destructors
- •Referring to Parent Data
- •Casting Up and Down
- •Inheritance for Polymorphism
- •Return of the Spreadsheet
- •Designing the Polymorphic Spreadsheet Cell
- •The Spreadsheet Cell Base Class
- •The Individual Subclasses
- •Leveraging Polymorphism
- •Future Considerations
- •Multiple Inheritance
- •Inheriting from Multiple Classes
- •Naming Collisions and Ambiguous Base Classes
- •Interesting and Obscure Inheritance Issues
- •Special Cases in Overriding Methods
- •Copy Constructors and the Equals Operator
- •The Truth about Virtual
- •Runtime Type Facilities
- •Non-Public Inheritance
- •Virtual Base Classes
- •Class Templates
- •Writing a Class Template
- •How the Compiler Processes Templates
- •Distributing Template Code between Files
- •Template Parameters
- •Method Templates
- •Template Class Specialization
- •Subclassing Template Classes
- •Inheritance versus Specialization
- •Function Templates
- •Function Template Specialization
- •Function Template Overloading
- •Friend Function Templates of Class Templates
- •Advanced Templates
- •More about Template Parameters
- •Template Class Partial Specialization
- •Emulating Function Partial Specialization with Overloading
- •Template Recursion
- •References
- •Reference Variables
- •Reference Data Members
- •Reference Parameters
- •Reference Return Values
- •Deciding between References and Pointers
- •Keyword Confusion
- •The const Keyword
- •The static Keyword
- •Order of Initialization of Nonlocal Variables
- •Types and Casts
- •typedefs
- •Casts
- •Scope Resolution
- •Header Files
- •C Utilities
- •Variable-Length Argument Lists
- •Preprocessor Macros
- •How to Picture Memory
- •Allocation and Deallocation
- •Arrays
- •Working with Pointers
- •Array-Pointer Duality
- •Arrays Are Pointers!
- •Not All Pointers Are Arrays!
- •Dynamic Strings
- •C-Style Strings
- •String Literals
- •The C++ string Class
- •Pointer Arithmetic
- •Custom Memory Management
- •Garbage Collection
- •Object Pools
- •Function Pointers
- •Underallocating Strings
- •Memory Leaks
- •Double-Deleting and Invalid Pointers
- •Accessing Out-of-Bounds Memory
- •Using Streams
- •What Is a Stream, Anyway?
- •Stream Sources and Destinations
- •Output with Streams
- •Input with Streams
- •Input and Output with Objects
- •String Streams
- •File Streams
- •Jumping around with seek() and tell()
- •Linking Streams Together
- •Bidirectional I/O
- •Internationalization
- •Wide Characters
- •Non-Western Character Sets
- •Locales and Facets
- •Errors and Exceptions
- •What Are Exceptions, Anyway?
- •Why Exceptions in C++ Are a Good Thing
- •Why Exceptions in C++ Are a Bad Thing
- •Our Recommendation
- •Exception Mechanics
- •Throwing and Catching Exceptions
- •Exception Types
- •Throwing and Catching Multiple Exceptions
- •Uncaught Exceptions
- •Throw Lists
- •Exceptions and Polymorphism
- •The Standard Exception Hierarchy
- •Catching Exceptions in a Class Hierarchy
- •Writing Your Own Exception Classes
- •Stack Unwinding and Cleanup
- •Catch, Cleanup, and Rethrow
- •Use Smart Pointers
- •Common Error-Handling Issues
- •Memory Allocation Errors
- •Errors in Constructors
- •Errors in Destructors
- •Putting It All Together
- •Why Overload Operators?
- •Limitations to Operator Overloading
- •Choices in Operator Overloading
- •Summary of Overloadable Operators
- •Overloading the Arithmetic Operators
- •Overloading Unary Minus and Unary Plus
- •Overloading Increment and Decrement
- •Overloading the Subscripting Operator
- •Providing Read-Only Access with operator[]
- •Non-Integral Array Indices
- •Overloading the Function Call Operator
- •Overloading the Dereferencing Operators
- •Implementing operator*
- •Implementing operator->
- •What in the World Is operator->* ?
- •Writing Conversion Operators
- •Ambiguity Problems with Conversion Operators
- •Conversions for Boolean Expressions
- •How new and delete Really Work
- •Overloading operator new and operator delete
- •Overloading operator new and operator delete with Extra Parameters
- •Two Approaches to Efficiency
- •Two Kinds of Programs
- •Is C++ an Inefficient Language?
- •Language-Level Efficiency
- •Handle Objects Efficiently
- •Use Inline Methods and Functions
- •Design-Level Efficiency
- •Cache as Much as Possible
- •Use Object Pools
- •Use Thread Pools
- •Profiling
- •Profiling Example with gprof
- •Cross-Platform Development
- •Architecture Issues
- •Implementation Issues
- •Platform-Specific Features
- •Cross-Language Development
- •Mixing C and C++
- •Shifting Paradigms
- •Linking with C Code
- •Mixing Java and C++ with JNI
- •Mixing C++ with Perl and Shell Scripts
- •Mixing C++ with Assembly Code
- •Quality Control
- •Whose Responsibility Is Testing?
- •The Life Cycle of a Bug
- •Bug-Tracking Tools
- •Unit Testing
- •Approaches to Unit Testing
- •The Unit Testing Process
- •Unit Testing in Action
- •Higher-Level Testing
- •Integration Tests
- •System Tests
- •Regression Tests
- •Tips for Successful Testing
- •The Fundamental Law of Debugging
- •Bug Taxonomies
- •Avoiding Bugs
- •Planning for Bugs
- •Error Logging
- •Debug Traces
- •Asserts
- •Debugging Techniques
- •Reproducing Bugs
- •Debugging Reproducible Bugs
- •Debugging Nonreproducible Bugs
- •Debugging Memory Problems
- •Debugging Multithreaded Programs
- •Debugging Example: Article Citations
- •Lessons from the ArticleCitations Example
- •Requirements on Elements
- •Exceptions and Error Checking
- •Iterators
- •Sequential Containers
- •Vector
- •The vector<bool> Specialization
- •deque
- •list
- •Container Adapters
- •queue
- •priority_queue
- •stack
- •Associative Containers
- •The pair Utility Class
- •multimap
- •multiset
- •Other Containers
- •Arrays as STL Containers
- •Strings as STL Containers
- •Streams as STL Containers
- •bitset
- •The find() and find_if() Algorithms
- •The accumulate() Algorithms
- •Function Objects
- •Arithmetic Function Objects
- •Comparison Function Objects
- •Logical Function Objects
- •Function Object Adapters
- •Writing Your Own Function Objects
- •Algorithm Details
- •Utility Algorithms
- •Nonmodifying Algorithms
- •Modifying Algorithms
- •Sorting Algorithms
- •Set Algorithms
- •The Voter Registration Audit Problem Statement
- •The auditVoterRolls() Function
- •The getDuplicates() Function
- •The RemoveNames Functor
- •The NameInList Functor
- •Testing the auditVoterRolls() Function
- •Allocators
- •Iterator Adapters
- •Reverse Iterators
- •Stream Iterators
- •Insert Iterators
- •Extending the STL
- •Why Extend the STL?
- •Writing an STL Algorithm
- •Writing an STL Container
- •The Appeal of Distributed Computing
- •Distribution for Scalability
- •Distribution for Reliability
- •Distribution for Centrality
- •Distributed Content
- •Distributed versus Networked
- •Distributed Objects
- •Serialization and Marshalling
- •Remote Procedure Calls
- •CORBA
- •Interface Definition Language
- •Implementing the Class
- •Using the Objects
- •A Crash Course in XML
- •XML as a Distributed Object Technology
- •Generating and Parsing XML in C++
- •XML Validation
- •Building a Distributed Object with XML
- •SOAP (Simple Object Access Protocol)
- •. . . Write a Class
- •. . . Subclass an Existing Class
- •. . . Throw and Catch Exceptions
- •. . . Read from a File
- •. . . Write to a File
- •. . . Write a Template Class
- •There Must Be a Better Way
- •Smart Pointers with Reference Counting
- •Double Dispatch
- •Mix-In Classes
- •Object-Oriented Frameworks
- •Working with Frameworks
- •The Model-View-Controller Paradigm
- •The Singleton Pattern
- •Example: A Logging Mechanism
- •Implementation of a Singleton
- •Using a Singleton
- •Example: A Car Factory Simulation
- •Implementation of a Factory
- •Using a Factory
- •Other Uses of Factories
- •The Proxy Pattern
- •Example: Hiding Network Connectivity Issues
- •Implementation of a Proxy
- •Using a Proxy
- •The Adapter Pattern
- •Example: Adapting an XML Library
- •Implementation of an Adapter
- •Using an Adapter
- •The Decorator Pattern
- •Example: Defining Styles in Web Pages
- •Implementation of a Decorator
- •Using a Decorator
- •The Chain of Responsibility Pattern
- •Example: Event Handling
- •Implementation of a Chain of Responsibility
- •Using a Chain of Responsibility
- •Example: Event Handling
- •Implementation of an Observer
- •Using an Observer
- •Chapter 1: A Crash Course in C++
- •Chapter 3: Designing with Objects
- •Chapter 4: Designing with Libraries and Patterns
- •Chapter 5: Designing for Reuse
- •Chapter 7: Coding with Style
- •Chapters 8 and 9: Classes and Objects
- •Chapter 11: Writing Generic Code with Templates
- •Chapter 14: Demystifying C++ I/O
- •Chapter 15: Handling Errors
- •Chapter 16: Overloading C++ Operators
- •Chapter 17: Writing Efficient C++
- •Chapter 19: Becoming Adept at Testing
- •Chapter 20: Conquering Debugging
- •Chapter 24: Exploring Distributed Objects
- •Chapter 26: Applying Design Patterns
- •Beginning C++
- •General C++
- •I/O Streams
- •The C++ Standard Library
- •C++ Templates
- •Integrating C++ and Other Languages
- •Algorithms and Data Structures
- •Open-Source Software
- •Software-Engineering Methodology
- •Programming Style
- •Computer Architecture
- •Efficiency
- •Testing
- •Debugging
- •Distributed Objects
- •CORBA
- •XML and SOAP
- •Design Patterns
- •Index
Developing Cross-Platform
and Cross-Language
Applications
C++ programs can be compiled to run on a variety of computing platforms and the language has been rigorously defined to ensure that programming in C++ for one platform is very similar to programming in C++ for another. Yet, despite the standardization of the language, platform differences eventually come into play when writing professional-quality programs in C++. Even when development is limited to a particular platform, small differences in compilers can elicit major programming headaches. This chapter examines the necessary complication of programming in
a world with multiple platforms and multiple programming languages.
The first part of this chapter surveys the platform-related issues that C++ programmers encounter. A platform is the collection of all of the details that make up your development and/or run-time system. For example, your platform may be a Microsoft C++ compiler running on Windows XP on a Pentium processor. Alternatively, your platform might be the gcc compiler running on Linux on a PowerPC processor. Both of these platforms are able to compile and run C++ programs, but there are significant differences between them.
The second part of this chapter looks at how C++ can interact with other programming languages. While C++ is a general-purpose language, it may not always be the right tool for the job. Through a variety of mechanisms, you can integrate C++ with other languages that may better serve your needs.
Cross-Platform Development
There are several reasons why the C++ language encounters platform issues. Even though C++ is a high-level language, its definition includes low-level implementation details. For example, C++ arrays are defined to live in contiguous blocks of memory. Such a specific implementation detail
Chapter 18
exposes the language to the possibility that not all systems arrange and manage their memory in the same way. C++ also faces the challenge of providing a standard language and a standard library without a standard implementation. Varying interpretations of the specification among C++ compiler and library vendors can lead to trouble when moving from one system to another. Finally, C++ is selective in what the language provides as standard. Despite the presence of a standard library, sophisticated programs often need functionality that is not provided by the language. This functionality generally comes from third-party libraries or the platform, and can vary greatly.
Architecture Issues
The term architecture generally refers to the processor, or family of processors, on which a program runs. A standard PC running Windows or Linux generally runs on the x86 architecture and Mac OS is usually found on the PowerPC architecture. As a high-level language, C++ shields you from the differences between these architectures. For example, a Pentium processor may have a single instruction that performs the same functionality as six PowerPC instructions. As a C++ programmer, you don’t need to know what this difference is or even that it exists. One advantage to using a high-level language is that the compiler takes care of converting your code into the processor’s native assembly code format.
Processor differences do, however, rise up to the level of C++ code at times. You won’t face most of these issues unless you are doing particularly low-level work, but you should be aware that they exist.
Binary Compatibility
As you probably already know, you cannot take a program written and compiled for a Pentium computer and run it on a Mac. These two platforms are not binary compatible because their processors do not support the same set of instructions. Recall that when you compile a C++ program, your source code is turned into binary instructions that the computer executes. That binary format is defined by the platform, not by the C++ language.
The solution for binary compatibility issues is usually cross-compiling. When you cross-compile a program, you build a separate version for each architecture on which it is destined to run. Some compilers support cross-compiling directly. Others require that you build each version separately on the destination architecture.
Another solution to differences in binary representation is open source distribution. By making your source available to the end user, she can compile it natively on her system and build a version of the program that is in the correct binary format for her machine. As discussed in Chapter 4, open-source software has become increasingly popular in the last several years. One of the major reasons is that it allows programmers to collaboratively develop software and increase the number of platforms on which it can run.
Word and Type Sizes
A word is the fundamental unit of storage for computer architectures. In most systems, a word is the size of an address and/or a single processor instruction. When someone describes an architecture as 32-bit, they most likely mean that the word size is 32 bits, or 4 bytes. In general, a system with a larger word size can handle more memory and operate more quickly on complex programs.
Since pointers are memory addresses, they are inherently tied to word sizes. Many programmers are taught that pointers are always 4 bytes, but this is not always the case. For example, consider the following program, which outputs the size of a pointer.
490
Developing Cross-Platform and Cross-Language Applications
#include <iostream>
using namespace std;
int main(int argc, char** argv)
{
int *ptr;
cout << “ptr size is “ << sizeof(ptr) << “ bytes” << endl;
}
If this program is run on a 32-bit Pentium architecture, the output is:
ptr size is 4 bytes
On a 64-bit Itanium system, the output is:
ptr size is 8 bytes
From a programmer’s point of view, the upshot of varying pointer sizes is simply that you cannot equate a pointer with 4 bytes. More generally, you need to be aware that most sizes are not prescribed by the C++ standard. The standard only says that a short integer has as much, or less, space as an integer, which has as much, or less, space as a long integer. An integer itself is supposed to contain enough space to hold a word, but as you saw above, this number can vary.
Word Order
All modern computers store numbers in a binary representation, but the representation of the same number on two platforms may not be identical. This sounds contradictory, but as you’ll see, there are two approaches to reading numbers that both make sense.
A single slot in your computer’s memory is usually a byte because most computers are byte addressable. Number types in C++ are usually multiple bytes. For example, a short may be 2 bytes. Imagine that your program contains the following line:
short myShort = 513;
In binary, the number 513 is 0000001000000001. This number contains 16 1s and 0s, or 16 bits. Because there are 8 bits in a byte, the computer would need 2 bytes to store the number. Because each individual memory address contains 1 byte, the computer needs to split the number up into multiple bytes. Assuming that a short is 2 bytes, the number will get split into two even parts. The higher part of the number is put into the high-order byte and the lower part of the number is put into the low-order byte. In this case, the high-order byte is 00000010 and the low-order byte is 00000001.
Now that the number has been split up into memory-sized parts, the only question that remains is how to store them in memory. Two bytes are needed, but the order of the bytes is unclear and in fact depends on the architecture of the system in question.
One way to represent the number is to put the high-order byte first in memory and the low-order byte next. This strategy is called big-endian ordering because the bigger part of the number comes first. PowerPC and Sparc processors use a big-endian approach. Some other processors, such as x86, order the bytes
in the opposite order, putting the low-order byte first in memory. This approach is called little-endian
491
Chapter 18
ordering because the smaller part of the number comes first. An architecture may choose one approach or the other, usually based on backward compatibility. For the curious, the terms “big-endian” and “littleendian” predate modern computers by several hundred years. Jonathan Swift coined the terms in his eighteenth-century novel Gulliver’s Travels to describe the opposing camps of a debate about the proper end on which to break an egg. Many computer scientists feel that the current debate about endianness is at least as silly as the one described by Swift.
Regardless of the word ordering a particular architecture uses, your program can continue to use numerical values without paying any attention to whether the machine uses big-endian ordering or little-endian ordering. The word ordering only comes into play when data moves between architectures. For example, if you are sending binary data across a network, you may need to consider the word ordering of the other system. Similarly, if you are writing binary data to a file, you may need to consider what will happen if that file is opened on a system with opposite word ordering.
Implementation Issues
When a C++ compiler is written, it is designed by a human being who attempts to adhere to the C++ standard. Unfortunately, the C++ standard is several hundred pages long and written in a combination of prose, language grammars, and examples. Two human beings implementing a compiler according to such a standard are unlikely to interpret every piece of prescribed information in the exact same way or to catch every single edge case. As difficult as it is to believe, even compilers have bugs.
Compiler Quirks and Extensions
The first compiler bug you encounter is a surreal experience. After all these years of tracking down and correcting your own bugs, you’ve finally discovered that the very program you have been depending on contains flaws! C++ compilers have improved greatly since the creation of the language, but bugs do exist in C++ compilers. At best, these are simply different interpretations of the specification or omitted language features. From time to time, however, you may find a case where the compiler simply does the wrong thing.
There is no simple rule for finding or avoiding compiler bugs. The best you can do is stay up to date on compiler updates and perhaps subscribe to a mailing list or newsgroup for your compiler. If you suspect that you have encountered a compiler bug, a simple Web search for the error message or condition you have witnessed could uncover a workaround or patch.
One area that compilers are notorious for having trouble with is the set of more recent language additions. For example, some of the template and run-time type features in C++ weren’t originally part of the language. As mentioned in Chapter 11, some compilers still don’t properly support these features.
Another issue to be aware of is that compilers often include their own language extensions without making it obvious to the programmer. For example, variable-sized stack-based arrays are not part of the C++ language, yet the following line compiles with the g++ compiler:
int i = 4;
char myStackArray[i]; // Not a standard language feature!
Some compiler extensions may be useful, but if there is a chance that you will switch compilers at some point, you should see if your compiler has a strict mode where it will avoid such extensions. For example, compiling the previous line with the pedantic flag passed to g++ will yield the following warning:
492