Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

You Can Program In C++ (2006) [eng]

.pdf
Скачиваний:
80
Добавлен:
16.08.2013
Размер:
4.33 Mб
Скачать

128

CHAPTER 7

Now try replacing line 5 with:

std::cout << max(i * 1.5, j + 2.3);

Notice that now we get the function-template definition used, because both those expressions are doubles.

Please experiment with other choices until you are happy you understand how a function template interacts with a plain function of the same name.

Function Templates Can Be Specialized

Suppose we try the following version of main( ):

int main( ){ try{

std::string s1("help"); std::string s2("me"); std::cout << max(s1, s2);

}

catch(...){ }

}

You may be surprised that it does not compile, because we used something similar earlier on without any problems when we were writing code with a typedef. However, what may surprise you even more is the reason the compiler gives for rejecting the code: it complains of ambiguity. Remember that when I introduced a program that used the template, I warned you to avoid types from the Standard C++ Library. The compiler has found a second template called max; this one is hidden away in namespace std. Why has it now found this one and added it to the potential candidates? The first thing to notice is that the compiler has called an ambiguity error, not a redefinition error. That tells us that it is all right to overload function templates, but that the problem was that it could not choose one as better than the other. However, how did it find the second one? The answer lies in something called argument-dependent lookup (ADL). When looking for a function name, the compiler will also look in the namespaces of the arguments provided in the function call; we call that ADL. In this case, the argument type is std::string, so it not only searches the obviously visible declarations and finds ours, but it also searches the parts of std namespace that it can see. There it finds another declaration of a function template called max. ADL is intended to be helpful, and it usually is, but sometimes it springs a surprise. This is such a case where a name hidden away in a namespace gets unexpectedly exposed and causes a conflict.

Fixing ADL Ambiguity

We can easily fix the problem by using the fully qualified name of the function we are using. That disables ADL and causes the compiler to look only in the namespace specified by the qualification. In this case we need to replace max with ::max (i.e. explicitly specify the global version we have provided).

The new code compiles and executes fine. But there is more. We might want to compare words but ignore the case used, so that ‘word’, ‘Word’, and ‘WORD’ will all be treated as equivalent. We need a special version of max( ) to do that.

GENERIC FUNCTIONS

129

Specializing max( )

We could write a plain function that has two std::string arguments. However, that enables conversions (i.e. allows use of any type from which a std::string can be implicitly created). If we do not want to allow conversions, we must stick with a template. The following code is an example of specialization that specializes our max( ) function template for std::string.

1 template<>

2 std::string const & max<std::string const &>(std::string const & first, 3 std::string const & second){

4 std::string s1(first);

5std::string s2(second);

6 std::transform(s1.begin( ), s1.end( ), s1.begin( ), tolower);

7 std::transform(s2.begin( ), s2.end( ), s2.begin( ), tolower);

8 return s1 > s2 ? first : second;

9 }

W A L K T H R O U G H

Line 1 is the way we tell the compiler that the following code is a special case of a template for which we are providing a new definition. Line 2 declares the original template with the type(s) for the special case replacing the name(s) given in the original function template. That is, we have to provide the type’s name (or names if there was more than one template type parameter) as template arguments (the <std::string> following max).

Lines 4 and 5 simply create copies of the function’s arguments, because we are going to change them but still want to be able to return a copy of the original.

Lines 6 and 7 use std::transform, one of the Standard C++ Library functions (declared in <algorithm>, so we will need to include that header when we try to compile this code). std::transform takes four parameters. The first two give the beginning and end of the sequence to be transformed (just as std::sort( ) did for a sequence to be sorted). The third parameter identifies where the transformed sequence will start – in this case, the transformation is in situ, i.e. the transformed elements will replace the originals. The fourth parameter names the function that is applied to each element of the sequence to create the elements of the transformed sequence.

Line 8 compares the lowercase strings to select which of the originals to return.

T R Y T H I S

Try including the specialization of max( ) for std::string with the version of main( ) that compares two std::string objects. Experiment with different values stored in the std::string objects until you are happy that the resulting code is making a case-independent comparison.

130

CHAPTER 7

EXERCISES

4.Write a template for squaring numbers. The important issue here is that the value of the square of a number should have the same type as that of the value provided. Your function template should work for any type that supports the multiplication operator.

5.Write a function template that has three parameters of the same type and returns the middle of the three in order of size.

6.Specialize your function template for middle so that it will select the middle of three words in a caseindependent way.

Overloading Function Templates

The error message earlier when we tried to use max( ) for a std::string hinted that it might be possible to overload function templates. Not only is it possible but it can also be useful.

The example I am going to give next lacks full generality because I am going to deal with a single type of C++ container – std::vector – when there are actually many types of containers in C++ (some of which we will look at in later chapters). The following code declares and defines a function template that returns the maximum value found in a std::vector<T> object, where T will be replaced by the exact type when the function is generated from the template.

1 template<typename T>

2 T max(std::vector<T> & data){

3 if(not data.size( )) throw std::range_error("No data");

4T maximum(data[0]);

5 for(size_t i(1); i != data.size( ); ++i){

6maximum = ::max(maximum, data[i]);

7}

8 return maximum;

9 }

W A L K T H R O U G H

The source code uses std::vector and std::range error, so the compiler will have had to have seen the contents of the <vector> and <stdexcept> headers if it is to compile code using this max( ) function template.

Line 1 declares that the following is some generic code based on a type called T (it is idiomatic to name a template type parameter with T, just as mathematicians use x, y, and z for unknowns but a, b, and c for constants). Line 2 says that this is a function that will be given a reference to std::vector of T values as an argument and will return a T value.

Notice that the function parameter declared in line 2 is a reference. The reference is important because we certainly would not normally want to pass containers such as std::vector<T> around by value (i.e. copying the container). Containers can contain large numbers of elements. In addition, the elements themselves can be large and expensive to copy.

GENERIC FUNCTIONS

131

Line 3 is an example of something that is easy to miss. It is just possible that a programmer hands over an empty vector. With the simple design we are using this will be an error that we need to detect and deal with. In this case, I have borrowed a Standard C++ Library type used to report problems with ranges. It is an imperfect choice but will do for now. The controlling expression in the if statement uses the Boolean not operator (if you prefer you can use the symbolic operator !). If data.size( ) is zero, not data.size( ) will be true; otherwise the expression will evaluate to false.

Line 4 sets the provisional value of maximum to the value of data[0]. At this stage, we know that data[0] exists because we checked that there was at least one value in data at line 3.

Lines 5 –7 loop through all the values in data keeping track of which is the largest so far. Notice that line 6 uses the other max( ) function template. This is a common technique and demonstrates that the compiler can keep track of what version of an overloaded set of function templates it should use. Also notice that I have used a fully elaborated name to avoid any possible confusion with std::max( ). Even if there were no name collision, it is still good practice to use fully elaborated names inside template definitions. We cannot know what names may be dragged in by the function arguments of a specific function generated from a function template by providing the template type arguments.

T R Y T H I S

Create a project to produce a program from the following version of main( ). You will need to add the definitions of the two function templates for max( ) as well as including suitable headers.

int main( ){ try{

std::vector<int> data; std::cout << max(data);

}

catch(std::range_error & error){ std::cerr << error.what( );} catch(...){std::cerr << "Caught unknown exception\n";}

}

There are two things to notice about this code. Firstly, it tests the bad case where there is no data. Secondly, it demonstrates how we can deal with a specific type of exception. When provided with a list of catch clauses the program will use the first one that fits the exception thrown. In this case both catch(std::range error & error) and catch(...) could deal

with the exception thrown by our call of max( ) with an empty vector. If you reverse the order, the compiler will give you an error, because catch(...) (i.e. catch all exceptions) must be the last catch clause. Note that catch clauses look a bit like functions with one parameter, and they behave very similarly. The C++ Standard Library exceptions support a what( ) member function that will regurgitate whatever message was provided at the point where the exception was created.

Now try inserting some values into data with a few lines such as:

data.push_back(5);

132CHAPTER 7

Also, try modifying the program so that it has values of some other type stored in a suitable std::vector. Make sure you try it for std::string. You may notice that, unlike our previous function template for max( ), there is no problem with ambiguity for that last case. That is because the Standard C++ Library does not provide a function template for max( ) that could be confused with the one we have written using a std::vector<T> parameter.

C++ Iterators

The C++ iterator is a type whose values locate another object.

Language Note: Most computing languages incorporate some form of implementation of the iterator concept. Some, such as Java, try to keep to minimalist support because general iterators are sources of confusion for many people. Others, such as C, provide extensive arithmetic support for manipulating iterators.

C++ has an entire hierarchy of iterator categories built on the simple idea of an object that can store values that locate other objects. The following is a summary of the most important types of C++ iterator. Each category includes the functionality of all previously described categories. So, for example, a bidirectional iterator can be used wherever a forward iterator is required.

trivial iterator: The simplest kind of iterator, a trivial iterator, just provides the location of an object and nothing more. It is rather like a URL that we use when locating material on the World Wide Web. Given an iterator, we get the object it locates by preceding it with an * (asterisk). So if location is an iterator, *location is the object itself. The unary * operator is called the dereference operator, because in the context of computer science, ‘dereferencing’ means getting whatever is identified by a pointer or address.

Language Note: A C pointer that points to a single object rather than into an array is an example of a trivial iterator. Function pointers in most languages that support them will be trivial iterators. Languages such as C allow arrays of function pointers, but the individual elements of such an array are trivial pointers. We will eventually deal with pointers in C++ (which are almost identical to pointers in C).

forward iterator: This kind of iterator can use the following operators:

unary * to obtain the object that the iterator value locates

-> to access a member of the object located (this will be covered when we deal with the C++ classes)

++ for both preand post-increment, to change the value stored in an iterator object to locate the next object of the kind being located by the iterator values

In addition we can compare two iterator values for equality (==) or inequality (!=).

There are two special types of forward iterator – an input iterator and an output iterator. The special feature of these iterators is that they are essentially for traversing data exactly once. The data may come from a transient source (such as a keyboard) or go to a transient destination such as a printer.

bidirectional iterator: These are iterators that add the functionality of going backwards through a sequence by using the preand post-decrement operators. So, for example, as long as iter does not locate the start of a sequence, --iter will locate the previous element of a sequence. There is a requirement that the

GENERIC FUNCTIONS

133

decrement operation undoes an increment one and vice versa so long as the intermediate iterator values are valid.

random-access iterator: These are bidirectional iterators that also support all the operations that allow going forwards and backwards by any integral number of elements that does not take the iterator outside the sequence.

Random-access iterators support addition (+) and subtraction (-) of an integer to/from an iterator value. They also support += and -= to adjust the stored value. They support [ ] (the subscript or index operator), so iter[n] is equivalent to *(iter + n).

Given that iter and jter are iterators into the same sequence, n = iter - jter is required to make the value of n such that iter == jter + n.

Finally < (less than) is required to be applicable to operands iter and jter that are random-access iterators into the same sequence. iter < jter returns true if and only if iter locates an object that is strictly before that located by jter. The other logical operators are defined in terms of <. So, for example: iter > jter is defined to be equivalent to jter < iter; iter != jter is equivalent to ((iter < jter) or (jter < iter)); and iter == jter is equivalent to (not(iter != jter)). Note that the implication of these definitions is that iterators are only required to provide a weak ordering, and that it is possible that two iterators compare equal even though they do not locate the same object.

Language Note: A C++ (or C) pointer into an array – we will be dealing with these later – is an example of a random access iterator.

Reference material: Chapter 7 of Generic Programming and the STL [Austern 1999] provides an in-depth study of C++ iterators.

Version of max(std::vector) Using an Iterator

Less experienced programmers may wish to skip this section when reading this book for the first time as it gets into some fairly advanced aspects of C++.

It is more normal in C++ to use iterators rather than values when we are dealing with collections or sequences. Given an iterator, it is easy to extract the value of the object it locates. However, given an iterator, we can do much more, such as changing the value of the object the iterator locates. Iterators generally give us access to objects rather than just values.

Here is a version of max(std::vector<T>) that returns an iterator to the object with the largest value.

1 template <typename T>

2 typename std::vector<T>::iterator max(std::vector<T> & data){

3 typename std::vector<T>::iterator maximum(data.begin( ));

4 for(typename std::vector<T>::iterator iter(data.begin( ));

5iter != data.end( ); ++iter){

6maximum = (*iter > *maximum ? iter : maximum);

7}

8 return maximum;

9 }

134

CHAPTER 7

W A L K T H R O U G H

Look carefully at line 2. Now focus on the return type of max( ) (that is, everything before max). We want to return an iterator for an object that is an element of a std::vector<T>. The type of an iterator for a std::vector is provided by the definition of the std::vector class template (do not worry – we will deal with the details of class templates later, but for now that is just the proper name for this kind of type). The name of the type for such an iterator is std::vector<T>::iterator with T replaced by the exact type the vector contains. However, in the context of templates, the compiler needs to have it confirmed that std::vector<T>::iterator is the name of a type and not the name of something else such as a variable or function. That is what the typename does here (and why the keyword was introduced into C++ in the first place).

Line 3 just sets up an iterator to hold the location of the largest element of the vector. You may wonder why I no longer check that there are any elements. The mechanism for dealing with empty containers is that the iterator value for their start is the same as that for their end. data.begin( ) gives the location of the start of the vector’s data and data.end( ) will give a special value that marks that there are no more elements. If the container is empty data.begin( ) will have that same special value as data.end( ). One advantage of this mechanism is that we can often treat empty containers the same way as we treat all the others. As long as we do not try to access an object located by the

.end( ) iterator all will be fine.

Lines 4 –7 simply loop through the container from start to finish, updating maximum so that it points to the element with the largest value so far. The *iter and *maximum are used so that we compare the values of the objects, not the values of the iterators. That is why we cannot call our other max function template: it would compare the iterators, not the objects that the iterators locate.

T R Y T H I S

Modify your earlier program so that it uses this function template for max( ) to find the maximum value of the values stored in a vector. Make sure you try versions for several different types.

The fgw::read Function Templates

The Standard C++ Library has many examples of function templates. We often use them without even realizing that that is what we are doing. In many cases, the compiler can deduce the types from the function arguments used in the call. When I write std::sort(data.begin( ), data.end( )) the compiler recognizes that I want a specific instance of the std::sort( ) function template that uses iterators of the type of the arguments data.begin( ) and data.end( ). This is a major benefit because it provides a great deal of versatility coupled with transparency. The programmer does not have to spend time making things explicit when the compiler can deduce what we mean.

However, there is another powerful use of function templates: we can reuse a function name even when the parameter list is not sufficiently different for ordinary overloading to work. The classic case is when we want to carry out essentially the same process, with the same parameters but with different return types.

Consider the problem of getting data from some form of input; if the data provided does not match the type required the input stream fails. In effect that means that we should always check that data input succeeded. Such checks are tedious and repetitious because they are effectively the same for all types that can get values from input by using the >> operator.

Remember the problem of getting an int or a double from std::cin? Now look at the following function template (if it seems too complicated for you to have written, do not worry as you do not need to understand exactly how it does its job in order to use it):

 

GENERIC FUNCTIONS

135

1

template<typename in_type>

 

2

in_type read(std::string const & prompt, unsigned int max_tries = 3){

 

3

in_type temp(in_type( ));

 

4unsigned int tries(0);

5 while(tries++ != max_tries){

6 std::cout << prompt;

7std::cin >> temp;

8if(not std::cin.eof( )) eat_ws_to_eol(std::cin);

9if(not std::cin.fail( ) or std::cin.eof( )) return temp;

10std::cin.clear( ); // if it has failed, reset it to normal

11std::cin.ignore(INT_MAX, '\n'); // flush cin

12std::cout << "\n That input was incorrect, try again: \n";

13}

14throw fgw::bad_input("Too many attempts to read data.");

15}

W A L K T H R O U G H

Lines 1 and 2 tell us that this is a generic function (a function template) called read. The template type provides the return type. The generated functions have two parameters. The first is a std::string that provides a prompt, and the second has a default argument that will be used if the caller does not provide a second argument. As the template type argument is only used for the return type, the compiler cannot deduce the template type argument from a call. That tells us that we will always have to provide it explicitly.

Line 3 defines a default-initialized object of the in type, which we call temp. The in type( ) used as an initializer in the definition of temp is a special syntax that tells the compiler to do whatever is appropriate to default-initialize the object. In the case of fundamental types such as int and double, this results in temp being initialized to zero.

Line 4 defines an object that we will use to track retries. Line 5 effectively says we should keep trying until either we succeed or we have exceeded the maximum number of tries (3 by default).

Line 6 outputs the provided prompt to the console, and line 7 attempts to extract a suitable value from the console.

The rest of the function gets complicated and demonstrates the value of a template solution. One possibility is that input succeeded but was terminated by the user explicitly providing an end-of-file character (Ctrl+Z on a Microsoft Windows machine and Ctrl+D on a UNIX-based one). Usually users do not terminate input that way but do so by pressing the Enter key. However, users sometimes type ahead when they are familiar with a program, or add redundant spaces at the end of a response. The purpose of line 8 is to call a special function (from my library) that removes such redundant input.

Line 9 effectively says that as long as the input succeeded, the current value of temp (i.e. whatever was read at line 7, or possibly if the user hit the end-of-file key immediately, the default value) is returned to the caller.

We can only get to line 10 if input failed. In this case we first reset std::cin by calling the clear( ) function, which clears the failure flags. Next line 11 clears out all the input from the failure (that is what the call of ignore(INT MAX, '\n') achieves). Line 12 provides a generic message explaining why the input was rejected. As long as the maximum number of retries has not been exceeded, the code loops back to try again.

If the maximum number of retries is exceeded, the function exits by throwing an exception of a type provided by my library.

136

CHAPTER 7

Using read<>

The above source code is a slightly modified version of a function template that is provided in the fgw text.h header file. This is one of a set of overloaded function templates to provide support in extracting data from input (the overload set includes provision for extraction from other sources, such as files). The only restriction on the type for which these function templates can be used is that they must be able to use the >> operator to extract data from an input source.

Eventually you will be able to write such templates, and probably improve on mine, but the important point is to be able to use them, and by doing so appreciate how templates can help with your programming. For example here is a small program to determine the biggest positive value input with std::cin using fgw::read<> function templates.

1 // created on 15/10/2004

2 #include "fgw_text.h"

3 #include <iostream>

4

5 int main( ){

6try {

7int biggest(0);

8std::string const prompt(

9"Type in an integer (zero or a negative value to end): "));

10do{

11int const i(fgw::read<int>(prompt));

12if(i < 1) break;

13if(biggest < i) biggest = i;

14} while(true);

15std::cout << "The largest number input was " << biggest << '\n';

16}

17catch(fgw::bad_input & except){

18std::cerr << except.report( );

19}

20catch(...){

21std::cerr << "***An exception was thrown.***\n";

22}

23}

W A L K T H R O U G H

By now, most of the above code should be standard. Lines 8 and 9 just move the prompt used for input outside the loop. Line 10 is the interesting line because it not only demonstrates the use of fgw::read<> but shows how it can be used to initialize a const object at run time. Line 16 is another example of catching a specific exception and dealing with it appropriately. Line 19 then provides for dealing with any other exceptions that happen even though we were not explicitly expecting them. fgw::bad input objects support a function called report( ), which provides access to the message provided by the creation of the exception object.

GENERIC FUNCTIONS

137

Using read<> With the Default Prompt

If you call fgw::read<>( ) without supplying any arguments, the compiler will use an overload of the function template that supplies a default prompt. Here is the definition of that function template:

template<typename in_type> in_type read( ){

return read<in_type>(": ");

}

Notice how easy it is to write such a definition, which simply delegates all the real work to the alreadywritten general version. Another advantage of this is that function templates such as this one automatically acquire any improvements, bug fixes, etc. that may be acquired by the general version.

Here is the program from the start of this chapter modified to use fgw::read<> to get the data:

1 #include "fgw_text.h"

2 #include <iostream>

3 #include <vector>

4

5 int max(int first, int second){

6 return first > second ? first : second;

7 }

8

9 int main( ){

10try{

11std::vector<int> data;

12int value(0);

13std::cout << "Type in some integers. End input with -9999\\";

14do{

15value = read<int>( );

16data.push_back(value);

17} while(value != -9999);

18int maximum(-9999);

19for(size_t i(0); i != data.size( ); ++i){

20maximum = max(maximum, data[i]);

21}

22std::cout << "The largest input value was " << maximum << ".\";

23}

24catch(fgw::bad_input & except)std::cerr << except.report( );

25catch(...)std::cerr << "\***An exception was thrown***\";

26}

I have highlighted the added or modified lines. Line 1 provides the compiler with access to the definitions of the function templates for fgw::read<>. Line 24 explicitly deals with the exception that might result from the user repeatedly failing to supply appropriate input (perhaps their pet cat is trying a little computer use). The valuable line is line 15, which now handles invalid input almost transparently.