Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

C++ For Mathematicians (2006) [eng]

.pdf
Скачиваний:
193
Добавлен:
16.08.2013
Размер:
31.64 Mб
Скачать

76

 

 

C++ for Mathematicians

 

 

2 7 7

 

 

 

 

98

 

 

 

 

99

3 3 11

 

 

 

 

100

2 2 5

5

 

 

 

 

 

 

5.4A procedure to calculate Euler’s totient

With the factor procedure built, our next step is to build a procedure to calculate Euler’s totient. The totient procedure takes a long argument (n) and returns a long result (ϕ(n)). Here is the header file totient.h.

Program 5.4: Header file for the totient procedure.

1 #ifndef TOTIENT_H

2#define TOTIENT_H

3

4/**

5* Euler’s totient function.

6* @param n the number whose totient we seek

7* @return the number of elements in {1,2,...,n} that are relatively

8* prime to n.

9 */

10

11 long totient(long n);

12

13 #endif

We use Theorem 5.4 to design the totient procedure. For example, let n = 36,750. Factoring n gives 36,750 = 2 ×3 ×5 ×5 ×7 ×7. Then

ϕ(n) = (2 −1) ×(3 −1) ×5 ×5 ×(5 −1) ×7 ×(7 −1) = 8400.

If the prime p appears e times in the factorization of n, then in ϕ(n) it contributes pe−1 ×(p −1).

The procedure to calculate ϕ(n) begins by factoring n and saving the result in an array, flist. We then step through flist one element at a time. If flist[k] equals flist[k+1], then we multiply by flist[k]; otherwise, we multiply by flist[k]-1. To do this, we use an expanded version of the if statement. The expanded version is called an if-else statement and is structured like this:

if (condition) { statements1;

}

else { statements2;

}

When this structure is encountered, the condition is evaluated. If it evaluates to TRUE then statements1 are executed and statements2 are skipped. However, if

Arrays

77

condition evaluates to FALSE, then statements1 are skipped and statements2 are executed.

We need to be careful when we reach the end of the array; we do not want to access the element past the end of the array.

Here is the code.

Program 5.5: The code for the totient procedure.

1 #include "totient.h"

2#include "factor.h"

3

4long totient(long n) {

5 // handle special cases

6 if (n <= 0) return 0;

7if (n == 1) return 1;

8

9// factor n

10long flist[100];

11long nfactors = factor(n,flist);

12

13 long ans = 1;

14

15 for (long k=0; k<nfactors; k++) {

16

17if (k < nfactors-1) {

18if (flist[k] == flist[k+1]) {

19ans *= flist[k];

20}

21else {

22ans *= flist[k]-1;

23}

24}

25else {

26ans *= flist[k]-1;

27}

28}

29return ans;

30}

After factoring n, we set up a variable named ans that holds the result to be returned (line 13).

Lines 15–28 form the core of the procedure. This is a for loop that steps through the array flist. If we are not yet at the last element of the array (checked on line 17) we compare the current element of the array and the next element of the array for equality. If they are equal, we multiply ans by flist[k] (line 19) but otherwise (see the else on line 21) we multiply by flist[k]-1 (line 22).

If we are at the last element of the array (the else on line 25), then the proper factor is flist[k]-1.

We test the totient procedure with this simple main:

#include "totient.h" #include <iostream> using namespace std;

78

C++ for Mathematicians

/**

* A program to test the totient procedure. */

int main() {

for (int k=1; k<=100; k++) {

cout << k << "\t" << totient(k) << endl;

}

return 0;

}

The resulting output looks like this:

11

21

32

42

54

62

76

84

96

104

1110

(many lines omitted)

9024

9172

9244

9360

9446

9572

9632

9796

9842

9960100 40

5.5The Sieve of Eratosthenes: new and delete[]

The totient procedure we developed in the previous section works well, but relies on an inefficient factoring procedure. Because we expect to be calculating ϕ(k) for millions of different values of k, it is worth the effort to make the procedure as efficient as possible.

One of the inefficiencies in the factor procedure arises from the lack of a table of prime numbers. If we had a table of prime numbers we could avoid wasted steps. If we only needed to factor one number or only wanted one value of ϕ, it might not be worth the effort. However, because we plan to compute ϕ millions of times, we can greatly increase the speed of our program by first building a table of primes.

Arrays

79

An efficient method to build a table of primes is known as the Sieve of Eratosthenes. To find all the primes up to some value N, we write down all the integers from 2 up to N (we skip 1). We circle 2 (it’s prime) and then cross off all other multiples of 2. The first unmarked number is 3. We circle 3 then cross off all other multiples of 3. Notice that 4 is crossed off, so the next unmarked number is 5. We circle 5, and then cross off all other multiples of 5. We continue in this manner until we reach N. At the end, the circled numbers are exactly the primes; every other entry in the table has been crossed off. The algorithm is illustrated in Figure 5.2.

2

3

4

5

6

7

8

9

10 11 12

2

3

4

5

6

7

8

9

10 11 12

2

3

4

5

6

7

8

9

10 11 12

2

3

4

5

6

7

8

9

10 11 12

2

3

4

5

6

7

8

9

10 11 12

Figure 5.2: Illustrating the Sieve of Eratosthenes algorithm.

We now create the sieve procedure. In a header file, sieve.h we declare the procedure as follows.

Program 5.6: The header file for a procedure to build a table of primes via the Sieve of Eratosthenes.

1 #ifndef SIEVE_H

2#define SIEVE_H

3

4/**

5* The Sieve of Eratosthenes: Generate a table of primes.

6*

7* @param n upper limit on the primes (i.e., we find all primes

8* less than or equal to n).

9* @param primes array to hold the table of primes.

10* @return the number of primes we found.

11*/

12

13 long sieve(long n, long* primes);

14

15 #endif

The parameter n is an upper bound on the primes we seek. The array primes is a place to hold the primes found by the procedure. It is the responsibility of the procedure that calls sieve to make sure that primes is big enough to hold the array.

80

C++ for Mathematicians

For example, if we wish to generate all primes up to ten million (107), how big does primes need to be? The prime number theorem gives us an estimate. The number of primes less than or equal to n is approximately n/logn. In the case n = 107, this gives an estimated 620,421 primes. In fact, the number is a bit higher: 664,579 to be exact. To be safe, the calling procedure should make sure that primes is (say) 1.5 times the estimate in the prime number theorem.

The code for the sieve procedure introduces a new idea. The procedure requires an array to hold the sieve. If n is 107, then this table requires 107 entries. For efficiency’s sake, we want the individual entries in the table to use as few bytes as possible. We could declare this table to be made up of short integers, but the char type is only one byte on most systems. Let’s call the table theSieve. We may be tempted to begin our program as follows.

long sieve(long n, long* primes) {

char theSieve[n]; // array for the sieve

Unfortunately, this is illegal. In declaring an array, the size of the array must be a constant; it may not be a number that can’t be known until the program is run.

We could declare theSieve to be a huge array (say, of size ten million). However, if we wish to use this procedure for larger values of n we would need to rewrite the program.

A better solution is to create an array using C++’s new operator. This operator allows us to create an array whose size is determined when the program is running.

The beginning of the sieve procedure looks like this:

long sieve(long n, long* primes) {

if (n<2) return 0; // no primes unless n is at least 2.

char* theSieve;

theSieve = new char[n+1]; // hold the marks

The initial if statement handles the case when a user might call sieve with, say, n negative. The next two lines are the interesting part.

The line char* theSieve; declares theSieve to be a name that can refer to an array of chars. However, this name may not be used yet because the array is not created. The star on the type name char is important; without it, the name theSieve would refer to a single character, not an array.

At this point we have a name to use for the array, but no space in memory to hold the array. The next line allocates the space. The statement

theSieve = new char[n+1];

does two things. First, it requests a block of n+1 pieces of memory, each big enough to hold a char. Second, it causes theSieve to refer to the start of that block of memory. In the program, we want to use the entries up to theSieve[n]; this is why we requested n + 1 array elements.

There is an important difference between an array that is declared with the usual sort of statement (such as long primes[10];) and an array that is created with new (such as long* primes = new long[10];). Arrays created with the standard

Arrays

81

declaration automatically disappear when the procedure in which they are defined exits; at that point, the memory they use is automatically freed to be used by other parts of the program. However, an array allocated using the new statement remains reserved until it is explicitly released.

A block of memory allocated by the use of new must later be released with a delete[] statement like this:

delete[] theSieve;

There must be one and only one delete[] balancing each use of new. If the delete[] is missing, the array persists until the program exits. If one has repeated requests to new (without matching delete[]s), then more and more memory is tied up in arrays until the computer runs out of memory to honor the requests. This situation is known as a memory leak.

On the other hand, if one tries to perform a delete[] on the same block of memory more than once, the gates of the underworld will open and demons will rule the earth. Or your program might crash.

The sieve program follows.

Program 5.7: The sieve procedure.

1#include "sieve.h"

2

3long sieve(long n, long* primes) {

4

5 if (n<2) return 0; // no primes unless n is at least 2.

6

7char* theSieve;

8

9theSieve = new char[n+1]; // hold the marks

10

11 // Names of marks to put in theSieve

12 const char blank = 0;

13 const char marked = 1;

14

15// Make sure theSieve is blank to begin

16for (long k=2; k<=n; k++) theSieve[k] = blank;

17

18 long idx = 0; // index into the primes array

19

20for (long k=2; k<=n; k++) {

21if (theSieve[k]==blank) { // we found an unmarked entry

22

theSieve[k] = marked;

// mark it as a prime

23

primes[idx] = k;

// record k in the primes array

24

idx++;

 

25

 

 

26// Now mark off all multiples of k

27for(long d=2*k; d<=n; d+=k) theSieve[d] = marked;

28}

29}

30delete[] theSieve;

31return idx;

32}

 

82

C++ for Mathematicians

 

 

Line 9 allocates the array theSieve. The matching delete[] is on line 30.

 

 

On lines 12 and 13 we create names for the marks we use in the array theSieve.

 

We use two types of mark to distinguish the two types of cells: blank cells and

 

marked cells. We could have simply used the values 0 and 1 in the program, but

 

unnamed constants are to be shunned. By giving these names we make the code

 

more understandable. The const qualifier in the declaration tells the compiler that

 

these values, blank and marked, never change. This does two good things. It

 

prevents us from accidentally writing code that would change these symbols and it

 

enables the compiler to produce more efficient object code.

 

 

Line 16 ensures that the array theSieve is entirely populated with blank (i.e.,

 

zero) before we begin. In a perfect world arrays are given to you filled with sensible

 

default values (such as zero). However, it is folly to rely on this. An initial run

 

through the array to make sure it is in the state we hope is quick and easy.

 

 

The variable idx on line 18 is an index into the array primes. It refers to the

 

next available cell in primes at all points in the program. At the end, it will have

 

been incremented once for every prime we record, and so it will hold the number of

 

primes found. That is why we use idx as the return value on line 31.

 

 

The sieving takes place on lines 20–29. When we come to an entry k in theSieve

 

that is blank it must be a prime. We mark that location, record the number k in

 

primes (and increment idx). Then (line 27) we place a mark in every cell that is a

 

multiple of k.

 

 

Here is a main to test the sieve procedure.2

 

 

 

 

 

Program 5.8: A program to test the sieve procedure.

1

#include "sieve.h"

2

#include <iostream>

3using namespace std;

4

 

 

5

const long N = 10000000;

// ten million

6const long TABLE_SIZE = 800000; // prime number theorem overestimate

7

8/**

9* A program to test the sieve procedure.

10

*/

11

 

12int main() {

13long primes[TABLE_SIZE];

14long np = sieve(N,primes);

15

16 cout << "We found " << np << " primes" << endl;

17

18 cout << "The first 10 primes we found are these: " << endl;

2Note: On Windows computers this program might crash because of line 13. Some computers place a limit on the maximum size array one can declare. The solution is to allocate large arrays dynamically. That is, replace line 13 with this: long* primes; primes = new long[TABLE SIZE]; Remember to delete[] primes; before the end of the program.

Arrays

83

19for (long k=0; k<10; k++) cout << primes[k] << " ";

20cout << endl;

21

22 cout << "The largest prime we found is " << primes[np-1] << endl;

23

24return 0;

25}

Finding all the primes up to ten million is quick. The output from the program

 

 

appeared on my screen in under four seconds.

 

 

 

 

 

We found 664579 primes

 

 

 

 

 

 

 

 

 

The first 10 primes we

found are these:

 

 

 

 

 

2 3

5 7

11 13 17 19 23

29

 

 

 

 

 

The

largest prime we found is 9999991

 

 

 

 

 

 

 

 

 

 

5.6A faster totient

With a table of primes at our disposal, we can calculate ϕ(n) without first factoring n; here’s how. For each prime p in the table, we check if p divides n. If so, we replace n by (n/p)(p −1). In the end, we have calculated

 

p1

 

p2

···

pt

 

n

p1 −1

 

p2 −1

 

 

pt −1

 

 

 

 

 

 

 

which, by Theorem 5.4, is ϕ(n).

The procedure we create has two arguments. The declaration of this function (in the file totient.h) looks like this:

long totient(long n, const long* primes);

Ignore the const keyword for a moment.

The first argument is n: the number for which we wish to calculate ϕ.

The second argument is a table of primes. The type of this argument is long* which indicates that primes holds the starting position of an array of long integers. The table of primes is not duplicated; what we pass to the totient procedure is the address of the table.

The procedure returns a long: the totient of n.

Now we consider the extra word const in the declaration. When we pass an array to a procedure it is possible for the procedure to change the values held in the array. Indeed, we relied on that fact when we created the sieve procedure. Recall that sieve is declared as long sieve(long n, long* primes);. The sieve procedure receives the array address primes and can then populate that array with the desired values.

84

C++ for Mathematicians

In the case of our new totient procedure, we use the values housed in the array primes, but we do not alter them. The const qualifier asserts that the procedure totient does not modify any element in the array primes.

Although the procedure totient would work equally well without the const qualifier, it is a good habit to declare arguments as const when appropriate. If (by mistake) the code in your procedure is capable of changing elements in the array, the compiler will complain and help you spot the error.

The code for the new totient procedure is this:

Program 5.9: A faster totient procedure that employs a table of primes.

1#include "totient.h"

2

3 long totient(long n, const long* primes) {

4if (n<=0) return 0;

5

6long ans = n;

7 for (long k=0; primes[k] <= n; k++) {

8 if (n%primes[k]==0) {

9 ans /= primes[k];

10ans *= primes[k]-1;

11}

12}

13return ans;

14}

The program is fairly straightforward. We make a copy of n in a variable named ans. (This isn’t necessary, but improves clarity.) In lines 7–12 we consider all primes that are less than or equal to n. If such a prime p is a factor of n, we modify ans by dividing out p and then multiplying by p −1 (lines 9–10). In the end, ans holds

ϕ(n).

The only caveat is that we must be sure that the array primes contains all the prime factors of n. One way to do this is to generate all primes up to n using sieve. For example, the following main tests the faster totient procedure.

#include "totient.h" #include "sieve.h" #include <iostream> using namespace std;

/**

*A main to test the faster version of Euler’s totient on

*the integers from 1 to 100.

*/

int main() {

const int N = 100;

// testing up to N

long primes[10*N]; // table of primes sieve(10*N, primes);

Arrays

85

for (long k=1; k<=N; k++) {

cout << k << "\t" << totient(k,primes) << endl;

}

}

The output is a two-column table. The first column contains the integers from 1 to 100, and the second column contains Euler’s totient of these.

5.7Computing pn for large n

Recall from the beginning of this chapter that we can calculate pn by the following formula,

 

n

"

k=1

#

 

1

 

n

 

pn =

2

 

1 + 2 ϕ(k) .

In this section we write a program to calculate pn for n equal to one million.

The main part of the program adds ϕ(k) as k goes from 1 to one million. Because we know pn is around 0.6, we expect the final sum to be around 0.6 ×1012 which is

larger than a long on a system where sizeof(long) is 4 bytes. (A 32-bit integer can store values up to about two billion, but not in the trillions.) So we need to use a long long (or int64); fortunately on my computer this is an 8-byte quantity and can hold values that are nearly 1019. This is more than adequate to the task.

Calculating this sum takes many minutes (but not many hours). We can request the program to report its progress along the way. In the program we present, we report pk whenever k is a multiple of 105. Here is the program.

Program 5.10: A program to calculate pn for n equal to one million.

1 #include "totient.h"

2 #include "sieve.h"

3 #include <iostream>

4#include <iomanip>

5using namespace std;

6

7/**

8* A program to calculate the probability that two integers chosen in

9* {1,2,...,n} are relatively prime. This probability is calculated

10* for values of n up to ten million.

11*/

12

13

int main() {

 

 

14

 

 

 

15

const long

N = 1000000;

// one million

16

const long

TABLE_SIZE = 200000;

// prime number th’m overestimate

17

 

 

 

18

// set up the table of primes