Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Тексты по английскому языку / Matrix Theory and Linear Algebra.doc
Скачиваний:
177
Добавлен:
02.05.2014
Размер:
827.9 Кб
Скачать

Matrix Theory and Linear Algebra

I

INTRODUCTION

Matrix Theory and Linear Algebra, interconnected branches of mathematics that serve as fundamental tools in pure and applied mathematics and are becoming increasingly important in the physical, biological, and social sciences.

II

MATRIX THEORY

A matrix is a rectangular array of numbers or elements of a ring (see Algebra). One of the principal uses of matrices is in representing systems of equations of the first degree in several unknowns. Each matrix row represents one equation, and the entries in a row are the coefficients of the variables in the equations, in some fixed order.

A matrix is usually enclosed in brackets:

In the above matrices, a, b, and c are arbitrary numbers. In place of brackets, parentheses or double vertical lines may be used to enclose the arrays. The horizontal lines, called rows, are numbered from the top down; the vertical lines, or columns, are numbered from left to right; thus, -1 is the element in the second row, third column of M1. A row or column is called a line.

The size of a matrix is given by the number of rows and columns, so that M1, M2, M3, and M4 are, in that order, of sizes 3 × 3 (3 by 3), 3 × 3, 3 × 2, and 2 × 3. The general matrix of size m × n is frequently represented in double-subscript notation, with the first subscript i indicating the row number, and the second subscript j indicating the column number; a23 is the element in the second row, third column. This general matrix

may be abbreviated to A = [aij], in which the ranges i = 1, 2, ..., m and j = 1, 2, ..., n should be explicitly given if they are not implied by the text. If m = n, the matrix is square, and the number of rows (or columns) is the order of the matrix. Two matrices, A = [aij] and B = [bij], are equal if and only if they are of the same size and if, for every i and j, aij = bij. The elements a11, a22, a33, ... constitute the main or principal diagonal of the matrix A = [aij], if it is square. The transpose AT of a matrix A is the matrix in which the ith row is the ith column of A and in which the jth column is the jth row of A; thus, from the matrix M3, above,

which is the transpose of M3.

Addition and multiplication of matrices can be defined so that certain sets of matrices form algebraic systems. Let the elements of the matrices considered be arbitrary real numbers, although the elements could have been chosen from other fields or rings. A zero matrix is one in which all the elements are zero; an identity matrix, Im of order m, is a square matrix of order m in which all the elements are zero except those on the main diagonal, which are 1. The order of an identity matrix may be omitted if implied by the text, and Im is then shortened to I.

The sum of two matrices is defined only if they are of the same size; if A = [aij] and B = [bij] are of the same size, then C = A + B is defined as the matrix [cij], in which cij = aij + bij; that is, two matrices of the same size are added merely by adding corresponding elements. Thus, in the matrices given above

The set of all matrices of a fixed size has the property that addition is closed, associative, and commutative; a unique matrix O exists such that for any matrix A, A + O = O + A = A; and, corresponding to any matrix A, there exists a unique matrix B such that A + B = B + A = O.

The product AB of two matrices, A and B, is defined only if the number of columns of the left factor A is the same as the number of rows of the right factor B; if A = [aij] is of size m × n and B = [bjk] is of size n × p, the product AB = C = [cik] is of size m × p, and cik is given by

That is, the element in the ith row and kth column of the product is the sum of the products of the elements of the ith row of the left factor multiplied by the corresponding elements of the kth column of the right factor.

III

LINEAR ALGEBRA

The geometric concept of a vector as a line segment of given length and direction can be advantageously generalized as follows. An n-vector (n-dimensional vector, vector of order n, vector of length n) is an ordered set of n elements of a field. As in matrix theory, the elements are assumed to be real numbers. An n-vector v is represented as:

v = [x1, x2, ..., xn]

In particular, the lines of a matrix are vectors; the horizontal lines are row vectors, the vertical lines are column vectors. The x's are called the components of the vector.

Addition of vectors (of the same length) and scalar multiplication are defined as for matrices and satisfy the same laws. If

w = [y1, y2, ..., yn]

and k is a scalar (real number), then

v + w = [x1 + y1, x2 + y2, ..., xn + yn]

kv = [kx1, kx2, ..., kxn]

If k1k2, ..., km are scalars and v1, v2, ..., vm are n-vectors, the n-vector

v = k1v1 + k2v2 + ... + kmvm

is called a linear combination of the vectors v1, v2, ..., vm. The m n-vectors are linearly independent if the only linear combination equal to the zero n-vector, 0 = [0,0, ..., 0], is the one in which k1 = k2 = ... = km = 0; otherwise, the vectors are linearly dependent. For example, if v1 = [0, 1, 2, 3], v2 = [1, 2, 3, 4], v3 = [2, 2, 4, 4], v4 = [3, 4, 7, 8], then v1, v2, v3 are linearly independent, because k1v1 + k2v2 + k3v3 = 0 if and only if k1 = k2 = k3 = 0; v2, v3, and v4 are linearly dependent because v2 + v3 - v4 = 0. If A is a matrix of rank r, then at least one set of r row, or column, vectors is a linearly independent set, and every set of more than r row, or column, vectors is a linearly dependent set.

A vector space V is a nonempty set of vectors (see Set Theory), with the properties that (1) if vV and w V, then v + wV, and (2) if v V and k is any scalar, then kvV. If S = {vi} is a set of vectors, all of the same length, all linear combinations of the v's form a vector space said to be spanned by the v's. If the set B = {w1} spans the same vector space V and is a linearly independent set, the set B is a basis for V. If a basis for V contains m vectors, every basis for V will contain exactly m vectors and V is called a vector space of dimension m. Two- and three-dimensional Euclidean spaces are vector spaces when their points are regarded as specified by ordered pairs or triples of real numbers. Matrices may be used to describe linear changes from one vector space into another.

Contributed By: James Singer Algebra

I

INTRODUCTION

Algebra, branch of mathematics in which letters are used to represent basic arithmetic relations. As in arithmetic, the basic operations of algebra are addition, subtraction, multiplication, division, and the extraction of roots. Arithmetic, however, cannot generalize mathematical relations such as the Pythagorean theorem, which states that the sum of the squares of the sides of any right triangle is also a square. Arithmetic can only produce specific instances of these relations (for example, 3, 4, and 5, where 32 + 42 = 52). But algebra can make a purely general statement that fulfills the conditions of the theorem: a2 + b2 = c2. Any number multiplied by itself is termed squared and is indicated by a superscript number 2. For example, 3 × 3 is notated 32; similarly, a × a is equivalent to a2 (see Exponent; Power; Root).

Classical algebra, which is concerned with solving equations, uses symbols instead of specific numbers and uses arithmetic operations to establish ways of handling symbols (see Equation; Equations, Theory of). Modern algebra has evolved from classical algebra by increasing its attention to the structures within mathematics. Mathematicians consider modern algebra to be a set of objects with rules for connecting or relating them. As such, in its most general form, algebra may fairly be described as the language of mathematics.

II

HISTORY

The history of algebra began in ancient Egypt and Babylon, where people learned to solve linear (ax = b) and quadratic (ax2 + bx = c) equations, as well as indeterminate equations such as x2 + y2 = z2, whereby several unknowns are involved. The ancient Babylonians solved arbitrary quadratic equations by essentially the same procedures taught today. They also could solve some indeterminate equations.

The Alexandrian mathematicians Hero of Alexandria and Diophantus continued the traditions of Egypt and Babylon, but Diophantus's book Arithmetica is on a much higher level and gives many surprising solutions to difficult indeterminate equations. This ancient knowledge of solutions of equations in turn found a home early in the Islamic world, where it was known as the “science of restoration and balancing.” (The Arabic word for restoration, al-jabru, is the root of the word algebra.) In the 9th century, the Arab mathematician al-Khwārizmī wrote one of the first Arabic algebras, a systematic exposé of the basic theory of equations, with both examples and proofs. By the end of the 9th century, the Egyptian mathematician Abu Kamil had stated and proved the basic laws and identities of algebra and solved such complicated problems as finding x, y, and z such that x + y + z = 10, x2 + y2 = z2, and xz = y2.

Ancient civilizations wrote out algebraic expressions using only occasional abbreviations, but by medieval times Islamic mathematicians were able to talk about arbitrarily high powers of the unknown x, and work out the basic algebra of polynomials (without yet using modern symbolism). This included the ability to multiply, divide, and find square roots of polynomials as well as a knowledge of the binomial theorem. The Persian mathematician, astronomer, and poet Omar Khayyam showed how to express roots of cubic equations by line segments obtained by intersecting conic sections, but he could not find a formula for the roots. A Latin translation of Al-Khwarizmi's Algebra appeared in the 12th century. In the early 13th century, the great Italian mathematician Leonardo Fibonacci achieved a close approximation to the solution of the cubic equation x3 + 2x2 + cx = d. Because Fibonacci had traveled in Islamic lands, he probably used an Arabic method of successive approximations.

Early in the 16th century, the Italian mathematicians Scipione del Ferro, Niccolò Tartaglia, and Gerolamo Cardano solved the general cubic equation in terms of the constants appearing in the equation. Cardano's pupil, Ludovico Ferrari, soon found an exact solution to equations of the fourth degree, and as a result, mathematicians for the next several centuries tried to find a formula for the roots of equations of degree five, or higher. Early in the 19th century, however, the Norwegian mathematician Niels Abel and the French mathematician Évariste Galois proved that no such formula exists.

An important development in algebra in the 16th century was the introduction of symbols for the unknown and for algebraic powers and operations. As a result of this development, Book III of La géometrie (1637), written by the French philosopher and mathematician René Descartes, looks much like a modern algebra text. Descartes's most significant contribution to mathematics, however, was his discovery of analytic geometry, which reduces the solution of geometric problems to the solution of algebraic ones. His geometry text also contained the essentials of a course on the theory of equations, including his so-called rule of signs for counting the number of what Descartes called the “true” (positive) and “false” (negative) roots of an equation. Work continued through the 18th century on the theory of equations, but not until 1799 was the proof published, by the German mathematician Carl Friedrich Gauss, showing that every polynomial equation has at least one root in the complex plane (see Number: Complex Numbers).

By the time of Gauss, algebra had entered its modern phase. Attention shifted from solving polynomial equations to studying the structure of abstract mathematical systems whose axioms were based on the behavior of mathematical objects, such as complex numbers, that mathematicians encountered when studying polynomial equations. Two examples of such systems are groups (see Group) and quaternions, which share some of the properties of number systems but also depart from them in important ways. Groups began as systems of permutations and combinations of roots of polynomials, but they became one of the chief unifying concepts of 19th-century mathematics. Important contributions to their study were made by the French mathematicians Galois and Augustin Cauchy, the British mathematician Arthur Cayley, and the Norwegian mathematicians Niels Abel and Sophus Lie. Quaternions were discovered by British mathematician and astronomer William Rowan Hamilton, who extended the arithmetic of complex numbers to quaternions while complex numbers are of the form a + bi, quaternions are of the form a + bi + cj + dk.

Immediately after Hamilton's discovery, the German mathematician Hermann Grassmann began investigating vectors. Despite its abstract character, American physicist J. W. Gibbs recognized in vector algebra a system of great utility for physicists, just as Hamilton had recognized the usefulness of quaternions. The widespread influence of this abstract approach led George Boole to write The Laws of Thought (1854), an algebraic treatment of basic logic. Since that time, modern algebra—also called abstract algebra—has continued to develop. Important new results have been discovered, and the subject has found applications in all branches of mathematics and in many of the sciences as well.

IV

OPERATIONS WITH POLYNOMIALS

In operating with polynomials, the assumption is that the usual laws of the arithmetic of numbers hold. In arithmetic, the numbers used are the set of rational numbers (see Number; Number Theory). Arithmetic alone cannot go beyond this, but algebra and geometry can include both irrational numbers, such as the square root of 2, and complex numbers. The set of all rational and irrational numbers taken together constitutes the set of what are called real numbers.

A

Laws of Addition

The sum of any two real numbers A1. a and b is again a real number, denoted a + b. The real numbers are closed under the operations of addition, subtraction, multiplication, division, and the extraction of roots; this means that applying any of these operations to real numbers yields a quantity that also is a real number.

No matter how terms are grouped in carrying out additions, the sum A2. will always be the same: (a + b) + c = a + (b + c). This is called the associative law of addition.

Given any real number A3. a, there is a real number zero (0) called the additive identity, such that a + 0 = 0 + a = a.

Given any real number A4. a, there is a number (-a), called the additive inverse of a, such that (a) + (-a) = 0.

No matter in what order addition is carried out, the sum will A5. always be the same: a + b = b + a. This is called the commutative law of addition.

Any set of numbers obeying laws A1 through A4 is said to form a group. If the set also obeys A5, it is said to be an Abelian, or commutative, group.

B

Laws of Multiplication

Laws similar to those for addition also apply to multiplication. Special attention should be given to the multiplicative identity and inverse, M3 and M4.

The product of any two real numbers M1. a and b is again a real number, denoted a·b or ab.

No matter how terms are grouped in carrying out multiplications, M2. the product will always be the same: (ab)c = a(bc). This is called the associative law of multiplication.

Given any real number M3. a, there is a number one (1) called the multiplicative identity, such that a(1) = 1(a) = a.

Given any nonzero real number M4. a, there is a number (a-1), or (1/a), called the multiplicative inverse, such that a(a-1) = (a-1)a = 1.

No matter in what order multiplication is carried out, the product M5. will always be the same: ab = ba. This is called the commutative law of multiplication.

Any set of elements obeying these five laws is said to be an Abelian, or commutative, group under multiplication. The set of all real numbers, excluding zero (because division by zero is inadmissible), forms such a commutative group under multiplication.

C

Distributive Laws

Another important property of the set of real numbers links addition and multiplication in two distributive laws as follows:

 D1.a(b + c) = ab + ac

( D2.b + c)a = ba + ca

Any set of elements with an equality relation and for which two operations (such as addition and multiplication) are defined, and which obeys all the laws for addition A1 through A5, the laws for multiplication M1 through M5, and the distributive laws D1 and D2, constitutes a field.

Number Systems

I

INTRODUCTION

Number Systems, in mathematics, various notational systems that have been or are being used to represent the abstract quantities called numbers. A number system is defined by the base it uses, the base being the number of different symbols required by the system to represent any of the infinite series of numbers. Thus, the decimal system in universal use today (except for computer application) requires ten different symbols, or digits, to represent numbers and is therefore a base-10 system.

Throughout history, many different number systems have been used; in fact, any whole number greater than 1 can be used as a base. Some cultures have used systems based on the numbers 3, 4, or 5. The Babylonians used the sexagesimal system, based on the number 60, and the Romans used (for some purposes) the duodecimal system, based on the number 12. The Mayas used the vigesimal system, based on the number 20. The binary system, based on the number 2, was used by some tribes and, together with the system based on 8, is used today in computer systems. For historical background, see Numerals.

II

PLACE VALUES

Except for computer work, the universally adopted system of mathematical notation today is the decimal system, which, as stated, is a base-10 system. As in other number systems, the position of a symbol in a base-10 number denotes the value of that symbol in terms of exponential values of the base. That is, in the decimal system, the quantity represented by any of the ten symbols used—0, 1, 2, 3, 4, 5, 6, 7, 8, and 9—depends on its position in the number. Thus, the number 3,098,323 is an abbreviation for (3 × 106) + (0 × 105) + (9 × 104) + (8 × 103) + (3 × 102) + (2 × 101) + (3 × 100, or 3 × 1). The first “3” (reading from right to left) represents 3 units; the second “3,” 300 units; and the third “3,” 3 million units. In this system the zero plays a double role; it represents naught, and it also serves to indicate the multiples of the base 10: 100, 1000, 10,000, and so on. It is also used to indicate fractions of integers: 1/10 is written as 0.1, 1/100 as 0.01, 1/1000 as 0.001, and so on.

Two digits—0, 1—suffice to represent a number in the binary system; 6 digits—0, 1, 2, 3, 4, 5—are needed to represent a number in the sexagesimal system; and 12 digits—0, 1, 2, 3, 4, 5, 6, 7, 8, 9, t (ten), e (eleven)—are needed to represent a number in the duodecimal system. The number 30155 in the sexagesimal system is the number (3 × 64) + (0 × 63) + (1 × 62) + (5 × 61) + (5 × 60) = 3959 in the decimal system; the number 2et in the duodecimal system is the number (2 × 122) + (11 × 121) + (10 × 120) = 430 in the decimal system

To write a given base-10 number n as a base-b number, divide (in the decimal system) n by b, divide the quotient by b, the new quotient by b, and so on until the quotient 0 is obtained. The successive remainders are the digits in the base-b expression for n. For example, to express 3959 (base 10) in the base 6, one writes

from which, as above, 395910 = 301556. (The base is frequently written in this way as a subscript of the number.) The larger the base, the more symbols are required, but fewer digits are needed to express a given number. The number 12 is convenient as a base because it is exactly divisible by 2, 3, 4, and 6; for this reason, some mathematicians have advocated adoption of base 12 in place of the base 10.

III

BINARY SYSTEM

The binary system plays an important role in computer technology. The first 20 numbers in the binary notation are 1, 10, 11, 100, 101, 110, 111, 1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111, 10000, 10001, 10010, 10011, 10100. The zero here also has the role of place marker, as in the decimal system. Any decimal number can be expressed in the binary system by the sum of different powers of two. For example, starting from the right, 10101101 represents (1 × 20) + (0 × 21) + (1 × 22) + (1 × 23) + (0 × 24) + (1 × 25) + (0 × 26) + (1 × 27) = 173. This example can be used for the conversion of binary numbers into decimal numbers. For the conversion of decimal numbers to binary numbers, the same principle can be used, but the other way around. Thus, to convert, the highest power of two that does not exceed the given number is sought first, and a 1 is placed in the corresponding position in the binary number. For example, the highest power of two in the decimal number 519 is 29 = 512. Thus, a 1 can be inserted as the 10th digit, counted from the right: 1000000000. In the remainder, 519 - 512 = 7, the highest power of 2 is 22 = 4, so the third zero from the right can be replaced by a 1: 1000000100. The next remainder, 3, consists of the sum of two powers of 2: 21 + 20, so the first and second zeros from the right are replaced by 1: 51910 = 10000001112.

Arithmetic operations in the binary system are extremely simple. The basic rules are: 1 + 1 = 10, and 1 × 1 = 1. Zero plays its usual role: 1 × 0 = 0, and 1 + 0 = 1. Addition, subtraction, and multiplication are done in a fashion similar to that of the decimal system:

Because only two digits (or bits) are involved, the binary system is used in computers, since any binary number can be represented by, for example, the positions of a series of on-off switches. The “on” position corresponds to a 1, and the “off” position to a 0. Instead of switches, magnetized dots on a magnetic tape or disk also can be used to represent binary numbers: a magnetized dot stands for the digit 1, and the absence of a magnetized dot is the digit 0. Flip-flops—electronic devices that can only carry two distinct voltages at their outputs and that can be switched from one state to the other state by an impulse—can also be used to represent binary numbers; the two voltages correspond to the two digits. Logic circuits in computers (see Computer; Electronics) carry out the different arithmetic operations of binary numbers; the conversion of decimal numbers to binary numbers for processing, and of binary numbers to decimal numbers for the readout, is done electronically.

Numerals

I

INTRODUCTION

Numerals, signs or symbols for graphic representation of numbers. The earliest forms of number notation were simply groups of straight lines, either vertical or horizontal, each line corresponding to the number 1. Such a system is inconvenient when dealing with large numbers, and as early as 3400 bc in Egypt and 3000 bc in Mesopotamia a special symbol was adopted for the number 10. The addition of this second number symbol made it possible to express the number 11 with 2 instead of 11 individual symbols and the number 99 with 18 instead of 99 individual symbols. Later numeral systems introduced extra symbols for a number between 1 and 10, usually either 4 or 5, and additional symbols for numbers greater than 10. In Babylonian cuneiform notation the numeral used for 1 was also used for 60 and for powers of 60; the value of the numeral was indicated by its context. This was a logical arrangement from the mathematical point of view because 60 0 = 1, 601 = 60, and 602 = 3600. The Egyptian hieroglyphic system used special symbols for 10, 100, 1000, and 10,000.

The ancient Greeks had two parallel systems of numerals. The earlier of these was based on the initial letters of the names of numbers: The number 5 was indicated by the letter pi; 10 by the letter delta; 100 by the antique form of the letter H; 1000 by the letter chi; and 10,000 by the letter mu. The later system, which was first introduced about the 3rd century bc, employed all the letters of the Greek alphabet plus three letters borrowed from the Phoenician alphabet as number symbols. The first nine letters of the alphabet were used for the numbers 1 to 9, the second nine letters for the tens from 10 to 90, and the last nine letters for the hundreds from 100 to 900. Thousands were indicated by placing a bar to the left of the appropriate numeral, and tens of thousands by placing the appropriate letter over the letter M. The late Greek system had the advantage that large numbers could be expressed with a minimum of symbols, but it had the disadvantage of requiring the user to memorize a total of 27 symbols.

II

ROMAN NUMERALS

The system of number symbols created by the Romans had the merit of expressing all numbers from 1 to 1,000,000 with a total of seven symbols: I for 1, V for 5, X for 10, L for 50, C for 100, D for 500, and M for 1000. Roman numerals are read from left to right. The symbols representing the largest quantities are placed at the left; immediately to the right of those are the symbols representing the next largest quantities, and so on. The symbols are usually added together. For example, LX = 60, and MMCIII = 2103. When a numeral is smaller than the numeral to the right, however, the numeral on the left should be subtracted from the numeral on the right. For instance, XIV = 14 and IX = 9.  represents 1,000,000—a small bar placed over the numeral multiplies the numeral by 1000. Thus, theoretically, it is possible, by using an infinite number of bars, to express the numbers from 1 to infinity. In practice, however, one bar is usually used; two are rarely used, and more than two are almost never used. Roman numerals are still used today, more than 2000 years after their introduction. The Roman system's one drawback, however, is that it is not suitable for rapid written calculations.

III

ARABIC NUMERALS

The common system of number notation in use in most parts of the world today is the Arabic system. This system was first developed by the Hindus and was in use in India in the 3rd century bc. At that time the numerals 1, 4, and 6 were written in substantially the same form used today. The Hindu numeral system was probably introduced into the Arab world about the 7th or 8th century ad . The first recorded use of the system in Europe was in ad976.

The important innovation in the Arabic system was the use of positional notation, in which individual number symbols assume different values according to their position in the written numeral. Positional notation is made possible by the use of a symbol for zero. The symbol 0 makes it possible to differentiate between 11, 101, and 1001 without the use of additional symbols, and all numbers can be expressed in terms of ten symbols, the numerals from 1 to 9 plus 0. Positional notation also greatly simplifies all forms of written numerical calculation.

Numerals

I

INTRODUCTION

Numerals, signs or symbols for graphic representation of numbers. The earliest forms of number notation were simply groups of straight lines, either vertical or horizontal, each line corresponding to the number 1. Such a system is inconvenient when dealing with large numbers, and as early as 3400 bc in Egypt and 3000 bc in Mesopotamia a special symbol was adopted for the number 10. The addition of this second number symbol made it possible to express the number 11 with 2 instead of 11 individual symbols and the number 99 with 18 instead of 99 individual symbols. Later numeral systems introduced extra symbols for a number between 1 and 10, usually either 4 or 5, and additional symbols for numbers greater than 10. In Babylonian cuneiform notation the numeral used for 1 was also used for 60 and for powers of 60; the value of the numeral was indicated by its context. This was a logical arrangement from the mathematical point of view because 60 0 = 1, 601 = 60, and 602 = 3600. The Egyptian hieroglyphic system used special symbols for 10, 100, 1000, and 10,000.

The ancient Greeks had two parallel systems of numerals. The earlier of these was based on the initial letters of the names of numbers: The number 5 was indicated by the letter pi; 10 by the letter delta; 100 by the antique form of the letter H; 1000 by the letter chi; and 10,000 by the letter mu. The later system, which was first introduced about the 3rd century bc, employed all the letters of the Greek alphabet plus three letters borrowed from the Phoenician alphabet as number symbols. The first nine letters of the alphabet were used for the numbers 1 to 9, the second nine letters for the tens from 10 to 90, and the last nine letters for the hundreds from 100 to 900. Thousands were indicated by placing a bar to the left of the appropriate numeral, and tens of thousands by placing the appropriate letter over the letter M. The late Greek system had the advantage that large numbers could be expressed with a minimum of symbols, but it had the disadvantage of requiring the user to memorize a total of 27 symbols.

II

ROMAN NUMERALS

The system of number symbols created by the Romans had the merit of expressing all numbers from 1 to 1,000,000 with a total of seven symbols: I for 1, V for 5, X for 10, L for 50, C for 100, D for 500, and M for 1000. Roman numerals are read from left to right. The symbols representing the largest quantities are placed at the left; immediately to the right of those are the symbols representing the next largest quantities, and so on. The symbols are usually added together. For example, LX = 60, and MMCIII = 2103. When a numeral is smaller than the numeral to the right, however, the numeral on the left should be subtracted from the numeral on the right. For instance, XIV = 14 and IX = 9.  represents 1,000,000—a small bar placed over the numeral multiplies the numeral by 1000. Thus, theoretically, it is possible, by using an infinite number of bars, to express the numbers from 1 to infinity. In practice, however, one bar is usually used; two are rarely used, and more than two are almost never used. Roman numerals are still used today, more than 2000 years after their introduction. The Roman system's one drawback, however, is that it is not suitable for rapid written calculations.

III

ARABIC NUMERALS

The common system of number notation in use in most parts of the world today is the Arabic system. This system was first developed by the Hindus and was in use in India in the 3rd century bc. At that time the numerals 1, 4, and 6 were written in substantially the same form used today. The Hindu numeral system was probably introduced into the Arab world about the 7th or 8th century ad . The first recorded use of the system in Europe was in ad976.

The important innovation in the Arabic system was the use of positional notation, in which individual number symbols assume different values according to their position in the written numeral. Positional notation is made possible by the use of a symbol for zero. The symbol 0 makes it possible to differentiate between 11, 101, and 1001 without the use of additional symbols, and all numbers can be expressed in terms of ten symbols, the numerals from 1 to 9 plus 0. Positional notation also greatly simplifies all forms of written numerical calculation.

Axiom

Axiom, in logic and mathematics, a basic principle that is assumed to be true without proof. The use of axioms in mathematics stems from the ancient Greeks, most probably during the 5th century bc, and represents the beginnings of pure mathematics as it is known today. Examples of axioms are the following: “No sentence can be true and false at the same time” (the principle of contradiction); “If equals are added to equals, the sums are equal”; “The whole is greater than any of its parts.” Logic and pure mathematics begin with such unproved assumptions from which other propositions (theorems) are derived. This procedure is necessary to avoid circularity, or an infinite regression in reasoning. The axioms of any system must be consistent with one another, that is, they should not lead to contradictions. They should be independent in the sense that they cannot be derived from one another. They should also be few in number. Axioms have sometimes been interpreted as self-evident truths. The present tendency is to avoid this claim and simply to assert that an axiom is assumed to be true without proof in the system of which it is a part.

The terms axiom and postulate are often used synonymously. Sometimes the word axiom is used to refer to basic principles that are assumed by every deductive system, and the term postulate is used to refer to first principles peculiar to a particular system, such as Euclidean geometry. Infrequently, the word axiom is used to refer to first principles in logic, and the term postulate is used to refer to first principles in mathematics.

Vector (mathematics)

Vector (mathematics), quantity having both magnitude and direction. For example, an ordinary quantity, or scalar, can be exemplified by the distance 6 km; a vector quantity can be exemplified by the term 6 km north. Vectors are usually represented by directed line segments, such as  in the diagram below; the length of the line segment is a measure of the vector quantity, and its direction is the same as that of the vector.

The simplest use of vectors and calculation by means of vectors is illustrated in the diagram, drawn to represent a boat moving across a stream. Vector a, or , indicates the motion of the boat in the course of a given interval of time if it were moving through still water; vector b, or , shows the drift or flow of the current during the same period of time. The actual path of travel of the boat under the influence of its own propulsion and of the current is represented by vector c, or . By the use of vectors any type of problem involving the motion of an object being acted on by several forces can be solved graphically.

This method of problem solution, known as vector addition, is performed as follows. A vector representing one force is drawn from the origin O in the proper direction. The length of the vector is made to agree with any convenient arbitrary scale, such as a given number of centimeters to the kilometer. In the diagram the rate of rowing was 2.2 km/h, the time rowed was 1 hr, and the scale is 1 cm to 1 km. Therefore, the line  is drawn as 2.2 cm to equal 2.2 km. The current speed of 6 km/h is then represented by a vector  that is 6 cm long, indicating a distance of 6 km that the current moved during 1 hr. This second vector is drawn with its origin at the end of vector a in a direction parallel to the flow of the current. The point B at the end of the second vector represents the actual position of the boat at the end of 1 hr of travel, and the actual distance traveled is represented by the length (in this case, about 6.4 km) of the vector c, or .

Problems in vector addition and subtraction such as the one above can be easily solved by graphic methods and can also be calculated by means of trigonometry. This type of calculation is useful in solving problems in navigation and motion as well as in mechanics and other branches of physics. In present-day advanced mathematics, a vector is considered an ordered set of quantities with appropriate rules of manipulation. Vector analysis, that is, the algebra, geometry, and calculus of vector quantities, enters into the applied mathematics of every field of science and engineering.

Sequence and Series

Sequence and Series, in mathematics, an ordered succession of numbers or other quantities, and the indicated sum of such a succession, respectively.

A sequence is represented as a1, a2..., an, .... The as are numbers or quantities, distinct or not; a1 is the first term, a 2 the second term, and so on. If the expression has a last term, the sequence is finite; otherwise, it is infinite. A sequence is established or defined only if a rule is given that determines the nth term for every positive integer n; this rule may be given as a formula for the nth term. For example, all the positive integers, in natural order, form an infinite sequence; this sequence is defined by the formula an=n. The formula an = n2 determines the sequence 1, 4, 9, 16, .... The rule of starting with 0, 1, then letting each term be the sum of the two preceding terms determines the sequence 0, 1, 1, 2, 3, 5, 8, 13, ...; this is known as the Fibonacci sequence.

Important types of sequences include arithmetic sequences (also known as arithmetic progressions) in which the differences between successive terms are constant, and geometric sequences (also known as geometric progressions) in which the ratios of successive terms are constant. Examples arise when a sum of money is invested. If the money is invested at a simple interest of 8 percent, then after n years an initial principal of P dollars grows to an = P + n × (0.08)P dollars. Since (0.08) P dollars is added each year, the amounts an form an arithmetic progression. If the interest is instead compounded, the amounts present after a sequence of years form a geometric progression, gn = P × (1.08)n. In both of these cases, it is clear that an and gn will eventually become larger than any preassigned whole number N, however large N may be.

Terms in a sequence, however, do not always increase without limit. For example, as n increases, the sequence an = 1/n approaches 0 as a limiting value, and bn = A + B/n approaches A. In any such case some finite number L exists such that whatever tolerance e is specified, the values of the sequence all eventually lie within a distance e of L. For example, in the case of the sequence 2 + (-1)n/2n,L = 2. Even if e is as small as 1/10,000, it can be seen that if n is greater than 5000, all values of n are within e of 2. The number L is called the limit of the sequence, since even though individual terms of the sequence may be bigger or smaller than L, the terms eventually cluster closer and closer to L. When the sequence has a limit L, it is said to converge to L. For the sequence an, for example, this is written as lim an = L, which is read as “the limit of an as n goes to infinity is L.

The term series refers to the indicated sum, a1 + a2 + ... + an, or a1 + a2 + ... + an + ..., of the terms of a sequence. A series is either finite or infinite, depending on whether the corresponding sequence of terms is finite or infinite.

The sequence s1 = a1,s2 = a1 + a2, s3 = a1 + a2 + a3, ..., sn = a1 + a2 + ... + an, ..., is called the sequence of partial sums of the series a1 + a2 + ... + an + .... The series converges or diverges as the sequence of partial sums converges or diverges. A constant-term series is one in which the terms are numbers; a series of functions is one in which the terms are functions of one or more variables. In particular, a power series is the series a0 + a1(x - c) + a2(x - c)2 + ... + an(x - c)n + ..., in which c and the as are constants. In the case of power series, the problem is to describe what values of x they converge for. If a series converges for some x, then the set of all x for which it converges consists of a point or some connected interval. The basic theory of convergence was worked out by the French mathematician Augustin Louis Cauchy in the 1820s.

The theory and application of infinite series are important in virtually every branch of pure and applied mathematics.

Logarithm

Logarithm, in mathematics, the exponent or power to which a stated number, called the base, is raised to yield a specific number. For example, in the expression 102 = 100, the logarithm of 100 to the base 10 is 2. This is written log10 100 = 2. Logarithms were originally invented to help simplify the arithmetical processes of multiplication, division, expansion to a power, and extraction of a root, but they are now used for a variety of purposes in pure and applied mathematics.

The first tables of logarithms were published independently by the Scottish mathematician John Napier in 1614 and the Swiss mathematician Justus Byrgius in 1620. The first table of common logarithms was compiled by the English mathematician Henry Briggs. Common logarithms use the number 10 as the base number. A system of logarithms often employed uses the transcendental number e as a base; they are called natural logarithms.

The method of logarithms can be illustrated by considering a sequence of powers of the number 2: 21, 22, 23, 24, 25, and 26, corresponding to the sequence of numbers 2, 4, 8, 16, 32, and 64. The exponents 1, 2, 3, 4, 5, and 6 are the logarithms of these numbers to the base 2. To multiply any number in this sequence by any other number in the series it is only necessary to add the logarithms of the numbers, then find the antilogarithm of the sum of the logarithms, which is equal to the base number raised to the power of the sum. Thus, to multiply 16 by 4, first note that the logarithm of 16 is 4, and the logarithm of 4 is 2. The sum of the logarithms 4 and 2 is equal to 6, and the antilogarithm of 6 is 64, which is the product desired. In division the logarithms are subtracted. To divide 32 by 8 subtract 3 from 5, giving 2, which is the logarithm of the quotient, 4.

To expand a number to any power, multiply the logarithm by the power desired, and take the antilogarithm of the product. Thus, to find 43: log 2 4 = 2; 3 × 2 = 6; antilog 6 = 64, which is the third power of 4. Roots are extracted by dividing the logarithm by the desired root. To find the fifth root of 32: log2 32 = 5; 5 ÷ 5 = 1; antilog 1 = 2, which is the fifth root of 32.

The problem in constructing a table of logarithms is to make the intervals between successive entries sufficiently small. In the above example, where the entries are the powers 2, 4, 8, and so on, the entries are too far apart to be useful in multiplying any larger numbers. By advanced mathematical processes, the logarithm of any number to any base can be calculated, and exhaustive tables of logarithms have been prepared. Each logarithm consists of a whole number and a decimal fraction, called respectively the characteristic and the mantissa. In the common system of logarithms, which has the base 10, the logarithm of the number 7 has the characteristic 0 and the mantissa .84510 (correct to five decimal places) and is written 0.84510. The logarithm of the number 70 is 1.84510; and the logarithm of the number 700 is 2.84510. The logarithm of the number .7 is -0.15490, which is sometimes written 9.84510-10 for convenience in calculation. Logarithm tables have been replaced by electronic calculators and computers with logarithmic functions.

Trigonometry

I

INTRODUCTION

Trigonometry, branch of mathematics that deals with the relationships between the sides and angles of triangles and with the properties and applications of the trigonometric functions of angles. The two branches of trigonometry are plane trigonometry, which deals with figures lying wholly in a single plane, and spherical trigonometry, which deals with triangles that are sections of the surface of a sphere.

The earliest applications of trigonometry were in the fields of navigation, surveying, and astronomy, in which the main problem generally was to determine an inaccessible distance, such as the distance between the earth and the moon, or of a distance that could not be measured directly, such as the distance across a large lake. Other applications of trigonometry are found in physics, chemistry, and almost all branches of engineering, particularly in the study of periodic phenomena, such as vibration studies of sound, a bridge, or a building, or the flow of alternating current.

II

PLANE TRIGONOMETRY

The concept of the trigonometric angle is basic to the study of trigonometry. A trigonometric angle is generated by a rotating ray. The rays OA and OB (Fig. 1a, 1b, and 1c) are considered originally coincident at OA, which is called the initial side. The ray OB then rotates to a final position called the terminal side. An angle and its measure are considered positive if they are generated by counterclockwise rotation in the plane, and negative if they are generated by clockwise rotation. Two trigonometric angles are equal if they are congruent and if their rotations are in the same direction and of the same magnitude.

An angular unit of measure usually is defined as an angle with a vertex at the center of a circle and with sides that subtend, or cut off, a certain part of the circumference (Fig. 2).

If the subtended arc s (AB) is equal to one-fourth of the total circumference C, that is, s = C, so that OA is perpendicular to OB, the angular unit is a right angle. If s = C, so that the points A, O, and B are on a straight line, the angular unit is a straight angle. If s = 1/360C, the angular unit is one degree. If s = C, so that the subtended arc is equal to the radius of the circle, the angular unit is a radian. By equating the various values of C, it follows that

1 straight angle = 2 right angles = 180 degrees =  radians

Each degree is subdivided into 60 equal parts called minutes, and each minute is subdivided into 60 equal parts called seconds. For finer measurements, decimal parts of a second may be used. Radian measurements smaller than a radian are expressed in decimals. The symbol for degree is °; for minutes, ‘; and for seconds, ". For radian measures either the abbreviation rad or no symbol at all may be used. Thus

The angular unit radian is understood in the last entry. (The notation 42".14 may be used instead of 42.14" to indicate decimal parts of seconds.)

By convention, a trigonometric angle is labeled with the Greek letter theta (θ). If the angle θ is given in radians, then the formula s = rθ may be used to find the length of the arc s; if θ is given in degrees, then

A

Trigonometric Functions

Trigonometric functions are unitless values that vary with the size of an angle. An angle placed in a rectangular coordinate plane is said to be in standard position if its vertex coincides with the origin and its initial side coincides with the positive x-axis.

In Fig. 3, let P, with coordinates x and y, be any point other than the vertex on the terminal side of the angle θ, and r be the distance between Pand the origin. Each of the coordinates x and y may be positive or negative, depending on the quadrant in which the point P lies; x may be zero, if P is on the y- axis, or y may be zero, if P is on the x-axis. The distance r is necessarily positive and is equal to

in accordance with the Pythagorean theorem (see Geometry).

The six commonly used trigonometric functions are defined as follows:

Since x and y do not change if 2 radians are added to the angle—that is, 360° are added—it is clear that sin (θ + 2) = sin θ. Similar statements hold for the five other functions. By definition, three of these functions are reciprocals of the three others, that is,

If point P, in the definition of the general trigonometric function, is on the y-axis, x is 0; therefore, because division by zero is inadmissible in mathematics, the tangent and secant of such angles as 90°, 270°, and -270° do not exist. If P is on the x-axis, y is 0; in this case, the cotangent and cosecant of such angles as 0°, 180°, and -180° do not exist. All angles have sines and cosines, because r is never equal to 0.

Since r is greater than or equal to x or y, the values of sin θ and cos θ range from -1 to +1; tan θ and cot θ are unlimited, assuming any real value; sec θ and csc θ may be either equal to or greater than 1, or equal to or less than -1.

It is readily shown that the value of a trigonometric function of an angle does not depend on the particular choice of point P, provided that it is on the terminal side of the angle, because the ratios depend only on the size of the angle, not on where the point P is located on the side of the angle.

If θ is one of the acute angles of a right triangle, the definitions of the trigonometric functions given above can be applied to θ as follows (Fig. 4). Imagine the vertex A is placed at the intersection of the x-axis and y-axis in Fig. 3, that AC extends along the positive x-axis, and that B is the point P, so that AB = AP = r. Then sin θ = y/r = a/c, and so on, as follows:

The numerical values of the trigonometric functions of a few angles can be readily obtained; for example, either acute angle of an isosceles right triangle is 45°, as shown in Fig. 4. Therefore, it follows that

The numerical values of the trigonometric functions of any angle can be determined approximately by drawing the angle in standard position with a ruler, compass, and protractor; by measuring x, y, and r; and then by calculating the appropriate ratios. Actually, it is necessary to calculate the values of sin θ and cos θ only for a few selected angles, because the values for other angles and for the other functions may be found by using one or more of the trigonometric identities that are listed below.

B

Trigonometric Identities

The following formulas, called identities, which show the relationships between the trigonometric functions, hold for all values of the angle θ, or of two angles, θ and φ, for which the functions involved are:

By repeated use of one or more of the formulas in group V, which are known as reduction formulas, sin θ and cos θ can be expressed for any value of θ, in terms of the sine and cosine of angles between 0° and 90°. By use of the formulas in groups I and II, the values of tan θ, cot θ, sec θ, and csc θ may be found from the values of sin θ and cos θ. It is therefore sufficient to tabulate the values of sin θ and cos θ for values of θ between 0° and 90°; in practice, to avoid tedious calculations, the values of the other four functions also have been made available in tabulations for the same range of θ.

The variation of the values of the trigonometric functions for different angles may be represented by graphs, as in Fig. 5. It is readily ascertained from these curves that each of the trigonometric functions is periodic, that is, the value of each is repeated at regular intervals called periods. The period of all the functions, except the tangent and the cotangent, is 360°, or 2  radians. Tangent and cotangent have a period of 180°, or  radians.

Many other trigonometric identities can be derived from the fundamental identities. All are needed for the applications and further study of trigonometry.

C

Inverse Functions

The statement y is the sine of θ, or y = sin θ is equivalent to the statement θ is an angle, the sine of which is equal to y, written symbolically as θ = arc sin y = sin-1y. The arc form is preferred. The inverse functions, arc cos y, arc tan y, arc cot y, arc sec y, arc csc y, are similarly defined. In the statement y = sin θ, or θ = arc sin y, a given value of θ will determine infinitely many values of y. Thus, sin 30° = sin 150° = sin (30° + 360°) = sin (150° + 360°). . .= 1/2; therefore, if θ = arc sin 1/2, then θ = 30° + n360° or θ = 150° + n360°, in which n is any integer, positive, negative, or zero. The value 30° is designated the basic or principal value of arc sin 1/2. When used in this sense, the term arc generally is written with a capital A. Although custom is not uniform, the principal value of Arc sin y, Arc cos y, Arc tan y, Arc cot y, Arc sec y, or Arc csc y commonly is defined to be the angle between 0° and 90° if y is positive; and, if y is negative, by the inequalities

D

The General Triangle

Practical applications of trigonometry often involve determining distances that cannot be measured directly. Such a problem may be solved by making the required distance one side of a triangle, measuring othersides or angles of the triangle, and then applying the formulas below.

If A, B, C are the three angles of a triangle, and a, b, c the respective opposite sides, it may be proved that

The cosine and tangent laws can each be given two other forms by rotating the letters a, b, c and A, B, C.

These three relationships can be used to solve any triangle, that is, the unknown sides or angles can be found when one side and two angles, two sides and the included angle, two sides and an angle opposite one of them (usually there are two triangles in this case), or when three sides are given.

III

SPHERICAL TRIGONOMETRY

Spherical trigonometry, which is used principally in navigation and astronomy, is concerned with spherical triangles, that is, figures that are arcs of great circles (see Navigation) on the surface of a sphere. The spherical triangle, like the plane triangle, has six elements, the three sides a, b, c and the angles A, B, C. But the three sides of the spherical triangle are angular as well as linear magnitudes, being arcs of great circles on the surface of a sphere and measured by the angle subtended at the center. The triangle is completely determined when any three of its six elements are given, since relations exist between the various parts by means of which unknown elements may be found.

In the right-angled or quadrantal triangle, however, as in the case of the right-angled plane triangle, only two elements are needed to determine all of the remaining parts. Thus, given c, A in the right-angled triangle, ABC, with C = 90°, the remaining parts are given by the formula as sin a = sin c sin A; tan b = tan c cos A; cot B = cos c tan A. When any other two parts are given the corresponding formulas may be obtained by Napier's rules concerning the relations of the five circular parts, a, b, complement of c, complement of A, complement of B. With respect to any particular part, the remaining parts are classified as adjacent and opposite; the sine of any part is equal to the product of the tangents of the adjacent parts and also to the product of the cosines of the opposite parts.

In the case of oblique triangles no simple rules have been found, but each case depends on the appropriate formula. Thus in the oblique triangle ABC, given a, b, and A, the formulas for the remaining parts are

In spherical trigonometry, as well as in plane, three elements taken at random may not satisfy the conditions for a triangle, or they may satisfy the conditions for more than one. The treatment of certain cases in spherical trigonometry is quite formidable, because every line intersects every other line in two points and multiplies the cases to be considered. The measurement of spherical polygons may be made to depend upon that of the triangle. If, by drawing diagonals, one can divide the polygons into triangles, each of which contains three known or obtainable elements, then all the parts of the polygon can be determined.

Spherical trigonometry is of great importance in the theory of stereographic projection and in geodesy. It is also the basis of the chief calculations of astronomy; for example, the solution of the so-called astronomical triangle is involved in finding the latitude and longitude of a place, the time of day, the position of a star, and various other data.

IV

HISTORY

The history of trigonometry goes back to the earliest recorded mathematics in Egypt and Babylon. The Babylonians established the measurement of angles in degrees, minutes, and seconds. Not until the time of the Greeks, however, did any considerable amount of trigonometry exist. In the 2nd century bc the astronomer Hipparchus compiled a trigonometric table for solving triangles. Starting with 7° and going up to 180° by steps of 7°, the table gave for each angle the length of the chord subtending that angle in a circle of a fixed radius r. Such a table is equivalent to a sine table. The value that Hipparchus used for r is not certain, but 300 years later the astronomer Ptolemy used r = 60 because the Hellenistic Greeks had adopted the Babylonian base-60 (sexagesimal) numeration system (see Mathematics).

In his great astronomical handbook, The Almagest, Ptolemy provided a table of chords for steps of °, from 0° to 180°, that is accurate to 1/3600 of a unit. He also explained his method for constructing his table of chords, and in the course of the book he gave many examples of how to use the table to find unknown parts of triangles from known parts. Ptolemy provided what is now known as Menelaus's theorem for solving spherical triangles, as well, and for several centuries his trigonometry was the primary introduction to the subject for any astronomer. At perhaps the same time as Ptolemy, however, Indian astronomers had developed a trigonometric system based on the sine function rather than the chord function of the Greeks. This sine function, unlike the modern one, was not a ratio but simply the length of the side opposite the angle in a right triangle of fixed hypotenuse. The Indians used various values for the hypotenuse.

Late in the 8th century, Muslim astronomers inherited both the Greek and the Indian traditions, but they seem to have preferred the sine function. By the end of the 10th century they had completed the sine and the five other functions and had discovered and proved several basic theorems of trigonometry for both plane and spherical triangles. Several mathematicians suggested using r = 1 instead of r = 60; this exactly produces the modern values of the trigonometric functions. The Muslims also introduced the polar triangle for spherical triangles. All of these discoveries were applied both for astronomical purposes and as an aid in astronomical time-keeping and in finding the direction of Mecca for the five daily prayers required by Muslim law. Muslim scientists also produced tables of great precision. For example, their tables of the sine and tangent, constructed for steps of 1/60 of a degree, were accurate for better than one part in 700 million. Finally, the great astronomer Nasir ad-Din at- Tusi wrote the Book of the Transversal Figure, which was the first treatment of plane and spherical trigonometry as independent mathematical sciences.

The Latin West became acquainted with Muslim trigonometry through translations of Arabic astronomy handbooks, beginning in the 12th century. The first major Western work on the subject was written by the German astronomer and mathematician Johann Müller, known as Regiomontanus. In the next century the German astronomer Georges Joachim, known as Rheticus introduced the modern conception of trigonometric functions as ratios instead of as the lengths of certain lines. The French mathematician François Viète introduced the polar triangle into spherical trigonometry, and stated the multiple-angle formulas for sin(nq) and cos(nq) in terms of the powers of sin(q) and cos(q).

Trigonometric calculations were greatly aided by the Scottish mathematician John Napier, who invented logarithms early in the 17th century. He also invented some memory aids for ten laws for solving spherical triangles, and some proportions (called Napier's analogies) for solving oblique spherical triangles.

Almost exactly one half century after Napier's publication of his logarithms, Isaac Newton invented the differential and integral calculus. One of the foundations of this work was Newton's representation of many functions as infinite series in the powers of x (see Sequence and Series). Thus Newton found the series sin(x) and similar series for cos(x) and tan(x). With the invention of calculus, the trigonometric functions were taken over into analysis, where they still play important roles in both pure and applied mathematics.

Finally, in the 18th century the Swiss mathematician Leonhard Euler defined the trigonometric functions in terms of complex numbers (see Number). This made the whole subject of trigonometry just one of the many applications of complex numbers, and showed that the basic laws of trigonometry were simply consequences of the arithmetic of these numbers.

Equation

Equation, statement of an equality between two expressions, used in almost all branches of pure and applied mathematics and in the physical, biological, and social sciences. An equation usually involves one or more unknown quantities, called variables or indeterminates. These are commonly denoted by letters or other symbols, as in the equations x2 + x - 4 = 8, y = sin x + x, and 3y = log x. An equation is named for the number of variables it contains, called an equation in one, two, three, or more variables.

An equation is said to be satisfied or to be true for certain values of the variables if, when the variables are replaced by these values, the expression on the left side of the equals sign is equal to that on the right side. For example, the equation 2x + 5 = 13 is satisfied when x = 4. If one or more values of the variable fail to satisfy the equation, the equation is called conditional. The equation in two variables 3x + 4y = 8 is a conditional equation because it is not satisfied when x = 1 and y = 3. An equation is called an identity if it is satisfied by all possible values of the variables. For example, the equations (x + y)2 = x2 + 2xy + y2 and sin2x + cos2x = 1 are identities because they are both true for all possible values of the unknowns. A solution of a conditional equation is a value of the variable, or a set of values of the variables, that satisfies the equation; thus, 3 is a solution of the equation x2 - 2x = 3; and x = 2, y = 4 is a solution of the equation 3x2 + 4y = 28. A solution of an equation in one variable is commonly called a root of the equation.

A polynomial equation has the form

a0xn + a1xn-1 + a 2xn-2 + ... + an-2x2 + an-1x + an = 0

in which the coefficients a0, a1, ..., an are constants, the leading coefficient a0 is not equal to zero, and n is a positive integer. The greatest exponent n is the degree of the equation. Equations of the first, second, third, fourth, and fifth degrees are often called, respectively, linear, quadratic, cubic, biquadratic or quartic, and quintic equations. Other important types of equations are algebraic, as in = = 7; trigonometric, as in sin x + cos 2x = ; logarithmic, as in log x + 2 log (x + 1) = 8; and exponential, as in 3x + 2x - 5 = 0. Diophantine equations are equations in one or more unknowns, usually with integral coefficients, for which integral solutions are sought. Differential and integral equations, which involve derivatives or differentials, and integrals, occur in calculus and its applications.

A system of simultaneous equations is a set of two or more equations in two or more unknowns. A solution of such a system is a set of values of the unknowns that satisfies every equation of the set simultaneously.

Set Theory

Set Theory, branch of mathematics, first given formal treatment by the German mathematician Georg Cantor in the 19th century. The set concept is one of the most basic in mathematics, even more primitive than the process of counting, and is found, explicitly or implicitly, in every area of pure and applied mathematics. Explicitly, the principles and terminology of sets are used to make mathematical statements more clear and precise and to clarify concepts such as the finite and the infinite.

A set is an aggregate, class, or collection of objects, which are called the elements of the set. In symbols, aS means that the element a belongs to or is contained in the set S, or that the set S contains the element a. A set S is defined if, given any object a, one and only one of these statements holds: aS or aS (that is, a is not contained in S). A set is frequently designated by the symbol S }, with the braces including the elements of = { S either by writing all of them in explicitly or by giving a formula, rule, or statement that describes all of them. Thus, S1 = {2, 4}; S2 = {2, 4, 6, ..., 2n,...} = {all positive even integers}; S3 = {x | x2 - 6x + 11 ≥ 3}; S4 = {all living males named John}. In S3 and S4 it is implied that x is a number; S3 is read as the set of all xs such that x2 - 6x + 11 ≥ 3.

If every element of a set R also belongs to a set S,R is a subset of S, and S is a superset of R; in symbols, RS, or SR. A set is both a subset and a superset of itself. If RS, but at least one element in S is not in R,R is called a proper subset of S, and S is a proper superset of R; in symbols, RS,SR. If RS and SR, that is, if every element of one set is an element of the other, then R and S are the same, written R = S. Thus, in the examples cited above, S1 is a proper subset of S2.

If A and B are two subsets of a set S, the elements found in A or in B or in both form a subset of S called the union of A and B, written AB. The elements common to A and B form a subset of S called the intersection of A and B, written AB. If A and B have no elements in common, the intersection is empty; it is convenient, however, to think of the intersection as a set, designated by  and called the empty, or null, set. Thus, if A = {2, 4, 6}, B ={4, 6, 8, 10}, and C = {10, 14, 16, 26}, then AB = {2, 4, 6, 8, 10}, AC = {2, 4, 6, 10, 14, 16, 26}, AB = {4, 6}, AC = . The set of elements that are in A but not in B is called the difference between A and B, written A - B (sometimes A\B); thus, in the illustration above, A - B ={2}, B - A = {8, 10}. If A is a subset of a set l, the set of elements in l that are not in A, that is, l - A, is called the complement of A (with respect to l), written l - A = A’ (also written Ā,Ã, ~ A).

The following statements are basic consequences of the above definitions, with A,B,C,... representing subsets of a set l. 1. AB = BA. 2. AB = BA. 3. (AB) C = A (BC). 4. (AB) C = A (BC). 5. A = A. 6. A = . 7. Al = l. 8. Al = A. 9. A (BC) = (AB)  (AC). 10. A (BC) = (AB)  (AC). 11. AA’ = l. 12. AA’ = . 13. (AB)’ = A’B’. 14. (AB)’ = A’B’. 15. AA = AA = A. 16. (A’)’ = A. 17. A - B = AB’. 18. (A - B) - C = A - (BC). 19. If AB = , then (AB) - B = A. 20. A - (BC) = (A - B)  (A - C).

These are laws of the algebra of sets, which is an example of the algebraic system that mathematicians call Boolean algebra.

If S is a set, the set of all subsets of S is a new set D, sometimes called the derived set of S. Thus, if S = {a,b,c}; D ={{},{a}, {b},{c}, {a,b}, {a,c}, {b,c}, {a,b,c}. Here,{} is used in place of the null set , of S; it is an element of D. If S has n elements, the derived set D has 2n elements. Larger and larger sets are obtained by taking the derived set D2 of D, the derived set D3 of D2, and so on.

If A and B are two sets, the set of all possible ordered pairs of the form (a,b), with a in A and b in B, is called the Cartesian product of A and B, frequently written A × B. For example, if A ={1, 2}, B ={x,y,z}, then A × B ={ (1, x), (1, y), (1, z), (2, x), (2, y), (2, z)}. B × A ={ (x, 1), (y, 1), (z, 1), (x, 2), (y, 2), (z, 2)}. Here, A × BB × A, because the pair (1, x) must be distinguished from the pair (x, 1).

The elements of the set A = {1, 2, 3} can be matched or paired with the elements of the set B = {x,y,z} in several (actually, six) ways such that each element of B is matched with an element of A, each element of A is matched with an element of B, and different elements of one set are matched with different elements of the other. For example, the elements may be matched (1, y), (2, z), (3, x). A matching of this type is called a one-to-one (1-1) correspondence between the elements of A and B. The elements of the set A = {1, 2, 3} cannot be put into a 1-1 correspondence with the elements of any one of its proper subsets and is therefore called a finite set or a set with finite cardinality. The elements of the set B = {1, 2, 3, ...} can be put into a 1-1 correspondence with the elements of its proper subset C ={3, 4, 5, ...} by matching, for example, n of B with n + 2 of C,n = 1, 2, 3, .... A set with this property is called an infinite set or a set of infinite cardinality. Two sets having elements that can be placed in a 1-1 correspondence are said to have the same cardinality.

Gauss, Carl Friedrich

Gauss, Carl Friedrich (1777-1855), German mathematician, noted for his wide-ranging contributions to physics, particularly the study of electromagnetism.

Born in Braunschweig on April 30, 1777, Gauss studied ancient languages in college, but at the age of 17 he became interested in mathematics and attempted a solution of the classical problem of constructing a regular heptagon, or seven-sided figure, with ruler and compass. He not only succeeded in proving this construction impossible, but went on to give methods of constructing figures with 17, 257, and 65,537 sides. In so doing he proved that the construction, with compass and ruler, of a regular polygon with an odd number of sides was possible only when the number of sides was a prime number of the series 3, 5, 17, 257, and 65,537 or was a multiple of two or more of these numbers. With this discovery he gave up his intention to study languages and turned to mathematics. He studied at the University of Göttingen from 1795 to 1798; for his doctoral thesis he submitted a proof that every algebraic equation has at least one root, or solution. This theorem, which had challenged mathematicians for centuries, is still called “the fundamental theorem of algebra” (see Algebra; Equations, Theory of). His volume on the theory of numbers, Disquisitiones Arithmeticae (Inquiries into Arithmetic, 1801), is a classic work in the field of mathematics.

Gauss next turned his attention to astronomy. A faint planetoid, Ceres, had been discovered in 1801; and because astronomers thought it was a planet, they observed it with great interest until losing sight of it. From the early observations Gauss calculated its exact position, so that it was easily rediscovered. He also worked out a new method for calculating the orbits of heavenly bodies. In 1807 Gauss was appointed professor of mathematics and director of the observatory at Göttingen, holding both positions until his death there on February 23, 1855.

Although Gauss made valuable contributions to both theoretical and practical astronomy, his principal work was in mathematics and mathematical physics. In theory of numbers, he developed the important prime-number theorem (see e). He was the first to develop a non-Euclidean geometry (see Geometry), but Gauss failed to publish these important findings because he wished to avoid publicity. In probability theory, he developed the important method of least squares and the fundamental laws of probability distribution (see Probability; Statistics). The normal probability graph is still called the Gaussian curve. He made geodetic surveys, and applied mathematics to geodesy (see Geophysics). With the German physicist Wilhelm Eduard Weber, Gauss did extensive research on magnetism. His applications of mathematics to both magnetism and electricity are among his most important works; the unit of intensity of magnetic fields is today called the gauss. He also carried out research in optics, particularly in systems of lenses. Scarcely a branch of mathematics or mathematical physics was untouched by Gauss.

Descartes, René

I

INTRODUCTION

Descartes, René (1596-1650), French philosopher, scientist, and mathematician, sometimes called the father of modern philosophy.

Born in La Haye, Touraine (a region and former province of France), Descartes was the son of a minor nobleman and belonged to a family that had produced a number of learned men. At the age of eight he was enrolled in the Jesuit school of La Flèche in Anjou, where he remained for eight years. Besides the usual classical studies, Descartes received instruction in mathematics and in Scholastic philosophy, which attempted to use human reason to understand Christian doctrine (see Scholasticism). Roman Catholicism exerted a strong influence on Descartes throughout his life. Upon graduation from school, he studied law at the University of Poitiers, graduating in 1616. He never practiced law, however; in 1618 he entered the service of Prince Maurice of Nassau, leader of the United Provinces of the Netherlands, with the intention of following a military career. In succeeding years Descartes served in other armies, but his attention had already been attracted to the problems of mathematics and philosophy to which he was to devote the rest of his life. He made a pilgrimage to Italy from 1623 to 1624 and spent the years from 1624 to 1628 in France. While in France, Descartes devoted himself to the study of philosophy and also experimented in the science of optics. In 1628, having sold his properties in France, he moved to the Netherlands, where he spent most of the rest of his life. Descartes lived for varying periods in a number of different cities in the Netherlands, including Amsterdam, Deventer, Utrecht, and Leiden.

It was probably during the first years of his residence in the Netherlands that Descartes wrote his first major work, Essais philosophiques (Philosophical Essays), published in 1637. The work contained four parts: an essay on geometry, another on optics, a third on meteors, and Discours de la méthode (Discourse on Method), which described his philosophical speculations. This was followed by other philosophical works, among them Meditationes de Prima Philosophia (Meditations on First Philosophy, 1641; revised 1642) and Principia Philosophiae (The Principles of Philosophy, 1644). The latter volume was dedicated to Princess Elizabeth Stuart of Bohemia, who lived in the Netherlands and with whom Descartes had formed a deep friendship. In 1649 Descartes was invited to the court of Queen Christina of Sweden in Stockholm to give the queen instruction in philosophy. The rigors of the northern winter brought on the pneumonia that caused his death in 1650.

II

PHILOSOPHY

Descartes attempted to apply the rational inductive methods of science, and particularly of mathematics, to philosophy. Before his time, philosophy had been dominated by the method of Scholasticism, which was entirely based on comparing and contrasting the views of recognized authorities. Rejecting this method, Descartes stated, “In our search for the direct road to truth, we should busy ourselves with no object about which we cannot attain a certitude equal to that of the demonstration of arithmetic and geometry.” He therefore determined to hold nothing true until he had established grounds for believing it true. The single sure fact from which his investigations began was expressed by him in the famous words Cogito, ergo sum,”I think, therefore I am.” From this postulate that a clear consciousness of his thinking proved his own existence, he argued the existence of God. God, according to Descartes's philosophy, created two classes of substance that make up the whole of reality. One class was thinking substances, or minds, and the other was extended substances, or bodies.

III

SCIENCE

Descartes's philosophy, sometimes called Cartesianism, carried him into elaborate and erroneous explanations of a number of physical phenomena. These explanations, however, had value, because he substituted a system of mechanical interpretations of physical phenomena for the vague spiritual concepts of most earlier writers. Although Descartes had at first been inclined to accept the Copernican theory of the universe with its concept of a system of spinning planets revolving around the sun, he abandoned this theory when it was pronounced heretical by the Roman Catholic church. In its place he devised a theory of vortices in which space was entirely filled with matter, in various states, whirling about the sun.

In the field of physiology, Descartes held that part of the blood was a subtle fluid, which he called animal spirits. The animal spirits, he believed, came into contact with thinking substances in the brain and flowed out along the channels of the nerves to animate the muscles and other parts of the body.

Descartes's study of optics led him to the independent discovery of the fundamental law of reflection: that the angle of incidence is equal to the angle of reflection. His essay on optics was the first published statement of this law. Descartes's treatment of light as a type of pressure in a solid medium paved the way for the undulatory theory of light.

IV

MATHEMATICS

The most notable contribution that Descartes made to mathematics was the systematization of analytic geometry (see Geometry: Analytic Geometry). He was the first mathematician to attempt to classify curves according to the types of equations that produce them. He also made contributions to the theory of equations. Descartes was the first to use the last letters of the alphabet to designate unknown quantities and the first letters to designate known ones. He also invented the method of indices (as in x2) to express the powers of numbers. In addition, he formulated the rule, which is known as Descartes's rule of signs, for finding the number of positive and negative roots for any algebraic equation.

Dyson, Freeman

Dyson, Freeman ), British-born (1923- American theoretical physicist and astrophysicist. Dyson was born in Crowthorne, England, and educated at the University of Cambridge, where he worked in applied mathematics for the British government even before his graduation. He received his bachelor's degree in 1945. After six years as a researcher in the United Kingdom and the United States, Dyson became professor of physics at Cornell University in 1951. He moved to the Institute for Advanced Study at Princeton University in 1953.

Dyson's research began at an important time in the study of physics. During the years just after World War II (1939-1945), new experimental evidence raised questions about how quantum theory, developed during the 1920s and 1930s to describe the relationships between electrons and atomic nuclei, might be extended to cover the interactions of matter and light. But the mathematical techniques used to extend quantum theory had internal difficulties that led to seemingly absurd results. However, about 1949 two dominant, and seemingly unrelated, theoretical solutions to these mathematical problems emerged: one theory was developed by the American physicist Richard P. Feynman, and the other was developed by the American physicist Julian S. Schwinger and independently by the Japanese physicist Tomonaga Shin’ichirō. About 1950 Dyson showed that both theories were reducible to a single formalism, and he became a major figure in the application of these ideas. The resulting theory and the mathematical techniques associated with it became central to modern theoretical physics during the second half of the 20th century.

Since the time he was a student Dyson has been interested in the military applications of physics and the peaceful applications of nuclear energy. His books that discuss nuclear strategy and arms control include his autobiography, Disturbing the Universe (1979), and Weapons and Hope (1984).

Quantum Theory

I

INTRODUCTION

Quantum Theory, in physics, description of the particles that make up matter and how they interact with each other and with energy. Quantum theory explains in principle how to calculate what will happen in any experiment involving physical or biological systems, and how to understand how our world works. The name “quantum theory” comes from the fact that the theory describes the matter and energy in the universe in terms of single indivisible units called quanta (singular quantum). Quantum theory is different from classical physics. Classical physics is an approximation of the set of rules and equations in quantum theory. Classical physics accurately describes the behavior of matter and energy in the everyday universe. For example, classical physics explains the motion of a car accelerating or of a ball flying through the air. Quantum theory, on the other hand, can accurately describe the behavior of the universe on a much smaller scale, that of atoms and smaller particles. The rules of classical physics do not explain the behavior of matter and energy on this small scale. Quantum theory is more general than classical physics, and in principle, it could be used to predict the behavior of any physical, chemical, or biological system. However, explaining the behavior of the everyday world with quantum theory is too complicated to be practical.

Quantum theory not only specifies new rules for describing the universe but also introduces new ways of thinking about matter and energy. The tiny particles that quantum theory describes do not have defined locations, speeds, and paths like objects described by classical physics. Instead, quantum theory describes positions and other properties of particles in terms of the chances that the property will have a certain value. For example, it allows scientists to calculate how likely it is that a particle will be in a certain position at a certain time.

Quantum description of particles allows scientists to understand how particles combine to form atoms. Quantum description of atoms helps scientists understand the chemical and physical properties of molecules, atoms, and subatomic particles. Quantum theory enabled scientists to understand the conditions of the early universe, how the Sun shines, and how atoms and molecules determine the characteristics of the material that they make up. Without quantum theory, scientists could not have developed nuclear energy or the electric circuits that provide the basis for computers.

Quantum theory describes all of the fundamental forces—except gravitation—that physicists have found in nature. The forces that quantum theory describes are the electrical, the magnetic, the weak, and the strong. Physicists often refer to these forces as interactions, because the forces control the way particles interact with each other. Interactions also affect spontaneous changes in isolated particles.

II

WAVES AND PARTICLES

One of the striking differences between quantum theory and classical physics is that quantum theory describes energy and matter both as waves and as particles. The type of energy physicists study most often with quantum theory is light. Classical physics considers light to be only a wave, and it treats matter strictly as particles. Quantum theory acknowledges that both light and matter can behave like waves and like particles.

It is important to understand how scientists describe the properties of waves in order to understand how waves fit into quantum theory. A familiar type of wave occurs when a rope is tied to a solid object and someone moves the free end up and down. Waves travel along the rope. The highest points on the rope are called the crests of the waves. The lowest points are called troughs. One full wave consists of a crest and trough. The distance from crest to crest or from trough to trough—or from any point on one wave to the identical point on the next wave—is called the wavelength. The frequency of the waves is the number of waves per second that pass by a given point along the rope.

If waves traveling down the rope hit the stationary end and bounce back, like water waves bouncing against a wall, two waves on the rope may meet each other, hitting the same place on the rope at the same time. These two waves will interfere, or combine (see Interference). If the two waves exactly line up—that is, if the crests and troughs of the waves line up—the waves interfere constructively. This means that the trough of the combined wave is deeper and the crest is higher than those of the waves before they combined. If the two waves are offset by exactly half of a wavelength, the trough of one wave lines up with the crest of the other. This alignment creates destructive interference—the two waves cancel each other out and a momentary flat spot appears on the rope. See also Wave Motion.

A

Light as a Wave and as a Particle

Like classical physics, quantum theory sometimes describes light as a wave, because light behaves like a wave in many situations. Light is not a vibration of a solid substance, such as a rope. Instead, a light wave is made up of a vibration in the intensity of the electric and magnetic fields that surround any electrically charged object.

Like the waves moving along a rope, light waves travel and carry energy. The amount of energy depends on the frequency of the light waves: the higher the frequency, the higher the energy. The frequency of a light wave is also related to the color of the light. For example, blue light has a higher frequency than that of red light. Therefore, a beam of blue light has more energy than an equally intense beam of red light has.

Unlike classical physics, quantum theory also describes light as a particle. Scientists revealed this aspect of light behavior in several experiments performed during the early 20th century. In one experiment, physicists discovered an interaction between light and particles in a metal. They called this interaction the photoelectric effect. In the photoelectric effect, a beam of light shining on a piece of metal makes the metal emit electrons. The light adds energy to the metal’s electrons, giving them enough energy to break free from the metal. If light was made strictly of waves, each electron in the metal could absorb many continuous waves of light and gain more and more energy. Increasing the intensity of the light, or adding more light waves, would add more energy to the emitted electrons. Shining a more and more intense beam of light on the metal would make the metal emit electrons with more and more energy.

Scientists found, however, that shining a more intense beam of light on the metal just made the metal emit more electrons. Each of these electrons had the same energy as that of electrons emitted with low intensity light. The electrons could not be interacting with waves, because adding more waves did not add more energy to the electrons. Instead, each electron had to be interacting with just a small piece of the light beam. These pieces were like little packets of light energy, or particles of light. The size, or energy, of each packet depended only on the frequency, or color, of the light—not on the intensity of the light. A more intense beam of light just had more packets of light energy, but each packet contained the same amount of energy. Individual electrons could absorb one packet of light energy and break free from the metal. Increasing the intensity of the light added more packets of energy to the beam and enabled a greater number of electrons to break free—but each of these emitted electrons had the same amount of energy. Scientists could only change the energy of the emitted electrons by changing the frequency, or color, of the beam. Changing from red light to blue light, for example, increased the energy of each packet of light. In this case, each emitted electron absorbed a bigger packet of light energy and had more energy after it broke free of the metal. Using these results, physicists developed a model of light as a particle, and they called these light particles photons.

In 1922 American physicist Arthur Compton discovered another interaction, now called the Compton effect, that reveals the particle nature of light. In the Compton effect, light collides with an electron. The collision knocks the electron off course and changes the frequency, and therefore energy, of the light. The light affects the electron in the same way a particle with momentum would: It bumps the electron and changes the electron’s path. The light is also affected by the collision as though it were a particle, in that its energy and momentum changes.

Momentum is a quantity that can be defined for all particles. For light particles, or photons, momentum depends on the frequency, or color, of the photon, which in turn depends on the photon’s energy. The energy of a photon is equal to a constant number, called Planck’s constant, times the frequency of the photon. Planck’s constant is named for German physicist Max Planck, who first proposed the relationship between energy and frequency. The accepted value of Planck’s constant is 6.626 × 10-34 joule-second. This number is very small—written out, it is a decimal point followed by 33 zeroes, followed by the digits 6626. The energy of a single photon is therefore very small.

The dual nature of light seems puzzling because we have no everyday experience with wave-particle duality. Waves are everyday phenomena; we are all familiar with waves on a body of water or on a vibrating rope. Particles, too, are everyday objects—baseballs, cars, buildings, and even people can be thought of as particles. But to our senses, there are no everyday objects that are both waves and particles. Scientists increasingly find that the rules that apply to the world we see are only approximations of the rules that govern the unseen world of light and subatomic particles.

B

Matter as Waves and Particles

In 1923 French physicist Louis de Broglie suggested that all particles—not just photons—have both wave and particle properties. He calculated that every particle has a wavelength (represented by λ, the Greek letter lambda) equal to Planck’s constant (h) divided by the momentum (p) of the particle: λ = h/p. Electrons, atoms, and all other particles have de Broglie wavelengths. The momentum of an object depends on its speed and mass, so the faster and heavier an object is, the larger its momentum (p) will be. Because Planck’s constant (h) is an extremely tiny number, the de Broglie wavelength (h/p) of any visible object is exceedingly small. In fact, the de Broglie wavelength of anything much larger than an atom is smaller than the size of one of its atoms. For example, the de Broglie wavelength of a baseball moving at 150 km/h (90 mph) is 1.1 × 10-34 m (3.6 × 10-34 ft). The diameter of a hydrogen atom (the simplest and smallest atom) is about 5 × 10-11 m (about 2 × 10-10 ft), more than 100 billion trillion times larger than the de Broglie wavelength of the baseball. The de Broglie wavelengths of everyday objects are so tiny that the wave nature of these objects does not affect their visible behavior, so their wave-particle duality is undetectable to us.

De Broglie wavelengths become important when the mass, and therefore momentum, of particles is very small. Particles the size of atoms and electrons have demonstrable wavelike properties. One of the most dramatic and interesting demonstrations of the wave behavior of electrons comes from the double-slit experiment. This experiment consists of a barrier set between a source of electrons and an electron detector. The barrier contains two slits, each about the width of the de Broglie wavelength of an electron. On this small scale, the wave nature of electrons becomes evident, as described in the following paragraphs.

Scientists can determine whether the electrons are behaving like waves or like particles by comparing the results of double-slit experiments with those of similar experiments performed with visible waves and particles. To establish how visible waves behave in a double-slit apparatus, physicists can replace the electron source with a device that creates waves in a tank of water. The slits in the barrier are about as wide as the wavelength of the water waves. In this experiment, the waves spread out spherically from the source until they hit the barrier. The waves pass through the slits and spread out again, producing two new wave fronts with centers as far apart as the slits are. These two new sets of waves interfere with each other as they travel toward the detector at the far end of the tank.

The waves interfere constructively in some places (adding together) and destructively in others (canceling each other out). The most intense waves—that is, those formed by the most constructive interference—hit the detector at the spot opposite the midpoint between the two slits. These strong waves form a peak of intensity on the detector. On either side of this peak, the waves destructively interfere and cancel each other out, creating a low point in intensity. Further out from these low points, the waves are weaker, but they constructively interfere again and create two more peaks of intensity, smaller than the large peak in the middle. The intensity then drops again as the waves destructively interfere. The intensity of the waves forms a symmetrical pattern on the detector, with a large peak directly across from the midpoint between the slits and alternating low points and smaller and smaller peaks on either side.

To see how particles behave in the double-slit experiment, physicists replace the water with marbles. The barrier slits are about the width of a marble, as the point of this experiment is to allow particles (in this case, marbles) to pass through the barrier. The marbles are put in motion and pass through the barrier, striking the detector at the far end of the apparatus. The results show that the marbles do not interfere with each other or with themselves like waves do. Instead, the marbles strike the detector most frequently in the two points directly opposite each slit.

When physicists perform the double-slit experiment with electrons, the detection pattern matches that produced by the waves, not the marbles. These results show that electrons do have wave properties. However, if scientists run the experiment using a barrier whose slits are much wider than the de Broglie wavelength of the electrons, the pattern resembles the one produced by the marbles. This shows that tiny particles such as electrons behave as waves in some circumstances and as particles in others.

C

Uncertainty Principle

Before the development of quantum theory, physicists assumed that, with perfect equipment in perfect conditions, measuring any physical quantity as accurately as desired was possible. Quantum mechanical equations show that accurate measurement of both the position and the momentum of a particle at the same time is impossible. This rule is called Heisenberg’s uncertainty principle after German physicist Werner Heisenberg, who derived it from other rules of quantum theory. The uncertainty principle means that as physicists measure a particle’s position with more and more accuracy, the momentum of the particle becomes less and less precise, or more and more uncertain, and vice versa.

Heisenberg formally stated his principle by describing the relationship between the uncertainty in the measurement of a particle’s position and the uncertainty in the measurement of its momentum. Heisenberg said that the uncertainty in position (represented by Δx) times the uncertainty in momentum (represented by Δp;) must be greater than a constant number equal to Planck’s constant (h) divided by 4 ( is a constant approximately equal to 3.14). Mathematically, the uncertainty principle can be written as Δx Δp > h / 4. This relationship means that as a scientist measures a particle’s position more and more accurately—so the uncertainty in its position becomes very small—the uncertainty in its momentum must become large to compensate and make this expression true. Likewise, if the uncertainty in momentum, Δp, becomes small, Δx must become large to make the expression true.

One way to understand the uncertainty principle is to consider the dual wave-particle nature of light and matter. Physicists can measure the position and momentum of an atom by bouncing light off of the atom. If they treat the light as a wave, they have to consider a property of waves called diffraction when measuring the atom’s position. Diffraction occurs when waves encounter an object—the waves bend around the object instead of traveling in a straight line. If the length of the waves is much shorter than the size of the object, the bending of the waves just at the edges of the object is not a problem. Most of the waves bounce back and give an accurate measurement of the object’s position. If the length of the waves is close to the size of the object, however, most of the waves diffract, making the measurement of the object’s position fuzzy. Physicists must bounce shorter and shorter waves off an atom to measure its position more accurately. Using shorter wavelengths of light, however, increases the uncertainty in the measurement of the atom’s momentum.

Light carries energy and momentum, because of its particle nature (described in the Compton effect). Photons that strike the atom being measured will change the atom’s energy and momentum. The fact that measuring an object also affects the object is an important principle in quantum theory. Normally the affect is so small it does not matter, but on the small scale of atoms, it becomes important. The bump to the atom increases the uncertainty in the measurement of the atom’s momentum. Light with more energy and momentum will knock the atom harder and create more uncertainty. The momentum of light is equal to Plank’s constant divided by the light’s wavelength, or p = h/λ. Physicists can increase the wavelength to decrease the light’s momentum and measure the atom’s momentum more accurately. Because of diffraction, however, increasing the light’s wavelength increases the uncertainty in the measurement of the atom’s position. Physicists most often use the uncertainty principle that describes the relationship between position and momentum, but a similar and important uncertainty relationship also exists between the measurement of energy and the measurement of time.

III

PROBABILITY AND WAVE FUNCTIONS

Quantum theory gives exact answers to many questions, but it can only give probabilities for some values. A probability is the likelihood of an answer being a certain value. Probability is often represented by a graph, with the highest point on the graph representing the most likely value and the lowest representing the least likely value. For example, the graph that shows the likelihood of finding the electron of a hydrogen atom at a certain distance from the nucleus looks like the following:

The nucleus of the atom is at the left of the graph. The probability of finding the electron very near the nucleus is very low. The probability reaches a definite peak, marking the spot at which the electron is most likely to be.

Scientists use a mathematical expression called a wave function to describe the characteristics of a particle that are related to time and space—such as position and velocity. The wave function helps determine the probability of these aspects being certain values. The wave function of a particle is not the same as the wave suggested by wave-particle duality. A wave function is strictly a mathematical way of expressing the characteristics of a particle. Physicists usually represent these types of wave functions with the Greek letter psi, Ψ. The wave function of the electron in a hydrogen atom is:

The symbol  and the letter e in this equation represent constant numbers derived from mathematics. The letter a is also a constant number called the Bohr radius for the hydrogen atom. The square of a wave function, or a wave function multiplied by itself, is equal to the probability density of the particle that the wave function describes. The probability density of a particle gives the likelihood of finding the particle at a certain point.

The wave function described above does not depend on time. An isolated hydrogen atom does not change over time, so leaving time out of the atom’s wave function is acceptable. For particles in systems that change over time, physicists use wave functions that depend on time. This lets them calculate how the system and the particle’s properties change over time. Physicists represent a time-dependent wave function with Ψ(t), where t represents time.

The wave function for a single atom can only reveal the probability that an atom will have a certain set of characteristics at a certain time. Physicists call the set of characteristics describing an atom the state of the atom. The wave function cannot describe the actual state of the atom, just the probability that the atom will be in a certain state.

The wave functions of individual particles can be added together to create a wave function for a system, so quantum theory allows physicists to examine many particles at once. The rules of probability state that probabilities and actual values match better and better as the number of particles (or dice thrown, or coins tossed, whatever the probability refers to) increases. Therefore, if physicists consider a large group of atoms, the wave function for the group of atoms provides information that is more definite and useful than that provided by the wave function of a single atom.

An example of a wave function for a single atom is one that describes an atom that has absorbed some energy. The energy has boosted the atom’s electrons to a higher energy level, and the atom is said to be in an excited state. It can return to its normal ground state (or lowest energy state) by emitting energy in the form of a photon. Scientists call the wave function of the initial exited state Ψi (with “i” indicating it is the initial state) and the wave function of the final ground state Ψf (with “f” representing the final state). To describe the atom’s state over time, they multiply Ψi by some function, a(t), that decreases with time, because the chances of the atom being in this excited state decrease with time. They multiply Ψf by some function, b(t), that increases with time, because the chances of the atom being in this state increase with time. The physicist completing the calculation chooses a(t) and b(t) based on the characteristics of the system. The complete wave equation for the transition is the following:

Ψ = a(t) Ψi + b(t) Ψf.

At any time, the rules of probability state that the probability of finding a single atom in either state is found by multiplying the coefficient of its wave function (a(t) or b(t)) by itself. For one atom, this does not give a very satisfactory answer. Even though the physicist knows the probability of finding the atom in one state or the other, whether or not reality will match probability is still far from certain. This idea is similar to rolling a pair of dice. There is a 1 in 6 chance that the roll will add up to seven, which is the most likely sum. Each roll is random, however, and not connected to the rolls before it. It would not be surprising if ten rolls of the dice failed to yield a sum of seven. However, the actual number of times that seven appears matches probability better as the number of rolls increases. For one million or one billion rolls of the dice, one of every six rolls would almost certainly add up to seven.

Similarly, for a large number of atoms, the probabilities become approximate percentages of atoms in each state, and these percentages become more accurate as the number of atoms observed increases. For example, if the square of a(t) at a certain time is 0.2, then in a very large sample of atoms, 20 percent (0.2) of the atoms will be in the initial state and 80 percent (0.8) will be in the final state.

One of the most puzzling results of quantum mechanics is the effect of measurement on a quantum system. Before a scientist measures the characteristics of a particle, its characteristics do not have definite values. Instead, they are described by a wave function, which gives the characteristics only as fuzzy probabilities. In effect, the particle does not exist in an exact location until a scientist measures its position. Measuring the particle fixes its characteristics at specific values, effectively “collapsing” the particle’s wave function. The particle’s position is no longer fuzzy, so the wave function that describes it has one high, sharp peak of probability. If the position of a particle has just been measured, the graph of its probability density looks like the following:

In the 1930s physicists proposed an imaginary experiment to demonstrate how measurement causes complications in quantum mechanics. They imagined a system that contained two particles with opposite values of spin, a property of particles that is analogous to angular momentum. The physicists can know that the two particles have opposite spins by setting the total spin of the system to be zero. They can measure the total spin without directly measuring the spin of either particle. Because they have not yet measured the spin of either particle, the spins do not actually have defined values. They exist only as fuzzy probabilities. The spins only take on definite values when the scientists measure them.

In this hypothetical experiment the scientists do not measure the spin of each particle right away. They send the two particles, called an entangled pair, off in opposite directions until they are far apart from each other. The scientists then measure the spin of one of the particles, fixing its value. Instantaneously, the spin of the other particle becomes known and fixed. It is no longer a fuzzy probability but must be the opposite of the other particle, so that their spins will add to zero. It is as though the first particle communicated with the second. This apparent instantaneous passing of information from one particle to the other would violate the rule that nothing, not even information, can travel faster than the speed of light. The two particles do not, however, communicate with each other. Physicists can instantaneously know the spin of the second particle because they set the total spin of the system to be zero at the beginning of the experiment. In 1997 Austrian researchers performed an experiment similar to the hypothetical experiment of the 1930s, confirming the effect of measurement on a quantum system.

IV

THE QUANTUM ATOM

The first great achievement of quantum theory was to explain how atoms work. Physicists found explaining the structure of the atom with classical physics to be impossible. Atoms consist of negatively charged electrons bound to a positively charged nucleus. The nucleus of an atom contains positively charged particles called protons and may contain neutral particles called neutrons. Protons and neutrons are about the same size but are much larger and heavier than electrons are. Classical physics describes a hydrogen atom as an electron orbiting a proton, much as the Moon orbits Earth. By the rules of classical physics, the electron has a property called inertia that makes it want to continue traveling in a straight line. The attractive electrical force of the positively charged proton overcomes this inertia and bends the electron’s path into a circle, making it stay in a closed orbit. The classical theory of electromagnetism says that charged particles (such as electrons) radiate energy when they bend their paths. If classical physics applied to the atom, the electron would radiate away all of its energy. It would slow down and its orbit would collapse into the proton within a fraction of a second. However, physicists know that atoms can be stable for centuries or longer.

Quantum theory gives a model of the atom that explains its stability. It still treats atoms as electrons surrounding a nucleus, but the electrons do not orbit the nucleus like moons orbiting planets. Quantum mechanics gives the location of an electron as a probability instead of pinpointing it at a certain position. Even though the position of an electron is uncertain, quantum theory prohibits the electron from being at some places. The easiest way to describe the differences between the allowed and prohibited positions of electrons in an atom is to think of the electron as a wave. The wave-particle duality of quantum theory allows electrons to be described as waves, using the electron’s de Broglie wavelength.

If the electron is described as a continuous wave, its motion may be described as that of a standing wave. Standing waves occur when a continuous wave occupies one of a set of certain distances. These distances enable the wave to interfere with itself in such a way that the wave appears to remain stationary. Plucking the string of a musical instrument sets up a standing wave in the string that makes the string resonate and produce sound. The length of the string, or the distance the wave on the string occupies, is equal to a whole or half number of wavelengths. At these distances, the wave bounces back at either end and constructively interferes with itself, which strengthens the wave. Similarly, an electron wave occupies a distance around the nucleus of an atom, or a circumference, that enables it to travel a whole or half number of wavelengths before looping back on itself. The electron wave therefore constructively interferes with itself and remains stable:

An electron wave cannot occupy a distance that is not equal to a whole or half number of wavelengths. In a distance such as this, the wave would interfere with itself in a complicated way, and would become unstable:

An electron has a certain amount of energy when its wave occupies one of the allowed circumferences around the nucleus of an atom. This energy depends on the number of wavelengths in the circumference, and it is called the electron’s energy level. Because only certain circumferences, and therefore energy levels, are allowed, physicists say that the energy levels are quantized. This quantization means that the energies of the levels can only take on certain values.

The regions of space in which electrons are most likely to be found are called orbitals. Orbitals look like fuzzy, three-dimensional shapes. More than one orbital, meaning more than one shape, may exist at certain energy levels. Electron orbitals are also quantized, meaning that only certain shapes are allowed in each energy level. The quantization of electron orbitals and energy levels in atoms explains the stability of atoms. An electron in an energy level that allows only one wavelength is at the lowest possible energy level. An atom with all of its electrons in their lowest possible energy levels is said to be in its ground state. Unless it is affected by external forces, an atom will stay in its ground state forever.

The quantum theory explanation of the atom led to a deeper understanding of the periodic table of the chemical elements. The periodic table of elements is a chart of the known elements. Scientists originally arranged the elements in this table in order of increasing atomic number (which is equal to the number of protons in the nuclei of each element’s atoms) and according to the chemical behavior of the elements. They grouped elements that behave in a similar way together in columns. Scientists found that elements that behave similarly occur in a periodic fashion according to their atomic number. For example, a family of elements called the noble gases all share similar chemical properties. The noble gases include neon, xenon, and argon. They do not react easily with other elements and are almost never found in chemical compounds. The atomic numbers of the noble gases increase from one element to the next in a periodic way. They belong to the same column at the far right edge of the periodic table.

Quantum theory showed that an element’s chemical properties have little to do with the nucleus of the element’s atoms, but instead depend on the number and arrangement of the electrons in each atom. An atom has the same number of electrons as protons, making the atom electrically neutral. The arrangement of electrons in an atom depends on two important parts of quantum theory. The first is the quantization of electron energy, which limits the regions of space that electrons can occupy. The second part is a rule called the Pauli exclusion principle, first proposed by Austrian-born Swiss physicist Wolfgang Pauli.

The Pauli exclusion principle states that no electron can have exactly the same characteristics as those of another electron. These characteristics include orbital, direction of rotation (called spin), and direction of orbit. Each energy level in an atom has a set number of ways these characteristics can combine. The number of combinations determines how many electrons can occupy an energy level before the electrons have to start filling up the next level.

An atom is the most stable when it has the least amount of energy, so its lowest energy levels fill with electrons first. Each energy level must be filled before electrons begin filling up the next level. These rules, and the rules of quantum theory, determine how many electrons an atom has in each energy level, and in particular, how many it has in its outermost level. Using the quantum mechanical model of the atom, physicists found that all the elements in the same column of the periodic table also have the same number of electrons in the outer energy level of their atoms. Quantum theory shows that the number of electrons in an atom’s outer level determines the atom’s chemical properties, or how it will react with other atoms.

The number of electrons in an atom’s outer energy level is important because atoms are most stable when their outermost energy level is filled, which is the case for atoms of the noble gases. Atoms imitate the noble gases by donating electrons to, taking electrons from, or sharing electrons with other atoms. If an atom’s outer energy level is only partially filled, it will bond easily with atoms that can help it fill its outer level. Atoms that are missing the same number of electrons from their outer energy level will react similarly to fill their outer energy level.

Quantum theory also explains why different atoms emit and absorb different wavelengths of light. An atom stores energy in its electrons. An atom with all of its electrons at their lowest possible energy levels has its lowest possible energy and is said to be in its ground state. One of the ways atoms can gain more energy is to absorb light in the form of photons, or particles of light. When a photon hits an atom, one of the atom’s electrons absorbs the photon. The photon’s energy makes the electron jump from its original energy level up to a higher energy level. This jump leaves an empty space in the original inner energy level, making the atom less stable. The atom is now in an excited state, but it cannot store the new energy indefinitely, because atoms always seek their most stable state. When the atom releases the energy, the electron drops back down to its original energy level. As it does, the electron releases a photon.

Quantum theory defines the possible energy levels of an atom, so it defines the particular jumps that an electron can make between energy levels. The difference between the old and new energy levels of the electron is equal to the amount of energy the atom stores. Because the energy levels are quantized, atoms can only absorb and store photons with certain amounts of energy. The photon’s energy is related to its frequency, or color. As the frequency of photons increases, their energy increases. Atoms can only absorb certain amounts of energy, so only certain frequencies of light can excite atoms. Likewise, atoms only emit certain frequencies of light when they drop to their ground state. The different frequencies available to different atoms help astronomers, for example, determine the chemical makeup of a star by observing which wavelengths are especially weak or strong in the star’s light. See also Spectroscopy.

V

DEVELOPMENT OF QUANTUM THEORY

The development of quantum theory began with German physicist Max Planck’s proposal in 1900 that matter can emit or absorb energy only in small, discrete packets, called quanta. This idea introduced the particle nature of light. In 1905 German-born American physicist Albert Einstein used Planck’s work to explain the photoelectric effect, in which light hitting a metal makes the metal emit electrons. British physicist Ernest Rutherford proved that atoms consisted of electrons bound to a nucleus in 1911. In 1913 Danish physicist Niels Bohr proposed that classical mechanics could not explain the structure of the atom and developed a model of the atom with electrons in fixed orbits. Bohr’s model of the atom proved difficult to apply to all but the simplest atoms.

In 1923 French physicist Louis de Broglie suggested that matter could be described as a wave, just as light could be described as a particle. The wave model of the electron allowed Austrian physicist Erwin Schrödinger to develop a mathematical method of determining the probability that an electron will be at a particular place at a certain time. Schrödinger published his theory of wave mechanics in 1926. Around the same time, German physicist Werner Heisenberg developed a way of calculating the characteristics of electrons that was quite different from Schrödinger’s method but yielded the same results. Heisenberg’s method was called matrix mechanics.

In 1925 Austrian-born Swiss physicist Wolfgang Pauli developed the Pauli exclusion principle, which allowed physicists to calculate the structure of the quantum atom for the first time. In 1926 Heisenberg and two of his colleagues, German physicists Max Born and Ernst Pascual Jordan, published a theory that combined the principles of quantum theory with the classical theory of light (called electrodynamics). Heisenberg made another important contribution to quantum theory in 1927 when he introduced the Heisenberg uncertainty principle.

Since these first breakthroughs in quantum mechanical research, physicists have focused on testing and refining quantum theory, further connecting the theory to other theories, and finding new applications. In 1928 British physicist Paul Dirac refined the theory that combined quantum theory with electrodynamics. He developed a model of the electron that was consistent with both quantum theory and Einstein’s special theory of relativity, and in doing so he created a theory that came to be known as quantum electrodynamics, or QED. In the early 1950s Japanese physicist Tomonaga Shin’ichirō and American physicists Richard Feynman and Julian Schwinger each independently improved the scientific community’s understanding of QED and made it an experimentally testable theory that successfully predicted or explained the results of many experiments.

VI

CURRENT RESEARCH AND APPLICATIONS

At the turn of the 21st century, physicists were still finding new problems to study with quantum theory and new applications for quantum theory. This research will probably continue for many decades. Quantum theory is technically a fully formulated theory—any questions about the physical world can be calculated using quantum mechanics, but some are too complicated to solve in practice. The attempt to find quantum explanations of gravitation and to find a unified description of all the forces in nature are promising and active areas of research. Researchers try to find out why quantum theory explains the way nature works—they may never find an answer, but the effort to do so is underway. Physicists also study the complicated area of overlap between classical physics and quantum mechanics and work on the applications of quantum mechanics.

Studying the intersection of quantum theory and classical physics requires developing a theory that can predict how quantum systems will behave as they get larger or as the number of particles involved approaches the size of problems described by classical physics. The mathematics involved is extremely difficult, but physicists continue to advance in their research. The constantly increasing power of computers should continue to help scientists with these calculations.

New research in quantum theory also promises new applications and improvements to known applications. One of the most potentially powerful applications is quantum computing. In quantum computing, scientists make use of the behavior of subatomic particles to perform calculations. Making calculations on the atomic level, a quantum computer could theoretically investigate all the possible answers to a query at the same time and make many calculations in parallel. This ability would make quantum computers thousands or even millions of time faster than current computers. Advancements in quantum theory also hold promise for the fields of optics, chemistry, and atomic theory.

Boolean Algebra

Boolean Algebra, branch of mathematics having laws and properties similar to, but different from, those of ordinary high school algebra. Formally a Boolean algebra is a mathematical system consisting of a set of elements, which may be called B, together with two binary operations, which may be denoted by the symbols  and . These operations are defined on the set B and satisfy the following axioms:

1.  and  are both commutative operations. That is, for any elements x, y of the set B, it is true that xY = yx and xy = yx.

2. Each of the operations  and  distributes over the other. That is, for any elements x, y, and z of the set B, it is true that x (yz) = (xy) (xz), and x (yz) = (xy)  (xz).

3. There exists in the set B a distinct identity element for each of the operations  and . These elements are usually denoted by the symbols 0 and 1 such that 0 ≠ 1, and have the property that 0 x = x and 1 x = x for any element x in the set B.

4. For each element x in the set B there exists a distinct corresponding element called the complement of x, usually denoted by the symbol x’. With respect to the operations  and , the element x’ has the property that xx’ = 1 and x x’ = 0.

A Boolean algebra may have other sets of axioms, all of which may be shown to be equivalent to those just given. The axioms given here are essentially those first published by the American mathematician Edward Huntington in Postulates for the Algebra of Logic (1904). The first treatment of the subject was given in 1854 by the English mathematician George Boole. It is possible to denote the operations  and  by any two symbols; +, , and  are sometimes used instead of , and ×, ^, , ·, and O instead of .

As an example of a Boolean algebra, consider any set X and let P(X) stand for the collection of all possible subsets of the set X. P(X) is sometimes called the power set of the set X. P(X), together with ordinary set union () and set intersection (), forms a Boolean algebra. In fact, every Boolean algebra may be represented as an algebra of sets.

From the symmetry of the axioms with respect to the two operations and their respective identities, one is able to prove the so-called principle of duality. This principle asserts that any algebraic statement deducible from the axioms of Boolean algebra remains true if the operations  and  and the identities 1 and 0 are interchanged throughout the statement. Of the many theorems that can be deduced from the axioms of a Boolean algebra, De Morgan's laws, that (xy)’ = x’y’ and that (xy)’ = x’y’, are particularly noteworthy.

The elements that are contained in the set B of a Boolean algebra may be abstract objects, or concrete things such as numbers, propositions, sets, or electrical networks. In Boole's original development, the elements of a Boolean algebra were a collection of propositions, or simple declarative sentences having the property that they were either true or false but not both. The operations were essentially conjunction and disjunction, denoted by the symbols ^ and  respectively. If x and y represent two propositions, then the expression xy (read x or y) would be true if and only if either x or y or both were true. The statement x ^ y (read x and y) would be true if and only if both x and y were true. In this type of Boolean algebra, the complement of an element or proposition is simply the negation of the statement.

A Boolean algebra of propositions and a Boolean algebra of sets are closely connected. For example, let p be the statement, “The ball is blue,” and let P be the set of all elements for which the statement p is true, that is, the set of all blue balls. P is called the truth set for the proposition p. Indeed, if P and Q are the truth sets for statements p and q, then the truth set for the statement pq is clearly P  Q and for p ^ q the truth set is P  Q.

Boolean algebra has many practical applications in the physical sciences, in electric-circuit theory and particularly in the field of computers.

As an example of an application of Boolean algebra in electrical-circuit theory, let p and q denote two propositions, that is, declarative sentences that are either true or false but not both. If each of the propositions p and q is associated with a switch that will be closed if the proposition is true, and open if the proposition is false, then the statement p ^ q may be represented by connecting the switches in series. The current will flow in this circuit if and only if both switches are closed, that is, if both p and q are true. Similarly, a circuit with switches connected in parallel can be used to represent the statement pq. In this case the current will flow if either p or q or both are true and the respective switches are closed. More complicated statements give rise to more complex switching circuits.

Arithmetic Progression

Arithmetic Progression, sequence of numbers that increase or diminish by a common difference so that any number in the sequence is the arithmetic mean, or average, of the numbers preceding and following it (See also Geometric Progression; Sequence and Series). The numbers 7, 10, 13, 16, 19, 22 form an arithmetic progression, as do the numbers 12, 10, 9, 7, 6. The natural numbers 1, 2, 3, 4 form an arithmetic progression in which the difference is 1. To find the sum of any arithmetic progression, multiply the sum of the first and last terms by half the number of terms. Thus, the sum of the first ten natural numbers is (1 + 10) × (10 ÷ 2) = 55.

The general arithmetic progression is a, a + d, a + 2d, a + 3d,... where the first term, a, and the common difference, d, are arbitrary numbers. The nth term of this progression (often denoted by the term an) is given by the formula an = a + (n - 1) d, and the sum of the first n terms is n[2a + (n - 1)d], or n(a + an).

Geometric Progression

Geometric Progression, in mathematics, sequence of numbers in which the ratio of any term, after the first, to the preceding term is a fixed number, called the common ratio. For example, the sequence of numbers 2, 4, 8, 16, 32, 64, 128 is a geometric progression in which the common ratio is 2, and 1, , , , , , ...i, ... is a geometric progression in which the common ratio is . The first is a finite geometric progression with seven terms; the second is an infinite geometric progression. In general, a geometric progression may be described by denoting the first term in the progression by a, the common ratio by r, and, in a finite progression, the number of terms by n. A finite geometric progression may then be written formally as

and an infinite geometric progression as

In general, if the nth term of a geometric progression is denoted by an, it follows from the definition that

If the symbol Sn denotes the sum of the first n terms of a geometric progression, it can be proved that

The terms in a geometric progression between ai, and aj, i<j, are called geometric means. The geometric mean between two positive numbers x and y is the same as the mean proportional  between the two numbers. In particular, an is the geometric mean or mean proportional between an - 1 and an + 1.

The formal sum of the terms of an infinite geometric progression, written as

is called a geometric series (see Sequence and Series). In analysis it can be proved that a geometric series converges if the absolute value of the common ratio is less than 1; otherwise, the series diverges. If the series does converge, the limit, S, can be shown to be

The symbol is read “the limit of Sn as n increases without bound.”

Geometric series and geometric progressions have many applications in the physical, biological, and social sciences, as well as in investments and banking. Many problems in compound interest and annuities are easily solved using these concepts.

Mathematical Symbols

I

INTRODUCTION

Mathematical Symbols, various signs and abbreviations used in mathematics to indicate entities, relations, or operations.

II

HISTORY

The origin and development of mathematical symbols are not entirely clear. For the probable origin of the remarkable digits 1 through 9, see Numerals. The origin of zero is unknown, because no authentic record exists of its history before ad400. The extension of the decimal position system below unity is attributed to the Dutch mathematician Simon Stevin, who called tenths, hundredths, and thousandths primes, sekondes, and terzes and circled digits to denote the orders; thus, 4.628 was written as 4 ⓪ 6 ① 2 2 8 3. A period was used to set off the decimal part of a number as early as 1492, and later a bar was also used. In the Exempelbüchlein of 1530 by the German mathematician Christoff Rudolf, a problem in compound interest is solved, and some use is made of the decimal fraction. The German astronomer Johannes Kepler used the comma to set off the decimal orders, and the Swiss mathematician Justus Byrgius used the decimal fraction in such forms as 3.2.

Although the early Egyptians had symbols for addition and equality, and the Greeks, Hindus, and Arabs had symbols for equality and the unknown quantity, from earliest times mathematical processes were cumbersome because proper symbols of operation were lacking. The expressions for such processes were either written out in full or denoted by word abbreviations. The later Greeks, the Hindus, and the German-born mathematician Nemorarius Jordanus indicated addition by juxtaposition; the Italians usually denoted it by the letter P or p with a line drawn through it, but their symbols were not uniform. Some mathematicians used p, some e, and the mathematician Niccolò Tartaglia commonly expressed the operation by . German and English algebraists introduced the sign +, but spoke of it as signum additorum and first used it only to indicate excess. The Greek mathematician Diophantus indicated subtraction by the symbol ↗. The Hindus used a dot, and the Italian algebraists denoted it by M or m with a line drawn through the letter. The German and English algebraists were the first to use the present symbol and described it as Signum subtractorum. The symbols + and - were first shown in 1489 by the German Johann Widman.

The English mathematician William Oughtred first used the symbol × for “times.” The German mathematician Gottfried Wilhelm Leibniz used a period to indicate multiplication, and in 1637 the French mathematician René Descartes used juxtaposition. In 1688 Leibniz employed the sign  to denote multiplication and  to denote division. The Hindus wrote the divisor under the dividend. Leibniz used the familiar form a:b. Descartes made popular the notation an for involution; the English mathematician John Wallis defined the negative exponent and first used the symbol (∞) for infinity.

The symbol of equality, =, was originated by the English mathematician Robert Recorde, and the symbols > and < for “greater than” and “less than” originated with Thomas Harriot, also an Englishman. The French mathematician François Viète introduced various symbols of aggregation. The symbols of differentiation, dx, and integration, ∫, as used in calculus, originated with Leibniz as did the symbol ~ for similarity, as used in geometry. The Swiss mathematician Leonhard Euler was largely responsible for the symbols , f, F, as used in the theory of functions.

III

THE HIERARCHY OF NUMBERS

The hierarchy of numbers is the following: million, billion, trillion, quadrillion, quintillion, sextillion, septillion, octillion, nonillion, decillion, undecillion, duodecillion, tredecillion, quat(t)uordecillion, quindecillion, sexdecillion, septendecillion, octodecillion, novemdecillion, vigintillion.

In the French and American system of notation, each number after a million is a thousand times the preceding number; in the English and German system, each number is a million times the preceding. A vigintillion is written as a 1 followed by 63 zeros in the French and American system; by 120 zeros in England and Germany.

Decimals are written in the form 1.23 in the United States, 1·23 in the United Kingdom, and 1,23 in continental Europe. In standard scientific notation, a number such as 0.000000123 is written as 1.23x10-7.

Fermat’s Last Theorem

Fermat’s Last Theorem, in mathematics, famous theorem which has led to important discoveries in algebra and analysis. It was proposed by the French mathematician Pierre de Fermat. While studying the work of the ancient Greek mathematician Diophantus, Fermat became interested in the chapter on Pythagorean numbers—that is, the sets of three numbers, a, b, and c, such as 3, 4, and 5, for which the equation a2 + b2 = c2 is true. He wrote in pencil in the margin, “I have discovered a truly remarkable proof which this margin is too small to contain.” Fermat added that when the Pythagorean theorem is altered to read an + b n = cn, the new equation cannot be solved in integers for any value of n greater than 2. That is, no set of positive integers a, b, and c can be found to satisfy, for example, the equation a3 + b3 = c3 or a4 + b4 = c4.

Fermat’s simple theorem turned out to be surprisingly difficult to prove. For more than 350 years, many mathematicians tried to prove Fermat’s statement or to disprove it by finding an exception. In June 1993, Andrew Wiles, an English mathematician at Princeton University, claimed to have proved the theorem; however, in December of that year reviewers found a gap in his proof. On October 6, 1994, Wiles sent a revised proof to three colleagues. On October 25, 1994, after his colleagues judged it complete, Wiles published his proof.

Despite the special and somewhat impractical nature of Fermat’s theorem, it was important because attempts at solving the problem led to many important discoveries in both algebra and analysis.

Pascal, Blaise

I

INTRODUCTION

Pascal, Blaise (1623-62), French philosopher, mathematician, and physicist, considered one of the great minds in Western intellectual history.

Pascal was born in Clermont-Ferrand on June 19, 1623, and his family settled in Paris in 1629. Under the tutelage of his father, Pascal soon proved himself a mathematical prodigy, and at the age of 16 he formulated one of the basic theorems of projective geometry, known as Pascal's theorem and described in his Essai pour les coniques (Essay on Conics, 1639). In 1642 he invented the first mechanical adding machine. Pascal proved by experimentation in 1648 that the level of the mercury column in a barometer is determined by an increase or decrease in the surrounding atmospheric pressure rather than by a vacuum, as previously believed. This discovery verified the hypothesis of the Italian physicist Evangelista Torricelli concerning the effect of atmospheric pressure on the equilibrium of liquids. Six years later, in conjunction with the French mathematician Pierre de Fermat, Pascal formulated the mathematical theory of probability, which has become important in such fields as actuarial, mathematical, and social statistics and as a fundamental element in the calculations of modern theoretical physics. Pascal's other important scientific contributions include the derivation of Pascal's law or principle, which states that fluids transmit pressures equally in all directions, and his investigations in the geometry of infinitesimals. His methodology reflected his emphasis on empirical experimentation as opposed to analytical, a priori methods, and he believed that human progress is perpetuated by the accumulation of scientific discoveries resulting from such experimentation.

II

LATER LIFE AND WORKS

Pascal espoused Jansenism and in 1654 entered the Jansenist community at Port Royal, where he led a rigorously ascetic life until his death eight years later. In 1656 and 1657 he wrote the famous 18 Lettres provinciales (Provincial Letters), in which he attacked the Jesuits for their attempts to reconcile 16th-century naturalism with orthodox Roman Catholicism. His most positive religious statement appeared posthumously (he died August 19, 1662); it was published in fragmentary form in 1670 as Apologie de la religion Chrétienne (Apology of the Christian Religion). In these fragments, which later were incorporated into his major work, he posed the alternatives of potential salvation and eternal damnation, with the implication that only by conversion to Jansenism could salvation be achieved. Pascal asserted that whether or not salvation was achieved, humanity's ultimate destiny is an afterlife belonging to a supernatural realm that can only be known intuitively. Pascal's final important work was Pensées sur la religion et sur quelques autres sujets (Thoughts on Religion and on Other Subjects), also published in 1670. In the Pensées Pascal attempted to explain and justify the difficulties of human life by the doctrine of original sin, and he contended that revelation can be comprehended only by faith, which in turn is justified by revelation. Pascal's writings urging acceptance of the Christian life contain frequent applications of the calculations of probability; he reasoned that the value of eternal happiness is infinite and that although the probability of gaining such happiness by religion may be small it is infinitely greater than by any other course of human conduct or belief. A reclassification of the Pensées, a careful work begun in 1935 and continued by several scholars, does not reconstruct the Apologie, but allows the reader to follow the plan that Pascal himself would have followed.

III

EVALUATION

Pascal was one of the most eminent mathematicians and physicists of his day and one of the greatest mystical writers in Christian literature. His religious works are personal in their speculation on matters beyond human understanding. He is generally ranked among the finest French polemicists, especially in the Lettres provinciales, a classic in the literature of irony. Pascal's prose style is noted for its originality and, in particular, for its total lack of artifice. He affects his readers by his use of logic and the passionate force of his dialectic.

Probability

Probability, also theory of probability, branch of mathematics that deals with measuring or determining quantitatively the likelihood that an event or experiment will have a particular outcome. Probability is based on the study of permutations and combinations and is the necessary foundation for statistics.

The foundation of probability is usually ascribed to the 17th-century French mathematicians Blaise Pascal and Pierre de Fermat, but mathematicians as early as Gerolamo Cardano had made important contributions to its development. Mathematical probability began in an attempt to answer certain questions arising in games of chance, such as how many times a pair of dice must be thrown before the chance that a six will appear is 50-50. Or, in another example, if two players of equal ability, in a match to be won by the first to win ten games, are obliged to suspend play when one player has won five games, and the other seven, how should the stakes be divided?

The probability of an outcome is represented by a number between 0 and 1, inclusive, with “probability 0” indicating certainty that an event will not occur and “probability 1” indicating certainty that it will occur. The simplest problems are concerned with the probability of a specified “favorable” result of an event that has a finite number of equally likely outcomes. If an event has n equally likely outcomes and f of them are termed favorable, the probability, p, of a favorable outcome is f/n. For example, a fair die can be cast in six equally likely ways; therefore, the probability of throwing a 5 or a 6 is 2/6. More involved problems are concerned with events in which the various possible outcomes are not equally likely. For example, in finding the probability of throwing a 5 or 6 with a pair of dice, the various outcomes (2, 3, ... 12) are not all equally likely. Some events may have infinitely many outcomes, such as the probability that a chord drawn at random in a circle will be longer than the radius.

Problems involving repeated trials form one of the connections between probability and statistics. To illustrate, what is the probability that exactly five 3s and at least four 6s will occur in 50 tosses of a fair die? Or, a person, tossing a fair coin twice, takes a step to the north, east, south, or west, according to whether the coin falls head, head; head, tail; tail, head; or tail, tail. What is the probability that at the end of 50 steps the person will be within 10 steps of the starting point?

In probability problems, two outcomes of an event are mutually exclusive if the probability of their joint occurrence is zero; two outcomes are independent if the probability of their joint occurrence is given as the product of the probability of their separate occurrences. Two outcomes are mutually exclusive if the occurrence of one precludes the occurrence of the other; two outcomes are independent if the occurrence or nonoccurrence of one does not alter the probability that the other will or will not occur. Compound probability is the probability of all outcomes of a certain set occurring jointly; total probability is the probability that at least one of a certain set of outcomes will occur. Conditional probability is the probability of an outcome when it is known that some other outcome has occurred or will occur.

If the probability that an outcome will occur is p, the probability that it will not occur is q = 1 - p. The odds in favor of the occurrence are given by the ratio p:q, and the odds against the occurrence are given by the ratio q:p. If the probabilities of two mutually exclusive outcomes X and Y are p and P, respectively, the odds in favor of X and against Y are p to P. If an event must result in one of the mutually exclusive outcomes O1,O2,..., On, with probabilities p1,p2,..., pn, respectively, and if v1,v2,...vn are numerical values attached to the respective outcomes, the expectation of the event is E = p1v1 + p2v2 + ...pnvn. For example, a person throws a die and wins 40 cents if it falls 1, 2, or 3; 30 cents for 4 or 5; but loses $1.20 if it falls 6. The expectation on a single throw is 3/6 × .40 + 2/6 × .30 - 1/6 × 1.20 = .10.

The most common interpretation of probability is used in statistical analysis. For example, the probability of throwing a 7 in one throw of two dice is 1/6, and this answer is interpreted to mean that if two fair dice are randomly thrown a very large number of times, about one-sixth of the throws will be 7s. This concept is frequently used to statistically determine the probability of an outcome that cannot readily be tested or is impossible to obtain. Thus, if long-range statistics show that out of every 100 people between 20 and 30 years of age, 42 will be alive at age 70, the assumption is that a person between those ages has a 42 percent probability of surviving to the age of 70.

Mathematical probability is widely used in the physical, biological, and social sciences and in industry and commerce. It is applied in such diverse areas as genetics, quantum mechanics, and insurance. It also involves deep and important theoretical problems in pure mathematics and has strong connections with the theory, known as mathematical analysis, that developed out of calculus.

Calculus (mathematics)

I

INTRODUCTION

Calculus (mathematics), branch of mathematics concerned with the study of such concepts as the rate of change of one variable quantity with respect to another, the slope of a curve at a prescribed point, the computation of the maximum and minimum values of functions, and the calculation of the area bounded by curves. Evolved from algebra, arithmetic, and geometry, it is the basis of that part of mathematics called analysis.

Calculus is widely employed in the physical, biological, and social sciences. It is used, for example, in the physical sciences to study the speed of a falling body, the rates of change in a chemical reaction, or the rate of decay of a radioactive material. In the biological sciences a problem such as the rate of growth of a colony of bacteria as a function of time is easily solved using calculus. In the social sciences calculus is widely used in the study of statistics and probability.

Calculus can be applied to many problems involving the notion of extreme amounts, such as the fastest, the most, the slowest, or the least. These maximum or minimum amounts may be described as values for which a certain rate of change (increase or decrease) is zero. By using calculus it is possible to determine how high a projectile will go by finding the point at which its change of altitude with respect to time, that is, its velocity, is equal to zero. Many general principles governing the behavior of physical processes are formulated almost invariably in terms of rates of change. It is also possible, through the insights provided by the methods of calculus, to resolve such problems in logic as the famous paradoxes posed by the Greek philosopher Zeno.

The fundamental concept of calculus, which distinguishes it from other branches of mathematics and is the source from which all its theory and applications are developed, is the theory of limits of functions of variables (see Function).

Let f be a function of the real variable x, which is denoted f(x), defined on some set of real numbers surrounding the number x0. It is not required that the function be defined at the point x0 itself. Let L be a real number. The expression

is read: “The limit of the function f(x), as x approaches x0, is equal to the number L.” The notation is designed to convey the idea that f(x) can be made as “close” to L as desired simply by choosing an x sufficiently close to x0. For example, if the function f(x) is defined as f(x) = x2 + 3x + 2, and if x0 = 3, then from the definition above it is true that

This is because, as x approaches 3 in value, x2 approaches 9, 3x approaches 9, and 2 does not change, so their sum approaches 9 + 9 + 2, or 20.

Another type of limit important in the study of calculus can be illustrated as follows. Let the domain of a function f(x) include all of the numbers greater than some fixed number m. L is said to be the limit of the function f(x) as x becomes positively infinite, if, corresponding to a given positive number , no matter how small, there exists a number M such that the numerical difference between f(x) and L (the absolute value |f(x) - L|) is less than whenever x is greater than M. In this case the limit is written as

For example, the function f(x) = 1/x approaches the number 0 as x becomes positively infinite.

It is important to note that a limit, as just presented, is a two-way, or bilateral, concept: A dependent variable approaches a limit as an independent variable approaches a number or becomes infinite. The limit concept can be extended to a variable that is dependent on several independent variables. The statement “u is an infinitesimal” meaning “u is a variable approaching 0 as a limit,” found in a few present-day and in many older texts on calculus, is confusing and should be avoided. Further, it is essential to distinguish between the limit of f(x) as x approaches x0 and the value of f(x) when x is x0, that is, the correspondent of x0. For example, if f(x) = sin x/x, then

however, no value of f(x) corresponding to x = 0 exists, because division by 0 is undefined in mathematics.

The two branches into which elementary calculus is usually divided are differential calculus, based on the consideration of the limit of a certain ratio, and integral calculus, based on the consideration of the limit of a certain sum.

II

DIFFERENTIAL CALCULUS

Let the dependent variable y be a function of the independent variable x, expressed by y = f(x). If x0 is a value of x in its domain of definition, then y0 = f(x0) is the corresponding value of y. Let h and k be real numbers, and let y0 + k = f(x0 + h). (Δx, read “delta x,” is used quite frequently in place of h.) When Δx is used in place of h,Δy is used in place of k. Then clearly

and

This ratio is called a difference quotient. Its intuitive meaning can be grasped from the geometrical interpretation of the graph of y = f(x). Let A and B be the points (x0, y0), (x0 + h, y0 + k), respectively, as in the Derivatives illustration. Draw the secant AB and the lines AC and CB, parallel to the x and y axes, respectively, so that h = AC, k = CB. Then the difference quotient k/h equals the tangent of angle BAC and is therefore, by definition, the slope of the secant AB. It is evident that if an insect were crawling along the curve from A to B, the abscissa x would always increase along its path but the ordinate y would first increase, slow down, then decrease. Thus, y varies with respect to x at different rates between A and B. If a second insect crawled from A to B along the secant, the ordinate y would vary at a constant rate, equal to the difference quotient k/h, with respect to the abscissa x. As the two insects start and end at the same points, the difference quotient may be regarded as the average rate of change of y = f(x) with respect to x in the interval AC.

If the limit of the ratio k/h exists as h approaches 0, this limit is called the derivative of y with respect to x, evaluated at x = x0. For example, let y = x2 and x = 3, so that y = 9. Then 9 + k = (3 + h)2; k = (3 + h)2 - 9 = 6h + h2; k/h = 6 + h; and

Referring back to the Derivatives illustration, the secant AB pivots around A and approaches a limiting position, the tangent AT, as h approaches 0. The derivative of y with respect to x, at x = x0, may be interpreted as the slope of the tangent AT, and this slope is defined as the slope of the curve y = f(x) at x = x0. Further, the derivative of y with respect to x, at x = x0, may be interpreted as the instantaneous rate of change of y with respect to x at x0.

If the derivative of y with respect to x is found for all values of x (in its domain) for which the derivative is defined, a new function is obtained, the derivative of y with respect to x. If y = f(x), the new function is written as y’ or f’(x), Dxy or Dxf(x), (dy)/(dx) or df(x)/dx. Thus, if y = x2, y + k = (x + h)2; k = (x + h)2 - x2 = 2xh + h2; k/h = 2x + h, whence

Thus, as before, y’ = f’(x) = 6 at x = 3, or f’(3) = 6; also, f’(2) = 4, f’(0) = 0, and f’(-2) = -4.

As the derivative f’(x) of a function f(x) of x is itself a function of x, its derivative with respect to x can be found; it is called the second (order) derivative of y with respect to x, and is designated by any one of the symbols y” or f”(x), Dx2y or Dx2f(x), (d2y)/(dx2) or (d2f(x))/(dx2). Third- and higher-order derivatives are similarly designated.

Every application of differential calculus stems directly or indirectly from one or both of the two interpretations of the derivative as the slope of the tangent to the curve and as the rate of change of the dependent variable with respect to the independent variable. In a detailed study of the subject, rules and methods developed by the limit process are provided for rapid calculation of the derivatives of various functions directly by means of various known formulas. Differentiation is the name given to the process of finding a derivative.

Differential calculus provides a method of finding the slope of the tangent to a curve at a certain point; related rates of change, such as the rate at which the area of a circle increases (in square feet per minute) in terms of the radius (in feet) and the rate at which the radius increases (in feet per minute); velocities (rates of change of distance with respect to time) and accelerations (rates of change of velocities with respect to time, therefore represented as second derivatives of distance with respect to time) of points moving on straight lines or other curves; and absolute and relative maxima and minima.

III

INTEGRAL CALCULUS

Let y = f(x) be a function defined for all x’s in the interval [a,b], that is, the set of x’s from x = a to x = b, including a and b, where a<b (suitable modifications can be made in the definitions to follow for more restricted ranges or domains). Let x0, x1, ..., xn be a sequence of values of x such that a = x0<x1<x2<...<xn - 1 <xn = b, and let h1 = x1 - x0, h2 = x2 - x1, ..., hn = xn - xn - 1, in brief, hi = xi - xi - 1, where i = 1, 2, ..., n. The x’s form a partition of the interval [a, b]; an h with a value not exceeded by any other h is called the norm of the partition. Let n values of x, for example, X1, X2, ..., Xn, be chosen so that xi - 1<Xi<xi, where i = 1, 2, ..., n. The sum of the area of the rectangles is given by

f(X1)h1 + f(X2)h2 + .... + f(Xn)hn

usually abbreviated to

(Σ is the Greek capital letter sigma.) Aside from the given function f(x) and the given a and b, the value of the sum clearly depends on n and on the choices of the xi’s and Xi’s. In particular, if, after the xi’s are chosen, the Xi’s are chosen so that f(Xi), for each i, is a maximum in the interval [xi - 1, xi] (that is, no ordinate from xi - 1 to xi exceeds the ordinate at Xi), the sum is called an upper sum; similarly, if, after the xi’s are chosen, the Xi’s are chosen so that f(Xi), for each i, is a minimum in the interval [xi - 1, xi], the sum is called a lower sum. It can be proved that the upper and lower sums will have limits, and , respectively, as the norm approaches 0. If and are equal and have the common value S, S is called the definite integral of f(x) from a to b and is written

The symbol ∫ is an elongated S (for sum); the f(x) dx is suggested by a term f(Xi)hi = f(Xi) Δxi of the sum which is used in defining the definite integral.

If y = g(x), then by differentiation y’ = g’(x). Let g’(x) = f(x), and C be any constant. Then f(x) is also the derivative of g(x) + C. The expression g(x) + C is called the antiderivative of f(x), or the indefinite integral of f(x), and it is represented by

The dual use of the term integral is justified by one of the fundamental theorems of calculus, namely, if g(x) is an antiderivative of f(x), then, under suitable restrictions on f(x) and g(x),

The process of finding either an indefinite or a definite integral of a function f(x) is called integration; the fundamental theorem relates differentiation and integration.

If the antiderivative, g(x), of f(x) is not readily obtainable or is not known, the definite integral can be approximated by the trapezoidal rule, (b - a) [f(a) + f(b)]/2 or by the more accurate Simpson’s rule:

If |b - a| is small, Simpson’s rule gives a fairly close result. If |b - a| is large, a good approximation can be obtained by dividing the interval from a to b into a number of small intervals and applying Simpson’s rule to the subintervals.

Integral calculus involves the inverse process of finding the derivative of a function, that is, it is the process of finding the function itself when its derivative is known. For example, integral calculus makes it possible to find the equation of a curve if the slope of the tangent is known at an arbitrary point; to find distance in terms of time if the velocity (or acceleration) is known; and to find the equation of a curve if its curvature is known. Integral calculus can also be used to find the lengths of curves, the areas of plane and curved surfaces, volumes of solids of revolution, centroids, moments of inertia, and total mass and total force.

IV

DIFFERENTIAL EQUATIONS

Calculus leads directly to the branch of mathematics called differential equations, which is extremely useful in engineering and in the physical sciences. An ordinary differential equation is an equation involving an independent variable, a dependent variable (one or both of these two may be missing), and one or more derivatives (at least one derivative must be present). Many physical laws or statements are initially expressed as differential equations. For example, the law that the acceleration of gravity is a constant g can be expressed mathematically by the differential equation d2x/dt2 = g; the principle that the rate of disintegration of radium is proportional to the amount present is expressed as dR/dt = -kR. A differential equation is solved if an equivalent equation is found involving only the independent and dependent variables.

This article has considered functions of a single independent variable only. Partial derivatives, multiple integrals, and partial differential equations are defined and studied in investigating functions of two or more independent variables.

V

DEVELOPMENT OF CALCULUS

The English and German mathematicians, respectively, Isaac Newton and Gottfried Wilhelm Leibniz invented calculus in the 17th century, but isolated results about its fundamental problems had been known for thousands of years. For example, the Egyptians discovered the rule for the volume of a pyramid as well as an approximation of the area of a circle. In ancient Greece, Archimedes proved that if c is the circumference and d the diameter of a circle, then 3d<c< 3d. His proof extended the method of inscribed and circumscribed figures developed by the Greek astronomer and mathematician Eudoxus. Archimedes used the same technique for his other results on areas and volumes. Archimedes discovered his results by means of heuristic arguments involving parallel slices of the figures and the law of the lever. Unfortunately, his treatise The Method was only rediscovered in the 19th century, so later mathematicians believed that the Greeks deliberately concealed their secret methods.

During the late middle ages in Europe, mathematicians studied translations of Archimedes’ treatises from Arabic. At the same time, philosophers were studying problems of change and the infinite, such as the addition of infinitely many quantities. Greek thinkers had seen only contradictions there, but medieval thinkers aided mathematics by making the infinite philosophically respectable.

By the early 17th century, mathematicians had developed methods for finding areas and volumes of a great variety of figures. In his Geometry by Indivisibles, the Italian mathematician F. B. Cavalieri, a student of the Italian physicist and astronomer Galileo, expanded on the work of the German astronomer Johannes Kepler on measuring volumes. He used what he called “indivisible magnitudes” to investigate areas under the curves y = xn, n = 1 ...9. Also, his theorem on the volumes of figures contained between parallel planes (now called Cavalieri’s theorem) was known all over Europe. At about the same time, the French mathematician René Descartes’La Géométrie appeared. In this important work, Descartes showed how to use algebra to describe curves and obtain an algebraic analysis of geometric problems. A codiscoverer of this analytic geometry was the French mathematician Pierre de Fermat, who also discovered a method of finding the greatest or least value of some algebraic expressions—a method close to those now used in differential calculus.

About 20 years later, the English mathematician John Wallis published The Arithmetic of Infinites, in which he extrapolated from patterns that held for finite processes to get formulas for infinite processes. His colleague at the University of Cambridge was Newton’s teacher, the English mathematician Isaac Barrow, who published a book that stated geometrically the inverse relationship between problems of finding tangents and areas, a relationship known today as the fundamental theorem of calculus.

Although many other mathematicians of the time came close to discovering calculus, the real founders were Newton and Leibniz. Newton’s discovery (1665-66) combined infinite sums (infinite series), the binomial theorem for fractional exponents, and the algebraic expression of the inverse relation between tangents and areas into methods we know today as calculus. Newton, however, was reluctant to publish, so Leibniz became recognized as a codiscoverer because he published his discovery of differential calculus in 1684 and of integral calculus in 1686. It was Leibniz, also, who replaced Newton’s symbols with those familiar today.

In the following years, one problem that led to new results and concepts was that of describing mathematically the motion of a vibrating string. Leibniz’s students, the Bernoulli family of Swiss mathematicians (see Bernoulli, Daniel), used calculus to solve this and other problems, such as finding the curve of quickest descent connecting two given points in a vertical plane. In the 18th century, the great Swiss-Russian mathematician Leonhard Euler, who had studied with Johann Bernoulli, wrote his Introduction to the Analysis of Infinites, which summarized known results and also contained much new material, such as a strictly analytic treatment of trigonometric and exponential functions.

Despite these advances in technique, calculus remained without logical foundations. Only in 1821 did the French mathematician A. L. Cauchy succeed in giving a secure foundation to the subject by his theory of limits, a purely arithmetic theory that did not depend on geometric intuition or infinitesimals. Cauchy then showed how this could be used to give a logical account of the ideas of continuity, derivatives, integrals, and infinite series. In the next decade, the Russian mathematician N. I. Lobachevsky and German mathematician P. G. L. Dirichlet both gave the definition of a function as a correspondence between two sets of real numbers, and the logical foundations of calculus were completed by the German mathematician J. W. R. Dedekind in his theory of real numbers, in 1872.

Zeno of Elea

Zeno of Elea (flourished 5th century bc), Greek mathematician and philosopher of the Eleatic school, known for his philosophical paradoxes.

Zeno was born in Elea, in southwestern Italy. He became a favorite disciple of the Greek philosopher Parmenides and accompanied him to Athens at the age of about 40. In Athens, Zeno taught philosophy for some years, concentrating on the Eleatic system of metaphysics. The Athenian statesmen Pericles and Callias (flourished 5th century bc) studied under him. Zeno later returned to Elea and, according to traditional accounts, joined a conspiracy to rid his native town of the tyrant Nearchus; the conspiracy failed and Zeno was severely tortured, but he refused to betray his accomplices. Further circumstances of his life are not known.

Only a few fragments of Zeno's works remain, but the writings of Plato and Aristotle provide textual references to Zeno's writings. Philosophically, Zeno accepted Parmenides' belief that the universe, or being, is a single, undifferentiated substance, a oneness, although it may appear diversified to the senses. Zeno's intention was to discredit the senses, which he sought to do through a brilliant series of arguments, or paradoxes, on time and space that have remained complex intellectual puzzles to this day. A typical paradox asserts that a runner cannot reach a goal because, in order to do so, he must traverse a distance; but he cannot traverse that distance without first traversing half of it, and so on, ad infinitum. Because an infinite number of bisections exist in a spatial distance, one cannot travel any distance in finite time, however short the distance or great the speed. This argument, like several others of Zeno, is intended to demonstrate the logical impossibility of motion. In that the senses lead us to believe in the existence of motion, the senses are illusory and therefore no obstacle to accepting the otherwise implausible theories of Parmenides. Zeno is noted not only for his paradoxes, but for inventing the type of philosophical argument they exemplify. Thus Aristotle named him the inventor of dialectical reasoning.

Function

Function, in mathematics, term used to indicate the relationship or correspondence between two or more quantities. The term function was first used in 1637 by the French mathematician René Descartes to designate a power xn of a variable x. In 1694 the German mathematician Gottfried Wilhelm Leibniz applied the term to various aspects of a curve, such as its slope. The most widely used meaning until quite recently was defined in 1829 by the German mathematician Peter Dirichlet. Dirichlet conceived of a function as a variable y, called the dependent variable, having its values fixed or determined in some definite manner by the values assigned to the independent variable x, or to several independent variables x1, x2, ..., xk.

The values of both the dependent and independent variables were real or complex numbers. The statement y = f(x), read “ y is a function of x,” indicated the interdependence between the variables x and y; f(x) was usually given as an explicit formula, such as f(x) = x2 - 3x + 5, or by a rule stated in words, such as f(x) is the first integer larger than x for all x's that are real numbers (see Number). If a is a number, then f(a) is the value of the function for the value x = a. Thus, in the first example, f(3) = 32 - 3 · 3 + 5 = 5, f(-4) = (-4)2 - 3(-4) + 5 = 33; in the second example, f(3) = f(3.1) = f() = 4.

The emergence of set theory first extended and then altered substantially the concept of a function. The function concept in present-day mathematics may be illustrated as follows. Let X and Y be two sets with arbitrary elements; let the variable x represent a member of the set X, and let the variable y represent a member of the set Y. The elements of these two sets may or may not be numbers, and the elements of X are not necessarily of the same type as those of Y. For example, X might be the set of the 50 states of the United States and Y the set of positive integers. Let P be the set of all possible ordered pairs (x, y) and F a subset of P with the property that if (x1, y1) and (x2, y2) are two elements of F, then y1y2 implies that x1x2—that is, F contains no more than one ordered pair with a given x as its first member. (If x1x2, however, it may happen that y1 = y2.) A function is now regarded as the set F of ordered pairs with the stated condition and is written F:XY. The set X1 of x's that occur as first elements in the ordered pairs of F is called the domain of the function F; the set Y1 of y's that occur as second elements in the ordered pairs is called the range of the function F. Thus, {(New York, 7), (Ohio, 4), (Utah, 4)} is one function that has X = the set of the 50 U.S. states and Y = the set of all positive integers; the domain is the three states named, and the range is 4, 7.

The modern concept of a function is related to the Dirichlet concept. Dirichlet regarded y = x2 - 3x + 5 as a function; today, y = x2 - 3x + 5 is thought of as the rule that determines the correspondent y for a given x of an ordered pair of the function; thus, the preceding rule determines (3, 5), (-4, 33) as two of the infinite number of elements of the function. Although y = f(x) is still used today, it is better to read it as “y is functionally related to x”.

A function is also called a transformation or mapping in many branches of mathematics. If the range Y1 is a proper subset of Y (that is, at least one y is in Y but not in Y1), then F is a function or transformation or mapping of the domain X1 into Y; if Y1 = Y,F is a function or transformation or mapping of X1 onto Y.

Leibniz, Gottfried Wilhelm

I

INTRODUCTION

Leibniz, Gottfried Wilhelm, also Leibnitz, Baron Gottfried Wilhelm von (1646-1716), German philosopher, mathematician, and statesman, regarded as one of the supreme intellects of the 17th century.

Leibniz was born in Leipzig. He was educated at the universities of Leipzig, Jena, and Altdorf. Beginning in 1666, the year in which he was awarded a doctorate in law, he served Johann Philipp von Schönborn, archbishop elector of Mainz, in a variety of legal, political, and diplomatic capacities. In 1673, when the elector's reign ended, Leibniz went to Paris. He remained there for three years and also visited Amsterdam and London, devoting his time to the study of mathematics, science, and philosophy. In 1676 he was appointed librarian and privy councillor at the court of Hannover. For the 40 years until his death, he served Ernest Augustus, duke of Brunswick-Lüneburg, later elector of Hannover, and George Louis, elector of Hannover, later George I, king of Great Britain and Ireland.

Leibniz was considered a universal genius by his contemporaries. His work encompasses not only mathematics and philosophy but also theology, law, diplomacy, politics, history, philology, and physics.

II

MATHEMATICS

Leibniz's contribution in mathematics was to discover, in 1675, the fundamental principles of infinitesimal calculus. This discovery was arrived at independently of the discoveries of the English scientist Sir Isaac Newton, whose system of calculus was invented in 1666. Leibniz's system was published in 1684, Newton's in 1687, and the method of notation devised by Leibniz was universally adopted (see Mathematical Symbols). In 1672 he also invented a calculating machine capable of multiplying, dividing, and extracting square roots, and he is considered a pioneer in the development of mathematical logic.

III

PHILOSOPHY

In the philosophy expounded by Leibniz, the universe is composed of countless conscious centers of spiritual force or energy, known as monads. Each monad represents an individual microcosm, mirroring the universe in varying degrees of perfection and developing independently of all other monads. The universe that these monads constitute is the harmonious result of a divine plan. Humans, however, with their limited vision, cannot accept such evils as disease and death as part of a universal harmony. This Leibnizian universe, “the best of all possible worlds,” is satirized as a utopia by the French author Voltaire in his novel Candide (1759).

Important philosophical works by Leibniz include Essays in Theodicy on the Goodness of God, the Liberty of Man, and the Origin of Evil (2 volumes, 1710; translated in Philosophical Works,1890), Monadology (1714; published in Latin as Principia Philosophiae,1721; translated 1890), and New Essays Concerning Human Understanding (1703; published 1765; translated 1916). The latter two greatly influenced German philosophers of the 18th century, including Christian von Wolff and Immanuel Kant.

Digital Logic

Digital Logic, also called binary logic, in computer science, a strict set of rules for representing the relationships and interactions among numbers, words, symbols, and other data stored or entered in the memory of a computer. Digital logic is the heart of the operational function for all modern digital computers. The system uses binary arithmetic, in which a sequence of 1s and 0s (called bits) are used to represent a number. These bits are combined in meaningful ways through the operation of digital logic and physically describe electrical voltage states in a computer’s circuitry. Digital logic uses the bit value 1 to represent a transistor with electric current flowing through it and the bit value 0 to represent a transistor with no current flowing through it.

The instructions that direct a computer’s operation are known as machine code, and they are written as a sequence of binary digits. These binary digits, or bits, switch specific groups of transistors, called gates, on or off (see Transistor). There are three basic logic states, or functions, for logic gates: AND, OR, and NOT. An AND gate takes the value of two input bits and tests them to see if they are both equal to 1. If they are, the output from the AND gate is a 1, or true. If they are not, the AND gate will output a 0, or false. An OR gate tests two input bits to see if either of the bits is equal to 1. If either input bit is equal to 1, the gate outputs a 1; if both input bits are 0, it outputs a 0. A NOT gate negates the input bit, so an input of 1 results in an output of 0 and vice versa.

Combinations of logic gates in open or closed positions can be used to represent and execute operations on data. A series of logic gates together form a logic circuit. The output of a logic circuit can provide input to another circuit or produce the result of an operation. Extremely complex operations can be performed using combinations of the AND, OR, and NOT functions.

Binary logic was first proposed by 19th-century British logician and mathematician George Boole, who in 1847 invented a two-valued system of algebra that represented logical relationships and operations. This system of algebra, called Boolean Algebra, was used by German engineer Konrad Zuse in the 1930s for his Z1 calculating machine. It was also used in the design of the first digital computer in the late 1930s by American physicist John Atanasoff and his graduate student Clifford Berry. During 1944 and 1945 Hungarian-born American mathematician John von Neumann suggested using the binary arithmetic system for storing programs in computers. In the 1930s and 1940s British mathematician Alan Turing and American mathematician Claude Shannon also recognized how binary logic was well suited to the development of digital computers.

Complex Number

I

INTRODUCTION

Complex Number, in mathematics, the sum of a real number and an imaginary number. An imaginary number is a multiple of i, where i is the square root of -1. Complex numbers can be expressed in the form a + bi, where a and b are real numbers. They have the algebraic structure of a field in mathematics. In engineering and physics, complex numbers are used extensively to describe electric circuits and electromagnetic waves (see Electromagnetic Radiation). The number i appears explicitly in the Schrödinger wave equation (see Schrödinger, Erwin), which is fundamental to the quantum theory of the atom. Complex analysis, which combines complex numbers with ideas from calculus, has been widely applied to subjects as different as the theory of numbers and the design of airplane wings.

II

HISTORY

Historically, complex numbers arose in the search for solutions to equations such as x2=-1. Because there is no real number x for which the square is -1, early mathematicians believed this equation had no solution. However, by the middle of the 16th century, Italian mathematician Gerolamo Cardano and his contemporaries were experimenting with solutions to equations that involved the square roots of negative numbers. Cardano suggested that the real number 40 could be expressed as

Swiss mathematician Leonhard Euler introduced the modern symbol i for

in 1777 and expressed the famous relationship ei=-1 which connects four of the fundamental numbers of mathematics. For his doctoral dissertation in 1799, German mathematician Carl Friedrich Gauss proved the fundamental theorem of algebra, which states that every polynomial with complex coefficients has a complex root. The study of complex functions was continued by French mathematician Augustin Louis Cauchy, who in 1825 generalized the real definite integral of calculus to functions of a complex variable.

III

PROPERTIES

For a complex number a+ bi,a is called the real part and b is called the imaginary part. Thus, the complex number -2 + 3i has the real part -2 and the imaginary part 3. Addition of complex numbers is performed by adding the real and imaginary parts separately. To add 1 + 4i and 2 - 2i, for example, add the real parts 1 and 2 and then the imaginary parts 4 and -2 to obtain the complex number 3 + 2i. The general rule for addition is

(a+ bi) + (c+di) = (a+ c) + (b+d)i

Multiplication of complex numbers is based on the premise that i×i=-1 and the assumption that multiplication distributes over addition. This gives the rule

(a+bi) × (c+di) = ( ac-bd) + (ad+bc)i

For example,

(1 + 4i) × (2 - 2i) = 10 + 6i

If z=a+bi is any complex number, then, by definition, the complex conjugate of z is

and the absolute value, or modulus, of z is

For example, the complex conjugate of 1 + 4i is 1 - 4i, and the modulus of 1 + 4i is

A basic relationship connecting absolute value and complex conjugate is

IV

THE COMPLEX PLANE

In the same way that real numbers can be thought of as points on a line, complex numbers can be thought of as points in a plane. The number a+bi is identified with the point in the plane with x coordinate a and y coordinate b. The points 1 + 4i and 2 - 2i are plotted in Figure 1 and correspond to the points (1,4) and (2,-2). In 1806 Swiss bookkeeper Jean Robert Argand was one of the first people to express complex numbers geometrically as points in the plane. For this reason, Figure 1 is sometimes referred to as an Argand diagram. If a complex number in the plane is thought of as a vector joining the origin to that point, then addition of complex numbers corresponds to standard vector addition. Figure 1 shows the complex number 3 + 2i obtained by adding the vectors 1 + 4i and 2 - 2i.

Since points in the plane can be written in terms of the polar coordinates r and θ (see Coordinate System), every complex number z can be written in the form

z= r (cos θ+i sin θ)

Here, r is the modulus, or distance to the origin, and θ is the argument of z or the angle that z makes with the x axis. If z=r (cos θ+i sin θ) and w=s (cos φ+i sin φ) are two complex numbers in polar form, then their product in polar form is given by

zw=rs (cos (θ+φ) +i sin (θ+φ))

This has a simple geometric interpretation that is illustrated in Figure 2.

V

SOLUTIONS TO POLYNOMIALS

There are many polynomial equations that have no real solutions, such as

x2+ 1 = 0

However, if x is allowed to be complex, the equation has the solutions xi, where i and - i are roots of the polynomial x2+ 1. The equation

x2- 2x+ 2 = 0

has the solutions x= l ±i. In his fundamental theory of algebra, Gauss showed that every nontrivial (having at least one nonzero root) polynomial with complex coefficients must have at least one complex root. From this it follows that every complex polynomial of degree n must have exactly n roots, although some roots may be the same. Consequently, every complex polynomial of degree n can be written as a product of exactly n linear, or first-degree, factors.

Calculator

Calculator, handheld device that performs mathematical calculations. A calculator can also be a program on a computer that simulates a handheld calculator or offers more sophisticated calculation features.

A standard calculator is rectangular in shape and has a keypad through which numbers and operations are entered, as well as a display on which the entered numbers and the results of calculations are shown. Modern calculators can perform many types of mathematical computations, as well as permit the user to store and access data from memory. Common handheld calculators have the ability to use complicated geometric, algebraic, trigonometric, statistical, and calculus functions. Many can also be programmed for specialized tasks. Calculators operate on electrical power supplied by either batteries, solar cells, or standard electrical current. Modern calculators have digital displays, usually using some form of LCD (liquid crystal display).

Calculator programs are common accessories included with most personal computer operating systems. For example, both the Macintosh and Windows operating systems have a simple desktop calculator program.

While the calculator is a very modern invention, machines able to perform addition and subtraction have existed for centuries. The abacus, a simple instrument for carrying out arithmetic operations, has been used by many cultures and dates back to ancient times. Another calculating instrument called the slide rule was invented in the early 1600s by the English mathematician William Oughtred. The slide rule was based on logarithms and made multiplication and division much easier. Engineers and scientists used slide rules until the introduction of calculators in the early 1970s.

The invention of the calculating machine is commonly credited to the French mathematician Blaise Pascal. In 1642 Pascal created a machine to free his father, who was a tax collector, from the tedious task of adding columns of numbers. Pascal’s machine used a complicated arrangement of numbered wheels connected by gears, which could add and subtract numbers up to nine digits long.

Until the late 1960s, hundreds of different types of calculating machines were invented that utilized many different technologies. Two of the most popular were the stepped-drum and pin-wheel mechanisms. These early calculators, even when refined for the desktop, were large, heavy, and expensive, compared to today’s calculators. Earlier models required the user to press a hand lever to turn the calculating mechanism. Later versions used electrical power to turn the mechanism. Most early calculators used paper tape to print typewritten numbers, operations, and results. For the most part, these mechanical devices could only perform four basic operations: addition, subtraction, multiplication, and division.

In 1967 a team of three engineers from Texas Instruments, Inc. invented the portable, electronic, handheld calculator. Jack Kilby, widely known as the inventor of the integrated circuit (IC), or computer chip, along with Jerry Merryman and James Van Tassel, built an IC-based, battery-powered miniature calculator that could add, subtract, multiply, and divide. This basic calculator could accept 6-digit numbers and display results as large as 12 digits. The prototype of this device is now displayed in the Smithsonian Institution, based in Washington, D.C.

Proof, Mathematical

Proof, Mathematical, an argument that is used to show the truth of a mathematical assertion. In modern mathematics, a proof begins with one or more statements called premises and demonstrates, using the rules of logic, that if the premises are true then a particular conclusion must also be true.

The accepted methods and strategies used to construct a convincing mathematical argument have evolved since ancient times and continue to change. Consider the Pythagorean theorem, named after the 5th century bc Greek mathematician and philosopher Pythagoras, which states that in a right-angled triangle, the square of the hypotenuse is equal to the sum of the squares of the other two sides. Many early civilizations considered this theorem true because it agreed with their observations in practical situations. But the early Greeks, among others, realized that observation and commonly held opinion do not guarantee mathematical truth. For example, before the 5th century bc it was widely believed that all lengths could be expressed as the ratio of two whole numbers. But an unknown Greek mathematician proved that this was not true by showing that the length of the diagonal of a square with an area of 1 is the irrational number  (see Number).

An example of a mathematical proof is the following argument, which proves that the Pythagorean theorem is true. Figure 1 and figure 2 demonstrate that the relationship A2 + B2 = C2 holds in a right-angled triangle with sides A and B and hypotenuse C. Figure 1 shows that a square of side A + B can be divided into four of the right-angled triangles, a square of side A, and a square of side B. Figure 2 shows that a square of side A + B can also be dissected into four of the right-angled triangles and a square of side C. Since the two squares of side A + B have the same area, they must still have the same area once the four triangles are removed from each of them. The total area of the squares that remain on the left side is A 2 + B2, and the area of the square remaining on the right side is C2. Thus A2 + B2 = C2.

The Greek mathematician Euclid laid down some of the conventions central to modern mathematical proofs. His book The Elements, written about 300 bc, contains many proofs in the fields of geometry and algebra. This book illustrates the Greek practice of writing mathematical proofs by first clearly identifying the initial assumptions and then reasoning from them in a logical way in order to obtain a desired conclusion. As part of such an argument, Euclid used results that had already been shown to be true, called theorems, or statements that were explicitly acknowledged to be self-evident, called axioms; this practice continues today.

In the 20th century, proofs have been written that are so complex that no one person understands every argument used in them. In 1976 a computer was used to complete the proof of the four-color theorem. This theorem states that four colors are sufficient to color any map in such a way that regions with a common boundary line have different colors. The use of a computer in this proof inspired considerable debate in the mathematical community. At issue was whether a theorem can be considered proven if human beings have not actually checked every detail of the proof.

Hardy-Weinberg Rule

I

INTRODUCTION

Hardy-Weinberg Rule, set of algebraic formulas that describe how the proportion of different genes, the hereditary units that determine a particular characteristic in an organism, can remain the same over time in a large population of individuals.

Specifically, this rule indicates how often particular alleles, alternate forms of a particular gene that contain specific information about a trait (eye color, for example), should occur in a population. The rule also reveals how often particular genotypes, the actual combination of genes an organism carries and may pass on to its offspring, should appear in that same population. By studying these allelic and genotypic frequencies, scientists can identify populations that are changing genetically, or evolving. They can also predict the occurrence of genetic defects in populations.

British mathematician Godfrey Harold Hardy and German physician Wilhelm Weinberg independently described the rule in 1908. American mathematician Sewall Wright, British mathematician Sir Ronald Fisher, and British geneticist John B. S. Haldane then used the Hardy-Weinberg rule to develop mathematical theories of evolution, including several based on the concept of natural selection, in which the organisms best adapted to their environment survive and pass on their genetic characteristics. These theories formed the basis for a new branch of science known as population genetics—the study of how genes spread through populations of organisms (see Genetics).

II

ALLELIC AND GENOTYPIC FREQUENCIES

Each individual in a population has two alleles for every gene. These alleles may be the same or different, and one allele may be dominant over the other. For example, in a sample group of 100 individuals from a particular population, the gene for a certain trait has alleles A and a, in which A is dominant over a. Each individual in the group carries two of these alleles in one of the following combinations, or genotypes: AA,Aa, or aa. In the sample group of 100 people, 33 individuals have the AA genotype, or two A alleles; 54 individuals have the Aa genotype, or one A and one a allele; and 13 individuals have the aa genotype, or two a alleles.

The actual frequency of each allele in the sample group—that is, the number of all alleles that are A or a, is determined by dividing the total number of each allele type by the total number of all alleles. For example, the actual frequency of the A allele in the sample group is 0.60, derived by dividing 120, the total number of A alleles (two each from the 33 individuals with the AA genotype and one each from the 54 individuals with the Aa genotype) by 200, the total number of all alleles (two each from the 100 individuals).

The Hardy-Weinberg rule uses the actual allelic frequencies of a population to predict the population’s expected genotypic frequencies—that is, the number of genotypes that should occur in the population. Assuming that a gene has two alleles, A and a (whose frequencies are represented mathematically as p and q, respectively), that can form three genotypes, AA,Aa, and aa, the following formulas can be used to predict expected genotypic frequencies:

Frequency of AA = p × p = p2 Frequency of Aa = 2 × p × q = 2pq Frequency of aa = q × q = q2

For example, if the frequency of the A allele in a population is equal to 0.60, then the expected frequency of individuals with a AA genotype is 0.36, which is 0.60 multiplied by 0.60.

III

HARDY-WEINBERG EQUILIBRIUM

Scientists compare a population’s expected genotypic frequencies to its actual genotypic frequencies (determined by dividing the total number of each genotype in the group by the total number of individuals in the group) to determine whether the population is maintaining the same ratio, or equilibrium, of genotypes over time.

According to the Hardy-Weinberg rule, this equilibrium remains the same in a population as long as four conditions are met. First, individuals must select mates randomly without regard to visible, or phenotypic, traits. Second, no genotype can be favored in such a way that it will increase in frequency in the population over time. The third condition states that no new alleles can be introduced into the population, either by individuals from outside the population or by alleles that have changed, or mutated, from one form to another. The final condition specifies that the number of individuals and genotypes in the population remain high. A population that meets these conditions maintains the same proportions of different genes over time—the genetic makeup of the population never changes. Rare genes never disappear and common genes remain numerous.

IV

EVOLVING POPULATIONS

Most populations do not maintain an equilibrium between allelic and genetic frequencies, however, and existing genes are replaced by new or more advantageous genes. This evolution may be due to natural selection—that is, some members of the population producing more or stronger, healthier offspring. Changes can also be caused by genetic mutation, an inheritable change in the character of a gene; genetic drift, the loss of an allele from a population caused when offspring in a generation inherit the alternate form of an allele for a gene; the migration of individuals to and from the population; or a decrease in population size. All of these factors occur naturally over time. Genetic mutations, however, are also caused by exposure to harmful chemicals and radioactive materials.

Determinant

Determinant, mathematical notation consisting of a square array of numbers or other elements between two vertical bars; the value of the expression is determined by its expansion according to certain rules. Determinants were first investigated by the Japanese mathematician Seki Kowa about 1683 and independently by the German philosopher and mathematician Gottfried Wilhelm Leibniz about 1693. These notations are used in almost every branch of mathematics and in the natural sciences.

The symbol is a determinant of the second order, because it is an array of two rows and two columns. Each letter a stands for a number or variable. The determinant itself also represents a number or variable, the value of which is defined as a11 a22-a12a21. For example:

A determinant of the nth order is a square array of n rows and n columns represented by the symbol

The minor, Mij, of any element aij in the array is the determinant formed of the elements remaining after deleting the row i and the column j in which the element aij occurs. The cofactor, Aij, of an element aij is equal to (- 1)i+jMij.

The value of any determinant may be expressed in terms of the elements of any row (or column) and their respective cofactors in accordance with the following rule. Each element in the selected row (or column) is multiplied by its corresponding cofactor; the sum of these products is the value of the determinant. Formally, this may be written

if the expansion is in terms of the ith row, or

if it is in terms of the jth column. Thus, to find the value of a third-order determinant using the elements in the first column

These terms may be evaluated in accordance with the definition of the second-order determinant given above. For determinants of higher orders than the third, the process is repeated on the determinants formed by the minors until the determinants can be expanded easily.

Because this method of finding the value of a determinant may be quite laborious, various properties of a determinant are developed and utilized to lessen the amount of calculation needed to evaluate it. Among these properties are the following: (1) a determinant is equal to zero if all the elements in one row (or column) are identical with or proportional to the elements in another row (or column); (2) a determinant is multiplied by a given factor if each element of a row (or column) is multiplied by the same factor; and (3) the value of a determinant is not changed by adding to each element of a row (or column) the corresponding element of another row (or column) multiplied by a constant factor. Hence, through the use of these and other properties, determinants of higher order can be reduced to third-order determinants for simple expansion.

Application of determinants in analytical geometry is illustrated in the following example: If P1(x1, y1), P2(x2, y2), P3(x3, y3) are three distinct points in a rectangular coordinate plane, the area A of triangle P1P2P3, apart from algebraic sign, is given by

When the three points are collinear, the determinant is equal to zero.

An example of the use of determinants in solving linear equations is as follows. Let

be a system of n linear equations in the n unknowns x1, x, ..., xn. The determinant Δ given above is the determinant of coefficients; let Δk be the determinant obtained by deleting the kth column of Δ and replacing it by the column of constants b1, b2, ..., bn, where k = 1, 2, ..., n. If Δ≠ 0, the equations are consistent—that is, a solution is possible. In this case only one solution is possible; it is given by

If Δ = 0, further investigation is necessary to determine the number and nature of the solutions.

Binomial

Binomial, algebraic expression that consists of exactly two terms separated by + or -, such as x + y or ab - cd. The binomial theorem asserts that the general expansion of a binomial, such as (x + y), raised to the nth power is given by

The general coefficient of the kth term in the above expression is

and is usually denoted by the symbol (). The expansion of (x + y)n contains n + 1 terms. Formulated in medieval times, the binomial theorem was developed (about 1676) for fractional exponents by the English scientist Sir Isaac Newton, enabling him to apply his newly discovered methods of calculus to many difficult problems. The binomial theorem is useful in various branches of mathematics, particularly in the theory of probability.

Pi

Pi, Greek letter () used in mathematics as the symbol for the ratio of the circumference of a circle to its diameter. Its value is approximately 22/7; the approximate value of  to five decimal places is 3.14159. The formula for the area of a circle, A = r2 (r is the radius), uses the constant. Various approximations of the numerical value of the ratio were used in biblical times and later. In the Bible, the value was taken to be 3; the Greek mathematician Archimedes correctly asserted that the value was between 3 10/70 and 3. With computers, the value has been figured to more than 200 billion decimal places. The ratio is actually an irrational number, so the decimal places go on infinitely without repeating or ending in zeros. The symbol  for the ratio was first used in 1706 by the Welsh mathematician William Jones, but it became popular only after its adoption by the Swiss mathematician Leonhard Euler in 1737. In 1882 the German mathematician Ferdinand Lindemann proved that  is a transcendental number—that is, it is not the root of any polynomial equation with rational coefficients (for example, x3 - 5/7x2 - 21x + 17 = 0). Consequently, Lindemann was able to demonstrate that it is impossible to square the circle algebraically or by use of the ruler and compass.

Linear Programming

Linear Programming, mathematical and operations-research technique, used in administrative and economic planning to maximize the linear functions of a large number of variables, subject to certain constraints (see Algebra; Function; Mathematics). The development of high-speed electronic computers and data-processing techniques has brought about many recent advances in linear programming, and the technique is now widely used in industrial and military operations. See Computer.

Linear programming is basically used to find a set of values, chosen from a prescribed set of numbers, that will maximize or minimize a given polynomial form (see Theory of Equations), and this is illustrated by the following example of a particular kind of problem and a method of solution. A manufacturer makes two varieties, V1 and V2, of an article having parts that must be cut, assembled, and finished; the manufacturer knows that as many articles as are produced can be sold. Variety V1 takes 25 min to cut, 60 min to assemble, and 68 min to finish; it yields $30 profit. Variety V2 takes 75 min to cut, 60 min to assemble, and 34 min to finish, and yields a $40 profit. Not more than 450 min of cutting time, 480 min of assembly time, and 476 min of finishing time are available each day. How many articles of each variety should be manufactured each day to maximize profit?

Let x and y be the numbers of articles of varieties V1 and V2, respectively, that should be manufactured each day to maximize profit. Because x and y cannot be negative numbers,

The cutting, assembly, and finishing data determine the following inequalities and equalities:

The profit is given by

The problem is to find the values of x and y, if any, subject to restrictions (1) through (5), that will maximize the linear polynomial or linear form (6).

The equation 25x + 75y = 450 represents a straight line in the Cartesian plane (see Geometry); if point P has coordinates (r, s), P is above the line, on the line, or below the line, as 25r + 75s is greater than, equal to, or less than 450. Therefore, condition (3) is satisfied by the coordinates of any point in the half plane determined by the line 25x + 75y = 450 and all points below it. Similarly, each of the conditions (1), (2), (4), and (5) is satisfied by the coordinates of a point in a certain half plane. To satisfy all five conditions, the point must lie on the boundary or interior of the convex, polygonal region OABCD in Figure 1. The region is convex because if R and S are any two points of the region, so is every point of the line segment RS; it is polygonal because its boundary consists of line segments.

The equation 30x + 40y = p, indicating the profit, also represents a straight line; the equation 30x + 40y = p’ represents a parallel line that is above, coincides with, or is below the first line, as p’ is greater than, equal to, or less than p. The profit will be maximized by choosing the line, of the family of parallel lines, that just touches the region OABCD above, namely, the line through the vertex B(3,5). The manufacturer will earn a maximum profit (of $290) if 3 articles of variety V1 and 5 of variety V2 are made per day. Any other quantities of the two varieties, within the constraints of the time limitations, will yield a smaller profit.

Linear programming is applied to many other kinds of problems, and many other methods of solution exist, but the above example is generally illustrative.

Inequality (mathematics)

Inequality (mathematics), mathematical relationship that makes use of the way in which numbers are ordered. Figure 1 shows the symbols used to denote inequality. For example, the inequality 3 < 10 says that the number 3 is less than the number 10. The inequality x2≥ 0 expresses the fact that the square of any real number is always greater than or equal to zero.

Inequalities often arise in describing areas and volumes. For example, if P is any point on the diagonal of the square shown in figure 2, then the area of the two rectangles that are shaded blue is always less than or equal to (≤) the area of the two squares that are shaded red.

The solutions of an inequality such as -2x + 6 > 0 are the values of x for which the expression -2x + 6 is greater than zero. The rules of algebra can be applied to solve this inequality, except that the direction of the inequality must be reversed when multiplying or dividing by negative numbers. So, to solve the inequality -2x + 6 > 0, first subtract 6 from both sides of the inequality to get -2x> -6. Next, divide both sides of -2x> -6 by -2, reversing the direction of the inequality since -2 is negative. This gives x< 3, meaning that any value for x that is less than 3 will be a solution of -2x + 6 > 0.

Root (mathematics)

Root (mathematics), in mathematics, a number that when multiplied by itself a stated number of times yields as a result a second, given number. The index of the root is the number of times that the root appears in the multiplication. For example, the numbers 2 or -2 are the square, or second, roots of 4; 3 or -3 are the square roots of 9; and 4 or -4 are the square roots of 16. A cube, or third, root of 8 is 2, and a cube root of -8 is -2; a cube root of 27 is 3, and of -27 is -3, and so on. The usual symbol for root is √ with the appropriate index, except in the case of the square root, which is written without an index. For example

It is also possible to express roots in the form of fractional exponents; thus

In an algebraic equation, when a quantity inserted in place of the unknown quantity renders the equation a true statement, that quantity is called a root of the equation. See Power.

Fractal

Fractal, in mathematics, a geometric shape that is complex and detailed in structure at any level of magnification. Often fractals are self-similar—that is, they have the property that each small portion of the fractal can be viewed as a reduced-scale replica of the whole. One example of a fractal is the “snowflake” curve constructed by taking an equilateral triangle and repeatedly erecting smaller equilateral triangles on the middle third of the progressively smaller sides. Theoretically, the result would be a figure of finite area but with a perimeter of infinite length, consisting of an infinite number of vertices. In mathematical terms, such a curve cannot be differentiated (see Calculus). Many such self-repeating figures can be constructed, and since they first appeared in the 19th century they have been considered as merely bizarre.

A turning point in the study of fractals came with the discovery of fractal geometry by the Polish-born French mathematician Benoit B. Mandelbrot in the 1970s. Mandelbrot adopted a much more abstract definition of dimension than that used in Euclidean geometry, stating that the dimension of a fractal must be used as an exponent when measuring its size. The result is that a fractal cannot be treated as existing strictly in one, two, or any other whole-number dimensions. Instead, it must be handled mathematically as though it has some fractional dimension. The “snowflake” curve of fractals has a dimension of 1.2618.

Fractal geometry is not simply an abstract development. A coastline, if measured down to its least irregularity, would tend toward infinite length just as does the “snowflake” curve. Mandelbrot has suggested that mountains, clouds, aggregates, galaxy clusters, and other natural phenomena are similarly fractal in nature, and fractal geometry's application in the sciences has become a rapidly expanding field. In addition, the beauty of fractals has made them a key element in computer graphics.

Fractals have also been used to compress still and video images on computers. In 1987, English-born mathematician Dr. Michael F. Barnsley discovered the Fractal TransformTM which automatically detects fractal codes in real-world images (digitized photographs). The discovery spawned fractal image compression, used in a variety of multimedia and other image-based computer applications.

Construction of a Fractal Snowflake

A Koch snowflake is constructed by making progressive additions to a simple triangle. The additions are made by dividing the equilateral triangle’s sides into thirds, then creating a new triangle on each middle third. Thus, each frame shows more complexity, but every new triangle in the design looks exactly like the initial one. This reflection of the larger design in its smaller details is characteristic of all fractals.

Julia Set

The fractal shown here is the graphical representation of a mathematical function called the Julia set. The set is named after French mathematician Gaston Julia, who worked on the mathematics of fractals early in the 20th century, before the term “fractal” was coined by Polish-born French mathematition Benoit Mandelbrot in 1975. The pattern of the whole shape in a fractal repeats itself on smaller and smaller scales, so that magnifying a fractal produces a shape that is similar to the original.

Mandelbrot Set

Polish-born French mathematician Benoit Mandelbrot coined the term “fractal” to describe complex geometric shapes that, when magnified, continue to resemble the shape’s larger structure. This property, in which the pattern of the whole repeats itself on smaller and smaller scales, is called self similarity. The fractal shown here, called the Mandelbrot set, is the graphical representation of a mathematical function.

Prism

Prism, in geometry, three-dimensional solid, of which the bases are two parallel planes. The faces of the prism in these planes are congruent polygons. The lateral faces of the prism are parallelograms (see Fig. 1). The intersections of the lateral faces, called the lateral edges, are parallel to each other. A prism is called a right prism if the lateral edges are perpendicular to the bases; if they are not, it is called an oblique prism. A prism is triangular, square, and so on, according as its bases are triangles, squares, or some other geometrical figure. A parallelopiped is a prism that has parallelograms as the bases; a rectangular parallelopiped, or box, is one in which all six faces (four lateral faces and two bases) are rectangles (see Fig. 2); in a cube the six faces are squares (see Fig. 3).

The altitude of a prism is the perpendicular distance between the planes of the bases. A truncated prism is that portion of a prism between a base and a section formed by a plane not parallel to the base but cutting all lateral edges (see Fig. 4). The volume, V, of a prism is given by the area, B, of a base multiplied by the altitude, h; in symbols, V = Bh. If a,b,c are the lengths of three edges of a rectangular parallelopiped that meet at one vertex (the length, width, and depth of a box), the volume is given by V = abc. In particular, if a is the length of one of the twelve equal edges of a cube, the volume of the cube is V = a3 and the total surface area, S, of the cube is S = 6a2.

Prisms: Figures 1-4

Prisms are commonly used to bend the path of light in devices such as binoculars and periscopes. White light passed through a glass prism will divide into the colors of the spectrum. In geometry, prisms are three-dimensional solids in which the bases are two parallel planes, and the faces in these planes are congruent polygons. As shown in figure 1, the lateral faces of the prism are parallelograms. Figure 2 depicts a prism in which all six bases are rectangular parallelograms, a special case of a prism called a rectangular parallelopiped. Figure 3 is also a parallelopiped, but each of the faces is a square, forming a cubic solid. A truncated prism, shown in figure 4, is a portion of a prism formed by cutting all lateral edges with a plane not parallel to the base.

Separation of White Light into Colored Light

Light that contains many colors, such as sunlight, appears white. When white light passes through a prism-shaped transparent block, the prism separates the light into a spectrum of different colors. The prism separates the light by refracting, or bending, light of different colors at different angles. Rays of red light bend the least and rays of violet light bend the most.

Circle

Circle, in geometry, plane curve such that each point on the curve is the same distance from a fixed point. This point is called the center of the circle. The circle belongs to the class of curves known as conic sections because a circle can be described as the intersection of a right circular cone with a plane that is perpendicular to the axis of the cone (see Geometry: Conic Sections).

Any line segment that passes through the center and is terminated by the circle is called a diameter of the circle. A radius is a line segment from the center of the circle to the perimeter of the circle. A chord is any straight-line segment that is intercepted by the circle. An arc of a circle is a portion lying between two points on the circle. A central angle is an angle with the vertex at the center of the circle and with sides forming radii of the circle. A central angle is subtended by the arc that lies between the points at which the central angle's sides intersect the circle.

Of all plane figures having the same perimeter, the circle has the greatest area. The ratio of the circumference to the diameter of a circle is a constant designated by the symbol , or pi. Pi is one of the most important mathematical constants, and plays a role in many calculations and proofs in mathematics, physics, engineering, and other sciences. Pi is approximately 3.141592, although 3.1416 and even 3 are sufficiently accurate for ordinary purposes. The Greek mathematician Archimedes described the value of  as lying between 3 and 3.

The center of a circle is a point of symmetry, and any diameter of a circle is an axis of symmetry. Concentric circles—that is, circles having different perimeters but the same center—never intersect. The area of a circle is equal to pi multiplied by the square of the circle's radius. An arc of a circle is proportional to the angle subtended at the center, and conversely; this property forms the basis of angular measure. There are 360° in a circle.

Соседние файлы в папке Тексты по английскому языку