Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Jones D.M.The new C standard (C90 and C++).An economic and cultural commentary.2005

.pdf
Скачиваний:
19
Добавлен:
23.08.2013
Размер:
1.36 Mб
Скачать

5.1.1.1 Program structure 110

105 Their characteristics define and constrain the results of executing conforming C programs constructed according to the syntactic and semantic rules for conforming implementations.

C++

The C++ Standard makes no such observation.

5.1 Conceptual models

106 Forward references: In this clause, only a few of many possible forward references have been noted.

C++

In particular, they need not copy or emulate the structure of the abstract machine. Rather, conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below.5)

This provision is sometimes called the “as-if ” rule, because an implementation is free to disregard any requirement of this International Standard as long as the result is as if the requirement had been obeyed, as far as can be determined from the observable behavior of the program. For instance, an actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no side effects affecting the observable behavior of the program are produced.

1.9p1

Footnote 5

5.1.1 Translation environment

5.1.1.1 Program structure

108 The text of the program is kept in units called source files , (or preprocessing files ) in this International Standard.

C90

The term preprocessing files is new in C99.

C++

The C++ Standard follows the wording in C90 and does not define the term preprocessing files .

109 A source file together with all the headers and source files included via the preprocessing directive #include is known as a preprocessing translation unit.

C90

The term preprocessing translation unit is new in C99.

C++

Like C90, the C++ Standard does not define the term preprocessing translation unit.

110 After preprocessing, a preprocessing translation unit is called a translation unit.

C90

source files preprocess-

ing files

preprocessing

translation unit known as

translation unit known as

September 2, 2005

v 1.0b

116

5.1.1.2 Translation phases

 

 

 

 

 

 

 

A source file together with all the headers and source files included via the preprocessing directive #include,

 

 

 

 

less any source lines skipped by any of the conditional inclusion preprocessing directives, is called a translation

 

 

 

 

unit.

 

 

 

 

 

 

This definition differs from C99 in that it does not specify whether macro definitions are part of a translation unit.

C++

The C++ Standard, 2p1, contains the same wording as C90.

5.1.1.2 Translation phases

translation The precedence among the syntax rules of translation is specified by the following phases. 5) 115 phases of

C++

C++ 116 model A

translation phase

C++ has nine translation phases. An extra phase has been inserted between what are called phases 7 and 8 in C. This additional phase is needed to handle templates, which are not supported in C. The C++ Standard specifies what the C Rationale calls model A.

1. Physical source file multibyte characters are mapped, in an implementation-defined manner, to the source 116

1character set (introducing new-line characters for end-of-line indicators) if necessary.

C90

In C90 the source file contains characters (the 8-bit kind), not multibyte characters.

C++

2.1p1

1. Physical source file characters are mapped, in an implementation-defined manner, to the basic source character set (introducing new-line characters for end-of-line indicators) if necessary. . . . Any source file character not in the basic source character set (2.2) is replaced by the universal-character-name that designates that character.

1

#define mkstr(s) #s

2

 

3

char *dollar = mkstr($); // The string "\u0024" is assigned

4

/* The string "$", if that character is supported */

C++

 

model A

 

Rationale The C++ Committee defined its Standard in terms of model A, just because that was the clearest to specify (used the fewest hypothetical constructs) because the basic source character set is a well-defined finite set.

The situation is not the same for C given the already existing text for the standard, which allows multibyte characters to appear almost anywhere (the most notable exception being in identifiers), and given the more low-level (or close to the metal) nature of some uses of the language.

Therefore, the C committee agreed in general that model B, keeping UCNs and native characters until as late as possible, is more in the “spirit of C” and, while probably more difficult to specify, is more able to encompass the existing diversity. The advantage of model B is also that it might encompass more programs and users’ intents than the two others, particularly if shift states are significant in the source text as is often the case in East Asia.

v 1.0b

September 2, 2005

5.1.1.2 Translation phases 128

In any case, translation phase 1 begins with an implementation-defined mapping; and such mapping can choose to implement model A or C (but the implementation must document it). As a by-product, a strictly conforming program cannot rely on the specifics handled differently by the three models: examples of non-strict conformance include handling of shift states inside strings and calls like fopen("\\ubeda\\file.txt","r") and #include "sys\udefault.h". Shift states are guaranteed to be handled correctly, however, as long as the implementation performs no mapping at the beginning of phase 1; and the two specific examples given above can be made much more portable by rewriting these as fopen("\\" "ubeda\\file.txt", "r") and

#include "sys/udefault.h".

1182. Each instance of a backslash character (\) immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines.

C++

translation phase

2 physical

source line logical source line

The first sentence of 2.1p2 is the same as C90.

The following sentence is not in the C Standard:

If, as a result, a character sequence that matches the syntax of a universal-character-name is produced, the

2.1p2

 

behavior is undefined.

 

 

 

1#include <stdio.h>

2

3int \u1F\

4

5F;

//

undefined behavior

5

 

/*

defined behavior */

6void f(void)

7{

8printf("\\u0123"); /* No UCNs. */

9printf("\\u\

100123"); /* same as above, no UCNs */

11// undefined, character sequence that matches a UCN created

12}

119 Only the last backslash on any physical source line shall be eligible for being part of such a splice.

C90

This fact was not explicitly specified in the C90 Standard.

C++

The C++ Standard uses the wording from C90.

121 A source file that is not empty shall end in a new-line character, which shall not be immediately preceded by a backslash character before any such splicing takes place.

source file end in new-line

C90

The wording, “ . . . before any such splicing takes place.”, is new in C99.

127 4. Preprocessing directives are executed, macro invocations are expanded, and _Pragma unary operator expressions are executed.

C90

Support for the _Pragma unary operator is new in C99.

translation phase 4

September 2, 2005

v 1.0b

143 5.1.1.2 Translation phases

transla- 116 tion phase

1

preprocessing directives deleted

C++

Support for the _Pragma unary operator is new in C99 and is not available in C++.

If a character sequence that matches the syntax of a universal character name is produced by token concate-

128

nation (6.10.3.3), the behavior is undefined.

 

C90

 

Support for universal character names is new in C99.

 

C++

 

In C++ universal character names are only processed during translation phase 1. Character sequences created

 

during subsequent phases of translation, which might be interpreted as a universal character name, are not

 

interpreted as such by a translator.

 

 

 

130

All preprocessing directives are then deleted.

C++

 

This explicit requirement was added in C99 and is not stated in the C++ Standard.

corresponding member if no

ISO 10646 28

if there is no corresponding member, it is converted to an implementation-defined member other than the 132 null (wide) character.7)

C90

The C90 Standard did not contain this statement. It was added in C99 to handle the fact that the UCN notation supports the specification of numeric values that may not represent any specified (by ISO 10646) character.

C++

2.2p3

The values of the members of the execution character sets are implementation-defined, and any additional members are locale-specific.

transla-

116

C++ handles implementation-defined character members during translation phase 1.

 

tion phase

 

 

1

 

 

 

 

 

133

 

 

 

 

 

 

 

 

 

 

translation phase

6. Adjacent string literal tokens are concatenated.

6

 

C90

 

 

 

 

 

 

 

 

 

 

 

 

6. Adjacent character string literal tokens are concatenated and adjacent wide string literal tokens are con-

 

 

 

 

 

catenated.

 

 

 

 

 

 

 

 

 

 

It was a constraint violation to concatenate the two types of string literals together in C90. Character and

 

 

 

 

format

 

wide string literals are treated on the same footing in C99.

 

 

The introduction of the macros for format specifiers in C99 created the potential need to support the

 

specifiers

 

 

macros

 

concatenation of character string literals with wide string literals. These macros are required to expand to

 

 

 

 

 

 

character string literals. A program that wanted to use them in a format specifier, containing wide character

 

 

 

string literals, would be unable to do so without this change of specification.

 

 

 

 

137

translation phase

8. All external object and function references are resolved.

8

 

 

 

 

 

 

v 1.0b

September 2, 2005

 

5.1.2 Execution environments

151

 

 

 

 

 

 

C++

 

 

The C translation phase 8 is numbered as translation phase 9 in C++ (in C++, translation phase 8 specifies

 

 

the instantiation of templates).

 

 

 

 

 

143 7) An implementation need not convert all non-corresponding source characters to the same execution char-

footnote

 

acter.

7

 

 

C++

The C++ Standard specifies that the conversion is implementation-defined (2.1p1, 2.13.2p5) and does not explicitly specify this special case.

5.1.1.3 Diagnostics

144A conforming implementation shall produce at least one diagnostic message (identified in an implementationdefined manner) if a preprocessing translation unit or translation unit contains a violation of any syntax rule or constraint, even if the behavior is also explicitly specified as undefined or implementation-defined.

C++

diagnostic shall produce

1.4p2

— If a program contains a violation of any diagnosable rule, a conforming implementation shall issue at least one diagnostic message, except that

— If a program contains a violation of a rule for which no diagnostic is required, this International Standard places no requirement on implementations with respect to that program.

A program that contains “a violation of a rule for which no diagnostic is required”, for instance on line 1, followed by “a violation of any diagnosable rule”, for instance on line 2; a C++ translator is not required to issue a diagnostic message.

5.1.2 Execution environments

149 All objects with static storage duration shall be initialized (set to their initial values) before program startup.

C++

In C++ the storage occupied by any object of static storage duration is first zero-initialized at program startup (3.6.2p1, 8.5), before any other initialization takes place. The storage is then initialized by any nonzero values. C++ permits static storage duration objects to be initialized using nonconstant values (not supported in C). The order of initialization is the textual order of the definitions in the source code, within a single translation unit. However, there is no defined order across translation units. Because C requires the values used to initialize objects of static storage duration to be constant, there are no initializer ordering dependencies.

151 Program termination returns control to the execution environment.

C++

static storage duration

initialized before startup

3.6.1p1

[Note: in a freestanding environment, start-up and termination is implementation defined;

3.6.1p5

September 2, 2005

v 1.0b

165

5.1.2.2.1 Program startup

 

 

 

 

 

 

 

A return statement in main has the effect of leaving the main function (destroying . . . duration) and calling

 

 

 

 

exit with the return value as the argument.

 

 

 

 

 

 

 

 

18.3p8

The function exit() has additional behavior in this International Standard:

 

 

 

 

 

 

 

 

. . . Finally, control is returned to the host environment.

 

 

 

 

 

 

5.1.2.1 Freestanding environment 5.1.2.2 Hosted environment

hosted environ- A hosted environment need not be provided, but shall conform to the following specifications if present.

156

ment

 

 

C++

 

 

 

 

 

1.4p7

For a hosted implementation, this International Standard defines the set of available libraries.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

17.4.1.3p1

For a hosted implementation, this International Standard describes the set of available headers.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Of course, an implementation is free to produce any number of diagnostics as long as a valid program is still

158

 

correctly translated.

 

 

C++

 

 

The C++ Standard does not explicitly give this permission. However, producing diagnostic messages that

 

 

the C++ Standard does not require to be generated might be regarded as an extension, and these are explicitly

 

 

permitted (1.4p8).

 

5.1.2.2.1 Program startup

 

 

 

 

 

 

or equivalent;9)

164

 

C++

 

 

The C++ Standard gives no such explicit permission.

 

 

 

 

 

 

or in some other implementation-defined manner.

165

 

C90

 

 

Support for this latitude is new in C99.

 

C++

The C++ Standard explicitly gives permission for an implementation to define this function using different parameter types, but it specifies that the return type is int.

3.6.1p2

v 1.0b

September 2, 2005

5.1.2.2.3 Program termination 180

It shall have a return type of int, but otherwise its type is implementation-defined.

. . .

[Note: it is recommended that any further (optional) parameters be added after argv. ]

170 The intent is to supply to the program information determined prior to program startup from elsewhere in the hosted environment.

C++

The C++ Standard does not specify any intent behind its support for this functionality.

171 If the host environment is not capable of supplying strings with letters in both uppercase and lowercase, the implementation shall ensure that the strings are received in lowercase.

C++

The C++ Standard is silent on this issue.

175 — The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination.

main parameters intent

argv lowercase

C++

The C++ Standard is silent on this issue.

5.1.2.2.2 Program execution

177 9) Thus, int can be replaced by a typedef name defined as int, or the type of argv can be written as char ** argv, and so on.

C++

The C++ Standard does not make this observation.

footnote 9

5.1.2.2.3 Program termination

178 If the return type of the main function is a type compatible with int, a return from the initial call to the main function is equivalent to calling the exit function with the value returned by the main function as its argument;10)

C90

Support for a return type of main other than int is new in C99.

C++

The C++ wording is essentially the same as C90.

179 reaching the } that terminates the main function returns a value of 0.

C90

This requirement is new in C99.

If the main function executes a return that specifies no value, the termination status returned to the host environment is undefined.

main return equiv-

alent to

September 2, 2005

v 1.0b

195

5.1.2.3 Program execution

 

 

 

 

 

 

 

 

 

 

main

 

If the return type is not compatible with int, the termination status returned to the host environment is 180

termination status

unspecified.

unspecified

 

C90

Support main returning a type that is not compatible with int is new in C99.

C++

3.6.1p2

It shall have a return type of int, . . .

Like C90, C++ does not support main having any return type other than int.

5.1.2.3 Program execution

expression

In the abstract machine, all expressions are evaluated as specified by the semantics.

187

evaluation

C++

 

abstract machine

 

signal interrupt

abstract machine processing

1.9p9

signal function

modified objects

received correct value

The C++ Standard specifies no such requirement.

When the processing of the abstract machine is interrupted by receipt of a signal, only the values of objects 189 as of the previous sequence point may be relied on.

C++

When the processing of the abstract machine is interrupted by receipt of a signal, the value of any objects with type other than volatile sig_atomic_t are unspecified, and the value of any object not of volatile sig_atomic_t that is modified by the handler becomes undefined.

This additional wording closely follows that given in the description of the signal function in the library clause of the C Standard.

Objects that may be modified between the previous sequence point and the next sequence point need not 190 have received their correct values yet.

C++

footnote 10

footnote 11

The C++ Standard does not make this observation.

10) In accordance with 6.2.4, the lifetimes of objects with automatic storage duration declared in main will 193 have ended in the former case, even where they would not have in the latter.

C90

This footnote did not appear in the C90 Standard and was added by the response to DR #085.

11) The IEC 60559 standard for binary floating-point arithmetic requires certain user-accessible status flags 194 and control modes.

C90

The dependence on this floating-point format is new in C99. But, it is still not required.

C++

The C++ Standard does not make these observations about IEC 60559.

v 1.0b

September 2, 2005

 

5.1.2.3 Program execution

208

 

 

 

 

 

 

 

 

 

195 Floating-point operations implicitly set the status flags;

 

 

C++

 

 

The C++ Standard does not say anything about status flags in the context of side effects. However, if a C ++

 

 

implementation supports IEC 60559 (i.e., is_iec559 is true, 18.2.1.2p52) then floating-point operations

 

will implicitly set the status flags (as required by that standard).

 

 

 

 

 

196 modes affect result values of floating-point operations.

 

 

C++

 

 

The C++ Standard does not say anything about floating-point modes in the context of side effects.

 

 

 

 

 

197 Implementations that support such floating-point state are required to regard changes to it as side effects—

side effect

 

see annex F for details.

floating-

 

point state

 

 

 

 

 

C++

 

 

The C++ Standard does not specify any such requirement.

 

 

 

 

 

198 The floating-point environment library <fenv.h> provides a programming facility for indicating when these

 

 

side effects matter, freeing the implementations in other cases.

 

 

C90

 

 

Support for <fenv.h> is new in C99.

 

 

C++

 

 

Support for <fenv.h> is new in C99, and there is no equivalent library header specified in the C ++ Standard.

 

 

 

 

 

199 — At program termination, all data written into files shall be identical to the result that execution of the

 

 

program according to the abstract semantics would have produced.

 

 

C++

 

1.9p11

— At program termination, all data written into files shall be identical to one of the possible results that execution of the program according to the abstract semantics would have produced.

The C++ Standard is technically more accurate in recognizing that the output of a conforming program may vary, if it contains unspecified behavior.

207 EXAMPLE 4 Implementations employing wide registers have to take care to honor appropriate semantics. Values are independent of whether they are represented in a register or in memory. For example, an implicit spilling of a register is not permitted to alter the value. Also, an explicit store and load is required to round to the precision of the storage type. In particular, casts and assignments are required to perform their specified conversion. For the fragment

double d1, d2; float f;

d1 = f = expression;

d2 = (float) expression;

the values assigned to d1 and d2 are required to have been converted to float.

C90

This example is new in C99.

September 2, 2005

v 1.0b

213

5.2.1 Character sets

 

 

 

 

 

 

 

 

 

 

 

EXAMPLE 5 Rearrangement for floating-point expressions is often restricted because of limitations in pre- 208

 

cision as well as range. The implementation cannot generally apply the mathematical associative rules for

 

addition or multiplication, nor the distributive rule, because of roundoff error, even in the absence of overflow

 

and underflow. Likewise, implementations cannot generally replace decimal constants in order to rearrange

 

expressions. In the following fragment, rearrangements suggested by mathematical rules for real numbers

 

are often not valid (see F.8).

double

x,

y, z;

 

 

/* ...

*/

 

 

 

 

x = (x

* y)

* z;

// not equivalent to x *= y * z;

 

z = (x

- y)

+ y ; // not equivalent to z = x;

 

z = x + x

*

y;

// not equivalent to z = x * (1.0

+ y);

y = x / 5.0;

 

// not equivalent to y = x * 0.2;

 

C90

This example is new in C99.

EXAMPLE

EXAMPLE 6 To illustrate the grouping behavior of expressions, in the following fragment

209

expression group-

 

 

 

ing

int a, b;

 

 

/* ...

*/

 

a = a + 32760 + b + 5;

the expression statement behaves exactly the same as

a = (((a + 32760) + b) + 5);

due to the associativity and precedence of these operators. Thus, the result of the sum (a + 32760) is next added to b, and that result is then added to 5 which results in the value assigned to a. On a machine in which overflows produce an explicit trap and in which the range of values representable by an int is [-32768, +32767], the implementation cannot rewrite this expression as

a = ((a + b) + 32765);

since if the values for a and b were, respectively, -32754 and -15, the sum a + b would produce a trap while the original expression would not; nor can the expression be rewritten either as

a = ((a + 32765) + b);

or

a = (a + (b + 32765));

since the values for a and b might have been, respectively, 4 and -8 or -17 and 12. However, on a machine in which overflow silently generates some value and where positive and negative overflows cancel, the above expression statement can be rewritten by the implementation in any of the above ways because the same result will occur.

C90

The C90 Standard used the term exception rather than trap.

5.2 Environmental considerations

5.2.1 Character sets

source character

set

execution character set

Two sets of characters and their associated collating sequences shall be defined: the set in which source files 212 are written (the source character set), and the set interpreted in the execution environment (the execution character set).

C90

The C90 Standard did not explicitly define the terms source character set and execution character set.

v 1.0b

September 2, 2005

Соседние файлы в предмете Электротехника