Tải bản đầy đủ (.pdf) (102 trang)

The C++ Programming Language Third Edition phần 9 pdf

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (349.04 KB, 102 trang )

Section A.7.1

Declarators

807

A.7.1 Declarators
See §4.9.1, Chapter 5 (pointers and arrays), §7.7 (pointers to functions), and §15.5 (pointers to
members).
init-declarator-list:
init-declarator
init-declarator-list , init-declarator
init-declarator:
declarator initializeropt
declarator:
direct-declarator
ptr-operator declarator
direct-declarator:
declarator-id
direct-declarator ( parameter-declaration-clause ) cv-qualifier-seqopt exception-specificationopt
direct-declarator [ constant-expressionopt ]
( declarator )
ptr-operator:
* cv-qualifier-seqopt
&
::opt nested-name-specifier * cv-qualifier-seqopt
cv-qualifier-seq:
cv-qualifier cv-qualifier-seqopt
cv-qualifier:
const
volatile


declarator-id:
::opt id-expression
::opt nested-name-specifieropt type-name
type-id:
type-specifier-seq abstract-declaratoropt
type-specifier-seq:
type-specifier type-specifier-seqopt
abstract-declarator:
ptr-operator abstract-declaratoropt
direct-abstract-declarator
direct-abstract-declarator:
direct-abstract-declaratoropt ( parameter-declaration-clause ) cv-qualifier-seqopt exception-specificationopt
direct-abstract-declaratoropt [ constant-expressionopt ]
( abstract-declarator )

The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.


808

Grammar

Appendix A

parameter-declaration-clause:
parameter-declaration-listopt ...opt
parameter-declaration-list , ...
parameter-declaration-list:
parameter-declaration

parameter-declaration-list , parameter-declaration
parameter-declaration:
decl-specifier-seq
decl-specifier-seq
decl-specifier-seq
decl-specifier-seq

declarator
declarator = assignment-expression
abstract-declaratoropt
abstract-declaratoropt = assignment-expression

function-definition:
decl-specifier-seqopt declarator ctor-initializeropt function-body
decl-specifier-seqopt declarator function-try-block
function-body:
compound-statement
initializer:
= initializer-clause
( expression-list )
initializer-clause:
assignment-expression
{ initializer-list ,opt }
{ }
initializer-list:
initializer-clause
initializer-list , initializer-clause

A v ol at il e specifier is a hint to a compiler that an object may change its value in ways not specified
vo la ti le

by the language so that aggressive optimizations must be avoided. For example, a real time clock
might be declared:
e xt er n c on st v ol at il e c lo ck
ex te rn co ns t vo la ti le cl oc k;

Two successive reads of c lo ck might give different results.
cl oc k

A.8 Classes
See Chapter 10.
class-name:
identifier
template-id
class-specifier:
class-head { member-specificationopt }

The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.


Section A.8

Classes

809

class-head:
class-key identifieropt base-clauseopt
class-key nested-name-specifier identifier base-clauseopt
class-key nested-name-specifier template template-id base-clauseopt

class-key:
class
struct
union
member-specification:
member-declaration member-specificationopt
access-specifier : member-specificationopt
member-declaration:
decl-specifier-seqopt member-declarator-listopt ;
function-definition ;opt
::opt nested-name-specifier templateopt unqualified-id ;
using-declaration
template-declaration
member-declarator-list:
member-declarator
member-declarator-list , member-declarator
member-declarator:
declarator pure-specifieropt
declarator constant-initializeropt
identifieropt : constant-expression
pure-specifier:
= 0
constant-initializer:
= constant-expression

To preserve C compatibility, a class and a non-class of the same name can be declared in the same
scope (§5.7). For example:
s tr uc t s ta t { /* ... */ };
st ru ct st at
i nt s ta t(c ha r* n am e, s tr uc t s ta t* b uf ;

in t st at ch ar na me st ru ct st at bu f)

In this case, the plain name (s ta t) is the name of the non-class. The class must be referred to using
st at
a class-key prefix .
Constant expressions are defined in §C.5.
A.8.1 Derived Classes
See Chapter 12 and Chapter 15.
base-clause:
: base-specifier-list

The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.


810

Grammar

Appendix A

base-specifier-list:
base-specifier
base-specifier-list , base-specifier
base-specifier:
::opt nested-name-specifieropt class-name
virtual access-specifieropt ::opt nested-name-specifieropt class-name
access-specifier virtualopt ::opt nested-name-specifieropt class-name
access-specifier:
private

protected
public

A.8.2 Special Member Functions
See §11.4 (conversion operators), §10.4.6 (class member initialization), and §12.2.2 (base initialization).
conversion-function-id:
operator conversion-type-id
conversion-type-id:
type-specifier-seq conversion-declaratoropt
conversion-declarator:
ptr-operator conversion-declaratoropt
ctor-initializer:
: mem-initializer-list
mem-initializer-list:
mem-initializer
mem-initializer , mem-initializer-list
mem-initializer:
mem-initializer-id ( expression-listopt )
mem-initializer-id:
::opt nested-name-specifieropt class-name
identifier

A.8.3 Overloading
See Chapter 11.
operator-function-id:
operator operator

The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.



Section A.8.3

operator: one of
new delete
+
*
+=
-=
*=
!=
<=
>=

Overloading

new[]
/
%
/=
%=
&&
||

delete[]
^
&
^=
&=
++

--

|
|=
,

~
<<
->*

!
>>
->

=
>>=
()

<
<<=
[]

811

>
==

A.9 Templates
Templates are explained in Chapter 13 and §C.13.
template-declaration:

exportopt template < template-parameter-list > declaration
template-parameter-list:
template-parameter
template-parameter-list , template-parameter
template-parameter:
type-parameter
parameter-declaration
type-parameter:
class identifieropt
class identifieropt = type-id
typename identifieropt
typename identifieropt = type-id
template < template-parameter-list > class identifieropt
template < template-parameter-list > class identifieropt = template-name
template-id:
template-name < template-argument-listopt >
template-name:
identifier
template-argument-list:
template-argument
template-argument-list , template-argument
template-argument:
assignment-expression
type-id
template-name
explicit-instantiation:
template declaration
explicit-specialization:
template < > declaration


The explicit template argument specification opens up the possibility of an obscure syntactic ambiguity. Consider:

The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.


812

Grammar

Appendix A

v oi d h
vo id h()
{
f 1>(0 ; // ambiguity: ((f)<1) > (0) or (f<1>)(0) ?
f<1 0)
// resolution: f<1> is called with argument 0
}

The resolution is simple and effective: if f is a template name, f is the beginning of a qualified
f<
template name and the subsequent tokens must be interpreted based on that; otherwise, < means
less-than. Similarly, the first non-nested > terminates a template argument list. If a greater-than is
needed, parentheses must be used:
f a b >(0 ;
f< a>b
0)
f (a b) >(0 ;
f< a>b

0)

// syntax error
// ok

A similar lexical ambiguity can occur when terminating >s get too close. For example:
l is tli st ve ct or in t>> l v1
lv 1;
l is t< v ec to r<i nt > l v2
li st ve ct or in t> lv 2;

// syntax error: unexpected >> (right shift)
// correct: list of vectors

Note the space between the two >s; >> is the right-shift operator. That can be a real nuisance.

A.10 Exception Handling
See §8.3 and Chapter 14.
try-block:
try compound-statement handler-seq
function-try-block:
try ctor-initializeropt function-body handler-seq
handler-seq:
handler handler-seqopt
handler:
catch ( exception-declaration ) compound-statement
exception-declaration:
type-specifier-seq declarator
type-specifier-seq abstract-declarator

type-specifier-seq
...
throw-expression:
throw assignment-expressionopt
exception-specification:
throw ( type-id-listopt )
type-id-list:
type-id
type-id-list , type-id

The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.


Section A.10

Exception Handling

813

A.11 Preprocessing Directives
The preprocessor is a relatively unsophisticated macro processor that works primarily on lexical
tokens rather than individual characters. In addition to the ability to define and use macros (§7.8),
the preprocessor provides mechanisms for including text files and standard headers (§9.2.1) and
conditional compilation based on macros (§9.3.3). For example:
#i f O PT 4
if OP T==4
#i nc lu de "h ea de r4 h"
in cl ud e he ad er 4.h
#e li f 0 OP T

el if 0#i nc lu de "s om eh ea de r.h
in cl ud e so me he ad er h"
#e ls e
el se
#i nc lu de cs td li b>
in cl ud e#e nd if
en di f

All preprocessor directives start with a #, which must be the first non-whitespace character on its
line.
preprocessing-file:
groupopt
group:
group-part
group group-part
group-part:
pp-tokensopt new-line
if-section
control-line
if-section:
if-group elif-groupsopt else-groupopt endif-line
if-group:
# if constant-expression new-line groupopt
# ifdef identifier new-line groupopt
# ifndef identifier new-line groupopt
elif-groups:
elif-group
elif-groups elif-group

elif-group:
# elif constant-expression new-line groupopt
else-group:
# else new-line groupopt
endif-line:
# endif

new-line

The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.


814

Grammar

Appendix A

control-line:
# include pp-tokens new-line
# define identifier replacement-list new-line
# define identifier lparen identifier-listopt ) replacement-list new-line
# undef identifier new-line
# line pp-tokens new-line
# error pp-tokensopt new-line
# pragma pp-tokensopt new-line
# new-line
lparen:
the left-parenthesis character without preceding white-space

replacement-list:
pp-tokensopt
pp-tokens:
preprocessing-token
pp-tokens preprocessing-token
new-line:
the new-line character

The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.


________________________________________

________________________________________________________________________________________________________________________________________________________________

B

Appendix
________________________________________

________________________________________________________________________________________________________________________________________________________________

Compatibility
You go ahead and follow your customs,
and I´ll follow mine.
– C. Napier

C/C++ compatibility — silent differences between C and C++ — C code that is not C++
— deprecated features — C++ code that is not C — coping with older C++ implementations — headers — the standard library — namespaces — allocation errors — templates

— for-statement initializers — advice — exercises.

B.1 Introduction
This appendix discusses the incompatibilities between C and C++ and between Standard C++ (as
defined by ISO/IEC 14882) and earlier versions of C++. The purpose is to document differences
that can cause problems for the programmer and point to ways of dealing with such problems.
Most compatibility problems surface when people try to upgrade a C program to a C++ program,
try to port a C++ program from one pre-standard version of C++ to another, or try to compile C++
using modern features with an older compiler. The aim here is not to drown you in the details of
every compatibility problem that ever surfaced in an implementation, but rather to list the most frequently occurring problems and present their standard solutions.
When you look at compatibility issues, a key question to consider is the range of implementations under which a program needs to work. For learning C++, it makes sense to use the most complete and helpful implementation. For delivering a product, a more conservative strategy might be
in order to maximize the number of systems on which the product can run. In the past, this has
been a reason (and sometimes just an excuse) to avoid C++ features deemed novel. However,
implementations are converging, so the need for portability across platforms is less cause for
extreme caution than it was a couple of years ago.

The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright ©2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.


816

Compatibility

Appendix B

B.2 C/C++ Compatibility
With minor exceptions, C++ is a superset of C (meaning C89, defined by ISO/IEC 9899:1990).
Most differences stem from C++’s greater emphasis on type checking. Well-written C programs
tend to be C++ programs as well. A compiler can diagnose every difference between C++ and C.

B.2.1 ‘‘Silent’’ Differences
With a few exceptions, programs that are both C++ and C have the same meaning in both languages. Fortunately, these ‘‘silent differences’’ are rather obscure:
In C, the size of a character constant and of an enumeration equals s iz eo f(i nt
si ze of in t). In C++,
++ implementation is allowed to choose whatever size is
s iz eo f(´a equals s iz eo f(c ha r), and a C
si ze of a´)
si ze of ch ar
most appropriate for an enumeration (§4.8).
C++ provides the // comments; C does not (although many C implementations provide them as
an extension). This difference can be used to construct programs that behave differently in the two
languages. For example:
i nt f in t a i nt b
in t f(i nt a, in t b)
{
r et ur n a //* pretty unlikely */ b
re tu rn
;
/* unrealistic: semicolon on separate line to avoid syntax error */
}

C99 (meaning C as defined by ISO/IEC 9899:1999(E)), also provides //.
A structure name declared in an inner scope can hide the name of an object, function, enumerator, or type in an outer scope. For example:
i nt x 99 ;
in t x[9 9]
v oi d f
vo id f()
{
s tr uc t x { i nt a };
st ru ct

in t a;
s iz eo f(x ; /* size of the array in C, size of the struct in C++ */
si ze of x)
}

B.2.2 C Code That Is Not C++
The C/C++ incompatibilities that cause most real problems are not subtle. Most are easily caught
by compilers. This section gives examples of C code that is not C++. Most are deemed poor style
or even obsolete in modern C.
In C, most functions can be called without a previous declaration. For example:
m ai n()
ma in
/* poor style C. Not C++ */
{
d ou bl e s q2 = s qr t(2 ;
do ub le sq 2 sq rt 2)
p ri nt f("t he s qu ar e r oo t o f 2 i s %g \n sq 2);
pr in tf th e sq ua re ro ot of
is g\ n",s q2
}

/* call undeclared function */
/* call undeclared function */

Complete and consistent use of function declarations (function prototypes) is generally recommended for C. Where that sensible advice is followed, and especially where C compilers provide

The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright ©2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.



C Code That Is Not C++

Section B.2.2

817

options to enforce it, C code conforms to the C++ rule. Where undeclared functions are called, you
have to know the functions and the rules for C pretty well to know whether you have made a mistake or introduced a portability problem. For example, the previous m ai n() contains at least two
ma in
errors as a C program.
In C, a function declared without specifying any argument types can take any number of arguments of any type at all. Such use is deemed obsolescent in Standard C, but it is not uncommon:
v oi d f ; /* argument types not mentioned */
vo id f()
v oi d g
vo id g()
{
f 2);
f(2
}

/* poor style C. Not C++ */

In C, functions can be defined using a syntax that optionally specifies argument types after the list
of arguments:
v oi d f a,p c) c ha r *p c ha r c { /* ... */ }
vo id f(a p,c ch ar p; ch ar c;

/* C. Not C++ */

Such definitions must be rewritten:

v oi d f in t a c ha r* p c ha r c { /* ... */ }
vo id f(i nt a, ch ar p, ch ar c)

In C and in pre-standard versions of C++, the type specifier defaults to i nt For example:
in t.
c on st a = 7
co ns t
7;

/* In C, type int assumed. Not C++ */

C99 disallows ‘‘implicit i nt just as in C++.
in t,’’
C allows the definition of s tr uc ts in return type and argument type declarations. For example:
st ru ct
s tr uc t S { i nt x y; } f ;
st ru ct
in t x,y
f()
v oi d g st ru ct S { i nt x y; } y ;
vo id g(s tr uc t
in t x,y
y)

/* C. Not C++ */
/* C. Not C++ */

The C++ rules for defining types make such declarations useless, and they are not allowed.
In C, integers can be assigned to variables of enumeration type:
e nu m D ir ec ti on { u p, d ow n };

en um Di re ct io n up do wn
e nu m D ir ec ti on d = 1
en um Di re ct io n
1;

/* error: int assigned to Direction; ok in C */

C++ provides many more keywords than C does. If one of these appears as an identifier in a C program, that program must be modified to make it a C++ program:
_
__________________________________________________________________________

C++ Keywords That Are Not C Keywords
_
__________________________________________________________________________
_
__________________________________________________________________________
 a nd

an d
a nd _e q
an d_ eq
a sm
as m
b it an d
bi ta nd
b it or
bi to r
b oo l
bo ol



ca tc h
c la ss
cl as s
c om pl
co mp l
c on st _c as t
co ns t_ ca st
d el et e
de le te
d yn am ic _c as t 
dy na mi c_ ca st
 c at ch
ex pl ic it
e xp or t
ex po rt
f al se
fa ls e
f ri en d
fr ie nd
i nl in e
in li ne
m ut ab le
mu ta bl e
 e xp li ci t

 n am es pa ce

na me sp ac e
n ew

ne w
n ot
no t
n ot _e q
no t_ eq
o pe ra to r
op er at or
or
or
 o r_ eq

or _e q
p ri va te
pr iv at e
p ro te ct ed
pr ot ec te d
p ub li c
pu bl ic
r ei nt er pr et _c as t
re in te rp re t_ ca st
s ta ti c_ ca st
st at ic _c as t
 t em pl at e

te mp la te
t hi s
th is
t hr ow
th ro w
t ru e

tr ue
t ry
tr y
t yp ei d
ty pe id


u si ng
us in g
v ir tu al
vi rt ua l
w ch ar _t
wc ha r_ t
x or
xo r
x or _e q
xo r_ eq
_ty pe na me
__________________________________________________________________________
 t yp en am e

The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright ©2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.


818

Compatibility

Appendix B


In C, some of the C++ keywords are macros defined in standard headers:
_______________________________________________________________
 ______________________________________________________________
C++ Keywords That Are C Macros
_______________________________________________________________
_
 a nd

an d
a nd _e q
an d_ eq
b it an d
bi ta nd
b it or
bi to r
b oo l c om pl
bo ol
co mp l
f al se
fa ls e


no t
n ot _e q
no t_ eq
or
or
o r_ eq
or _e q

t ru e
tr ue
w ch ar _t
wc ha r_ t x or
xo r
x or _e q
xo r_ eq
_n ot
 ______________________________________________________________
This implies that in C they can be tested using #i fd ef redefined, etc.
if de f,
In C, a global data object may be declared several times in a single translation unit without
using the e xt er n specifier. As long as at most one such declaration provides an initializer, the
ex te rn
object is considered defined only once. For example:
i nt i i nt i
in t i; in t i;

/* defines or declares a single integer ‘i’; not C++ */

In C++, an entity must be defined exactly once; §9.2.3.
In C++, a class may not have the same name as a t yp ed ef declared to refer to a different type in
ty pe de f
the same scope; §5.7.
In C, a v oi d* may be used as the right-hand operand of an assignment to or initialization of a
vo id
variable of any pointer type; in C++ it may not (§5.6). For example:
v oi d f in t n
vo id f(i nt n)
{

i nt p = m al lo c(n si ze of in t)); /* not C++. In C++, allocate using ‘new’ */
in t*
ma ll oc n*s iz eo f(i nt
}

C allows transfer of control to a labeled-statement (§A.6) to bypass an initialization; C++ does not.
In C, a global c on st by default has external linkage; in C++ it does not and must be initialized,
co ns t
unless explicitly declared e xt er n (§5.4).
ex te rn
In C, names of nested structures are placed in the same scope as the structure in which they are
nested. For example:
s tr uc t S {
st ru ct
s tr uc t T { /* ... */ };
st ru ct
// ...
};
s tr uc t T x
st ru ct
x;

/* ok in C meaning ‘S::T x;’. Not C++ */

In C, an array can be initialized by an initializer that has more elements than the array requires. For
example:
c ha r v 5] = "O sc ar
ch ar v[5
Os ca r";


/* ok in C, the terminating 0 is not used. Not C++ */

B.2.3 Deprecated Features
By deprecating a feature, the standards committee expresses the wish that the feature would go
away. However, the committee does not have a mandate to remove a heavily used feature – however redundant or dangerous it may be. Thus, a deprecation is a strong hint to the users to avoid the
feature.
The keyword s ta ti c, which usually means ‘‘statically allocated,’’ can be used to indicate that a
st at ic
function or an object is local to a translation unit. For example:

The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright ©2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.


Section B.2.3

Deprecated Features

819

// file1:
s ta ti c i nt g lo b;
st at ic in t gl ob
// file2:
s ta ti c i nt g lo b;
st at ic in t gl ob

This program genuinely has two integers called g lo b. Each g lo b is used exclusively by functions
gl ob
gl ob

defined in its translation unit.
The use of s ta ti c to indicate ‘‘local to translation unit’’ is deprecated in C++. Use unnamed
st at ic
namespaces instead (§8.2.5.1).
The implicit conversion of a string literal to a (non-c on st c ha r* is deprecated. Use named
co ns t) ch ar
arrays of c ha r or avoid assignment of string literals to c ha r*s (§5.2.2).
ch ar
ch ar
C-style casts should have been deprecated when the new-style casts were introduced. Programmers should seriously consider banning C-style casts from their own programs. Where explicit
type conversion is necessary, s ta ti c_ ca st r ei nt er pr et _c as t, c on st _c as t, or a combination of these
st at ic _c as t, re in te rp re t_ ca st co ns t_ ca st
can do what a C-style cast can. The new-style casts should be preferred because they are more
explicit and more visible (§6.2.7).
B.2.4 C++ Code That Is Not C
This section lists facilities offered by C++ but not by C. The features are sorted by purpose. However, many classifications are possible and most features serve multiple purposes, so this classification should not be taken too seriously.
– Features primarily for notational convenience:
[1] // comments (§2.3); added to C99
[2] Support for restricted character sets (§C.3.1); partially added to C99
[3] Support for extended character sets (§C.3.3); added to C99
[4] Non-constant initializers for objects in s ta ti c storage (§9.4.1)
st at ic
[5] c on st in constant expressions (§5.4, §C.5)
co ns t
[6] Declarations as statements (§6.3.1); added to C99
[7] Declarations in for-statement initializers (§6.3.3); added to C99
[8] Declarations in conditions (§6.3.2.1)
[9] Structure names need not be prefixed by s tr uc t (§5.7)
st ru ct
– Features primarily for strengthening the type system:

[1] Function argument type checking (§7.1); later added to C (§B.2.2)
[2] Type-safe linkage (§9.2, §9.2.3)
[3] Free store management using n ew and d el et e (§6.2.6, §10.4.5, §15.6)
ne w
de le te
[4] c on st (§5.4, §5.4.1); later added to C
co ns t
[5] The Boolean type b oo l (§4.2); partially added to C99
bo ol
[6] New cast syntax (§6.2.7)
– Facilities for user-defined types:
[1] Classes (Chapter 10)
[2] Member functions (§10.2.1) and member classes (§11.12)
[3] Constructors and destructors (§10.2.3, §10.4.1)
[4] Derived classes (Chapter 12, Chapter 15)

The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright ©2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.


820

Compatibility

Appendix B

[5] v ir tu al functions and abstract classes (§12.2.6, §12.3)
vi rt ua l
[6] Public/protected/private access control (§10.2.2, §15.3, §C.11)
[7] f ri en ds (§11.5)

fr ie nd
[8] Pointers to members (§15.5, §C.12)
[9] s ta ti c members (§10.2.4)
st at ic
[10] m ut ab le members (§10.2.7.2)
mu ta bl e
[11] Operator overloading (Chapter 11)
[12] References (§5.5)
– Features primarily for program organization (in addition to classes):
[1] Templates (Chapter 13, §C.13)
[2] Inline functions (§7.1.1); added to C99
[3] Default arguments (§7.5)
[4] Function overloading (§7.4)
[5] Namespaces (§8.2)
[6] Explicit scope qualification (operator ::; §4.9.4)
[7] Exception handling (§8.3, Chapter 14)
[8] Run-time Type Identification (§15.4)
The keywords added by C++ (§B.2.2) can be used to spot most C++-specific facilities. However,
some facilities, such as function overloading and c on st in constant expressions, are not identified
co ns ts
by a keyword. In addition to the features listed, the C++ library (§16.1.2) is mostly C++ specific.
The _ _c pl us pl us macro can be used to determine whether a program is being processed by a C
__ cp lu sp lu s
or a C++ compiler (§9.2.4).

B.3 Coping with Older C++ Implementations
C++ has been in constant use since 1983 (§1.4). Since then, several versions have been defined and
many separately developed implementations have emerged. The fundamental aim of the standards
effort was to ensure that implementers and users would have a single definition of C++ to work
from. Until that definition becomes pervasive in the C++ community, however, we have to deal

with the fact that not every implementation provides every feature described in this book.
It is unfortunately not uncommon for people to take their first serious look at C++ using a fiveyear-old implementation. The typical reason is that such implementations are widely available and
free. Given a choice, no self-respecting professional would touch such an antique. For a novice,
older implementations come with serious hidden costs. The lack of language features and library
support means that the novice must struggle with problems that have been eliminated in newer
implementations. Using a feature-poor older implementation also warps the novice’s programming
style and gives a biased view of what C++ is. The best subset of C++ to initially learn is not the set
of low-level facilities (and not the common C and C++ subset; §1.2). In particular, I recommend
relying on the standard library and on templates to ease learning and to get a good initial impression of what C++ programming can be.
The first commercial release of C++ was in late 1985. The language was defined by the first
edition of this book. At that point, C++ did not offer multiple inheritance, templates, run-time type
information, exceptions, or namespaces. Today, I see no reason to use an implementation that

The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright ©2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.


Section B.3

Coping with Older C++ Implementations

821

doesn’t provide at least some of these features. I added multiple inheritance, templates, and exceptions to the definition of C++ in 1989. However, early support for templates and exceptions was
uneven and often poor. If you find problems with templates or exceptions in an older implementation, consider an immediate upgrade.
In general, it is wise to use an implementation that conforms to the standard wherever possible
and to minimize the reliance on implementation-defined and undefined aspects of the language.
Design as if the full language were available and then use whatever workarounds are needed. This
leads to better organized and more maintainable programs than designing for the lowest-commondenominator subset of C++. Also, be careful to use implementation-specific language extensions
only when absolutely necessary.

B.3.1 Headers
Traditionally, every header file had a .h suffix. Thus, C++ implementations provided headers such
h
as <m ap h> and ma p.h
io st re am h>.
When the standards committee needed headers for redefined versions of standard libraries and
for newly added library facilities, naming those headers became a problem. Using the old .h
h
names would have caused compatibility problems. The solution was to drop the .h suffix in stanh
dard header names. The suffix is redundant anyway because the < > notation indicates that a standard header is being named.
Thus, the standard library provides non-suffixed headers, such as <i os tr ea m> and io st re am
ma p>.
declarations in those files are placed in namespace s td Older headers place their declarations in the
st d.
global namespace and use a .h suffix. Consider:
h
#i nc lu de io st re am
in cl ud e<i os tr ea m>
i nt m ai n()
in t ma in
{
s td :c ou t << "H el lo w or ld \n
st d: co ut
He ll o, wo rl d!\ n";
}

If this fails to compile on an implementation, try the more traditional version:
#i nc lu de io st re am h>

in cl ud ei nt m ai n()
in t ma in
{
c ou t << "H el lo w or ld \n
co ut
He ll o, wo rl d!\ n";
}

Some of the most serious portability problems occur because of incompatible headers. The standard headers are only a minor contributor to this. Often, a program depends on a large number of
headers that are not present on all systems, on a large number of declarations that don’t appear in
the same headers on all systems, and on declarations that appear to be standard (because they are
found in headers with standard names) but are not part of any standard.
There are no fully-satisfactory approaches to dealing with portability in the face of inconsistent
headers. A general idea is to avoid direct dependencies on inconsistent headers and localize the
remaining dependencies. That is, we try to achieve portability through indirection and localization.

The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright ©2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.


822

Compatibility

Appendix B

For example, if declarations that we need are provided in different headers in different systems, we
may choose to #i nc lu de an application specific header that in turn #i nc lu de the appropriate
in cl ud e

in cl ud es
header(s) for each system. Similarly, if some functionality is provided in slightly different forms
on different systems, we may choose to access that functionality through application-specific interface classes and functions.
B.3.2 The Standard Library
Naturally, pre-standard-C++ implementations may lack parts of the standard library. Most will
have iostreams, non-templated c om pl ex a different s tr in g class, and the C standard library. Howco mp le x,
st ri ng
ever, some may lack m ap l is t, v al ar ra y, etc. In such cases, use the – typically proprietary –
ma p, li st va la rr ay
libraries available in a way that will allow conversion when your implementation gets upgraded to
the standard. It is usually better to use a non-standard s tr in g, l is t, and m ap than to revert to C-style
st ri ng li st
ma p
programming in the absence of these standard library classes. Also, good implementations of the
STL part of the standard library (Chapter 16, Chapter 17, Chapter 18, Chapter 19) are available free
for downloading.
Early implementations of the standard library were incomplete. For example, some had containers that didn’t support allocators and others required allocators to be explicitly specified for
each class. Similar problems occurred for other ‘‘policy arguments,’’ such as comparison criteria.
For example:
l is tli st in t> li
l is tli st in t,a ll oc at or in t> li 2;

// ok, but some implementations require an allocator
// ok, but some implementations don’t implement allocators

m ap st ri ng Re co rd m 1;
ma p<s tr in g,R ec or d> m1
// ok, but some implementations require a less-operation

m ap st ri ng Re co rd le ss st ri ng > m 2;
ma p
Use whichever version an implementation accepts. Eventually, the implementations will accept all.
Early C++ implementations provided i st rs tr ea m and o st rs tr ea m defined in <s tr st re am h>
is tr st re am
os tr st re am
st rs tr ea m.h
instead of i st ri ng st re am and o st ri ng st re am defined in is tr in gs tr ea m
os tr in gs tr ea m
ss tr ea m>. The s tr st re am operated
st rs tr ea ms
directly on a c ha r[] (see §21.10[26]).
ch ar
The streams in pre-standard-C++ implementations were not parameterized. In particular, the
templates with the b as ic _ prefix are new in the standard, and the b as ic _i os class used to be called
ba si c_
ba si c_ io s
i os Curiously enough, i os ta te used to be called i o_ st at e.
io s.
io st at e
io _s ta te
B.3.3 Namespaces
If your implementation does not support namespaces, use source files to express the logical structure of the program (Chapter 9). Similarly, use header files to express interfaces that you provide
for implementations or that are shared with C.
In the absence of namespaces, use s ta ti c to compensate for the lack of unnamed namespaces.
st at ic
Also use an identifying prefix to global names to distinguish your names from those of other parts
of the code. For example:

// for use on pre-namespace implementations:
c la ss b s_ st ri ng { /* ... */ };
cl as s bs _s tr in g
t yp ed ef i nt b s_ bo ol
ty pe de f in t bs _b oo l;

// Bjarne’s string
// Bjarne’s Boolean type

The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright ©2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.


Section B.3.3

c la ss j oe _s tr in g;
cl as s jo e_ st ri ng
e nu m j oe _b oo l { j oe _f al se j oe _t ru e };
en um jo e_ bo ol jo e_ fa ls e, jo e_ tr ue

Namespaces

823

// Joe’s string
// Joe’s bool

Be careful when choosing a prefix. Existing C and C++ libraries are littered with such prefixes.
B.3.4 Allocation Errors
In pre-exception-handling-C++, operator n ew returned 0 to indicate allocation failure. Standard

ne w
C++’s n ew throws b ad _a ll oc by default.
ne w
ba d_ al lo c
In general, it is best to convert to the standard. In this case, this means modify the code to catch
b ad _a ll oc rather than test for 0 In either case, coping with memory exhaustion beyond giving an
ba d_ al lo c
0.
error message is hard on many systems.
However, when converting from testing 0 to catching b ad _a ll oc is impractical, you can someba d_ al lo c
times modify the program to revert to the pre-exception-handling behavior. If no _ ne w_ ha nd le r is
_n ew _h an dl er
installed, using the n ot hr ow allocator will cause a 0 to be returned in case of allocation failure:
no th ro w
X p 1 = n ew X
X* p1 ne w X;
// throws bad_alloc if no memory
X p 2 = n ew no th ro w) X // returns 0 if no memory
X* p2 ne w(n ot hr ow X;

B.3.5 Templates
The standard introduced new template features and clarified the rules for several existing ones.
If your implementation doesn’t support partial specialization, use a separate name for the template that would otherwise have been a specialization. For example:
t em pl at ete mp la te cl as s T> cl as s pl is t pr iv at e li st vo id
// ...
};

If your implementation doesn’t support member templates, some techniques become infeasible. In
particular, member templates allow the programmer to specify construction and conversion with a

flexibility that cannot be matched without them (§13.6.2). Sometimes, providing a nonmember
function that constructs an object is an alternative. Consider:
t em pl at ete mp la te cl as s T> cl as s
// ...
t em pl at ete mp la te cl as s A> X(c on st A& a)
};

In the absence of member templates, we must restrict ourselves to specific types:
t em pl at ete mp la te cl as s T> cl as s
// ...
X co ns t A 1& a ;
X(c on st A1 a)
X co ns t A 2& a ;
X(c on st A2 a)
// ...
};

Most early implementations generated definitions for all member functions defined within a template class when that template class was instantiated. This could lead to errors in unused member

The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright ©2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.


824

Compatibility


Appendix B

functions (§C.13.9.1). The solution is to place the definition of the member functions after the
class declaration. For example, rather than
t em pl at ete mp la te cl as s T> cl as s Co nt ai ne r
// ...
p ub li c:
pu bl ic
v oi d s or t() { /* use < */ }
vo id so rt
// in-class definition
};
c la ss G lo b { /* no < for Glob */ };
cl as s Gl ob
C on ta in er Gl ob c g; // some pre-standard implementations try to define Container<Glob>::sort()
Co nt ai ne r<G lo b> cg

use
t em pl at ete mp la te cl as s T> cl as s Co nt ai ne r
// ...
p ub li c:
pu bl ic
v oi d s or t();
vo id so rt
};
t em pl at e<c la ss T v oi d C on ta in er T>::s or t() { /* use < */ }
te mp la te cl as s T> vo id Co nt ai ne rso rt


// out-of-class definition

c la ss G lo b { /* no < for Glob */ };
cl as s Gl ob
C on ta in er Gl ob c g; // no problem as long as cg.sort() isn’t called
Co nt ai ne r<G lo b> cg

Early implementations of C++ did not handle the use of members defined later in a class. For
example:
t em pl at ete mp la te cl as s T> cl as s Ve ct or
p ub li c:
pu bl ic
T o pe ra to r[](s iz e_ t i { r et ur n v i]; } // v declared below
T& op er at or
si ze _t i) re tu rn v[i
// ...
p ri va te
pr iv at e:
T v
T* v;
// oops: not found!
s iz e_ t s z;
si ze _t sz
};

In such cases, either sort the member declarations to avoid the problem or place the definition of
the member function after the class declaration.
Some pre-standard-C++ implementations do not accept default arguments for templates

(§13.4.1). In that case, every template parameter must be given an explicit argument. For example:
t em pl at ete mp la te cl as s Ke y, cl as s T, cl as s LT le ss T> cl as s ma p
// ...
};
m ap st ri ng in t> m
ma pm ap s tr in g,i nt le ss st ri ng > m 2;
ma p< st ri ng in t,l es s<s tr in g> m2

// Oops: default template arguments not implemented
// workaround: be explicit

The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright ©2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.


Section B.3.6

For-Statement Initializers

825

B.3.6 For-Statement Initializers
Consider:
v oi d f ve ct or ch ar v i nt m
vo id f(v ec to r<c ha r>& v, in t m)
{
f or (i nt i 0 i v.s iz e() && i m; ++i c ou t << v i];
fo r in t i= 0; i

i<=m
i) co ut
v[i
i f (i == m {
if i
m)
// ...
}

// error: i referred to after end of for-statement

}

Such code used to work because in the original definition of C++, the scope of the controlled variable extended to the end of the scope in which the for-statement appears. If you find such code,
simply declare the controlled variable before the for-statement:
v oi d f 2(v ec to r<c ha r>& v i nt m
vo id f2 ve ct or ch ar
v, in t m)
{
i nt i 0 // i needed after the loop
in t i= 0;
f or (; i v.s iz e() && i m; ++i c ou t << v i];
fo r
ii<=m
i) co ut
v[i
i f (i == m {
if i
m)

// ...
}
}

B.4 Advice
[1] For learning C++, use the most up-to-date and complete implementation of Standard C++ that
you can get access to; §B.3.
[2] The common subset of C and C++ is not the best initial subset of C++ to learn; §1.6, §B.3.
[3] For production code, remember that not every C++ implementation is completely up-to-date.
Before using a major new feature in production code, try it out by writing small programs to
test the standards conformance and performance of the implementations you plan to use; for
example, see §8.5[6-7], §16.5[10], §B.5[7].
[4] Avoid deprecated features such as global s ta ti cs; also avoid C-style casts; §6.2.7, §B.2.3.
st at ic
[5] ‘‘implicit i nt has been banned, so explicitly specify the type of every function, variable,
in t’’
c on st etc.; §B.2.2.
co ns t,
[6] When converting a C program to C++, first make sure that function declarations (prototypes)
and standard headers are used consistently; §B.2.2.
[7] When converting a C program to C++, rename variables that are C++ keywords; §B.2.2.
[8] When converting a C program to C++, cast the result of m al lo c() to the proper type or change
ma ll oc
all uses of m al lo c() to uses of n ew §B.2.2.
ma ll oc
ne w;
[9] When converting from m al lo c() and f re e() to n ew and d el et e, consider using v ec to r,
ma ll oc
fr ee
ne w

de le te
ve ct or
p us h_ ba ck , and r es er ve instead of r ea ll oc
pu sh _b ac k()
re se rv e()
re al lo c(); §3.8, §16.3.5.
[10] When converting a C program to C++, remember that there are no implicit conversions from
i nt to enumerations; use explicit type conversion where necessary; §4.8.
in ts

The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright ©2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.


826

Compatibility

Appendix B

[11] A facility defined in namespace s td is defined in a header without a suffix (e.g. s td :c ou t is
st d
st d: co ut
declared in <i os tr ea m>). Older implementations have standard library facilities in the global
io st re am
namespace and declared in headers with a .h suffix (e.g. ::c ou t declared in h
co ut
io st re am h>);
§9.2.2, §B.3.1.

[12] If older code tests the result of n ew against 0 it must be modified to catch b ad _a ll oc or to use
ne w
0,
ba d_ al lo c
n ew no th ro w); §B.3.4.
ne w(n ot hr ow
[13] If your implementation doesn’t support default template arguments, provide arguments explicitly; t yp ed ef can often be used to avoid repetition of template arguments (similar to the way
ty pe de fs
the typedef s tr in g saves you from saying b as ic _s tr in g< c ha r, c ha r_ tr ai ts ch ar
st ri ng
ba si c_ st ri ng ch ar ch ar _t ra it s<c ha r>,
a ll oc at or ch ar >); §B.3.5.
al lo ca to r<c ha r>
[14] Use <s tr in g> to get s td :s tr in g (st ri ng
st d: st ri ng st ri ng h>
§B.3.1.
[15] For each standard C header <X h> that places names in the global namespace, the header
X.h
<c X> places the names in namespace s td §B.3.1.
cX
st d;
[16] Many systems have a "S tr in g.h header defining a string type. Note that such strings differ
St ri ng h"
from the standard library s tr in g.
st ri ng
[17] Prefer standard facilities to non-standard ones; §20.1, §B.3, §C.2.
[18] Use e xt er n "C when declaring C functions; §9.2.4.
ex te rn C"


B.5 Exercises
1. (∗2.5) Take a C program and convert it to a C++ program; list the kinds of non-C++ constructs
used and determine if they are valid ANSI C constructs. First convert the program to strict
ANSI C (adding prototypes, etc.), then to C++. Estimate the time it would take to convert a
100,000 line C program to C++.
2. (∗2.5) Write a program to help convert C programs to C++ by renaming variables that are C++
keywords, replacing calls of m al lo c() by uses of n ew etc. Hint: don’t try to do a perfect job.
ma ll oc
ne w,
3. (∗2) Replace all uses of m al lo c() in a C-style C++ program (maybe a recently converted C proma ll oc
gram) to uses of n ew Hint: §B.4[8-9].
ne w.
4. (∗2.5) Minimize the use of macros, global variables, uninitialized variables, and casts in a Cstyle C++ program (maybe a recently converted C program).
5. (∗3) Take a C++ program that is the result of a crude conversion from C and critique it as a C++
program considering locality of information, abstraction, readability, extensibility, and potential
for reuse of parts. Make one significant change to the program based on that critique.
6. (∗2) Take a small (say, 500 line) C++ program and convert it to C. Compare the original with
the result for size and probable maintainability.
7. (∗3) Write a small set of test programs to determine whether a C++ implementation has ‘‘the
latest’’ standard features. For example, what is the scope of a variable defined in a f or
fo rs ta te me nt initializer? (§B.3.6), are default template arguments supported? (§B.3.5), are member
st at em en t
templates supported? (§13.6.2), and is argument-based lookup supported? (§8.2.6). Hint:
§B.2.4.
8. (∗2.5) Take a C++ program that use <X h> headers and convert it to using X.h
X>
cX
headers. Minimize the use of using-directives.


The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright ©2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.


________________________________________
________________________________________________________________________________________________________________________________________________________________

C

Appendix
________________________________________
________________________________________________________________________________________________________________________________________________________________

Technicalities
Deep in the fundamental
heart of mind and Universe,
there is a reason.
– Slartibartfast

What the standard promises — character sets — integer literals — constant expressions
— promotions and conversions — multidimensional arrays — fields and unions —
memory management — garbage collection — namespaces — access control — pointers
to data members — templates — s ta ti c members — f ri en ds — templates as template
st at ic
fr ie nd s
parameters — template argument deduction — t yp en am e and t em pl at e qualification —
ty pe na me
te mp la te
instantiation — name binding — templates and namespaces — explicit instantiation —
advice.


C.1 Introduction and Overview
This chapter presents technical details and examples that do not fit neatly into my presentation of
the main C++ language features and their uses. The details presented here can be important when
you are writing a program and essential when reading code written using them. However, I consider them technical details that should not be allowed to distract from the student’s primary task of
learning to use C++ well or the programmer’s primary task of expressing ideas as clearly and as
directly as possible in C++.

C.2 The Standard
Contrary to common belief, strictly adhering to the C++ language and library standard doesn’t guarantee good code or even portable code. The standard doesn’t say whether a piece of code is good

The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.


828

Technicalities

Appendix C

or bad; it simply says what a programmer can and cannot rely on from an implementation. One can
write perfectly awful standard-conforming programs, and most real-world programs rely on features not covered by the standard.
Many important things are deemed implementation-defined by the standard. This means that
each implementation must provide a specific, well-defined behavior for a construct and that behavior must be documented. For example:
u ns ig ne d c ha r c 1 = 6 4;
un si gn ed ch ar c1 64
u ns ig ne d c ha r c 2 = 1 25 6;
un si gn ed ch ar c2 12 56


// well-defined: a char has at least 8 bits and can always hold 64
// implementation-defined: truncation if a char has only 8 bits

The initialization of c 1 is well-defined because a c ha r must be at least 8 bits. However, the behavc1
ch ar
ior of the initialization of c 2 is implementation-defined because the number of bits in a c ha r is
c2
ch ar
implementation-defined. If the c ha r has only 8 bits, the value 1 25 6 will be truncated to 2 32
ch ar
12 56
23 2
(§C.6.2.1). Most implementation-defined features relate to differences in the hardware used to run
a program.
When writing real-world programs, it is usually necessary to rely on implementation-defined
behavior. Such behavior is the price we pay for the ability to operate effectively on a large range of
systems. For example, the language would have been much simpler if all characters had been 8 bits
and all integers 32 bits. However, 16-bit and 32-bit character sets are not uncommon – nor are
integers too large to fit in 32 bits. For example, many computers now have disks that hold more
that 3 2G bytes, so 48-bit or 64-bit integers can be useful for representing disk addresses.
32 G
To maximize portability, it is wise to be explicit about what implementation-defined features
we rely on and to isolate the more subtle examples in clearly marked sections of a program. A typical example of this practice is to present all dependencies on hardware sizes in the form of constants and type definitions in some header file. To support such techniques, the standard library
provides n um er ic _l im it s (§22.2).
nu me ri c_ li mi ts
Undefined behavior is nastier. A construct is deemed undefined by the standard if no reasonable behavior is required by an implementation. Typically, some obvious implementation technique will cause a program using an undefined feature to behave very badly. For example:
c on st i nt s iz e = 4 10 24
co ns t in t si ze 4*1 02 4;
c ha r p ag e[s iz e];
ch ar pa ge si ze

v oi d f
vo id f()
{
p ag e[s iz e+s iz e] = 7 // undefined
pa ge si ze si ze
7;
}

Plausible outcomes of this code fragment include overwriting unrelated data and triggering a hardware error/exception. An implementation is not required to choose among plausible outcomes.
Where powerful optimizers are used, the actual effects of undefined behavior can become quite
unpredictable. If a set of plausible and easily implementable alternatives exist, a feature is deemed
implementation-defined rather than undefined.
It is worth spending considerable time and effort to ensure that a program does not use something deemed undefined by the standard. In many cases, tools exist to help do this.

The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.


Section C.3

Character Sets

829

C.3 Character Sets
The examples in this book are written using the U.S. variant of the international 7-bit character set
ISO 646-1983 called ASCII (ANSI3.4-1968). This can cause three problems for people who use
C++ in an environment with a different character set:
[1] ASCII contains punctuation characters and operator symbols – such as ], {, and ! – that
are not available in some character sets.

[2] We need a notation for characters that do not have a convenient character representation
(e.g., newline and ‘‘the character with value 17’’).
[3] ASCII doesn’t contain characters, such as – ζ , æ, and Π – that are used for writing languages other than English.
C.3.1 Restricted Character Sets
The ASCII special characters [, ], {, }, |, and \ occupy character set positions designated as
alphabetic by ISO. In most European national ISO-646 character sets, these positions are occupied
by letters not found in the English alphabet. For example, the Danish national character set uses
them for the vowels Ỉ, ỉ, Ø, ø, Å, and å. No significant amount of text can be written in Danish
without them.
A set of trigraphs is provided to allow national characters to be expressed in a portable way
using a truly standard minimal character set. This can be useful for interchange of programs, but it
doesn’t make it easier for people to read programs. Naturally, the long-term solution to this problem is for C++ programmers to get equipment that supports both their native language and C++
well. Unfortunately, this appears to be infeasible for some, and the introduction of new equipment
can be a frustratingly slow process. To help programmers stuck with incomplete character sets,
C++ provides alternatives:
_
______________________________________
______________________________________
_ Keywords  Digraphs  Trigraphs 
_
______________________________________




an d
&&  <%
{  ??=
# 
 a nd

an d_ eq
}  ??(
[ 
 a nd _e q &=  %>
bi ta nd
&  <:
[  ??<
{ 
 b it an d
 b it or
bi to r
|  :>
]  ??/
\ 
 c om pl
co mp l
~  %:
#  ??)
] 




no t
!  %:%:
##  ??>
} 
 n ot
or
|| 

^ 
 or
 ??’
or _e q
|= 
| 
 o r_ eq
 ??!
 x or
 ??xo r
^ 
~ 
 x or _e q
 ???
xo r_ eq
^= 
? 




no t_ eq
!= 
______________________________________
_n ot _e q

Programs using the keywords and digraphs are far more readable than the equivalent programs
written using trigraphs. However, if characters such as { are not available, trigraphs are necessary
for putting ‘‘missing’’ characters into strings and character constants. For example, ´{´ becomes
´??<´.

Some people prefer the keywords such as a nd to their traditional operator notation.
an d

The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.


830

Technicalities

Appendix C

C.3.2 Escape Characters
A few characters have standard names that use the backslash \ as an escape character:
________________________________________
_______________________________________
Name
ASCII Name C++ Name
________________________________________
_


NL (LF)
\n
 newline

HT
\t
 horizontal tab


 vertical tab
VT
\v

 backspace

BS
\b
 carriage return

CR
\r


FF
\f
 form feed

BEL
\a
 alert

 backslash
\
\\

 question mark

?

\?
 single quote


\’


"
\"
 double quote

ooo
\ooo
 octal number


hhh
\xhhh ... 
 hex number
________________________________________
Despite their appearance, these are single characters.
It is possible to represent a character as a one-, two-, or three-digit octal number (\ followed by
\
octal digits) or as a hexadecimal number (\ x followed by hexadecimal digits). There is no limit to
\x
the number of hexadecimal digits in the sequence. A sequence of octal or hexadecimal digits is terminated by the first character that is not an octal digit or a hexadecimal digit, respectively. For
example:
__________________________________________
 _________________________________________
Hexadecimal Decimal ASCII

__________________________________________
_ Octal


’\6’
’\x6’
6
ACK 

’\x30’
48
’0’ 
 ’\60’
 _________________________________________
’\137’ ’\x05f’
95
’_’
_
This makes it possible to represent every character in the machine’s character set and, in particular,
to embed such characters in character strings (see §5.2.2). Using any numeric notation for characters makes a program nonportable across machines with different character sets.
It is possible to enclose more than one character in a character literal, for example ´a b´. Such
ab
uses are archaic, implementation-dependent, and best avoided.
When embedding a numeric constant in a string using the octal notation, it is wise always to use
three digits for the number. The notation is hard enough to read without having to worry about
whether or not the character after a constant is a digit. For hexadecimal constants, use two digits.
Consider these examples:
c ha r
ch ar
c ha r

ch ar
c ha r
ch ar
c ha r
ch ar

v 1[] = "a \x ah \1 29
v1
a\ xa h\ 12 9";
v 2[] = "a \x ah \1 27
v2
a\ xa h\ 12 7";
v 3[] = "a \x ad \1 27
v3
a\ xa d\ 12 7";
v 4[] = "a \x ad \0 12 7";
v4
a\ xa d\ 01 27

// 6 chars: ’a’ ’\xa’ ’h’ ’\12’ ’9’ ’\0’
// 5 chars: ’a’ ’\xa’ ’h’ ’\127’ ’\0’
// 4 chars: ’a’ ’\xad’ ’\127’ ’\0’
// 5 chars: ’a’ ’\xad’ ’\012’ ’7’ ’\0’

The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.


Section C.3.3


Large Character Sets

831

C.3.3 Large Character Sets
A C++ program may be written and presented to the user in character sets that are much richer than
the 127 character ASCII set. Where an implementation supports larger character sets, identifiers,
comments, character constants, and strings may contain characters such as å, β, and Γ. However, to
be portable the implementation must map these characters into an encoding using only characters
available to every C++ user. In principle, this translation into the C++ basic source character set
(the set used in this book) occurs before the compiler does any other processing. Therefore, it does
not affect the semantics of the program.
The standard encoding of characters from large character sets into the smaller set supported
directly by C++ is presented as sequences of four or eight hexadecimal digits:
universal-character-name:
\U X X X X X X X X
\u X X X X

Here, X represents a hexadecimal digit. For example, \ u1 e2 b. The shorter notation \ uX XX X is
\u 1e 2b
\u XX XX
equivalent to \ U0 00 0X XX X. A number of hexadecimal digits different from four or eight is a lexi\U 00 00 XX XX
cal error.
A programmer can use these character encodings directly. However, they are primarily meant
as a way for an implementation that internally uses a small character set to handle characters from a
large character set seen by the programmer.
If you rely on special environments to provide an extended character set for use in identifiers,
the program becomes less portable. A program is hard to read unless you understand the natural
language used for identifiers and comments. Consequently, for programs used internationally it is
usually best to stick to English and ASCII.

C.3.4 Signed and Unsigned Characters
It is implementation-defined whether a plain c ha r is considered signed or unsigned. This opens the
ch ar
possibility for some nasty surprises and implementation dependencies. For example:
c ha r c = 2 55
ch ar
25 5; // 255 is ‘‘all ones,’’ hexadecimal 0xFF
i nt i = c
in t
c;

What will be the value of i Unfortunately, the answer is undefined. On all implementations I
i?
know of, the answer depends on the meaning of the ‘‘all ones’’ c ha r bit pattern when extended into
ch ar
an i nt On a SGI Challenge machine, a c ha r is unsigned, so the answer is 2 55 On a Sun SPARC
in t.
ch ar
25 5.
or an IBM PC, where a c ha r is signed, the answer is -1 In this case, the compiler might warn
ch ar
1.
about the conversion of the literal 2 55 to the c ha r value -1 However, C++ does not offer a general
25 5
ch ar
1.
mechanism for detecting this kind of problem. One solution is to avoid plain c ha r and use the spech ar
cific c ha r types only. Unfortunately, some standard library functions, such as s tr cm p(), take plain
ch ar
st rc mp

c ha rs only (§20.4.1).
ch ar
A c ha r must behave identically to either a s ig ne d c ha r or an u ns ig ne d c ha r. However, the
ch ar
si gn ed ch ar
un si gn ed ch ar
three c ha r types are distinct, so you can’t mix pointers to different c ha r types. For example:
ch ar
ch ar

The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.