Tải bản đầy đủ (.pdf) (39 trang)

Tài liệu Giáo trình C++ P2 ppt

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (345.27 KB, 39 trang )


www.gameinstitute.com Introduction to C and C++ : Week 2: Page 1 of 39










Game Institute



































by Stan Trujillo



Week
2

Introduction to C and C++

www.gameinstitute.com Introduction to C and C++ : Week 2: Page 2 of 39






© 2001, eInstitute, Inc.

You may print one copy of this document for your own personal use.
You agree to destroy any worn copy prior to printing another. You may
not distribute this document in paper, fax, magnetic, electronic or other
telecommunications format to anyone else.















This is the companion text to the www.gameinstitute.com
course of the
same title. With minor modifications made for print formatting, it is
identical to the viewable text, but without the audio.












www.gameinstitute.com Introduction to C and C++ : Week 2: Page 3 of 39



Table of Contents

Lesson 2 – Data Structures 4
Complex Data Types 4
Arrays 5
Structures 11
Mixing Complex Types 12
Memory Usage 14
Pointers 15
Pointers and Functions 19
References 21
Pointer arithmetic 24
Memory Allocation 27
Automatic Variables 28
Dynamic Memory 29
Global Memory 30
The PlayerList Sample 31
Exercises 36
What’s next? 39


www.gameinstitute.com Introduction to C and C++ : Week 2: Page 4 of 39

Lesson 2 – Data Structures

In Lesson 1 we used the data types that are intrinsic to C and C++, such as int, float, and char. As
intrinsic types, these data types are directly supported by the language. In a sense, these are the only types
of data that are supported. For game programming, all of the creepy monsters, hordes of aliens, simulated
vehicles, and the virtual worlds in which these entities exist, are ultimately represented by these simple
data types.

Thankfully, C and C++ allow us to use the intrinsic data types to construct larger and more complex
types. The intrinsic data types can be used directly, or as building blocks from which virtually any entity
or system of entities can be represented.

Once a complex data type has been defined, it can be used just like an intrinsic type. Variables of that
type can be declared, passed to functions, saved to disk, and manipulated by assigning new values. The
only difference is that the complex data structure is bigger (it occupies more memory), and it represents
something that is more specific than each of its individual parts.

In this lesson, in addition to learning how to create complex data structures, we’ll learn how to manage
them efficiently: how to pass them to functions, and manage collections of data structures. We’ll also
learn the different ways in which the memory required to represent these data types can be allocated.
Complex Data Types
There are two basic ways in which multiple data elements can be assembled into a larger data element.
The first is to assemble a collection of homogenous types. This is called an array. An array uses a single
data type as a building block, and is a new data type only because it represents two or more instances of
that type.

Arrays can be used to represent a collection of any data type. In Lesson 1, we used an array of the char

type to represent a string. In this case each element in the array represents a character, and together the
characters represent a text string. Strings are often used in games, to store player names, display in-game
status information, and to display menus. Games can also use arrays of other data types. Some examples
are list of scores, lists of monsters, or a list of network addresses. We’ll continue our discussion of arrays
in the next section.

The second method of defining a complex type is to use heterogeneous data types to create a new data
type. This method allows any number of different data types to be used to define a new category of data
type. This is called a structure.

Unlike an array, which is formed as a collection of similar items, a structure can be used to represent
entities that require multiple data types for representation. In a racing game, for example, a car might be
represented as a structure that contains the make, model, weight, dimensions, fuel capacity, and handling
characteristics of the vehicle. Structures are used extensively in games, and are best defined—at least in a
rough form—early in the game development process. We’ll learn how to create and structures after we’ve
covered arrays.

If you’re familiar with the concepts behind object-oriented languages, you might be asking yourself why
objects haven’t entered the picture yet. We will cover objects in detail in Lesson 3, building directly on
what we cover in this lesson. By concentrating on data structures now, we’ll have less to digest when we
introduce objects, because objects rely on the same data structures we’re using in this lesson.

www.gameinstitute.com Introduction to C and C++ : Week 2: Page 5 of 39
Arrays
C++ uses square brackets to denote an array. Declaring an array looks very much like a non-array
variable declaration, but arrays require that the variable name be followed by square brackets. Typically
the square bracket set contains the number of elements in the array. We can declare an array of integers
like this:

int playerScores[8];


This snippet declares an array of 8 integers that are collectively represented by the variable name
playerScores. Each element of the playerScores array has int for a data type, and contains a value that
can be inspected or modified using any operator that is appropriate for integers.

Arrays use square brackets for declaration, and for accessing array elements. For declaration, the brackets
contain the array size. For accessing array elements after the array has been declared, the brackets contain
a numeric value called an index that indicates the desired element. Using the playerScore array declared
earlier, a value can be stored in the first array element like this:

playerScores[0] = 1;

This assigns the first element in the array to 1. In this example we’ve used an index of zero, indicating the
first element in the array.

C++ uses a zero-based indexing scheme: the first element of an array is indexed as 0, not 1. This means
that the last element of the array in our example has an index of 7, and not 8. This frequently leads to
bugs for people with experience in Basic or Pascal which both use 1 to index the first array element.

The code above demonstrates how a single array element can be assigned. To assign all of the elements in
this array, we could use eight similar assignments, each with a different index, like this:

playerScore[0] = 0;
playerScore[1] = 0;
playerScore[2] = 0;
playerScore[3] = 0;
playerScore[4] = 0;
playerScore[5] = 0;
playerScore[6] = 0;
playerScore[7] = 0;


Clearly, this is impractical for large arrays. Alternatively a loop can be used to iterate through the array:

for (int i = 0; i < 8; i++)
{
playerScore[i] = 0;
}

Instead of a literal index, we’re using the variable i as the index, so that each element of the array is
affected in turn. In this case we’re assigning each player score to zero using the assignment operator.

Notice that we’re using the number 8 in the for loop as part of the terminating condition. This works only
because we’re the terminating condition indicates that i must be less than 8. If we used the “less than or
equal to” operator instead (<=) instead, the loop would assign nine array elements, as shown here:

www.gameinstitute.com Introduction to C and C++ : Week 2: Page 6 of 39

for (int i = 0; i <= 8; i++) // out of bounds error (not detected by the compiler!)
{
playerScore[i] = 0;
}

This loop is problematic because it assigns a ninth array element. The last iteration of this loop assigns i
to 8, which, because C++ arrays are indexed starting with zero, indicates the ninth element.

The square bracket syntax used to indicate array elements can be used on either side of the assignment
operator. This loop, for example, retrieves each element value, and displays it on the screen:

for (int i = 0; i < 8; i++)
{

int score = playerScore[i];
cout << “player “ << i << “ has a score of “ << score << endl;
}

The loop above uses an integer called score to store each retrieved value, and then provides it to cout.
Alternatively this temporary value can be omitted, like this:

for (int i = 0; i < 8; i++)
{
cout << “player “ << i << “ has a score of “ << playerScore[i] << endl;
}

Each array element can be manipulated using the arithmetic operators. Since our example array contains
player scores, a score might be incremented like this:

playerScore[4] = playerScore[4] + 100; // adds 100 to the 5
th
score

Or, using the C++ shorthand notation:

playerScore[4] += 100; // exactly the same as above

Similarly, any function that accepts an int as an argument can accept an element of our array. For
example, we can write a function that has this prototype:

void DisplayPlayerScore( int s );

We can call like this:


DisplayPlayerScore( playerScore[2] );

This function call passes the 3
rd
score in the array to the DisplayPlayerScore function. Alternatively, a
loop could be used to pass each score to DisplayPlayerScore, like this:

for (int i = 0; i < 8; i++)
{
DisplayPlayerScore( playerScore[i] );
}


www.gameinstitute.com Introduction to C and C++ : Week 2: Page 7 of 39
Providing an index to an array selects an array element. This array element is a variable of that type, and
as such can be used in any context that is legal for that type. Attempting to pass these the entire array in
the same context is illegal:

DisplayPlayerScore( playerScore ); // compiler error!

This won’t work because the DisplayPlayerScore, as we’ve defined it, takes an integer, and not an array
of integers. In order to pass the entire array to a function, the function in question would have to accept an
array instead of an integer. A function with this prototype, for example:

void DisplayAllPlayerScores( int s[8] );

Would accept the entire array, like this:

DisplayAllPlayerScores( playerScores );


For reasons that we’ll discuss later in this lesson, passing entire arrays to functions in this manner is best
avoided in most cases.

Like non-array variable declarations, a newly declared array has an undetermined state. The playerScore
array used in this section, for example, contains 8 integer elements that have virtually random values.
Using these values without first assigning known values will likely lead to undesirable results.

Arrays can either be initialized using a loop shortly after declaration, as shown previously, or initialized
when they are declared. In Lesson 1 we learned that intrinsic data types can be declared and initialized in
a single statement, as shown below:

int variable1= 0;
float variable2 = 0.0f;

Likewise, our playerScores array might be initialized like this:

int scores[8] = { 0, 1, 2, 3, 4, 5, 6, 7 };

This declaration initializes each score to a known value. The first value to appear (zero) is assigned to the
first array element, the second value to the second array element, and so forth. Initializing arrays requires
that the declaration be followed by the assignment operator and that the initial values are enclosed in
curly braces, separated by commas.

All of the intrinsic data types support this notation. Here are some more examples:

bool engineStates[4] = { true, true, false, true };
float fueltankStates[2] = { 100.0f, 0.0f };

The declarations above create an array of Booleans and an array of floating point values. Each array
element is assigned initial values, so there’s no ambiguity about the array contents.


When initial array values are provided in this fashion, there is no need to provide the array size. The
previous declarations can also be written this way:

bool engineStates[ ] = { true, true, false, true };

www.gameinstitute.com Introduction to C and C++ : Week 2: Page 8 of 39
float fueltankStates[ ] = { 100.0f, 0.0f };

The compiler can determine the size of the array for us by counting the number of initial values provided,
so the two arrays above have 4 and 2 elements respectively.

The char data type, when used in an array, is the standard format for representing text strings. As such,
char arrays get some special treatment. This support takes the form of special initialization syntax, and a
host of string handling functions that are provided in the standard libraries. For example, consider these
two declarations, which result in two arrays with identical content.

char str1[] = { 'a', 'b', 'c', 0 };
char str2[] = "abc";

The first declaration initializes each array element using the typical array initialization syntax, whereas
the second uses a string literal. Notice that the first declaration, in addition to being harder to read,
requires that we add a null-terminator explicitly (the zero as the last array element). Failing to do so
would cause subsequent string operations to fail. In the second declaration, the compiler automatically
adds a null-terminator. As a result, both of these arrays have a length of 4. Notice also that C++ requires
that individual characters be enclosed in single quotations when used as literals. Of the two declarations
above, the latter is preferred, due to better readability and the implicit null-termination.

Not every array element must be initialized during declaration. If you wanted the first two array elements
to be initialized, but didn’t care to assign values to the remainder of the array, you could write this:


int scores[8] = { 100, 200 };

This declaration creates an array with 8 elements, and initializes the first two elements to 100 and 200.
The remaining six elements are not explicitly initialized. Interestingly, the compiler automatically
initializes any remaining array elements to zero. The presence of an array initializer list prompts the
compiler to initialize the remaining the elements to zero. For example:

long largeArray[64000] = { 0 };

In this example, despite the fact that only one array element is explicitly initialized, all 64,000 elements
will be initialized to zero. If an array initializer list is present, array elements that are not explicitly
initialized, are set to zero. In this example:

long largeArray[64000] = { 1 };

The first element is assigned to 1, but the remaining 59,999 elements to be assigned a zero value.

It is important to remember that array elements are not initialized unless at least one element is initialized.
Here are some examples that explore the different array declaration notations:

int a[10]; // all 10 array elements contain unknown values
int b[10] = { 0 }; // all 10 array elements are initialized to zero
int c[10] = { 100 }; // the first element is assigned to 100, the rest to zero.
int d[]; // illegal (compiler error) array size or initializer list required
int e[] = { 33, 44 }; // array size is 2, as determined by the initializer list
char f[] = “zxy”; // creates a char array, or string, that has a length of 4
char g[] = { ‘z’, x’, ‘y’ }; // acceptable but risky (should not be treated as a string)
char h[] = { ‘z’, ‘x’, ‘y’, 0 }; // safe, but not as readable as f


www.gameinstitute.com Introduction to C and C++ : Week 2: Page 9 of 39

Note that if the g array is used as a string, a bug is likely due to the lack of a null-terminator. Furthermore,
the size of the array is set to 3 because 3 values are provided in the initializer list, so there is no room in
the array to add a terminator later without shortening the effective string length to 2.

A common misconception is that strings that are initialized during declaration cannot be modified because
its value was assigned at declaration. We’ll talk about conditions where this is true in Lesson 3, but in all
of the declarations we’ve looked at so far, the contents of the resulting array can be modified at will.

To demonstrate this fact, and write some code that manipulates an array, let’s write a sample called
StringEdit. This sample displays a string, and allows the user to modify the contents of the underlying
array by specifying an index. The String Edit sample looks like this:



The StringEdit sample uses a char array to represent a string. The array is initialized with text that
describes the array, and is displayed using cout. This string appears between the two dotted lines, as
shown above. The user is then prompted for an index or one of three special values. Entering the number
44 causes the user to be prompted for an index at which a null-terminator is to be added (44 is used
because it is not a valid index—the string has just 40 elements.) This has the effect of truncating the
string, demonstrating the significance of the value zero when used in a string. Entering 55 prompts the
user for an index where a space will be assigned to the array (we’re using cin for input, which doesn’t
accept spaces as input.) Entering a valid index (0 through 38) prompts the user for a character to be placed
in the array. Entering negative 1 (-1) causes the sample to terminate.

A valid index is a value between zero (indicating the first array element) and 38. We prevent the user
from modifying index 39 because that element is used for the null-terminator. If this element were to be
reassigned, cout would display any and all characters until the value zero is encountered. This would
likely involve the display of a large amount of data outside our array, and would constitute a bug.



www.gameinstitute.com Introduction to C and C++ : Week 2: Page 10 of 39
The StringEdit sample is implemented using just the main function. A loop is used to repeatedly display
the current state of the array, and to process user input. The main function appears here, in its entirety:

int main()
{
char str[40] = "initial string - declared char str[40]";

while ( true )
{
cout << endl;
cout << " string contents " << endl;
cout << str << endl;
cout << " " << endl << endl;
cout << "Enter index (0-38), 44 for a terminator, 55 for a space, or -1 to quit: ";

int index;
cin >> index;

if (index == -1)
return 0;

if (index >= 0 && index < 39)
{
cout << "enter new character: ";
char ch;
cin >> ch;
str[index] = ch;

}
else if (index == 44)
{
cout << "Enter index for terminator: ";
int terminatorIndex;
cin >> terminatorIndex;
if (terminatorIndex >=0 && terminatorIndex < 39)
str[terminatorIndex] = 0;
else
cout << "Invalid index for terminator" << endl;
}
else if (index == 55)
{
cout << "Enter index for space: ";
int spaceIndex;
cin >> spaceIndex;
if (spaceIndex >=0 && spaceIndex < 39)
str[spaceIndex] = ' ';
else
cout << "Invalid index for space " << endl;
}
else
cout << "invalid index" << endl;
};

return 0;
}


www.gameinstitute.com Introduction to C and C++ : Week 2: Page 11 of 39

A while loop is used, but instead of testing a variable condition, as is normally the case, the true keyword
is used. This creates an infinite loop that iterates forever—in theory. The sample provides a way out of the
loop, by simply returning from the function. The loop is forced to terminate when the user enters an index
of –1 because a return statement is executed which causes the entire function to terminate.

The remainder of the cases are handled using an if/else if construct that handles each case. The array itself
is named str, and is modified according to user request.
Structures
Now let’s turn our attention to another form of complex data types: structures. Structures allow any
number of data types to be combined into a new data type. This allows new types to be created that
represent virtually any entity. The resulting type can be used to declare variables, define function
argument lists, and, as we’ll learn in Lesson 3, these new types can be made to behave just like intrinsic
data types and even perform custom operations. Furthermore, the information provided in this section
applies to the creation of objects, as we’ll see in Lesson 3.

Structures are supported through the struct keyword, which prefaces the definition of a new structure.
This keyword is followed by a name for the structure, and a body containing one or more variable
declarations. Let’s begin with a simple example that represents a game player. We’ll include two data
items inside the structure: a string for the player name, and an integer indicating his score:

struct Player
{
char name[80];
int score;
};

The structure body is enclosed in curly braces, and is followed by a semicolon. The body contains
variable declarations that collectively form the new data type. The Player structure contains two such
variables, one for the player name, and one for the player’s score. These entries are often called fields, but
it is more accurate to call them data members. In this case the Player structure is said to have one data

member called name, and one called score.

At first glace a structure definition might look like a function definition, but the two differ in the
following ways:

• Structures definitions begin with the struct keyword
• Structures definitions do not have argument lists
• Instead of statements, loops, and conditionals, structures contain only data member declarations
• Structure definitions are terminated with a semicolon (functions aren’t)

It is important to note that the data member definitions in a structure are not variable declarations. The
Player structure defined above is a data type—not a variable. Although the structure appears to have
variables inside it, they are used only to describe the format of the Player structure. As a data type, the
Player structure provides a data description, or blueprint, but by itself is not functional. It cannot store
values, and it occupies no memory. The Player structure does not represent a single player—it is a data
type that can be used to represent a player.

A structure can be used to create variables that can store values and occupy memory. A variable
declaration for the Player structure looks like this:


www.gameinstitute.com Introduction to C and C++ : Week 2: Page 12 of 39
Player player1;

The syntax is exactly like that of a variable that is based on an intrinsic type. This is because we have
created a new data type: a Player. It isn’t provided by C++, but, now that we’ve created it, we can use it
as though it were.

As with variables based on intrinsic types, the new variable created above (player1), is un-initialized. We
can assign values to its data members using the dot (.) operator, like this:


player1.score = 0;

The name to the left of the dot operator indicates the variable based on the structure, and the name to the
right indicates the data member within the structure that we wish to access. In this case we’re accessing
the score field (an int) and setting it to zero.

It is logical to assume that we would proceed by assigning the name field in the same fashion, like this:

player1.name = “unnamed player”; // compiler error!

The problem is that the name field is a char array, and the assignment operator cannot be used on arrays
except during initialization. We can’t use the assignment operator here because the player1 variable has
already been declared. Instead, we can use the strcpy function, which is provided in the standard C++
library (strcpy is declared in the string.h header file):

strcpy( player1.name, “unnamed player” );

The strcpy function copies string data, and takes two arguments. The first is the string to be assigned (the
‘destination’ string), and the second is the source string. The code above uses the strcpy function to
assign the string “unnamed player” into the data member name contained within the player1 variable.
(We’ll study strcpy in more detail later in this lesson when we talk about pointers.)

Alternatively, the player1 variable can be initialized in the same way that we initialized arrays in the
previous section, like this:

Player player1 = { "unnamed player", 100 };

Initializing a structure looks just like array initialization, except that the initial values must each be
matched to the type of the data member to which it refers. In this case we’re using a string to initialize the

first data member (name), and an integer to initialize the second data member (score). This is clearly
preferable, but is limited to declaration. This syntax can’t be used on a variable that already exists. Also,
the initializer values must appear in the order in which the data members are declared in the structure.
Mixing Complex Types
Arrays and structures can be mixed in any fashion. You can create arrays of structures, structures
containing arrays, arrays of arrays, and structures that are composed of other structures. In addition, you
can embed structures and arrays in into larger types over and over. You can define structures within
structures within structures, arrays within arrays, within arrays and…well, you get the idea. Let’s look at
some examples:

struct Projectile
{

www.gameinstitute.com Introduction to C and C++ : Week 2: Page 13 of 39
float locationX, locationY;
float velocityX, velocityY;
};

Projectile projectileArray[100];

In this example, a structure called Projectile is defined that contains data members representing the
current location and velocity of a bullet or missile. The Projectile structure is then used to declare an
array containing 100 projectiles. This is an example of an array of structures. Modifying or inspecting the
contents of this array requires both the square bracket syntax and the dot operator, like this:

projectileArray[0].locationX = 1000.0f; // assigns the locationX data member of the 1
st
element
projectileArray[99].velocityX = 0.0f; // assigns the velocityX data member of the last element


We’ve already seen an example of a structure that contains an array: the Player structure:

struct Player
{
char name[80];
int score;
};

Previously we used the dot operator to assign the name array with the help of the strcpy function.
Alternatively, each element in the array can be accessed using the square bracket syntax together with the
dot operator:

Player p1;
p1.name[4] = ‘v’; // assigns the character ‘v’ to the 5
th
character of the player name
p1.name[79] = 0; // adds a null-terminator to the end of the player name data member in p1

Structures and arrays can be used as building blocks for new types. For example, we can use the Player
structure as the basis for a new structure, like this:

struct GamePlayers
{
Player allPlayers[10];
};

This example uses the Player structure to define a new structure called GamePlayers. While Player
represents a single player, the GamePlayers structure represents all of the players in the game (the game
is limited to 10 players in this case). The GamePlayers structure contains an array of Player structures.
GamePlayers is an example of a structure that contains an array, and the allPlayers field is another

example of an array of structures. As the complexity of the data type increases, so does the syntax
required access variables of that type. To access individual characters of a player’s name, for example,
requires two dot operators and two array indices:

GamePlayers players;

players.allPlayers[4].name[0] = ‘a’; // assigns ‘a’ to the first character of the 5
th
player name

This code creates a variable called players that represents all of the players in a game. The next line
assigns the character ‘a’ as the first character of the 5
th
player’s name.

www.gameinstitute.com Introduction to C and C++ : Week 2: Page 14 of 39
Memory Usage
When I was in college, I had a professor that was obsessed with saving memory. His assignments were
usually along the lines of, write a program that does such and such using no more than only 6 bytes. His
fascination with saving memory was borne out of his having learned how to program on computers that
had less that a kilobyte of RAM—less than 1024 bytes.

It is not uncommon for modern PCs to be equipped with 512 megabytes of RAM, or 52,428,800 bytes.
Clearly, concerning yourself with saving a byte or two is unnecessary. Why then, given the ample
memory stores on today’s desktops, are we looking at memory usage?

The problem is that, while each intrinsic type is very cheap to use memory-wise, they add up quickly,
especially when arrays are involved. In Lesson 1 we learned that variables occupy memory, and that the
amount of memory a variable requires depends on its data type. Here’s a table with the key intrinsic types,
and the memory required by each variable of that type. (The values displayed are for 32-bit compilers.)


Memory usage for intrinsic types
Type Bits Bytes
bool 8 1
char 8 1
short 16 2
long 32 4
int 32 4
float 32 4
double 64 8

The size of any data type can be retrieved using the sizeof operator. This operator, given a data type,
returns the number of bytes required to represent variables of the given type. The values in the table
above can be displayed with this code:

int main()
{
cout << "sizeof(bool) = " << sizeof(bool) << endl;
cout << "sizeof(char) = " << sizeof(char) << endl;
cout << "sizeof(short) = " << sizeof(short) << endl;
cout << "sizeof(int) = " << sizeof(int) << endl;
cout << "sizeof(long) = " << sizeof(long) << endl;
cout << "sizeof(float) = " << sizeof(float) << endl;
cout << "sizeof(double) = " << sizeof(double) << endl;

return 0;
}

The sizeof operator also accepts variable names, so it can be used this way as well:


float f;
cout << “sizeof(f) = “ << sizeof(f) << endl;

The size of any data type can be retrieved, be it intrinsic or otherwise, so we can use sizeof to determine
the size of structures and arrays. For example:

struct Player
{

www.gameinstitute.com Introduction to C and C++ : Week 2: Page 15 of 39
char name[80];
int score;
};

cout << sizeof( Player ) << endl;

This snippet displays the value 84 because the Player structure contains an 80 byte array (one byte for
each char), and 4 bytes for the score integer. The GamePlayers structure, because it contains an array of
10 Player structures, requires 840 bytes:

struct GamePlayers
{
Player allPlayers[10];
};

cout << sizeof( GamePlayers ) << endl;

So, although these structures and arrays ultimately contain small intrinsic data types, the combination of
small types results higher memory requirements. For the next few sections, we’ll talk about the
implications of using complex data types due to their memory requirements, and discuss the memory

allocation options that C++ provides. First, however, we need to discuss pointers.
Pointers
Up to this point, our code snippets and samples have been written without the use of pointers. Until now,
we didn’t need them. Despite their bad reputation, we could certainly have introduced them earlier, but
instead we waited until we had two specific uses for them. This way we can put pointers to use
immediately after we talk about the necessary concepts and syntax.

A pointer is similar to intrinsic types in the sense that it relies on data type definitions and variable
declarations. Unlike the intrinsic types, however, all pointers serve one purpose: the storage of memory
addresses.

What is a memory address? All of the data stored in a program is stored in the memory installed in the
computer. For 32-bit operating systems, memory is arranged in a linear fashion, and can be addressed
starting at address zero and continuing upward in 1 byte increments until the end of the installed memory
is reached. The addresses used to indicate specific locations within memory are merely numbers, but are
almost always expressed in hexadecimal (base 16) instead of decimal (base 10). To differentiate
hexadecimal numbers from base 10 numbers, C and C++ use ‘0x’ as a prefix. A typical memory address
therefore looks like this: 0x00A44C30, which indicates the 10,767,408
th
location in memory.

Although it hasn’t been explicit, we have been using memory addresses all along. Each time we declare a
variable, the compiler generates code that reserves memory at an address of its choosing. This address is
where the variable data is stored in memory. When we compile a code statement that assigns a value to a
variable, the compiler generates code that stores the provided value at the memory address that represents
the variable. We don’t see the memory address, but we’re writing code that manipulates the memory at
that address. The figure below illustrates the relationship of a variable to the memory it represents:


www.gameinstitute.com Introduction to C and C++ : Week 2: Page 16 of 39



In this figure an integer called var is declared and then assigned a value. The memory required to
represent var is represented as a square. The state of the memory corresponding to the variable is
modified by each code statement on the left. The variable declaration itself allocates the memory required
to represent the new variable (in this case 4 bytes – recall that an integer consumes 4 bytes of memory),
but, unless an initializer is used, the memory contains an undetermined value. The second operation—an
assignment—changes the contents of the memory. In the figure, the arbitrary memory address of 0x9898
is used as the starting address where the memory representing the var variable resides. The code makes
no explicit mention of any memory address; the address is used implicitly. You should note as well that
since the integer consumes 4 bytes of memory, the ending address for this variable would be 0x989B.
Thus the four bytes of linear physical memory used to store this single integer value are 0x9898, 0x9899,
0x989A, 0x989B.

By using pointers, we can gain access to values via memory addresses instead of variable names. Why
would be want to do that? For reasons that we’ll discuss in the next two sections, we won’t always have
access to the variable name. Accessing its value via its address will be our only option.

C and C++ assign special properties to two operators for pointer declarations and operations: the asterisk
(*) and the ampersand (&). Both of these operators are also used in multiple contexts. The asterisk, for
example, in addition to serving as the multiplication operator, has two pointer-related meanings as well.
Still, the meaning of these operators can always be determined through the context in which it is used.

The asterisk, when used with pointers, is often called the star operator. This code, for example, declares
an integer pointer:

int* pVar;

A pointer is a variable. It has a data type and a variable name. In this case the data type is “int *” (the star
is part of the type, making the resulting variable a pointer), and the name of the pointer is pVar.


Except for the presence of the star operator, this declaration is identical to that of an integer. The star tells
the compiler that instead of an int variable, we want a variable that is a pointer to an int. The variable
name is prefaced with a ‘p’ to indicate that it is a pointer. (This is a naming convention—it is not required
that pointer names be prefaced in this fashion, but is a common practice among C++ programmers.)

The new pointer, as it appears above, is un-initialized; its value is unknown, and therefore should not be
used before it is assigned to a safe value. For this reason, pointers are often declared like this:


www.gameinstitute.com Introduction to C and C++ : Week 2: Page 17 of 39
int* pVar = 0;

This initializes pVar to zero as soon as it is declared. Because this is a pointer, the assigned value has a
special meaning: it is a memory address. In fact, zero is a special memory address, because it is reserved.
The memory at address zero, although it exists, is never used to store program data. The compiler will let
you assign pointers to zero, and it will generate code that attempts to access the “0
th
” memory address
without errors or warnings, but these attempts will always fail at runtime.

It is for exactly this reason that pointers that aren’t actively pointing to valid data should be assigned to
zero. This way, if you inadvertently forget to assign a pointer with a valid memory address, the executable
will fail with an error that indicates that a “null pointer” was used, and you’ll know what the problem is.
Leaving pointers “dangling” (not pointing either to valid data or to zero) is a dangerous practice, because
you’re guaranteed no such error message when the pointer is used, making the bug much harder to find.

With this in mind, let’s assign our new pointer to a valid, non-zero memory address right away. In order
to do so, we’ll need an additional variable of the appropriate type that we can assign the pointer to “point”
to:


int var = 100;
int* pVar = &var;

The pVar pointer now appears along with a declaration of an integer called var, which is initialized to
100. The pointer is initialized to point to var, with the help of the ampersand operator, which server as the
address of operator. This assignment can be described as “assign to pVar the address of var.” The pVar
variable now contains a value that indicates the location of var in memory. pVar points to var. This is
illustrated below:



The first line of code declares and initializes an integer called var. The second line declares an initializes
a pointer. This pointer is initialized with the “address of” the variable var. The result is that pVar points
to var.

In this figure, we’re using an arbitrary address, but it’s important to point out that this address could be
any value. We have no control over where var is stored. Still, we can display memory addresses
contained within pointers, like this:

cout << “var is stored at memory address: “ << pVar << endl;


www.gameinstitute.com Introduction to C and C++ : Week 2: Page 18 of 39
On my computer, using Visual C++ to compile a debug build, this code results in this output:

var is stored at memory address: 0x0012FF7C

But, compiling a release build yields this output:


var is stored at memory address: 0x0012FF80

And the Borland compiler produces this output:

var is stored at memory address: 0012FF88

In each case the memory address displayed is correct; it indicates the location of the var variable. But this
location is determined at run-time, and—obviously—is not consistent. This means that if the address of a
variable is required, it must be retrieved at runtime. Including literal memory addresses into your source
code is asking for trouble, because what is stored at the given location can change according to
compilation settings, and even from one execution to another.

So, what does it mean now that pVar points to var? pVar now contains the memory address where var is
stored, so we can access the value of var without using the var variable. For example, we can retrieve the
value of var like this:

int var = 33;
int* pVar = &var;
int var2 = *pVar;

The added code retrieves the value stored at the memory location indicated by pVar, and stores it in a
new variable called var2. In this case the asterisk is used to de-reference the pointer, returning not the
memory address contained in pVar, but the data contained at the memory address. Because we initialized
var to 33, var2 is now equal to 33 as well.

When used with pointers, the asterisk has two meanings, depending on context (it is the multiplication
operator when it is not used with pointers). When used as part of a data type during variable declaration, it
indicates that the resulting variable should be a pointer. Alternatively, when used to prefix a pointer that
has already been declared, it serves as the de-reference operator.


Pointers can also be used to assign values. For example, we can override the value contained in var like
this:

*pVar = 200;

Now the value of var has changed even though we didn’t use the var variable. We can verify this by
adding some statements to display the value of our variables:

int var = 33;
cout << "var is initialized to " << var << endl;

int* pVar = &var;
cout << "pVar is assigned to point to var (memory address " << pVar << ")" << endl;

int var2 = *pVar;
cout << "var2 is initialized to *pVar (var2 is now " << var2 << ")" << endl;

www.gameinstitute.com Introduction to C and C++ : Week 2: Page 19 of 39

*pVar = 200;
cout << "*pVar is assigned to 200" << endl;

cout << "final result: var is " << var << ", var2 is " << var2 << endl;

This code is available for download as the Pointer sample, and produces this output:



The result is that we were able to read and assign the value of a variable without using the variable name.
For code that appears in a single function, as is the case with the Pointer sample, it certainly would have

been easier to merely use the variable name. But, as we’ll see in the next section, sharing variables
between functions is a situation where the simple use of a variable name isn’t possible; pointers must be
used instead.
Pointers and Functions
Functions normally have inputs and outputs. The input is data that is provided in the form of parameters,
and are used in the function body to perform calculations. Output typically takes the form of a return
value. However, there are situations where parameters must be used for output as well. If a function
provides two output items, for example, the return type isn’t sufficient, as it can only provide one value.

In the Celsius sample from the previous lesson, we used functions to divide the work of gathering user
input, performing the conversion from Fahrenheit, and displaying the results. Some of the functions we
used accepted arguments that allowed input values to be provided. The DisplayOutput function, for
example, took two such parameters:

void DisplayOutput(float f, float c)
{
cout << f << " degrees Fahrenheit is " << c << " Celsius" << endl;
}

In order to call this function, the calling function must provide two variables of the float type. This
function uses the provided values to display the conversion results.

This style of parameters usage is called passing by value because it allows one function to provide
another with the values of one or more variables. The calling function shares values with the function it
calls, but each function has its own copy of the values. The fact that each function contains its own
variables can be demonstrated with this code:


www.gameinstitute.com Introduction to C and C++ : Week 2: Page 20 of 39
void GetValue(int test)

{
test = 99;
}

int main()
{
int test = 0;

GetValue( test );

// what value does test have now?

return 0;
}

In this example the main function declares a variable called test, and initializes it to 0. It then calls a
function called GetValue using test as an argument. GetValue accepts a variable called test, which it
assigns it to 99. The GetValue function then returns. What value does test have now? The answer is zero,
because main and GetValue both have their own copies of test. The copy of test in the main function is
assigned to zero, and that value—not the variable itself—is passed to GetValue. GetValue receives the
value zero in its own copy of test, which it then overrides with the value 99, but this value is discarded
when GetValue returns. main’s copy of test is not altered by the function call.

The example above is misleading because both functions use the same variable name. We can clarify the
code by renaming one of the variables:

void GetValue(int testParam)
{
testParam = 99;
}


int main()
{
int test = 0;

GetValue( test );

return 0;
}

Now it’s easier to see why the variable in main isn’t affected by the call to GetValue. But the lesson here
isn’t that all the variables in your code should have different names. On the contrary, each function can
have variables that belong to it, and cannot be accessed by other functions. The programmer is free to
choose variable names that are specifically tailored to their use within the function. Because these
variables are specific to one function, they are called local, or automatic variables. All of the variables
we’ve used thus far have been local variables.

Local variables are advantageous because they are easy to create and they are not as prone to bugs as the
alternatives. But the downside is that local variables have a very limited lifespan: they exist in only the
function in which they are declared. Local variables are automatically discarded when the function in
which they reside returns. We’ll talk about the alternatives to local variables later in this lesson, but first
let’s explore how local variables can be shared between functions.

www.gameinstitute.com Introduction to C and C++ : Week 2: Page 21 of 39

Local variables cannot be shared with other functions. The addresses of these variables, on the other hand,
can be shared—through the use of pointers. In the case of the previous example, the function name
GetValue implies that we want the argument to be modified, so we can fix the code by changing the code
to look like this:


void GetValue(int* pFuncTest)
{
*pFuncTest = 99;
}

int main()
{
int test = 0;

GetValue( &test );

// test is now 99

return 0;
}

Instead of passing the value of test to GetValue, main provides the address of test using the address of
operator. GetValue has been modified to accept a pointer to an int instead of an int. Because it now
accepts a pointer, the GetValue function must de-reference the parameter with the star operator in order
to assign a new value. The result is that the value of test is modified as a result of the function call. This is
called passing by reference. Instead of providing the value of a variable, the address of the variable is
given. Using pointers to pass variables by reference is one way to share data between functions. This is
just one example of how pointers can be used. We’ll talk about other uses soon, but first let’s talk about
another technique that can be used to share variables between functions.
References
Both C and C++ support pointers, so programmers that learned C first are usually comfortable with
pointers, and tend to use them liberally. As a result, even now that they are using C++, these programmers
tend to use pointers in situations where references are more appropriate. References are specific to C++,
and, although they can’t do everything that pointers can do, they are simpler to use, and provide safety
features that pointers lack.


Like pointers, references are data types that allow variables to be passed by reference instead of by value,
but without the potential for bugs associated with pointers. (The phrase “passed by reference” refers to the
communication of data between functions by address as opposed to value, and applies both to the use of
pointers and references.)

C++ uses the ampersand to denote a reference. This operator also serves as the address of operator, but
can be easily discerned because references use the ampersand in declarations only, whereas the address of
operator cannot be used in a declaration. Declaring a reference looks like this:

int score;
int& scoreRef = score;


www.gameinstitute.com Introduction to C and C++ : Week 2: Page 22 of 39
This code snippet declares an integer called score, and an integer reference called scoreRef. The
reference is initialized using the score variable, which results in scoreRef being a reference to score.
scoreRef can now be used to access the data contained in the score variable.

Technically speaking, a reference works just like a pointer. The assignment shown above assigns the
address of score to scoreRef, not the value. Unlike a pointer, no special syntax is required to assign or de-
reference references, so scoreRef can be used as an alias, or synonym for score. For example:

int score = 0;
int& scoreRef = score;

cout << score << endl;
cout << scoreRef << endl;

scoreRef = 99;


cout << score << endl;
cout << scoreRef << endl;

This code displays the value contained in both score and scoreRef, and then assigns a new value to
scoreRef. Both variables are displayed again, demonstrating that the value of score has changed as well.
scoreRef is a reference to score, so assigning one effects the value of the other, hence this output:

0
0
99
99

Using references, we can rewrite the main and GetValue functions that were previously fitted to use
pointers:

void GetValue(int& funcTest)
{
funcTest = 99;
}

int main()
{
int test = 0;

GetValue( test );

// test is now 99

return 0;

}

By modifying GetValue to accept a reference, this code accomplishes the same result as the pointer-
based version: the value of test is modified by the call to GetValue. But this version has the advantage of
simplicity. No special operators are required to call GetValue, nor to perform the assignment in the
GetValue function body.


www.gameinstitute.com Introduction to C and C++ : Week 2: Page 23 of 39
References are also safer to use than pointers: code that uses pointers is more susceptible to programmer
error. To illustrate these potential problems, let’s take another look at the pointer-based version:

void GetValue(int* pTestParam)
{
*pTestParam = 99;
}

int main()
{
int test = 0;

GetValue( &test );

// test is now 99

return 0;
}

The potential for error with this code involves the fact that pointers can easily be made to point to invalid
data. For example, there’s nothing stopping us from calling GetValue like this:


GetValue( 0 );

This is legal, since zero is a valid memory address. This code will compile, but GetValue will fail at
runtime when it tries to de-reference pTestParam.

With the reference-based version of GetValue, this function call simply won’t compile. A compiler error
will be displayed that directs our attention to this function call, and notifies us that this argument is
invalid in this context.

Another potential problem is un-initialized pointers. Pointers can be declared without being initialized,
like this:

float* pData;

This declaration creates a pointer to a float, but the pointer doesn’t point to a valid floating point variable,
or even to zero. We don’t know what this pointer points to because it hasn’t been initialized. If we de-
reference this pointer, like this:

*pData = 1000; // assign a value to the memory indicated by pData

This code will compile, but will mostly likely crash at runtime, because pData contains a virtually
random memory address.

A reference, on the other hand, cannot be declared without being initialized, and furthermore cannot be
initialized to reference invalid data. For this reason, none of these declarations will compile:

float& r; // reference to float, not initialized – compiler error
short& s = 0; // reference to short, initialized to literal value – compiler error
char& c = ‘c’; // reference to char, initialized to literal value – compiler error



www.gameinstitute.com Introduction to C and C++ : Week 2: Page 24 of 39
References must be initialized with other variables of the same type. The following references will
compile, and will not fail at runtime because they reference valid variables:

float throttlePosition;
float& f = throttlePosition;

short arrayIndex;
short& s = arrayIndex;

char ch;
char& c = ch;

It may seem like you’re being warned away from pointers—but that’s not the lesson to take from this
section. References are safer than pointers, but are not nearly as powerful. There are situations where
pointers must be used, or provide a performance advantage over non-pointer solutions. In general it’s a
good idea to follow these rules:

• Avoid pointers if a reference can be used instead
• If pointers are necessary, examine your code carefully, and test it thoroughly. Pointer bugs are
easy to create and hard to debug.

In the next section we’ll talk about some of the situations where references can’t be used, so pointers are
the only option.
Pointer arithmetic
In addition to allowing functions to share data, pointers can be used to manipulate arrays. This requires
pointer arithmetic, a feature that allows a pointer to be directed to point at various locations in memory
through the use of the addition and subtraction operators (+ and -) and the increment and decrement

operators (++ and ). Before we employ pointers in this fashion, let’s review regular array manipulation,
and point out a few facts that we didn’t cover the first time around.

An array is a collection of similarly typed elements. C++ allows the use of square brackets to indicate the
array size during declaration, and to indicate specific elements when the array is used:

char array[5]; // declares a 20 element array called ‘array’

array[0] = ‘a’; // assigns the first element to ‘a’
array[4] = ‘e’; // assigns the last element to ‘e’
array[0] = array[4]; // assigns the first element to the contents of the last ‘e’

Arrays are represented sequentially in memory. The first element of the array is followed immediately by
the second; the second is followed by the third; and so forth. Consider this array:

char array[0] = “abcde”; // array size is 6 with null terminator

The result of this declaration is that the characters a, b, c, d, e, and the null terminator are stored in
memory sequentially. This means that if we assign a pointer to the memory address of the first element in
the array, we can access the second array element by incrementing the address contained within the
pointer. We can create a pointer and assign it to point to the first array element like this:

char* p = &array[0]; // p points at the first array element, which contains ‘a’


www.gameinstitute.com Introduction to C and C++ : Week 2: Page 25 of 39
This code creates a pointer called p, and assigns it to the address of the first array element. If we de-
reference p, we’ll get the character ‘a’:

cout << *p << endl; // displays ‘a’


But we can also increment p, causing it to point to the next memory address—where ‘b’ is stored:

cout << *p << endl; // displays ‘a’;
p = p + 1; // increments p
cout << *p << endl; // displays ‘b’

This code, which increments p by one in order to move to the next array entry, is illustrated below:



The figure illustrates that the array is stored using a contiguous range of memory address. The arbitrary
memory address 1000 (in hex) is used for the beginning of the array, where the first element is stored.
The pointer p is initialized to point to this first address. It therefore contains the memory address 1000.
The pointer is then incremented by one, so that it points to the address 1001. The result is that p now
points to be second array element.

This is called pointer arithmetic: using addition and subtraction to iterate through or jump to specific
locations in memory. Here’s a slight modification on the previous example, using the ++ operator instead:

cout << *p << endl; // displays a
p++; // increments p by one
cout << *p << endl; // displays b

Despite the more compact syntax, this version behaves exactly like the previous version.

We can also add or subtract values other than 1. This version increments p by 4 instead of one, causing p
to jump to the 4
th
array element:


cout << *p << endl; // displays a
p += 4; // increments p by 4
cout << *p << endl; // displays e

So what do we gain by accessing an array this way? After all, any of the above snippets can be written
using square brackets containing array indices to accomplish the same goal. Not long ago, pointers were
clearly advantageous because they were faster. The square bracket syntax caused the compiler to generate
more machine code than was required to compile the pointer-based code. This was particularly true for

×