=================== LECTURE
1 =============================
Comments
C++ supports two ways to insert comments:
// line comment
/* block comment */
Identifiers
A valid identifier
is a sequence of one or more letters, digits or underscore characters (_).
Neither
spaces nor punctuation marks or symbols can be part of an identifier. Only
letters, digits and single underscore characters are valid.
In
addition, variable identifiers always have to begin with a letter.
Another
rule that you have to consider when inventing your own identifiers is that they
cannot match any keyword of the C++ language nor your compiler's specific ones,
which are reserved keywords. The standard reserved keywords are:
asm, auto, bool, break, case, catch, char,
class, const, const_cast, continue, default, delete, do, double, dynamic_cast,
else, enum, explicit, export, extern, false, float, for, friend, goto, if,
inline, int, long, mutable, namespace, new, operator, private, protected,
public, register, reinterpret_cast, return, short, signed, sizeof, static,
static_cast, struct, switch, template, this, throw, true, try, typedef, typeid,
typename, union, unsigned, using, virtual, void, volatile, wchar_t, while
Additionally,
alternative representations for some operators cannot be used as identifiers
since they are reserved words under some circumstances:
and, and_eq, bitand, bitor, compl, not,
not_eq, or, or_eq, xor, xor_eq
Very
important: The C++
language is a "case sensitive" language. That means that an
identifier written in kapital letters is not equivalent to another one with the
same name but written in small letters. Thus, for example, the RESULT variable is
not the same as the result variable or the Result variable. These are three different variable identifiers.
Fundamental
data types
Name |
Description |
Size |
Range |
char |
Character or small integer |
1 byte |
signed: -128 to 127 unsigned: 0 to 255 |
short short int |
Short Integer |
2 bytes |
signed: -32768 to 32767 unsigned: 0 to 65535 |
int |
Integer |
4 bytes |
signed: -2147483648 to 2147483647 unsigned: 0 to 4294967295 |
long long int |
Long Integer |
4 bytes |
signed: -2147483648 to 2147483647 unsigned: 0 to 4294967295 |
bool |
Boolean value |
1 byte |
true or false |
float |
Floating point number |
4 bytes |
±3.4e±38 (~7 digits) |
double |
Double precision floating point
number |
8 bytes |
±1.7e±308 (~15 digits) |
Declaration
of variables
int a;
float mynumber;
int a, b, c;
int a;
int b;
int c;
unsigned int NextYear;
int a = 0, b = 3;
int a(0), b(3);
char c = 'X';
char mystring[] = "This is a
string";
string mystring = "This is a
string";
Character
and string literals have certain peculiarities, like the escape codes. These
are special characters that are difficult or impossible to express otherwise in
the source code of a program, like newline (\n) or tab (\t). All of them are preceded by a backslash (\). Here
you have a list of some of such escape codes:
\n newline
\t tab
\“ double quote
\' single quote
\\ backslash
"one\ntwo\nthree"
#define PI 3.14159
const Florat PI = 3.14159;
=================== LECTURE
2 =============================
Operators
(e.k. operaatorid, tehted;
v.k. операции)
Assignment
(=)
The
assignment operator assigns a value to a variable.
a = 5; x = a;
u = v = w = 13;
Arithmetic
operators ( +, -, *, /, % )
Modulo
(%) is the operation that gives the remainder of a division of two values. For
example, if we write:
a = 11 % 3;
the
variable a
will contain the value 2, since 2 is the
remainder from dividing 11 between 3.
Compound
assignment ( +=, -=, *=,
/=, %=, >>=, <<=, &=, ^=, |= )
Increase
and decrease (++, --)
Relational
and equality operators ( ==, !=, >, <, >=, <= )
Logical
operators ( !, &&, || )
! not
&& and
|| or
Conditional
operator ( ? )
Comma
operator ( , )
Bitwise
Operators ( &, |, ^,
~, <<, >> )
Explicit
type casting operator
int i;
float f = 3.14;
i = (int) f;
sizeof()
This operator accepts one parameter, which can be either a type or
a variable itself and returns the size in bytes of that type or object:
a = sizeof (char);
Other
operators
Precedence
of operators
Control
Structures (Statements, e.k.
laused;
v.k. операторы)
With the
introduction of control structures we are going to have to introduce a new
concept: the compoundstatement or block.
A block
is a group of statements which are separated by semicolons (;) like all C++
statements, but grouped together in a block enclosed in braces: { }:
{ statement1; statement2; statement3; }
Most of
the control structures that we will see in this section require a generic
statement as part of its syntax.
A
statement can be either a simple statement (a simple instruction ending with a
semicolon) or a compound
statement
(several instructions grouped in a block), like the one just described.
In the
case that we want the statement to be a simple statement, we do not need to
enclose it in braces ({}). But in the case that we want the statement to be a compound
statement it must be enclosed between braces ({}), forming a block.
Conditional
structure: if and else
The if keyword is
used to execute a statement or block only if a condition is fulfilled. Its form
is:
if (condition) statement
if (x == 100)
cout << "x
is 100";
If we want more than a single statement
to be executed in case that the condition is true we can specify a block
using braces { }:
if (x == 100)
{
cout << "x
is ";
cout << x;
}
We can additionally specify what we
want to happen if the condition is not fulfilled by using the keyword else.
Its form used in conjunction with if is: if ( condition ) statement1 else statement2
if (x > 0)
cout << "x
is positive";
else if (x < 0)
cout << "x
is negative";
else
cout << "x
is 0";
Iteration
structures (loops)
The
while loop
Its
format is: while (expression) statement
and its
functionality is simply to repeat statement while the condition set in
expression is true.
The
do-while loop
Its
format is: do statement while (condition);
Its
functionality is exactly the same as the while loop, except that condition in the
do-while loop is evaluated after the execution of statement instead of before,
granting at least one execution of statement even if condition is never fulfilled.
The
for loop
Its
format is: for( initialization; condition;
increase ) statement;
and its
main function is to repeat statement
while condition
remains true, like the while loop.
But in
addition, the for loop provides specific locations to contain an initialization statement
and an increase statement. So this loop is specially designed to perform a repetitive
action with a counter which is initialized and increased on each iteration.
It works
in the following way:
1. initialization is
executed. Generally it is an initial value setting for a counter variable. This
is executed
only
once.
2. condition is checked. If
it is true the loop continues, otherwise the loop ends and statement is skipped
(not
executed).
3. statement is executed.
As usual, it can be either a single statement or a block enclosed in braces { }.
4.
finally, whatever is specified in the increase
field is executed and the loop gets back to step
2.
Jump
statements.
The
break statement
Using break we can leave a
loop even if the condition for its end is not fulfilled. It can be used to end
an infinite
loop, or
to force it to end before its natural end.
The
continue statement
The continue statement
causes the program to skip the rest of the loop in the current iteration as if
the end of the statement block had been reached, causing it to jump to the
start of the following iteration.
The
goto statement
goto allows to make
an absolute jump to another point in the program. You should use this feature
with caution
since
its execution causes an unconditional jump ignoring any type of nesting
limitations.
The
destination point is identified by a label, which is then used as an argument
for the goto statement. A label is made of a valid identifier followed by a
colon (:).
Generally
speaking, this instruction has no concrete use in structured or object oriented
programming aside from those that low-level programming fans may find for it.
The exit function
exit is a function defined in the cstdlib library. The purpose of exit is to terminate the current program with a specific exit
code. Its prototype is: void exit(int exitcode);
The exitcode is used by some operating systems and
may be used by calling programs. By convention, an exit
code of 0 means that the program finished
normally and any other value means that some error or unexpected
results happened.
The
selective structure: switch.
=================== LECTURE
3 =============================
Functions
A
function is a group of statements that is executed when it is called from some
point of the program. The
following
is its format:
type name ( parameter1, parameter2,
...) { statements }
where:
type is the data
type specifier of the data returned by the function.
name is the
identifier by which it will be possible to call the function.
parameters (as many as
needed): Each parameter consists of a data type specifier followed by an
identifier,
like any regular variable declaration (for example: int x) and which
acts within the function as
a
regular local variable. They allow to pass arguments to the function when it is
called. The different
parameters
are separated by commas.
statements is the
function's body. It is a block of statements surrounded by braces { }.
Here you
have the first function example:
// function example
#include <iostream>
using namespace std;
int addition (int a, int b)
{
int r;
r=a+b;
return r;
}
int main ()
{
int z;
z = addition
(5,3); // NB! Passing By Value
cout << "The result is
" <<
z;
return 0;
}
The
parameters and arguments have a clear correspondence. Within the main function we called
to addition
passing two values: 5 and
3, that correspond to the int a and int b
parameters declared for function addition.
At the
point at which the function is called from within main, the control
is lost by main and passed to function
addition. The value of both arguments passed in the call
(5 and 3) are copied
to the local variables int a and int b within the function.
Function
addition declares another local variable (int
r), and by means of the expression r=a+b, it assigns
to r
the result of a plus
b. Because the actual parameters passed for a and
b are 5 and
3 respectively, the result is 8.
Functions
with no type. The use of void.
// void function example
#include <iostream>
using namespace std;
void printmessage ()
{
cout << "I'm a function!";
}
int main ()
{
printmessage ();
return 0;
}
void can also be used in the function's
parameter list to explicitly specify that we want the function to take no
actual parameters when it is called.
Arguments
passed by value and by reference.
But there
might be some cases where you need to manipulate from inside a function the
value of an external
variable.
For that purpose we can use arguments passed by reference, as in the function
duplicate of the following example:
// passing parameters by reference
#include <iostream>
using namespace std;
void duplicate (int & a, int & b, int & c)
{
a*=2;
b*=2;
c*=2;
}
int main ()
{
int x=1, y=3, z=7;
duplicate (x, y, z);
cout << "x=" << x << ", y=" << y << ", z=" << z;
return 0;
}
The first
thing that should call your attention is that in the declaration of duplicate the type of
each parameter
was
followed by an ampersand sign (&). This ampersand is what specifies that their corresponding arguments
are to be passed by reference instead of by value.
When a
variable is passed by reference we are not passing a copy of its value, but we
are somehow passing the
variable
itself to the function and any modification that we do to the local variables
will have an effect in their
counterpart
variables passed as arguments in the call to the function.
Default
values in parameters.
Overloaded
functions.
Recursivity.
Recursivity
is the property that functions have to be called by themselves. It is useful
for many tasks, like sorting or calculate the factorial of numbers. For
example, to obtain the factorial of a number (n!) the mathematical formula
would be:
n!
= n * (n-1) * (n-2) * (n-3) ... * 1
more
concretely, 5! (factorial of 5) would be:
5!
= 5 * 4 * 3 * 2 * 1 = 120
and a
recursive function to calculate this in C++ could be:
// factorial calculator
#include <iostream>
using namespace std;
long factorial (long a)
{
if (a > 1)
return a * factorial (a-1);
else
return 1;
}
int main ()
{
long
number;
cout
<< "Please
type a number: ";
cin
>> number;
cout
<< number << "!
= " <<
factorial (number);
return
0;
}
Notice
how in function factorial we included a call to itself, but only if the argument passed was
greater than 1,
since
otherwise the function would perform an infinite recursive loop in which once
it arrived to 0 it would continue multiplying by all the negative numbers (probably
provoking a stack overflow error on runtime).
This
function has a limitation because of the data type we used in its design (long) for more
simplicity. The results given will not be valid for values much greater than
10! or 15!, depending on the system you compile it.
Declaring
functions.
Static locale
variables
The static
keyword has several distinct meanings.
Normally, variables defined local to a function
disappear at the end of the function scope. When you call the function again,
storage for the variables is created anew and the values are re-initialized.
If you want a value to be extant throughout the life
of a program, you can define a function’s local variable to be static
and give it an initial value. The initialization is performed only the first
time the function is called, and the data retains its value between function
calls. This way, a function can “remember” some piece of information between
function calls.
You may wonder why a global variable isn’t used
instead. The beauty of a static variable is that it is unavailable
outside the scope of the function, so it can’t be inadvertently changed. This
localizes errors.
Here’s an example of the use of static
variables:
// Using a static variable in a function
#include <iostream>
using namespace std;
void func() {
static int i = 0;
cout << "i = " << ++i << endl;
}
int main() {
for(int x = 0; x <
10; x++)
func();
} ///:~
Each time func( ) is called in the
for loop, it prints a different value. If the keyword static is not
used, the value printed will always be ‘1’.
Arrays
Like a
regular variable, an array must be declared before it is used. A typical
declaration for an array in C++ is:
type
name [elements];
where type is a valid
type (like int, float...), name is a valid identifier and the elements
field (which is always
enclosed
in square brackets []), specifies how many of these elements the array has to contain.
int billy [5];
int molly [5] = { 16, 2,
77, 40, 12071 };
int terry [] = { 16, 2, 77,
40, 12071 };
billy[2] = 75;
a = billy[2];
In C++
it is syntactically correct to exceed the valid range of indices for an array.
This can create problems, since accessing out-of-range elements do not cause
compilation errors but can cause runtime errors. The reason why this is allowed
will be seen further ahead when we begin to use pointers.
At this
point it is important to be able to clearly distinguish between the two uses
that brackets [ ] have related to arrays. They perform two different tasks: one is to
specify the size of arrays when they are declared; and the
second
one is to specify indices for concrete array elements. Do not confuse these two
possible uses of brackets [] with arrays.
Multidimensional
arrays
Arrays as parameters
At some moment we may need to pass an array to a function as a
parameter.
In C++ it is not possible to pass a complete block of memory by
value as a parameter to a function, but we are allowed to pass its address. In
practice this has almost the same effect and it is a much faster and more
efficient operation.
In order to accept arrays as parameters the only thing that we
have to do when declaring the function is to specify in its parameters the
element type of the array, an identifier and a pair of void brackets []. For example, the following function:
void procedure (int arg[])
accepts a parameter of type "array of int" called arg. In order to pass to this function
an array declared as:
int myarray [40];
it would be enough to write a call like this:
procedure (myarray);
Here you have a complete example:
// arrays as parameters
#include <iostream>
using namespace std;
void printarray (int arg[], int length) {
for
(int n=0; n<length; n++)
cout
<< arg[n] << "
";
cout
<< "\n";
}
int main ()
{
int
firstarray[] = {5, 10,
15};
int
secondarray[] = {2, 4,
6, 8, 10};
printarray
(firstarray,3);
printarray
(secondarray,5);
return
0;
}
As you
can see, the first parameter (int
arg[]) accepts any array whose elements are of type int, whatever its
length. For
that reason we have included a second parameter that tells the function the
length of each array that we pass to it as its first parameter. This allows the
for loop
that prints out the array to know the range to iterate in the passed array
without going out of range.
Arrays
passed as function parameters are a quite common source of errors for novice
programmers. I recommend the reading of the chapter about Pointers for a better
understanding on how arrays operate.
Character Sequences
As you may already know, the C++ Standard Library implements a
powerful string class, which is very useful to handle and manipulate
strings of characters.
However, because strings are in fact sequences of characters, we
can represent them also as plain arrays of char elements.
For example, the following array:
char jenny[20];
is an array that can store up to 20 elements of type char. It can be represented as:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Therefore, in this array, in theory, we can store sequences of
characters up to 20 characters long. But we can also
store shorter sequences. For example, jenny could store at some point in a program either the sequence
"Hello"
or the sequence "Merry christmas",
since both are shorter than 20 characters.
H |
e |
l |
l |
o |
\0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
M |
e |
r |
r |
y |
|
c |
h |
r |
i |
s |
t |
m |
a |
s |
\0 |
|
|
|
|
Therefore, since the array of characters can store shorter
sequences than its total length, a special character is
used to signal the end of the valid sequence: the null character, whose
literal constant can be written as '\0'
(backslash, zero).
Initialization
of null-terminated character sequences
Double quoted strings (") are literal constants whose type is
in fact a null-terminated array of characters. So string literals enclosed
between double quotes always have a null character (\0') automatically appended at the end. Therefore we can initialize
the array of char
elements called myword with a null-terminated sequence of characters by either
one of these two methods:
char myword [] = { 'H', 'e', 'l', 'l', 'o', '\0' };
char myword [] = "Hello";
In both cases the array of characters myword is declared with a size of 6 elements of type char: the 5 characters that compose the word "Hello" plus a final null character ('\0') which specifies the end of the sequence and that, in the second
case, when using double quotes (") it is appended automatically.
Using
null-terminated sequences of characters
Null-terminated
sequences of characters are the natural way of treating strings in C++, so they
can be used as such in many procedures. In fact, regular string literals have
this type (char[]) and can also be used in most cases.
For
example, cin and cout support null-terminated sequences as valid containers for sequences of
characters, so they can be used directly to extract strings of characters from cin or to insert
them into cout. For example:
// null-terminated sequences of
characters
#include <iostream>
using namespace std;
int main ()
{
char question[] = "Please, enter
your first name: ";
char greeting[] = "Hello, ";
char yourname [80];
cout << question;
cin >> yourname;
cout << greeting
<< yourname << "!";
return 0;
}
Finally, sequences of characters stored in char arrays can easily be converted into string objects just by using the
assignment operator:
string
mystring;
char
myntcs[]="some text";
mystring
= myntcs;
C Strings
This header file defines several functions to manipulate C strings and arrays.
Copying:
memcpy Copy block of memory (function )
memmove Move block of memory (function )
strcpy Copy string (function )
strncpy Copy characters from string (function )
Concatenation:
strcat Concatenate strings (function )
strncat Append characters from string (function )
Comparison:
memcmp Compare two blocks of memory (function )
strcmp Compare two strings (function )
strcoll Compare two strings using locale (function )
strncmp Compare characters of two strings (function )
strxfrm Transform string using locale (function )
Searching:
memchr Locate character in block of memory (function )
strchr Locate first occurrence of character in string (function )
strcspn Get span until character in string (function )
strpbrk Locate characters in string (function )
strrchr Locate last occurrence of character in string (function )
strspn Get span of character set in string (function )
strstr Locate substring (function )
strtok Split string into tokens (function )
Other:
memset Fill block of memory (function )
strerror Get pointer to error message string (function )
strlen Get string length (function )
NULL Null pointer (macro )
size_t Unsigned integral type (type )
<string>
String class
The standard string class provides support
for such objects with an interface similar to that of standard containers, but adding features
specifically designed to operate with strings of characters.
(constructor) Construct string object (public member function )
(destructor) String destructor (public member function )
operator= String assignment (public member function )
Iterators:
begin Return iterator to beginning (public member function )
end Return iterator to end (public member function )
rbegin Return reverse iterator to reverse beginning (public member function )
rend Return reverse iterator to reverse end (public member function )
cbegin Return const_iterator to beginning (public member function )
cend Return const_iterator to end (public member function )
crbegin Return const_reverse_iterator to reverse beginning (public member function )
crend Return const_reverse_iterator to reverse end (public member function )
Capacity:
size Return length of string (public member function )
length Return length of string (public member function )
max_size Return maximum size of string (public member function )
resize Resize string (public member function )
capacity Return size of allocated storage (public member function )
reserve Request a change in capacity (public member function )
clear Clear string (public member function )
empty Test if string is empty (public member function )
shrink_to_fit Shrink to fit (public member function )
Element access:
operator[] Get character of string (public member function )
at Get character in string (public member function )
back Access last character (public member function )
front Access first character (public member function )
Modifiers:
operator+= Append to string (public member function )
append Append to string (public member function )
push_back Append character to string (public member function )
assign Assign content to string (public member function )
insert Insert into string (public member function )
erase Erase characters from string (public member function )
replace Replace portion of string (public member function )
swap Swap string values (public member function )
pop_back Delete last character (public member function )
String operations:
c_str Get C string equivalent (public member function )
data Get string data (public member function )
get_allocator Get allocator (public member function )
copy Copy sequence of characters from string (public member function )
find Find content in string (public member function )
rfind Find last occurrence of content in string (public member function )
find_first_of Find character in string (public member function )
find_last_of Find character in string from the end (public member function )
find_first_not_of Find absence of character in string (public member function )
find_last_not_of Find non-matching character in string from the end (public member function )
substr Generate substring (public member function )
compare Compare strings (public
member function )
We have already seen how
variables are seen as memory cells that can be accessed using their
identifiers. This way we did not have to care about the physical location of
our data within memory, we simply used its identifier whenever we wanted to
refer to our variable.
The memory of your computer can be imagined as a succession of memory cells,
each one of the minimal size that computers manage (one byte). These
single-byte memory cells are numbered in a consecutive way, so as, within any
block of memory, every cell has the same number as the previous one plus one.
This way, each cell can be easily located in the memory because it has a unique
address and all the memory cells follow a successive pattern. For example, if
we are looking for cell 1776 we know that it is going to be right between cells
1775 and 1777, exactly one thousand cells after 776 and exactly one thousand
cells before cell 2776.
As soon as we declare a
variable, the amount of memory needed is assigned for it at a specific location
in memory (its memory address). We generally do not actively decide the exact
location of the variable within the panel of cells that we have imagined the
memory to be - Fortunately, that is a task automatically performed by the
operating system during runtime. However, in some cases we may be interested in
knowing the address where our variable is being stored during runtime in order
to operate with relative positions to it.
The address that locates a variable within memory is what we call a reference
to that variable. This reference to a variable can be obtained by preceding the
identifier of a variable with an ampersand sign (&), known as reference operator, and which can
be literally translated as "address of". For example:
|
This would assign to ted
the address of variable andy,
since when preceding the name of the variable andy with the reference operator (&) we are no longer
talking about the content of the variable itself, but about its reference
(i.e., its address in memory).
From now on we are going to assume that andy is placed during runtime in the memory address 1776. This number (1776) is just an arbitrary
assumption we are inventing right now in order to help clarify some concepts in
this tutorial, but in reality, we cannot know before runtime the real value the
address of a variable will have in memory.
Consider the following code fragment:
//
my first pointer
#include
<iostream>
using namespace
std;
int
main ()
{
int firstvalue, secondvalue;
int * mypointer;
mypointer = &firstvalue;
*mypointer = 10;
mypointer = &secondvalue;
*mypointer = 20;
cout << "firstvalue is "
<< firstvalue << endl;
cout << "secondvalue is "
<< secondvalue << endl;
return 0;
}
Notice that even though we have never directly set a value
to either firstvalue
or secondvalue,
both end up with a value set indirectly through the use of mypointer.
This is the procedure:
First, we have assigned as value of mypointer a reference to firstvalue
using the reference operator (&). And then we have assigned the value
10 to the memory location pointed by mypointer, that because at this moment
is pointing to the memory location of firstvalue, this in fact modifies the
value of firstvalue.
In order to demonstrate that a pointer may take several different values during
the same program I have repeated the process with secondvalue and that same
pointer, mypointer.
Here is an example a little bit more elaborated:
//
more pointers
#include
<iostream>
using namespace
std;
int
main ()
{
int firstvalue = 5, secondvalue = 15;
int * p1, * p2;
p1 = &firstvalue; // p1 = address of firstvalue
p2 = &secondvalue; // p2 = address of
secondvalue
*p1 = 10; // value pointed by p1 = 10
*p2 = *p1; // value pointed by p2 = value
pointed by p1
p1 = p2; // p1 = p2 (value of pointer is
copied)
*p1 = 20; // value pointed by p1 = 20
cout << "firstvalue is "
<< firstvalue << endl;
cout << "secondvalue is "
<< secondvalue << endl;
return 0;
}
I have included as a comment on each line how the code
can be read: ampersand (&) as "address of" and asterisk (*) as "value pointed by".
Notice that there are expressions with pointers p1 and p2, both with and without dereference operator (*). The meaning
of an expression using the dereference operator (*) is very different from one that does not: When this
operator precedes the pointer name, the expression refers to the value being
pointed, while when a pointer name appears without this operator, it refers to
the value of the pointer itself (i.e. the address of what the pointer is
pointing to).
Another thing that may call your attention is the line:
|
int * p1, * p2; |
This declares the two pointers used in the previous example. But notice that
there is an asterisk (*) for each pointer, in order for both to have type int* (pointer to int).
Otherwise, the type for the second variable declared in that line would have
been int (and not int*) because of
precedence relationships. If we had written:
|
int * p1, p2; |
p1 would
indeed have int* type, but p2 would have type int (spaces do not matter at all for this
purpose). This is due to operator precedence rules. But anyway, simply
remembering that you have to put one asterisk per pointer is enough for most
pointer users.
The concept of array is very much bound to the one of pointer. In fact, the identifier of an array is equivalent to the address of its first element, as a pointer is equivalent to the address of the first element that it points to, so in fact they are the same concept. For example, supposing these two declarations:
int int |
The following assignment operation would be valid:
|
After that, p and numbers would be equivalent and
would have the same properties. The only difference is that we could change the
value of pointer p by another one,
whereas numbers will always
point to the first of the 20 elements of type int
with which it was defined. Therefore, unlike p,
which is an ordinary pointer, numbers
is an array, and an array can be considered a constant pointer.
Therefore, the following allocation would not be valid:
|
Because numbers is an array, so
it operates as a constant pointer, and we cannot assign values to constants.
Due to the characteristics of variables, all expressions that include pointers
in the following example are perfectly valid:
//
more pointers
#include
<iostream>
using namespace
std;
int
main ()
{
int numbers[5];
int * p;
p = numbers;
*p = 10;
p++;
*p = 20;
p = &numbers[2]; *p = 30;
p = numbers + 3; *p = 40;
p = numbers;
*(p+4) = 50;
for (int n=0; n<5; n++)
cout << numbers[n] << ",
";
return 0;
}
In the chapter about arrays we used brackets ([]) several times in order to specify the index of an element of the array to which we wanted to refer. Well, these bracket sign operators [] are also a dereference operator known as offset operator. They dereference the variable they follow just as * does, but they also add the number between brackets to the address being dereferenced. For example:
|
These two expressions are equivalent and valid both if a is a pointer or if a is an array.
To conduct arithmetical operations on pointers is a little different than to
conduct them on regular integer data types. To begin with, only addition and
subtraction operations are allowed to be conducted with them, the others make
no sense in the world of pointers. But
both addition and subtraction have a different behavior with pointers according
to the size of the data type to which they point.
When we saw the different fundamental data types, we saw that some occupy more
or less space than others in the memory. For example, let's assume that in a
given compiler for a specific machine, char
takes 1 byte, short takes 2 bytes and
long takes 4.
Suppose that we define three pointers in this compiler:
char short long |
and that we know that they point to memory locations 1000, 2000 and 3000 respectively.
So if we write:
|
This is applicable both when adding and subtracting any number to a pointer. It would happen exactly the same if we write:
|
Both the increase (++)
and decrease (--) operators have
greater operator precedence than the dereference operator (*), but both have a special
behavior when used as suffix (the expression is evaluated with the value it had
before being increased). Therefore, the following expression may lead to
confusion:
|
Because ++
has greater precedence than *,
this expression is equivalent to *(p++).
Therefore, what it does is to increase the value of p (so it now points to the
next element), but because ++ is used as postfix the whole expression is
evaluated as the value pointed by the original reference (the address the
pointer pointed to before being increased).
Notice the difference with:
(*p)++
Here, the expression would have
been evaluated as the value pointed by p
increased by one. The value of p
(the pointer itself) would not be modified (what is being modified is what it
is being pointed to by this pointer).
If we write:
|
Because ++ has a higher
precedence than *, both p and q are increased, but because
both increase operators (++)
are used as postfix and not prefix, the value assigned to *p is *q before both p and q are increased. And then both
are increased. It would be roughly equivalent to:
|
Like always, I recommend you to use parentheses () in order to avoid unexpected results and to give
more legibility to the code.
C++ allows the use of pointers that point to pointers, that these, in its turn, point to data (or even to other pointers). In order to do that, we only need to add an asterisk (*) for each level of reference in their declarations:
char char char
|
A null pointer is a regular pointer of any pointer type which has a special value that indicates that it is not pointing to any valid reference or memory address. This value is the result of type-casting the integer value zero to any pointer type.
int
|
A null pointer is a value that any pointer may take to represent that it is pointing to "nowhere".
Until now, in all our programs,
we have only had as much memory available as we declared for our variables,
having the size of all of them to be determined in the source code, before the
execution of the program. But, what if we need a variable amount of memory that
can only be determined during runtime? For example, in the case that we need
some user input to determine the necessary amount of memory space.
The answer is dynamic memory, for which C++ integrates the operators new and delete.
In order to request dynamic
memory we use the operator new.
new is followed by a
data type specifier and -if a sequence of more than one element is required- the
number of these within brackets [].
It returns a pointer to the beginning of the new block of memory allocated. Its
form is:
pointer = new type
pointer
= new type [number_of_elements]
The first expression is used to allocate memory to contain one single element
of type type. The second one is
used to assign a block (an array) of elements of type type, where number_of_elements is an
integer value representing the amount of these. For example:
int
|
In this case, the system dynamically assigns space for five elements of type int and returns a pointer to
the first element of the sequence, which is assigned to bobby. Therefore, now, bobby points to a valid block
of memory with space for five elements of type int.
The first element pointed by bobby can be accessed either
with the expression bobby[0] or the expression *bobby. Both are equivalent
as has been explained in the section about pointers. The second element can be
accessed either with bobby[1] or *(bobby+1) and so on...
You could be wondering the difference between declaring a normal array and
assigning dynamic memory to a pointer, as we have just done. The most important
difference is that the size of an array has to be a constant value, which limits
its size to what we decide at the moment of designing the program, before its
execution, whereas the dynamic memory allocation allows us to assign memory
during the execution of the program (runtime) using any variable or constant
value as its size.
The dynamic memory requested by our program is allocated by the system from the
memory heap. However, computer memory is a limited resource, and it can be
exhausted. Therefore, it is important to have some mechanism to check if our
request to allocate memory was successful or not.
Operators delete and delete[]
Operators new and delete are exclusive of C++.
They are not available in the C language. But using pure C language and its
library, dynamic memory can also be used through the functions malloc, calloc, realloc and free, which are also available in C++ including
the <cstdlib> header
file (see cstdlib for more info).
The memory blocks allocated by these functions are not necessarily compatible
with those returned by new, so each one should be manipulated with its own set
of functions or operators.
Standard Containers
A container is a holder object that stores a collection of
other objects (its elements). They are implemented as class templates, which
allows a great flexibility in the types supported as elements.
The container manages the storage space for its elements and provides member
functions to access them, either directly or through iterators (reference
objects with similar properties to pointers).
Containers replicate structures very commonly used in programming: dynamic
arrays (vector), queues (queue), stacks (stack), heaps (priority_queue), linked lists (list), trees (set), associative arrays (map)...
Many containers have several member functions in common, and share
functionalities. The decision of which type of container to use for a specific
need does not generally depend only on the functionality offered by the
container, but also on the efficiency of some of its members (complexity). This
is especially true for sequence containers, which offer different trade-offs in
complexity between inserting/removing elements and accessing them.
stack, queue and priority_queue are implemented
as container adaptors. Container adaptors are not full container
classes, but classes that provide a specific interface relying on an object of
one of the container classes (such as deque
or list) to handle the elements. The
underlying container is encapsulated in such a way that its elements are
accessed by the members of the container class independently of the
underlying container class used.
<vector>
template < class T, class Alloc = allocator<T> > class vector; // generic template
Vector
Just like arrays, vectors use contiguous storage locations for their elements,
which means that their elements can also be accessed using offsets on
regular pointers to its elements, and just as efficiently as in arrays. But
unlike arrays, their size can change
dynamically, with their storage being handled automatically by the
container.
Internally, vectors use a dynamically allocated array to store their
elements. This array may need to be reallocated in order to grow in size
when new elements are inserted, which implies allocating a new array and moving
all elements to it. This is a relatively expensive task in terms of processing
time, and thus, vectors do not reallocate each time an element is added to the
container.
Therefore, compared to arrays, vectors consume more memory in exchange for the
ability to manage storage and grow dynamically in an efficient way.
Compared to the other dynamic sequence containers (deques, lists and forward_lists), vectors are very
efficient accessing its elements (just like arrays) and relatively efficient
adding or removing elements from its end.
For operations that involve inserting or removing elements at positions other
than the end, they perform worse than the others, and have less consistent
iterators and references than lists and
forward_lists.
Sequence
Elements in sequence containers are ordered in a strict linear sequence. Individual elements are accessed by their position in this sequence.
Dynamic array
Allows direct access to any element in the sequence, even through pointer arithmetics, and provides relatively fast addition/removal of elements at the end of the sequence.
Allocator-aware
The container uses an allocator object to dynamically handle its storage needs.
(constructor) Construct vector (public member function )
(destructor) Vector destructor (public member function )
operator= Assign content (public member function )
Iterators:
begin Return iterator to beginning (public member function )
end Return iterator to end (public member function )
rbegin Return reverse iterator to reverse beginning (public member function )
rend Return reverse iterator to reverse end (public member function )
cbegin Return const_iterator to beginning (public member function )
cend Return const_iterator to end (public member function )
crbegin Return const_reverse_iterator to reverse beginning (public member function )
crend Return const_reverse_iterator to reverse end (public member function )
Capacity:
size Return size (public member function )
max_size Return maximum size (public member function )
resize Change size (public member function )
capacity Return size of allocated storage capacity (public member function )
empty Test whether vector is empty (public member function )
reserve Request a change in capacity (public member function )
shrink_to_fit Shrink to fit (public member function )
Element access:
operator[] Access element (public member function )
at Access element (public member function )
front Access first element (public member function )
back Access last element (public member function )
data Access data (public member function )
Modifiers:
assign Assign vector content (public member function )
push_back Add element at the end (public member function )
pop_back Delete last element (public member function )
insert Insert elements (public member function )
erase Erase elements (public member function )
swap Swap content (public member function )
clear Clear content (public member function )
emplace Construct and insert element (public member function )
emplace_back Construct and insert element at the end (public member function )
Data
structures
We
have already learned how groups of sequential data can be used in C++. But this
is somewhat restrictive, since
in many occasions what we want to
store are not mere sequences of elements all of the same data type, but sets of
different elements with different data types.
Data
structures
A
data structure is a group of data elements grouped together under one name.
These
data elements, known as members, can have different types and different lengths.
Data
structures are declared in C++ using the following syntax:
struct
structure_name {
member_type1
member_name1;
member_type2
member_name2;
member_type3
member_name3;
.
.
}
object_names;
where
structure_name is a name for the structure type, object_name can be a set of valid identifiers for objects
that
have the type of this structure. Within braces { } there is a
list with the data members, each one is specified
with
a type and a valid identifier as its name.
The
first thing we have to know is that a data
structure creates a new type: Once a data structure is declared, a
new
type with the identifier specified as structure_name is
created and can be used in the rest of the program as
if
it was any other type.
For
example:
struct product {
int weight;
float price;
};
product apple;
product banana, melon;
We
have first declared a structure type called product with two
members: weight and price, each of a different
fundamental
type. We have then used this name of the structure type (product) to declare three objects of that
type:
apple, banana and melon as we would
have done with any fundamental data type.
Once
declared, product has become a
new valid type name like the fundamental ones int, char or short and
from
that point on we are able to declare objects (variables) of this compound new
type, like we have done with
apple, banana and melon.
Right
at the end of the struct declaration, and before the ending
semicolon, we can use the optional field
object_name to
directly declare objects of the structure type. For example, we can also
declare the structure
objects
apple, banana and melon at the moment
we define the data structure type this way:
struct product {
int weight;
float price;
} apple, banana, melon;
It
is important to clearly differentiate between what is the structure type name,
and what is an object (variable)
that
has this structure type. We can instantiate many objects (i.e. variables, like apple, banana and melon) from a
single structure type (product).
Once we have
declared our three objects of a determined structure type (apple, banana and melon) we can
operate
directly with their members. To do that we use a dot (.) inserted
between the object name and the
member name.
For example, we could operate with any of these elements as if they were
standard variables of
their
respective types:
apple.weight
apple.price
banana.weight
banana.price
melon.weight
melon.price
Each one of
these has the data type corresponding to the member they refer to: apple.weight, banana.weight
and melon.weight are of
type int, while apple.price, banana.price and melon.price are of
type float.
Let's see a
real example where you can see how a structure type can be used in the same way
as fundamental
types:
// example about structures
#include <iostream>
using namespace std;
struct movies_t {
string title;
int year;
} mine, yours;
void printmovie (movies_t movie);
// forward declaration
int main ()
{
string mystr;
mine.title = "2001 A Space Odyssey";
mine.year = 1968;
cout << "Enter title: ";
getline (cin,yours.title);
cout << "Enter year: ";
cin >> yours.year;
cout << "My favorite movie is:\n ";
printmovie (mine);
cout << "And yours is:\n ";
printmovie (yours);
return 0;
}
void printmovie (movies_t movie)
{
cout << movie.title;
cout << " (" << movie.year << ")\n";
}
The example
shows how we can use the members of an object as regular variables. For
example, the member
yours.year
is a valid variable of type int, and mine.title is a valid variable
of type string.
The objects mine and yours can also be
treated as valid variables of type movies_t, for example we have passed
them to the
function printmovie as we would have done with regular variables. Therefore, one of the
most
important advantages
of data structures is that we can either refer to their members individually or
to the entire
structure as a
block with only one identifier.
Data
structures are a feature that can be used to represent databases, especially if
we consider the possibility of
building
arrays of them:
Pointers
to structures
Like
any other type, structures can be pointed by its own type of pointers:
struct movies_t {
string title;
int year;
};
movies_t amovie;
movies_t * pmovie;
Here
amovie is an object of structure type movies_t, and pmovie is a pointer to point to objects of
structure type
movies_t. So, the
following code would also be valid:
pmovie = &amovie;
The
value of the pointer pmovie would be assigned to a reference to
the object amovie (its memory address).
We will now go
with another example that includes pointers, which will serve to introduce a
new operator: the
arrow operator
(->):
// pointers to
structures
#include
<iostream>
#include
<string>
#include
<sstream>
using namespace std;
struct movies_t {
string title;
int year;
};
int main ()
{
string mystr;
movies_t amovie;
movies_t * pmovie;
pmovie = &amovie;
cout << "Enter title: ";
getline (cin, pmovie->title);
cout << "Enter year: ";
getline (cin, mystr);
(stringstream) mystr >>
pmovie->year;
cout << "\nYou have
entered:\n";
cout << pmovie->title;
cout << " (" <<
pmovie->year << ")\n";
system("PAUSE");
return 0;
}
The previous code
includes an important introduction: the arrow operator (->). This is a
dereference operator that
is used
exclusively with pointers to objects with members. This operator serves to
access a member of an object to
which we have
a reference. In the example we used:
pmovie->title
Which is for
all purposes equivalent to:
(*pmovie).title
Both
expressions pmovie->title
and (*pmovie).title
are valid and both mean that we are evaluating
the
member title of the data
structure pointed by a pointer called pmovie.
It must be
clearly differentiated from:
*pmovie.title
which is
equivalent to:
*(pmovie.title)
Nesting
structures
Structures
can also be nested so that a valid element of a structure can also be in its
turn another structure.
struct movies_t {
string title;
int year;
};
struct friends_t {
string name;
string email;
movies_t favorite_movie;
} charlie, maria;
friends_t * pfriends = &charlie;
After
the previous declaration we could use any of the following expressions:
charlie.name
maria.favorite_movie.title
charlie.favorite_movie.year
pfriends->favorite_movie.year
(where,
by the way, the last two expressions refer to the same member).
Other
Data Types
Defined
data types (typedef)
C++
allows the definition of our own types based on other existing data types. We
can do this using the keyword typedef, whose format is:
typedef existing_type new_type_name ;
where
existing_type is a C++ fundamental or compound type and new_type_name is the name for the new type we are defining. For example:
typedef char C;
typedef unsigned int WORD;
typedef char * pChar;
typedef char field [50];
In
this case we have defined four data types: C, WORD, pChar and field as char, unsigned int, char* and char[50] respectively,
that we could perfectly use in declarations later as any other valid type:
C mychar, anotherchar, *ptc1;
WORD myword;
pChar ptc2;
field name;
typedef does not
create different types. It only creates synonyms of existing types.
Unions
Anonymous
unions
Enumerations
(enum)
Enumerations
create new data types to contain something different that is not limited to the
values fundamental data types may take. Its form is the following:
enum enumeration_name {
value1,
value2,
value3,
.
.
} object_names;
For
example, we could create a new type of variable called color to store colors with the following declaration:
enum colors_t {black, blue, green, cyan, red, purple, yellow, white};
Notice
that we do not include any fundamental data type in the declaration. To say it
somehow, we have created a whole new data type from scratch without basing it
on any other existing type.
The
possible values that variables of this new type color_t may take are
the new constant values included within braces. For example, once the colors_t enumeration is declared the following expressions will be
valid:
colors_t mycolor;
mycolor = blue;
if (mycolor == green) mycolor = red;
Enumerations
are type compatible with numeric variables, so their constants are always
assigned an integer numerical value internally.
If
it is not specified, the integer value equivalent to the first possible value is
equivalent to 0 and the following ones follow a +1
progression. Thus, in our data type colors_t that
we have defined above, black would be equivalent to 0, blue would be equivalent to 1, green to 2, and so on.
We
can explicitly specify an integer value for any of the constant values that our
enumerated type can take. If the constant value that follows it is not given an
integer value, it is automatically assumed the same value as the previous one
plus one. For example:
enum months_t { january=1, february, march, april, may, june, july,
august,
september, october, november, december} y2k;
In
this case, variable y2k of enumerated type months_t can contain any of the 12 possible
values that go from january
to december and
that are equivalent to values between 1 and 12 (not between 0 and 11, since we have made january equal to 1).
Input/Output
with files
C++
provides the following classes to perform output and input of characters
to/from files:
ofstream: Stream class to write on files
ifstream: Stream class to read from files
fstream: Stream class to both read and write
from/to files.
These
classes are derived directly or indirectly from the classes istream, and ostream.
We
have already used objects whose types were these classes: cin is an object of class istream and cout is an object of class ostream. Therfore, we
have already been using classes that are related to our file streams.
And
in fact, we can use our file streams the same way we are already used to use cin and cout, with the only difference that we
have to associate these streams with physical files. Let's see an example:
// basic file operations
#include <iostream>
#include <fstream>
using namespace std;
int main ()
{
ofstream myfile;
myfile.open ("example.txt");
myfile << "Writing this to a file.\n";
myfile.close();
return 0;
}
This
code creates a file called example.txt and inserts a sentence into it in the
same way we are used to do with cout, but using the file stream myfile instead.
Open
a file
The
first operation generally performed on an object of one of these classes is to
associate it to a real file. This procedure is known as to open a file.
An
open file is represented within a program by a stream object (an instantiation
of one of these classes, in the previous example this was myfile) and any input or output operation performed on this
stream object will be applied to the physical file associated to it.
In order to
open a file with a stream object we use its member function open():
open (filename, mode);
Where filename is a
null-terminated character sequence of type const
char * (the same type that string literals have)
representing the name of the file to be opened, and mode is an optional
parameter with a combination of the following flags:
ios::in
Open for input
operations.
ios::out
Open for output
operations.
ios::binary
Open in binary mode.
ios::ate Set the initial position
at the end of the file. If this flag is not set to any value, the initial
position is
the beginning of the file.
ios::app All output operations are
performed at the end of the file, appending the content to the
current content of the file. This flag
can only be used in streams open for output-only
operations.
ios::trunc If the file opened for
output operations already existed before, its previous content is
deleted and replaced by the new one.
All
these flags can be combined using the bitwise operator OR (|).
For
example, if we want to open the file example.bin in binary mode to add data we could
do it by the following call to member function open():
ofstream myfile;
myfile.open ("example.bin", ios::out | ios::app |
ios::binary);
Each
one of the open()
member functions of the classes ofstream, ifstream
and fstream has a default mode
that is used if the file is opened without a second argument:
class
default
mode parameter
ofstream
ios::out
ifstream
ios::in
fstream
ios::in | ios::out
For
ifstream and ofstream
classes, ios::in and ios::out
are automatically and respectively
assumed, even if a mode that does not include them is passed as second argument
to the open() member function.
Since
the first task that is performed on a file stream object is generally to open a
file, these three classes include a constructor that automatically calls the open() member function and has the exact same parameters as this
member. Therefore, we could also have declared the previous myfile object and conducted the same opening operation in our
previous example by writing:
ofstream myfile ("example.bin", ios::out | ios::app |
ios::binary);
Combining
object construction and stream opening in a single statement. Both forms to
open a file are valid and equivalent.
To
check if a file stream was successful opening a file, you can do it by calling
to member is_open()
with no arguments. This member
function returns a bool value of true in the case that indeed the stream object
is associated with an open file, or false otherwise:
if (myfile.is_open()) { /* ok, proceed with output */ }
Closing a file
When we are
finished with our input and output operations on a file we shall close it so
that its resources become available again. In order to do that we have to call
the stream's member function close(). This member function takes no parameters, and what it does is to
flush the associated buffers and close the file:
myfile.close();
Once this
member function is called, the stream object can be used to open another file,
and the file is available again to be opened by other processes.
Text files
Text
file streams are those where we do not include the ios::binary flag in their opening mode.
These
files are designed to store text and thus all values that we input or output
from/to them can suffer some formatting transformations, which do not
necessarily correspond to their literal binary value.
Data output operations on text files are performed in the same way we operated with cout:
// writing on a text file
#include <iostream>
#include <fstream>
using namespace std;
int main ()
{
ofstream myfile ("example.txt");
if (myfile.is_open())
{
myfile << "This is a line.\n";
myfile << "This is another line.\n";
myfile.close();
}
else cout << "Unable to open file";
return 0;
}
Data input from a file can also be performed in the same way that we did with cin:
// reading a text file
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main () {
string line;
ifstream myfile ("example.txt");
if (myfile.is_open())
{
while (! myfile.eof() )
{
getline (myfile,line);
cout << line << endl;
}
myfile.close();
}
else cout << "Unable to open file";
return 0;
}
Checking
state flags
In addition to
eof(), which checks if the end of file has been reached, other member functions
exist to check the state of a stream (all of them return a bool value):
bad()
Returns true
if a reading or writing operation fails. For example in the case that we try to
write to a file that is not open for writing or if the device where we try to
write has no space left.
fail()
Returns true
in the same cases as bad(), but also in the case that a format error happens,
like when an alphabetical character is extracted when we are trying to read an
integer number.
eof()
Returns true
if a file open for reading has reached the end.
good()
It is the most
generic state flag: it returns false in the same cases in which
calling any of the previous functions would return true.
In order to
reset the state flags checked by any of these member functions we have just
seen we can use the member function clear(), which takes
no parameters.
get and put
stream pointers
All i/o
streams objects have, at least, one internal stream pointer:
ifstream, like istream, has a pointer known as the get
pointer that
points to the element to be read in the next input operation.
ofstream, like ostream, has a pointer known as the put pointer that points to
the location where the next element has to be written.
Finally, fstream, inherits both, the get and the put
pointers, from iostream (which is itself derived from both istream
and ostream).
These internal stream pointers that point to the
reading or writing locations within a stream can be manipulated using the
following member functions:
tellg() and tellp()
These two member functions have no parameters and
return a value of the member type pos_type, which is an integer data type
representing the current position of the get stream pointer (in the case of tellg) or the put stream pointer (in the
case of tellp).
seekg() and seekp()
These functions allow us to change the position
of the get and put stream pointers.
seekg ( position );
seekp ( position );
Using this prototype the stream pointer is
changed to the absolute position position (counting from the beginning of the
file).
The following example uses the member functions
we have just seen to obtain the size of a file:
// obtaining file size
#include <iostream>
#include <fstream>
using namespace std;
int main () {
long begin,end;
ifstream myfile ("example.txt");
begin = myfile.tellg();
myfile.seekg (0, ios::end);
end = myfile.tellg();
myfile.close();
cout << "size is: " << (end-begin) << " bytes.\n";
return 0;
}
Preprocessor
directives
Preprocessor
directives are lines included in the code of our programs that are not program statements
but directives for the preprocessor. These lines are always preceded by a hash
sign (#). The preprocessor is executed before the actual
compilation of code begins, therefore the preprocessor digests all these
directives before any code is generated by the statements.
These
preprocessor directives extend only across a single line of code. As soon as a
newline character is found, the preprocessor directive is considered to end. No
semicolon (;) is expected at the end of a preprocessor directive. The only way
a preprocessor directive can extend through more than one line is by preceding
the newline character at the end of the line by a backslash (\).
macro
definitions (#define, #undef)
To
define preprocessor macros we can use #define. Its format is:
#define identifier replacement
When
the preprocessor encounters this directive, it replaces any occurrence of identifier in the rest of the code by replacement.
This replacement can be an expression, a statement, a block or simply
anything. The preprocessor does not understand C++, it simply replaces any
occurrence of identifier by replacement.
#define TABLE_SIZE 100
int table1[TABLE_SIZE];
int table2[TABLE_SIZE];
After
the preprocessor has replaced TABLE_SIZE, the code becomes equivalent to:
int table1[100];
int table2[100];
Conditional
inclusions (#ifdef, #ifndef, #if, #endif, #else and #elif)
These
directives allow to include or discard part of the code of a program if a
certain condition is met. #ifdef allows a section of a program to be compiled
only if the macro that is specified as the parameter has been defined, no
matter which its value is. For example:
#ifdef TABLE_SIZE
int table[TABLE_SIZE];
#endif
In
this case, the line of code int table[TABLE_SIZE]; is
only compiled if TABLE_SIZE was previously defined with #define, independently of its value. If it was not defined, that
line will not be included in the program compilation. #ifndef serves for the exact opposite: the code between #ifndef and #endif directives is only compiled if the
specified identifier has not been previously defined. For example:
#ifndef TABLE_SIZE
#define TABLE_SIZE 100
#endif
int table[TABLE_SIZE];
In this case, if when arriving at
this piece of code, the TABLE_SIZE macro has not been defined
yet, it would be defined to a value of 100. If it already existed it would keep
its previous value since the #define directive would not be
executed.
Source
file inclusion (#include)
This
directive has also been used assiduously in other sections of this tutorial.
When the preprocessor finds an #include directive it replaces it by the
entire content of the specified file. There are two ways to specify a file to
be included:
#include "file"
#include <file>
The only difference between both
expressions is the places (directories) where the compiler is going to look for
the file.
In
the first case where the file name is specified between double-quotes, the file
is searched first in the same directory that includes the file containing the
directive.
In
case that it is not there, the compiler searches the file in the default
directories where it is configured to look for the standard header files.
If
the file name is enclosed between angle-brackets <> the
file is searched directly where the compiler is configured to look for the
standard header files. Therefore, standard header files are usually included in
angle-brackets, while other specific header files are included using quotes.
In each header file that contains a structure, you should first check to
see if this header has already been included in this particular cpp
file.
You do this by testing a preprocessor flag. If the flag isn’t set, the file
wasn’t included and you should set the flag (so the structure can’t get re-declared)
and declare the structure. If the flag was set then that type has already been
declared so you should just ignore the code that declares it.
Here’s how the header file should look:
#ifndef
HEADER_FLAG
#define
HEADER_FLAG
// Type declaration here...
#endif
// HEADER_FLAG
As you can see, the first time the header file is included, the contents of
the header file (including your type declaration) will be included by the
preprocessor.
All the subsequent times it is included
– in a single compilation unit – the type declaration will be ignored.
The name HEADER_FLAG can be any unique name, but a reliable standard to
follow is to capitalize the name of the header file and replace periods with underscores
(leading underscores, however, are reserved for system names).
Here’s an example:
//: Simple.h
// Simple header that prevents re-definition
#ifndef
SIMPLE_H
#define
SIMPLE_H
struct Simple {
int quantity;
float
price;
};
float total(Simple s) { return
s.number * s.price; }
#endif
// SIMPLE_H ///:~
Although the SIMPLE_H after the #endif is commented out and
thus ignored by the preprocessor, it is useful for documentation.
These preprocessor statements that prevent
multiple inclusion are often referred to as include guards.
When programmers talk about creating programs, they often say, "it compiles fine" or, when asked if the program works, "let's compile it and see".
This colloquial usage might
later be a source of confusion for new programmers. Compiling isn't quite
the same as creating an executable file!
Instead, creating an executable is a multistage process divided into two components: compilation and linking.
In reality, even if a program "compiles fine" it might not actually work because of errors during the linking phase. The total process of going from source code files to an executable might better be referred to as a build.
Compilation refers to the processing of source code files (.c, .cc, or .cpp) and the creation of an 'object' file.
This step doesn't create anything the user can actually run. Instead, the compiler merely produces the machine language instructions that correspond to the source code file that was compiled.
For instance, if you compile (but don't link) three separate files, you will have three object files created as output, each with the name <filename>.o or <filename>.obj (the extension will depend on your compiler). Each of these files contains a translation of your source code file into a machine language file -- but you can't run them yet! You need to turn them into executables your operating system can use. That's where the linker comes in.
Linking refers to the creation of a single executable file from multiple object files.
In this step, it is common that the linker will complain about undefined functions (commonly, main itself).
During compilation, if the compiler could not find the definition for a particular function, it would just assume that the function was defined in another file. If this isn't the case, there's no way the compiler would know -- it doesn't look at the contents of more than one file at a time.
The linker, on the other hand,
may look at multiple files and try to find references for the functions that
weren't mentioned.
You might ask why there are separate compilation and linking steps.
First, it's probably easier to implement things that way. The compiler does its thing, and the linker does its thing -- by keeping the functions separate, the complexity of the program is reduced.
Another (more obvious) advantage is that this allows the creation of large programs without having to redo the compilation step every time a file is changed.
Instead, using so called "conditional compilation", it is necessary to compile only those source files that have changed; for the rest, the object files are sufficient input for the linker.
Finally, this makes it simple
to implement libraries of pre-compiled code: just create object files and link
them just like any other object file. (The fact that each file is compiled
separately from information contained in other files, incidentally, is called
the "separate compilation model".)
To get the full benefits of condition compilation, it's probably easier to get
a program to help you than to try and remember which files you've changed since
you last compiled. (You could, of course, just recompile every file that has a
timestamp greater than the timestamp of the corresponding object file.)
If you're working with an
integrated development environment (IDE) it may already take care of this for
you. If you're using command line tools, there's a nifty utility called make that comes with most
*nix distributions. Along with conditional compilation, it has several other
nice features for programming, such as allowing different compilations of your
program -- for instance, if you have a version producing verbose output for
debugging.
Knowing the difference between the compilation phase and the link phase can
make it easier to hunt for bugs.
Compiler errors are usually syntactic in nature -- a missing semicolon, an extra parenthesis.
Linking errors usually have to do with missing or multiple definitions. If you get an error that a function or variable is defined multiple times from the linker, that's a good indication that the error is that two of your source code files have the same function or variable.
Object Oriented Programming
Classes
A class is an expanded
concept of a data structure: instead of holding only data, it can hold both
data and functions.
An object is an instantiation
of a class. In terms of variables, a class would be the type, and an object
would be the variable.
Classes are generally declared using the keyword class, with the following format:
class class_name {
access_specifier_1:
member1;
access_specifier_2:
member2;
...
} object_names;
Where class_name is a valid identifier for the class, object_names is an optional list of names for objects of this class.
The body of the declaration can contain members,
that can be either data or function declarations, and optionally access
specifiers.
All is very similar to the declaration on data structures, except
that we can now include also functions and members, but also this new thing
called access specifier.
An access specifier is one of the following three keywords: private, public or protected.
These specifiers modify the access rights that the members following them
acquire:
·
private members of a class are accessible
only from within other members of the same class or from their friends.
·
protected members are accessible from members
of their same class and from their friends, but also from members of their
derived classes.
·
Finally, public members are accessible from anywhere where the object is
visible.
·
By default, all members of a class
declared with the class keyword have
private access for all its members. Therefore,
any member that is declared before one other class specifier automatically has
private access. For example:
class CRectangle {
int x, y;
public:
void set_values (int,int);
int area (void);
} rect;
Declares a class (i.e., a type) called CRectangle and an object (i.e., a variable) of this class called rect. This class contains four members: two data members of type int (member x and member y) with private access (because private is the default access
level) and two member functions with public access: set_values() and area(), of which for now we have only
included their declaration, not their definition.
Notice the difference between the class name and the object name:
In the previous example, CRectangle was the class name (i.e., the type),
whereas rect was an object of type CRectangle.
It is the same relationship int and a have in the following declaration:
int a;
where int is the type name (the
class) and a is the variable name (the object).
After the previous declarations of CRectangle and
rect, we can refer within the body of the program to any of
the public members of the object rect as if they were normal
functions or normal variables, just by putting the object's name followed by a
dot (.)
and then the name of the member. All very similar to what we did with plain
data structures before. For example:
rect.set_values
(3,4);
myarea
= rect.area();
The only members of rect that we
cannot access from the body of our program outside the class are x and
y,
since they have private access and they
can only be referred from within other members of that same class.
Here is the complete example of class
CRectangle:
// classes example
#include <iostream>
using namespace std;
class CRectangle {
int x, y;
public:
void set_values (int,int);
int area () {return (x*y);}
};
void CRectangle::set_values
(int a, int b) {
x = a;
y = b;
}
int main () {
CRectangle
rect;
rect.set_values
(3,4);
cout
<< "area:
" <<
rect.area();
return
0;
}
The most
important new thing in this code is the operator
of scope (::, two colons) included in the definition of set_values(). It is
used to define a member of a class from outside the class definition itself.
You may
notice that the definition of the member function area() has been included directly within the definition
of the CRectangle class given its
extreme simplicity, whereas set_values()
has only its
prototype declared within the class, but its definition is outside it.
In this
outside declaration, we must use the operator of scope (::) to specify
that we are defining a function that is a member of the class CRectangle and not a
regular global function. The scope operator (::) specifies the class to which the member being
declared belongs, granting exactly the same scope properties as if this
function definition was directly included within the class definition.
For
example, in the function set_values() of the previous code, we have been able to use the variables x and y, which are
private members of class CRectangle, which means they are only accessible from other members of their
class.
The only
difference between defining a class member function completely within its class
or to include only the prototype and later its definition, is that in the first
case the function will automatically be considered an inline member function by
the compiler, while in the second it will be a normal (not-inline) class member
function, which in fact supposes no difference in behavior.
Members x and y have private
access (remember that if nothing else is said, all members of a class defined
with keyword class have private access). By declaring them private we deny
access to them from anywhere outside the class.
This
makes sense, since we have already defined a member function to set values for
those members within the object: the member function set_values().
Therefore, the rest of the program does not need to have direct access to them.
Perhaps in a so simple example as this, it is
difficult to see an utility in protecting those two variables, but in greater
projects it may be very important that values cannot be modified in an
unexpected way (unexpected from the point of view of the object).
Classes
defined with struct and union
Classes
can be defined not only with keyword class, but also with keywords struct and union.
The
concepts of class and data structure are so similar that both keywords (struct and class) can be used
in C++ to declare classes (i.e. structs can also have function members in C++, not only data members).
The only difference between both is that members
of classes declared with the keyword struct have public access by default,
while members of classes declared with the keyword class
have private
access.
For all
other purposes both keywords are equivalent.
Classes (continued)
One of the greater advantages of a class is that,
as any other type, we can declare several objects of it.
For example,
following with the previous example of class CRectangle, we could have declared the object rectb in addition to
the object rect:
// example: one class, two
objects
#include <iostream>
using namespace std;
class CRectangle {
int x, y;
public:
void
set_values (int,int);
int
area () {return (x*y);}
};
void CRectangle::set_values
(int a, int b) {
x
= a;
y
= b;
}
int main () {
CRectangle
rect, rectb;
rect.set_values
(3,4);
rectb.set_values
(5,6);
cout
<< "rect
area: " <<
rect.area() << endl;
cout
<< "rectb
area: " <<
rectb.area() << endl;
return
0;
}
In this
concrete case, the class (type of the objects) to which we are talking about is
CRectangle, of which there are two instances or objects: rect and rectb. Each one of
them has its own member variables and member functions.
Notice
that the call to rect.area() does not give the same result as the call to rectb.area(). This
is because each object of class CRectangle has its own variables x and y, as they, in
some way, have also their own function members
set_value()
and area()
that each uses
its object's own variables to operate.
That is
the basic concept of object-oriented programming: Data and functions are both members of the object. We no longer use
sets of global variables that we pass from one function to another as
parameters, but instead we handle objects that have their own data and
functions embedded as members.
Notice
that we have not had to give any parameters in any of the calls to rect.area or rectb.area. Those member
functions directly used the data members of their respective objects rect and rectb.
Constructors
and destructors
Objects generally need to initialize variables or
assign dynamic memory during their process of creation to become operative and
to avoid returning unexpected values during their execution.
For
example, what would happen if in the previous example we called the member
function area() before
having called function set_values()?
Probably we would have gotten an undetermined
result since the members x and y would have never been assigned a value.
In order
to avoid that, a class can include a special function called constructor, which is automatically called whenever a
new object of this class is created.
This constructor function must have the same name
as the class, and cannot have any return type; not even void.
We are
going to implement CRectangle including a constructor:
// example: class constructor
#include <iostream>
using namespace std;
class CRectangle {
int width, height;
public:
CRectangle
(int,int);
int
area () {return (width*height);}
};
CRectangle::CRectangle (int a, int b) {
width
= a;
height
= b;
}
int main () {
CRectangle
rect (3,4);
CRectangle
rectb (5,6);
cout
<< "rect
area: " <<
rect.area() << endl;
cout
<< "rectb
area: " <<
rectb.area() << endl;
return
0;
}
As you
can see, the result of this example is identical to the previous one.
But now
we have removed the member function set_values(), and have included instead a constructor that performs a similar
action: it initializes the values of x
and y
with the parameters that are passed to it.
Notice
how these arguments are passed to the constructor at the moment at which the
objects of this class are created:
CRectangle
rect (3,4);
CRectangle
rectb (5,6);
Constructors cannot be called explicitly as if
they were regular member functions. They are only executed when a new object of
that class is created.
Overloading
Constructors
Like any other function, a constructor can also
be overloaded with more than one function that have the same name but different
types or number of parameters.
Remember
that for overloaded functions the compiler will call the one whose parameters
match the arguments used in the function call. In the case of constructors,
which are automatically called when an object is created, the one executed is
the one that matches the arguments passed on the object declaration:
// overloading class
constructors
#include <iostream>
using namespace std;
class CRectangle {
int width, height;
public:
CRectangle
();
CRectangle
(int,int);
int
area (void) {return (width*height);}
};
CRectangle::CRectangle () {
width
= 5;
height
= 5;
}
CRectangle::CRectangle (int a, int b) {
width
= a;
height
= b;
}
int main () {
CRectangle
rect (3,4);
CRectangle
rectb;
cout
<< "rect
area: " <<
rect.area() << endl;
cout
<< "rectb
area: " <<
rectb.area() << endl;
return
0;
}
In this case, rectb was declared without any arguments,
so it has been initialized with the constructor that has no parameters, which
initializes both width
and height with a value
of 5.
Important: Notice how if
we declare a new object and we want to use its default constructor (the one
without parameters), we do not include parentheses ():
CRectangle
rectb; //
right
CRectangle
rectb(); //
wrong!
Default constructor
If you do not declare any constructors in a class definition, the
compiler assumes the class to have a default constructor with no arguments.
Therefore, after declaring a class like this one:
class CExample {
public:
int
a,b,c;
void
multiply (int n, int m) { a=n; b=m; c=a*b;
};
};
The compiler assumes that CExample has a default constructor, so you can
declare objects of this class by simply declaring them without any arguments:
CExample ex;
But as soon as you declare your own constructor for a class, the
compiler no longer provides an implicit default constructor. So you have to
declare all objects of that class according to the constructor prototypes you
defined for the class:
class CExample {
public:
int
a,b,c;
CExample
(int n, int m) { a=n; b=m; };
void
multiply () { c=a*b; };
};
Here we have declared a constructor
that takes two parameters of type int. Therefore the following objekt declaration
would be correct:
CExample ex (2,3);
But,
CExample ex;
Would not be correct, since we have
declared the class to have an explicit constructor, thus replacing the default constructor.
Pointers
to classes
Overloading
operators
The
keyword this
The
keyword this represents a pointer to the object whose member function is being
executed. It is a pointer to the object itself.
Static
members
Friend
functions
Friend
classes
Inheritance
between classes
Polymorphism