Programming: Principles and Practice Using C++ (2014)

Part I: The Basics

8. Technicalities: Functions, etc.

“No amount of genius can overcome
obsession with detail.”d

—Traditional

In this chapter and the next, we change our focus from programming to our main tool for programming: the C++ programming language. We present language-technical details to give a slightly broader view of C++’s basic facilities and to provide a more systematic view of those facilities. These chapters also act as a review of many of the programming notions presented so far and provide an opportunity to explore our tool without adding new programming techniques or concepts.


8.1 Technicalities

8.2 Declarations and definitions

8.2.1 Kinds of declarations

8.2.2 Variable and constant declarations

8.2.3 Default initialization

8.3 Header files

8.4 Scope

8.5 Function call and return

8.5.1 Declaring arguments and return type

8.5.2 Returning a value

8.5.3 Pass-by-value

8.5.4 Pass-by-const-reference

8.5.5 Pass-by-reference

8.5.6 Pass-by-value vs. pass-by-reference

8.5.7 Argument checking and conversion

8.5.8 Function call implementation

8.5.9 constexpr functions

8.6 Order of evaluation

8.6.1 Expression evaluation

8.6.2 Global initialization

8.7 Namespaces

8.7.1 using declarations and using directives


8.1 Technicalities

Given a choice, we’d much rather talk about programming than about programming language features; that is, we consider how to express ideas as code far more interesting than the technical details of the programming language that we use to express those ideas. To pick an analogy from natural languages: we’d much rather discuss the ideas in a good novel and the way those ideas are expressed than study the grammar and vocabulary of English. What matters are ideas and how those ideas can be expressed in code, not the individual language features.

However, we don’t always have a choice. When you start programming, your programming language is a foreign language for which you need to look at “grammar and vocabulary.” This is what we will do in this chapter and the next, but please don’t forget:

Image

• Our primary study is programming.

• Our output is programs/systems.

• A programming language is (only) a tool.

Keeping this in mind appears to be amazingly difficult. Many programmers come to care passionately about apparently minor details of language syntax and semantics. In particular, too many get the mistaken belief that the way things are done in their first programming language is “the one true way.” Please don’t fall into that trap. C++ is in many ways a very nice language, but it is not perfect; neither is any other programming language.

Image

Most design and programming concepts are universal, and many such concepts are widely supported by popular programming languages. That means that the fundamental ideas and techniques we learn in a good programming course carry over from language to language. They can be applied — with varying degrees of ease — in all languages. The language technicalities, however, are specific to a given language. Fortunately, programming languages do not develop in a vacuum, so much of what you learn here will have reasonably obvious counterparts in other languages. In particular, C++ belongs to a group of languages that also includes C (Chapter 27), Java, and C#, so quite a few technicalities are shared with those languages.

Note that when we are discussing language-technical issues, we deliberately use nondescriptive names, such as f, g, X, and y. We do that to emphasize the technical nature of such examples, to keep those examples very short, and to try to avoid confusing you by mixing language technicalities and genuine program logic. When you see nondescriptive names (such as should never be used in real code), please focus on the language-technical aspects of the code. Technical examples typically contain code that simply illustrates language rules. If you compiled and ran them, you’d get many “variable not used” warnings, and few such technical program fragments would do anything sensible.

Please note that what we write here is not a complete description of C++’s syntax and semantics — not even for the facilities we describe. The ISO C++ standard is 1300+ pages of dense technical language and The C++ Programming Language by Stroustrup is 1300+ pages of text aimed at experienced programmers (both covering both the C++ language and its standard library). We do not try to compete with those in completeness and comprehensiveness; we compete with them in comprehensibility and value for time spent reading.

8.2 Declarations and definitions

declaration is a statement that introduces a name into a scope (§8.4)

• Specifying a type for what is named (e.g., a variable or a function)

• Optionally, specifying an initializer (e.g., an initializer value or a function body)

For example:

int a = 7;                             // an int variable
const double cd = 8.7;      // a double-precision floating-point constant
double sqrt(double);        // a function taking a double argument
                                           // and returning a double result
vector<Token> v;              // a vector-of-Tokens variable

Before a name can be used in a C++ program, it must be declared. Consider:

int main()
{
         cout << f(i) << '\n';
}

The compiler will give at least three “undeclared identifier” errors for this: cout, f, and i are not declared anywhere in this program fragment. We can get cout declared by including the header std_lib_facilities.h, which contains its declaration:

#include "std_lib_facilities.h"         // we find the declaration of cout in here
int main()
{
          cout << f(i) << '\n';
}

Now, we get only two “undefined” errors. As you write real-word programs, you’ll find that most declarations are found in headers. That’s where we define interfaces to useful facilities defined “elsewhere.” Basically, a declaration defines how something can be used; it defines the interface of a function, variable, or class. Please note one obvious but invisible advantage of this use of declarations: we didn’t have to look at the details of how cout and its << operators were defined; we just #included their declarations. We didn’t even have to look at their declarations; from textbooks, manuals, code examples, or other sources, we just know how cout is supposed to be used. The compiler reads the declarations in the header that it needs to “understand” our code.

However, we still have to declare f and i. We could do that like this:

#include "std_lib_facilities.h"       // we find the declaration of cout in here

int f(int);                                           // declaration of f

int main()
{
          int i = 7;                                  // declaration of i
          cout << f(i) << '\n';
}

This will compile because every name has been declared, but it will not link (§2.4) because we have not defined f(); that is, nowhere have we specified what f() actually does.

A declaration that (also) fully specifies the entity declared is called a definition. For example:

int a = 7;
vector<double> v;
double sqrt(double d) { /* . . . */ }

Every definition is (by definition Image) also a declaration, but only some declarations are also definitions. Here are some examples of declarations that are not definitions; if the entity it refers to is used, each must be matched by a definition elsewhere in the code:

double sqrt(double);      // no function body here
extern int a;                     // “extern plus no initializer” means “not definition”

When we contrast definitions and declarations, we follow convention and use declarations to mean “declarations that are not definitions” even though that’s slightly sloppy terminology.

A definition specifies exactly what a name refers to. In particular, a definition of a variable sets aside memory for that variable. Consequently, you can’t define something twice. For example:

double sqrt(double d) { /* . . . */ }        // definition
double sqrt(double d) { /* . . . */ }        // error: double definition

int a;                                                      // definition
int a;                                                      // error: double definition

In contrast, a declaration that isn’t also a definition simply tells how you can use a name; it is just an interface and doesn’t allocate memory or specify a function body. Consequently, you can declare something as often as you like as long as you do so consistently:

int x = 7;                                             // definition
extern int x;                                       // declaration
extern int x;                                       // another declaration

double sqrt(double);                        // declaration
double sqrt(double d) { /* . . . */ }   // definition
double sqrt(double);                        // another declaration of sqrt
double sqrt(double);                        // yet another declaration of sqrt

int sqrt(double);                                // error: inconsistent declarations of sqrt

Why is that last declaration an error? Because there cannot be two functions called sqrt taking an argument of type double and returning different types (int and double).

The extern keyword used in the second declaration of x simply states that this declaration of x isn’t a definition. It is rarely useful. We recommend that you don’t use it, but you’ll see it in other people’s code, especially code that uses too many global variables (see §8.4 and §8.6.2).

Image

Image

Why does C++ offer both declarations and definitions? The declaration/definition distinction reflects the fundamental distinction between what we need to use something (an interface) and what we need for that something to do what it is supposed to (an implementation). For a variable, a declaration supplies the type but only the definition supplies the object (the memory). For a function, a declaration again provides the type (argument types plus return type) but only the definition supplies the function body (the executable statements). Note that function bodies are stored in memory as part of the program, so it is fair to say that function and variable definitions consume memory, whereas declarations don’t.

The declaration/definition distinction allows us to separate a program into many parts that can be compiled separately. The declarations allow each part of a program to maintain a view of the rest of the program without bothering with the definitions in other parts. As all declarations (including the one definition) must be consistent, the use of names in the whole program will be consistent. We’ll discuss that further in §8.3. Here, we’ll just remind you of the expression parser from Chapter 6: expression() calls term() which calls primary() which calls expression(). Since every name in a C++ program has to be declared before it is used, there is no way we could just define those three functions:

double expression();             // just a declaration, not a definition

double primary()
{
          // . . .
          expression();
          // . . .
}

double term()
{
          // . . .
          primary();
          // . . .
}

double expression()
{
          // . . .
          term();
          // . . .
}

We can order those four functions any way we like; there will always be one call to a function defined below it. Somewhere, we need a “forward” declaration. Therefore, we declared expression() before the definition of primary() and all is well. Such cyclic calling patterns are very common.

Why does a name have to be declared before it is used? Couldn’t we just require the language implementation to read the program (just as we do) and find the definition to see how a function must be called? We could, but that would lead to “interesting” technical problems, so we decided against that. The C++ definition requires declaration before use (except for class members; see §9.4.4). After all, this is already the convention for ordinary (non-program) writing: when you read a textbook, you expect the author to define terminology before using it; otherwise, you have to guess or go to the index all the time. The “declaration before use” rule simplifies reading for both humans and compilers. In a program, there is a second reason that “declare before use” is important. In a program of thousands of lines (maybe hundreds of thousands of lines), most of the functions we want to call will be defined “elsewhere.” That “elsewhere” is often a place we don’t really want to know about. Having to know the declarations only of what we use saves us (and the compiler) from looking through huge amounts of program text.

8.2.1 Kinds of declarations

There are many kinds of entities that a programmer can define in C++. The most interesting are

• Variables

• Constants

• Functions (see §8.5)

• Namespaces (see §8.7)

• Types (classes and enumerations; see Chapter 9)

• Templates (see Chapter 19)

8.2.2 Variable and constant declarations

The declaration of a variable or a constant specifies a name, a type, and optionally an initializer. For example:

int a;                                          // no initializer
double d = 7;                           // initializer using the = syntax
vector<int> vi(10);                  // initializer using the ( ) syntax
vector<int> vi2 {1,2,3,4};       // initializer using the { } syntax

You can find the complete grammar in the ISO C++ standard.

Constants have the same declaration syntax as variables. They differ in having const as part of their type and requiring an initializer:

const int x = 7;         // initializer using the = syntax
const int x2 {9};       // initializer using the {} syntax
const int y;               // error: no initializer

Image

The reason for requiring an initializer for a const is obvious: how could a const be a constant if it didn’t have a value? It is almost always a good idea to initialize variables also; an uninitialized variable is a recipe for obscure bugs. For example:

void f(int z)
{
          int x;                              // uninitialized
          // . . . no assignment to x here . . .
          x = 7;                             // give x a value
          // . . .
}

This looks innocent enough, but what if the first . . . included a use of x? For example:

void f(int z)
{
          int x;                              // uninitialized
          // . . . no assignment to x here . . .
          if (z>x) {
          // . . .
          }
          // . . .
          x = 7;                        // give x a value
          // . . .
}

Because x is uninitialized, executing z>x would be undefined behavior. The comparison z>x could give different results on different machines and different results in different runs of the program on the same machine. In principle, z>x might cause the program to terminate with a hardware error, but most often that doesn’t happen. Instead we get unpredictable results.

Naturally, we wouldn’t do something like that deliberately, but if we don’t consistently initialize variables it will eventually happen by mistake. Remember, most “silly mistakes” (such as using an uninitialized variable before it has been assigned to) happen when you are busy or tired. Compilers try to warn, but in complicated code — where such errors are most likely to occur — compilers are not smart enough to catch all such errors. There are people who are not in the habit of initializing their variables, often because they learned to program in languages that didn’t allow or encourage consistent initialization; so you’ll see examples in other people’s code. Please just don’t add to the problem by forgetting to initialize the variables you define yourself.

We have a preference for the { } initializer syntax. It is the most general and it most explicitly says “initializer.” We tend to use it except for very simple initializations, where we sometimes use = out of old habits, and ( ) for specifying the number of elements of a vector (see §17.4.4).

8.2.3 Default initialization

You might have noticed that we often don’t provide an initializer for strings, vectors, etc. For example:

vector<string> v;
string s;
while (cin>>s) v.push_back(s);

This is not an exception to the rule that variables must be initialized before use. What is going on here is that string and vector are defined so that variables of those types are initialized with a default value whenever we don’t supply one explicitly. Thus, v is empty (it has no elements) and s is the empty string ("") before we reach the loop. The mechanism for guaranteeing default initialization is called a default constructor; see §9.7.3.

Unfortunately, the language doesn’t allow us to make such guarantees for built-in types. A global variable (§8.4) is default initialized to 0, but you should minimize the use of global values. The most useful variables, local variables and class members, are uninitialized unless you provide an initializer (or a default constructor). You have been warned!

8.3 Header files

How do we manage our declarations and definitions? After all, they have to be consistent, and in real-world programs there can be tens of thousands of declarations; programs with hundreds of thousands of declarations are not rare. Typically, when we write a program, most of the definitions we use are not written by us. For example, the implementations of cout and sqrt() were written by someone else many years ago. We just use them.

Image

The key to managing declarations of facilities defined “elsewhere” in C++ is the header. Basically, a header is a collection of declarations, typically defined in a file, so a header is also called a header file. Such headers are then #included in our source files. For example, we might decide to improve the organization of the source code for our calculator (Chapters 6 and 7) by separating out the token management. We could define a header file token.h containing declarations needed to use Token and Token_stream:

Image

The declarations of Token and Token_stream are in the header token.h. Their definitions are in token.cpp. The .h suffix is the most common for C++ headers, and the .cpp suffix is the most common for C++ source files. Actually, the C++ language doesn’t care about file suffixes, but some compilers and most program development environments insist, so please use this convention for your source code.

In principle, #include "file.h" simply copies the declarations from file.h into your file at the point of the #include. For example, we could write a header f.h:

// f.h
int f(int);

and include it in our file user.cpp:

// user.cpp
#include "f.h"
int g(int i)
{
          return f(i);
}

When compiling user.cpp the compiler would do the #include and compile

int f(int);
int g(int i)
{
          return f(i);
}

Since #includes logically happen before anything else a compiler does, handling #includes is part of what is called preprocessing (§A.17).

Image

To ease consistency checking, we #include a header both in source files that use its declarations and in source files that provide definitions for those declarations. That way, the compiler catches errors as soon as possible. For example, imagine that the implementer ofToken_stream::putback() made mistakes:

Token Token_stream::putback(Token t)
{
          buffer.push_back(t);
          return t;
}

This looks innocent enough. Fortunately, the compiler catches the mistakes because it sees the (#included) declaration of Token_stream::putback(). Comparing that declaration with our definition, the compiler finds that putback() should not return a Token and that buffer is a Token, rather than a vector<Token>, so we can’t use push_back(). Such mistakes occur when we work on our code to improve it, but don’t quite get a change consistent throughout a program.

Similarly, consider these mistakes:

Token t = ts.gett();          // error: no member gett
// . . .
ts.putback();                    // error: argument missing

The compiler would immediately give errors; the header token.h gives it all the information it needs for checking.

Our std_lib_facilities.h header contains declarations for the standard library facilities we use, such as cout, vector, and sqrt(), together with a couple of simple utility functions, such as error(), that are not part of the standard library. In §12.8 we show how to use the standard library headers directly.

Image

A header will typically be included in many source files. That means that a header should only contain declarations that can be duplicated in several files (such as function declarations, class definitions, and definitions of numeric constants).

8.4 Scope

Image

scope is a region of program text. A name is declared in a scope and is valid (is “in scope”) from the point of its declaration until the end of the scope in which it was declared. For example:

void f()
{
          g();                  // error: g() isn’t (yet) in scope
}
void g()
{
          f();                   // OK: f() is in scope
}
void h()
{
          int x = y;         // error: y isn’t (yet) in scope
          int y = x;         // OK: x is in scope
          g();                  // OK: g() is in scope
}

Names in a scope can be seen from within scopes nested within it. For example, the call of f() is within the scope of g() which is “nested” in the global scope. The global scope is the scope that’s not nested in any other. The rule that a name must be declared before it can be used still holds, sof() cannot call g().

There are several kinds of scopes that we use to control where our names can be used:

• The global scope: the area of text outside any other scope

• A namespace scope: a named scope nested in the global scope or in another namespace; see §8.7

• A class scope: the area of text within a class; see §9.2

• A local scope: between { . . . } braces of a block or in a function argument list

• A statement scope: e.g., in a for-statement

The main purpose of a scope is to keep names local, so that they won’t interfere with names declared elsewhere. For example:

void f(int x)                  // f is global; x is local to f
{
          int z = x+7;        // z is local
}
int g(int x)                    // g is global; x is local to g
{
          int f = x+2;         // f is local
          return 2*f;
}

Or graphically:

Image

Here f()’s x is different from g()’s x. They don’t “clash” because they are not in the same scope: f()’s x is local to f and g()’s x is local to g. Two incompatible declarations in the same scope are often referred to as a clash. Similarly, the f defined and used within g() is (obviously) not the function f().

Here is a logically equivalent but more realistic example of the use of local scope:

int max(int a, int b)            // max is global; a and b are local
{
          return (a>=b) ? a : b;
}
int abs(int a)                       // not max()’s a
{
          return (a<0) ? –a : a;
}

You find max() and abs() in the standard library, so you don’t have to write them yourself. The ?: construct is called an arithmetic if or a conditional expression. The value of (a>=b)?a:b is a if a>=b and b otherwise. A conditional expression saves us from writing long-winded code like this:

int max(int a, int b)          // max is global; a and b are local
{
          int m;                       // m is local
          if (a>=b)
                    m = a;
          else
                    m = b;
          return m;
}

Image

So, with the notable exception of the global scope, a scope keeps names local. For most purposes, locality is good, so keep names as local as possible. When I declare my variables, functions, etc. within functions, classes, namespaces, etc., they won’t interfere with yours. Remember: Real programs have many thousands of named entities. To keep such programs manageable, most names have to be local.

Here is a larger technical example illustrating how names go out of scope at the end of statements and blocks (including function bodies):

// no r, i, or v here
class My_vector {
          vector<int> v;   // v is in class scope
public:
          int largest()
          {
                    int r = 0;                                    // r is local (smallest nonnegative int)
                    for (int i = 0; i<v.size(); ++i)
                              r = max(r,abs(v[i]));      // i is in the for’s statement scope
                    // no I here
                    return r;
          }
          // no r here
};
// no v here
int x;                               // global variable — avoid those where you can
int y;
int f()
{
          int x;                     // local variable, hides the global x
          x = 7;                    // the local x
          {
                    int x = y;    // local x initialized by global y, hides the previous local x
                    ++x;           // the x from the previous line
          }
          ++x;                     // the x from the first line of f()
          return x;
}

Whenever you can, avoid such complicated nesting and hiding. Remember: “Keep it simple!”

The larger the scope of a name is, the longer and more descriptive its name should be: x, y, and f are horrible as global names. The main reason that you don’t want global variables in your program is that it is hard to know which functions modify them. In large programs, it is basically impossible to know which functions modify a global variable. Imagine that you are trying to debug a program and you find that a global variable has an unexpected value. Who gave it that value? Why? What functions write to that value? How would you know? The function that wrote a bad value to that variable may be in a source file you have never seen! A good program will have only very few (say, one or two), if any, global variables. For example, the calculator in Chapters 6 and 7 had two global variables: the token stream, ts, and the symbol table, names.

Note that most C++ constructs that define scopes nest:

• Functions within classes: member functions (see §9.4.2)

class C {
public:
          void f();
          void g()    // a member function can be defined within its class
          {
                    // . . .
          }
          // . . .
};
void C::f()         // a member definition can be outside its class
{
          // . . .
}

This is the most common and useful case.

• Classes within classes: member classes (also called nested classes)

class C {
public:
          struct M {
                    // . . .
          };
          // . . .
};

This tends to be useful only in complicated classes; remember that the ideal is to keep classes small and simple.

• Classes within functions: local classes

void f()
{
          class L {
                    // . . .
          };
          // . . .
}

Image

Avoid this; if you feel the need for a local class, your function is probably far too long.

• Functions within functions: local functions (also called nested functions)

void f()
{
          void g()          // illegal
          {
                    // . . .
          }
          // . . .
}

This is not legal in C++; don’t do it. The compiler will reject it.

• Blocks within functions and other blocks: nested blocks

void f(int x, int y)
{
          if (x>y) {
                    // . . .
          }
          else {
                    // . . .
                    {
                              // . . .
                    }
                    // . . .
          }
}

Nested blocks are unavoidable, but be suspicious of complicated nesting: it can easily hide errors.

C++ also provides a language feature, namespace, exclusively for expressing scoping; see §8.7.

Image

Note our consistent indentation to indicate nesting. Without consistent indentation, nested constructs become unreadable. For example:

// dangerously ugly code
struct X {
void f(int x) {
struct Y {
int f() { return 1; } int m; };
int m;
m=x; Y m2;
return f(m2.f()); }
int m; void g(int m) {
if (m) f(m+2); else {
g(m+2); }}
X() { } void m3() {
}

void main() {
X a; a.f(2);}
};

Hard-to-read code usually hides bugs. When you use an IDE, it tries to automatically make your code properly indented (according to some definition of “properly”), and there exist “code beautifiers” that will reformat a source code file for you (often offering you a choice of formats). However, the ultimate responsibility for your code being readable rests with you.

8.5 Function call and return

Image

Functions are the way we represent actions and computations. Whenever we want to do something that is worthy of a name, we write a function. The C++ language gives us operators (such as + and *) with which we can produce new values from operands in expressions, and statements (such as for and if) with which we can control the order of execution. To organize code made out of these primitives, we have functions.

To do its job, a function usually needs arguments, and many functions return a result. This section focuses on how arguments are specified and passed.

8.5.1 Declaring arguments and return type

Functions are what we use in C++ to name and represent computations and actions. A function declaration consists of a return type followed by the name of the function followed by a list of formal arguments in parentheses. For example:

double fct(int a, double d);                               // declaration of fct (no body)
double fct(int a, double d) { return a*d; }       // definition of fct

A definition contains the function body (the statements to be executed by a call), whereas a declaration that isn’t a definition just has a semicolon. Formal arguments are often called parameters. If you don’t want a function to take arguments, just leave out the formal arguments. For example:

int current_power();         // current_power doesn’t take an argument

If you don’t want to return a value from a function, give void as its return type. For example:

void increase_power(int level);       // increase_power doesn’t return a value

Here, void means “doesn’t return a value” or “return nothing.”

You can name a parameter or not as it suits you in both declarations and definitions. For example:

// search for s in vs;
// vs[hint] might be a good place to start the search
// return the index of a match; –1 indicates “not found”
int my_find(vector<string> vs, string s, int hint);    // naming arguments

int my_find(vector<string>, string, int);                   // not naming arguments

Image

In declarations, formal argument names are not logically necessary, just very useful for writing good comments. From a compiler’s point of view, the second declaration of my_find() is just as good as the first: it has all the information necessary to call my_find().

Usually, we name all the arguments in the definition. For example:

int my_find(vector<string> vs, string s, int hint)
// search for s in vs starting at hint
{
          if (hint<0 || vs.size()<=hint) hint = 0;
          for (int i = hint; i<vs.size(); ++i)     // search starting from hint
                    if (vs[i]==s) return i;
          if (0<hint) {                                         // if we didn’t find s search before hint
          for (int i = 0; i<hint; ++i)
                    if (vs[i]==s) return i;
          }
          return –1;
}

The hint complicates the code quite a bit, but the hint was provided under the assumption that users could use it to good effect by knowing roughly where in the vector a string will be found. However, imagine that we had used my_find() for a while and then discovered that callers rarely usedhint well, so that it actually hurt performance. Now we don’t need hint anymore, but there is lots of code “out there” that calls my_find() with a hint. We don’t want to rewrite that code (or can’t because it is someone else’s code), so we don’t want to change the declaration(s) of my_find(). Instead, we just don’t use the last argument. Since we don’t use it we can leave it unnamed:

int my_find(vector<string> vs, string s, int)       // 3rd argument unused
{
          for (int i = 0; i<vs.size(); ++i)
                    if (vs[i]==s) return i;
          return –1;
}

You can find the complete grammar for function definitions in the ISO C++ standard.

8.5.2 Returning a value

We return a value from a function using a return-statement:

T f()         // f() returns a T
{
          V v;
          // . . .
          return v;
}

T x = f();

Here, the value returned is exactly the value we would have gotten by initializing a variable of type T with a value of type V:

V v;
// . . .
T t(v);    // initialize t with v

That is, value return is a form of initialization.

A function declared to return a value must return a value. In particular, it is an error to “fall through the end of the function”:

double my_abs(int x)         // warning: buggy code
{
          if (x < 0)
                    return –x;
          else if (x > 0)
                    return x;
}         // error: no value returned if x is 0

Actually, the compiler probably won’t notice that we “forgot” the case x==0. In principle it could, but few compilers are that smart. For complicated functions, it can be impossible for a compiler to know whether or not you return a value, so be careful. Here, “being careful” means to make really sure that you have a return-statement or an error() for every possible way out of the function.

For historical reasons, main() is a special case. Falling through the bottom of main() is equivalent to returning the value 0, meaning “successful completion” of the program.

In a function that does not return a value, we can use return without a value to cause a return from the function. For example:

void print_until_s(vector<string> v, string quit)
{
          for(int s : v) {
                    if (s==quit) return;
                    cout << s << '\n';
          }
}

As you can see, it is acceptable to “drop through the bottom” of a void function. This is equivalent to a return;.

8.5.3 Pass-by-value

Image

The simplest way of passing an argument to a function is to give the function a copy of the value you use as the argument. An argument of a function f() is a local variable in f() that’s initialized each time f() is called. For example:

// pass-by-value (give the function a copy of the value passed)
int f(int x)
{
          x = x+1;                               // give the local x a new value
          return x;
}
int main()
{
          int xx = 0;
          cout << f(xx) << '\n';         // write: 1
          cout << xx << '\n';             // write: 0; f() doesn’t change xx

          int yy = 7;
          cout << f(yy) << '\n';              // write: 8
          cout << yy << '\n';                  // write: 7; f() doesn’t change yy
}

Since a copy is passed, the x=x+1 in f() does not change the values xx and yy passed in the two calls. We can illustrate a pass-by-value argument passing like this:

Image

Pass-by-value is pretty straightforward and its cost is the cost of copying the value.

8.5.4 Pass-by-const-reference

Pass-by-value is simple, straightforward, and efficient when we pass small values, such as an int, a double, or a Token (§6.3.2). But what if a value is large, such as an image (often, several million bits), a large table of values (say, thousands of integers), or a long string (say, hundreds of characters)? Then, copying can be costly. We should not be obsessed by cost, but doing unnecessary work can be embarrassing because it is an indication that we didn’t directly express our idea of what we wanted. For example, we could write a function to print out a vector of floating-point numbers like this:

void print(vector<double> v)                     // pass-by-value; appropriate?
{
          cout << "{ ";
          for (int i = 0; i<v.size(); ++i) {
                    cout << v[i];
                    if (i!=v.size()–1) cout << ", ";
          }
          cout << " }\n";
}

We could use this print() for vectors of all sizes. For example:

void f(int x)
{
          vector<double> vd1(10);              // small vector
          vector<double> vd2(1000000);    // large vector
          vector<double> vd3(x);            // vector of some unknown size
          // . . . fill vd1, vd2, vd3 with values . . .
          print(vd1);
          print(vd2);
          print(vd3);
}

Image

This code works, but the first call of print() has to copy ten doubles (probably 80 bytes), the second call has to copy a million doubles (probably 8 megabytes), and we don’t know how much the third call has to copy. The question we must ask ourselves here is: “Why are we copying anything at all?” We just wanted to print the vectors, not to make copies of their elements. Obviously, there has to be a way for us to pass a variable to a function without copying it. As an analogy, if you were given the task to make a list of books in a library, the librarians wouldn’t ship you a copy of the library building and all its contents; they would send you the address of the library, so that you could go and look at the books. So, we need a way of giving our print() function “the address” of the vector to print() rather than the copy of the vector. Such an “address” is called a referenceand is used like this:

void print(const vector<double>& v)      // pass-by-const-reference
{
          cout << "{ ";
          for (int i = 0; i<v.size(); ++i) {
                    cout << v[i];
                    if (i!=v.size()–1) cout << ", ";
          }
          cout << " }\n";
}

The & means “reference” and the const is there to stop print() modifying its argument by accident. Apart from the change to the argument declaration, all is the same as before; the only change is that instead of operating on a copy, print() now refers back to the argument through the reference. Note the phrase “refer back”; such arguments are called references because they “refer” to objects defined elsewhere. We can call this print() exactly as before:

void f(int x)
{
          vector<double> vd1(10);              // small vector
          vector<double> vd2(1000000);    // large vector
          vector<double> vd3(x);                // vector of some unknown size
          // . . . fill vd1, vd2, vd3 with values . . .
          print(vd1);
          print(vd2);
          print(vd3);
}

We can illustrate that graphically:

Image

A const reference has the useful property that we can’t accidentally modify the object passed. For example, if we made a silly error and tried to assign to an element from within print(), the compiler would catch it:

void print(const vector<double>& v)       // pass-by-const-reference
{
          // . . .
          v[i] = 7;                                                // error: v is a const (is not mutable)
          // . . .
}

Pass-by-const-reference is a useful and popular mechanism. Consider again the my_find() function (§8.5.1) that searches for a string in a vector of strings. Pass-by-value could be unnecessarily costly:

int my_find(vector<string> vs, string s);    // pass-by-value: copy

If the vector contained thousands of strings, you might notice the time spent even on a fast computer. So, we could improve my_find() by making it take its arguments by const reference:

// pass-by-const-reference: no copy, read-only access
int my_find(const vector<string>& vs, const string& s);

8.5.5 Pass-by-reference

But what if we did want a function to modify its arguments? Sometimes, that’s a perfectly reasonable thing to wish for. For example, we might want an init() function that assigns values to vector elements:

void init(vector<double>& v)     // pass-by-reference
{
          for (int i = 0; i<v.size(); ++i) v[i] = i;
}
void g(int x)
{
          vector<double> vd1(10);               // small vector
          vector<double> vd2(1000000);     // large vector
          vector<double> vd3(x);                 // vector of some unknown size

          init(vd1);
          init(vd2);
          init(vd3);
}

Here, we wanted init() to modify the argument vector, so we did not copy (did not use pass-by-value) or declare the reference const (did not use pass-by-const-reference) but simply passed a “plain reference” to the vector.

Let us consider references from a more technical point of view. A reference is a construct that allows a user to declare a new name for an object. For example, int& is a reference to an int, so we can write

Image

That is, any use of r is really a use of i.

References can be useful as shorthand. For example, we might have a

vector< vector<double> > v;      // vector of vector of double

and we need to refer to some element v[f(x)][g(y)] several times. Clearly, v[f(x)][g(y)] is a complicated expression that we don’t want to repeat more often than we have to. If we just need its value, we could write

double val = v[f(x)][g(y)];       // val is the value of v[f(x)][g(y)]

and use val repeatedly. But what if we need to both read from v[f(x)][g(y)] and write to v[f(x)][g(y)]? Then, a reference comes in handy:

double& var = v[f(x)][g(y)];      // var is a reference to v[f(x)][g(y)]

Now we can read and write v[f(x)][g(y)] through var. For example:

var = var/2+sqrt(var);

This key property of references, that a reference can be a convenient shorthand for some object, is what makes them useful as arguments. For example:

// pass-by-reference (let the function refer back to the variable passed)
int f(int& x)
{
          x = x+1;
          return x;
}
int main()
{
          int xx = 0;
          cout << f(xx) << '\n';              // write: 1
          cout << xx << '\n';                  // write: 1; f() changed the value of xx
          int yy = 7;
          cout << f(yy) << '\n';              // write: 8
          cout << yy << '\n';                  // write: 8; f() changed the value of yy
}

We can illustrate a pass-by-reference argument passing like this:

Image

Compare this to the similar example in §8.5.3.

Image

Pass-by-reference is clearly a very powerful mechanism: we can have a function operate directly on any object to which we pass a reference. For example, swapping two values is an immensely important operation in many algorithms, such as sorting. Using references, we can write a function that swaps doubles like this:

void swap(double& d1, double& d2)
{
          double temp = d1;             // copy d1’s value to temp
          d1 = d2;                                // copy d2’s value to d1
          d2 = temp;                           // copy d1’s old value to d2
}
int main()
{
          double x = 1;
          double y = 2;
          cout << "x == " << x << " y== " << y << '\n';   // write: x==1 y==2
          swap(x,y);
          cout << "x == " << x << " y== " << y << '\n';   // write: x==2 y==1
}

The standard library provides a swap() for every type that you can copy, so you don’t have to write swap() yourself for each type.

8.5.6 Pass-by-value vs. pass-by-reference

When should you use pass-by-value, pass-by-reference, and pass-by-const-reference? Consider first a technical example:

void f(int a, int& r, const int& cr)
{
          ++a;            // change the local a
          ++r;            // change the object referred to by r
          ++cr;          // error: cr is const
}

If you want to change the value of the object passed, you must use a non-const reference: pass-by-value gives you a copy and pass-by-const-reference prevents you from changing the value of the object passed. So we can try

void g(int a, int& r, const int& cr)
{
          ++a;                  // change the local a
          ++r;                  // change the object referred to by r
          int x = cr;         // read the object referred to by cr
}
int main()
{
          int x = 0;
          int y = 0;
          int z = 0;

          g(x,y,z);      // x==0; y==1; z==0
          g(1,2,3);      // error: reference argument r needs a variable to refer to
          g(1,y,3);      // OK: since cr is const we can pass a literal
}

So, if you want to change the value of an object passed by reference, you have to pass an object. Technically, the integer literal 2 is just a value (an rvalue), rather than an object holding a value. What you need for g()’s argument r is an lvalue, that is, something that could appear on the left-hand side of an assignment.

Note that a const reference doesn’t need an lvalue. It can perform conversions exactly as initialization or pass-by-value. Basically, what happens in that last call, g(1,y,3), is that the compiler sets aside an int for g()’s argument cr to refer to:

g(1,y,3);      // means: int__compiler_generated = 3; g(1,y,__compiler_generated)

Such a compiler-generated object is called a temporary object or just a temporary.

Our rule of thumb is:

Image

1. Use pass-by-value to pass very small objects.

2. Use pass-by-const-reference to pass large objects that you don’t need to modify.

3. Return a result rather than modifying an object through a reference argument.

4. Use pass-by-reference only when you have to.

These rules lead to the simplest, least error-prone, and most efficient code. By “very small” we mean one or two ints, one or two doubles, or something like that. If we see an argument passed by non-const reference, we must assume that the called function will modify that argument.

That third rule reflects that you have a choice when you want to use a function to change the value of a variable. Consider:

int incr1(int a) { return a+1; }          // return the new value as the result
void incr2(int& a) { ++a; }                // modify object passed as reference

int x = 7;
x = incr1(x);                                        // pretty obvious
incr2(x);                                              // pretty obscure

Image

Why do we ever use non-const-reference arguments? Occasionally, they are essential

• For manipulating containers (e.g., vector) and other large objects

• For functions that change several objects (we can have only one return value)

For example:

void larger(vector<int>& v1, vector<int>& v2)
          // make each element in v1 the larger of the corresponding
          // elements in v1 and v2;
          // similarly, make each element of v2 the smaller
{
          if (v1.size()!=v2.size()) error("larger(): different sizes");
          for (int i=0; i<v1.size(); ++i)
                    if (v1[i]<v2[i])
                              swap(v1[i],v2[i]);
}
void f()
{
          vector<int> vx;
          vector<int> vy;
          // read vx and vy from input
          larger(vx,vy);
          // . . .
}

Using pass-by-reference arguments is the only reasonable choice for a function like larger().

It is usually best to avoid functions that modify several objects. In theory, there are always alternatives, such as returning a class object holding several values. However, there are a lot of programs “out there” expressed in terms of functions that modify one or more arguments, so you are likely to encounter them. For example, in Fortran — the major programming language used for numerical calculation for about 50 years — all arguments are traditionally passed by reference. Many numeric programmers copy Fortran designs and call functions written in Fortran. Such code often uses pass-by-reference or pass-by-const-reference.

Image

If we use a reference simply to avoid copying, we use a const reference. Consequently, when we see a non-const-reference argument, we assume that the function changes the value of its argument; that is, when we see a pass-by-non-const-reference we assume that not only can that function modify the argument passed, but it will, so that we have to look extra carefully at the call to make sure that it does what we expect it to.

8.5.7 Argument checking and conversion

Passing an argument is the initialization of the function’s formal argument with the actual argument specified in the call. Consider:

void f(T x);
f(y);
T x = y;          // initialize x with y (see §8.2.2)

The call f(y) is legal whenever the initialization T x=y; is, and when it is legal both xs get the same value. For example:

void f(double x);
void g(int y)
{
          f(y);
          double x = y;      // initialize x with y (see §8.2.2)
}

Note that to initialize x with y, we have to convert an int to a double. The same happens in the call of f(). The double value received by f() is the same as the one stored in x.

Image

Conversions are often useful, but occasionally they give surprising results (see §3.9.2). Consequently, we have to be careful with them. Passing a double as an argument to a function that requires an int is rarely a good idea:

void ff(int x);

void gg(double y)
{
          ff(y);                 // how would you know if this makes sense?
          int x = y;          // how would you know if this makes sense?
}

If you really mean to truncate a double value to an int, say so explicitly:

void ggg(double x)
{
          int x1 = x;                                     // truncate x
          int x2 = int(x);
          int x3 = static_cast<int>(x);         // very explicit conversion (§17.8)

          ff(x1);
          ff(x2);
          ff(x3);

          ff(x);                                             // truncate x
          ff(int(x));
          ff(static_cast<int>(x));                 // very explicit conversion (§17.8)
}

That way, the next programmer to look at this code can see that you thought about the problem.

8.5.8 Function call implementation

But how does a computer really do a function call? The expression(), term(), and primary() functions from Chapters 6 and 7 are perfect for illustrating this except for one detail: they don’t take any arguments, so we can’t use them to explain how arguments are passed. But wait! They musttake some input; if they didn’t, they couldn’t do anything useful. They do take an implicit argument: they use a Token_stream called ts to get their input; ts is a global variable. That’s a bit sneaky. We can improve these functions by letting them take a Token_stream& argument. Here they are with a Token_stream& parameter added and everything that doesn’t concern function call implementation removed.

First, expression() is completely straightforward; it has one argument (ts) and two local variables (left and t):

double expression(Token_stream& ts)
{
          double left = term(ts);
          Token t = ts.get();
          // . . .
}

Second, term() is much like expression(), except that it has an additional local variable (d) that it uses to hold the result of a divisor for '/':

double term(Token_stream& ts)
{
          double left = primary(ts);
          Token t = ts.get();
          // . . .
                    case '/':
                    {
                              double d = primary(ts);
                              // . . .
                    }
          // . . .
}

Third, primary() is much like term() except that it doesn’t have a local variable left:

double primary(Token_stream& ts)
{
          Token t = ts.get();
          switch (t.kind) {
          case '(':
                    {       double d = expression(ts);
                             // . . .
                    }
                    // . . .
          }
}

Now they don’t use any “sneaky global variables” and are perfect for our illustration: they have an argument, they have local variables, and they call each other. You may want to take the opportunity to refresh your memory of what the complete expression(), term(), and primary() look like, but the salient features as far as function call is concerned are presented here.

Image

When a function is called, the language implementation sets aside a data structure containing a copy of all its parameters and local variables. For example, when expression() is first called, the compiler ensures that a structure like this is created:

Image

The “implementation stuff” varies from implementation to implementation, but that’s basically the information that the function needs to return to its caller and to return a value to its caller. Such a data structure is called a function activation record, and each function has its own detailed layout of its activation record. Note that from the implementation’s point of view, a parameter is just another local variable.

So far, so good, and now expression() calls term(), so the compiler ensures that an activation record for this call of term() is generated:

Image

Note that term() has an extra variable d that needs to be stored, so we set aside space for that in the call even though the code may never get around to using it. That’s OK. For reasonable functions (such as every function we directly or indirectly use in this book), the run-time cost of laying down a function activation record doesn’t depend on how big it is. The local variable d will be initialized only if we execute its case '/'.

Now term() calls primary() and we get

Image

This is starting to get a bit repetitive, but now primary() calls expression():

Image

Image

So this call of expression() gets its own activation record, different from the first call of expression(). That’s good or else we’d be in a terrible mess, since left and t will be different in the two calls. A function that directly or (as here) indirectly calls itself is called recursive. As you see, recursive functions follow naturally from the implementation technique we use for function call and return (and vice versa).

So, each time we call a function the stack of activation records, usually just called the stack, grows by one record. Conversely, when the function returns, its record is no longer used. For example, when that last call of expression() returns to primary(), the stack will revert to this:

Image

And when that call of primary() returns to term(), we get back to

Image

And so on. The stack, also called the call stack, is a data structure that grows and shrinks at one end according to the rule “Last in, first out.”

Please remember that the details of how a call stack is implemented and used vary from C++ implementation to C++ implementation, but the basics are as outlined here. Do you need to know how function calls are implemented to use them? Of course not; you have done well enough before this implementation subsection, but many programmers like to know and many use phrases like “activation record” and “call stack,” so it’s better to know what they mean.

8.5.9 constexpr functions

A function represents a calculation, and sometimes we want to do a calculation at compile time. The reason to want a calculation to be evaluated by the compiler is usually to avoid having the same calculation done millions of times at run time. We use functions to make our calculations comprehensible, so naturally we sometimes want to use a function in a constant expression. We convey our intent to have a function evaluated by the compiler by declaring the function constexpr. A constexpr function can be evaluated by the compiler if it is given constant expressions as arguments. For example:

constexpr double xscale = 10;       // scaling factors
constexpr double yscale = 0.8;

constexpr Point scale(Point p) { return {xscale*p.x,yscale*p.y}; };

Assume that Point is a simple struct with members x and y representing 2D coordinates. Now, when we give scale() a Point argument, it returns a Point with coordinates scaled according to the factors xscale and yscale. For example:

void user(Point p1)
{
          Point p2 {10,10};

          Point p3 = scale(p1);     // OK: p3 == {100,8}; run-time evaluation is fine
          Point p4 = scale(p2);     // p4 == {100,8}

          constexpr Point p5 = scale(p1);   // error: scale (p1) is not a constant
                                                                // expression
          constexpr Point p6 = scale(p2);   // p6 == {100,8}
// . . .
}

A constexpr function behaves just like an ordinary function until you use it where a constant is needed. Then, it is calculated at compile time provided its arguments are constant expressions (e.g., p2) and gives an error if they are not (e.g., p1). To enable that, a constexpr function must be so simple that the compiler (every standard-conforming compiler) can evaluate it. In C++11, that means that a constexpr function must have a body consisting of a single return-statement (like scale()); in C++14, we can also write simple loops. A constexpr function may not have side effects; that is, it may not change the value of variables outside its own body, except those it is assigned to or uses to initialize.

Here is an example of a function that violates those rules for simplicity:

int gob = 9;

constexpr void bad(int & arg)        // error: no return value
{
          ++arg;                                   // error: modifies caller through argument
          glob = 7;                              // error: modifies nonlocal variable
}

If a compiler cannot determine that a constexpr function is “simple enough” (according to detailed rules in the standard), the function is considered an error.

8.6 Order of evaluation

Image

The evaluation of a program — also called the execution of a program — proceeds through the statements according to the language rules. When this “thread of execution” reaches the definition of a variable, the variable is constructed; that is, memory is set aside for the object and the object is initialized. When the variable goes out of scope, the variable is destroyed; that is, the object it refers to is in principle removed and the compiler can use its memory for something else. For example:

string program_name = "silly";
vector<string> v;                                            // v is global

void f()
{
          string s;                                                // s is local to f
          while (cin>>s && s!="quit") {
                    string stripped;                         // stripped is local to the loop
                    string not_letters;
                    for (int i=0; i<s.size(); ++i)       // i has statement scope
                              if (isalpha(s[i]))
                                        stripped += s[i];
                              else
                                        not_letters += s[i];
                    v.push_back(stripped);
                    // . . .
          }
          // . . .
}

Global variables, such as program_name and v, are initialized before the first statement of main() is executed. They “live” until the program terminates, and then they are destroyed. They are constructed in the order in which they are defined (that is, program_name before v) and destroyed in the reverse order (that is, v before program_name).

When someone calls f(), first s is constructed; that is, s is initialized to the empty string. It will live until we return from f().

Each time we enter the block that is the body of the while-statement, stripped and not_letters are constructed. Since stripped is defined before not_letters, stripped is constructed before not_letters. They live until the end of the loop, where they are destroyed in the reverse order of construction (that is, not_letters before stripped) before the condition is reevaluated. So, if ten strings are seen before we encounter the string quit, stripped and not_letters will each be constructed and destroyed ten times.

Each time we reach the for-statement, i is constructed. Each time we exit the for-statement, i is destroyed before we reach the v.push_back(stripped); statement.

Please note that compilers (and linkers) are clever beasts and they are allowed to — and do — optimize code as long as the results are equivalent to what we have described here. In particular, compilers are clever at not allocating and deallocating memory more often than is really necessary.

8.6.1 Expression evaluation

Image

The order of evaluation of sub-expressions is governed by rules designed to please an optimizer rather than to make life simple for the programmer. That’s unfortunate, but you should avoid complicated expressions anyway, and there is a simple rule that can keep you out of trouble: if you change the value of a variable in an expression, don’t read or write it twice in that same expression. For example:

v[i] = ++i;                                       // don’t: undefined order of evaluation
v[++i] = i;                                       // don’t: undefined order of evaluation
int x = ++i + ++i;                           // don’t: undefined order of evaluation
cout << ++i << ' ' << i << '\n';      // don’t: undefined order of evaluation
f(++i,++i);                                    // don’t: undefined order of evaluation

Unfortunately, not all compilers warn if you write such bad code; it’s bad because you can’t rely on the results being the same if you move your code to another computer, use a different compiler, or use a different optimizer setting. Compilers really differ for such code; just don’t do it.

Note in particular that = (assignment) is considered just another operator in an expression, so there is no guarantee that the left-hand side of an assignment is evaluated before the right-hand side. That’s why v[++i] = i is undefined.

8.6.2 Global initialization

Global variables (and namespace variables; see §8.7) in a single translation unit are initialized in the order in which they appear. For example:

// file f1.cpp
int x1 = 1;
int y1 = x1+2;        // y1 becomes 3

This initialization logically takes place “before the code in main() is executed.”

Image

Using a global variable in anything but the most limited circumstances is usually not a good idea. We have mentioned the problem of the programmer having no really effective way of knowing which parts of a large program read and/or write a global variable (§8.4). Another problem is that the order of initialization of global variables in different translation units is not defined. For example:

// file f2.cpp
extern int y1;
int y2 = y1+2;         // y2 becomes 2 or 5

Such code is to be avoided for several reasons: it uses global variables, it gives the global variables short names, and it uses complicated initialization of the global variables. If the globals in file f1.cpp are initialized before the globals in f2.cpp, y2 will be initialized to 5 (as a programmer might naively and reasonably expect). However, if the globals in file f2.cpp are initialized before the globals in f1.cpp, y2 will be initialized to 2 (because the memory used for global variables is initialized to 0 before complicated initialization is attempted). Avoid such code, and be very suspicious when you see global variables with nontrivial initializers; consider any initializer that isn’t a constant expression complicated.

But what do you do if you really need a global variable (or constant) with a complicated initializer? A plausible example would be that we wanted a default value for a Date type we were providing for a library supporting business transactions:

const Date default_date(1970,1,1);      // the default date is January 1, 1970

How would we know that default_date was never used before it was initialized? Basically, we can’t know, so we shouldn’t write that definition. The technique that we use most often is to call a function that returns the value. For example:

const Date default_date()                // return the default date
{
          return Date(1970,1,1);
}

Image

This constructs the Date every time we call default_date(). That is often fine, but if default_date() is called often and it is expensive to construct Date, we’d like to construct the Date once only. That is done like this:

const Date& default_date()
{
          static const Date dd(1970,1,1);        // initialize dd first time we get here
          return dd;
}

The static local variable is initialized (constructed) only the first time its function is called. Note that we returned a reference to eliminate unnecessary copying and, in particular, we returned a const reference to prevent the calling function from accidentally changing the value. The arguments about how to pass an argument (§8.5.6) also apply to returning values.

8.7 Namespaces

We use blocks to organize code within a function (§8.4). We use classes to organize functions, data, and types into a type (Chapter 9). A function and a class both do two things for us:

• They allow us to define a number of “entities” without worrying that their names clash with other names in our program.

• They give us a name to refer to what we have defined.

Image

What we lack so far is something to organize classes, functions, data, and types into an identifiable and named part of a program without defining a type. The language mechanism for such grouping of declarations is a namespace. For example, we might like to provide a graphics library with classes called Color, Shape, Line, Function, and Text (see Chapter 13):

namespace Graph_lib {
          struct Color { /* . . . */ };
          struct Shape { /* . . . */ };
          struct Line : Shape { /* . . . */ };
          struct Function : Shape { /* . . . */ };
          struct Text : Shape { /* . . . */ };
          // . . .
          int gui_main() { /* . . . */ }
}

Most likely somebody else in the world has used those names, but now that doesn’t matter. You might define something called Text, but our Text doesn’t interfere. Graph_lib::Text is one of our classes and your Text is not. We have a problem only if you have a class or a namespace calledGraph_lib with Text as its member. Graph_lib is a slightly ugly name; we chose it because the “pretty and obvious” name Graphics had a greater chance of already being used somewhere.

Let’s say that your Text was part of a text manipulation library. The same logic that made us put our graphics facilities into namespace Graph_lib should make you put your text manipulation facilities into a namespace called something like TextLib:

namespace TextLib {
          class Text { /* . . . */ };
          class Glyph { /* . . . */ };
          class Line { /* . . . */ };
          // . . .
}

Had we both used the global namespace, we could have been in real trouble. Someone trying to use both of our libraries would have had really bad name clashes for Text and Line. Worse, if we both had users for our libraries we would not have been able to change our names, such as Lineand Text, to avoid clashes. We avoided that problem by using namespaces; that is, our Text is Graph_lib::Text and yours is TextLib::Text. A name composed of a namespace name (or a class name) and a member name combined by :: is called a fully qualified name.

8.7.1 using declarations and using directives

Writing fully qualified names can be tedious. For example, the facilities of the C++ standard library are defined in namespace std and can be used like this:

#include<string>                // get the string library
#include<iostream>           // get the iostream library
int main()
{
          std::string name;
          std::cout << "Please enter your first name\n";
          std::cin >> name;
          std::cout << "Hello, " << name << '\n';
}

Having seen the standard library string and cout thousands of times, we don’t really want to have to refer to them by their “proper” fully qualified names std::string and std::cout all the time. A solution is to say that “by string, I mean std::string,” “by cout, I mean std::cout,” etc.:

using std::string;         // string means std::string
using std::cout;           // cout means std::cout
// . . .

That construct is called a using declaration; it is the programming equivalent to using plain “Greg” to refer to Greg Hansen, when there are no other Gregs in the room.

Sometimes, we prefer an even stronger “shorthand” for the use of names from a namespace: “If you don’t find a declaration for a name in this scope, look in std.” The way to say that is to use a using directive:

using namespace std;      // make names from std directly accessible

So we get this common style:

#include<string>                // get the string library
#include<iostream>          // get the iostream library
using namespace std;      // make names from std directly accessible

int main()
{
          string name;
          cout << "Please enter your first name\n";
          cin >> name;
          cout << "Hello, " << name << '\n';
}

The cin is std::cin, the string is std::string, etc. As long as you use std_lib_facilities.h, you don’t need to worry about standard headers and the std namespace.

Image

It is usually a good idea to avoid using directives for any namespace except for a namespace, such as std, that’s extremely well known in an application area. The problem with overuse of using directives is that you lose track of which names come from where, so that you again start to get name clashes. Explicit qualification with namespace names and using declarations doesn’t suffer from that problem. So, putting a using directive in a header file (so that users can’t avoid it) is a very bad habit. However, to simplify our initial code we did place a using directive for std instd_lib_facilities.h. That allowed us to write

#include "std_lib_facilities.h"

int main()
{
          string name;
          cout << "Please enter your first name\n";
          cin >> name;
          cout << "Hello, " << name << '\n';
}

We promise never to do that for any namespace except std.

Image Drill

1. Create three files: my.h, my.cpp, and use.cpp. The header file my.h contains

extern int foo;
void print_foo();
void print(int);

The source code file my.cpp #includes my.h and std_lib_facilities.h, defines print_foo() to print the value of foo using cout, and print(int i) to print the value of i using cout.

The source code file use.cpp #includes my.h, defines main() to set the value of foo to 7 and print it using print_foo(), and to print the value of 99 using print(). Note that use.cpp does not #include std_lib_facilities.h as it doesn’t directly use any of those facilities.

Get these files compiled and run. On Windows, you need to have both use.cpp and my.cpp in a project and use { char cc; cin>>cc; } in use.cpp to be able to see your output. Hint: You need to #include <iostream> to use cin.

2. Write three functions swap_v(int,int), swap_r(int&,int&), and swap_cr(const int&, const int&). Each should have the body

{ int temp; temp = a, a=b; b=temp; }

where a and b are the names of the arguments.

Try calling each swap like this

int x = 7;
int y =9;
swap_?(x,y);                      // replace ? by v, r, or cr
swap_?(7,9);
const int cx = 7;
const int cy = 9;
swap_?(cx,cy);
swap_?(7.7,9.9);
double dx = 7.7;
double dy = 9.9;
swap_?(dx,dy);
swap_?(7.7,9.9);

Which functions and calls compiled, and why? After each swap that compiled, print the value of the arguments after the call to see if they were actually swapped. If you are surprised by a result, consult §8.6.

3. Write a program using a single file containing three namespaces X, Y, and Z so that the following main() works correctly:

int main()
{
          X::var = 7;
          X::print();                  // print X’s var
          using namespace Y;
          var = 9;
          print();                       // print Y’s var
          {          using Z::var;
                      using Z::print;
                      var = 11;
                      print();           // print Z’s var
          }
          print();                      // print Y’s var
          X::print();                // print X’s var
}

Each namespace needs to define a variable called var and a function called print() that outputs the appropriate var using cout.

Review

1. What is the difference between a declaration and a definition?

2. How do we syntactically distinguish between a function declaration and a function definition?

3. How do we syntactically distinguish between a variable declaration and a variable definition?

4. Why can’t you use the functions in the calculator program from Chapter 6 without declaring them first?

5. Is int a; a definition or just a declaration?

6. Why is it a good idea to initialize variables as they are declared?

7. What can a function declaration consist of?

8. What good does indentation do?

9. What are header files used for?

10. What is the scope of a declaration?

11. What kinds of scope are there? Give an example of each.

12. What is the difference between a class scope and local scope?

13. Why should a programmer minimize the number of global variables?

14. What is the difference between pass-by-value and pass-by-reference?

15. What is the difference between pass-by-reference and pass-by-const-reference?

16. What is a swap()?

17. Would you ever define a function with a vector<double>-by-value parameter?

18. Give an example of undefined order of evaluation. Why can undefined order of evaluation be a problem?

19. What do x&&y and x||y, respectively, mean?

20. Which of the following is standard-conforming C++: functions within functions, functions within classes, classes within classes, classes within functions?

21. What goes into an activation record?

22. What is a call stack and why do we need one?

23. What is the purpose of a namespace?

24. How does a namespace differ from a class?

25. What is a using declaration?

26. Why should you avoid using directives in a header?

27. What is namespace std?

Terms

activation record

argument

argument passing

call stack

class scope

const

constexpr

declaration

definition

extern

forward declaration

function

function definition

global scope

header file

initializer

local scope

namespace

namespace scope

nested block

parameter

pass-by-const-reference

pass-by-reference

pass-by-value

recursion

return

return value

scope

statement scope

technicalities

undeclared identifier

using declaration

using directive

Exercises

1. Modify the calculator program from Chapter 7 to make the input stream an explicit parameter (as shown in §8.5.8), rather than simply using cin. Also give the Token_stream constructor (§7.8.2) an istream& parameter so that when we figure out how to make our own istreams (e.g., attached to files), we can use the calculator for those. Hint: Don’t try to copy an istream.

2. Write a function print() that prints a vector of ints to cout. Give it two arguments: a string for “labeling” the output and a vector.

3. Create a vector of Fibonacci numbers and print them using the function from exercise 2. To create the vector, write a function, fibonacci(x,y,v,n), where integers x and y are ints, v is an empty vector<int>, and n is the number of elements to put into v; v[0] will be x and v[1] will be y. A Fibonacci number is one that is part of a sequence where each element is the sum of the two previous ones. For example, starting with 1 and 2, we get 1, 2, 3, 5, 8, 13, 21, . . . . Your fibonacci() function should make such a sequence starting with its x and y arguments.

4. An int can hold integers only up to a maximum number. Find an approximation of that maximum number by using fibonacci().

5. Write two functions that reverse the order of elements in a vector<int>. For example, 1, 3, 5, 7, 9 becomes 9, 7, 5, 3, 1. The first reverse function should produce a new vector with the reversed sequence, leaving its original vector unchanged. The other reverse function should reverse the elements of its vector without using any other vectors (hint: swap).

6. Write versions of the functions from exercise 5, but with a vector<string>.

7. Read five names into a vector<string> name, then prompt the user for the ages of the people named and store the ages in a vector<double> age. Then print out the five (name[i],age[i]) pairs. Sort the names (sort(name.begin(),name.end())) and print out the (name[i],age[i]) pairs. The tricky part here is to get the age vector in the correct order to match the sorted name vector. Hint: Before sorting name, take a copy and use that to make a copy of age in the right order after sorting name. Then, do that exercise again but allowing an arbitrary number of names.

9. Write a function that given two vector<double>s price and weight computes a value (an “index”) that is the sum of all price[i]*weight[i]. Make sure to have weight.size()==price.size().

10. Write a function maxv() that returns the largest element of a vector argument.

11. Write a function that finds the smallest and the largest element of a vector argument and also computes the mean and the median. Do not use global variables. Either return a struct containing the results or pass them back through reference arguments. Which of the two ways of returning several result values do you prefer and why?

12. Improve print_until_s() from §8.5.2. Test it. What makes a good set of test cases? Give reasons. Then, write a print_until_ss() that prints until it sees a second occurrence of its quit argument.

13. Write a function that takes a vector<string> argument and returns a vector<int> containing the number of characters in each string. Also find the longest and the shortest string and the lexicographically first and last string. How many separate functions would you use for these tasks? Why?

14. Can we declare a non-reference function argument const (e.g., void f(const int);)? What might that mean? Why might we want to do that? Why don’t people do that often? Try it; write a couple of small programs to see what works.

Postscript

We could have put much of this chapter (and much of the next) into an appendix. However, you’ll need most of the facilities described here in Part II of this book. You’ll also encounter most of the problems that these facilities were invented to help solve very soon. Most simple programming projects that you might undertake will require you to solve such problems. So, to save time and minimize confusion, a somewhat systematic approach is called for, rather than a series of “random” visits to manuals and appendices.