Learning Python (2013)

Part III. Statements and Syntax

Chapter 11. Assignments, Expressions, and Prints

Now that we’ve had a quick introduction to Python statement syntax, this chapter begins our in-depth tour of specific Python statements. We’ll begin with the basics: assignment statements, expression statements, and print operations. We’ve already seen all of these in action, but here we’ll fill in important details we’ve skipped so far. Although they’re relatively simple, as you’ll see, there are optional variations for each of these statement types that will come in handy once you begin writing realistic Python programs.

Assignment Statements

We’ve been using the Python assignment statement for a while to assign objects to names. In its basic form, you write the target of an assignment on the left of an equals sign, and the object to be assigned on the right. The target on the left may be a name or object component, and the object on the right can be an arbitrary expression that computes an object. For the most part, assignments are straightforward, but here are a few properties to keep in mind:

§  Assignments create object references. As discussed in Chapter 6, Python assignments store references to objects in names or data structure components. They always create references to objects instead of copying the objects. Because of that, Python variables are more like pointers than data storage areas.

§  Names are created when first assigned. Python creates a variable name the first time you assign it a value (i.e., an object reference), so there’s no need to predeclare names ahead of time. Some (but not all) data structure slots are created when assigned, too (e.g., dictionary entries, some object attributes). Once assigned, a name is replaced with the value it references whenever it appears in an expression.

§  Names must be assigned before being referenced. It’s an error to use a name to which you haven’t yet assigned a value. Python raises an exception if you try, rather than returning some sort of ambiguous default value. This turns out to be crucial in Python because names are not predeclared—if Python provided default values for unassigned names used in your program instead of treating them as errors, it would be much more difficult for you to spot name typos in your code.

§  Some operations perform assignments implicitly. In this section we’re concerned with the = statement, but assignment occurs in many contexts in Python. For instance, we’ll see later that module imports, function and class definitions, for loop variables, and function arguments are all implicit assignments. Because assignment works the same everywhere it pops up, all these contexts simply bind names to object references at runtime.

Assignment Statement Forms

Although assignment is a general and pervasive concept in Python, we are primarily interested in assignment statements in this chapter. Table 11-1 illustrates the different assignment statement forms in Python, and their syntax patterns.

Table 11-1. Assignment statement forms

Operation

Interpretation

spam = 'Spam'

Basic form

spam, ham = 'yum', 'YUM'

Tuple assignment (positional)

[spam, ham] = ['yum', 'YUM']

List assignment (positional)

a, b, c, d = 'spam'

Sequence assignment, generalized

a, *b = 'spam'

Extended sequence unpacking (Python 3.X)

spam = ham = 'lunch'

Multiple-target assignment

spams += 42

Augmented assignment (equivalent to spams = spams + 42)

The first form in Table 11-1 is by far the most common: binding a name (or data structure component) to a single object. In fact, you could get all your work done with this basic form alone. The other table entries represent special forms that are all optional, but that programmers often find convenient in practice:

Tuple- and list-unpacking assignments

The second and third forms in the table are related. When you code a tuple or list on the left side of the =, Python pairs objects on the right side with targets on the left by position and assigns them from left to right. For example, in the second line of Table 11-1, the name spam is assigned the string 'yum', and the name ham is bound to the string 'YUM'. In this case Python internally may make a tuple of the items on the right, which is why this is called tuple-unpacking assignment.

Sequence assignments

In later versions of Python, tuple and list assignments were generalized into instances of what we now call sequence assignment—any sequence of names can be assigned to any sequence of values, and Python assigns the items one at a time by position. We can even mix and match the types of the sequences involved. The fourth line in Table 11-1, for example, pairs a tuple of names with a string of characters: a is assigned 's', b is assigned 'p', and so on.

Extended sequence unpacking

In Python 3.X (only), a new form of sequence assignment allows us to be more flexible in how we select portions of a sequence to assign. The fifth line in Table 11-1, for example, matches a with the first character in the string on the right and b with the rest: a is assigned 's', and b is assigned 'pam'. This provides a simpler alternative to assigning the results of manual slicing operations.

Multiple-target assignments

The sixth line in Table 11-1 shows the multiple-target form of assignment. In this form, Python assigns a reference to the same object (the object farthest to the right) to all the targets on the left. In the table, the names spam and ham are both assigned references to the same string object,'lunch'. The effect is the same as if we had coded ham = 'lunch' followed by spam = ham, as ham evaluates to the original string object (i.e., not a separate copy of that object).

Augmented assignments

The last line in Table 11-1 is an example of augmented assignment—a shorthand that combines an expression and an assignment in a concise way. Saying spam += 42, for example, has the same effect as spam = spam + 42, but the augmented form requires less typing and is generally quicker to run. In addition, if the subject is mutable and supports the operation, an augmented assignment may run even quicker by choosing an in-place update operation instead of an object copy. There is one augmented assignment statement for every binary expression operator in Python.

Sequence Assignments

We’ve already used and explored basic assignments in this book, so we’ll take them as a given. Here are a few simple examples of sequence-unpacking assignments in action:

% python

>>> nudge = 1                      # Basic assignment

>>> wink  = 2

>>> A, B = nudge, wink             # Tuple assignment

>>> A, B                           # Like A = nudge; B = wink

(1, 2)

>>> [C, D] = [nudge, wink]         # List assignment

>>> C, D

(1, 2)

Notice that we really are coding two tuples in the third line in this interaction—we’ve just omitted their enclosing parentheses. Python pairs the values in the tuple on the right side of the assignment operator with the variables in the tuple on the left side and assigns the values one at a time.

Tuple assignment leads to a common coding trick in Python that was introduced in a solution to the exercises at the end of Part II. Because Python creates a temporary tuple that saves the original values of the variables on the right while the statement runs, unpacking assignments are also a way to swap two variables’ values without creating a temporary variable of your own—the tuple on the right remembers the prior values of the variables automatically:

>>> nudge = 1

>>> wink  = 2

>>> nudge, wink = wink, nudge      # Tuples: swaps values

>>> nudge, wink                    # Like T = nudge; nudge = wink; wink = T

(2, 1)

In fact, the original tuple and list assignment forms in Python have been generalized to accept any type of sequence (really, iterable) on the right as long as it is of the same length as the sequence on the left. You can assign a tuple of values to a list of variables, a string of characters to a tuple of variables, and so on. In all cases, Python assigns items in the sequence on the right to variables in the sequence on the left by position, from left to right:

>>> [a, b, c] = (1, 2, 3)          # Assign tuple of values to list of names

>>> a, c

(1, 3)

>>> (a, b, c) = "ABC"              # Assign string of characters to tuple

>>> a, c

('A', 'C')

Technically speaking, sequence assignment actually supports any iterable object on the right, not just any sequence. This is a more general category that includes collections both physical (e.g., lists) and virtual (e.g., a file’s lines), which was defined briefly in Chapter 4 and has popped up in passing ever since. We’ll firm up this term when we explore iterables in Chapter 14 and Chapter 20.

Advanced sequence assignment patterns

Although we can mix and match sequence types around the = symbol, we must generally have the same number of items on the right as we have variables on the left, or we’ll get an error. Python 3.X allows us to be more general with extended unpacking * syntax, described in the next section. But normally in 3.X—and always in 2.X—the number of items in the assignment target and subject must match:

>>> string = 'SPAM'

>>> a, b, c, d = string                            # Same number on both sides

>>> a, d

('S', 'M')

>>> a, b, c = string                               # Error if not

...error text omitted...

ValueError: too many values to unpack (expected 3)

To be more flexible, we can slice in both 2.X and 3.X. There are a variety of ways to employ slicing to make this last case work:

>>> a, b, c = string[0], string[1], string[2:]     # Index and slice

>>> a, b, c

('S', 'P', 'AM')

>>> a, b, c = list(string[:2]) + [string[2:]]      # Slice and concatenate

>>> a, b, c

('S', 'P', 'AM')

>>> a, b = string[:2]                              # Same, but simpler

>>> c = string[2:]

>>> a, b, c

('S', 'P', 'AM')

>>> (a, b), c = string[:2], string[2:]             # Nested sequences

>>> a, b, c

('S', 'P', 'AM')

As the last example in this interaction demonstrates, we can even assign nested sequences, and Python unpacks their parts according to their shape, as expected. In this case, we are assigning a tuple of two items, where the first item is a nested sequence (a string), exactly as though we had coded it this way:

>>> ((a, b), c) = ('SP', 'AM')                     # Paired by shape and position

>>> a, b, c

('S', 'P', 'AM')

Python pairs the first string on the right ('SP') with the first tuple on the left ((a, b)) and assigns one character at a time, before assigning the entire second string ('AM') to the variable c all at once. In this event, the sequence-nesting shape of the object on the left must match that of the object on the right. Nested sequence assignment like this is somewhat rare to see, but it can be convenient for picking out the parts of data structures with known shapes.

For example, we’ll see in Chapter 13 that this technique also works in for loops, because loop items are assigned to the target given in the loop header:

for (a, b, c) in [(1, 2, 3), (4, 5, 6)]: ...          # Simple tuple assignment

for ((a, b), c) in [((1, 2), 3), ((4, 5), 6)]: ...    # Nested tuple assignment

In a note in Chapter 18, we’ll also see that this nested tuple (really, sequence) unpacking assignment form works for function argument lists in Python 2.X (though not in 3.X), because function arguments are passed by assignment as well:

def f(((a, b), c)): ...          # For arguments too in Python 2.X, but not 3.X

f(((1, 2), 3))

Sequence-unpacking assignments also give rise to another common coding idiom in Python—assigning an integer series to a set of variables:

>>> red, green, blue = range(3)

>>> red, blue

(0, 2)

This initializes the three names to the integer codes 0, 1, and 2, respectively (it’s Python’s equivalent of the enumerated data types you may have seen in other languages). To make sense of this, you need to know that the range built-in function generates a list of successive integers (in 3.X only, it requires a list around it if you wish to display its values all at once like this):

>>> list(range(3))                       # list() required in Python 3.X only

[0, 1, 2]

This call was previewed briefly in Chapter 4; because range is commonly used in for loops, we’ll say more about it in Chapter 13.

Another place you may see a tuple assignment at work is for splitting a sequence into its front and the rest in loops like this:

>>> L = [1, 2, 3, 4]

>>> while L:

...     front, L = L[0], L[1:]           # See next section for 3.X * alternative

...     print(front, L)

...

1 [2, 3, 4]

2 [3, 4]

3 [4]

4 []

The tuple assignment in the loop here could be coded as the following two lines instead, but it’s often more convenient to string them together:

...     front = L[0]

...     L = L[1:]

Notice that this code is using the list as a sort of stack data structure, which can often also be achieved with the append and pop methods of list objects; here, front = L.pop(0) would have much the same effect as the tuple assignment statement, but it would be an in-place change. We’ll learn more about while loops, and other (often better) ways to step through a sequence with for loops, in Chapter 13.

Extended Sequence Unpacking in Python 3.X

The prior section demonstrated how to use manual slicing to make sequence assignments more general. In Python 3.X (but not 2.X), sequence assignment has been generalized to make this easier. In short, a single starred name, *X, can be used in the assignment target in order to specify a more general matching against the sequence—the starred name is assigned a list, which collects all items in the sequence not assigned to other names. This is especially handy for common coding patterns such as splitting a sequence into its “front” and “rest,” as in the preceding section’s last example.

Extended unpacking in action

Let’s look at an example. As we’ve seen, sequence assignments normally require exactly as many names in the target on the left as there are items in the subject on the right. We get an error if the lengths disagree in both 2.X and 3.X (unless we manually sliced on the right, as shown in the prior section):

C:\code> c:\python33\python

>>> seq = [1, 2, 3, 4]

>>> a, b, c, d = seq

>>> print(a, b, c, d)

1 2 3 4

>>> a, b = seq

ValueError: too many values to unpack (expected 2)

In Python 3.X, though, we can use a single starred name in the target to match more generally. In the following continuation of our interactive session, a matches the first item in the sequence, and b matches the rest:

>>> a, *b = seq

>>> a

1

>>> b

[2, 3, 4]

When a starred name is used, the number of items in the target on the left need not match the length of the subject sequence. In fact, the starred name can appear anywhere in the target. For instance, in the next interaction b matches the last item in the sequence, and a matches everything before the last:

>>> *a, b = seq

>>> a

[1, 2, 3]

>>> b

4

When the starred name appears in the middle, it collects everything between the other names listed. Thus, in the following interaction a and c are assigned the first and last items, and b gets everything in between them:

>>> a, *b, c = seq

>>> a

1

>>> b

[2, 3]

>>> c

4

More generally, wherever the starred name shows up, it will be assigned a list that collects every unassigned name at that position:

>>> a, b, *c = seq

>>> a

1

>>> b

2

>>> c

[3, 4]

Naturally, like normal sequence assignment, extended sequence unpacking syntax works for any sequence types (really, again, any iterable), not just lists. Here it is unpacking characters in a string and a range (an iterable in 3.X):

>>> a, *b = 'spam'

>>> a, b

('s', ['p', 'a', 'm'])

>>> a, *b, c = 'spam'

>>> a, b, c

('s', ['p', 'a'], 'm')

>>> a, *b, c = range(4)

>>> a, b, c

(0, [1, 2], 3)

This is similar in spirit to slicing, but not exactly the same—a sequence unpacking assignment always returns a list for multiple matched items, whereas slicing returns a sequence of the same type as the object sliced:

>>> S = 'spam'

>>> S[0], S[1:]    # Slices are type-specific, * assignment always returns a list

('s', 'pam')

>>> S[0], S[1:3], S[3]

('s', 'pa', 'm')

Given this extension in 3.X, as long as we’re processing a list the last example of the prior section becomes even simpler, since we don’t have to manually slice to get the first and rest of the items:

>>> L = [1, 2, 3, 4]

>>> while L:

...     front, *L = L                    # Get first, rest without slicing

...     print(front, L)

...

1 [2, 3, 4]

2 [3, 4]

3 [4]

4 []

Boundary cases

Although extended sequence unpacking is flexible, some boundary cases are worth noting. First, the starred name may match just a single item, but is always assigned a list:

>>> seq = [1, 2, 3, 4]

>>> a, b, c, *d = seq

>>> print(a, b, c, d)

1 2 3 [4]

Second, if there is nothing left to match the starred name, it is assigned an empty list, regardless of where it appears. In the following, a, b, c, and d have matched every item in the sequence, but Python assigns e an empty list instead of treating this as an error case:

>>> a, b, c, d, *e = seq

>>> print(a, b, c, d, e)

1 2 3 4 []

>>> a, b, *e, c, d = seq

>>> print(a, b, c, d, e)

1 2 3 4 []

Finally, errors can still be triggered if there is more than one starred name, if there are too few values and no star (as before), and if the starred name is not itself coded inside a sequence:

>>> a, *b, c, *d = seq

SyntaxError: two starred expressions in assignment

>>> a, b = seq

ValueError: too many values to unpack (expected 2)

>>> *a = seq

SyntaxError: starred assignment target must be in a list or tuple

>>> *a, = seq

>>> a

[1, 2, 3, 4]

A useful convenience

Keep in mind that extended sequence unpacking assignment is just a convenience. We can usually achieve the same effects with explicit indexing and slicing (and in fact must in Python 2.X), but extended unpacking is simpler to code. The common “first, rest” splitting coding pattern, for example, can be coded either way, but slicing involves extra work:

>>> seq

[1, 2, 3, 4]

>>> a, *b = seq                        # First, rest

>>> a, b

(1, [2, 3, 4])

>>> a, b = seq[0], seq[1:]             # First, rest: traditional

>>> a, b

(1, [2, 3, 4])

The also-common “rest, last” splitting pattern can similarly be coded either way, but the new extended unpacking syntax requires noticeably fewer keystrokes:

>>> *a, b = seq                        # Rest, last

>>> a, b

([1, 2, 3], 4)

>>> a, b = seq[:-1], seq[-1]           # Rest, last: traditional

>>> a, b

([1, 2, 3], 4)

Because it is not only simpler but, arguably, more natural, extended sequence unpacking syntax will likely become widespread in Python code over time.

Application to for loops

Because the loop variable in the for loop statement can be any assignment target, extended sequence assignment works here too. We met the for loop iteration tool briefly in Chapter 4 and will study it formally in Chapter 13. In Python 3.X, extended assignments may show up after the word for, where a simple variable name is more commonly used:

for (a, *b, c) in [(1, 2, 3, 4), (5, 6, 7, 8)]:

    ...

When used in this context, on each iteration Python simply assigns the next tuple of values to the tuple of names. On the first loop, for example, it’s as if we’d run the following assignment statement:

a, *b, c = (1, 2, 3, 4)                            # b gets [2, 3]

The names a, b, and c can be used within the loop’s code to reference the extracted components. In fact, this is really not a special case at all, but just an instance of general assignment at work. As we saw earlier in this chapter, we can do the same thing with simple tuple assignment in both Python 2.X and 3.X:

for (a, b, c) in [(1, 2, 3), (4, 5, 6)]:           # a, b, c = (1, 2, 3), ...

And we can always emulate 3.X’s extended assignment behavior in 2.X by manually slicing:

for all in [(1, 2, 3, 4), (5, 6, 7, 8)]:

    a, b, c = all[0], all[1:3], all[3]

Since we haven’t learned enough to get more detailed about the syntax of for loops, we’ll return to this topic in Chapter 13.

Multiple-Target Assignments

A multiple-target assignment simply assigns all the given names to the object all the way to the right. The following, for example, assigns the three variables a, b, and c to the string 'spam':

>>> a = b = c = 'spam'

>>> a, b, c

('spam', 'spam', 'spam')

This form is equivalent to (but easier to code than) these three assignments:

>>> c = 'spam'

>>> b = c

>>> a = b

Multiple-target assignment and shared references

Keep in mind that there is just one object here, shared by all three variables (they all wind up pointing to the same object in memory). This behavior is fine for immutable types—for example, when initializing a set of counters to zero (recall that variables must be assigned before they can be used in Python, so you must initialize counters to zero before you can start adding to them):

>>> a = b = 0

>>> b = b + 1

>>> a, b

(0, 1)

Here, changing b only changes b because numbers do not support in-place changes. As long as the object assigned is immutable, it’s irrelevant if more than one name references it.

As usual, though, we have to be more cautious when initializing variables to an empty mutable object such as a list or dictionary:

>>> a = b = []

>>> b.append(42)

>>> a, b

([42], [42])

This time, because a and b reference the same object, appending to it in place through b will impact what we see through a as well. This is really just another example of the shared reference phenomenon we first met in Chapter 6. To avoid the issue, initialize mutable objects in separate statements instead, so that each creates a distinct empty object by running a distinct literal expression:

>>> a = []

>>> b = []                 # a and b do not share the same object

>>> b.append(42)

>>> a, b

([], [42])

A tuple assignment like the following has the same effect—by running two list expressions, it creates two distinct objects:

>>> a, b = [], []          # a and b do not share the same object

Augmented Assignments

Beginning with Python 2.0, the set of additional assignment statement formats listed in Table 11-2 became available. Known as augmented assignments, and borrowed from the C language, these formats are mostly just shorthand. They imply the combination of a binary expression and an assignment. For instance, the following two formats are roughly equivalent:

X = X + Y                       # Traditional form

X += Y                          # Newer augmented form

Table 11-2. Augmented assignment statements

X += Y

X &= Y

X −= Y

X |= Y

X *= Y

X ^= Y

X /= Y

X >>= Y

X %= Y

X <<= Y

X **= Y

X //= Y

Augmented assignment works on any type that supports the implied binary expression. For example, here are two ways to add 1 to a name:

>>> x = 1

>>> x = x + 1                   # Traditional

>>> x

2

>>> x += 1                      # Augmented

>>> x

3

When applied to a sequence such as a string, the augmented form performs concatenation instead. Thus, the second line here is equivalent to typing the longer S = S + "SPAM":

>>> S = "spam"

>>> S += "SPAM"                 # Implied concatenation

>>> S

'spamSPAM'

As shown in Table 11-2, there are analogous augmented assignment forms for every Python binary expression operator (i.e., each operator with values on the left and right side). For instance, X *= Y multiplies and assigns, X >>= Y shifts right and assigns, and so on. X //= Y (for floor division) was added in version 2.2.

Augmented assignments have three advantages:[22]

§  There’s less for you to type. Need I say more?

§  The left side has to be evaluated only once. In X += Y, X may be a complicated object expression. In the augmented form, its code must be run only once. However, in the long form, X = X + Y, X appears twice and must be run twice. Because of this, augmented assignments usually run faster.

§  The optimal technique is automatically chosen. That is, for objects that support in-place changes, the augmented forms automatically perform in-place change operations instead of slower copies.

The last point here requires a bit more explanation. For augmented assignments, in-place operations may be applied for mutable objects as an optimization. Recall that lists can be extended in a variety of ways. To add a single item to the end of a list, we can concatenate or call append:

>>> L = [1, 2]

>>> L = L + [3]                 # Concatenate: slower

>>> L

[1, 2, 3]

>>> L.append(4)                 # Faster, but in place

>>> L

[1, 2, 3, 4]

And to add a set of items to the end, we can either concatenate again or call the list extend method:[23]

>>> L = L + [5, 6]              # Concatenate: slower

>>> L

[1, 2, 3, 4, 5, 6]

>>> L.extend([7, 8])            # Faster, but in place

>>> L

[1, 2, 3, 4, 5, 6, 7, 8]

In both cases, concatenation is less prone to the side effects of shared object references but will generally run slower than the in-place equivalent. Concatenation operations must create a new object, copy in the list on the left, and then copy in the list on the right. By contrast, in-place method calls simply add items at the end of a memory block (it can be a bit more complicated than that internally, but this description suffices).

When we use augmented assignment to extend a list, we can largely forget these details—Python automatically calls the quicker extend method instead of using the slower concatenation operation implied by +:

>>> L += [9, 10]                # Mapped to L.extend([9, 10])

>>> L

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Note however, that because of this equivalence += for a list is not exactly the same as a + and = in all cases—for lists += allows arbitrary sequences (just like extend), but concatenation normally does not:

>>> L = []

>>> L += 'spam'                 # += and extend allow any sequence, but + does not!

>>> L

['s', 'p', 'a', 'm']

>>> L = L + 'spam'

TypeError: can only concatenate list (not "str") to list

Augmented assignment and shared references

This behavior is usually what we want, but notice that it implies that the += is an in-place change for lists; thus, it is not exactly like + concatenation, which always makes a new object. As for all shared reference cases, this difference might matter if other names reference the object being changed:

>>> L = [1, 2]

>>> M = L                       # L and M reference the same object

>>> L = L + [3, 4]              # Concatenation makes a new object

>>> L, M                        # Changes L but not M

([1, 2, 3, 4], [1, 2])

>>> L = [1, 2]

>>> M = L

>>> L += [3, 4]                 # But += really means extend

>>> L, M                        # M sees the in-place change too!

([1, 2, 3, 4], [1, 2, 3, 4])

This only matters for mutables like lists and dictionaries, and it is a fairly obscure case (at least, until it impacts your code!). As always, make copies of your mutable objects if you need to break the shared reference structure.

Variable Name Rules

Now that we’ve explored assignment statements, it’s time to get more formal about the use of variable names. In Python, names come into existence when you assign values to them, but there are a few rules to follow when choosing names for the subjects of your programs:

Syntax: (underscore or letter) + (any number of letters, digits, or underscores)

Variable names must start with an underscore or letter, which can be followed by any number of letters, digits, or underscores. _spam, spam, and Spam_1 are legal names, but 1_Spam, spam$, and @#! are not.

Case matters: SPAM is not the same as spam

Python always pays attention to case in programs, both in names you create and in reserved words. For instance, the names X and x refer to two different variables. For portability, case also matters in the names of imported module files, even on platforms where the filesystems are case-insensitive. That way, your imports still work after programs are copied to differing platforms.

Reserved words are off-limits

Names you define cannot be the same as words that mean special things in the Python language. For instance, if you try to use a variable name like class, Python will raise a syntax error, but klass and Class work fine. Table 11-3 lists the words that are currently reserved (and hence off-limits for names of your own) in Python.

Table 11-3. Python 3.X reserved words

False

class

finally

is

return

None

continue

for

lambda

try

True

def

from

nonlocal

while

and

del

global

not

with

as

elif

if

or

yield

assert

else

import

pass

 

break

except

in

raise

 

Table 11-3 is specific to Python 3.X. In Python 2.X, the set of reserved words differs slightly:

§  print is a reserved word, because printing is a statement, not a built-in function (more on this later in this chapter).

§  exec is a reserved word, because it is a statement, not a built-in function.

§  nonlocal is not a reserved word because this statement is not available.

In older Pythons the story is also more or less the same, with a few variations:

§  with and as were not reserved until 2.6, when context managers were officially enabled.

§  yield was not reserved until Python 2.3, when generator functions came online.

§  yield morphed from statement to expression in 2.5, but it’s still a reserved word, not a built-in function.

As you can see, most of Python’s reserved words are all lowercase. They are also all truly reserved—unlike names in the built-in scope that you will meet in the next part of this book, you cannot redefine reserved words by assignment (e.g., and = 1 results in a syntax error).[24]

Besides being of mixed case, the first three entries in Table 11-3, True, False, and None, are somewhat unusual in meaning—they also appear in the built-in scope of Python described in Chapter 17, and they are technically names assigned to objects. In 3.X they are truly reserved in all other senses, though, and cannot be used for any other purpose in your script other than that of the objects they represent. All the other reserved words are hardwired into Python’s syntax and can appear only in the specific contexts for which they are intended.

Furthermore, because module names in import statements become variables in your scripts, variable name constraints extend to your module filenames too. For instance, you can code files called and.py and my-code.py and run them as top-level scripts, but you cannot import them: their names without the “.py” extension become variables in your code and so must follow all the variable rules just outlined. Reserved words are off-limits, and dashes won’t work, though underscores will. We’ll revisit this module idea in Part V of this book.

PYTHON’S DEPRECATION PROTOCOL

It is interesting to note how reserved word changes are gradually phased into the language. When a new feature might break existing code, Python normally makes it an option and begins issuing “deprecation” warnings one or more releases before the feature is officially enabled. The idea is that you should have ample time to notice the warnings and update your code before migrating to the new release. This is not true for major new releases like 3.0 (which breaks existing code freely), but it is generally true in other cases.

For example, yield was an optional extension in Python 2.2, but is a standard keyword as of 2.3. It is used in conjunction with generator functions. This was one of a small handful of instances where Python broke with backward compatibility. Still, yield was phased in over time: it began generating deprecation warnings in 2.2 and was not enabled until 2.3.

Similarly, in Python 2.6, the words with and as become new reserved words for use in context managers (a newer form of exception handling). These two words are not reserved in 2.5, unless the context manager feature is turned on manually with a from__future__import (discussed later in this book). When used in 2.5, with and as generate warnings about the upcoming change—except in the version of IDLE in Python 2.5, which appears to have enabled this feature for you (that is, using these words as variable names does generate errors in 2.5, but only in its version of the IDLE GUI).

Naming conventions

Besides these rules, there is also a set of naming conventions—rules that are not required but are followed in normal practice. For instance, because names with two leading and trailing underscores (e.g., __name__) generally have special meaning to the Python interpreter, you should avoid this pattern for your own names. Here is a list of the conventions Python follows:

§  Names that begin with a single underscore (_X) are not imported by a from module import * statement (described in Chapter 23).

§  Names that have two leading and trailing underscores (__X__) are system-defined names that have special meaning to the interpreter.

§  Names that begin with two underscores and do not end with two more (__X) are localized (“mangled”) to enclosing classes (see the discussion of pseudoprivate attributes in Chapter 31).

§  The name that is just a single underscore (_) retains the result of the last expression when you are working interactively.

In addition to these Python interpreter conventions, there are various other conventions that Python programmers usually follow. For instance, later in the book we’ll see that class names commonly start with an uppercase letter and module names with a lowercase letter, and that the nameself, though not reserved, usually has a special role in classes. In Chapter 17 we’ll also study another, larger category of names known as the built-ins, which are predefined but not reserved (and so can be reassigned: open = 42 works, though sometimes you might wish it didn’t!).

Names have no type, but objects do

This is mostly review, but remember that it’s crucial to keep Python’s distinction between names and objects clear. As described in Chapter 6, objects have a type (e.g., integer, list) and may be mutable or not. Names (a.k.a. variables), on the other hand, are always just references to objects; they have no notion of mutability and have no associated type information, apart from the type of the object they happen to reference at a given point in time.

Thus, it’s OK to assign the same name to different kinds of objects at different times:

>>> x = 0               # x bound to an integer object

>>> x = "Hello"         # Now it's a string

>>> x = [1, 2, 3]       # And now it's a list

In later examples, you’ll see that this generic nature of names can be a decided advantage in Python programming. In Chapter 17, you’ll also learn that names also live in something called a scope, which defines where they can be used; the place where you assign a name determines where it is visible.[25]

NOTE

For additional naming suggestions, see the discussion of naming conventions in Python’s semi-official style guide, known as PEP 8. This guide is available at http://www.python.org/dev/peps/pep-0008, or via a web search for “Python PEP 8.” Technically, this document formalizes coding standards for Python library code.

Though useful, the usual caveats about coding standards apply here. For one thing, PEP 8 comes with more detail than you are probably ready for at this point in the book. And frankly, it has become more complex, rigid, and subjective than it may need to be—some of its suggestions are not at all universally accepted or followed by Python programmers doing real work. Moreover, some of the most prominent companies using Python today have adopted coding standards of their own that differ.

PEP 8 does codify useful rule-of-thumb Python knowledge, though, and it’s a great read for Python beginners, as long as you take its recommendations as guidelines, not gospel.


[22] C/C++ programmers take note: although Python now supports statements like X += Y, it still does not have C’s auto-increment/decrement operators (e.g., X++, −−X). These don’t quite map to the Python object model because Python has no notion of in-place changes to immutable objects like numbers.

[23] As suggested in Chapter 6, we can also use slice assignment (e.g., L[len(L):] = [11,12,13]), but this works roughly the same as the simpler and more mnemonic list extend method.

[24] In standard CPython, at least. Alternative implementations of Python might allow user-defined variable names to be the same as Python reserved words. See Chapter 2 for an overview of alternative implementations, such as Jython.

[25] If you’ve used a more restrictive language like C++, you may be interested to know that there is no notion of C++’s const declaration in Python; certain objects may be immutable, but names can always be assigned. Python also has ways to hide names in classes and modules, but they’re not the same as C++’s declarations (if hiding attributes matters to you, see the coverage of _X module names in Chapter 25, __X class names in Chapter 31, and the Private and Public class decorators example in Chapter 39).

Expression Statements

In Python, you can use an expression as a statement, too—that is, on a line by itself. But because the result of the expression won’t be saved, it usually makes sense to do so only if the expression does something useful as a side effect. Expressions are commonly used as statements in two situations:

For calls to functions and methods

Some functions and methods do their work without returning a value. Such functions are sometimes called procedures in other languages. Because they don’t return values that you might be interested in retaining, you can call these functions with expression statements.

For printing values at the interactive prompt

Python echoes back the results of expressions typed at the interactive command line. Technically, these are expression statements, too; they serve as a shorthand for typing print statements.

Table 11-4 lists some common expression statement forms in Python. Calls to functions and methods are coded with zero or more argument objects (really, expressions that evaluate to objects) in parentheses, after the function/method name.

Table 11-4. Common Python expression statements

Operation

Interpretation

spam(eggs, ham)

Function calls

spam.ham(eggs)

Method calls

spam

Printing variables in the interactive interpreter

print(a, b, c, sep='')

Printing operations in Python 3.X

yield x ** 2

Yielding expression statements

The last two entries in Table 11-4 are somewhat special cases—as we’ll see later in this chapter, printing in Python 3.X is a function call usually coded on a line by itself, and the yield operation in generator functions (discussed in Chapter 20) is often coded as a statement as well. Both are really just instances of expression statements.

For instance, though you normally run a 3.X print call on a line by itself as an expression statement, it returns a value like any other function call (its return value is None, the default return value for functions that don’t return anything meaningful):

>>> x = print('spam')         # print is a function call expression in 3.X

spam

>>> print(x)                  # But it is coded as an expression statement

None

Also keep in mind that although expressions can appear as statements in Python, statements cannot be used as expressions. A statement that is not an expression must generally appear on a line all by itself, not nested in a larger syntactic structure. For example, Python doesn’t allow you to embed assignment statements (=) in other expressions. The rationale for this is that it avoids common coding mistakes; you can’t accidentally change a variable by typing = when you really mean to use the == equality test. You’ll see how to code around this restriction when you meet the Python while loop in Chapter 13.

Expression Statements and In-Place Changes

This brings up another mistake that is common in Python work. Expression statements are often used to run list methods that change a list in place:

>>> L = [1, 2]

>>> L.append(3)               # Append is an in-place change

>>> L

[1, 2, 3]

However, it’s not unusual for Python newcomers to code such an operation as an assignment statement instead, intending to assign L to the larger list:

>>> L = L.append(4)           # But append returns None, not L

>>> print(L)                  # So we lose our list!

None

This doesn’t quite work, though. Calling an in-place change operation such as append, sort, or reverse on a list always changes the list in place, but these methods do not return the list they have changed; instead, they return the None object. Thus, if you assign such an operation’s result back to the variable name, you effectively lose the list (and it is probably garbage-collected in the process!).

The moral of the story is, don’t do this—call in-place change operations without assigning their results. We’ll revisit this phenomenon in the section Common Coding Gotchas because it can also appear in the context of some looping statements we’ll meet in later chapters.

Print Operations

In Python, print prints things—it’s simply a programmer-friendly interface to the standard output stream.

Technically, printing converts one or more objects to their textual representations, adds some minor formatting, and sends the resulting text to either standard output or another file-like stream. In a bit more detail, print is strongly bound up with the notions of files and streams in Python:

File object methods

In Chapter 9, we learned about file object methods that write text (e.g., file.write(str)). Printing operations are similar, but more focused—whereas file write methods write strings to arbitrary files, print writes objects to the stdout stream by default, with some automatic formatting added. Unlike with file methods, there is no need to convert objects to strings when using print operations.

Standard output stream

The standard output stream (often known as stdout) is simply a default place to send a program’s text output. Along with the standard input and error streams, it’s one of three data connections created when your script starts. The standard output stream is usually mapped to the window where you started your Python program, unless it’s been redirected to a file or pipe in your operating system’s shell.

Because the standard output stream is available in Python as the stdout file object in the built-in sys module (i.e., sys.stdout), it’s possible to emulate print with file write method calls. However, print is noticeably easier to use and makes it easy to print text to other files and streams.

Printing is also one of the most visible places where Python 3.X and 2.X have diverged. In fact, this divergence is usually the first reason that most 2.X code won’t run unchanged under 3.X. Specifically, the way you code print operations depends on which version of Python you use:

§  In Python 3.X, printing is a built-in function, with keyword arguments for special modes.

§  In Python 2.X, printing is a statement with specific syntax all its own.

Because this book covers both 3.X and 2.X, we will look at each form in turn here. If you are fortunate enough to be able to work with code written for just one version of Python, feel free to pick the section that is relevant to you. Because your needs may change, however, it probably won’t hurt to be familiar with both cases. Moreover, users of recent Python 2.X releases can also import and use 3.X’s flavor of printing in their Pythons if desired—both for its extra functionality and to ease future migration to 3.X.

The Python 3.X print Function

Strictly speaking, printing is not a separate statement form in 3.X. Instead, it is simply an instance of the expression statement we studied in the preceding section.

The print built-in function is normally called on a line of its own, because it doesn’t return any value we care about (technically, it returns None, as we saw in the preceding section). Because it is a normal function, though, printing in 3.X uses standard function-call syntax, rather than a special statement form. And because it provides special operation modes with keyword arguments, this form is both more general and supports future enhancements better.

By comparison, Python 2.X print statements have somewhat ad hoc syntax to support extensions such as end-of-line suppression and target files. Further, the 2.X statement does not support separator specification at all; in 2.X, you wind up building strings ahead of time more often than you do in 3.X. Rather than adding yet more ad hoc syntax, Python 3.X’s print takes a single, general approach that covers them all.

Call format

Syntactically, calls to the 3.X print function have the following form (the flush argument is new as of Python 3.3):

print([object, ...][, sep=' '][, end='\n'][, file=sys.stdout][, flush=False])

In this formal notation, items in square brackets are optional and may be omitted in a given call, and values after = give argument defaults. In English, this built-in function prints the textual representation of one or more objects separated by the string sep and followed by the string end to the stream file, flushing buffered output or not per flush.

The sep, end, file, and (in 3.3 and later) flush parts, if present, must be given as keyword arguments—that is, you must use a special “name=value” syntax to pass the arguments by name instead of position. Keyword arguments are covered in depth in Chapter 18, but they’re straightforward to use. The keyword arguments sent to this call may appear in any left-to-right order following the objects to be printed, and they control the print operation:

§  sep is a string inserted between each object’s text, which defaults to a single space if not passed; passing an empty string suppresses separators altogether.

§  end is a string added at the end of the printed text, which defaults to a \n newline character if not passed. Passing an empty string avoids dropping down to the next output line at the end of the printed text—the next print will keep adding to the end of the current output line.

§  file specifies the file, standard stream, or other file-like object to which the text will be sent; it defaults to the sys.stdout standard output stream if not passed. Any object with a file-like write(string) method may be passed, but real files should be already opened for output.

§  flush, added in 3.3, defaults to False. It allows prints to mandate that their text be flushed through the output stream immediately to any waiting recipients. Normally, whether printed output is buffered in memory or not is determined by file; passing a true value to flush forcibly flushes the stream.

The textual representation of each object to be printed is obtained by passing the object to the str built-in call (or its equivalent inside Python); as we’ve seen, this built-in returns a “user friendly” display string for any object.[26] With no arguments at all, the print function simply prints a newline character to the standard output stream, which usually displays a blank line.

The 3.X print function in action

Printing in 3.X is probably simpler than some of its details may imply. To illustrate, let’s run some quick examples. The following prints a variety of object types to the default standard output stream, with the default separator and end-of-line formatting added (these are the defaults because they are the most common use case):

C:\code> c:\python33\python

>>> print()                                      # Display a blank line

>>> x = 'spam'

>>> y = 99

>>> z = ['eggs']

>>> 

>>> print(x, y, z)                               # Print three objects per defaults

spam 99 ['eggs']

There’s no need to convert objects to strings here, as would be required for file write methods. By default, print calls add a space between the objects printed. To suppress this, send an empty string to the sep keyword argument, or send an alternative separator of your choosing:

>>> print(x, y, z, sep='')                       # Suppress separator

spam99['eggs']

>>> 

>>> print(x, y, z, sep=', ')                     # Custom separator

spam, 99, ['eggs']

Also by default, print adds an end-of-line character to terminate the output line. You can suppress this and avoid the line break altogether by passing an empty string to the end keyword argument, or you can pass a different terminator of your own including a \n character to break the line manually if desired (the second of the following is two statements on one line, separated by a semicolon):

>>> print(x, y, z, end='')                        # Suppress line break

spam 99 ['eggs']>>>

>>> 

>>> print(x, y, z, end=''); print(x, y, z)        # Two prints, same output line

spam 99 ['eggs']spam 99 ['eggs']

>>> print(x, y, z, end='...\n')                   # Custom line end

spam 99 ['eggs']...

>>> 

You can also combine keyword arguments to specify both separators and end-of-line strings—they may appear in any order but must appear after all the objects being printed:

>>> print(x, y, z, sep='...', end='!\n')          # Multiple keywords

spam...99...['eggs']!

>>> print(x, y, z, end='!\n', sep='...')          # Order doesn't matter

spam...99...['eggs']!

Here is how the file keyword argument is used—it directs the printed text to an open output file or other compatible object for the duration of the single print (this is really a form of stream redirection, a topic we will revisit later in this section):

>>> print(x, y, z, sep='...', file=open('data.txt', 'w'))      # Print to a file

>>> print(x, y, z)                                             # Back to stdout

spam 99 ['eggs']

>>> print(open('data.txt').read())                             # Display file text

spam...99...['eggs']

Finally, keep in mind that the separator and end-of-line options provided by print operations are just conveniences. If you need to display more specific formatting, don’t print this way. Instead, build up a more complex string ahead of time or within the print itself using the string tools we met in Chapter 7, and print the string all at once:

>>> text = '%s: %-.4f, %05d' % ('Result', 3.14159, 42)

>>> print(text)

Result: 3.1416, 00042

>>> print('%s: %-.4f, %05d' % ('Result', 3.14159, 42))

Result: 3.1416, 00042

As we’ll see in the next section, almost everything we’ve just seen about the 3.X print function also applies directly to 2.X print statements—which makes sense, given that the function was intended to both emulate and improve upon 2.X printing support.

The Python 2.X print Statement

As mentioned earlier, printing in Python 2.X uses a statement with unique and specific syntax, rather than a built-in function. In practice, though, 2.X printing is mostly a variation on a theme; with the exception of separator strings (which are supported in 3.X but not 2.X) and flushes on prints (available as of 3.3 only), everything we can do with the 3.X print function has a direct translation to the 2.X print statement.

Statement forms

Table 11-5 lists the print statement’s forms in Python 2.X and gives their Python 3.X print function equivalents for reference. Notice that the comma is significant in print statements—it separates objects to be printed, and a trailing comma suppresses the end-of-line character normally added at the end of the printed text (not to be confused with tuple syntax!). The >> syntax, normally used as a bitwise right-shift operation, is used here as well, to specify a target output stream other than the sys.stdout default.

Table 11-5. Python 2.X print statement forms

Python 2.X statement

Python 3.X equivalent

Interpretation

print x, y

print(x, y)

Print objects’ textual forms to sys.stdout; add a space between the items and an end-of-line at the end

print x, y,

print(x, y, end='')

Same, but don’t add end-of-line at end of text

print >> afile, x, y

print(x, y, file=afile)

Send text to afile.write, not to sys.stdout.write

The 2.X print statement in action

Although the 2.X print statement has more unique syntax than the 3.X function, it’s similarly easy to use. Let’s turn to some basic examples again. The 2.X print statement adds a space between the items separated by commas and by default adds a line break at the end of the current output line:

C:\code> c:\python27\python

>>> x = 'a'

>>> y = 'b'

>>> print x, y

a b

This formatting is just a default; you can choose to use it or not. To suppress the line break so you can add more text to the current line later, end your print statement with a comma, as shown in the second line of Table 11-5 (the following uses a semicolon to separate two statements on one line again):

>>> print x, y,; print x, y

a b a b

To suppress the space between items, again, don’t print this way. Instead, build up an output string using the string concatenation and formatting tools covered in Chapter 7, and print the string all at once:

>>> print x + y

ab

>>> print '%s...%s' % (x, y)

a...b

As you can see, apart from their special syntax for usage modes, 2.X print statements are roughly as simple to use as 3.X’s function. The next section uncovers the way that files are specified in 2.X prints.

Print Stream Redirection

In both Python 3.X and 2.X, printing sends text to the standard output stream by default. However, it’s often useful to send it elsewhere—to a text file, for example, to save results for later use or testing purposes. Although such redirection can be accomplished in system shells outside Python itself, it turns out to be just as easy to redirect a script’s streams from within the script.

The Python “hello world” program

Let’s start off with the usual (and largely pointless) language benchmark—the “hello world” program. To print a “hello world” message in Python, simply print the string per your version’s print operation:

>>> print('hello world')               # Print a string object in 3.X

hello world

>>> print 'hello world'                # Print a string object in 2.X

hello world

Because expression results are echoed on the interactive command line, you often don’t even need to use a print statement there—simply type the expressions you’d like to have printed, and their results are echoed back:

>>> 'hello world'                      # Interactive echoes

'hello world'

This code isn’t exactly an earth-shattering piece of software mastery, but it serves to illustrate printing behavior. Really, the print operation is just an ergonomic feature of Python—it provides a simple interface to the sys.stdout object, with a bit of default formatting. In fact, if you enjoy working harder than you must, you can also code print operations this way:

>>> import sys                         # Printing the hard way

>>> sys.stdout.write('hello world\n')

hello world

This code explicitly calls the write method of sys.stdout—an attribute preset when Python starts up to an open file object connected to the output stream. The print operation hides most of those details, providing a simple tool for simple printing tasks.

Manual stream redirection

So, why did I just show you the hard way to print? The sys.stdout print equivalent turns out to be the basis of a common technique in Python. In general, print and sys.stdout are directly related as follows. This statement:

print(X, Y)                            # Or, in 2.X: print X, Y

is equivalent to the longer:

import sys

sys.stdout.write(str(X) + ' ' + str(Y) + '\n')

which manually performs a string conversion with str, adds a separator and newline with +, and calls the output stream’s write method. Which would you rather code? (He says, hoping to underscore the programmer-friendly nature of prints...)

Obviously, the long form isn’t all that useful for printing by itself. However, it is useful to know that this is exactly what print operations do because it is possible to reassign sys.stdout to something different from the standard output stream. In other words, this equivalence provides a way of making your print operations send their text to other places. For example:

import sys

sys.stdout = open('log.txt', 'a')       # Redirects prints to a file

...

print(x, y, x)                          # Shows up in log.txt

Here, we reset sys.stdout to a manually opened file named log.txt, located in the script’s working directory and opened in append mode (so we add to its current content). After the reset, every print operation anywhere in the program will write its text to the end of the file log.txt instead of to the original output stream. The print operations are happy to keep calling sys.stdout’s write method, no matter what sys.stdout happens to refer to. Because there is just one sys module in your process, assigning sys.stdout this way will redirect every print anywhere in your program.

In fact, as the sidebar Why You Will Care: print and stdout will explain, you can even reset sys.stdout to an object that isn’t a file at all, as long as it has the expected interface: a method named write to receive the printed text string argument. When that object is a class, printed text can be routed and processed arbitrarily per a write method you code yourself.

This trick of resetting the output stream might be more useful for programs originally coded with print statements. If you know that output should go to a file to begin with, you can always call file write methods instead. To redirect the output of a print-based program, though, resettingsys.stdout provides a convenient alternative to changing every print statement or using system shell-based redirection syntax.

In other roles, streams may be reset to objects that display them in pop-up windows in GUIs, colorize then in IDEs like IDLE, and so on. It’s a general technique.

Automatic stream redirection

Although redirecting printed text by assigning sys.stdout is a useful tool, a potential problem with the last section’s code is that there is no direct way to restore the original output stream should you need to switch back after printing to a file. Because sys.stdout is just a normal file object, though, you can always save it and restore it if needed:[27]

C:\code> c:\python33\python

>>> import sys

>>> temp = sys.stdout                   # Save for restoring later

>>> sys.stdout = open('log.txt', 'a')   # Redirect prints to a file

>>> print('spam')                       # Prints go to file, not here

>>> print(1, 2, 3)

>>> sys.stdout.close()                  # Flush output to disk

>>> sys.stdout = temp                   # Restore original stream

>>> print('back here')                  # Prints show up here again

back here

>>> print(open('log.txt').read())       # Result of earlier prints

spam

1 2 3

As you can see, though, manual saving and restoring of the original output stream like this involves quite a bit of extra work. Because this crops up fairly often, a print extension is available to make it unnecessary.

In 3.X, the file keyword allows a single print call to send its text to the write method of a file (or file-like object), without actually resetting sys.stdout. Because the redirection is temporary, normal print calls keep printing to the original output stream. In 2.X, a print statement that begins with a >> followed by an output file object (or other compatible object) has the same effect. For example, the following again sends printed text to a file named log.txt:

log =  open('log.txt', 'a')             # 3.X

print(x, y, z, file=log)                # Print to a file-like object

print(a, b, c)                          # Print to original stdout

log =  open('log.txt', 'a')             # 2.X

print >> log, x, y, z                   # Print to a file-like object

print a, b, c                           # Print to original stdout

These redirected forms of print are handy if you need to print to both files and the standard output stream in the same program. If you use these forms, however, be sure to give them a file object (or an object that has the same write method as a file object), not a file’s name string. Here is the technique in action:

C:\code> c:\python33\python

>>> log = open('log.txt', 'w')

>>> print(1, 2, 3, file=log)            # For 2.X: print >> log, 1, 2, 3

>>> print(4, 5, 6, file=log)

>>> log.close()

>>> print(7, 8, 9)                      # For 2.X: print 7, 8, 9

7 8 9

>>> print(open('log.txt').read())

1 2 3

4 5 6

These extended forms of print are also commonly used to print error messages to the standard error stream, available to your script as the preopened file object sys.stderr. You can either use its file write methods and format the output manually, or print with redirection syntax:

>>> import sys

>>> sys.stderr.write(('Bad!' * 8) + '\n')

Bad!Bad!Bad!Bad!Bad!Bad!Bad!Bad!

>>> print('Bad!' * 8, file=sys.stderr)     # In 2.X: print >> sys.stderr, 'Bad!' * 8

Bad!Bad!Bad!Bad!Bad!Bad!Bad!Bad!

Now that you know all about print redirections, the equivalence between printing and file write methods should be fairly obvious. The following interaction prints both ways in 3.X, then redirects the output to an external file to verify that the same text is printed:

>>> X = 1; Y = 2

>>> print(X, Y)                                            # Print: the easy way

1 2

>>> import sys                                             # Print: the hard way

>>> sys.stdout.write(str(X) + ' ' + str(Y) + '\n')

1 2

4

>>> print(X, Y, file=open('temp1', 'w'))                   # Redirect text to file

>>> open('temp2', 'w').write(str(X) + ' ' + str(Y) + '\n')# Send to file manually

4

>>> print(open('temp1', 'rb').read())                      # Binary mode for bytes

b'1 2\r\n'

>>> print(open('temp2', 'rb').read())

b'1 2\r\n'

As you can see, unless you happen to enjoy typing, print operations are usually the best option for displaying text. For another example of the equivalence between prints and file writes, watch for a 3.X print function emulation example in Chapter 18; it uses this code pattern to provide a general 3.X print function equivalent for use in Python 2.X.

Version-Neutral Printing

Finally, if you need your prints to work on both Python lines, you have some options. This is true whether you’re writing 2.X code that strives for 3.X compatibility, or 3.X code that aims to support 2.X too.

2to3 converter

For one, you can code 2.X print statements and let 3.X’s 2to3 conversion script translate them to 3.X function calls automatically. See the Python 3.X manuals for more details about this script; it attempts to translate 2.X code to run under 3.X—a useful tool, but perhaps more than you want to make just your print operations version-neutral. A related tool named 3to2 attempts to do the inverse: convert 3.X code to run on 2.X; see Appendix C for more information.

Importing from __future__

Alternatively, you can code 3.X print function calls in code to be run by 2.X, by enabling the function call variant with a statement like the following coded at the top of a script, or anywhere in an interactive session:

from __future__ import print_function

This statement changes 2.X to support 3.X’s print functions exactly. This way, you can use 3.X print features and won’t have to change your prints if you later migrate to 3.X. Two usage notes here:

§  This statement is simply ignored if it appears in code run by 3.X—it doesn’t hurt if included in 3.X code for 2.X compatibility.

§  This statement must appear at the top of each file that prints in 2.X—because it modifies that parser for a single file only, it’s not enough to import another file that includes this statement.

Neutralizing display differences with code

Also keep in mind that simple prints, like those in the first row of Table 11-5, work in either version of Python—because any expression may be enclosed in parentheses, we can always pretend to be calling a 3.X print function in 2.X by adding outer parentheses. The main downside to this is that it makes a tuple out of your printed objects if there are more than one, or none—they will print with extra enclosing parentheses. In 3.X, for example, any number of objects may be listed in the call’s parentheses:

C:\code> c:\python33\python

>>> print('spam')                       # 3.X print function call syntax

spam

>>> print('spam', 'ham', 'eggs')        # These are multiple arguments

spam ham eggs

The first of these works the same in 2.X, but the second generates a tuple in the output:

C:\code> c:\python27\python

>>> print('spam')                       # 2.X print statement, enclosing parens

spam

>>> print('spam', 'ham', 'eggs')        # This is really a tuple object!

('spam', 'ham', 'eggs')

The same applies when there are no objects printed to force a line-feed: 2.X shows a tuple, unless you print an empty string:

c:\code> py −2

>> print()                              # This is just a line-feed on 3.X

()

>>> print('')                           # This is a line-feed in both 2.X and 3.X

Strictly speaking, outputs may in some cases differ in more than just extra enclosing parentheses in 2.X. If you look closely at the preceding results, you’ll notice that the strings also print with enclosing quotes in 2.X only. This is because objects may print differently when nested in another object than they do as top-level items. Technically, nested appearances display with repr and top-level objects with str—the two alternative display formats we noted in Chapter 5.

Here this just means extra quotes around strings nested in the tuple that is created for printing multiple parenthesized items in 2.X. Displays of nested objects can differ much more for other object types, though, and especially for class objects that define alternative displays with operator overloading—a topic we’ll cover in Part VI in general and Chapter 30 in particular.

To be truly portable without enabling 3.X prints everywhere, and to sidestep display difference for nested appearances, you can always format the print string as a single object to unify displays across versions, using the string formatting expression or method call, or other string tools that we studied in Chapter 7:

>>> print('%s %s %s' % ('spam', 'ham', 'eggs'))

spam ham eggs

>>> print('{0} {1} {2}'.format('spam', 'ham', 'eggs'))

spam ham eggs

>>> print('answer: ' + str(42))

answer: 42

Of course, if you can use 3.X exclusively you can forget such mappings entirely, but many Python programmers will at least encounter, if not write, 2.X code and systems for some time to come. We’ll use both __future__ and version-neutral code to achieve 2.X/3.X portability in many examples in this book.

NOTE

I use Python 3.X print function calls throughout this book. I’ll often make prints version-neutral, and will usually warn you when the results may differ in 2.X, but I sometimes don’t, so please consider this note a blanket warning. If you see extra parentheses in your printed text in 2.X, either drop the parentheses in your print statements, import 3.X prints from the __future__, recode your prints using the version-neutral scheme outlined here, or learn to love superfluous text.

WHY YOU WILL CARE: PRINT AND STDOUT

The equivalence between the print operation and writing to sys.stdout is important. It makes it possible to reassign sys.stdout to any user-defined object that provides the same write method as files. Because the print statement just sends text to the sys.stdout.write method, you can capture printed text in your programs by assigning sys.stdout to an object whose writemethod processes the text in arbitrary ways.

For instance, you can send printed text to a GUI window, or tee it off to multiple destinations, by defining an object with a write method that does the required routing. You’ll see an example of this trick when we study classes in Part VI of this book, but abstractly, it looks like this:

class FileFaker:

    def write(self, string):

        # Do something with printed text in string

import sys

sys.stdout = FileFaker()

print(someObjects)              # Sends to class write method

This works because print is what we will call in the next part of this book a polymorphic operation—it doesn’t care what sys.stdout is, only that it has a method (i.e., interface) called write. This redirection to objects is made even simpler with the file keyword argument in 3.X and the >> extended form of print in 2.X, because we don’t need to reset sys.stdout explicitly—normal prints will still be routed to the stdout stream:

myobj = FileFaker()             # 3.X: Redirect to object for one print

print(someObjects, file=myobj)  # Does not reset sys.stdout

myobj = FileFaker()             # 2.X: same effect

print >> myobj, someObjects     # Does not reset sys.stdout

Python’s 3.X’s built-in input function (named raw_input in 2.X) reads from the sys.stdin file, so you can intercept read requests in a similar way, using classes that implement file-like read methods instead. See the input and while loop example in Chapter 10 for more background on this function.

Notice that because printed text goes to the stdout stream, it’s also the way to print HTML reply pages in CGI scripts used on the Web, and enables you to redirect Python script input and output at the operating system’s shell command line as usual:

python script.py < inputfile > outputfile

python script.py | filterProgram

Python’s print operation redirection tools are essentially pure-Python alternatives to these shell syntax forms. See other resources for more on CGI scripts and shell syntax.


[26] Technically, printing uses the equivalent of str in the internal implementation of Python, but the effect is the same. Besides this to-string conversion role, str is also the name of the string data type and can be used to decode Unicode strings from raw bytes with an extra encoding argument, as we’ll learn in Chapter 37; this latter role is an advanced usage that we can safely ignore here.

[27] In both 2.X and 3.X you may also be able to use the __stdout__ attribute in the sys module, which refers to the original value sys.stdout had at program startup time. You still need to restore sys.stdout to sys.__stdout__ to go back to this original stream value, though. See the sys module documentation for more details.

Chapter Summary

In this chapter, we began our in-depth look at Python statements by exploring assignments, expressions, and print operations. Although these are generally simple to use, they have some alternative forms that, while optional, are often convenient in practice—augmented assignment statements and the redirection form of print operations, for example, allow us to avoid some manual coding work. Along the way, we also studied the syntax of variable names, stream redirection techniques, and a variety of common mistakes to avoid, such as assigning the result of anappend method call back to a variable.

In the next chapter, we’ll continue our statement tour by filling in details about the if statement, Python’s main selection tool; there, we’ll also revisit Python’s syntax model in more depth and look at the behavior of Boolean expressions. Before we move on, though, the end-of-chapter quiz will test your knowledge of what you’ve learned here.

Test Your Knowledge: Quiz

1.    Name three ways that you can assign three variables to the same value.

2.    Why might you need to care when assigning three variables to a mutable object?

3.    What’s wrong with saying L = L.sort()?

4.    How might you use the print operation to send text to an external file?

Test Your Knowledge: Answers

1.    You can use multiple-target assignments (A = B = C = 0), sequence assignment (A, B, C = 0, 0, 0), or multiple assignment statements on three separate lines (A = 0, B = 0, and C = 0). With the latter technique, as introduced in Chapter 10, you can also string the three separate statements together on the same line by separating them with semicolons (A = 0; B = 0; C = 0).

2.    If you assign them this way:

3.  A = B = C = []

all three names reference the same object, so changing it in place from one (e.g., A.append(99)) will affect the others. This is true only for in-place changes to mutable objects like lists and dictionaries; for immutable objects such as numbers and strings, this issue is irrelevant.

4.    The list sort method is like append in that it makes an in-place change to the subject list—it returns None, not the list it changes. The assignment back to L sets L to None, not to the sorted list. As discussed both earlier and later in this book (e.g., Chapter 8), a newer built-in function, sorted, sorts any sequence and returns a new list with the sorting result; because this is not an in-place change, its result can be meaningfully assigned to a name.

5.    To print to a file for a single print operation, you can use 3.X’s print(X, file=F) call form, use 2.X’s extended print >> file, X statement form, or assign sys.stdout to a manually opened file before the print and restore the original after. You can also redirect all of a program’s printed text to a file with special syntax in the system shell, but this is outside Python’s scope.