A practical guide to Fedora and Red Hat Enterprise Linux, 7th Edition (2014)

Part V: Programming Tools

Chapter 28. The Python Programming Language

In This Chapter

Invoking Python

Lists

Dictionaries

Control Structures

Reading from and Writing to Files

Pickle

Regular Expressions

Defining a Function

Using Libraries

Lambda Functions

List Comprehensions


Objectives

After reading this chapter you should be able to:

Image Give commands using the Python interactive shell

Image Write and run a Python program stored in a file

Image Demonstrate how to instantiate a list and how to remove elements from and add elements to a list

Image Describe a dictionary and give examples of how it can be used

Image Describe three Python control structures

Image Write a Python program that iterates through a list or dictionary

Image Read from and write to a file

Image Demonstrate exception processing

Image Preserve an object using pickle()

Image Write a Python program that uses regular expressions

Image Define a function and use it in a program

Introduction

Python is a friendly and flexible programming language in widespread use everywhere from Fortune 500 companies to large-scale open-source projects. Python is an interpreted language: It translates code into bytecode (page 1241) at runtime and executes the bytecode within the Python virtual machine. Contrast Python with the C language, which is a compiled language. C differs from Python in that the C compiler compiles C source code into architecture-specific machine code. Python programs are not compiled; you run a Python program the same way you run a bash or Perl script. Because Python programs are not compiled, they are portable between operating systems and architectures. In other words, the same Python program will run on any system to which the Python virtual machine has been ported.

Object oriented

While not required to use the language, Python supports the object-oriented (OO) paradigm. It is possible to use Python with little or no understanding of object-oriented concepts and this chapter covers OO programming minimally while still explaining Python’s important features.

Libraries

Python comes with hundreds of prewritten tools that are organized into logical libraries. These libraries are accessible to Python programs, but not loaded into memory at runtime because doing so would significantly increase startup times for Python programs. Entire libraries (or just individual modules) are instead loaded into memory when the program requests them.

Version

Python is available in two main development branches: Python 2.x and Python 3.x. This chapter focuses on Python 2.x because the bulk of Python written today uses 2.x. The following commands show that two versions of Python are installed and that the python command runs Python 2.7.5:

whereis python
python: /usr/bin/python3.3m /usr/bin/python2.7 /usr/bin/python ...
ls -l $(which python)
lrwxrwxrwx. 1 root root 7 07-02 12:53 /usr/bin/python -> python2
python -V
Python 2.7.5

Invoking Python

This section discusses the methods you can use to run a Python program.

Interactive shell

Most of the examples in this chapter use the Python interactive shell because you can use it to debug and execute code one line at a time and see the results immediately. Although this shell is handy for testing, it is not a good choice for running longer, more complex programs. You start a Python interactive shell by calling the python utility (just as you would start a bash shell by calling bash). The primary Python prompt is >>>. When Python requires more input to complete a command, it displays its secondary prompt (...).

python
Python 2.7.5 (default, May 16 2013, 13:44:12)
[GCC 4.8.0 20130412 (Red Hat 4.8.0-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>

While you are using the Python interactive shell, you can give Python a command by entering the command and pressing RETURN.

>>> print 'Good morning!'
Good morning!


Tip: Implied display

Within the Python interactive shell, Python displays output from any command line that does not have an action. The output is similar to what print would display, although it might not be exactly the same. The following examples show explicit print and implicit display actions:

>>> print 'Good morning!'
Good morning!
>>> 'Good morning!'
'Good morning!'

>>> print 2 + 2
4
>>> 2 + 2
4

Implied display allows you to display the value of a variable by typing its name:

>>> x = 'Hello'
>>> x
'Hello'

Implied display does not work unless you are running the Python interactive shell (i.e., Python does not invoke an implicit display action when it is run from a file).


Program file

Most of the time a Python program is stored in a text file. Although not required, the file typically has a filename extension of .py. Use chmod (page 193) to make the file executable. As explained on page 338, the #! at the start of the first line of the file instructs the shell to pass the rest of the file to /usr/bin/python for execution.

chmod 755 gm.py
cat gm.py
#!/usr/bin/python
print 'Good morning!'

./gm.py
Good morning!

You can also run a Python program by specifying the name of the program file as an argument or as standard input to python.

python gm.py
Good morning!

python < gm.py
Good morning!

cat gm.py | python
Good morning!

echo "print 'Good morning! '" | python
Good morning!

Because the shell interprets an exclamation point immediately followed by a character other than a SPACE as an event number (page 381), the final command includes a SPACE after the exclamation point.

Command line

Using the python –c option, you can run a program from the shell command line. In the preceding and following commands, the double quotation marks keep the shell from removing the single quotation marks from the command line before passing it to Python.

python -c "print 'Good morning! '"
Good morning!


Tip: Single and double quotation marks are functionally equivalent

You can use single quotation marks and double quotation marks interchangeably in a Python program and you can use one to quote the other. When Python displays quotation marks around a string it uses single quotation marks.

>>> a = "hi"
>>> a
'hi'
>>> print "'hi'"
'hi'
>>> print '"hi"'
"hi"


More Information

Local

python man page, pydoc program

Python interactive shell

From the Python interactive shell give the command help() to use the Python help feature. When you call it, this utility displays information to help you get started using it. Alternately, you can give the command help('object') where object is the name of an object you have instantiated or the name of a data structure or module such as list or pickle.

Web

Python home page: www.python.org

Documentation: docs.python.org

Index of Python Enhancement Proposals (PEPs): www.python.org/dev/peps

PEP 8 Style Guide for Python Code: www.python.org/dev/peps/pep-0008

PyPI (Python Package Index): pypi.python.org

Writing to Standard Output and Reading from Standard Input

raw_input()

Python makes it easy to write to standard output and read from standard input, both of which the shell (e.g., bash) connects to the terminal by default. The raw_input() function writes to standard output and reads from standard input. It returns the value of the string it reads after stripping the trailing control characters (RETURN-NEWLINE or NEWLINE).

In the following example, raw_input() displays its argument (Enter your name:) and waits for the user. When the user types something and presses RETURN, Python assigns the value returned by raw_input(), which is the string the user entered, to the variable my_in.

print

Within a print statement, a plus sign (+) catenates the strings on either side of it. The print statement in the example displays Hello, the value of my_in, and an exclamation point.

>>> my_in = raw_input ('Enter your name: ')
Enter your name: Neo
>>> print 'Hello, ' + my_in + '!'
Hello, Neo!

Functions and Methods

Functions

Functions in Python, as in most programming languages, are key to improving code readability, efficiency, and maintenance. Python has a number of builtin functions and functions you can immediately import into a Python program. These functions are available when you install Python. For example, the int() function returns the integer (truncated) part of a floating-point number:

>>> int(8.999)
8

Additional downloadable libraries hold more functions. For more information refer to “Using Libraries” on page 1103Table 28-1 lists a few of the most commonly used functions.

Image

Table 28-1 Commonly used functions

Methods

Functions and methods are very similar. The difference is that functions stand on their own while methods work on and are specific to objects. You will see the range() function used by itself [range(args)] and the readall() method used as an object method [f.readall(args), where f is the object (a file) readall() is reading from]. Table 28-2 on page 1089 and Table 28-4 on page 1097 list some methods.

Image

Table 28-2 List methods

Image

Table 28-3 File modes

Image

Table 28-4 File object methods

Scalar Variables, Lists, and Dictionaries

This section discusses some of the Python builtin data types. Scalar data types include number and string types. Compound data types include dictionary and list types.

Scalar Variables

As in most programming languages, you declare and initialize a variable using an equal sign. Python does not require the SPACEs around the equal sign, nor does it require you to identify a scalar variable with a prefix (Perl and bash require a leading dollar sign). As explained in the tip on page 1083, you can use the print function to display the value of a variable or you can just specify the variable name as a command:

>>> lunch = 'Lunch time!'
>>> print lunch
Lunch time!
>>> lunch
'Lunch time!'

Python performs arithmetic as you might expect:

>>> n1 = 5
>>> n2 = 8
>>> n1 + n2
13

Floating-point numbers

Whether Python performs floating-point or integer arithmetic depends on the values it is given. If all of the numbers involved in a calculation are integers—that is, if none of the numbers includes a decimal point—Python performs integer arithmetic and will truncate answers that include a fractional part. If at least one of the numbers involved in a calculation is a floating-point number (includes a decimal point), Python performs floating-point arithmetic. Be careful when performing division: If the answer could include a fraction, be sure to include a decimal point in one of the numbers or explicitly specify one of the numbers as a floating-point number:

>>> 3/2
1
>>> 3/2.0
1.5
>>> float(3)/2
1.5

Lists

A Python list is an object that comprises one or more elements; it is similar to an array in C or Java. Lists are ordered and use zero-based indexing (i.e., the first element of a list is numbered zero). A list is called iterable because it can provide successive elements on each iteration through a looping control structure such as for; see page 1090 for a discussion.

This section shows one way to instantiate (create) a list. The following commands instantiate and display a list named a that holds four values:

>>> a = ['bb', 'dd', 'zz', 'rr']
>>> a
['bb', 'dd', 'zz', 'rr']

Indexes

You can access an element of a list by specifying its index (remember—the first element of a list is numbered zero). The first of the following commands displays the value of the third element of a; the next assigns the value of the first element of a to x and displays the value of x.

>>> a[2]
'zz'

>>> x = a[0]
>>> x
'bb'

When you specify a negative index, Python counts from the end of the array.

>>> a[-1]
'rr'

>>> a[-2]
'zz'

Replacing an element

You can replace an element of a list by assigning a value to it.

>>> a[1] = 'qqqq'
>>> a
['bb', 'qqqq', 'zz', 'rr']

Slicing

The next examples show how to access a slice or portion of a list. The first example displays elements 0 up to 2 of the list (elements 0 and 1):

>>> a[0:2]
['bb', 'dd']

If you omit the number that follows the colon, Python displays from the element with the index specified before the colon through the end of the list. If you omit the number before the colon, Python displays from the beginning of the list up to the element with the number specified after the colon.

>>> a[2:]
['zz', 'rr']
>>> a[:2]
['bb', 'dd']

You can use negative numbers when slicing a list. The first of the following commands displays element 1 up to the last element of the list (element –1); the second displays from the next-to-last element of the list (element –2) through the end of the list.

>>> a[1:-1]
['dd', 'zz']
>>> a[-2:]
['zz', 'rr']

remove()

Following Python’s object-oriented paradigm, the list data type includes builtin methods. The remove(x) method removes the first element of a list whose value is x, and decreases the length of the list by 1. The following command removes the first element of list a whose value is bb.

>>> a.remove('bb')
>>> a
['dd', 'zz', 'rr']

append()

The append(x) method appends an element whose value is x to the list, and increases the length of the list by 1.

>>> a.append('mm')
>>> a
['dd', 'zz', 'rr', 'mm']

reverse()

The reverse() method does not take an argument. It is an efficient method that reverses elements of a list in place, overwriting elements of the list with new values.

>>> a.reverse()
>>> a
['mm', 'rr', 'zz', 'dd']

sort()

The sort() method does not take an argument. It sorts the elements in a list in place, overwriting elements of the list with new values.

>>> a.sort()
>>> a
['dd', 'mm', 'rr', 'zz']

sorted()

If you do not want to alter the contents of the list (or other iterable data structure) you are sorting, use the sorted() function. This function returns the sorted list and does not change the original list.

>>> b = sorted(a)
>>> a
['mm', 'rr', 'zz', 'dd']
>>> b
['dd', 'mm', 'rr', 'zz']

len()

The len() function returns the number of elements in a list or other iterable data structure.

>>> len(a)
4

Table 28-2 lists some of the methods that work on lists. The command help(list) displays a complete list of methods you can use with a list.

Working with Lists

Passing a list by reference

Python passes all objects, including lists, by reference. That is, it passes an object by passing a pointer to the object. When you assign one object to another, you are simply creating another name for the object—you are not creating a new object. When you change the object using either name, you can view the change using either name. In the following example, names is instantiated as a list that holds the values sammax, and zachcopy is set equal to names, setting up another name for (reference to) the same list. When the value of the first element of copy is changed, displaying names shows that its first element has also changed.

>>> names = ['sam', 'max', 'zach']
>>> copy = names
>>> names
['sam', 'max', 'zach']
>>> copy[0] = 'helen'
>>> names
['helen', 'max', 'zach']

Copying a list

When you use the syntax b = a[:] to copy a list, each list remains independent of the other. The next example is the same as the previous one, except copy2 points to a different location than names because the list was copied, not passed by reference: Look at the difference in the values of the first element (zero index) of both lists.

>>> names = ['sam', 'max', 'zach']
>>> copy2 = names[:]
>>> copy2[0] = 'helen'
>>> names
['sam', 'max', 'zach']
>>> copy2
['helen', 'max', 'zach']

Lists Are Iterable

An important feature of lists is that they are iterable, meaning a control structure such as for (page 1095) can loop (iterate) through each item in a list. In the following example, the for control structure iterates over the list a, assigning one element of a to item each time through the loop. The loop terminates after it has assigned each of the elements of a to item. The comma at the end of the print statement replaces the NEWLINE print normally outputs with a SPACE. You must indent lines within a control structure (called a logical block; page 1092).

>>> a
['bb', 'dd', 'zz', 'rr']
>>> for item in a:
...     print item,
...
bb dd zz rr

The next example returns the largest element in a list. In this example, the list is embedded in the code; see page 1106 for a similar program that uses random numbers. The program initializes my_rand_list as a list holding ten numbers and largest as a scalar with a value of –1. The forstructure retrieves the elements of my_rand_list in order, one each time through the loop. It assigns the value it retrieves to item. Within the for structure, an if statement tests to see if item is larger than largest. If it is, the program assigns the value of item to largest and displays the new value (so you can see how the program is progressing). When control exits from the for structure, the program displays a message and the largest number. Be careful when a logical block (a subblock) appears within another logical block: You must indent the subblock one level more than its superior logical block.

cat my_max.py
#!/usr/bin/python

my_rand_list = [5, 6, 4, 1, 7, 3, 2, 0, 9, 8]
largest = -1
for item in my_rand_list:
    if (item > largest):
        largest = item
        print largest,
print
print 'largest number is ', largest
./my_max.py
5 6 7 9
largest number is  9

See page 1107 for an easier way to find the maximum value in a list.

Dictionaries

A Python dictionary holds unordered key–value pairs in which the keys must be unique. Other languages refer to this type of data structure as an associative array, hash, or hashmap. Like lists, dictionaries are iterable. A dictionary provides fast lookups. The class name for dictionary is dict; thus you must type help(dict) to display the help page for dictionaries. Use the following syntax to instantiate a dictionary:

dict = { key1 : value1key2 : value2key3 : value3 ... }

Working with Dictionaries

The following example instantiates and displays a telephone extension dictionary named ext. Because dictionaries are unordered, Python does not display a dictionary in a specific order and usually does not display it in the order it was created.

>>> ext = {'sam': 44, 'max': 88, 'zach': 22}
>>> ext
{'max': 88, 'zach': 22, 'sam': 44}

keys() and values()

You can use the keys() and values() methods to display all keys or values held in a dictionary:

>>> ext.keys()
['max', 'zach', 'sam']
>>> ext.values()
[88, 22, 44]

You can add a key–value pair:

>>> ext['helen'] = 92
>>> ext
{'max': 88, 'zach': 22, 'sam': 44, 'helen': 92}

If you assign a value to a key that is already in the dictionary, Python replaces the value (keys must be unique):

>>> ext['max'] = 150
>>> ext
{'max': 150, 'zach': 22, 'sam': 44, 'helen': 92}

The following example shows how to remove a key–value pair from a dictionary:

>>> del ext['max']
>>> ext
{'zach': 22, 'sam': 44, 'helen': 92}

You can also query the dictionary. Python returns the value when you supply the key:

>>> ext['zach']
22

items()

The items() method returns key–value pairs as a list of tuples (pairs of values). Because a dictionary is unordered, the order of the tuples returned by items() can vary from run to run.

>>> ext.items()
[('zach', 22), ('sam', 44), ('helen', 92)]

Because dictionaries are iterable, you can loop through them using for.

>>> ext = {'sam': 44, 'max': 88, 'zach': 22}
>>> for i in ext:
...     print i
...
max
zach
sam

Using this syntax, the dictionary returns just keys; it is as though you wrote for i in ext.keys(). If you want to loop through values, use for i in ext.values().

Keys and values can be of different types within a dictionary:

>>> dic = {500: 2, 'bbbb': 'BBBB', 1000: 'big'}
>>> dic
{1000: 'big', 'bbbb': 'BBBB', 500: 2}

Control Structures

Control flow statements alter the order of execution of statements within a program. Starting on page 982Chapter 27 discusses bash control structures in detail and includes flow diagrams of their operation. Python control structures perform the same functions as their bash counterparts, although the two languages use different syntaxes. The description of each control structure in this section references the discussion of the same control structure under bash.

In this section, the bold italic words in the syntax description are the items you supply to cause the structure to have the desired effect; the nonbold italic words are the keywords Python uses to identify the control structure. Many of these structures use an expression, denoted as expr, to control their execution. The examples in this chapter delimit expr using parentheses for clarity and consistency; the parentheses are not always required.

Indenting logical blocks

In most programming languages, control structures are delimited by pairs of parentheses, brackets, or braces; bash uses keywords (e.g., if...fido...done). Python uses a colon (:) as the opening token for a control structure. While including SPACEs (or TABs) at the beginning of lines in a control structure is good practice in other languages, Python requires these elements; they indicate a logical block, or section of code, that is part of a control structure. The last indented line marks the end of the control structure; the change in indent level is the closing token that matches the opening colon.

if

Similar to the bash if...then control structure (page 983), the Python if control structure has the following syntax:

if expr:
       ...

As with all Python control structures, the control block, denoted by ..., must be indented.

In the following example, my_in != '' (if my_in is not an empty string) evaluates to true if the user entered something before pressing RETURN. If expr evaluates to true, Python executes the following indented print statement. Python executes any number of indented statements that follow anif statement as part of the control structure. If expr evaluates to false, Python skips any number of indented statements following the if statement.

cat if1.py
#!/usr/bin/python
my_in = raw_input('Enter your name: ')
if (my_in != ''):
    print 'Thank you, ' + my_in
print 'Program running, with or without your input.'

./if1.py
Enter your name: Neo
Thank you, Neo.
Program running, with or without your input.

if...else

Similar to the bash if...then...else control structure (page 987), the if...else control structure implements a two-way branch using the following syntax:

if expr:
       ...
else:
       ...

If expr evaluates to true, Python executes the statements in the if control block. Otherwise, it executes the statements in the else control block. The following example builds on the previous one, displaying an error message and exiting from the program if the user does not enter something.

cat if2.py
#!/usr/bin/python
my_in = raw_input('Enter your name: ')
if (my_in != ''):
    print 'Thank you, ' + my_in
else:
    print 'Program requires input to continue.'
    exit()
print 'Program running with your input.'
./if2.py
Enter your name: Neo
Thank you, Neo
Program running with your input.

./if2.py
Enter your name:
Program requires input to continue.

if...elif...else

Similar to the bash if...then...elif control structure (page 989), the Python if...elif...else control structure implements a nested set of if...else structures using the following syntax:

if (expr):
       ...
elif (expr):
       ...
else:
       ...

This control structure can include as many elif control blocks as necessary. In the following example, the if statement evaluates the Boolean expression following it within parentheses and enters the indented logical block below the statement if the expression evaluates to true. The ifelif, andelse statements are part of one control structure and Python will execute statements in only one of the indented logical blocks.

The if and elif statements are each followed by a Boolean expression. Python executes each of their corresponding logical blocks only if their expression evaluates to true. If none of the expressions evaluates to true, control falls through to the else logical block.

cat bignum.py
#!/usr/bin/python
input = raw_input('Please enter a number: ')
if (input == '1'):
    print 'You entered one.'
elif (input == '2'):
    print 'You entered two.'
elif (input == '3'):
    print 'You entered three.'
else:
    print 'You entered a big number...'
print 'End of program.'

In the preceding program, even though the user enters an integer/scalar value, Python stores it as a string. Thus each of the comparisons checks whether this value is equal to a string. You can use int() to convert a string to an integer. If you do so, you must remove the quotation marks from around the values:

...
input = int(raw_input('Please enter a number: '))
if (input == 1):
    print 'You entered one.'
...

while

The while control structure (page 999) evaluates a Boolean expression and continues execution while the expression evaluates to true. The following program, run using the Python interactive shell, displays 0 through 9. As you enter the command, Python displays its secondary prompt (...) when it requires more input to complete a statement; you must still enter SPACE or TAB characters to indent the logical block.

First, the program initializes count to 0. The first time through the loop, the while expression evaluates to true and Python executes the indented statements that make up the while control structure logical block. The comma at the end of the print statement causes print to output a SPACEinstead of a NEWLINE after each string, and the count += 1 statement increments count each time through the loop. When control reaches the bottom of the loop, Python returns control to the while statement, where count is now 1. The loop continues while count is less than or equal to 10. When count equals 11, the while statement evaluates to false and control passes to the first statement after the logical block (the first statement that is not indented; there is none in this example).

>>> count = 0
>>> while (count <= 10):
...     print count,
...     count += 1
...
0 1 2 3 4 5 6 7 8 9 10


Tip: Be careful not to create an infinite loop

It is easy to accidentally create an infinite loop using a while statement. Ensure there is a reachable exit condition (e.g., a counter is compared to a finite value and the counter is incremented each time through the loop).


for

The for control structure (page 997) assigns values from a list, string, or other iterable data structure (page 1090) to a loop index variable each time through the loop.

Lists are iterable

In the following example, lis is a list that holds the names of four types of animals. The for statement iterates through the elements of lis in order, starting with turkey, and assigning a value to nam each time it is called. The print statement in the logical block displays the value of nam each time through the loop. Python exits from the logical block when it runs out of elements in lis.

>>> lis = ['turkey', 'pony', 'dog', 'fox']
>>> for nam in lis:
...     print nam
...
turkey
pony
dog
fox

Strings are iterable

The next example demonstrates that strings are iterable. The string named string holds My name is Sam. and the for statement iterates through string, assigning one character to char each time through the loop. The print statement displays each character; the comma causes print to put aSPACE after each character instead of a NEWLINE.

>>> string = 'My name is Sam.'
>>> for char in string:
...     print char,
...
M y   n a m e   i s   S a m .

range()

The range() function returns a list that holds the integers between the two values specified as its arguments, including the first but excluding the last. An optional third parameter defines a step value.

>>> range(1,6)
[1, 2, 3, 4, 5]
>>> range(0,10,3)
[0, 3, 6, 9]

The next example shows how to use range() in a for loop. In this example, range() returns a list comprising 0, 3, 6, and 9. The for control structure loops over these values, executing the indented statement each time through the loop.

>>> for cnt in range(0,10,3):
...     print cnt
...
0
3
6
9


Optional

xrange()

The range() function is useful for generating short lists but, because it stores the list it returns in memory, it takes up a lot of system resources when it generates longer lists. In contrast, xrange() has a fixed memory footprint that is independent of the length of the list it returns; as a consequence, it uses fewer system resources than range() when working with long lists.

The two functions work differently. Whereas range() fills a list with values and stores that list in memory, xrange() works only when you iterate through the values it returns: The entire list is never stored in memory.

>>> range(1,11)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> xrange(1,11)
xrange(1, 11)

>>> for cnt in xrange(1,11):
...     print cnt,
...
1 2 3 4 5 6 7 8 9 10


Reading from and Writing to Files

Python allows you to work with files in many ways. This section explains how to read from and write to text files and how to preserve an object in a file using pickle.

File Input and Output

open()

The open() function opens a file and returns a file object called a file handle; it can open a file in one of several modes (Table 28-3). Opening a file in w (write) mode truncates the file; use a (append) mode if you want to add to a file. The following statement opens the file in Max’s home directory named test_file in read mode; the file handle is named f.

f = open('/home/max/test_file', 'r')

Once the file is opened, you direct input and output using the file handle with one of the methods listed in Table 28-4. When you are finished working with a file, use close() to close it and free the resources the open file is using.

The following example reads from /home/max/test_file, which holds three lines. It opens this file in read mode and assigns the file handle f to the open file. It uses the readlines() method, which reads the entire file into a list and returns that list. Because the list is iterable, Python passes to the for control structure one line from test_file each time through the loop. The for structure assigns the string value of this line to ln, which print then displays. The strip() method removes whitespace and/or a NEWLINE from the end of a line. Without strip()print would output twoNEWLINEs: the one that terminates the line from the file and the one it automatically appends to each line it outputs. After reading and displaying all lines from the file, the example closes the file.

>>> f = open('/home/max/test_file', 'r')
>>> for ln in f.readlines():
...     print ln.strip()
...
This is the first line
and here is the second line
of this file.
>>> f.close()

The next example opens the same file in append mode and writes a line to it using write(). The write() method does not append a NEWLINE to the line it outputs, so you must terminate the string you write to the file with a \n.

>>> f = open('/home/max/test_file','a')
>>> f.write('Extra line!\n')
>>> f.close()


Optional

In the example that uses for, Python does not call the readlines() method each time through the for loop. Instead, it reads the file into a list the first time readlines() is called and then iterates over the list, setting ln to the value of the next line in the list each time it is called subsequently. It is the same as if you had written

>>> f = open('/home/max/test_file', 'r')
>>> lines = f.readlines()
>>> for ln in lines:
...     print ln.strip()

It is more efficient to iterate over the file handle directly because this technique does not store the file in memory.

>>> f = open('/home/max/test_file', 'r')
>>> for ln in f:
...     print ln.strip()


Exception Handling

An exception is an error condition that changes the normal flow of control in a program. Although you can try to account for every problem your code will need to deal with, it is not always possible to do so: Unknown circumstances might arise. What if the file the previous programs opened does not exist? Python raises an IOError (input/output error) number 2 and displays the message No such file or directory.

>>> f = open('/home/max/test_file', 'r')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IOError: [Errno 2] No such file or directory: '/home/max/test_file'

Instead of allowing Python to display what might be an incomprehensible error message and quit, a well-written program handles exceptions like this one gracefully. Good exception handling can help you debug a program and can also provide a nontechnical user with clear information about why the program failed.

The next example wraps the open statement that failed with an exception handler in the form of a try...except control structure. This structure attempts to execute the try block. If execution of the try block fails (has an exception), it executes the code in the except block. If execution of thetry block succeeds, it skips the except block. Depending on the severity of the error, the except block should warn the user that something might not be right or should display an error message and exit from the program.

>>> try:
...     f = open('/home/max/test_file', 'r')
... except:
...     print "Error on opening the file."
...
Error on opening the file.

You can refine the type of error an except block handles. Previously, the open statement returned an IOError. The following program tests for an IOError when it attempts to open a file, displays an error message, and exits. If it does not encounter an IOError, it continues normally.

cat except1.py
#!/usr/bin/python
try:
    f = open('/home/max/test_file', 'r')
except IOError:
    print "Cannot open file."
    exit()
print "Processing file."

./except1.py
Cannot open file.
$
touch test_file
./except1.py
Processing file.

Pickle

The pickle module allows you to store an object in a file in a standard format for later use by the same or a different program. The object you store can be any type of object as long as it does not require an operating system resource such as a file handle or network socket. The standard picklefilename extension is .p. For more information visit wiki.python.org/moin/UsingPickle.


Security: Never unpickle data received from an untrusted source

The pickle module is not secure against maliciously constructed data. When you unpickle an object, you are trusting the person who created it. Do not unpickle an object if you do not know or trust its source.


This section discusses two pickle methods: dump(), which writes an object to disk, and load(), which reads an object from disk. The syntax for these methods is

pickle.dump(objectname, open(filename, 'wb'))

pickle.load(objectname, open(filename, 'rb'))

It is critical that you open the file in binary mode (wb and rb). The load() method returns the object; you can assign the object the same name as or a different name from the original object. Before you can use pickle, you must import it (page 1105).

In the following example, after importing picklepickle.dump() creates a file named pres.p in wb (write binary) mode with a dump of the preserves list, and exit() leaves the Python interactive shell:

>>> import pickle

>>> preserves = ['apple', 'cherry', 'blackberry', 'apricot']
>>> preserves
['apple', 'cherry', 'blackberry', 'apricot']
>>> pickle.dump(preserves, open('pres.p', 'wb'))
exit()

The next example calls the Python interactive shell and pickle.load() reads the pres.p file in rb (read binary) mode. This method returns the object you originally saved using dump(). You can give the object any name you like.

python
...
>>> import pickle

>>> jams = pickle.load(open('pres.p', 'rb'))
>>> jams
['apple', 'cherry', 'blackberry', 'apricot']

Regular Expressions

The Python re (regular expression) module handles regular expressions. Python regular expressions follow the rules covered in Appendix A. This section discusses a few of the tools in the Python re library. You must give the command import re before you can use the re methods. To display Python help on the re module, give the command help() from the Python interactive shell and then type re.

findall()

One of the simplest re methods is findall(), which returns a list that holds matches using the following syntax:

re.findall(regexstring)

where regex is the regular expression and string is the string you are looking for a match in.

The regex in the following example (hi) matches the three occurrences of hi in the string (hi hi hi hello).

>>> import re
>>> a = re.findall('hi', 'hi hi hi hello')
>>> print a
['hi', 'hi', 'hi']

Because findall() returns a list (it is iterable), you can use it in a for statement. The following example uses the period (.) special character, which, in a regular expression, matches any character. The regex (hi.) matches the three occurrences of hi followed by any character in the string.

>>> for mat in re.findall('hi.', 'hit him hid hex'):
...     print mat,
...
hit him hid

search()

The search() re method uses the same syntax as findall() and looks through string for a match to regex. Instead of returning a list if it finds a match, however, it returns a MatchObject. Many re methods return a MatchObject (and not a list) when they find a match for the regex in the string.

>>> a = re.search('hi.', 'bye hit him hex')
>>> print a
<_sre.SRE_Match object at 0xb7663a30>

bool()

The bool() function returns true or false based on its argument. Because you can test the MatchObject directly, bool() is not used often in Python programming, but is included here for its instructional value. A MatchObject evaluates to true because it indicates a match. [Although findall()does not return a MatchObject, it does evaluate to true when it finds a match.]

>>> bool(a)
True

group()

The group() method allows you to access the match a MatchObject holds.

>>> a.group(0)
'hit'

type()

When no match exists, search() returns a NoneType object [as shown by the type() function], which evaluates to None or in a Boolean expression evaluates to false.

>>> a = re.search('xx.', 'bye hit him hex')
>>> type(a)
<type 'NoneType'>
>>> print a
None
>>> bool(a)
False

Because a re method in a Boolean context evaluates to true or false, you can use a re method as the expression an if statement evaluates. The next example uses search() as the expression in an if statement; because there is a match, search() evaluates as true, and Python executes the printstatement.

>>> name = 'sam'
>>> if(re.search(name,'zach max sam helen')):
...     print 'The list includes ' + name
...
The list includes sam

match()

The match() method of the re object uses the same syntax as search(), but looks only at the beginning of string for a match to regex.

>>> name = 'zach'
>>> if(re.match(name,'zach max sam helen')):
...     print 'The list includes ' + name
...
The list includes zach

Defining a Function

A Python function definition must be evaluated before the function is called, so it generally appears in the code before the call to the function. The contents of the function, as with other Python logical blocks, must be indented.

The syntax of a function definition is

def my_function(args):
       ...

Python passes lists and other data structures to a function by reference (page 1089), meaning that when a function modifies a data structure that was passed to it as an argument, it modifies the original data structure. The following example demonstrates this fact:

>>> def add_ab(my_list):
...     my_list.append('a')
...     my_list.append('b')
...
>>> a = [1,2,3]
>>> add_ab(a)
>>> a
[1, 2, 3, 'a', 'b']

You can pass arguments to a function in three ways. Assume the function place_stuff is defined as follows. The values assigned to the arguments in the function definition are defaults.

>>> def place_stuff(x = 10, y = 20, z = 30):
...     return x, y, z
...

If you call the function and specify arguments, the function uses those arguments:

>>> place_stuff(1,2,3)
(1, 2, 3)

If you do not specify arguments, the function uses the defaults:

>>> place_stuff()
(10, 20, 30)

Alternately, you can specify values for some or all of the arguments:

>>> place_stuff(z=100)
(10, 20, 100)

Using Libraries

This section discusses the Python standard library, nonstandard libraries, and Python namespace, as well as how to import and use a function.

Standard Library

The Python standard library, which is usually included in packages installed with Python, provides a wide range of facilities, including functions, constants, string services, data types, file and directory access, cryptographic services, and file formats. Visit docs.python.org/library/index.htmlfor a list of the contents of the standard library.

Nonstandard Libraries

In some cases a module you want might be part of a library that is not included with Python. You can usually find what you need in the Python Package Index (PyPI; pypi.python.org), a repository of more than 22,000 Python packages.

You can find lists of modules for the distribution you are using by searching the Web for distro package database, where distro is the name of the Linux distribution you are using, and then searching one of the databases for python.

SciPy and NumPy Libraries

Two popular libraries are SciPy and NumPy. The SciPy (“Sigh Pie”; scipy.org) library holds Python modules for mathematics, science, and engineering. It depends on NumPy (numpy.scipy.org), a library of Python modules for scientific computing.

You must download and install the package that holds NumPy before you can use any of its modules. Under Debian/Ubuntu/Mint and openSuSE, the package is named python-numpy; under Fedora/RHEL, it is named numpy. See page 534 for instructions on downloading and installing packages. Alternately, you can obtain the libraries from scipy.org or pypi.python.org. Once you import SciPy (import scipy), help(scipy) will list the functions you can import individually.

Namespace

namespace comprises a set of names (identifiers) in which all names are unique. For example, the namespace for a program might include an object named planets. You might instantiate planets as an integer:

>>> planets = 5
>>> type(planets)
<type 'int'>

Although it is not good programming practice, later on in the program you could assign planets a string value. It would then be an object of type string.

>>> planets = 'solar system'
>>> type(planets)
<type 'str'>

You could make planets a function, a list, or another type of object. Regardless, there would always be only one object named planets (identifiers in a namespace must be unique).

When you import a module (including a function), Python can merge the namespace of the module with the namespace of your program, creating a conflict. For example, if you import the function named sample from the library named random and then define a function with the same name, you will no longer be able to access the original function:

>>> from random import sample
>>> sample(range(10), 10)
[6, 9, 0, 7, 3, 5, 2, 4, 1, 8]
>>> def sample(a, b):
...     print 'Problem?'
...
>>> sample(range(10), 10)
Problem?

The next section discusses different ways you can import objects from a library and steps you can take to avoid the preceding problem.

Importing a Module

You can import a module in one of several ways. How you import the module determines whether Python merges the namespace of the module with that of your program.

The simplest thing to do is to import the whole module. In this case Python does not merge the namespaces but allows you to refer to an object from the module by prefixing its name with the name of the module. The following code imports the random module. Using this syntax, the function named sample is not defined: You must call it as random.sample.

>>> import random
>>> sample(range(10), 10)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'sample' is not defined

>>> random.sample(range(10), 10)
[1, 0, 6, 9, 8, 3, 2, 7, 5, 4]

This setup allows you to define your own function named sample. Because Python has not merged the namespaces from your program and the module named random, the two functions can coexist.

>>> def sample(a, b):
...     print 'Not a problem.'
...
>>> sample(1, 2)
Not a problem.
>>> random.sample(range(10), 10)
[2, 9, 6, 5, 1, 3, 4, 0, 7, 8]

Importing part of a module

Another way to import a module is to specify the module and the object.

>>> from random import sample

When you import an object using this syntax, you import only the function named sample; no other objects from that module will be available. This technique is very efficient. However, Python merges the namespaces from your program and from the object, which can give rise to the type of problem illustrated in the previous section.

You can also use from module import *. This syntax imports all names from module into the namespace of your program; it is generally not a good idea to use this technique.

Example of Importing a Function

The my_max.py program on page 1090 finds the largest element in a predefined list. The following program works the same way, except at runtime it fills a list with random numbers.

The following command imports the sample() function from the standard library module named random. You can install an object from a nonstandard library, such as NumPy, the same way.

from random import sample

After importing this function, the command help(sample) displays information about sample(). The sample() function has the following syntax:

sample(listnumber)

where list is a list holding the population, or values sample() can return, and number is the number of random numbers in the list sample() returns.

>>> sample([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 4)
[7, 1, 2, 5]
>>> sample([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 6)
[0, 5, 4, 1, 3, 7]

The following program uses the range() function (page 1096), which returns a list holding the numbers from 0 through 1 less than its argument:

>>> range(8)
[0, 1, 2, 3, 4, 5, 6, 7]
>>> range(16)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

When you combine the two functions, range() provides a list of values and sample() selects values randomly from that list.

>>> sample(range(100),10)
[5, 32, 70, 93, 74, 29, 90, 7, 30, 11]

In the following program, sample generates a list of random numbers and for iterates through them.

cat my_max2.py
#!/usr/bin/python

from random import sample
my_rand_list = sample(range(100), 10)
print 'Random list of numbers:', my_rand_list
largest = -1
for item in my_rand_list:
        if (item > largest):
                largest = item
                print largest,
print
print 'largest number is ', largest
./my_max2.py
random list of numbers: [67, 40, 1, 29, 9, 49, 99, 95, 77, 51]
67 99
largest number is  99
./my_max2.py
random list of numbers: [53, 33, 76, 35, 71, 13, 75, 58, 74, 50]
53 76
largest number is  76

max()

The algorithm used in this example is not the most efficient way of finding the maximum value in a list. It is more efficient to use the max() builtin function.

>>> from random import sample
>>> max(sample(range(100), 10))
96


Optional

Lambda Functions

Python supports Lambda functions—functions that might not be bound to a name. You might also see them referred to as anonymous functions. Lambda functions are more restrictive than other functions because they can hold only a single expression. In its most basic form, Lambda is another syntax for defining a function. In the following example, the object named a is a Lambda function and performs the same task as the function named add_one:

>>> def add_one(x):
...     return x + 1
...
>>> type (add_one)
<type 'function'>

>>> add_one(2)
3

>>> a = lambda x: x + 1
>>> type(a)
<type 'function'>

>>> a(2)
3

map()

You can use the Lambda syntax to define a function inline as an argument to a function such as map() that expects another function as an argument. The syntax of the map() function is

map(funcseq1[, seq2, ...])

where func is a function that is applied to the sequence of arguments represented by seq1 (and seq2 ...). Typically the sequences that are arguments to map() and the object returned by map() are lists. The next example first defines a function named times_two():

>>> def times_two(x):
...     return x * 2
...
>>> times_two(8)
16

Next, the map() function applies times_two() to a list:

>>> map(times_two, [1, 2, 3, 4])
[2, 4, 6, 8]

You can define an inline Lambda function as an argument to map(). In this example the Lambda function is not bound to a name.

>>> map(lambda x: x * 2, [1, 2, 3, 4])
[2, 4, 6, 8]

List Comprehensions

List comprehensions apply functions to lists. For example, the following code, which does not use a list comprehension, uses for to iterate over items in a list:

>>> my_list = []
>>> for x in range(10):
...     my_list.append(x + 10)
...
>>> my_list
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

You can use a list comprehension to perform the same task neatly and efficiently. The syntax is similar, but a list comprehension is enclosed within square brackets and the operation (x + 10) precedes the iteration [for x in range(10)].

>>> my_list = [x + 10 for x in range(10)]
>>> my_list
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

The results when using a for structure and a list comprehension are the same. The next example uses a list comprehension to fill a list with powers of 2:

>>> potwo = [2**x for x in range(1, 13)]
>>> print potwo
[2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096]

The next list comprehension fills a list with even numbers. The if clause returns values only if the remainder after dividing a number by 2 is 0 (if x % 2 == 0).

>>> [x for x in range(1,11) if x % 2 == 0]
[2, 4, 6, 8, 10]

The final example shows nested list comprehensions. It nests for loops and uses x + y to catenate the elements of both lists in all combinations.

>>> A = ['a', 'b', 'c']
>>> B = ['1', '2', '3']
>>> all = [x + y for x in A for y in B]
>>> print all
['a1', 'a2', 'a3', 'b1', 'b2', 'b3', 'c1', 'c2', 'c3']


Chapter Summary

Python is an interpreted language: It translates code into bytecode at runtime and executes the bytecode within the Python virtual machine. You can run a Python program from the Python interactive shell or from a file. The Python interactive shell is handy for development because you can use it to debug and execute code one line at a time and see the results immediately. Within the Python interactive shell, Python displays output from any command line that does not have an action. From this shell you can give the command help() to use the Python help feature or you can give the command help('object') to display help on object.

Functions help improve code readability, efficiency, and expandability. Many functions are available when you first call Python (builtin functions) and many more are found in libraries that you can download and/or import. Methods are similar to functions except they work on objects whereas functions stand on their own. Python allows you to define both normal functions and nameless functions called Lambda functions.

Python enables you to read from and write to files in many ways. The open() function opens a file and returns a file object called a file handle. Once the file is opened, you direct input and output using the file handle. When you are finished working with a file, it is good practice to close it. The pickle module allows you to store an object in a file in a standard format for later use by the same or a different program.

A Python list is an object that comprises one or more elements; it is similar to an array in C or Java. An important feature of lists is that they are iterable, meaning a control structure such as for can loop (iterate) through each item in a list. A Python dictionary holds unordered key–value pairs in which the keys are unique. Like lists, dictionaries are iterable.

Python implements many control structures, including if...elseif...elif...elsewhile, and for. Unlike most languages, Python requires SPACEs (or TABs) at the beginning of lines in a control structure. The indented code marks a logical block, or section of code, that is part of the control structure.

Python regular expressions are implemented by the Python re (regular expression) module. You must give the command import re before you can use the re methods.

Exercises

1. What is meant by implied display? Is it available in the Python interactive shell or from a program file? Provide a simple example of implied display.

2. Write and run a Python program that you store in a file. The program should demonstrate how to prompt the user for input and display the string the user entered.

3. Using the Python interactive shell, instantiate a list that holds three-letter abbreviations for the first six months of the year and display the list.

4. Using the Python interactive shell, use a for control structure to iterate through the elements of the list you instantiated in exercise 3 and display each abbreviated name followed by a period on a line by itself. (Hint: The period is a string.)

5. Using the Python interactive shell, put the elements of the list you instantiated in exercise 3 in alphabetical order.

6. Instantiate a dictionary in which the keys are the months in the third quarter of the year and the values are the number of days in the corresponding month. Display the dictionary, the keys, and the values. Add the tenth month of the year to the dictionary and display the value of that month only.

7. What does iterable mean? Name two builtin objects that are iterable. Which control structure can you use to loop through an iterable object?

8. Write and demonstrate a Lambda function named stg() that appends .txt to its argument. What happens when you call the function with an integer?

Advanced Exercises

9. Define a function named cents that returns its argument divided by 100 and truncated to an integer. For example:

>>> cents(12345)
123

10. Define a function named cents2 that returns its argument divided by 100 exactly (and includes decimal places if necessary). Make sure your function does not truncate the answer. For example:

>>> cents2(12345)
123.45

11. Create a list that has four elements. Make a copy of the list and change one of the elements in the copy. Show that the same element in the original list did not change.

12. Why does the following assignment statement generate an error?

>>> x.y = 5
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined

13. Call map() with two arguments:

1. A Lambda function that returns the square of the number it was called with

2. A list holding the even numbers between 4 and 15; generate the list inline using range()

14. Use a list comprehension to display the numbers from 1 through 30 inclusive that are divisible by 3.

15. Write a function that takes an integer, val, as an argument. The function asks the user to enter a number. If the number is greater than val, the function displays Too high. and returns 1; if the number is less than val, the function displays Too low. and returns –1; if the number equalsval, the function displays Got it! and returns 0. Call the function repeatedly until the user enters the right number.

16. Rewrite exercise 15 to call the function with a random number between 0 and 10 inclusive. (Hint: The randint function in the random library returns a random number between its two arguments inclusive.)

17. Write a function that counts the characters in a string the user inputs. Then write a routine that calls the function and displays the following output.

./count_letters.py
Enter some words: The rain in Spain
The string "The rain in Spain" has  17 characters in it.

18. Write a function that counts the vowels (aeiou) in a string the user inputs. Make sure it counts upper- and lowercase vowels. Then write a routine that calls the function and displays the following output.

./count_vowels.py
Enter some words: Go East young man!
The string "Go East young man!" has  6 vowels in it.

19. Write a function that counts all the characters and the vowels in a string the user inputs. Then write a routine that calls the function and displays the following output.

./count_all.py
Enter some words: The sun rises in the East and sets in the West.
13 letters in 47 are vowels.