Learning Python (2013)

Part II. Types and Operations

Chapter 5. Numeric Types

This chapter begins our in-depth tour of the Python language. In Python, data takes the form of objects—either built-in objects that Python provides, or objects we create using Python tools and other languages such as C. In fact, objects are the basis of every Python program you will ever write. Because they are the most fundamental notion in Python programming, objects are also our first focus in this book.

In the preceding chapter, we took a quick pass over Python’s core object types. Although essential terms were introduced in that chapter, we avoided covering too many specifics in the interest of space. Here, we’ll begin a more careful second look at data type concepts, to fill in details we glossed over earlier. Let’s get started by exploring our first data type category: Python’s numeric types and operations.

Numeric Type Basics

Most of Python’s number types are fairly typical and will probably seem familiar if you’ve used almost any other programming language in the past. They can be used to keep track of your bank balance, the distance to Mars, the number of visitors to your website, and just about any other numeric quantity.

In Python, numbers are not really a single object type, but a category of similar types. Python supports the usual numeric types (integers and floating points), as well as literals for creating numbers and expressions for processing them. In addition, Python provides more advanced numeric programming support and objects for more advanced work. A complete inventory of Python’s numeric toolbox includes:

§  Integer and floating-point objects

§  Complex number objects

§  Decimal: fixed-precision objects

§  Fraction: rational number objects

§  Sets: collections with numeric operations

§  Booleans: true and false

§  Built-in functions and modules: round, math, random, etc.

§  Expressions; unlimited integer precision; bitwise operations; hex, octal, and binary formats

§  Third-party extensions: vectors, libraries, visualization, plotting, etc.

Because the types in this list’s first bullet item tend to see the most action in Python code, this chapter starts with basic numbers and fundamentals, then moves on to explore the other types on this list, which serve specialized roles. We’ll also study sets here, which have both numeric and collection qualities, but are generally considered more the former than the latter. Before we jump into code, though, the next few sections get us started with a brief overview of how we write and process numbers in our scripts.

Numeric Literals

Among its basic types, Python provides integers, which are positive and negative whole numbers, and floating-point numbers, which are numbers with a fractional part (sometimes called “floats” for verbal economy). Python also allows us to write integers using hexadecimal, octal, and binary literals; offers a complex number type; and allows integers to have unlimited precision—they can grow to have as many digits as your memory space allows. Table 5-1 shows what Python’s numeric types look like when written out in a program as literals or constructor function calls.

Table 5-1. Numeric literals and constructors

Literal

Interpretation

1234, −24, 0, 99999999999999

Integers (unlimited size)

1.23, 1., 3.14e-10, 4E210, 4.0e+210

Floating-point numbers

0o177, 0x9ff, 0b101010

Octal, hex, and binary literals in 3.X

0177, 0o177, 0x9ff, 0b101010

Octal, octal, hex, and binary literals in 2.X

3+4j, 3.0+4.0j, 3J

Complex number literals

set('spam'), {1, 2, 3, 4}

Sets: 2.X and 3.X construction forms

Decimal('1.0'), Fraction(1, 3)

Decimal and fraction extension types

bool(X), True, False

Boolean type and constants

In general, Python’s numeric type literals are straightforward to write, but a few coding concepts are worth highlighting here:

Integer and floating-point literals

Integers are written as strings of decimal digits. Floating-point numbers have a decimal point and/or an optional signed exponent introduced by an e or E and followed by an optional sign. If you write a number with a decimal point or exponent, Python makes it a floating-point object and uses floating-point (not integer) math when the object is used in an expression. Floating-point numbers are implemented as C “doubles” in standard CPython, and therefore get as much precision as the C compiler used to build the Python interpreter gives to doubles.

Integers in Python 2.X: normal and long

In Python 2.X there are two integer types, normal (often 32 bits) and long (unlimited precision), and an integer may end in an l or L to force it to become a long integer. Because integers are automatically converted to long integers when their values overflow their allocated bits, you never need to type the letter L yourself—Python automatically converts up to long integer when extra precision is needed.

Integers in Python 3.X: a single type

In Python 3.X, the normal and long integer types have been merged—there is only integer, which automatically supports the unlimited precision of Python 2.X’s separate long integer type. Because of this, integers can no longer be coded with a trailing l or L, and integers never print with this character either. Apart from this, most programs are unaffected by this change, unless they do type testing that checks for 2.X long integers.

Hexadecimal, octal, and binary literals

Integers may be coded in decimal (base 10), hexadecimal (base 16), octal (base 8), or binary (base 2), the last three of which are common in some programming domains. Hexadecimals start with a leading 0x or 0X, followed by a string of hexadecimal digits (0–9 and A–F). Hex digits may be coded in lower- or uppercase. Octal literals start with a leading 0o or 0O (zero and lower- or uppercase letter o), followed by a string of digits (0–7). In 2.X, octal literals can also be coded with just a leading 0, but not in 3.X—this original octal form is too easily confused with decimal, and is replaced by the new 0o format, which can also be used in 2.X as of 2.6. Binary literals, new as of 2.6 and 3.0, begin with a leading 0b or 0B, followed by binary digits (0–1).

Note that all of these literals produce integer objects in program code; they are just alternative syntaxes for specifying values. The built-in calls hex(I), oct(I), and bin(I) convert an integer to its representation string in these three bases, and int(strbase) converts a runtime string to an integer per a given base.

Complex numbers

Python complex literals are written as realpart+imaginarypart, where the imaginarypart is terminated with a j or J. The realpart is technically optional, so the imaginarypart may appear on its own. Internally, complex numbers are implemented as pairs of floating-point numbers, but all numeric operations perform complex math when applied to complex numbers. Complex numbers may also be created with the complex(realimag) built-in call.

Coding other numeric types

As we’ll see later in this chapter, there are additional numeric types at the end of Table 5-1 that serve more advanced or specialized roles. You create some of these by calling functions in imported modules (e.g., decimals and fractions), and others have literal syntax all their own (e.g., sets).

Built-in Numeric Tools

Besides the built-in number literals and construction calls shown in Table 5-1, Python provides a set of tools for processing number objects:

Expression operators

+, -, *, /, >>, **, &, etc.

Built-in mathematical functions

pow, abs, round, int, hex, bin, etc.

Utility modules

random, math, etc.

We’ll meet all of these as we go along.

Although numbers are primarily processed with expressions, built-ins, and modules, they also have a handful of type-specific methods today, which we’ll meet in this chapter as well. Floating-point numbers, for example, have an as_integer_ratio method that is useful for the fraction number type, and an is_integer method to test if the number is an integer. Integers have various attributes, including a new bit_length method introduced in Python 3.1 that gives the number of bits necessary to represent the object’s value. Moreover, as part collection and part number,sets also support both methods and expressions.

Since expressions are the most essential tool for most number types, though, let’s turn to them next.

Python Expression Operators

Perhaps the most fundamental tool that processes numbers is the expression: a combination of numbers (or other objects) and operators that computes a value when executed by Python. In Python, you write expressions using the usual mathematical notation and operator symbols. For instance, to add two numbers X and Y you would say X + Y, which tells Python to apply the + operator to the values named by X and Y. The result of the expression is the sum of X and Y, another number object.

Table 5-2 lists all the operator expressions available in Python. Many are self-explanatory; for instance, the usual mathematical operators (+, −, *, /, and so on) are supported. A few will be familiar if you’ve used other languages in the past: % computes a division remainder, << performs a bitwise left-shift, & computes a bitwise AND result, and so on. Others are more Python-specific, and not all are numeric in nature: for example, the is operator tests object identity (i.e., address in memory, a strict form of equality), and lambda creates unnamed functions.

Table 5-2. Python expression operators and precedence

Operators

Description

yield x

Generator function send protocol

lambda args: expression

Anonymous function generation

x if y else z

Ternary selection (x is evaluated only if y is true)

x or y

Logical OR (y is evaluated only if x is false)

x and y

Logical AND (y is evaluated only if x is true)

not x

Logical negation

x in y, x not in y

Membership (iterables, sets)

x is y, x is not y

Object identity tests

x < y, x <= y, x > y, x >= y

x == y, x != y

Magnitude comparison, set subset and superset;

Value equality operators

x | y

Bitwise OR, set union

x ^ y

Bitwise XOR, set symmetric difference

x & y

Bitwise AND, set intersection

x << y, x >> y

Shift x left or right by y bits

x + y

x – y

Addition, concatenation;

Subtraction, set difference

x * y

x % y

x / y, x // y

Multiplication, repetition;

Remainder, format;

Division: true and floor

−x, +x

Negation, identity

˜x

Bitwise NOT (inversion)

x ** y

Power (exponentiation)

x[i]

Indexing (sequence, mapping, others)

x[i:j:k]

Slicing

x(...)

Call (function, method, class, other callable)

x.attr

Attribute reference

(...)

Tuple, expression, generator expression

[...]

List, list comprehension

{...}

Dictionary, set, set and dictionary comprehensions

Since this book addresses both Python 2.X and 3.X, here are some notes about version differences and recent additions related to the operators in Table 5-2:

§  In Python 2.X, value inequality can be written as either X != Y or X <> Y. In Python 3.X, the latter of these options is removed because it is redundant. In either version, best practice is to use X != Y for all value inequality tests.

§  In Python 2.X, a backquotes expression `X` works the same as repr(X) and converts objects to display strings. Due to its obscurity, this expression is removed in Python 3.X; use the more readable str and repr built-in functions, described in “Numeric Display Formats.”

§  The X // Y floor division expression always truncates fractional remainders in both Python 2.X and 3.X. The X / Y expression performs true division in 3.X (retaining remainders) and classic division in 2.X (truncating for integers). See Division: Classic, Floor, and True.

§  The syntax [...] is used for both list literals and list comprehension expressions. The latter of these performs an implied loop and collects expression results in a new list. See Chapter 4Chapter 14, and Chapter 20 for examples.

§  The syntax (...) is used for tuples and expression grouping, as well as generator expressions—a form of list comprehension that produces results on demand, instead of building a result list. See Chapter 4 and Chapter 20 for examples. The parentheses may sometimes be omitted in all three contexts.

§  The syntax {...} is used for dictionary literals, and in Python 3.X and 2.7 for set literals and both dictionary and set comprehensions. See the set coverage in this chapter as well as Chapter 4Chapter 8Chapter 14, and Chapter 20 for examples.

§  The yield and ternary if/else selection expressions are available in Python 2.5 and later. The former returns send(...) arguments in generators; the latter is shorthand for a multiline if statement. yield requires parentheses if not alone on the right side of an assignment statement.

§  Comparison operators may be chained: X < Y < Z produces the same result as X < Y and Y < Z. See Comparisons: Normal and Chained for details.

§  In recent Pythons, the slice expression X[I:J:K] is equivalent to indexing with a slice object: X[slice(I, J, K)].

§  In Python 2.X, magnitude comparisons of mixed types are allowed, and convert numbers to a common type, and order other mixed types according to type names. In Python 3.X, nonnumeric mixed-type magnitude comparisons are not allowed and raise exceptions; this includes sorts by proxy.

§  Magnitude comparisons for dictionaries are also no longer supported in Python 3.X (though equality tests are); comparing sorted(aDict.items()) is one possible replacement.

We’ll see most of the operators in Table 5-2 in action later; first, though, we need to take a quick look at the ways these operators may be combined in expressions.

Mixed operators follow operator precedence

As in most languages, in Python, you code more complex expressions by stringing together the operator expressions in Table 5-2. For instance, the sum of two multiplications might be written as a mix of variables and operators:

A * B + C * D

So, how does Python know which operation to perform first? The answer to this question lies in operator precedence. When you write an expression with more than one operator, Python groups its parts according to what are called precedence rules, and this grouping determines the order in which the expression’s parts are computed. Table 5-2 is ordered by operator precedence:

§  Operators lower in the table have higher precedence, and so bind more tightly in mixed expressions.

§  Operators in the same row in Table 5-2 generally group from left to right when combined (except for exponentiation, which groups right to left, and comparisons, which chain left to right).

For example, if you write X + Y * Z, Python evaluates the multiplication first (Y * Z), then adds that result to X because * has higher precedence (is lower in the table) than +. Similarly, in this section’s original example, both multiplications (A * B and C * D) will happen before their results are added.

Parentheses group subexpressions

You can forget about precedence completely if you’re careful to group parts of expressions with parentheses. When you enclose subexpressions in parentheses, you override Python’s precedence rules; Python always evaluates expressions in parentheses first before using their results in the enclosing expressions.

For instance, instead of coding X + Y * Z, you could write one of the following to force Python to evaluate the expression in the desired order:

(X + Y) * Z

X + (Y * Z)

In the first case, + is applied to X and Y first, because this subexpression is wrapped in parentheses. In the second case, the * is performed first (just as if there were no parentheses at all). Generally speaking, adding parentheses in large expressions is a good idea—it not only forces the evaluation order you want, but also aids readability.

Mixed types are converted up

Besides mixing operators in expressions, you can also mix numeric types. For instance, you can add an integer to a floating-point number:

40 + 3.14

But this leads to another question: what type is the result—integer or floating point? The answer is simple, especially if you’ve used almost any other language before: in mixed-type numeric expressions, Python first converts operands up to the type of the most complicated operand, and then performs the math on same-type operands. This behavior is similar to type conversions in the C language.

Python ranks the complexity of numeric types like so: integers are simpler than floating-point numbers, which are simpler than complex numbers. So, when an integer is mixed with a floating point, as in the preceding example, the integer is converted up to a floating-point value first, and floating-point math yields the floating-point result:

>>> 40 + 3.14       # Integer to float, float math/result

43.14

Similarly, any mixed-type expression where one operand is a complex number results in the other operand being converted up to a complex number, and the expression yields a complex result. In Python 2.X, normal integers are also converted to long integers whenever their values are too large to fit in a normal integer; in 3.X, integers subsume longs entirely.

You can force the issue by calling built-in functions to convert types manually:

>>> int(3.1415)     # Truncates float to integer

3

>>> float(3)        # Converts integer to float

3.0

However, you won’t usually need to do this: because Python automatically converts up to the more complex type within an expression, the results are normally what you want.

Also, keep in mind that all these mixed-type conversions apply only when mixing numeric types (e.g., an integer and a floating point) in an expression, including those using numeric and comparison operators. In general, Python does not convert across any other type boundaries automatically. Adding a string to an integer, for example, results in an error, unless you manually convert one or the other; watch for an example when we meet strings in Chapter 7.

NOTE

In Python 2.X, nonnumeric mixed types can be compared, but no conversions are performed—mixed types compare according to a rule that seems deterministic but not aesthetically pleasing: it compares the string names of the objects’ types. In 3.X, nonnumeric mixed-type magnitude comparisons are never allowed and raise exceptions. Note that this applies to comparison operators such as> only; other operators like + do not allow mixed nonnumeric types in either 3.X or 2.X.

Preview: Operator overloading and polymorphism

Although we’re focusing on built-in numbers right now, all Python operators may be overloaded (i.e., implemented) by Python classes and C extension types to work on objects you create. For instance, you’ll see later that objects coded with classes may be added or concatenated with x+yexpressions, indexed with x[i] expressions, and so on.

Furthermore, Python itself automatically overloads some operators, such that they perform different actions depending on the type of built-in objects being processed. For example, the + operator performs addition when applied to numbers but performs concatenation when applied to sequence objects such as strings and lists. In fact, + can mean anything at all when applied to objects you define with classes.

As we saw in the prior chapter, this property is usually called polymorphism—a term indicating that the meaning of an operation depends on the type of the objects being operated on. We’ll revisit this concept when we explore functions in Chapter 16, because it becomes a much more obvious feature in that context.

Numbers in Action

On to the code! Probably the best way to understand numeric objects and expressions is to see them in action, so with those basics in hand let’s start up the interactive command line and try some simple but illustrative operations (be sure to see Chapter 3 for pointers if you need help starting an interactive session).

Variables and Basic Expressions

First of all, let’s exercise some basic math. In the following interaction, we first assign two variables (a and b) to integers so we can use them later in a larger expression. Variables are simply names—created by you or Python—that are used to keep track of information in your program. We’ll say more about this in the next chapter, but in Python:

§  Variables are created when they are first assigned values.

§  Variables are replaced with their values when used in expressions.

§  Variables must be assigned before they can be used in expressions.

§  Variables refer to objects and are never declared ahead of time.

In other words, these assignments cause the variables a and b to spring into existence automatically:

% python

>>> a = 3                  # Name created: not declared ahead of time

>>> b = 4

I’ve also used a comment here. Recall that in Python code, text after a # mark and continuing to the end of the line is considered to be a comment and is ignored by Python. Comments are a way to write human-readable documentation for your code, and an important part of programming. I’ve added them to most of this book’s examples to help explain the code. In the next part of the book, we’ll meet a related but more functional feature—documentation strings—that attaches the text of your comments to objects so it’s available after your code is loaded.

Because code you type interactively is temporary, though, you won’t normally write comments in this context. If you’re working along, this means you don’t need to type any of the comment text from the # through to the end of the line; it’s not a required part of the statements we’re running this way.

Now, let’s use our new integer objects in some expressions. At this point, the values of a and b are still 3 and 4, respectively. Variables like these are replaced with their values whenever they’re used inside an expression, and the expression results are echoed back immediately when we’re working interactively:

>>> a + 1, a − 1           # Addition (3 + 1), subtraction (3 − 1)

(4, 2)

>>> b * 3, b / 2           # Multiplication (4 * 3), division (4 / 2)

(12, 2.0)

>>> a % 2, b ** 2          # Modulus (remainder), power (4 ** 2)

(1, 16)

>>> 2 + 4.0, 2.0 ** b      # Mixed-type conversions

(6.0, 16.0)

Technically, the results being echoed back here are tuples of two values because the lines typed at the prompt contain two expressions separated by commas; that’s why the results are displayed in parentheses (more on tuples later). Note that the expressions work because the variables a and bwithin them have been assigned values. If you use a different variable that has not yet been assigned, Python reports an error rather than filling in some default value:

>>> c * 2

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>

NameError: name 'c' is not defined

You don’t need to predeclare variables in Python, but they must have been assigned at least once before you can use them. In practice, this means you have to initialize counters to zero before you can add to them, initialize lists to an empty list before you can append to them, and so on.

Here are two slightly larger expressions to illustrate operator grouping and more about conversions, and preview a difference in the division operator in Python 3.X and 2.X:

>>> b / 2 + a               # Same as ((4 / 2) + 3)   [use 2.0 in 2.X]

5.0

>>> b / (2.0 + a)           # Same as (4 / (2.0 + 3)) [use print before 2.7]

0.8

In the first expression, there are no parentheses, so Python automatically groups the components according to its precedence rules—because / is lower in Table 5-2 than +, it binds more tightly and so is evaluated first. The result is as if the expression had been organized with parentheses as shown in the comment to the right of the code.

Also, notice that all the numbers are integers in the first expression. Because of that, Python 2.X’s / performs integer division and addition and will give a result of 5, whereas Python 3.X’s / performs true division, which always retains fractional remainders and gives the result 5.0 shown. If you want 2.X’s integer division in 3.X, code this as b // 2 + a; if you want 3.X’s true division in 2.X, code this as b / 2.0 + a (more on division in a moment).

In the second expression, parentheses are added around the + part to force Python to evaluate it first (i.e., before the /). We also made one of the operands floating point by adding a decimal point: 2.0. Because of the mixed types, Python converts the integer referenced by a to a floating-point value (3.0) before performing the +. If instead all the numbers in this expression were integers, integer division (4 / 5) would yield the truncated integer 0 in Python 2.X but the floating point 0.8 shown in Python 3.X. Again, stay tuned for formal division details.

Numeric Display Formats

If you’re using Python 2.6, Python 3.0, or earlier, the result of the last of the preceding examples may look a bit odd the first time you see it:

>>> b / (2.0 + a)           # Pythons <= 2.6: echoes give more (or fewer) digits

0.80000000000000004

>>> print(b / (2.0 + a))    # But print rounds off digits

0.8

We met this phenomenon briefly in the prior chapter, and it’s not present in Pythons 2.7, 3.1, and later. The full story behind this odd result has to do with the limitations of floating-point hardware and its inability to exactly represent some values in a limited number of bits. Because computer architecture is well beyond this book’s scope, though, we’ll finesse this by saying that your computer’s floating-point hardware is doing the best it can, and neither it nor Python is in error here.

In fact, this is really just a display issue—the interactive prompt’s automatic result echo shows more digits than the print statement here only because it uses a different algorithm. It’s the same number in memory. If you don’t want to see all the digits, use print; as this chapter’s sidebar str and repr Display Formats will explain, you’ll get a user-friendly display. As of 2.7 and 3.1, Python’s floating-point display logic tries to be more intelligent, usually showing fewer decimal digits, but occasionally more.

Note, however, that not all values have so many digits to display:

>>> 1 / 2.0

0.5

and that there are more ways to display the bits of a number inside your computer than using print and automatic echoes (the following are all run in Python 3.3, and may vary slightly in older versions):

>>> num = 1 / 3.0

>>> num                      # Auto-echoes

0.3333333333333333

>>> print(num)               # Print explicitly

0.3333333333333333

>>> '%e' % num               # String formatting expression

'3.333333e-01'

>>> '%4.2f' % num            # Alternative floating-point format

'0.33'

>>> '{0:4.2f}'.format(num)   # String formatting method: Python 2.6, 3.0, and later

'0.33'

The last three of these expressions employ string formatting, a tool that allows for format flexibility, which we will explore in the upcoming chapter on strings (Chapter 7). Its results are strings that are typically printed to displays or reports.

STR AND REPR DISPLAY FORMATS

Technically, the difference between default interactive echoes and print corresponds to the difference between the built-in repr and str functions:

>>> repr('spam')           # Used by echoes: as-code form

"'spam'"

>>> str('spam')            # Used by print: user-friendly form

'spam'

Both of these convert arbitrary objects to their string representations: repr (and the default interactive echo) produces results that look as though they were code; str (and the print operation) converts to a typically more user-friendly format if available. Some objects have both—a str for general use, and a repr with extra details. This notion will resurface when we study both strings and operator overloading in classes, and you’ll find more on these built-ins in general later in the book.

Besides providing print strings for arbitrary objects, the str built-in is also the name of the string data type, and in 3.X may be called with an encoding name to decode a Unicode string from a byte string (e.g., str(b'xy', 'utf8')), and serves as an alternative to the bytes.decode method we met in Chapter 4. We’ll study the latter advanced role in Chapter 37 of this book.

Comparisons: Normal and Chained

So far, we’ve been dealing with standard numeric operations (addition and multiplication), but numbers, like all Python objects, can also be compared. Normal comparisons work for numbers exactly as you’d expect—they compare the relative magnitudes of their operands and return a Boolean result, which we would normally test and take action on in a larger statement and program:

>>> 1 < 2                  # Less than

True

>>> 2.0 >= 1               # Greater than or equal: mixed-type 1 converted to 1.0

True

>>> 2.0 == 2.0             # Equal value

True

>>> 2.0 != 2.0             # Not equal value

False

Notice again how mixed types are allowed in numeric expressions (only); in the second test here, Python compares values in terms of the more complex type, float.

Interestingly, Python also allows us to chain multiple comparisons together to perform range tests. Chained comparisons are a sort of shorthand for larger Boolean expressions. In short, Python lets us string together magnitude comparison tests to code chained comparisons such as range tests. The expression (A < B < C), for instance, tests whether B is between A and C; it is equivalent to the Boolean test (A < B and B < C) but is easier on the eyes (and the keyboard). For example, assume the following assignments:

>>> X = 2

>>> Y = 4

>>> Z = 6

The following two expressions have identical effects, but the first is shorter to type, and it may run slightly faster since Python needs to evaluate Y only once:

>>> X < Y < Z              # Chained comparisons: range tests

True

>>> X < Y and Y < Z

True

The same equivalence holds for false results, and arbitrary chain lengths are allowed:

>>> X < Y > Z

False

>>> X < Y and Y > Z

False

>>> 1 < 2 < 3.0 < 4

True

>>> 1 > 2 > 3.0 > 4

False

You can use other comparisons in chained tests, but the resulting expressions can become nonintuitive unless you evaluate them the way Python does. The following, for instance, is false just because 1 is not equal to 2:

>>> 1 == 2 < 3        # Same as: 1 == 2 and 2 < 3

False                 # Not same as: False < 3 (which means 0 < 3, which is true!)

Python does not compare the 1 == 2 expression’s False result to 3—this would technically mean the same as 0 < 3, which would be True (as we’ll see later in this chapter, True and False are just customized 1 and 0).

One last note here before we move on: chaining aside, numeric comparisons are based on magnitudes, which are generally simple—though floating-point numbers may not always work as you’d expect, and may require conversions or other massaging to be compared meaningfully:

>>> 1.1 + 2.2 == 3.3             # Shouldn't this be True?...

False

>>> 1.1 + 2.2                    # Close to 3.3, but not exactly: limited precision

3.3000000000000003

>>> int(1.1 + 2.2) == int(3.3)   # OK if convert: see also round, floor, trunc ahead

True                             # Decimals and fractions (ahead) may help here too

This stems from the fact that floating-point numbers cannot represent some values exactly due to their limited number of bits—a fundamental issue in numeric programming not unique to Python, which we’ll learn more about later when we meet decimals and fractions, tools that can address such limitations. First, though, let’s continue our tour of Python’s core numeric operations, with a deeper look at division.

Division: Classic, Floor, and True

You’ve seen how division works in the previous sections, so you should know that it behaves slightly differently in Python 3.X and 2.X. In fact, there are actually three flavors of division, and two different division operators, one of which changes in 3.X. This story gets a bit detailed, but it’s another major change in 3.X and can break 2.X code, so let’s get the division operator facts straight:

X / Y

Classic and true division. In Python 2.X, this operator performs classic division, truncating results for integers, and keeping remainders (i.e., fractional parts) for floating-point numbers. In Python 3.X, it performs true division, always keeping remainders in floating-point results, regardless of types.

X // Y

Floor division. Added in Python 2.2 and available in both Python 2.X and 3.X, this operator always truncates fractional remainders down to their floor, regardless of types. Its result type depends on the types of its operands.

True division was added to address the fact that the results of the original classic division model are dependent on operand types, and so can be difficult to anticipate in a dynamically typed language like Python. Classic division was removed in 3.X because of this constraint—the / and //operators implement true and floor division in 3.X. Python 2.X defaults to classic and floor division, but you can enable true division as an option. In sum:

§  In 3.X, the / now always performs true division, returning a float result that includes any remainder, regardless of operand types. The // performs floor division, which truncates the remainder and returns an integer for integer operands or a float if any operand is a float.

§  In 2.X, the / does classic division, performing truncating integer division if both operands are integers and float division (keeping remainders) otherwise. The // does floor division and works as it does in 3.X, performing truncating division for integers and floor division for floats.

Here are the two operators at work in 3.X and 2.X—the first operation in each set is the crucial difference between the lines that may impact code:

C:\code> C:\Python33\python

>>> 

>>> 10 / 4            # Differs in 3.X: keeps remainder

2.5

>>> 10 / 4.0          # Same in 3.X: keeps remainder

2.5

>>> 10 // 4           # Same in 3.X: truncates remainder

2

>>> 10 // 4.0         # Same in 3.X: truncates to floor

2.0

C:\code> C:\Python27\python

>>> 

>>> 10 / 4            # This might break on porting to 3.X!

2

>>> 10 / 4.0

2.5

>>> 10 // 4           # Use this in 2.X if truncation needed

2

>>> 10 // 4.0

2.0

Notice that the data type of the result for // is still dependent on the operand types in 3.X: if either is a float, the result is a float; otherwise, it is an integer. Although this may seem similar to the type-dependent behavior of / in 2.X that motivated its change in 3.X, the type of the return value is much less critical than differences in the return value itself.

Moreover, because // was provided in part as a compatibility tool for programs that rely on truncating integer division (and this is more common than you might expect), it must return integers for integers. Using // instead of / in 2.X when integer truncation is required helps make code 3.X-compatible.

Supporting either Python

Although / behavior differs in 2.X and 3.X, you can still support both versions in your code. If your programs depend on truncating integer division, use // in both 2.X and 3.X as just mentioned. If your programs require floating-point results with remainders for integers, use float to guarantee that one operand is a float around a / when run in 2.X:

X = Y // Z        # Always truncates, always an int result for ints in 2.X and 3.X

X = Y / float(Z)  # Guarantees float division with remainder in either 2.X or 3.X

Alternatively, you can enable 3.X / division in 2.X with a __future__ import, rather than forcing it with float conversions:

C:\code> C:\Python27\python

>>> from __future__ import division         # Enable 3.X "/" behavior

>>> 10 / 4

2.5

>>> 10 // 4                                 # Integer // is the same in both

2

This special from statement applies to the rest of your session when typed interactively like this, and must appear as the first executable line when used in a script file (and alas, we can import from the future in Python, but not the past; insert something about talking to “the Doc” here...).

Floor versus truncation

One subtlety: the // operator is informally called truncating division, but it’s more accurate to refer to it as floor division—it truncates the result down to its floor, which means the closest whole number below the true result. The net effect is to round down, not strictly truncate, and this matters for negatives. You can see the difference for yourself with the Python math module (modules must be imported before you can use their contents; more on this later):

>>> import math

>>> math.floor(2.5)           # Closest number below value

2

>>> math.floor(-2.5)

-3

>>> math.trunc(2.5)           # Truncate fractional part (toward zero)

2

>>> math.trunc(-2.5)

-2

When running division operators, you only really truncate for positive results, since truncation is the same as floor; for negatives, it’s a floor result (really, they are both floor, but floor is the same as truncation for positives). Here’s the case for 3.X:

C:\code> c:\python33\python

>>> 5 / 2, 5 / −2

(2.5, −2.5)

>>> 5 // 2, 5 // −2           # Truncates to floor: rounds to first lower integer

(2, −3)                       # 2.5 becomes 2, −2.5 becomes −3

>>> 5 / 2.0, 5 / −2.0

(2.5, −2.5)

>>> 5 // 2.0, 5 // −2.0       # Ditto for floats, though result is float too

(2.0, −3.0)

The 2.X case is similar, but / results differ again:

C:code> c:\python27\python

>>> 5 / 2, 5 / −2             # Differs in 3.X

(2, −3)

>>> 5 // 2, 5 // −2           # This and the rest are the same in 2.X and 3.X

(2, −3)

>>> 5 / 2.0, 5 / −2.0

(2.5, −2.5)

>>> 5 // 2.0, 5 // −2.0

(2.0, −3.0)

If you really want truncation toward zero regardless of sign, you can always run a float division result through math.trunc, regardless of Python version (also see the round built-in for related functionality, and the int built-in, which has the same effect here but requires no import):

C:\code> c:\python33\python

>>> import math

>>> 5 / −2                      # Keep remainder

−2.5

>>> 5 // −2                     # Floor below result

-3

>>> math.trunc(5 / −2)          # Truncate instead of floor (same as int())

−2

C:\code> c:\python27\python

>>> import math

>>> 5 / float(−2)               # Remainder in 2.X

−2.5

>>> 5 / −2, 5 // −2             # Floor in 2.X

(−3, −3)

>>> math.trunc(5 / float(−2))   # Truncate in 2.X

−2

Why does truncation matter?

As a wrap-up, if you are using 3.X, here is the short story on division operators for reference:

>>> (5 / 2), (5 / 2.0), (5 / −2.0), (5 / −2)        # 3.X true division

(2.5, 2.5, −2.5, −2.5)

>>> (5 // 2), (5 // 2.0), (5 // −2.0), (5 // −2)    # 3.X floor division

(2, 2.0, −3.0, −3)

>>> (9 / 3), (9.0 / 3), (9 // 3), (9 // 3.0)        # Both

(3.0, 3.0, 3, 3.0)

For 2.X readers, division works as follows (the three bold outputs of integer division differ from 3.X):

>>> (5 / 2), (5 / 2.0), (5 / −2.0), (5 / −2)        # 2.X classic division (differs)

(2, 2.5, −2.5, −3)

>>> (5 // 2), (5 // 2.0), (5 // −2.0), (5 // −2)    # 2.X floor division (same)

(2, 2.0, −3.0, −3)

>>> (9 / 3), (9.0 / 3), (9 // 3), (9 // 3.0)        # Both

(3, 3.0, 3, 3.0)

It’s possible that the nontruncating behavior of / in 3.X may break a significant number of 2.X programs. Perhaps because of a C language legacy, many programmers rely on division truncation for integers and will have to learn to use // in such contexts instead. You should do so in all new 2.X and 3.X code you write today—in the former for 3.X compatibility, and in the latter because / does not truncate in 3.X. Watch for a simple prime number while loop example in Chapter 13, and a corresponding exercise at the end of Part IV that illustrates the sort of code that may be impacted by this / change. Also stay tuned for more on the special from command used in this section; it’s discussed further in Chapter 25.

Integer Precision

Division may differ slightly across Python releases, but it’s still fairly standard. Here’s something a bit more exotic. As mentioned earlier, Python 3.X integers support unlimited size:

>>> 999999999999999999999999999999 + 1         # 3.X

1000000000000000000000000000000

Python 2.X has a separate type for long integers, but it automatically converts any number too large to store in a normal integer to this type. Hence, you don’t need to code any special syntax to use longs, and the only way you can tell that you’re using 2.X longs is that they print with a trailing “L”:

>>> 999999999999999999999999999999 + 1         # 2.X

1000000000000000000000000000000L

Unlimited-precision integers are a convenient built-in tool. For instance, you can use them to count the U.S. national debt in pennies in Python directly (if you are so inclined, and have enough memory on your computer for this year’s budget). They are also why we were able to raise 2 to such large powers in the examples in Chapter 3. Here are the 3.X and 2.X cases:

>>> 2 ** 200

1606938044258990275541962092341162602522202993782792835301376

>>> 2 ** 200

1606938044258990275541962092341162602522202993782792835301376L

Because Python must do extra work to support their extended precision, integer math is usually substantially slower than normal when numbers grow large. However, if you need the precision, the fact that it’s built in for you to use will likely outweigh its performance penalty.

Complex Numbers

Although less commonly used than the types we’ve been exploring thus far, complex numbers are a distinct core object type in Python. They are typically used in engineering and science applications. If you know what they are, you know why they are useful; if not, consider this section optional reading.

Complex numbers are represented as two floating-point numbers—the real and imaginary parts—and you code them by adding a j or J suffix to the imaginary part. We can also write complex numbers with a nonzero real part by adding the two parts with a +. For example, the complex number with a real part of 2 and an imaginary part of −3 is written 2 + −3j. Here are some examples of complex math at work:

>>> 1j * 1J

(-1+0j)

>>> 2 + 1j * 3

(2+3j)

>>> (2 + 1j) * 3

(6+3j)

Complex numbers also allow us to extract their parts as attributes, support all the usual mathematical expressions, and may be processed with tools in the standard cmath module (the complex version of the standard math module). Because complex numbers are rare in most programming domains, though, we’ll skip the rest of this story here. Check Python’s language reference manual for additional details.

Hex, Octal, Binary: Literals and Conversions

Python integers can be coded in hexadecimal, octal, and binary notation, in addition to the normal base-10 decimal coding we’ve been using so far. The first three of these may at first seem foreign to 10-fingered beings, but some programmers find them convenient alternatives for specifying values, especially when their mapping to bytes and bits is important. The coding rules were introduced briefly at the start of this chapter; let’s look at some live examples here.

Keep in mind that these literals are simply an alternative syntax for specifying the value of an integer object. For example, the following literals coded in Python 3.X or 2.X produce normal integers with the specified values in all three bases. In memory, an integer’s value is the same, regardless of the base we use to specify it:

>>> 0o1, 0o20, 0o377           # Octal literals: base 8, digits 0-7 (3.X, 2.6+)

(1, 16, 255)

>>> 0x01, 0x10, 0xFF           # Hex literals: base 16, digits 0-9/A-F (3.X, 2.X)

(1, 16, 255)

>>> 0b1, 0b10000, 0b11111111   # Binary literals: base 2, digits 0-1 (3.X, 2.6+)

(1, 16, 255)

Here, the octal value 0o377, the hex value 0xFF, and the binary value 0b11111111 are all decimal 255. The F digits in the hex value, for example, each mean 15 in decimal and a 4-bit 1111 in binary, and reflect powers of 16. Thus, the hex value 0xFF and others convert to decimal values as follows:

>>> 0xFF, (15 * (16 ** 1)) + (15 * (16 ** 0))     # How hex/binary map to decimal

(255, 255)

>>> 0x2F, (2  * (16 ** 1)) + (15 * (16 ** 0))

(47, 47)

>>> 0xF, 0b1111, (1*(2**3) + 1*(2**2) + 1*(2**1) + 1*(2**0))

(15, 15, 15)

Python prints integer values in decimal (base 10) by default but provides built-in functions that allow you to convert integers to other bases’ digit strings, in Python-literal form—useful when programs or users expect to see values in a given base:

>>> oct(64), hex(64), bin(64)               # Numbers=>digit strings

('0o100', '0x40', '0b1000000')

The oct function converts decimal to octal, hex to hexadecimal, and bin to binary. To go the other way, the built-in int function converts a string of digits to an integer, and an optional second argument lets you specify the numeric base—useful for numbers read from files as strings instead of coded in scripts:

>>> 64, 0o100, 0x40, 0b1000000              # Digits=>numbers in scripts and strings

(64, 64, 64, 64)

>>> int('64'), int('100', 8), int('40', 16), int('1000000', 2)

(64, 64, 64, 64)

>>> int('0x40', 16), int('0b1000000', 2)    # Literal forms supported too

(64, 64)

The eval function, which you’ll meet later in this book, treats strings as though they were Python code. Therefore, it has a similar effect, but usually runs more slowly—it actually compiles and runs the string as a piece of a program, and it assumes the string being run comes from a trusted source—a clever user might be able to submit a string that deletes files on your machine, so be careful with this call:

>>> eval('64'), eval('0o100'), eval('0x40'), eval('0b1000000')

(64, 64, 64, 64)

Finally, you can also convert integers to base-specific strings with string formatting method calls and expressions, which return just digits, not Python literal strings:

>>> '{0:o}, {1:x}, {2:b}'.format(64, 64, 64)     # Numbers=>digits, 2.6+

'100, 40, 1000000'

>>> '%o, %x, %x, %X' % (64, 64, 255, 255)        # Similar, in all Pythons

'100, 40, ff, FF'

String formatting is covered in more detail in Chapter 7.

Two notes before moving on. First, per the start of this chapter, Python 2.X users should remember that you can code octals with simply a leading zero, the original octal format in Python:

>>> 0o1, 0o20, 0o377     # New octal format in 2.6+ (same as 3.X)

(1, 16, 255)

>>> 01, 020, 0377        # Old octal literals in all 2.X (error in 3.X)

(1, 16, 255)

In 3.X, the syntax in the second of these examples generates an error. Even though it’s not an error in 2.X, be careful not to begin a string of digits with a leading zero unless you really mean to code an octal value. Python 2.X will treat it as base 8, which may not work as you’d expect—010is always decimal 8 in 2.X, not decimal 10 (despite what you may or may not think!). This, along with symmetry with the hex and binary forms, is why the octal format was changed in 3.X—you must use 0o010 in 3.X, and probably should in 2.6 and 2.7 both for clarity and forward-compatibility with 3.X.

Secondly, note that these literals can produce arbitrarily long integers. The following, for instance, creates an integer with hex notation and then displays it first in decimal and then in octal and binary with converters (run in 3.X here: in 2.X the decimal and octal displays have a trailing L to denote its separate long type, and octals display without the letter o):

>>> X = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFF

>>> X

5192296858534827628530496329220095

>>> oct(X)

'0o17777777777777777777777777777777777777'

>>> bin(X)

'0b111111111111111111111111111111111111111111111111111111111 ...and so on... 11111'

Speaking of binary digits, the next section shows tools for processing individual bits.

Bitwise Operations

Besides the normal numeric operations (addition, subtraction, and so on), Python supports most of the numeric expressions available in the C language. This includes operators that treat integers as strings of binary bits, and can come in handy if your Python code must deal with things like network packets, serial ports, or packed binary data produced by a C program.

We can’t dwell on the fundamentals of Boolean math here—again, those who must use it probably already know how it works, and others can often postpone the topic altogether—but the basics are straightforward. For instance, here are some of Python’s bitwise expression operators at work performing bitwise shift and Boolean operations on integers:

>>> x = 1               # 1 decimal is 0001 in bits

>>> x << 2              # Shift left 2 bits: 0100

4

>>> x | 2               # Bitwise OR (either bit=1): 0011

3

>>> x & 1               # Bitwise AND (both bits=1): 0001

1

In the first expression, a binary 1 (in base 2, 0001) is shifted left two slots to create a binary 4 (0100). The last two operations perform a binary OR to combine bits (0001|0010 = 0011) and a binary AND to select common bits (0001&0001 = 0001). Such bit-masking operations allow us to encode and extract multiple flags and other values within a single integer.

This is one area where the binary and hexadecimal number support in Python as of 3.0 and 2.6 become especially useful—they allow us to code and inspect numbers by bit-strings:

>>> X = 0b0001          # Binary literals

>>> X << 2              # Shift left

4

>>> bin(X << 2)         # Binary digits string

'0b100'

>>> bin(X | 0b010)      # Bitwise OR: either

'0b11'

>>> bin(X & 0b1)        # Bitwise AND: both

'0b1'

This is also true for values that begin life as hex literals, or undergo base conversions:

>>> X = 0xFF            # Hex literals

>>> bin(X)

'0b11111111'

>>> X ^ 0b10101010      # Bitwise XOR: either but not both

85

>>> bin(X ^ 0b10101010)

'0b1010101'

>>> int('01010101', 2)  # Digits=>number: string to int per base

85

>>> hex(85)             # Number=>digits: Hex digit string

'0x55'

Also in this department, Python 3.1 and 2.7 introduced a new integer bit_length method, which allows you to query the number of bits required to represent a number’s value in binary. You can often achieve the same effect by subtracting 2 from the length of the bin string using the lenbuilt-in function we met in Chapter 4 (to account for the leading “0b”), though it may be less efficient:

>>> X = 99

>>> bin(X), X.bit_length(), len(bin(X)) - 2

('0b1100011', 7, 7)

>>> bin(256), (256).bit_length(), len(bin(256)) - 2

('0b100000000', 9, 9)

We won’t go into much more detail on such “bit twiddling” here. It’s supported if you need it, but bitwise operations are often not as important in a high-level language such as Python as they are in a low-level language such as C. As a rule of thumb, if you find yourself wanting to flip bits in Python, you should think about which language you’re really coding. As we’ll see in upcoming chapters, Python’s lists, dictionaries, and the like provide richer—and usually better—ways to encode information than bit strings, especially when your data’s audience includes readers of thehuman variety.

Other Built-in Numeric Tools

In addition to its core object types, Python also provides both built-in functions and standard library modules for numeric processing. The pow and abs built-in functions, for instance, compute powers and absolute values, respectively. Here are some examples of the built-in math module (which contains most of the tools in the C language’s math library) and a few built-in functions at work in 3.3; as described earlier, some floating-point displays may show more or fewer digits in Pythons before 2.7 and 3.1:

>>> import math

>>> math.pi, math.e                               # Common constants

(3.141592653589793, 2.718281828459045)

>>> math.sin(2 * math.pi / 180)                   # Sine, tangent, cosine

0.03489949670250097

>>> math.sqrt(144), math.sqrt(2)                  # Square root

(12.0, 1.4142135623730951)

>>> pow(2, 4), 2 ** 4, 2.0 ** 4.0                 # Exponentiation (power)

(16, 16, 16.0)

>>> abs(-42.0), sum((1, 2, 3, 4))                 # Absolute value, summation

(42.0, 10)

>>> min(3, 1, 2, 4), max(3, 1, 2, 4)              # Minimum, maximum

(1, 4)

The sum function shown here works on a sequence of numbers, and min and max accept either a sequence or individual arguments. There are a variety of ways to drop the decimal digits of floating-point numbers. We met truncation and floor earlier; we can also round, both numerically and for display purposes:

>>> math.floor(2.567), math.floor(-2.567)         # Floor (next-lower integer)

(2, −3)

>>> math.trunc(2.567), math.trunc(−2.567)         # Truncate (drop decimal digits)

(2, −2)

>>> int(2.567), int(−2.567)                       # Truncate (integer conversion)

(2, −2)

>>> round(2.567), round(2.467), round(2.567, 2)   # Round (Python 3.X version)

(3, 2, 2.57)

>>> '%.1f' % 2.567, '{0:.2f}'.format(2.567)       # Round for display (Chapter 7)

('2.6', '2.57')

As we saw earlier, the last of these produces strings that we would usually print and supports a variety of formatting options. As also described earlier, the second-to-last test here will also output (3, 2, 2.57) prior to 2.7 and 3.1 if we wrap it in a print call to request a more user-friendly display. String formatting is still subtly different, though, even in 3.X; round rounds and drops decimal digits but still produces a floating-point number in memory, whereas string formatting produces a string, not a number:

>>> (1 / 3.0), round(1 / 3.0, 2), ('%.2f' % (1 / 3.0))

(0.3333333333333333, 0.33, '0.33')

Interestingly, there are three ways to compute square roots in Python: using a module function, an expression, or a built-in function (if you’re interested in performance, we will revisit these in an exercise and its solution at the end of Part IV, to see which runs quicker):

>>> import math

>>> math.sqrt(144)              # Module

12.0

>>> 144 ** .5                   # Expression

12.0

>>> pow(144, .5)                # Built-in

12.0

>>> math.sqrt(1234567890)       # Larger numbers

35136.41828644462

>>> 1234567890 ** .5

35136.41828644462

>>> pow(1234567890, .5)

35136.41828644462

Notice that standard library modules such as math must be imported, but built-in functions such as abs and round are always available without imports. In other words, modules are external components, but built-in functions live in an implied namespace that Python automatically searches to find names used in your program. This namespace simply corresponds to the standard library module called builtins in Python 3.X (and __builtin__ in 2.X). There is much more about name resolution in the function and module parts of this book; for now, when you hear “module,” think “import.”

The standard library random module must be imported as well. This module provides an array of tools, for tasks such as picking a random floating-point number between 0 and 1, and selecting a random integer between two numbers:

>>> import random

>>> random.random()

0.5566014960423105

>>> random.random()              # Random floats, integers, choices, shuffles

0.051308506597373515

>>> random.randint(1, 10)

5

>>> random.randint(1, 10)

9

This module can also choose an item at random from a sequence, and shuffle a list of items randomly:

>>> random.choice(['Life of Brian', 'Holy Grail', 'Meaning of Life'])

'Holy Grail'

>>> random.choice(['Life of Brian', 'Holy Grail', 'Meaning of Life'])

'Life of Brian'

>>> suits = ['hearts', 'clubs', 'diamonds', 'spades']

>>> random.shuffle(suits)

>>> suits

['spades', 'hearts', 'diamonds', 'clubs']

>>> random.shuffle(suits)

>>> suits

['clubs', 'diamonds', 'hearts', 'spades']

Though we’d need additional code to make this more tangible here, the random module can be useful for shuffling cards in games, picking images at random in a slideshow GUI, performing statistical simulations, and much more. We’ll deploy it again later in this book (e.g., in Chapter 20’s permutations case study), but for more details, see Python’s library manual.

Other Numeric Types

So far in this chapter, we’ve been using Python’s core numeric types—integer, floating point, and complex. These will suffice for most of the number crunching that most programmers will ever need to do. Python comes with a handful of more exotic numeric types, though, that merit a brief look here.

Decimal Type

Python 2.4 introduced a new core numeric type: the decimal object, formally known as Decimal. Syntactically, you create decimals by calling a function within an imported module, rather than running a literal expression. Functionally, decimals are like floating-point numbers, but they have a fixed number of decimal points. Hence, decimals are fixed-precision floating-point values.

For example, with decimals, we can have a floating-point value that always retains just two decimal digits. Furthermore, we can specify how to round or truncate the extra decimal digits beyond the object’s cutoff. Although it generally incurs a performance penalty compared to the normal floating-point type, the decimal type is well suited to representing fixed-precision quantities like sums of money and can help you achieve better numeric accuracy.

Decimal basics

The last point merits elaboration. As previewed briefly when we explored comparisons, floating-point math is less than exact because of the limited space used to store values. For example, the following should yield zero, but it does not. The result is close to zero, but there are not enough bits to be precise here:

>>> 0.1 + 0.1 + 0.1 - 0.3                         # Python 3.3

5.551115123125783e-17

On Pythons prior to 3.1 and 2.7, printing the result to produce the user-friendly display format doesn’t completely help either, because the hardware related to floating-point math is inherently limited in terms of accuracy (a.k.a. precision). The following in 3.3 gives the same result as the previous output:

>>> print(0.1 + 0.1 + 0.1 - 0.3)                  # Pythons < 2.7, 3.1

5.55111512313e-17

However, with decimals, the result can be dead-on:

>>> from decimal import Decimal

>>> Decimal('0.1') + Decimal('0.1') + Decimal('0.1') - Decimal('0.3')

Decimal('0.0')

As shown here, we can make decimal objects by calling the Decimal constructor function in the decimal module and passing in strings that have the desired number of decimal digits for the resulting object (using the str function to convert floating-point values to strings if needed). When decimals of different precision are mixed in expressions, Python converts up to the largest number of decimal digits automatically:

>>> Decimal('0.1') + Decimal('0.10') + Decimal('0.10') - Decimal('0.30')

Decimal('0.00')

In Pythons 2.7, 3.1, and later, it’s also possible to create a decimal object from a floating-point object, with a call of the form decimal.Decimal.from_float(1.25), and recent Pythons allow floating-point numbers to be used directly. The conversion is exact but can sometimes yield a large default number of digits, unless they are fixed per the next section:

>>> Decimal(0.1) + Decimal(0.1) + Decimal(0.1) - Decimal(0.3)

Decimal('2.775557561565156540423631668E-17')

In Python 3.3 and later, the decimal module was also optimized to improve its performance radically: the reported speedup for the new version is 10X to 100X, depending on the type of program benchmarked.

Setting decimal precision globally

Other tools in the decimal module can be used to set the precision of all decimal numbers, arrange error handling, and more. For instance, a context object in this module allows for specifying precision (number of decimal digits) and rounding modes (down, ceiling, etc.). The precision is applied globally for all decimals created in the calling thread:

>>> import decimal

>>> decimal.Decimal(1) / decimal.Decimal(7)                     # Default: 28 digits

Decimal('0.1428571428571428571428571429')

>>> decimal.getcontext().prec = 4                               # Fixed precision

>>> decimal.Decimal(1) / decimal.Decimal(7)

Decimal('0.1429')

>>> Decimal(0.1) + Decimal(0.1) + Decimal(0.1) - Decimal(0.3)   # Closer to 0

Decimal('1.110E-17')

This is especially useful for monetary applications, where cents are represented as two decimal digits. Decimals are essentially an alternative to manual rounding and string formatting in this context:

>>> 1999 + 1.33      # This has more digits in memory than displayed in 3.3

2000.33

>>> 

>>> decimal.getcontext().prec = 2

>>> pay = decimal.Decimal(str(1999 + 1.33))

>>> pay

Decimal('2000.33')

Decimal context manager

In Python 2.6 and 3.0 and later, it’s also possible to reset precision temporarily by using the with context manager statement. The precision is reset to its original value on statement exit; in a new Python 3.3 session (per Chapter 3 the “...” here is Python’s interactive prompt for continuation lines in some interfaces and requires manual indentation; IDLE omits this prompt and indents for you):

C:\code> C:\Python33\python

>>> import decimal

>>> decimal.Decimal('1.00') / decimal.Decimal('3.00')

Decimal('0.3333333333333333333333333333')

>>> 

>>> with decimal.localcontext() as ctx:

...     ctx.prec = 2

...     decimal.Decimal('1.00') / decimal.Decimal('3.00')

...

Decimal('0.33')

>>> 

>>> decimal.Decimal('1.00') / decimal.Decimal('3.00')

Decimal('0.3333333333333333333333333333')

Though useful, this statement requires much more background knowledge than you’ve obtained at this point; watch for coverage of the with statement in Chapter 34.

Because use of the decimal type is still relatively rare in practice, I’ll defer to Python’s standard library manuals and interactive help for more details. And because decimals address some of the same floating-point accuracy issues as the fraction type, let’s move on to the next section to see how the two compare.

Fraction Type

Python 2.6 and 3.0 debuted a new numeric type, Fraction, which implements a rational number object. It essentially keeps both a numerator and a denominator explicitly, so as to avoid some of the inaccuracies and limitations of floating-point math. Like decimals, fractions do not map as closely to computer hardware as floating-point numbers. This means their performance may not be as good, but it also allows them to provide extra utility in a standard tool where required or useful.

Fraction basics

Fraction is a functional cousin to the Decimal fixed-precision type described in the prior section, as both can be used to address the floating-point type’s numerical inaccuracies. It’s also used in similar ways—like Decimal, Fraction resides in a module; import its constructor and pass in a numerator and a denominator to make one (among other schemes). The following interaction shows how:

>>> from fractions import Fraction

>>> x = Fraction(1, 3)                    # Numerator, denominator

>>> y = Fraction(4, 6)                    # Simplified to 2, 3 by gcd

>>> x

Fraction(1, 3)

>>> y

Fraction(2, 3)

>>> print(y)

2/3

Once created, Fractions can be used in mathematical expressions as usual:

>>> x + y

Fraction(1, 1)

>>> x − y                           # Results are exact: numerator, denominator

Fraction(−1, 3)

>>> x * y

Fraction(2, 9)

Fraction objects can also be created from floating-point number strings, much like decimals:

>>> Fraction('.25')

Fraction(1, 4)

>>> Fraction('1.25')

Fraction(5, 4)

>>> 

>>> Fraction('.25') + Fraction('1.25')

Fraction(3, 2)

Numeric accuracy in fractions and decimals

Notice that this is different from floating-point-type math, which is constrained by the underlying limitations of floating-point hardware. To compare, here are the same operations run with floating-point objects, and notes on their limited accuracy—they may display fewer digits in recent Pythons than they used to, but they still aren’t exact values in memory:

>>> a = 1 / 3.0                     # Only as accurate as floating-point hardware

>>> b = 4 / 6.0                     # Can lose precision over many calculations

>>> a

0.3333333333333333

>>> b

0.6666666666666666

>>> a + b

1.0

>>> a - b

-0.3333333333333333

>>> a * b

0.2222222222222222

This floating-point limitation is especially apparent for values that cannot be represented accurately given their limited number of bits in memory. Both Fraction and Decimal provide ways to get exact results, albeit at the cost of some speed and code verbosity. For instance, in the following example (repeated from the prior section), floating-point numbers do not accurately give the zero answer expected, but both of the other types do:

>>> 0.1 + 0.1 + 0.1 - 0.3           # This should be zero (close, but not exact)

5.551115123125783e-17

>>> from fractions import Fraction

>>> Fraction(1, 10) + Fraction(1, 10) + Fraction(1, 10) - Fraction(3, 10)

Fraction(0, 1)

>>> from decimal import Decimal

>>> Decimal('0.1') + Decimal('0.1') + Decimal('0.1') - Decimal('0.3')

Decimal('0.0')

Moreover, fractions and decimals both allow more intuitive and accurate results than floating points sometimes can, in different ways—by using rational representation and by limiting precision:

>>> 1 / 3                           # Use a ".0" in Python 2.X for true "/"

0.3333333333333333

>>> Fraction(1, 3)                  # Numeric accuracy, two ways

Fraction(1, 3)

>>> import decimal

>>> decimal.getcontext().prec = 2

>>> Decimal(1) / Decimal(3)

Decimal('0.33')

In fact, fractions both retain accuracy and automatically simplify results. Continuing the preceding interaction:

>>> (1 / 3) + (6 / 12)              # Use a ".0" in Python 2.X for true "/"

0.8333333333333333

>>> Fraction(6, 12)                 # Automatically simplified

Fraction(1, 2)

>>> Fraction(1, 3) + Fraction(6, 12)

Fraction(5, 6)

>>> decimal.Decimal(str(1/3)) + decimal.Decimal(str(6/12))

Decimal('0.83')

>>> 1000.0 / 1234567890

8.100000073710001e-07

>>> Fraction(1000, 1234567890)      # Substantially simpler!

Fraction(100, 123456789)

Fraction conversions and mixed types

To support fraction conversions, floating-point objects now have a method that yields their numerator and denominator ratio, fractions have a from_float method, and float accepts a Fraction as an argument. Trace through the following interaction to see how this pans out (the * in the second test is special syntax that expands a tuple into individual arguments; more on this when we study function argument passing in Chapter 18):

>>> (2.5).as_integer_ratio()               # float object method

(5, 2)

>>> f = 2.5

>>> z = Fraction(*f.as_integer_ratio())    # Convert float -> fraction: two args

>>> z                                      # Same as Fraction(5, 2)

Fraction(5, 2)

>>> x                                      # x from prior interaction

Fraction(1, 3)

>>> x + z

Fraction(17, 6)                            # 5/2 + 1/3 = 15/6 + 2/6

>>> float(x)                               # Convert fraction -> float

0.3333333333333333

>>> float(z)

2.5

>>> float(x + z)

2.8333333333333335

>>> 17 / 6

2.8333333333333335

>>> Fraction.from_float(1.75)              # Convert float -> fraction: other way

Fraction(7, 4)

>>> Fraction(*(1.75).as_integer_ratio())

Fraction(7, 4)

Finally, some type mixing is allowed in expressions, though Fraction must sometimes be manually propagated to retain accuracy. Study the following interaction to see how this works:

>>> x

Fraction(1, 3)

>>> x + 2                                  # Fraction + int -> Fraction

Fraction(7, 3)

>>> x + 2.0                                # Fraction + float -> float

2.3333333333333335

>>> x + (1./3)                             # Fraction + float -> float

0.6666666666666666

>>> x + (4./3)

1.6666666666666665

>>> x + Fraction(4, 3)                     # Fraction + Fraction -> Fraction

Fraction(5, 3)

Caveat: although you can convert from floating point to fraction, in some cases there is an unavoidable precision loss when you do so, because the number is inaccurate in its original floating-point form. When needed, you can simplify such results by limiting the maximum denominator value:

>>> 4.0 / 3

1.3333333333333333

>>> (4.0 / 3).as_integer_ratio()                # Precision loss from float

(6004799503160661, 4503599627370496)

>>> x

Fraction(1, 3)

>>> a = x + Fraction(*(4.0 / 3).as_integer_ratio())

>>> a

Fraction(22517998136852479, 13510798882111488)

>>> 22517998136852479 / 13510798882111488.      # 5 / 3 (or close to it!)

1.6666666666666667

>>> a.limit_denominator(10)                     # Simplify to closest fraction

Fraction(5, 3)

For more details on the Fraction type, experiment further on your own and consult the Python 2.6, 2.7, and 3.X library manuals and other documentation.

Sets

Besides decimals, Python 2.4 also introduced a new collection type, the set—an unordered collection of unique and immutable objects that supports operations corresponding to mathematical set theory. By definition, an item appears only once in a set, no matter how many times it is added. Accordingly, sets have a variety of applications, especially in numeric and database-focused work.

Because sets are collections of other objects, they share some behavior with objects such as lists and dictionaries that are outside the scope of this chapter. For example, sets are iterable, can grow and shrink on demand, and may contain a variety of object types. As we’ll see, a set acts much like the keys of a valueless dictionary, but it supports extra operations.

However, because sets are unordered and do not map keys to values, they are neither sequence nor mapping types; they are a type category unto themselves. Moreover, because sets are fundamentally mathematical in nature (and for many readers, may seem more academic and be used much less often than more pervasive objects like dictionaries), we’ll explore the basic utility of Python’s set objects here.

Set basics in Python 2.6 and earlier

There are a few ways to make sets today, depending on which Python you use. Since this book covers all, let’s begin with the case for 2.6 and earlier, which also is available (and sometimes still required) in later Pythons; we’ll refine this for 2.7 and 3.X extensions in a moment. To make a set object, pass in a sequence or other iterable object to the built-in set function:

>>> x = set('abcde')

>>> y = set('bdxyz')

You get back a set object, which contains all the items in the object passed in (notice that sets do not have a positional ordering, and so are not sequences—their order is arbitrary and may vary per Python release):

>>> x

set(['a', 'c', 'b', 'e', 'd'])                    # Pythons <= 2.6 display format

Sets made this way support the common mathematical set operations with expression operators. Note that we can’t perform the following operations on plain sequences like strings, lists, and tuples—we must create sets from them by passing them to set in order to apply these tools:

>>> x − y                                         # Difference

set(['a', 'c', 'e'])

>>> x | y                                         # Union

set(['a', 'c', 'b', 'e', 'd', 'y', 'x', 'z'])

>>> x & y                                         # Intersection

set(['b', 'd'])

>>> x ^ y                                         # Symmetric difference (XOR)

set(['a', 'c', 'e', 'y', 'x', 'z'])

>>> x > y, x < y                                  # Superset, subset

(False, False)

The notable exception to this rule is the in set membership test—this expression is also defined to work on all other collection types, where it also performs membership (or a search, if you prefer to think in procedural terms). Hence, we do not need to convert things like strings and lists to sets to run this test:

>>> 'e' in x                                      # Membership (sets)

True

>>> 'e' in 'Camelot', 22 in [11, 22, 33]          # But works on other types too

(True, True)

In addition to expressions, the set object provides methods that correspond to these operations and more, and that support set changes—the set add method inserts one item, update is an in-place union, and remove deletes an item by value (run a dir call on any set instance or the set type name to see all the available methods). Assuming x and y are still as they were in the prior interaction:

>>> z = x.intersection(y)                         # Same as x & y

>>> z

set(['b', 'd'])

>>> z.add('SPAM')                                 # Insert one item

>>> z

set(['b', 'd', 'SPAM'])

>>> z.update(set(['X', 'Y']))                     # Merge: in-place union

>>> z

set(['Y', 'X', 'b', 'd', 'SPAM'])

>>> z.remove('b')                                 # Delete one item

>>> z

set(['Y', 'X', 'd', 'SPAM'])

As iterable containers, sets can also be used in operations such as len, for loops, and list comprehensions. Because they are unordered, though, they don’t support sequence operations like indexing and slicing:

>>> for item in set('abc'): print(item * 3)

aaa

ccc

bbb

Finally, although the set expressions shown earlier generally require two sets, their method-based counterparts can often work with any iterable type as well:

>>> S = set([1, 2, 3])

>>> S | set([3, 4])          # Expressions require both to be sets

set([1, 2, 3, 4])

>>> S | [3, 4]

TypeError: unsupported operand type(s) for |: 'set' and 'list'

>>> S.union([3, 4])          # But their methods allow any iterable

set([1, 2, 3, 4])

>>> S.intersection((1, 3, 5))

set([1, 3])

>>> S.issubset(range(-5, 5))

True

For more details on set operations, see Python’s library reference manual or a reference book. Although set operations can be coded manually in Python with other types, like lists and dictionaries (and often were in the past), Python’s built-in sets use efficient algorithms and implementation techniques to provide quick and standard operation.

Set literals in Python 3.X and 2.7

If you think sets are “cool,” they eventually became noticeably cooler, with new syntax for set literals and comprehensions initially added in the Python 3.X line only, but back-ported to Python 2.7 by popular demand. In these Pythons we can still use the set built-in to make set objects, but also a new set literal form, using the curly braces formerly reserved for dictionaries. In 3.X and 2.7, the following are equivalent:

set([1, 2, 3, 4])                # Built-in call (all)

{1, 2, 3, 4}                     # Newer set literals (2.7, 3.X)

This syntax makes sense, given that sets are essentially like valueless dictionaries—because a set’s items are unordered, unique, and immutable, the items behave much like a dictionary’s keys. This operational similarity is even more striking given that dictionary key lists in 3.X are viewobjects, which support set-like behavior such as intersections and unions (see Chapter 8 for more on dictionary view objects).

Regardless of how a set is made, 3.X displays it using the new literal format. Python 2.7 accepts the new literal syntax, but still displays sets using the 2.6 display form of the prior section. In all Pythons, the set built-in is still required to create empty sets and to build sets from existing iterable objects (short of using set comprehensions, discussed later in this chapter), but the new literal is convenient for initializing sets of known structure.

Here’s what sets look like in 3.X; it’s the same in 2.7, except that set results display with 2.X’s set([...]) notation, and item order may vary per version (which by definition is irrelevant in sets anyhow):

C:\code> c:\python33\python

>>> set([1, 2, 3, 4])            # Built-in: same as in 2.6

{1, 2, 3, 4}

>>> set('spam')                  # Add all items in an iterable

{'s', 'a', 'p', 'm'}

>>> {1, 2, 3, 4}                 # Set literals: new in 3.X (and 2.7)

{1, 2, 3, 4}

>>> S = {'s', 'p', 'a', 'm'}

>>> S

{'s', 'a', 'p', 'm'}

>>> S.add('alot')                # Methods work as before

>>> S

{'s', 'a', 'p', 'alot', 'm'}

All the set processing operations discussed in the prior section work the same in 3.X, but the result sets print differently:

>>> S1 = {1, 2, 3, 4}

>>> S1 & {1, 3}                  # Intersection

{1, 3}

>>> {1, 5, 3, 6} | S1            # Union

{1, 2, 3, 4, 5, 6}

>>> S1 - {1, 3, 4}               # Difference

{2}

>>> S1 > {1, 3}                  # Superset

True

Note that {} is still a dictionary in all Pythons. Empty sets must be created with the set built-in, and print the same way:

>>> S1 - {1, 2, 3, 4}            # Empty sets print differently

set()

>>> type({})                     # Because {} is an empty dictionary

<class 'dict'>

>>> S = set()                    # Initialize an empty set

>>> S.add(1.23)

>>> S

{1.23}

As in Python 2.6 and earlier, sets created with 3.X/2.7 literals support the same methods, some of which allow general iterable operands that expressions do not:

>>> {1, 2, 3} | {3, 4}

{1, 2, 3, 4}

>>> {1, 2, 3} | [3, 4]

TypeError: unsupported operand type(s) for |: 'set' and 'list'

>>> {1, 2, 3}.union([3, 4])

{1, 2, 3, 4}

>>> {1, 2, 3}.union({3, 4})

{1, 2, 3, 4}

>>> {1, 2, 3}.union(set([3, 4]))

{1, 2, 3, 4}

>>> {1, 2, 3}.intersection((1, 3, 5))

{1, 3}

>>> {1, 2, 3}.issubset(range(-5, 5))

True

Immutable constraints and frozen sets

Sets are powerful and flexible objects, but they do have one constraint in both 3.X and 2.X that you should keep in mind—largely because of their implementation, sets can only contain immutable (a.k.a. “hashable”) object types. Hence, lists and dictionaries cannot be embedded in sets, but tuples can if you need to store compound values. Tuples compare by their full values when used in set operations:

>>> S

{1.23}

>>> S.add([1, 2, 3])                   # Only immutable objects work in a set

TypeError: unhashable type: 'list'

>>> S.add({'a':1})

TypeError: unhashable type: 'dict'

>>> S.add((1, 2, 3))

>>> S                                  # No list or dict, but tuple OK

{1.23, (1, 2, 3)}

>>> S | {(4, 5, 6), (1, 2, 3)}         # Union: same as S.union(...)

{1.23, (4, 5, 6), (1, 2, 3)}

>>> (1, 2, 3) in S                     # Membership: by complete values

True

>>> (1, 4, 3) in S

False

Tuples in a set, for instance, might be used to represent dates, records, IP addresses, and so on (more on tuples later in this part of the book). Sets may also contain modules, type objects, and more. Sets themselves are mutable too, and so cannot be nested in other sets directly; if you need to store a set inside another set, the frozenset built-in call works just like set but creates an immutable set that cannot change and thus can be embedded in other sets.

Set comprehensions in Python 3.X and 2.7

In addition to literals, Python 3.X grew a set comprehension construct that was back-ported for use to Python 2.7 too. Like the 3.X set literal, 2.7 accepts its syntax, but displays its results in 2.X set notation. The set comprehension expression is similar in form to the list comprehension we previewed in Chapter 4, but is coded in curly braces instead of square brackets and run to make a set instead of a list. Set comprehensions run a loop and collect the result of an expression on each iteration; a loop variable gives access to the current iteration value for use in the collection expression. The result is a new set you create by running the code, with all the normal set behavior. Here is a set comprehension in 3.3 (again, result display and order differs in 2.7):

>>> {x ** 2 for x in [1, 2, 3, 4]}         # 3.X/2.7 set comprehension

{16, 1, 4, 9}

In this expression, the loop is coded on the right, and the collection expression is coded on the left (x ** 2). As for list comprehensions, we get back pretty much what this expression says: “Give me a new set containing X squared, for every X in a list.” Comprehensions can also iterate across other kinds of objects, such as strings (the first of the following examples illustrates the comprehension-based way to make a set from an existing iterable):

>>> {x for x in 'spam'}                    # Same as: set('spam')

{'m', 's', 'p', 'a'}

>>> {c * 4 for c in 'spam'}                # Set of collected expression results

{'pppp', 'aaaa', 'ssss', 'mmmm'}

>>> {c * 4 for c in 'spamham'}

{'pppp', 'aaaa', 'hhhh', 'ssss', 'mmmm'}

>>> S = {c * 4 for c in 'spam'}

>>> S | {'mmmm', 'xxxx'}

{'pppp', 'xxxx', 'mmmm', 'aaaa', 'ssss'}

>>> S & {'mmmm', 'xxxx'}

{'mmmm'}

Because the rest of the comprehensions story relies upon underlying concepts we’re not yet prepared to address, we’ll postpone further details until later in this book. In Chapter 8, we’ll meet a first cousin in 3.X and 2.7, the dictionary comprehension, and I’ll have much more to say about all comprehensions—list, set, dictionary, and generator—later on, especially in Chapter 14 and Chapter 20. As we’ll learn there, all comprehensions support additional syntax not shown here, including nested loops and if tests, which can be challenging to understand until you’ve had a chance to study larger statements.

Why sets?

Set operations have a variety of common uses, some more practical than mathematical. For example, because items are stored only once in a set, sets can be used to filter duplicates out of other collections, though items may be reordered in the process because sets are unordered in general. Simply convert the collection to a set, and then convert it back again (sets work in the list call here because they are iterable, another technical artifact that we’ll unearth later):

>>> L = [1, 2, 1, 3, 2, 4, 5]

>>> set(L)

{1, 2, 3, 4, 5}

>>> L = list(set(L))                                  # Remove duplicates

>>> L

[1, 2, 3, 4, 5]

>>> list(set(['yy', 'cc', 'aa', 'xx', 'dd', 'aa']))   # But order may change

['cc', 'xx', 'yy', 'dd', 'aa']

Sets can be used to isolate differences in lists, strings, and other iterable objects too—simply convert to sets and take the difference—though again the unordered nature of sets means that the results may not match that of the originals. The last two of the following compare attribute lists of string object types in 3.X (results vary in 2.7):

>>> set([1, 3, 5, 7]) - set([1, 2, 4, 5, 6])          # Find list differences

{3, 7}

>>> set('abcdefg') - set('abdghij')                   # Find string differences

{'c', 'e', 'f'}

>>> set('spam') - set(['h', 'a', 'm'])                # Find differences, mixed

{'p', 's'}

>>> set(dir(bytes)) - set(dir(bytearray))             # In bytes but not bytearray

{'__getnewargs__'}

>>> set(dir(bytearray)) - set(dir(bytes))

{'append', 'copy', '__alloc__', '__imul__', 'remove', 'pop', 'insert', ...more...]

You can also use sets to perform order-neutral equality tests by converting to a set before the test, because order doesn’t matter in a set. More formally, two sets are equal if and only if every element of each set is contained in the other—that is, each is a subset of the other, regardless of order. For instance, you might use this to compare the outputs of programs that should work the same but may generate results in different order. Sorting before testing has the same effect for equality, but sets don’t rely on an expensive sort, and sorts order their results to support additional magnitude tests that sets do not (greater, less, and so on):

>>> L1, L2 = [1, 3, 5, 2, 4], [2, 5, 3, 4, 1]

>>> L1 == L2                                          # Order matters in sequences

False

>>> set(L1) == set(L2)                                # Order-neutral equality

True

>>> sorted(L1) == sorted(L2)                          # Similar but results ordered

True

>>> 'spam' == 'asmp', set('spam') == set('asmp'), sorted('spam') == sorted('asmp')

(False, True, True)

Sets can also be used to keep track of where you’ve already been when traversing a graph or other cyclic structure. For example, the transitive module reloader and inheritance tree lister examples we’ll study in Chapter 25 and Chapter 31, respectively, must keep track of items visited to avoid loops, as Chapter 19 discusses in the abstract. Using a list in this context is inefficient because searches require linear scans. Although recording states visited as keys in a dictionary is efficient, sets offer an alternative that’s essentially equivalent (and may be more or less intuitive, depending on whom you ask).

Finally, sets are also convenient when you’re dealing with large data sets (database query results, for example)—the intersection of two sets contains objects common to both categories, and the union contains all items in either set. To illustrate, here’s a somewhat more realistic example of set operations at work, applied to lists of people in a hypothetical company, using 3.X/2.7 set literals and 3.X result displays (use set in 2.6 and earlier):

>>> engineers = {'bob', 'sue', 'ann', 'vic'}

>>> managers  = {'tom', 'sue'}

>>> 'bob' in engineers                   # Is bob an engineer?

True

>>> engineers & managers                 # Who is both engineer and manager?

{'sue'}

>>> engineers | managers                 # All people in either category

{'bob', 'tom', 'sue', 'vic', 'ann'}

>>> engineers - managers                 # Engineers who are not managers

{'vic', 'ann', 'bob'}

>>> managers - engineers                 # Managers who are not engineers

{'tom'}

>>> engineers > managers                 # Are all managers engineers? (superset)

False

>>> {'bob', 'sue'} < engineers           # Are both engineers? (subset)

True

>>> (managers | engineers) > managers    # All people is a superset of managers

True

>>> managers ^ engineers                 # Who is in one but not both?

{'tom', 'vic', 'ann', 'bob'}

>>> (managers | engineers) - (managers ^ engineers)     # Intersection!

{'sue'}

You can find more details on set operations in the Python library manual and some mathematical and relational database theory texts. Also stay tuned for Chapter 8’s revival of some of the set operations we’ve seen here, in the context of dictionary view objects in Python 3.X.

Booleans

Some may argue that the Python Boolean type, bool, is numeric in nature because its two values, True and False, are just customized versions of the integers 1 and 0 that print themselves differently. Although that’s all most programmers need to know, let’s explore this type in a bit more detail.

More formally, Python today has an explicit Boolean data type called bool, with the values True and False available as preassigned built-in names. Internally, the names True and False are instances of bool, which is in turn just a subclass (in the object-oriented sense) of the built-in integer type int. True and False behave exactly like the integers 1 and 0, except that they have customized printing logic—they print themselves as the words True and False, instead of the digits 1 and 0. bool accomplishes this by redefining str and repr string formats for its two objects.

Because of this customization, the output of Boolean expressions typed at the interactive prompt prints as the words True and False instead of the older and less obvious 1 and 0. In addition, Booleans make truth values more explicit in your code. For instance, an infinite loop can now be coded as while True: instead of the less intuitive while 1:. Similarly, flags can be initialized more clearly with flag = False. We’ll discuss these statements further in Part III.

Again, though, for most practical purposes, you can treat True and False as though they are predefined variables set to integers 1 and 0. Most programmers had been preassigning True and False to 1 and 0 anyway; the bool type simply makes this standard. Its implementation can lead to curious results, though. Because True is just the integer 1 with a custom display format, True + 4 yields integer 5 in Python!

>>> type(True)

<class 'bool'>

>>> isinstance(True, int)

True

>>> True == 1                # Same value

True

>>> True is 1                # But a different object: see the next chapter

False

>>> True or False            # Same as: 1 or 0

True

>>> True + 4                 # (Hmmm)

5

Since you probably won’t come across an expression like the last of these in real Python code, you can safely ignore any of its deeper metaphysical implications.

We’ll revisit Booleans in Chapter 9 to define Python’s notion of truth, and again in Chapter 12 to see how Boolean operators like and and or work.

Numeric Extensions

Finally, although Python core numeric types offer plenty of power for most applications, there is a large library of third-party open source extensions available to address more focused needs. Because numeric programming is a popular domain for Python, you’ll find a wealth of advanced tools.

For example, if you need to do serious number crunching, an optional extension for Python called NumPy (Numeric Python) provides advanced numeric programming tools, such as a matrix data type, vector processing, and sophisticated computation libraries. Hardcore scientific programming groups at places like Los Alamos and NASA use Python with NumPy to implement the sorts of tasks they previously coded in C++, FORTRAN, or Matlab. The combination of Python and NumPy is often compared to a free, more flexible version of Matlab—you get NumPy’s performance, plus the Python language and its libraries.

Because it’s so advanced, we won’t talk further about NumPy in this book. You can find additional support for advanced numeric programming in Python, including graphics and plotting tools, extended precision floats, statistics libraries, and the popular SciPy package by searching the Web. Also note that NumPy is currently an optional extension; it doesn’t come with Python and must be installed separately, though you’ll probably want to do so if you care enough about this domain to look it up on the Web.

Chapter Summary

This chapter has taken a tour of Python’s numeric object types and the operations we can apply to them. Along the way, we met the standard integer and floating-point types, as well as some more exotic and less commonly used types such as complex numbers, decimals, fractions, and sets. We also explored Python’s expression syntax, type conversions, bitwise operations, and various literal forms for coding numbers in scripts.

Later in this part of the book, we’ll continue our in-depth type tour by filling in some details about the next object type—the string. In the next chapter, however, we’ll take some time to explore the mechanics of variable assignment in more detail than we have here. This turns out to be perhaps the most fundamental idea in Python, so make sure you check out the next chapter before moving on. First, though, it’s time to take the usual chapter quiz.

Test Your Knowledge: Quiz

1.    What is the value of the expression 2 * (3 + 4) in Python?

2.    What is the value of the expression 2 * 3 + 4 in Python?

3.    What is the value of the expression 2 + 3 * 4 in Python?

4.    What tools can you use to find a number’s square root, as well as its square?

5.    What is the type of the result of the expression 1 + 2.0 + 3?

6.    How can you truncate and round a floating-point number?

7.    How can you convert an integer to a floating-point number?

8.    How would you display an integer in octal, hexadecimal, or binary notation?

9.    How might you convert an octal, hexadecimal, or binary string to a plain integer?

Test Your Knowledge: Answers

1.    The value will be 14, the result of 2 * 7, because the parentheses force the addition to happen before the multiplication.

2.    The value will be 10, the result of 6 + 4. Python’s operator precedence rules are applied in the absence of parentheses, and multiplication has higher precedence than (i.e., happens before) addition, per Table 5-2.

3.    This expression yields 14, the result of 2 + 12, for the same precedence reasons as in the prior question.

4.    Functions for obtaining the square root, as well as pi, tangents, and more, are available in the imported math module. To find a number’s square root, import math and call math.sqrt(N). To get a number’s square, use either the exponent expression X ** 2 or the built-in functionpow(X, 2). Either of these last two can also compute the square root when given a power of 0.5 (e.g., X ** .5).

5.    The result will be a floating-point number: the integers are converted up to floating point, the most complex type in the expression, and floating-point math is used to evaluate it.

6.    The int(N) and math.trunc(N) functions truncate, and the round(Ndigits) function rounds. We can also compute the floor with math.floor(N) and round for display with string formatting operations.

7.    The float(I) function converts an integer to a floating point; mixing an integer with a floating point within an expression will result in a conversion as well. In some sense, Python 3.X / division converts too—it always returns a floating-point result that includes the remainder, even if both operands are integers.

8.    The oct(I) and hex(I) built-in functions return the octal and hexadecimal string forms for an integer. The bin(I) call also returns a number’s binary digits string in Pythons 2.6, 3.0, and later. The % string formatting expression and format string method also provide targets for some such conversions.

9.    The int(Sbase) function can be used to convert from octal and hexadecimal strings to normal integers (pass in 8, 16, or 2 for the base). The eval(S) function can be used for this purpose too, but it’s more expensive to run and can have security issues. Note that integers are always stored in binary form in computer memory; these are just display string format conversions.