Learning Python (2013)

Part VI. Classes and OOP

Chapter 31. Designing with Classes

So far in this part of the book, we’ve concentrated on using Python’s OOP tool, the class. But OOP is also about design issues—that is, how to use classes to model useful objects. This chapter will touch on a few core OOP ideas and present some additional examples that are more realistic than many shown so far.

Along the way, we’ll code some common OOP design patterns in Python, such as inheritance, composition, delegation, and factories. We’ll also investigate some design-focused class concepts, such as pseudoprivate attributes, multiple inheritance, and bound methods.

One note up front: some of the design terms mentioned here require more explanation than I can provide in this book. If this material sparks your curiosity, I suggest exploring a text on OOP design or design patterns as a next step. As we’ll see, the good news is that Python makes many traditional design patterns trivial.

Python and OOP

Let’s begin with a review—Python’s implementation of OOP can be summarized by three ideas:

Inheritance

Inheritance is based on attribute lookup in Python (in X.name expressions).

Polymorphism

In X.method, the meaning of method depends on the type (class) of subject object X.

Encapsulation

Methods and operators implement behavior, though data hiding is a convention by default.

By now, you should have a good feel for what inheritance is all about in Python. We’ve also talked about Python’s polymorphism a few times already; it flows from Python’s lack of type declarations. Because attributes are always resolved at runtime, objects that implement the same interfaces are automatically interchangeable; clients don’t need to know what sorts of objects are implementing the methods they call.

Encapsulation means packaging in Python—that is, hiding implementation details behind an object’s interface. It does not mean enforced privacy, though that can be implemented with code, as we’ll see in Chapter 39. Encapsulation is available and useful in Python nonetheless: it allows the implementation of an object’s interface to be changed without impacting the users of that object.

Polymorphism Means Interfaces, Not Call Signatures

Some OOP languages also define polymorphism to mean overloading functions based on the type signatures of their arguments—the number passed and/or their types. Because there are no type declarations in Python, this concept doesn’t really apply; as we’ve seen, polymorphism in Python is based on object interfaces, not types.

If you’re pining for your C++ days, you can try to overload methods by their argument lists, like this:

class C:

    def meth(self, x):

        ...

    def meth(self, x, y, z):

        ...

This code will run, but because the def simply assigns an object to a name in the class’s scope, the last definition of the method function is the only one that will be retained. Put another way, it’s just as if you say X = 1 and then X = 2; X will be 2. Hence, there can be only one definition of a method name.

If they are truly required, you can always code type-based selections using the type-testing ideas we met in Chapter 4 and Chapter 9, or the argument list tools introduced in Chapter 18:

class C:

    def meth(self, *args):

        if len(args) == 1:              # Branch on number arguments

            ...

        elif type(arg[0]) == int:       # Branch on argument types (or isinstance())

            ...

You normally shouldn’t do this, though—it’s not the Python way. As described in Chapter 16, you should write your code to expect only an object interface, not a specific data type. That way, it will be useful for a broader category of types and applications, both now and in the future:

class C:

    def meth(self, x):

        x.operation()                   # Assume x does the right thing

It’s also generally considered better to use distinct method names for distinct operations, rather than relying on call signatures (no matter what language you code in).

Although Python’s object model is straightforward, much of the art in OOP is in the way we combine classes to achieve a program’s goals. The next section begins a tour of some of the ways larger programs use classes to their advantage.

OOP and Inheritance: “Is-a” Relationships

We’ve explored the mechanics of inheritance in depth already, but I’d now like to show you an example of how it can be used to model real-world relationships. From a programmer’s point of view, inheritance is kicked off by attribute qualifications, which trigger searches for names in instances, their classes, and then any superclasses. From a designer’s point of view, inheritance is a way to specify set membership: a class defines a set of properties that may be inherited and customized by more specific sets (i.e., subclasses).

To illustrate, let’s put that pizza-making robot we talked about at the start of this part of the book to work. Suppose we’ve decided to explore alternative career paths and open a pizza restaurant (not bad, as career paths go). One of the first things we’ll need to do is hire employees to serve customers, prepare the food, and so on. Being engineers at heart, we’ve decided to build a robot to make the pizzas; but being politically and cybernetically correct, we’ve also decided to make our robot a full-fledged employee with a salary.

Our pizza shop team can be defined by the four classes in the following Python 3.X and 2.X example file, employees.py. The most general class, Employee, provides common behavior such as bumping up salaries (giveRaise) and printing (__repr__). There are two kinds of employees, and so two subclasses of Employee—Chef and Server. Both override the inherited work method to print more specific messages. Finally, our pizza robot is modeled by an even more specific class—PizzaRobot is a kind of Chef, which is a kind of Employee. In OOP terms, we call these relationships “is-a” links: a robot is a chef, which is an employee. Here’s the employees.py file:

# File employees.py (2.X + 3.X)

from __future__ import print_function

class Employee:

    def __init__(self, name, salary=0):

        self.name   = name

        self.salary = salary

    def giveRaise(self, percent):

        self.salary = self.salary + (self.salary * percent)

    def work(self):

        print(self.name, "does stuff")

    def __repr__(self):

        return "<Employee: name=%s, salary=%s>" % (self.name, self.salary)

class Chef(Employee):

    def __init__(self, name):

        Employee.__init__(self, name, 50000)

    def work(self):

        print(self.name, "makes food")

class Server(Employee):

    def __init__(self, name):

        Employee.__init__(self, name, 40000)

    def work(self):

        print(self.name, "interfaces with customer")

class PizzaRobot(Chef):

    def __init__(self, name):

        Chef.__init__(self, name)

    def work(self):

        print(self.name, "makes pizza")

if __name__ == "__main__":

    bob = PizzaRobot('bob')       # Make a robot named bob

    print(bob)                    # Run inherited __repr__

    bob.work()                    # Run type-specific action

    bob.giveRaise(0.20)           # Give bob a 20% raise

    print(bob); print()

    for klass in Employee, Chef, Server, PizzaRobot:

        obj = klass(klass.__name__)

        obj.work()

When we run the self-test code included in this module, we create a pizza-making robot named bob, which inherits names from three classes: PizzaRobot, Chef, and Employee. For instance, printing bob runs the Employee.__repr__ method, and giving bob a raise invokesEmployee.giveRaise because that’s where the inheritance search finds that method:

c:\code> python employees.py

<Employee: name=bob, salary=50000>

bob makes pizza

<Employee: name=bob, salary=60000.0>

Employee does stuff

Chef makes food

Server interfaces with customer

PizzaRobot makes pizza

In a class hierarchy like this, you can usually make instances of any of the classes, not just the ones at the bottom. For instance, the for loop in this module’s self-test code creates instances of all four classes; each responds differently when asked to work because the work method is different in each. bob the robot, for example, gets work from the most specific (i.e., lowest) PizzaRobot class.

Of course, these classes just simulate real-world objects; work prints a message for the time being, but it could be expanded to do real work later (see Python’s interfaces to devices such as serial ports, Arduino boards, and the Raspberry Pi if you’re taking this section much too literally!).

OOP and Composition: “Has-a” Relationships

The notion of composition was introduced in Chapter 26 and Chapter 28. From a programmer’s point of view, composition involves embedding other objects in a container object, and activating them to implement container methods. To a designer, composition is another way to represent relationships in a problem domain. But, rather than set membership, composition has to do with components—parts of a whole.

Composition also reflects the relationships between parts, called “has-a” relationships. Some OOP design texts refer to composition as aggregation, or distinguish between the two terms by using aggregation to describe a weaker dependency between container and contained. In this text, a “composition” simply refers to a collection of embedded objects. The composite class generally provides an interface all its own and implements it by directing the embedded objects.

Now that we’ve implemented our employees, let’s put them in the pizza shop and let them get busy. Our pizza shop is a composite object: it has an oven, and it has employees like servers and chefs. When a customer enters and places an order, the components of the shop spring into action—the server takes the order, the chef makes the pizza, and so on. The following example—file pizzashop.py—runs the same on Python 3.X and 2.X and simulates all the objects and relationships in this scenario:

# File pizzashop.py (2.X + 3.X)

from __future__ import print_function

from employees import PizzaRobot, Server

class Customer:

    def __init__(self, name):

        self.name = name

    def order(self, server):

        print(self.name, "orders from", server)

    def pay(self, server):

        print(self.name, "pays for item to", server)

class Oven:

    def bake(self):

        print("oven bakes")

class PizzaShop:

    def __init__(self):

        self.server = Server('Pat')         # Embed other objects

        self.chef   = PizzaRobot('Bob')     # A robot named bob

        self.oven   = Oven()

    def order(self, name):

        customer = Customer(name)           # Activate other objects

        customer.order(self.server)         # Customer orders from server

        self.chef.work()

        self.oven.bake()

        customer.pay(self.server)

if __name__ == "__main__":

    scene = PizzaShop()                     # Make the composite

    scene.order('Homer')                    # Simulate Homer's order

    print('...')

    scene.order('Shaggy')                   # Simulate Shaggy's order

The PizzaShop class is a container and controller; its constructor makes and embeds instances of the employee classes we wrote in the prior section, as well as an Oven class defined here. When this module’s self-test code calls the PizzaShop order method, the embedded objects are asked to carry out their actions in turn. Notice that we make a new Customer object for each order, and we pass on the embedded Server object to Customer methods; customers come and go, but the server is part of the pizza shop composite. Also notice that employees are still involved in an inheritance relationship; composition and inheritance are complementary tools.

When we run this module, our pizza shop handles two orders—one from Homer, and then one from Shaggy:

c:\code> python pizzashop.py

Homer orders from <Employee: name=Pat, salary=40000>

Bob makes pizza

oven bakes

Homer pays for item to <Employee: name=Pat, salary=40000>

...

Shaggy orders from <Employee: name=Pat, salary=40000>

Bob makes pizza

oven bakes

Shaggy pays for item to <Employee: name=Pat, salary=40000>

Again, this is mostly just a toy simulation, but the objects and interactions are representative of composites at work. As a rule of thumb, classes can represent just about any objects and relationships you can express in a sentence; just replace nouns with classes (e.g., Oven), and verbs with methods (e.g., bake), and you’ll have a first cut at a design.

Stream Processors Revisited

For a composition example that may be a bit more tangible than pizza-making robots, recall the generic data stream processor function we partially coded in the introduction to OOP in Chapter 26:

def processor(reader, converter, writer):

    while True:

        data = reader.read()

        if not data: break

        data = converter(data)

        writer.write(data)

Rather than using a simple function here, we might code this as a class that uses composition to do its work in order to provide more structure and support inheritance. The following 3.X/2.X file, streams.py, demonstrates one way to code the class:

class Processor:

    def __init__(self, reader, writer):

        self.reader = reader

        self.writer = writer

    def process(self):

        while True:

            data = self.reader.readline()

            if not data: break

            data = self.converter(data)

            self.writer.write(data)

    def converter(self, data):

        assert False, 'converter must be defined'       # Or raise exception

This class defines a converter method that it expects subclasses to fill in; it’s an example of the abstract superclass model we outlined in Chapter 29 (more on assert in Part VII—it simply raises an exception if its test is false). Coded this way, reader and writer objects are embedded within the class instance (composition), and we supply the conversion logic in a subclass rather than passing in a converter function (inheritance). The file converters.py shows how:

from streams import Processor

class Uppercase(Processor):

    def converter(self, data):

        return data.upper()

if __name__ == '__main__':

    import sys

    obj = Uppercase(open('trispam.txt'), sys.stdout)

    obj.process()

Here, the Uppercase class inherits the stream-processing loop logic (and anything else that may be coded in its superclasses). It needs to define only what is unique about it—the data conversion logic. When this file is run, it makes and runs an instance that reads from the file trispam.txt and writes the uppercase equivalent of that file to the stdout stream:

c:\code> type trispam.txt

spam

Spam

SPAM!

c:\code> python converters.py

SPAM

SPAM

SPAM!

To process different sorts of streams, pass in different sorts of objects to the class construction call. Here, we use an output file instead of a stream:

C:\code> python

>>> import converters

>>> prog = converters.Uppercase(open('trispam.txt'), open('trispamup.txt', 'w'))

>>> prog.process()

C:\code> type trispamup.txt

SPAM

SPAM

SPAM!

But, as suggested earlier, we could also pass in arbitrary objects coded as classes that define the required input and output method interfaces. Here’s a simple example that passes in a writer class that wraps up the text inside HTML tags:

C:\code> python

>>> from converters import Uppercase

>>> 

>>> class HTMLize:

         def write(self, line):

            print('<PRE>%s</PRE>' % line.rstrip())

>>> Uppercase(open('trispam.txt'), HTMLize()).process()

<PRE>SPAM</PRE>

<PRE>SPAM</PRE>

<PRE>SPAM!</PRE>

If you trace through this example’s control flow, you’ll see that we get both uppercase conversion (by inheritance) and HTML formatting (by composition), even though the core processing logic in the original Processor superclass knows nothing about either step. The processing code only cares that writers have a write method and that a method named convert is defined; it doesn’t care what those methods do when they are called. Such polymorphism and encapsulation of logic is behind much of the power of classes in Python.

As is, the Processor superclass only provides a file-scanning loop. In more realistic work, we might extend it to support additional programming tools for its subclasses, and, in the process, turn it into a full-blown application framework. Coding such a tool once in a superclass enables you to reuse it in all of your programs. Even in this simple example, because so much is packaged and inherited with classes, all we had to code was the HTML formatting step; the rest was free.

For another example of composition at work, see exercise 9 at the end of Chapter 32 and its solution in Appendix D; it’s similar to the pizza shop example. We’ve focused on inheritance in this book because that is the main tool that the Python language itself provides for OOP. But, in practice, composition may be used as much as inheritance as a way to structure classes, especially in larger systems. As we’ve seen, inheritance and composition are often complementary (and sometimes alternative) techniques. Because composition is a design issue outside the scope of the Python language and this book, though, I’ll defer to other resources for more on this topic.

WHY YOU WILL CARE: CLASSES AND PERSISTENCE

I’ve mentioned Python’s pickle and shelve object persistence support a few times in this part of the book because it works especially well with class instances. In fact, these tools are often compelling enough to motivate the use of classes in general—by pickling or shelving a class instance, we get data storage that contains both data and logic combined.

For example, besides allowing us to simulate real-world interactions, the pizza shop classes developed in this chapter could also be used as the basis of a persistent restaurant database. Instances of classes can be stored away on disk in a single step using Python’s pickle or shelve modules. We used shelves to store instances of classes in the OOP tutorial in Chapter 28, but the object pickling interface is remarkably easy to use as well:

import pickle

object = SomeClass()

file   = open(filename, 'wb')     # Create external file

pickle.dump(object, file)         # Save object in file

import pickle

file   = open(filename, 'rb')

object = pickle.load(file)        # Fetch it back later

Pickling converts in-memory objects to serialized byte streams (in Python, strings), which may be stored in files, sent across a network, and so on; unpickling converts back from byte streams to identical in-memory objects. Shelves are similar, but they automatically pickle objects to an access-by-key database, which exports a dictionary-like interface:

import shelve

object = SomeClass()

dbase  = shelve.open(filename)

dbase['key'] = object             # Save under key

import shelve

dbase  = shelve.open(filename)

object = dbase['key']             # Fetch it back later

In our pizza shop example, using classes to model employees means we can get a simple database of employees and shops with little extra work—pickling such instance objects to a file makes them persistent across Python program executions:

>>> from pizzashop import PizzaShop

>>> shop = PizzaShop()

>>> shop.server, shop.chef

(<Employee: name=Pat, salary=40000>, <Employee: name=Bob, salary=50000>)

>>> import pickle

>>> pickle.dump(shop, open('shopfile.pkl', 'wb'))

This stores an entire composite shop object in a file all at once. To bring it back later in another session or program, a single step suffices as well. In fact, objects restored this way retain both state and 'margin-top:0cm;margin-right:0cm;margin-bottom:0cm; margin-left:20.0pt;margin-bottom:.0001pt;line-height:normal;vertical-align: baseline'>>>> import pickle

>>> obj = pickle.load(open('shopfile.pkl', 'rb'))

>>> obj.server, obj.chef

(<Employee: name=Pat, salary=40000>, <Employee: name=Bob, salary=50000>)

>>> obj.order('LSP')

LSP orders from <Employee: name=Pat, salary=40000>

Bob makes pizza

oven bakes

LSP pays for item to <Employee: name=Pat, salary=40000>

This just runs a simulation as is, but we might extend the shop to keep track of inventory, revenue, and so on—saving it to its file after changes would retain its updated state. See the standard library manual and related coverage in Chapter 9Chapter 28, and Chapter 37 for more on pickles and shelves.

OOP and Delegation: “Wrapper” Proxy Objects

Beside inheritance and composition, object-oriented programmers often speak of delegation, which usually implies controller objects that embed other objects to which they pass off operation requests. The controllers can take care of administrative activities, such as logging or validating accesses, adding extra steps to interface components, or monitoring active instances.

In a sense, delegation is a special form of composition, with a single embedded object managed by a wrapper (sometimes called a proxy) class that retains most or all of the embedded object’s interface. The notion of proxies sometimes applies to other mechanisms too, such as function calls; in delegation, we’re concerned with proxies for all of an object’s behavior, including method calls and other operations.

This concept was introduced by example in Chapter 28, and in Python is often implemented with the __getattr__ method hook we studied in Chapter 30. Because this operator overloading method intercepts accesses to nonexistent attributes, a wrapper class can use __getattr__ to route arbitrary accesses to a wrapped object. Because this method allows attribute requests to be routed generically, the wrapper class retains the interface of the wrapped object and may add additional operations of its own.

By way of review, consider the file trace.py (which runs the same in 2.X and 3.X):

class Wrapper:

    def __init__(self, object):

        self.wrapped = object                    # Save object

    def __getattr__(self, attrname):

        print('Trace: ' + attrname)              # Trace fetch

        return getattr(self.wrapped, attrname)   # Delegate fetch

Recall from Chapter 30 that __getattr__ gets the attribute name as a string. This code makes use of the getattr built-in function to fetch an attribute from the wrapped object by name string—getattr(X,N) is like X.N, except that N is an expression that evaluates to a string at runtime, not a variable. In fact, getattr(X,N) is similar to X.__dict__[N], but the former also performs an inheritance search, like X.N, while the latter does not (see Chapter 22 and Chapter 29 for more on the __dict__ attribute).

You can use the approach of this module’s wrapper class to manage access to any object with attributes—lists, dictionaries, and even classes and instances. Here, the Wrapper class simply prints a trace message on each attribute access and delegates the attribute request to the embeddedwrapped object:

>>> from trace import Wrapper

>>> x = Wrapper([1, 2, 3])                       # Wrap a list

>>> x.append(4)                                  # Delegate to list method

Trace: append

>>> x.wrapped                                    # Print my member

[1, 2, 3, 4]

>>> x = Wrapper({'a': 1, 'b': 2})                # Wrap a dictionary

>>> list(x.keys())                               # Delegate to dictionary method

Trace: keys

['a', 'b']

The net effect is to augment the entire interface of the wrapped object, with additional code in the Wrapper class. We can use this to log our method calls, route method calls to extra or custom logic, adapt a class to a new interface, and so on.

We’ll revive the notions of wrapped objects and delegated operations as one way to extend built-in types in the next chapter. If you are interested in the delegation design pattern, also watch for the discussions in Chapter 32 and Chapter 39 of function decorators, a strongly related concept designed to augment a specific function or method call rather than the entire interface of an object, and class decorators, which serve as a way to automatically add such delegation-based wrappers to all instances of a class.

NOTE

Version skew note: As we saw by example in Chapter 28, delegation of object interfaces by general proxies has changed substantially in 3.X when wrapped objects implement operator overloading methods. Technically, this is a new-style class difference, and can appear in 2.X code too if it enables this option; per the next chapter, it’s mandatory in 3.X and thus often considered a 3.X change.

In Python 2.X’s default classes, operator overloading methods run by built-in operations are routed through generic attribute interception methods like __getattr__. Printing a wrapped object directly, for example, calls this method for __repr__ or __str__, which then passes the call on to the wrapped object. This pattern holds for __iter__, __add__, and the other operator methods of the prior chapter.

In Python 3.X, this no longer happens: printing does not trigger __getattr__ (or its __getattribute__ cousin we’ll study in the next chapter) and a default display is used instead. In 3.X, new-style classes look up methods invoked implicitly by built-in operations in classes and skip the normal instance lookup entirely. Explicit name attribute fetches are routed to __getattr__ the same way in both 2.X and 3.X, but built-in operation method lookup differs in ways that may impact some delegation-based tools.

We’ll return to this issue in the next chapter as a new-style class change, and see it live in Chapter 38 and Chapter 39, in the context of managed attributes and decorators. For now, keep in mind that for delegation coding patterns, you may need to redefine operator overloading methods in wrapper classes (either by hand, by tools, or by superclasses) if they are used by embedded objects and you want them to be intercepted in new-style classes.

Pseudoprivate Class Attributes

Besides larger structuring goals, class designs often must address name usage too. In Chapter 28’s case study, for example, we noted that methods defined within a general tool class might be modified by subclasses if exposed, and noted the tradeoffs of this policy—while it supports method customization and direct calls, it’s also open to accidental replacements.

In Part V, we learned that every name assigned at the top level of a module file is exported. By default, the same holds for classes—data hiding is a convention, and clients may fetch or change attributes in any class or instance to which they have a reference. In fact, attributes are all “public” and “virtual,” in C++ terms; they’re all accessible everywhere and are looked up dynamically at runtime.[61]

That said, Python today does support the notion of name “mangling” (i.e., expansion) to localize some names in classes. Mangled names are sometimes misleadingly called “private attributes,” but really this is just a way to localize a name to the class that created it—name mangling does not prevent access by code outside the class. This feature is mostly intended to avoid namespace collisions in instances, not to restrict access to names in general; mangled names are therefore better called “pseudoprivate” than “private.”

Pseudoprivate names are an advanced and entirely optional feature, and you probably won’t find them very useful until you start writing general tools or larger class hierarchies for use in multiprogrammer projects. In fact, they are not always used even when they probably should be—more commonly, Python programmers code internal names with a single underscore (e.g., _X), which is just an informal convention to let you know that a name shouldn’t generally be changed (it means nothing to Python itself).

Because you may see this feature in other people’s code, though, you need to be somewhat aware of it, even if you don’t use it yourself. And once you learn its advantages and contexts of use, you may find this feature to be more useful in your own code than some programmers realize.

Name Mangling Overview

Here’s how name mangling works: within a class statement only, any names that start with two underscores but don’t end with two underscores are automatically expanded to include the name of the enclosing class at their front. For instance, a name like __X within a class named Spam is changed to _Spam__X automatically: the original name is prefixed with a single underscore and the enclosing class’s name. Because the modified name contains the name of the enclosing class, it’s generally unique; it won’t clash with similar names created by other classes in a hierarchy.

Name mangling happens only for names that appear inside a class statement’s code, and then only for names that begin with two leading underscores. It works for every name preceded with double underscores, though—both class attributes (including method names) and instance attribute names assigned to self. For example, in a class named Spam, a method named __meth is mangled to _Spam__meth, and an instance attribute reference self.__X is transformed to self._Spam__X.

Despite the mangling, as long as the class uses the double underscore version everywhere it refers to the name, all its references will still work. Because more than one class may add attributes to an instance, though, this mangling helps avoid clashes—but we need to move on to an example to see how.

Why Use Pseudoprivate Attributes?

One of the main issues that the pseudoprivate attribute feature is meant to alleviate has to do with the way instance attributes are stored. In Python, all instance attributes wind up in the single instance object at the bottom of the class tree, and are shared by all class-level method functions the instance is passed into. This is different from the C++ model, where each class gets its own space for data members it defines.

Within a class’s method in Python, whenever a method assigns to a self attribute (e.g., self.attr = value), it changes or creates an attribute in the instance (recall that inheritance searches happen only on reference, not on assignment). Because this is true even if multiple classes in a hierarchy assign to the same attribute, collisions are possible.

For example, suppose that when a programmer codes a class, it is assumed that the class owns the attribute name X in the instance. In this class’s methods, the name is set, and later fetched:

class C1:

    def meth1(self): self.X = 88         # I assume X is mine

    def meth2(self): print(self.X)

Suppose further that another programmer, working in isolation, makes the same assumption in another class:

class C2:

    def metha(self): self.X = 99         # Me too

    def methb(self): print(self.X)

Both of these classes work by themselves. The problem arises if the two classes are ever mixed together in the same class tree:

class C3(C1, C2): ...

I = C3()                                 # Only 1 X in I!

Now, the value that each class gets back when it says self.X will depend on which class assigned it last. Because all assignments to self.X refer to the same single instance, there is only one X attribute—I.X—no matter how many classes use that attribute name.

This isn’t a problem if it’s expected, and indeed, this is how classes communicate—the instance is shared memory. To guarantee that an attribute belongs to the class that uses it, though, prefix the name with double underscores everywhere it is used in the class, as in this 2.X/3.X file,pseudoprivate.py:

class C1:

    def meth1(self): self.__X = 88       # Now X is mine

    def meth2(self): print(self.__X)     # Becomes _C1__X in I

class C2:

    def metha(self): self.__X = 99       # Me too

    def methb(self): print(self.__X)     # Becomes _C2__X in I

class C3(C1, C2): pass

I = C3()                                 # Two X names in I

I.meth1(); I.metha()

print(I.__dict__)

I.meth2(); I.methb()

When thus prefixed, the X attributes will be expanded to include the names of their classes before being added to the instance. If you run a dir call on I or inspect its namespace dictionary after the attributes have been assigned, you’ll see the expanded names, _C1__X and _C2__X, but notX. Because the expansion makes the names more unique within the instance, the class coders can be fairly safe in assuming that they truly own any names that they prefix with two underscores:

% python pseudoprivate.py

{'_C2__X': 99, '_C1__X': 88}

88

99

This trick can avoid potential name collisions in the instance, but note that it does not amount to true privacy. If you know the name of the enclosing class, you can still access either of these attributes anywhere you have a reference to the instance by using the fully expanded name (e.g.,I._C1__X = 77). Moreover, names could still collide if unknowing programmers use the expanded naming pattern explicitly (unlikely, but not impossible). On the other hand, this feature makes it less likely that you will accidentally step on a class’s names.

Pseudoprivate attributes are also useful in larger frameworks or tools, both to avoid introducing new method names that might accidentally hide definitions elsewhere in the class tree and to reduce the chance of internal methods being replaced by names defined lower in the tree. If a method is intended for use only within a class that may be mixed into other classes, the double underscore prefix virtually ensures that the method won’t interfere with other names in the tree, especially in multiple-inheritance scenarios:

class Super:

    def method(self): ...                  # A real application method

class Tool:

    def __method(self): ...                # Becomes _Tool__method

    def other(self): self.__method()       # Use my internal method

class Sub1(Tool, Super): ...

    def actions(self): self.method()       # Runs Super.method as expected

class Sub2(Tool):

    def __init__(self): self.method = 99   # Doesn't break Tool.__method

We met multiple inheritance briefly in Chapter 26 and will explore it in more detail later in this chapter. Recall that superclasses are searched according to their left-to-right order in class header lines. Here, this means Sub1 prefers Tool attributes to those in Super. Although in this example we could force Python to pick the application class’s methods first by switching the order of the superclasses listed in the Sub1 class header, pseudoprivate attributes resolve the issue altogether. Pseudoprivate names also prevent subclasses from accidentally redefining the internal method’s names, as in Sub2.

Again, I should note that this feature tends to be of use primarily for larger, multiprogrammer projects, and then only for selected names. Don’t be tempted to clutter your code unnecessarily; only use this feature for names that truly need to be controlled by a single class. Although useful in some general class-based tools, for simpler programs, it’s probably overkill.

For more examples that make use of the __X naming feature, see the lister.py mix-in classes introduced later in this chapter in the multiple inheritance section, as well as the discussion of Private class decorators in Chapter 39.

If you care about privacy in general, you might want to review the emulation of private instance attributes sketched in the section Attribute Access: __getattr__ and __setattr__ in Chapter 30, and watch for the more complete Private class decorator we’ll build with delegation in Chapter 39. Although it’s possible to emulate true access controls in Python classes, this is rarely done in practice, even for large systems.


[61] This tends to scare people with a C++ background disproportionately. In Python, it’s even possible to change or completely delete a class’s method at runtime. On the other hand, almost nobody ever does this in practical programs. As a scripting language, Python is more about enabling than restricting. Also, recall from our discussion of operator overloading in Chapter 30 that__getattr__ and __setattr__ can be used to emulate privacy, but are generally not used for this purpose in practice. More on this when we code a more realistic privacy decorator in Chapter 39.

Methods Are Objects: Bound or Unbound

Methods in general, and bound methods in particular, simplify the implementation of many design goals in Python. We met bound methods briefly while studying __call__ in Chapter 30. The full story, which we’ll flesh out here, turns out to be more general and flexible than you might expect.

In Chapter 19, we learned how functions can be processed as normal objects. Methods are a kind of object too, and can be used generically in much the same way as other objects—they can be assigned to names, passed to functions, stored in data structures, and so on—and like simple functions, qualify as “first class” objects. Because a class’s methods can be accessed from an instance or a class, though, they actually come in two flavors in Python:

Unbound (class) method objects: no self

Accessing a function attribute of a class by qualifying the class returns an unbound method object. To call the method, you must provide an instance object explicitly as the first argument. In Python 3.X, an unbound method is the same as a simple function and can be called through the class’s name; in 2.X it’s a distinct type and cannot be called without providing an instance.

Bound (instance) method objects: self + function pairs

Accessing a function attribute of a class by qualifying an instance returns a bound method object. Python automatically packages the instance with the function in the bound method object, so you don’t need to pass an instance to call the method.

Both kinds of methods are full-fledged objects; they can be transferred around a program at will, just like strings and numbers. Both also require an instance in their first argument when run (i.e., a value for self). This is why we’ve had to pass in an instance explicitly when calling superclass methods from subclass methods in previous examples (including this chapter’s employees.py); technically, such calls produce unbound method objects along the way.

When calling a bound method object, Python provides an instance for you automatically—the instance used to create the bound method object. This means that bound method objects are usually interchangeable with simple function objects, and makes them especially useful for interfaces originally written for functions (see the sidebar Why You Will Care: Bound Method Callbacks for a realistic use case in GUIs).

To illustrate in simple terms, suppose we define the following class:

class Spam:

    def doit(self, message):

        print(message)

Now, in normal operation, we make an instance and call its method in a single step to print the passed-in argument:

object1 = Spam()

object1.doit('hello world')

Really, though, a bound method object is generated along the way, just before the method call’s parentheses. In fact, we can fetch a bound method without actually calling it. An object.name expression evaluates to an object as all expressions do. In the following, it returns a bound method object that packages the instance (object1) with the method function (Spam.doit). We can assign this bound method pair to another name and then call it as though it were a simple function:

object1 = Spam()

x = object1.doit        # Bound method object: instance+function

x('hello world')        # Same effect as object1.doit('...')

On the other hand, if we qualify the class to get to doit, we get back an unbound method object, which is simply a reference to the function object. To call this type of method, we must pass in an instance as the leftmost argument—there isn’t one in the expression otherwise, and the method expects it:

object1 = Spam()

t = Spam.doit           # Unbound method object (a function in 3.X: see ahead)

t(object1, 'howdy')     # Pass in instance (if the method expects one in 3.X)

By extension, the same rules apply within a class’s method if we reference self attributes that refer to functions in the class. A self.method expression is a bound method object because self is an instance object:

class Eggs:

    def m1(self, n):

        print(n)

    def m2(self):

        x = self.m1     # Another bound method object

        x(42)           # Looks like a simple function

Eggs().m2()             # Prints 42

Most of the time, you call methods immediately after fetching them with attribute qualification, so you don’t always notice the method objects generated along the way. But if you start writing code that calls objects generically, you need to be careful to treat unbound methods specially—they normally require an explicit instance object to be passed in.

NOTE

For an optional exception to this rule, see the discussion of static and class methods in the next chapter, and the brief mention of one in the next section. Like bound methods, static methods can masquerade as basic functions because they do not expect instances when called. Formally speaking, Python supports three kinds of class-level methods—instance, static, and class—and 3.X allows simple functions in classes, too. Chapter 40’s metaclass methods are distinct too, but they are essentially class methods with less scope.

Unbound Methods Are Functions in 3.X

In Python 3.X, the language has dropped the notion of unbound methods. What we describe as an unbound method here is treated as a simple function in 3.X. For most purposes, this makes no difference to your code; either way, an instance will be passed to a method’s first argument when it’s called through an instance.

Programs that do explicit type testing might be impacted, though—if you print the type of an instance-less class-level method, it displays “unbound method” in 2.X, and “function” in 3.X.

Moreover, in 3.X it is OK to call a method without an instance, as long as the method does not expect one and you call it only through the class and never through an instance. That is, Python 3.X will pass along an instance to methods only for through-instance calls. When calling through a class, you must pass an instance manually only if the method expects one:

C:\code> c:\python33\python

>>> class Selfless:

        def __init__(self, data):

            self.data = data

        def selfless(arg1, arg2):               # A simple function in 3.X

            return arg1 + arg2

        def normal(self, arg1, arg2):           # Instance expected when called

            return self.data + arg1 + arg2

>>> X = Selfless(2)

>>> X.normal(3, 4)                  # Instance passed to self automatically: 2+(3+4)

9

>>> Selfless.normal(X, 3, 4)        # self expected by method: pass manually

9

>>> Selfless.selfless(3, 4)         # No instance: works in 3.X, fails in 2.X!

7

The last test in this fails in 2.X, because unbound methods require an instance to be passed by default; it works in 3.X because such methods are treated as simple functions not requiring an instance. Although this removes some potential error trapping in 3.X (what if a programmer accidentally forgets to pass an instance?), it allows a class’s methods to be used as simple functions as long as they are not passed and do not expect a “self” instance argument.

The following two calls still fail in both 3.X and 2.X, though—the first (calling through an instance) automatically passes an instance to a method that does not expect one, while the second (calling through a class) does not pass an instance to a method that does expect one (error message text here is per 3.3):

>>> X.selfless(3, 4)

TypeError: selfless() takes 2 positional arguments but 3 were given

>>> Selfless.normal(3, 4)

TypeError: normal() missing 1 required positional argument: 'arg2'

Because of this change, the staticmethod built-in function and decorator described in the next chapter is not needed in 3.X for methods without a self argument that are called only through the class name, and never through an instance—such methods are run as simple functions, without receiving an instance argument. In 2.X, such calls are errors unless an instance is passed manually or the method is marked as being static (more on static methods in the next chapter).

It’s important to be aware of the differences in behavior in 3.X, but bound methods are generally more important from a practical perspective anyway. Because they pair together the instance and function in a single object, they can be treated as callables generically. The next section demonstrates what this means in code.

NOTE

For a more visual illustration of unbound method treatment in Python 3.X and 2.X, see also the lister.py example in the multiple inheritance section later in this chapter. Its classes print the value of methods fetched from both instances and classes, in both versions of Python—as unbound methods in 2.X and simple functions in 3.X. Also note that this change is inherent in 3.X itself, not the new-style class model it mandates.

Bound Methods and Other Callable Objects

As mentioned earlier, bound methods can be processed as generic objects, just like simple functions—they can be passed around a program arbitrarily. Moreover, because bound methods combine both a function and an instance in a single package, they can be treated like any other callable object and require no special syntax when invoked. The following, for example, stores four bound method objects in a list and calls them later with normal call expressions:

>>> class Number:

        def __init__(self, base):

            self.base = base

        def double(self):

            return self.base * 2

        def triple(self):

            return self.base * 3

>>> x = Number(2)                                       # Class instance objects

>>> y = Number(3)                                       # State + methods

>>> z = Number(4)

>>> x.double()                                          # Normal immediate calls

4

>>> acts = [x.double, y.double, y.triple, z.double]     # List of bound methods

>>> for act in acts:                                    # Calls are deferred

        print(act())                                    # Call as though functions

4

6

9

8

Like simple functions, bound method objects have introspection information of their own, including attributes that give access to the instance object and method function they pair. Calling the bound method simply dispatches the pair:

>>> bound = x.double

>>> bound.__self__, bound.__func__

(<__main__.Number object at 0x...etc...>, <function Number.double at 0x...etc...>)

>>> bound.__self__.base

2

>>> bound()                   # Calls bound.__func__(bound.__self__, ...)

4

Other callables

In fact, bound methods are just one of a handful of callable object types in Python. As the following demonstrates, simple functions coded with a def or lambda, instances that inherit a __call__, and bound instance methods can all be treated and called the same way:

>>> def square(arg):

        return arg ** 2                          # Simple functions (def or lambda)

>>> class Sum:

        def __init__(self, val):                 # Callable instances

            self.val = val

        def __call__(self, arg):

            return self.val + arg

>>> class Product:

        def __init__(self, val):                 # Bound methods

            self.val = val

        def method(self, arg):

            return self.val * arg

>>> sobject = Sum(2)

>>> pobject = Product(3)

>>> actions = [square, sobject, pobject.method]  # Function, instance, method

>>> for act in actions:                          # All three called same way

        print(act(5))                            # Call any one-arg callable

25

7

15

>>> actions[-1](5)                               # Index, comprehensions, maps

15

>>> [act(5) for act in actions]

[25, 7, 15]

>>> list(map(lambda act: act(5), actions))

[25, 7, 15]

Technically speaking, classes belong in the callable objects category too, but we normally call them to generate instances rather than to do actual work—a single action is better coded as a simple function than a class with a constructor, but the class here serves to illustrate its callable nature:

>>> class Negate:

        def __init__(self, val):                 # Classes are callables too

            self.val = -val                      # But called for object, not work

        def __repr__(self):                      # Instance print format

            return str(self.val)

>>> actions = [square, sobject, pobject.method, Negate]     # Call a class too

>>> for act in actions:

        print(act(5))

25

7

15

-5

>>> [act(5) for act in actions]                     # Runs __repr__ not __str__!

[25, 7, 15, −5]

>>> table = {act(5): act for act in actions}        # 3.X/2.7 dict comprehension

>>> for (key, value) in table.items():

        print('{0:2} => {1}'.format(key, value))    # 2.6+/3.X str.format

25 => <function square at 0x0000000002987400>

15 => <bound method Product.method of <__main__.Product object at ...etc...>>

-5 => <class '__main__.Negate'>

 7 => <__main__.Sum object at 0x000000000298BE48>

As you can see, bound methods, and Python’s callable objects model in general, are some of the many ways that Python’s design makes for an incredibly flexible language.

You should now understand the method object model. For other examples of bound methods at work, see the upcoming sidebar Why You Will Care: Bound Method Callbacks as well as the prior chapter’s discussion of callback handlers in the section on the method __call__.

WHY YOU WILL CARE: BOUND METHOD CALLBACKS

Because bound methods automatically pair an instance with a class’s method function, you can use them anywhere a simple function is expected. One of the most common places you’ll see this idea put to work is in code that registers methods as event callback handlers in the tkinter GUI interface (named Tkinter in Python 2.X) we’ve met before. As review, here’s the simple case:

def handler():

    ...use globals or closure scopes for state...

...

widget = Button(text='spam', command=handler)

To register a handler for button click events, we usually pass a callable object that takes no arguments to the command keyword argument. Function names (and lambdas) work here, and so do class-level methods—though they must be bound methods if they expect an instance when called:

class MyGui:

    def handler(self):

        ...use self.attr for state...

    def makewidgets(self):

        b = Button(text='spam', command=self.handler)

Here, the event handler is self.handler—a bound method object that remembers both self and MyGui.handler. Because self will refer to the original instance when handler is later invoked on events, the method will have access to instance attributes that can retain state between events, as well as class-level methods. With simple functions, state normally must be retained in global variables or enclosing function scopes instead.

See also the discussion of __call__ operator overloading in Chapter 30 for another way to make classes compatible with function-based APIs, and lambda in Chapter 19 for another tool often used in callback roles. As noted in the former of these, you don’t generally need to wrap a bound method in a lambda; the bound method in the preceding example already defers the call (note that there are no parentheses to trigger one), so adding a lambda here would be pointless!

Classes Are Objects: Generic Object Factories

Sometimes, class-based designs require objects to be created in response to conditions that can’t be predicted when a program is written. The factory design pattern allows such a deferred approach. Due in large part to Python’s flexibility, factories can take multiple forms, some of which don’t seem special at all.

Because classes are also “first class” objects, it’s easy to pass them around a program, store them in data structures, and so on. You can also pass classes to functions that generate arbitrary kinds of objects; such functions are sometimes called factories in OOP design circles. Factories can be a major undertaking in a strongly typed language such as C++ but are almost trivial to implement in Python.

For example, the call syntax we met in Chapter 18 can call any class with any number of positional or keyword constructor arguments in one step to generate any sort of instance:[62]

def factory(aClass, *pargs, **kargs):        # Varargs tuple, dict

    return aClass(*pargs, **kargs)           # Call aClass (or apply in 2.X only)

class Spam:

    def doit(self, message):

        print(message)

class Person:

    def __init__(self, name, job=None):

        self.name = name

        self.job  = job

object1 = factory(Spam)                      # Make a Spam object

object2 = factory(Person, "Arthur", "King")  # Make a Person object

object3 = factory(Person, name='Brian')      # Ditto, with keywords and default

In this code, we define an object generator function called factory. It expects to be passed a class object (any class will do) along with one or more arguments for the class’s constructor. The function uses special “varargs” call syntax to call the function and return an instance.

The rest of the example simply defines two classes and generates instances of both by passing them to the factory function. And that’s the only factory function you’ll ever need to write in Python; it works for any class and any constructor arguments. If you run this live (factory.py), your objects will look like this:

>>> object1.doit(99)

99

>>> object2.name, object2.job

('Arthur', 'King')

>>> object3.name, object3.job

('Brian', None)

By now, you should know that everything is a “first class” object in Python—including classes, which are usually just compiler input in languages like C++. It’s natural to pass them around this way. As mentioned at the start of this part of the book, though, only objects derived from classes do full OOP in Python.

Why Factories?

So what good is the factory function (besides providing an excuse to illustrate first-class class objects in this book)? Unfortunately, it’s difficult to show applications of this design pattern without listing much more code than we have space for here. In general, though, such a factory might allow code to be insulated from the details of dynamically configured object construction.

For instance, recall the processor example presented in the abstract in Chapter 26, and then again as a composition example earlier in this chapter. It accepts reader and writer objects for processing arbitrary data streams. The original version of this example manually passed in instances of specialized classes like FileWriter and SocketReader to customize the data streams being processed; later, we passed in hardcoded file, stream, and formatter objects. In a more dynamic scenario, external devices such as configuration files or GUIs might be used to configure the streams.

In such a dynamic world, we might not be able to hardcode the creation of stream interface objects in our scripts, but might instead create them at runtime according to the contents of a configuration file.

Such a file might simply give the string name of a stream class to be imported from a module, plus an optional constructor call argument. Factory-style functions or code might come in handy here because they would allow us to fetch and pass in classes that are not hardcoded in our program ahead of time. Indeed, those classes might not even have existed at all when we wrote our code:

classname = ...parse from config file...

classarg  = ...parse from config file...

import streamtypes                           # Customizable code

aclass = getattr(streamtypes, classname)     # Fetch from module

reader = factory(aclass, classarg)           # Or aclass(classarg)

processor(reader, ...)

Here, the getattr built-in is again used to fetch a module attribute given a string name (it’s like saying obj.attr, but attr is a string). Because this code snippet assumes a single constructor argument, it doesn’t strictly need factory—we could make an instance with justaclass(classarg). The factory function may prove more useful in the presence of unknown argument lists, however, and the general factory coding pattern can improve the code’s flexibility.


[62] Actually, this syntax can invoke any callable object, including functions, classes, and methods. Hence, the factory function here can also run any callable object, not just a class (despite the argument name). Also, as we learned in Chapter 18, Python 2.X has an alternative to aClass(*pargs, **kargs): the apply(aClass, pargs, kargs) built-in call, which has been removed in Python 3.X because of its redundancy and limitations.

Multiple Inheritance: “Mix-in” Classes

Our last design pattern is one of the most useful, and will serve as a subject for a more realistic example to wrap up this chapter and point toward the next. As a bonus, the code we’ll write here may be a useful tool.

Many class-based designs call for combining disparate sets of methods. As we’ve seen, in a class statement, more than one superclass can be listed in parentheses in the header line. When you do this, you leverage multiple inheritance—the class and its instances inherit names from all the listed superclasses.

When searching for an attribute, Python’s inheritance search traverses all superclasses in the class header from left to right until a match is found. Technically, because any of the superclasses may have superclasses of its own, this search can be a bit more complex for larger class trees:

§  In classic classes (the default until Python 3.0), the attribute search in all cases proceeds depth-first all the way to the top of the inheritance tree, and then from left to right. This order is usually called DFLR, for its depth-first, left-to-right path.

§  In new-style classes (optional in 2.X and standard in 3.X), the attribute search is usually as before, but in diamond patterns proceeds across by tree levels before moving up, in a more breadth-first fashion. This order is usually called the new-style MRO, for method resolution order, though it’s used for all attributes, not just methods.

The second of these search rules is explained fully in the new-style class discussion in the next chapter. Though difficult to understand without the next chapter’s code (and somewhat rare to create yourself), diamond patterns appear when multiple classes in a tree share a common superclass; the new-style search order is designed to visit such a shared superclass just once, and after all its subclasses. In either model, though, when a class has multiple superclasses, they are searched from left to right according to the order listed in the class statement header lines.

In general, multiple inheritance is good for modeling objects that belong to more than one set. For instance, a person may be an engineer, a writer, a musician, and so on, and inherit properties from all such sets. With multiple inheritance, objects obtain the union of the behavior in all their superclasses. As we’ll see ahead, multiple inheritance also allows classes to function as general packages of mixable attributes.

Though a useful pattern, multiple inheritance’s chief downside is that it can pose a conflict when the same method (or other attribute) name is defined in more than one superclass. When this occurs, the conflict is resolved either automatically by the inheritance search order, or manually in your code:

§  Default: By default, inheritance chooses the first occurrence of an attribute it finds when an attribute is referenced normally—by self.method(), for example. In this mode, Python chooses the lowest and leftmost in classic classes, and in nondiamond patterns in all classes; new-style classes may choose an option to the right before one above in diamonds.

§  Explicit: In some class models, you may sometimes need to select an attribute explicitly by referencing it through its class name—with superclass.method(self), for instance. Your code breaks the conflict and overrides the search’s default—to select an option to the right of or above the inheritance search’s default.

This is an issue only when the same name appears in multiple superclasses, and you do not wish to use the first one inherited. Because this isn’t as common an issue in typical Python code as it may sound, we’ll defer details on this topic until we study new-style classes and their MRO andsuper tools in the next chapter, and revisit this as a “gotcha” at the end of that chapter. First, though, the next section demonstrates a practical use case for multiple inheritance-based tools.

Coding Mix-in Display Classes

Perhaps the most common way multiple inheritance is used is to “mix in” general-purpose methods from superclasses. Such superclasses are usually called mix-in classes—they provide methods you add to application classes by inheritance. In a sense, mix-in classes are similar to modules: they provide packages of methods for use in their client subclasses. Unlike simple functions in modules, though, methods in mix-in classes also can participate in inheritance hierarchies, and have access to the self instance for using state information and other methods in their trees.

For example, as we’ve seen, Python’s default way to print a class instance object isn’t incredibly useful:

>>> class Spam:

        def __init__(self):                     # No __repr__ or __str__

            self.data1 = "food"

>>> X = Spam()

>>> print(X)                                    # Default: class name + address (id)

<__main__.Spam object at 0x00000000029CA908>    # Same in 2.X, but says "instance"

As you saw in both Chapter 28’s case study and Chapter 30’s operator overloading coverage, you can provide a __str__ or __repr__ method to implement a custom string representation of your own. But, rather than coding one of these in each and every class you wish to print, why not code it once in a general-purpose tool class and inherit it in all your classes?

That’s what mix-ins are for. Defining a display method in a mix-in superclass once enables us to reuse it anywhere we want to see a custom display format—even in classes that may already have another superclass. We’ve already seen tools that do related work:

§  Chapter 28’s AttrDisplay class formatted instance attributes in a generic __repr__ method, but it did not climb class trees and was utilized in single-inheritance mode only.

§  Chapter 29’s classtree.py module defined functions for climbing and sketching class trees, but it did not display object attributes along the way and was not architected as an inheritable class.

Here, we’re going to revisit these examples’ techniques and expand upon them to code a set of three mix-in classes that serve as generic display tools for listing instance attributes, inherited attributes, and attributes on all objects in a class tree. We’ll also use our tools in multiple-inheritance mode and deploy coding techniques that make classes better suited to use as generic tools.

Unlike Chapter 28, we’ll also code this with a __str__ instead of a __repr__. This is partially a style issue and limits their role to print and str, but the displays we’ll be developing will be rich enough to be categorized as more user-friendly than as-code. This policy also leaves client classes the option of coding an alternative lower-level display for interactive echoes and nested appearances with a __repr__. Using __repr__ here would still allow an alternative __str__, but the nature of the displays we’ll be implementing more strongly suggests a __str__ role. SeeChapter 30 for a review of these distinctions.

Listing instance attributes with __dict__

Let’s get started with the simple case—listing attributes attached to an instance. The following class, coded in the file listinstance.py, defines a mix-in called ListInstance that overloads the __str__ method for all classes that include it in their header lines. Because this is coded as a class, ListInstance is a generic tool whose formatting logic can be used for instances of any subclass client:

#!python

# File listinstance.py (2.X + 3.X)

class ListInstance:

    """

    Mix-in class that provides a formatted print() or str() of instances via

    inheritance of __str__ coded here;  displays instance attrs only;  self is

    instance of lowest class; __X names avoid clashing with client's attrs

    """

    def __attrnames(self):

        result = ''

        for attr in sorted(self.__dict__):

            result += '\t%s=%s\n' % (attr, self.__dict__[attr])

        return result

    def __str__(self):

        return '<Instance of %s, address %s:\n%s>' % (

                           self.__class__.__name__,         # My class's name

                           id(self),                        # My address

                           self.__attrnames())              # name=value list

if __name__ == '__main__':

    import testmixin

    testmixin.tester(ListInstance)

All the code in this section runs in both Python 2.X and 3.X. A coding note: this code exhibits a classic comprehension pattern, and you could save some program real estate by implementing the __attrnames method here more concisely with a generator expression that is triggered by the string join method, but it’s arguably less clear—expressions that wrap lines like this should generally make you consider simpler coding alternatives:

    def __attrnames(self):

        return ''.join('\t%s=%s\n' % (attr, self.__dict__ [attr])

                          for attr in sorted(self.__dict__))

ListInstance uses some previously explored tricks to extract the instance’s class name and attributes:

§  Each instance has a built-in __class__ attribute that references the class from which it was created, and each class has a __name__ attribute that references the name in the header, so the expression self.__class__.__name__ fetches the name of an instance’s class.

§  This class does most of its work by simply scanning the instance’s attribute dictionary (remember, it’s exported in __dict__) to build up a string showing the names and values of all instance attributes. The dictionary’s keys are sorted to finesse any ordering differences across Python releases.

In these respects, ListInstance is similar to Chapter 28’s attribute display; in fact, it’s largely just a variation on a theme. Our class here uses two additional techniques, though:

§  It displays the instance’s memory address by calling the id built-function, which returns any object’s address (by definition, a unique object identifier, which will be useful in later mutations of this code).

§  It uses the pseudoprivate naming pattern for its worker method: __attrnames. As we learned earlier in this chapter, Python automatically localizes any such name to its enclosing class by expanding the attribute name to include the class name (in this case, it becomes_ListInstance__attrnames). This holds true for both class attributes (like methods) and instance attributes attached to self. As noted in Chapter 28’s first-cut version, this behavior is useful in a general tool like this, as it ensures that its names don’t clash with any names used in its client subclasses.

Because ListInstance defines a __str__ operator overloading method, instances derived from this class display their attributes automatically when printed, giving a bit more information than a simple address. Here is the class in action, in single-inheritance mode, mixed in to the previous section’s class (this code works the same in both Python 3.X and 2.X, though 2.X default repr displays use the label “instance” instead of “object”):

>>> from listinstance import ListInstance

>>> class Spam(ListInstance):                    # Inherit a __str__ method

        def __init__(self):

            self.data1 = 'food'

>>> x = Spam()

>>> print(x)                                     # print() and str() run __str__

<Instance of Spam, address 43034496:

        data1=food

You can also fetch and save the listing output as a string without printing it with str, and interactive echoes still use the default format because we’re left __repr__ as an option for clients:

>>> display = str(x)                             # Print this to interpret escapes

>>> display

'<Instance of Spam, address 43034496:\n\tdata1=food\n>'

>>> x                                            # The __repr__ still is a default

<__main__.Spam object at 0x000000000290A780>

The ListInstance class is useful for any classes you write—even classes that already have one or more superclasses. This is where multiple inheritance comes in handy: by adding ListInstance to the list of superclasses in a class header (i.e., mixing it in), you get its __str__ “for free” while still inheriting from the existing superclass(es). The file testmixin0.py demonstrates with a first-cut testing script:

# File testmixin0.py

from listinstance import ListInstance # Get lister tool class

class Super:

    def __init__(self):               # Superclass __init__

        self.data1 = 'spam'           # Create instance attrs

    def ham(self):

        pass

class Sub(Super, ListInstance):       # Mix in ham and a __str__

    def __init__(self):               # Listers have access to self

        Super.__init__(self)

        self.data2 = 'eggs'           # More instance attrs

        self.data3 = 42

    def spam(self):                   # Define another method here

        pass

if __name__ == '__main__':

    X = Sub()

    print(X)                          # Run mixed-in __str__

Here, Sub inherits names from both Super and ListInstance; it’s a composite of its own names and names in both its superclasses. When you make a Sub instance and print it, you automatically get the custom representation mixed in from ListInstance (in this case, this script’s output is the same under both Python 3.X and 2.X, except for object addresses, which can naturally vary per process):

c:\code> python testmixin0.py

<Instance of Sub, address 44304144:

        data1=spam

        data2=eggs

        data3=42

This testmixin0 testing script works, but it hardcodes the tested class’s name in the code, and makes it difficult to experiment with alternatives—as we will in a moment. To be more flexible, we can borrow a page from Chapter 25’s module reloaders, and pass in the object to be tested, as in the following improved test script, testmixin—the one actually used by all the lister class modules’ self-test code. In this context the object passed in to the tester is a mix-in class instead of a function, but the principle is similar: everything qualifies as a passable “first class” object in Python:

#!python

# File testmixin.py (2.X + 3.X)

"""

Generic lister mixin tester: similar to transitive reloader in

Chapter 25, but passes a class object to tester (not function),

and testByNames adds loading of both module and class by name

strings here, in keeping with Chapter 31's factories pattern.

"""

import importlib

def tester(listerclass, sept=False):

    class Super:

        def __init__(self):            # Superclass __init__

            self.data1 = 'spam'        # Create instance attrs

        def ham(self):

            pass

    class Sub(Super, listerclass):     # Mix in ham and a __str__

        def __init__(self):            # Listers have access to self

            Super.__init__(self)

            self.data2 = 'eggs'        # More instance attrs

            self.data3 = 42

        def spam(self):                # Define another method here

            pass

    instance = Sub()                   # Return instance with lister's __str__

    print(instance)                    # Run mixed-in __str__ (or via str(x))

    if sept: print('-' * 80)

def testByNames(modname, classname, sept=False):

    modobject   = importlib.import_module(modname)  # Import by namestring

    listerclass = getattr(modobject, classname)     # Fetch attr by namestring

    tester(listerclass, sept)

if __name__ == '__main__':

    testByNames('listinstance',  'ListInstance',  True)      # Test all three here

    testByNames('listinherited', 'ListInherited', True)

    testByNames('listtree',      'ListTree',      False)

While it’s at it, this script also adds the ability to specify test module and class by name string, and leverages this in its self-test code—an application of the factory pattern’s mechanics described earlier. Here is the new script in action, being run by the lister module that imports it to test its own class (with the same results in 2.X and 3.X again); we can run the test script itself too, but that mode tests the two lister variants, which we have yet to see (or code!):

c:\code> python listinstance.py

<Instance of Sub, address 43256968:

        data1=spam

        data2=eggs

        data3=42

c:\code> python testmixin.py

<Instance of Sub, address 43977584:

        data1=spam

        data2=eggs

        data3=42

...and tests of two other lister classes coming up...

The ListInstance class we’ve coded so far works in any class it’s mixed into because self refers to an instance of the subclass that pulls this class in, whatever that may be. Again, in a sense, mix-in classes are the class equivalent of modules—packages of methods useful in a variety of clients. For example, here is ListInstance working again in single-inheritance mode on a different class’s instances, loaded with import, and displaying attributes assigned outside the class:

>>> import listinstance

>>> class C(listinstance.ListInstance): pass

>>> x = C()

>>> x.a, x.b, x.c = 1, 2, 3

>>> print(x)

<Instance of C, address 43230824:

        a=1

        b=2

        c=3

Besides the utility they provide, mix-ins optimize code maintenance, like all classes do. For example, if you later decide to extend ListInstance’s __str__ to also print all the class attributes that an instance inherits, you’re safe; because it’s an inherited method, changing __str__automatically updates the display of each subclass that imports the class and mixes it in. And since it’s now officially “later,” let’s move on to the next section to see what such an extension might look like.

Listing inherited attributes with dir

As it is, our ListerInstance mix-in displays instance attributes only (i.e., names attached to the instance object itself). It’s trivial to extend the class to display all the attributes accessible from an instance, though—both its own and those it inherits from its classes. The trick is to use thedir built-in function instead of scanning the instance’s __dict__ dictionary; the latter holds instance attributes only, but the former also collects all inherited attributes in Python 2.2 and later.

The following mutation codes this scheme; I’ve coded this in its own module to facilitate simple testing, but if existing clients were to use this version instead they would pick up the new display automatically (and recall from Chapter 25 that an import’s as clause can rename a new version to a prior name being used):

#!python

# File listinherited.py (2.X + 3.X)

class ListInherited:

    """

    Use dir() to collect both instance attrs and names inherited from

    its classes;  Python 3.X shows more names than 2.X because of the

    implied object superclass in the new-style class model;  getattr()

    fetches inherited names not in self.__dict__;  use __str__, not

    __repr__, or else this loops when printing bound methods!

    """

    def __attrnames(self):

        result = ''

        for attr in dir(self):                              # Instance dir()

            if attr[:2] == '__' and attr[-2:] == '__':      # Skip internals

                result += '\t%s\n' % attr

            else:

                result += '\t%s=%s\n' % (attr, getattr(self, attr))

        return result

    def __str__(self):

        return '<Instance of %s, address %s:\n%s>' % (

                           self.__class__.__name__,         # My class's name

                           id(self),                        # My address

                           self.__attrnames())              # name=value list

if __name__ == '__main__':

    import testmixin

    testmixin.tester(ListInherited)

Notice that this code skips __X__ names’ values; most of these are internal names that we don’t generally care about in a generic listing like this. This version also must use the getattr built-in function to fetch attributes by name string instead of using instance attribute dictionary indexing—getattr employs the inheritance search protocol, and some of the names we’re listing here are not stored on the instance itself.

To test the new version, run its file directly—it passes the class it defines to the testmixin.py file’s test function to be used as a mix-in in a subclass. This output of this test and lister class varies per release, though, because dir results differ. In Python 2.X, we get the following; notice the name mangling at work in the lister’s method name (I truncated some of the full value displays to fit on this page):

c:\code> c:\python27\python listinherited.py

<Instance of Sub, address 35161352:

        _ListInherited__attrnames=<bound method Sub.__attrnames of <test...more...>>

        __doc__

        __init__

        __module__

        __str__

        data1=spam

        data2=eggs

        data3=42

        ham=<bound method Sub.ham of <testmixin.Sub instance at 0x00000...more...>>

        spam=<bound method Sub.spam of <testmixin.Sub instance at 0x00000...more...>>

In Python 3.X, more attributes are displayed because all classes are “new style” and inherit names from the implied object superclass; more on this in Chapter 32. Because so many names are inherited from the default superclass, I’ve omitted many here—there are 32 in total in 3.3. Run this on your own for the full listing:

c:\code> c:\python33\python listinherited.py

<Instance of Sub, address 43253152:

        _ListInherited__attrnames=<bound method Sub.__attrnames of <test...more...>>

        __class__

        __delattr__

        __dict__

        __dir__

        __doc__

        __eq__

        ...more names omitted 32 total...

        __repr__

        __setattr__

        __sizeof__

        __str__

        __subclasshook__

        __weakref__

        data1=spam

        data2=eggs

        data3=42

        ham=<bound method Sub.ham of <testmixin.tester.<locals>.Sub ...more...>>

        spam=<bound method Sub.spam of <testmixin.tester.<locals>.Sub ...more...>>

As one possible improvement to address the proliferation of inherited built-in names and long values here, the following alternative for ___attrnames in file listinherited2.py of the book example’s package groups the double-underscore names separately, and minimizes line wrapping for large attribute values; notice how it escapes a % with %% so that just one remains for the final formatting operation at the end:

    def __attrnames(self, indent=' '*4):

        result  = 'Unders%s\n%s%%s\nOthers%s\n' % ('-'*77, indent, '-'*77)

        unders = []

        for attr in dir(self):                              # Instance dir()

            if attr[:2] == '__' and attr[-2:] == '__':      # Skip internals

                unders.append(attr)

            else:

                display = str(getattr(self, attr))[:82-(len(indent) + len(attr))]

                result += '%s%s=%s\n' % (indent, attr, display)

        return result % ', '.join(unders)

With this change, the class’s test output is a bit more sophisticated, but also more concise and usable:

c:\code> c:\python27\python listinherited2.py

<Instance of Sub, address 36299208:

Unders-----------------------------------------------------------------------------

    __doc__, __init__, __module__, __str__

Others-----------------------------------------------------------------------------

    _ListInherited__attrnames=<bound method Sub.__attrnames of <testmixin.Sub insta

    data1=spam

    data2=eggs

    data3=42

    ham=<bound method Sub.ham of <testmixin.Sub instance at 0x000000000229E1C8>>

    spam=<bound method Sub.spam of <testmixin.Sub instance at 0x000000000229E1C8>>

c:\code> c:\python33\python listinherited2.py

<Instance of Sub, address 43318912:

Unders-----------------------------------------------------------------------------

    __class__, __delattr__, __dict__, __dir__, __doc__, __eq__, __format__, __ge__,

__getattribute__, __gt__, __hash__, __init__, __le__, __lt__, __module__, __ne__,

__new__, __qualname__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__,

__str__, __subclasshook__, __weakref__

Others-----------------------------------------------------------------------------

    _ListInherited__attrnames=<bound method Sub.__attrnames of <testmixin.tester.<l

    data1=spam

    data2=eggs

    data3=42

    ham=<bound method Sub.ham of <testmixin.tester.<locals>.Sub object at 0x0000000

    spam=<bound method Sub.spam of <testmixin.tester.<locals>.Sub object at 0x00000

Display format is an open-ended problem (e.g., Python’s standard pprint “pretty printer” module may offer options here too), so we’ll leave further polishing as a suggested exercise. The tree lister of the next section may be more useful in any event.

NOTE

Looping in __repr__: One caution here—now that we’re displaying inherited methods too, we have to use __str__ instead of __repr__ to overload printing. With __repr__, this code will fall into recursive loops—displaying the value of a method triggers the __repr__ of the method’s class, in order to display the class. That is, if the lister’s __repr__ tries to display a method, displaying the method’s class will trigger the lister’s __repr__ again. Subtle, but true! Change __str__ to __repr__ here to see this for yourself. If you must use __repr__ in such a context, you can avoid the loops by using isinstance to compare the type of attribute values against types.MethodType in the standard library, to know which items to skip.

Listing attributes per object in class trees

Let’s code one last extension. As it is, our latest lister includes inherited names, but doesn’t give any sort of designation of the classes from which the names are acquired. As we saw in the classtree.py example near the end of Chapter 29, though, it’s straightforward to climb class inheritance trees in code. The following mix-in class, coded in the file listtree.py, makes use of this same technique to display attributes grouped by the classes they live in—it sketches the full physical class tree, displaying attributes attached to each object along the way. The reader must still infer attribute inheritance, but this gives substantially more detail than a simple flat list:

#!python

# File listtree.py (2.X + 3.X)

class ListTree:

    """

    Mix-in that returns an __str__ trace of the entire class tree and all

    its objects' attrs at and above self;  run by print(), str() returns

    constructed string;  uses __X attr names to avoid impacting clients;

    recurses to superclasses explicitly, uses str.format() for clarity;

    """

    def __attrnames(self, obj, indent):

        spaces = ' ' * (indent + 1)

        result = ''

        for attr in sorted(obj.__dict__):

            if attr.startswith('__') and attr.endswith('__'):

                result += spaces + '{0}\n'.format(attr)

            else:

                result += spaces + '{0}={1}\n'.format(attr, getattr(obj, attr))

        return result

    def __listclass(self, aClass, indent):

        dots = '.' * indent

        if aClass in self.__visited:

            return '\n{0}<Class {1}:, address {2}: (see above)>\n'.format(

                           dots,

                           aClass.__name__,

                           id(aClass))

        else:

            self.__visited[aClass] = True

            here  = self.__attrnames(aClass, indent)

            above = ''

            for super in aClass.__bases__:

                above += self.__listclass(super, indent+4)

            return '\n{0}<Class {1}, address {2}:\n{3}{4}{5}>\n'.format(

                           dots,

                           aClass.__name__,

                           id(aClass),

                           here, above,

                           dots)

    def __str__(self):

        self.__visited = {}

        here  = self.__attrnames(self, 0)

        above = self.__listclass(self.__class__, 4)

        return '<Instance of {0}, address {1}:\n{2}{3}>'.format(

                           self.__class__.__name__,

                           id(self),

                           here, above)

if __name__ == '__main__':

    import testmixin

    testmixin.tester(ListTree)

This class achieves its goal by traversing the inheritance tree—from an instance’s __class__ to its class, and then from the class’s __bases__ to all superclasses recursively, scanning each object’s attribute __dict__ along the way. Ultimately, it concatenates each tree portion’s string as the recursion unwinds.

It can take a while to understand recursive programs like this, but given the arbitrary shape and depth of class trees, we really have no choice here (apart from explicit stack equivalents of the sorts we met in Chapter 19 and Chapter 25, which tend to be no simpler, and which we’ll omit here for space and time). This class is coded to keep its business as explicit as possible, though, to maximize clarity.

For example, you could replace the __listclass method’s loop statement in the first of the following with the implicitly run generator expression in the second, but the second seems unnecessarily convoluted in this context—recursive calls embedded in a generator expression—and has no obvious performance advantage, especially given this program’s limited scope (neither alternative makes a temporary list, though the first may create more temporary results depending on the internal implementation of strings, concatenation, and join—something you’d need to time with Chapter 21’s tools to determine):

            above = ''

            for super in aClass.__bases__:

                above += self.__listclass(super, indent+4)

...or...

            above = ''.join(

                    self.__listclass(super, indent+4) for super in aClass.__bases__)

You could also code the else clause in __listclass like the following, as in the prior edition of this book—an alternative that embeds everything in the format arguments list; relies on the fact that the join call kicks off the generator expression and its recursive calls before the format operation even begins building up the result text; and seems more difficult to understand, despite the fact that I wrote it (never a good sign!):

            self.__visited[aClass] = True

            genabove = (self.__listclass(c, indent+4) for c in aClass.__bases__)

            return '\n{0}<Class {1}, address {2}:\n{3}{4}{5}>\n'.format(

                           dots,

                           aClass.__name__,

                           id(aClass),

                           self.__attrnames(aClass, indent),   # Runs before format!

                           ''.join(genabove),

                           dots)

As always, explicit is better than implicit, and your code can be as big a factor in this as the tools it uses.

Also notice how this version uses the Python 3.X and 2.6/2.7 string format method instead of % formatting expressions, in an effort to make substitutions arguably clearer; when many substitutions are applied like this, explicit argument numbers may make the code easier to decipher. In short, in this version we exchange the first of the following lines for the second:

        return '<Instance of %s, address %s:\n%s%s>' % (...)          # Expression

        return '<Instance of {0}, address {1}:\n{2}{3}>'.format(...)  # Method

This policy has an unfortunate downside in 3.2 and 3.3 too, but we have to run the code to see why.

Running the tree lister

Now, to test, run this class’s module file as before; it passes the ListTree class to testmixin.py to be mixed in with a subclass in the test function. The file’s tree-sketcher output in Python 2.X is as follows:

c:\code> c:\python27\python listtree.py

<Instance of Sub, address 36690632:

 _ListTree__visited={}

 data1=spam

 data2=eggs

 data3=42

....<Class Sub, address 36652616:

     __doc__

     __init__

     __module__

     spam=<unbound method Sub.spam>

........<Class Super, address 36652712:

         __doc__

         __init__

         __module__

         ham=<unbound method Super.ham>

........>

........<Class ListTree, address 30795816:

         _ListTree__attrnames=<unbound method ListTree.__attrnames>

         _ListTree__listclass=<unbound method ListTree.__listclass>

         __doc__

         __module__

         __str__

........>

....>

Notice in this output how methods are unbound now under 2.X, because we fetch them from classes directly. In the previous section’s version they displayed as bound methods, because ListInherited fetched these from instances with getattr instead (the first version indexed the instance __dict__ and did not display inherited methods on classes at all). Also observe how the lister’s __visited table has its name mangled in the instance’s attribute dictionary; unless we’re very unlucky, this won’t clash with other data there. Some of the lister class’s methods are mangled for pseudoprivacy as well.

Under Python 3.X in the following, we again get extra attributes which may vary within the 3.X line, and extra superclasses—as we’ll learn in the next chapter, all top-level classes inherit from the built-in object class automatically in 3.X; Python 2.X classes do so manually if they desire new-style class behavior. Also notice that the attributes that were unbound methods in 2.X are simple functions in 3.X, as described earlier in this chapter (and that again, I’ve deleted most built-in attributes in object to save space here; run this on your own for the complete listing):

c:\code> c:\python33\python listtree.py

<Instance of Sub, address 44277488:

 _ListTree__visited={}

 data1=spam

 data2=eggs

 data3=42

....<Class Sub, address 36990264:

     __doc__

     __init__

     __module__

     __qualname__

     spam=<function tester.<locals>.Sub.spam at 0x0000000002A3C840>

........<Class Super, address 36989352:

         __dict__

         __doc__

         __init__

         __module__

         __qualname__

         __weakref__

         ham=<function tester.<locals>.Super.ham at 0x0000000002A3C730>

............<Class object, address 506770624:

             __class__

             __delattr__

             __dir__

             __doc__

             __eq__

             ...more omitted: 22 total...

             __repr__

             __setattr__

             __sizeof__

             __str__

             __subclasshook__

............>

........>

........<Class ListTree, address 36988440:

         _ListTree__attrnames=<function ListTree.__attrnames at 0x0000000002A3C158>

         _ListTree__listclass=<function ListTree.__listclass at 0x0000000002A3C1E0>

         __dict__

         __doc__

         __module__

         __qualname__

         __str__

         __weakref__

............<Class object:, address 506770624: (see above)>

........>

....>

This version avoids listing the same class object twice by keeping a table of classes visited so far (this is why an object’s id is included—to serve as a key for a previously displayed item in the report). Like the transitive module reloader of Chapter 25, a dictionary works to avoid repeats in the output because class objects are hashable and thus may be dictionary keys; a set would provide similar functionality.

Technically, cycles are not generally possible in class inheritance trees—a class must already have been defined to be named as a superclass, and Python raises an exception as it should if you attempt to create a cycle later by __bases__ changes—but the visited mechanism here avoids relisting a class twice:

>>> class C: pass

>>> class B(C): pass

>>> C.__bases__ = (B,)        # Deep, dark magic!

TypeError: a __bases__ item causes an inheritance cycle

Usage variation: Showing underscore name values

This version also takes care to avoid displaying large internal objects by skipping __X__ names again. If you comment out the code that treats these names specially:

        for attr in sorted(obj.__dict__):

#            if attr.startswith('__') and attr.endswith('__'):

#                result += spaces + '{0}\n'.format(attr)

#            else:

                result += spaces + '{0}={1}\n'.format(attr, getattr(obj, attr))

then their values will display normally. Here’s the output in 2.X with this temporary change made, giving the values of every attribute in the class tree:

c:\code> c:\python27\python listtree.py

<Instance of Sub, address 35750408:

 _ListTree__visited={}

 data1=spam

 data2=eggs

 data3=42

....<Class Sub, address 36353608:

     __doc__=None

     __init__=<unbound method Sub.__init__>

     __module__=testmixin

     spam=<unbound method Sub.spam>

........<Class Super, address 36353704:

         __doc__=None

         __init__=<unbound method Super.__init__>

         __module__=testmixin

         ham=<unbound method Super.ham>

........>

........<Class ListTree, address 31254568:

         _ListTree__attrnames=<unbound method ListTree.__attrnames>

         _ListTree__listclass=<unbound method ListTree.__listclass>

         __doc__=

    Mix-in that returns an __str__ trace of the entire class tree and all

    its objects' attrs at and above self;  run by print(), str() returns

    constructed string;  uses __X attr names to avoid impacting clients;

    recurses to superclasses explicitly, uses str.format() for clarity;

         __module__=__main__

         __str__=<unbound method ListTree.__str__>

........>

....>

This test’s output is much larger in 3.X and may justify isolating underscore names in general as we did earlier. In fact, this test may not even work in some currently recent 3.X releases as is:

c:\code> c:\python33\python listtree.py

   ...etc...

   File "listtree.py", line 18, in __attrnames

    result += spaces + '{0}={1}\n'.format(attr, getattr(obj, attr))

TypeError: Type method_descriptor doesn't define __format__

I debated recoding to work around this issue, but it serves as a fair example of debugging requirements and techniques in a dynamic open source project like Python. Per the following note, the str.format call no longer supports certain object types that are the values of built-in attribute names—yet another reason these names are probably better skipped.

NOTE

Debugging a str.format issue: In 3.X, running the commented-out version works in 3.0 and 3.1, but there seems to be a bug, or at least a regression, here in 3.2 and 3.3—these Pythons fail with an exception because five built-in methods in object do not define a __format__ expected by str.format, and the default in object is apparently no longer applied correctly in such cases with empty and generic formatting targets. To see this live, it’s enough to run simplified code that isolates the problem:

 c:\code> py −3.1

>>> '{0}'.format(object.__reduce__)

"<method '__reduce__' of 'object' objects>"

c:\code> py −3.3

>>> '{0}'.format(object.__reduce__)

TypeError: Type method_descriptor doesn't define __format__

Per both prior behavior and current Python documentation, empty targets like this are supposed to convert the object to its str print string (see both the original PEP 3101 and the 3.3 language reference manual). Oddly, the {0} and {0:s} string targets both now fail, but the {0!s} forced str conversion target works, as does manual str preconversion—apparently reflecting a change for a type-specific case that neglected perhaps more common generic usage modes:

c:\code> py −3.3

>>> '{0:s}'.format(object.__reduce__)

TypeError: Type method_descriptor doesn't define __format__

>>> '{0!s}'.format(object.__reduce__)

"<method '__reduce__' of 'object' objects>"

>>> '{0}'.format(str(object.__reduce__))

"<method '__reduce__' of 'object' objects>"

To fix, wrap the format call in a try statement to catch the exception; use % formatting expressions instead of the str.format method; use one of the aforementioned still-working str.format usage modes and hope it does not change too; or wait for a repair of this in a later 3.X release. Here’s the recommended workaround using the tried-and-true % (it’s also noticeably shorter, but I won’t repeat Chapter 7’s comparisons here):

c:\code> py −3.3

>>> '%s' % object.__reduce__

"<method '__reduce__' of 'object' objects>"

To apply this in the tree lister’s code, change the first of these to its follower:

result += spaces + '{0}={1}\n'.format(attr, getattr(obj, attr))

result += spaces + '%s=%s\n' % (attr, getattr(obj, attr))

Python 2.X has the same regression in 2.7 but not 2.6—inherited from the 3.2 change, apparently—but does not show object methods in this chapter’s example. Since this example generates too much output in 3.X anyhow, it’s a moot point here, but is a decent example of real-world coding. Unfortunately, using newer features like str.format sometimes puts your code in the awkward position of beta tester in the current 3.X line!

Usage variation: Running on larger modules

For more fun, uncomment the underscore handler lines to enable them again, and try mixing this class into something more substantial, like the Button class of Python’s tkinter GUI toolkit module. In general, you’ll want to name ListTree first (leftmost) in a class header, so its__str__ is picked up; Button has one, too, and the leftmost superclass is always searched first in multiple inheritance.

The output of the following is fairly massive (20K characters and 330 lines in 3.X—and 38K if you forget to uncomment the underscore detection!), so run this code on your own to see the full listing. Notice how our lister’s __visited dictionary attribute mixes harmlessly with those created by tkinter itself. If you’re using Python 2.X, also recall that you should use Tkinter for the module name instead of tkinter:

>>> from listtree import ListTree

>>> from tkinter import Button                  # Both classes have a __str__

>>> class MyButton(ListTree, Button): pass      # ListTree first: use its __str__

>>> B = MyButton(text='spam')

>>> open('savetree.txt', 'w').write(str(B))     # Save to a file for later viewing

20513

>>> len(open('savetree.txt').readlines())       # Lines in the file

330

>>> print(B)                                    # Print the display here

<Instance of MyButton, address 43363688:

 _ListTree__visited={}

 _name=43363688

 _tclCommands=[]

 _w=.43363688

 children={}

 master=.

 ...much more omitted...

>>> S = str(B)                                  # Or print just the first part

>>> print(S[:1000])

Experiment arbitrarily on your own. The main point here is that OOP is all about code reuse, and mix-in classes are a powerful example. Like almost everything else in programming, multiple inheritance can be a useful device when applied well. In practice, though, it is an advanced feature and can become complicated if used carelessly or excessively. We’ll revisit this topic as a gotcha at the end of the next chapter.

Collector module

Finally, to make importing our tools even easier, we can provide a collector module that combines them in a single namespace—importing just the following gives access to all three lister mix-ins at once:

# File lister.py

# Collect all three listers in one module for convenience

from listinstance  import ListInstance

from listinherited import ListInherited

from listtree      import ListTree

Lister = ListTree  # Choose a default lister

Importers can use the individual class names as is, or alias them to a common name used in subclasses that can be modified in the import statement:

>>> import lister

>>> lister.ListInstance                          # Use a specific lister

<class 'listinstance.ListInstance'>

>>> lister.Lister                                # Use Lister default

<class 'listtree.ListTree'>

>>> from lister import Lister                    # Use Lister default

>>> Lister

<class 'listtree.ListTree'>

>>> from lister import ListInstance as Lister    # Use Lister alias

>>> Lister

<class 'listinstance.ListInstance'>

Python often makes flexible tool APIs nearly automatic.

Room for improvement: MRO, slots, GUIs

Like most software, there’s much more we could do here. The following gives some pointers on extensions you may wish to explore. Some are interesting projects, and two serve as segue to the next chapter, but for space will have to remain in the suggested exercise category here.

General ideas: GUIs, built-ins

Grouping double-underscore names as we did earlier may help reduce the size of the tree display, though some like __init__ are user-defined and may merit special treatment. Sketching the tree in a GUI might be a natural next step too—the tkinter toolkit that we utilized in the prior section’s lister examples ships with Python and provides basic but easy support, and others offer richer but more complex alternatives. See the notes at the end of Chapter 28’s case study for more pointers in this department.

Physical trees versus inheritance: using the MRO (preview)

In the next chapter, we’ll also meet the new-style class model, which modifies the search order for one special multiple inheritance case (diamonds). There, we’ll also study the class.__mro__ new-style class object attribute—a tuple giving the class tree search order used by inheritance, known as the new-style MRO.

As is, our ListTree tree lister sketches the physical shape of the inheritance tree, and expects the viewer to infer from this where an attribute is inherited from. This was its goal, but a general object viewer might also use the MRO tuple to automatically associate an attribute with the class from which it is inherited—by scanning the new-style MRO (or the classic classes’ DFLR ordering) for each inherited attribute in a dir result, we can simulate Python’s inheritance search, and map attributes to their source objects in the physical class tree displayed.

In fact, we will write code that comes very close to this idea in the next chapter’s mapattrs module, and reuse this example’s test classes there to demonstrate the idea, so stay tuned for an epilogue to this story. This might be used instead of or in addition to displaying attribute physical locations in __attrnames here; both forms might be useful data for programmers to see. This approach is also one way to deal with slots, the topic of the next note.

Virtual data: slots, properties, and more (preview)

Because they scan instance __dict__ namespace dictionaries, the ListInstance and ListTree classes presented here raise some subtle design issues. In Python classes, some names associated with instance data may not be stored at the instance itself. This includes topics presented in the next chapter such as new-style properties, slots, and descriptors, but also attributes dynamically computed in all classes with tools like __getattr__. None of these “virtual” attributes’ names are stored in an instance’s namespace dictionary, so none will be displayed as part of an instance’s own data.

Of these, slots seem the most strongly associated with an instance; they store data on instances, even though their names don’t appear in instance namespace dictionaries. Properties and descriptors are associated with instances too, but they don’t reserve space in the instance, their computed nature is much more explicit, and they may seem closer to class-level methods than instance data.

As we’ll see in the next chapter, slots function like instance attributes, but are created and managed by automatically created items in classes. They are a relatively infrequently used new-style class option, where instance attributes are declared in a __slots__ class attribute, and not physically stored in an instance’s __dict__; in fact, slots may suppress a __dict__ entirely. Because of this, tools that display instances by scanning their namespaces alone won’t directly associate the instance with attributes stored in slots. As is, ListTree displays slots as class attributes wherever they appear (though not at the instance), and ListInstance doesn’t display them at all.

Though this will make more sense after we study this feature in the next chapter, it impacts code here and similar tools. For example, if in textmixin.py we assign __slots__=['data1'] in Super and __slots__=['data3'] in Sub, only the data2 attribute is displayed in the instance by these two lister classes. ListTree also displays data1 and data3, but as attributes of the Super and Sub class objects and with a special format for their values (technically, they are class-level descriptors, another new-style tool introduced in the next chapter).

As the next chapter will explain, to show slot attributes as instance names, tools generally need to use dir to get a list of all attributes—both physically present and inherited—and then use either getattr to fetch their values from the instance, or fetch values from their inheritance source via __dict__ in tree scans and accept the display of the implementations of some at classes. Because dir includes the names of inherited “virtual” attributes—including both slots and properties—they would be included in the instance set. As we’ll also find, the MRO might assist here to map dir attribute to their sources, or restrict instance displays to names coded in user-defined classes by filtering out names inherited from the built-in object.

ListInherited is immune to most of this, because it already displays the full dir results set, which include both __dict__ names and all classes’ __slots__ names, though its display is of marginal use as is. A ListTree variant using the dir technique along with the MRO sequence to map attributes to classes would apply to slots too, because slots-based names appear in class’s __dict__ results individually as slot management tools, though not in the instance __dict__.

Alternatively, as a policy we could simply let our code handle slot-based attributes as it currently does, rather than complicating it for a rarely used, advanced feature that’s even questionable practice today. Slots and normal instance attributes are different kinds of names. In fact, displaying slots names as attributes of classes instead of instances is technically more accurate—as we’ll see in the next chapter their implementation is at classes, though their space is at instances.

Ultimately, attempting to collect all the “virtual” attributes associated with a class may be a bit of a pipe dream anyhow. Techniques like those outlined here may address slots and properties, but some attributes are entirely dynamic, with no physical basis at all: those computed on fetch by generic method such as __getattr__ are not data in the classic sense. Tools that attempt to display data in a wildly dynamic language Python must come with the caveat that some data is ethereal at best!

We’ll also make a minor extension to this section’s code in the exercises at the end of this part of the book, to list superclass names in parentheses at the start of instance displays, so keep it filed for future reference for now. To better understand the last of the preceding two points, we need to wrap up this chapter and move on to the next and last in this part of the book.

Other Design-Related Topics

In this chapter, we’ve studied inheritance, composition, delegation, multiple inheritance, bound methods, and factories—all common patterns used to combine classes in Python programs. We’ve really only scratched the surface here in the design patterns domain, though. Elsewhere in this book you’ll find coverage of other design-related topics, such as:

§  Abstract superclasses (Chapter 29)

§  Decorators (Chapter 32 and Chapter 39)

§  Type subclasses (Chapter 32)

§  Static and class methods (Chapter 32)

§  Managed attributes (Chapter 32 and Chapter 38)

§  Metaclasses (Chapter 32 and Chapter 40)

For more details on design patterns, though, we’ll delegate to other resources on OOP at large. Although patterns are important in OOP work and are often more natural in Python than other languages, they are not specific to Python itself, and a subject that’s often best acquired by experience.

Chapter Summary

In this chapter, we sampled common ways to use and combine classes to optimize their reusability and factoring benefits—what are usually considered design issues that are often independent of any particular programming language (though Python can make them easier to implement). We studied delegation (wrapping objects in proxy classes), composition (controlling embedded objects), and inheritance (acquiring behavior from other classes), as well as some more esoteric concepts such as pseudoprivate attributes, multiple inheritance, bound methods, and factories.

The next chapter ends our look at classes and OOP by surveying more advanced class-related topics. Some of its material may be of more interest to tool writers than application programmers, but it still merits a review by most people who will do OOP in Python—if not for your code, then for the code of others you may need to understand. First, though, here’s another quick chapter quiz to review.

Test Your Knowledge: Quiz

1.    What is multiple inheritance?

2.    What is delegation?

3.    What is composition?

4.    What are bound methods?

5.    What are pseudoprivate attributes used for?

Test Your Knowledge: Answers

1.    Multiple inheritance occurs when a class inherits from more than one superclass; it’s useful for mixing together multiple packages of class-based code. The left-to-right order in class statement headers determines the general order of attribute searches.

2.    Delegation involves wrapping an object in a proxy class, which adds extra behavior and passes other operations to the wrapped object. The proxy retains the interface of the wrapped object.

3.    Composition is a technique whereby a controller class embeds and directs a number of objects, and provides an interface all its own; it’s a way to build up larger structures with classes.

4.    Bound methods combine an instance and a method function; you can call them without passing in an instance object explicitly because the original instance is still available.

5.    Pseudoprivate attributes (whose names begin but do not end with two leading underscores: __X) are used to localize names to the enclosing class. This includes both class attributes like methods defined inside the class, and self instance attributes assigned inside the class’s methods. Such names are expanded to include the class name, which makes them generally unique.