Learning Python (2013)

Part VI. Classes and OOP

Chapter 32. Advanced Class Topics

This chapter concludes our look at OOP in Python by presenting a few more advanced class-related topics: we will survey subclassing built-in types, “new style” class changes and extensions, static and class methods, slots and properties, function and class decorators, the MRO and thesuper call, and more.

As we’ve seen, Python’s OOP model is, at its core, relatively simple, and some of the topics presented in this chapter are so advanced and optional that you may not encounter them very often in your Python applications-programming career. In the interest of completeness, though—and because you never know when an “advanced” topic may crop up in code you use—we’ll round out our discussion of classes with a brief look at these advanced tools for OOP work.

As usual, because this is the last chapter in this part of the book, it ends with a section on class-related “gotchas,” and the set of lab exercises for this part. I encourage you to work through the exercises to help cement the ideas we’ve studied here. I also suggest working on or studying larger OOP Python projects as a supplement to this book. As with much in computing, the benefits of OOP tend to become more apparent with practice.

NOTE

Content notes: This chapter collects advanced class topics, but some are too large for this chapter to cover well. Topics such as properties, descriptors, decorators, and metaclasses are mentioned only briefly here, and given a fuller treatment in the final part of this book, after exceptions. Be sure to look ahead for more complete examples and extended coverage of some of the subjects that fall into this chapter’s category.

You’ll also notice that this is the largest chapter in this book—I’m assuming that readers courageous enough to take on this chapter’s topics are ready to roll up their sleeves and explore its in-depth coverage. If you’re not looking for advanced OOP topics, you may wish to skip ahead to chapter-end materials, and come back here when you confront these tools in the code of your programming future.

Extending Built-in Types

Besides implementing new kinds of objects, classes are sometimes used to extend the functionality of Python’s built-in types to support more exotic data structures. For instance, to add queue insert and delete methods to lists, you can code classes that wrap (embed) a list object and export insert and delete methods that process the list specially, like the delegation technique we studied in Chapter 31. As of Python 2.2, you can also use inheritance to specialize built-in types. The next two sections show both techniques in action.

Extending Types by Embedding

Do you remember those set functions we wrote in Chapter 16 and Chapter 18? Here’s what they look like brought back to life as a Python class. The following example (the file setwrapper.py) implements a new set object type by moving some of the set functions to methods and adding some basic operator overloading. For the most part, this class just wraps a Python list with extra set operations. But because it’s a class, it also supports multiple instances and customization by inheritance in subclasses. Unlike our earlier functions, using classes here allows us to make multiple self-contained set objects with preset data and behavior, rather than passing lists into functions manually:

class Set:

   def __init__(self, value = []):    # Constructor

       self.data = []                 # Manages a list

       self.concat(value)

   def intersect(self, other):        # other is any sequence

       res = []                       # self is the subject

       for x in self.data:

           if x in other:             # Pick common items

               res.append(x)

       return Set(res)                # Return a new Set

   def union(self, other):            # other is any sequence

       res = self.data[:]             # Copy of my list

       for x in other:                # Add items in other

           if not x in res:

               res.append(x)

       return Set(res)

   def concat(self, value):           # value: list, Set...

       for x in value:                # Removes duplicates

          if not x in self.data:

               self.data.append(x)

   def __len__(self):          return len(self.data)            # len(self), if self

   def __getitem__(self, key): return self.data[key]            # self[i], self[i:j]

   def __and__(self, other):   return self.intersect(other)     # self & other

   def __or__(self, other):    return self.union(other)         # self | other

   def __repr__(self):         return 'Set:' + repr(self.data)  # print(self),...

   def __iter__(self):         return iter(self.data)           # for x in self,...

To use this class, we make instances, call methods, and run defined operators as usual:

from setwrapper import Set

x = Set([1, 3, 5, 7])

print(x.union(Set([1, 4, 7])))       # prints Set:[1, 3, 5, 7, 4]

print(x | Set([1, 4, 6]))            # prints Set:[1, 3, 5, 7, 4, 6]

Overloading operations such as indexing and iteration also enables instances of our Set class to often masquerade as real lists. Because you will interact with and extend this class in an exercise at the end of this chapter, I won’t say much more about this code until Appendix D.

Extending Types by Subclassing

Beginning with Python 2.2, all the built-in types in the language can now be subclassed directly. Type-conversion functions such as list, str, dict, and tuple have become built-in type names—although transparent to your script, a type-conversion call (e.g., list('spam')) is now really an invocation of a type’s object constructor.

This change allows you to customize or extend the behavior of built-in types with user-defined class statements: simply subclass the new type names to customize them. Instances of your type subclasses can generally be used anywhere that the original built-in type can appear. For example, suppose you have trouble getting used to the fact that Python list offsets begin at 0 instead of 1. Not to worry—you can always code your own subclass that customizes this core behavior of lists. The file typesubclass.py shows how:

# Subclass built-in list type/class

# Map 1..N to 0..N-1; call back to built-in version.

class MyList(list):

    def __getitem__(self, offset):

        print('(indexing %s at %s)' % (self, offset))

        return list.__getitem__(self, offset - 1)

if __name__ == '__main__':

    print(list('abc'))

    x = MyList('abc')               # __init__ inherited from list

    print(x)                        # __repr__ inherited from list

    print(x[1])                     # MyList.__getitem__

    print(x[3])                     # Customizes list superclass method

    x.append('spam'); print(x)      # Attributes from list superclass

    x.reverse();      print(x)

In this file, the MyList subclass extends the built-in list’s __getitem__ indexing method only, to map indexes 1 to N back to the required 0 to N−1. All it really does is decrement the submitted index and call back to the superclass’s version of indexing, but it’s enough to do the trick:

% python typesubclass.py

['a', 'b', 'c']

['a', 'b', 'c']

(indexing ['a', 'b', 'c'] at 1)

a

(indexing ['a', 'b', 'c'] at 3)

c

['a', 'b', 'c', 'spam']

['spam', 'c', 'b', 'a']

This output also includes tracing text the class prints on indexing. Of course, whether changing indexing this way is a good idea in general is another issue—users of your MyList class may very well be confused by such a core departure from Python sequence behavior! The ability to customize built-in types this way can be a powerful asset, though.

For instance, this coding pattern gives rise to an alternative way to code a set—as a subclass of the built-in list type, rather than a standalone class that manages an embedded list object as shown in the prior section. As we learned in Chapter 5, Python today comes with a powerful built-in set object, along with literal and comprehension syntax for making new sets. Coding one yourself, though, is still a great way to learn about type subclassing in general.

The following class, coded in the file setsubclass.py, customizes lists to add just methods and operators related to set processing. Because all other behavior is inherited from the built-in list superclass, this makes for a shorter and simpler alternative—everything not defined here is routed tolist directly:

from __future__ import print_function    # 2.X compatibility

class Set(list):

    def __init__(self, value = []):      # Constructor

        list.__init__([])                # Customizes list

        self.concat(value)               # Copies mutable defaults

    def intersect(self, other):          # other is any sequence

        res = []                         # self is the subject

        for x in self:

            if x in other:               # Pick common items

                res.append(x)

        return Set(res)                  # Return a new Set

    def union(self, other):              # other is any sequence

        res = Set(self)                  # Copy me and my list

        res.concat(other)

        return res

    def concat(self, value):             # value: list, Set, etc.

        for x in value:                  # Removes duplicates

            if not x in self:

                self.append(x)

    def __and__(self, other): return self.intersect(other)

    def __or__(self, other):  return self.union(other)

    def __repr__(self):       return 'Set:' + list.__repr__(self)

if __name__ == '__main__':

    x = Set([1,3,5,7])

    y = Set([2,1,4,5,6])

    print(x, y, len(x))

    print(x.intersect(y), y.union(x))

    print(x & y, x | y)

    x.reverse(); print(x)

Here is the output of the self-test code at the end of this file. Because subclassing core types is a somewhat advanced feature with a limited target audience, I’ll omit further details here, but I invite you to trace through these results in the code to study its behavior (which is the same on Python 3.X and 2.X):

% python setsubclass.py

Set:[1, 3, 5, 7] Set:[2, 1, 4, 5, 6] 4

Set:[1, 5] Set:[2, 1, 4, 5, 6, 3, 7]

Set:[1, 5] Set:[1, 3, 5, 7, 2, 4, 6]

Set:[7, 5, 3, 1]

There are more efficient ways to implement sets with dictionaries in Python, which replace the nested linear search scans in the set implementations shown here with more direct dictionary index operations (hashing) and so run much quicker. For more details, see the continuation of this thread in the follow-up book Programming Python. Again, if you’re interested in sets, also take another look at the set object type we explored in Chapter 5; this type provides extensive set operations as built-in tools. Set implementations are fun to experiment with, but they are no longer strictly required in Python today.

For another type subclassing example, explore the implementation of the bool type in Python 2.3 and later. As mentioned earlier in the book, bool is a subclass of int with two instances (True and False) that behave like the integers 1 and 0 but inherit custom string-representationmethods that display their names.

The “New Style” Class Model

In release 2.2, Python introduced a new flavor of classes, known as new-style classes; classes following the original and traditional model became known as classic classes when compared to the new kind. In 3.X the class story has merged, but it remains split for Python 2.X users and code:

§  In Python 3.X, all classes are automatically what were formerly called “new style,” whether they explicitly inherit from object or not. Coding the object superclass is optional and implied.

§  In Python 2.X, classes must explicitly inherit from object (or another built-in type) to be considered “new style” and enable and obtain all new-style behavior. Classes without this are “classic.”

Because all classes are automatically new-style in 3.X, the features of new-style classes are simply normal class features in that line. I’ve opted to keep their descriptions in this section separate, however, in deference to users of Python 2.X code—classes in such code acquire new-style features and behavior only when they are derived from object.

In other words, when Python 3.X users see descriptions of “new style” topics in this book, they should take them to be descriptions of existing properties of their classes. For 2.X readers, these are a set of optional changes and extensions that you may choose to enable or not, unless the code you must use already employs them.

In Python 2.X, the identifying syntactic difference for new-style classes is that they are derived from either a built-in type, such as list, or a special built-in class known as object. The built-in name object is provided to serve as a superclass for new-style classes if no other built-in type is appropriate to use:

class newstyle(object):                    # 2.X explicit new-style derivation

    ...normal class code...                # Not required in 3.X: automatic

Any class derived from object, or any other built-in type, is automatically treated as a new-style class. That is, as long as a built-in type is somewhere in its superclass tree, a 2.X class acquires new-style class behavior and extensions. Classes not derived from built-ins such as object are considered classic.

Just How New Is New-Style?

As we’ll see, new-style classes come with profound differences that impact programs broadly, especially when code leverages their added advanced features. In fact, at least in terms of its OOP support, these changes on some levels transform Python into a different language altogether—one that’s mandated in the 3.X line, one that’s optional in 2.X only if ignored by every programmer, and one that borrows much more from (and is often as complex as) other languages in this domain.

New-style classes stem in part from an attempt to merge the notion of class with that of type around the time of Python 2.2, though they went unnoticed by many until they were escalated to required knowledge in 3.X. You’ll need to judge the success of that merging for yourself, but as we’ll see, there are still distinctions in the model—now between class and metaclass—and one of its side effects is to make normal classes more powerful but also substantially more complex. The new-style inheritance algorithm formalized in Chapter 40, for example, grows in complexity by at least a factor of 2.

Still, some programmers using straightforward application code may notice only slight divergence from traditional “classic” classes. After all, we’ve managed to get to this point in this book writing substantial class examples, with mostly just passing mentions of this change. Moreover, the classic class model still available in 2.X works exactly as it has for some two decades.[63]

However, because they modify core class behaviors, new-style classes had to be introduced in Python 2.X as a distinct tool so as to avoid impacting any existing code that depends on the prior model. For example, some subtle differences, such as diamond pattern inheritance search and the interaction of built-in operations and managed attribute methods such as __getattr__ can cause some existing code to fail if left unchanged. Using optional extensions in the new model such as slots can have the same effect.

The class model split is removed in Python 3.X, which mandates new-style classes, but it still exists for readers using 2.X, or reusing the vast amount of existing 2.X code in production use. Because this has been an optional extension in 2.X, code written for that line may use either class model.

The next two top-level sections provide overviews of the ways in which new-style classes differ and the new tools they provide. These topics represent potential changes to some Python 2.X readers, but simply additional advanced class topics to many Python 3.X readers. If you’re in the latter group, you’ll find full coverage here, though some of it is presented in the context of changes—which you can accept as features, but only if you never must deal with any of the millions of lines of existing 2.X code.


[63] As a data point, the book Programming Python, a 1,600-page applications programming follow-up to this book that uses 3.X exclusively, neither uses nor needs to accommodate any of the new-style class tools of this chapter, and still manages to build significant programs for GUIs, websites, systems programming, databases, and text. It’s mostly straightforward code that leverages built-in types and libraries to do its work, not obscure and esoteric OOP extensions. When it does use classes, they are relatively simple, providing structure and code factoring. That book’s code is also probably more representative of real-world programming than some in this language tutorial text—which suggests that many of Python’s advanced OOP tools may be artificial, having more to do with language design than practical program goals. Then again, that book has the luxury of restricting its toolset to such code; as soon as your coworker finds a way to use an arcane language feature, all bets are off!

New-Style Class Changes

New-style classes differ from classic classes in a number of ways, some of which are subtle but can impact both existing 2.X code and common coding styles. As preview and summary, here are some of the most prominent ways they differ:

Attribute fetch for built-ins: instance skipped

The __getattr__ and __getattribute__ generic attribute interception methods are still run for attributes accessed by explicit name, but no longer for attributes implicitly fetched by built-in operations. They are not called for __X__ operator overloading method names in built-in contexts only—the search for such names begins at classes, not instances. This breaks or complicates objects that serve as proxies for another object’s interface, if wrapped objects implement operator overloading. Such methods must be redefined for the sake of differing built-ins dispatch in new-style classes.

Classes and types merged: type testing

Classes are now types, and types are now classes. In fact, the two are essentially synonyms, though the metaclasses that now subsume types are still somewhat distinct from normal classes. The type(I) built-in returns the class an instance is made from, instead of a generic instance type, and is normally the same as I.__class__. Moreover, classes are instances of the type class, and type may be subclassed to customize class creation with metaclasses coded with class statements. This can impact code that tests types or otherwise relies on the prior type model.

Automatic object root class: defaults

All new-style classes (and hence types) inherit from object, which comes with a small set of default operator overloading methods (e.g., __repr__). In 3.X, this class is added automatically above the user-defined root (i.e., topmost) classes in a tree, and need not be listed as a superclass explicitly. This can affect code that assumes the absence of method defaults and root classes.

Inheritance search order: MRO and diamonds

Diamond patterns of multiple inheritance have a slightly different search order—roughly, at diamonds they are searched across before up, and more breadth-first than depth-first. This attribute search order, known as the MRO, can be traced with a new __mro__ attribute available on new-style classes. The new search order largely applies only to diamond class trees, though the new model’s implied object root itself forms a diamond in all multiple inheritance trees. Code that relies on the prior order will not work the same.

Inheritance algorithm: Chapter 40

The algorithm used for inheritance in new-style classes is substantially more complex than the depth-first model of classic classes, incorporating special cases for descriptors, metaclasses, and built-ins. We won’t be able to formalize this until Chapter 40 after we’ve studied metaclasses and descriptors in more depth, but it can impact code that does not anticipate its extra convolutions.

New advanced tools: code impacts

New-style classes have a set of new class tools, including slotspropertiesdescriptors, super, and the __getattribute__ method. Most of these have very specific tool-building purposes. Their use can also impact or break existing code, though; slots, for example, sometimes prevent creation of an instance namespace dictionary altogether, and generic attribute handlers may require different coding.

We’ll explore the extensions noted in the last of these items in a later top-level section of its own, and will defer formal inheritance algorithm coverage until Chapter 40 as noted. Because the other items on this list have the potential to break traditional Python code, though, let’s take a closer look at each in turn here.

NOTE

Content note: Keep in mind that new-style class changes apply to both 3.X and 2.X, even though they are an option in the latter. This chapter and book sometimes label features as 3.X changes to contrast with traditional 2.X code, but some are technically introduced by new-style classes—which are mandated in 3.X, but can show up in 2.X code too. For space, this distinction is called out often but not dogmatically here. Complicating this distinction, some 3.X class-related changes owe to new-style classes (e.g., skipping __getattr__ for operator methods) but some do not (e.g., replacing unbound methods with functions). Moreover, many 2.X programmers stick to classic classes, ignoring what they view as a 3.X feature. New-style classes are not new, though, and apply to both Pythons—if they appear in 2.X code, they’re required reading for 2.X users too.

Attribute Fetch for Built-ins Skips Instances

We introduced this new-style class change in sidebars in both Chapter 28 and Chapter 31 because of their impact on prior examples and topics. In new-style classes (and hence all classes in 3.X), the generic instance attribute interception methods __getattr__ and __getattribute__ are no longer called by built-in operations for __X__ operator overloading method names—the search for such names begins at classes, not instances. Attributes accessed by explicit name, however, are routed through these methods, even if they are __X__ names. Hence, this is primarily a change to the behavior of built-in operations.

More formally, if a class defines a __getitem__ index overload method and X is an instance of this class, then an index expression like X[I] is roughly equivalent to X.__getitem__(I) for classic classes, but type(X).__getitem__(X, I) for new-style classes—the latter beginning its search in the class, and thus skipping a __getattr__ step from the instance for an undefined name.

Technically, this method search for built-in operations like X[I] uses normal inheritance beginning at the class level, and inspects only the namespace dictionaries of all the classes from which X derives—a distinction that can matter in the metaclass model we’ll meet later in this chapter and focus on in Chapter 40, where classes may acquire behavior differently. The instance, however, is omitted by built-ins’ search.

Why the lookup change?

You can find formal rationales for this change elsewhere; this book is disinclined to parrot justifications for a change that breaks many working programs. But this is imagined as both an optimization path and a solution to a seemingly obscure call pattern issue. The former rationale is supported by the frequency of built-in operations. If every +, for example, requires extra steps at the instance, it can degrade program speed—especially so given the new-style model’s many attribute-level extensions.

The latter rationale is more obscure, and is described in Python manuals; in short, it reflects a conundrum introduced by the metaclass model. Because classes are now instances of metaclasses, and because metaclasses can define built-in operator methods to process the classes they generate, a method call run for a class must skip the class itself and look one level higher to pick up a method that processes the class, rather than selecting the class’s own version. Its own version would result in an unbound method call, because the class’s own method processes lower instances. This is just the usual unbound method model we discussed in the prior chapter, but is potentially aggravated by the fact that classes can acquire type behavior from metaclasses too.

As a result, because classes are both types and instances in their own right, all instances are skipped for built-in operation method lookup. This is supposedly applied to normal instances for uniformity and consistency, but both non-built-in names and direct and explicit calls to built-in names still check the instance anyhow. Though perhaps a consequence of the new-style class model, to some this may seem a solution arrived at for the sake of a usage pattern that was more artificial and obscure than the widely used one it broke. Its role as optimization path seems more defensible, but also not without repercussions.

In particular, this has potentially broad implications for the delegation-based classes, often known as proxy classes, when embedded objects implement operator overloading. In new-style classes, such a proxy object’s class must generally redefine any such names to catch and delegate, either manually or with tools. The net effect is to either significantly complicate or wholly obviate an entire category of programs. We explored delegation in Chapter 28 and Chapter 31; it’s a common pattern used to augment or adapt another class’s interface—to add validation, tracing, timing, and many other sorts of logic. Though proxies may be more the exception than the rule in typical Python code, many Python programs depend upon them.

Implications for attribute interception

In simple terms, and run in Python 2.X to show how new-style classes differ, indexing and prints are routed to __getattr__ in traditional classes, but not for new-style classes, where printing uses a default:[64]

>>> class C:

        data = 'spam'

        def __getattr__(self, name):             # Classic in 2.X: catches built-ins

            print(name)

            return getattr(self.data, name)

>>> X = C()

>>> X[0]

__getitem__

's'

>>> print(X)                                     # Classic doesn't inherit default

__str__

spam

>>> class C(object):                             # New-style in 2.X and 3.X

        ...rest of class unchanged...

>>> X = C()                                      # Built-ins not routed to getattr

>>> X[0]

TypeError: 'C' object does not support indexing

>>> print(X)

<__main__.C object at 0x02205780>

Though apparently rationalized in the name of class metaclass methods and optimizing built-in operations, this divergence is not addressed by special-casing normal instances having a __getattr__, and applies only to built-in operations—not to normally named methods, or explicit calls to built-in methods by name:

>>> class C: pass                                # 2.X classic class

>>> X = C()

>>> X.normal = lambda: 99

>>> X.normal()

99

>>> X.__add__ = lambda(y): 88 + y

>>> X.__add__(1)

89

>>> X + 1

89

>>> class C(object): pass                        # 2.X/3.X new-style class

>>> X = C()

>>> X.normal = lambda: 99

>>> X.normal()                                   # Normals still from instance

99

>>> X.__add__ = lambda(y): 88 + y

>>> X.__add__(1)                                 # Ditto for explicit built-in names

89

>>> X + 1

TypeError: unsupported operand type(s) for +: 'C' and 'int'

This behavior winds up being inherited by the __getattr__ attribute interception method:

>>> class C(object):

        def __getattr__(self, name): print(name)

>>> X = C()

>>> X.normal             # Normal names are still routed to getattr

normal

>>> X.__add__            # Direct calls by name are too, but expressions are not!

__add__

>>> X + 1

TypeError: unsupported operand type(s) for +: 'C' and 'int'

Proxy coding requirements

In a more realistic delegation scenario, this means that built-in operations like expressions no longer work the same as their traditional direct-call equivalent. Asymmetrically, direct calls to built-in method names still work, but equivalent expressions do not because through-type calls fail for names not at the class level and above. In other words, this distinction arises in built-in operations only; explicit fetches run correctly:

>>> class C(object):

        data = 'spam'

        def __getattr__(self, name):

            print('getattr: ' + name)

            return getattr(self.data, name)

>>> X = C()

>>> X.__getitem__(1)           # Traditional mapping works but new-style's does not

getattr: __getitem__

'p'

>>> X[1]

TypeError: 'C' object does not support indexing

>>> type(X).__getitem__(X, 1)

AttributeError: type object 'C' has no attribute '__getitem__'

>>> X.__add__('eggs')          # Ditto for +: instance skipped for expression only

getattr: __add__

'spameggs'

>>> X + 'eggs'

TypeError: unsupported operand type(s) for +: 'C' and 'str'

>>> type(X).__add__(X, 'eggs')

AttributeError: type object 'C' has no attribute '__add__'

The net effect: to code a proxy of an object whose interface may in part be invoked by built-in operations, new-style classes require both __getattr__ for normal names, as well as method redefinitions for all names accessed by built-in operations—whether coded manually, obtained from superclasses, or generated by tools. When redefinitions are so incorporated, calls through both instances and types are equivalent to built-in operations, though redefined names are no longer routed to the generic __getattr__ undefined name handler, even for explicit name calls:

>>> class C(object):                                    # New-style: 3.X and 2.X

        data = 'spam'

        def __getattr__(self, name):                    # Catch normal names

            print('getattr: ' + name)

            return getattr(self.data, name)

        def __getitem__(self, i):                       # Redefine built-ins

            print('getitem: ' + str(i))

            return self.data[i]                         # Run expr or getattr

        def __add__(self, other):

            print('add: ' +  other)

            return getattr(self.data, '__add__')(other)

>>> X = C()

>>> X.upper

getattr: upper

<built-in method upper of str object at 0x0233D670>

>>> X.upper()

getattr: upper

'SPAM'

>>> X[1]                            # Built-in operation (implicit)

getitem: 1

'p'

>>> X.__getitem__(1)                # Traditional equivalence (explicit)

getitem: 1

'p'

>>> type(X).__getitem__(X, 1)       # New-style equivalence

getitem: 1

'p'

>>> X + 'eggs'                      # Ditto for + and others

add: eggs

'spameggs'

>>> X.__add__('eggs')

add: eggs

'spameggs'

>>> type(X).__add__(X, 'eggs')

add: eggs

'spameggs'

For more details

We will revisit this change in Chapter 40 on metaclasses, and by example in the contexts of attribute management in Chapter 38 and privacy decorators in Chapter 39. In the latter of these, we’ll also explore coding structures for providing proxies with the required operator methods generically—it’s not an impossible task, and may need to be coded just once if done well. For more of the sort of code influenced by this issue, see those later chapters, as well as the earlier examples in Chapter 28 and Chapter 31.

Because we’ll expand on this issue later in the book, we’ll cut the coverage short here. For external links and pointers on this issue, though, see the following (along with your local search engine):

§  Python Issue 643841: this issue has been discussed widely, but its most official history seems to be documented at http://bugs.python.org/issue643841. There, it was raised as a concern for real programs and escalated to be addressed, but a proposed library remedy or broader change in Python was struck down in favor of a simple documentation change to describe the new mandated behavior.

§  Tool recipes: also see http://code.activestate.com/recipes/252151, an Active State Python recipe that describes a tool that automatically fills in special method names as generic call dispatchers in a proxy class created with metaclass techniques introduced later in this chapter. This tool still must ask you to pass in the operator method names that a wrapped object may implement, though (it must, as interface components of a wrapped object may be inherited from arbitrary sources).

§  Other approaches: a web search today will uncover numerous additional tools that similarly populate proxy classes with overloading methods; it’s a widespread concern! Again, in Chapter 39, we’ll also see how to code straightforward and general superclasses once that provide the required methods or attributes as mix-ins, without metaclasses, redundant code generation, or similarly complex techniques.

This story may evolve over time, of course, but has been an issue for many years. As this stands today, classic class proxies for objects that do any operator overloading are effectively broken as new-style classes. Such classes in both 2.X and 3.X require coding or generating wrappers for all the implicitly invoked operator methods a wrapped object may support. This is not ideal for such programs—some proxies may require dozens of wrapper methods (potentially over 50!)—but reflects, or is at least an artifact of, the design goals of new-style class developers.

NOTE

Be sure to see Chapter 40’s metaclass coverage for an additional illustration of this issue and its rationale. We’ll also see there that this behavior of built-ins qualifies as a special case in new-style inheritance. Understanding this well requires more background on metaclasses than the current chapter can provide, a regrettable byproduct of metaclasses in general—they’ve become prerequisite to more usage than their originators may have foreseen.

Type Model Changes

On to our next new-style change: depending on your assessment, in new-style classes the distinction between type and class has either been greatly muted or has vanished entirely. Specifically:

Classes are types

The type object generates classes as its instances, and classes generate instances of themselves. Both are considered types, because they generate instances. In fact, there is no real difference between built-in types like lists and strings and user-defined types coded as classes. This is why we can subclass built-in types, as shown earlier in this chapter—a subclass of a built-in type such as list qualifies as a new-style class and becomes a new user-defined type.

Types are classes

New class-generating types may be coded in Python as the metaclasses we’ll meet later in this chapter—user-defined type subclasses that are coded with normal class statements, and control creation of the classes that are their instances. As we’ll see, metaclasses are both class and type, though they are distinct enough to support a reasonable argument that the prior type/class dichotomy has become one of metaclass/class, perhaps at the cost of added complexity in normal classes.

Besides allowing us to subclass built-in types and code metaclasses, one of the most practical contexts where this type/class merging becomes most obvious is when we do explicit type testing. With Python 2.X’s classic classes, the type of a class instance is a generic “instance,” but the types of built-in objects are more specific:

C:\code> c:\python27\python

>>> class C: pass                       # Classic classes in 2.X

>>> I = C()                             # Instances are made from classes

>>> type(I), I.__class__

(<type 'instance'>, <class __main__.C at 0x02399768>)

>>> type(C)                             # But classes are not the same as types

<type 'classobj'>

>>> C.__class__

AttributeError: class C has no attribute '__class__'

>>> type([1, 2, 3]), [1, 2, 3].__class__

(<type 'list'>, <type 'list'>)

>>> type(list), list.__class__

(<type 'type'>, <type 'type'>)

But with new-style classes in 2.X, the type of a class instance is the class it’s created from, since classes are simply user-defined types—the type of an instance is its class, and the type of a user-defined class is the same as the type of a built-in object type. Classes have a __class__ attribute now, too, because they are instances of type:

C:\code> c:\python27\python

>>> class C(object): pass               # New-style classes in 2.X

>>> I = C()                             # Type of instance is class it's made from

>>> type(I), I.__class__

(<class '__main__.C'>, <class '__main__.C'>)

>>> type(C), C.__class__                # Classes are user-defined types

(<type 'type'>, <type 'type'>)

The same is true for all classes in Python 3.X, since all classes are automatically new-style, even if they have no explicit superclasses. In fact, the distinction between built-in types and user-defined class types seems to melt away altogether in 3.X:

C:\code> c:\python33\python

>>> class C: pass

>>> I = C()                             # All classes are new-style in 3.X

>>> type(I), I.__class__                # Type of instance is class it's made from

(<class '__main__.C'>, <class '__main__.C'>)

>>> type(C), C.__class__                # Class is a type, and type is a class

(<class 'type'>, <class 'type'>)

>>> type([1, 2, 3]), [1, 2, 3].__class__

(<class 'list'>, <class 'list'>)

>>> type(list), list.__class__          # Classes and built-in types work the same

(<class 'type'>, <class 'type'>)

As you can see, in 3.X classes are types, but types are also classes. Technically, each class is generated by a metaclass—a class that is normally either type itself, or a subclass of it customized to augment or manage generated classes. Besides impacting code that does type testing, this turns out to be an important hook for tool developers. We’ll talk more about metaclasses later in this chapter, and again in more detail in Chapter 40.

Implications for type testing

Besides providing for built-in type customization and metaclass hooks, the merging of classes and types in the new-style class model can impact code that does type testing. In Python 3.X, for example, the types of class instances compare directly and meaningfully, and in the same way as built-in type objects. This follows from the fact that classes are now types, and an instance’s type is the instance’s class:

C:\code> c:\python33\python

>>> class C: pass

>>> class D: pass

>>> c, d = C(), D()

>>> type(c) == type(d)                 # 3.X: compares the instances' classes

False

>>> type(c), type(d)

(<class '__main__.C'>, <class '__main__.D'>)

>>> c.__class__, d.__class__

(<class '__main__.C'>, <class '__main__.D'>)

>>> c1, c2 = C(), C()

>>> type(c1) == type(c2)

True

With classic classes in 2.X, though, comparing instance types is almost useless, because all instances have the same “instance” type. To truly compare types, the instance __class__ attributes must be compared (if you care about portability, this works in 3.X, too, but it’s not required there):

C:\code> c:\python27\python

>>> class C: pass

>>> class D: pass

>>> c, d = C(), D()

>>> type(c) == type(d)                 # 2.X: all instances are same type!

True

>>> c.__class__ == d.__class__         # Compare classes explicitly if needed

False

>>> type(c), type(d)

(<type 'instance'>, <type 'instance'>)

>>> c.__class__, d.__class__

(<class __main__.C at 0x024585A0>, <class __main__.D at 0x024588D0>)

And as you should expect by now, new-style classes in 2.X work the same as all classes in 3.X in this regard—comparing instance types compares the instances’ classes automatically:

C:\code> c:\python27\python

>>> class C(object): pass

>>> class D(object): pass

>>> c, d = C(), D()

>>> type(c) == type(d)                 # 2.X new-style: same as all in 3.X

False

>>> type(c), type(d)

(<class '__main__.C'>, <class '__main__.D'>)

>>> c.__class__, d.__class__

(<class '__main__.C'>, <class '__main__.D'>)

Of course, as I’ve pointed out numerous times in this book, type checking is usually the wrong thing to do in Python programs (we code to object interfaces, not object types), and the more general isinstance built-in is more likely what you’ll want to use in the rare cases where instance class types must be queried. However, knowledge of Python’s type model can help clarify the class model in general.

All Classes Derive from “object”

Another ramification of the type change in the new-style class model is that because all classes derive (inherit) from the class object either implicitly or explicitly, and because all types are now classes, every object derives from the object built-in class, whether directly or through a superclass. Consider the following interaction in Python 3.X:

>>> class C: pass                     # For new-style classes

>>> X = C()

>>> type(X), type(C)                  # Type is class instance was created from

(<class '__main__.C'>, <class 'type'>)

As before, the type of a class instance is the class it was made from, and the type of a class is the type class because classes and types have merged. It is also true, though, that the instance and class are both derived from the built-in object class and type, an implicit or explicit superclass of every class:

>>> isinstance(X, object)

True

>>> isinstance(C, object)             # Classes always inherit from object

True

The preceding returns the same results for both new-style and classic classes in 2.X today, though 2.X type results differ. More importantly, as we’ll see ahead, object is not added to or present in a 2.X classic class’s __bases__ tuple, and so is not a true superclass.

The same relationship holds true for built-in types like lists and strings, because types are classes in the new-style model—built-in types are now classes, and their instances derive from object, too:

>>> type('spam'), type(str)

(<class 'str'>, <class 'type'>)

>>> isinstance('spam', object)        # Same for  built-in types (classes)

True

>>> isinstance(str, object)

True

In fact, type itself derives from object, and object derives from type, even though the two are different objects—a circular relationship that caps the object model and stems from the fact that types are classes that generate classes:

>>> type(type)                        # All classes are types, and vice versa

<class 'type'>

>>> type(object)

<class 'type'>

>>> isinstance(type, object)          # All classes derive from object, even type

True

>>> isinstance(object, type)          # Types make classes, and type is a class

True

>>> type is object

False

Implications for defaults

The preceding may seem obscure, but this model has a number of practical implications. For one thing, it means that we sometimes must be aware of the method defaults that come with the explicit or implicit object root class in new-style classes only:

c:\code> py −2

>>> dir(object)

['__class__', '__delattr__', '__doc__', '__format__', '__getattribute__', '__hash__'

, '__init__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '

__sizeof__', '__str__', '__subclasshook__']

>>> class C: pass

>>> C.__bases__                       # Classic classes do not inherit from object

()

>>> X = C()

>>> X.__repr__

AttributeError: C instance has no attribute '__repr__'

>>> class C(object): pass             # New-style classes inherit object defaults

>>> C.__bases__

(<type 'object'>,)

>>> X = C()

>>> X.__repr__

<method-wrapper '__repr__' of C object at 0x00000000020B5978>

c:\code> py −3

>>> class C: pass                     # This means all classes get defaults in 3.X

>>> C.__bases__

(<class 'object'>,)

>>> C().__repr__

<method-wrapper '__repr__' of C object at 0x0000000002955630>

This model also makes for fewer special cases than the prior type/class distinction of classic classes, and it allows us to write code that can safely assume and use an object superclass (e.g., by assuming it as an “anchor” in some super built-in roles described ahead, and by passing it method calls to invoke default behavior). We’ll see examples of the latter later in the book; for now, let’s move on to explore the last major new-style change.

Diamond Inheritance Change

Our final new-style class model change is also one of its most visible: its slightly different inheritance search order for so-called diamond pattern multiple inheritance trees—a tree pattern in which more than one superclass leads to the same higher superclass further above (and whose name comes from the diamond shape of the tree if you sketch out—a square resting on one of its corners).

The diamond pattern is a fairly advanced design concept, only occurs in multiple inheritance trees, and tends to be coded rarely in Python practice, so we won’t cover this topic in full depth. In short, though, the differing search orders were introduced briefly in the prior chapter’s multiple inheritance coverage:

For classic classes (the default in 2.X): DFLR

The inheritance search path is strictly depth first, and then left to right—Python climbs all the way to the top, hugging the left side of the tree, before it backs up and begins to look further to the right. This search order is known as DFLR for the first letters in its path’s directions.

For new-style classes (optional in 2.X and automatic in 3.X): MRO

The inheritance search path is more breadth-first in diamond cases—Python first looks in any superclasses to the right of the one just searched before ascending to the common superclass at the top. In other words, this search proceeds across by levels before moving up. This search order is called the new-style MRO for “method resolution order” (and often just MRO for short when used in contrast with the DFLR order). Despite the name, this is used for all attributes in Python, not just methods.

The new-style MRO algorithm is a bit more complex than just described—and we’ll expand on it a bit more formally later—but this is as much as many programmers need to know. Still, it has both important benefits for new-style class code, as well as program-breaking potential for existing classic class code.

For example, the new-style MRO allows lower superclasses to overload attributes of higher superclasses, regardless of the sort of multiple inheritance trees they are mixed into. Moreover, the new-style search rule avoids visiting the same superclass more than once when it is accessible from multiple subclasses. It’s arguably better than DFLR, but applies to a small subset of Python user code; as we’ll see, though, the new-style class model itself makes diamonds much more common, and the MRO more important.

At the same time, the new MRO will locate attributes differently, creating a potential incompatibility for 2.X classic classes. Let’s move on to some code to see how its differences pan out in practice.

Implications for diamond inheritance trees

To illustrate how the new-style MRO search differs, consider this simplistic incarnation of the diamond multiple inheritance pattern for classic classes. Here, D’s superclasses B and C both lead to the same common ancestor, A:

>>> class A:       attr = 1           # Classic (Python 2.X)

>>> class B(A):    pass               # B and C both lead to A

>>> class C(A):    attr = 2

>>> class D(B, C): pass               # Tries A before C

>>> x = D()

>>> x.attr                            # Searches x, D, B, A

1

The attribute x.attr here is found in superclass A, because with classic classes, the inheritance search climbs as high as it can before backing up and moving right. The full DFLR search order would visit x, D, B, A, C, and then A. For this attribute, the search stops as soon as attr is found inA, above B.

However, with new-style classes derived from a built-in like object (and all classes in 3.X), the search order is different: Python looks in C to the right of B, before trying A above B. The full MRO search order would visit x, D, B, C, and then A. For this attribute, the search stops as soon asattr is found in C:

>>> class A(object): attr = 1         # New-style ("object" not required in 3.X)

>>> class B(A):      pass

>>> class C(A):      attr = 2

>>> class D(B, C):   pass             # Tries C before A

>>> x = D()

>>> x.attr                            # Searches x, D, B, C

2

This change in the inheritance search procedure is based upon the assumption that if you mix in C lower in the tree, you probably intend to grab its attributes in preference to A’s. It also assumes that C is always intended to override A’s attributes in all contexts, which is probably true when it’s used standalone but may not be when it’s mixed into a diamond with classic classes—you might not even know that C may be mixed in like this when you code it.

Since it is most likely that the programmer meant that C should override A in this case, though, new-style classes visit C first. Otherwise, C could be essentially pointless in a diamond context for any names in A too—it could not customize A and would be used only for names unique to C.

Explicit conflict resolution

Of course, the problem with assumptions is that they assume things! If this search order deviation seems too subtle to remember, or if you want more control over the search process, you can always force the selection of an attribute from anywhere in the tree by assigning or otherwise naming the one you want at the place where the classes are mixed together. The following, for example, chooses new-style order in a classic class by resolving the choice explicitly:

>>> class A:       attr = 1           # Classic

>>> class B(A):    pass

>>> class C(A):    attr = 2

>>> class D(B, C): attr = C.attr      # <== Choose C, to the right

>>> x = D()

>>> x.attr                            # Works like new-style (all 3.X)

2

Here, a tree of classic classes is emulating the search order of new-style classes for a specific attribute: the assignment to the attribute in D picks the version in C, thereby subverting the normal inheritance search path (D.attr will be lowest in the tree). New-style classes can similarly emulate classic classes by choosing the higher version of the target attribute at the place where the classes are mixed together:

>>> class A(object): attr = 1         # New-style

>>> class B(A):      pass

>>> class C(A):      attr = 2

>>> class D(B, C):   attr = B.attr    # <== Choose A.attr, above

>>> x = D()

>>> x.attr                            # Works like classic (default 2.X)

1

If you are willing to always resolve conflicts like this, you may be able to largely ignore the search order difference and not rely on assumptions about what you meant when you coded your classes.

Naturally, attributes picked this way can also be method functions—methods are normal, assignable attributes that happen to reference callable function objects:

>>> class A:

        def meth(s): print('A.meth')

>>> class C(A):

        def meth(s): print('C.meth')

>>> class B(A):

        pass

>>> class D(B, C): pass               # Use default search order

>>> x = D()                           # Will vary per class type

>>> x.meth()                          # Defaults to classic order in 2.X

A.meth

>>> class D(B, C): meth = C.meth      # <== Pick C's method: new-style (and 3.X)

>>> x = D()

>>> x.meth()

C.meth

>>> class D(B, C): meth = B.meth      # <== Pick B's method: classic

>>> x = D()

>>> x.meth()

A.meth

Here, we select methods by explicitly assigning to names lower in the tree. We might also simply call the desired class explicitly; in practice, this pattern might be more common, especially for things like constructors:

class D(B, C):

    def meth(self):                   # Redefine lower

        ...

        C.meth(self)                  # <== Pick C's method by calling

Such selections by assignment or call at mix-in points can effectively insulate your code from this difference in class flavors. This applies only to the attributes you handle this way, of course, but explicitly resolving the conflicts ensures that your code won’t vary per Python version, at least in terms of attribute conflict selection. In other words, this can serve as a portability technique for classes that may need to be run under both the new-style and classic class models.

NOTE

Explicit is better than implicit—for method resolution too: Even without the classic/new-style class divergence, the explicit method resolution technique shown here may come in handy in multiple inheritance scenarios in general. For instance, if you want part of a superclass on the left and part of a superclass on the right, you might need to tell Python which same-named attributes to choose by using explicit assignments or calls in subclasses. We’ll revisit this notion in a “gotcha” at the end of this chapter.

Also note that diamond inheritance patterns might be more problematic in some cases than I’ve implied here (e.g., what if B and C both have required constructors that call to the constructor in A?). Since such contexts are rare in real-world Python, we’ll defer this topic until we explore the super built-in function near the end of this chapter; besides providing generic access to superclasses in single inheritance trees, super supports a cooperative mode for resolving conflicts in multiple inheritance trees by ordering method calls per the MRO—assuming this order makes sense in this context too!

Scope of search order change

In sum, by default, the diamond pattern is searched differently for classic and new-style classes, and this is a non-backward-compatible change. Keep in mind, though, that this change primarily affects diamond pattern cases of multiple inheritance; new-style class inheritance works the same for most other inheritance tree structures. Further, it’s not impossible that this entire issue may be of more theoretical than practical importance—because the new-style search wasn’t significant enough to address until Python 2.2 and didn’t become standard until 3.0, it seems unlikely to impact most Python code.

Having said that, I should also note that even though you might not code diamond patterns in classes you write yourself, because the implied object superclass is above every root class in 3.X as we saw earlier, every case of multiple inheritance exhibits the diamond pattern today. That is, in new-style classes, object automatically plays the role that the class A does in the example we just considered. Hence the new-style MRO search rule not only modifies logical semantics, but is also an important performance optimization—it avoids visiting and searching the same class more than once, even the automatic object.

Just as important, we’ve also seen that the implied object superclass in the new-style model provides default methods for a variety of built-in operations, including the __str__ and __repr__ display format methods. Run a dir(object) to see which methods are provided. Without the new-style MRO search order, in multiple inheritance cases the defaults in object would always override redefinitions in user-coded classes, unless they were always made in the leftmost superclass. In other words, the new-style class model itself makes using the new-style search order more critical!

For a more visual example of the implied object superclass in 3.X, and other examples of diamond patterns created by it, see the ListTree class’s output in the lister.py example in the preceding chapter, as well as the classtree.py tree walker example in Chapter 29—and the next section.

More on the MRO: Method Resolution Order

To trace how new-style inheritance works by default, we can also use the new class.__mro__ attribute mentioned in the preceding chapter’s class lister examples—technically a new-style extension, but useful here to explore a change. This attribute returns a class’s MRO—the order in which inheritance searches classes in a new-style class tree. This MRO is based on the C3 superclass linearization algorithm initially developed in the Dylan programming language, but later adopted by other languages including Python 2.3 and Perl 6.

The MRO algorithm

This book avoids a full description of the MRO algorithm deliberately, because many Python programmers don’t need to care (this only impacts diamonds, which are relatively rare in real-world code); because it differs between 2.X and 3.X; and because the details of the MRO are a bit too arcane and academic for this text. As a rule, this book avoids formal algorithms and prefers to teach informally by example.

On the other hand, some readers may still have an interest in the formal theory behind new-style MRO. If this set includes you, it’s described in full detail online; search Python’s manuals and the Web for current MRO links. In short, though, the MRO essentially works like this:

1.    List all the classes that an instance inherits from using the classic class’s DFLR lookup rule, and include a class multiple times if it’s visited more than once.

2.    Scan the resulting list for duplicate classes, removing all but the last occurrence of duplicates in the list.

The resulting MRO list for a given class includes the class, its superclasses, and all higher superclasses up to the object root class at the top of the tree. It’s ordered such that each class appears before its parents, and multiple parents retain the order in which they appear in the __bases__superclass tuple.

Crucially, though, because common parents in diamonds appear only at the position of their last visitation, lower classes are searched first when the MRO list is later used by attribute inheritance. Moreover, each class is included and thus visited just once, no matter how many classes lead to it.

We’ll see applications of this algorithm later in this chapter, including that in super—a built-in that elevates the MRO to required reading if you wish to fully understand how methods are dispatched by this call, should you choose to use it. As we’ll see, despite its name, this call invokes the next class on the MRO, which might not be a superclass at all.

Tracing the MRO

If you just want to see how Python’s new-style inheritance orders superclasses in general, though, new-style classes (and hence all classes in 3.X) have a class.__mro__ attribute, which is a tuple giving the linear search order Python uses to look up attributes in superclasses. Really, this attribute is the inheritance order in new-style classes, and is often as much MRO detail as many Python users need.

Here are some illustrative examples, run in 3.X; for diamond inheritance patterns only, the search is the new order we’ve been studying—across before up, per the MRO for new-style classes always used in 3.X, and available as an option in 2.X:

>>> class A: pass

>>> class B(A): pass         # Diamonds: order differs for newstyle

>>> class C(A): pass         # Breadth-first across lower levels

>>> class D(B, C): pass

>>> D.__mro__

(<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>,

<class '__main__.A'>, <class 'object'>)

For nondiamonds, though, the search is still as it has always been (albeit with an extra object root)—to the top, and then to the right (a.k.a. DFLR, depth first and left to right, the model used for all classic classes in 2.X):

>>> class A: pass

>>> class B(A): pass         # Nondiamonds: order same as classic

>>> class C: pass            # Depth first, then left to right

>>> class D(B, C): pass

>>> D.__mro__

(<class '__main__.D'>, <class '__main__.B'>, <class '__main__.A'>,

<class '__main__.C'>, <class 'object'>)

The MRO of the following tree, for example, is the same as the earlier diamond, per DFLR:

>>> class A: pass

>>> class B: pass            # Another nondiamond: DFLR

>>> class C(A): pass

>>> class D(B, C): pass

>>> D.__mro__

(<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>,

<class '__main__.A'>, <class 'object'>)

Notice how the implied object superclass always shows up at the end of the MRO; as we’ve seen, it’s added automatically above root (topmost) classes in new-style class trees in 3.X (and optionally in 2.X):

>>> A.__bases__              # Superclass links: object at two roots

(<class 'object'>,)

>>> B.__bases__

(<class 'object'>,)

>>> C.__bases__

(<class '__main__.A'>,)

>>> D.__bases__

(<class '__main__.B'>, <class '__main__.C'>)

Technically, the implied object superclass always creates a diamond in multiple inheritance even if your classes do not—your classes are searched as before, but the new-style MRO ensures that object is visited last, so your classes can override its defaults:

>>> class X: pass

>>> class Y: pass

>>> class A(X): pass         # Nondiamond: depth first then left to right

>>> class B(Y): pass         # Though implied "object" always forms a diamond

>>> class D(A, B): pass

>>> D.mro()

[<class '__main__.D'>, <class '__main__.A'>, <class '__main__.X'>,

<class '__main__.B'>, <class '__main__.Y'>, <class 'object'>]

>>> X.__bases__, Y.__bases__

((<class 'object'>,), (<class 'object'>,))

>>> A.__bases__, B.__bases__

((<class '__main__.X'>,), (<class '__main__.Y'>,))

The class.__mro__ attribute is available only on new-style classes; it’s not present in 2.X unless classes derive from object. Strictly speaking, new-style classes also have a class.mro() method used in the prior example for variety; it’s called at class instantiation time and its return value is a list used to initialize the __mro__ attribute when the class is created (the method is available for customization in metaclasses, described later). You can also select MRO names if classes’ object displays are too detailed, though this book usually shows the objects to remind you of their true form:

>>> D.mro() == list(D.__mro__)

True

>>> [cls.__name__ for cls in D.__mro__]

['D', 'A', 'X', 'B', 'Y', 'object']

However you access or display them, class MRO paths might be useful to resolve confusion, and in tools that must imitate Python’s inheritance search order. The next section shows the latter role in action.

Example: Mapping Attributes to Inheritance Sources

As a prime MRO use case, we noted at the end of the prior chapter that class tree climbers—such as the class tree lister mix-in we wrote there—might benefit from the MRO. As coded, the tree lister gave the physical locations of attributes in a class tree. However, by mapping the list of inherited attributes in a dir result to the linear MRO sequence (or DFLR order for classic classes), such tools can more directly associate attributes with the classes from which they are inherited—also a useful relationship for programmers.

We won’t recode our tree lister here, but as a first major step, the following file, mapattrs.py, implements tools that can be used to associate attributes with their inheritance source; as an added bonus, its mapattrs function demonstrates how inheritance actually searches for attributes in class tree objects, though the new-style MRO is largely automated for us:

"""

File mapattrs.py (3.X + 2.X)

Main tool: mapattrs() maps all attributes on or inherited by an

instance to the instance or class from which they are inherited.

Assumes dir() gives all attributes of an instance.  To simulate

inheritance, uses either the class's MRO tuple, which gives the

search order for new-style classes (and all in 3.X), or a recursive

traversal to infer the DFLR order of classic classes in 2.X.

Also here: inheritance() gives version-neutral class ordering;

assorted dictionary tools using 3.X/2.7 comprehensions.

"""

import pprint

def trace(X, label='', end='\n'):

    print(label + pprint.pformat(X) + end)  # Print nicely

def filterdictvals(D, V):

    """

    dict D with entries for value V removed.

    filterdictvals(dict(a=1, b=2, c=1), 1) => {'b': 2}

    """

    return {K: V2 for (K, V2) in D.items() if V2 != V}

def invertdict(D):

    """

    dict D with values changed to keys (grouped by values).

    Values must all be hashable to work as dict/set keys.

    invertdict(dict(a=1, b=2, c=1)) => {1: ['a', 'c'], 2: ['b']}

    """

    def keysof(V):

        return sorted(K for K in D.keys() if D[K] == V)

    return {V: keysof(V) for V in set(D.values())}

def dflr(cls):

    """

    Classic depth-first left-to-right order of class tree at cls.

    Cycles not possible: Python disallows on __bases__ changes.

    """

    here = [cls]

    for sup in cls.__bases__:

        here += dflr(sup)

    return here

def inheritance(instance):

    """

    Inheritance order sequence: new-style (MRO) or classic (DFLR)

    """

    if hasattr(instance.__class__, '__mro__'):

        return (instance,) + instance.__class__.__mro__

    else:

        return [instance] + dflr(instance.__class__)

def mapattrs(instance, withobject=False, bysource=False):

    """

    dict with keys giving all inherited attributes of instance,

    with values giving the object that each is inherited from.

    withobject: False=remove object built-in class attributes.

    bysource:   True=group result by objects instead of attributes.

    Supports classes with slots that preclude __dict__ in instances.

    """

    attr2obj = {}

    inherits = inheritance(instance)

    for attr in dir(instance):

        for obj in inherits:

             if hasattr(obj, '__dict__') and attr in obj.__dict__:      # See slots

               attr2obj[attr] = obj

               break

    if not withobject:

        attr2obj = filterdictvals(attr2obj, object)

    return attr2obj if not bysource else invertdict(attr2obj)

if __name__ == '__main__':

    print('Classic classes in 2.X, new-style in 3.X')

    class A:         attr1 = 1

    class B(A):      attr2 = 2

    class C(A):      attr1 = 3

    class D(B, C):   pass

    I = D()

    print('Py=>%s' % I.attr1)                        # Python's search == ours?

    trace(inheritance(I),             'INH\n')       # [Inheritance order]

    trace(mapattrs(I),                'ATTRS\n')     # Attrs  => Source

    trace(mapattrs(I, bysource=True), 'OBJS\n')      # Source => [Attrs]

    print('New-style classes in 2.X and 3.X')

    class A(object): attr1 = 1                       # "(object)" optional in 3.X

    class B(A):      attr2 = 2

    class C(A):      attr1 = 3

    class D(B, C):   pass

    I = D()

    print('Py=>%s' % I.attr1)

    trace(inheritance(I),             'INH\n')

    trace(mapattrs(I),                'ATTRS\n')

    trace(mapattrs(I, bysource=True), 'OBJS\n')

This file assumes dir gives all an instance’s attributes. It maps each attribute in a dir result to its source by scanning either the MRO order for new-style classes, or the DFLR order for classic classes, searching each object’s namespace __dict__ along the way. For classic classes, the DFLR order is computed with a simple recursive scan. The net effect is to simulate Python’s inheritance search in both class models.

This file’s self-test code applies its tools to the diamond multiple-inheritance trees we saw earlier. It uses Python’s pprint library module to display lists and dictionaries nicely—pprint.pprint is its basic call, and its pformat returns a print string. Run this on Python 2.7 to see both classic DFLR and new-style MRO search orders; on Python 3.3, the object derivation is unnecessary, and both tests give the same, new-style results. Importantly, attr1, whose value is labeled with “Py=>” and whose name appears in the results lists, is inherited from class A in classic search, but from class C in new-style search:

c:\code> py −2 mapattrs.py

Classic classes in 2.X, new-style in 3.X

Py=>1

INH

[<__main__.D instance at 0x000000000225A688>,

 <class __main__.D at 0x0000000002248828>,

 <class __main__.B at 0x0000000002248768>,

 <class __main__.A at 0x0000000002248708>,

 <class __main__.C at 0x00000000022487C8>,

 <class __main__.A at 0x0000000002248708>]

ATTRS

{'__doc__': <class __main__.D at 0x0000000002248828>,

 '__module__': <class __main__.D at 0x0000000002248828>,

 'attr1': <class __main__.A at 0x0000000002248708>,

 'attr2': <class __main__.B at 0x0000000002248768>}

OBJS

{<class __main__.A at 0x0000000002248708>: ['attr1'],

 <class __main__.B at 0x0000000002248768>: ['attr2'],

 <class __main__.D at 0x0000000002248828>: ['__doc__', '__module__']}

New-style classes in 2.X and 3.X

Py=>3

INH

(<__main__.D object at 0x0000000002257B38>,

 <class '__main__.D'>,

 <class '__main__.B'>,

 <class '__main__.C'>,

 <class '__main__.A'>,

 <type 'object'>)

ATTRS

{'__dict__': <class '__main__.A'>,

 '__doc__': <class '__main__.D'>,

 '__module__': <class '__main__.D'>,

 '__weakref__': <class '__main__.A'>,

 'attr1': <class '__main__.C'>,

 'attr2': <class '__main__.B'>}

OBJS

{<class '__main__.A'>: ['__dict__', '__weakref__'],

 <class '__main__.B'>: ['attr2'],

 <class '__main__.C'>: ['attr1'],

 <class '__main__.D'>: ['__doc__', '__module__']}

As a larger application of these tools, the following is our inheritance simulator at work in 3.3 on the preceding chapter’s testmixin0.py file’s test classes (I’ve deleted some built-in names here for space; as usual, run live for the whole list). Notice how __X pseudoprivate names are mapped to their defining classes, and how ListInstance appears in the MRO before object, which has a __str__ that would otherwise be chosen first—as you’ll recall, mixing this method in was the whole point of the lister classes!

c:\code> py −3

>>> from mapattrs import trace, dflr, inheritance, mapattrs

>>> from testmixin0 import Sub

>>> I = Sub()                      # Sub inherits from Super and ListInstance roots

>>> trace(dflr(I.__class__))       # 2.X search order: implied object before lister!

[<class 'testmixin0.Sub'>,

 <class 'testmixin0.Super'>,

 <class 'object'>,

 <class 'listinstance.ListInstance'>,

 <class 'object'>]

>>> trace(inheritance(I))          # 3.X (+ 2.X newstyle) search order: lister first

(<testmixin0.Sub object at 0x0000000002974630>,

 <class 'testmixin0.Sub'>,

 <class 'testmixin0.Super'>,

 <class 'listinstance.ListInstance'>,

 <class 'object'>)

>>> trace(mapattrs(I))

{'_ListInstance__attrnames': <class 'listinstance.ListInstance'>,

 '__init__': <class 'testmixin0.Sub'>,

 '__str__': <class 'listinstance.ListInstance'>,

 ...etc...

 'data1': <testmixin0.Sub object at 0x0000000002974630>,

 'data2': <testmixin0.Sub object at 0x0000000002974630>,

 'data3': <testmixin0.Sub object at 0x0000000002974630>,

 'ham': <class 'testmixin0.Super'>,

 'spam': <class 'testmixin0.Sub'>}

>>> trace(mapattrs(I, bysource=True))

{<testmixin0.Sub object at 0x0000000002974630>: ['data1', 'data2', 'data3'],

 <class 'listinstance.ListInstance'>: ['_ListInstance__attrnames', '__str__'],

 <class 'testmixin0.Super'>: ['__dict__', '__weakref__', 'ham'],

 <class 'testmixin0.Sub'>: ['__doc__',

                            '__init__',

                            '__module__',

                            '__qualname__',

                            'spam']}

>>> trace(mapattrs(I, withobject=True))

{'_ListInstance__attrnames': <class 'listinstance.ListInstance'>,

 '__class__': <class 'object'>,

 '__delattr__': <class 'object'>,

 ...etc...

Here’s the bit you might run if you want to label class objects with names inherited by an instance, though you may want to filter out some built-in double-underscore names for the sake of users’ eyesight!

>>> amap = mapattrs(I, withobject=True, bysource=True)

>>> trace(amap)

{<testmixin0.Sub object at 0x0000000002974630>: ['data1', 'data2', 'data3'],

 <class 'listinstance.ListInstance'>: ['_ListInstance__attrnames', '__str__'],

 <class 'testmixin0.Super'>: ['__dict__', '__weakref__', 'ham'],

 <class 'testmixin0.Sub'>: ['__doc__',

                            '__init__',

                            '__module__',

                            '__qualname__',

                            'spam'],

 <class 'object'>: ['__class__',

                    '__delattr__',

                    ...etc...

                    '__sizeof__',

                    '__subclasshook__']}

Finally, and as both a follow-up to the prior chapter’s ruminations and segue to the next section here, the following shows how this scheme works for class-based slots attributes too. Because a class’s __dict__ includes both normal class attributes and individual entries for the instance attributes defined by its __slots__ list, the slots attributes inherited by an instance will be correctly associated with the implementing class from which they are acquired, even though they are not physically stored in the instance’s __dict__ itself:

# mapattrs-slots.py: test __slots__ attribute inheritance

from mapattrs import mapattrs, trace

class A(object): __slots__ = ['a', 'b']; x = 1; y = 2

class B(A):      __slots__ = ['b', 'c']

class C(A):      x = 2

class D(B, C):

    z = 3

    def __init__(self): self.name = 'Bob';

I = D()

trace(mapattrs(I, bysource=True))     # Also: trace(mapattrs(I))

For explicitly new-style classes like those in this file, the results are the same under both 2.7 and 3.3, though 3.3 adds an extra built-in name to the set. The attribute names here reflect all those inherited by the instance from user-defined classes, even those implemented by slots defined at classes and stored in space allocated in the instance:

c:\code> py −3 mapattrs-slots.py

{<__main__.D object at 0x00000000028988E0>: ['name'],

 <class '__main__.C'>: ['x'],

 <class '__main__.D'>: ['__dict__',

                        '__doc__',

                        '__init__',

                        '__module__',

                        '__qualname__',

                        '__weakref__',

                        'z'],

 <class '__main__.A'>: ['a', 'y'],

 <class '__main__.B'>: ['__slots__', 'b', 'c']}

But we need to move ahead to understand the role of slots better—and understand why mapattrs must be careful to check to see if a __dict__ is present before fetching it!

Study this code for more insight. For the prior chapter’s tree lister, your next step might be to index the mapattrs function’s bysource=True dictionary result to obtain an object’s attributes during the tree sketch traversal, instead of (or perhaps in addition to?) its current physical__dict__ scan. You’ll probably need to use getattr on the instance to fetch attribute values, because some may be implemented as slots or other “virtual” attributes at their source classes, and fetching these at the class directly won’t return the instance’s value. If I code anymore here, though, I’ll deprive readers of the remaining fun, and the next section of its subject matter.

NOTE

Python’s pprint module used in this example works as shown in Pythons 3.3 and 2.7, but appears to have an issue in Pythons 3.2 and 3.1 where it raises a wrong-number-arguments exception internally for the objects displayed here. Since I’ve already devoted too much space to covering transitory Python defects, and since this has been repaired in the versions of Python used in this edition, we’ll leave working around this in the suggested exercises column for readers running this on the infected Pythons; change trace to simple prints as needed, and mind the note on battery dependence in Chapter 1!


[64] As of this chapter’s interaction listings, I’ve started omitting some blank lines and shortening some hex addresses to 32 bits in object displays, to reduce size and clutter. I’m going to assume that by this point in the book, you’ll find such small details irrelevant.

New-Style Class Extensions

Beyond the changes described in the prior section (some of which, frankly, may seem too academic and obscure to matter to many readers of this book), new-style classes provide a handful of more advanced class tools that have more direct and practical application—slotsproperties,descriptors, and more. The following sections provide an overview of each of these additional features, available for new-style class in Python 2.X and all classes in Python 3.X. Also in this extensions category are the __mro__ attribute and the super call, both covered elsewhere—the former in the previous section to explore a change, and the latter postponed until chapter end to serve as a larger case study.

Slots: Attribute Declarations

By assigning a sequence of string attribute names to a special __slots__ class attribute, we can enable a new-style class to both limit the set of legal attributes that instances of the class will have, and optimize memory usage and possibly program speed. As we’ll find, though, slots should be used only in applications that clearly warrant the added complexity. They will complicate your code, may complicate or break code you may use, and require universal deployment to be effective.

Slot basics

To use slots, assign a sequence of string names to the special __slots__ variable and attribute at the top level of a class statement: only those names in the __slots__ list can be assigned as instance attributes. However, like all names in Python, instance attribute names must still be assigned before they can be referenced, even if they’re listed in __slots__:

>>> class limiter(object):

        __slots__ = ['age', 'name', 'job']

>>> x = limiter()

>>> x.age                                           # Must assign before use

AttributeError: age

>>> x.age = 40                                      # Looks like instance data

>>> x.age

40

>>> x.ape = 1000                                    # Illegal: not in __slots__

AttributeError: 'limiter' object has no attribute 'ape'

This feature is envisioned as both a way to catch typo errors like this (assignments to illegal attribute names not in __slots__ are detected) as well as an optimization mechanism.

Allocating a namespace dictionary for every instance object can be expensive in terms of memory if many instances are created and only a few attributes are required. To save space, instead of allocating a dictionary for each instance, Python reserves just enough space in each instance to hold a value for each slot attribute, along with inherited attributes in the common class to manage slot access. This might additionally speed execution, though this benefit is less clear and might vary per program, platform, and Python.

Slots are also something of a major break with Python’s core dynamic nature, which dictates that any name may be created by assignment. In fact, they imitate C++ for efficiency at the expense of flexibility, and even have the potential to break some programs. As we’ll see, slots also come with a plethora of special-case usage rules. Per Python’s own manual, they should not be used except in clearly warranted cases—they are difficult to use correctly, and are, to quote the manual:

best reserved for rare cases where there are large numbers of instances in a memory-critical application.

In other words, this is yet another feature that should be used only if clearly warranted. Unfortunately, slots seem to be showing up in Python code much more often than they should; their obscurity seems to be a draw in itself. As usual, knowledge is your best ally in such things, so let’s take a quick look here.

NOTE

In Python 3.3, non-slots attribute space requirements have been reduced with a key-sharing dictionary model, where the __dict__ dictionaries used for objects’ attributes may share part of their internal storage, including that of their keys. This may lessen some of the value of __slots__ as an optimization tool; per benchmark reports, this change reduces memory use by 10% to 20% for object-oriented programs, gives a small improvement in speed for programs that create many similar objects, and may be optimized further in the future. On the other hand, this won’t negate the presence of __slots__ in existing code you may need to understand!

Slots and namespace dictionaries

Potential benefits aside, slots can complicate the class model—and code that relies on it—substantially. In fact, some instances with slots may not have a __dict__ attribute namespace dictionary at all, and others will have data attributes that this dictionary does not include. To be clear: this is a major incompatibility with the traditional class model—one that can complicate any code that accesses attributes generically, and may even cause some programs to fail altogether.

For instance, programs that list or access instance attributes by name string may need to use more storage-neutral interfaces than __dict__ if slots may be used. Because an instance’s data may include class-level names such as slots—either in addition to or instead of namespace dictionary storage—both attribute sources may need to be queried for completeness.

Let’s see what this means in terms of code, and explore more about slots along the way. First off, when slots are used, instances do not normally have an attribute dictionary—instead, Python uses the class descriptors feature introduced ahead to allocate and manage space reserved for slot attributes in the instance. In Python 3.X, and in 2.X for new-style classes derived from object:

>>> class C:                             # Requires "(object)" in 2.X only

        __slots__ = ['a', 'b']           # __slots__ means no __dict__ by default

>>> X = C()

>>> X.a = 1

>>> X.a

1

>>> X.__dict__

AttributeError: 'C' object has no attribute '__dict__'

However, we can still fetch and set slot-based attributes by name string using storage-neutral tools such as getattr and setattr (which look beyond the instance __dict__ and thus include class-level names like slots) and dir (which collects all inherited names throughout a class tree):

>>> getattr(X, 'a')

1

>>> setattr(X, 'b', 2)                   # But getattr() and setattr() still work

>>> X.b

2

>>> 'a' in dir(X)                        # And dir() finds slot attributes too

True

>>> 'b' in dir(X)

True

Also keep in mind that without an attribute namespace dictionary, it’s not possible to assign new names to instances that are not names in the slots list:

>>> class D:                             # Use D(object) for same result in 2.X

        __slots__ = ['a', 'b']

        def __init__(self):

            self.d = 4                   # Cannot add new names if no __dict__

>>> X = D()

AttributeError: 'D' object has no attribute 'd'

We can still accommodate extra attributes, though, by including __dict__ explicitly in __slots__, in order to create an attribute namespace dictionary too:

>>> class D:

        __slots__ = ['a', 'b', '__dict__']    # Name __dict__ to include one too

        c = 3                                 # Class attrs work normally

        def __init__(self):

            self.d = 4                        # d stored in __dict__, a is a slot

>>> X = D()

>>> X.d

4

>>> X.c

3

>>> X.a                          # All instance attrs undefined until assigned

AttributeError: a

>>> X.a = 1

>>> X.b = 2

In this case, both storage mechanisms are used. This renders __dict__ too limited for code that wishes to treat slots as instance data, but generic tools such as getattr still allow us to process both storage forms as a single set of attributes:

>>> X.__dict__                   # Some objects have both __dict__ and slot names

{'d': 4}                         # getattr() can fetch either type of attr

>>> X.__slots__

['a', 'b', '__dict__']

>>> getattr(X, 'a'), getattr(X, 'c'), getattr(X, 'd')    # Fetches all 3 forms

(1, 3, 4)

Because dir also returns all inherited attributes, though, it might be too broad in some contexts; it also includes class-level methods, and even all object defaults. Code that wishes to list just instance attributes may in principle still need to allow for both storage forms explicitly. We might at first naively code this as follows:

>>> for attr in list(X.__dict__) + X.__slots__:          # Wrong...

        print(attr, '=>', getattr(X, attr))

Since either can be omitted, we may more correctly code this as follows, using getattr to allow for defaults—a noble but nonetheless inaccurate approach, as the next section will explain:

>>> for attr in list(getattr(X, '__dict__', [])) + getattr(X, '__slots__', []):

        print(attr, '=>', getattr(X, attr))

d => 4

a => 1                                                   # Less wrong...

b => 2

__dict__ => {'d': 4}

Multiple __slot__ lists in superclasses

The preceding code works in this specific case, but in general it’s not entirely accurate. Specifically, this code addresses only slot names in the lowest __slots__ attribute inherited by an instance, but slot lists may appear more than once in a class tree. That is, a name’s absence in the lowest__slots__ list does not preclude its existence in a higher __slots__. Because slot names become class-level attributes, instances acquire the union of all slot names anywhere in the tree, by the normal inheritance rule:

>>> class E:

        __slots__ = ['c', 'd']            # Superclass has slots

>>> class D(E):

        __slots__ = ['a', '__dict__']     # But so does its subclass

>>> X = D()

>>> X.a = 1; X.b = 2; X.c = 3             # The instance is the union (slots: a, c)

>>> X.a, X.c

(1, 3)

Inspecting just the inherited slots list won’t pick up slots defined higher in a class tree:

>>> E.__slots__                           # But slots are not concatenated

['c', 'd']

>>> D.__slots__

['a', '__dict__']

>>> X.__slots__                           # Instance inherits *lowest* __slots__

['a', '__dict__']

>>> X.__dict__                            # And has its own an attr dict

{'b': 2}

>>> for attr in list(getattr(X, '__dict__', [])) + getattr(X, '__slots__', []):

        print(attr, '=>', getattr(X, attr))

b => 2                                    # Other superclass slots missed!

a => 1

__dict__ => {'b': 2}

>>> dir(X)                                # But dir() includes all slot names

[...many names omitted... 'a', 'b', 'c', 'd']

In other words, in terms of listing instance attributes generically, one __slots__ isn’t always enough—they are potentially subject to the full inheritance search procedure. See the earlier mapattrs-slots.py for another example of slots appearing in multiple superclasses. If multiple classes in a class tree have their own __slots__ attributes, generic programs must develop other policies for listing attributes—as the next section explains.

Handling slots and other “virtual” attributes generically

At this point, you may wish to review the discussion of slots policy options at the coverage of the lister.py display mix-in classes near the end of the preceding chapter—a prime example of why generic programs may need to care about slots. Such tools that attempt to list instance data attributes generically must account for slots, and perhaps other such “virtual” instance attributes like properties and descriptors discussed ahead—names that similarly reside in classes but may provide attribute values for instances on request. Slots are the most data-centric of these, but are representative of a larger category.

Such attributes require inclusive approaches, special handling, or general avoidance—the latter of which becomes unsatisfactory as soon as any programmer uses slots in subject code. Really, class-level instance attributes like slots probably necessitate a redefinition of the term instance data—as locally stored attributes, the union of all inherited attributes, or some subset thereof.

For example, some programs might classify slot names as attributes of classes instead of instances; these attributes do not exist in instance namespace dictionaries, after all. Alternatively, as shown earlier, programs can be more inclusive by relying on dir to fetch all inherited attribute names and getattr to fetch their corresponding values for the instance—without regard to their physical location or implementation. If you must support slots as instance data, this is likely the most robust way to proceed:

>>> class Slotful:

        __slots__ = ['a', 'b', '__dict__']

        def __init__(self, data):

            self.c = data

>>> I = Slotful(3)

>>> I.a, I.b = 1, 2

>>> I.a, I.b, I.c                            # Normal attribute fetch

(1, 2, 3)

>>> I.__dict__                               # Both __dict__ and slots storage

{'c': 3}

>>> [x for x in dir(I) if not x.startswith('__')]

['a', 'b', 'c']

>>> I.__dict__['c']                          # __dict__ is only one attr source

3

>>> getattr(I, 'c'), getattr(I, 'a')         # dir+getattr is broader than __dict__

(3, 1)                                       # applies to slots, properties, descrip

>>> for a in (x for x in dir(I) if not x.startswith('__')):

        print(a, getattr(I, a))

a 1

b 2

c 3

Under this dir/getattr model, you can still map attributes to their inheritance sources, and filter them more selectively by source or type if needed, by scanning the MRO—as we did earlier in both mapattrs.py and its application to slots in mapattrs-slots.py. As an added bonus, such tools and policies for handling slots will potentially apply automatically to properties and descriptors too, though these attributes are more explicitly computed values, and less obviously instance-related data than slots.

Also keep in mind that this is not just a tools issue. Class-based instance attributes like slots also impact the traditional coding of the __setattr__ operator overloading method we met in Chapter 30. Because slots and some other attributes are not stored in the instance __dict__, and may even imply its absence, new-style classes must instead generally run attribute assignments by routing them to the object superclass. In practice, this may make this method fundamentally different in some classic and new-style classes.

Slot usage rules

Slot declarations can appear in multiple classes in a class tree, but when they do they are subject to a number of constraints that are somewhat difficult to rationalize unless you understand the implementation of slots as class-level descriptors for each slot name that are inherited by the instances where the managed space is reserved (descriptors are an advanced tool we’ll study in detail in the last part of this book):

§  Slots in subs are pointless when absent in supers: If a subclass inherits from a superclass without a __slots__, the instance __dict__ attribute created for the superclass will always be accessible, making a __slots__ in the subclass largely pointless. The subclass still manages its slots, but doesn’t compute their values in any way, and doesn’t avoid a dictionary—the main reason to use slots.

§  Slots in supers are pointless when absent in subs: Similarly, because the meaning of a __slots__ declaration is limited to the class in which it appears, subclasses will produce an instance __dict__ if they do not define a __slots__, rendering a __slots__ in a superclass largely pointless.

§  Redefinition renders super slots pointless: If a class defines the same slot name as a superclass, its redefinition hides the slot in the superclass per normal inheritance. You can access the version of the name defined by the superclass slot only by fetching its descriptor directly from the superclass.

§  Slots prevent class-level defaults: Because slots are implemented as class-level descriptors (along with per-instance space), you cannot use class attributes of the same name to provide defaults as you can for normal instance attributes: assigning the same name in the class overwrites the slot descriptor.

§  Slots and __dict__: As shown earlier, __slots__ preclude both an instance __dict__ and assigning names not listed, unless __dict__ is listed explicitly too.

We’ve already seen the last of these in action, and the earlier mapattrs-slots.py illustrates the third. It’s easy to demonstrate how the new rules here translate to actual code—most crucially, a namespace dictionary is created when any class in a tree omits slots, thereby negating the memory optimization benefit:

>>> class C: pass                        # Bullet 1: slots in sub but not super

>>> class D(C): __slots__ = ['a']        # Makes instance dict for nonslots

>>> X = D()                              # But slot name still managed in class

>>> X.a = 1; X.b = 2

>>> X.__dict__

{'b': 2}

>>> D.__dict__.keys()

dict_keys([... 'a', '__slots__', ...])

>>> class C: __slots__ = ['a']           # Bullet 2: slots in super but not sub

>>> class D(C): pass                     # Makes instance dict for nonslots

>>> X = D()                              # But slot name still managed in class

>>> X.a = 1; X.b = 2

>>> X.__dict__

{'b': 2}

>>> C.__dict__.keys()

dict_keys([... 'a', '__slots__', ...])

>>> class C: __slots__ = ['a']           # Bullet 3: only lowest slot accessible

>>> class D(C): __slots__ = ['a']

>>> class C: __slots__ = ['a']; a = 99  # Bullet 4: no class-level defaults

ValueError: 'a' in __slots__ conflicts with class variable

In other words, besides their program-breaking potential, slots essentially require both universal and careful deployment to be effective—because slots do not compute values dynamically like properties (coming up in the next section), they are largely pointless unless each class in a tree uses them and is cautious to define only new slot names not defined by other classes. It’s an all-or-nothing feature—an unfortunate property shared by the super call discussed ahead:

>>> class C: __slots__ = ['a']           # Assumes universal use, differing names

>>> class D(C): __slots__ = ['b']

>>> X = D()

>>> X.a = 1; X.b = 2

>>> X.__dict__

AttributeError: 'D' object has no attribute '__dict__'

>>> C.__dict__.keys(), D.__dict__.keys()

(dict_keys([... 'a', '__slots__', ...]), dict_keys([... 'b', '__slots__', ...]))

Such rules—among others regarding weak references omitted here for space—are part of the reason slots are not generally recommended, except in pathological cases where their space reduction is significant. Even then, their potential to complicate or break code should be ample cause to carefully consider the tradeoffs. Not only must they be spread almost neurotically throughout a framework, they may also break tools you rely on.

Example impacts of slots: ListTree and mapattrs

As a more realistic example of slots’ effects, due to the first bullet in the prior section, Chapter 31’s ListTree class does not fail when mixed in to a class that defines __slots__, even though it scans instance namespace dictionaries. The lister class’s own lack of slots is enough to ensure that the instance will still have a __dict__, and hence not trigger an exception when fetched or indexed. For example, both of the following display without error—the second also allows names not in the slots list to be assigned as instances attributes, including any required by the superclass:

class C(ListTree): pass

X = C()                                        # OK: no __slots__ used

print(X)

class C(ListTree): __slots__ = ['a', 'b']      # OK: superclass produces __dict__

X = C()

X.c = 3

print(X)                                       # Displays c at X, a and b at C

The following classes display correctly as well—any nonslot class like ListTree generates an instance __dict__, and can thus safely assume its presence:

class A: __slots__ = ['a']                     # Both OK by bullet 1 above

class B(A, ListTree): pass

class A: __slots__ = ['a']

class B(A, ListTree): __slots__ = ['b']        # Displays b at B, a at A

Although it renders subclass slots pointless, this is a positive side effect for tools classes like ListTree (and its Chapter 28 predecessor). In general, though, some tools might need to catch exceptions when __dict__ is absent or use a hasattr or getattr to test or provide defaults if slot usage may preclude a namespace dictionary in instance objects inspected.

For example, you should now be able to understand why the mapattrs.py program earlier in this chapter must check for the presence of a __dict__ before fetching it—instance objects created from classes with __slots__ won’t have one. In fact, if we use the highlighted alternative line in the following, the mapattrs function fails with an exception when attempting to look for an attribute name in the instance at the front of the inheritance path sequence:

def mapattrs(instance, withobject=False, bysource=False):

    for attr in dir(instance):

        for obj in inherits:

            if attr in obj.__dict__:           # May fail if __slots__ used

>>> class C: __slots__ = ['a']

>>> X = C()

>>> mapattrs(X)

AttributeError: 'C' object has no attribute '__dict__'

Either of the following works around the issue, and allows the tool to support slots—the first provides a default, and the second is more verbose but seems marginally more explicit in its intent:

            if attr in getattr(obj, '__dict__', {}):

            if hasattr(obj, '__dict__') and attr in obj.__dict__:

As mentioned earlier, some tools may benefit from mapping dir results to objects in the MRO this way, instead of scanning an instance __dict__ in general—without this more inclusive approach, attributes implemented by class-level tools like slots won’t be reported as instance data. Even so, this doesn’t necessarily excuse such tools from allowing for a missing __dict__ in the instance too!

What about slots speed?

Finally, while slots primarily optimize memory use, their speed impact is less clear-cut. Here’s a simple test script using the timeit techniques we studied in Chapter 21. For both the slots and nonslots (instance dictionary) storage models, it makes 1,000 instances, assigns and fetches 4 attributes on each, and repeats 1,000 times—for both models taking the best of 3 runs that each exercise a total of 8M attribute operations:

# File slots-test.py

from __future__ import print_function

import timeit

base = """

Is = []

for i in range(1000):

    X = C()

    X.a = 1; X.b = 2; X.c = 3; X.d = 4

    t = X.a + X.b + X.c + X.d

    Is.append(X)

"""

stmt = """

class C:

    __slots__ = ['a', 'b', 'c', 'd']

""" + base

print('Slots   =>', end=' ')

print(min(timeit.repeat(stmt, number=1000, repeat=3)))

stmt = """

class C:

    pass

""" + base

print('Nonslots=>', end=' ')

print(min(timeit.repeat(stmt, number=1000, repeat=3)))

At least on this code, on my laptop, and in my installed versions (Python 3.3 and 2.7), the best times imply that slots are slightly quicker in 3.X and a wash in 2.X, though this says little about memory space, and is prone to change arbitrarily in the future:

c:\code> py −3 slots-test.py

Slots   => 0.7780903942045899

Nonslots=> 0.9888108080898417

c:\code> py −2 slots-test.py

Slots   => 0.80868754371

Nonslots=> 0.802224740747

For more on slots in general, see the Python standard manual set. Also watch for the Private decorator case study of Chapter 39—an example that naturally allows for attributes based on both __slots__ and __dict__ storage, by using delegation and storage-neutral accessor tools likegetattr.

Properties: Attribute Accessors

Our next new-style extension is properties—a mechanism that provides another way for new-style classes to define methods called automatically for access or assignment to instance attributes. This feature is similar to properties (a.k.a. “getters” and “setters”) in languages like Java and C#, but in Python is generally best used sparingly, as a way to add accessors to attributes after the fact as needs evolve and warrant. Where needed, though, properties allow attribute values to be computed dynamically without requiring method calls at the point of access.

Though properties cannot support generic attribute routing goals, at least for specific attributes they are an alternative to some traditional uses of the __getattr__ and __setattr__ overloading methods we first studied in Chapter 30. Properties have a similar effect to these two methods, but by contrast incur an extra method call only for accesses to names that require dynamic computation—other nonproperty names are accessed normally with no extra calls. Although __getattr__ is invoked only for undefined names, the __setattr__ method is instead called for assignment to every attribute.

Properties and slots are related too, but serve different goals. Both implement instance attributes that are not physically stored in instance namespace dictionaries—a sort of “virtual” attribute—and both are based on the notion of class-level attribute descriptors. In contrast, slots manage instance storage, while properties intercept access and compute values arbitrarily. Because their underlying descriptor implementation tool is too advanced for us to cover here, properties and descriptors both get full treatment in Chapter 38.

Property basics

As a brief introduction, though, a property is a type of object assigned to a class attribute name. You generate a property by calling the property built-in function, passing in up to three accessor methods—handlers for get, set, and delete operations—as well as an optional docstring for the property. If any argument is passed as None or omitted, that operation is not supported.

The resulting property object is typically assigned to a name at the top level of a class statement (e.g., name=property()), and a special @ syntax we’ll meet later is available to automate this step. When thus assigned, later accesses to the class property name itself as an object attribute (e.g., obj.name) are automatically routed to one of the accessor methods passed into the property call.

For example, we’ve seen how the __getattr__ operator overloading method allows classes to intercept undefined attribute references in both classic and new-style classes:

>>> class operators:

        def __getattr__(self, name):

            if name == 'age':

                return 40

            else:

                raise AttributeError(name)

>>> x = operators()

>>> x.age                                         # Runs __getattr__

40

>>> x.name                                        # Runs __getattr__

AttributeError: name

Here is the same example, coded with properties instead; note that properties are available for all classes but require the new-style object derivation in 2.X to work properly for intercepting attribute assignments (and won’t complain if you forget this—but will silently overwrite your property with the new data!):

>>> class properties(object):                     # Need object in 2.X for setters

        def getage(self):

            return 40

        age = property(getage, None, None, None)  # (get, set, del, docs), or use @

>>> x = properties()

>>> x.age                                         # Runs getage

40

>>> x.name                                        # Normal fetch

AttributeError: 'properties' object has no attribute 'name'

For some coding tasks, properties can be less complex and quicker to run than the traditional techniques. For example, when we add attribute assignment support, properties become more attractive—there’s less code to type, and no extra method calls are incurred for assignments to attributes we don’t wish to compute dynamically:

>>> class properties(object):                     # Need object in 2.X for setters

        def getage(self):

            return 40

        def setage(self, value):

            print('set age: %s' % value)

            self._age = value

        age = property(getage, setage, None, None)

>>> x = properties()

>>> x.age                                         # Runs getage

40

>>> x.age = 42                                    # Runs setage

set age: 42

>>> x._age                                        # Normal fetch:  no getage call

42

>>> x.age                                         # Runs getage

40

>>> x.job = 'trainer'                             # Normal assign: no setage call

>>> x.job                                         # Normal fetch:  no getage call

'trainer'

The equivalent class based on operator overloading incurs extra method calls for assignments to attributes not being managed and needs to route attribute assignments through the attribute dictionary to avoid loops (or, for new-style classes, to the object superclass’s __setattr__ to better support “virtual” attributes such as slots and properties coded in other classes):

>>> class operators:

        def __getattr__(self, name):              # On undefined reference

            if name == 'age':

                return 40

            else:

                raise AttributeError(name)

        def __setattr__(self, name, value):       # On all assignments

            print('set: %s %s' % (name, value))

            if name == 'age':

                self.__dict__['_age'] = value     # Or object.__setattr__()

            else:

                self.__dict__[name] = value

>>> x = operators()

>>> x.age                                         # Runs __getattr__

40

>>> x.age = 41                                    # Runs __setattr__

set: age 41

>>> x._age                                        # Defined: no __getattr__ call

41

>>> x.age                                         # Runs __getattr__

40

>>> x.job = 'trainer'                             # Runs __setattr__ again

set: job trainer

>>> x.job                                         # Defined: no __getattr__ call

'trainer'

Properties seem like a win for this simple example. However, some applications of __getattr__ and __setattr__ still require more dynamic or generic interfaces than properties directly provide.

For example, in many cases the set of attributes to be supported cannot be determined when the class is coded, and may not even exist in any tangible form (e.g., when delegating arbitrary attribute references to a wrapped/embedded object generically). In such contexts, a generic__getattr__ or a __setattr__ attribute handler with a passed-in attribute name is usually preferable. Because such generic handlers can also support simpler cases, properties are often an optional and redundant extension—albeit one that may avoid extra calls on assignments, and one that some programmers may prefer when applicable.

For more details on both options, stay tuned for Chapter 38 in the final part of this book. As we’ll see there, it’s also possible to code properties using the @ symbol function decorator syntax—a topic introduced later in this chapter, and an equivalent and automatic alternative to manual assignment in the class scope:

class properties(object):

    @property                          # Coding properties with decorators: ahead

    def age(self):

        ...

    @age.setter

    def age(self, value):

        ...

To make sense of this decorator syntax, though, we must move ahead.

__getattribute__ and Descriptors: Attribute Tools

Also in the class extensions department, the __getattribute__ operator overloading method, available for new-style classes only, allows a class to intercept all attribute references, not just undefined references. This makes it more potent than its __getattr__ cousin we used in the prior section, but also trickier to use—it’s prone to loops much like __setattr__, but in different ways.

For more specialized attribute interception goals, in addition to properties and operator overloading methods, Python supports the notion of attribute descriptors—classes with __get__ and __set__ methods, assigned to class attributes and inherited by instances, that intercept read and write accesses to specific attributes. As a preview, here’s one of the simplest descriptors you’re likely to encounter:

>>> class AgeDesc(object):

        def __get__(self, instance, owner): return 40

        def __set__(self, instance, value): instance._age = value

>>> class descriptors(object):

        age = AgeDesc()

>>> x = descriptors()

>>> x.age                                         # Runs AgeDesc.__get__

40

>>> x.age = 42                                    # Runs AgeDesc.__set__

>>> x._age                                        # Normal fetch: no AgeDesc call

42

Descriptors have access to state in instances of themselves as well as their client class, and are in a sense a more general form of properties; in fact, properties are a simplified way to define a specific type of descriptor—one that runs functions on access. Descriptors are also used to implement the slots feature we met earlier, and other Python tools.

Because __getattribute__ and descriptors are too substantial to cover well here, we’ll defer the rest of their coverage, as well as much more on properties, to Chapter 38 in the final part of this book. We’ll also employ them in examples in Chapter 39 and study how they factor into inheritance in Chapter 40.

Other Class Changes and Extensions

As mentioned, we’re also postponing coverage of the super built-in—an additional major new-style class extension that relies on its MRO—until the end of this chapter. Before we get there, though, we’re going to explore additional class-related changes and extensions that are not necessarily bound to new-style classes, but were introduced at roughly the same time: static and class methods, decorators, and more.

Many of the changes and feature additions of new-style classes integrate with the notion of subclassable types mentioned earlier in this chapter, because subclassable types and new-style classes were introduced in conjunction with a merging of the type/class dichotomy in Python 2.2 and beyond. As we’ve seen, in 3.X, this merging is complete: classes are now types, and types are classes, and Python classes today still reflect both that conceptual merging and its implementation.

Along with these changes, Python also grew a more coherent and generalized protocol for coding metaclasses—classes that subclass the type object, intercept class creation calls, and may provide behavior acquired by classes. Accordingly, they provide a well-defined hook for management and augmentation of class objects. They are also an advanced topic that is optional for most Python programmers, so we’ll postpone further details here. We’ll glimpse metaclasses again later in this chapter in conjunction with class decorators—a feature whose roles often overlap—but we’ll postpone their full coverage until Chapter 40, in the final part of this book. For our purpose here, let’s move on to a handful of additional class-related extensions.

Static and Class Methods

As of Python 2.2, it is possible to define two kinds of methods within a class that can be called without an instance: static methods work roughly like simple instance-less functions inside a class, and class methods are passed a class instead of an instance. Both are similar to tools in other languages (e.g., C++ static methods). Although this feature was added in conjunction with the new-style classes discussed in the prior sections, static and class methods work for classic classes too.

To enable these method modes, you must call special built-in functions named staticmethod and classmethod within the class, or invoke them with the special @name decoration syntax we’ll meet later in this chapter. These functions are required to enable these special method modes in Python 2.X, and are generally needed in 3.X. In Python 3.X, a staticmethod declaration is not required for instance-less methods called only through a class name, but is still required if such methods are called through instances.

Why the Special Methods?

As we’ve learned, a class’s method is normally passed an instance object in its first argument, to serve as the implied subject of the method call—that’s the “object” in “object-oriented programming.” Today, though, there are two ways to modify this model. Before I explain what they are, I should explain why this might matter to you.

Sometimes, programs need to process data associated with classes instead of instances. Consider keeping track of the number of instances created from a class, or maintaining a list of all of a class’s instances that are currently in memory. This type of information and its processing are associated with the class rather than its instances. That is, the information is usually stored on the class itself and processed apart from any instance.

For such tasks, simple functions coded outside a class can often suffice—because they can access class attributes through the class name, they have access to class data and never require access to an instance. However, to better associate such code with a class, and to allow such processing to be customized with inheritance as usual, it would be better to code these types of functions inside the class itself. To make this work, we need methods in a class that are not passed, and do not expect, a self instance argument.

Python supports such goals with the notion of static methods—simple functions with no self argument that are nested in a class and are designed to work on class attributes instead of instance attributes. Static methods never receive an automatic self argument, whether called through a class or an instance. They usually keep track of information that spans all instances, rather than providing behavior for instances.

Although less commonly used, Python also supports the notion of class methods—methods of a class that are passed a class object in their first argument instead of an instance, regardless of whether they are called through an instance or a class. Such methods can access class data through their class argument—what we’ve called self thus far—even if called through an instance. Normal methods, now known in formal circles as instance methods, still receive a subject instance when called; static and class methods do not.

Static Methods in 2.X and 3.X

The concept of static methods is the same in both Python 2.X and 3.X, but its implementation requirements have evolved somewhat in Python 3.X. Since this book covers both versions, I need to explain the differences in the two underlying models before we get to the code.

Really, we already began this story in the preceding chapter, when we explored the notion of unbound methods. Recall that both Python 2.X and 3.X always pass an instance to a method that is called through an instance. However, Python 3.X treats methods fetched directly from a class differently than 2.X—a difference in Python lines that has nothing to do with new-style classes:

§  Both Python 2.X and 3.X produce a bound method when a method is fetched through an instance.

§  In Python 2.X, fetching a method from a class produces an unbound method, which cannot be called without manually passing an instance.

§  In Python 3.X, fetching a method from a class produces a simple function, which can be called normally with no instance present.

In other words, Python 2.X class methods always require an instance to be passed in, whether they are called through an instance or a class. By contrast, in Python 3.X we are required to pass an instance to a method only if the method expects one—methods that do not include an instance argument can be called through the class without passing an instance. That is, 3.X allows simple functions in a class, as long as they do not expect and are not passed an instance argument. The net effect is that:

§  In Python 2.X, we must always declare a method as static in order to call it without an instance, whether it is called through a class or an instance.

§  In Python 3.X, we need not declare such methods as static if they will be called through a class only, but we must do so in order to call them through an instance.

To illustrate, suppose we want to use class attributes to count how many instances are generated from a class. The following file, spam.py, makes a first attempt—its class has a counter stored as a class attribute, a constructor that bumps up the counter by one each time a new instance is created, and a method that displays the counter’s value. Remember, class attributes are shared by all instances. Therefore, storing the counter in the class object itself ensures that it effectively spans all instances:

class Spam:

    numInstances = 0

    def __init__(self):

        Spam.numInstances = Spam.numInstances + 1

    def printNumInstances():

        print("Number of instances created: %s" % Spam.numInstances)

The printNumInstances method is designed to process class data, not instance data—it’s about all the instances, not any one in particular. Because of that, we want to be able to call it without having to pass an instance. Indeed, we don’t want to make an instance to fetch the number of instances, because this would change the number of instances we’re trying to fetch! In other words, we want a self-less “static” method.

Whether this code’s printNumInstances works or not, though, depends on which Python you use, and which way you call the method—through the class or through an instance. In 2.X, calls to a self-less method function through both the class and instances fail (as usual, I’ve omitted some error text here for space):

C:\code> c:\python27\python

>>> from spam import Spam

>>> a = Spam()                       # Cannot call unbound class methods in 2.X

>>> b = Spam()                       # Methods expect a self object by default

>>> c = Spam()

>>> Spam.printNumInstances()

TypeError: unbound method printNumInstances() must be called with Spam instance

as first argument (got nothing instead)

>>> a.printNumInstances()

TypeError: printNumInstances() takes no arguments (1 given)

The problem here is that unbound instance methods aren’t exactly the same as simple functions in 2.X. Even though there are no arguments in the def header, the method still expects an instance to be passed in when it’s called, because the function is associated with a class. In Python 3.X, calls to self-less methods made through classes work, but calls from instances fail:

C:\code> c:\python33\python

>>> from spam import Spam

>>> a = Spam()                       # Can call functions in class in 3.X

>>> b = Spam()                       # Calls through instances still pass a self

>>> c = Spam()

>>> Spam.printNumInstances()         # Differs in 3.X

Number of instances created: 3

>>> a.printNumInstances()

TypeError: printNumInstances() takes 0 positional arguments but 1 was given

That is, calls to instance-less methods like printNumInstances made through the class fail in Python 2.X but work in Python 3.X. On the other hand, calls made through an instance fail in both Pythons, because an instance is automatically passed to a method that does not have an argument to receive it:

Spam.printNumInstances()             # Fails in 2.X, works in 3.X

instance.printNumInstances()         # Fails in both 2.X and 3.X (unless static)

If you’re able to use 3.X and stick with calling self-less methods through classes only, you already have a static method feature. However, to allow self-less methods to be called through classes in 2.X and through instances in both 2.X and 3.X, you need to either adopt other designs or be able to somehow mark such methods as special. Let’s look at both options in turn.

Static Method Alternatives

Short of marking a self-less method as special, you can sometimes achieve similar results with different coding structures. For example, if you just want to call functions that access class members without an instance, perhaps the simplest idea is to use normal functions outside the class, not class methods. This way, an instance isn’t expected in the call. The following mutation of spam.py illustrates, and works the same in Python 3.X and 2.X:

def printNumInstances():

    print("Number of instances created: %s" % Spam.numInstances)

class Spam:

    numInstances = 0

    def __init__(self):

        Spam.numInstances = Spam.numInstances + 1

C:\code> c:\python33\python

>>> import spam

>>> a = spam.Spam()

>>> b = spam.Spam()

>>> c = spam.Spam()

>>> spam.printNumInstances()           # But function may be too far removed

Number of instances created: 3         # And cannot be changed via inheritance

>>> spam.Spam.numInstances

3

Because the class name is accessible to the simple function as a global variable, this works fine. Also, note that the name of the function becomes global, but only to this single module; it will not clash with names in other files of the program.

Prior to static methods in Python, this structure was the general prescription. Because Python already provides modules as a namespace-partitioning tool, one could argue that there’s not typically any need to package functions in classes unless they implement object behavior. Simple functions within modules like the one here do much of what instance-less class methods could, and are already associated with the class because they live in the same module.

Unfortunately, this approach is still less than ideal. For one thing, it adds to this file’s scope an extra name that is used only for processing a single class. For another, the function is much less directly associated with the class by structure; in fact, its definition could be hundreds of lines away. Perhaps worse, simple functions like this cannot be customized by inheritance, since they live outside a class’s namespace: subclasses cannot directly replace or extend such a function by redefining it.

We might try to make this example work in a version-neutral way by using a normal method and always calling it through (or with) an instance, as usual:

class Spam:

    numInstances = 0

    def __init__(self):

        Spam.numInstances = Spam.numInstances + 1

    def printNumInstances(self):

        print("Number of instances created: %s" % Spam.numInstances)

C:\code> c:\python33\python

>>> from spam import Spam

>>> a, b, c = Spam(), Spam(), Spam()

>>> a.printNumInstances()

Number of instances created: 3

>>> Spam.printNumInstances(a)

Number of instances created: 3

>>> Spam().printNumInstances()         # But fetching counter changes counter!

Number of instances created: 4

Unfortunately, as mentioned earlier, such an approach is completely unworkable if we don’t have an instance available, and making an instance changes the class data, as illustrated in the last line here. A better solution would be to somehow mark a method inside a class as never requiring an instance. The next section shows how.

Using Static and Class Methods

Today, there is another option for coding simple functions associated with a class that may be called through either the class or its instances. As of Python 2.2, we can code classes with static and class methods, neither of which requires an instance argument to be passed in when invoked. To designate such methods, classes call the built-in functions staticmethod and classmethod, as hinted in the earlier discussion of new-style classes. Both mark a function object as special—that is, as requiring no instance if static and requiring a class argument if a class method. For example, in the file bothmethods.py (which unifies 2.X and 3.X printing with lists, though displays still vary slightly for 2.X classic classes):

# File bothmethods.py

class Methods:

    def imeth(self, x):            # Normal instance method: passed a self

        print([self, x])

    def smeth(x):                  # Static: no instance passed

        print([x])

    def cmeth(cls, x):             # Class: gets class, not instance

        print([cls, x])

    smeth = staticmethod(smeth)    # Make smeth a static method (or @: ahead)

    cmeth = classmethod(cmeth)     # Make cmeth a class method (or @: ahead)

Notice how the last two assignments in this code simply reassign (a.k.a. rebind) the method names smeth and cmeth. Attributes are created and changed by any assignment in a class statement, so these final assignments simply overwrite the assignments made earlier by the defs. As we’ll see in a few moments, the special @ syntax works here as an alternative to this just as it does for properties—but makes little sense unless you first understand the assignment form here that it automates.

Technically, Python now supports three kinds of class-related methods, with differing argument protocols:

§  Instance methods, passed a self instance object (the default)

§  Static methods, passed no extra object (via staticmethod)

§  Class methods, passed a class object (via classmethod, and inherent in metaclasses)

Moreover, Python 3.X extends this model by also allowing simple functions in a class to serve the role of static methods without extra protocol, when called through a class object only. Despite its name, the bothmethods.py module illustrates all three method types, so let’s expand on these in turn.

Instance methods are the normal and default case that we’ve seen in this book. An instance method must always be called with an instance object. When you call it through an instance, Python passes the instance to the first (leftmost) argument automatically; when you call it through a class, you must pass along the instance manually:

>>> from bothmethods import Methods    # Normal instance methods

>>> obj = Methods()                    # Callable through instance or class

>>> obj.imeth(1)

[<bothmethods.Methods object at 0x0000000002A15710>, 1]

>>> Methods.imeth(obj, 2)

[<bothmethods.Methods object at 0x0000000002A15710>, 2]

Static methods, by contrast, are called without an instance argument. Unlike simple functions outside a class, their names are local to the scopes of the classes in which they are defined, and they may be looked up by inheritance. Instance-less functions can be called through a class normally in Python 3.X, but never by default in 2.X. Using the staticmethod built-in allows such methods to also be called through an instance in 3.X and through both a class and an instance in Python 2.X (that is, the first of the following works in 3.X without staticmethod, but the second does not):

>>> Methods.smeth(3)                   # Static method: call through class

[3]                                    # No instance passed or expected

>>> obj.smeth(4)                       # Static method: call through instance

[4]                                    # Instance not passed

Class methods are similar, but Python automatically passes the class (not an instance) in to a class method’s first (leftmost) argument, whether it is called through a class or an instance:

>>> Methods.cmeth(5)                   # Class method: call through class

[<class 'bothmethods.Methods'>, 5]     # Becomes cmeth(Methods, 5)

>>> obj.cmeth(6)                       # Class method: call through instance

[<class 'bothmethods.Methods'>, 6]     # Becomes cmeth(Methods, 6)

In Chapter 40, we’ll also find that metaclass methods—a unique, advanced, and technically distinct method type—behave similarly to the explicitly-declared class methods we’re exploring here.

Counting Instances with Static Methods

Now, given these built-ins, here is the static method equivalent of this section’s instance-counting example—it marks the method as special, so it will never be passed an instance automatically:

class Spam:

    numInstances = 0                         # Use static method for class data

    def __init__(self):

        Spam.numInstances += 1

    def printNumInstances():

        print("Number of instances: %s" % Spam.numInstances)

    printNumInstances = staticmethod(printNumInstances)

Using the static method built-in, our code now allows the self-less method to be called through the class or any instance of it, in both Python 2.X and 3.X:

>>> from spam_static import Spam

>>> a = Spam()

>>> b = Spam()

>>> c = Spam()

>>> Spam.printNumInstances()                 # Call as simple function

Number of instances: 3

>>> a.printNumInstances()                    # Instance argument not passed

Number of instances: 3

Compared to simply moving printNumInstances outside the class, as prescribed earlier, this version requires an extra staticmethod call (or an @ line we’ll see ahead). However, it also localizes the function name in the class scope (so it won’t clash with other names in the module); moves the function code closer to where it is used (inside the class statement); and allows subclasses to customize the static method with inheritance—a more convenient and powerful approach than importing functions from the files in which superclasses are coded. The following subclass and new testing session illustrate (be sure to start a new session after changing files, so that your from imports load the latest version of the file):

class Sub(Spam):

    def printNumInstances():                 # Override a static method

        print("Extra stuff...")              # But call back to original

        Spam.printNumInstances()

    printNumInstances = staticmethod(printNumInstances)

>>> from spam_static import Spam, Sub

>>> a = Sub()

>>> b = Sub()

>>> a.printNumInstances()                    # Call from subclass instance

Extra stuff...

Number of instances: 2

>>> Sub.printNumInstances()                  # Call from subclass itself

Extra stuff...

Number of instances: 2

>>> Spam.printNumInstances()                 # Call original version

Number of instances: 2

Moreover, classes can inherit the static method without redefining it—it is run without an instance, regardless of where it is defined in a class tree:

>>> class Other(Spam): pass                  # Inherit static method verbatim

>>> c = Other()

>>> c.printNumInstances()

Number of instances: 3

Notice how this also bumps up the superclass’s instance counter, because its constructor is inherited and run—a behavior that begins to encroach on the next section’s subject.

Counting Instances with Class Methods

Interestingly, a class method can do similar work here—the following has the same behavior as the static method version listed earlier, but it uses a class method that receives the instance’s class in its first argument. Rather than hardcoding the class name, the class method uses the automatically passed class object generically:

class Spam:

    numInstances = 0                         # Use class method instead of static

    def __init__(self):

        Spam.numInstances += 1

    def printNumInstances(cls):

        print("Number of instances: %s" % cls.numInstances)

    printNumInstances = classmethod(printNumInstances)

This class is used in the same way as the prior versions, but its printNumInstances method receives the Spam class, not the instance, when called from both the class and an instance:

>>> from spam_class import Spam

>>> a, b = Spam(), Spam()

>>> a.printNumInstances()                    # Passes class to first argument

Number of instances: 2

>>> Spam.printNumInstances()                 # Also passes class to first argument

Number of instances: 2

When using class methods, though, keep in mind that they receive the most specific (i.e., lowest) class of the call’s subject. This has some subtle implications when trying to update class data through the passed-in class. For example, if in module spam_class.py we subclass to customize as before, augment Spam.printNumInstances to also display its cls argument, and start a new testing session:

class Spam:

    numInstances = 0                         # Trace class passed in

    def __init__(self):

        Spam.numInstances += 1

    def printNumInstances(cls):

        print("Number of instances: %s %s" % (cls.numInstances, cls))

    printNumInstances = classmethod(printNumInstances)

class Sub(Spam):

    def printNumInstances(cls):              # Override a class method

        print("Extra stuff...", cls)         # But call back to original

        Spam.printNumInstances()

    printNumInstances = classmethod(printNumInstances)

class Other(Spam): pass                      # Inherit class method verbatim

The lowest class is passed in whenever a class method is run, even for subclasses that have no class methods of their own:

>>> from spam_class import Spam, Sub, Other

>>> x = Sub()

>>> y = Spam()

>>> x.printNumInstances()                           # Call from subclass instance

Extra stuff... <class 'spam_class.Sub'>

Number of instances: 2 <class 'spam_class.Spam'>

>>> Sub.printNumInstances()                         # Call from subclass itself

Extra stuff... <class 'spam_class.Sub'>

Number of instances: 2 <class 'spam_class.Spam'>

>>> y.printNumInstances()                           # Call from superclass instance

Number of instances: 2 <class 'spam_class.Spam'>

In the first call here, a class method call is made through an instance of the Sub subclass, and Python passes the lowest class, Sub, to the class method. All is well in this case—since Sub’s redefinition of the method calls the Spam superclass’s version explicitly, the superclass method in Spamreceives its own class in its first argument. But watch what happens for an object that inherits the class method verbatim:

>>> z = Other()                                     # Call from lower sub's instance

>>> z.printNumInstances()

Number of instances: 3 <class 'spam_class.Other'>

This last call here passes Other to Spam’s class method. This works in this example because fetching the counter finds it in Spam by inheritance. If this method tried to assign to the passed class’s data, though, it would update Other, not Spam! In this specific case, Spam is probably better off hardcoding its own class name to update its data if it means to count instances of all its subclasses too, rather than relying on the passed-in class argument.

Counting instances per class with class methods

In fact, because class methods always receive the lowest class in an instance’s tree:

§  Static methods and explicit class names may be a better solution for processing data local to a class.

§  Class methods may be better suited to processing data that may differ for each class in a hierarchy.

Code that needs to manage per-class instance counters, for example, might be best off leveraging class methods. In the following, the top-level superclass uses a class method to manage state information that varies for and is stored on each class in the tree—similar in spirit to the way instance methods manage state information that varies per class instance:

class Spam:

    numInstances = 0

    def count(cls):                    # Per-class instance counters

        cls.numInstances += 1          # cls is lowest class above instance

    def __init__(self):

        self.count()                   # Passes self.__class__ to count

    count = classmethod(count)

class Sub(Spam):

    numInstances = 0

    def __init__(self):                # Redefines __init__

        Spam.__init__(self)

class Other(Spam):                     # Inherits __init__

    numInstances = 0

>>> from spam_class2 import Spam, Sub, Other

>>> x = Spam()

>>> y1, y2 = Sub(), Sub()

>>> z1, z2, z3 = Other(), Other(), Other()

>>> x.numInstances, y1.numInstances, z1.numInstances             # Per-class data!

(1, 2, 3)

>>> Spam.numInstances, Sub.numInstances, Other.numInstances

(1, 2, 3)

Static and class methods have additional advanced roles, which we will finesse here; see other resources for more use cases. In recent Python versions, though, the static and class method designations have become even simpler with the advent of function decoration syntax—a way to apply one function to another that has roles well beyond the static method use case that was its initial motivation. This syntax also allows us to augment classes in Python 2.X and 3.X—to initialize data like the numInstances counter in the last example, for instance. The next section explainshow.

NOTE

For a postscript on Python’s method types, be sure to watch for coverage of metaclass methods in Chapter 40—because these are designed to process a class that is an instance of a metaclass, they turn out to be very similar to the class methods defined here, but require no classmethod declaration, and apply only to the shadowy metaclass realm.

Decorators and Metaclasses: Part 1

Because the staticmethod and classmethod call technique described in the prior section initially seemed obscure to some observers, a device was eventually added to make the operation simpler. Python decorators—similar to the notion and syntax of annotations in Java—both addressed this specific need and provided a general tool for adding logic that manages both functions and classes, or later calls to them.

This is called a “decoration,” but in more concrete terms is really just a way to run extra processing steps at function and class definition time with explicit syntax. It comes in two flavors:

§  Function decorators—the initial entry in this set, added in Python 2.4—augment function definitions. They specify special operation modes for both simple functions and classes’ methods by wrapping them in an extra layer of logic implemented as another function, usually called ametafunction.

§  Class decorators—a later extension, added in Python 2.6 and 3.0—augment class definitions. They do the same for classes, adding support for management of whole objects and their interfaces. Though perhaps simpler, they often overlap in roles with metaclasses.

Function decorators turn out to be very general tools: they are useful for adding many types of logic to functions besides the static and class method use cases. For instance, they may be used to augment functions with code that logs calls made to them, checks the types of passed arguments during debugging, and so on. Function decorators can be used to manage either functions themselves or later calls to them. In the latter mode, function decorators are similar to the delegation design pattern we explored in Chapter 31, but they are designed to augment a specific function or method call, not an entire object interface.

Python provides a few built-in function decorators for operations such as marking static and class methods and defining properties (as sketched earlier, the property built-in works as a decorator automatically), but programmers can also code arbitrary decorators of their own. Although they are not strictly tied to classes, user-defined function decorators often are coded as classes to save the original functions for later dispatch, along with other data as state information.

This proved such a useful hook that it was extended in Python 2.6, 2.7, and 3.X—class decorators bring augmentation to classes too, and are more directly tied to the class model. Like their function cohorts, class decorators may manage classes themselves or later instance creation calls, and often employ delegation in the latter mode. As we’ll find, their roles also often overlap with metaclasses; when they do, the newer class decorators may offer a more lightweight way to achieve the same goals.

Function Decorator Basics

Syntactically, a function decorator is a sort of runtime declaration about the function that follows. A function decorator is coded on a line by itself just before the def statement that defines a function or method. It consists of the @ symbol, followed by what we call a metafunction—a function (or other callable object) that manages another function. Static methods since Python 2.4, for example, may be coded with decorator syntax like this:

class C:

   @staticmethod                    # Function decoration syntax

   def meth():

       ...

Internally, this syntax has the same effect as the following—passing the function through the decorator and assigning the result back to the original name:

class C:

   def meth():

       ...

   meth = staticmethod(meth)        # Name rebinding equivalent

Decoration rebinds the method name to the decorator’s result. The net effect is that calling the method function’s name later actually triggers the result of its staticmethod decorator first. Because a decorator can return any sort of object, this allows the decorator to insert a layer of logic to be run on every call. The decorator function is free to return either the original function itself, or a new proxy object that saves the original function passed to the decorator to be invoked indirectly after the extra logic layer runs.

With this addition, here’s a better way to code our static method example from the prior section in either Python 2.X or 3.X:

class Spam:

    numInstances = 0

    def __init__(self):

        Spam.numInstances = Spam.numInstances + 1

    @staticmethod

    def printNumInstances():

        print("Number of instances created: %s" % Spam.numInstances)

>>> from spam_static_deco import Spam

>>> a = Spam()

>>> b = Spam()

>>> c = Spam()

>>> Spam.printNumInstances()            # Calls from classes and instances work

Number of instances created: 3

>>> a.printNumInstances()

Number of instances created: 3

Because they also accept and return functions, the classmethod and property built-in functions may be used as decorators in the same way—as in the following mutation of the prior bothmethods.py:

# File bothmethods_decorators.py

class Methods(object):             # object needed in 2.X for property setters

    def imeth(self, x):            # Normal instance method: passed a self

        print([self, x])

    @staticmethod

    def smeth(x):                  # Static: no instance passed

        print([x])

    @classmethod

    def cmeth(cls, x):             # Class: gets class, not instance

        print([cls, x])

    @property                      # Property: computed on fetch

    def name(self):

        return 'Bob ' + self.__class__.__name__

>>> from bothmethods_decorators import Methods

>>> obj = Methods()

>>> obj.imeth(1)

[<bothmethods_decorators.Methods object at 0x0000000002A256A0>, 1]

>>> obj.smeth(2)

[2]

>>> obj.cmeth(3)

[<class 'bothmethods_decorators.Methods'>, 3]

>>> obj.name

'Bob Methods'

Keep in mind that staticmethod and its kin here are still built-in functions; they may be used in decoration syntax, just because they take a function as an argument and return a callable to which the original function name can be rebound. In fact, any such function can be used in this way—even user-defined functions we code ourselves, as the next section explains.

A First Look at User-Defined Function Decorators

Although Python provides a handful of built-in functions that can be used as decorators, we can also write custom decorators of our own. Because of their wide utility, we’re going to devote an entire chapter to coding decorators in the final part of this book. As a quick example, though, let’s look at a simple user-defined decorator at work.

Recall from Chapter 30 that the __call__ operator overloading method implements a function-call interface for class instances. The following code uses this to define a call proxy class that saves the decorated function in the instance and catches calls to the original name. Because this is a class, it also has state information—a counter of calls made:

class tracer:

    def __init__(self, func):          # Remember original, init counter

        self.calls = 0

        self.func  = func

    def __call__(self, *args):         # On later calls: add logic, run original

        self.calls += 1

        print('call %s to %s' % (self.calls, self.func.__name__))

        return self.func(*args)

@tracer                                # Same as spam = tracer(spam)

def spam(a, b, c):                     # Wrap spam in a decorator object

    return a + b + c

print(spam(1, 2, 3))                   # Really calls the tracer wrapper object

print(spam('a', 'b', 'c'))             # Invokes __call__ in class

Because the spam function is run through the tracer decorator, when the original spam name is called it actually triggers the __call__ method in the class. This method counts and logs the call, and then dispatches it to the original wrapped function. Note how the *name argument syntax is used to pack and unpack the passed-in arguments; because of this, this decorator can be used to wrap any function with any number of positional arguments.

The net effect, again, is to add a layer of logic to the original spam function. Here is the script’s 3.X and 2.X output—the first line comes from the tracer class, and the second gives the return value of the spam function itself:

c:\code> python tracer1.py

call 1 to spam

6

call 2 to spam

abc

Trace through this example’s code for more insight. As it is, this decorator works for any function that takes positional arguments, but it does not handle keyword arguments, and cannot decorate class-level method functions (in short, for methods its __call__ would be passed a tracerinstance only). As we’ll see in Part VIII, there are a variety of ways to code function decorators, including nested def statements; some of the alternatives are better suited to methods than the version shown here.

For example, by using nested functions with enclosing scopes for state, instead of callable class instances with attributes, function decorators often become more broadly applicable to class-level methods too. We’ll postpone the full details on this, but here’s a brief look at this closure based coding model; it uses function attributes for counter state for portability, but could leverage variables and nonlocal instead in 3.X only:

def tracer(func):                      # Remember original

    def oncall(*args):                 # On later calls

        oncall.calls += 1

        print('call %s to %s' % (oncall.calls, func.__name__))

        return func(*args)

    oncall.calls = 0

    return oncall

class C:

    @tracer

    def spam(self,a, b, c): return a + b + c

x = C()

print(x.spam(1, 2, 3))

print(x.spam('a', 'b', 'c'))           # Same output as tracer1 (in tracer2.py)

A First Look at Class Decorators and Metaclasses

Function decorators turned out to be so useful that Python 2.6 and 3.0 expanded the model, allowing decorators to be applied to classes as well as functions. In short, class decorators are similar to function decorators, but they are run at the end of a class statement to rebind a class name to a callable. As such, they can be used to either manage classes just after they are created, or insert a layer of wrapper logic to manage instances when they are later created. Symbolically, the code structure:

def decorator(aClass): ...

@decorator                             # Class decoration syntax

class C: ...

is mapped to the following equivalent:

def decorator(aClass): ...

class C: ...                           # Name rebinding equivalent

C = decorator(C)

The class decorator is free to augment the class itself, or return a proxy object that intercepts later instance construction calls. For example, in the code of the section Counting instances per class with class methods, we could use this hook to automatically augment the classes with instance counters and any other data required:

def count(aClass):

    aClass.numInstances = 0

    return aClass                 # Return class itself, instead of a wrapper

@count

class Spam: ...                   # Same as Spam = count(Spam)

@count

class Sub(Spam): ...              # numInstances = 0 not needed here

@count

class Other(Spam): ...

In fact, as coded, this decorator can be applied to class or functions—it happily returns the object being defined in either context after initializing the object’s attribute:

@count

def spam(): pass        # Like spam = count(spam)

@count

class Other: pass       # Like Other = count(Other)

spam.numInstances       # Both are set to zero

Other.numInstances

Though this decorator manages a function or class itself, as we’ll see later in this book, class decorators can also manage an object’s entire interface by intercepting construction calls, and wrapping the new instance object in a proxy that deploys attribute accessor tools to intercept later requests—a multilevel coding technique we’ll use to implement class attribute privacy in Chapter 39. Here’s a preview of the model:

def decorator(cls):                             # On @ decoration

    class Proxy:

        def __init__(self, *args):              # On instance creation: make a cls

            self.wrapped = cls(*args)

        def __getattr__(self, name):            # On attribute fetch: extra ops here

            return getattr(self.wrapped, name)

    return Proxy

@decorator

class C: ...        # Like C = decorator(C)

X = C()             # Makes a Proxy that wraps a C, and catches later X.attr

Metaclasses, mentioned briefly earlier, are a similarly advanced class-based tool whose roles often intersect with those of class decorators. They provide an alternate model, which routes the creation of a class object to a subclass of the top-level type class, at the conclusion of a classstatement:

class Meta(type):

    def __new__(meta, classname, supers, classdict):

        ...extra logic + class creation via type call...

class C(metaclass=Meta):

    ...my creation routed to Meta...            # Like C = Meta('C', (), {...})

In Python 2.X, the effect is the same, but the coding differs—use a class attribute instead of a keyword argument in the class header:

class C:

    __metaclass__ = Meta

    ... my creation routed to Meta...

In either line, Python calls a class’s metaclass to create the new class object, passing in the data defined during the class statement’s run; in 2.X, the metaclass simply defaults to the classic class creator:

 classname = Meta(classname, superclasses, attributedict)

To assume control of the creation or initialization of a new class object, a metaclass generally redefines the __new__ or __init__ method of the type class that normally intercepts this call. The net effect, as with class decorators, is to define code to be run automatically at class creation time. Here, this step binds the class name to the result of a call to a user-defined metaclass. In fact, a metaclass need not be a class at all—a possibility we’ll explore later that blurs some of the distinction between this tool and decorators, and may even qualify the two as functionally equivalent in many roles.

Both schemes, class decorators and metaclasses, are free to augment a class or return an arbitrary object to replace it—a protocol with almost limitless class-based customization possibilities. As we’ll see later, metaclasses may also define methods that process their instance classes, rather than normal instances of them—a technique that’s similar to class methods, and might be emulated in spirit by methods and data in class decorator proxies, or even a class decorator that returns a metaclass instance. Such mind-binding concepts will require Chapter 40’s conceptual groundwork (and quite possibly sedation!).

For More Details

Naturally, there’s much more to the decorator and metaclass stories than I’ve shown here. Although they are a general mechanism whose usage may be required by some packages, coding new user-defined decorators and metaclasses is an advanced topic of interest primarily to tool writers, not application programmers. Because of this, we’ll defer additional coverage until the final and optional part of this book:

§  Chapter 38 shows how to code properties using function decorator syntax in more depth.

§  Chapter 39 has much more on decorators, including more comprehensive examples.

§  Chapter 40 covers metaclasses, and more on the class and instance management story.

Although these chapters cover advanced topics, they’ll also provide us with a chance to see Python at work in more substantial examples than much of the rest of the book was able to provide. For now, let’s move on to our final class-related topic.

The super Built-in Function: For Better or Worse?

So far, I’ve mentioned Python’s super built-in function only briefly in passing because it is relatively uncommon and may even be controversial to use. Given this call’s increased visibility in recent years, though, it merits some further elaboration in this edition. Besides introducing super, this section also serves as a language design case study to close out a chapter on so many tools whose presence may to some seem curious in a scripting language like Python.

Some of this section calls this proliferation of tools into question, and I encourage you to judge any subjective content here for yourself (and we’ll return to such things at the end of this book after we’ve expanded on other advanced tools such as metaclasses and descriptors). Still, Python’s rapid growth rate in recent years represents a strategic decision point for its community going forward, and super seems as good a representative example as any.

The Great super Debate

As noted in Chapter 28 and Chapter 29, Python has a super built-in function that can be used to invoke superclass methods generically, but was deferred until this point of the book. This was deliberate—because super has substantial downsides in typical code, and a sole use case that seems obscure and complex to many observers, most beginners are better served by the traditional explicit-name call scheme used so far. See the sidebar What About super? in Chapter 28 for a brief summary of the rationale for this policy.

The Python community itself seems split on this subject, with online articles about it running the gamut from “Python’s Super Considered Harmful” to “Python’s super() considered super!”[65] Frankly, in my live classes this call seems to be most often of interest to Java programmers starting to use Python anew, because of its conceptual similarity to a tool in that language (many a new Python feature ultimately owes its existence to programmers of other languages bringing their old habits to a new model). Python’s super is not Java’s—it translates differently to Python’s multiple inheritance, and has a use case beyond Java’s—but it has managed to generate both controversy and misunderstanding since its conception.

This book postponed the super call until now (and omitted it almost entirely in prior editions) because it has significant issues—it’s prohibitively cumbersome to use in 2.X, differs in form between 2.X and 3.X, is based upon unusual semantics in 3.X, and mixes poorly with Python’s multiple inheritance and operator overloading in typical Python code. In fact, as we’ll see, in some code super can actually mask problems, and discourage a more explicit coding style that offers better control.

In its defense, this call does have a valid use case too—cooperative same-named method dispatch in diamond multiple inheritance trees—but it seems to ask a lot of newcomers. It requires that super be used universally and consistently (if not neurotically), much like __slots__ discussed earlier; relies on the arguably obscure MRO algorithm to order calls; and addresses a use case that seems far more the exception than the norm in Python programs. In this role, super seems an advanced tool based upon esoteric principles, which may be beyond much of Python’s audience, and seems artificial to real program goals. That aside, its expectation of universal use seems unrealistic for the vast amount of existing Python code.

Because of all these factors, this introductory-level book has preferred the traditional explicit-name call scheme thus far and recommends the same for newcomers. You’re better off learning the traditional scheme first, and might be better off sticking with that in general, rather than using an extra special-case tool that may not work in some contexts, and relies on arcane magic in the valid but atypical use case it addresses. This is not just your author’s opinion; despite its advocate’s best intentions, super is not widely recognized as “best practice” in Python today, for completely valid reasons.

On the other hand, just as for other tools the increasing use of this call in Python code in recent years makes it no longer optional for many Python programmers—the first time you see it, it’s officially mandatory! For readers who may wish to experiment with super, and for other readers who may have it imposed upon them, this section provides a brief look at this tool and its rationale—beginning with alternatives to it.

Traditional Superclass Call Form: Portable, General

In general, this book’s examples prefer to call back to superclass methods when needed by naming the superclass explicitly, because this technique is traditional in Python, because it works the same in both Python 2.X and 3.X, and because it sidesteps limitations and complexities related to this call in both 2.X and 3.X. As shown earlier, the traditional superclass method call scheme to augment a superclass method works as follows:

>>> class C:                    # In Python 2.X and 3.X

        def act(self):

            print('spam')

>>> class D(C):

        def act(self):

            C.act(self)         # Name superclass explicitly, pass self

            print('eggs')

>>> X = D()

>>> X.act()

spam

eggs

This form works the same in 2.X and 3.X, follows Python’s normal method call mapping model, applies to all inheritance tree forms, and does not lead to confusing behavior when operator overloading is used. To see why these distinctions matter, let’s see how super compares.

Basic super Usage and Its Tradeoffs

In this section, we’ll both introduce super in basic, single-inheritance mode, and look at its perceived downsides in this role. As we’ll find, in this context super does work as advertised, but is not much different from traditional calls, relies on unusual semantics, and is cumbersome to deploy in 2.X. More critically, as soon as your classes grow to use multiple inheritance, this super usage mode can both mask problems in your code and route calls in ways you may not expect.

Odd semantics: A magic proxy in Python 3.X

The super built-in actually has two intended roles. The more esoteric of these—cooperative multiple inheritance dispatch protocols in diamond multiple-inheritance trees (yes, a mouthful!)—relies on the 3.X MRO, was borrowed from the Dylan language, and will be covered later in this section.

The role we’re interested in here is more commonly used, and more frequently requested by people with Java backgrounds—to allow superclasses to be named generically in inheritance trees. This is intended to promote simpler code maintenance, and to avoid having to type long superclass reference paths in calls. In Python 3.X, this call seems at least at first glance to achieve this purpose well:

>>> class C:                    # In Python 3.X (only: see 2.X super form ahead)

        def act(self):

            print('spam')

>>> class D(C):

        def act(self):

            super().act()       # Reference superclass generically, omit self

            print('eggs')

>>> X = D()

>>> X.act()

spam

eggs

This works, and minimizes code changes—you don’t need to update the call if D’s superclass changes in the future. One of the biggest downsides of this call in 3.X, though, is its reliance on deep magic: though prone to change, it operates today by inspecting the call stack in order to automatically locate the self argument and find the superclass, and pairs the two in a special proxy object that routes the later call to the superclass version of the method. If that sounds complicated and strange, it’s because it is. In fact, this call form doesn’t work at all outside the context of a class’s method:

>>> super                       # A "magic" proxy object that routes later calls

<class 'super'>

>>> super()

SystemError: super(): no arguments

>>> class E(C):

        def method(self):       # self is implicit in super...only!

            proxy = super()     # This form has no meaning outside a method

            print(proxy)        # Show the normally hidden proxy object

            proxy.act()         # No arguments: implicitly calls superclass method!

>>> E().method()

<super: <class 'E'>, <E object>>

spam

Really, this call’s semantics resembles nothing else in Python—it’s neither a bound nor unbound method, and somehow finds a self even though you omit one in the call. In single inheritance trees, a superclass is available from self via the path self.__class__.__bases__[0], but the heavily implicit nature of this call makes this difficult to see, and even flies in the face of Python’s explicit self policy that holds true everywhere else. That is, this call violates a fundamental Python idiom for a single use case. It also soundly contradicts Python’s longstanding EIBTI design rule (run an “import this” for more on this rule).

Pitfall: Adding multiple inheritance naively

Besides its unusual semantics, even in 3.X this super role applies most directly to single inheritance trees, and can become problematic as soon as classes employ multiple inheritance with traditionally coded classes. This seems a major limitation of scope; due to the utility of mix-in classes in Python, multiple inheritance from disjoint and independent superclasses is probably more the norm than the exception in realistic code. The super call seems a recipe for disaster in classes coded to naively use its basic mode, without allowing for its much more subtle implications in multiple inheritance trees.

The following illustrates the trap. This code begins its life happily deploying super in single-inheritance mode to invoke a method one level up from C:

>>> class A:                      # In Python 3.X

        def act(self): print('A')

>>> class B:

        def act(self): print('B')

>>> class C(A):

        def act(self):

            super().act()         # super applied to a single-inheritance tree

>>> X = C()

>>> X.act()

A

If such classes later grow to use more than one superclass, though, super can become error-prone, and even unusable—it does not raise an exception for multiple inheritance trees, but will naively pick just the leftmost superclass having the method being run (technically, the first per the MRO), which may or may not be the one that you want:

>>> class C(A, B):                # Add a B mix-in class with the same method

        def act(self):

            super().act()         # Doesn't fail on multi-inher, but picks just one!

>>> X = C()

>>> X.act()

A

>>> class C(B, A):

        def act(self):

            super().act()         # If B is listed first, A.act() is no longer run!

>>> X = C()

>>> X.act()

B

Perhaps worse, this silently masks the fact that you should probably be selecting superclasses explicitly in this case, as we learned earlier in both this chapter and its predecessor. In other words, super usage may obscure a common source of errors in Python—one so common that it shows up again in this part’s “Gotchas.” If you may need to use direct calls later, why not use them earlier too?

>>> class C(A, B):                # Traditional form

        def act(self):            # You probably need to be more explicit here

            A.act(self)           # This form handles both single and multiple inher

            B.act(self)           # And works the same in both Python 3.X and 2.X

>>> X = C()                       # So why use the super() special case at all?

>>> X.act()

A

B

As we’ll see in a few moments, you might also be able to address such cases by deploying super calls in every class of the tree. But that’s also one of the biggest downsides of super—why code it in every class, when it’s usually not needed, and when using the preceding simpler traditional form in a single class will usually suffice? Especially in existing code—and new code that uses existing code—this super requirement seems harsh, if not unrealistic.

Much more subtly, as we’ll also see ahead, once you step up to multiple inheritance calls this way, the super calls in your code might not invoke the class you expect them to. They’ll be routed per the MRO order, which, depending on where else super might be used, may invoke a method in a class that is not the caller’s superclass at all—an implicit ordering that might make for interesting debugging sessions! Unless you completely understand what super means once multiple inheritance is introduced, you may be better off not deploying it in single-inheritance mode either.

This coding situation isn’t nearly as abstract as it may seem. Here’s a real-world example of such a case, taken from the PyMailGUI case study in Programming Python—the following very typical Python classes use multiple inheritance to mix in both application logic and window tools from independent, standalone classes, and hence must invoke both superclass constructors explicitly with direct calls by name. As coded, a super().__init__() here would run only one constructor, and adding super throughout this example’s disjoint class trees would be more work, would be no simpler, and wouldn’t make sense in tools meant for arbitrary deployment in clients that may use super or not:

class PyMailServerWindow(PyMailServer, windows.MainWindow):

    "a Tk, with extra protocol and mixed-in methods"

    def __init__(self):

        windows.MainWindow.__init__(self, appname, srvrname)

        PyMailServer.__init__(self)

class PyMailFileWindow(PyMailFile, windows.PopupWindow):

    "a Toplevel, with extra protocol and mixed-in methods"

    def __init__(self, filename):

        windows.PopupWindow.__init__(self, appname, filename)

        PyMailFile.__init__(self, filename)

The crucial point here is that using super for just the single inheritance cases where it applies most clearly is a potential source of error and confusion, and means that programmers must remember two ways to accomplish the same goal, when just one—explicit direct calls—could suffice for all cases.

In other words, unless you can be sure that you will never add a second superclass to a class in a tree over your software’s entire lifespan, you cannot use super in single-inheritance mode without understanding and allowing for its much more sophisticated role in multiple-inheritance trees. We’ll discuss the latter ahead, but it’s not optional if you deploy super at all.

From a more practical view, it’s also not clear that the trivial amount of code maintenance that this super role is envisioned to avoid fully justifies its presence. In Python practice, superclass names in headers are rarely changed; when they are, there are usually at most a very small number of superclass calls to update within the class. And consider this: if you add a new superclass in the future that doesn’t use super (as in the preceding example), you’ll have to either wrap it in an adaptor proxy or augment all the super calls in your class to use the traditional explicit-name call scheme anyhow—a maintenance task that seems just as likely, but perhaps more error-prone if you’ve grown to rely on super magic.

Limitation: Operator overloading

As briefly noted in Python’s library manual, super also doesn’t fully work in the presence of __X__ operator overloading methods. If you study the following code, you’ll see that direct named calls to overload methods in the superclass operate normally, but using the super result in an expression fails to dispatch to the superclass’s overload method:

>>> class C:                            # In Python 3.X

        def __getitem__(self, ix):      # Indexing overload method

            print('C index')

>>> class D(C):

        def __getitem__(self, ix):      # Redefine to extend here

            print('D index')

            C.__getitem__(self, ix)     # Traditional call form works

            super().__getitem__(ix)     # Direct name calls work too

            super()[ix]                 # But operators do not! (__getattribute__)

>>> X = C()

>>> X[99]

C index

>>> X = D()

>>> X[99]

D index

C index

C index

Traceback (most recent call last):

  File "", line 1, in

  File "", line 6, in __getitem__

TypeError: 'super' object is not subscriptable

This behavior is due to the very same new-style (and 3.X) class change described earlier in this chapter (see Attribute Fetch for Built-ins Skips Instances)—because the proxy object returned by super uses __getattribute__ to catch and dispatch later method calls, it fails to intercept the automatic __X__ method invocations run by built-in operations including expressions, as these begin their search in the class instead of the instance. This may seem less severe than the multiple-inheritance limitation, but operators should generally work the same as the equivalent method call, especially for a built-in like this. Not supporting this adds another exception for super users to confront and remember.

Other languages’ mileage may vary, but in Python, self is explicit, multiple-inheritance mix-ins and operator overloading are common, and superclass name updates are rare. Because super adds an odd special case to the language—one with strange semantics, limited scope, rigid requirements, and questionable reward—most Python programmers may be better served by the more broadly applicable traditional call scheme. While super has some advanced applications too that we’ll study ahead, they may be too obscure to warrant making it a mandatory part of every Python programmer’s toolbox.

Use differs in Python 2.X: Verbose calls

If you are a Python 2.X user reading this dual-version book, you should also know that the super technique is not portable between Python lines. Its form differs between 2.X and 3.X—and not just between classic and new-style classes. It’s really a different tool in 2.X, which cannot run 3.X’s simpler form.

To make this call work in Python 2.X, you must first use new-style classes. Even then, you must also explicitly pass in the immediate class name and self to super, making this call so complex and verbose that in most cases it’s probably easier to avoid it completely, and simply name the superclass explicitly per the previous traditional code pattern (for brevity, I’ll leave it to readers to consider what changing a class’s own name means for code maintenance when using the 2.X super form!):

>>> class C(object):                # In Python 2.X: for new-style classes only

        def act(self):

            print('spam')

>>> class D(C):

        def act(self):

            super(D, self).act()    # 2.X: different call format - seems too complex

            print('eggs')           # "D" may be just as much to type/change as "C"!

>>> X = D()

>>> X.act()

spam

eggs

Although you can use the 2.X call form in 3.X for backward compatibility, it’s too cumbersome to deploy in 3.X-only code, and the more reasonable 3.X form is not usable in 2.X:

>>> class D(C):

        def act(self):

            super().act()           # Simpler 3.X call format fails in 2.X

            print('eggs')

>>> X = D()

>>> X.act()

TypeError: super() takes at least 1 argument (0 given)

On the other hand, the traditional call form with explicit class names works in 2.X in both classic and new-style classes, and exactly as it does in 3.X:

>>> class D(C):

        def act(self):

            C.act(self)             # But traditional pattern works portably

            print('eggs')           # And may often be simpler in 2.X code

>>> X = D()

>>> X.act()

spam

eggs

So why use a technique that works in only limited contexts instead of one that works in many more? Though its basis is complex, the next sections attempt to rally support for the super cause.

The super Upsides: Tree Changes and Dispatch

Having just shown you the downsides of super, I should also confess that I’ve been tempted to use this call in code that would only ever run on 3.X, and which used a very long superclass reference path through a module package (that is, mostly for laziness, but coding brevity can matter too). To be fair, super may still be useful in some use cases, the chief among which merit a brief introduction here:

§  Changing class trees at runtime: When a superclass may be changed at runtime, it’s not possible to hardcode its name in a call expression, but it is possible to dispatch calls via super.

On the other hand, this case is extremely rare in Python programming, and other techniques can often be used in this context as well.

§  Cooperative multiple inheritance method dispatch: When multiple inheritance trees must dispatch to the same-named method in multiple classes, super can provide a protocol for orderly call routing.

On the other hand, the class tree must rely upon the ordering of classes by the MRO—a complex tool in its own right that is artificial to the problem a program is meant to address—and must be coded or augmented to use super in each version of the method in the tree to be effective. Such dispatch can also often be implemented in other ways (e.g., via instance state).

As discussed earlier, super can also be used to select a superclass generically as long as the MRO’s default makes sense, though in traditional code naming a superclass explicitly is often preferable, and may even be required. Moreover, even valid super use cases tend to be uncommon in many Python programs—to the point of seeming academic curiosity to some. The two cases just listed, however, are most often cited as super rationales, so let’s take a quick look at each.

Runtime Class Changes and super

Superclass that might be changed at runtime dynamically preclude hardcoding their names in a subclass’s methods, while super will happily look up the current superclass dynamically. Still, this case may be too rare in practice to warrant the super model by itself, and can often be implemented in other ways in the exceptional cases where it is needed. To illustrate, the following changes the superclass of C dynamically by changing the subclass’s __bases__ tuple in 3.X:

>>> class X:

        def m(self): print('X.m')

>>> class Y:

        def m(self): print('Y.m')

>>> class C(X):                                 # Start out inheriting from X

        def m(self): super().m()                # Can't hardcode class name here

>>> i = C()

>>> i.m()

X.m

>>> C.__bases__ = (Y,)                          # Change superclass at runtime!

>>> i.m()

Y.m

This works (and shares behavior-morphing goals with other deep magic, such as changing an instance’s __class__), but seems rare in the extreme. Moreover, there may be other ways to achieve the same effect—perhaps most simply, calling through the current superclass tuple’s value indirectly: special code to be sure, but only for a very special case (and perhaps not any more special than implicit routing by MROs):

>>> class C(X):

        def m(self): C.__bases__[0].m(self)     # Special code for a special case

>>> i = C()

>>> i.m()

X.m

>>> C.__bases__ = (Y,)                          # Same effect, without super()

>>> i.m()

Y.m

Given the preexisting alternatives, this case alone doesn’t seem to justify super, though in more complex trees, the next rationale—based on the tree’s MRO order instead of physical superclass links—may apply here as well.

Cooperative Multiple Inheritance Method Dispatch

The second of the use cases listed earlier is the main rationale commonly given for super, and also borrows from other programming languages (most notably, Dylan), where its use case may be more common than it is in typical Python code. It generally applies to diamond pattern multiple inheritance trees, discussed earlier in this chapter, and allows for cooperative and conformant classes to route calls to a same-named method coherently among multiple class implementations. Especially for constructors, which have multiple implementations normally, this can simplify call routing protocol when used consistently.

In this mode, each super call selects the method from a next class following it in the MRO ordering of the class of the self subject of a method call. The MRO was introduced earlier; it’s the path Python follows for inheritance in new-style classes. Because the MRO’s linear ordering depends on which class self was made from, the order of method dispatch orchestrated by super can vary per class tree, and visits each class just once as long as all classes use super to dispatch.

Since every class participates in a diamond under object in 3.X (and 2.X new-style classes), the applications are broader than you might expect. In fact, some of the earlier examples that demonstrated super shortcomings in multiple inheritance trees could use this call to achieve their dispatch goals. To do so, however, super must be used universally in the class tree to ensure that method call chains are passed on—a fairly major requirement that may be difficult to enforce in much existing and new code.

The basics: Cooperative super call in action

Let’s take a look at what this role means in code. In this and the following sections, we’ll both learn how super works, and explore the tradeoffs it implies along the way. To get started, consider the following traditionally coded Python classes (condensed somewhat here as usual for space):

>>> class B:

        def __init__(self): print('B.__init__')      # Disjoint class tree branches

>>> class C:

        def __init__(self): print('C.__init__')

>>> class D(B, C): pass

>>> x = D()                                          # Runs leftmost only by default

B.__init__

In this case, superclass tree branches are disjoint (they don’t share a common explicit ancestor), so subclasses that combine them must call through each superclass by name—a common situation in much existing Python code that super cannot address directly without code changes:

>>> class D(B, C):

        def __init__(self):                          # Traditional form

            B.__init__(self)                         # Invoke supers by name

            C.__init__(self)

>>> x = D()

B.__init__

C.__init__

In diamond class tree patterns, though, explicit-name calls may by default trigger the top-level class’s method more than once, though this might be subverted with additional protocols (e.g., status markers in the instance):

>>> class A:

        def __init__(self): print('A.__init__')

>>> class B(A):

        def __init__(self): print('B.__init__'); A.__init__(self)

>>> class C(A):

        def __init__(self): print('C.__init__'); A.__init__(self)

>>> x = B()

B.__init__

A.__init__

>>> x = C()                                # Each super works by itself

C.__init__

A.__init__

>>> class D(B, C): pass                    # Still runs leftmost only

>>> x = D()

B.__init__

A.__init__

>>> class D(B, C):

        def __init__(self):                # Traditional form

            B.__init__(self)               # Invoke both supers by name

            C.__init__(self)

>>> x = D()                                # But this now invokes A twice!

B.__init__

A.__init__

C.__init__

A.__init__

By contrast, if all classes use super, or are appropriately coerced by proxies to behave as if they do, the method calls are dispatched according to class order in the MRO, such that the top-level class’s method is run just once:

>>> class A:

        def __init__(self): print('A.__init__')

>>> class B(A):

        def __init__(self): print('B.__init__'); super().__init__()

>>> class C(A):

        def __init__(self): print('C.__init__'); super().__init__()

>>> x = B()                   # Runs B.__init__, A is next super in self's B MRO

B.__init__

A.__init__

>>> x = C()

C.__init__

A.__init__

>>> class D(B, C): pass

>>> x = D()                   # Runs B.__init__, C is next super in self's D MRO!

B.__init__

C.__init__

A.__init__

The real magic behind this is the linear MRO list constructed for the class of self—because each class appears just once on this list, and because super dispatches to the next class on this list, it ensures an orderly invocation chain that visits each class just once. Crucially, the next class following B in the MRO differs depending on the class of self—it’s A for a B instance, but C for a D instance, accounting for the order of constructors run:

>>> B.__mro__

(<class '__main__.B'>, <class '__main__.A'>, <class 'object'>)

>>> D.__mro__

(<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>,

<class '__main__.A'>, <class 'object'>)

The MRO and its algorithm were presented earlier in this chapter. By selecting a next class in the MRO sequence, a super call in a class’s method propagates the call through the tree, so long as all classes do the same. In this mode super does not necessarily choose a superclass at all; it picks the next in the linearized MRO, which might be a sibling—or even a lower relative—in the class tree of a given instance. See Tracing the MRO for other examples of the path super dispatch would follow, especially for nondiamonds.

The preceding works—and may even seem clever at first glance—but its scope may also appear limited to some. Most Python programs do not rely on the nuances of diamond pattern multiple inheritance trees (in fact, many Python programmers I’ve met do not know what the term means!). Moreover, super applies most directly to single inheritance and cooperative diamond cases, and may seem superfluous for disjoint nondiamond cases, where we might want to invoke superclass methods selectively or independently. Even cooperative diamonds can be managed in other ways that may afford programmers more control than an automatic MRO ordering can. To evaluate this tool objectively, though, we need to look deeper.

Constraint: Call chain anchor requirement

The super call comes with complexities that may not be apparent on first encounter, and may even seem initially like features. For example, because all classes inherit from object in 3.X automatically (and explicitly in 2.X new-style classes), the MRO ordering can be used even in cases where the diamond is only implicit—in the following, triggering constructors in independent classes automatically:

>>> class B:

        def __init__(self): print('B.__init__'); super().__init__()

>>> class C:

        def __init__(self): print('C.__init__'); super().__init__()

>>> x = B()                   # object is an implied super at the end of MRO

B.__init__

>>> x = C()

C.__init__

>>> class D(B, C): pass       # Inherits B.__init__ but B's MRO differs for D

>>> x = D()                   # Runs B.__init__, C is next super in self's D MRO!

B.__init__

C.__init__

Technically, this dispatch model generally requires that the method being called by super must exist, and must have the same argument signature across the class tree, and every appearance of the method but the last must use super itself. This prior example works only because the impliedobject superclass at the end of the MRO of all three classes happens to have a compatible __init__ that satisfies these rules:

>>> B.__mro__

(<class '__main__.B'>, <class 'object'>)

>>> D.__mro__

(<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class 'object'>)

Here, for a D instance, the next class in the MRO after B is C, which is followed by object whose __init__ silently accepts the call from C and ends the chain. Thus, B’s method calls C’s, which ends in object’s version, even though C is not a superclass to B.

Really, though, this example is atypical—and perhaps even lucky. In most cases, no such suitable default will exist in object, and it may be less trivial to satisfy this model’s expectations. Most trees will require an explicit—and possibly extra—superclass to serve the anchoring role thatobject does here, to accept but not forward the call. Other trees may require careful design to adhere to this requirement. Moreover, unless Python optimizes it away, the call to object (or other anchor) defaults at the end of the chain may also add extra performance costs.

By contrast, in such cases direct calls incur neither extra coding requirements nor added performance cost, and make dispatch more explicit and direct:

>>> class B:

        def __init__(self): print('B.__init__')

>>> class C:

        def __init__(self): print('C.__init__')

>>> class D(B, C):

        def __init__(self): B.__init__(self); C.__init__(self)

>>> x = D()

B.__init__

C.__init__

Scope: An all-or-nothing model

Also keep in mind that traditional classes that were not written to use super in this role cannot be directly used in such cooperative dispatch trees, as they will not forward calls along the MRO chain. It’s possible to incorporate such classes with proxies that wrap the original object and add the requisite super calls, but this imposes both additional coding requirements and performance costs on the model. Given that there are many millions of lines of existing Python code that do not use super, this seems a major detriment.

Watch what happens, for example, if any one class fails to pass along the call chain by omitting a super, ending the call chain prematurely—like __slots__, super is generally an all-or-nothing feature:

>>> class B:

        def __init__(self): print('B.__init__'); super().__init__()

>>> class C:

        def __init__(self): print('C.__init__'); super().__init__()

>>> class D(B, C):

        def __init__(self): print('D.__init__'); super().__init__()

>>> X = D()

D.__init__

B.__init__

C.__init__

>>> D.__mro__

(<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class 'object'>)

# What if you must use a class that doesn't call super?

>>> class B:

        def __init__(self): print('B.__init__')

>>> class D(B, C):

        def __init__(self): print('D.__init__'); super().__init__()

>>> X = D()

D.__init__

B.__init__             # It's an all-or-nothing tool...

Satisfying this mandatory propagation requirement may be no simpler than direct by-name calls—which you might still forget, but which you won’t need to require of all the code your classes employ. As mentioned, it’s possible to adapt a class like B by inheriting from a proxy class that embeds B instances, but that seems artificial to program goals, adds an extra call to each wrapped method, is subject to the new-style class problems we met earlier regarding interface proxies and built-ins, and seems an extraordinary and even stunning added coding requirement inherent in a model intended to simplify code.

Flexibility: Call ordering assumptions

Routing with super also assumes that you really mean to pass method calls throughout all your classes per the MRO, which may or may not match your call ordering requirements. For example, imagine that—irrespective of other inheritance ordering needs—the following requires that the class C’s version of a given method be run before B’s in some contexts. If the MRO says otherwise, you’re back to traditional calls, which may conflict with super usage—in the following, invoking C’s method twice:

# What if method call ordering needs differ from the MRO?

>>> class B:

        def __init__(self): print('B.__init__'); super().__init__()

>>> class C:

        def __init__(self): print('C.__init__'); super().__init__()

>>> class D(B, C):

        def __init__(self): print('D.__init__'); C.__init__(self); B.__init__(self)

>>> X = D()

D.__init__

C.__init__

B.__init__

C.__init__             # It's the MRO xor explicit calls...

Similarly, if you want some methods to not run at all, the super automatic path won’t apply as directly as explicit calls may, and will make it difficult to take more explicit control of the dispatch process. In realistic programs with many methods, resources, and state variables, these seem entirely plausible scenarios. While you could reorder superclasses in D for this method, that may break other expectations.

Customization: Method replacement

On a related note, the universal deployment expectations of super may make it difficult for a single class to replace (override) an inherited method altogether. Not passing the call higher with super—intentionally in this case—works fine for the class itself, but may break the call chain of trees it’s mixed into, thereby preventing methods elsewhere in the tree from running. Consider the following tree:

>>> class A:

        def method(self): print('A.method'); super().method()

>>> class B(A):

        def method(self): print('B.method'); super().method()

>>> class C:

        def method(self): print('C.method')       # No super: must anchor the chain!

>>> class D(B, C):

        def method(self): print('D.method'); super().method()

>>> X = D()

>>> X.method()

D.method

B.method

A.method               # Dispatch to all per the MRO automatically

C.method

Method replacement here breaks the super model, and probably leads us back to the traditional form:

# What if a class needs to replace a super's default entirely?

>>> class B(A):

        def method(self): print('B.method')       # Drop super to replace A's method

>>> class D(B, C):

        def method(self): print('D.method'); super().method()

>>> X = D()

>>> X.method()

D.method

B.method               #  But replacement also breaks the call chain...

>>> class D(B, C):

        def method(self): print('D.method'); B.method(self); C.method(self)

>>> D().method()

D.method

B.method

C.method               # It's back to explicit calls...

Once again, the problem with assumptions is that they assume things! Although the assumption of universal routing might be reasonable for constructors, it would also seem to conflict with one of the core tenets of OOP—unrestricted subclass customization. This might suggest restrictingsuper usage to constructors, but even these might sometimes warrant replacement, and this adds an odd special-case requirement for one specific context. A tool that can be used only for certain categories of methods might be seen by some as redundant—and even spurious, given the extra complexity it implies.

Coupling: Application to mix-in classes

Subtly, when we say super selects the next class in the MRO, we really mean the next class in the MRO that implements the requested method—it technically skips ahead until it finds a class with the requested name. This matters for independent mix-in classes, which might be added to arbitrary client trees. Without this skipping-ahead behavior, such mix-ins wouldn’t work at all—they would otherwise drop the call chain of their clients’ arbitrary methods, and couldn’t rely on their own super calls to work as expected.

In the following independent branches, for example, C’s call to method is passed on, even though Mixin, the next class in the C instance’s MRO, doesn’t define that method’s name. As long as method name sets are disjoint, this just works—the call chains of each branch can exist independently:

# Mix-ins work for disjoint method sets

>>> class A:

        def other(self): print('A.other')

>>> class Mixin(A):

        def other(self): print('Mixin.other'); super().other()

>>> class B:

        def method(self): print('B.method')

>>> class C(Mixin, B):

        def method(self): print('C.method'); super().other(); super().method()

>>> C().method()

C.method

Mixin.other

A.other

B.method

>>> C.__mro__

(<class '__main__.C'>, <class '__main__.Mixin'>, <class '__main__.A'>,

<class '__main__.B'>, <class 'object'>)

Similarly, mixing the other way doesn’t break call chains of the mix-in either. For instance, in the following, even though B doesn’t define other when called in C, classes do later in the MRO. In fact, the call chains work even if one of the branches doesn’t use super at all—as long as a method is defined somewhere ahead on the MRO, its call works:

>>> class C(B, Mixin):

        def method(self): print('C.method'); super().other(); super().method()

>>> C().method()

C.method

Mixin.other

A.other

B.method

>>> C.__mro__

(<class '__main__.C'>, <class '__main__.B'>, <class '__main__.Mixin'>,

<class '__main__.A'>, <class 'object'>)

This is also true in the presence of diamonds—disjoint method sets are dispatched as expected, even if not implemented by each disjoint branch, because we select the next on the MRO with the method. Really, because the MRO contains the same classes in these cases, and because a subclass always appears before its superclass in the MRO, they are equivalent contexts. For example, the call in Mixin to other in the following still finds it in A, even though the next class after Mixin on the MRO is B (the call to method in C works again for similar reasons):

# Explicit diamonds work too

>>> class A:

        def other(self): print('A.other')

>>> class Mixin(A):

        def other(self): print('Mixin.other'); super().other()

>>> class B(A):

        def method(self): print('B.method')

>>> class C(Mixin, B):

        def method(self): print('C.method'); super().other(); super().method()

>>> C().method()

C.method

Mixin.other

A.other

B.method

>>> C.__mro__

(<class '__main__.C'>, <class '__main__.Mixin'>, <class '__main__.B'>,

<class '__main__.A'>, <class 'object'>)

# Other mix-in orderings work too

>>> class C(B, Mixin):

        def method(self): print('C.method'); super().other(); super().method()

>>> C().method()

C.method

Mixin.other

A.other

B.method

>>> C.__mro__

(<class '__main__.C'>, <class '__main__.B'>, <class '__main__.Mixin'>,

<class '__main__.A'>, <class 'object'>)

Still, this has an effect that is no different—but may seem wildly more implicit—than direct by-name calls, which also work the same in this case regardless of superclass ordering, and whether there is a diamond or not. In this case, the motivation for relying on MRO ordering seems on shaky ground, if the traditional form is both simpler and more explicit, and offers more control and flexibility:

# But direct calls work here too: explicit is better than implicit

>>> class C(Mixin, B):

       def method(self): print('C.method'); Mixin.other(self); B.method(self)

>>> X = C()

>>> X.method()

C.method

Mixin.other

A.other

B.method

More crucially, this example so far assumes that method names are disjoint in its branches; the dispatch order for same-named methods in diamonds like this may be much less fortuitous. In a diamond like the preceding, for example, it’s not impossible that a client class could invalidate asuper call’s intent—the call to method in Mixin in the following works to run A’s version as expected, unless it’s mixed into a tree that drops the call chain:

# But for nondisjoint methods: super creates overly strong coupling

>>> class A:

        def method(self): print('A.method')

>>> class Mixin(A):

        def method(self): print('Mixin.method'); super().method()

>>> Mixin().method()

Mixin.method

A.method

>>> class B(A):

        def method(self): print('B.method')      # super here would invoke A after B

>>> class C(Mixin, B):

        def method(self): print('C.method'); super().method()

>>> C().method()

C.method

Mixin.method

B.method                                         # We miss A in this context only!

It may be that B shouldn’t redefine this method anyhow (and frankly, we may be encroaching on problems inherent in multiple inheritance in general), but this need not also break the mix-in—direct calls give you more control in such cases, and allow mix-in classes to be much more independent of usage contexts:

# And direct calls do not: they are immune to context of use

>>> class A:

        def method(self): print('A.method')

>>> class Mixin(A):

        def method(self): print('Mixin.method'); A.method(self)       # C irrelevant

>>> class C(Mixin, B):

        def method(self): print('C.method'); Mixin.method(self)

>>> C().method()

C.method

Mixin.method

A.method

More to the point, by making mix-ins more self-contained, direct calls minimize component coupling that always skews program complexity higher—a fundamental software principle that seems neglected by super’s variable and context-specific dispatch model.

Customization: Same-argument constraints

As a final note, you should also consider the consequences of using super when method arguments differ per class—because a class coder can’t be sure which version of a method super might invoke (indeed, this may vary per tree!), every version of the method must generally accept the same arguments list, or choose its inputs with analysis of generic argument lists—either of which imposes additional requirements on your code. In realistic programs, this constraint may in fact be a true showstopper for many potential super applications, precluding its use entirely.

To illustrate why this can matter, recall the pizza shop employee classes we wrote in Chapter 31. As coded there, both subclasses use direct by-name calls to invoke the superclass constructor, filling in an expected salary argument automatically—the logic being that the subclass implies the pay grade:

>>> class Employee:

        def __init__(self, name, salary):                  # Common superclass

            self.name = name

            self.salary = salary

>>> class Chef1(Employee):

        def __init__(self, name):                          # Differing arguments

            Employee.__init__(self, name, 50000)           # Dispatch by direct call

>>> class Server1(Employee):

        def __init__(self, name):

            Employee.__init__(self, name, 40000)

>>> bob = Chef1('Bob')

>>> sue = Server1('Sue')

>>> bob.salary, sue.salary

(50000, 40000)

This works, but since this is a single-inheritance tree, we might be tempted to deploy super here to route the constructor calls generically. Doing so works for either subclass in isolation, since its MRO includes just itself and its actual superclass:

>>> class Chef2(Employee):

        def __init__(self, name):

            super().__init__(name, 50000)                  # Dispatch by super()

>>> class Server2(Employee):

        def __init__(self, name):

            super().__init__(name, 40000)

>>> bob = Chef2('Bob')

>>> sue = Server2('Sue')

>>> bob.salary, sue.salary

(50000, 40000)

Watch what happens, though, when an employee is a member of both categories. Because the constructors in the tree have differing argument lists, we’re in trouble:

>>> class TwoJobs(Chef2, Server2): pass

>>> tom = TwoJobs('Tom')

TypeError: __init__() takes 2 positional arguments but 3 were given

The problem here is that the super call in Chef2 no longer invokes its Employee superclass, but instead invokes its sibling class and follower on the MRO, Server2. Since this sibling has a differing argument list than the true superclass—expecting just self and name—the code breaks. This is inherent in super use: because the MRO can differ per tree, it might call different versions of a method in different trees—even some you may not be able to anticipate when coding a class by itself:

>>> TwoJobs.__mro__

(<class '__main__.TwoJobs'>, <class '__main__.Chef2'>, <class '__main__.Server2'>

<class '__main__.Employee'>, <class 'object'>)

>>> Chef2.__mro__

(<class '__main__.Chef2'>, <class '__main__.Employee'>, <class 'object'>)

By contrast, the direct by-name call scheme still works when the classes are mixed, though the results are a bit dubious—the combined category gets the pay of the leftmost superclass:

>>> class TwoJobs(Chef1, Server1): pass

>>> tom = TwoJobs('Tom')

>>> tom.salary

50000

Really, we probably want to route the call to the top-level class in this event with a new salary—a model that is possible with direct calls but not with super alone. Moreover, calling Employee directly in this one class means our code uses two dispatch techniques when just one—direct calls—would suffice:

>>> class TwoJobs(Chef1, Server1):

        def __init__(self, name): Employee.__init__(self, name, 70000)

>>> tom = TwoJobs('Tom')

>>> tom.salary

70000

>>> class TwoJobs(Chef2, Server2):

        def __init__(self, name): super().__init__(name, 70000)

>>> tom = TwoJobs('Tom')

TypeError: __init__() takes 2 positional arguments but 3 were given

This example may warrant redesign in general—splitting off shareable parts of Chef and Server to mix-in classes without a constructor, for example. It’s also true that polymorphism in general assumes that the methods in an object’s external interface have the same argument signature, though this doesn’t quite apply to customization of superclass methods—an internal implementation technique that should by nature support variation, especially in constructors.

But the crucial point here is that because direct calls do not make code dependent on a magic ordering that can vary per tree, they more directly support argument list flexibility. More broadly, the questionable (or weak) performances super turns in on method replacement, mix-in coupling, call ordering, and argument constraints should make you evaluate its deployment carefully. Even in single-inheritance mode, its potential for later impacts as trees grow is considerable.

In sum, the three requirements of super in this role are also the source of most of its usability issues:

§  The method called by super must exist—which requires extra code if no anchor is present.

§  The method called by super must have the same argument signature across the class tree—which impairs flexibility, especially for implementation-level methods like constructors.

§  Every appearance of the method called by super but the last must use super itself—which makes it difficult to use existing code, change call ordering, override methods, and code self-contained classes.

Taken together, these seem to make for a tool with both substantial complexity and significant tradeoffs—downsides that will assert themselves the moment the code grows to incorporate multiple inheritance.

Naturally, there may be creative workarounds for the super dilemmas just posed, but additional coding steps would further dilute the call’s benefits—and we’ve run out of space here in any event. There are also alternative non-super solutions to some diamond method dispatch problems, but these will have to be left as a user exercise for space reasons too. In general, when superclass methods are called by explicit name, root classes of diamonds might check state in instances to avoid firing twice—a similarly complex coding pattern, but required rarely in most code, and which to some may seem no more difficult than using super itself.

The super Summary

So there it is—the bad and the good. As with all Python extensions, you should be the judge on this one too. I’ve tried to give both sides of the debate a fair shake here to help you decide. But because the super call:

§  Differs in form between 2.X and 3.X

§  In 3.X, relies on arguably non-Pythonic magic, and does not fully apply to operator overloading or traditionally coded multiple-inheritance trees

§  In 2.X, seems so verbose in this intended role that it may make code more complex instead of less

§  Claims code maintenance benefits that may be more hypothetical than real in Python practice

even ex–Java programmers should also consider this book’s preferred traditional technique of explicit-name superclass calls to be at least as valid a solution as Python’s super—a call that on some levels seems an unusual and limited answer to a question that was not being asked by most Python programmers, and was not deemed important for much of Python’s history.

At the same time, the super call offers one solution to the difficult problem of same-named method dispatch in multiple inheritance trees, for programs that choose to use it universally and consistently. But therein lies one of its largest obstacles: it requires universal deployment to address a problem most programmers probably do not have. Moreover, at this point in Python’s history, asking programmers to change their existing code to use this call widely enough to make it reliable seems highly unrealistic.

Perhaps the chief problem of this role, though, is the role itself—same-named method dispatch in multiple inheritance trees is relatively rare in real Python programs, and obscure enough to have generated both much controversy and much misunderstanding surrounding this role. People don’t use Python the same way they use C++, Java, or Dylan, and lessons from other such languages do not necessarily apply.

Also keep in mind that using super makes your program’s behavior dependent on the MRO algorithm—a procedure that we’ve covered only informally here due to its complexity, that is artificial to your program’s purpose, and that seems tersely documented and understood in the Python world. As we’ve seen, even if you understand the MRO, its implications on customizationcoupling, and flexibility are remarkably subtle. If you don’t completely understand this algorithm—or have goals that its application does not address—you may be better served not relying on it to implicitly trigger actions in your code.

Or, to quote a Python motto from its import this creed:

If the implementation is hard to explain, it’s a bad idea.

The super call seems firmly in this category. Most programmers won’t use an arcane tool aimed at a rare use case, no matter how clever it may be. This is especially true in a scripting language that bills itself as friendly to nonspecialists. Regrettably, use by any programmer can impose such a tool on others anyhow—the real reason I’ve covered it here, and a theme we’ll revisit at the end of this book.

As usual, time and user base will tell if this call’s tradeoffs or momentum lead to broader adoption or not. At the least, it behooves you to also know about the traditional explicit-name superclass call technique, as it is still commonly used and often either simpler or required in today’s real-world Python programming. If you do choose to use this tool, my own advice to readers is to remember that using super:

§  In single-inheritance mode can mask later problems and lead to unexpected behavior as trees grow

§  In multiple-inheritance mode brings with it substantial complexity for an atypical Python use case

For other opinions on Python’s super that go into further details both good and bad, search the Web for related articles. You can find plenty of additional positions, though in the end, Python’s future relies as much on yours as any other.


[65] Both are opinion pieces in part, but are suggested reading. The first was eventually retitled “Python’s Super is nifty, but you can’t use it,” and is today at https://fuhm.net/super-harmful. Oddly—and despite its subjective tone—the second article (“Python’s super() considered super!”) alone somehow found its way into Python’s official library manual; see its link in the manual’s supersection...and consider demanding that differing opinions be represented more evenly in your tools’ documentation, or omitted altogether. Python’s manuals are not the place for personal opinion and one-sided propaganda!

Class Gotchas

We’ve reached the end of the primary OOP coverage in this book. After exceptions, we’ll explore additional class-related examples and topics in the last part of the book, but that part mostly just gives expanded coverage to concepts introduced here. As usual, let’s wrap up this part with the standard warnings about pitfalls to avoid.

Most class issues can be boiled down to namespace issues—which makes sense, given that classes are just namespaces with a handful of extra tricks. Some of the items in this section are more like class usage pointers than problems, but even experienced class coders have been known to stumble on a few.

Changing Class Attributes Can Have Side Effects

Theoretically speaking, classes (and class instances) are mutable objects. As with built-in lists and dictionaries, you can change them in place by assigning to their attributes—and as with lists and dictionaries, this means that changing a class or instance object may impact multiple references to it.

That’s usually what we want, and is how objects change their state in general, but awareness of this issue becomes especially critical when changing class attributes. Because all instances generated from a class share the class’s namespace, any changes at the class level are reflected in all instances, unless they have their own versions of the changed class attributes.

Because classes, modules, and instances are all just objects with attribute namespaces, you can normally change their attributes at runtime by assignments. Consider the following class. Inside the class body, the assignment to the name a generates an attribute X.a, which lives in the class object at runtime and will be inherited by all of X’s instances:

>>> class X:

        a = 1       # Class attribute

>>> I = X()

>>> I.a             # Inherited by instance

1

>>> X.a

1

So far, so good—this is the normal case. But notice what happens when we change the class attribute dynamically outside the class statement: it also changes the attribute in every object that inherits from the class. Moreover, new instances created from the class during this session or program run also get the dynamically set value, regardless of what the class’s source code says:

>>> X.a = 2         # May change more than X

>>> I.a             # I changes too

2

>>> J = X()         # J inherits from X's runtime values

>>> J.a             # (but assigning to J.a changes a in J, not X or I)

2

Is this a useful feature or a dangerous trap? You be the judge. As we learned in Chapter 27, you can actually get work done by changing class attributes without ever making a single instance—a technique that can simulate the use of “records” or “structs” in other languages. As a refresher, consider the following unusual but legal Python program:

class X: pass                       # Make a few attribute namespaces

class Y: pass

X.a = 1                             # Use class attributes as variables

X.b = 2                             # No instances anywhere to be found

X.c = 3

Y.a = X.a + X.b + X.c

for X.i in range(Y.a): print(X.i)   # Prints 0..5

Here, the classes X and Y work like “fileless” modules—namespaces for storing variables we don’t want to clash. This is a perfectly legal Python programming trick, but it’s less appropriate when applied to classes written by others; you can’t always be sure that class attributes you change aren’t critical to the class’s internal behavior. If you’re out to simulate a C struct, you may be better off changing instances than classes, as that way only one object is affected:

class Record: pass

X = Record()

X.name = 'bob'

X.job  = 'Pizza maker'

Changing Mutable Class Attributes Can Have Side Effects, Too

This gotcha is really an extension of the prior. Because class attributes are shared by all instances, if a class attribute references a mutable object, changing that object in place from any instance impacts all instances at once:

>>> class C:

        shared = []                 # Class attribute

        def __init__(self):

            self.perobj = []        # Instance attribute

>>> x = C()                         # Two instances

>>> y = C()                         # Implicitly share class attrs

>>> y.shared, y.perobj

([], [])

>>> x.shared.append('spam')         # Impacts y's view too!

>>> x.perobj.append('spam')         # Impacts x's data only

>>> x.shared, x.perobj

(['spam'], ['spam'])

>>> y.shared, y.perobj              # y sees change made through x

(['spam'], [])

>>> C.shared                        # Stored on class and shared

['spam']

This effect is no different than many we’ve seen in this book already: mutable objects are shared by simple variables, globals are shared by functions, module-level objects are shared by multiple importers, and mutable function arguments are shared by the caller and the callee. All of these are cases of general behavior—multiple references to a mutable object—and all are impacted if the shared object is changed in place from any reference. Here, this occurs in class attributes shared by all instances via inheritance, but it’s the same phenomenon at work. It may be made more subtle by the different behavior of assignments to instance attributes themselves:

x.shared.append('spam')    # Changes shared object attached to class in place

x.shared = 'spam'          # Changed or creates instance attribute attached to x

But again, this is not a problem, it’s just something to be aware of; shared mutable class attributes can have many valid uses in Python programs.

Multiple Inheritance: Order Matters

This may be obvious by now, but it’s worth underscoring: if you use multiple inheritance, the order in which superclasses are listed in the class statement header can be critical. Python always searches superclasses from left to right, according to their order in the header line.

For instance, in the multiple inheritance example we studied in Chapter 31, suppose that the Super class implemented a __str__ method, too:

class ListTree:

    def __str__(self): ...

class Super:

    def __str__(self): ...

class Sub(ListTree, Super):    # Get ListTree's __str__ by listing it first

x = Sub()                      # Inheritance searches ListTree before Super

Which class would we inherit it from—ListTree or Super? As inheritance searches proceed from left to right, we would get the method from whichever class is listed first (leftmost) in Sub’s class header. Presumably, we would list ListTree first because its whole purpose is its custom__str__ (indeed, we had to do this in Chapter 31 when mixing this class with a tkinter.Button that had a __str__ of its own).

But now suppose Super and ListTree have their own versions of other same-named attributes, too. If we want one name from Super and another from ListTree, the order in which we list them in the class header won’t help—we will have to override inheritance by manually assigning to the attribute name in the Sub class:

class ListTree:

    def __str__(self): ...

    def other(self): ...

class Super:

    def __str__(self): ...

    def other(self): ...

class Sub(ListTree, Super):    # Get ListTree's __str__ by listing it first

    other = Super.other        # But explicitly pick Super's version of other

    def __init__(self):

        ...

x = Sub()                      # Inheritance searches Sub before ListTree/Super

Here, the assignment to other within the Sub class creates Sub.other—a reference back to the Super.other object. Because it is lower in the tree, Sub.other effectively hides ListTree.other, the attribute that the inheritance search would normally find. Similarly, if we listedSuper first in the class header to pick up its other, we would need to select ListTree’s method explicitly:

class Sub(Super, ListTree):               # Get Super's other by order

    __str__ = Lister.__str__              # Explicitly pick Lister.__str__

Multiple inheritance is an advanced tool. Even if you understood the last paragraph, it’s still a good idea to use it sparingly and carefully. Otherwise, the meaning of a name may come to depend on the order in which classes are mixed in an arbitrarily far-removed subclass. (For another example of the technique shown here in action, see the discussion of explicit conflict resolution in “The ‘New-Style’ Class Model”, as well as the earlier super coverage.)

As a rule of thumb, multiple inheritance works best when your mix-in classes are as self-contained as possible—because they may be used in a variety of contexts, they should not make assumptions about names related to other classes in a tree. The pseudoprivate __X attributes feature we studied in Chapter 31 can help by localizing names that a class relies on owning and limiting the names that your mix-in classes add to the mix. In this example, for instance, if ListTree only means to export its custom __str__, it can name its other method __other to avoid clashing with like-named classes in the tree.

Scopes in Methods and Classes

When working out the meaning of names in class-based code, it helps to remember that classes introduce local scopes, just as functions do, and methods are simply further nested functions. In the following example, the generate function returns an instance of the nested Spam class. Within its code, the class name Spam is assigned in the generate function’s local scope, and hence is visible to any further nested functions, including code inside method; it’s the E in the “LEGB” scope lookup rule:

def generate():

    class Spam:                  # Spam is a name in generate's local scope

        count = 1

        def method(self):

            print(Spam.count)    # Visible in generate's scope, per LEGB rule (E)

    return Spam()

generate().method()

This example works in Python since version 2.2 because the local scopes of all enclosing function defs are automatically visible to nested defs (including nested method defs, as in this example).

Even so, keep in mind that method defs cannot see the local scope of the enclosing class; they can see only the local scopes of enclosing defs. That’s why methods must go through the self instance or the class name to reference methods and other attributes defined in the enclosing classstatement. For example, code in the method must use self.count or Spam.count, not just count.

To avoid nesting, we could restructure this code such that the class Spam is defined at the top level of the module: the nested method function and the top-level generate will then both find Spam in their global scopes; it’s not localized to a function’s scope, but is still local to a single module:

def generate():

    return Spam()

class Spam:                    # Define at top level of module

    count = 1

    def method(self):

        print(Spam.count)      # Works: in global (enclosing module)

generate().method()

In fact, this approach is recommended for all Python releases—code tends to be simpler in general if you avoid nesting classes and functions. On the other hand, class nesting is useful in closure contexts, where the enclosing function’s scope retains state used by the class or its methods. In the following, the nested method has access to its own scope, the enclosing function’s scope (for label), the enclosing module’s global scope, anything saved in the self instance by the class, and the class itself via its nonlocal name:

>>> def generate(label):       # Returns a class instead of an instance

        class Spam:

            count = 1

            def method(self):

                print("%s=%s" % (label, Spam.count))

        return Spam

>>> aclass = generate('Gotchas')

>>> I = aclass()

>>> I.method()

Gotchas=1

Miscellaneous Class Gotchas

Here’s a handful of additional class-related warnings, mostly as review.

Choose per-instance or class storage wisely

On a similar note, be careful when you decide whether an attribute should be stored on a class or its instances: the former is shared by all instances, and the latter will differ per instance. This can be a crucial design issue in practice. In a GUI program, for instance, if you want information to be shared by all of the window class objects your application will create (e.g., the last directory used for a Save operation, or an already entered password), it must be stored as class-level data; if stored in the instance as self attributes, it will vary per window or be missing entirely when looked up by inheritance.

You usually want to call superclass constructors

Remember that Python runs only one __init__ constructor method when an instance is made—the lowest in the class inheritance tree. It does not automatically run the constructors of all superclasses higher up. Because constructors normally perform required startup work, you’ll usually need to run a superclass constructor from a subclass constructor—using a manual call through the superclass’s name (or super), passing along whatever arguments are required—unless you mean to replace the super’s constructor altogether, or the superclass doesn’t have or inherit a constructor at all.

Delegation-based classes in 3.X: __getattr__ and built-ins

Another reminder: as described earlier in this chapter and elsewhere, classes that use the __getattr__ operator overloading method to delegate attribute fetches to wrapped objects may fail in Python 3.X (and 2.X when new-style classes are used) unless operator overloading methods are redefined in the wrapper class. The names of operator overloading methods implicitly fetched by built-in operations are not routed through generic attribute-interception methods. To work around this, you must redefine such methods in wrapper classes, either manually, with tools, or by definition in superclasses; we’ll see how in Chapter 40.

KISS Revisited: “Overwrapping-itis”

When used well, the code reuse features of OOP make it excel at cutting development time. Sometimes, though, OOP’s abstraction potential can be abused to the point of making code difficult to understand. If classes are layered too deeply, code can become obscure; you may have to search through many classes to discover what an operation does.

For example, I once worked in a C++ shop with thousands of classes (some machine-generated), and up to 15 levels of inheritance. Deciphering method calls in such a complex system was often a monumental task: multiple classes had to be consulted for even the most basic of operations. In fact, the logic of the system was so deeply wrapped that understanding a piece of code in some cases required days of wading through related files. This obviously isn’t ideal for programmer productivity!

The most general rule of thumb of Python programming applies here, too: don’t make things complicated unless they truly must be. Wrapping your code in multiple layers of classes to the point of incomprehensibility is always a bad idea. Abstraction is the basis of polymorphism and encapsulation, and it can be a very effective tool when used well. However, you’ll simplify debugging and aid maintainability if you make your class interfaces intuitive, avoid making your code overly abstract, and keep your class hierarchies short and flat unless there is a good reason to do otherwise. Remember: code you write is generally code that others must read. See Chapter 20 for more on KISS.

Chapter Summary

This chapter presented an assortment of advanced class-related topics, including subclassing built-in types, new-style classes, static methods, and decorators. Most of these are optional extensions to the OOP model in Python, but they may become more useful as you start writing larger object-oriented programs, and are fair game if they appear in code you must understand. As mentioned earlier, our discussion of some of the more advanced class tools continues in the final part of this book; be sure to look ahead if you need more details on properties, descriptors, decorators, and metaclasses.

This is the end of the class part of this book, so you’ll find the usual lab exercises at the end of the chapter: be sure to work through them to get some practice coding real classes. In the next chapter, we’ll begin our look at our last core language topic, exceptions—Python’s mechanism for communicating errors and other conditions to your code. This is a relatively lightweight topic, but I’ve saved it for last because new exceptions are supposed to be coded as classes today. Before we tackle that final core subject, though, take a look at this chapter’s quiz and the lab exercises.

Test Your Knowledge: Quiz

1.    Name two ways to extend a built-in object type.

2.    What are function and class decorators used for?

3.    How do you code a new-style class?

4.    How are new-style and classic classes different?

5.    How are normal and static methods different?

6.    Are tools like __slots__ and super valid to use in your code?

7.    How long should you wait before lobbing a “Holy Hand Grenade”?

Test Your Knowledge: Answers

1.    You can embed a built-in object in a wrapper class, or subclass the built-in type directly. The latter approach tends to be simpler, as most original behavior is automatically inherited.

2.    Function decorators are generally used to manage a function or method, or add to it a layer of logic that is run each time the function or method is called. They can be used to log or count calls to a function, check its argument types, and so on. They are also used to “declare” static methods (simple functions in a class that are not passed an instance when called), as well as class methods and properties. Class decorators are similar, but manage whole objects and their interfaces instead of a function call.

3.    New-style classes are coded by inheriting from the object built-in class (or any other built-in type). In Python 3.X, all classes are new-style automatically, so this derivation is not required (but doesn’t hurt); in 2.X, classes with this explicit derivation are new-style and those without it are “classic.”

4.    New-style classes search the diamond pattern of multiple inheritance trees differently—they essentially search breadth-first (across), instead of depth-first (up) in diamond trees. New-style classes also change the result of the type built-in for instances and classes, do not run generic attribute fetch methods such as __getattr__ for built-in operation methods, and support a set of advanced extra tools including properties, descriptors, super, and __slots__ instance attribute lists.

5.    Normal (instance) methods receive a self argument (the implied instance), but static methods do not. Static methods are simple functions nested in class objects. To make a method static, it must either be run through a special built-in function or be decorated with decorator syntax. Python 3.X allows simple functions in a class to be called through the class without this step, but calls through instances still require static method declaration.

6.    Of course, but you shouldn’t use advanced tools automatically without carefully considering their implications. Slots, for example, can break code; super can mask later problems when used for single inheritance, and in multiple inheritance brings with it substantial complexity for an isolated use case; and both require universal deployment to be most useful. Evaluating new or advanced tools is a primary task of any engineer, and is why we explored tradeoffs so carefully in this chapter. This book’s goal is not to tell you which tools to use, but to underscore the importance of objectively analyzing them—a task often given too low a priority in the software field.

7.    Three seconds. (Or, more accurately: “And the Lord spake, saying, ‘First shalt thou take out the Holy Pin. Then, shalt thou count to three, no more, no less. Three shalt be the number thou shalt count, and the number of the counting shall be three. Four shalt thou not count, nor either count thou two, excepting that thou then proceed to three. Five is right out. Once the number three, being the third number, be reached, then lobbest thou thy Holy Hand Grenade of Antioch towards thy foe, who, being naughty in my sight, shall snuff it.’”)[66]


[66] This quote is from Monty Python and the Holy Grail (and if you didn’t know that, it may be time to find a copy!).

Test Your Knowledge: Part VI Exercises

These exercises ask you to write a few classes and experiment with some existing code. Of course, the problem with existing code is that it must be existing. To work with the set class in exercise 5, either pull the class source code off this book’s website (see the preface for a pointer) or type it up by hand (it’s fairly brief). These programs are starting to get more sophisticated, so be sure to check the solutions at the end of the book for pointers. You’ll find them in Appendix D, under Part VI.

1.    Inheritance. Write a class called Adder that exports a method add(self, x, y) that prints a “Not Implemented” message. Then, define two subclasses of Adder that implement the add method:

ListAdder

With an add method that returns the concatenation of its two list arguments

DictAdder

With an add method that returns a new dictionary containing the items in both its two dictionary arguments (any definition of dictionary addition will do)

Experiment by making instances of all three of your classes interactively and calling their add methods.

Now, extend your Adder superclass to save an object in the instance with a constructor (e.g., assign self.data a list or a dictionary), and overload the + operator with an __add__ method to automatically dispatch to your add methods (e.g., X + Y triggers X.add(X.data,Y)). Where is the best place to put the constructors and operator overloading methods (i.e., in which classes)? What sorts of objects can you add to your class instances?

In practice, you might find it easier to code your add methods to accept just one real argument (e.g., add(self,y)), and add that one argument to the instance’s current data (e.g., self.data + y). Does this make more sense than passing two arguments to add? Would you say this makes your classes more “object-oriented”?

2.    Operator overloading. Write a class called MyList that shadows (“wraps”) a Python list: it should overload most list operators and operations, including +, indexing, iteration, slicing, and list methods such as append and sort. See the Python reference manual or other documentation for a list of all possible methods to support. Also, provide a constructor for your class that takes an existing list (or a MyList instance) and copies its components into an instance attribute. Experiment with your class interactively. Things to explore:

a.     Why is copying the initial value important here?

b.    Can you use an empty slice (e.g., start[:]) to copy the initial value if it’s a MyList instance?

c.     Is there a general way to route list method calls to the wrapped list?

d.    Can you add a MyList and a regular list? How about a list and a MyList instance?

e.     What type of object should operations like + and slicing return? What about indexing operations?

f.      If you are working with a reasonably recent Python release (version 2.2 or later), you may implement this sort of wrapper class by embedding a real list in a standalone class, or by extending the built-in list type with a subclass. Which is easier, and why?

3.    Subclassing. Make a subclass of MyList from exercise 2 called MyListSub, which extends MyList to print a message to stdout before each call to the + overloaded operation and counts the number of such calls. MyListSub should inherit basic method behavior from MyList. Adding a sequence to a MyListSub should print a message, increment the counter for + calls, and perform the superclass’s method. Also, introduce a new method that prints the operation counters to stdout, and experiment with your class interactively. Do your counters count calls per instance, or per class (for all instances of the class)? How would you program the other option? (Hint: it depends on which object the count members are assigned to: class members are shared by instances, but self members are per-instance data.)

4.    Attribute methods. Write a class called Attrs with methods that intercept every attribute qualification (both fetches and assignments), and print messages listing their arguments to stdout. Create an Attrs instance, and experiment with qualifying it interactively. What happens when you try to use the instance in expressions? Try adding, indexing, and slicing the instance of your class. (Note: a fully generic approach based upon __getattr__ will work in 2.X’s classic classes but not in 3.X’s new-style classes—which are optional in 2.X—for reasons noted inChapter 28Chapter 31, and Chapter 32, and summarized in the solution to this exercise.)

5.    Set objects. Experiment with the set class described in “Extending Types by Embedding”. Run commands to do the following sorts of operations:

a.     Create two sets of integers, and compute their intersection and union by using & and | operator expressions.

b.    Create a set from a string, and experiment with indexing your set. Which methods in the class are called?

c.     Try iterating through the items in your string set using a for loop. Which methods run this time?

d.    Try computing the intersection and union of your string set and a simple Python string. Does it work?

e.     Now, extend your set by subclassing to handle arbitrarily many operands using the *args argument form. (Hint: see the function versions of these algorithms in Chapter 18.) Compute intersections and unions of multiple operands with your set subclass. How can you intersect three or more sets, given that & has only two sides?

f.      How would you go about emulating other list operations in the set class? (Hint: __add__ can catch concatenation, and __getattr__ can pass most named list method calls like append to the wrapped list.)

6.    Class tree links. In “Namespaces: The Whole Story” in Chapter 29 and in “Multiple Inheritance: ‘Mix-in’ Classes” in Chapter 31, we learned that classes have a __bases__ attribute that returns a tuple of their superclass objects (the ones listed in parentheses in the class header). Use__bases__ to extend the lister.py mix-in classes we wrote in Chapter 31 so that they print the names of the immediate superclasses of the instance’s class. When you’re done, the first line of the string representation should look like this (your address will almost certainly vary):

7.  <Instance of Sub(Super, Lister), address 7841200:

8.    Composition. Simulate a fast-food ordering scenario by defining four classes:

Lunch

A container and controller class

Customer

The actor who buys food

Employee

The actor from whom a customer orders

Food

What the customer buys

To get you started, here are the classes and methods you’ll be defining:

class Lunch:

    def __init__(self)               # Make/embed Customer and Employee

    def order(self, foodName)        # Start a Customer order simulation

    def result(self)                 # Ask the Customer what Food it has

class Customer:

    def __init__(self)                        # Initialize my food to None

    def placeOrder(self, foodName, employee)  # Place order with an Employee

    def printFood(self)                       # Print the name of my food

class Employee:

    def takeOrder(self, foodName)    # Return a Food, with requested name

class Food:

    def __init__(self, name)         # Store food name

The order simulation should work as follows:

a.     The Lunch class’s constructor should make and embed an instance of Customer and an instance of Employee, and it should export a method called order. When called, this order method should ask the Customer to place an order by calling its placeOrder method. TheCustomer’s placeOrder method should in turn ask the Employee object for a new Food object by calling Employee’s takeOrder method.

b.    Food objects should store a food name string (e.g., “burritos”), passed down from Lunch.order, to Customer.placeOrder, to Employee.takeOrder, and finally to Food’s constructor. The top-level Lunch class should also export a method called result, which asks the customer to print the name of the food it received from the Employee via the order (this can be used to test your simulation).

Note that Lunch needs to pass either the Employee or itself to the Customer to allow the Customer to call Employee methods.

Experiment with your classes interactively by importing the Lunch class, calling its order method to run an interaction, and then calling its result method to verify that the Customer got what he or she ordered. If you prefer, you can also simply code test cases as self-test code in the file where your classes are defined, using the module __name__ trick of Chapter 25. In this simulation, the Customer is the active agent; how would your classes change if Employee were the object that initiated customer/employee interaction instead?

9.    Zoo animal hierarchy. Consider the class tree shown in Figure 32-1.

Code a set of six class statements to model this taxonomy with Python inheritance. Then, add a speak method to each of your classes that prints a unique message, and a reply method in your top-level Animal superclass that simply calls self.speak to invoke the category-specific message printer in a subclass below (this will kick off an independent inheritance search from self). Finally, remove the speak method from your Hacker class so that it picks up the default above it. When you’re finished, your classes should work this way:

% python

>>> from zoo import Cat, Hacker

>>> spot = Cat()

>>> spot.reply()                   # Animal.reply: calls Cat.speak

meow

>>> data = Hacker()                # Animal.reply: calls Primate.speak

>>> data.reply()

Hello world!

A zoo hierarchy composed of classes linked into a tree to be searched by attribute inheritance. Animal has a common “reply” method, but each class may have its own custom “speak” method called by “reply”.

Figure 32-1. A zoo hierarchy composed of classes linked into a tree to be searched by attribute inheritance. Animal has a common “reply” method, but each class may have its own custom “speak” method called by “reply”.

10.The Dead Parrot Sketch. Consider the object embedding structure captured in Figure 32-2.

Code a set of Python classes to implement this structure with composition. Code your Scene object to define an action method, and embed instances of the Customer, Clerk, and Parrot classes (each of which should define a line method that prints a unique message). The embedded objects may either inherit from a common superclass that defines line and simply provide message text, or define line themselves. In the end, your classes should operate like this:

% python

>>> import parrot

>>> parrot.Scene().action()        # Activate nested objects

customer: "that's one ex-bird!"

clerk: "no it isn't..."

parrot: None

A scene composite with a controller class (Scene) that embeds and directs instances of three other classes (Customer, Clerk, Parrot). The embedded instance’s classes may also participate in an inheritance hierarchy; composition and inheritance are often equally useful ways to structure classes for code reuse.

Figure 32-2. A scene composite with a controller class (Scene) that embeds and directs instances of three other classes (Customer, Clerk, Parrot). The embedded instance’s classes may also participate in an inheritance hierarchy; composition and inheritance are often equally useful ways to structure classes for code reuse.

WHY YOU WILL CARE: OOP BY THE MASTERS

When I teach Python classes, I invariably find that about halfway through the class, people who have used OOP in the past are following along intensely, while people who have not are beginning to glaze over (or nod off completely). The point behind the technology just isn’t apparent.

In a book like this, I have the luxury of including material like the new Big Picture overview in Chapter 26, and the gradual tutorial of Chapter 28—in fact, you should probably review that section if you’re starting to feel like OOP is just some computer science mumbo-jumbo. Though it adds much more structure than the generators we met earlier, OOP similarly relies on some magic (inheritance search and a special first argument) that beginners can find difficult to rationalize.

In real classes, however, to help get the newcomers on board (and keep them awake), I have been known to stop and ask the experts in the audience why they use OOP. The answers they’ve given might help shed some light on the purpose of OOP, if you’re new to the subject.

Here, then, with only a few embellishments, are the most common reasons to use OOP, as cited by my students over the years:

Code reuse

This one’s easy (and is the main reason for using OOP). By supporting inheritance, classes allow you to program by customization instead of starting each project from scratch.

Encapsulation

Wrapping up implementation details behind object interfaces insulates users of a class from code changes.

Structure

Classes provide new local scopes, which minimizes name clashes. They also provide a natural place to write and look for implementation code, and to manage object state.

Maintenance

Classes naturally promote code factoring, which allows us to minimize redundancy. Thanks both to the structure and code reuse support of classes, usually only one copy of the code needs to be changed.

Consistency

Classes and inheritance allow you to implement common interfaces, and hence create a common look and feel in your code; this eases debugging, comprehension, and maintenance.

Polymorphism

This is more a property of OOP than a reason for using it, but by supporting code generality, polymorphism makes code more flexible and widely applicable, and hence more reusable.

Other

And, of course, the number one reason students gave for using OOP: it looks good on a résumé! (OK, I threw this one in as a joke, but it is important to be familiar with OOP if you plan to work in the software field today.)

Finally, keep in mind what I said at the beginning of this part of the book: you won’t fully appreciate OOP until you’ve used it for a while. Pick a project, study larger examples, work through the exercises—do whatever it takes to get your feet wet with OO code; it’s worth the effort.