Functional Python Programming (2015)

Chapter 11. Decorator Design Techniques

Python offers us many ways to create higher-order functions. In Chapter 5Higher-order Functions, we looked at two techniques: defining a function which accepts a function as an argument and defining a subclass of Callable which is either initialized with a function or called with a function as an argument.

In this chapter, we'll look at using a decorator to build a function based on another function. We'll also look at two functions from the functools module, the update_wrapper() and wraps() functions, that can help us build decorators.

One of the benefits of decorated functions is that we can create composite functions. These are single functions that embody functionality from several sources. A composite function, Decorator Design Techniques, can be somewhat more expressive of a complex algorithm than Decorator Design Techniques. It's often helpful to have a number of syntax alternatives for expressing complex processing.

Decorators as higher-order functions

The core idea of a decorator is to transform some original function into another form. A decorator creates a kind of composite function based on the decorator and the original function being decorated.

A decorator function can be used in one of the two following ways:

·        As a prefix that creates a new function with the same name as the base function as follows:

·        @decorator

·        def original_function():

·            pass

·        As an explicit operation that returns a new function, possibly with a new name:

·        def original_function():

·            pass

·        original_function= decorator(original_function)

These are two different syntaxes for the same operation. The prefix notation has the advantages of being tidy and succinct. The prefix location is more visible to some readers. The suffix notation is explicit and slightly more flexible. While the prefix notation is common, there is one reason for using the suffix notation: we might not want the resulting function to replace the original function. We might want to execute the following command that allows us to use both the decorated and the undecorated functions:

new_function = decorator(original_function)

Python functions are first-class objects. A function that accepts a function as an argument and returns a function as the result is clearly a built-in feature of the language. The open question then is how do we update or adjust the internal code structure of a function?

The answer is we don't.

Rather than messing about on the inside of the code, it's much cleaner to define a new function that wraps the original function. We have two tiers of higher-order functions involved in defining a decorator as follows:

·        The decorator function applies a wrapper to a base function and returns the new wrapper. This function can do some one-time only evaluation as part of building the decorated function.

·        The wrapper function can (and usually does) evaluate the base function. This function will be evaluated every time the decorated function is evaluated.

Here's an example of a simple decorator:

from functools import wraps

def nullable(function):

    @wraps(function)

    def null_wrapper(arg):

        return None if arg is None else function(arg)

    return null_wrapper

We almost always want to use the functools.wraps() function to assure that the decorated function retains the attributes of the original function. Copying the __name__, and __doc__ attributes, for example, assures that the resulting decorated function has the name and docstring of the original function.

The resulting composite function, called null_wrapper() function in the definition of the decorator, is also a kind of higher-order function that combines the original function, the function() function, in an expression that preserves the None values. The original function is not an explicit argument; it is a free variable that will get its value from the context in which the wrapper() function is defined.

The decorator function's return value will return the newly minted function. It's important that decorators only return functions, and not attempt any processing of data. Decorators are meta-programming: a code that creates a code. The wrapper() function, however, will be used to process the real data.

We can apply our @nullable decorator to create a composite function as follows:

nlog = nullable(math.log)

We now have a function, nlog(), which is a null-aware version of the math.log() function. We can use our composite, nlog() function, as follows:

>>> some_data = [10, 100, None, 50, 60]

>>> scaled = map(nlog, some_data)

>>> list(scaled)

[2.302585092994046, 4.605170185988092, None, 3.912023005428146, 4.0943445622221]

We've applied the function to a collection of data values. The None value politely leads to a None result. There was no exception processing involved.

Note

This example isn't really suitable for unit testing. We'll need to round the values for testing purposes. For this, we'll need a null-aware round() function too.

Here's how we can create a null-aware rounding function using decorator notation:

@nullable

def nround4(x):

    return round(x,4)

This function is a partial application of the round() function, wrapped to be null-aware. In some respects, this is a relatively sophisticated bit of functional programming that's readily available to Python programmers.

We could also create the null-aware rounding function using the following:

nround4= nullable(lambda x: round(x,4))

This has the same effect, at some cost in clarity.

We can use this round4() function to create a better test case for our nlog() function as follows:

>>> some_data = [10, 100, None, 50, 60]

>>> scaled = map(nlog, some_data)

>>> [nround4(v) for v in scaled]

[2.3026, 4.6052, None, 3.912, 4.0943]

This result will be independent of any platform considerations.

This decorator makes an assumption that the decorated function is unary. We would need to revisit this design to create a more general-purpose null-aware decorator that works with arbitrary collections of arguments.

In Chapter 14The PyMonad Library, we'll look at an alternative approach to this problem of tolerating the None values. The PyMonad library defines a Maybe class of objects which may have a proper value or may be the None value.

Using functool's update_wrapper() functions

The @wraps decorator applies the update_wrapper() function to preserve a few attributes of a wrapped function. In general, this does everything we need by default. This function copies a specific list of attributes from the original function to the resulting function created by a decorator. What's the specific list of attributes? It's defined by a module global.

The update_wrapper() function relies on a module global variable to determine what attributes to preserve. The WRAPPER_ASSIGNMENTS variable defines the attributes that are copied by default. The default value is this list of attributes to copy:

('__module__', '__name__', '__qualname__', '__doc__', '__annotations__')

It's difficult to make meaningful modifications to this list. In order to copy additional attributes, we have to assure that our functions are defined with these additional attributes. This is challenging, since the internals of the def statement aren't open to simple modification or change.

Because we can't easily fold in new attributes, it's difficult to locate reasons to modify or extend the way the wrapping works on a function. It's mostly interesting to use this variable as a piece of reference information.

If we're going to use the callable objects, then we might have a class that provides some additional attributes as part of the definition. We could then have a situation where a decorator might need to copy these additional attributes from the original wrappedcallable object to the wrapping function being created. However, it seems simpler to make these kinds of changes in the class definition itself, rather than exploit tricky decorator techniques.

While there's a lot of flexibility available, much of it isn't helpful for ordinary application development.

Cross-cutting concerns

One general principle behind decorators is to allow us to build a composite function from the decorator and the original function to which the decorator is applied. The idea is to have a library of common decorators that can provide implementations for common concerns.

We often call these cross-cutting concerns because they apply across several functions. These are the sorts of things that we would like to design once via a decorator and have them applied in relevant classes throughout an application or a framework.

Concerns that are often centralized as described previously include the following:

·        Logging

·        Auditing

·        Security

·        Handling incomplete data

A logging decorator, for example, might write standardized messages to the application's logfile. An audit decorator might write details surrounding a database update. A security decorator might check some runtime context to be sure that the login user has the necessary permissions.

Our example of a null-aware wrapper for a function is a cross-cutting concern. In this case, we'd like to have a number of functions handle the None values by returning the None values instead of raising an exception. In applications where data is incomplete, we may have a need to process rows in a simple, uniform way without having to write lots of distracting if statements to handle missing values.

Composite design

The common mathematical notation for a composite function looks as follows:

Composite design

The idea is that we can define a new function, Composite design, that combines two other functions, Composite designand Composite design.

Python's multiple-line definition of the form is as follows:

@f

def g(x):

    something

This is vaguely equivalent to Composite design. The equivalence isn't very precise because the @f decorator isn't the same as the mathematical abstraction of composing Composite design and Composite design. For the purposes of discussing function composition, we'll ignore the implementation disconnect between the abstraction of Composite design and the @f decorator.

Because decorators wrap another function, Python offers a slightly more generalized composition. We can think of Python design as follows:

Composite design

A decorator applied to some application function, Composite design, will include a wrapper function. One portion of the wrapper, Composite design, applies before the wrapped function and the other portion, Composite design, applies after the wrapped function.

The Wrapper() function often looks as follows:

@wraps(argument_function)

def something_wrapper(*args, **kw):

    # The "before" part, w_α, applied to *args or **kw

    result= argument_function(*args, **kw)

    # the "after" part, w_β, applied to the result

Details will vary, and vary widely. There are many clever things that can be done within this general framework.

A great deal of functional programming amounts to Composite design kinds of constructs. We often spell these functions out because there's no real benefit from summarizing the function into a composite, Composite design. In some cases, however, we might want to use a composite function with a higher-order function like map(), filter(), or reduce().

We can always resort to the map(f, map(g, x)) method. It might be more clear, however, to use the map(f_g, x) method to apply a composite to a collection. It's important to note that there's no inherent performance advantage to either technique. The map() function is lazy: with two map() functions, one item will be taken from x, processed by the g() function, and then processed by the f() function. With a single map() function, an item will be taken from x and then processed by the f_g() composite function.

In Chapter 14The PyMonad Library, we'll look at an alternative approach to this problem of creating composite functions from individual curried functions.

Preprocessing bad data

One cross-cutting concern in some exploratory data analysis applications is how to handle numeric values that are missing or cannot be parsed. We often have a mixture of float, int, and Decimal currency values that we'd like to process with some consistency.

In other contexts, we have not applicable or not available data values that shouldn't interfere with the main thread of the calculation. It's often handy to allow the Not Applicable values to pass through an expression without raising an exception. We'll focus on three bad-data conversion functions: bd_int(), bd_float(), and bd_decimal(). The composite feature we're adding will be defined before the built-in conversion function.

Here's a simple bad-data decorator:

import decimal

def bad_data(function):

    @wraps(function)

    def wrap_bad_data(text, *args, **kw):

        try:

            return function(text, *args, **kw)

        except (ValueError, decimal.InvalidOperation):

            cleaned= text.replace(",", "")

            return function(cleaned, *args, **kw)

    return wrap_bad_data

This function wraps a given conversion function to try a second conversion in the event the first conversion involved bad data. In the case of preserving the None values as a Not Applicable code, the exception handling would simply return the None value.

In this case, we've provided Python *args and **kw parameters. This assures that the wrapped functions can have additional argument values provided.

We can use this wrapper as follows:

bd_int= bad_data(int)

bd_float= bad_data(float)

bd_decimal= bad_data(Decimal)

This will create a suite of functions that can do conversions of good data as well as a limited amount of data cleansing to handle specific kinds of bad data.

Following are some examples of using the bd_int() function:

>>> bd_int("13")

13

>>> bd_int("1,371")

1371

>>> bd_int("1,371", base=16)

4977

We've applied the bd_int() function to a string that converted neatly and a string with the specific type of punctuation that we'll tolerate. We've also shown that we can provide additional parameters to each of these conversion functions.

We might like to have a more flexible decorator. One feature that we might like to add is the ability to handle a variety of data scrubbing alternatives. Simple, removal isn't always what we need. We may also need to remove $, or ° symbols, too. We'll look at more sophisticated, parameterized decorators in the next section.

Adding a parameter to a decorator

A common requirement is to customize a decorator with additional parameters. Rather than simply creating a composite Adding a parameter to a decorator, we're doing something a bit more complex. We're creating Adding a parameter to a decorator. We've applied a parameter, c, as part of creating the wrapper. This parameterized composite,Adding a parameter to a decorator, can then be used with the actual data, x.

In Python syntax, we can write it as follows:

@deco(arg)

def func( ):

    something

This will provide a parameterized deco(arg) function to the base function definition.

The effect is as follows:

def func( ):

    something

func= deco(arg)(func)

We've done three things and they are as follows:

1.    Define a function, func.

2.    Apply the abstract decorator, deco(), to its arguments to create a concrete decorator, deco(arg).

3.    Apply the concrete decorator, deco(arg), to the base function to create the decorated version of the function, deco(arg)(func).

A decorator with arguments involves indirect construction of the final function. We seem to have moved beyond merely higher-order functions into something even more abstract: higher-order functions that create higher-order functions.

We can expand our bad-data aware decorator to create a slightly more flexible conversion. We'll define a decorator that can accept parameters of characters to remove. Following is a parameterized decorator:

import decimal

def bad_char_remove(*char_list):

    def cr_decorator(function):

        @wraps(function)

        def wrap_char_remove(text, *args, **kw):

            try:

                return function(text, *args, **kw)

            except (ValueError, decimal.InvalidOperation):

                cleaned= clean_list(text, char_list)

                return function(cleaned, *args, **kw)

        return wrap_char_remove

    return cr_decorator

A parameterized decorator has three parts and they are as follows:

·        The overall decorator. This defines and returns the abstract decorator. In this case, the cr_decorator is an abstract decorator. This has a free variable, char_list, that comes from the initial decorator.

·        The abstract decorator. In this case, the cr_decorator decorator will have its free variable, char_list, bound so that it can be applied to a function.

·        The decorating wrapper. In this example, the wrap_char_remove function will replace the wrapped function. Because of the @wraps decorator, the __name__ (and other attributes) will be replaced with the name of the function being wrapped.

We can use this decorator to create conversion functions as follows:

@bad_char_remove("$", ",")

def currency(text, **kw):

    return Decimal(text, **kw)

We've used our decorator to wrap a currency() function. The essential feature of the currency() function is a reference to the decimal.Decimal constructor.

This currency() function will now handle some variant data formats:

>>> currency("13")

Decimal('13')

>>> currency("$3.14")

Decimal('3.14')

>>> currency("$1,701.00")

Decimal('1701.00')

We can now process input data using a relatively simple map(currency, row) method to convert source data from strings to usable Decimal values. The try:/except: error-handling has been isolated to a function that we've used to build a composite conversion function.

We can use a similar design to create Null-tolerant functions. These functions would use a similar try:/except: wrapper, but would simply return the None values.

Implementing more complex descriptors

We can easily write the following commands:

@f_wrap

@g_wrap

def h(x):

    something

There's nothing in Python to stop us. This has a meaning somewhat like Implementing more complex descriptors. However, the name is merely Implementing more complex descriptors. Because of this potential confusion, we need to be cautious when creating functions that involve deeply nested descriptors. If our intent is simply to handle some cross-cutting concerns, then each decorator can handle a concern without creating much confusion.

If, on the other hand, we're using a decoration to create a composite function, it might also be better to use the following command:

f_g_h= f_wrap(g_wrap(h))

This clarifies as to what precisely is going on. Decorator functions don't correspond precisely with the mathematical abstraction of functions being composed. The decorator function actually contains a wrapper function that will contain the function being composed. This distinction between a function and a decorator that creates a composite from the function can become a problem when trying to understand an application.

As with other aspects of functional programming, a succinct and expressive program is the goal. Decorators who are expressive are welcome. Writing an über-meta-super-callable that can do everything in the application with only minor customizations may be succinct, but it's rarely expressive.

Recognizing design limitations

In the case of our data cleanup, the simplistic removal of stray characters may not be sufficient. When working with the geolocation data, we may have a wide variety of input formats that include simple degrees (37.549016197), degrees and minutes (37° 32.94097′), and degrees-minutes-seconds (37° 32′ 56.46″). Of course, there can be even more subtle cleaning problems: some devices will create an output with the Unicode U+00BA character, º, instead of the similar-looking degree character, °, which is U+00B0.

For this reason, it is often necessary to provide a separate cleansing function that's bundled in with the conversion function. This function will handle the more sophisticated conversions required by inputs that are as wildly inconsistent in format as latitudes and longitudes.

How can we implement this? We have a number of choices. Simple higher-order functions are a good choice. A decorator, on the other hand, doesn't work out terribly well. We'll look at a decorator-based design to see that there are limitations to what makes sense in a decorator.

The requirements have two orthogonal design considerations and they are as follows:

1.    The output conversion (int, float, Decimal)

2.    The input cleaning (clean stray characters, reformat coordinates)

Ideally, one of these aspects is an essential function that gets wrapped and the other aspect is something that's included via a wrapper. The choice of essence versus wrap isn't clear. One of the reasons it isn't clear is that our previous examples are a bit more complex than a simple two-part composite.

In the previous examples, we were actually creating a three-part composite:

·        The output conversion (int, float, Decimal)

·        The input cleansing—either a simple replace or a more complex multiple-character replacement

·        The function which attempted the conversion, did the cleansing as a response to an exception, and attempted the conversion again

The third part – attempting the conversion and retrying – is the actual wrapper that also forms a part of the composite function. As we noted previously, a wrapper contains a before phase and an after phase, which we've called Recognizing design limitations and Recognizing design limitations, respectively.

We want to use this wrapper to create a composite of two additional functions. We have two choices for the syntax. We could include the cleansing function as an argument to the decorator on the conversion as follows:

@cleanse_before(cleanser)

def convert(text):

    something

Or, we could include the conversion function as an argument to the decorator for a cleansing function as follows:

@then_convert(converter)

def clean(text):

    something

In this case, we can choose the @then_convert(converter) style decorator because we're relying—for the most part—on the built-in conversions. Our point is to show that the choice is not crystal clear.

The decorator looks as follows:

def then_convert(convert_function):

    def clean_convert_decorator(clean_function):

        @wraps(clean_function)

        def cc_wrapper(text, *args, **kw):

            try:

                return convert_function(text, *args, **kw)

            except (ValueError, decimal.InvalidOperation):

                cleaned= clean_function(text)

                return convert_function(cleaned, *args, **kw)

        return cc_wrapper

    return clean_convert_decorator

We've defined a three-layer decorator. At the heart is the cc_wrapper() function that applies the convert_function function. If this fails, then it uses a clean_function function and then tries the convert_function function again. This function is wrapped around theclean_function function by the then_convert_decorator() concrete decorator function. The concrete decorator has the convert_function function as a free variable. The concrete decorator is created by the decorator interface, then_convert(), which is customized by a conversion function.

We can now build a slightly more flexible cleanse and convert function as follows:

@then_convert(int)

def drop_punct(text):

    return text.replace(",", "").replace("$", "")

The integer conversion is a decorator applied to the given cleansing function. In this case, the cleansing function removes $ and , characters. The integer conversion is wrapped around this cleansing.

We can use the integer conversion as follows:

>>> drop_punct("1,701")

1701

>>> drop_punct("97")

97

While this can encapsulate some sophisticated cleansing and converting into a very tidy package, the results are potentially confusing. The name of the function is the name of the core cleansing algorithm; the other function's contribution to the composite is lost.

As an alternative, we can use the integer conversion as follows:

def drop_punct(text):

    return text.replace(",", "").replace("$", "")

drop_punct_int = then_convert(int)(drop_punct)

This will allow us to provide a new name to the decorated cleaning function. This solves the naming problem, but the construction of the final function via the then_convert(int)(drop_punct) method is rather opaque.

It seems like we've reached the edge of the envelope here. The decorator model isn't ideal for this kind of design. Generally, decorators work well when we have a number of relatively simple and fixed aspects that we want to include with a given function (or a class). Decorators are also important when these additional aspects can be looked at as an infrastructure or a support, and not something essential to the meaning of the application code.

For something that involves multiple orthogonal dimensions, we might want to result to the Callables function with various kinds of plugin strategy objects. This might provide something more palatable. We might want to look closely at creating higher-order functions. We can then create partial functions with various combinations of parameters for the higher-order functions.

The typical examples of logging or security testing can be considered as the kind of background processing that isn't specific to the problem domain. When we have processing that is as ubiquitous as the air that surrounds us, then a decorator might be more appropriate.

Summary

In this chapter, we've looked at two kinds of decorators: the simple decorator with no arguments and parameterized decorators. We've seen how decorators involve an indirect composition between functions: the decorator wraps a function (defined inside the decorator) around another function.

Using the functools.wraps() decorator assures that our decorators will properly copy attributes from the function being wrapped. This should be a piece of every decorator we write.

In the next chapter, we'll look at the multiprocessing and multithreading techniques that are available to us. These packages become particularly helpful in a functional programming context. When we eliminate a complex shared state and design around nonstrict processing, we can leverage parallelism to improve the performance.