Python (2016)

CHAPTER 20: Python Idioms

Idioms are not just for the regular language—programming languages have idioms, too. A great example is our Python, which is by nature a strongly idiomatic language. A programming idiom is basically a means of expression of a recurring construct, something that is not already a built-in feature of that language. In order to be “fluent” in “speaking” any chosen programming language, one has to understand the idioms associated with it. This allows the programmer to properly produce codes—“speaking” the language. Because of Python’s philosophy about having a single optimal way to do things, it relies strongly on its idioms for structure.

Basic Principles

●         Aside from conflicting programming language philosophies, here are a couple of conflicting acronyms: EAFP and LBYL. EAFP stands for [It’s] Easier to Ask Forgiveness than Permission, while the latter means Look Before You Leap. While LBYL may be more useful in real life, it is EAFP that is honored in Python. Specifically, it spears to the programmer to use exceptions for checking errors. Any action that may fail has to be put inside a try...except block.

●         For managing resources such as files, use context managers (to be discussed in detail in a later chapter). For ad hoc cleanup, use the finally statement. However, it is preferred to write your context manager to encapsulate it.

●         Do not use getter-setter pairs; instead, use properties.

●         For dynamic records, it is better to use dictionaries. For static, records, classes are preferred. For even simple classes, collections.namedtuple from the Python library can be a good help. If records have the same field all the time, this is best made explicit in a class. If the presence of the fields may vary, a dictionary is in order.

●         For throwaway variables, use the underscore. This can be in instances like discarding return values when tuples are returned, or when indicating that a specific parameter is being igored. You may use *_, **_ in order to discard the keyword or positional arguments being passed to a function. These symbols correspond to the common *args, **kwargs parameters, though they are explicitly discarded. You may also use them on top of named or positional parameters (after the ones you use). This allows you to utilize some while discarding any excess.

●         Except when you need to distinguish between “None”, 0, and [] or other falsy values, use implicit True/False. Falsy values like the one given should have an explicit check, such as ==0 or is None. And yes, that was “falsy” as opposed to “false”—falsy refers to a non-Boolean that has an assigned value of false. (In the same vein, non-Booleans with an assigned True value is called a “truthy”.)

●         While else is completely optional, it is good practice to use it right after try, while, and for instead of ending things at if.

Imports

For a good code practice, import only modules and not names (such as classes or functions). Doing the latter gives birth to a new (name) binding, which may not always be in sync with already exsisting binding. For example, for a module m which exists to define a function f, having the function importing using from m import f would mean that m.f and f may differ if either of them is affixed to (creating a new binding).

In actuality, this is often ignored especially for smaller-scale code. This is because modifying modules after importing them is relatively rare. Both functions and classes in such instances may be imported from modules in order for them to be referred to without using a prefix. For large-scale robust code, however, this could be an important convention as it can create bugs which can be very hard to find.

With low typing, the programmer can utilize a renaming import in order to abbreviate a longer module name.

Import this_module_has_a_very_long_name as n1

N1.f() #tthis is a lot easier than a very long name,, which just as robust

Take note that using from in order to import subpackages or submodules from a specific package is completely acceptable. Here is an example:

From O import sm # completely alright

Sm.f()

Operations

●         In swapping values, simply use b,a=a,b

●         In order to access attributes (particularly to call a method) for values that could be an object or possible None, use the following:

a and a.x

a and a.f()

●         For substring checking, you can use in

Data Types

You can use enumerate() for keeping track of the different iteration cycles over iterables:

For i, x in enumerate(l):

              #...

This is as opposed to the following technique, which is considered an anti-idiom:

For i in range(len(l)):

              X=l[i]

              #...

The second line goes from the list to the numbers and then back to the list, which is unpythonic.

Finding the first matching element

Python sequences have an index method, which returns the index of a specific value’s first occurrence in the sequence. In order to find this first occurrence which satisfies a condition, simply use next plus a generator expression. This is demonstrated in the following code:

Try:

              X=next(a for a, n in enumerate(l) if n>0)

Except StopIteration:

              Print(‘No positive number’)

Else:

              Print(‘The firs positive number index is”,x)

When the value is needed and not its index of occurrence, this can be directly obtained through the following technique:

Try:

              X=next (a for a in l if >0)

Except StopIteration:

              Print(‘No positive number’)

Else:

              Print(‘The first number is’,x)

The reason the code is constructed this way is two-fold. First, the exceptions will let you signal a “no match found”, since they solve the problem of the semipredicate). Since a single value will be returned and not an index, this cannot be returned in the value. Also, generator expressions will let the programmer use an expression even without the need for a lambda (a function which is “anonymous” or not bound down to a name) or for the introduction of new grammar.

Truncating

In the case of mutable sequences, use the del function instead of reassigning it to a slice:

Del l[j:]

Del l[:i]

The anti-idiom for this is as follows:

L = l[:j]

L = l[i:]

The plain reason for this is that adding del makes your intention of truncating clear. Slicing will create another reference pointing towards the same list since lists are mutable. Unreachable data that is left over can be collected, though this is mostly done later. Instead of this, deleting modifies the lists in-place -- a method that is faster than creating slices and then assigning them to existing variables. This also lets Python immediately single out the deleted elements for deallocation, instead of waiting for the garbage collection to do this later.

Admittedly, though, there are some cases when you would want the same list in 2 slices. This is, however, rare in basic programming. It is also rare that one would want a slice of the entire list, then using this slice to replace the original variable without changing the other slice. This improbability is demonstrated in the following code:

M = l

L = l[i:j]

Instead of this, you can just try:

m=l[i:j]

Sorted list from iterables

You may create sorted lists directly from iterables, without having to make a list first and then sort it. These also cover dictionaries and sets:

x={l,’a’,...}

y=sorted(x)

z={‘a’:1,...}

y=sorted(x)

Tuples

When going for constant sequences, always use tuples. While not a complete rule, it helps the code become more Pythonic by making the intention of the programmer clear.

Strings

When checking for substrings, use the in function. However, do not use the function to check for a single-character match. This will match substrings, returning spurious matches. Instead of this, use a tuple with valid values. Let’s take the following code as an example of what not to imitate:

Def valid_sign(sign):

              Return sign in ‘+-’

This is wrong, since it will return True for the sign == ‘+-’. Instead of that, use a tuple as follows:

Def valid_sign(sign):

              Return sign in (‘+’,’-’)

Building Strings

If you would need to build a long string in increments, first build a list and then join using ‘ ‘ or new lines if you are building in a text file. Do not forget to use a final new line in this case. This will be both clearer and faster that appending directly to a string, which is oftentimes slow.

There are, however, a few optimizations in CPython that can help make simple appending to strings fast. Appending strings in CPython 2.5+ as well as appending bytestrings in CPython 3.0+ are all fast, but building Unicode strings (called unicode in Py2.X and string in Py3.X) ad joining them are even faster. Remember to be aware of this when you are expecting to manipulate a lot of strings -- profile your code accordingly.

Let’s take the following code as an example of what you shouldn’t do:

X=’ ‘

For y in l:

              X+=y

The above code would make a new string for each iteration, since strings are immutable. Instead of that, use the following code:

#...

#x.append(y)

S=’ ‘.join(x)

You may also use generator expressions, which may prove to be very efficient:

S = ‘ ‘.join(f(y) for y in l)

You can also use StringIO for mutable string-like objects.

Dictionaries

You can use the following code to iterate over keys:

For z in y:

              ...

For iterating over values in Python 3:

For x in y.values():

              ...

For iterating over values in Python 2 (note that dict.values() returns a copy in Py2.X):

For x in y.itervalues():

              ...

For iterating over keys and values in Python 3

For z,x in y.items():

              ...

For iterating over values in Python 2 (note that dict.items() returns a copy in Py2.X):

For z, x in y.iteritems()

              ...

On the other hand, anti-idioms would be:

●         For z, _in y.items(): instead of for z in y:

●         For _, x in y.items(): instead of for x in y.values()