Performant Python
There are many ways to improve Python's performance, such as cython or writing C modules and calling them from Python. These are invasive methods that require a specific setup or build. However, there are a number of ways to improve Python's performance that are non-invasive and require little effort.
Python has been designed to be user-friendly and intuitive and simpler constructs are presented as defaults. Sometimes performance is needed and Python has many performant constructs that provide significant improvements in memory and execution time.
Memory
Python uses a lot of memory but there are alternatives to default data structures that can reduce memory usage.
Counting memory in Python is complicated, there are a number of ways to do it, often producing different results, even if off by few bytes. As general rule, I will try to get the exclusive memory of an object.
I am going to use the objsize library to count the exclusive memory in use by an item; for generators I will use a custom script and sys.getsizeof since objsize does not work with generators.
Tuples
Tuples are immutable lists.
('hello', 'world')
Tuples take a bit less memory than lists. With more elements, the relative memory savings becomes smaller:
Tuples can also replace dictionaries, infact dict() can produce a dictionary from a tuple, in case one needs that transformation:
t = (('company', 'Strangemachines'), ('city', 'Amsterdam'))
dict(t) # {'company': 'Strangemachines', 'city': 'Amsterdam'}
Tuples provide significant savings when replacing dictionaries:
Generators
Generators are lazy lists. Instead of computing and putting in memory the entire list, they do so one item at a time and only when the item is requested. Working with generators is more complicated than regular lists, because they can't be indexed or sliced.
g = (x for x in data)
Generators are smaller than large tuples, but bigger than small tuples:
Slots
Slots are a mechanism that reduces the memory used by class instances. Slots disable setting dynamic properties on instances and they must be declared when defining the class.
class Users:
__slots__ = ('username', 'email')
def __init__(self, username, email):
self.username = username
self.email = email
Comparing the memory for small classes, we can see that a slotted class will use at least 50% less memory. Larger clases have smaller savings, but still around half.
Named tuples
Named tuples provide a way to create objects without declaring a class:
from collections import namedtuple
Point = namedtuple('Point', 'x', 'y')
p = Point(0, 1) # we can access p.x and p.y
Many Python performance articles encourage to use namedtuples for objects on the fly; at Strangemachines we know enough metaprogramming and maths to know better: a namedtuple is smaller than a class, but a slotted class is smaller than a namedtuple.
Since Namedtuple is used to create inline objects, we'll need to use type to create a slotted class, in order to replace a namedtuple:
def init(self, x, y):
self.x = x
self.y = y
Point = type('Point', (), {'__slots__': ('x', 'y'), '__init__': init})
Iteration time
In Python it's often possible to save iteration time by using alternatives to for.
As a note of caution, notice that for in Python is not slow by itself. It becomes slower in cases where there is an optimized alternative for that specific type of iteration.
For example, calling append to add elements to a list will make for slower than map, but the decrease in performance comes from the additional functions calls of the for loop.
Tuples
Iterating lists is slightly faster than tuples.
Generators
Iterating generators is much faster than lists.
Map
Map is a function that applies a function to the elements of an iterable and produces a new iterable with the results of the function calls.
def plus_one(x):
return x + 1
map(plus_one, [1, 2, 3, 4, 5]) # [2, 3, 4, 5]
With for it would look like this:
result = []
for i in [1, 2, 3, 4, 5]:
result.append(i+1)
Map is faster than for, but only if the map is not converted to a list (probably because of the extra function call):
Filter
Filter is a function that applies a function to the elements of an iterable, producing a new iterable that contains only the element for which the function returned a truthy value.
def even(number):
if number % 2 == 0:
return True
filter(even, [1, 2, 3, 4, 5]) # [2, 4]
Again the equivalent with for:
result = []
for i in [1, 2, 3, 4, 5]:
if i % 2 == 0:
result.append(i)
List Comprehensions
List comprehensions are mechanism to process an iterable and create a list the same time. Common cases include filtering or applying a function to the elements of the iterable.
[city.name for city in cities]
List comprehensions are slower than map and filter and using generators instead of list comprehension does not change the outcome:
Even with a lambda function, map is faster:
Dictionary comprehensions
Comprehensions also work on dictionaries:
{city.name: city for city in cities}
Dictionary comprehensions are slower than map and filter as well:
Conclusion
I know it's a lot of graphs, so I'll summarize them:
- Tuples save a bit of memory when replacing lists
- Tuples save lots of memory when replacing dictionaries
- Generators save memory only for large tuples, but are faster to iterate
- Slotted classes save a lot of memory
- Namedtuples are better than classes but worse than slotted classes
- Map and filter are sometimes faster than comprehensions