Python and the need for speed
nathan.ernst at gmail.com
Tue Apr 11 20:28:30 EDT 2017
I used to write Python modules in C++. Well, more accurately, wrapped
already-written C++ APIs to expose to Python using Boost Python. This
wasn't due to performance issues, but to avoid reimplementing APIs.
That said, I believe Python gets a bad wrap in regards to performance for a
variety of reasons, chief among them being: Python's prime audience tends
to be mathematical/scientific/statistical/financial first and are
developers by happen stance (I know this is a large generalization). Point
is, for these audiences, the prime driver is to get a result. Efficiency is
secondary. For instance, in a separate thread in the last week, someone was
asking if there was a faster way of doing something along these lines:
result = [(expensive_calc(x) + 1, expensive_calc(x) for x in some_data]
A valid, yet sub-optimal response was suggested:
temp = [expensive_calc(x) for x in some_data]
result = [(x + 1, x) for x in temp]
A better result would have been:
temp = (expensive_calc(x) for x in some_data)
result = [(x + 1, x) for x in temp]
Note: the difference is subtle, but significant. The first example creates
a list with the entire temporary result, while the second uses a generator.
For small sizes of "some_data", you're not likely to notice. For large
sizes of "some_data", this is huge.
Writing performant Python code is possible, but like writing performant
code in any other language, you need to be aware of what's happening. This
means paying attention to things that may cause memory allocations (which
are largely hidden from you in Python).
I worked on http://www.marketswiki.com/wiki/CMDX - in particular I wrote
most of the Migration Utility mentioned to migrate paper CDS trades to
standardized CDS contracts against CME. Most of the migration util was
written in native Python 2.5 (it was written in 2008) using a single
thread. Performance wasn't super critical, but desired. At the end of the
project, I was processing ~100K positions per second. Memory usage of the
app was constant and processing time of a portfolio was directly linear to
the number of positions in the portfolio. Python wasn't the limiting factor
for the app - it was the write speed to the database (and we were using the
bcp interface of pysybase to write to a Sybase DB).
Basically, what I'm getting at is Python *can* be performant. It's also
easy to screw up and introduce non-obvious slowness. Threading in Python is
bad - don't bother (until we can get rid of the GIL, I doubt the situation
If you have a performance problem with Python, before you blame Python,
take a step back and look at your own code (or libraries you're using) and
ask yourself: "Is my code optimal?"
Yes, Python is not the faster language/runtime in existence. But for
probably 99% of the people out there that complain about Python's speed,
there's probably plenty of suboptimal or outright wasteful code that they
should fix first, before complaining. For the other 1%, Python was probably
the wrong choice to begin with.
I don't intend this to be seen or implied as an attack or criticism of
anyone. I'm just trying to provide an insight into my experience of using
On Tue, Apr 11, 2017 at 3:58 PM, Mikhail V <mikhailwas at gmail.com> wrote:
> On 11 April 2017 at 16:56, Steve D'Aprano <steve+python at pearwood.info>
> > On Tue, 11 Apr 2017 07:56 pm, Brecht Machiels wrote:
> >> DropBox and
> >> Google seem to agree that there are no good solutions, since they are
> >> moving to Go.
> > That's a good solution! Maybe we should be writing extensions in Go,
> > of C. Or for maths-heavy work, using extensions written in Julia.
> Just my curiosity, I've always been intersted in such question: are devs
> still writing extensions in C, I mean type in C code? Aren't they using
> some translator or IDE which at lest hides the brackets and semicolons?
> I personally don't have problems with understanding low-level
> concepts of programming, but I find it pretty hard to see
> through the mangroves of brackets, asterisks and Co.
More information about the Python-list