[Tutor] Efficient programming questions. Tuples vs Lists; Custom Objects vs Lists.
project5 at redrival.net
Mon Sep 25 14:42:52 CEST 2006
Wesley Brooks <wesbrooks <at> gmail.com> writes:
> Most of these are issues relating to a mix of speed of execution
> for the code, and scripting best practice.
Generally speaking, performance bottlenecks can be determined using the profile
module. Things often turn out different than you might expect, so talking about
performance in general terms like "is a tuple better than a list" may not be
> Firstly tuples vs lists. I'm guessing that lists use more memory than tuples
> as they provide more functions? Are they also more CPU intensive to use?
s = [(i, i+1, i+2, i+3, i+4, i+5, i+6) for i in xrange(500000)]
s = [[i, i+1, i+2, i+3, i+4, i+5, i+6] for i in xrange(500000)]
The second (lists) takes on my machine about 85 MB, while the tuples takes about
75 MB. Tuples seem a bit more memory efficient for this simple test, but not
When it comes to CPU: lists and tuples have different capabilities, so I'm not
sure how you'd compare their performance in a generic way.
> Currently if I'm not likely to add or remove Items I use a tuple (eg, a
> coordinate in 3D space), but when I do I prefer using a list.
Tuples are immutable, meaning you have no choice if you need to
add/remove/modify items :).
> This leads on to another question: If you use an object many times, for
> instance a list, does the interpreter remember that each new object is a list
> and when a function is called on a list look at one section of memory which
> details the list functions, or for each new object does it dedicate a new
> section of memory to the functions of that object?
OO languages usually do, but let's test.
>>> s = [1, 2]
>>> t = [3, 4]
>>> id(s.append) == id(t.append)
Yep, it's the same method.
> Secondly, a similar question to the first. A list object is something which is
> in the standard python library. I guess in a CPython distribution that this is
> C/C++ code being called by the python interpreter when the list object is
> used? If so then this would imply that a list object would be significantly
> quicker/less memory to use than an equivalent object scripted in python. I'm
Python implements a lot of performance-sensitive parts (the language itself, but
also of the standard library) in C. If you'd write your own equivalent
functionality in Python (say a brand new string class), it will probably be
slower. However, the point is that you don't write your own primitives: you use
high-level, optimized primitives provided by Python and build useful
functionality on top of them.
> within the list. My code would be a lot more elegant and easier to read if I
> used custom objects for some of these but I'm worried they would be much
> slower. Would it be appropriate to write a function that inherited the methods
Write a prototype and profile it. I don't know what you're trying to do, but I
do think it's in principle better to have custom objects which encapsulate
relevant behavior and data than to mess around with lists and procedures.
> Lastly why can't python be compiled? I understand that there are certain
It is compiled, but not to machine code. The dynamic nature of Python makes this
a difficult task. There are efforts to mitigate this (in order of decreasing
ease of use):
- Psyco can make certain kinds of code a lot faster. It's trivial to use, but
you'll have to profile in order to identify the bottlenecks. Apply Psyco to them
and see if it helps. If the bottleneck is some library, it probably won't.
- Pyrex is a Python-like language that compiles to C(++?). Handy if you want to
get some performance-sensitive module in C with as little effort as possible.
- IronPython compiles to .Net and from what I've read performs faster in certain
tasks than CPython. Nothing revolutionary though.
- There is a Python-like language with static typing available for .Net, called
Boo. If used with static typing, it will have C#-ish performance IIRC.
- PyPy is a reimplementation of Python in Python that aims to eventually be
faster using some magic bootstrapping I don't quite understand :). It's not
finished and currently slower than CPy.
- ShedSkin is also a Python-to-C++ compiler, but I don't know what its current
> situations where speed is critical? Is CPython's interpreter effectively a C
The typical course of action is to write in Python, identify problematic
performance and see what you can do about it. Often the problem can be
ameliorated by Psyco, algorithm improvements, implementing some caching
mechanism, switching to a different module (e.g. use a different DB, or another
XML parser) and as a last resort rewrite the performance-sensitive part in Pyrex
> program that carries out C functions as requested in the script? If so why is
> it not possible to have a program that reads in the whole python script,
> translates it to C and compiles it? Is it simply that the C functions are
Because the script as you see it might not be what is executed. Python programs
can be modified dynamically at runtime, e.g. I might add a method to an object
based on user input, or add complete new classes, etc. Or I might call a method
on a certain object, without knowing what that object is - it needs to examine
the object at runtime to determine if the method is available. The potential
compiler would have to handle all of these cases, meaning you'd end up with...
well, CPython. Typical compiler efforts in the past have limited the flexibility
of the language.
More information about the Tutor