Real Problems with Python
neelk at cswcasa.com
neelk at cswcasa.com
Wed Feb 9 13:56:11 EST 2000
Python isn't perfect, and I enjoy thoughtful criticism. But I am sick
to death of the -stupid- whitespace flamewar, since it's a) not a
problem, and b) not thoughtful.
So in effort to throttle that monster, here's a list of half-a-dozen
*real* problems with Python, along with what the current workarounds
are, and my subjective assessment of the prospects of them being
solved in the long term. Enjoy, argue, whatever.
1. Reference counting memory management
If you need to do anything with any sort of sophisticated cyclic
data structures, then reference-counting is deeply annoying. This
is not something that someone in a high-level language should have
to worry about!
In the long run, the solution is to use a conservative garbage
collection algorithm (such as the Boehm collector), and to use
reference counts to make sure that finalizers are called in
topological order. (Cyclic references with finalizers have no good
solution, period, so it's not worth worrying about them imo.)
Workarounds:
o Use JPython. It hijacks Java's garbage collection, so there are
no problems with cyclic data structures. It doesn't ever call
__del__ methods, though I don't think this is a problem in
practice.
o Use John Max Skaller's Viper
Viper is an interpreter written in OCaml, so like JPythn it
uses the host language's garbage collection. It's still an
alpha, though worth looking at if only to learn OCaml. :)
o Use Neil Schemenauer's gc patch.
Neil Schemenauer has added support for using the Boehm garbage
collector in Python. It uses reference counts in addition to the
gc for handling finalization, so things like file objects get
closes at the expected times, while still collecting cyclic
objects.
Prognosis:
Excellent. It looks quite likely that Python 3K will have
gc, and in the meantime Neil Schemenauer's patch is entirely
usable.
2. Lack of lexical scoping
Tim Peters disagrees, but I miss it a lot, even after using Python
for years. It makes writing callbacks harder, especially when
dealing with Tkinter and re.sub, os.path.walk, and basically every
time a higher-order function is the natural solution to a problem.
(Classes with __call__ methods are too cumbersome, and lambdas too
weak, when you need a small function closure that's basically an if
statement.)
There's another, more subtle problem -- much of the work that's
been done on optimizing and analyzing highly dynamic languages has
been from on the Scheme/ML/Lisp world, and almost of that work
assumes lexically-scoped environments. So using well-understood
optimization techniques is made harder by the difference.
Workarounds:
o Greg Ewing has a closure patch, that makes functions and
lambdas work without creating too much cyclic garbage.
Prognosis:
Pretty good. It looks like the consensus is gelling around adding
it eventually, if only to shut up complainers like me. :)
3. Multiple inheritance is unsound
By 'unsound' I mean that a method call to a method inherited by a
subclass of two other classes can fail, even if that method call
would work on an instance of the base class. So inheritance isn't a
type-safe operation in Python. This is the result of using
depth-first search for name resolution in MI. (Other languages,
like Dylan, do support sound versions of MI at the price of
more involved rules for name resolution.)
This makes formal analysis of program properties (for example,
value flow analyses for IDEs) somewhere between 'harder' to
'impossible'.
Workarounds:
o Don't use multiple inheritance in Python.
Prognosis:
I don't think there's any hope of this ever being fixed. It's just
too much a part of Python, and there isn't much pressure to fix
it. Just avoid using MI in most circumstances.
4. Lack of support for highly declarative styles of programming
It's often Python claimed that Python reads like pseudocode, but
very often when people are describing algorithms the natural
description is "find the solution that satisfies the following
conditions" rather than "nest for loops five deep".
If you've done any Prolog (or even SQL) programming you know how
powerful the constraint-solving mindset is. When a problem can be
represented as a set of constraints, then writing out the solution
manually feels very tedious, compared to telling the computer to
solve the problem for you. :)
Workarounds:
o Greg Ewing has implemented list comprehensions for Python. This
is a big increase in expressive power, since it starts to permit
the use of a constraint solving-style.
o Christian Tismer's Stackless Python patch enables the use of
first-class continuations in regular Python code. This lets
people easily and in pure Python create and experiment with all
sorts of funky control structures, like coroutines, generators,
and nondeterministic evaluation.
Prognosis:
Pretty good, if Stackless Python makes it in, along with list
comprehensions. (Personally, I suspect that this is the single most
important improvement possible to Python, since it opens up a
whole new category of expressiveness to the language.)
5. The type/class dichotomy, and the lack of a metaobject system.
C types and Python classes are both types, but they don't really
understand each other. C types are (kind of) primitive objects, in
that they can't be subclassed, and this is of course frustrating
for the usual over-discussed reasons. Also, if everything is an
object, then subclassing Class becomes possible, which is basically
all you need for a fully functional metaobject protocol. This
allows for some extremely neat things.
Workarounds:
o Use Jim Fulton's ExtensionClasses to write C extensions that
can be subclassed in Python code.
o Read GvR's paper "Metaclasses in Python 1.5" to write metaclasses
using the Don Beaudry hook. It's not as nice as just subclassing
Class, though.
Prognosis:
Good. The current workarounds are messy, but workable, and
long-term there's a real determination to solve the problem once
and for all.
6. The iteration protocol
The iteration protocol is kind of hacky. This is partly a function
of the interface, and partly due to the way the type-class
dichotomy prevents arranging the collection classes into a nice
hierarchy.
The design of the iteration protocol also makes looping over
recursive data structures (like trees) either slow, if done in the
obvious way and comprehensible way, or clumsy and weird-looking, if
you try to define iterator objects.
An example: Try writing a linked list class that you can iterate
over using a for loop. Then try to write a tree class that you
can iterate over in preorder, inorder, and postorderdirection.
Then make them efficient, taking no more than O(n) time and O(1)
space to iterate over all elements. Then try to reuse duplicated
code. Is the result beautiful?
Workarounds:
o Implement iterator objects and just eat the ugliness of
implementation for clean interface.
o Don't use trees or graphs, or at least don't expect to use
them with the standard loop constructs. :)
o Wait for Guido to invent a better way: Tim Peters has said on the
newsgroup that GvR is working on a better design,
Prognosis:
Good. There aren't any good short-term fixes, but the long term
outlook is probably fine. We aren't likely to ST blocks, though.
Neel
More information about the Python-list
mailing list