Homogeneous Sequences in Python?

Tue Dec 26 19:11:00 EST 2000

It seems that we have recently has yet another discussion about what
kind of performance improvements might be made to the Python
interpreter. It seems that there is some consensus that there are
perhaps a lot of little things that could be done for special cases, but
the big "bang for the buck" stuff has been done. For the most part, I
find that Python is fast enough as it stands for most of my needs, but
that it does fall down flat for a lot of what I do: numerical computing.

A lot of the performance hit comes from what it know to be a limitation
of virtually all interpreted languages: slow looping. While this is
true, I think Python suffers more because of it's extreme dynamicism.

The classic answer to why Python cannot be easily made faster, even if
it were compiled, is that because of Python's dynamic nature, an
expression like "x = a + b" cannot be compiled, because it is not known
until runtime what types a and b are, and what it means to __add__
them. I am a BIG fan of Python's dynamic nature, so I would not like to
see any of that flexibility go away.

I have seen proposals for "optional static typing" that would address
this issue (I believe Guido is talking about it for Py3k). That would
certainly help, but it could be somewhat ugly as well. What occurs to me
are two observations:

1) the occasional staticly typed integer is not going to make much
difference. What would make a difference is a large collection of
integers, for example

2) More often than not, when I build a large sequence, it is
homogeneous, that is a large list (or tuple, whatever) of the same type:
maybe a list of strings ( file.readlines() ), a list of integers
(range(n) ), or even a list of lists.

So, what we need to speed things up is to have a way for Python to
process large sequences of the same type without having to type check
each item in the sequence.

The simple way to implement this is to have an optionally static type
declaration for sequence: list.is_list_of_integers, or something like
that. Then when this list is processed, the interpreter would know
what it was dealing with. It then struck me that it could certainly be
done automatically, on-the-fly: Every time an item was added to a
sequence, it's type would be checked, and some kind of "homogeneous
flag" would be set:

a = [] # type unknown

a.append(4) # homogeneous list of integers

a.append(56) # still a homogeneous list of integers

a.append(4.5) # Not homogeneous

I have not idea how much this would slow append operations, but I
suspect much less than the potential speed up of know that a list is
homogeneous when it is. A plus of doing this automatically, is that it
would take NO changes to existing code to take advantage of the new
feature.

Would this really speed up Python code much, even for large homogeneous
sequences? I have no idea. I am in no way qualified to make that
judgment, I'd like to hear what folks think.

I suspect that it won't speed up straight, run-through-the-interpreter
code enough to make it worth it, but it could have some real benefits in
three places:

1) some of the sequence oriented operations that are being introduced:
list comprehensions, and the possible future Elementwise/Objectwise
Operators (see PEP # 225)

2) Py2c, the Python to C translator. I think a really important element
of Python's future may be Py2C (and perhaps eventually a Python compiler
or JIT compiler). It seems that has been the direction of number of
interpreted languages. With optional static typing and compilation,
LISP can be as fast (or faster) that C++ for numerics. MATLAB now has a
translator to C++ that can create faster code and fully compiled stand
alone applications. That has almost eliminated the need to write
external functions in C. Py2C can only be so fast without some kind of
static typing. I think homogeneous sequences could do it. Right now Py2C
can convert "for i in range(1000)" into a fast C loop, but can't do
anything with "for i in list_of_integers", If it had a way of knowing
that it was a list of integers, it could be highly optimized.

3) Extensions that you have to pass a lot of data to. An example is
wxPython. In order to draw a polyline with a couple of thousand points,
for example, you have to pass a list of 2-tuples of integers of the
coordinates of the points. The extension has to do a type check on every
one of the numbers in every one of the tuples in the list. Frankly,
this is still pretty fast, but not as fast as it could be.

My goal is to be able to write applications that contain some
numerically intensive code (or other repetitious simple processing) and
not to have to use anything but Python. I'm going to have to keep
writing C for a large amount of the core code if something isn't done a
little differently.

What about Numeric?

NumPy provides the Multiarray object, which does provide what I am
talking about, a homogeneous sequence ( at least of numbers, which is
mostly what I need). As a result, it can implement very fast array
oriented operations that eliminate the need for a large amount of hand
coded C extensions. The problem is that core Python has no idea that it
is a homogeneous sequence, so when you have to do some looping, things
slow WAY down.

As an example, I recently wrote a function that had to loop through a
NumPy array of 1000 items, a list of 200 Numpy arrays, and each row in
each of the Arrays in that list. The result was a triple nested loop,
and a very slow run time. I probably could have avoided one of those
loops using array oriented computation, but there was no way to get rid
of all of them. I wrote the functions in the inner part of the loop in
C, and got about a 25% speed up. I then wrote the entire thing in C, and
got a 100X speed up. what had taken 100 seconds, now took 1 second. This
was going to get done a lot, so I needed to do that. Having been working
almost entirely in Python lately, I really didn't enjoy writing the C !

How this could be implemented:

I am not qualified to talk about the actual code in the Python
interpreter, but what I am envisioning is that the concept of the
homogeneous sequence would be get into the Python code, and hopefully
some optimizations of list comprehensions and the like could be done. In
the short term, I would imagine little other benefit, but once it was in
there, it could be used to great advantage in Py2C, and used by
extension writers to really make things easier. In the long term,
homogeneous sequences might provide a basis for the high performance
Python compiler of the future.

My questions:

Is this even a remotely good idea, or have I just gotten carried away?

Could it be useful to dynamically test for homogeneous sequences, or
would they have to be statically defined?

Looking at some of the stuff on the Web about Guido's ideas about static
typing, I see a way to do homogeneous sequences, but no mention of the
concept itself. Is it there somewhere?

What are all the other flaws in my ideas that I haven't come up with
yet?

Sent via Deja.com
http://www.deja.com/