[Python-Dev] Google Summer of Code proposal: New class for work with binary trees AVL and RB as with the standard dictionary.

Josiah Carlson jcarlson at uci.edu
Wed Apr 26 20:18:18 CEST 2006


"Vladimir 'Yu' Stepanov" <vys at renet.ru> wrote:
> Josiah Carlson wrote:
> > There exists various C and Python implementations of both AVL and
> > Red-Black trees.  For users of Python who want to use AVL and/or
> > Red-Black trees, I would urge them to use the Python implementations. 
> > In the case of *needing* the speed of a C extension, there already
> > exists a CPython extension module for AVL trees:
> >     http://www.python.org/pypi/pyavl/1.1
> >
> > I would suggest you look through some suggested SoC projects in the
> > wiki:
> >     http://wiki.python.org/moin/SummerOfCode
> >
> >  - Josiah
> >
> >   
> Thanks for the answer!
> 
> I already saw pyavl-1.1. But for this reason I would like to see the module
> in a standard package python. Functionality for pyavl and dict to compare
> difficultly. Functionality of my module will differ from functionality dict
> in the best party. I have lead some tests on for work with different types
> both for a package pyavl-1.1, and for the prototype of own module. The 
> script
> of check is resulted in attached a file avl-timeit.py In files
> timeit-result-*-*.txt results of this test. The first in the name of a file
> means quantity of the added elements, the second - argument of a method
> timeit. There it is visible, that in spite of the fact that the module 
> xtree
> is more combined in comparison with pyavl the module (for everyone again
> inserted pair [the key, value], is created two elements: python object - 
> pair,
> and an internal element of a tree), even in this case it works a little bit
> more quickly. Besides the module pyavl is unstable for work in multithread
> appendices (look the attached file module-avl-is-thread-unsafe.py).

I'm not concerned about the speed of the external AVL module, and I'm
not concerned about the speed of trees in Python; so far people have
found that dictionaries are generally sufficient.  More specifically,
while new lambda syntaxes are presented almost weekly, I haven't heard
anyone complain about Python's lack of a tree module in months.  As a
result, I don't find the possible addition of any tree structure to the
collections module as being a generally compelling addition.

Again, I believe that the existance of 3rd party extension modules which
implement AVL and red-black trees, regardless of their lack of thread
safety, slower than your module by a constant, or lack of particular
functionality to be basically immaterial to this discussion.  In my mind,
there are only two relevant items to this discussion:

1. Is having a tree implementation (AVL or Red-Black) important, and if
so, is it important enough to include in the collections module?
2. Is a tree implementation important enough for google to pay for its
inclusion, given that there exists pyavl and a collectionsmodule.c patch
for a red-black tree implementation?

Then again, you already have your own implementation of a tree module,
and it seems as though you would like to be paid for this already-done
work.  I don't know how google feels about such things, but you should
remember to include this information in your application.


> I think, that creation of this type (together with type of pair), will make
> programming more convenient since sorting by function sort will be required
> less often.

How often are you sorting data?  I've found that very few of my
operations involve sorting data of any kind, and even if I did have need
of sorting, Python's sorting algorithm is quite fast, likely faster by
nearly a factor of 2 (in the number of comparisons) than the on-line
construction of an AVL or Red-Black tree.


> I can probably borrow in this module beyond the framework of the project
> google. The demand of such type of data is interesting. Because of 
> necessity
> of processing `gcmodule.c' and `obmalloc.c' this module cannot be realized
> as the external module.

It seems to me (after checking collectionsmodule.c and a few others)
that proper C modules only seem to import Python.h and maybe
structmember.h . If you are manually including gcmodule.c and obmalloc.c,
you may be doing something wrong; I'll leave it to others to further
comment on this aspect.

 - Josiah



More information about the Python-Dev mailing list