[Python-Dev] Google Summer of Code proposal: New class for work with binary trees AVL and RB as with the standard dictionary.
Josiah Carlson
jcarlson at uci.edu
Wed Apr 26 20:18:18 CEST 2006
"Vladimir 'Yu' Stepanov" <vys at renet.ru> wrote:
> Josiah Carlson wrote:
> > There exists various C and Python implementations of both AVL and
> > Red-Black trees. For users of Python who want to use AVL and/or
> > Red-Black trees, I would urge them to use the Python implementations.
> > In the case of *needing* the speed of a C extension, there already
> > exists a CPython extension module for AVL trees:
> > http://www.python.org/pypi/pyavl/1.1
> >
> > I would suggest you look through some suggested SoC projects in the
> > wiki:
> > http://wiki.python.org/moin/SummerOfCode
> >
> > - Josiah
> >
> >
> Thanks for the answer!
>
> I already saw pyavl-1.1. But for this reason I would like to see the module
> in a standard package python. Functionality for pyavl and dict to compare
> difficultly. Functionality of my module will differ from functionality dict
> in the best party. I have lead some tests on for work with different types
> both for a package pyavl-1.1, and for the prototype of own module. The
> script
> of check is resulted in attached a file avl-timeit.py In files
> timeit-result-*-*.txt results of this test. The first in the name of a file
> means quantity of the added elements, the second - argument of a method
> timeit. There it is visible, that in spite of the fact that the module
> xtree
> is more combined in comparison with pyavl the module (for everyone again
> inserted pair [the key, value], is created two elements: python object -
> pair,
> and an internal element of a tree), even in this case it works a little bit
> more quickly. Besides the module pyavl is unstable for work in multithread
> appendices (look the attached file module-avl-is-thread-unsafe.py).
I'm not concerned about the speed of the external AVL module, and I'm
not concerned about the speed of trees in Python; so far people have
found that dictionaries are generally sufficient. More specifically,
while new lambda syntaxes are presented almost weekly, I haven't heard
anyone complain about Python's lack of a tree module in months. As a
result, I don't find the possible addition of any tree structure to the
collections module as being a generally compelling addition.
Again, I believe that the existance of 3rd party extension modules which
implement AVL and red-black trees, regardless of their lack of thread
safety, slower than your module by a constant, or lack of particular
functionality to be basically immaterial to this discussion. In my mind,
there are only two relevant items to this discussion:
1. Is having a tree implementation (AVL or Red-Black) important, and if
so, is it important enough to include in the collections module?
2. Is a tree implementation important enough for google to pay for its
inclusion, given that there exists pyavl and a collectionsmodule.c patch
for a red-black tree implementation?
Then again, you already have your own implementation of a tree module,
and it seems as though you would like to be paid for this already-done
work. I don't know how google feels about such things, but you should
remember to include this information in your application.
> I think, that creation of this type (together with type of pair), will make
> programming more convenient since sorting by function sort will be required
> less often.
How often are you sorting data? I've found that very few of my
operations involve sorting data of any kind, and even if I did have need
of sorting, Python's sorting algorithm is quite fast, likely faster by
nearly a factor of 2 (in the number of comparisons) than the on-line
construction of an AVL or Red-Black tree.
> I can probably borrow in this module beyond the framework of the project
> google. The demand of such type of data is interesting. Because of
> necessity
> of processing `gcmodule.c' and `obmalloc.c' this module cannot be realized
> as the external module.
It seems to me (after checking collectionsmodule.c and a few others)
that proper C modules only seem to import Python.h and maybe
structmember.h . If you are manually including gcmodule.c and obmalloc.c,
you may be doing something wrong; I'll leave it to others to further
comment on this aspect.
- Josiah
More information about the Python-Dev
mailing list