[Python-Dev] Google Summer of Code proposal: New class for work with binary trees AVL and RB as with the standard dictionary.
Josiah Carlson
jcarlson at uci.edu
Mon Apr 24 20:51:30 CEST 2006
There exists various C and Python implementations of both AVL and
Red-Black trees. For users of Python who want to use AVL and/or
Red-Black trees, I would urge them to use the Python implementations.
In the case of *needing* the speed of a C extension, there already
exists a CPython extension module for AVL trees:
http://www.python.org/pypi/pyavl/1.1
I would suggest you look through some suggested SoC projects in the
wiki:
http://wiki.python.org/moin/SummerOfCode
- Josiah
"Vladimir 'Yu' Stepanov" <vys at renet.ru> wrote:
>
> I would like to participate in Google Summer of Code. The idea consists in
> creation of a specialized class for work with binary trees AVL and RB. The
> part of ideas is taken from Parrot (Perl6) where for pair values the
> specialized type is stipulated. As advantages it is possible to note:
> * High speed. On simple types at quantity of elements up to 10000 speed of
> work concedes to the standard dictionary of all in 2.7 times.
> * Safety of work in a multithread operating mode.
> * The method for frosts of object is stipulated.
> * The data storage is carried out in the sorted kind.
> * A high overall performance `item' and `iteritem' methods as it is not
> necessary to create new objects.
> * At lots of objects in a tree memory is used much less intensively.
> * Absence of collisions. As consequence, it is impossible to generate bad
> data for creation DoS of the attacks influencing the dictionary of
> transferred arguments.
> * For objects existence of a method __ hash __ is not necessary.
> * The same kind of the dictionary by means of the overloaded operations
> probably change of its behaviour so that it supported the
> multivariational
> representation based on stratification given and subsequent recoils.
>
> Additional requirements to key objects:
> * The opportunity of comparison of key objects function cmp is necessary.
>
> There are the reasons, braking development of this project:
> * Compulsion of heading of a module `gc' for each compound object (this
> structure is unessential to some objects, but updating `gc' the module
> is required).
> * Absence of standard object the pair, probably depends on the
> previous item.
>
> Lacks of a binary tree:
> * Average time of search for hash - O (1), and for trees - O (log2 N).
> * A lot of memory under a single element of a tree:
> (3*sizeof (void *) + sizeof (int))*2,
> one element is used rather - the pair, the second site of memory is
> allocated under node of a tree.
>
> In protection of object "pair":
> * The logic of methods of the given object noticeably is easier than
> tuple,
> that as a result can affect speed of work of the program in the best
> party.
> * Alignment on 8 bytes has no big sense at present architecture where in
> cache sample a minimum on 64 bytes is made. Use of type "pair" will give
> an appreciable prize at use of alignment in 4 bytes. Otherwise on 64-bit
> platforms it is much more favourable to use tuple object.
>
> The given project can demand processing of the module `gcmodule.c' and
> `tupleobject.c'. It is necessary to reduce the size of static objects,
> for this
> purpose the opportunity is necessary is transparent to pass objects not
> having
> direct support from the module `gcmodule.c'.
>
> Also it will be partially necessary to process the module `obmalloc.c'
> for more
> effective distribution of memory.
>
> I shall be glad to answer questions on this theme.
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/jcarlson%40uci.edu
More information about the Python-Dev
mailing list