[Python-3000] How will unicode get used?

Josiah Carlson jcarlson at uci.edu
Thu Sep 28 05:49:38 CEST 2006


"Martin v. Löwis" <martin at v.loewis.de> wrote:
> 
> Josiah Carlson schrieb:
> > Attached is my untested sample implementation (I'm away for the next
> > week or so, and can't test), that should give an idea of what I was
> > talking about.
> 
> Thanks. It is hard to tell what the impact on the implementation is.
> For example, ISTM that you have to regenerate the tree each time
> a new string is created. E.g. if you slice a string, you would
> have to regenerate the tree for the slice. Right?

Generally, yes.  We could use the pre-existing tree information, but
it would probably be simpler (and faster) to scan the string and
re-create it.

Really, one would create the tree when someone wants to access an index
for the first time (or during creation, for fewer surprises), then use
the index finding function to return the address of character i.

> As for the implementation: If you are using a array-based heap,
> couldn't you just drop the left and right child pointers, and
> instead use indices 2*k and 2*k+1 to find the child nodes?
> This would get down memory overhead significantly; you'd only
> need the length of the array to determine what a leaf node is.

Good point.  I had originally malloced each node individually, but I
zoned the heap optimization when I went with that style of construction.

 - Josiah



More information about the Python-3000 mailing list