Python speed-up

Wed Sep 22 10:40:47 EDT 2004

"Guyon Morée" <gumuz at NO_looze_SPAM.net> writes:

> Hi all,
> 
> I am working on a Huffman encoding exercise, but it is kinda slow. This is
> not a big problem, I do this to educate myself :)
> 
> So I started profiling the code and the slowdown was actually taking place
> at places where I didn't expect it.
> 
> after I have created a lookup-table-dictionary with encodings like
> {'d':'0110', 'e':'01' etc} to encode the original text like this:
> 
> for c in original_text:
>     encoded_text += table[c]

I probably shouldn't be guessing like this, but I know for sure that
Python strings are immutable, and that this has to allocate a new
string every time through the loop.

How about 

encoded_text = ''.join([table[c] for c in original_text]) # untested, beware

instead?

> I can appreciate the length of the text is big, but this isn't a problem at
> character frequency counting for eaxample. Why is this slow?
> 
> 
> the second place the slowdown occurs is when I ty to chop the encoded string
> of 0's and 1's in pieces of eigth like this:
> 
> chr_list = [] # resulting list
> while 1:
>     chr_list.append(encoded_text[:8]) # take 8 bits from string and put them
> in the list
>     encoded_text = encoded_text[8:] # truncate the string
>     if len(encoded_text) < 8: # end of string reached
>         chr_list.append(encoded_text)
>         break

Again with the mutable strings!  Probably wizards can improve on the
first thing that came to my mind:

chr_list = [encoded_text[i:i+8] 
            for i in range(0, len(encoded_text), 8)] # Tested, but carelessly

At the very least it's shorter my way.

> I hope someone can tell me why these are slow.

Des
isn't sure.
-- 
"[T]he structural trend in linguistics which took root with the
International Congresses of the twenties and early thirties [...] had
close and effective connections with phenomenology in its Husserlian
and Hegelian versions." -- Roman Jakobson