[Tutor] How can I make this run faster?

Mon Dec 21 19:19:19 CET 2009

"Emad Nawfal (عمـ نوفل ـاد)" <emadnawfal at gmail.com> wrote

> def devocalize(word):
>     vowels = "aiou"
Should this include 'e'?
>     return "".join([letter for letter in word if letter not in vowels])

Its probably faster to use a regular expression replacement.
Simply replace any vowel with the empty string.

> vowelled = ['him', 'ham', 'hum', 'fun', 'fan'] # input, usually a large 
> list
> of around 500,000 items
> vowelled = set(vowelled)

How do you process the file? Do you read it all into memory and
then convert it to a set? Or do you process each line (one word
per line?) and add the words to the set one by one? The latter
is probably faster.

> unvowelled = set([devocalize(word) for word in vowelled])
> for lex in unvowelled:
>     d = {}
>    d[lex] = [word for word in vowelled if devocalize(word) == lex]

I think you could remove the comprehensions and do all of
this inside a single loop. One of those cases where a single
explicit loop is faster than 2 comprehesions and a loop.

But the only way to be sure is to test/profile to see whee the slowdown 
occurs.

HTH,

-- 
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/