[Tutor] simple python scrip for collocation discovery

Kent Johnson kent37 at tds.net
Sun Aug 17 15:15:40 CEST 2008


On 8/16/08, Emad Nawfal (عماد نوفل) <emadnawfal at gmail.com> wrote:
> Thank you so much Steve,
> I followed your advice about calculating o the fly and it really rang a
> bell. Now I have this script. It's faster and does not give me the nasty
> memory error message the first one sometimes did:

A few hints:
- You  shoulld just split the line once and save the result - the
current code splits the line several times for each word.
- You could make a generator that returns pairs of words, that would
simplify the loop code.
- There is a difference between making the code fast for a single
test, and making it fast for many tests. For a single test, the best
solution is probably something like what you have, computing the
numbers you need as you make a single pass through the data. For
multiple tests, you will do better if you create some helper
dictionaries that allow you to look up words and pairs quickly. You
could use the pickle module to save the dicts and re-read them on
subsequent runs.

HTH - I can't flesh these out right now.
Kent


More information about the Tutor mailing list