List of Numbers
Jim Richardson
warlock at eskimo.com
Sat Apr 12 22:44:37 EDT 2003
On Sat, 12 Apr 2003 21:11:39 GMT,
Alex Martelli <aleax at aleax.it> wrote:
> Jim Richardson wrote:
>
>> On Sat, 05 Apr 2003 20:13:45 +0100,
>> Simon Faulkner <news at titanic.co.uk> wrote:
>>> I have a list of about 5000 numbers in a text file - up to 14 digits
>>> each.
>>>
>>> I need to check for duplicates.
>>>
>>> What would people suggest as a good method?
>>>
>>> Simon
>>
>> I'd use sort|uniq, but I don't know if that's available for MS type osen.
>
> You can get implementations of sort and uniq for MS, but a short
> Python script is better IMHO.
yeah, but that's usually the case :)
>
>
>> In python, just stuff them all in a dictionary, any repeats, will be
>> eliminated. But this is rather crude and probably slow. But it would
>> work.
>
> Anything but slow! Python dictionaries are quite fast. But removing
> duplicates is not the same as 'checking for duplicates' -- Simon
> might rather want (e.g.) a list of all numbers that WERE in fact
> duplicate. A script that plays with a Python dict is still no doubt
> the right solution, but it's hard to write one without more precise
> specifications regarding what is desired.
>
>
yeah, I didn't look at the check for part, I just parsed it as get rid
of... <sigh> must need a brain upgrade.
I don't know how fast/slow the dict would be to tell the truth, it just
doesn't seem that "elegant" and elegance, is often (wrongly I know)
associated with speed.
Having said that, I have found that usually, the simpler the script, and
the closer to the pythonic "metal" it is, the faster it is. For some
reason, the folks who wrote python, are a lot better at programming that
I am :)
--
Jim Richardson http://www.eskimo.com/~warlock
Linux, because eventually, you grow up enough to be trusted with a fork()
More information about the Python-list
mailing list