hard memory limits
John Machin
sjmachin at lexicon.net
Fri May 6 23:32:36 EDT 2005
On Sat, 07 May 2005 02:29:48 GMT, bokr at oz.net (Bengt Richter) wrote:
>On Sat, 07 May 2005 11:08:31 +1000, Maurice LING <mauriceling at acm.org> wrote:
>>
>>It doesn't seems to help. I'm thinking that it might be a SOAPpy
>>problem. The allocation fails when I grab a list of more than 150k
>>elements through SOAP but allocating a 1 million element list is fine in
>>python.
>>
>>Now I have a performance problem...
>>
>>Say I have 3 lists (20K elements, 1G elements, and 0 elements), call
>>them 'a', 'b', and 'c'. I want to filter all that is in 'b' but not in
>>'a' into 'c'...
>>
>> >>> a = range(1, 100000, 5)
>> >>> b = range(0, 1000000)
>> >>> c = []
>> >>> for i in b:
>>... if i not in a: c.append(i)
>>...
>>
>>This takes forever to complete. Is there anyway to optimize this?
>>
>Checking whether something is in a list may average checking equality with
>each element in half the list. Checking for membership in a set should
>be much faster for any significant size set/list. I.e., just changing to
>
> a = set(range(1, 100000, 5))
>
>should help. I assume those aren't examples of your real data ;-)
>You must have a lot of memory if you are keeping 1G elements there and
>copying a significant portion of them. Do you need to do this file-to-file,
>keeping a in memory? Perhaps page-file thrashing is part of the time problem?
Since when was 1000000 == 1G??
Maurice, is this mucking about with 1M or 1G lists in the same
exercise as the "vm_malloc fails when allocating a 20K-element list"
problem? Again, it might be a good idea if you gave us a little bit
more detail. You haven't even posted the actual *PYTHON* error message
and stack trace that you got from the original problem. In fact,
there's a possible interpretation that the (system?) malloc merely
prints the vm_malloc message and staggers on somehow ...
Regards,
John
More information about the Python-list
mailing list