[Chicago] is there really no built-in file/iter split() thing?
Carl Karsten
carl at personnelware.com
Sun Dec 2 03:17:25 CET 2007
Kumar McMillan wrote:
> On Dec 1, 2007 4:59 PM, Carl Karsten <carl at personnelware.com> wrote:
>> > Assuming open('big.sql').read().split(';') would be a dumb idea,
>>
>> How about we just not assume that? If it is, lets see the proof so we have a
>> good idea how bad it is, which will help gauge how elaborate of a work around is
>> justified.
>
> the file I was parsing was 45M. If you want to test it on *your*
> machine, go ahead and post back the results :) It would be nice to
> see, actually. My assumption is that it will try to allocate at least
> 90M of memory but, yes, it is still just an assumption.
carl at vaio:~$ free -m
total used free shared buffers cached
Mem: 376 26 349 0 0 4
-/+ buffers/cache: 21 354
Swap: 627 56 570
carl at vaio:~$ time python
Python 2.5.1 (r251:54863, Oct 5 2007, 13:36:32)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import datetime
>>> s=datetime.datetime.now()
>>> x='abc;'*(45000000/4)
>>> datetime.datetime.now() - s
datetime.timedelta(0, 9, 131028)
>>> len(x)
45000000
>>> s=datetime.datetime.now()
>>> y=x.split(';')
datetime.datetime.now() - s
>>> datetime.datetime.now() - s
datetime.timedelta(0, 23, 222340)
>>> len(y)
11250001
>>>
real 2m48.391s
user 0m4.016s
sys 0m2.320s
in a 2nd shell, after doing y=...
carl at vaio:~$ ps vp 7191
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
7191 pts/2 S+ 0:03 283 985 461442 340092 88.1 python
Anyone know what that means?
Most of the 2m48s was after I hit ^D to exit python. Not really sure why that
would take so much longer than creating y. I got too much stuff open on my box
with 1gb.
Carl K
More information about the Chicago
mailing list