[Chicago] is there really no built-in file/iter split() thing?

Carl Karsten carl at personnelware.com
Sun Dec 2 03:17:25 CET 2007


Kumar McMillan wrote:
> On Dec 1, 2007 4:59 PM, Carl Karsten <carl at personnelware.com> wrote:
>>  > Assuming open('big.sql').read().split(';') would be a dumb idea,
>>
>> How about we just not assume that?  If it is, lets see the proof so we have a
>> good idea how bad it is, which will help gauge how elaborate of a work around is
>> justified.
> 
> the file I was parsing was 45M.  If you want to test it on *your*
> machine, go ahead and post back the results :)  It would be nice to
> see, actually.  My assumption is that it will try to allocate at least
> 90M of memory but, yes, it is still just an assumption.

carl at vaio:~$ free -m
              total       used       free     shared    buffers     cached
Mem:           376         26        349          0          0          4
-/+ buffers/cache:         21        354
Swap:          627         56        570
carl at vaio:~$ time python
Python 2.5.1 (r251:54863, Oct  5 2007, 13:36:32)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> import datetime
 >>> s=datetime.datetime.now()
 >>> x='abc;'*(45000000/4)
 >>> datetime.datetime.now() - s
datetime.timedelta(0, 9, 131028)
 >>> len(x)
45000000
 >>> s=datetime.datetime.now()
 >>> y=x.split(';')
datetime.datetime.now() - s
 >>> datetime.datetime.now() - s
datetime.timedelta(0, 23, 222340)
 >>> len(y)
11250001
 >>>

real    2m48.391s
user    0m4.016s
sys     0m2.320s

in a 2nd shell, after doing y=...
carl at vaio:~$ ps vp 7191
   PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
  7191 pts/2    S+     0:03    283   985 461442 340092 88.1 python

Anyone know what that means?

Most of the 2m48s was after I hit ^D to exit python.  Not really sure why that 
would take so much longer than creating y.  I got too much stuff open on my box 
with 1gb.

Carl K


More information about the Chicago mailing list