[Chicago] is there really no built-in file/iter split() thing?
Kumar McMillan
kumar.mcmillan at gmail.com
Sun Dec 2 04:58:50 CET 2007
On Dec 1, 2007 8:17 PM, Carl Karsten <carl at personnelware.com> wrote:
> Kumar McMillan wrote:
> > On Dec 1, 2007 4:59 PM, Carl Karsten <carl at personnelware.com> wrote:
> >> > Assuming open('big.sql').read().split(';') would be a dumb idea,
> >>
> >> How about we just not assume that? If it is, lets see the proof so we have a
> >> good idea how bad it is, which will help gauge how elaborate of a work around is
> >> justified.
> >
> > the file I was parsing was 45M. If you want to test it on *your*
> > machine, go ahead and post back the results :) It would be nice to
> > see, actually. My assumption is that it will try to allocate at least
> > 90M of memory but, yes, it is still just an assumption.
>
> carl at vaio:~$ free -m
> total used free shared buffers cached
> Mem: 376 26 349 0 0 4
> -/+ buffers/cache: 21 354
> Swap: 627 56 570
> carl at vaio:~$ time python
> Python 2.5.1 (r251:54863, Oct 5 2007, 13:36:32)
> [GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import datetime
> >>> s=datetime.datetime.now()
> >>> x='abc;'*(45000000/4)
> >>> datetime.datetime.now() - s
> datetime.timedelta(0, 9, 131028)
> >>> len(x)
> 45000000
> >>> s=datetime.datetime.now()
> >>> y=x.split(';')
> datetime.datetime.now() - s
> >>> datetime.datetime.now() - s
> datetime.timedelta(0, 23, 222340)
> >>> len(y)
> 11250001
> >>>
>
> real 2m48.391s
> user 0m4.016s
> sys 0m2.320s
>
> in a 2nd shell, after doing y=...
> carl at vaio:~$ ps vp 7191
> PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
> 7191 pts/2 S+ 0:03 283 985 461442 340092 88.1 python
>
> Anyone know what that means?
you could try using pysizer instead:
http://pysizer.8325.org/
http://pysizer.8325.org/doc/tutorial.html
"PySizer is a memory usage profiler for Python code."
I've never tried using it myself.
>
> Most of the 2m48s was after I hit ^D to exit python. Not really sure why that
> would take so much longer than creating y. I got too much stuff open on my box
> with 1gb.
>
>
> Carl K
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> http://mail.python.org/mailman/listinfo/chicago
>
More information about the Chicago
mailing list