Silly question about ungzipping a file
Mike C. Fletcher
mcfletch at rogers.com
Thu Sep 26 17:21:39 EDT 2002
Try something like (untested):
chunk = 4096*1024 # 4MB chunks, play with this value to suit
myFile = gzip.open( 'blah.gz' ) # this is binary if I recall correctly
outFile = open( 'blah','wb') # note wb mode for windows machines!
data = myFile.read( chunk )
while data:
outFile.write( data )
data = myFile.read( chunk )
outFile.close()
myFile.close()
HTH,
Mike
Lemniscate wrote:
> Hi everyone,
>
> This may be a ridiculously easy question, but I've kind of hit a wall
> and I was wondering if I am just missing something. I want to
> automate the retrieval and unzipping of a *.gz file. The issue is
> that the file, when it is unzipped, usually has a size somewhere in
> the range of 128MB. Occassionally, I get memory errors when I try to
> run it. Here is a quick idea of what I am doing...
>
>
>>>>import gzip
>>>>myFile = gzip.open('LL_tmpl.gz')
>>>>file('output.txt', 'w').write(myFile.read())
>>>>
>>>
>
> Now, is there any way to do this so that less memory is used? I mean,
> if I wanted to do some processing on the resulting output file, I
> would use xreadlines or something like that to keep memory consumption
> to a minimum. Is there something roughly equivalent that I am not
> noticing in the gzip documentation. Let me also say that I have tried
> the following as well:
>
>
>>>>myFile = gzip.open('LL_tmpl.gz')
>>>>fout = file('output.txt', 'w')
>>>>while myFile.readline():
>>>
> ... fout.write(myFile.readline())
> ... fout.write('\n')
> ...
>
>>>>fout.close()
>>>>myFile = gzip.open('LL_tmpl.gz')
>>>>fout = file('output2.txt', 'w')
>>>>while myFile.readline():
>>>
> ... fout.writelines(myFile.readline())
> ...
>
>>>>fout.close()
>>>
>
>
> These do solve my memory problem, but there are other issues. First
> of all, my CPU gets pegged and it takes FOREVER (okay, not forever,
> but about ...let me test real quick... at least 4-5 times as long, and
> my computer is pretty much useless during that time (side note: can
> you tell I am working on a woefully underpowered machine?)). Is there
> something in-between that anybody can think of? The other, and much
> more immediate, issue is puzzling to me. It seems that the resulting
> files from the code are only about 65MB (64.7 to be exact) versus
> 129MB. I'm sure I'm just missing something simple, but why is that?
> Thanks for your time.
>
> Lem
--
_______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/
More information about the Python-list
mailing list