[Tutor] file.read() doesn't give full contents of compressed files

Tue Feb 20 13:53:01 CET 2007

Barton David wrote:
> Hi,
> I'm really confused, and I hope somebody can explain this for me...
>  
> I've been playing with compression and archives, and have some .zip, 
> .tar, .gz and .tgz example files to test my code on.
> I can read them using either zipfile, tarfile, gzip or zlib, and that's 
> fine. But just reading them in 'raw' doesn't give me the whole string of 
> (compressed) bytes.
>  
> i.e...
>  
> len( file("mytestfile","r").read() ) != os.path.getsize("mytestfile")
>  
> Not even close, in fact. It seems like file.read() just stops after 
> reading a small portion of each example file, but why would that happen? 
> And what could I do if I wanted to read in the entire (compressed) 
> contents as a string?

Why do you think it stops reading? len() should be giving a bigger 
number than getsize() because you are reading the file in text mode 
which will convert \n to \r\n. Try file("mytestfile","rb").

Kent