Determine file type (binary or text)

Karl Scalet news at
Wed Aug 13 14:39:57 CEST 2003

Michael Peuser schrieb:
> Hi,
> yes there is more than just Unix in the world ;-)
> Windows directories have no means to specify their contents type in any way.

That's even more true with linux/unix, as there is no need to do
any stuff like line-terminator conversion.

> The approved method is using three-letter extensions, though this rule  is
> not strictly followed (lot of files without extension nowadays!)
> When I had a similar problem I read 1000 characters, counted the amount of
> <32 and >255 characters and classified it "binary when this qota exceeded
> 20%. I have no idea whether it will work good with chinese unicode files or
> some funny depositories or project files that store uncompressed texts....

based on the idea from Mr. "bromden", why not use mimetypes.MimeTypes()
and guess_type('file://...') and analye the returned string.
This should work on windows / linux / unix / whatever.


> KIndly
> Michael P
> "Sami Viitanen" <none at> schrieb im Newsbeitrag
> news:v7p_a.1558$k4.32814 at
>>Works well in Unix but I'm making a script that works on both
>>Unix and Windows.
>>Win doesn't have that 'file -bi' command.
>>"bromden" <bromden at> wrote in message
>>news:bhd559$ku9$1 at
>>>>How can I check if a file is binary or text?
>>> >>> import os
>>> >>> f = os.popen('file -bi', 'r')
>>> >>>'text')
>>>(btw, returns 'text/x-java; charset=us-ascii\n')

More information about the Python-list mailing list