Opening multiple Files in Different Encoding

Oscar Benjamin oscar.j.benjamin at
Wed Jul 11 22:55:31 CEST 2012

On 11 July 2012 19:15, <subhabangalore at> wrote:

> On Tuesday, July 10, 2012 11:16:08 PM UTC+5:30, Subhabrata wrote:
> > Dear Group,
> >
> > I kept a good number of files in a folder. Now I want to read all of
> > them. They are in different formats and different encoding. Using
> > listdir/glob.glob I am able to find the list but how to open/read or
> > process them for different encodings?
> >
> > If any one can help me out.I am using Python3.2 on Windows.
> >
> > Regards,
> > Subhabrata Banerjee.
> Dear Group,
> No generally I know the glob.glob or the encodings as I work lot on
> non-ASCII stuff, but I recently found an interesting issue, suppose there
> are .doc,.docx,.txt,.xls,.pdf files with different encodings.

Some of the formats you have listed are not text-based. What do you mean by
the encoding of e.g. a .doc or .xls file?

My understanding is that these are binary files. You won't be able to read
them without the help of a special module (I don't know of one that can).

> 1) First I have to determine on the fly the file type.
> 2) I can not assign encoding="..." whatever be the encoding I have to read
> it.

Perhaps you just want to open the file as binary? The following will read
the contents of any file binary or text regardless of encoding or anything

f = open('spreadsheet.xls', 'rb')
data =   # returns binary data rather than text

> Any idea. Thinking.
> Thanks in Advance,
> Regards,
> Subhabrata Banerjee.
> --
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Python-list mailing list