[Python-Dev] Do I misunderstand how codecs.EncodedFile is supposed to work?
Martin v. Loewis
martin@v.loewis.de
07 Aug 2002 08:46:59 +0200
Skip Montanaro <skip@pobox.com> writes:
> I thought the whole purpose of the EncodedFile class was to provide
> transparent encoding.
""" Return a wrapped version of file which provides transparent
encoding translation.
Strings written to the wrapped file are interpreted according
to the given data_encoding and then written to the original
file as string using file_encoding. The intermediate encoding
will usually be Unicode but depends on the specified codecs.
Strings are read from the file using file_encoding and then
passed back to the caller as string using data_encoding.
If file_encoding is not given, it defaults to data_encoding.
"""
So, no. It provides transparent recoding: with a file encoding, and a
data encoding.
I never found this class useful.
What you want is a StreamWriter:
f = codecs.get_writer('utf-8')(open('unicode-test', 'w'))
Of course, *this* specific case can be written much easier as
f = codecs.open('unicode-test', 'w', encoding = 'utf-8')
The get_writer case is useful if you already got a file-like object
from somewhere.
> Shouldn't it support transparent encoding of Unicode
> objects? That is, I told the system I want writes to be in utf-8 when I
> instantiated the class.
You told it also that input data are in utf-8, as you have omitted the
data_encoding.
> I don't think I should have to call .encode() directly. I realize I
> can wrap the function in a class that adds the transparency I
> desire, but it seems the whole point should be to make it easy to
> write Unicode objects to files.
Not this class, no.
Now, you may ask what else is the purpose of this class. I really
don't know - it is against everything I'm advocating, as it assumes
that you have byte strings in a certain encoding in your memory that
you want to save in a different encoding. That should never happen -
all your text data should be Unicode strings.
Regards,
Martin