(Simple?) Unicode Question
shashank.sunny.singh at gmail.com
Thu Aug 27 18:39:06 CEST 2009
I have a very simple (and probably stupid) question eluding me.
When exactly is the char-set information needed?
To make my question clear consider reading a file.
While reading a file, all I get is basically an array of bytes.
Now suppose a file has 10 bytes in it (all is data, no metadata,
forget the BOM and stuff for a little while). I read it into an array of 10
bytes, replace, say, 2nd bytes and write all the bytes back to a new
Do i need the character encoding mumbo jumbo anywhere in this?
Further, does anything, except a printing device need to know the
encoding of a piece of "text"? I mean, as long as we are not trying
to get a symbolic representation of a "text" or get "i"th character
of it, all we need to do is to carry the intended encoding as
an auxiliary information to the data stored as byte array.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-list