
On Fri, Aug 22, 2014 at 04:42:29AM +0200, Oleg Broytman wrote:
On Thu, Aug 21, 2014 at 05:30:14PM -0700, Chris Barker - NOAA Federal <chris.barker@noaa.gov> wrote:
This brings up the other key problem. If file names are (almost) arbitrary bytes, how do you write one to/read one from a text file with a particular encoding? ( or for that matter display it on a terminal)
There is no such thing as an encoding of text files.
I don't understand this comment. It seems to me that *text* files have to have an encoding, otherwise you can't interpret the contents as text. Files, of course, only contain bytes, but to be treated as bytes you need some way of transforming byte N to char C (or multiple bytes to C), which is an encoding. Perhaps you just mean that encodings are not recorded in the text file itself? To answer Chris' question, you typically cannot include arbitrary bytes in text files, and displaying them to the user is likewise problematic. The usual solution is to support some form of escaping, like \t #x0A; or %0D, to give a few examples. -- Steven