[Tutor] name shortening in a csv module output
Dave Angel
davea at davea.name
Thu Apr 23 23:40:34 CEST 2015
On 04/23/2015 05:08 PM, Mark Lawrence wrote:
>
> Slight aside, why a BOM, all I ever think of is Inspector Clouseau? :)
>
As I recall, it stands for "Byte Order Mark". Applicable only to
multi-byte storage formats (eg. UTF-16), it lets the reader decide
which of the formats were used.
For example, a file that reads
fe ff 41 00 42 00
might be a big-endian version of UTF-16
while
ff fe 00 41 00 42
might be the little-endian version of the same data.
I probably have it all inside out and backwards, but that's the general
idea. If the BOM appears backwards, you switch between BE and LE and
the data will make sense.
The same concept was used many years ago in two places I know of.
Binary files representing faxes had "II" or "MM" at the beginning.
But the UCSD-P system program format used a number (I think it was 0001)
which would decode wrong if you were on the wrong processor type. The
idea was that instead of coding an explicit check, you just looked at
one piece of data, and if it was wrong, you had to swap all the
byte-pairs. That way if you read the file on the same machine, no work
was needed at all.
Seems to me the Java bytecode does something similar, but I don't know.
All of these are from memory, and subject to mis-remembering.
--
DaveA
More information about the Tutor
mailing list