Python version of perl's "if (-T ..)" and "if (-B ...)"?
MRAB
python at mrabarnett.plus.com
Fri Feb 12 12:05:37 EST 2010
Christian Heimes wrote:
> Lloyd Zusman wrote:
>> .... The -T and -B switches work as follows. The first block or so
>> .... of the file is examined for odd characters such as strange control
>> .... codes or characters with the high bit set. If too many strange
>> .... characters (>30%) are found, it's a -B file; otherwise it's a -T
>> .... file. Also, any file containing null in the first block is
>> .... considered a binary file. [ ... ]
>
> That's a butt ugly heuristic that will lead to lots of false positives
> if your text happens to be UTF-16 encoded or non-english text UTF-8 encoded.
>
...or non-English Latin-1 text...
More information about the Python-list
mailing list