Python version of perl's "if (-T ..)" and "if (-B ...)"?

MRAB python at mrabarnett.plus.com
Fri Feb 12 12:05:37 EST 2010


Christian Heimes wrote:
> Lloyd Zusman wrote:
>> .... The -T  and -B  switches work as follows. The first block or so
>> .... of the file is examined for odd characters such as strange control
>> .... codes or characters with the high bit set. If too many strange
>> .... characters (>30%) are found, it's a -B file; otherwise it's a -T
>> .... file. Also, any file containing null in the first block is
>> .... considered a binary file. [ ... ]
> 
> That's a butt ugly heuristic that will lead to lots of false positives
> if your text happens to be UTF-16 encoded or non-english text UTF-8 encoded.
> 
...or non-English Latin-1 text...



More information about the Python-list mailing list