Python version of perl's "if (-T ..)" and "if (-B ...)"?

Christian Heimes lists at cheimes.de
Fri Feb 12 09:14:07 EST 2010


Lloyd Zusman wrote:
> .... The -T  and -B  switches work as follows. The first block or so
> .... of the file is examined for odd characters such as strange control
> .... codes or characters with the high bit set. If too many strange
> .... characters (>30%) are found, it's a -B file; otherwise it's a -T
> .... file. Also, any file containing null in the first block is
> .... considered a binary file. [ ... ]

That's a butt ugly heuristic that will lead to lots of false positives
if your text happens to be UTF-16 encoded or non-english text UTF-8 encoded.

Christian



More information about the Python-list mailing list