Python version of perl's "if (-T ..)" and "if (-B ...)"?
Christian Heimes
lists at cheimes.de
Fri Feb 12 09:14:07 EST 2010
Lloyd Zusman wrote:
> .... The -T and -B switches work as follows. The first block or so
> .... of the file is examined for odd characters such as strange control
> .... codes or characters with the high bit set. If too many strange
> .... characters (>30%) are found, it's a -B file; otherwise it's a -T
> .... file. Also, any file containing null in the first block is
> .... considered a binary file. [ ... ]
That's a butt ugly heuristic that will lead to lots of false positives
if your text happens to be UTF-16 encoded or non-english text UTF-8 encoded.
Christian
More information about the Python-list
mailing list