[Python-ideas] os.path.isbinary

Steven D'Aprano steve at pearwood.info
Thu Aug 1 03:59:48 CEST 2013


On 01/08/13 05:03, Ryan wrote:
> 1.The link I provided wasn't how I wanted it to be. I was using it as an example to show it wasn't impossible.

But it *is* impossible even in principle to tell the difference between "text" and "binary", since both text and binary files are made up of the same bytes. Whether something is text or binary depends in part on the intention of the reader.

E.g. a text file containing the ASCII string "Greetings and salutations Ryan\r\n" is bit-for-bit identical with a binary file containing four C doubles:

1.6937577544703708e+190
2.6890193974129695e+161
9.083672029092351e+223
2.9908963169274674e-260


So any such "is binary" function cannot determine whether a file actually is binary or not. The best it can do is "might be text".

That perhaps leads to a less bad (although maybe not actually good) idea, a function which takes an encoding and tries to determine whether or not the contexts of the file could be text in that encoding.

But really, file type guessing is too complex to be a simple function like "isbinary" or even "maybetext".


> 3.Did no one get the 'nothingness/is/eternal' joke?

Not me.




-- 
Steven


More information about the Python-ideas mailing list