[Python-ideas] os.path.isbinary
Clay Sweetser
clay.sweetser at gmail.com
Wed Jul 31 19:15:37 CEST 2013
On Jul 31, 2013 12:22 PM, "Eli Bendersky" <eliben at gmail.com> wrote:
>
>
>
>
> On Wed, Jul 31, 2013 at 8:40 AM, Ryan <rymg19 at gmail.com> wrote:
>>
>> Here's something more interesting than my shlex idea.
>>
>> os.path is, pretty much, the Python FS toolbox, along with shutil. But,
there's one feature missing: check if a file is binary. It isn't hard, see
http://code.activestate.com/recipes/173220/. But, writing 50 lines of code
for a more common task isn't really Python-ish.
>>
>> So...
>>
>> What if os.path had a binary checker that works just like isfile:
>> os.path.isbinary('/nothingness/is/eternal') # Returns boolean
Besides the high chance of false positives, what makes this method (and the
problem it tries to solve) so so difficult is that binary files may contain
what is considered to be large amounts of text, and text files may contain
pieces of binary data.
For example, consider a windows executable file - Much of the data in such
a file is considered binary data, but there are defined sections where
strings and text resources are stored. Any heuristic algorithm like the one
mentioned will be insufficient in such cases.
Although I can't think of a situation off hand where the opposite may be
true (binary data embedded in what is considered to be a text file) I'm
pretty sure such a situation exists.
>
>
>
> Some time ago I put on a gas mask and dove into the Perl source code to
figure out how its "is binary" and "is text" operators work:
http://eli.thegreenplace.net/2011/10/19/perls-guess-if-file-is-text-or-binary-implemented-in-python/
>
> I would recommend against including such a simplistic heuristic in the
Python stdlib.
>
> Eli
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130731/ab59833c/attachment-0001.html>
More information about the Python-ideas
mailing list