How to know if a file is a text file
Nobody
nobody at nowhere.com
Sun Nov 15 13:56:01 EST 2009
On Sun, 15 Nov 2009 13:49:54 +0100, Luca wrote:
> I was quite sure that this is not a very simple task. Right now search
> only inside ASCII encode is not enough for me (my native language is
> outside this encode :-)
> Checking every single byte can be a good solution...
>
> I can start using the mimetype module and, if the file has no
> extension, check byte one by one (commonly) as "file" command does.
> Better: I can check use the "file" command if available.
Another possible solution:
Universal Encoding Detector
Character encoding auto-detection in Python 2 and 3
http://chardet.feedparser.org/
More information about the Python-list
mailing list