Efficient, built-in way to determine if string has non-ASCII chars outside ASCII 32-127, CRLF, Tab?
python at bdurham.com
python at bdurham.com
Mon Oct 31 15:54:34 EDT 2011
Wondering if there's a fast/efficient built-in way to determine
if a string has non-ASCII chars outside the range ASCII 32-127,
CR, LF, or Tab?
I know I can look at the chars of a string individually and
compare them against a set of legal chars using standard Python
code (and this works fine), but I will be working with some very
large files in the 100's Gb to several Tb size range so I'd
thought I'd check to see if there was a built-in in C that might
handle this type of check more efficiently.
Does this sound like a use case for cython or pypy?
Thanks,
Malcolm
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20111031/53351e74/attachment.html>
More information about the Python-list
mailing list