<!--/*SC*/DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"/*EC*/-->

<html><head><title></title><style type="text/css"><!-- body{padding:1ex;margin:0;font-family:sans-serif;font-size:small}a[href]{color:-moz-hyperlinktext!important;text-decoration:-moz-anchor-decoration}blockquote{margin:0;border-left:2px solid #144fae;padding-left:1em}blockquote blockquote{border-color:#006312}blockquote blockquote blockquote{border-color:#540000} --></style></head><body><div style="font-family: Arial; font-size: medium;" dir="ltr"><div>

        Wondering if there's a fast/efficient built-in way to determine if a string has non-ASCII chars outside the range ASCII 32-127, CR, LF, or Tab?</div>

<div>

         </div>

<div>

        I know I can look at the chars of a string individually and compare them against a set of legal chars using standard Python code (and this works fine), but I will be working with some very large files in the 100's Gb to several Tb size range so I'd thought I'd check to see if there was a built-in in C that might handle this type of check more efficiently.</div>

<div>

         </div>

<div>

        Does this sound like a use case for cython or pypy?</div>

<div>

         </div>

<div>

        Thanks,</div>

<div>

        Malcolm</div>

<div>

         </div>

<div>

         </div>

</div></body></html>