Efficient, built-in way to determine if string has non-ASCII chars outside ASCII 32-127, CRLF, Tab?

MRAB python at mrabarnett.plus.com
Tue Nov 1 15:29:50 EDT 2011


On 01/11/2011 18:54, Duncan Booth wrote:
> Steven D'Aprano<steve+comp.lang.python at pearwood.info>  wrote:
>
>> LEGAL = ''.join(chr(n) for n in range(32, 128)) + '\n\r\t\f'
>> MASK = ''.join('\01' if chr(n) in LEGAL else '\0' for n in range(128))
>>
>> # Untested
>> def is_ascii_text(text):
>>      for c in text:
>>          n = ord(c)
>>          if n>= len(MASK) or MASK[n] == '\0': return False
>>      return True
>>
>>
>> Optimizing it is left as an exercise :)
>>
>
> #untested
> LEGAL = ''.join(chr(n) for n in range(32, 128)) + '\n\r\t\f'
> MASK = [True if chr(n) in LEGAL else False for n in range(128)]
>
Instead of:

     True if chr(n) in LEGAL else False

why not:

     if chr(n) in LEGAL

> # Untested
> def is_ascii_text(text):
>    try:
>      return all(MASK[ord(c)] for c in text)
>    except IndexError:
>      return False
>



More information about the Python-list mailing list