Efficient, built-in way to determine if string has non-ASCII chars outside ASCII 32-127, CRLF, Tab?
MRAB
python at mrabarnett.plus.com
Tue Nov 1 15:29:50 EDT 2011
On 01/11/2011 18:54, Duncan Booth wrote:
> Steven D'Aprano<steve+comp.lang.python at pearwood.info> wrote:
>
>> LEGAL = ''.join(chr(n) for n in range(32, 128)) + '\n\r\t\f'
>> MASK = ''.join('\01' if chr(n) in LEGAL else '\0' for n in range(128))
>>
>> # Untested
>> def is_ascii_text(text):
>> for c in text:
>> n = ord(c)
>> if n>= len(MASK) or MASK[n] == '\0': return False
>> return True
>>
>>
>> Optimizing it is left as an exercise :)
>>
>
> #untested
> LEGAL = ''.join(chr(n) for n in range(32, 128)) + '\n\r\t\f'
> MASK = [True if chr(n) in LEGAL else False for n in range(128)]
>
Instead of:
True if chr(n) in LEGAL else False
why not:
if chr(n) in LEGAL
> # Untested
> def is_ascii_text(text):
> try:
> return all(MASK[ord(c)] for c in text)
> except IndexError:
> return False
>
More information about the Python-list
mailing list