Fastest way to detect a non-ASCII character in a list of strings.

Tim Chase python.list at
Mon Oct 18 04:38:55 CEST 2010

On 10/17/10 19:04, Rhodri James wrote:
>     import string
>     return set("".join(L))<= set(string.printable)
> I've no idea whether this is faster or slower than any of your
> suggestions.

For set("".join(L)) to return, it has to scan the entire input 
list/string.  Imagine

   s = UNPRINTABLE_CHAR + ('a'*1000000)

I'd sooner do something like

   printable_set = set(string.printable)
   return all((c in printable_set) for c in s)

which will bail as soon as it encounters a character that isn't 

This also somewhat addresses Seebs's concern about defining what 
you want -- put "valid" characters in the set.  But the various 
algorithms you (the OP, Dun) propose don't have the same 
functionality for characters < ASCII 32 (space).  Your #2 is more 

   ord(c) < 128

instead of "31 < ord(c) < 128"

As a modest speed-up on your (OP's) #3, you can do one 
conversion-to-char for your endpoints instead of N 

   start = chr(31)
   end = chr(127)
   return all(start < c < end for c in ...)

> You could "timeit" and see, or you could wait a bit and not
> optimise prematurely.

But this is sage advice regardless of the algorithm :)


More information about the Python-list mailing list