Checking strings for "bad" characters
peter at engcorp.com
Tue Aug 27 23:15:14 EDT 2002
Harvey Thomas wrote:
> I've got some very long Unicode strings which I wish to test for the presence of ASCII characters 0-8 and 14-31. My first thought was to use regular expressions, e.g.:
> import re
> r = re.compile(u'[%s%s]' % (''.join([unichr(x) for x in range(0, 9)]) , ''.join([unichr(x) for x in range(14, 32)])))
> amatch = r.search(r)
> if amatch:
> print "Bad characters"
> print "OK"
> but is there a better or faster method.
If you could use string.maketrans and .translate() to convert all bad characters
that might be present into a single code (e.g. \x00), and then do a simple
.find() for that character, you might get the benefits of simplicity and extreme
More information about the Python-list