C's isprint() concept?
Michael P. Reilly
arcege at shore.net
Sun Aug 15 21:15:40 EDT 1999
Jeff Pinyan <jeffp at crusoe.net> wrote:
:> If I want to replace all non-printable characters in a string with a single
:> space, what would be the best way? Do I need to loop over the entire string
:> character by character checking the ord() value of each one? Anyone have
:> a sane way to do this with regular expressions?
: In Perl, I could do it this way:
: $string =~ tr/\x00-\x1f\x80-\xff//d;
: What that means is this:
: remove the following characters from $string:
: characters whose ASCII value is from \x00 (0) to \x1f (31)
: characters whose ASCII value is from \x80 (128) to \xff (255)
: This can be done, almost as painlessly, in Python.
Considering this was asked to a Python newsgroup, how about showing
how to do it in Python.
import string
# works for ASCII
control_chars = string._idmap[:ord(' ')] # 0 to 31
high_chars = string._idmap[ord('~')+:] # 127 to 255
to_remove = control_chars + high_chars
map = string.maketrans(to_remove, ' ' * len(to_remove))
midstr = string.translate(instr, map)
outstr = string.join(string.split(midstr))
or:
import re
outstr = re.sub(r'[^ -~]+', ' ', instr)
(Indented for "easy cut&paste" ;)
-Arcege
More information about the Python-list
mailing list