SUMMARY: C's isprint() concept

Jeff Blaine jblaine at shore.net
Tue Aug 17 15:39:30 CEST 1999


Thanks to those of you who replied with help or attempted help and no
thanks to those of you who replied and were snippy.

Method 1:  Loop over every character in the line, check ord() value

Method 2:  Use a regexp (re.sub)

Method 3:  Use string.maketrans and string.translate (Python dist
           documentation on these methods needs work if you ask me).

Results:   Method 3 is fastest by far.  Method 1 was second fastest.  The
           regexp method was slowest.

#-----------------------------------------------------------------------------
# Snippet from Method 1 - assumes an open fd
#-----------------------------------------------------------------------------
for line in fd.readlines():
    good = []
    for character in line:
        ordval = ord(character)
        if ordval < 32 or ordval > 126:
            good.append(' ')
            continue
        else:
            good.append(character)

#-----------------------------------------------------------------------------
# Snippet from Method 2 - assumes an open fd
#-----------------------------------------------------------------------------
NOTPRINTABLE = r'[^ -~]+'
npre = re.compile(NOTPRINTABLE)
for line in fd.readlines():
    new_line = re.sub(npre, ' ', line)

#-----------------------------------------------------------------------------
# Snippet from Method 3 - assumes an open fd
#-----------------------------------------------------------------------------
ttable = string.maketrans('\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8A\x8B\x8C\x8D\x8E\x8F\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9A\x9B\x9C\x9D\x9E\x9F', '                                                                ')
for line in fd.readlines():
    new_line = string.translate(line, ttable)




More information about the Python-list mailing list