Most direct way to strip unoprintable characters out of a string?

George Sakkis gsakkis at rutgers.edu
Sun Sep 25 09:16:09 CEST 2005


"Steve Bergman" <steve at rueb.com> wrote:

> George Sakkis wrote:
>
> >
> >
> >If by straightforward you mean one-liner, there is:
> >''.join(c for c in input_string if c not in string.printable)
> >
> >If you care about performance though, string.translate is faster; as always, the best way to
decide
> >on a performance issue is to profile the alternatives on your data and see if it's worth going
for
> >the fastest one at the expense of readability.
> >
> >
> >
> Thank you for the reply.  I was really thinking of some function in the
> standard library like:
>
> s = stripUnprintable(s)
>
> When I learned php, I more or less took the route of using whatever I
> found that 'worked'.  In learning Python, I'm trying to take my time and
> learn the 'right' (that's pronounced 'beautiful') way of doing things.
>
> As it stands, I've stashed the string.translate code in a short function
> with a comment explaining what it does and how.  I mainly didn't want to
> use that if there was some trivial built-in that everyone else uses.

No there's not a stripUnprintable in a standard module AFAIK, and that's a good thing; if every
little function that one might ever wanted made it to the standard library, the language would be
overwhelming.

Make sure you calculate the unprintable characters only the first time it is called, not every time.
Here's a way to encapsulate this in the same function, without polluting the global namespace with
allchars and delchars:

import string

def stripUnprintable(input_string):
    try: filterUnprintable = stripUnprintable.filter
    except AttributeError: # only the first time it is called
        allchars = string.maketrans('','')
        delchars = allchars.translate(allchars, string.printable)
        filterUnprintable = stripUnprintable.filter = lambda input: input.translate(allchars,
delchars)
    return filterUnprintable(input_string)

George





More information about the Python-list mailing list