[Python-ideas] New explicit methods to trim strings

Paul Moore p.f.moore at gmail.com
Tue Apr 2 07:23:15 EDT 2019


On Tue, 2 Apr 2019 at 12:07, Rhodri James <rhodri at kynesim.co.uk> wrote:
> So far we have two slightly dubious use-cases.
>
> 1. Stripping file extensions.  Personally I find that treating filenames
> like filenames (i.e. using os.path or (nowadays) pathlib) results in me
> thinking more appropriately about what I'm doing.

I'd go further and say that filename manipulation is a great example
of a place where generic string functions should definitely *not* be
used.

> 2. Stripping prefixes and suffixes to get to root words.  Python has
> been used for natural language work for over a decade, and I don't think
> I've heard any great call from linguists for the functionality.  English
> isn't a girl who puts out like that on a first date :-)  There are too
> many common exception cases for such a straightforward approach not to
> cause confusion.

Agreed, using prefix/suffix stripping on natural language is at best a
"quick hack". For robust usage, one of the natural language processing
packages from PyPI is likely a far better fit. But "quick hacks" using
the stdlib are not an unrealistic use case, so I don't think we should
completely discount this. It's certainly not *compelling*, though.

> 3. My most common use case (not very common at that) is for stripping
> annoying prompts off text-based APIs.  I'm happy using .startswith() and
> string slicing for that, though your point about the repeated use of the
> string to be stripped off (or worse, hard-coding its length) is well made.
>
> I am beginning to worry slightly that actually there are usually more
> appropriate things to do than simply cutting off affixes, and that in
> providing these particular batteries we might be encouraging poor practise.

It would be really helpful if someone could go through the various use
cases presented in this thread and classify them - filename
manipulation, natural language uses, and "other". We could then focus
on the "other" category to get a better feel for what use cases might
act as a good argument for the feature. To me, it's starting to feel
like a proposal that looks deceptively valuable because it's a
"natural", or "obvious", addition to make, and there's a weight of
people thinking of cases where they "might find it useful", but the
reality is that many of those cases are not actually as good a fit for
the feature as it seems at first glance. It would help the people in
favour of the proposal to make their case if they could dispel that
impression by giving a clearer summary of the expected use cases...

Paul


More information about the Python-ideas mailing list