[Python-ideas] Give regex operations more sugar

Steven D'Aprano steve at pearwood.info
Thu Jun 14 02:08:00 EDT 2018


On Wed, Jun 13, 2018 at 10:43:43PM +0200, Michel Desmoulin wrote:

> str.replace come to mind. It's a annoying to have to chain it 5 times
> while we could pass optionally a tuple.

Its not so simple. Multiple replacements underspecifies the behaviour. 
The simplest behaviour is to have

    astring.replace((spam, eggs, cheese), new)

be simply syntactic sugar for:

    astring.replace(spam, new).replace(eggs, new).replace(cheese, new)

which is nice and simple to explain and nice and simple to implement 
(it's just a loop calling the method for each argument in the tuple), 
but its probably not the most useful solution:

# replace any of "salad", "cheese" or "ham" with "cheesecake".
s = "Lunch course are cheese & coffee, salad & cream, or ham & peas"
s.replace("salad", "cheesecake").replace("cheese", "cheesecake").replace("ham", "cheesecake")

=> 'Lunch course are cheesecake & coffee, cheesecakecake & cream, or cheesecake & peas'

which is highly unlikely to be what anyone wants. But it isn't clear 
what people *will* want.

So we need to decide what replace with multiple targets actually means. 
Here are some suggestions:

- the order of targets ought to be irrelevant: replace((a, b) ...)
  and replace((b, a) ...) ought to mean the same thing;

- should targets match longest first or shortest first? or a flag
  to choose which you want?

- what if you have multiple targets and you need to give some longer
  ones priority, and some shorter ones?

- there ought to be a single pass through the string, not multiple
  passes -- this is not just syntactic sugar for calling replace in 
  a loop!

- the replacement string should be skipped and not scanned.



-- 
Steve


More information about the Python-ideas mailing list