[New-bugs-announce] [issue41080] re.sub treats * incorrectly?
Ryan Westlund
report at bugs.python.org
Mon Jun 22 13:28:11 EDT 2020
New submission from Ryan Westlund <rlwestlund at gmail.com>:
```
>>> re.sub('a*', '-', 'a')
'--'
>>> re.sub('a*', '-', 'aa')
'--'
>>> re.sub('a*', '-', 'aaa')
'--'
```
Shouldn't it be returning one dash, not two, since the greedy quantifier will match all the a's? I understand why substituting on 'b' returns '-a-', but shouldn't this constitute only one match? In Python 2.7, it behaves as I expect:
```
>>> re.sub('a*', '-', 'a')
'-'
>>> re.sub('a*', '-', 'aa')
'-'
>>> re.sub('a*', '-', 'aaa')
'-'
```
The original case that led me to this was trying to normalize a path to end in one slash. I used `re.sub('/*$', '/', path)`, but a nonzero number of slashes came out as two.
----------
components: Regular Expressions
messages: 372104
nosy: Yujiri, ezio.melotti, mrabarnett
priority: normal
severity: normal
status: open
title: re.sub treats * incorrectly?
type: behavior
versions: Python 3.10, Python 3.7, Python 3.8
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue41080>
_______________________________________
More information about the New-bugs-announce
mailing list