[New-bugs-announce] [issue3231] re.compile fails with some bytes patterns

Antoine Pitrou report at bugs.python.org
Sun Jun 29 01:11:55 CEST 2008


New submission from Antoine Pitrou <pitrou at free.fr>:

Some patterns can be compiled in str form but not in bytes form. This
was overlooked because the test suite wasn't correctly adapted for py3k:

>>> re.compile('[\\1]')
<_sre.SRE_Pattern object at 0xb7be1410>
>>> re.compile('\\09')
<_sre.SRE_Pattern object at 0xb7c4f2f0>
>>> re.compile('\\n')
<_sre.SRE_Pattern object at 0xb7be1f50>

but:

>>> re.compile(b'[\\1]')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/antoine/py3k/reunicode/Lib/re.py", line 188, in compile
return _compile(pattern, flags)
  File "/home/antoine/py3k/reunicode/Lib/re.py", line 240, in _compile
p = sre_compile.compile(pattern, flags)
  File "/home/antoine/py3k/reunicode/Lib/sre_compile.py", line 497, in
compile
p = sre_parse.parse(p, flags)
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 685, in parse
p = _parse_sub(source, pattern, 0)
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 320, in
_parse_sub
itemsappend(_parse(source, state))
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 409, in _parse
this = sourceget()
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 215, in get
self.__next()
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 204, in __next
char = char + c
TypeError: Can't convert 'int' object to str implicitly
>>> re.compile(b'\\09')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/antoine/py3k/reunicode/Lib/re.py", line 188, in compile
return _compile(pattern, flags)
  File "/home/antoine/py3k/reunicode/Lib/re.py", line 240, in _compile
p = sre_compile.compile(pattern, flags)
  File "/home/antoine/py3k/reunicode/Lib/sre_compile.py", line 497, in
compile
p = sre_parse.parse(p, flags)
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 678, in parse
source = Tokenizer(str)
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 187, in
__init__
self.__next()
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 204, in __next
char = char + c
TypeError: Can't convert 'int' object to str implicitly
>>> re.compile(b'\\n')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/antoine/py3k/reunicode/Lib/re.py", line 188, in compile
return _compile(pattern, flags)
  File "/home/antoine/py3k/reunicode/Lib/re.py", line 240, in _compile
p = sre_compile.compile(pattern, flags)
  File "/home/antoine/py3k/reunicode/Lib/sre_compile.py", line 497, in
compile
p = sre_parse.parse(p, flags)
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 678, in parse
source = Tokenizer(str)
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 187, in
__init__
self.__next()
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 204, in __next
char = char + c
TypeError: Can't convert 'int' object to str implicitly

----------
components: Regular Expressions
messages: 68925
nosy: pitrou
severity: normal
status: open
title: re.compile fails with some bytes patterns
type: behavior
versions: Python 3.0

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue3231>
_______________________________________


More information about the New-bugs-announce mailing list