[Python-bugs-list] [ python-Bugs-405894 ] sre fails on r'[\w-]' patterns

nobody nobody@sourceforge.net
Sun, 04 Mar 2001 14:57:04 -0800


Bugs #405894, was updated on 2001-03-04 14:57
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=405894&group_id=5470

Category: Regular Expressions
Group: Platform-specific
Status: Open
Priority: 5
Submitted By: Nobody/Anonymous
Assigned to: Nobody/Anonymous
Summary: sre fails on r'[\w-]' patterns

Initial Comment:
Trying to match "email" addresses I used a patern from
"Python Essential Reference" (middle of page 115):
   import re
   addrStr = 'peterb@selresearch.net'
   print
re.search(r'([a-zA-Z][\w-]*@[\w-]+(?:\.[\w-]+)+)',
addrStr)

By default this brings in sre these days I believe.
On the same machine if I explicitly import pre this
works
(after changing it to be pre.search(...) ;-).
Explicitly importing sre causes the same failure as
just importing re.

The machine I'm on is a:
  SunOS burn 5.8 Generic_108528-02 sun4u sparc
SUNW,Ultra-1
with Python 1.6 compiled with gcc 2.95.2 with no
special attention given.

I have isolated it to the character class r'[\w-]'
failing,
and indeed the test suite doesn't test for this case.

Thought you should know -

Thanks for all the great software

;;peter

- - - - here is a run on my machine showing the
behavior - - -

burn <14:54:55># python
Python 1.6 (#1, Feb 28 2001, 12:56:49)  [GCC 2.95.2
19991024 (release)] on sunos5
Copyright (c) 1995-2000 Corporation for National
Research Initiatives.
All Rights Reserved.
Copyright (c) 1991-1995 Stichting Mathematisch Centrum,
Amsterdam.
All Rights Reserved.
>>> import re
>>> addrStr = 'peterb@selresearch.net'
>>> print
re.search(r'([a-zA-Z][\w-]*@[\w-]+(?:\.[\w-]+)+)',
addrStr)
None
>>> 
>>> import pre
>>> print
pre.search(r'([a-zA-Z][\w-]*@[\w-]+(?:\.[\w-]+)+)',
addrStr)
<pre.MatchObject instance at 1c1058>
>>> print
pre.search(r'([a-zA-Z][\w-]*@[\w-]+(?:\.[\w-]+)+)',
addrStr).span()
(0, 22)
>>> 
>>> import sre
>>> print
sre.search(r'([a-zA-Z][\w-]*@[\w-]+(?:\.[\w-]+)+)',
addrStr)
None
>>> 
>>> ^D
burn <14:56:58># 

- - - - end of run - - - -




----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=405894&group_id=5470