[Python-checkins] r87905 - in python/branches/release31-maint: Doc/howto/regex.rst Misc/NEWS
terry.reedy
python-checkins at python.org
Mon Jan 10 22:49:11 CET 2011
Author: terry.reedy
Date: Mon Jan 10 22:49:11 2011
New Revision: 87905
Log:
Issue #10875: Update Regular Expression HOWTO; patch by 'SilentGhost'.
Modified:
python/branches/release31-maint/Doc/howto/regex.rst
python/branches/release31-maint/Misc/NEWS
Modified: python/branches/release31-maint/Doc/howto/regex.rst
==============================================================================
--- python/branches/release31-maint/Doc/howto/regex.rst (original)
+++ python/branches/release31-maint/Doc/howto/regex.rst Mon Jan 10 22:49:11 2011
@@ -5,7 +5,6 @@
****************************
:Author: A.M. Kuchling <amk at amk.ca>
-:Release: 0.05
.. TODO:
Document lookbehind assertions
@@ -24,11 +23,6 @@
Introduction
============
-The :mod:`re` module was added in Python 1.5, and provides Perl-style regular
-expression patterns. Earlier versions of Python came with the :mod:`regex`
-module, which provided Emacs-style patterns. The :mod:`regex` module was
-removed completely in Python 2.5.
-
Regular expressions (called REs, or regexes, or regex patterns) are essentially
a tiny, highly specialized programming language embedded inside Python and made
available through the :mod:`re` module. Using this little language, you specify
@@ -264,7 +258,7 @@
>>> import re
>>> p = re.compile('ab*')
>>> p
- <_sre.SRE_Pattern object at 80b4150>
+ <_sre.SRE_Pattern object at 0x...>
:func:`re.compile` also accepts an optional *flags* argument, used to enable
various special features and syntax variations. We'll go over the available
@@ -362,8 +356,8 @@
and more.
You can learn about this by interactively experimenting with the :mod:`re`
-module. If you have Tkinter available, you may also want to look at
-:file:`Tools/scripts/redemo.py`, a demonstration program included with the
+module. If you have :mod:`tkinter` available, you may also want to look at
+:file:`Tools/demo/redemo.py`, a demonstration program included with the
Python distribution. It allows you to enter REs and strings, and displays
whether the RE matches or fails. :file:`redemo.py` can be quite useful when
trying to debug a complicated RE. Phil Schwartz's `Kodos
@@ -373,11 +367,10 @@
This HOWTO uses the standard Python interpreter for its examples. First, run the
Python interpreter, import the :mod:`re` module, and compile a RE::
- Python 2.2.2 (#1, Feb 10 2003, 12:57:01)
>>> import re
>>> p = re.compile('[a-z]+')
>>> p
- <_sre.SRE_Pattern object at 80c3c28>
+ <_sre.SRE_Pattern object at 0x...>
Now, you can try matching various strings against the RE ``[a-z]+``. An empty
string shouldn't match at all, since ``+`` means 'one or more repetitions'.
@@ -395,7 +388,7 @@
>>> m = p.match('tempo')
>>> m
- <_sre.SRE_Match object at 80c4f68>
+ <_sre.SRE_Match object at 0x...>
Now you can query the :class:`MatchObject` for information about the matching
string. :class:`MatchObject` instances also have several methods and
@@ -434,7 +427,7 @@
>>> print(p.match('::: message'))
None
>>> m = p.search('::: message') ; print(m)
- <re.MatchObject instance at 80c9650>
+ <_sre.SRE_Match object at 0x...>
>>> m.group()
'message'
>>> m.span()
@@ -459,11 +452,11 @@
:meth:`findall` has to create the entire list before it can be returned as the
result. The :meth:`finditer` method returns a sequence of :class:`MatchObject`
-instances as an :term:`iterator`. [#]_ ::
+instances as an :term:`iterator`::
>>> iterator = p.finditer('12 drummers drumming, 11 ... 10 ...')
>>> iterator
- <callable-iterator object at 0x401833ac>
+ <callable_iterator object at 0x...>
>>> for match in iterator:
... print(match.span())
...
@@ -485,7 +478,7 @@
>>> print(re.match(r'From\s+', 'Fromage amk'))
None
>>> re.match(r'From\s+', 'From amk Thu May 14 19:12:10 1998')
- <re.MatchObject instance at 80c5978>
+ <_sre.SRE_Match object at 0x...>
Under the hood, these functions simply create a pattern object for you
and call the appropriate method on it. They also store the compiled object in a
@@ -687,7 +680,7 @@
line, the RE to use is ``^From``. ::
>>> print(re.search('^From', 'From Here to Eternity'))
- <re.MatchObject instance at 80c1520>
+ <_sre.SRE_Match object at 0x...>
>>> print(re.search('^From', 'Reciting From Memory'))
None
@@ -699,11 +692,11 @@
or any location followed by a newline character. ::
>>> print(re.search('}$', '{block}'))
- <re.MatchObject instance at 80adfa8>
+ <_sre.SRE_Match object at 0x...>
>>> print(re.search('}$', '{block} '))
None
>>> print(re.search('}$', '{block}\n'))
- <re.MatchObject instance at 80adfa8>
+ <_sre.SRE_Match object at 0x...>
To match a literal ``'$'``, use ``\$`` or enclose it inside a character class,
as in ``[$]``.
@@ -728,7 +721,7 @@
>>> p = re.compile(r'\bclass\b')
>>> print(p.search('no class at all'))
- <re.MatchObject instance at 80c8f28>
+ <_sre.SRE_Match object at 0x...>
>>> print(p.search('the declassified algorithm'))
None
>>> print(p.search('one subclass is'))
@@ -746,7 +739,7 @@
>>> print(p.search('no class at all'))
None
>>> print(p.search('\b' + 'class' + '\b') )
- <re.MatchObject instance at 80c3ee0>
+ <_sre.SRE_Match object at 0x...>
Second, inside a character class, where there's no use for this assertion,
``\b`` represents the backspace character, for compatibility with Python's
@@ -1316,8 +1309,8 @@
be *very* complicated. Use an HTML or XML parser module for such tasks.)
-Not Using re.VERBOSE
---------------------
+Using re.VERBOSE
+----------------
By now you've probably noticed that regular expressions are a very compact
notation, but they're not terribly readable. REs of moderate complexity can
@@ -1366,8 +1359,3 @@
now-removed :mod:`regex` module, which won't help you much.) Consider checking
it out from your library.
-
-.. rubric:: Footnotes
-
-.. [#] Introduced in Python 2.2.2.
-
Modified: python/branches/release31-maint/Misc/NEWS
==============================================================================
--- python/branches/release31-maint/Misc/NEWS (original)
+++ python/branches/release31-maint/Misc/NEWS Mon Jan 10 22:49:11 2011
@@ -31,6 +31,8 @@
Library
-------
+- Issue #10875: Update Regular Expression HOWTO; patch by 'SilentGhost'.
+
- Issue #10869: Fixed bug where ast.increment_lineno modified the root
node twice.
More information about the Python-checkins
mailing list