Limits on search length
John Machin
sjmachin at lexicon.net
Mon Oct 1 22:20:43 EDT 2007
On Oct 2, 3:16 am, Daryl Lee <d... at altaregos.com> wrote:
> I am trying to locate all lines in a suite of files with quoted strings of
> particular lengths. A search pattern like r'".{15}"' finds 15-character
> strings very nicely. But I have some very long ones, and a pattern like
> r'".{272}"' fails miserably, even though I know I have at least one
> 272-character string.
>
> In the short term, I can resort to locating the character positions of the
> quotes, but this seemed like such an elegant solution I hate to see it not
> work. The program is given below (sans imports), in case someone can spot
> something I'm overlooking:
>
> # Example usage: search.py *.txt \".{15}\"
>
> filePattern = sys.argv[1]
> searchPattern = sys.argv[2]
1. Learn an elementary debugging technique called "print the input".
print "pattern is", repr(searchPattern)
2. Fix your regular expression:
>>> import re
>>> patt = r'".{15}"'
>>> patt
'".{15}"'
>>> rx = re.compile(patt)
>>> o = rx.search('"123456789012345"'); o
<_sre.SRE_Match object at 0x00B96918>
>>> o.group()
'"123456789012345"'
>>> o = rx.search('"1234567" "12345"'); o
<_sre.SRE_Match object at 0x00B96950>
>>> o.group()
'"1234567" "12345"' ########## whoops ##########
>>>
>>> patt = r'"[^"]{15}"' # or use the non-greedy ? tag
>>> rx = re.compile(patt)
>>> o = rx.search('"123456789012345"'); o
<_sre.SRE_Match object at 0x00B96918>
>>> o.group()
'"123456789012345"'
>>> o = rx.search('"1234567" "12345"'); o
>>> o.group()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: 'NoneType' object has no attribute 'group'
3. Try building scripts from small TESTED parts e.g. in this case
write a function to find all quoted strings of length n inside a given
string. If you do that, you will KNOW there is no limit that stops you
finding a string of length 272, and you can then look for your error
elsewhere.
HTH,
John
More information about the Python-list
mailing list