regex problem with re and fnmatch
Fabian Braennstroem
f.braennstroem at gmx.de
Wed Nov 21 16:05:16 EST 2007
Hi John,
John Machin schrieb am 11/20/2007 09:40 PM:
> On Nov 21, 8:05 am, Fabian Braennstroem <f.braennstr... at gmx.de> wrote:
>> Hi,
>>
>> I would like to use re to search for lines in a files with
>> the word "README_x.org", where x is any number.
>> E.g. the structure would look like this:
>> [[file:~/pfm_v99/README_1.org]]
>>
>> I tried to use these kind of matchings:
>> # org_files='.*README\_1.org]]'
>> org_files='.*README\_*.org]]'
>> if re.match(org_files,line):
>
> First tip is to drop the leading '.*' and use search() instead of
> match(). The second tip is to use raw strings always for your
> patterns.
>
>> Unfortunately, it matches all entries with "README.org", but
>> not the wanted number!?
>
> \_* matches 0 or more occurrences of _ (the \ is redundant). You need
> to specify one or more digits -- use \d+ or [0-9]+
>
> The . in .org matches ANY character except a newline. You need to
> escape it with a \.
>
>>>> pat = r'README_\d+\.org'
>>>> re.search(pat, 'xxxxREADME.org')
>>>> re.search(pat, 'xxxxREADME_.org')
>>>> re.search(pat, 'xxxxREADME_1.org')
> <_sre.SRE_Match object at 0x00B899C0>
>>>> re.search(pat, 'xxxxREADME_9999.org')
> <_sre.SRE_Match object at 0x00B899F8>
>>>> re.search(pat, 'xxxxREADME_9999Zorg')
>>>>
Thanks a lot, works really nice!
>> After some splitting and replacing I am able to check, if
>> the above file exists. If it does not, I start to search for
>> it using the 'walk' procedure:
>
> I presume that you mean something like: """.. check if the above file
> exists in some directory. If it does not, I start to search for it
> somewhere else ..."""
>
>> for root, dirs, files in
>> os.walk("/home/fab/org"):
>
>> for name in dirs:
>> dirs=os.path.join(root, name) + '/'
>
> The above looks rather suspicious ...
> for thing in container:
> container = something_else
> ????
> What are you trying to do?
>
>
>> for name in files:
>> files=os.path.join(root, name)
>
> and again ....
>
>> if fnmatch.fnmatch(str(files), "README*"):
>
> Why str(name) ?
>
>> print "File Found"
>> print str(files)
>> break
>
>
> fnmatch is not as capable as re; in particular it can't express "one
> or more digits". To search a directory tree for the first file whose
> name matches a pattern, you need something like this:
> def find_one(top, pat):
> for root, dirs, files in os.walk(top):
> for fname in files:
> if re.match(pat + '$', fname):
> return os.path.join(root, fname)
>
>
>> As soon as it finds the file,
>
> "the" file or "a" file???
>
> Ummm ... aren't you trying to locate a file whose EXACT name you found
> in the first exercise??
>
> def find_it(top, required):
> for root, dirs, files in os.walk(top):
> if required in files:
> return os.path.join(root, required)
Great :-) Thanks a lot for your help... it can be so easy :-)
Fabian
More information about the Python-list
mailing list