[Tutor] recursion in script with re module

McCarney, James Alexander James.Alexander.McCarney@Cognicase.com
Thu, 10 Oct 2002 19:55:52 -0400


Ok list,
Thank you for your hints and answers... Now I have the text file display to
my screen, but what I want -- as Magnus helpfully pointed out -- is
recursion. So I have imported re.

I read the docs -- or tried to! ;-) -- and I looked at the recursion how-to
-- and I am starting to get perl hives. ;-)

Anyway, now that the file is opened in the script... I tried to implement
Magnus's suggestions. What I want is everything between the < > signs. 

So re.findall(r"(<[^>]+?/>)",alllines) should work. But it doesn't because
it is a list instead of a string. Can I pop eveything out of a list at once?
Or--as I seem to recall in the tutorial--is it a one by one thing?

Here is the error:

Traceback (most recent call last):
  File "<pyshell#134>", line 1, in ?
    tagfinger()
  File "C:\Documents and Settings\jamccarn\Desktop\tagfinger.py", line 26,
in tagfinger
    out_tags=re.findall(r"(<[^>]+?/>)",alllines)
  File "C:\Python22\lib\sre.py", line 166, in findall
    return _compile(pattern, 0).findall(string)
TypeError: expected string or buffer

Here is the code:

def tagfinger():
    """Finds mark-up, and prints it to a text file."""
    #imports
    import sys, string, os, os.path, glob, re
    #welcome
    print "Welcome to TagFinger!"
    print ""
    path_in=raw_input("Please type the pathname where you wish to search: ")
    print "The pathname you have chosen is: ",path_in
    print ""
    from os import chdir
    chdir(path_in)
    print "You are now in path: ",path_in
    print "We found the follwing...",glob.glob("*")
    list1=glob.glob("*")
    print ""
    answer=raw_input("Which file's tags do you wish to extract? ")
    print "You chose",answer
    for answer in list1:
        list1.pop()
    print "...Opening...",answer
    rtfile=file(answer, 'r')
    print '...Reading...',answer
    alllines=rtfile.readlines()
    print alllines
    out_tags=re.findall(r"(<[^>]+?/>)",alllines)
    print out_tags