Hello everyone,<br><br> Iam new to this mailing list as well as python(uptime-3 weeks).Today I learnt about RE from <a href="http://www.amk.ca/python/howto/regex/%22RE%27s" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">
http://www.amk.ca/python/howto/regex/</a>.This one was really helpful. I started working out with few examples on my own. The first one was to collect all the HTML tags used in an HTML file. I wrote this code.<br><br>------------------------------
<br>
import re<br>file1=open(raw_input("\nEnter The path of the HTML file: "),"r")<br>ans=""<br>while 1:<br> data=file1.readline()<br> if data=="":<br> break<br> ans=ans+data
<br> <br>ans1=re.sub(r' .*?',">",ans) # to make tags such as <link rel..> to <link>rel<br>match=re.findall(r'<[^/]?[a-zA-Z]+.*?>',ans1)<br>print match<br clear="all">---------------------------------
<br><br>I get the output but with tags repeated. I want to display all the tags used in a file ,but no repetitions.Say the output to one of the HTML file I got was : "<html><link><span style="font-weight: bold;">
<a><br><a><br></span>"<br><br>Instead of writing a new 'for,if' loop to filter the repetetive tags from the list, is there something that I can add in the re itself to match the pattern only once?
<br><br>Thank You<br>-- <br>Intercodes<br>