How to print out html tags excluding the attributes
Michael F. Stemper
michael.stemper at gmail.com
Sun Jul 21 16:41:45 EDT 2019
On 20/07/2019 20.04, sum abiut wrote:
> I want to use regular expression to print out the HTML tags excluding the
> attributes.
>
> for example:
>
> import re
> html = '<h1>Hi</h1><p>test <span class="time">test</span></p>'
> tags = re.findall(r'<[^>]+>', html)
> for a in tags:
> print(a)
>
>
> the output is :
>
> <h1>
> </h1>
> <p>
> <span class="time">
> </span>
> </p>
>
> But I just want the tag, not the attributes
Try this:
for a in tags:
a = re.sub( " .*>", ">", a )
print(a)
(The two statements could be combined.)
--
Michael F. Stemper
Galatians 3:28
More information about the Python-list
mailing list