regexp

johnzenger at gmail.com johnzenger at gmail.com
Tue Dec 19 16:15:17 EST 2006


You want re.sub("(?s)<!--.*?-->", "", htmldata)

Explanation:  To make the dot match all characters, including newlines,
you need to set the DOTALL flag.  You can set the flag using the (?_)
syntax, which is explained in section 4.2.1 of the Python Library
Reference.

A more readable way to do this is:

obj = re.compile("<!--.*?-->", re.DOTALL)
re.sub("", htmldata)


On Dec 19, 3:59 pm, vertigo <s... at spam.pl> wrote:
> Hello
>
>
>
>
>
> > On Tuesday 19 December 2006 13:15, vertigo wrote:
> >> Hello
>
> >> I need to use some regular expressions for more than one line.
> >> And i would like to use some modificators like: /m or /s in perl.
> >> For example:
> >> re.sub("<script.*>.*</script>","",data)
>
> >> will not cut out all javascript code if it's spread on many lines.
> >> I could use something like /s from perl which treats . as all signs
> >> (including new line). How can i do that ?
>
> >> Maybe there is other way to achieve the same results ?
>
> >> Thanx
>
> > Take a look at Chapter 8 of 'Dive Into Python.'
> >http://diveintopython.org/toc/index.htmli read whole regexp chapter - but there was no solution for my problem.
> Example:
>
> re.sub("<!--.*-->","",htmldata)
> would remove only comments which are in one line.
> If comment is in many lines like this:
> <!--start
> of
> commend, end-->
>
> it would not work. It's because '.' sign does not matches '\n' sign.
>
> Does anybody knows solution for this particular problem ?
> 
> Thanx- Hide quoted text -- Show quoted text -




More information about the Python-list mailing list