regexp question

Jon Clements joncle at googlemail.com
Fri Nov 6 17:06:26 EST 2009


On Nov 6, 9:50 pm, Jabba Laci <jabba.l... at gmail.com> wrote:
> Hi,
>
> How to find all occurences of a substring in a string? I want to
> convert the following Perl code to Python.
>
> Thanks,
>
> Laszlo
>
> ==========
>
> my $text = '<a href="ad1">sdqs</a><a href="ad2">sds</a><a href=ad3>qs</a>';
>
> while ($text =~ m#href="?(.*?)"?>#g)
> {
>    print $1, "\n";}
>
> # output:
> #
> # ad1
> # ad2
> # ad3

There's numerous threads on why using regexp's to process html is not
a great idea. Search GGs.

You're better off using beautifulsoup (an HTML parsing library). The
API is simple, and for real-world data is a much better choice.

hth
Jon.



More information about the Python-list mailing list