[Tutor] find data in html file

Ed Singleton singletoned at gmail.com
Wed Sep 28 10:25:53 CEST 2005


On 27/09/05, lmac <lopoff at gmx.net> wrote:
> Hi there,
> i have a base-question. If i want to read some kind of data out of a line
> which i know the start-tag and the end-tag in an html-file how do i
> recognize
> if it's more than one line ?
>
> Example:
>
> <td>Some text<a href>link</a>text ..... DATA ....</tr></td> etc.
>
> I would use >text as the starting tag to localize the beginning of the DATA.
> And then </tr> as the ending tag of the DATA. But if there is \n then
> there are more than
> one line.

Hopefully it's just a typo or something, but you appear to have your
ending </tr> and </td> tags the wrong way round.

You should be closing the cell before you close the row.

How do you want to get the data out?  This case is simple enough that
you could do a lazy (non-greedy) regex statement for it.  Something
like "<td>([\s|\S]+?)</td>" would do it.

Ed


More information about the Tutor mailing list