[Tutor] splice a string object based on embedded html tag...
Isr Gish
isrgish at fusemail.com
Fri Jan 30 14:35:43 EST 2004
Stella Rockford wrote:
>there is a lot of data I have no use for and it slows the parser down
>I have studied the HTML code and found that the info I need
>is, naturally, nested in a table with a unique id for CSS
>
>I am assuming that removing everything but this table
>before it gets parsed will allow sgmllib to function faster...
>
>I would like to SPLICE everything before and after this table off of
>the file object
>this would be the first operation on the object, but when I looked up
>string's methods
>I couldn't quite find what i am looking for to do this.
>
> indexing and splicing of a string seems only responds to integers
I don't know HTML, bt to search for the ID you would do something like this.
>>> import string
>>> ID = '<1234>' #sub string to look for in big string
>>> str = 'This is first part of string. <1234> This is after the ID.'
>>> indx = string.find(str, ID) # this finds the index in the string of the firs acurrance of ID
>>> newstr = str[indx:] # this gives you the slice from after the indx
>>> print newstr
<1234> This is after the ID.
If you want it without the ID then add to indx the length of ID like this
>>> newindx = indx + len(ID)
Good luck
Isr
More information about the Tutor
mailing list