[Tutor] splice a string object based on embedded html tag...

Isr Gish isrgish at fusemail.com
Fri Jan 30 14:35:43 EST 2004


Stella Rockford wrote:

    >there is a lot of data I have no use for and it slows the parser down
   >I have studied the HTML code and found that the info I need
   >is, naturally, nested in a table with a unique id for CSS
   >
   >I am assuming that removing everything but this table
   >before it gets parsed will allow sgmllib to function faster...
   >
   >I would like to SPLICE everything before and after this table off of 
   >the file object
   >this would be the first operation on the object,  but when I looked up 
   >string's methods
   >I couldn't quite find what i am looking for to do this.
   >
   >  indexing and splicing of a string seems only responds to integers

I don't know HTML, bt to search for the ID you would do something like this.
>>> import string
>>> ID = '<1234>' #sub string to look for in big string
>>> str = 'This is first part of string. <1234> This is after the ID.'
>>> indx = string.find(str, ID) # this finds the index in the string of the firs acurrance of ID
>>> newstr = str[indx:] # this gives you the slice from after the indx
>>> print newstr
<1234> This is after the ID.

If you want it without the ID then add to indx the length of ID like this
>>> newindx = indx + len(ID)

Good luck
Isr




More information about the Tutor mailing list