[Tutor] find second occurance of string in line
Peter Otten
__peter__ at web.de
Wed Sep 9 09:55:24 CEST 2015
richard kappler wrote:
> On Tue, Sep 8, 2015 at 1:40 PM, Peter Otten <__peter__ at web.de> wrote:
>> I'm inferring from the above that you do not really want the "second"
>> timestamp in the line -- there is no line left intace anyway;) -- but
>> rather
>> the one in the <general>...</general> part.
>>
>> Here's a way to get these (all of them) with lxml:
>>
>> import lxml.etree
>>
>> tree = lxml.etree.parse("example.xml")
>> print tree.xpath("//objectdata/general/timestamp/text()")
> No no, I'm not sure how I can be much more clear, that is one (1) line of
> xml that I provided, not many, and I really do want what I said in the
> very beginning, the second instance of <timestamp> for each of those
> lines.
It looks likes I was not clear enough: XML doesn't have the concept of
lines. When you process XML "by line" you have buggy code.
> Got it figured out with guidance from Alan's response though:
>
> #!/usr/bin/env python
>
> with open("example.xml", 'r') as f:
> for line in f:
> if "objectdata" in line:
> if "<timestamp>" in line:
> x = "<timestamp>"
> a = "</timestamp>"
> first = line.index(x)
> second = line.index(x, first+1)
> b = line.index(a)
> c = line.index(a, b+1)
> y = second + 11
> timestamp = line[y:c]
> print timestamp
Just for fun take the five minutes to install lxml and compare the output of
the two scripts. If it's the same now there's no harm switching to lxml, and
you are making future failures less likely.
More information about the Tutor
mailing list