[Tutor] Fwd: find second occurance of string in line

Peter Otten __peter__ at web.de
Wed Sep 9 15:53:33 CEST 2015


Albert-Jan Roskam wrote:

>> To: tutor at python.org
>> From: __peter__ at web.de
>> Date: Tue, 8 Sep 2015 21:37:07 +0200
>> Subject: Re: [Tutor] Fwd:  find second occurance of string in line
>> 
>> Albert-Jan Roskam wrote:
>> 
>> >> import lxml.etree
>> >>
>> >> tree = lxml.etree.parse("example.xml")
>> >> print tree.xpath("//objectdata/general/timestamp/text()")
>> > 
>> > Nice. I do need to try lxml some time. Is the "text()" part xpath as
>> > well?
>>  
>> Yes. I think ElementTree supports a subset of XPath.
> 
> aha, I see. I studied lxml.de a bit last night and it seems to be better
> in many ways. Writing appears to be mmmuch faster than cElementtree while
> many other situations are comparable to cElementtree. I love the objectify
> part of the package. Would you say that (given iterparse) lxml is also the
> module to process giant (ie. larger than RAM) xml files)?

Yes; but that would not be backed by experience ;)
 
> The webpage recommends a cascade of try-except ImportError statementsL
> first lxml, then cElementtree, then elementtree. But given that there are
> slight API differences, is that really a good idea? 

I tend to shy away from such complications, just as I write Python-3-only 
code now.

> How would you test
> whether the code runs under both lxml and under the alternatives? Would
> you uninstall lxml to force that one of the alternatives is used?

Again, I have not much experience with code that must cope with different 
environments.

I have simulated a missing module (not lxml) with the following trick:

$ mkdir missing_lxml
$ echo 'raise ImportError' > missing_lxml/lxml.py
$ python3 -c 'import lxml'
$ PYTHONPATH=missing_lxml python3 -c 'import lxml'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/peter/missing_lxml/lxml.py", line 1, in <module>
    raise ImportError
ImportError

Those who regularly need different configurations probably use virtualenv, 
or virtual machines when the differences are not limited to Python.




More information about the Tutor mailing list