[Python-Dev] itertools addition: getitem()
Walter Dörwald
walter at livinglogic.de
Wed Jul 11 10:11:35 CEST 2007
Giovanni Bajo wrote:
> On 09/07/2007 21.23, Walter Dörwald wrote:
>
>> >>> from ll.xist import parsers, xfind
>> >>> from ll.xist.ns import html
>> >>> e = parsers.parseURL("http://www.python.org", tidy=True)
>> >>> print e.walknode(html.h2 & xfind.hasclass("news"))[-1]
>> Google Adds Python Support to Google Calendar Developer's Guide
>>
>>
>> Get the first comment line from a python file:
>>
>> >>> getitem((line for line in open("Lib/codecs.py") if
>> line.startswith("#")), 0)
>> '### Registry and builtin stateless codec functions\n'
>>
>>
>> Create a new unused identifier:
>>
>> >>> def candidates(base):
>> ... yield base
>> ... for suffix in count(2):
>> ... yield "%s%d" % (base, suffix)
>> ...
>> >>> usedids = set(("foo", "bar"))
>> >>> getitem((i for i in candidates("foo") if i not in usedids), 0)
>> 'foo2'
>
> You keep posting examples where you call your getitem() function with "0" as
> index, or -1.
>
> getitem(it, 0) already exists and it's spelled it.next(). getitem(it, -1)
> might be useful in fact, and it might be spelled last(it) (or it.last()). Then
> one may want to add first() for simmetry, but that's it:
>
> first(i for i in candidates("foo") if i not in usedids)
> last(line for line in open("Lib/codecs.py") if line[0] == '#')
>
> Are there real-world use cases for getitem(it, n) with n not in (0, -1)? I
> share Raymond's feelings on this. And by the way, if you wonder, I have these
> exact feelings as well for islice... :)
It useful for screen scraping HTML. Suppose you have the following HTML
table:
<table>
<tr><td>01.01.2007</td><td>12.34</td><td>Foo</td></tr>
<tr><td>13.01.2007</td><td>23.45</td><td>Bar</td></tr>
<tr><td>04.02.2007</td><td>45.56</td><td>Baz</td></tr>
<tr><td>27.02.2007</td><td>56.78</td><td>Spam</td></tr>
<tr><td>17.03.2007</td><td>67.89</td><td>Eggs</td></tr>
<tr><td> </td><td>164.51</td><td>Total</td></tr>
<tr><td> </td><td>(incl. VAT)</td><td></td></tr>
</table>
To extract the total sum, you want the second column from the second to
last row, i.e. something like:
row = getitem((r for r in table if r.name == "tr"), -2)
col = getitem((c for c in row if c.name == "td"), 1)
Servus,
Walter
More information about the Python-Dev
mailing list