x=something, y=somethinelse and z=crud all likely to fail - how do i wrap them up
Veek. M
vek.m1234 at gmail.com
Sun Jan 31 01:31:45 EST 2016
Veek. M wrote:
> Chris Angelico wrote:
>
>> On Sun, Jan 31, 2016 at 3:58 PM, Veek. M <vek.m1234 at gmail.com> wrote:
>>> I'm parsing html and i'm doing:
>>>
>>> x = root.find_class(...
>>> y = root.find_class(..
>>> z = root.find_class(..
>>>
>>> all 3 are likely to fail so typically i'd have to stick it in a try.
>>> This is a huge pain for obvious reasons.
>>>
>>> try:
>>> ....
>>> except something:
>>> x = 'default_1'
>>> (repeat 3 times)
>>>
>>> Is there some other nice way to wrap this stuff up?
>>
>> I'm not sure what you're using to parse HTML here (there are several
>> libraries for doing that), but the first thing I'd look for is an
>> option to have it return a default if it doesn't find something - even
>> if that default has to be (say) None.
>>
>> But failing that, you can always write your own wrapper:
>>
>> def find_class(root, ...):
>> try:
>> return root.find_class(...)
>> except something:
>> return 'default_1'
>>
>> Or have the default as a parameter, if it's different for the different
>> ones.
>>
>> ChrisA
>
> I'm using lxml.html
>
> def parse_page(self, root):
> for li_item in root.xpath('//li[re:test(@id, "^item[a-z0-9]+$")]',
> namespaces={'re': "http://exslt.org/regular-expressions"}):
> description = li_item.find_class('vip')[0].text_content()
> link = li_item.find_class('vip')[0].get('href')
> price_dollar = li_item.find_class('lvprice prc')
> [0].xpath('span')[0].text
> bids = li_item.find_class('lvformat')[0].xpath('span')[0].text
>
> tme_time = li_item.find_class('tme')[0].xpath('span')
> [0].get('timems')
> if tme_time:
> time_hrs = int(tme_time)/1000 - time.time()
> else:
> time_hrs = 'No time found'
>
> shipping = li_item.find_class('lvshipping')
> [0].xpath('span/span/span')[0].text_content()"
>
> print('{} {} {} {} {}'.format(link, price_dollar, time_hrs,
> shipping, bids))
>
print('-----------------------------------------------------------------')
Someone suggested i refactor the find_class/xpath into wrapper functions but
i tried it and it didn't look all that great..
Just give me a general idea of how to deal with messy crud like this..
More information about the Python-list
mailing list