[Tutor] list comprehension, testing for multiple conditions

Dave Angel d at davea.name
Fri Aug 24 06:26:39 CEST 2012


(you replied off-list, so I'm cc'ing the list here, to keep it public)

On 08/23/2012 10:42 PM, Pete O'Connell wrote:
> On Fri, Aug 24, 2012 at 1:39 PM, Dave Angel <d at davea.name> wrote:
> 
>> On 08/23/2012 09:11 PM, Pete O'Connell wrote:
>>> Hi, I have tried to simplify things and am running into a bit of trouble.
>>> What i am really trying to do is: Keep all the lines starting with "v "
>> and
>>> then delete those lines whose modulus 5 don't equal zero
>>>
>>> I have written it like this which seems to take a really long time (a
>>> couple of  minutes when iteration over a folder with 200 files to parse)
>>> #####################################
>>> with open(theFilePath) as lines:
>>>     #keep only the lines beginning with "v " (this works)
>>>     theGoodLines = [line.strip("\n") for line in lines if "v " ==
>>> line[0:2]]
>>
>> Better to use startswith(), since short lines will cause the if
>> expression above to blow up.
>>
> Thanks that looks much safer.

Sorry about that.  Eryksun corrected me on that.  Your present code
won't blow up, but I'd still prefer startswith().

> 
>>>     theLinesAsListSubset = []
>>>     for i in range(len(theGoodLines)):
>>
>> When you see a line like this, it's usually clearer to do:
>>          for i, line in enumerate(theGoodLines):
>>>         nuke.tprint(i)
>>>         if i%5 != 0:
>>>             continue
>>>         elif i%5 == 0:
>>>             theLinesAsListSubset.append(theGoodLines[i])
>> It's confusing to me whether you meant to keep only one of every 5 lines
>> of the filtered input, or to keep only those lines of the filtered input
>> that came from the appropriate indices of the original data.  You need a
>> more precise spec before you can safely combine the two loops.  (You may
>> have it precise in your head;  I'm just saying it isn't clear to me)
>>
> Sorry that wasn't clear. I want to keep every fifth line.

Fifth of which list?  The one you start with, or the one you get after
throwing out the lines that don't begin with v ?  You have presently
coded the latter, and if that's what you want, I can't see any
reasonable way to make it a single list comprehension.

> 
>>
>>> ########################################
>>>
>>> I think it would be better to include the modulud test within the
>> original
>>> list comprehension but I am not sure how to access the index of "line":
>>>     #something like this is a sketch of what I mean (I know it's wrong)
>>>     theGoodLines = [line.strip("\n") for line in lines if "v " ==
>>> line[0:2] and line.getIndex() % 5 == 0]
>>>
>>>
>>> Do I need enumerate for this maybe?
>> Good call.   Question is whether to do the enumerate on the original
>> list, or on the list you get after.  That decision would be based on the
>> question above.
>>
>> I have noticed that part of the slowness comes from the feedback I am
> getting on the command line with print statements. When I streamline those
> it is much faster.
> It is useable even in its current state, still out of curiosity and for my
> python understanding, it would be nice to know if it is possible to write
> it all within one list comprehension.
> Thanks
> Pete
> 
>>

-- 

DaveA


More information about the Tutor mailing list