Exclude 'None' from list comprehension of dicts

Fri Aug 5 16:44:37 EDT 2022

Benchmarking aside, Lori, there are some ideas about such things.

You are describing a case, in abstract terms, where an algorithm grinds away
and produces results that may include an occasional or a common unwanted
result. The question is when to eliminate the unwanted. Do you eliminate
them immediately at the expense of some extra code at that point, or do you
want till much later or even at the end?

The answer is it DEPENDS and let me point out that many problems can start
multi-dimensional (as in processing a 5-D matrix) and produce a linear
output (as in a 1-D list) or it can be the other way around. Sometimes what
you want eliminated is something like duplicates. Is it easier to remove
duplicates as they happen, or later when you have some huge data structure
containing oodles of copies of each duplicate?

You can imagine many scenarios and sometimes you need to also look at costs.
What does it cost to check if a token is valid, as in can the word be found
in a dictionary? Is it cheaper to wait till you have lots of words including
duplicates and do one lookup to find a bad word then mark it so future
occurrences are removed without that kind of lookup? Or is it better to read
I the dictionary once and hash it so later access is easy?

In your case, you have a single simple criterion for recognizing an item to
leave out. So the above may not apply. But I note we often use pre-created
software that simply returns a result and then the only reasonable way to
remove things  is after calling it. Empty or unwanted items may take up some
room, though, so a long-running process may be better off pruning as it
goes.

-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com at python.org> On
Behalf Of Loris Bennett
Sent: Friday, August 5, 2022 1:50 AM
To: python-list at python.org
Subject: Re: Exclude 'None' from list comprehension of dicts

Antoon Pardon <antoon.pardon at vub.be> writes:

> Op 4/08/2022 om 13:51 schreef Loris Bennett:
>> Hi,
>>
>> I am constructing a list of dictionaries via the following list
>> comprehension:
>>
>>    data = [get_job_efficiency_dict(job_id) for job_id in job_ids]
>>
>> However,
>>
>>    get_job_efficiency_dict(job_id)
>>
>> uses 'subprocess.Popen' to run an external program and this can fail.
>> In this case, the dict should just be omitted from 'data'.
>>
>> I can have 'get_job_efficiency_dict' return 'None' and then run
>>
>>    filtered_data = list(filter(None, data))
>>
>> but is there a more elegant way?
>
> Just wondering, why don't you return an empty dictionary in case of a
failure?
> In that case your list will be all dictionaries and empty ones will be 
> processed fast enough.

When the list of dictionaries is processed, I would have to check each
element to see if it is empty.  That strikes me as being less efficient than
filtering out the empty dictionaries in one go, although obviously one would
need to benchmark that.

Cheers,

Loris

--
This signature is currently under construction.
--
https://mail.python.org/mailman/listinfo/python-list