Using namedtuples field names for column indices in a list of lists

Peter Otten __peter__ at web.de
Mon Jan 9 09:51:14 EST 2017


Deborah Swanson wrote:

> Even better, to get hold of all the records with the same Description as
> the current row, compare them all, mark all but the different ones for
> deletion, and then resume processing the records after the last one?

When you look at all fields for deduplication anyway there's no need to 
treat one field (Description) specially. Just

records = set(records)

should be fine. As the initial order is lost* you probably want to sort 
afterwards. The code then becomes

records = sorted(
    set(records), 
    key=operator.attrgetter("Description")
)

Now if you want to fill in missing values, you should probably do this 
before deduplication -- and the complete() function introduced in

https://mail.python.org/pipermail/python-list/2016-December/717847.html

can be adapted to work with namedtuples instead of dicts.

(*) If you want to preserve the initial order you can use a 
collections.OrderedDict instead of the set.




More information about the Python-list mailing list