Using namedtuples field names for column indices in a list of lists
Peter Otten
__peter__ at web.de
Mon Jan 9 09:51:14 EST 2017
Deborah Swanson wrote:
> Even better, to get hold of all the records with the same Description as
> the current row, compare them all, mark all but the different ones for
> deletion, and then resume processing the records after the last one?
When you look at all fields for deduplication anyway there's no need to
treat one field (Description) specially. Just
records = set(records)
should be fine. As the initial order is lost* you probably want to sort
afterwards. The code then becomes
records = sorted(
set(records),
key=operator.attrgetter("Description")
)
Now if you want to fill in missing values, you should probably do this
before deduplication -- and the complete() function introduced in
https://mail.python.org/pipermail/python-list/2016-December/717847.html
can be adapted to work with namedtuples instead of dicts.
(*) If you want to preserve the initial order you can use a
collections.OrderedDict instead of the set.
More information about the Python-list
mailing list