[Chennaipy] Finding duplicates of row in csv
Rajagopal Jagannathan
rajagopal.jagannathan at gmail.com
Mon Feb 12 08:18:51 EST 2018
You can do this in multiple ways.
in Shell script
sort <filename> | uniq -c
If you have to use Python, then the faster way to do it would be to load
the csv into a Pandas dataframe, which should allow you to use
dataframe.duplicated()
If you don't want to use pandas then you can loop through the csv and
create a set or hashmap with the row as the key and count as the value
Hope this helps.
On Sun, Feb 11, 2018 at 9:07 AM, Saravanan Muthu <saravana4285 at gmail.com>
wrote:
> Hello All,
> I have a csv with multiple column , and I need to figure out the
> duplicates entry ,I have imported csv and assigned the row to dictionary
> ,please share a logic to find the duplicates , sample data is below ,
>
> Name age employer
> Kumar 28 133678
> Kumar 28 133678
> Anil. 42. 133567
>
> Kumar entry need to be finded out
>
> _______________________________________________
> Chennaipy mailing list
> Chennaipy at python.org
> https://mail.python.org/mailman/listinfo/chennaipy
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/chennaipy/attachments/20180212/27e18a21/attachment.html>
More information about the Chennaipy
mailing list