Python3, column names from array - numpy or pandas

Rhodri James rhodri at kynesim.co.uk
Thu Dec 15 08:51:59 EST 2016


On 15/12/16 01:56, renjith madhavan wrote:
> I have a dataset in the below format.
>
> id	A	B	C	D	E
> 100	1	0	0	0	0
> 101	0	1	1	0	0
> 102	1	0	0	0	0
> 103	0	0	0	1	1
>
> I would like to convert this into below:
> 100, A
> 101, B C
> 102, A
> 103, D E
>
> How do I do this ? I tried numpy argsort but I am new to Python and finding this challenging.
> Appreciate any help in this.
>

Numpy or pandas?  Neither, this is a straightforward bit of text 
manipulation you can do without needing to import anything.  I wouldn't 
bother considering either unless your dataset is massive and speed is 
anything of an issue.

with open("data.txt") as datafile:
     # First line needs handling separately
     line = next(datafile)
     columns = line.split()[1:]
     # Now iterate through the rest
     for line in datafile:
         results = []
         for col, val in zip(columns, line.split()[1:]:
              if val == "1":
                  results.append(col)
         print("{0}, {1}".format(data[0], " ".join(results)))

Obviously there's no defensive coding for blank lines or unexpected data 
in there, and if want to use the results later on you probably want to 
stash them in a dictionary, but that will do the job.

-- 
Rhodri James *-* Kynesim Ltd


More information about the Python-list mailing list