Python3, column names from array - numpy or pandas
Rhodri James
rhodri at kynesim.co.uk
Thu Dec 15 08:51:59 EST 2016
On 15/12/16 01:56, renjith madhavan wrote:
> I have a dataset in the below format.
>
> id A B C D E
> 100 1 0 0 0 0
> 101 0 1 1 0 0
> 102 1 0 0 0 0
> 103 0 0 0 1 1
>
> I would like to convert this into below:
> 100, A
> 101, B C
> 102, A
> 103, D E
>
> How do I do this ? I tried numpy argsort but I am new to Python and finding this challenging.
> Appreciate any help in this.
>
Numpy or pandas? Neither, this is a straightforward bit of text
manipulation you can do without needing to import anything. I wouldn't
bother considering either unless your dataset is massive and speed is
anything of an issue.
with open("data.txt") as datafile:
# First line needs handling separately
line = next(datafile)
columns = line.split()[1:]
# Now iterate through the rest
for line in datafile:
results = []
for col, val in zip(columns, line.split()[1:]:
if val == "1":
results.append(col)
print("{0}, {1}".format(data[0], " ".join(results)))
Obviously there's no defensive coding for blank lines or unexpected data
in there, and if want to use the results later on you probably want to
stash them in a dictionary, but that will do the job.
--
Rhodri James *-* Kynesim Ltd
More information about the Python-list
mailing list