[Tutor] Iterating over columns in a tab-delimited text and
writing SQL statements
Kent Johnson
kent_johnson at skillsoft.com
Sat Nov 6 13:41:36 CET 2004
Oh, I forgot - my guess is the reason you get an IndexError is that you
have a blank line in your file. I would check for the expected number of
fields in each data line before processing it. I've modified my program
below to do this.
<...goes to change the program to check for short lines...>
No, actually, I don't have to change my program. The semantics of slicing
and zip() make it work fine for short lines.
data[2:] just returns an empty list if len(data) <= 2
zip(header, []) also returns an empty list
so the iteration over zip(header, data[2:]) is just empty when a data line
is too short. For lines that are longer than 2 items but shorter than
expected it will just use what is there.
Kent
At 12:33 AM 11/6/2004 -0500, Kent Johnson wrote:
>You need two nested loops to do this. zip() is also very handy, it lets
>you iterate two lists at the same time. I think this does what you want:
>
>f1 = open('GeneName.txt', 'r')
>header = f1.readline().split()[3:]
>
>for line in f1:
> data = line.split()
> for sample, expr in zip(header, data[2:]):
> sql = '''INSERT into expression (gene_name,probe_name,sample,exprs)
> VALUES ('%s', '%s', '%s', %s)''' % (data[0], data[1], sample,
> expr)
> print sql
>
>For your sample data, it prints
>INSERT into expression (gene_name,probe_name,sample,exprs)
> VALUES ('xyz', '1000_at', 'AL21', 272.3)
>INSERT into expression (gene_name,probe_name,sample,exprs)
> VALUES ('xyz', '1000_at', 'AL11', 345.2)
>INSERT into expression (gene_name,probe_name,sample,exprs)
> VALUES ('xyz', '1000_at', 'AL12', -32.45)
>INSERT into expression (gene_name,probe_name,sample,exprs)
> VALUES ('abc', '1001_at', 'AL21', 134.4)
>INSERT into expression (gene_name,probe_name,sample,exprs)
> VALUES ('abc', '1001_at', 'AL11', 45.3)
>INSERT into expression (gene_name,probe_name,sample,exprs)
> VALUES ('abc', '1001_at', 'AL12', 342.59)
More information about the Tutor
mailing list