[Tutor] map one file and print it out following the sequence
Dave Angel
d at davea.name
Thu Oct 13 17:43:15 CEST 2011
On 10/13/2011 09:09 AM, lina wrote:
> <snip>
>
>> I think your final version of sortfile() might look something like:
>>
>> def sortfile(infilename=**INFILENAME, outfilename=OUTFILENAME):
>> infile = open(infilename, "r")
>> intext = infile.readlines()
>> outfile = open(OUTFILENAME, "w")
>> for chainid in CHAINID:
>> print("chain id = ",chainid)
>> sortoneblock(chainid, intext, outfile)
>> infile.close()
>> outfile.close()
>>
>
> $ python3 map-to-itp.py
> {'O4': '2', 'C19': '3', 'C21': '1'}
> C
> Traceback (most recent call last):
> File "map-to-itp.py", line 55, in<module>
> sortfile()
> File "map-to-itp.py", line 17, in sortfile
> sortoneblock(chainid,intext,OUTFILENAME)
> File "map-to-itp.py", line 29, in sortoneblock
> f.write(line[1].strip() for line in temp)
> TypeError: must be str, not generator
>
>
When you see an error message that describes a generator, it means you
usually have a for-expression used as a value.
At your stage of learning you probably be ignoring generators and list
comprehensions, and just write simple for loops. So you should replace
the f.write with a loop.
for item in temp:
f.write(something + "\n")
One advantage is that you can easily stuff print() functions into the
loop, to debug what's really happening. After you're sure it's right,
it might be appropriate to use either a generator or a list comprehension.
> I don't know how to fix the writing issue.
>
> can I write the different chainID one into the same OUTFILE?
>
> Thanks, I attached the code I used below:
>
> #!/usr/bin/python3
>
> import os.path
>
> LINESTOSKIP=0
> CHAINID="CDEFGHI"
> INFILENAME="pdbone.pdb"
> OUTFILENAME="sortedone.pdb"
> DICTIONARYFILE="itpone.itp"
> mapping={}
> valuefromdict={}
>
> def sortfile():
> intext=fetchonefiledata(INFILENAME)
> for chainid in CHAINID:
> print(chainid)
> sortoneblock(chainid,intext,OUTFILENAME)
>
One way to get all the output into one file is to create the file in
sortfile(), and pass the file object. Look again at what I suggested
for sortfile(). If you can open the file once, here, you won't have the
overhead of constantly opening the same file that nobody closed, and
you'll have the side benefit that the old contents of the file will be
overwritten.
Andreas' suggestion of using append would make more sense if you wanted
the output to accumulate over multiple runs of the program. If you
don't want the output file to be the history of all the runs, then
you'll need to do one open(name, "w"), probably in sortfile(), and then
you might as well pass the file object as I suggested.
>
>
> def sortoneblock(cID,TEXT,OUTFILE):
If you followed my suggestions for sortfile(), then the last paramter to
this function would be outfile., and you could use outfile.write().
As Andreas says, don't use uppercase for non-constants.
> temp = []
#this writes the cID to the output file, once per cID
outfile.write(cID + "\n")
> for line in TEXT:
> blocks=line.strip().split()
> if len(blocks)== 11 and blocks[3] == "CUR" and blocks[4] == cID and
> blocks[2] in mapping.keys():
if (len(blocks)== 11 and blocks[3] == "CUR"
and blocks[4] == cID and blocks[2] in mapping ):
Having the .keys() in that test is redundant and slows execution down
quite a bit. "in" already knows how to look things up efficiently in a
dictionary, so there's no use in converting to a slow list before doing
the slow lookup.
Also, if you put parentheses around the whole if clause, you can span it
across multiple lines without doing anything special.
> temp.append((mapping[blocks[2]],line))
> temp.sort()
> with open(OUTFILE,"w") as f:
> f.write(line[1].strip() for line in temp)
>
See comment above for splitting this write into a loop. You also are
going to have to decide what to write, as you have tuple containing both
an index number and a string in each item of temp. Probably you want to
write the second item of the tuple. Combining these changes, you
would have
for index, line in temp:
outfile.write(line + "\n")
Note that the following are equivalent:
for item in temp:
index, line = item
outfile.write(line + "\n")
for item in temp:
outfile.write(item[1] + "\n")
But I like the first form, since it makes it clear what's been stored in
temp. That sort of thing is important if you ever change it.
>
>
>
> def generatedictionary(dictfilename):
> text=fetchonefiledata(DICTIONARYFILE)
> for line in text:
> parts=line.strip().split()
> if len(parts)==8:
> mapping[parts[4]]=parts[0]
> print(mapping)
>
>
>
> def fetchonefiledata(infilename):
> text=open(infilename).readlines()
> if os.path.splitext(infilename)[1]==".itp":
> return text
> if os.path.splitext(infilename)[1]==".pdb":
> return text[LINESTOSKIP:]
> infilename.close()
>
>
> if __name__=="__main__":
> generatedictionary(DICTIONARYFILE)
> sortfile()
>
Final note: write() doesn't automatically append a newline, so I tend to
add an explicit one in the write() itself. But if you start seeing
double spacing, that's presumably because the line already had a newline
in it. You could use rstrip() on it (my choice), or remove the + "\n"
in the write() method.
--
DaveA
More information about the Tutor
mailing list