No subject
Shashank Singh
shashank.sunny.singh at gmail.com
Sat May 1 02:31:56 EDT 2010
Here is my quick take on it using re
import re
strings = ["1 ALA Helix Sheet Helix Coil",
"2 ALA Coil Coil Coil Sheet",
"3 ALA Helix Sheet Coil Turn",
"4 ALA Helix Sheet Helix Sheet"]
regex = re.compile(r" (.+?\b)(?=.*\1)")
for s in strings:
moreThanOnce = list(set(regex.findall(s)))
count = len(moreThanOnce)
if count == 1: print moreThanOnce[0]
elif count == 2: print "doubtful"
else: print "error"
Although this is short, its probably not the most efficient.
A more verbose and efficient version would be
for s in strings:
l = s.split()[2:]
counts = {}
for ss in l:
if counts.has_key(ss): counts[ss] += 1
else: counts[ss] = 1
filtered = [ss for ss in counts if counts[ss] >= 2]
filteredCount = len(filtered)
if filteredCount == 1:
print filtered[0]
elif filteredCount > 1:
print "doubtful"
else:
print "error"
HTH
On Sat, May 1, 2010 at 9:03 AM, mannu jha <mannu_0523 at rediffmail.com> wrote:
> Dear all,
>
> I am trying my problem in this way:
>
> import re
> expr = re.compile("Helix Helix| Sheet Sheet| Turn Turn| Coil Coil")
> f = open("CalcSecondary4.txt")
> for line in f:
> if expr.search(line):
> print line
>
> but with this it is printing only those line in which helix, sheet, turn
> and coil are coming twice. Kindly suggest how should I modify it so that
> whatever secondary structure is coming more than or equal to two times it
> should write that as final secondary structure and if two seconday structure
> are coming two-two times in one line itself like:
>
> 4 ALA Helix Sheet Helix Sheet
>
> then it should write that as doubtful and rest it should write as error.
>
> Thanks,
>
>
> Dear all,
>
> I have a file like:
>
> 1 ALA Helix Sheet Helix Coil
> 2 ALA Coil Coil Coil Sheet
> 3 ALA Helix Sheet Coil Turn
>
> now what I want is that write a python program in which I will put the
> condition that in each line whatever secondary structure is coming more than
> or equal to two times it should write that as final secondary structure and
> if two seconday structure are coming two-two times in one line itself like:
>
> 4 ALA Helix Sheet Helix Sheet
>
> then it should write that as doubtful and rest it should write as error.
>
> Thanks,
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>
--
Regards
Shashank Singh
Senior Undergraduate, Department of Computer Science and Engineering
Indian Institute of Technology Bombay
shashank.sunny.singh at gmail.com
http://www.cse.iitb.ac.in/~shashanksingh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20100501/7fa3c2c1/attachment.html>
More information about the Python-list
mailing list