[Tutor] mapping problem
Srinivas Iyyer
srini_iyyer_bio at yahoo.com
Wed Feb 1 16:58:49 CET 2006
Dear group,
I have a problem in finding a method to solve a
problem where I want to walk through a lineage of
terms and find group them from right to left.
A snippet of the problem is here. The terms in file as
tab delim manner.
a b c d car
a b c f truck
a b c d van
a b c d SUV
a b c f 18-wheeler
a b j k boat
a b j a submarine
a b d a B-747
a b j c cargo-ship
a b j p passenger-cruise ship
a b a a bicycle
a b a b motorcycle
Now my question is to enrich members that have
identical lineage with different leaf.
'i.e': a b c d - van suv . I have two terms in this
path and I am not happy with two. I wish to have more.
Then: a b c - car, van, truck, SUV and 18-wheeler
(automobiles that travel on road). I am happy with
this grouping and I enriched more items if I walk on
lienage : (a-b-c)
Thus, I want to try to enrich for all 21 K lines of
lineages.
My question:
Is there a way to automate this problem.
My idea of doing this:
Since this is a tab delim file. I want to read a line
with say 5 columns (5 tabs). Search for items with
same column item 4 (because leaf items could be
unique). If I find a hit, then check if columns 3 and
2 are identical if so create a list.
Although this problem is more recursive and time and
resource consuming, I cannot think of an easy
solution.
Would you please suggest a nice and simple method to
solve this problem.
For people who are into bioinformatics (I know Danny
Yoo is a bioinformatician) the question is about GO
terms. I parsed OBO file and laid out the term
lineages that constitute the OBO-DAG structure. I want
to enrich the terms to do an enrichment analysis for a
set of terms that I am interested in.
Thank you in advance.
cheers
Srini
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
More information about the Tutor
mailing list