[Tutor] mapping problem

Srinivas Iyyer srini_iyyer_bio at yahoo.com
Wed Feb 1 16:58:49 CET 2006


Dear group, 
  I have a problem in finding a method to solve a
problem where I want to walk through a lineage of
terms and find group them from right to left. 

A snippet of the problem is here. The terms in file as
tab delim manner. 

a	b	c	d	car
a	b	c	f	truck
a	b	c	d	van
a	b	c	d	SUV
a	b	c	f	18-wheeler
a	b 	j	k	boat
a	b	j	a	submarine
a	b	d	a	B-747
a	b	j	c	cargo-ship
a	b	j	p	passenger-cruise ship
a	b	a	a	bicycle
a	b	a	b	motorcycle


Now my question is to enrich members that have
identical lineage with different leaf.
'i.e': a b c d - van suv . I have two terms in this
path and I am not happy with two. I wish to have more.

Then: a b c - car, van, truck, SUV and 18-wheeler
(automobiles that travel on road). I am happy with
this grouping and I enriched more items if I walk on
lienage : (a-b-c)


Thus, I want to try to enrich for all 21 K lines of
lineages.

My question:

Is there a way to automate this problem.

My idea of doing this:

Since this is a tab delim file. I want to read a line
with say 5 columns (5 tabs).  Search for items with
same column item 4 (because leaf items could be
unique). If I find a hit, then check if columns 3 and
2 are identical if so create a list. 

Although this problem is more recursive and time and
resource consuming, I cannot think of an easy
solution. 

Would you please suggest a nice and simple method to
solve this problem. 

For people who are into bioinformatics (I know Danny
Yoo is a bioinformatician) the question is about GO
terms.  I parsed OBO file and laid out the term
lineages that constitute the OBO-DAG structure. I want
to enrich the terms to do an enrichment analysis for a
set of terms that I am interested in.

Thank you in advance.

cheers
Srini




__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


More information about the Tutor mailing list