[Tutor] How to create a dictionary for ount elements
Peter Otten
__peter__ at web.de
Wed Jun 4 12:53:16 CEST 2014
jarod_v6 at libero.it wrote:
> Dear all thanks for your suggestion!!!
> Thanks to your suggestion I create this structure:with open("prova.csv")
> as p:
> for i in p:
> lines =i.rstrip("\n").split("\t")
> ...: print lines
> ...:
> ['programs ', 'sample', 'gene', 'values']
> ['program1', 'sample1', 'TP53', '2']
> ['program1', 'sample1', 'TP53', '3']
> ['program1', 'sample2', 'PRNP', '4']
> ['program1', 'sample2', 'ATF3', '3']
> ['program2', 'sample1', 'TP53', '2']
> ['program2', 'sample1', 'PRNP', '5']
> ['program2', 'sample2', 'TRIM32', '4']
> ['program2', 'sample2', 'TLK1', '4']
>
> In [4]: with open("prova.csv") as p:
> for i in p:
> lines =i.rstrip("\n").split("\t")
> dizlines
> diz
>
> In [4]: with open("prova.csv") as p:
> for i in p:
> lines =i.rstrip("\n").split("\t")
> line = (lines[0],lines[1])
> ...: diz.setdefault(line,set()).add(lines[2])
> ...:
>
> In [5]: diz
> Out[5]:
> {('program1', 'sample1'): {'TP53'},
> ('program1', 'sample2'): {'ATF3', 'PRNP'},
> ('program2', 'sample1'): {'PRNP', 'TP53'},
> ('program2', 'sample2'): {'TLK1', 'TRIM32'},
> ('programs ', 'sample'): {'gene'}}
>
>
> So what I want to do is to use intersect between the keys recursively:
> s = diz[('program2', 'sample1']
> ....:
> ....:
> KeyboardInterrupt
>
> In [14]: s = diz[('program2', 'sample1')]
>
> In [15]: s
> Out[15]: {'PRNP', 'TP53'}
>
> In [16]: a
> Out[16]: {'ATF3', 'PRNP'}
>
> In [17]: s.inte
> s.intersection s.intersection_update
>
> In [17]: s.intersection(a)
> Out[17]: {'PRNP'}
>
> How can Have a intersect of all my dictionary and from ('program1',
> 'sample1') vs ('program1', 'sample2')...
> I want to count how many genes are common
> Thanks in advance for your help!
For that you have to map genes to program/sample pairs -- or just count them
as already suggested:
>>> import csv
>>> from collections import Counter
>>> with open("prova.csv") as f:
... rows = csv.DictReader(f, delimiter="\t")
... freq = Counter(row["gene"] for row in rows)
...
>>> freq.most_common(2)
[('TP53', 3), ('PRNP', 2)]
More information about the Tutor
mailing list