Pairwise frequency count from an incidence matrix of group membership

Jean-Michel Pichavant jeanmichel at sequans.com
Fri Apr 22 12:57:38 CEST 2011


Shafique, M. (UNU-MERIT) wrote:
> Hi,
> I have a number of different groups g1, g2, … g100 in my data. Each 
> group is comprised of a known but different set of members (m1, m2, 
> …m1000) from the population. The data has been organized in an 
> incidence matrix:
> g1 g2 g3 g4 g5
> m1 1 1 1 0 1
> m2 1 0 0 1 0
> m3 0 1 1 0 0
> m4 1 1 0 1 1
> m5 0 0 1 1 0
>
> I need to count how many groups each possible pair of members share 
> (i.e., both are member of). 
> I shall prefer the result in a pairwise edgelist with weight/frequency 
> in a format like the following:
> m1, m1, 4
> m1, m2, 1
> m1, m3, 2
> m1, m4, 3
> m1, m5, 1
> m2, m2, 2
> ... and so on.
>
> I shall highly appreciate if anybody could suggest/share some 
> code/tool/module which could help do this.
>
> Best regards,
> Muhammad
>
Here are some clues

m1 = [1,1,1,0,1]
m2 = [1,0,0,1,0]

def foo(list1, list2):
      return len([ index for index, val in enumerate(list1) if val and 
list2[index]])

 > foo(m1, m1)
< 4

 > foo(m1, m2)
< 1


JM



More information about the Python-list mailing list