Pairwise count of frequency from an incidence matrix of group membership
__peter__ at web.de
Wed Apr 20 10:30:28 CEST 2011
Shafique, M. (UNU-MERIT) wrote:
> I have a number of different groups g1, g2, … g100 in my data. Each group
> is comprised of a known but different set of members from the population
> m1, m2, …m1000. The data has been organized in an incidence matrix:
> I need to count how many groups each possible pair of members share (i.e.,
> both are member of).
> I shall prefer the result in a pairwise edgelist with weight/frequency in
> a format like the following:
> m1, m1, 4
> m1, m2, 1
> m1, m3, 2
> m1, m4, 3
> m1, m5, 1
> m2, m2, 2
> ... and so on.
> I shall highly appreciate if anybody could suggest/share some
> code/tool/module which could help do this.
Homework? What have you tried?
One strategy is to create a list of sets containing the groups from the
matrix = [
[1, 1, 1, 0, 1],
[1, 0, 0, 1, 0],
sets = [ # zero-based indices
The enumerate() builtin may help you with the conversion. You can then find
the shared groups with set arithmetic:
sets & sets #m1/m2
More information about the Python-list