[Mailman-Developers] [GSoC14] Full Anonymization Project Idea

Wed Mar 5 05:27:31 CET 2014

In general, what is missing from these "anonymization" proposals are
use cases, user stories which display the reasons for anonymity and the
definition of anonymity (for example, should repeated posts from a
given subscriber have the same "From" field or not?)  For example, the
following organizations want a "fully anonymized" list:

1.  An Alcoholics Anonymous meeting.
2.  A therapy group for battered wives led by a professional therapist
    (whose identity is known to all, and who knows all realspace
    identities -- but maybe can't match them to list identities).
3.  A corporate whistleblower/suggestion box.
4.  A terrorist cell.  (I'm not suggesting we should *care* about
    serving these people well, and maybe we should try *not* to serve
    them -- it's an intellectual exercise.)
5.  A tax evaders users' group.  (ditto)

How do their needs differ?  How are they similar?  How well does your
proposal serve their needs?  I'm not too serious about that specific
list, individual students may or may not have experience and knowledge
of those use cases.  But I would strongly prefer to work with a
student who thinks about these issues *explicitly* and *concretely* in
terms of use cases.  In particular, it may seem obvious that we can't
protect the subscriber database against the site admin/root, but then
we have to give up on use case 3 above.  Or do we?

These are hard, *hard*, HARD questions.  Even Bruce Schneier (if you
don't know who he is, find out!) might not get the answers at first
try.  I don't ask you to get the hard ones at all!  But sometimes the
answer is obvious from just asking the question (use case 3 vs root
access), so you'd best ask some of those easy ones. :-)

About this specific proposal:

Kshitij Gupta writes:

 > As I understand, we can do this in the following ways:
 > 1. For each subscriber on the mailing list generate a random encryption and
 > decryption key, which will be store in the database.

If the keys never leave the database, why not use symmetric
encryption?  Why do different subscribers need different keys?

If they do leave the database, how are they distributed?  How is the
distribution protected from the standard attacks (eg, man in the
middle)?

 > 2. Everytime user sends the mail we can encrypt the email to a hash which
 > will then be used as the pseudo id for the user. To do this we can either
 > use salt (as in time) to ensure a new email id is generated everytime

I don't understand why you would ever want this, let alone why there
is a use case common enough to be worth implementing in Mailman.

That doesn't mean there isn't any, but please explain.

 > or without salt which equivalently fixes a single id for the user.
 > 3. From the email we can cleanup headers, converting the users timezone to
 > a standard UTC timezone.

You also probably need to handle Message-ID specially.

 > 4. We can also hash the users original email id and append it as a
 > signature to sign the mail, ensuring the authenticity of mail in a
 > conversation.

That's not how digital signatures are done, and only those with access
to a descryption key can check authenticity.

 > 5. For replies, the person replying can respond to the message, the email
 > address will then be decrypted by matching against the list of all
 > decryption keys and matching the digest of the mail id for additional
 > security and forwarding it to the intended user.

I'm not sure I understand the "additional security part."  In any
case, if you have a "digest", why not use that as a unique key into
the user database, so that the actual decryption becomes an
authentication, and only needs to be done once?

 > The above steps (in my understanding):
 > 1. Will allow users to anonymously post to mailing lists.

Except that the site admin knows where to find each user.  The site
admin had better be the only entity with such access.

 > 2. Ensure nobody can pretend to be someone else in a thread via the
 > personal salt.

But how about spoofing subscriptions?  Do we care about that?  What if
a user happens to know the address of another user, and spoofs that?

 > 3. Allow users to communicate in threads and reply to each other.
 > 4. Use a constant space for users in the database, at the cost of matching
 > against multiple decryption keys and then checking against the hashed email.

Is constant space really an issue?

 > Look forward to some feedback and hope to contribute to the mailman
 > community.
 > 
 > [1]- https://code.launchpad.net/apparmor-profile-tools , however the code
 > was recently merged into the branch upstream at:
 > https://code.launchpad.net/apparmor where development continues.
 > 
 > Regards,
 > Kshitij Gupta
 > _______________________________________________
 > Mailman-Developers mailing list
 > Mailman-Developers at python.org
 > https://mail.python.org/mailman/listinfo/mailman-developers
 > Mailman FAQ: http://wiki.list.org/x/AgA3
 > Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/
 > Unsubscribe: https://mail.python.org/mailman/options/mailman-developers/stephen%40xemacs.org
 > 
 > Security Policy: http://wiki.list.org/x/QIA9