Re: [Mailman-Developers] [GSoC14] Full Anonymization Project Idea

I totally understand that claiming to provide total privacy to any individual could be vague and debatable in this case. And if we apply the usual encryption it becomes very easy to hack. So using normal cryptography is ruled out. But if we could implement salting. We generate a salt when a user is joins this list. It is stored in the database with the individual's password and other details (assuming the SQL database used is secure and does not give access rights to the moderator or any one else). When user sends a mail we store the salt used and fake mail id generated in another database (follows the previous assumptions). In case of reply to same address we lookup the fake mail id and corresponding salt and regenerate the address simultaneously applying the process of fake mail id generation to the sender's mail id this time. Also we need to filter the header from all the possible traces. But then it depends on the accessibility of the database in question. Also if we need to implement these processes someplace that is secure from middle men or malware (server may be). Or may be I guess the only way around is to trust the list admin but then it won't be able to serve the needs when we wish to keep this thing out of the hands of the admin. It then only keeps users anonymous from each other while the admin knows everything. If I am still thinking it all the wrong way. Please guide me as to how do I approach the problem.

Rashi Karanpuria writes:
If I am still thinking it all the wrong way. Please guide me as to how do I approach the problem.
The problem with your thinking is that you're thinking that there's a technical solution to a social problem called "full anonymization".
I'm 99.44% sure that whatever you're thinking of doing will be a good solution to *somebody's* problem, so I wouldn't worry about that end of things.
What I *do* worry about is that (1) I don't have a use case in mind for all the gymnastics you propose to do, because (2) it seems to me that if the site admin is not trusted she can work around any of the measures you describe, and otherwise I don't see a need for more than keeping the list keys secret from everybody else (including list admins). (3) Encryption doesn't cover the whole attack surface, which includes various kinds of traffic analysis (trace headers give a lot of information about network location, timestamps may help provide geographical location, and of course message content and writing style can provide very strong clues).
You can say "I know that". The problem is that your users frequently will not, and may read more into "*full* anonymization" than can possibly be delivered. If we're going to deliver this feature as part of Mailman, it's really important that we be able to explain what use cases it's good for, and what it's not.
None of the use cases you've proposed so far are particularly appealing to me, but the one that comes closest is the "group therapy" application. So let's look at that use case and what its requirements are.
- Who can be trusted with the keys?
- Who needs to be anonymous?
- What are the social threats if anonymity is breached?
- What are the technical threats to anonymity (ie, how can it be breached)?
I'm sure there are more questions needing answers, but that's a good place to start.
Regards,

On Sat 2015-02-21 08:49:49 -0500, "Stephen J. Turnbull" <stephen@xemacs.org> wrote:
I think it's important to distinguish between attempts to anonymize the users of the mailing list and attempts to hide the content of the messages. It's also important to understand *from whom* we are protecting the respective pieces of sensitive information.
The mailing list server *must* know the addresses of the subscribed parties, in order to be able to send mail to them, for example.
The PSELS project [0] provides a mechanism for separation of the List Moderator (who manages subscriptions and removals from the list, potentially in a mostly-offline fashion) from the List Server (which manages the online operation -- forwarding and message distribution). In their model, the List Server still knows the subscribed addresses and their keys, and knows how to transform the messages such that recipients can read them, but the List Server cannot. This doesn't hide who is receiving the mail or who is sending it from the List Server, but it hides the contents.
Other projects might focus on stripping the metadata from message headers at the mailman installation -- this doesn't hide the information from the mailing list, but it might mean that recipients of messages from the list wouldn't know things like where on the network other senders were. It could even strip or replace the "From:" header so that each recipient wouldn't know who the others are. But is this useful? Surely to have a conversation on a mailing list (as we are here) it's useful to at least have persistent pseudonymous connections between messages, otherwise how do you know who's talking to whom?
This kind of threat analysis is critical to making any sort of useful proposal in this space. It doesn't have to be complete, and it can potentially change over time if you need it to, but please start with these questions so that we understand what problem you're looking to solve, and so that you can better evaluate whether a proposed set of changes actually addresses the identified problem.
--dkg

Rashi Karanpuria writes:
If I am still thinking it all the wrong way. Please guide me as to how do I approach the problem.
The problem with your thinking is that you're thinking that there's a technical solution to a social problem called "full anonymization".
I'm 99.44% sure that whatever you're thinking of doing will be a good solution to *somebody's* problem, so I wouldn't worry about that end of things.
What I *do* worry about is that (1) I don't have a use case in mind for all the gymnastics you propose to do, because (2) it seems to me that if the site admin is not trusted she can work around any of the measures you describe, and otherwise I don't see a need for more than keeping the list keys secret from everybody else (including list admins). (3) Encryption doesn't cover the whole attack surface, which includes various kinds of traffic analysis (trace headers give a lot of information about network location, timestamps may help provide geographical location, and of course message content and writing style can provide very strong clues).
You can say "I know that". The problem is that your users frequently will not, and may read more into "*full* anonymization" than can possibly be delivered. If we're going to deliver this feature as part of Mailman, it's really important that we be able to explain what use cases it's good for, and what it's not.
None of the use cases you've proposed so far are particularly appealing to me, but the one that comes closest is the "group therapy" application. So let's look at that use case and what its requirements are.
- Who can be trusted with the keys?
- Who needs to be anonymous?
- What are the social threats if anonymity is breached?
- What are the technical threats to anonymity (ie, how can it be breached)?
I'm sure there are more questions needing answers, but that's a good place to start.
Regards,

On Sat 2015-02-21 08:49:49 -0500, "Stephen J. Turnbull" <stephen@xemacs.org> wrote:
I think it's important to distinguish between attempts to anonymize the users of the mailing list and attempts to hide the content of the messages. It's also important to understand *from whom* we are protecting the respective pieces of sensitive information.
The mailing list server *must* know the addresses of the subscribed parties, in order to be able to send mail to them, for example.
The PSELS project [0] provides a mechanism for separation of the List Moderator (who manages subscriptions and removals from the list, potentially in a mostly-offline fashion) from the List Server (which manages the online operation -- forwarding and message distribution). In their model, the List Server still knows the subscribed addresses and their keys, and knows how to transform the messages such that recipients can read them, but the List Server cannot. This doesn't hide who is receiving the mail or who is sending it from the List Server, but it hides the contents.
Other projects might focus on stripping the metadata from message headers at the mailman installation -- this doesn't hide the information from the mailing list, but it might mean that recipients of messages from the list wouldn't know things like where on the network other senders were. It could even strip or replace the "From:" header so that each recipient wouldn't know who the others are. But is this useful? Surely to have a conversation on a mailing list (as we are here) it's useful to at least have persistent pseudonymous connections between messages, otherwise how do you know who's talking to whom?
This kind of threat analysis is critical to making any sort of useful proposal in this space. It doesn't have to be complete, and it can potentially change over time if you need it to, but please start with these questions so that we understand what problem you're looking to solve, and so that you can better evaluate whether a proposed set of changes actually addresses the identified problem.
--dkg
participants (3)
-
Daniel Kahn Gillmor
-
Rashi Karanpuria
-
Stephen J. Turnbull