Re: [Mailman-Developers] [Bug 985149] [NEW] Add List-Post value to permalink hash input

Barry,
I definitely agree that "Now's the time".
I don't understand the proposal. By "added to this hash", do you mean "included in the set of elements that get hashed" or do you mean "appended to the hash value"?
Presumedly, the sole purpose in publishing an algorithm to create the hash is to make it possible for two handlers to independently develop the same hash given only the message. Otherwise, a "secret" method could be used to assign a unique identifier to the message.
In either case, this suggested change renews my argument that the resulting hash should be tagged, visibly, with a "protocol revision designator". Omitting that designation transforms the chosen calculation method into a "secret".
Richard
On Apr 18, 2012, at 1:53 PM, Barry Warsaw wrote:
Public bug reported:
Currently, we define the X-Message-ID-Hash as the base32 encoding of the sha1 hash of the Message-ID content (sans angle brackets as defined in RFC 5322). The suggestion is made that List-Post value should be added to this hash so as to be able to distinguish cross-posted messages.
This should be fine, and pretty easy. My only concern is that the header name is now a misnomer.
I wonder, is it worth coming up with a better header? Now's the time to do it since it's likely that there are almost no consumers of this standard.
What about
Permalink-Hash
?** Affects: mailman Importance: High Status: Confirmed
** Tags: mailman3

On Apr 18, 2012, at 07:22 PM, Richard Wackerbarth wrote:
I don't understand the proposal. By "added to this hash", do you mean "included in the set of elements that get hashed" or do you mean "appended to the hash value"?
I mean "append (or prepend, we have to decide ;) to the hash input.
Specifically. Let's say you have this message snippet:
List-Post: foo.example.com
Message-ID: <bar>
under the current algorithm is:
>>> from base64 import b32encode
>>> from hashlib import sha1
>>> s = sha1('bar')
>>> b32encode(s.digest())
'MLG3OAQP7EQOLKTEFQ6UAZUVBXI7AH2N'
but after the elaboration suggested in this bug would be:
>>> s = sha1('bar')
>>> s.update('foo.example.com')
>>> b32encode(s.digest())
'P67IMDMX6CRPP3TXX26OMJEOX2DDK6WN'
Presumedly, the sole purpose in publishing an algorithm to create the hash is to make it possible for two handlers to independently develop the same hash given only the message. Otherwise, a "secret" method could be used to assign a unique identifier to the message.
Exactly.
In either case, this suggested change renews my argument that the resulting hash should be tagged, visibly, with a "protocol revision designator". Omitting that designation transforms the chosen calculation method into a "secret".
The way to do that is probably to use a parameter on the header, e.g.
Permalink-Hash: P67IMDMX6CRPP3TXX26OMJEOX2DDK6WN; version=1

On Apr 18, 2012, at 02:22 PM, Richard Wackerbarth wrote:
I definitely agree that "Now's the time".
Full response in the bug, but tl;dr:
Proposal is to append the List-Post value as input to the hash, after the Message-ID value (sans angle brackets).
Add version=1 as a parameter to the header value, whatever we decide that will be (assuming we all agree that with this elaboration X-Message-ID-Hash is a misnomer).
https://bugs.launchpad.net/mailman/+bug/985149
Cheers, -Barry

On Thu, Apr 19, 2012 at 5:03 AM, Barry Warsaw <barry@list.org> wrote:
- Proposal is to append the List-Post value as input to the hash, after the Message-ID value (sans angle brackets).
First, List-POST, not List-ID? List-Post is not permanent!
Second, that order is wrong IMHO; the idea of the hash is to identify the message in a fixed-length format. If you want to qualify it with list information, why not add the list identifier to the *output* of the hash? Now you have a well-defined[1] format that (1) allows you to distinguish cross-posted instances of the same message *and* (2) identify cross-posted instances of the same message, depending on your application.
Yoroshiku, Steve
[1] I haven't read the List-ID RFC recently, but I think its format is quite restricted and likely to be of reasonable length. I don't see why Mailman can't require a List-ID for every list.

At 13:03 18-04-2012, Barry Warsaw wrote:
The List-ID: can be assumed to be unique across different mailing lists. There's a corner case though.
Regards, -sm
participants (5)
-
Barry Warsaw
-
Barry Warsaw
-
Richard Wackerbarth
-
SM
-
Stephen J. Turnbull