What characters should be allowed in listnames

I have just files <https://gitlab.com/mailman/mailman/issues/311> which is about Mailman core allowing list's to be created with a slash '/' in the name.
Core validates listnames by ensuring the fqdn_listname is a valid email address. This is too liberal. RFC 5321 allows many characters in the local part of a list name. We don't allow quite all of them, but we allow this set [-0-9a-z!#$%&'*+./=?@_`{}~].
Since list names form parts of a URI, both in Postorius/HyperKitty and in the Core's REST API, it is clear that characters that will cause problems there should not be allowed. These include [#%&/?] and maybe others. Additionally, I don't think we want @ in an email address local part and + and = might cause problems with VERP which whittles it down to [-0-9a-z!$'*._`{}~], but I'm thinking of being even more conservative and going with just [-0-9a-z._].
I don't intend to change the email address validation except maybe to remove @, but the code is such that an address with multiple @ won't validate anyway.
I'd like feedback on this. What are your thoughts on what characters should be allowed in list names?
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

On Feb 12, 2017, at 03:58 PM, Mark Sapiro wrote:
I suppose if we did continue to allow them, they would have to be escaped in the URL. I'm not sure how much that helps, or even whether it should be part of our decision to allow them or not.
I think it's entirely reasonable for Mailman to narrow the list of allowable characters in the local part of list names. We already make some opinionated decisions about allowable email addresses; for example, we support case-preserving, case-insensitive addresses so we treat bob@example.com and BOB@example.com as identical.
I'm amenable to the conservative set you propose (obviously, case insensitive), although I have some concerns about how dots in the local part would interfere with any List-ID operations. E.g. foo.bar@example.com becomes List-ID: foo.bar.example.com. As an identifier with comparison rules according to RFC 2919 I think it's fine (it just has to be unique). I'm not sure whether in practice it would cause problems with the core.
The other question is whether we're unfairly closing the door on i18n list names. OTOH, we haven't yet had any requests for that afaict.
Certainly some narrowing is appropriate. We could just clamp it down as you suggest, understanding that there may already be lists in existence that use the more liberal character set, and acknowledging that we may want to relax the set based on future bug reports.
What about this: come up with an absolute black list set, e.g. the ones that will break Mailman. Come up with a second set of discouraged but allowed characters, and a third set which is the narrow list you propose. Then make the allowable set configurable, except that the black list characters are always disallowed. Now, that might be too complicated, so I'm also fine with making it narrow now, and letting the set relax based on user feedback.
Cheers, -Barry

On 02/12/2017 05:27 PM, Barry Warsaw wrote:
Thanks Barry. FWIW, MM 2.1 has an ACCEPTABLE_LISTNAME_CHARACTERS config setting which defaults to '[-+_.=a-z0-9]'. I don't really like the + and = in that list because of their possible interaction with VERP. I have a WIP MR at <https://gitlab.com/mailman/mailman/merge_requests/248> that allows only [-_.a-z0-9] (IGNORECASE) and has no config override.
The narrow, overridable config combined with a blacklist or some kind of limitation on the overrides would be the most flexible. I'll look at adding that to the MR. Basically, I'm thinking of a fixed list of allowed characters which is liberal, testing that first and if that passes, testing the config set.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Mark Sapiro writes:
I'd like feedback on this. What are your thoughts on what characters should be allowed in list names?
Uh, RFC 6532 ....
Probably that can wait for when we actually support it :-), but while you're doing this we should (= I should when life gets sane ;-) make sure that whatever that restriction is, is encapsulated in one place.
Steve

On 02/14/2017 10:52 AM, Stephen J. Turnbull wrote:
That doesn't really address my question. That has to do with internationalized email addresses. Granted the listname must be a valid local part of an email address, but that doesn't mean every valid local part has to be a valid list name.
In particular, the issue that raises this question is a list name containing a slash. while 'my/list@example.com' is a legal email address, https://lists.example.com/mailman3/lists/my/list.example.com/ is not a URL which will work in Postorius.
I think I've done that in my current MR on this at <https://gitlab.com/mailman/mailman/merge_requests/248> which implements a [mailman] config setting described in schema.cfg as
The only questions are whether these are the right sets for the "outside this are not allowed" class and the default listname_chars class.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

On February 15, 2017 8:03:33 AM GMT+05:30, Mark Sapiro <mark@msapiro.net> wrote:
My gut feeling is that allowing slashes is going to lead to confusing side effects with mail clients and even potential security bugs down the road, and we should just not do it unless there's a clear need to support slashes.
But that's really just a feeling based on exploit experience, where "be conservative in what you expect" is best practice. It doesn't speak to what the standards say or what's doable.
Terri

Mark Sapiro writes:
The problem that I thought we may face is internationalized mailboxes and domain names *are still ASCII* which encodes Unicode.
OK, I looked it up, and I was almost certainly wrong. The relevant RFCs are actually 6531 (SMTPUTF8 extension), 5890 (IDNA), and 3492 (Punycode). IDNA allows "U-labels" (UTF-8), which we can gracefully extend to, and "A-labels" (encoded using Punycode, which uses the ASCII repertoire). AFAICT (haven't really looked carefully), Punycode uses only letters, digits, and the hyphen ("-").
So I withdraw the comment.
Thanks for your work on this, Mark!
Steve

Hi Stephen, At 10:12 PM 2/18/2017, Stephen J. Turnbull wrote:
RFC 5890 can be used for the domain name part. The issue would be about what to do for the local part. If I recall correctly, the question was not discussed as part of RFC 6531. I suggest taking a look at RFC 7564.
Regards, -sm

On Feb 12, 2017, at 03:58 PM, Mark Sapiro wrote:
I suppose if we did continue to allow them, they would have to be escaped in the URL. I'm not sure how much that helps, or even whether it should be part of our decision to allow them or not.
I think it's entirely reasonable for Mailman to narrow the list of allowable characters in the local part of list names. We already make some opinionated decisions about allowable email addresses; for example, we support case-preserving, case-insensitive addresses so we treat bob@example.com and BOB@example.com as identical.
I'm amenable to the conservative set you propose (obviously, case insensitive), although I have some concerns about how dots in the local part would interfere with any List-ID operations. E.g. foo.bar@example.com becomes List-ID: foo.bar.example.com. As an identifier with comparison rules according to RFC 2919 I think it's fine (it just has to be unique). I'm not sure whether in practice it would cause problems with the core.
The other question is whether we're unfairly closing the door on i18n list names. OTOH, we haven't yet had any requests for that afaict.
Certainly some narrowing is appropriate. We could just clamp it down as you suggest, understanding that there may already be lists in existence that use the more liberal character set, and acknowledging that we may want to relax the set based on future bug reports.
What about this: come up with an absolute black list set, e.g. the ones that will break Mailman. Come up with a second set of discouraged but allowed characters, and a third set which is the narrow list you propose. Then make the allowable set configurable, except that the black list characters are always disallowed. Now, that might be too complicated, so I'm also fine with making it narrow now, and letting the set relax based on user feedback.
Cheers, -Barry

On 02/12/2017 05:27 PM, Barry Warsaw wrote:
Thanks Barry. FWIW, MM 2.1 has an ACCEPTABLE_LISTNAME_CHARACTERS config setting which defaults to '[-+_.=a-z0-9]'. I don't really like the + and = in that list because of their possible interaction with VERP. I have a WIP MR at <https://gitlab.com/mailman/mailman/merge_requests/248> that allows only [-_.a-z0-9] (IGNORECASE) and has no config override.
The narrow, overridable config combined with a blacklist or some kind of limitation on the overrides would be the most flexible. I'll look at adding that to the MR. Basically, I'm thinking of a fixed list of allowed characters which is liberal, testing that first and if that passes, testing the config set.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Mark Sapiro writes:
I'd like feedback on this. What are your thoughts on what characters should be allowed in list names?
Uh, RFC 6532 ....
Probably that can wait for when we actually support it :-), but while you're doing this we should (= I should when life gets sane ;-) make sure that whatever that restriction is, is encapsulated in one place.
Steve

On 02/14/2017 10:52 AM, Stephen J. Turnbull wrote:
That doesn't really address my question. That has to do with internationalized email addresses. Granted the listname must be a valid local part of an email address, but that doesn't mean every valid local part has to be a valid list name.
In particular, the issue that raises this question is a list name containing a slash. while 'my/list@example.com' is a legal email address, https://lists.example.com/mailman3/lists/my/list.example.com/ is not a URL which will work in Postorius.
I think I've done that in my current MR on this at <https://gitlab.com/mailman/mailman/merge_requests/248> which implements a [mailman] config setting described in schema.cfg as
The only questions are whether these are the right sets for the "outside this are not allowed" class and the default listname_chars class.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

On February 15, 2017 8:03:33 AM GMT+05:30, Mark Sapiro <mark@msapiro.net> wrote:
My gut feeling is that allowing slashes is going to lead to confusing side effects with mail clients and even potential security bugs down the road, and we should just not do it unless there's a clear need to support slashes.
But that's really just a feeling based on exploit experience, where "be conservative in what you expect" is best practice. It doesn't speak to what the standards say or what's doable.
Terri

Mark Sapiro writes:
The problem that I thought we may face is internationalized mailboxes and domain names *are still ASCII* which encodes Unicode.
OK, I looked it up, and I was almost certainly wrong. The relevant RFCs are actually 6531 (SMTPUTF8 extension), 5890 (IDNA), and 3492 (Punycode). IDNA allows "U-labels" (UTF-8), which we can gracefully extend to, and "A-labels" (encoded using Punycode, which uses the ASCII repertoire). AFAICT (haven't really looked carefully), Punycode uses only letters, digits, and the hyphen ("-").
So I withdraw the comment.
Thanks for your work on this, Mark!
Steve

Hi Stephen, At 10:12 PM 2/18/2017, Stephen J. Turnbull wrote:
RFC 5890 can be used for the domain name part. The issue would be about what to do for the local part. If I recall correctly, the question was not discussed as part of RFC 6531. I suggest taking a look at RFC 7564.
Regards, -sm
participants (5)
-
Barry Warsaw
-
Mark Sapiro
-
SM
-
Stephen J. Turnbull
-
Terri Oda