
Hi all, I noticed (from a DMARC mitigation utility that Lindsay extracted) that Mailman features its own approach to using the PSL. Of course, development must go on, and sometimes it is a waste of time trying to make a super-duper scaffolding for a job that can be carried out complying to the KISS principle. At any rate, what is the future of DMARC lookups in Mailman?
- The specs say that "DMARC should be amended to use [a method better than PSL] as soon as it is generally available" [1]. I believe that sentence refers to RDAP, which was released more or less at the same time (March 2015) [2].
[1] https://tools.ietf.org/html/rfc7489#appendix-A.6 [2] https://datatracker.ietf.org/wg/weirds/documents/
- There are various Python packages for domain name splitting. They obviously use the PSL, but supposedly would transparently switch to a better method in case. If Mailman used one such package, a practical advantage for users would be to update the PSL in only one place, if they happened to use the same dependency. I found six packages.
tldextract [3] is the only one of them which caches a JSON object rather than the original textual representation of the list. It uses a frozenset. tld [4] and publicsuffixlist [5] also build a set. publicsuffix[6] and publicsuffix2 [7] build lists of nested dictionaries from all the labels. dnspy [8] builds a dictionary of FQDNs, somewhat like Mailman.
How does the time to build the structure compare with the time taken by DSN queries?
[3] https://pypi.python.org/pypi/tldextract [4] https://pypi.python.org/pypi/tld [5] https://pypi.python.org/pypi/publicsuffixlist [6] https://pypi.python.org/pypi/publicsuffix [7] https://pypi.python.org/pypi/publicsuffix2 [8] https://pypi.python.org/pypi/dnspy
- Debian distributes a publicsuffix package which brings a textual version of the list. Since stretch, it also brings a "dafsa" version. Nowadays, most C implementations (Firefox, Chromium) use dafsa. They build the structure using offsets rather than pointers, so that the blob can be defined in a source file as a literal static array of chars, in order to minimize loading time. That strategy works well as long as the relevant package is upgraded more frequently than the PSL. Otherwise, as for libpsl, one ends up using obsolete data.
Surprisingly, the publisuffix package itself is not upgraded as frequently as the PSL. This bug [9] is what prompted me to write this message. I guess you, as Mailman developers, have pondered this subject and I'd be interested to know what you think.
[9] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=879008