Re: [Mailman-Developers] Missing footers with latest CVS
"Dan" == Dan Mick <dmick@utopia.west.sun.com> writes:
Dan> Well, so, one of them has no charset expressed at all that I Dan> can see.
That means their charset is us-ascii. Is the list set to some other language? Could you please post the configuration of the list, and an example message without footer that was sent to the list?
Basically, we need to deal with the case where a list is configured for something like iso-8859-2, but a user sends a message in iso-8859-1, or utf-8, etc. In these cases, we can't just tack the footer on -- we'll get a garbage message! We have to avoid adding a footer if the charsets mismatch; no other way about it.
Why a garbage message? Why not just a (potentially) garbage footer?
Dan> Not really, if appropriate workaround is "ignore the incoming Dan> charset and add this footer unconditionally please".
But this is the worst thing you can do. What happens when I post a message in UTF-8 and then a Japanese ISO-2022-JP footer gets tacked on? Not good.
I'd be more concerned about what happened to the message, since it's apparently sent in a language that can't be understood by its audience.
There's something about the fullness of charset processing that I don't grok. I think it has to do with design. Are there design notes somewhere?
"Dan" == Dan Mick <dmick@utopia.west.sun.com> writes:
Ben> Basically, we need to deal with the case where a list is
Ben> configured for something like iso-8859-2, but a user sends a
Ben> message in iso-8859-1, or utf-8, etc. In these cases, we
Ben> can't just tack the footer on -- we'll get a garbage message!
Ben> We have to avoid adding a footer if the charsets mismatch; no
Ben> other way about it.
Dan> Why a garbage message? Why not just a (potentially) garbage
Dan> footer?
Here's an example.
My Japanese terminal accepts EUC-JP and ISO-2022-JP only. If I displayed a Japanese ISO-2022-JP message with an illegal ISO-8859-1 footer on it, not only would it be a garbage footer, but any further output to the terminal AFTER the footer would be complete garbage, because the illegal 8-bit characters would "shift" my terminal into a special Japanese-only mode.
Basically, illegal footers can be worse than just illegal -- they can render the reader's terminal completely useless, requiring a total restart. This is not acceptable.
Dan> I'd be more concerned about what happened to the message,
Dan> since it's apparently sent in a language that can't be
Dan> understood by its audience.
Why? You can set a list's default charset to Japanese, but often get messages in English, Chinese, or Korean. Adding a Japanese footer to these unconditionally without making the whole thing a MIME message with separate parts with their own charsets would break everyone's terminals.
Dan> There's something about the fullness of charset processing
Dan> that I don't grok. I think it has to do with design. Are
Dan> there design notes somewhere?
I'm not exactly sure what you mean about fullness, but here's a little explanation from (gasp) Microsoft that covers 7-bit ASCII, 8-bit IBM PC-DOS characters, double-byte character sets like Japanese and Chinese, and Unicode:
http://www.microsoft.com/typography/unicode/cs.htm
Ben
-- Brought to you by the letters G and X and the number 8. "You have my pills!" Debian GNU/Linux maintainer of Gimp and Nethack -- http://www.debian.org/
"Dan" == Dan Mick <dmick@utopia.West.Sun.COM> writes:
Dan> Why a garbage message? Why not just a (potentially) garbage
Dan> footer?
Well, if it garbles HTML, it's going to hose our web interface. This is very possible with the Asian double-byte 7-bit encodings which are rife with random usage of octets like <, ", etc.
Also, some MUAs (think Emacs RMail, but also Outhouse Abcess) autodetect the encoding of the message buffer, usually based on the first 3000-4000 characters. Very long messages would be OK, but typical messages would have the confusing footer within that space. I really can't predict what Abcess would do with that (although it's quite good at generating that kind of stuff itself, Microsoft Unicode BOM followed by ASCII HTML markup intermixed with Unicode P?CDATA, for one egregious example of a beta version ;-).
Recently things have gotten better but when I first got to Japan I got to the point where I could read my boss's name and the name of the department in both base64 and raw bytes. Alphanumerics are easy; the double byte versions are #1, #2, ..., #A, .... ;-) So bogus encodings are no joke on this side of the big blue puddle.
For more information, Jukka Korpela's page is quite good, there must be an intro or seven there, and you don't even need to read Finnish!
http://www.cs.tut.fi/~jkorpela/
-- Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Don't ask how you can "do" free software business; ask what your business can "do for" free software.
[... I'm a man of wealth and taste...]
I recently started running a relatively small mailing list using Mailman and am quite happy with it. I am a proficient Python developer and the fact that Mailman is Python-based was one reason I chose it with entheusiasm. I hope that in my spare time I'll be able to make some useful contributions.
Here are some of the areas I was thinking of working in. I throw them out in case someone else has already started or has ideas:
I would like to implement self.filter_prog, which I expect would be a filter on each incoming messages. There are some special things I can think of I'd like to do with my list. E.g., automatic tallying of poll results.
Along those lines, I would like users to be able to elect to filter out HTML from their regular mail, in addition to digests. Seems this should be easy.
Longer term, I would like the option of adding some info fields to the user database. E.g., I would like to store real names, addresses and phone numbers on one of my lists.
It would be nice to reuse the existing list security as an umbrealla to cover other arbitrary, list-members-only web pages. E.g., some listers hate large graphics attachments (and they are problematic generally). I'd like to remove the image from the incoming message, publish it on a secure, lister-only web page and forward the mail with an URL substituted for the image attachment.
Having barely glanced at the code and Mailman architecture generally, these items seem eminently doable and generally useful. BUt maybe I'm crazy. Don't hesitate to be frank, as I usually don't.
Looking forward to working with you folks.
REgards
--jb
-- James J. Besemer 503-280-0838 voice http://cascade-sys.com 503-280-0375 fax mailto:jb@cascade-sys.com
"JJB" == James J Besemer <jb@cascade-sys.com> writes:
JJB> I recently started running a relatively small mailing list
JJB> using Mailman and am quite happy with it. I am a proficient
JJB> Python developer and the fact that Mailman is Python-based
JJB> was one reason I chose it with entheusiasm. I hope that in
JJB> my spare time I'll be able to make some useful contributions.
Cool! Since I'm strapped for time right now, I'm just going to comment briefly.
JJB> Here are some of the areas I was thinking of working in. I
JJB> throw them out in case someone else has already started or
JJB> has ideas:
Be sure you're working with MM2.1, and preferrably the cvs tree. Very soon now, MM2.1 will go into beta so that'll mean a feature freeze.
JJB> 1. I would like to implement self.filter_prog, which I expect
JJB> would be a filter on each incoming messages. There are some
JJB> special things I can think of I'd like to do with my list.
JJB> E.g., automatic tallying of poll results.
The old filter_prog stuff is gone. A much better way to do this in MM2.1 is to write a handler module. See the Mailman/Handlers directory for examples.
JJB> 2. Along those lines, I would like users to be able to elect
JJB> to filter out HTML from their regular mail, in addition to
JJB> digests. Seems this should be easy.
You'd think! I've had a couple of patches contributed that filter out HTML, but I've not been able to whip them into shape for inclusion. I've basically given up hope for MM2.1, but will look at it again for the next release. The problem is that the naive approach isn't difficult, but for it to be robust is much more difficult.
JJB> 3. Longer term, I would like the option of adding some info
JJB> fields to the user database. E.g., I would like to store
JJB> real names, addresses and phone numbers on one of my lists.
Not hard to do in MM2.1, but I doubt I'll accept much extension in this area. The whole backend user database will be rewritten in a future version and IMO, such extra information ought to be kept in an external database like LDAP or some such. Then those databases ought to be easily integrated into Mailman's rosters.
JJB> 4. It would be nice to reuse the existing list security as an
JJB> umbrealla to cover other arbitrary, list-members-only web
JJB> pages. E.g., some listers hate large graphics attachments
JJB> (and they are problematic generally). I'd like to remove the
JJB> image from the incoming message, publish it on a secure,
JJB> lister-only web page and forward the mail with an URL
JJB> substituted for the image attachment.
Again, not hard to do. The MemberAdaptor API can help you here, and also take a look at the SecurityManager.py. Note also that MM2.1's Pipermail has some code to strip out attachments and store them separately in url space.
Cheers, -Barry
On Wed, Mar 06, 2002 at 10:33:40AM -0500, barry@zope.com wrote:
Not hard to do in MM2.1, but I doubt I'll accept much extension in this area. The whole backend user database will be rewritten in a future version and IMO, such extra information ought to be kept in an external database like LDAP or some such. Then those databases ought to be easily integrated into Mailman's rosters.
<nit pedantic_level="high"> "...in an external database, possibly with an LDAP interface..." </nit>
This is much akin to XML, which everyone commonly (and incorrectly) conflates with application architectures which utilize it...
Cheers, -- jra
Jay R. Ashworth jra@baylink.com Member of the Technical Staff Baylink RFC 2100 The Suncoast Freenet The Things I Think Tampa Bay, Florida http://baylink.pitas.com +1 727 647 1274
"If you don't have a dream; how're you gonna have a dream come true?" -- Captain Sensible, The Damned (from South Pacific's "Happy Talk")
At 08:52 PM 04/03/02 -0800, James J. Besemer wrote:
- It would be nice to reuse the existing list security as an umbrealla to cover other arbitrary, list-members-only web pages. E.g., some listers hate large graphics attachments (and they are problematic generally). I'd like to remove the image from the incoming message, publish it on a secure, lister-only web page and forward the mail with an URL substituted for the image attachment.
Since no one else has commented on this yet, I thought I should do so lest it be forgotten. :)
I think it'd be very cool if I could set up one of my list servers to do grab images and do this automatically, perhaps after confirmation from the admin interface (so if it were *really* huge I could refuse to store it, or if I had it grabbing anything that was larger than the limit, I'd probably want to check to make sure it wasn't another copy of SirCam before posting :P ). I'm not too sure it'd be useful to the world as a whole, but I do this manually now for one server's worth of lists, so maybe implementing it for myself would be worth it.
It's funny that you should mention it, because I got a handful of images sent to one of my smaller lists this week and I was actually thinking about this just before you posted.
It's good to know that using list security is easy, too. Ages ago, we contemplated using mailman passwords to allow access to a linuxchix wiki (It seems sorta wrong to limit access to a wiki, but linuxchix is a likely target for trolls... although thankfully we haven't had many on the mailing lists), and although I figured it could be done, it's neat to know where to start if I ever have a similar idea brought up. :)
participants (7)
-
barry@zope.com
-
Ben Gertzfield
-
Dan Mick
-
James J. Besemer
-
Jay R. Ashworth
-
Stephen J. Turnbull
-
Terri Oda