I'm already using the "jobs" infrastructure provided by the
django-extensions package:
http://django-extensions.readthedocs.org/en/latest/jobs_scheduling.html
Cool. I didn't know about this extension, but it looks like it does what we
need. So the background process would be its own file in the jobs
directory, and we could leave it to the admin to setup the crontab?
I have another test server with more current info if you want, but I
break it regularly. It's lists-dev.cloud.fedoraproject.org
Thanks for linking this. I got my own local dev server working yesterday,
but this one is much more populated.
We do put the attachment in the mbox, as a MIME component like in
every email.
I see how this works now. Are the attachments always Base64 encoded?
Another possible "nice-to-have" feature I thought of yesterday is a
download link that scripts can use to get archives (e.g.
"/download?year=x&month=y"). On the other hand, maybe this is just a
security risk that has no actual use case, but I'd still like to have a
second opinion on this.
Well, there still is the authentication issue.
I guess getting the scripts to authenticate would be a little complicated,
but otherwise does this seem like something worth including? If my proposal
gets accepted, I'm ok with leaving this as an open question until it
becomes clear whether or not I'm going to have extra time at the end of the
summer.
In my proposal I suggested using any of several asynchronous job queue
libraries, such as Celery or Huey. These all use redis as a back-end.
Because I have no experience with asynchronous job queues, I'm not sure
if
this is too much baggage for our purposes. Maybe we just don't want the
extra dependencies.
Yeah, we don't want to add another database or an AMQP server just for
that. We must keep it simple for admins to deploy.
Regarding cron jobs, there's also django-background-task which is a
simple
django addon that might do what we need. Again, if we don't want/need the
extra dependency, rolling our own cron job should be fairly
straight-forward.
I'm already using the "jobs" infrastructure provided by the
django-extensions package:
http://django-extensions.readthedocs.org/en/latest/jobs_scheduling.html
I did consider django-background-task but django-extensions seemed
like a better fit, because django-background-task seems written for
delayed tasks, not periodic tasks (well, a task could call itself
again when done, but it seems like a hack). I'm not opposed to
switching to django-background-task if we use the "delayed job"
feature or if we need the extra flexibility of choosing exactly how
many seconds apart we want our tasks to run.
If we choose to pre-build the mbox files, we can't simply have them
served through the webserver, because some lists are private
Then there is also an authentication step?
Yeah, we must use HyperKitty's authentication and check if the user is
allowed to see the archive. So the files can't be served by the
webserver like static files.
I noticed on the test server
that I can't actually look at any of the mailing lists because they're
all
private.
If you're looking at lists.stg.fedoraproject.org, it's currently very
outdated (still running the Python2-compatible branch of Mailman 3). I
have another test server with more current info if you want, but I
break it regularly. It's lists-dev.cloud.fedoraproject.org
When we create the mbox file, do we simply note that an attachment
existed
(e.g. "Attachment: myattachment.txt") or do we actually put the
attachment
in the mbox? AFAIK mbox is a plaintext format, so if the latter is the
case
then I'm not exactly sure how this would work...
We do put the attachment in the mbox, as a MIME component like in
every email. If you choose "view source" when looking at an email with
attachments, you'll see how it's done.
Are there going to be any issues handling unicode foreign characters or
with file locks? Right now it looks like we should only have one process
handling the mbox, but is it possible that more than one could be spawned
somehow?
No, mbox files are not designed for concurrent writes, so it's better
to have a single process write to them.
Another possible "nice-to-have" feature I thought of yesterday is a
download link that scripts can use to get archives (e.g.
"/download?year=x&month=y"). On the other hand, maybe this is just a
security risk that has no actual use case, but I'd still like to have a
second opinion on this.
Well, there still is the authentication issue.
Aurélien
Mailman-Developers mailing list
Mailman-Developers@python.org
https://mail.python.org/mailman/listinfo/mailman-developers
Mailman FAQ: http://wiki.list.org/x/AgA3
Searchable Archives:
http://www.mail-archive.com/mailman-developers%40python.org/
Unsubscribe:
https://mail.python.org/mailman/options/mailman-developers/dru5%40cornell.ed...
Security Policy: http://wiki.list.org/x/QIA9