Last night, I added some code to queue messages that fail delivery when using SMTPDirect. What happens is this:
If a message either totally fails delivery (e.g. the smtp socket connect fails) or partial delivery fails for some, but not all, recipients, then the message is stored on the file system for a re-try later.
For every failed message, two files are created. The base name of these files is the SHA hexdigest dump of the message text. This should be nearly guaranteed unique. A new directory contains these files, called `qfiles'. The first file created is the complete plain text of the failed message. The second file is a marshal of useful information related to the failed delivery. This contains the listname and the failed recip list along with a few other moderately useful bits of info.
There's a new cron script called `qrunner' which cruise the files in qfiles. It claims a lock (to prevent multiple qrunner processes) and then goes through each file it finds, attempting redelivery. If there are any problems reading a qfile file, it skips it for next time (assumes it's a transient problem with the file, but logs a message). When qrunner notices that the message has been handed off the the smtp daemon for all outstanding recipients, it deletes the two message files.
I've moderately tested this stuff with total delivery failure by shutting off my smtp daemon, attempting some deliveries, turning it back on and running qrunner. I don't have the time right now to test partial delivery failures, but I still claim that without DSN support, these will be unlikely. Hopefully some of you can help look at this.
I'm about to check all this stuff in. Let me know what you think. -Barry