Thanks for starting this discussion. Since the thread's already long, I'm just going to answer randomly with my own thoughts.
One thing I have a real problem with is defining the database query layer as the interface between components. To me that just unacceptably ties us to a specific database, and/or a specific protocol. For example, I do not want to *require* Postgres in order to run Mailman, or to integrate *a* web ui with the core. I just think that as convenient as that might seem today, it will lock us into a system design we're going to regret somewhere down the road.
So let's say for the moment that we agree that all the user data should live in one place. I don't have a problem with that conceptually, and I actually don't care whether that's part of the core or in a separate component. The other problem I have is extending the core's data model to include things it doesn't care about. When you realize all of that has to be documented and tested, that just seems like it's adding lots of extra baggage to the core.
For example, today you might want Twitter and Facebook ids in that database. Five years ago maybe you also wanted an AIM id in there. Do you today? Will you still want Google+ ids in there, or BrowserIDs, or OpenIDs five years from now? Yet, if it's part of the core's data model, we have to support it, test it, document it, go through deprecation cycles, etc. etc.
One of the important design decisions I made was using Zope interfaces to formally define the touch points between the different components of the system. This isn't just for the fun of it; instead, it gives us great implementation flexibility.
For example, if you need to know what email addresses a user has registered, you access that through the IUser interface. Rosters are another great example of where you access things through the IRoster interface and nothing else. Nothing except the implementation of that even cares that they are implemented as queries and don't exist as *real* objects in the system. They can return whatever they want, as long as they conform to the IUser or whatever interface.
This all might lead to inefficiencies, but I don't think that matters right now. It probably will some day, but let's worry about that if and when we need to. What we care about now is the *flexibility* and the *stability* of the system.
For the sake of argument, let's say that all the user information should be stored in Postorius. What kind of changes would be needed in the core to keep its view of the user world in sync with Postorius's view of the world. No matter how you slice it, you are going to have two separate processes that need to be kept in sync.
You actually could, as I think Richard advocates, just expose the SQL queries to both processes. You would in theory have to only re-implement a handful of interfaces to keep the rest of the system humming. IOW, when the IUserManager needs to look up a user by their email address, instead of running a query against the local SQLite database, you would run it against the Postorius database. But - and here's the key thing - you would *still* return some object that implements the IUser interface. If you do that, you've localized the changes you have to make to the core and everything else Just Works (again, in theory ;).
One of the things I've tried to do, with unknown success because nobody's
tried it, is to keep in mind three broad slices of data: the user data, the
list data, and the message data. So for example, an IMember associates an
IAddress with an IMailingList. The standard implementation of that doesn't
use a foreign key between the member
table and the mailinglist
table.
Instead it uses the fqdn_listname, i.e. a string. What that *should* mean is
that you could move the user data anywhere you want and not have to also move
the list data and message store data.
There *should* be enough hooks in the system already for a system administrator to say "I want to use Postorius, so I must enable the Postorius IUserManager implementation". For global objects like this, we use the Zope Component Architecture (ZCA), so in a Postorius-owns-the-world scenario, what has to happen is that
usermanager = getUtility(IUserManager)
must return the PostoriusUserManager instance and not the SQLite based UserManager instance. Once you've done that, you have to change *nothing* else in the system because everything talks to that object through the interface, and as long as that keeps its contract, the rest of the system should, again Just Work.
I have no idea whether the above will be easy or not, since nobody's tried it. But the system design should allow you to do it this way, and I would be very open to the right hooks, fixes, and extensions to make this possible. I hope you can see how this approach lets someone run Mailman in many different configurations, from a core-only system, to Postorius, to a system where all the user database is in ZODB or already locked up in a proprietary closed database.
There is another approach of course, which may end up being simpler, if more brittle. You could just try to keep the two databases in sync. It doesn't matter too much which is the master, you just have to decide. This is essentially how Launchpad's integration with Mailman 2 works. Launchpad is the master database and whenever something in that database chances that could affect Mailman, that information is communicated to the Mailman server. The details are mostly unimportant, and yes, it does work. It's been brittle in the past, but now with enough logging, monitoring, and fail-safes it works great.
How would you keep these two in sync? First, if something changes in the core, the idea is that an event is triggered. Other components of the system watch for those events and react to the ones they care about. For example, let's say a user changes their password via email command. Once the core acts on that change, it will trigger a PasswordChangeEvent which has a reference to the user effecting that change. If Postorius was the master database for passwords, we'd have to add a little event subscriber which, when it got a PasswordChangeEvent, then talked to Postorius to make that change. Or maybe it updated the shared user data component, or made the appropriate SQL UPDATE call. The key thing again, is that the core just fires the PasswordChangeEvent, and other things react to it.
Alternatively, let's say a user changes their password through the web ui. I think this case is already covered, because the way to keep that in sync with the core is to make the appropriate REST call, probably PATCHing the user's password.
Very likely we don't have enough events defined to cover all the actions that the core must take (e.g. through email commands). But events are easy to add and again, I'm not opposed to adding any events which make the integration easier.
It's also likely that the REST API is incomplete for every bit of information Postorius wants to get into the core or out of the core. Again, it's easy to extend the REST API, so let's fill in what's missing.
I hope this lays out the basic design constraints that I want to follow. Maybe it sparks some thoughts about different possibilities.
Cheers, -Barry