
Jason Pruim wrote:
But how would you scale that to the size of say... yahoo? Multiple data centers around the world, all processing mail for different domains under yahoo's control... How would one be able to synchronize all that data from tons of different places like that?
Well,
Scale-out is always hard, and enterprises like Yahoo and such are facing a lot of difficult challenges in their environment, but if I may put my two cents in how about letting gateways query some sort of probabilistic datastructure, e.g. bloom filters, to find if an address is known? You could generate bloom filters from the different directories, and distribute those filters via DNS or LDAP.
With a few hundred megabytes of memory you can store millions of entries in the filter with error probability something like under 1/1000. That means you end up sending only under 1/1000th of DSN messages you are sending now.
-- Eino Tuominen