Britton, Sam and Stephen have all reported to me at different times that it seems sometimes one of the processors in a parallel job hangs for a while then races to catch up at the end. Have any of you ever successfully done any localization of this problem? Figuring out where exactly it hangs? I think this would show up in per-processor profiling, and looking to see which functions take up the most time on processors, and disparities in that across procs. I'd *really* like to track this down, as it's now causing us some real problems.