The overhead of importing is not in trying too many names, but in loading the module and executing its bytecode.
That was my conclusion as well when I did some profiling last fall at the Python core sprint. My lazy execution experiments are an attempt to solve this:
I expect that Mercurial is already doing a lot of tricks to make execution more lazy. They have a lazy module import hook but they probably do other things to not execute more bytecode at startup then is needed. My lazy execution idea is that this could happen more automatically. I.e. don't pay for something you don't use. Right now, with eager module imports, you usually pay a price for every bit of bytecode that your program potentially uses.
Another idea, suggested to me by Carl Shapiro, is to store unmarshalled Python data in the heap section of the executable (or in DLLs). Then, the OS page fault handling would take care of only loading the data into RAM that is actually being used. The linker would take care of fixing up pointer references. There are a lot of details to work out with this idea but I have heard that Jeethu Rao (Carl's colleague at Instagram) has a prototype implementation that shows promise.