[IPython-dev] Patch for paren completion glitch
Fernando Perez
Fernando.Perez at colorado.edu
Thu Jul 22 02:42:10 EDT 2004
Ville Vainio wrote:
> I managed to make "(foo)bar" completion work for filenames, without
> breaking completion for space characters (or anything else I could try
> out quickly). It seems to work on Linux at least - haven't tried it on
> Windows yet, I thought it might be a good idea to show the patch for
> some quick feedback before that.
>
> As you can guess it's brutally hackish, but I figure that's the way it
> goes with readline ;-).
>
> It should be trivial to add the functionality for other delimiter chars.
> I used shlex.split directly, it should be changef to shlex_split from
> magic for bacwards compatibility but I thought I'll do it later on...
It's pretty ugly, but I doubt much better is achievable given the problem at
hand. I am, however, a bit concerned about performance given that this stuff
gets called for _every_ filename in the completions match, which can be a lot
in a big directory. People expect tab-completion to be near-instantaneous,
and I'd like to keep it that way. In particular, I think that
def unprotect_filename(s):
chs = []
in_escape = False
for ch in s:
if in_escape:
chs.append(ch)
in_escape = False
continue
if ch == '\\':
in_escape = True
continue
chs.append(ch)
return "".join(chs)
is essentially:
# Alternative unprotect_filename
# About 5 times faster than the original
unprotect_filename2 = lambda s:s.replace('\\','')
Am I right? If that's the case, it can (and should) be explicitly inlined,
since function call overhead in python is violent. Even as a function, the
second form is about 5 times faster for short strings, which is significant.
Once inlined, the payoff will be even bigger.
I'm attaching a file which tests that indeed these two return identical
results for a bunch of random tests. I also checked the protect_filename, and
could manage very minor improvements by using a string instead of a list for
the 'in' check: checking 'char in string' is faster than 'char in list_of_chars'.
You can use the tester at the end for other checks like the one I suggest below.
It would be worth also checking if this:
+ lsplit = shlex.split(lbuf[:self.readline.get_endidx()])[-1]
is faster when done with a regexp instead of shlex.split (the latter is HUGE,
so I expect it to be pretty slow).
Here:
+ matches = [text0 + protect_filename(f[len(lsplit):]) for f in m0]
the len(lsplit) should be kept in a local outside, since python does not lift
constants out of loops or listcomps (the python compiler is absolutely
primitive in the optimizations it attempts).
I agree that it's necessary to do this correctly because people do have
filenames with these chars in them. But since this is smack in the middle of
the interactive loop, I really want to be sure that the code is as absolutely
tight as possible. Also keep in mind that there may be users out there with
hardware much slower than yours, so coding for absolute efficiency is
important, even if it seems to run fine on good hardware.
I'm sure we'll converge on a nice solution shortly. Just go over every line
with a maniac eye for optimization fine-tunings. I've always tried to write
ipython that way, in the code paths which lie in the middle of the interactive
loop. I think the fact that even with all the stuff that goes on it still
feels reasonably responsive is a testament to the effort being worth it (and
obviously to the quality of python's implementation).
Thanks for the work!
Best,
f
-------------- next part --------------
A non-text attachment was scrubbed...
Name: comp_strfuns.py
Type: text/x-python
Size: 2238 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20040722/1f0f61fb/attachment.py>
More information about the IPython-dev
mailing list