[IPython-dev] Function specific hooks into the ipython tab completion system

Fri Nov 30 07:49:23 EST 2012

Hey,

[tl;dr (1) Prototype working	(2) Dynamic language = limited potential	(3) Function annotations and
getting the ball rolling on annotation interoperability schemes]

I have a working prototype of the system at https://github.com/rmcgibbo/ipython/tree/functab.
Currently, the only type of matches that are supported are for enumerated string literals.
I will be adding more soon, and making the interface such that it is easy to extend the code with
new argument specific tab completion logic. The heavy lifting is done I think.

Here's what you can do currently:

In [1]: from IPython.extensions.customtab import tab_completion

In [2]: @tab_completion(mode=['read', 'write'])
   ...: def my_io(fname, mode):
   ...:     print 'lets go now!'
   ...:     

In [3]: my_io('f.tgz', <TAB>
'read'   'write'  

(or if there's only one matching completion, it'll obviously just fill in)

So when your cursor is in place to be adding the second argument, it will do tab completions based on
two static strings read from the decorator. It pretty much as you'd expect, supporting keyword args, etc.

The one thing that's tricky is dealing with quotation mark literals (' and ") in readline. This was my first
time coding against readline, so maybe I was making some rookie mistakes, but it seems to me like if
you don't play any tricks, the quotation marks are not treated correctly. When the current token is '"rea',
as if you've just started typing "read", and you hit tab, readline seems to tokenize the line and associate
the opening quotation mark with the previous token, not the current one. This means that if you include it
in the completion that you give, and you hit tab, the line will end up with TWO opening quotation marks,
ala '""read'. (And it'll keep going if you keep hitting tab). I had to hack my way around this, but if
anyone has any suggestions I'm all ears.

---

As Tom said, there's a lot you could do with this -- enumerated string literals are just the tip of the
iceberg. Some other ones include glob-completed string literals for filenames, type-based 
matching of python objects in the current namespace (i.e. only show tab completions for ndarray
objects in your session, etc).

You obviously can't do as much as you could in a statically typed language -- if you have type-based tab
completion looking for ndarray objects, you're not going to be able to infer that since the return value of 
some random function like np.random.randn is going to be an ndarray, you could tab complete that too.
While things like type checking can be done with decorators / annotations, this isn't really the place. In particular,
the code for tab completion gets evaluated before you as a user ever hit enter and execute your line of
code -- and itsnot appropriate to actually exec anything -- so there's no way to know the type of the first
argument to a function like foo(bar(x), 1<TAB> at the time that the tab is executed.

----

As to the decorator vs function annotation syntax, I am hesitant to use annotations for a few reasons. First, selfishly,
I don't use python3. (I would, but I depend on PyTabes for everything, and it isn't py3k ready). Without the
syntactical support, manually attaching __annotations__ to functions seems like a lame api.

Second, in order to do annotations "right" in such a way that two independent project don't have annotation semantics
that fail to interoperate, everyone needs to agree on some system. If one library expects the annotations to be strings
(or to mean a  specific thing, like a type for typechecking) and another library expects something different, then everyone is
in a bad spot. PEP3107 seems to want some convention to "develop organically". As the author of the PEP said on
Python-ideas:

> When the first big project like Zope or Twisted announces that "we will do
> annotation interop like so...", everyone else will be pressured to
> line up behind them.
> I don't want to pick a mechanism,  only to have to roll
> another release when Zope or Twisted or some other big project pushes
> out their own solution.
But, on the other hand, maybe there's no time like the present to start, and although the PEP was finalized in 2010, 
and it doesn't seem like Django or any other big package is using annotations, so maybe we should lead the way?

In that spirit, I propose two possible schemes for annotation interoperability:

(1) All the annotations should be dicts. That way we can put tab-complete specific stuff under the 'tab' key and other
libraries/features can put their stuff under a different key, like 'type' or  what have you. One advantage are that it's
simple. On the other hand, its kind of verbose, and IMO ugly.

def my_io(fname, mode : {'tab': ['read', 'write']}):
    pass

def my_io(fname, mode : {'tab': ['read', 'write'], 'foo':'bar'}):
    pass

(2). Inspired by ggplot, where you compose plots by summing components, the annotations could be the sum of
objects that overload __add__, like:

def my_io(fname, mode : tab('read', 'write'))
   pass

or 

def my_io(fname, mode : tab('read', 'write') + typecheck(str)):
    pass

Where tab and typecheck are something like:

class tab(object):
    def __init__(self, *args, **kwargs):
        self.args = args
        self.kwargs = kwargs
        self.next = None
        self.prev = None

    def __add__(self, other):
        if not hasattr(other, '__add__'):
            raise ValueError('"{}" must also be composable'.format(other))
        self.next = other
        other.prev = self
        return other

Then you could basically recover something analogous to the dictionary in proposal one
by walking through the next/prev pointers. The advantages of this are chiefly visual. 

Obviously other things like lists or tuples or generic iterables could be done too.

def my_io(fname, mode : (tab('read', 'write'), typecheck(str))):
    pass

or

def my_io(fname, mode : [tab('read', 'write'), typecheck(str)]):
    pass

With some cleverness, it might be possible to add an __iter__ method to the tab class that
would let you turn the sum into a list via list(iterable), such that via polymorphism the ggplot
style and the tuple/list style could exist side-by-side.

Basically, IMHO, using function annotation system requires some serious thought -- probably
above my pay grade -- to do right.

(Sorry this email was so long)

-Robert

On Nov 29, 2012, at 6:38 AM, Tom Dimiduk wrote:

> That looks pretty great.  I would use something like that.  
> 
> For the filename('.txt'), it could be handy to be able to pass arbitrary globs (as in for glob.glob).  You could still default to matching against the end of the string for extensions, but adding the glob support costs little (since you probably want to use glob internally anyway.  
> 
> For extra (and unnecessary) fancyness, I could also see use cases for 
> from from tablib import tabcompletion, instance
> class bar:
>     pass
> @tabcompletion(foo=instance(bar))
> to be able to only complete for specific types of objects for other parameters (it would do an isinstance test).  
> 
> Even more bonus points if the decorator could parse numpy styled docstrings to grab that kind of information about parameters.  I guess at this point you could do type checking, file existence checking, and a variety of other fun stuff there as well once you have that information, but that is almost certainly going out of the scope of your proposal.  
> 
> Sorry if I am growing your proposal too much, the basic thing you proposed would still be very useful.  If I can grab some spare mental cycles, I would collaborate with you on it if you end up writing it.  
> 
> Tom
> 
> On 11/29/2012 05:28 AM, Robert McGibbon wrote:
>> Hi,
>> 
>> Good spot, Matthias. I didn't see that method was already exposed -- I was just looking at IPCompleter.matchers, which what that method inserts into.
>> 
>> Annotations are cool, but they're not obviously composable. I worry that if I use them for this and then fix one syntax of how the annotation is parsed, and somebody else
>> is using annotations in their lib for something else, the two schemes won't be able to interoperate. Also they're py3k only.
>> 
>> My preferred syntax would be
>> 
>> from tablib import tabcompletion, filename,
>> 
>> @tabcompletion(fname=filename, mode=['r', 'w'])
>> def load(fname, mode, other_argument):
>>     pass
>> 
>> or maybe with a parameterized filename to get specific extensions
>> 
>> @tabcompletion(fname=filename('.txt'))
>> def f(fname, other_argument):
>>     pass
>> 
>> Is this something that other people would be interested in?
>> 
>> -Robert
>> 
>> On Nov 29, 2012, at 2:02 AM, Matthias BUSSONNIER wrote:
>> 
>>> Hi, 
>>> 
>>> I may be wrong, but IIRC you can insert your own completer in the IPython  completer chain and decide to filter the previous completion.
>>> 
>>> You should be able to have a custom completer that just forward the previous completion in most cases, 
>>> And just do a dir completion if the object is np.loadtxt in your case (or look at __annotations__ if you wish).
>>> 
>>> I've found one reference to inserting custom completer here
>>> http://ipython.org/ipython-doc/dev/api/generated/IPython.core.interactiveshell.html?highlight=interactiveshell#IPython.core.interactiveshell.InteractiveShell.set_custom_completer
>>> 
>>> 
>>> -- 
>>> Matthias
>>> 
>>> Le 29 nov. 2012 à 10:27, Aaron Meurer a écrit :
>>> 
>>>> I've often thought this as well.  Probably a full-blown IPEP is in
>>>> order here.  Perhaps __annotations__ would be the correct way to go
>>>> here.
>>>> 
>>>> Aaron Meurer
>>>> 
>>>> On Thu, Nov 29, 2012 at 12:58 AM, Robert McGibbon <rmcgibbo at gmail.com> wrote:
>>>>> Hey,
>>>>> 
>>>>> Tab completion in IPython is one of the things that makes it so useful,
>>>>> especially the context specific tab completion for things like "from ..."
>>>>> where only packages, or obviously the special completion for attributes when
>>>>> the line contains a dot.
>>>>> 
>>>>> I use IPython for interactive data analysis a lot, and one of the most
>>>>> frequent operations is loading up data with something like numpy.loadtxt()
>>>>> or various related functions.
>>>>> 
>>>>> It would be really awesome if we could annotate functions to interact with
>>>>> the tab completion system, perhaps for instance saying that argument 0 to
>>>>> numpy.loadtxt() is supposed to be a filename, so let's give tab-complete
>>>>> suggestions that try to look for directories/files. Some functions only
>>>>> files with specific extensions, so you could filter based on that or
>>>>> whatever.
>>>>> 
>>>>> By hacking on the code for completerlib.py:cd_completer, I sketched out a
>>>>> little demo of what you could do with this: https://gist.github.com/4167151.
>>>>> The architecture is totally wrong, but it lets you get behavior like:
>>>>> 
>>>>> ```
>>>>> In [1]: ls
>>>>> datfile.dat  dir1/        dir2/        file.gz      random_junk  test.py
>>>>> 
>>>>> In [2]: directory_as_a_variable = 'sdfsfsd'
>>>>> 
>>>>> In [3]: f = np.loadtxt(<TAB>
>>>>> datfile.dat  dir1/        dir2/        file.gz
>>>>> 
>>>>> In [4]: normal_function(<TAB>
>>>>> Display all 330 possibilities? (y or n)
>>>>> 
>>>>> In [5]: g = np.loadtxt(di<TAB>
>>>>> dict                      dir1/                     directory_as_a_variable
>>>>> divmod
>>>>> dir                       dir2/                     directory_of_my_choosing
>>>>> ```
>>>>> 
>>>>> Basically hitting the tab completion, when np.loadtxt is on the input line,
>>>>> only shows directories and files that end with a certain extension. If you
>>>>> start to type in the name of an object in your namespace, it'll show up too,
>>>>> but only once you've typed in more than 1 matching character.
>>>>> 
>>>>> The implementation in my gist is pretty lame. The way I've coded it up, the
>>>>> special behavior is based on simply finding the string "np.loadtxt" on the
>>>>> input line, not on the actual function. This means you can't really make the
>>>>> behavior specific to your position in the argument list (i.e. I know that
>>>>> the first arg is a filename, and so should be tab completed like this, but
>>>>> the other ones are not). I suspect the right way to do the implementation is
>>>>> via function decorators to specify the behavior and then adding to
>>>>> IPCompleter instead.
>>>>> 
>>>>> I think I'm up for giving this a shot.
>>>>> 
>>>>> Thoughts? Is this a feature anyone else would find interesting?
>>>>> 
>>>>> -Robert
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> IPython-dev mailing list
>>>>> IPython-dev at scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>>>> 
>>>> _______________________________________________
>>>> IPython-dev mailing list
>>>> IPython-dev at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>> 
>>> _______________________________________________
>>> IPython-dev mailing list
>>> IPython-dev at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>> 
>> 
>> 
>> _______________________________________________
>> IPython-dev mailing list
>> IPython-dev at scipy.org
>> http://mail.scipy.org/mailman/listinfo/ipython-dev
> 
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20121130/66299213/attachment.html>