[Python-ideas] Better comand line version of python -c

Fri Jan 9 06:46:10 CET 2015

On Thursday, January 8, 2015 9:02 PM, Russell Stewart <Russell.S.Stewart at gmail.com> wrote:

>I think if you actually used one of the tools in the space for a little while (be it pythonpy or spy, which seems good, or the not so great pyp), you may come to realize why your intuition for the user interface is so flawed.

Which part of it is flawed?

After playing with it a bit, I still think `-i` would be better than `--i` and `-fx` feels weird; that Cygwin users do care about the command line, that I'd like a daemon mode (or some other magic way to access `_`), that if JSON is useful it would be more useful to parse a stream of multi-line JSON objects than requiring them to be one per line, that I have no need for any CSV options besides delimiter or dialect, that I don't understand what you were trying to say in your iterable-vs.-list paragraph, etc.

And I'm even _more_ convinced that I'd like a shortcut for sys.stdin as an iterator rather than as a list (and for iterators to be printed line by line as lists are, but lazily--except that I think you already _do_ that, even though the docs say otherwise); it's easy to stall the `-l` pipeline with a huge input, which other Unix tools don't have a problem with.

More generally, it would help if you'd reply inline rather than top-posting; it's hard to figure out what your replies are in reply to.
>On Thu, Jan 8, 2015 at 8:27 PM, Andrew Barnert <abarnert at yahoo.com> wrote:
>
>On Jan 8, 2015, at 15:14, Russell Stewart <Russell.S.Stewart at gmail.com> wrote:
>>
>>
>>[lots of unmarked snips below]
>>
>>
>>Json handling:
>>>I think I would remove this if I did it again. I originally found it quite useful, but the fact that it doesn't support multi-line json inputs makes it weird. Also, all the functionality can be accessed through the json module with python (e.g. py -l 'json.loads(l)').
>>
>>
>>But, as I pointed out, it would be pretty easy to make it handle a stream of multi-line JSON inputs by looping over raw_decode--and that would _not_ be easy to access in a one-liner by the user. So with that change, it could be useful.
>>
>>-csv option:
>>>I considered this at length, and originally the --si was motivated by commas. But ultimately --si is good for lots of things other than commas. A separate csv command seems nice, until you realize that there are too many options to specify in parameters, just like in the json case.
>>
>>
>>I don't understand what you mean here. In the JSON case, there aren't any options to specify at all. In the CSV case, there are lots of dialect options, but most of them are rarely used; being able to specify a delimiter (which you already need for --si) or a named dialect covers 90% of what people need.
>>
>>py vs python -m pythonpy ...:
>>>Yes, using this with python -m is a very real option. People could simply alias py='python -m pythonpy' to avoid typing issues. This is probably the most straightforward distribution option. It would be great to keep the ipython style tab completion, but I'm not sure that its necessary.
>>
>>
>>I don't understand what you mean about tab completion. If you're talking about having bash (or zsh or whatever) do IPython-style tab completion in the expression, you can write a completion script that remembers that sets a variable when it sees "-m pythonpy" and completes non-option arguments with your Python completion instead of with files if that variable is set.
>>
>>
>>The only trick with the alias is that you have to pass Python's arguments before the -m and pythonpy's after it… but I suspect that most people will never need Python arguments, and those who do will need them often enough that they'll just set up a separate alias for `python -u -m pythonpy`.
>>
>>py -fx flag:
>>>Originally, I didn't want to name this -f because I didn't want the active variable x to be magically introduced. When you type -fx, its clear where the variable x comes from. This was probably a mistake, because everyone hates 2 character single-dash flags. 
>>
>>
>>It's not that everyone hates 2 character single-dash flags, it's that everyone hates inconsistency. Your script is _almost_ compatible with the GNU standard (single-char single-dash arguments and multi-char double-dash arguments), but in a few places (-fx instead of --fx, --i instead of -i) you violate it. Why not just follow it consistently?
>>
>>Should be changed in the case of an overhaul. This command is also only marginally useful, due to the fact that grep already handles this use case so well, and because you can get the functionality with py -x 'x if foo(x) else None' just as well.
>>
>>
>>Does that mean your script doesn't print out None values? I can see how that's useful in some cases, but it could be pretty bad--and surprising--in others.
>>
>>pythonpy should be Iterable based, not list-based:
>>>I agree somewhat. The current dictionary functionality is weird, only printing the list of keys. Again, I don't want to change this for backwards compat reasons. This could be done a lot of different ways, and I think the current strategy is pretty good, but leaves a little bit to be desired in terms of simplicity. The -x and -l flags capture a lot of the list-printing that you would want to do, so its not clear you need to do list-printing with no arguments. Using a separate flag for list printing would be a good design consideration here. Keep in mind that the current motivation for all the list strangeness is so that pythonpy will play well with unix pipes, which it does. That is really important for some core usecases.
>>
>>I don't understand this paragraph. What does any of this have to do with just changing all the stuff that requires lists to require any iterable, and/or changing -l to give you an iterator instead of a list?
>>
>>
>>In fact, the reason I want to be able to output any iterable--in particular, an iterator--line by line without having to convert to a list is to use it with standard Unix pipelines. If I have a huge input file to filter, I'd rather use, say, a generator expression and have it read and output lines one at a time than have to process the whole thing into a list in memory before outputting anything. (Yes, I realize that -x covers many of the same cases--but just as map doesn't make genexprs useless in scripts, I don't think -x would make them useless in one-liners.)
>>
>>
>>Daemon mode:
>>>I hadn't thought of this one before. Maybe it could be done well, but daemon modes tend incur a high cost of mental overhead on the user.
>>
>>
>>If designed right, there really isn't any overhead as long as either (a) the user doesn't mind leaving it running forever, or (b) there's a timeout in case you forget to kill it. Then I'd just alias `py` to `python -m pythonpy --daemon=300` and everything would work exactly the same as before, except that it would start faster, and I could use variables set by `-c`/`-C` or the previous result (as `_`) if I wanted to.
>>
>>I've worked hard to keep the startup time within a factor of 2 from the python interpreter, so I do care a lot about that.
>>
>>
>>I know "Windows users don't care about the command line"--but Cygwin users do, and they still have to deal with the Windows process launch delay, and there are some Unixes that people still use that can't launch processes as fast as Linux or BSD.
>>
>>--i flag:
>>>Again, the naming here is up for change. It's useful in a few cases, where python throws exceptions easily (e.g. py -x re.match(r"b", x).group(0)). But I haven't used it all that much, and it really could be removed.
>>
>>
>>I wasn't arguing that it was useless--on the contrary, I think I'd use it often. (Although again, I'd also like a "short exception output" option.) I was just suggesting that it should be `-i` rather than `--i` (both to stick with the standard you're almost following, and to save a keystroke on something I think would be useful often, but not often enough to make a new alias for...).
>>
>>So in general I like the python -m idea. Support for tab completion under this would be really nice. I've long thought that python -c and python -m should have ipython style tab completion anyways, 
>>
>>
>>On my system, `python -m` does have tab completion (although it's not as smart as it could be--e.g., even with -s it still searches user site for completions). I'm not sure what package this comes from, but it's pretty trivial.
>>
>>
>>Making `python -c` do completion for Python statements is cool, it just requires someone writing the completer. It sounds like you've already written one for pythonpy; it's just a matter of modifying it to handle statements instead of expressions and hooking it to the `-c` argument of your python completion. That seems like it would be a useful thing in its own right (although I'm not sure the bash-completions and zsh people would take it upstream, so it might have to be a separate package).
>>
>>so that one could type python -m Sim<tab> and get python -m SimpleHTTPServer. Another idea would be to try and get a shorter binary name distributed with python. It sounds superficial, but is ultimately quite important if python wants to compete with ruby, perl, and sed.
>>
>>
>>I really don't think that's important. The reason people use perl and sed isn't that the command is 2 or 3 characters shorter (if it were, they'd just create an alias), it's that many trivial one-liners are more verbose in Python. For example, consider looping over each line in each filename passed on the command line. In perl that's a few characters; in python (even pythonpy) you have to loop over fileinput.input to do the same. And now imagine that what you want to do with each line is an re.sub; in Python, just the quoting is longer than the whole thing in Perl. In general, this isn't a weakness (it's what makes Python code readable), but if you're really interested in "competing with Perl" on its own terms, those are the parts you have to compete with, not the name of the command.
>>
>>
>>
>>
>>On Thu, Jan 8, 2015 at 2:35 PM, Andrew Barnert <abarnert at yahoo.com> wrote:
>>>
>>>On Thursday, January 8, 2015 1:11 PM, Russell Stewart <Russell.S.Stewart at gmail.com> wrote:
>>>>
>>>>
>>>>>I am the original dev for pypi.python.org/pypi/pythonpy, which offers much of the same functionality as python -c, but in a more convenient form. I'd be really interested in figuring out whether the core python tools could support many of the use cases that pythonpy covers, so that users wouldn't have to install it as a separate tool. I don't have a concrete idea of what it would look like, but many of my users have said they would much prefer an officially supported tool. Perhaps someone on this mailing list can provide some insight into the available options.
>>>>
>>>>This looks pretty cool to me—so I have lots of questions and comments. :)
>>>>
>>>>The obvious problem is what to call it. `py` (well, `py.exe`) is already the name of the PEP 397 launcher tool (the thing that lets you use shbang lines on Windows), so this would need a new name.
>>>>
>>>>Maybe it would be better as just a different flag or set of flags on `python` itself, like `--pprint`/`-p`? Of course then there's the problem of where to stick the flags for `-p` (for GNU-style long names this is pretty easy: `--print=filter,lines`, but obviously for something meant to be used for quick&dirty command-line usage you need short names as well…). And the "backport" to 3.4 (and 2.x) would act differently from the standard 3.5+ version. But it might be worth looking at anyway.
>>>>
>>>>
>>>>Or, of course, just make this a stdlib module, so it's just `python [PYTHON OPTIONS] -m YOUR_MODULENAME [YOUR MODULE OPTIONS]`, and you can alias that to whatever you want.
>>>>
>>>>Is `-fx` a multi-character single-hyphen flag (in which case it's very weird to mix those with GNU-style long arguments in the same program, or is it a combination of `-f` and `-x`? And is the filter expression an argument to `-fx`, or does `-f` just change the interpretation of the argument?
>>>>
>>>>
>>>>Speaking of GNU-style long arguments, why `--i` instead of just `-i` (as a short name for `--ignore_exceptions` or something)
>>>>
>>>>The documentation is pretty Python 2-specific: the majority of the examples use the `range` function and depend on it returning a list.
>>>>
>>>>
>>>>You seem to have built a non-trivial custom pretty-printer for this tool (e.g., to print lists row by row) as well as an auto-importer (e.g., to use `collections.Counter` without an `import collections`); maybe some or all of that should be separated out and made available to Python code somewhere in the stdlib, and then the script (or flag) could just use that function?
>>>>
>>>>This seems to be pretty strongly list-based for no good reason. Why not print _any_ iterable line by line, make -l set `l = sys.stdin` instead of `l = list(sys.stdin)` (I realize that doesn't work for the specific example of `py -l 'l[::-1]'`, but `py -l reversed(l)` would work just as well in that case, and it's hard to think of other examples where there would even be a problem), etc.?
>>>>
>>>>It might be nice to have an option to see the `repr` instead of the `str` of everything, to match what you'd see from the interactive interpreter.
>>>>
>>>>While --si is cool, people already misuse `str.split` and `re.split` to try to parse CSV and similar input that has quotes or escapes; it might be nice to have a `--csv` mode to parse the input with `csv.reader` (and that could still take an optional delimiter, of course; passing other dialect flags to `reader` might be out of scope).
>>>>
>>>>A columnar reading mode might also be nice, since that's one of those things that's novices have a hard time writing without statements.
>>>>
>>>>An option for pretty JSON output instead of compact JSON output might be nice.
>>>>
>>>>And maybe an option to read multi-line JSON input—not just a single JSON value (which is trivial with `json.loads(l)`, but, since JSON is self-delimiting, still a stream of them, without the requirement of one/line (e.g., by looping over `json.JSONDecoder.raw_decode`).
>>>>
>>>>I assume you're evaluating the expression with `eval`. Are you passing it a custom locals and globals (so any internals of your script itself aren't available), or not? I could see disadvantages of both (the former may prevent some useful quick&dirty hacks; the latter could raise safety concerns), but either way, it's probably worth documenting.
>>>>
>>>>From the summary help, it looks like you can apply both `-x` and `-fx`. If you do that, do they run in a specific order, or in the order specified? For that matter, can you apply multiple mappings and/or filterings? (If not, it seems like that could be handy.)
>>>>
>>>>It might be useful to have some kind of "daemon mode", where you start up an interpreter in the background and then pipe input to it. Then you can pipe multiple things to the same interpreter session, both for persistence, and to avoid the process startup cost on platforms where it matters (like Windows). Maybe with an optional timeout, where the daemon stops a few minutes after last use, so you don't have to remember to `py -d sys.exit(0)` or similar. And maybe the magic `_` should work in daemon mode, as it does in the interactive interpreter. (I could see writing some one-liner that takes 90 seconds to run, then realizing I wanted it in JSON format, so `py --daemon -jo _` would be handy to avoid re-running everything…)
>>>>
>>>>It would be really cool if, when run under PowerShell, this could handle scriptlet input and output instead of plain text (so, e.g., you pass it an array of strings and it processes each string).
>>>>
>>>>Being able to somehow pass command-line arguments through to `python` itself (assuming this isn't merged into the main interpreter, of course) might be handy. In particular, I could see `-u` being useful, but there might be others (including platform/implementation-specific `-X` options). Of course that would come for free with the `-m` interface, but since I'm not sure that's the best option...
>>>>
>>>>Instead of just being able to ignore exceptions, it might be nice to enable one-line exception output (e.g., print just the type and message, no traceback, and to stdout instead of stderr).
>>>>
>>>>With `-x`, an option to prefix each line of output with the corresponding line of input could be handy, similar to the `-v` option to commands like `cp`, `tar`, etc.
>>>>
>>>>A way to feed input files in could be handy to allow the kind of one-liners people often fall back to perl for, although I'm not sure exactly what you'd want this to look like.
>>>>
>>>>Some of these are probably way out of scope even for a third-party tool, much less for a built-in stdlib version, but I'm not sure exactly which, so I've just dumped everything on you to let you sort them out. :)
>>>>
>>>
>
>
>