On 18 September 2014 18:01, Andrew Barnert
> On Sep 17, 2014, at 23:15, Nick Coghlan <ncoghlan(a)gmail.com> wrote:
>> However, now that CPython ships with pip by default, we may want to
>> consider providing more explicit pointers to such "If you want more
>> advanced functionality than the standard library provides" libraries.
> I love this idea, but there's one big potential problem, and one smaller one.
> Many of the most popular and useful packages require C extensions. In itself,
> that doesn't have to be a problem; if you provide wheels for the official 3.4+ Win32,
> Win64, and Mac64 CPython builds, it can still be as simple as `pip install spam`
> for most users, including the ones with the least ability to figure it out for themselves.
OK, the key thing to look at here is the user experience for someone
who has Python installed, and has a job to do, but needs to branch out
into external packages because the stdlib doesn't provide enough
To make this example concrete, I'll focus on a specific use case,
which I believe is relatively common, although I can't back this up
with hard data.
* A user who is comfortable with Python, or with scripting languages in general
* No licensing or connectivity issues to worry about
* An existing manual process that the user wants to automate
In my line of work, this constitutes the vast bulk of Python use -
informal, simple automation scripts.
So I'm writing this script, and I discover I need to do something that
the stdlib doesn't cover, but I feel like it should be available "out
there", and it's sufficiently fiddly that I'd prefer not to write it
myself. Examples I've come across in the past:
* A console progress bar
* Scraping some data off a web page
* Writing data into an Excel spreadsheet with formatting
* Querying an Oracle database
Every time an issue like this comes up, I know that I'm looking to do
"pip install XXX". It's working out what XXX is that's the problem.
So I go and ask Google. A quick check on the progress bar case gets me
to a StackOverflow article that offers me a lot of "write it yourself"
solutions, and pointers to a couple of libraries. Further down there
are a few pointers to python-progressbar, which was mentioned in the
StackOverflow article, which in turn leads me to the PyPI page for it.
The latest version (2.3-dev) is not hosted on PyPI, so I hit all the
fun of --allow-external.
Ironically, "pip install tqdm" gives me what I want instantly. But it
never came up via Google.
The rest of the cases are similar, lots of Google searching, often
combined with evaluating multiple options, followed by more or less
pain installing the software. Things that aren't Python 3 or Windows
compatible suck me into the "shall I patch it and submit a PR"
For the last case (an Oracle driver), where I need a C extension and
access to external libraries, ironically it's pretty easy. There's no
real competition to cx_Oracle, and the PyPI page has what I need,
although they ship wininst exes rather than wheels, which means I need
to do a download then a wheel convert then a pip install, so it's not
ideal, but doable.
>From this example, I'd like to see the following improvements to the process:
1. Somewhere I can go to find useful modules, that's better than Google.
2. Someone else choosing the "best option" - I don't want to evaluate
3 different progressbar modules, I just want to write "57% complete"
and a few dots!
3. C extensions aren't a huge problem to me on Windows, although I'm
looking forward to the day when everyone distributes wheels (wheel
convert is good enough for now though). 
4. Much more community pressure for projects to host their code on
PyPI. Some projects have genuine issues with hosting on PyPI, and
there are changes being looked at to support them, but for most
projects it seems to just be history and inertia.
 A Linux/OS X user might have more more issues with C extensions.
Maybe this can't be solved in any meaningful sense, and maybe it's not
something the "Python project" should take responsibility for, but
without any doubt, it's the single most significant improvement that
could be made to my experience with PyPI.
PS I should also note that even in its current state, PyPI is streets
ahead of the 3rd party module story I've experienced for any other
language - C/C++, Lua, Powershell, and Java are all far worse.
Perl/CPAN may be as good or better, it's so long since I used Perl
that I don't really know these days.
Why did the CPython core developers decide to force the display of
ASCII characters in the printable representation of bytes objects in
CPython 3? For example
>>> import struct
>>> # In go bytes for four floats:
>>> my_packed_bytes = struct.pack('ffff', 3.544294848931151e-12,
1.853266900760489e+25, 1.6215185358725202e-19, 0.9742483496665955)
>>> # And out comes a speciously human-readable representation of
b'Why, Guido? Why?'
>>> # But it's just an illusion; it's truly bytes underneath!
>>> a_reasonable_representation = bytes((0x57, 0x68, 0x79, 0x2c,
0x20, 0x47, 0x75, 0x69, 0x64, 0x6f, 0x3f, 0x20, 0x57, 0x68, 0x79,
>>> my_packed_bytes == a_reasonable_reperesentation
>>> this_also_seems_reasonable =
>>> my_packed_bytes == this_also_seems_reasonable
I understand bytes literals were brought in to Python 3 to aid the
transition from Python 2 to Python 3 , but this did not imply that
`repr()` on a bytes object ought to display bytes mapping to ASCII
characters as ASCII characters. I have not yet found a PEP describing
why this decision was made. I am now seeking to put forth a PEP to
change printable representation of bytes to be simple, consistent, and
easy to understand.
The current behavior printing of elements of bytes with a mapping to
printable ASCII characters as those characters seems to violate
multiple tenants of the Zen of Python 
* "Explicit is better than implicit." This display happens without the
user's explicit request.
* "Simple is better than complex." The printable representation of
bytes is complex, surprising, and unintuitive: Elements of bytes shall
be displayed as their hexadecimal value, unless such a value maps to a
printable ASCII character, in which case, the character shall be
displayed instead of the hexadecimal value. The underlying values of
each element, however, are always integers. The printable
representation of an element of a byte will always be an integer
representation. The simple thing is to show the hex value for every
* "Special cases aren't special enough to break the rules." Implicit
decoding of bytes to ASCII characters comes in handy only some of the
* "In the face of ambiguity, refuse the temptation to guess." Python
is guessing that I want to see some bytes as ASCII characters. In the
example above, though, what I gave Python was bytes from four floating
* "There should be one-- and preferably only one --obvious way to do
it." `bytes.decode('ascii', errors='backslashreplace')` already
provides users the means to display ASCII characters among bytes, as a
To be fair, there are two tenants of the Zen of Python that support
the implicit display of ASCII characters in bytes:
* "Readability counts."
* "Although practicality beats purity."
In counterargument, though, I would say that the extra readability and
practicality are only served boosted in special cases (which are not
Much ado was (and continues to be) raised over Python 3 enforcing
distinction between (Unicode) strings and bytes. A lot of this
resentment comes from Python programmers who do not yet appreciate the
difference between bytes and text†, or from those who remain apathetic
and prefer Python 2's it-works-'til-it-doesn't strings. This implicit
displaying of ASCII characters in bytes ends up conflating the two
data types even deeper in novice programmers' minds.
In the example above, `my_packed_bytes` looks like a string. It reads
like a string. But it is not a string. The ASCII characters are a lie,
as evidenced when trying to access elements of a bytes instance:
>>> b'Why, Guido? Why?'
>>> # Oh, perhaps you were expecting b'W'?
I find this behavior harmful to Python 3 advocacy, and novices and
those accustomed to Python 2 find this yet another deterrent in the
way of Python 3 adoption.
I would like to gauge the feasibility of a PEP to change the printable
representation of bytes in CPython 3 to display all elements by their
hexadecimal values, and only by their hexadecimal values.
† I write this as someone who, himself, didn't appreciate nor
understand the difference between bytes, strings, and Unicode. I have
Ned Batchelder  to thank and his illuminating "Pragmatic Unicode"
presentation  for getting me on the right path.
Please see this discussion on python-list:
Currently `float('inf') //1` is equal to NaN. I think that this is really
weird. If I understand correctly it's to maintain the invariant `div*y +
mod == x`. The question is, do we really care more about maintaining this
invariant rather than providing a mathematically reasonable value for floor
It seems that this didn't reach the list directly (see https://mail.python.org/pipermail/python-ideas/2014-August/028956.html), so I'm resending:
Erik Bray (the author of the +FLOAT_CMP extension in Astropy), Bruce Leban, and I had a short off-thread email discussion. Here are the points:
- [Bruce]: ALMOST_EQUAL is the best flag name.
- [Erik]: If there's agreement on this, Erik will develop a patch as soon as he can.
- [Erik]: There's no way to adjust the tolerance because there seems to be no easy way to parameterize doctest flags. Ideas are welcome.
- [Erik]: Still, "This +FLOAT_CMP flag enabled removing tons of ellipses from the test outputs [of Astropy], and restoring the full outputs which certainly read better in the docs... For more complete unit tests of course we use assert_almost_equal type functions.
- [Erik]: This PR is a better link than the one I gave: https://github.com/astropy/astropy/pull/2087
- [Erik]: Most of the code is from the SymPy project with improvements. Erik had started on a similar feature when he found that their implementation was further developed.
Hi Andrew, Hi List,
> [ some discussion about calling yield from from the command line skipped ]
> I would love to see this. I'm not sure if I'd love it in practice or not, but until
> someone implements it and I can play with it I'm not sure how I'd become sure.
> So... You just volunteered, right? Go build it and put it on PyPI, I want it and
> I'll be your best friend forever and ever no takebacks if you do it. :)
Well, so I did, I wrote an IPython extension that does it and put it up on
It's more a mock-up of how it should actually look like, but it is a
So now you can write on the command line stuff like:
>>> %load_ext yf
>>> from asyncio import sleep, async
>>> def f():
... yield from sleep(3)
>>> yield from f()
#[wait three seconds]
>>> #[wait three seconds, or type other commands] done
So as you see, the event loop runs while you are typing commands,
and while they are executed.
One of the problems with new Python programmers using 3.x is that they
first read 'print x' in 2.x based material, try 'print x' in 3.x, get
"SyntaxError: invalid syntax" (note the uninformative redundant
message), and go "huh?" or worse.
Would it be possible to add detect this particular error and print a
more useful message? I am thinking of something of something like
SyntaxError: calling the 'print' function requires ()s, as in "print(x)"
SyntaxError: did you mean "print(...)"?
I was 'inspired' by a recent SO question
which was closed as a duplicate of the 2009 question
I imagine that there have been other duplicates. The same question (and
answer) has appeared multiple times on python-list also.
If we do this, I am sure someone will ask why we do not automatically
'fix' the error. One answer would be that the closing ) is needed to
determine the intended end of the call. A longer version would be that
if we insert (, we are just guessing that the insertion is correct and
we still would not know, without guessing, where to put the ).
Terry Jan Reedy
>>> from collections import namedtuple
>>> A = namedtuple("A", ["foo"])
The relevant code is
I propose we bring the behavior to regular classes.
>>> class A(object):
... def __init__(self):
... self.foo = 1
'<__main__.A object at 0x1090c0990>'
We should be able to see the current values to the display.
1. Helps debugging (via pdb, print and logging). We no longer have to do
A().foo to find out.
2. I don't know how often people actually rely on repr(A()) or str(A()) and
parse the string so breaking compatibility is, probably low.
3. People who wish to define their own repr and str is welcome. Django
model for example has a more explicit representation by default (although
many Django users do redefine the representation on their own).
datetime.datetime by default, as a library, is also explicit. So
customization will come.
The main challenge:
Where and how do we actually look for what attributes are relevant?
namedtuple can do it because it has __slot__ and we know in advance how
many attributes are set. In regular class, we deal with dynamic attribute
setting, single and inheritances. I don't have an answer for this simply
because I lack of experience. We can certainly start with the attributes
set in the main instance and one level up in the inheritance chain.
1. What if there are too many attributes? I don't think the number will
explode beyond 30. I choose this number out of thin air. I can do more
research on this. It doesn't actually hurt to see everything. If you do
have a class with so many attribute (whether you have this many to begin
with, or because you allow aritbary numbers of attributes to be set -- for
example, a document from a collection in NoSQL like MongoDB), that's still
very useful. We could limit by default up to how many.
2. How do we order them? We can order them in unsorted or sorted order. I
prefer the sorted order.
since there seemed to be some interest in my idea of a
asyncio-enabled command line, I just sat down and wrote
it. I submitted the parts that would need to go into CPython
as Issue 22412 to the Python bug tracker. I added a simple
command line interpreter, based on code.InteractiveConsole,
which will allow for uses like
>>> from asyncio import sleep
>>> yield from sleep(10)
The following code is mostly a copy of InteractiveConsole, with
the appropriate yield froms stuck in (and comments removed.
from asyncio import get_event_loop, coroutine, input
from code import InteractiveConsole
def __init__(self, locals=None, filename="<console>"):
self.compile.compiler.flags |= 0x1000
def runsource(self, source, filename="<input>", symbol="single"):
code = self.compile(source, filename, symbol)
except (OverflowError, SyntaxError, ValueError):
if code is None:
yield from self.runcode(code)
def runcode(self, code):
yield from eval(code, self.locals)
def push(self, line):
source = "\n".join(self.buffer)
more = yield from self.runsource(source, self.filename)
if not more:
def interact(self, banner=None):
sys.ps1 = ">>> "
sys.ps2 = "... "
cprt = 'Type "help", "copyright", "credits" or "license" for
if banner is None:
self.write("Python %s on %s\n%s\n(%s)\n" %
(sys.version, sys.platform, cprt,
self.write("%s\n" % str(banner))
more = 0
prompt = sys.ps2
prompt = sys.ps1
line = yield from input(prompt)
more = yield from self.push(line)
more = 0
if __name__ == "__main__":
console = AsyncConsole()
Hi Terry, Hi List,
> I presume full behavior requires the call to root.mainloop(). This has two
> problems for continued interaction. First, the call blocks until the window
> is closed, making further entry impossible through normal means. If that
> were solved with a 'noblock' option, there would still be the problem of
> getting shell input to a callback that could, on demand, execute to code to
> modify the tk app. The solution would have to be different for the console
> interpreter, there tkinter is running in the same process and Idle, where
> tkinter is running is a separate process.
You just gave a good reasoning for the advantages of asyncio. Because once
we have an asyncio-aware version of tkinter - and an asyncio-aware command
line, this is what I am proposing - all the problems you just described
disappear. So, I would call that a good use case for my idea.
A tkinter-aware commandline would then just look like:
async(commandline()) # i.e. the coroutine defined in my last post
get_event_loop().run_forever() # this calls root.mainloop()
I'm currently trying to convince my company that asyncio is a great
thing. After a lot of critique, the newest thing is, people complain:
I cannot test my code on the command line! And indeed they are
right, a simple
a = yield from some_coroutine()
is not possible on the command line, and doesn't make sense.
Wait a minute, really?
Well, it could make sense, in an asyncio-based command line.
I am thinking about a python interpreter whose internal loop is
cmd = yield from input_async()
code = compile(cmd, "<stdin>", "generator")
yield from exec(code)
A new compile mode would allow to directly, always create a
generator, and exec should be certainly be able to handle this.
I think this would not only make people happy that want to test
code on the command line, but also all those people developing
command line-GUI combinations (IPython comes to mind),
which have to keep several event loops in sync.