At the moment, the array module of the standard library allows to
create arrays of different numeric types and to initialize them from
an iterable (eg, another array).
What's missing is the possiblity to specify the final size of the
array (number of items), especially for large arrays.
I'm thinking of suffix arrays (a text indexing data structure) for
large texts, eg the human genome and its reverse complement (about 6
billion characters from the alphabet ACGT).
The suffix array is a long int array of the same size (8 bytes per
number, so it occupies about 48 GB memory).
At the moment I am extending an array in chunks of several million
items at a time at a time, which is slow and not elegant.
The function below also initializes each item in the array to a given
value (0 by default).
Is there a reason why there the array.array constructor does not allow
to simply specify the number of items that should be allocated? (I do
not really care about the contents.)
Would this be a worthwhile addition to / modification of the array module?
My suggestions is to modify array generation in such a way that you
could pass an iterator (as now) as second argument, but if you pass a
single integer value, it should be treated as the number of items to
Here is my current workaround (which is slow):
def filled_array(typecode, n, value=0, bsize=(1<<22)):
"""returns a new array with given typecode
(eg, "l" for long int, as in the array module)
with n entries, initialized to the given value (default 0)
a = array.array(typecode, [value]*bsize)
x = array.array(typecode)
r = n
while r >= bsize:
r -= bsize
I just spent a few minutes staring at a bug caused by a missing comma
-- I got a mysterious argument count error because instead of foo('a',
'b') I had written foo('a' 'b').
This is a fairly common mistake, and IIRC at Google we even had a lint
rule against this (there was also a Python dialect used for some
specific purpose where this was explicitly forbidden).
Now, with modern compiler technology, we can (and in fact do) evaluate
compile-time string literal concatenation with the '+' operator, so
there's really no reason to support 'a' 'b' any more. (The reason was
always rather flimsy; I copied it from C but the reason why it's
needed there doesn't really apply to Python, as it is mostly useful
Would it be reasonable to start deprecating this and eventually remove
it from the language?
--Guido van Rossum (python.org/~guido)
For context please see http://bugs.python.org/issue22937 and
I have two questions I'm hoping to get answered through this thread:
- does the change in question need a PEP? Antoine seemed to think it
didn't, as long as it was opt-in (via the unittest CLI etc)
- Is my implementation approach sound (for traceback, unittest I
think I have covered :))?
Implementation wise, I think its useful to work in the current
traceback module layout - that is to alter extract_stack to
(optionally) include rendered data about locals and then look for that
and format it in format_list.
I'm sure there is code out there that depends on the quadruple nature
of extract_stack though, so I think we need to preserve that. Three
strategies occured to me; one is to have parallel functions, one
quadruple, one quintuple. A second one is to have the return value of
extract_stack be a quintuple when a new keyword parameter
include_locals is included. Lastly, and this is my preferred one, if
we return a tuple subclass with an attribute containing a dict with
the rendered data on the locals; this can be present but None, or even
just absent when extract_stack was not asked to include locals.
The last option is my preferred one because the other two both imply
having a data structure which is likely to break existing code - and
while you'd have to opt into having them it seems likely to require a
systematic set of upgrades vs having an optional attribute that can be
So - thoughts?
Robert Collins <rbtcollins(a)hp.com>
HP Converged Cloud
I have a situation where, no matter how the routine ends, I need to return the focus to a widget. It would be
rather clever to do this with a context, but I would expect something like this to be a possible strategy
with contextlib.context(enter=None, exit=lambda *args: my_widget.setFocus()):
do what I need to do
as far as I know, at the moment it's not possible. Am I right? I think it would be an easy and practical
addition to the contextlib module to quickly register two routines for enter and exit.
the problem is that even with extreme precaution, it is impossible to keep
ALL modules compatible from a version to another.
what I want to ask is this:
-some "packs" which can, like py-compile, generate .pyc files, but using
"old" versions of the default library, and of __builtins__.
-One will go with every minor version and will be optionnal in the
-any imported py file will be able to choose which version it wants with
"#! py recommended version X.X" or
"#! py mandatory version X.X" commentaries at the begining of the file.
thank you and have a nice day/evening/night.
I sent an answer to andrew barnert, who have sent the first answer to my
question, and not to the list...
however , I do have an answer to my question.
2014-11-29 11:51 GMT+01:00 Andrew Barnert <abarnert(a)yahoo.com>:
> On Nov 29, 2014, at 0:46, Liam Marsh <liam.marsh.home(a)gmail.com> wrote:
> yes, the bug which gave me this idea is a py3.2 to py3.3 bug with Vpython,
> a 3D graphs library.
> And is the bug a stdlib bug, or a language bug?
> I also thought this will be the reconciliation between py2 and py3. (this
> is why I thought the packs will include a version of the stdlibs)
> The language differences between 2.x and 3.x are huge, and most of the
> stdlib differences between 2.6 and 3.1 or 2.7 and 3.2 are related to those
> language differences. Porting code to 3.x is primarily about fixing Unicode
> stuff, or code that was already deprecated in 2.5 or 2.6 but wasn't broken
> until 3.x. Having the 2.7 stdlib in 3.x would be a huge amount of work for
> almost no benefit.
> in fact, how do the .pyc work? were them modified by the "language and
> implementation changes"? how do them import other modules?
> .pyc files are just compiled bytecode, with a version-specific header.
> If you fixed the 2.7 stdlib to work in 3.x (which, again, would be a huge
> amount of work), you could compile it with 3.4.
> But you're missing the fact that large chunks of the stdlib are written in
> C, and compiler against the Python C API. And parts of the stdlib
> (especially builtins and the sys module) are exposing types and functions
> written in C that are part of the core implementation, so the 2.x version
> of the sys module wouldn't be compatible with 2.x code anyway.
> Also, did you mean to write just to me instead of to the list?
> thank you!
> 2014-11-29 6:07 GMT+01:00 Andrew Barnert <abarnert(a)yahoo.com>:
>> On Nov 28, 2014, at 8:29, Liam Marsh <liam.marsh.home(a)gmail.com> wrote:
>> > hello,
>> > the problem is that even with extreme precaution, it is impossible to
>> keep ALL modules compatible from a version to another.
>> Do you have a specific library or app that you've had a problem with?
>> There were a handful of modules that had a problem with the 3.2 to 3.3
>> conversion, but every one I saw was caused by language and implementation
>> changes, not stdlib changes. I don't think I've seen anything that works
>> with 3.3 but not 3.4. I'm sure it's not impossible for such a thing to
>> happen, but it would be helpful to have at least one real-life example.
>> > what I want to ask is this:
>> > -some "packs" which can, like py-compile, generate .pyc files, but
>> using "old" versions of the default library, and of __builtins__.
>> But how would this work? The same changes that broke a handful of
>> third-party modules between 3.2 and 3.3 probably also mean that the 3.2
>> stdlib wouldn't work in 3.3 without minor changes. And as for builtins,
>> most of those are exposing internals of the implementation, so trying to
>> make the 3.2 builtins work with 3.3 would take a lot more work than just
>> building the 3.2 code against 3.3.
>> > -One will go with every minor version and will be optionnal in the
>> > -any imported py file will be able to choose which version it wants
>> with the
>> > "#! py recommended version X.X" or
>> > "#! py mandatory version X.X" commentaries at the begining of the
>> > thank you and have a nice day/evening/night.
>> > _______________________________________________
>> > Python-ideas mailing list
>> > Python-ideas(a)python.org
>> > https://mail.python.org/mailman/listinfo/python-ideas
>> > Code of Conduct: http://python.org/psf/codeofconduct/
On Thu, Nov 27, 2014 at 09:59:45PM +0000, Charles-François Natali wrote:
> 2014-11-27 1:18 GMT+00:00 Trent Nelson <trent(a)snakebite.org>:
> > Everything else is just normal Python, nothing special -- it just conforms
> > to the current constraints of PyParallel. Basically, the
> > HttpServer.data_received() method will be invoked from parallel threads, not
> > the main interpreter thread.
> So, still no garbage collection from the threads?
Not having garbage collection has surprisingly not gotten in the way
so far, so it's not even on the radar anymore. There are other means
available for persisting objects past the lifetime of the parallel
context, and you could always do an @async.call_from_main_thread if
you want to have the main thread's memory allocator (and thus, GC)
At one point, all these tests passed, just to give you an idea of
some of the facilities that are available:
(I haven't removed any of those facilities, I just haven't spent any
time on them since switching over to the async socket stuff, so I
can't comment on their current state.)
----- End forwarded message -----
On Wed, Nov 26, 2014 at 10:36 PM, Trent Nelson <trent(a)trent.me> wrote:
> Have you seen this?:
> I spend the first 80-ish slides on async I/O.
> (That was a year ago. I've done 2-3 sprints on it since then and have gotten it to a point where I can back up the claims with hard numbers on load testing benchmarks, demonstrated in the most recent video: https://www.youtube.com/watch?v=4L4Ww3ROuro.)
Thanks. Have never seen it before. Do you have some detailed
description about how I/O subsystem works? From what I see, pyparallel
relies on offloading I/O operations to the IOCP. And it seems that I/O
kernel which I'm describing can replace IOCP on unix. And allowing the
C code to do more high-level stuff (like reconnecting and message
delimitation), would be of a benefit for pyparallel too, as it would
allow less heap allocations, smaller number of heap snasphot, etc.
Still PyParallel, is much bigger departure from current Python. As it
doesn't allow arbitrary code to run in threads, so it's much longer
perspective, than I/O kernel which will be just a library on PyPI.
Title: Change StopIteration handling inside generators
Author: Chris Angelico <rosuav(a)gmail.com>
Type: Standards Track
This PEP proposes a semantic change to ``StopIteration`` when raised
inside a generator, unifying the behaviour of list comprehensions and
generator expressions somewhat.
The interaction of generators and ``StopIteration`` is currently
somewhat surprising, and can conceal obscure bugs. An unexpected
exception should not result in subtly altered behaviour, but should
cause a noisy and easily-debugged traceback. Currently,
``StopIteration`` can be absorbed by the generator construct.
If a ``StopIteration`` is about to bubble out of a generator frame, it
is replaced with some other exception (maybe ``RuntimeError``, maybe a
new custom ``Exception`` subclass, but *not* deriving from
``StopIteration``) which causes the ``next()`` call (which invoked the
generator) to fail, passing that exception out. From then on it's
just like any old exception. _
Consequences to existing code
This change will affect existing code that depends on
``StopIteration`` bubbling up. The pure Python reference
implementation of ``groupby`` _ currently has comments "Exit on
``StopIteration``" where it is expected that the exception will
propagate and then be handled. This will be unusual, but not unknown,
and such constructs will fail.
(Nick Coghlan comments: """If you wanted to factor out a helper
function that terminated the generator you'd have to do "return
yield from helper()" rather than just "helper()".""")
As this can break code, it is proposed to utilize the ``__future__``
mechanism to introduce this, finally making it standard in Python 3.6
Supplying a specific exception to raise on return
Nick Coghlan suggested a means of providing a specific
``StopIteration`` instance to the generator; if any other instance of
``StopIteration`` is raised, it is an error, but if that particular
one is raised, the generator has properly completed.
Making return-triggered StopIterations obvious
For certain situations, a simpler and fully backward-compatible
solution may be sufficient: when a generator returns, instead of
raising ``StopIteration``, it raises a specific subclass of
``StopIteration`` which can then be detected. If it is not that
subclass, it is an escaping exception rather than a return statement.
Unofficial and apocryphal statistics suggest that this is seldom, if
ever, a problem. _ Code does exist which relies on the current
behaviour, and there is the concern that this would be unnecessary
code churn to achieve little or no gain.
..  Initial mailing list comment
..  Pure Python implementation of groupby
..  Proposal by GvR
..  Response by Steven D'Aprano
This document has been placed in the public domain.