At the moment, the array module of the standard library allows to
create arrays of different numeric types and to initialize them from
an iterable (eg, another array).
What's missing is the possiblity to specify the final size of the
array (number of items), especially for large arrays.
I'm thinking of suffix arrays (a text indexing data structure) for
large texts, eg the human genome and its reverse complement (about 6
billion characters from the alphabet ACGT).
The suffix array is a long int array of the same size (8 bytes per
number, so it occupies about 48 GB memory).
At the moment I am extending an array in chunks of several million
items at a time at a time, which is slow and not elegant.
The function below also initializes each item in the array to a given
value (0 by default).
Is there a reason why there the array.array constructor does not allow
to simply specify the number of items that should be allocated? (I do
not really care about the contents.)
Would this be a worthwhile addition to / modification of the array module?
My suggestions is to modify array generation in such a way that you
could pass an iterator (as now) as second argument, but if you pass a
single integer value, it should be treated as the number of items to
Here is my current workaround (which is slow):
def filled_array(typecode, n, value=0, bsize=(1<<22)):
"""returns a new array with given typecode
(eg, "l" for long int, as in the array module)
with n entries, initialized to the given value (default 0)
a = array.array(typecode, [value]*bsize)
x = array.array(typecode)
r = n
while r >= bsize:
r -= bsize
I think it would be a good idea if Python tracebacks could be translated
into languages other than English - and it would set a good example.
For example, using French as my default local language, instead of
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: integer division or modulo by zero
I might get something like
Suivi d'erreur (appel le plus récent en dernier) :
Fichier "<stdin>", à la ligne 1, dans <module>
ZeroDivisionError: division entière ou modulo par zéro
Greg Ewing wrote:
> Mark Shannon wrote:
>> Why not have proper co-routines, instead of hacked-up generators?
> What do you mean by a "proper coroutine"?
A parallel, non-concurrent, thread of execution.
It should be able to transfer control from arbitrary places in
execution, not within generators.
Stackless provides coroutines. Greenlets are also coroutines (I think).
Lua has them, and is implemented in ANSI C, so it can be done portably.
(One of the examples in the paper uses coroutines to implement
generators, which is obviously not required in Python :) )
Here's an updated version of the PEP reflecting my
recent suggestions on how to eliminate 'codef'.
Author: Gregory Ewing <greg.ewing(a)canterbury.ac.nz>
Type: Standards Track
A syntax is proposed for defining and calling a special type of generator
called a 'cofunction'. It is designed to provide a streamlined way of
writing generator-based coroutines, and allow the early detection of
certain kinds of error that are easily made when writing such code, which
otherwise tend to cause hard-to-diagnose symptoms.
This proposal builds on the 'yield from' mechanism described in PEP 380,
and describes some of the semantics of cofunctions in terms of it. However,
it would be possible to define and implement cofunctions independently of
PEP 380 if so desired.
A cofunction is a special kind of generator, distinguished by the presence
of the keyword ``cocall`` (defined below) at least once in its body. It may
also contain ``yield`` and/or ``yield from`` expressions, which behave as
they do in other generators.
From the outside, the distinguishing feature of a cofunction is that it cannot
be called the same way as an ordinary function. An exception is raised if an
ordinary call to a cofunction is attempted.
Calls from one cofunction to another are made by marking the call with
a new keyword ``cocall``. The expression
cocall f(*args, **kwds)
is evaluated by first checking whether the object ``f`` implements
a ``__cocall__`` method. If it does, the cocall expression is
yield from f.__cocall__(*args, **kwds)
except that the object returned by __cocall__ is expected to be an
iterator, so the step of calling iter() on it is skipped.
If ``f`` does not have a ``__cocall__`` method, or the ``__cocall__``
method returns ``NotImplemented``, then the cocall expression is
treated as an ordinary call, and the ``__call__`` method of ``f``
Objects which implement __cocall__ are expected to return an object
obeying the iterator protocol. Cofunctions respond to __cocall__ the
same way as ordinary generator functions respond to __call__, i.e. by
returning a generator-iterator.
Certain objects that wrap other callable objects, notably bound methods,
will be given __cocall__ implementations that delegate to the underlying
The full syntax of a cocall expression is described by the following
atom: cocall | <existing alternatives for atom>
cocall: 'cocall' atom cotrailer* '(' [arglist] ')'
cotrailer: '[' subscriptlist ']' | '.' NAME
Note that this syntax allows cocalls to methods and elements of sequences
or mappings to be expressed naturally. For example, the following are valid:
y = cocall self.foo(x)
y = cocall funcdict[key](x)
y = cocall a.b.c[i].d(x)
Also note that the final calling parentheses are mandatory, so that for example
the following is invalid syntax:
y = cocall f # INVALID
New builtins, attributes and C API functions
To facilitate interfacing cofunctions with non-coroutine code, there will
be a built-in function ``costart`` whose definition is equivalent to
def costart(obj, *args, **kwds):
m = obj.__cocall__
result = NotImplemented
result = m(*args, **kwds)
if result is NotImplemented:
raise TypeError("Object does not support cocall")
There will also be a corresponding C API function
PyObject *PyObject_CoCall(PyObject *obj, PyObject *args, PyObject *kwds)
It is left unspecified for now whether a cofunction is a distinct type
of object or, like a generator function, is simply a specially-marked
function instance. If the latter, a read-only boolean attribute
``__iscofunction__`` should be provided to allow testing whether a given
function object is a cofunction.
Motivation and Rationale
The ``yield from`` syntax is reasonably self-explanatory when used for the
purpose of delegating part of the work of a generator to another function. It
can also be used to good effect in the implementation of generator-based
coroutines, but it reads somewhat awkwardly when used for that purpose, and
tends to obscure the true intent of the code.
Furthermore, using generators as coroutines is somewhat error-prone. If one
forgets to use ``yield from`` when it should have been used, or uses it when it
shouldn't have, the symptoms that result can be extremely obscure and confusing.
Finally, sometimes there is a need for a function to be a coroutine even though
it does not yield anything, and in these cases it is necessary to resort to
kludges such as ``if 0: yield`` to force it to be a generator.
The ``cocall`` construct address the first issue by making the syntax directly
reflect the intent, that is, that the function being called forms part of a
The second issue is addressed by making it impossible to mix coroutine and
non-coroutine code in ways that don't make sense. If the rules are violated, an
exception is raised that points out exactly what and where the problem is.
Lastly, the need for dummy yields is eliminated by making it possible for a
cofunction to call both cofunctions and ordinary functions with the same syntax,
so that an ordinary function can be used in place of a cofunction that yields
Record of Discussion
An earlier version of this proposal required a special keyword ``codef`` to be
used in place of ``def`` when defining a cofunction, and disallowed calling an
ordinary function using ``cocall``. However, it became evident that these
features were not necessary, and the ``codef`` keyword was dropped in the
interests of minimising the number of new keywords required.
The use of a decorator instead of ``codef`` was also suggested, but the current
proposal makes this unnecessary as well.
It has been questioned whether some combination of decorators and functions
could be used instead of a dedicated ``cocall`` syntax. While this might be
possible, to achieve equivalent error-detecting power it would be necessary
to write cofunction calls as something like
yield from cocall(f)(args)
making them even more verbose and inelegant than an unadorned ``yield from``.
It is also not clear whether it is possible to achieve all of the benefits of
the cocall syntax using this kind of approach.
An implementation of an earlier version of this proposal in the form of patches
to Python 3.1.2 can be found here:
If this version of the proposal is received favourably, the implementation will
be updated to match.
This document has been placed in the public domain.
(reposting this from Google Group once more as the previous post missed
Mailing List, because I was not subscribed in Mailman)
*Static module/package inspection*
- static: without execution (as opposed to dynamic)
- module/package: .py or __init__.py file
- inspection: get an overview of the contents
*What should this do?*
The proposal to add a mechanism to Python interpreter to get an outline of
module/package contents without importing or executing module/package. The
outline includes names of classes, functions, variables. It also should
contain values for variables that could be provided without sophisticated
calculations (e.g. a string, integer, but probably not expressions as it
may lead to security leaks).
*user story PEPx.001:*
As a Python package maintainer, I find it bothersome to repeatedly write
bolierplate code (e.g. setup.py) to package my single file module. The
reason I should write setup.py is to provide version and description info.
This info is already available in my module source code. So I need to
either copy/paste the info from the module manually, or to import (and
hence execute) my module during packaging and installation, which I don't
want either, because modules are often installed with root privileges.
With this PEP, packing tool will be able to extract meta information from
my module without executing it or without me manually copying version
fields into some 'package configuration file'.
*user story PEPx.002:*
As a Python Application developer, I find it really complicated to provide
plugin extension subsystem for my users. Users need a mechanism to switch
between different versions of the plugin, and this mechanism is usually
provided by external tool such as setuptools to manage and install multiple
versions of plugins in local Python package repository. It is rather hard
to create an alternative approach, because you are forced to maintain
external meta-data about your plugin modules even in case it is already
available inside the module.
With this PEP, Python Application will be able to inspect
meta-data embedded inside of plugins before choosing which version to load.
This will also provide a standard mechanism for applications to check
modules returned by packaging tools without executing them. This will
greatly simplify writing and debugging custom plugins loaders on different
At this stage I'd like to a community response to two separate questions:
1. If everybody feels this functionality will be useful for Python
2. If the solution is technically feasible
On Feb 17, 2012 4:12 PM, "Nick Coghlan" <ncoghlan(a)gmail.com> wrote:
> On Sat, Feb 18, 2012 at 7:57 AM, Mark Janssen <dreamingforward(a)gmail.com>
> > Anyway... of course patches welcome, yes... ;^)
> Not really. doctest is for *testing code example in docs*. If you try
> to use it for more than that, it's likely to drive you up the wall, so
> proposals to make it more than it is usually don't get a great
> reception (docs patches to make it's limitations clearer are generally
> welcome, though). The stdib solution for test driven development is
> unittest (the vast majority of our own regression suite is written
> that way - only a small proportion uses doctest).
This pessimistic attitude is why doctest is challenging to work with at
times, not anything to do with doctest's actual model. The constant
criticisms of doctest keep contributors away, and keep its many resolvable
problems from being resolved.
> An interesting third party alternative that has been created recently
> is behave: http://crate.io/packages/behave/
This style of test is why it's so sad that doctest is ignored and
unmaintained. It's based on testing patterns developed by people who care
to promote what they are doing, but I'm of the strong opinion that they are
inferior to doctest.
So I've recently been trying to implement something for which I had hoped the 'with' statement would be perfect, but it turns out, won't work because Python provides no mechanism by which to skip the block of code in a with statement.
I want to create some functionality to make it easy too wrap command line programs in a caching architecture. To do this there are some things that need to happen before and after the wrapped CLI program is called, a try,except,finally version might look like this:
hashedoutput = hashon(args)
the 'with' version would look like
hashedpath = hashon(args)
So obviously the 'with' statement would be a good fit, especially since non-python programmers might be wrapping their CLI programs... unfortunately I can't use 'with' because I can't find a clean way to make the with block code conditional.
PEP377 suggested some mechanics that seemed a bit complicated for getting the desired effect, but I think, and correct me if I'm wrong, that the same effect could be achieved by having the __enter__ function raise a StopIteration that would be caught by the context and skip directly to the __exit__ function. The semantics of this even make some sense too me, since the closest I've been able to get to what I had hoped for was using an iterator to execute the appropriate code before and after the loop block:
hashedpath = hashon(args)
for _ in cacheon(hashedpath):
this still seems non-ideal to me...
I find myself wanting to use doctest for some test-driven development,
and find myself slightly frustrated and wonder if others would be
interested in seeing the following additional functionality in
1. Execution context determined by outer-scope doctest defintions.
2. Smart Comparisons that will detect output of a non-ordered type
(dict/set), lift and recast it and do a real comparison.
Without #1, "literate testing" becomes awash with re-defining re-used
variables which, generally, also detracts from exact purpose of the
test -- this creates testdoc noise and the docs become less useful.
Without #2, "readable docs" nicely co-aligning with "testable docs"
tends towards divergence.
Perhaps not enough developers use doctest to care, but I find it one
of the more enjoyable ways to develop python code -- I don't have to
remember test cases nor go through the trouble of setting up
unittests. AND, it encourages agile development. Another user wrote
a while back of even having a built-in test() method. Wouldn't that
really encourage agile developement? And you wouldn't have to muddy
up your code with "if __name__ == "__main__": import doctest, yadda
Anyway... of course patches welcome, yes... ;^)
On Wed, Feb 29, 2012 at 8:19 AM, Barry Warsaw <barry(a)python.org> wrote:
> On Feb 27, 2012, at 05:44 PM, Ian Bicking wrote:
>>Doctest needs reliable repr's more than reversable repr's, and you can create
>>them using that. You'll still get a lot of <foobar.Foobar object at
>>0x391a9df> strings, which suck... but if you are committed to doctest then
>>maybe better to provide good __repr__ methods on your custom objects!
> +1 even if you don't use doctests! I can't tell you how many times adding a
> useful repr has vastly improved debugging. I urge everyone to flesh out your
> reprs with a little bit of useful information so you can quickly identify your
> instances at a pdb prompt.
Since this question came up recently, what do you think of adding some
more helpers to reprlib to make this even easier to do?
I know I just added some utility functions to PulpDist  to avoid
reinventing that particular wheel for each of my class definitions.
Nick Coghlan | ncoghlan(a)gmail.com | Brisbane, Australia
On 27 February 2012 23:23, Mark Janssen <dreamingforward(a)gmail.com> wrote:
> On Mon, Feb 27, 2012 at 3:59 PM, Michael Foord <fuzzyman(a)gmail.com> wrote:
>> As well as fundamental problems, the particular implementation of doctest
>> suffers from these potentially resolvable problems:
>>> Execution of an individual testing section continues after a failure. So
>>> a single failure results in the *reporting* of potentially many failures.
>>> Hmm, perhaps I don't understand you. doctest reports how many failures
> occur, without blocking on any single failure.
Right. But you typically group a bunch of actions into a single "test". If
a doctest fails in an early action then every line after that will probably
fail - a single test failure will cause multiple *reported* failures.
>> The problem of being dependent on order of unorderable types (actually
>>> very difficult to solve).
> Well, a crude solution is just to lift any output text that denotes an
> non-ordered type and pass it through an "eval" operation.
Not a general solution - not all reprs are reversible (in fact very few are
as a proportion of all objects).
>> Things like shared fixtures and mocking become *harder* (although by no
>>> means impossible) in a doctest environment.
> This, I think, what I was suggesting with doctest "scoping" where the
> execution environment is a matter of how nested the docstring is in
> relation to the "python semantic environment", with a final scope of
> "globs" that can be passed into the test environment, for anything with
> global scope.
>> Another thing I dislike is that it encourages a "test last" approach, as
>>> by far the easiest way of generating doctests is to copy and paste from the
>>> interactive interpreter. The alternative is lots of annoying typing of
>>> '>>>' and '...', and as you're editing text and not code IDE support tends
>>> to be worse (although this is a tooling issue and not a problem with
>>> doctest itself).
> This is where I think the idea of having a test() built-in, like help(),
> would really be nice. One could run test(myClass.mymethod) iterively while
> one codes, encouraging TDD and writing tests *along with* your code. My
> TDD sense says it couldn't get any better.
>> More fundamental-ish problems:
>> Putting debugging prints into a function can break a myriad of tests
>> (because they're output based).
> That's a good point. But then it's a fairly simple matter of adding the
> output device: 'print >> stderr, 'here I am'", another possibility, if TDD
> were to become more of part of the language, is a special debug exception:
> "raise Debug("Am at the test point, ", x)" Such special exceptions could
> be caught and ignored by doctest.
>> With multiple doctest blocks in a test file running an individual
>> test can be difficult (impossible?).
>> This again solved with the test() built-in an making TDD something that
> is a feature of the language itself.
I don't fully follow you, but it shouldn't be hard to add this to doctest
and see if it is really useful.
>> I may be misremembering, but I think debugging support is also
>> problematic because of the stdout redirection
> Interesting, I try to pre-conceive tests well enough so I never need to
> invoke the debugger.
Heh. When I'm adding new features to existing code it is very common for me
to write a test that drops into the debugger after setting up some state -
and potentially using the test infrastructure (fixtures, django test client
perhaps, etc). So not being able to run a single test or drop into a
debugger puts the kybosh on that.
>> So yeah. Not a huge fan.
>> That's good feedback. Thanks.
May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html