Now in the CPython small integer numbers from -5 up to 256 inclusive are
preallocated at the start. It allows to reduce memory consumption and
time of creation of the integers in this range. In particular this
affects the speed of short enumerations. Increasing the range to the
maximum (from -32767 to 32767 inclusive), we can speed up longer
./python -m timeit "for i in range(10000): pass"
./python -m timeit -s "a=*10000" "for i, x in enumerate(a): pass"
./python -m timeit -s "a=*10000" "i=0" "for x in a: i+=1"
./python -m timeit -s "a=*10000" "for i in range(len(a)): x=a[i]"
530 usec 337 usec 57%
1.06 msec 811 usec 31%
1.34 msec 1.13 msec 19%
1.42 msec 1.22 msec 16%
1) Memory consumption increases by constant 1-1.5 MB. Or half of it if
the range is expanded only in a positive direction. This is not a
problem on most modern computers. But would be better if the parameters
NSMALLPOSINTS and NSMALLNEGINTS have been configurable at build time.
2) A little bit larger Python start time. I was not able to measure the
difference, it is too small.
attrgetter and itemgetter are both very useful functions, but both have
a significant pitfall if the arguments passed in are validated but not
controlled: if receiving the arguments (list of attributes, keys or
indexes) from an external source and *-applying it, if the external
source passes a sequence of one element both functions will in turn
return an element rather than a singleton (1-element tuple).
This means such code, for instance code "slicing" a matrix of some sort
to get only some columns and getting the slicing information from its
caller (in situation where extracting a single column may be perfectly
sensible) will have to implement a manual dispatch between a "manual"
getitem (or getattr) and an itemgetter (resp. attrgetter) call, e.g.
slicer = (operator.itemgetter(*indices) if len(indices) > 1
else lambda ar: [ar[indices])
This makes for more verbose and less straightforward code, I think it
would be useful to such situations if attrgetter and itemgetter could be
forced into always returning a tuple by way of an optional argument:
# works the same no matter what len(indices) is
slicer = operator.itemgetter(*indices, force_tuple=True)
which in the example equivalences would be an override (to False) of
the `len` check (`len(items) == 1` would become `len(items) == 1 and not
The argument is backward-compatible as neither function currently
accepts any keyword argument.
Uncertainty note: whether force_tuple (or whatever its name is)
silences the error generated when len(indices) == 0, and returns
a null tuple rather than raising a TypeError.
On Sep 14, 2012 10:02 PM, "Jim Jewett" <jimjjewett(a)gmail.com> wrote:
> On 9/14/12, Oscar Benjamin <oscar.j.benjamin(a)gmail.com> wrote:
> > I can see why you would expect different behaviour here, though. I tend
> > to think of the functions in the operator module as convenience
> > but as *efficient* nameable functions referring to operations that are
> > normally invoked with a non-function syntax. Which is more convenient
> > of the following:
> > 1) using operator
> > import operator
> > result = sorted(values, key=operator.attrgetter('name'))
> I would normally write that as
> from operator import attrgetter as attr
> ... # may use it several times
> result=sorted(values, key=attr('name'))
> which is about the best I could hope for, without being able to use
> the dot itself.
To be clear, I wasn't complaining about the inconvenience of importing and
referring to attrgetter. I was saying that if the obvious alternative
(lambda functions) is at least as convenient then it's odd to describe
itemgetter/attrgetter as convenience functions.
> > 2) using lambda
> > result = sorted(values, key=lambda v: v.name)
> And I honestly think that would be worse, even if lambda didn't have a
> code smell. It focuses attention on the fact that you're creating a
> callable, instead of on the fact that you're grabbing the name
I disagree here. I find the fact that a lambda function shows me the
expression I would normally use to get the quantity I'm interested in makes
it easier for me to read. When I look at it I don't see it as a callable
function but as an expression that I'm passing for use somewhere else.
> > In general it is bad to conflate scalar/sequence semantics so that a
> > should get a different type of object depending on the length of a
> > sequence.
> Yeah, but that can't really be solved well in python, except maybe by
> never extending an API to handle sequences. I would personally not
> consider that an improvement.
> Part of the problem is that the cleanest way to take a variable number
> of arguments is to turn them into a sequence under the covers (*args),
> even if they weren't passed that way.
You can extend an API to support sequences by adding a new entry point.
This is a common idiom in python: think list.append vs list.extend.
Why is there no way to pass PYTHONPATH on the command line? Oversight
python -p path_item -c "import something; something.foo()"
I am aware that the __main__.py behavior lessens the need for this
Alexander Belopolsky wrote:
> Consider this:
> >>> memoryview(b'x').cast('B', ()).tolist()
> The return value of to list() is an int, not a list.
That's because NumPy's tolist() does the same thing:
>>> x = numpy.array(120, dtype='B')
If you implement tolist() recursively like in _testbuffer.c and choose
the zeroth dimension as the base case, you arrive at single elements.
So at least it's not completely unnatural.
We have been discussing the value of having namedtuple as the return
type for urlparse.urlparse and urlparse.urlsplit. See that thread
here: http://bugs.python.org/issue15824 . I jumped the gun and
submitted a patch without seeing if anyone else thought different
behavior was desirable. My argument is that it would be a major
usability improvement if the return type supported item assignment.
Currently, something like the following is necessary in order to
parse, make changes, and unparse:
url = list(urlparse.urlparse('http://www.example.com/foo/bar?hehe=haha'))
url = 'python.com'
new_url = urllib.urlunparse(url)
I think this is really clunky. I don't see any reason why we should be
using a type that doesn't support item assignment and needs to be
casted to a another type in order to make changes. I think an
interface like this is more useful:
url = urlparse.urlparse('http://www.example.com/foo/bar?hehe=haha')
url.netloc = 'www.python.com'
What do other people think?
On Sat, Sep 8, 2012 at 9:17 AM, Terry Reedy <tjreedy(a)udel.edu> wrote:
> On 9/8/2012 2:27 AM, Guido van Rossum wrote:
>> Can someone explain what problem we are trying to solve? I fail to
>> uderstand what's wrong with the current behavior...
> Pairs of different things have the same representation, making the
> representation ambiguous to both people and the interpreter.
Well yeah, when designing a repr() we usually have to compromise. E.g.
if you render a class instance it often shows the class name but not
the module name (e.g. decimal.Decimal.)
> Moreover, the interpreter's guess is usually wrong.
The requirement that the interpreter can evaluate a repr() and return
a similar value is pretty weak, and I'm not sure that in this case the
fact that copying the output back into the interpreter returns an
object of a different share matters much to anyone.
A subtler but similar bug appears with lists containing multiple
references to the same sublist, e.g.
>>> a = [1, 2]
>>> b = [a, a]
[[1, 2], [1, 2]]
[[1, 2, 3], [1, 2, 3]]
>>> x = [[1, 2], [1, 2]]
[[1, 2, 3], [1, 2]]
I don't think we should attempt to fix this particular one -- first of
all, the analysis would be tricky (there could be a user-defined
object involved) and second of all, I can't think of a solution that
still produces a valid expression (except perhaps a very ugly one).
> In particular, the representations of recursive lists use what is now the
> Ellipsis literal '...', so they are also valid list displays for a
> non-recursive nested list containing Ellipsis. The interpreter always reads
> ... as the Ellipsis literal, which it nearly always is not what is meant.
But when does it ever matter?
> It would be trivial to tweak the representations of recursive lists so they
> are not valid list displays.
To what purpose? I still don't understand what the actual use case is
where you think that will produce a better experience for the user.
--Guido van Rossum (python.org/~guido)
With the Python 3 loosening of where ... can occur, this somewhat
suboptimal behaviour occurs
>>> x = 
Is this something that can be improved? Is it something worth improving?
I think the annotations of parameters and return value of a function,
a useful practice for the user of the function.
As a function can modify or create global variables, and as it's
important for the end user, I would appreciate to add annotations in
the global statement.
An annotation syntax similar to that of parameters could be employed :
global var : expression
global var1 : expression1, var2 : expression2,...
Alex (geoscience modeler)
>>> memoryview(b'x').cast('B', ()).tolist()
The return value of to list() is an int, not a list.
I suggest to deprecate memoryview.tolist() and .tobytes() methods
(soft deprecation - in documentation only) and recommend using list(m)
and bytes(m) instead.
For the multidimensional (and 0-dimensional) views, I suggest adding
an unpack([depth]) method that would unpack a view into a nested list
of tuples or subviews. For example a single-byte scalar should unpack
>>> m = memoryview(b'x').cast('B', ())
>>> struct.unpack_from(m.format, m)