There's a whole matrix of these and I'm wondering why the matrix is
currently sparse rather than implementing them all. Or rather, why we
can't stack them as:
class foo(object):
@classmethod
@property
def bar(cls, ...):
...
Essentially the permutation are, I think:
{'unadorned'|abc.abstract}{'normal'|static|class}{method|property|non-callable
attribute}.
concreteness
implicit first arg
type
name
comments
{unadorned}
{unadorned}
method
def foo():
exists now
{unadorned} {unadorned} property
@property
exists now
{unadorned} {unadorned} non-callable attribute
x = 2
exists now
{unadorned} static
method @staticmethod
exists now
{unadorned} static property @staticproperty
proposing
{unadorned} static non-callable attribute {degenerate case -
variables don't have arguments}
unnecessary
{unadorned} class
method @classmethod
exists now
{unadorned} class property @classproperty or @classmethod;@property
proposing
{unadorned} class non-callable attribute {degenerate case - variables
don't have arguments}
unnecessary
abc.abstract {unadorned} method @abc.abstractmethod
exists now
abc.abstract {unadorned} property @abc.abstractproperty
exists now
abc.abstract {unadorned} non-callable attribute
@abc.abstractattribute or @abc.abstract;@attribute
proposing
abc.abstract static method @abc.abstractstaticmethod
exists now
abc.abstract static property @abc.staticproperty
proposing
abc.abstract static non-callable attribute {degenerate case -
variables don't have arguments} unnecessary
abc.abstract class method @abc.abstractclassmethod
exists now
abc.abstract class property @abc.abstractclassproperty
proposing
abc.abstract class non-callable attribute {degenerate case -
variables don't have arguments} unnecessary
I think the meanings of the new ones are pretty straightforward, but in
case they are not...
@staticproperty - like @property only without an implicit first
argument. Allows the property to be called directly from the class
without requiring a throw-away instance.
@classproperty - like @property, only the implicit first argument to the
method is the class. Allows the property to be called directly from the
class without requiring a throw-away instance.
@abc.abstractattribute - a simple, non-callable variable that must be
overridden in subclasses
@abc.abstractstaticproperty - like @abc.abstractproperty only for
@staticproperty
@abc.abstractclassproperty - like @abc.abstractproperty only for
@classproperty
--rich
At the moment, the array module of the standard library allows to
create arrays of different numeric types and to initialize them from
an iterable (eg, another array).
What's missing is the possiblity to specify the final size of the
array (number of items), especially for large arrays.
I'm thinking of suffix arrays (a text indexing data structure) for
large texts, eg the human genome and its reverse complement (about 6
billion characters from the alphabet ACGT).
The suffix array is a long int array of the same size (8 bytes per
number, so it occupies about 48 GB memory).
At the moment I am extending an array in chunks of several million
items at a time at a time, which is slow and not elegant.
The function below also initializes each item in the array to a given
value (0 by default).
Is there a reason why there the array.array constructor does not allow
to simply specify the number of items that should be allocated? (I do
not really care about the contents.)
Would this be a worthwhile addition to / modification of the array module?
My suggestions is to modify array generation in such a way that you
could pass an iterator (as now) as second argument, but if you pass a
single integer value, it should be treated as the number of items to
allocate.
Here is my current workaround (which is slow):
def filled_array(typecode, n, value=0, bsize=(1<<22)):
"""returns a new array with given typecode
(eg, "l" for long int, as in the array module)
with n entries, initialized to the given value (default 0)
"""
a = array.array(typecode, [value]*bsize)
x = array.array(typecode)
r = n
while r >= bsize:
x.extend(a)
r -= bsize
x.extend([value]*r)
return x
I just spent a few minutes staring at a bug caused by a missing comma
-- I got a mysterious argument count error because instead of foo('a',
'b') I had written foo('a' 'b').
This is a fairly common mistake, and IIRC at Google we even had a lint
rule against this (there was also a Python dialect used for some
specific purpose where this was explicitly forbidden).
Now, with modern compiler technology, we can (and in fact do) evaluate
compile-time string literal concatenation with the '+' operator, so
there's really no reason to support 'a' 'b' any more. (The reason was
always rather flimsy; I copied it from C but the reason why it's
needed there doesn't really apply to Python, as it is mostly useful
inside macros.)
Would it be reasonable to start deprecating this and eventually remove
it from the language?
--
--Guido van Rossum (python.org/~guido)
Stephen J. Turnbull wrote:
> Vernon D. Cole writes:
>
>> I cannot compile a Python extension module with any Microsoft compiler
>> I can obtain.
>
> Your pain is understood, but it's not simple to address it.
FWIW, I'm working on making the compiler easily obtainable. The VS 2008 link that was posted is unofficial, and could theoretically disappear at any time (I'm not in control of that), but the Windows SDK for Windows 7 and .NET 3.5 SP1 (http://www.microsoft.com/en-us/download/details.aspx?id=3138) should be around for as long as Windows 7 is supported. The correct compiler (VC9) is included in this SDK, but unfortunately does not install the vcvarsall.bat file that distutils expects. (Though it's pretty simple to add one that will switch on %1 and call the correct vcvars(86|64|...).bat.)
The SDK needed for Python 3.3 and 3.4 (VC10) is even worse - there are many files missing. I'm hoping we'll be able to set up some sort of downloadable package/tool that will fix this. While we'd obviously love to move CPython onto our latest compilers, it's simply not possible (for good reason). Python 3.4 is presumably locked to VC10, but hopefully 3.5 will be able to use whichever version is current when that decision is made.
> The basic problem is that the ABI changes. Therefore it's going to require
> a complete new set of *all* C extensions for Windows, and the duplication
> of download links for all those extensions from quite a few different vendors
> is likely to confuse a lot of users.
Specifically, the CRT changes. The CRT is an interesting mess of data structures that are exposed in header files, which means while you can have multiple CRTs loaded, they cannot touch each other's data structures at all or things will go bad/crash, and there's no nice way to set it up to avoid this (my colleague who currently owns MSVCRT suggested a not-very-nice way to do it, but I don't think it's going to be reliable enough). Python's stable ABI helps, but does not solve this problem.
The file APIs are the worst culprits. The layout of FILE* objects can and do change between CRT versions, and file descriptors are simply indices into an array of these objects that is exposed through macros rather than function calls. As a result, you cannot mix either FILE pointers or file descriptors between CRTs. The only safe option is to build with the matching CRT, and for MSVCRT, this means with the matching compiler. It's unfortunate, and the responsible teams are well aware of the limitation, but it's history at this point, so we have no choice but to work with it.
Cheers,
Steve
Dear all,
I guess this is so obvious that someone must have suggested it before:
in list comprehensions you can currently exclude items based on the if
conditional, e.g.:
[n for n in range(1,1000) if n % 4 == 0]
Why not extend this filtering by allowing a while statement in addition to
if, as in:
[n for n in range(1,1000) while n < 400]
Trivial effect, I agree, in this example since you could achieve the same by
using range(1,400), but I hope you get the point.
This intuitively understandable extension would provide a big speed-up for
sorted lists where processing all the input is unnecessary.
Consider this:
some_names=["Adam", "Andrew", "Arthur", "Bob", "Caroline","Lancelot"] #
a sorted list of names
[n for n in some_names if n.startswith("A")]
# certainly gives a list of all names starting with A, but .
[n for n in some_names while n.startswith("A")]
# would have saved two comparisons
Best,
Wolfgang
On Fri, Nov 22, 2013 at 1:44 PM, Gregory P. Smith <greg(a)krypto.org> wrote:
> It'd be nice to formalize a way to get rid of the __name__ == '__main__'
> idiom as well in the long long run. Sure everyone's editor types that for
> them now but it's still a wart. Anyways, digressing... ;)
This has come up before and is the subject of several PEPS. [1][2]
The current idiom doesn't bother me too much as I try not to have
files that are both scripts and modules. However, Python doesn't make
the distinction all that clear nor does it do much to encourage people
to keep the two separate. I'd prefer improvements in both those
instead, but haven't had the time for any concrete proposal.
FWIW, aside from the idiom there are other complications that arise
from a module that also gets loaded in __main__ (run as a script).
See PEP 395 [3].
-erc
[1] http://www.python.org/dev/peps/pep-0299/
[2] http://www.python.org/dev/peps/pep-3122/
[3] http://www.python.org/dev/peps/pep-0395/ (sort of related)
Hello,
Coming back to python after a long time.
My present project is about (yet another) top-down matching / parsing lib. There
are 2 issues that, I guess, may be rather easily solved by simple string
methods. The core point is that any scanning / parsing process ends up, a the
lowest level, constantly comparing either single-char (rather single-code)
substrings or constant (literal) substrings of the source string. This is the
only operation which, when successful, actually advances in the source. Thus, it
is certainly worth having it efficient, or at the minimum not having it
needlessly inefficient. I suppose the same functionalities can be highly useful
in various other use cases of text processing.
Note again that I'm rediscovering Python (with some pleasure :-), thus may miss
known solutions -- but I asked on the tutor mailing list.
In both cases, I guess ordinary idiomatic Python code actually _creates_ a new
string object, as a substring of length 1 or more, which is otherwise useless;
for instance:
if s[i] == char:
# match ok -- object s[i] unneeded
if s[i:j] == substr:
# match ok -- object s[i:j] unneeded
What is actually needed is just to check for equality (or another check about a
code, see below).
The case of single-code checking appears when (1) a substring happens to hold a
single code (meaning it represents a simple or precomposed unicode char) (2)
when matching a char from a given set, range, or more complex class (eg in regex
[a-zA-Z0-9_-']). In all cases, what we want is tocheck the code: compare it to a
constant value, check whether it belongs to a set of value, or lies inside a
given range. We need the code --not a single-code string. Ideally, I'd like
expressions like:
c = s.code(i) # or s.ord(i) or s.ucode(i) [3]
# and then one of:
if c = code:
# match ok
if c in codes:
# match ok
if c >= code1 and c <= code2:
# match ok
The builtin function ord(char) does not do the job, since it only works for a
single-char string. We would again need to create a new string, with ord(s[i]).
The right solution apparently is a string method like code(self, i) giving the
code at an arbitrary index. I guess this is trivial.
I'm surprised it does not exist; maybe some may think this is a symptom there is
no strong need for it; instead, I guess people routinely use a typical Python
idiom without even noticing it creates a unneeded string object. [2] [3]
What do you think?
A second need is checking substring equality against constant substrings of
arbitrary sizes. This is similar to startswith & endswith, except at any code
index in the source string; a generalisation. In C implementation, it would
probably delegate to memcomp, with a start pointer set to p_source+i. On the
Python side, it may be a string method like sub_equals(self, substr, i). Choose
you preferred name ;-). [1] [4]
if s.sub_equals(substr, i):
# match ok
What do you think? (bis)
Thank you,
Denis
[1] I am unsure whether an end index is useful, actually I don't really
understand its usage for startswith & endswith neither.
[2] Actually, the compiler, if smart enough, may eliminate this object
construction and just check the code; does it? Anyway, I think it is not that
easy in the cases of ranges & sets.
[3] As a side-note, 'ord' is in my view a misnomer, since character codes are
not ordinals, with significant order, but nominals, plain numerical codes which
only need to be all distinct; they are kinds of id's. For unicode, I call them
'ucodes', an idea I stole somewhere. But I would be happy is the method is
called 'ord' anyway, since the term is established in the Python community.
[4] Would such a new method make startswith & endswith unneeded?
http://docs.python.org/2/library/string.html#template-strings
## Original Idea
stdlib lacks the most popular basic variable extension syntax
"{{ variable }}" that can be found in Django [1], Jinja2 [2] and
other templating engines [3].
## stdlib Analysis
string.Template syntax is ancient (dates back to Python 2.4
from 9 years ago). I haven't seen a template like this for a long time.
## Scope of Enhancement
st = 'Hello {{world}}.'
world = 'is not enough'
t = Template(string, style='brace')
t.render(locals())
## Links
1. https://docs.djangoproject.com/en/dev/topics/templates/#variables
2. http://jinja.pocoo.org/docs/templates/#variables
3. http://mustache.github.io/
## Feature Creeping
# Allow to override {{ }} symbols to make it more generic.
# `foo.bar` attribute lookup for 2D (nested) structures.
Questions is it has to be supported:
`foo.bar` in Django does dictionary lookup first, then attribute lookup
`foo.bar` in Jinja2 does attribute lookup first
I am not sure which is better. I definitely don't want some method or
property on a dict passed to render() method to hide dict value.
--
anatoly t.
Hello all,
I would like to propose refactoring the assertions out of
unittest.TestCase. As code speaks louder than words, you can see my
initial work at https://github.com/OddBloke/cpython.
The aim of the refactor is:
(a) to reduce the amount of repeated code in the assertions,
(b) to provide a clearer framework for third-party assertions to follow,
and
(c) to make it easier to split gnarly assertions in to an
easier-to-digest form.
My proposed implementation (as seen in the code above) is to move each
assertion in to its own class. There will be a shared superclass (called
Assert currently; see [0]) implementing the template pattern (look at
__call__ on line 69), meaning that each assertion only has to concern
itself with its unique aspects: what makes it fail, and how that
specific failure should be presented.
To maintain the current TestCase interface (all of the tests pass in my
branch), the existing assert* methods instantiate the Assert sub-classes
on each call with a context that captures self.longMessage,
self.failureException, self.maxDiff, and self._diffThreshold from the
TestCase instance.
Other potential aims include eventually deprecating the assertion
methods, providing a framework for custom assertions to hook in to, and
providing assertion functions (a la nose.tools) with a default context
set.
This proposal would help address #18054[1] as the new assertions could
just be implemented as separate classes (and not included in the
TestCase.assert* API); people who wanted to use them could just
instantiate them themselves.
I’d love some feedback on this proposal (and the implementation thus
far).
Cheers,
Dan (Odd_Bloke)
[0] https://github.com/OddBloke/cpython/blob/master/Lib/unittest/assertions/__i…
[1] http://bugs.python.org/issue18054