There's a whole matrix of these and I'm wondering why the matrix is
currently sparse rather than implementing them all. Or rather, why we
can't stack them as:
class foo(object):
@classmethod
@property
def bar(cls, ...):
...
Essentially the permutation are, I think:
{'unadorned'|abc.abstract}{'normal'|static|class}{method|property|non-callable
attribute}.
concreteness
implicit first arg
type
name
comments
{unadorned}
{unadorned}
method
def foo():
exists now
{unadorned} {unadorned} property
@property
exists now
{unadorned} {unadorned} non-callable attribute
x = 2
exists now
{unadorned} static
method @staticmethod
exists now
{unadorned} static property @staticproperty
proposing
{unadorned} static non-callable attribute {degenerate case -
variables don't have arguments}
unnecessary
{unadorned} class
method @classmethod
exists now
{unadorned} class property @classproperty or @classmethod;@property
proposing
{unadorned} class non-callable attribute {degenerate case - variables
don't have arguments}
unnecessary
abc.abstract {unadorned} method @abc.abstractmethod
exists now
abc.abstract {unadorned} property @abc.abstractproperty
exists now
abc.abstract {unadorned} non-callable attribute
@abc.abstractattribute or @abc.abstract;@attribute
proposing
abc.abstract static method @abc.abstractstaticmethod
exists now
abc.abstract static property @abc.staticproperty
proposing
abc.abstract static non-callable attribute {degenerate case -
variables don't have arguments} unnecessary
abc.abstract class method @abc.abstractclassmethod
exists now
abc.abstract class property @abc.abstractclassproperty
proposing
abc.abstract class non-callable attribute {degenerate case -
variables don't have arguments} unnecessary
I think the meanings of the new ones are pretty straightforward, but in
case they are not...
@staticproperty - like @property only without an implicit first
argument. Allows the property to be called directly from the class
without requiring a throw-away instance.
@classproperty - like @property, only the implicit first argument to the
method is the class. Allows the property to be called directly from the
class without requiring a throw-away instance.
@abc.abstractattribute - a simple, non-callable variable that must be
overridden in subclasses
@abc.abstractstaticproperty - like @abc.abstractproperty only for
@staticproperty
@abc.abstractclassproperty - like @abc.abstractproperty only for
@classproperty
--rich
At the moment, the array module of the standard library allows to
create arrays of different numeric types and to initialize them from
an iterable (eg, another array).
What's missing is the possiblity to specify the final size of the
array (number of items), especially for large arrays.
I'm thinking of suffix arrays (a text indexing data structure) for
large texts, eg the human genome and its reverse complement (about 6
billion characters from the alphabet ACGT).
The suffix array is a long int array of the same size (8 bytes per
number, so it occupies about 48 GB memory).
At the moment I am extending an array in chunks of several million
items at a time at a time, which is slow and not elegant.
The function below also initializes each item in the array to a given
value (0 by default).
Is there a reason why there the array.array constructor does not allow
to simply specify the number of items that should be allocated? (I do
not really care about the contents.)
Would this be a worthwhile addition to / modification of the array module?
My suggestions is to modify array generation in such a way that you
could pass an iterator (as now) as second argument, but if you pass a
single integer value, it should be treated as the number of items to
allocate.
Here is my current workaround (which is slow):
def filled_array(typecode, n, value=0, bsize=(1<<22)):
"""returns a new array with given typecode
(eg, "l" for long int, as in the array module)
with n entries, initialized to the given value (default 0)
"""
a = array.array(typecode, [value]*bsize)
x = array.array(typecode)
r = n
while r >= bsize:
x.extend(a)
r -= bsize
x.extend([value]*r)
return x
I just spent a few minutes staring at a bug caused by a missing comma
-- I got a mysterious argument count error because instead of foo('a',
'b') I had written foo('a' 'b').
This is a fairly common mistake, and IIRC at Google we even had a lint
rule against this (there was also a Python dialect used for some
specific purpose where this was explicitly forbidden).
Now, with modern compiler technology, we can (and in fact do) evaluate
compile-time string literal concatenation with the '+' operator, so
there's really no reason to support 'a' 'b' any more. (The reason was
always rather flimsy; I copied it from C but the reason why it's
needed there doesn't really apply to Python, as it is mostly useful
inside macros.)
Would it be reasonable to start deprecating this and eventually remove
it from the language?
--
--Guido van Rossum (python.org/~guido)
o/
The french translation of the Python Documentation [1][2] has translated
20% of the pageviews of docs.python.org. I think it's the right moment
to push it do docs.python.org. So there's some questions ! And I'd like
feedback.
TL;DR (with my personal choices):
- URL may be "http://docs.python.org/fr/"
- For localized variations of languages we should use dash and
lowercase like "docs.python.org/pt-br/"
- po files may be hosted on the python's github
- existing script to build doc may be patched to build translations
- each translations may crosslink to others
- untranslated strings may be visually marked as so
I also opened: http://bugs.python.org/issue26546.
# Chronology, dependencies
The only blocking decision here is the URL, (also reviewing my patch
...), with those two, translated docs can be pushed to production, and
the other steps can be discussed and applied one by one.
# The URL
## CCTLD vs path vs subdomain
I think we should use a variation of "docs.python.org/fr/" for
simplicity and clarity.
I think we should avoid using CCTLDs as they're sometime hard or near
impossible to obtain (may cost a lot of time), also some are expensive,
so it's time and money we clearly don't need to loose.
Last possibility I see is to use a subdomain, like fr.docs.python.org or
docs.fr.python.org but I don't think it's the role / responsibility of
the sub-domain to do it.
So I'm for docs.python.org/LANGUAGE_TAG/ (without moving current
documentation inside a /en/).
## Language tag in path
### Dropping the default locale of a language
I personally think we should not show the region in case it's redundant:
so to use "fr" instead of "fr-FR", "de" instead of "de-DE", but keeping
the possibility to use a locale code when it's not redundant like for
"pt-br" or "de-AT" (German ('de') as used in Austria ('AT')).
I think so because I don't think we'll have a lot of locale variations
(like de-AT, fr-CH, fr-CA, ...) so it will be most of the time redundant
(visually heavy, longer to type, longer to read) but we'll still need
some locale (pt-BR typically).
### gettext VS IETF language tag format
gettext goes by using an underscore between language and locale [3] and
IETF goes by using a dash [4][5].
As sphinx is using gettext, and gettext uses underscore we may choose
underscore too. But URLs are not here to leak the underlying
implementation, and the IETF looks like to be the standard way to
represent language tags. Also I visually prefer the dash over the
underscore, so I'm for the dash here.
### Lower case vs upper case local tag
RFC 5646 section-2.1 tells us language tags are not case sensitive, yet
ISO3166-1 recommends that country codes (part of the language tag) be
capitalized. I personally prefer the all-lowercase one as paths in URLs
typically are lowercase. I searched for `inurl:"pt-br"` to see if I'm
not too far away from the usage here and usage seems to agree with me,
although there's some "pt-BR" in urls.
# Where to host the translated files
Currently we're hosting the *po* files in the afpy's (Francophone
association for python) [6] github [1] but it may make sense to use (in
the generation scripts) a more controlled / restricted clone in the
python github, at least to have a better view of who can push on the
documentation.
We may want to choose between aggregating all translations under the
same git repository but I don't feel it's useful.
# How to
Currently, a python script [7] is used to generate `docs.python.org`, I
proposed a patch in [8] to make this script clone and build the french
translation too, it's a simple and effective way, I don't think we need
more ? Any idea welcome.
In our side, we have a Makefile [12] to build the translated doc which
is only a thin layer on top of the Sphinx Makefile. So my proposed patch
to build scripts "just" delegate the build to our Makefile which itself
delegate the hard work to the Sphinx Makefile.
# Next ?
## Document how to translate Python
I think I can (should) write a documentation on "how to start a Python
doc translation project" and "how to migrate existing [9][10][11] python
doc translation projects to docs.python.org" if french does goes
docs.python.org because it may hopefully motivate people to do the same,
and I think our structure is a nice way to do it (A Makefile to generate
the doc, all versions translated, people mainly working on latest
version, scripts to propagating translations to older version, etc...).
## Crosslinking between existing translations
Once the translations are on `docs.python.org`, crosslinks may be
established so people on a version can be aware of other version, and
easily switch to them. I'm not a UI/UX man but I think we may have a
select box right before the existing select box about version, on the
top-left corner. Right before because it'll reflect the path: /fr/3.5/
-> [select box fr][select box 3.5].
## Marking as "untranslated, you can help" the untranslated paragraphs
The translations will always need work to follow upstream modifications:
marking untranslated paragraphs as so may transform the "Oh they suck,
this paragraph is not even translated :-(" to "Hey, nice I can help
translating that !". There's an opened sphinx-doc ticket to do so [13]
but I have not worked on it yet. As previously said I'm real bad at
designing user interfaces, so I don't even visualize how I'd like it to be.
[1] http://www.afpy.org/doc/python/3.5/
[2] https://github.com/afpy/python_doc_fr
[3] https://www.gnu.org/software/gettext/manual/html_node/Locale-Names.html
[4] http://tools.ietf.org/html/rfc5646
[5] https://en.wikipedia.org/wiki/IETF_language_tag
[6] http://www.afpy.org/
[7] https://github.com/python/docsbuild-scripts/
[8] http://bugs.python.org/issue26546
[9] http://docs.python.jp/3/
[10] https://github.com/python-doc-ja/python-doc-ja
[11] http://docs.python.org.ar/tutorial/3/index.html
[12] https://github.com/AFPy/python_doc_fr/blob/master/Makefile
[13] https://github.com/sphinx-doc/sphinx/issues/1246
--
Julien Palard
Hello all,
This is a writeup of a proposal I floated here:
https://mail.python.org/pipermail/python-list/2015-August/694905.html
last Sunday. If the response is positive I wish to write a PEP.
Briefly, it is a natural expectation in users that the command:
python -m module_name ...
used to invoke modules in "main program" mode on the command line imported the
module as "module_name". It does not, it imports it as "__main__". An import
within the program of "module_name" makes a new instance of the module, which
causes cognitive dissonance and has the side effect that now the program has
two instances of the module.
What I propose is that the above command line _should_ bind
sys.modules['module_name'] as well as binding '__main__' as it does currently.
I'm proposing that the python -m option have this effect (python pseudocode):
% python -m module.name ...
runs:
# pseudocode, with values hardwired for clarity
import sys
M = new_empty_module(name='__main__', qualname='module.name')
sys.modules['__main__'] = M
sys.modules['module.name'] = M
# load the module code from wherever (not necessarily a file - CPython
# already must do this phase)
M.execfile('/path/to/module/name.py')
Specificly, this would have the following two changes to current practice:
1) the module is imported _once_, and bound to both its canonical name and
also to __main__.
2) imported modules acquire a new attribute __qualname__ (analogous to the
recent __qualname__ on functions). This is always the conanoical name of the
module as resolved by the importer. For most modules __name__ will be the same
as __qualname__, but for the "main" module __name__ will be '__main__'.
This change has the following advantages:
The current standard boilerplate:
if __name__ == '__main__':
... invoke "main program" here ...
continues to work unchanged.
Importantly, if the program then issues "import module_name", it is already
there and the existing instance is found and used.
The thread referenced above outlines my most recent encounter with this and the
trouble it caused me. Followup messages include some support for this proposed
change, and some criticism.
The critiquing article included some workarounds for this multiple module
situation, but they were (1) somewhat dependent on modules coming from a file
pathname and (2) cumbersome and require every end user to adopt these changes
if affected by the situation. I'd like to avoid that.
Cheers,
Cameron Simpson <cs(a)zip.com.au>
The reasonable man adapts himself to the world; the unreasonable one persists
in trying to adapt the world to himself. Therefore all progress depends
on the unreasonable man. - George Bernard Shaw
Has anyone else found this to be too syntactically noisy?
from module import Foo as _Foo, bar as _bar
That is horrifically noisy IMO. The problem is, how do we
remove the noise without sacrificing intuitiveness? My first
idea was to do this:
from module import_private Foo, bar
And while it's self explanatory, it's also too long. So i
thought...
from module _import Foo, bar
I'm leaning more towards the latter, but i'm not loving it
either. Any ideas?
It's common to want to clip (or clamp) a number to a range. This feature
is commonly needed for both floating point numbers and integers:
http://stackoverflow.com/questions/9775731/clamping-floating-numbers-in-pyt…http://stackoverflow.com/questions/4092528/how-to-clamp-an-integer-to-some-…
There are a few approaches:
* use a couple ternary operators
(e.g. https://github.com/scipy/scipy/pull/5944/files line 98, which
generated a lot of discussion)
* use a min/max construction,
* call sorted on a list of the three numbers and pick out the first, or
* use numpy.clip.
Am I right that there is no *obvious* way to do this? If so, I suggest
adding math.clip (or math.clamp) to the standard library that has the
meaning:
def clip(number, lower, upper):
return lower if number < lower else upper if number > upper else number
This would work for non-numeric types so long as the non-numeric types
support comparison. It might also be worth adding
assert lower < upper
to catch some bugs.
Best,
Neil
Hello,
I have run into a use case where I want to create a wrapper object that
will star-unpack a sequence of arguments to pass to a function. Currently
one way to construct this is to create a function:
def starcaller(f):
def wrapper(args):
return f(*args)
return wrapper
Such a function feels simple enough, and specific enough to the language,
that it would fit in well in the operator module of the standard library.
We already have a similar representation of this functionality in
itertools.starmap.
Whereas the nested function implementation above will produce unpickleable
objects (due to the closure), a straightforward c implementation will
produce pickleable ones, making them useful for parallel applications.
Thanks,
Dan Spitz
Maybe it's time to add a new module for sequence-specific functions
(seqtools?). It should contain at least two classes or fabric functions:
1. A view that represents a sliced subsequence. Lazy equivalent of
seq[start:end:step]. This feature is implemented in third-party module
dataview [1].
2. A view that represents a linear sequence as 2D array. Iterating this
view emits non-intersecting chunks of the sequence. For example it can
be used for representing the bytes object as a sequence of 1-byte bytes
objects (as in 2.x), a generalized alternative to iterbytes() from PEP
467 [2].
Neither itertools nor collections modules look good place for these
features, since they are not concrete classes and work only with
sequences, not general iterables or iterators. On other side,
mappingproxy and ChainMap look close, maybe new module should be
oriented not on sequences, but on views.
[1] https://pypi.python.org/pypi/dataview
[2] https://www.python.org/dev/peps/pep-0467
There are currently a few locations in the stdlib, such as http and socket, that are now using
Enums to replace constants; those names are all upper-case -- those aren't the names I am
speaking of.
The names I am speaking of are those in brand-new enumerations where we have full control.
As an example:
class FederalHoliday(AutoNumberEnum):
NewYear = "First day of the year.", 'absolute', JANUARY, 1
MartinLutherKingJr = "Birth of Civil Rights leader.", 'relative', JANUARY, MONDAY, 3
President = "Birth of George Washington", 'relative', FEBRUARY, MONDAY, 3
Memorial = "Memory of fallen soldiers", 'relative', MAY, MONDAY, 5
Independence = "Declaration of Independence", 'absolute', JULY, 4
Labor = "American Labor Movement", 'relative', SEPTEMBER, MONDAY, 1
Columbus = "Americas discovered", 'relative', OCTOBER, MONDAY, 2
Veterans = "Recognition of Armed Forces service", 'relative', NOVEMBER, 11, 1
Thanksgiving = "Day of Thanks", 'relative', NOVEMBER, THURSDAY, 4
Christmas = "Birth of Jesus Christ", 'absolute', DECEMBER, 25
def __init__(self, doc, type, month, day, occurance=None):
self.__doc__ = doc
self.type = type
self.month = month
self.day = day
self.occurance = occurance
def date(self, year):
"""
returns the observed date of the holiday for `year`
""""
...
@classmethod
def next_business_day(cls, date, days=1):
"""
Return the next `days` business day from date.
"""
...
@classmethod
def count_business_days(cls, date1, date2):
"""
Return the number of business days between 'date1' and 'date2'.
"""
...
@classmethod
def year(cls, year):
"""
Return a list of the actual FederalHoliday dates for `year`.
"""
...
Take the name "NewYear": if it had been a global constant I would have named it "NEWYEAR"; if
it had been a normal class attribute I would have named it "new_year"; however, being an Enum
member, it is neither of those things.
<context switch>
I've written some custom data types as part of my dbf package, and a few of them have instances
that are singletons that are created in the global (okay, module) namespace, and for them I
followed Python's lead in naming singletons: Python has used Title Case in such things as None,
True, and False, so I followed suit and named mine -- Null, NullDate, NullTime, NullDateTime, etc.
</context switch>
Given my past history with using and creating singleton objects, I followed suit when creating
my own Enum classes.
I was recently queried about my apparent break with PEP 8 for naming Enum members, to which I
replied:
> Considering the strange beast that an Enum is, there is not much precedent for it anywhere.
>
> Consider:
>
> - Enum is a class
> - but it is a container
> - and can be iterated over
> - and it has a length (which can be zero)
> - but it's always True in a boolean sense
>
> - Enum members are instances of the Enum class
> - but are pre-created
> - and new ones cannot be created
> - but are available as attributes on the class
>
> Given all that I have been using Title case (or CamelCase) to name the members as it helps
> distinguish an Enum member from an ordinary attribute (which Enum classes can also have).
I forgot to include in that reply that I think CamelCase also helps to emphasize the special
singleton nature of Enum members.
My question for the community: Your thoughts/opinions of my reasoning, and if you don't agree
then which casing choice would you recommend and use, and why? (Reminder: this question does
not include Enums whose names are replacements for existing constants and so the names cannot
be changed.)
--
~Ethan~