Mailman 3 September 2016 - Python-ideas

@classproperty, @abc.abstractclasspropery, etc.
by K. Richard Pixley 16 Dec '20

16 Dec '20

There's a whole matrix of these and I'm wondering why the matrix is currently sparse rather than implementing them all. Or rather, why we can't stack them as: class foo(object): @classmethod @property def bar(cls, ...): ... Essentially the permutation are, I think: {'unadorned'|abc.abstract}{'normal'|static|class}{method|property|non-callable attribute}. concreteness implicit first arg type name comments {unadorned} {unadorned} method def foo(): exists now {unadorned} {unadorned} property @property exists now {unadorned} {unadorned} non-callable attribute x = 2 exists now {unadorned} static method @staticmethod exists now {unadorned} static property @staticproperty proposing {unadorned} static non-callable attribute {degenerate case - variables don't have arguments} unnecessary {unadorned} class method @classmethod exists now {unadorned} class property @classproperty or @classmethod;@property proposing {unadorned} class non-callable attribute {degenerate case - variables don't have arguments} unnecessary abc.abstract {unadorned} method @abc.abstractmethod exists now abc.abstract {unadorned} property @abc.abstractproperty exists now abc.abstract {unadorned} non-callable attribute @abc.abstractattribute or @abc.abstract;@attribute proposing abc.abstract static method @abc.abstractstaticmethod exists now abc.abstract static property @abc.staticproperty proposing abc.abstract static non-callable attribute {degenerate case - variables don't have arguments} unnecessary abc.abstract class method @abc.abstractclassmethod exists now abc.abstract class property @abc.abstractclassproperty proposing abc.abstract class non-callable attribute {degenerate case - variables don't have arguments} unnecessary I think the meanings of the new ones are pretty straightforward, but in case they are not... @staticproperty - like @property only without an implicit first argument. Allows the property to be called directly from the class without requiring a throw-away instance. @classproperty - like @property, only the implicit first argument to the method is the class. Allows the property to be called directly from the class without requiring a throw-away instance. @abc.abstractattribute - a simple, non-callable variable that must be overridden in subclasses @abc.abstractstaticproperty - like @abc.abstractproperty only for @staticproperty @abc.abstractclassproperty - like @abc.abstractproperty only for @classproperty --rich

10 15

Specify number of items to allocate for array.array() constructor
by Sven Rahmann 21 Feb '20

21 Feb '20

At the moment, the array module of the standard library allows to create arrays of different numeric types and to initialize them from an iterable (eg, another array). What's missing is the possiblity to specify the final size of the array (number of items), especially for large arrays. I'm thinking of suffix arrays (a text indexing data structure) for large texts, eg the human genome and its reverse complement (about 6 billion characters from the alphabet ACGT). The suffix array is a long int array of the same size (8 bytes per number, so it occupies about 48 GB memory). At the moment I am extending an array in chunks of several million items at a time at a time, which is slow and not elegant. The function below also initializes each item in the array to a given value (0 by default). Is there a reason why there the array.array constructor does not allow to simply specify the number of items that should be allocated? (I do not really care about the contents.) Would this be a worthwhile addition to / modification of the array module? My suggestions is to modify array generation in such a way that you could pass an iterator (as now) as second argument, but if you pass a single integer value, it should be treated as the number of items to allocate. Here is my current workaround (which is slow): def filled_array(typecode, n, value=0, bsize=(1<<22)): """returns a new array with given typecode (eg, "l" for long int, as in the array module) with n entries, initialized to the given value (default 0) """ a = array.array(typecode, [value]*bsize) x = array.array(typecode) r = n while r >= bsize: x.extend(a) r -= bsize x.extend([value]*r) return x

14 20

Implicit string literal concatenation considered harmful?
by Guido van Rossum 14 Mar '18

14 Mar '18

I just spent a few minutes staring at a bug caused by a missing comma -- I got a mysterious argument count error because instead of foo('a', 'b') I had written foo('a' 'b'). This is a fairly common mistake, and IIRC at Google we even had a lint rule against this (there was also a Python dialect used for some specific purpose where this was explicitly forbidden). Now, with modern compiler technology, we can (and in fact do) evaluate compile-time string literal concatenation with the '+' operator, so there's really no reason to support 'a' 'b' any more. (The reason was always rather flimsy; I copied it from C but the reason why it's needed there doesn't really apply to Python, as it is mostly useful inside macros.) Would it be reasonable to start deprecating this and eventually remove it from the language? -- --Guido van Rossum (python.org/~guido)

51 165

https://docs.python.org/fr/ ?
by Julien Palard 02 Feb '17

02 Feb '17

o/ The french translation of the Python Documentation [1][2] has translated 20% of the pageviews of docs.python.org. I think it's the right moment to push it do docs.python.org. So there's some questions ! And I'd like feedback. TL;DR (with my personal choices): - URL may be "http://docs.python.org/fr/" - For localized variations of languages we should use dash and lowercase like "docs.python.org/pt-br/" - po files may be hosted on the python's github - existing script to build doc may be patched to build translations - each translations may crosslink to others - untranslated strings may be visually marked as so I also opened: http://bugs.python.org/issue26546. # Chronology, dependencies The only blocking decision here is the URL, (also reviewing my patch ...), with those two, translated docs can be pushed to production, and the other steps can be discussed and applied one by one. # The URL ## CCTLD vs path vs subdomain I think we should use a variation of "docs.python.org/fr/" for simplicity and clarity. I think we should avoid using CCTLDs as they're sometime hard or near impossible to obtain (may cost a lot of time), also some are expensive, so it's time and money we clearly don't need to loose. Last possibility I see is to use a subdomain, like fr.docs.python.org or docs.fr.python.org but I don't think it's the role / responsibility of the sub-domain to do it. So I'm for docs.python.org/LANGUAGE_TAG/ (without moving current documentation inside a /en/). ## Language tag in path ### Dropping the default locale of a language I personally think we should not show the region in case it's redundant: so to use "fr" instead of "fr-FR", "de" instead of "de-DE", but keeping the possibility to use a locale code when it's not redundant like for "pt-br" or "de-AT" (German ('de') as used in Austria ('AT')). I think so because I don't think we'll have a lot of locale variations (like de-AT, fr-CH, fr-CA, ...) so it will be most of the time redundant (visually heavy, longer to type, longer to read) but we'll still need some locale (pt-BR typically). ### gettext VS IETF language tag format gettext goes by using an underscore between language and locale [3] and IETF goes by using a dash [4][5]. As sphinx is using gettext, and gettext uses underscore we may choose underscore too. But URLs are not here to leak the underlying implementation, and the IETF looks like to be the standard way to represent language tags. Also I visually prefer the dash over the underscore, so I'm for the dash here. ### Lower case vs upper case local tag RFC 5646 section-2.1 tells us language tags are not case sensitive, yet ISO3166-1 recommends that country codes (part of the language tag) be capitalized. I personally prefer the all-lowercase one as paths in URLs typically are lowercase. I searched for `inurl:"pt-br"` to see if I'm not too far away from the usage here and usage seems to agree with me, although there's some "pt-BR" in urls. # Where to host the translated files Currently we're hosting the *po* files in the afpy's (Francophone association for python) [6] github [1] but it may make sense to use (in the generation scripts) a more controlled / restricted clone in the python github, at least to have a better view of who can push on the documentation. We may want to choose between aggregating all translations under the same git repository but I don't feel it's useful. # How to Currently, a python script [7] is used to generate `docs.python.org`, I proposed a patch in [8] to make this script clone and build the french translation too, it's a simple and effective way, I don't think we need more ? Any idea welcome. In our side, we have a Makefile [12] to build the translated doc which is only a thin layer on top of the Sphinx Makefile. So my proposed patch to build scripts "just" delegate the build to our Makefile which itself delegate the hard work to the Sphinx Makefile. # Next ? ## Document how to translate Python I think I can (should) write a documentation on "how to start a Python doc translation project" and "how to migrate existing [9][10][11] python doc translation projects to docs.python.org" if french does goes docs.python.org because it may hopefully motivate people to do the same, and I think our structure is a nice way to do it (A Makefile to generate the doc, all versions translated, people mainly working on latest version, scripts to propagating translations to older version, etc...). ## Crosslinking between existing translations Once the translations are on `docs.python.org`, crosslinks may be established so people on a version can be aware of other version, and easily switch to them. I'm not a UI/UX man but I think we may have a select box right before the existing select box about version, on the top-left corner. Right before because it'll reflect the path: /fr/3.5/ -> [select box fr][select box 3.5]. ## Marking as "untranslated, you can help" the untranslated paragraphs The translations will always need work to follow upstream modifications: marking untranslated paragraphs as so may transform the "Oh they suck, this paragraph is not even translated :-(" to "Hey, nice I can help translating that !". There's an opened sphinx-doc ticket to do so [13] but I have not worked on it yet. As previously said I'm real bad at designing user interfaces, so I don't even visualize how I'd like it to be. [1] http://www.afpy.org/doc/python/3.5/ [2] https://github.com/afpy/python_doc_fr [3] https://www.gnu.org/software/gettext/manual/html_node/Locale-Names.html [4] http://tools.ietf.org/html/rfc5646 [5] https://en.wikipedia.org/wiki/IETF_language_tag [6] http://www.afpy.org/ [7] https://github.com/python/docsbuild-scripts/ [8] http://bugs.python.org/issue26546 [9] http://docs.python.jp/3/ [10] https://github.com/python-doc-ja/python-doc-ja [11] http://docs.python.org.ar/tutorial/3/index.html [12] https://github.com/AFPy/python_doc_fr/blob/master/Makefile [13] https://github.com/sphinx-doc/sphinx/issues/1246 -- Julien Palard

7 8

proposal: "python -m foo" should bind sys.modules['foo']
by Cameron Simpson 17 Jan '17

17 Jan '17

Hello all, This is a writeup of a proposal I floated here: https://mail.python.org/pipermail/python-list/2015-August/694905.html last Sunday. If the response is positive I wish to write a PEP. Briefly, it is a natural expectation in users that the command: python -m module_name ... used to invoke modules in "main program" mode on the command line imported the module as "module_name". It does not, it imports it as "__main__". An import within the program of "module_name" makes a new instance of the module, which causes cognitive dissonance and has the side effect that now the program has two instances of the module. What I propose is that the above command line _should_ bind sys.modules['module_name'] as well as binding '__main__' as it does currently. I'm proposing that the python -m option have this effect (python pseudocode): % python -m module.name ... runs: # pseudocode, with values hardwired for clarity import sys M = new_empty_module(name='__main__', qualname='module.name') sys.modules['__main__'] = M sys.modules['module.name'] = M # load the module code from wherever (not necessarily a file - CPython # already must do this phase) M.execfile('/path/to/module/name.py') Specificly, this would have the following two changes to current practice: 1) the module is imported _once_, and bound to both its canonical name and also to __main__. 2) imported modules acquire a new attribute __qualname__ (analogous to the recent __qualname__ on functions). This is always the conanoical name of the module as resolved by the importer. For most modules __name__ will be the same as __qualname__, but for the "main" module __name__ will be '__main__'. This change has the following advantages: The current standard boilerplate: if __name__ == '__main__': ... invoke "main program" here ... continues to work unchanged. Importantly, if the program then issues "import module_name", it is already there and the existing instance is found and used. The thread referenced above outlines my most recent encounter with this and the trouble it caused me. Followup messages include some support for this proposed change, and some criticism. The critiquing article included some workarounds for this multiple module situation, but they were (1) somewhat dependent on modules coming from a file pathname and (2) cumbersome and require every end user to adopt these changes if affected by the situation. I'd like to avoid that. Cheers, Cameron Simpson <cs(a)zip.com.au> The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man. - George Bernard Shaw

5 13

Importing public symbols and simultainiously privatizing them, is too noisy
by Rick Johnson 28 Dec '16

28 Dec '16

Has anyone else found this to be too syntactically noisy? from module import Foo as _Foo, bar as _bar That is horrifically noisy IMO. The problem is, how do we remove the noise without sacrificing intuitiveness? My first idea was to do this: from module import_private Foo, bar And while it's self explanatory, it's also too long. So i thought... from module _import Foo, bar I'm leaning more towards the latter, but i'm not loving it either. Any ideas?

9 17

singledispatch for instance methods
by Tim Mitchell 26 Dec '16

26 Dec '16

Hi All, We have a modified version of singledispatch at work which works for methods as well as functions. We have open-sourced it as methoddispatch (pypi: https://pypi.python.org/pypi/methoddispatch). IMHO I thought it would make a nice addition to python stdlib. What does everyone else think?

2 2

Null coalescing operator
by Arek Bulski 05 Nov '16

05 Nov '16

Sometimes I find myself in need of this nice operator that I used back in the days when I was programming in .NET, essentially an expression >>> expr ?? instead should return expr when it `is not None` and `instead` otherwise. A piece of code that I just wrote, you can see a use case: def _sizeof(self, context): if self.totalsizeof is not None: return self.totalsizeof else: raise SizeofError("cannot calculate size") With the oprator it would just be def _sizeof(self, context): return self.totalsizeof ?? raise SizeofError("cannot calculate size") pozdrawiam, Arkadiusz Bulski

36 123

SI scale factors alone, without units or dimensional analysis
by Steven D'Aprano 29 Oct '16

29 Oct '16

Ken has made what I consider a very reasonable suggestion, to introduce SI prefixes to Python syntax for numbers. For example, typing 1K will be equivalent to 1000. However, there are some complexities that have been glossed over. (1) Are the results floats, ints, or something else? I would expect that 1K would be int 1000, not float 1000. But what about fractional prefixes, like 1m? Should that be a float or a decimal? If I write 7981m I would expect 7.981, not 7.9809999999999999, so maybe I want a decimal float, not a binary float? Actually, what I would really want is for the scale factor to be tracked separately. If I write 7981m * 1M, I should end up with 7981000 as an int, not a float. Am I being unreasonable? Obviously if I write 1.1K then I'm expecting a float. So I'm not *entirely* unreasonable :-) (2) Decimal or binary scale factors? The SI units are all decimal, and I think if we support these, we should insist that K == 1000, not 1024. For binary scale factors, there is the IEC standard: http://physics.nist.gov/cuu/Units/binary.html which defines Ki = 2**10, Mi = 2**20, etc. (Fortunately this doesn't have to deal with fractional prefixes.) So it would be easy enough to support them as well. (3) µ or u, k or K? I'm going to go to the barricades to fight for the real SI prefixes µ and k to be supported. If people want to support the common fakes u and K as well, that's fine, I have no objection, but I think that its important to support the actual prefixes too. (Python 3 assumes UTF-8 as the default encoding, so it shouldn't cause any technical difficulties to support µ as syntax. The political difficulties though...) (4) What about E? E is tricky if we want 1E to be read as the integer 10**18, because it matches the floating point syntax 1E (which is currently a syntax error). So there's a nasty bit of ambiguity where it may be unclear whether or not 1E is intended as an int or an incomplete float, and then there's 1E1E which might be read as 1E1*10**18 or as just an error. Replacing E with (say) X is risky. The two largest current SI prefixes are Z and Y, it seems very likely that the next one added (if that ever happens) will be X. Actually, using any other letter risks clashing with a future expansion of the SI prefixes. (5) What about other numeric types? Just because there's no syntactic support for Fraction and Decimal shouldn't mean we can't use these scale factors with them. (6) What happens to int(), float() etc? I wouldn't want int("23K") to suddenly change from being an error to returning 23000. Presumably we would want int to take an optional argument to allow the interpretation of scale factors. This gives us an advantage: int("23E", scale=True) is unambiguously an int, and we can ignore the fact that it looks like a float. (7) What about repr() and str()? I don't think that the repr() or str() of numeric types should change. But perhaps format() could grow some new codes to display numbers using either the most obvious scale factor, or some specific scale factor. * * * This leads to my first proposal: require an explicit numeric prefix on numbers before scale factors are allowed, similar to how we treat non-decimal bases. 8M # remains a syntax error 0s8M # unambiguously an int with a scale factor of M = 10**6 0s1E1E # a float 1E1 with a scale factor of E = 10**18 0s1.E # a float 1. with a scale factor of E, not an exponent int('8M') # remains a ValueError int('0s8M', base=0) # returns 8*10**6 Or if that's too heavy (two whole characters, plus the suffix!) perhaps we could have a rule that the suffix must follow the final underscore of the number: 8_M # int 8*10*6 123_456_789_M # int 123456789*10**6 123_M_456 # still an error 8._M # float 8.0*10**6 int() and float() take a keyword only argument to allow a scale factor when converting from strings: int("8_M") # remains an error int("8_M", scale=True) # allowed This solves the problem with E and floats. Its only a scale factor if it immediately follows the final underscore in the float, otherwise it is the regular exponent sign. Proposal number two: don't make any changes to the syntax, but treat these as *literally* numeric scale factors. Add a simple module to the std lib defining the various factors: k = kilo = 10**3 M = mega = 10**6 G = giga = 10**9 etc. and then allow the user to literally treat them as scale factors by multiplying: from scaling import * int_value = 8*M float_value = 8.0*M fraction_value = Fraction(1, 8)*M decimal_value = Decimal("1.2345")*M and so forth. The biggest advantage of this is that there is no syntactic changes needed, it is completely backwards compatible, it works with any numeric type and even non-numbers: py> x = [None]*M py> len(x) 1000000 You can even scale by multiple factors: x = 8*M*K Disadvantages: none I can think of. (Some cleverness may be needed to have fractional scale values work with both floats and Decimals, but that shouldn't be hard.) -- Steve

14 20

if-statement in for-loop
by Dominik Gresch 05 Oct '16

05 Oct '16

Hi, I've recently found myself writing code similar to this: for i in range(10): if i == 5: continue "body" which I find a bit ugly. Obviously the same could be written as for i in range(10): if i != 5: "body" but here you would have to look at the end of the body to see if something happens when i==5. So I asked myself if a syntax as follows would be possible: for i in range(10) if i != 5: body Personally, I find this extremely intuitive since this kind of if-statement is already present in list comprehensions. What is your opinion on this? Sorry if this has been discussed before -- I didn't find anything in the archives. Best regards, Dominik Gresch

17 33