Mailman 3 May 2015 - Python-ideas

@classproperty, @abc.abstractclasspropery, etc.
by K. Richard Pixley 16 Dec '20

16 Dec '20

There's a whole matrix of these and I'm wondering why the matrix is currently sparse rather than implementing them all. Or rather, why we can't stack them as: class foo(object): @classmethod @property def bar(cls, ...): ... Essentially the permutation are, I think: {'unadorned'|abc.abstract}{'normal'|static|class}{method|property|non-callable attribute}. concreteness implicit first arg type name comments {unadorned} {unadorned} method def foo(): exists now {unadorned} {unadorned} property @property exists now {unadorned} {unadorned} non-callable attribute x = 2 exists now {unadorned} static method @staticmethod exists now {unadorned} static property @staticproperty proposing {unadorned} static non-callable attribute {degenerate case - variables don't have arguments} unnecessary {unadorned} class method @classmethod exists now {unadorned} class property @classproperty or @classmethod;@property proposing {unadorned} class non-callable attribute {degenerate case - variables don't have arguments} unnecessary abc.abstract {unadorned} method @abc.abstractmethod exists now abc.abstract {unadorned} property @abc.abstractproperty exists now abc.abstract {unadorned} non-callable attribute @abc.abstractattribute or @abc.abstract;@attribute proposing abc.abstract static method @abc.abstractstaticmethod exists now abc.abstract static property @abc.staticproperty proposing abc.abstract static non-callable attribute {degenerate case - variables don't have arguments} unnecessary abc.abstract class method @abc.abstractclassmethod exists now abc.abstract class property @abc.abstractclassproperty proposing abc.abstract class non-callable attribute {degenerate case - variables don't have arguments} unnecessary I think the meanings of the new ones are pretty straightforward, but in case they are not... @staticproperty - like @property only without an implicit first argument. Allows the property to be called directly from the class without requiring a throw-away instance. @classproperty - like @property, only the implicit first argument to the method is the class. Allows the property to be called directly from the class without requiring a throw-away instance. @abc.abstractattribute - a simple, non-callable variable that must be overridden in subclasses @abc.abstractstaticproperty - like @abc.abstractproperty only for @staticproperty @abc.abstractclassproperty - like @abc.abstractproperty only for @classproperty --rich

10 15

Specify number of items to allocate for array.array() constructor
by Sven Rahmann 21 Feb '20

21 Feb '20

At the moment, the array module of the standard library allows to create arrays of different numeric types and to initialize them from an iterable (eg, another array). What's missing is the possiblity to specify the final size of the array (number of items), especially for large arrays. I'm thinking of suffix arrays (a text indexing data structure) for large texts, eg the human genome and its reverse complement (about 6 billion characters from the alphabet ACGT). The suffix array is a long int array of the same size (8 bytes per number, so it occupies about 48 GB memory). At the moment I am extending an array in chunks of several million items at a time at a time, which is slow and not elegant. The function below also initializes each item in the array to a given value (0 by default). Is there a reason why there the array.array constructor does not allow to simply specify the number of items that should be allocated? (I do not really care about the contents.) Would this be a worthwhile addition to / modification of the array module? My suggestions is to modify array generation in such a way that you could pass an iterator (as now) as second argument, but if you pass a single integer value, it should be treated as the number of items to allocate. Here is my current workaround (which is slow): def filled_array(typecode, n, value=0, bsize=(1<<22)): """returns a new array with given typecode (eg, "l" for long int, as in the array module) with n entries, initialized to the given value (default 0) """ a = array.array(typecode, [value]*bsize) x = array.array(typecode) r = n while r >= bsize: x.extend(a) r -= bsize x.extend([value]*r) return x

14 20

Implicit string literal concatenation considered harmful?
by Guido van Rossum 14 Mar '18

14 Mar '18

I just spent a few minutes staring at a bug caused by a missing comma -- I got a mysterious argument count error because instead of foo('a', 'b') I had written foo('a' 'b'). This is a fairly common mistake, and IIRC at Google we even had a lint rule against this (there was also a Python dialect used for some specific purpose where this was explicitly forbidden). Now, with modern compiler technology, we can (and in fact do) evaluate compile-time string literal concatenation with the '+' operator, so there's really no reason to support 'a' 'b' any more. (The reason was always rather flimsy; I copied it from C but the reason why it's needed there doesn't really apply to Python, as it is mostly useful inside macros.) Would it be reasonable to start deprecating this and eventually remove it from the language? -- --Guido van Rossum (python.org/~guido)

51 165

Reprs of classes and functions
by Serhiy Storchaka 05 Aug '15

05 Aug '15

This idea is already casually mentioned, but sank deep into the threads of the discussion. Raise it up. Currently reprs of classes and functions look as: >>> int <class 'int'> >>> int.from_bytes <built-in method from_bytes of type object at 0x826cf60> >>> open <built-in function open> >>> import collections >>> collections.Counter <class 'collections.Counter'> >>> collections.Counter.fromkeys <bound method Counter.fromkeys of <class 'collections.Counter'>> >>> collections.namedtuple <function namedtuple at 0xb6fc4adc> What if change default reprs of classes and functions to just full qualified name __module__ + '.' + __qualname__ (or just __qualname__ if __module__ is builtins)? This will look more neatly. And such reprs are evaluable.

7 9

Fwd: [Python-Dev] An yocto change proposal in logging module to simplify structured logs support
by Ludovic Gasc 06 Jul '15

06 Jul '15

Hi Python-Ideas ML, To resume quickly the idea: I wish to add "extra" attribute to LogMessage, to facilitate structured logs generation. For more details with use case and example, you can read message below. Before to push the patch on bugs.python.org, I'm interested in by your opinions: the patch seems to be too simple to be honest. Regards. -- Ludovic Gasc (GMLudo) http://www.gmludo.eu/ ---------- Forwarded message ---------- From: Guido van Rossum <guido(a)python.org> Date: 2015-05-24 23:44 GMT+02:00 Subject: Re: [Python-Dev] An yocto change proposal in logging module to simplify structured logs support To: Ludovic Gasc <gmludo(a)gmail.com> Ehh, python-ideas? On Sun, May 24, 2015 at 10:22 AM, Ludovic Gasc <gmludo(a)gmail.com> wrote: > Hi, > > 1. The problem > > For now, when you want to write a log message, you concatenate the data > from your context to generate a string: In fact, you convert your > structured data to a string. > When a sysadmin needs to debug your logs when something is wrong, he must > write regular expressions to extract interesting data. > > Often, he must find the beginning of the interesting log and follow the > path. Sometimes, you can have several requests in the same time in the log, > it's harder to find interesting log. > In fact, with regular expressions, the sysadmin tries to convert the log > lines strings to structured data. > > 2. A possible solution > > You should provide a set of regular expressions to your sysadmins to help > them to find the right logs, however, another approach is possible: > structured logs. > Instead of to break your data structure to push in the log message, the > idea is to keep the data structure, to attach that as metadata of the log > message. > For now, I know at least Logstash and Journald that can handle structured > logs and provide a query tool to extract easily logs. > > 3. A concrete example with structured logs > > As most Web developers, we build HTTP daemons used by several different > human clients in the same time. > In the Python source code, to support structured logs, you don't have a > big change, you can use "extra" parameter for that, example: > > [handle HTTP request] > LOG.debug('Receive a create_or_update request', extra={'request_id': > request.request_id, > > 'account_id': account_id, > > 'aiohttp_request': request, > > 'payload': str(payload)}) > [create data in database] > LOG.debug('Callflow created', extra={'account_id': account_id, > 'request_id': > request.request_id, > 'aiopg_cursor': cur, > 'results': row}) > > Now, if you want, you can enhance the structured log with a custom logging > Handler, because the standard journald handler doesn't know how to handle > aiohttp_request or aiopg_cursor. > My example is based on journald, but you can write an equivalent version > with python-logstash: > #### > from systemdream.journal.handler import JournalHandler > > class Handler(JournalHandler): > # Tip: on a system without journald, use socat to test: > # socat UNIX-RECV:/run/systemd/journal/socket STDIN > def emit(self, record): > if record.extra: > # import ipdb; ipdb.set_trace() > if 'aiohttp_request' in record.extra: > record.extra['http_method'] = > record.extra['aiohttp_request'].method > record.extra['http_path'] = > record.extra['aiohttp_request'].path > record.extra['http_headers'] = > str(record.extra['aiohttp_request'].headers) > del(record.extra['aiohttp_request']) > if 'aiopg_cursor' in record.extra: > record.extra['pg_query'] = > record.extra['aiopg_cursor'].query.decode('utf-8') > record.extra['pg_status_message'] = > record.extra['aiopg_cursor'].statusmessage > record.extra['pg_rows_count'] = > record.extra['aiopg_cursor'].rowcount > del(record.extra['aiopg_cursor']) > super().emit(record) > #### > > And you can enable this custom handler in your logging config file like > this: > [handler_journald] > class=XXXXXXXXXX.utils.logs.Handler > args=() > formatter=detailed > > And now, with journalctl, you can easily extract logs, some examples: > Logs messages from 'lg' account: > journalctl ACCOUNT_ID=lg > All HTTP requests that modify the 'lg' account (PUT, POST and DELETE): > journalctl ACCOUNT_ID=lg HTTP_METHOD=PUT > HTTP_METHOD=POST HTTP_METHOD=DELETE > Retrieve all logs from one specific HTTP request: > journalctl REQUEST_ID=130b8fa0-6576-43b6-a624-4a4265a2fbdd > All HTTP requests with a specific path: > journalctl HTTP_PATH=/v1/accounts/lg/callflows > All logs of "create" function in the file "example.py" > journalctl CODE_FUNC=create CODE_FILE=/path/example.py > > If you already do a troubleshooting on a production system, you should > understand the interest of this: > In fact, it's like to have SQL queries capabilities, but it's logging > oriented. > We use that since a small time on one of our critical daemon that handles > a lot of requests across several servers, it's already adopted from our > support team. > > 4. The yocto issue with the Python logging module > > I don't explain here a small part of my professional life for my pleasure, > but to help you to understand the context and the usages, because my patch > for logging is very small. > If you're an expert of Python logging, you already know that my Handler > class example I provided above can't run on a classical Python logging, > because LogRecord doesn't have an extra attribute. > > extra parameter exists in the Logger, but, in the LogRecord, it's merged > as attributes of LogRecord: > https://github.com/python/cpython/blob/master/Lib/logging/__init__.py#L1386 > > It means, that when the LogRecord is sent to the Handler, you can't > retrieve the dict from the extra parameter of logger. > The only way to do that without to patch Python logging, is to rebuild by > yourself the dict with a list of official attributes of LogRecord, as is > done in python-logstash: > > https://github.com/vklochan/python-logstash/blob/master/logstash/formatter.… > At least to me, it's a little bit dirty. > > My quick'n'dirty patch I use for now on our CPython on production: > > diff --git a/Lib/logging/__init__.py b/Lib/logging/__init__.py > index 104b0be..30fa6ef 100644 > --- a/Lib/logging/__init__.py > +++ b/Lib/logging/__init__.py > @@ -1382,6 +1382,7 @@ class Logger(Filterer): > """ > rv = _logRecordFactory(name, level, fn, lno, msg, args, exc_info, > func, > sinfo) > + rv.extra = extra > if extra is not None: > for key in extra: > if (key in ["message", "asctime"]) or (key in > rv.__dict__): > > At least to me, it should be cleaner to add "extra" as parameter > of _logRecordFactory, but I've no idea of side effects, I understand that > logging module is critical, because it's used everywhere. > However, except with python-logstash, to my knowledge, extra parameter > isn't massively used. > The only backward incompatibility I see with a new extra attribute of > LogRecord, is that if you have a log like this: > LOG.debug('message', extra={'extra': 'example'}) > It will raise a KeyError("Attempt to overwrite 'extra' in LogRecord") > exception, but, at least to me, the probability of this use case is near to > 0. > > Instead of to "maintain" this yocto patch, even it's very small, I should > prefer to have a clean solution in Python directly. > > Thanks for your remarks. > > Regards. > -- > Ludovic Gasc (GMLudo) > http://www.gmludo.eu/ > > _______________________________________________ > Python-Dev mailing list > Python-Dev(a)python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > > -- --Guido van Rossum (python.org/~guido)

4 6

Enabling access to the AST for Python code
by Ben Hoyt 04 Jul '15

04 Jul '15

Hi Python Ideas folks, (I previously posted a similar message on Python-Dev, but it's a better fit for this list. See that thread here: https://mail.python.org/pipermail/python-dev/2015-May/140063.html) Enabling access to the AST for compiled code would make some cool things possible (C# LINQ-style ORMs, for example), and not knowing too much about this part of Python internals, I'm wondering how possible and practical this would be. Context: PonyORM (http://ponyorm.com/) allows you to write regular Python generator expressions like this: select(c for c in Customer if sum(c.orders.price) > 1000) which compile into and run SQL like this: SELECT "c"."id" FROM "Customer" "c" LEFT JOIN "Order" "order-1" ON "c"."id" = "order-1"."customer" GROUP BY "c"."id" HAVING coalesce(SUM("order-1"."total_price"), 0) > 1000 I think the Pythonic syntax here is beautiful. But the tricks PonyORM has to go to get it are ... not quite so beautiful. Because the AST is not available, PonyORM decompiles Python bytecode into an AST first, and then converts that to SQL. (More details on all that from author's EuroPython talk at http://pyvideo.org/video/2968) PonyORM needs the AST just for generator expressions and lambda functions, but obviously if this kind of AST access feature were in Python it'd probably be more general. I believe C#'s LINQ provides something similar, where if you're developing a LINQ converter library (say LINQ to SQL), you essentially get the AST of the code ("expression tree") and the library can do what it wants with that. (I know that there's the "ast" module and ast.parse(), which can give you an AST given a *source string*, but that's not very convenient here.) What would it take to enable this kind of AST access in Python? Is it possible? Is it a good idea? -Ben

15 29

Explicitly shared objects with sub modules vs import
by Ron Adam 02 Jun '15

02 Jun '15

While trying to debug a problem and thinking that it may be an issue with circular imports, I come up with an interesting idea that might be of value. It wasn't a circular import problem in this case, but I may have found the actual bug sooner if I didn't need to be concerned about that possibility. I have had some difficulty splitting larger modules into smaller modules in the past where if I split the code by functionality, it doesn't correspond with how the code is organized by dependency. The result is an imported module needs to import the module it's imported into. Which just doesn't feel right to me. The solution I found was to call a function to explicitly set the shared items in the imported module. (The example is from a language I'm experimenting with written in python. So don't be concerned about the shared object names in this case.) In the main module... import parse parse.set_main(List=List, Keyword=Keyword, Name=Name, String=String, Express=Express, keywords=keywords, raise_with=raise_with, nil=nil) And in parse... # Sets shared objects from main module. from collections import namedtuple def set_main(**d): global main main = namedtuple(__name__, d.keys()) for k, v in d.items(): setattr(main, k, v) After this, the sub module access's the parent modules objects with... main.Keyword Just the same as if the parent module was imported as main, but it only shares what is intended to be shared within this specific imported module. I think that is better than using "import from" in the sub module. And an improvement over importing the whole module which can possibly expose too much. The benifits: * The shared items are explicitly set by the parent module. * If an item is missing, it results in a nice error message * Accessing objects works the same as if import was used. * It avoids (most) circular import problems. * It's easier to think about once you understand what it does. The problem is the submodule needs a function to make it work. I think it would be nice if it could be made a builtin but doing that may be tricky. Where I've used "main", it could set the name of the shared parent module(s) automatically. The name of the function probably should be "shared" or "sharing". (Or some other thing that makes sense.) I would like to hear what other here think, and of course if there are any obvious improvements that can be made. Would this be a good candidate for a new builtin? Cheers, Ron

4 6

Why decode()/encode() name is harmful
by anatoly techtonik 01 Jun '15

01 Jun '15

First, let me start with The Curse of Knowledge https://en.wikipedia.org/wiki/Curse_of_knowledge which can be summarized as: "Once you get something, it becomes hard to think how it was to be without it". I assume that all of you know difference between decode() and encode(), so you're cursed and therefore think that getting that right it is just a matter of reading documentation, experience and time. But quite a lot of had passed and Python 2 is still there, and Python 3, which is all unicode at the core (and which is great for people who finally get it) is not as popular. So, remember that you are biased towards (or against) decode/unicode perception. Now imaging a person who has a text file. The person need to process that with Python. That person is probably a journalist and doesn't know anything that "any developer should know about unicode". In Python 2 he just copy pastes regular expressions to match the letter and is happy. In Python 3 he needs to *convert* that text to unicode. Then he tries to read the documentation, it already starts to bring conflict to his mind. It says to him to "decode" the text. I don't know about you, but when I'm being told to decode the text, I assume that it is crypted, because I watched a few spy movies including ones with Sherlock Holmes and Stierlitz. But the text looks legit to me, I can clearly see and read it and now you say that I need to decode it. You're basically ruining my world right here. No wonder that I will resist. I probably stressed, has a lot of stuff to do, and you are trying to load me with all those abstract concepts that conflict with what I know. No way! Unless I have a really strong motivation (or scientific background) there is no chance to get this stuff for me right on this day. I will probably repeat the exercise and after a few tries will get the output right, but there is no chance I will remember this thing on that day. Because rewiring neural paths in my brain is much harder that paving them from scratch. -- anatoly t.

7 8

npm-style venv-aware launcher
by David Townshend 01 Jun '15

01 Jun '15

Pip and venv have done a lot to improve the accessibility and ease of installing python packages, but I believe there is still a lot of room for improvement. I only realised how cumbersome I find working with python packages when I recently spent a lot of time on a javascript project using npm. A bit of googling and I found several articles discussing pip, venv and npm, and all of them seemed to say the same thing, i.e. pip/venv could learn a lot from npm. My proposal revolves around two issues: 1. Setting up and working with virtual environments can be onerous. Creating one is easy enough, but using them means remembering to run `source activate` every time, which also means remembering which venv is used for which project. Not a major issue, but still and annoyance. 2. Managing lists of required packages is not nearly as easy as in npm since these is no equivalent to `npm install --save ...`. The best that pip offers is `pip freeze`. Howevere, using that is a) an extra step to remember and b) includes all implied dependencies which is not ideal. My proposal is to use a similar model to npm, where each project has a `venvrc` file which lets python-related tools know which environment to use. In order to showcase the sort of funcionality I'm proposing, I've created a basic example on github (https://github.com/aquavitae/pyle). This is currently py3.4 on linux only and very pre-alpha. Once I've added a few more features that I have in mind (e.g. multiple venvs) I'll add it to pypi and if there is sufficient interest I'd be happy to write up a PEP for getting it into the stdlib. Does this seem like the sort of tool that would be useful in the stdlib? Regards David

12 17

import features; if "print_function" in features.data
by Wes Turner 30 May '15

30 May '15

Would it be useful to have one Python source file with an OrderedDict of (API_feat_lbl, [(start, None)]) mappings and a lookup? * [ ] feat/version segments/rays map * [ ] .lookup("print[_function]") Syntax ideas: * has("print[_function]") Advantages * More pythonic to check for features than capabilities * Forward maintainability Disadvantages: * Alternatives: * six, nine, future * try/import ENOENT

3 2