Mailman 3 July 2019 - Python-ideas

for ... except, with ... except
by Serhiy Storchaka July 31, 2019

July 31, 2019

Python allows you to write code in tight and readable form. Consider the following example. with connect() as stream: # connect() or __enter__() can fail. for data in stream: # __next__() can fail write(data) # write() can fail The problem is that different lines can raise an exception of the same type (for example OSError). We want to catch and handle exceptions raised when open a connection, when read a data and when write a data in different ways. … [View More]Currently you need to expand so convenient Python statements "with" and "for" (see PEP 343) _mgr = connect() _enter = type(_mgr).__enter__ _exit = type(_mgr).__exit__ _value = _enter(_mgr) _exc = True try: stream = _value _it = iter(stream) while True: try: data = next(_it) except StopIteration: break write(data) except: _exc = False if not _exit(_mgr, *sys.exc_info()): raise finally: if _exc: _exit(_mgr, None, None, None) and then add "try ... except" around corresponding explicit calls of `__enter__()` and `next()`. try: _mgr = connect() _enter = type(_mgr).__enter__ _exit = type(_mgr).__exit__ _value = _enter(_mgr) _exc = True except OSError: handle_connection_error() else: try: stream = _value try: _it = iter(stream) except OSError: handle_read_error() else: while True: try: data = next(_it) except StopIteration: break except OSError: handle_read_error() break try: write(data) except OSError: handle_write_error() except: _exc = False if not _exit(_mgr, *sys.exc_info()): raise finally: if _exc: _exit(_mgr, None, None, None) Does not it look ugly? I propose to add "except" clause to "for" and "with" statement to catch exceptions in the code that can't be wrapped with "try ... except". for VAR in EXPR: BLOCK except EXC: HANDLER should be equivalent to try: _it = iter(EXPR) except EXC: HANDLER else: while True: try: VAR = next(_it) except StopIteration: break except EXC: HANDLER break BLOCK and with EXPR as VAR: BLOCK except EXC: HANDLER try: _mgr = EXPR _enter = type(_mgr).__enter__ _exit = type(_mgr).__exit__ _value = _enter(_mgr) _exc = True except EXC: HANDLER else: try: VAR = _value BLOCK except: _exc = False if not _exit(_mgr, *sys.exc_info()): raise finally: if _exc: _exit(_mgr, None, None, None) And correspondingly for asynchronous versions "async for" and "async with". So you will be able to add errors handling like in: with connect() as stream: for data in stream: try: write(data) except OSError: handle_write_error() except OSError: handle_read_error() except OSError: handle_connection_error() [View Less]

16 31

Entrypoint function for modules (AKA if __name__ == '__main__' ) with built-in argument parsing
by agustinscaramuzza＠gmail.com July 31, 2019

July 31, 2019

Maybe the def __main__() argument is already a dead horse, given the number of discussions it has created that have ended nowhere, but I think one argument in favour of its implementation would be including argument parsing in it, for example: # main.py def __run__(first_num, second_num, print_operation=False): """ Adds two numbers. positional arguments: - first_num: the first number. - second_num: the second number. optional arguments: - … [View More]

4 5

asyncio futures and tasks with synchronous callbacks
by Aurélien Lambert July 31, 2019

July 31, 2019

In asyncio, when a task awaits for another task (or future), it can be cancelled right after the awaited task finished. Thus, if the awaited task has consumed data, the data is lost. For instance, with the following code: import asyncio available_data = [] data_ready = asyncio.Future() def feed_data(data): global data_ready available_data.append(data) data_ready.set_result(None) data_ready = asyncio.Future() async def consume_data(): … [View More] while not available_data: await asyncio.shield(data_ready) return available_data.pop() async def wrapped_consumer(): task = asyncio.ensure_future(consume_data()) return await task If I perform those exact steps: async def test(): task = asyncio.ensure_future(wrapped_consumer()) await asyncio.sleep(0) feed_data('data') await asyncio.sleep(0) task.cancel() await asyncio.sleep(0) print ('task', task) print ('available_data', available_data) loop = asyncio.get_event_loop() loop.run_until_complete(test()) Then I can see that the task has been cancelled despite the data being consumed. Since the result of `wrapped_consumer` cannot be retrieved, the data is forever lost. task <Task cancelled coro=<wrapped_consumer() done, defined at <ipython-input-1-de4ad193b1d0>:17>> available_data [] This side effect does not happen when awaiting a coroutine, but coroutines are not as flexible as tasks (unless manipulated as a generator). It happens when awaiting a `Future`, a `Task`, or any function like `asyncio.wait`, `asyncio.wait_for` or `asyncio.gather` (which all inherit from or use `Future`). There is then no way to do anything equivalent to: stop_future = asyncio.Future() async def wrapped_consumer2(): task = asyncio.ensure_future(consume_data()) try: await asyncio.wait([task, stop_future]) finally: task.cancel() if not task.cancelled(): return task.result() else: raise RuntimeError('stopped') This is due to the Future calling the callback asynchronously: https://github.com/python/cpython/blob/3.6/Lib/asyncio/futures.py#L214 for callback in callbacks: self._loop.call_soon(callback, self) I propose to create synchronous versions of those, or a `synchronous_callback` parameter, that turns the callbacks of `Future` synchronous. I've experimented a simple library `syncio` with CPython 3.6 to do this (it is harder to patch later versions due to the massive use of private methods). Basically, needs to: 1) replace the `Future._schedule_callbacks` method by a synchronous version 2) fix `Task._step` to not fail when cleaning `_current_tasks` ( https://github.com/python/cpython/blob/3.6/Lib/asyncio/tasks.py#L245) 3) rewrite all the functions to use synchronous futures instead of normal ones With that library, the previous functions are possible and intuitive import syncio async def wrapped_consumer(): task = syncio.ensure_sync_future(consume_data()) return await task stop_future = asyncio.Future() async def wrapped_consumer2(): task = syncio.ensure_sync_future(consume_data()) try: await syncio.sync_wait([task, stop_future]) finally: task.cancel() if not task.cancelled(): return task.result() else: raise RuntimeError('stopped') No need to use `syncio` anywhere else in the code, which makes it totally transparent for the end user. `wrapped_consumer` and `wrapped_consumer2` are now cancelled if and only if the data hasn't been consumed, whatever is the order of the steps (and the presence of `asyncio.sleep`). This "library" can be found here: https://github.com/aure-olli/aiokafka/blob/3acb88d6ece4502a78e230b234f47b90… It implements `SyncFuture`, `SyncTask`, `ensure_sync_future`, `sync_wait`, `sync_wait_for`, `sync_gather` and `sync_shield`. It works with CPython 3.6. To conclude: - asynchronous callbacks are preferable in most cases, but do not provide a coherent cancelled status in specific cases - implementing a version with synchronous callback (or a `synchronous_callback` parameter) is rather easy (however step 2 need to be clarified, probably a cleaner way to fix this) - it is totally transparent for the end user, as synchronous callbacks are totally compatible with asynchronous ones [View Less]

1 0

Cartesian Product on `__mul__`
by Batuhan Taskaya July 30, 2019

July 30, 2019

I think it looks very fine when you type {1, 2, 3} * {"a", "b", "c"} and get set(itertools.product({1, 2, 3}, {"a", "b", "c"})). So i am proposing set multiplication implementation as cartesian product. >>>

7 11

Fwd: for ... except, with ... except
by Bruce Leban July 29, 2019

July 29, 2019

And another message that was rejected (I sent from an unregistered email address) On Sat, Jul 27, 2019 at 1:49 AM Serhiy Storchaka <storchaka(a)gmail.com> wrote: > 26.07.19 21:52, Bruce Leban пише: > > > To put this in a simpler way: the proposal is to add an except clause that > applies ONLY to the direct operation of the with or for statement and not > to the block. That's an interesting idea. > > The one thing I find confusing about your proposal is that the … [View More]proposed > syntax does not imply the behavior. In a try statement, the except appears > at the end and after all possible statements that it could cover. The > proposal mimics that syntax but with different semantics. Something like > this would be much more clear what is going on: > > for VARIABLE in EXPRESSION: > except EXCEPTION: > BLOCK > BLOCK > > with EXPRESSION as VARIABLE: > except EXCEPTION: > BLOCK > BLOCK > > while EXPRESSION: > except EXCEPTION: > BLOCK > BLOCK > > > Besides an unusual for Python layout (a clause has different indentation > than the initial clause of the statement to which it belongs) there is > other problem. The exception block is not the part of the "for" or "with" > block. After handling an exception in the "for" clause you do not continue > to execute the "for" block, but leave the loop. After handling an exception > in the "with" clause you do not continue to execute the "with" block and do > not call `__exit__` when leave it. To me, this syntax is much more > confusing than my initial proposition. > And I find it less confusing. And neither of those is the standard to use. The goal is for syntax to imply semantics (which my proposal does and I do not think yours does, given several people commenting that they thought it applied to the entire loop) and to choose syntax that is more clear to more people (which requires more than two peoples' opinions). Consider how you would write this if everything was an expression in Python and we had braces: for VAR in ( EXPR except EXCEPTION: { BLOCK; break; } ): BLOCK I do agree that it is not obvious that the exception block breaks out of the loop. I think in actual code it will be fairly obvious what's happening as continuing into the loop when the loop expression through an expression doesn't make sense. I'm open to alternatives. On the other hand, an except clause at the bottom of the loop that does not apply to the loop body is going to catch me every time I see it. --- Bruce [View Less]

1 0

Fwd: for ... except, with ... except
by Bruce Leban July 29, 2019

July 29, 2019

I sent this message earlier but it was rejected by the mailer. On Fri, Jul 26, 2019 at 11:27 AM Serhiy Storchaka <storchaka(a)gmail.com> wrote: > > So you will be able to add errors handling like in: > > with connect() as stream: > for data in stream: > try: > write(data) > except OSError: > handle_write_error() > except OSError: > handle_read_error() > … [View More]

1 0

support toml for pyproject support
by Jimmy Girardet July 29, 2019

July 29, 2019

Hi, I don't know if this was already debated but I don't know how to search in the whole archive of the list. For now the adoption of pyproject.toml file is more difficult because toml is not in the standard library. Each tool which wants to use pyproject.toml has to add a toml lib as a conditional or hard dependency. Since toml is now the standard configuration file format, it's strange the python does not support it in the stdlib lije it would have been strange to … [View More]

13 15

Fwd: Re: Universal parsing library in the stdlib to alleviate security issues
by Nam Nguyen July 29, 2019

July 29, 2019

Forward to the list because Abusix had blocked google.com initially. Nam ---------- Forwarded message --------- From: Nam Nguyen <bitsink(a)gmail.com> Date: Sun, Jul 28, 2019 at 10:18 AM Subject: Re: [Python-ideas] Re: Universal parsing library in the stdlib to alleviate security issues To: Sebastian Kreft <skreft(a)gmail.com> Cc: Paul Moore <p.f.moore(a)gmail.com>, python-ideas <python-ideas(a)python.org> Let's circle back to the beginning one last time ;). On Thu, … [View More]Jul 25, 2019 at 8:15 AM Sebastian Kreft <skreft(a)gmail.com> wrote: > Nam, I think it'd be better to frame the proposal as a security > enhancement. Stating some of the common bugs/gotchas found when manually > implementing parsers, and the impact this has had on python over the years. > Seeing a full list of security issues (CVEs) by module would give us a > sense of how widespread the problem is. > Since my final exam was done this weekend, I gathered some more info into this spreadsheet. https://docs.google.com/spreadsheets/d/1TlWSf8iM7eIzEPXanJAP8Ztyzt4ZD28xFvU… I think a strict parser can help with the majority of those problems. They are in HTTP headers, emails, cookies, URLs, and even low level socket code (inet_atoi). > Then survey the stdlib for what kind of grammars are currently being > parsed, what ad-hoc parsing strategy are implemented and provide examples > of whether having a general purpose parser would have prevented the > security issues you have previously cited. > Most grammars I have seen here come straight from RFCs, which are in ABNF and thus context-free. Current implementations are based on regexes or string splitting. My previous example showed that at least 30500, 36216, 36742 were non-issues if we started out with a strict parser. > > Right now, it is not clear what the impact of such refactor would be, nor > the worth of such attempt. > Exactly the kind of response I'm looking for. It is okay to suggest that the benefits aren't clear or that there are requirements X and Y that a general parser won't be able to meet, but it's not convincing to brush aside this because there is "existing, working code." Many of the bugs in that sheet are still open. It's not comfortable to say the code is working with a straight face as I have experienced with my own fix for 30500. I just couldn't tell if it was doing the right thing. > > What others have said earlier is that you are the one that needs to > provide some of the requirements for the proposed private parsing library. > And from what I read from your emails you do have some ideas. For example, > you want it to be easy to write and review (I guess here you would > eventually like it to be a close translation from whatever is specified in > the RFC or grammar specification). > Yes, that's the most important point because "readability counts." It's hard to reason about correctness when there are many transformations between the authoritative spec and the implementation. I definitely don't want to touch the regexes, string splits, and custom logic that I don't understand "why" they are that way in the beginning. How do I, for example, know what this regex is about ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))? (It's from RFC 3986.) But you also need to take into consideration some of the list's concerns, > the parser library has to be performant, as a performance regression is > likely not to be tolerable. > Absolutely. That's where I need inputs from the list. I have provided my own set of requirements for such a parser library. I'm sure most of us have different needs too. So if a parser library can help you, let's hear what you want from it. If you think it can't, please let me understand why. Thanks, Nam > > > On Thu, Jul 25, 2019 at 10:54 AM Nam Nguyen <bitsink(a)gmail.com> wrote: > >> On Thu, Jul 25, 2019 at 2:32 AM Paul Moore <p.f.moore(a)gmail.com> wrote: >> >>> On Thu, 25 Jul 2019 at 02:16, Nam Nguyen <bitsink(a)gmail.com> wrote: >>> > Back to my original requests to the list: 1) Whether we want to have a >>> (possibly private) parsing library in the stdlib >>> >>> In the abstract, no. Propose a specific library, and that answer would >>> change to "maybe". >>> >> >> I have no specific library to propose. I'm looking for a list of features >> such a library should have. >> >> >>> >>> > and 2) What features it should have. >>> >>> That question only makes sense if you get agreement to the abstract >>> proposal that "we should add a parsing library. And as I said, I don't >>> agree to that so I can't answer the second question. >>> >> >> As Chris summarized it correctly, I am advocating for a general solution >> to individual problems (which have the same nature). We can certainly solve >> the problems when they are reported, or we can take a proactive approach to >> make them less likely to occur. I am talking about a class of input >> validation issues here and I thought parsing would be a very natural >> solution to that. This is quite similar to a context-sensitive templating >> library that prevents cross-site-scripting on the output side. So I don't >> know why (or what it takes) to convince people that it's a good thing(tm). >> >> >>> >>> Generally, things go into the stdlib when they have been developed >>> externally and proved their value. The bar for designing a whole >>> library from scratch, "specifically" targeted at stdlib inclusion, is >>> very high, and you're nowhere near reaching it IMO. >>> >> >> This is a misunderstanding. I have not proposed any from-scratch, or >> existing library to be used. And on this note, please allow me to make it >> clear once more time that I am not asking for a publicly-facing library >> either. >> >> >>> >>> > These are good points to set as targets! What does it take for me to >>> get the list to agree on one such set of criteria? >>> >>> You need to start by getting agreement on the premise that adding a >>> newly-written parser to the stdlib is a good idea. And so far your >>> *only* argument seems to be that "it will avoid a class of security >>> bugs" which I find extremely unconvincing (and I get the impression >>> others do, too). >> >> >> Why? What is unconvincing about a parsing library being able... parse >> (and therefore, validate) inputs? >> >> >>> But even if "using a real parser" was useful in that >>> context, there's *still* no argument for writing one from scratch, >>> rather than using an existing, proven library. >> >> >> Never a goal. >> >> >>> At the most basic >>> level, what if there's a bug in your new parsing library? If we're >>> using it in security-critical code, such a bug would be a >>> vulnerability just like the ones you're suggesting your parser would >>> avoid. Are you asking us to believe that your code will be robust >>> enough to trust over code that's been used in production systems for >>> years? >>> >>> I think you need to stop getting distracted by details, and focus on >>> your stated initial request "Whether we want to have a (possibly >>> private) parsing library in the stdlib". You don't seem to me to have >>> persuaded anyone of this basic suggestion yet, >> >> >> Good observation. How do I convince you that complex input validation >> tasks should be left to a parser? >> >> Thanks! >> Nam >> >> _______________________________________________ >> Python-ideas mailing list -- python-ideas(a)python.org >> To unsubscribe send an email to python-ideas-leave(a)python.org >> https://mail.python.org/mailman3/lists/python-ideas.python.org/ >> Message archived at >> https://mail.python.org/archives/list/python-ideas@python.org/message/FCPU4… >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > -- > Sebastian Kreft > [View Less]

2 1

Utilities for easier debugging
by James Lu July 29, 2019

July 29, 2019

Minimal strawman proposal. New keyword debug. debug EXPRESSION Executes EXPRESSION when in debug mode. debug context Prints all the variables of the enclosing closure and all the variable names accessed within that block. For example, if in foo you access the global variable spam, spam would be printed. The format would be: variableName: value variableTwo: value where "value" is the repr() of the variable. Separated by new lines. The exact output format would not be part of the spec. ?… [View More]

5 5

Fwd: Utilities for easier debugging
by James Lu July 28, 2019

July 28, 2019

Sent from my iPhone Begin forwarded message: > From: James Lu <jamtlu(a)gmail.com> > Date: July 28, 2019 at 6:22:11 PM EDT > To: Andrew Barnert <abarnert(a)yahoo.com> > Subject: Re: [Python-ideas] Utilities for easier debugging > > >> On Jul 28, 2019, at 4:26 PM, Andrew Barnert <abarnert(a)yahoo.com> wrote: >> >> This would break iPython’s improved interactive console, which already uses this syntax to provide a similar feature. > If … [View More]

1 0