From rndblnch at gmail.com  Wed Jun  1 15:31:47 2016
From: rndblnch at gmail.com (rndblnch)
Date: Wed, 1 Jun 2016 19:31:47 +0000 (UTC)
Subject: [Python-Dev] Adding NewType() to PEP 484
References: <CAOMjWkk=gyj+Zp-F1PygoqqA9ZOoc=1cWw-G5J9scGFFEND4Jw@mail.gmail.com>
 <CAP1=2W7jNH6kS9Of5=VSov_314HoZkjZhbtMtzv-Y4aiM3jaNA@mail.gmail.com>
 <CAP7+vJLxiGg-=TdumxLCyrZbUzp2fdEL8TAhVCx9it0H8PZHMg@mail.gmail.com>
 <CAP7+vJJBsY34wx+peUdxLDUmLVMpE-+ws=uSZbYvdUzAaCp=NA@mail.gmail.com>
 <loom.20160531T214828-498@post.gmane.org>
 <38cb015b-2d32-b9a7-d5b7-eef312eb4fa7@g.nevcal.com>
 <CADiSq7fnQEZVxkKJ5T-=rU7CPtsE4c7SuyDH+1SnW8KZg-FnOA@mail.gmail.com>
Message-ID: <loom.20160601T212611-127@post.gmane.org>

Nick Coghlan <ncoghlan <at> gmail.com> writes:
> On 31 May 2016 3:12 pm, "Glenn Linderman" <v+python <at> g.nevcal.com> wrote:
> > On 5/31/2016 12:55 PM, rndblnch wrote:
> >> Guido van Rossum <gvanrossum <at> gmail.com> writes:
> >>
> >>>
> >>> Also -- the most important thing.? What to call these things?

[...]

> > Interesting! Prior art. And parallel type isn't a bad name...
> If I heard "parallel type", I'd assume it had something to do with
parallel processing.

sure, it was 15 years ago, parallel processing was not so widely widespread.

but looking at synonyms for parallel, i stumbed upon: counterpart, analog,
miror, etc.
and then from here: countertype ...

my 2 cents.
renaud

[...]
> Cheers,
> Nick.

From guido at python.org  Wed Jun  1 16:59:57 2016
From: guido at python.org (Guido van Rossum)
Date: Wed, 1 Jun 2016 13:59:57 -0700
Subject: [Python-Dev] Adding NewType() to PEP 484
In-Reply-To: <loom.20160601T212611-127@post.gmane.org>
References: <CAOMjWkk=gyj+Zp-F1PygoqqA9ZOoc=1cWw-G5J9scGFFEND4Jw@mail.gmail.com>
 <CAP1=2W7jNH6kS9Of5=VSov_314HoZkjZhbtMtzv-Y4aiM3jaNA@mail.gmail.com>
 <CAP7+vJLxiGg-=TdumxLCyrZbUzp2fdEL8TAhVCx9it0H8PZHMg@mail.gmail.com>
 <CAP7+vJJBsY34wx+peUdxLDUmLVMpE-+ws=uSZbYvdUzAaCp=NA@mail.gmail.com>
 <loom.20160531T214828-498@post.gmane.org>
 <38cb015b-2d32-b9a7-d5b7-eef312eb4fa7@g.nevcal.com>
 <CADiSq7fnQEZVxkKJ5T-=rU7CPtsE4c7SuyDH+1SnW8KZg-FnOA@mail.gmail.com>
 <loom.20160601T212611-127@post.gmane.org>
Message-ID: <CAP7+vJJhoSNXyw9mYBGMXf9EUERDgnuxBTndHyLrmQtMgP7WMQ@mail.gmail.com>

Unless Jukka objects I am going with "distinct type" when discussing
the feature but NewType() in code.

-- 
--Guido van Rossum (python.org/~guido)

From guido at python.org  Wed Jun  1 20:44:40 2016
From: guido at python.org (Guido van Rossum)
Date: Wed, 1 Jun 2016 17:44:40 -0700
Subject: [Python-Dev] Adding NewType() to PEP 484
In-Reply-To: <CAFNMPM9bzod8C3M2cmaxzwxrJtsiV79_fmo4_8quzheQKWXomQ@mail.gmail.com>
References: <CAOMjWkk=gyj+Zp-F1PygoqqA9ZOoc=1cWw-G5J9scGFFEND4Jw@mail.gmail.com>
 <CAP1=2W7jNH6kS9Of5=VSov_314HoZkjZhbtMtzv-Y4aiM3jaNA@mail.gmail.com>
 <CAP7+vJLxiGg-=TdumxLCyrZbUzp2fdEL8TAhVCx9it0H8PZHMg@mail.gmail.com>
 <CAP7+vJJBsY34wx+peUdxLDUmLVMpE-+ws=uSZbYvdUzAaCp=NA@mail.gmail.com>
 <loom.20160531T214828-498@post.gmane.org>
 <38cb015b-2d32-b9a7-d5b7-eef312eb4fa7@g.nevcal.com>
 <CADiSq7fnQEZVxkKJ5T-=rU7CPtsE4c7SuyDH+1SnW8KZg-FnOA@mail.gmail.com>
 <loom.20160601T212611-127@post.gmane.org>
 <CAP7+vJJhoSNXyw9mYBGMXf9EUERDgnuxBTndHyLrmQtMgP7WMQ@mail.gmail.com>
 <CAFNMPM9bzod8C3M2cmaxzwxrJtsiV79_fmo4_8quzheQKWXomQ@mail.gmail.com>
Message-ID: <CAP7+vJKdEKr-CmMNDYPiUU7=boxhP=RPA=2ofhT+ciG5C3WGcg@mail.gmail.com>

Everyone on the mypy team has a different opinion so the search is on. :-(

On Wed, Jun 1, 2016 at 5:37 PM, Hai Nguyen <nhai.qn at gmail.com> wrote:
> I am +1 for DistinctType (vs others) (no specific reason, just read out
> loud).
>
> Hai
>
> On Wednesday, June 1, 2016, Guido van Rossum <guido at python.org> wrote:
>>
>> Unless Jukka objects I am going with "distinct type" when discussing
>> the feature but NewType() in code.
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/nhai.qn%40gmail.com

-- 
--Guido van Rossum (python.org/~guido)

From nhai.qn at gmail.com  Wed Jun  1 20:37:20 2016
From: nhai.qn at gmail.com (Hai Nguyen)
Date: Wed, 1 Jun 2016 20:37:20 -0400
Subject: [Python-Dev] Adding NewType() to PEP 484
In-Reply-To: <CAP7+vJJhoSNXyw9mYBGMXf9EUERDgnuxBTndHyLrmQtMgP7WMQ@mail.gmail.com>
References: <CAOMjWkk=gyj+Zp-F1PygoqqA9ZOoc=1cWw-G5J9scGFFEND4Jw@mail.gmail.com>
 <CAP1=2W7jNH6kS9Of5=VSov_314HoZkjZhbtMtzv-Y4aiM3jaNA@mail.gmail.com>
 <CAP7+vJLxiGg-=TdumxLCyrZbUzp2fdEL8TAhVCx9it0H8PZHMg@mail.gmail.com>
 <CAP7+vJJBsY34wx+peUdxLDUmLVMpE-+ws=uSZbYvdUzAaCp=NA@mail.gmail.com>
 <loom.20160531T214828-498@post.gmane.org>
 <38cb015b-2d32-b9a7-d5b7-eef312eb4fa7@g.nevcal.com>
 <CADiSq7fnQEZVxkKJ5T-=rU7CPtsE4c7SuyDH+1SnW8KZg-FnOA@mail.gmail.com>
 <loom.20160601T212611-127@post.gmane.org>
 <CAP7+vJJhoSNXyw9mYBGMXf9EUERDgnuxBTndHyLrmQtMgP7WMQ@mail.gmail.com>
Message-ID: <CAFNMPM9bzod8C3M2cmaxzwxrJtsiV79_fmo4_8quzheQKWXomQ@mail.gmail.com>

I am +1 for DistinctType (vs others) (no specific reason, just read out
loud).

Hai

On Wednesday, June 1, 2016, Guido van Rossum <guido at python.org> wrote:

> Unless Jukka objects I am going with "distinct type" when discussing
> the feature but NewType() in code.
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org <javascript:;>
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/nhai.qn%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160601/f32143b6/attachment.html>

From mafagafogigante at gmail.com  Wed Jun  1 20:50:07 2016
From: mafagafogigante at gmail.com (Bernardo Sulzbach)
Date: Wed, 1 Jun 2016 21:50:07 -0300
Subject: [Python-Dev] Adding NewType() to PEP 484
In-Reply-To: <CAP7+vJKdEKr-CmMNDYPiUU7=boxhP=RPA=2ofhT+ciG5C3WGcg@mail.gmail.com>
References: <CAOMjWkk=gyj+Zp-F1PygoqqA9ZOoc=1cWw-G5J9scGFFEND4Jw@mail.gmail.com>
 <CAP1=2W7jNH6kS9Of5=VSov_314HoZkjZhbtMtzv-Y4aiM3jaNA@mail.gmail.com>
 <CAP7+vJLxiGg-=TdumxLCyrZbUzp2fdEL8TAhVCx9it0H8PZHMg@mail.gmail.com>
 <CAP7+vJJBsY34wx+peUdxLDUmLVMpE-+ws=uSZbYvdUzAaCp=NA@mail.gmail.com>
 <loom.20160531T214828-498@post.gmane.org>
 <38cb015b-2d32-b9a7-d5b7-eef312eb4fa7@g.nevcal.com>
 <CADiSq7fnQEZVxkKJ5T-=rU7CPtsE4c7SuyDH+1SnW8KZg-FnOA@mail.gmail.com>
 <loom.20160601T212611-127@post.gmane.org>
 <CAP7+vJJhoSNXyw9mYBGMXf9EUERDgnuxBTndHyLrmQtMgP7WMQ@mail.gmail.com>
 <CAFNMPM9bzod8C3M2cmaxzwxrJtsiV79_fmo4_8quzheQKWXomQ@mail.gmail.com>
 <CAP7+vJKdEKr-CmMNDYPiUU7=boxhP=RPA=2ofhT+ciG5C3WGcg@mail.gmail.com>
Message-ID: <472a3418-6030-00a9-73b2-a0c9b0afee6f@gmail.com>

On 06/01/2016 09:44 PM, Guido van Rossum wrote:
> Everyone on the mypy team has a different opinion so the search is on. :-(
>
> On Wed, Jun 1, 2016 at 5:37 PM, Hai Nguyen <nhai.qn at gmail.com> wrote:
>> I am +1 for DistinctType (vs others) (no specific reason, just read out
>> loud).
>>

At least on this thread it seems like (I haven't counted) that distinct 
type [alias] is the preferred option.

From guido at python.org  Wed Jun  1 21:04:00 2016
From: guido at python.org (Guido van Rossum)
Date: Wed, 1 Jun 2016 18:04:00 -0700
Subject: [Python-Dev] Adding NewType() to PEP 484
In-Reply-To: <472a3418-6030-00a9-73b2-a0c9b0afee6f@gmail.com>
References: <CAOMjWkk=gyj+Zp-F1PygoqqA9ZOoc=1cWw-G5J9scGFFEND4Jw@mail.gmail.com>
 <CAP1=2W7jNH6kS9Of5=VSov_314HoZkjZhbtMtzv-Y4aiM3jaNA@mail.gmail.com>
 <CAP7+vJLxiGg-=TdumxLCyrZbUzp2fdEL8TAhVCx9it0H8PZHMg@mail.gmail.com>
 <CAP7+vJJBsY34wx+peUdxLDUmLVMpE-+ws=uSZbYvdUzAaCp=NA@mail.gmail.com>
 <loom.20160531T214828-498@post.gmane.org>
 <38cb015b-2d32-b9a7-d5b7-eef312eb4fa7@g.nevcal.com>
 <CADiSq7fnQEZVxkKJ5T-=rU7CPtsE4c7SuyDH+1SnW8KZg-FnOA@mail.gmail.com>
 <loom.20160601T212611-127@post.gmane.org>
 <CAP7+vJJhoSNXyw9mYBGMXf9EUERDgnuxBTndHyLrmQtMgP7WMQ@mail.gmail.com>
 <CAFNMPM9bzod8C3M2cmaxzwxrJtsiV79_fmo4_8quzheQKWXomQ@mail.gmail.com>
 <CAP7+vJKdEKr-CmMNDYPiUU7=boxhP=RPA=2ofhT+ciG5C3WGcg@mail.gmail.com>
 <472a3418-6030-00a9-73b2-a0c9b0afee6f@gmail.com>
Message-ID: <CAP7+vJJKcEXFsQpkfkrnJs5A_anKwEp89DVbGhugRQ+oroQz5w@mail.gmail.com>

I've merged this into PEP 484 now. The informal term used there is
actually "unique type" which is fine. End of discussion please.

On Wed, Jun 1, 2016 at 5:50 PM, Bernardo Sulzbach
<mafagafogigante at gmail.com> wrote:
> On 06/01/2016 09:44 PM, Guido van Rossum wrote:
>>
>> Everyone on the mypy team has a different opinion so the search is on. :-(
>>
>> On Wed, Jun 1, 2016 at 5:37 PM, Hai Nguyen <nhai.qn at gmail.com> wrote:
>>>
>>> I am +1 for DistinctType (vs others) (no specific reason, just read out
>>> loud).
>>>
>
> At least on this thread it seems like (I haven't counted) that distinct type
> [alias] is the preferred option.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org

-- 
--Guido van Rossum (python.org/~guido)

From jake at lwn.net  Thu Jun  2 20:39:59 2016
From: jake at lwn.net (Jake Edge)
Date: Thu, 2 Jun 2016 18:39:59 -0600
Subject: [Python-Dev] Start of the Python Language Summit coverage at LWN
Message-ID: <20160602183959.4728902e@chukar.edge2.net>

Howdy python-dev,

I was able to sit in on the Python Language Summit again this year
(thanks Larry and Barry!) and have some of the coverage available for
your viewing pleasure now.

The starting point is here: https://lwn.net/Articles/688969/
(or here for non-subscribers:
https://lwn.net/SubscriberLink/688969/91cbeeaf32807914/ )

So far, I have written up the first three sessions.  The rest will be
coming in over the next week or so and be added to the page above (and
will also appear in next week's weekly edition).

The future of the ssl module: https://lwn.net/Articles/688974/
https://lwn.net/SubscriberLink/688974/31cfa9f818c834e1/

Twisted and Python 3: https://lwn.net/Articles/689068/
https://lwn.net/SubscriberLink/689068/34b68a2aea6ddd2d/

Gilectomy: https://lwn.net/Articles/689548/
https://lwn.net/SubscriberLink/689548/4328423f85a47679/

The articles will be freely available (without using the
SubscriberLink) to the world at large in a week (and the next batch the
week after that) ... until then, feel free to share the SubscriberLinks

Hopefully I have captured things reasonably well.  If there are
corrections or clarifications needed, though, I recommend posting them
as comments on the article.

enjoy!

jake

-- 
Jake Edge - LWN - jake at lwn.net - http://lwn.net

From tjreedy at udel.edu  Thu Jun  2 23:26:43 2016
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 2 Jun 2016 23:26:43 -0400
Subject: [Python-Dev] Start of the Python Language Summit coverage at LWN
In-Reply-To: <20160602183959.4728902e@chukar.edge2.net>
References: <20160602183959.4728902e@chukar.edge2.net>
Message-ID: <niqtdk$hd0$1@ger.gmane.org>

On 6/2/2016 8:39 PM, Jake Edge wrote:
>
> Howdy python-dev,
>
> I was able to sit in on the Python Language Summit again this year
> (thanks Larry and Barry!) and have some of the coverage available for
> your viewing pleasure now.
>
> The starting point is here: https://lwn.net/Articles/688969/
> (or here for non-subscribers:
> https://lwn.net/SubscriberLink/688969/91cbeeaf32807914/ )
>
> So far, I have written up the first three sessions.  The rest will be
> coming in over the next week or so and be added to the page above

Thank you.  Please continue posting the individual SubscriberLinks here 
as the page above does not have them.

> will also appear in next week's weekly edition).
>
> The future of the ssl module: https://lwn.net/Articles/688974/
> https://lwn.net/SubscriberLink/688974/31cfa9f818c834e1/
>
> Twisted and Python 3: https://lwn.net/Articles/689068/
> https://lwn.net/SubscriberLink/689068/34b68a2aea6ddd2d/
>
> Gilectomy: https://lwn.net/Articles/689548/
> https://lwn.net/SubscriberLink/689548/4328423f85a47679/
>
> The articles will be freely available (without using the
> SubscriberLink) to the world at large in a week (and the next batch the
> week after that) ... until then, feel free to share the SubscriberLinks
>
> Hopefully I have captured things reasonably well.  If there are
> corrections or clarifications needed, though, I recommend posting them
> as comments on the article.

-- 
Terry Jan Reedy

From status at bugs.python.org  Fri Jun  3 12:08:43 2016
From: status at bugs.python.org (Python tracker)
Date: Fri,  3 Jun 2016 18:08:43 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <20160603160843.1677A5688D@psf.upfronthosting.co.za>

ACTIVITY SUMMARY (2016-05-27 - 2016-06-03)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open    5537 ( +8)
  closed 33416 (+52)
  total  38953 (+60)

Open issues with patches: 2416 

Issues opened (41)
==================

#22331: test_io.test_interrupted_write_text() hangs on the buildbot Fr
http://bugs.python.org/issue22331  reopened by martin.panter

#27137: Python implementation of `functools.partial` is not a class
http://bugs.python.org/issue27137  opened by ebarry

#27139: Increased test coverage for statistics.median_grouped
http://bugs.python.org/issue27139  opened by juliojr77

#27140: Opcode for creating dict with constant keys
http://bugs.python.org/issue27140  opened by serhiy.storchaka

#27141: Fix collections.UserList shallow copy
http://bugs.python.org/issue27141  opened by bar.harel

#27142: Default int value with xmlrpclib / xmlrpc.client
http://bugs.python.org/issue27142  opened by julienc

#27144: concurrent.futures.as_completed() memory inefficiency
http://bugs.python.org/issue27144  opened by grzgrzgrz3

#27145: long_add and long_sub might return a new int where &small_ints
http://bugs.python.org/issue27145  opened by Oren Milman

#27149: Implement socket.sendmsg() for Windows
http://bugs.python.org/issue27149  opened by mmarkk

#27150: PEP446 (CLOEXEC by default) violation with fcntl.fcntl(..., fc
http://bugs.python.org/issue27150  opened by mmarkk

#27151: multiprocessing.Process leaves read pipes open (Process.sentin
http://bugs.python.org/issue27151  opened by Roman Bolshakov

#27152: Additional assert methods for unittest
http://bugs.python.org/issue27152  opened by serhiy.storchaka

#27154: Regression in file.writelines behavior
http://bugs.python.org/issue27154  opened by snaury

#27156: IDLE: remove unused code
http://bugs.python.org/issue27156  opened by terry.reedy

#27157: Unhelpful error message when one calls a subclass of type with
http://bugs.python.org/issue27157  opened by ppperry

#27161: Confusing exception in Path().with_name
http://bugs.python.org/issue27161  opened by Antony.Lee

#27162: Add idlelib.interface module
http://bugs.python.org/issue27162  opened by terry.reedy

#27163: IDLE entry for What's New in Python 3.6
http://bugs.python.org/issue27163  opened by terry.reedy

#27164: zlib can't decompress DEFLATE using shared dictionary
http://bugs.python.org/issue27164  opened by Vladimir Mihailenco

#27165: Skip callables when displaying exception fields in cgitb
http://bugs.python.org/issue27165  opened by Adam.Biela??ski

#27167: subprocess reports signal as negative exit status, not documen
http://bugs.python.org/issue27167  opened by dmacnet

#27168: Yury isn't sure comprehensions and await interact correctly
http://bugs.python.org/issue27168  opened by njs

#27169: __debug__ is not optimized out at compile time for anything bu
http://bugs.python.org/issue27169  opened by josh.r

#27170: IDLE: remove Toggle Auto Coloring or add to edit menu & doc
http://bugs.python.org/issue27170  opened by terry.reedy

#27172: Add skip_bound_arg argument to inspect.Signature.from_callable
http://bugs.python.org/issue27172  opened by ryan.petrello

#27173: Modern Unix key bindings for IDLE
http://bugs.python.org/issue27173  opened by serhiy.storchaka

#27175: Unpickling Path objects
http://bugs.python.org/issue27175  opened by Antony.Lee

#27177: re match.group should support __index__
http://bugs.python.org/issue27177  opened by jdemeyer

#27179: subprocess uses wrong encoding on Windows
http://bugs.python.org/issue27179  opened by davispuh

#27180: Doc/pathlib: Please describe the behaviour of Path().rename() 
http://bugs.python.org/issue27180  opened by hashimo

#27181: Add geometric mean to `statistics` module
http://bugs.python.org/issue27181  opened by cool-RR

#27182: PEP 519 support in the stdlib
http://bugs.python.org/issue27182  opened by ethan.furman

#27184: Support path objects in the ntpath module
http://bugs.python.org/issue27184  opened by ethan.furman

#27186: add os.fspath()
http://bugs.python.org/issue27186  opened by ethan.furman

#27187: Relax __all__ location requirement in PEP 8
http://bugs.python.org/issue27187  opened by barry

#27188: sqlite3 execute* methods return value not documented
http://bugs.python.org/issue27188  opened by Dave Sawyer

#27189: configure --with-lto with clang should find the appropriate ll
http://bugs.python.org/issue27189  opened by gregory.p.smith

#27190: Check sqlite3_version before allowing check_same_thread = Fals
http://bugs.python.org/issue27190  opened by Dave Sawyer

#27194: Tarfile superfluous truncate calls slows extraction.
http://bugs.python.org/issue27194  opened by fried

#27195: Crash when RawIOBase.write(b) evaluates b.format
http://bugs.python.org/issue27195  opened by martin.panter

#27196: Eliminate 'ThemeChanged' warning when running IDLE tests
http://bugs.python.org/issue27196  opened by terry.reedy

Most recent 15 issues with no replies (15)
==========================================

#27195: Crash when RawIOBase.write(b) evaluates b.format
http://bugs.python.org/issue27195

#27189: configure --with-lto with clang should find the appropriate ll
http://bugs.python.org/issue27189

#27188: sqlite3 execute* methods return value not documented
http://bugs.python.org/issue27188

#27180: Doc/pathlib: Please describe the behaviour of Path().rename() 
http://bugs.python.org/issue27180

#27175: Unpickling Path objects
http://bugs.python.org/issue27175

#27168: Yury isn't sure comprehensions and await interact correctly
http://bugs.python.org/issue27168

#27165: Skip callables when displaying exception fields in cgitb
http://bugs.python.org/issue27165

#27163: IDLE entry for What's New in Python 3.6
http://bugs.python.org/issue27163

#27162: Add idlelib.interface module
http://bugs.python.org/issue27162

#27151: multiprocessing.Process leaves read pipes open (Process.sentin
http://bugs.python.org/issue27151

#27144: concurrent.futures.as_completed() memory inefficiency
http://bugs.python.org/issue27144

#27139: Increased test coverage for statistics.median_grouped
http://bugs.python.org/issue27139

#27123: Allow `install_headers` command to follow specific directory s
http://bugs.python.org/issue27123

#27121: imghdr does not support jpg files with Lavc bytes
http://bugs.python.org/issue27121

#27115: IDLE/tkinter: in simpledialog, <Return> != [OK] click
http://bugs.python.org/issue27115

Most recent 15 issues waiting for review (15)
=============================================

#27194: Tarfile superfluous truncate calls slows extraction.
http://bugs.python.org/issue27194

#27190: Check sqlite3_version before allowing check_same_thread = Fals
http://bugs.python.org/issue27190

#27186: add os.fspath()
http://bugs.python.org/issue27186

#27179: subprocess uses wrong encoding on Windows
http://bugs.python.org/issue27179

#27177: re match.group should support __index__
http://bugs.python.org/issue27177

#27173: Modern Unix key bindings for IDLE
http://bugs.python.org/issue27173

#27172: Add skip_bound_arg argument to inspect.Signature.from_callable
http://bugs.python.org/issue27172

#27165: Skip callables when displaying exception fields in cgitb
http://bugs.python.org/issue27165

#27164: zlib can't decompress DEFLATE using shared dictionary
http://bugs.python.org/issue27164

#27161: Confusing exception in Path().with_name
http://bugs.python.org/issue27161

#27157: Unhelpful error message when one calls a subclass of type with
http://bugs.python.org/issue27157

#27152: Additional assert methods for unittest
http://bugs.python.org/issue27152

#27145: long_add and long_sub might return a new int where &small_ints
http://bugs.python.org/issue27145

#27144: concurrent.futures.as_completed() memory inefficiency
http://bugs.python.org/issue27144

#27141: Fix collections.UserList shallow copy
http://bugs.python.org/issue27141

Top 10 most discussed issues (10)
=================================

#27157: Unhelpful error message when one calls a subclass of type with
http://bugs.python.org/issue27157  26 msgs

#19611: inspect.getcallargs doesn't properly interpret set comprehensi
http://bugs.python.org/issue19611  15 msgs

#27179: subprocess uses wrong encoding on Windows
http://bugs.python.org/issue27179  12 msgs

#20699: Document that binary IO classes work with bytes-likes objects
http://bugs.python.org/issue20699  11 msgs

#27137: Python implementation of `functools.partial` is not a class
http://bugs.python.org/issue27137  11 msgs

#27161: Confusing exception in Path().with_name
http://bugs.python.org/issue27161  10 msgs

#27136: sock_connect fails for bluetooth (and probably others)
http://bugs.python.org/issue27136   9 msgs

#22558: Missing doc links to source code for Python-coded modules.
http://bugs.python.org/issue22558   8 msgs

#26546: Provide translated french translation on docs.python.org
http://bugs.python.org/issue26546   8 msgs

#27033: Change the decode_data default in smtpd to False
http://bugs.python.org/issue27033   8 msgs

Issues closed (50)
==================

#5252: 2to3 should detect and delete import of removed statvfs module
http://bugs.python.org/issue5252  closed by r.david.murray

#8519: doc: termios and ioctl reference links
http://bugs.python.org/issue8519  closed by orsenthil

#9327: doctest DocFileCase setUp/tearDown asymmetry
http://bugs.python.org/issue9327  closed by berker.peksag

#9363: data_files are not installed relative to sys.prefix
http://bugs.python.org/issue9363  closed by berker.peksag

#12243: getpass.getuser works on OSX
http://bugs.python.org/issue12243  closed by berker.peksag

#12691: tokenize.untokenize is broken
http://bugs.python.org/issue12691  closed by terry.reedy

#13784: Documentation of  xml.sax.xmlreader: Locator.getLineNumber() a
http://bugs.python.org/issue13784  closed by r.david.murray

#17352: Be clear that __prepare__ must be declared as a class method
http://bugs.python.org/issue17352  closed by berker.peksag

#18384: Add devhelp build instructions to the documentation makefile
http://bugs.python.org/issue18384  closed by berker.peksag

#18478: Class bodies: when does a name become local?
http://bugs.python.org/issue18478  closed by terry.reedy

#20496: function definition tutorial encourages bad practice
http://bugs.python.org/issue20496  closed by berker.peksag

#20973: Implement proper comparison operations for in _TotalOrderingMi
http://bugs.python.org/issue20973  closed by r.david.murray

#21271: reset_mock needs parameters to also reset return_value and sid
http://bugs.python.org/issue21271  closed by kushal.das

#21776: distutils.upload uses the wrong order of exceptions
http://bugs.python.org/issue21776  closed by berker.peksag

#23116: Python Tutorial 4.7.1: Improve ask_ok() to cover more input va
http://bugs.python.org/issue23116  closed by berker.peksag

#24647: Document argparse.REMAINDER as being equal to "..."
http://bugs.python.org/issue24647  closed by r.david.murray

#24671: idlelib 2.7: finish converting print statements
http://bugs.python.org/issue24671  closed by terry.reedy

#25570: urllib.request > Request.add_header("abcd","efgh") fails with 
http://bugs.python.org/issue25570  closed by martin.panter

#25926: Clarify that the itertools pure python equivalents are only ap
http://bugs.python.org/issue25926  closed by rhettinger

#25931: os.fork() command distributed in windows Python27 (in SocketSe
http://bugs.python.org/issue25931  closed by gregory.p.smith

#26526: In parsermodule.c, replace over 2KLOC of hand-crafted validati
http://bugs.python.org/issue26526  closed by python-dev

#26553: Write HTTP in uppercase
http://bugs.python.org/issue26553  closed by martin.panter

#26632: @public - an __all__ decorator
http://bugs.python.org/issue26632  closed by barry

#26739: idle: Errno 10035 a non-blocking socket operation could not be
http://bugs.python.org/issue26739  closed by zach.ware

#26829: update docs: when creating classes a new dict is created for t
http://bugs.python.org/issue26829  closed by r.david.murray

#27043: Describe what ???inspect.cleandoc??? does to synopsis line.
http://bugs.python.org/issue27043  closed by orsenthil

#27113: sqlite3 connect parameter "check_same_thread" not documented
http://bugs.python.org/issue27113  closed by orsenthil

#27117: turtledemo does not work with IDLE's new dark theme.
http://bugs.python.org/issue27117  closed by terry.reedy

#27124: binascii.a2b_hex raises binascii.Error and ValueError, not Typ
http://bugs.python.org/issue27124  closed by martin.panter

#27125: Typo in Python 2 multiprocessing documentation
http://bugs.python.org/issue27125  closed by martin.panter

#27138: FileFinder.find_spec() docstring needs to be corrected.
http://bugs.python.org/issue27138  closed by eric.snow

#27143: python 3.5 conflict with Mailman, ebtables and firewalld
http://bugs.python.org/issue27143  closed by barry

#27146: posixmodule.c needs stdio.h
http://bugs.python.org/issue27146  closed by gregory.p.smith

#27147: importlib docs do not mention PEP 420
http://bugs.python.org/issue27147  closed by eric.snow

#27148: Make VENV_DIR relative to Script directory
http://bugs.python.org/issue27148  closed by vinay.sajip

#27153: Default value shown by argparse.ArgumentDefaultsHelpFormatter 
http://bugs.python.org/issue27153  closed by r.david.murray

#27155: '-' sign typo in example
http://bugs.python.org/issue27155  closed by r.david.murray

#27158: `isinstance` function does not handle types that are their own
http://bugs.python.org/issue27158  closed by ebarry

#27159: Python 3.5.1's websocket's lib crashes in event that internet 
http://bugs.python.org/issue27159  closed by r.david.murray

#27160: str.format: Silent truncation of kwargs when passing keywords 
http://bugs.python.org/issue27160  closed by ebarry

#27166: Spam
http://bugs.python.org/issue27166  closed by ebarry

#27171: Fix various typos
http://bugs.python.org/issue27171  closed by martin.panter

#27174: Update URL to IPython in interactive.rst
http://bugs.python.org/issue27174  closed by berker.peksag

#27176: Addition of assertNotRaises
http://bugs.python.org/issue27176  closed by rhettinger

#27178: Unconverted RST marking in interpreter tutorial
http://bugs.python.org/issue27178  closed by berker.peksag

#27183: Clarify that Py_VISIT(NULL) does nothing
http://bugs.python.org/issue27183  closed by python-dev

#27185: Clarify Test Coverage for the String Module (test_pep292 is no
http://bugs.python.org/issue27185  closed by erinspace

#27191: Add formatting around methods in PEP 8
http://bugs.python.org/issue27191  closed by berker.peksag

#27192: Keyboard Shortcuts Consistently Cause Crashes
http://bugs.python.org/issue27192  closed by ebarry

#27193: Tkinter Unresponsive With Special Keys
http://bugs.python.org/issue27193  closed by ned.deily

From brett at python.org  Fri Jun  3 17:37:03 2016
From: brett at python.org (Brett Cannon)
Date: Fri, 03 Jun 2016 21:37:03 +0000
Subject: [Python-Dev] frame evaluation API PEP
Message-ID: <CAP1=2W7K6Ny82Vq-rk3zO9cHxxJmtQGczqGy3byg83sy-N8W9Q@mail.gmail.com>

For those of you who follow python-ideas or were at the PyCon US 2016
language summit, you have already seen/heard about this PEP. For those of
you who don't fall into either of those categories, this PEP proposed a
frame evaluation API for CPython. The motivating example of this work has
been Pyjion, the experimental CPython JIT Dino Viehland and I have been
working on in our spare time at Microsoft. The API also works for
debugging, though, as already demonstrated by Google having added a very
similar API internally for debugging purposes.

The PEP is pasted in below and also available in rendered form at
https://github.com/Microsoft/Pyjion/blob/master/pep.rst (I will assign
myself a PEP # once discussion is finished as it's easier to work in git
for this for the rich rendering of the in-progress PEP).

I should mention that the difference from python-ideas and the language
summit in the PEP are the listed support from Google's use of a very
similar API as well as clarifying the co_extra field on code objects
doesn't change their immutability (at least from the view of the PEP).

----------
PEP: NNN
Title: Adding a frame evaluation API to CPython
Version: $Revision$
Last-Modified: $Date$
Author: Brett Cannon <brett at python.org>,
        Dino Viehland <dinov at microsoft.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 16-May-2016
Post-History: 16-May-2016
              03-Jun-2016

Abstract
========

This PEP proposes to expand CPython's C API [#c-api]_ to allow for
the specification of a per-interpreter function pointer to handle the
evaluation of frames [#pyeval_evalframeex]_. This proposal also
suggests adding a new field to code objects [#pycodeobject]_ to store
arbitrary data for use by the frame evaluation function.

Rationale
=========

One place where flexibility has been lacking in Python is in the direct
execution of Python code. While CPython's C API [#c-api]_ allows for
constructing the data going into a frame object and then evaluating it
via ``PyEval_EvalFrameEx()`` [#pyeval_evalframeex]_, control over the
execution of Python code comes down to individual objects instead of a
hollistic control of execution at the frame level.

While wanting to have influence over frame evaluation may seem a bit
too low-level, it does open the possibility for things such as a
method-level JIT to be introduced into CPython without CPython itself
having to provide one. By allowing external C code to control frame
evaluation, a JIT can participate in the execution of Python code at
the key point where evaluation occurs. This then allows for a JIT to
conditionally recompile Python bytecode to machine code as desired
while still allowing for executing regular CPython bytecode when
running the JIT is not desired. This can be accomplished by allowing
interpreters to specify what function to call to evaluate a frame. And
by placing the API at the frame evaluation level it allows for a
complete view of the execution environment of the code for the JIT.

This ability to specify a frame evaluation function also allows for
other use-cases beyond just opening CPython up to a JIT. For instance,
it would not be difficult to implement a tracing or profiling function
at the call level with this API. While CPython does provide the
ability to set a tracing or profiling function at the Python level,
this would be able to match the data collection of the profiler and
quite possibly be faster for tracing by simply skipping per-line
tracing support.

It also opens up the possibility of debugging where the frame
evaluation function only performs special debugging work when it
detects it is about to execute a specific code object. In that
instance the bytecode could be theoretically rewritten in-place to
inject a breakpoint function call at the proper point for help in
debugging while not having to do a heavy-handed approach as
required by ``sys.settrace()``.

To help facilitate these use-cases, we are also proposing the adding
of a "scratch space" on code objects via a new field. This will allow
per-code object data to be stored with the code object itself for easy
retrieval by the frame evaluation function as necessary. The field
itself will simply be a ``PyObject *`` type so that any data stored in
the field will participate in normal object memory management.

Proposal
========

All proposed C API changes below will not be part of the stable ABI.

Expanding ``PyCodeObject``
--------------------------

One field is to be added to the ``PyCodeObject`` struct
[#pycodeobject]_::

  typedef struct {
     ...
     PyObject *co_extra;  /* "Scratch space" for the code object. */
  } PyCodeObject;

The ``co_extra`` will be ``NULL`` by default and will not be used by
CPython itself. Third-party code is free to use the field as desired.
Values stored in the field are expected to not be required in order
for the code object to function, allowing the loss of the data of the
field to be acceptable (this keeps the code object as immutable from
a functionality point-of-view; this is slightly contentious and so is
listed as an open issue in `Is co_extra needed?`_). The field will be
freed like all other fields on ``PyCodeObject`` during deallocation
using ``Py_XDECREF()``.

It is not recommended that multiple users attempt to use the
``co_extra`` simultaneously. While a dictionary could theoretically be
set to the field and various users could use a key specific to the
project, there is still the issue of key collisions as well as
performance degradation from using a dictionary lookup on every frame
evaluation. Users are expected to do a type check to make sure that
the field has not been previously set by someone else.

Expanding ``PyInterpreterState``
--------------------------------

The entrypoint for the frame evalution function is per-interpreter::

  // Same type signature as PyEval_EvalFrameEx().
  typedef PyObject* (__stdcall *PyFrameEvalFunction)(PyFrameObject*, int);

  typedef struct {
      ...
      PyFrameEvalFunction eval_frame;
  } PyInterpreterState;

By default, the ``eval_frame`` field will be initialized to a function
pointer that represents what ``PyEval_EvalFrameEx()`` currently is
(called ``PyEval_EvalFrameDefault()``, discussed later in this PEP).
Third-party code may then set their own frame evaluation function
instead to control the execution of Python code. A pointer comparison
can be used to detect if the field is set to
``PyEval_EvalFrameDefault()`` and thus has not been mutated yet.

Changes to ``Python/ceval.c``
-----------------------------

``PyEval_EvalFrameEx()`` [#pyeval_evalframeex]_ as it currently stands
will be renamed to ``PyEval_EvalFrameDefault()``. The new
``PyEval_EvalFrameEx()`` will then become::

    PyObject *
    PyEval_EvalFrameEx(PyFrameObject *frame, int throwflag)
    {
        PyThreadState *tstate = PyThreadState_GET();
        return tstate->interp->eval_frame(frame, throwflag);
    }

This allows third-party code to place themselves directly in the path
of Python code execution while being backwards-compatible with code
already using the pre-existing C API.

Updating ``python-gdb.py``
--------------------------

The generated ``python-gdb.py`` file used for Python support in GDB
makes some hard-coded assumptions about ``PyEval_EvalFrameEx()``, e.g.
the names of local variables. It will need to be updated to work with
the proposed changes.

Performance impact
==================

As this PEP is proposing an API to add pluggability, performance
impact is considered only in the case where no third-party code has
made any changes.

Several runs of pybench [#pybench]_ consistently showed no performance
cost from the API change alone.

A run of the Python benchmark suite [#py-benchmarks]_ showed no
measurable cost in performance.

In terms of memory impact, since there are typically not many CPython
interpreters executing in a single process that means the impact of
``co_extra`` being added to ``PyCodeObject`` is the only worry.
According to [#code-object-count]_, a run of the Python test suite
results in about 72,395 code objects being created. On a 64-bit
CPU that would result in 579,160 bytes of extra memory being used if
all code objects were alive at once and had nothing set in their
``co_extra`` fields.

Example Usage
=============

A JIT for CPython
-----------------

Pyjion
''''''

The Pyjion project [#pyjion]_ has used this proposed API to implement
a JIT for CPython using the CoreCLR's JIT [#coreclr]_. Each code
object has its ``co_extra`` field set to a ``PyjionJittedCode`` object
which stores four pieces of information:

1. Execution count
2. A boolean representing whether a previous attempt to JIT failed
3. A function pointer to a trampoline (which can be type tracing or not)
4. A void pointer to any JIT-compiled machine code

The frame evaluation function has (roughly) the following algorithm::

    def eval_frame(frame, throw_flag):
        pyjion_code = frame.code.co_extra
        if not pyjion_code:
            frame.code.co_extra = PyjionJittedCode()
        elif not pyjion_code.jit_failed:
            if not pyjion_code.jit_code:
                return pyjion_code.eval(pyjion_code.jit_code, frame)
            elif pyjion_code.exec_count > 20_000:
                if jit_compile(frame):
                    return pyjion_code.eval(pyjion_code.jit_code, frame)
                else:
                    pyjion_code.jit_failed = True
        pyjion_code.exec_count += 1
        return PyEval_EvalFrameDefault(frame, throw_flag)

The key point, though, is that all of this work and logic is separate
from CPython and yet with the proposed API changes it is able to
provide a JIT that is compliant with Python semantics (as of this
writing, performance is almost equivalent to CPython without the new
API). This means there's nothing technically preventing others from
implementing their own JITs for CPython by utilizing the proposed API.

Other JITs
''''''''''

It should be mentioned that the Pyston team was consulted on an
earlier version of this PEP that was more JIT-specific and they were
not interested in utilizing the changes proposed because they want
control over memory layout they had no interest in directly supporting
CPython itself. An informal discusion with a developer on the PyPy
team led to a similar comment.

Numba [#numba]_, on the other hand, suggested that they would be
interested in the proposed change in a post-1.0 future for
themselves [#numba-interest]_.

The experimental Coconut JIT [#coconut]_ could have benefitted from
this PEP. In private conversations with Coconut's creator we were told
that our API was probably superior to the one they developed for
Coconut to add JIT support to CPython.

Debugging
---------

In conversations with the Python Tools for Visual Studio team (PTVS)
[#ptvs]_, they thought they would find these API changes useful for
implementing more performant debugging. As mentioned in the Rationale_
section, this API would allow for switching on debugging functionality
only in frames where it is needed. This could allow for either
skipping information that ``sys.settrace()`` normally provides and
even go as far as to dynamically rewrite bytecode prior to execution
to inject e.g. breakpoints in the bytecode.

It also turns out that Google has provided a very similar API
internally for years. It has been used for performant debugging
purposes.

Implementation
==============

A set of patches implementing the proposed API is available through
the Pyjion project [#pyjion]_. In its current form it has more
changes to CPython than just this proposed API, but that is for ease
of development instead of strict requirements to accomplish its goals.

Open Issues
===========

Allow ``eval_frame`` to be ``NULL``
-----------------------------------

Currently the frame evaluation function is expected to always be set.
It could very easily simply default to ``NULL`` instead which would
signal to use ``PyEval_EvalFrameDefault()``. The current proposal of
not special-casing the field seemed the most straight-forward, but it
does require that the field not accidentally be cleared, else a crash
may occur.

Is co_extra needed?
-------------------

While discussing this PEP at PyCon US 2016, some core developers
expressed their worry of the ``co_extra`` field making code objects
mutable. The thinking seemed to be that having a field that was
mutated after the creation of the code object made the object seem
mutable, even though no other aspect of code objects changed.

The view of this PEP is that the `co_extra` field doesn't change the
fact that code objects are immutable. The field is specified in this
PEP as to not contain information required to make the code object
usable, making it more of a caching field. It could be viewed as
similar to the UTF-8 cache that string objects have internally;
strings are still considered immutable even though they have a field
that is conditionally set.

The field is also not strictly necessary. While the field greatly
simplifies attaching extra information to code objects, other options
such as keeping a mapping of code object memory addresses to what
would have been kept in ``co_extra`` or perhaps using a weak reference
of the data on the code object and then iterating through the weak
references until the attached data is found is possible. But obviously
all of these solutions are not as simple or performant as adding the
``co_extra`` field.

Rejected Ideas
==============

A JIT-specific C API
--------------------

Originally this PEP was going to propose a much larger API change
which was more JIT-specific. After soliciting feedback from the Numba
team [#numba]_, though, it became clear that the API was unnecessarily
large. The realization was made that all that was truly needed was the
opportunity to provide a trampoline function to handle execution of
Python code that had been JIT-compiled and a way to attach that
compiled machine code along with other critical data to the
corresponding Python code object. Once it was shown that there was no
loss in functionality or in performance while minimizing the API
changes required, the proposal was changed to its current form.

References
==========

.. [#pyjion] Pyjion project
   (https://github.com/microsoft/pyjion)

.. [#c-api] CPython's C API
   (https://docs.python.org/3/c-api/index.html)

.. [#pycodeobject] ``PyCodeObject``
   (https://docs.python.org/3/c-api/code.html#c.PyCodeObject)

.. [#coreclr] .NET Core Runtime (CoreCLR)
   (https://github.com/dotnet/coreclr)

.. [#pyeval_evalframeex] ``PyEval_EvalFrameEx()``
   (
https://docs.python.org/3/c-api/veryhigh.html?highlight=pyframeobject#c.PyEval_EvalFrameEx
)

.. [#pycodeobject] ``PyCodeObject``
   (https://docs.python.org/3/c-api/code.html#c.PyCodeObject)

.. [#numba] Numba
   (http://numba.pydata.org/)

.. [#numba-interest]  numba-users mailing list:
   "Would the C API for a JIT entrypoint being proposed by Pyjion help out
Numba?"
   (
https://groups.google.com/a/continuum.io/forum/#!topic/numba-users/yRl_0t8-m1g
)

.. [#code-object-count] [Python-Dev] Opcode cache in ceval loop
   (https://mail.python.org/pipermail/python-dev/2016-February/143025.html)

.. [#py-benchmarks] Python benchmark suite
   (https://hg.python.org/benchmarks)

.. [#pyston] Pyston
   (http://pyston.org)

.. [#pypy] PyPy
   (http://pypy.org/)

.. [#ptvs] Python Tools for Visual Studio
   (http://microsoft.github.io/PTVS/)

.. [#coconut] Coconut
   (https://github.com/davidmalcolm/coconut)

Copyright
=========

This document has been placed in the public domain.

..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160603/e98dc0fd/attachment-0001.html>

From rdmurray at bitdance.com  Fri Jun  3 17:50:29 2016
From: rdmurray at bitdance.com (R. David Murray)
Date: Fri, 03 Jun 2016 17:50:29 -0400
Subject: [Python-Dev] I broke the 3.5 branch, apparently
Message-ID: <20160603215031.E2C37B14024@webabinitio.net>

I don't understand how it happened, but apparently I got a merge commit
backward and merged 3.6 into 3.5 and pushed it without realizing what
had happened.  If anyone has any clue how to reverse this cleanly,
please let me know.  (There are a couple people at the sprints looking
in to it, but the mercurial guys aren't here so we are short on experts).

My apologies for the mess :(

--David

From python at mrabarnett.plus.com  Fri Jun  3 18:21:25 2016
From: python at mrabarnett.plus.com (MRAB)
Date: Fri, 3 Jun 2016 23:21:25 +0100
Subject: [Python-Dev] I broke the 3.5 branch, apparently
In-Reply-To: <20160603215031.E2C37B14024@webabinitio.net>
References: <20160603215031.E2C37B14024@webabinitio.net>
Message-ID: <5e613f77-0f91-1c14-e4dc-39dbfb1cdd4f@mrabarnett.plus.com>

On 2016-06-03 22:50, R. David Murray wrote:
> I don't understand how it happened, but apparently I got a merge commit
> backward and merged 3.6 into 3.5 and pushed it without realizing what
> had happened.  If anyone has any clue how to reverse this cleanly,
> please let me know.  (There are a couple people at the sprints looking
> in to it, but the mercurial guys aren't here so we are short on experts).
>
> My apologies for the mess :(
>
There's a lot about undoing changes here:

http://hgbook.red-bean.com/read/finding-and-fixing-mistakes.html

From rdmurray at bitdance.com  Fri Jun  3 18:29:03 2016
From: rdmurray at bitdance.com (R. David Murray)
Date: Fri, 03 Jun 2016 18:29:03 -0400
Subject: [Python-Dev] FIXED: I broke the 3.5 branch, apparently
In-Reply-To: <5e613f77-0f91-1c14-e4dc-39dbfb1cdd4f@mrabarnett.plus.com>
References: <20160603215031.E2C37B14024@webabinitio.net>
 <5e613f77-0f91-1c14-e4dc-39dbfb1cdd4f@mrabarnett.plus.com>
Message-ID: <20160603222904.960CFB14024@webabinitio.net>

On Fri, 03 Jun 2016 23:21:25 +0100, MRAB <python at mrabarnett.plus.com> wrote:
> On 2016-06-03 22:50, R. David Murray wrote:
> > I don't understand how it happened, but apparently I got a merge commit
> > backward and merged 3.6 into 3.5 and pushed it without realizing what
> > had happened.  If anyone has any clue how to reverse this cleanly,
> > please let me know.  (There are a couple people at the sprints looking
> > in to it, but the mercurial guys aren't here so we are short on experts).
> >
> > My apologies for the mess :(
> >
> There's a lot about undoing changes here:
> 
> http://hgbook.red-bean.com/read/finding-and-fixing-mistakes.html

Ned Deily has fixed the problem.

--David

From benjamin at python.org  Sat Jun  4 02:11:31 2016
From: benjamin at python.org (Benjamin Peterson)
Date: Fri, 03 Jun 2016 23:11:31 -0700
Subject: [Python-Dev] C99
Message-ID: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>

PEP 7 requires CPython to use C code conforming to the venerable C89
standard. Traditionally, we've been stuck with C89 due to poor C support
in MSVC. However, MSVC 2013 and 2015 implement the key features of C99.
C99 does not offer anything earth-shattering; here are the features I
think we'd find most interesting:
- Variable declarations can be on any line: removes possibly the most
annoying limitation of C89.
- Inline functions: We can make Py_DECREF and Py_INCREF inline functions
rather than unpleasant macros.
- C++-style line comments: Not an killer feature but commonly used.
- Booleans
In summary, some niceties that would make CPython hacking a little more
fun.

So, what say you to updating PEP 7 to allow C99 features for Python 3.6
(in so much as GCC and MSVC support them)?

Regards,
Benjamin

From vadmium+py at gmail.com  Sat Jun  4 03:53:20 2016
From: vadmium+py at gmail.com (Martin Panter)
Date: Sat, 4 Jun 2016 07:53:20 +0000
Subject: [Python-Dev] C99
In-Reply-To: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
Message-ID: <CA+eR4cF56FBdd86inH8NBzPLcL4W+Nb2ikixBn=xUsoOCmhhLg@mail.gmail.com>

On 4 June 2016 at 06:11, Benjamin Peterson <benjamin at python.org> wrote:
> PEP 7 requires CPython to use C code conforming to the venerable C89
> standard. Traditionally, we've been stuck with C89 due to poor C support
> in MSVC. However, MSVC 2013 and 2015 implement the key features of C99.
> C99 does not offer anything earth-shattering; here are the features I
> think we'd find most interesting:
> - Variable declarations can be on any line: removes possibly the most
> annoying limitation of C89.
> - Inline functions: We can make Py_DECREF and Py_INCREF inline functions
> rather than unpleasant macros.
> - C++-style line comments: Not an killer feature but commonly used.
> - Booleans

My most-missed C99 feature would be designated initializers. Does MSVC
support them? It might allow you to do away with those giant pasted
slot tables, and just write the slots you need:

PyTypeObject PyUnicodeIter_Type = {
    PyVarObject_HEAD_INIT(&PyType_Type, 0)
    .tp_name = "str_iterator",
    .tp_basicsize = sizeof(unicodeiterobject),
    .tp_dealloc = unicodeiter_dealloc,
    .tp_getattro = PyObject_GenericGetAttr,
    .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,
    .tp_traverse = unicodeiter_traverse,
    .tp_iter = PyObject_SelfIter,
    .tp_iternext = unicodeiter_next,
    .tp_methods = unicodeiter_methods,
};

> So, what say you to updating PEP 7 to allow C99 features for Python 3.6
> (in so much as GCC and MSVC support them)?

Sounds good for features that are well-supported by compilers that
people use. (Are there other compilers used than just GCC and MSVC?)

From storchaka at gmail.com  Sat Jun  4 04:08:39 2016
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sat, 4 Jun 2016 11:08:39 +0300
Subject: [Python-Dev] Improving the bytecode
Message-ID: <niu2a8$n9m$1@ger.gmane.org>

Following the converting 8-bit bytecode to 16-bit bytecode (wordcode), 
there are other issues for improving the bytecode.

1. http://bugs.python.org/issue27129
Make the bytecode more 16-bit oriented.

2. http://bugs.python.org/issue27140
Add new opcode BUILD_CONST_KEY_MAP for building a dict with constant 
keys. This optimize the common case and especially helpful for two 
following issues (creating and calling functions).

3. http://bugs.python.org/issue27095
Simplify MAKE_FUNCTION/MAKE_CLOSURE. Instead packing three numbers in 
oparg the new MAKE_FUNCTION takes built tuples and dicts from the stack. 
MAKE_FUNCTION and MAKE_CLOSURE are merged in the single opcode.

4. http://bugs.python.org/issue27213
Rework CALL_FUNCTION* opcodes. Replace four existing opcodes with three 
simpler and more efficient opcodes.

5. http://bugs.python.org/issue27127
Rework the for loop implementation.

6. http://bugs.python.org/issue17611
Move unwinding of stack for "pseudo exceptions" from interpreter to 
compiler.

From sebastian at realpath.org  Sat Jun  4 04:12:57 2016
From: sebastian at realpath.org (Sebastian Krause)
Date: Sat, 04 Jun 2016 10:12:57 +0200
Subject: [Python-Dev] C99
In-Reply-To: <CA+eR4cF56FBdd86inH8NBzPLcL4W+Nb2ikixBn=xUsoOCmhhLg@mail.gmail.com>
 (Martin Panter's message of "Sat, 4 Jun 2016 07:53:20 +0000")
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <CA+eR4cF56FBdd86inH8NBzPLcL4W+Nb2ikixBn=xUsoOCmhhLg@mail.gmail.com>
Message-ID: <m2shwtnzhy.fsf@news.realpath.org>

Martin Panter <vadmium+py at gmail.com> wrote:
>> So, what say you to updating PEP 7 to allow C99 features for Python 3.6
>> (in so much as GCC and MSVC support them)?
>
> Sounds good for features that are well-supported by compilers that
> people use. (Are there other compilers used than just GCC and MSVC?)

clang on OS X, but it supports pretty much everything that GCC
supports as well.

From brett at python.org  Sat Jun  4 12:07:22 2016
From: brett at python.org (Brett Cannon)
Date: Sat, 04 Jun 2016 16:07:22 +0000
Subject: [Python-Dev] Improving the bytecode
In-Reply-To: <niu2a8$n9m$1@ger.gmane.org>
References: <niu2a8$n9m$1@ger.gmane.org>
Message-ID: <CAP1=2W5a2Z5POX+Swo7aMQNLgjtfEdD74BUhM0pDtFCu1cz03Q@mail.gmail.com>

It's not on the list but I'm hoping to convince Dino to work on END_FINALLY
to be a bit more sane.

On Sat, Jun 4, 2016, 01:17 Serhiy Storchaka <storchaka at gmail.com> wrote:

> Following the converting 8-bit bytecode to 16-bit bytecode (wordcode),
> there are other issues for improving the bytecode.
>
> 1. http://bugs.python.org/issue27129
> Make the bytecode more 16-bit oriented.
>
> 2. http://bugs.python.org/issue27140
> Add new opcode BUILD_CONST_KEY_MAP for building a dict with constant
> keys. This optimize the common case and especially helpful for two
> following issues (creating and calling functions).
>
> 3. http://bugs.python.org/issue27095
> Simplify MAKE_FUNCTION/MAKE_CLOSURE. Instead packing three numbers in
> oparg the new MAKE_FUNCTION takes built tuples and dicts from the stack.
> MAKE_FUNCTION and MAKE_CLOSURE are merged in the single opcode.
>
> 4. http://bugs.python.org/issue27213
> Rework CALL_FUNCTION* opcodes. Replace four existing opcodes with three
> simpler and more efficient opcodes.
>
> 5. http://bugs.python.org/issue27127
> Rework the for loop implementation.
>
> 6. http://bugs.python.org/issue17611
> Move unwinding of stack for "pseudo exceptions" from interpreter to
> compiler.
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160604/f4e2931c/attachment.html>

From ericsnowcurrently at gmail.com  Sat Jun  4 13:02:27 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Sat, 4 Jun 2016 11:02:27 -0600
Subject: [Python-Dev] Improving the bytecode
In-Reply-To: <niu2a8$n9m$1@ger.gmane.org>
References: <niu2a8$n9m$1@ger.gmane.org>
Message-ID: <CALFfu7DNU5foQugHsyfzCZ+GuKDHi-2Wa-5rwCwFJ+p=m4ZMSA@mail.gmail.com>

You should get in touch with Mark Shannon, while you're working on
ceval.  He has some definite improvements that can be made to the eval
loop.

-eric

On Sat, Jun 4, 2016 at 2:08 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:
> Following the converting 8-bit bytecode to 16-bit bytecode (wordcode), there
> are other issues for improving the bytecode.
>
> 1. http://bugs.python.org/issue27129
> Make the bytecode more 16-bit oriented.
>
> 2. http://bugs.python.org/issue27140
> Add new opcode BUILD_CONST_KEY_MAP for building a dict with constant keys.
> This optimize the common case and especially helpful for two following
> issues (creating and calling functions).
>
> 3. http://bugs.python.org/issue27095
> Simplify MAKE_FUNCTION/MAKE_CLOSURE. Instead packing three numbers in oparg
> the new MAKE_FUNCTION takes built tuples and dicts from the stack.
> MAKE_FUNCTION and MAKE_CLOSURE are merged in the single opcode.
>
> 4. http://bugs.python.org/issue27213
> Rework CALL_FUNCTION* opcodes. Replace four existing opcodes with three
> simpler and more efficient opcodes.
>
> 5. http://bugs.python.org/issue27127
> Rework the for loop implementation.
>
> 6. http://bugs.python.org/issue17611
> Move unwinding of stack for "pseudo exceptions" from interpreter to
> compiler.
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/ericsnowcurrently%40gmail.com

From christian at python.org  Sat Jun  4 13:27:39 2016
From: christian at python.org (Christian Heimes)
Date: Sat, 4 Jun 2016 10:27:39 -0700
Subject: [Python-Dev] C99
In-Reply-To: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
Message-ID: <niv32b$d7k$1@ger.gmane.org>

On 2016-06-03 23:11, Benjamin Peterson wrote:
> PEP 7 requires CPython to use C code conforming to the venerable C89
> standard. Traditionally, we've been stuck with C89 due to poor C support
> in MSVC. However, MSVC 2013 and 2015 implement the key features of C99.
> C99 does not offer anything earth-shattering; here are the features I
> think we'd find most interesting:
> - Variable declarations can be on any line: removes possibly the most
> annoying limitation of C89.
> - Inline functions: We can make Py_DECREF and Py_INCREF inline functions
> rather than unpleasant macros.
> - C++-style line comments: Not an killer feature but commonly used.
> - Booleans
> In summary, some niceties that would make CPython hacking a little more
> fun.
> 
> So, what say you to updating PEP 7 to allow C99 features for Python 3.6
> (in so much as GCC and MSVC support them)?

+1

- We never officially deprecated C89 platforms withou 64 bit integers in
PEP 7. Victor's changes to pytime.h implies support for uint64_t and
int64_t. C99 has mandatory long long int support.

- If we also drop Solaris Studio C compiler support, we can replace
header guards (e.g. #ifndef Py_PYTHON_H) with #pragma once

Christian

From guido at python.org  Sat Jun  4 13:47:38 2016
From: guido at python.org (Guido van Rossum)
Date: Sat, 4 Jun 2016 10:47:38 -0700
Subject: [Python-Dev] C99
In-Reply-To: <niv32b$d7k$1@ger.gmane.org>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
Message-ID: <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>

Funny. Just two weeks ago I was helping someone who discovered a
compiler that doesn't support the new relaxed variable declaration
rules. I think it was on Windows. Maybe this move is a little too
aggressively deprecating older Windows compilers?

On Sat, Jun 4, 2016 at 10:27 AM, Christian Heimes <christian at python.org> wrote:
> On 2016-06-03 23:11, Benjamin Peterson wrote:
>> PEP 7 requires CPython to use C code conforming to the venerable C89
>> standard. Traditionally, we've been stuck with C89 due to poor C support
>> in MSVC. However, MSVC 2013 and 2015 implement the key features of C99.
>> C99 does not offer anything earth-shattering; here are the features I
>> think we'd find most interesting:
>> - Variable declarations can be on any line: removes possibly the most
>> annoying limitation of C89.
>> - Inline functions: We can make Py_DECREF and Py_INCREF inline functions
>> rather than unpleasant macros.
>> - C++-style line comments: Not an killer feature but commonly used.
>> - Booleans
>> In summary, some niceties that would make CPython hacking a little more
>> fun.
>>
>> So, what say you to updating PEP 7 to allow C99 features for Python 3.6
>> (in so much as GCC and MSVC support them)?
>
> +1
>
> - We never officially deprecated C89 platforms withou 64 bit integers in
> PEP 7. Victor's changes to pytime.h implies support for uint64_t and
> int64_t. C99 has mandatory long long int support.
>
> - If we also drop Solaris Studio C compiler support, we can replace
> header guards (e.g. #ifndef Py_PYTHON_H) with #pragma once
>
> Christian
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org

-- 
--Guido van Rossum (python.org/~guido)

From dinov at microsoft.com  Sat Jun  4 14:32:30 2016
From: dinov at microsoft.com (Dino Viehland)
Date: Sat, 4 Jun 2016 18:32:30 +0000
Subject: [Python-Dev] C99
In-Reply-To: <CA+eR4cF56FBdd86inH8NBzPLcL4W+Nb2ikixBn=xUsoOCmhhLg@mail.gmail.com>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <CA+eR4cF56FBdd86inH8NBzPLcL4W+Nb2ikixBn=xUsoOCmhhLg@mail.gmail.com>
Message-ID: <BN3PR03MB2195EF97F647F0E80D241A16BB5A0@BN3PR03MB2195.namprd03.prod.outlook.com>

Martin wrote:
> On 4 June 2016 at 06:11, Benjamin Peterson <benjamin at python.org> wrote:
> > PEP 7 requires CPython to use C code conforming to the venerable C89
> > standard. Traditionally, we've been stuck with C89 due to poor C
> > support in MSVC. However, MSVC 2013 and 2015 implement the key
> features of C99.
> > C99 does not offer anything earth-shattering; here are the features I
> > think we'd find most interesting:
> > - Variable declarations can be on any line: removes possibly the most
> > annoying limitation of C89.
> > - Inline functions: We can make Py_DECREF and Py_INCREF inline
> > functions rather than unpleasant macros.
> > - C++-style line comments: Not an killer feature but commonly used.
> > - Booleans
> 
> My most-missed C99 feature would be designated initializers. Does MSVC
> support them? It might allow you to do away with those giant pasted slot
> tables, and just write the slots you need:
> 
> PyTypeObject PyUnicodeIter_Type = {
>     PyVarObject_HEAD_INIT(&PyType_Type, 0)
>     .tp_name = "str_iterator",
>     .tp_basicsize = sizeof(unicodeiterobject),
>     .tp_dealloc = unicodeiter_dealloc,
>     .tp_getattro = PyObject_GenericGetAttr,
>     .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,
>     .tp_traverse = unicodeiter_traverse,
>     .tp_iter = PyObject_SelfIter,
>     .tp_iternext = unicodeiter_next,
>     .tp_methods = unicodeiter_methods,
> };

I checked and VC++ does actually support this, and it looks like they support 
// comments  as well.  I don't think it fully supports all of the C99 features - it appears 
They just cherry picked some stuff.  The C99 standard library does appear to be fully
supported with the exception of tgmath.h.

From benjamin at python.org  Sat Jun  4 14:47:43 2016
From: benjamin at python.org (Benjamin Peterson)
Date: Sat, 04 Jun 2016 11:47:43 -0700
Subject: [Python-Dev] C99
In-Reply-To: <BN3PR03MB2195EF97F647F0E80D241A16BB5A0@BN3PR03MB2195.namprd03.prod.outlook.com>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <CA+eR4cF56FBdd86inH8NBzPLcL4W+Nb2ikixBn=xUsoOCmhhLg@mail.gmail.com>
 <BN3PR03MB2195EF97F647F0E80D241A16BB5A0@BN3PR03MB2195.namprd03.prod.outlook.com>
Message-ID: <1465066063.2971497.627971209.737F379F@webmail.messagingengine.com>

On Sat, Jun 4, 2016, at 11:32, Dino Viehland wrote:
> 
> 
> Martin wrote:
> > On 4 June 2016 at 06:11, Benjamin Peterson <benjamin at python.org> wrote:
> > > PEP 7 requires CPython to use C code conforming to the venerable C89
> > > standard. Traditionally, we've been stuck with C89 due to poor C
> > > support in MSVC. However, MSVC 2013 and 2015 implement the key
> > features of C99.
> > > C99 does not offer anything earth-shattering; here are the features I
> > > think we'd find most interesting:
> > > - Variable declarations can be on any line: removes possibly the most
> > > annoying limitation of C89.
> > > - Inline functions: We can make Py_DECREF and Py_INCREF inline
> > > functions rather than unpleasant macros.
> > > - C++-style line comments: Not an killer feature but commonly used.
> > > - Booleans
> > 
> > My most-missed C99 feature would be designated initializers. Does MSVC
> > support them? It might allow you to do away with those giant pasted slot
> > tables, and just write the slots you need:
> > 
> > PyTypeObject PyUnicodeIter_Type = {
> >     PyVarObject_HEAD_INIT(&PyType_Type, 0)
> >     .tp_name = "str_iterator",
> >     .tp_basicsize = sizeof(unicodeiterobject),
> >     .tp_dealloc = unicodeiter_dealloc,
> >     .tp_getattro = PyObject_GenericGetAttr,
> >     .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,
> >     .tp_traverse = unicodeiter_traverse,
> >     .tp_iter = PyObject_SelfIter,
> >     .tp_iternext = unicodeiter_next,
> >     .tp_methods = unicodeiter_methods,
> > };
> 
> I checked and VC++ does actually support this, and it looks like they
> support 
> // comments  as well.  I don't think it fully supports all of the C99
> features - it appears 
> They just cherry picked some stuff.  The C99 standard library does appear
> to be fully
> supported with the exception of tgmath.h.

Are the C99 features VS++ supports documented anywhere? I couldn't find
any list.

From christian at python.org  Sat Jun  4 14:50:52 2016
From: christian at python.org (Christian Heimes)
Date: Sat, 4 Jun 2016 11:50:52 -0700
Subject: [Python-Dev] C99
In-Reply-To: <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
Message-ID: <1b9e1e41-b401-c27d-894e-913ecc485d8d@python.org>

On 2016-06-04 10:47, Guido van Rossum wrote:
> Funny. Just two weeks ago I was helping someone who discovered a
> compiler that doesn't support the new relaxed variable declaration
> rules. I think it was on Windows. Maybe this move is a little too
> aggressively deprecating older Windows compilers?

Yes, it's not support in VS 2012 and 2008 for Python 3.4 and older. New
C99 features are available in VS 2013,
https://blogs.msdn.microsoft.com/vcblog/2013/06/28/c1114-stl-features-fixes-and-breaking-changes-in-vs-2013/

Python 3.5+ requires VS 2015 anyway. Traditionally we tried to keep
backwards compatibility with older compiler versions. The new features
are tempting enough to deprecate compiler versions that have been
released more than five years ago.

Christian

From guido at python.org  Sat Jun  4 14:59:07 2016
From: guido at python.org (Guido van Rossum)
Date: Sat, 4 Jun 2016 11:59:07 -0700
Subject: [Python-Dev] C99
In-Reply-To: <1b9e1e41-b401-c27d-894e-913ecc485d8d@python.org>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <1b9e1e41-b401-c27d-894e-913ecc485d8d@python.org>
Message-ID: <CAP7+vJKM1DZdWrcyi_w8ZrQvSR9OqxBKGy_1+HoZHQ6T-Drdbw@mail.gmail.com>

As long as we don't require extension module authors to use them --
they may have their own compatibility requirements.

On Sat, Jun 4, 2016 at 11:50 AM, Christian Heimes <christian at python.org> wrote:
> On 2016-06-04 10:47, Guido van Rossum wrote:
>> Funny. Just two weeks ago I was helping someone who discovered a
>> compiler that doesn't support the new relaxed variable declaration
>> rules. I think it was on Windows. Maybe this move is a little too
>> aggressively deprecating older Windows compilers?
>
> Yes, it's not support in VS 2012 and 2008 for Python 3.4 and older. New
> C99 features are available in VS 2013,
> https://blogs.msdn.microsoft.com/vcblog/2013/06/28/c1114-stl-features-fixes-and-breaking-changes-in-vs-2013/
>
>
> Python 3.5+ requires VS 2015 anyway. Traditionally we tried to keep
> backwards compatibility with older compiler versions. The new features
> are tempting enough to deprecate compiler versions that have been
> released more than five years ago.
>
> Christian

-- 
--Guido van Rossum (python.org/~guido)

From christian at python.org  Sat Jun  4 15:05:09 2016
From: christian at python.org (Christian Heimes)
Date: Sat, 4 Jun 2016 12:05:09 -0700
Subject: [Python-Dev] C99
In-Reply-To: <CAP7+vJKM1DZdWrcyi_w8ZrQvSR9OqxBKGy_1+HoZHQ6T-Drdbw@mail.gmail.com>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <1b9e1e41-b401-c27d-894e-913ecc485d8d@python.org>
 <CAP7+vJKM1DZdWrcyi_w8ZrQvSR9OqxBKGy_1+HoZHQ6T-Drdbw@mail.gmail.com>
Message-ID: <0764173d-e69b-4f75-8797-30828ca6b471@python.org>

On 2016-06-04 11:59, Guido van Rossum wrote:
> As long as we don't require extension module authors to use them --
> they may have their own compatibility requirements.

On Windows extension modules must be compiled with a specific version of
MSVC any way. For Python 3.6 VS 2015 or newer is a hard requirement.

We kept the old compiler directories around for embedders.

From guido at python.org  Sat Jun  4 15:07:07 2016
From: guido at python.org (Guido van Rossum)
Date: Sat, 4 Jun 2016 12:07:07 -0700
Subject: [Python-Dev] C99
In-Reply-To: <0764173d-e69b-4f75-8797-30828ca6b471@python.org>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <1b9e1e41-b401-c27d-894e-913ecc485d8d@python.org>
 <CAP7+vJKM1DZdWrcyi_w8ZrQvSR9OqxBKGy_1+HoZHQ6T-Drdbw@mail.gmail.com>
 <0764173d-e69b-4f75-8797-30828ca6b471@python.org>
Message-ID: <CAP7+vJ+bm1qgc=3uD=dGqOW1TQdaMRH5aE=C4PFaWRyX3HBJhA@mail.gmail.com>

I'm talking about 3rd party extensions. Those may require source
compatibility with older Python versions. All I'm asking for is to not
require source-level use of C99 features. Of course requiring a
specific compiler to work with specific CPython versions is fine.

On Sat, Jun 4, 2016 at 12:05 PM, Christian Heimes <christian at python.org> wrote:
> On 2016-06-04 11:59, Guido van Rossum wrote:
>> As long as we don't require extension module authors to use them --
>> they may have their own compatibility requirements.
>
> On Windows extension modules must be compiled with a specific version of
> MSVC any way. For Python 3.6 VS 2015 or newer is a hard requirement.
>
> We kept the old compiler directories around for embedders.
>

-- 
--Guido van Rossum (python.org/~guido)

From christian at python.org  Sat Jun  4 15:10:26 2016
From: christian at python.org (Christian Heimes)
Date: Sat, 4 Jun 2016 12:10:26 -0700
Subject: [Python-Dev] C99
In-Reply-To: <CAP7+vJ+bm1qgc=3uD=dGqOW1TQdaMRH5aE=C4PFaWRyX3HBJhA@mail.gmail.com>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <1b9e1e41-b401-c27d-894e-913ecc485d8d@python.org>
 <CAP7+vJKM1DZdWrcyi_w8ZrQvSR9OqxBKGy_1+HoZHQ6T-Drdbw@mail.gmail.com>
 <0764173d-e69b-4f75-8797-30828ca6b471@python.org>
 <CAP7+vJ+bm1qgc=3uD=dGqOW1TQdaMRH5aE=C4PFaWRyX3HBJhA@mail.gmail.com>
Message-ID: <7104eb32-9e06-7e33-7e5a-542b3bc94a35@python.org>

On 2016-06-04 12:07, Guido van Rossum wrote:
> I'm talking about 3rd party extensions. Those may require source
> compatibility with older Python versions. All I'm asking for is to not
> require source-level use of C99 features. Of course requiring a
> specific compiler to work with specific CPython versions is fine.

Ah, the other way around. Yes, that makes a lot of sense.

From larry at hastings.org  Sat Jun  4 17:12:13 2016
From: larry at hastings.org (Larry Hastings)
Date: Sat, 4 Jun 2016 14:12:13 -0700
Subject: [Python-Dev] C99
In-Reply-To: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
Message-ID: <5753442D.2000008@hastings.org>

On 06/03/2016 11:11 PM, Benjamin Peterson wrote:
> So, what say you to updating PEP 7 to allow C99 features for Python 3.6
> (in so much as GCC and MSVC support them)?

+1

Clearly it'll be 3.5+ only, and clearly it'll be a specific list of 
features ("C89 but also permitting //-comments, variadic macros, 
variable declarations on any line, inline functions, and designated 
initializers").  But I'm looking forward to it!

We already had macros for inline (e.g. Py_LOCAL_INLINE), maybe we can 
remove those.

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160604/a4d9a4f4/attachment-0001.html>

From meadori at gmail.com  Sat Jun  4 17:22:20 2016
From: meadori at gmail.com (Meador Inge)
Date: Sat, 4 Jun 2016 16:22:20 -0500
Subject: [Python-Dev] C99
In-Reply-To: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
Message-ID: <CAK1QoorF9sRtCjSRdTE1Jg-JGnpTHHxMN=UH9kqkv102Pz5DtQ@mail.gmail.com>

On Sat, Jun 4, 2016 at 1:11 AM, Benjamin Peterson <benjamin at python.org>
wrote:

> So, what say you to updating PEP 7 to allow C99 features for Python 3.6
> (in so much as GCC and MSVC support them)?
>

+1

# Meador
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160604/2df4c81e/attachment.html>

From christian at python.org  Sat Jun  4 20:26:01 2016
From: christian at python.org (Christian Heimes)
Date: Sat, 4 Jun 2016 17:26:01 -0700
Subject: [Python-Dev] cpython: replace custom validation logic in the
 parse module with a simple DFA validator
In-Reply-To: <20160602183248.72964.43203.21CCDB55@psf.io>
References: <20160602183248.72964.43203.21CCDB55@psf.io>
Message-ID: <4005d323-bb04-9dda-00c8-a5fb271061f9@python.org>

On 2016-06-02 11:32, benjamin.peterson wrote:
> https://hg.python.org/cpython/rev/4a9159ea2536
> changeset:   101601:4a9159ea2536
> user:        Benjamin Peterson <benjamin at python.org>
> date:        Thu Jun 02 11:30:18 2016 -0700
> summary:
>   replace custom validation logic in the parse module with a simple DFA validator (closes #26526)
> 
> Patch from A. Skrobov.
> 
> files:
>   Misc/NEWS              |     3 +
>   Modules/parsermodule.c |  2545 +--------------------------
>   2 files changed, 96 insertions(+), 2452 deletions(-)
> 
> 
> diff --git a/Misc/NEWS b/Misc/NEWS
> --- a/Misc/NEWS
> +++ b/Misc/NEWS
> @@ -22,6 +22,9 @@
>  Library
>  -------
>  
> +- Issue #26526: Replace custom parse tree validation in the parser
> +  module with a simple DFA validator.
> +
>  - Issue #27114: Fix SSLContext._load_windows_store_certs fails with
>    PermissionError
>  
> diff --git a/Modules/parsermodule.c b/Modules/parsermodule.c
> --- a/Modules/parsermodule.c
> +++ b/Modules/parsermodule.c
> @@ -670,9 +670,75 @@
>  
>  
>  static node* build_node_tree(PyObject *tuple);
> -static int   validate_expr_tree(node *tree);
> -static int   validate_file_input(node *tree);
> -static int   validate_encoding_decl(node *tree);
> +
> +static int
> +validate_node(node *tree)
> +{
> +    int type = TYPE(tree);
> +    int nch = NCH(tree);
> +    dfa *nt_dfa;
> +    state *dfa_state;
> +    int pos, arc;
> +
> +    assert(ISNONTERMINAL(type));
> +    type -= NT_OFFSET;
> +    if (type >= _PyParser_Grammar.g_ndfas) {
> +        PyErr_Format(parser_error, "Unrecognized node type %d.", TYPE(tree));
> +        return 0;
> +    }
> +    nt_dfa = &_PyParser_Grammar.g_dfa[type];
> +    REQ(tree, nt_dfa->d_type);
> +
> +    /* Run the DFA for this nonterminal. */
> +    dfa_state = &nt_dfa->d_state[nt_dfa->d_initial];
> +    for (pos = 0; pos < nch; ++pos) {
> +        node *ch = CHILD(tree, pos);
> +        int ch_type = TYPE(ch);
> +        for (arc = 0; arc < dfa_state->s_narcs; ++arc) {
> +            short a_label = dfa_state->s_arc[arc].a_lbl;
> +            assert(a_label < _PyParser_Grammar.g_ll.ll_nlabels);
> +            if (_PyParser_Grammar.g_ll.ll_label[a_label].lb_type == ch_type) {
> +     	        /* The child is acceptable; if non-terminal, validate it recursively. */
> +                if (ISNONTERMINAL(ch_type) && !validate_node(ch))
> +                    return 0;
> +
> +                /* Update the state, and move on to the next child. */
> +                dfa_state = &nt_dfa->d_state[dfa_state->s_arc[arc].a_arrow];
> +                goto arc_found;
> +            }
> +        }
> +        /* What would this state have accepted? */
> +        {
> +            short a_label = dfa_state->s_arc->a_lbl;
> +            int next_type;
> +            if (!a_label) /* Wouldn't accept any more children */
> +                goto illegal_num_children;
> +
> +            next_type = _PyParser_Grammar.g_ll.ll_label[a_label].lb_type;
> +            if (ISNONTERMINAL(next_type))
> +                PyErr_Format(parser_error, "Expected node type %d, got %d.",
> +                             next_type, ch_type);
> +            else
> +                PyErr_Format(parser_error, "Illegal terminal: expected %s.",
> +                             _PyParser_TokenNames[next_type]);

Coverity doesn't that line:

CID 1362505 (#1 of 1): Out-of-bounds read (OVERRUN)
20. overrun-local: Overrunning array _PyParser_TokenNames of 58 8-byte
elements at element index 255 (byte offset 2040) using index next_type
(which evaluates to 255).

Can you add a check to verify, that next_type is not out-of-bounds, e.g.

+            else if (next_type > N_TOKENS)
+                PyErr_Format(parser_error, "Illegal node type %d",
next_type);

> +            return 0;
> +        }
> +
> +arc_found:
> +        continue;
> +    }
> +    /* Are we in a final state? If so, return 1 for successful validation. */
> +    for (arc = 0; arc < dfa_state->s_narcs; ++arc) {
> +        if (!dfa_state->s_arc[arc].a_lbl) {
> +            return 1;
> +        }
> +    }
> +
> +illegal_num_children:
> +    PyErr_Format(parser_error,
> +                 "Illegal number of children for %s node.", nt_dfa->d_name);
> +    return 0;
> +}

From mark at hotpy.org  Sat Jun  4 20:53:57 2016
From: mark at hotpy.org (Mark Shannon)
Date: Sat, 4 Jun 2016 17:53:57 -0700
Subject: [Python-Dev] Improving the bytecode
In-Reply-To: <CALFfu7DNU5foQugHsyfzCZ+GuKDHi-2Wa-5rwCwFJ+p=m4ZMSA@mail.gmail.com>
References: <niu2a8$n9m$1@ger.gmane.org>
 <CALFfu7DNU5foQugHsyfzCZ+GuKDHi-2Wa-5rwCwFJ+p=m4ZMSA@mail.gmail.com>
Message-ID: <57537825.7000706@hotpy.org>

On 04/06/16 10:02, Eric Snow wrote:
> You should get in touch with Mark Shannon, while you're working on
> ceval.  He has some definite improvements that can be made to the eval
> loop.

See http://bugs.python.org/issue17611 for my suggested improvements.
I've made a new comment there.

Cheers,
Mark.

>
> -eric
>
> On Sat, Jun 4, 2016 at 2:08 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:
>> Following the converting 8-bit bytecode to 16-bit bytecode (wordcode), there
>> are other issues for improving the bytecode.
>>
>> 1. http://bugs.python.org/issue27129
>> Make the bytecode more 16-bit oriented.
>>
>> 2. http://bugs.python.org/issue27140
>> Add new opcode BUILD_CONST_KEY_MAP for building a dict with constant keys.
>> This optimize the common case and especially helpful for two following
>> issues (creating and calling functions).
>>
>> 3. http://bugs.python.org/issue27095
>> Simplify MAKE_FUNCTION/MAKE_CLOSURE. Instead packing three numbers in oparg
>> the new MAKE_FUNCTION takes built tuples and dicts from the stack.
>> MAKE_FUNCTION and MAKE_CLOSURE are merged in the single opcode.
>>
>> 4. http://bugs.python.org/issue27213
>> Rework CALL_FUNCTION* opcodes. Replace four existing opcodes with three
>> simpler and more efficient opcodes.
>>
>> 5. http://bugs.python.org/issue27127
>> Rework the for loop implementation.
>>
>> 6. http://bugs.python.org/issue17611
>> Move unwinding of stack for "pseudo exceptions" from interpreter to
>> compiler.
>>
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/ericsnowcurrently%40gmail.com

From raymond.hettinger at gmail.com  Sun Jun  5 14:24:50 2016
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Sun, 5 Jun 2016 11:24:50 -0700
Subject: [Python-Dev] Improving the bytecode
In-Reply-To: <niu2a8$n9m$1@ger.gmane.org>
References: <niu2a8$n9m$1@ger.gmane.org>
Message-ID: <535A1D81-B1BD-439C-8167-A93B2174507E@gmail.com>

> On Jun 4, 2016, at 1:08 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:
> 
> Following the converting 8-bit bytecode to 16-bit bytecode (wordcode), there are other issues for improving the bytecode.
> 
> 1. http://bugs.python.org/issue27129
> Make the bytecode more 16-bit oriented.

I don' think this should be done.  Adding the /2 and *2 just complicates the code and messes with my ability to reason about jumps.  

With VM opcodes, there is always a tension between being close to implementation (what byte address are we jumping to) and being high level (what is the word offset).  In this case, I think we should stay with the former because they are primarily used in ceval.c and peephole.c which are close to the implementation.  At the higher level, there isn't any real benefit either (because dis.py already does a nice job of translating the jump targets).

Here is one example of the parts of the diff that cause concern that future maintenance will be made more difficult by the change:

-                j = blocks[j + i + 2] - blocks[i] - 2;
+                j = (blocks[j * 2 + i + 2] - blocks[i] - 2) / 2;

Reviewing the original line only gives me a mild headache while the second one really makes me want to avert my eyes ;-)

> 2. http://bugs.python.org/issue27140
> Add new opcode BUILD_CONST_KEY_MAP for building a dict with constant keys. This optimize the common case and especially helpful for two following issues (creating and calling functions).

This shows promise. 

The proposed name BUILD_CONST_KEY_MAP is much more clear than BUILD_MAP_EX.

> 3. http://bugs.python.org/issue27095
> Simplify MAKE_FUNCTION/MAKE_CLOSURE. Instead packing three numbers in oparg the new MAKE_FUNCTION takes built tuples and dicts from the stack. MAKE_FUNCTION and MAKE_CLOSURE are merged in the single opcode.
> 
> 4. http://bugs.python.org/issue27213
> Rework CALL_FUNCTION* opcodes. Replace four existing opcodes with three simpler and more efficient opcodes.

+1

> 5. http://bugs.python.org/issue27127
> Rework the for loop implementation.

I'm unclear what problem is being solved by requiring that GET_ITER always followed immediately by FOR_ITER.

> 6. http://bugs.python.org/issue17611
> Move unwinding of stack for "pseudo exceptions" from interpreter to compiler.

I have mixed feelings on this one, at once applauding efforts to simplify an eternally messy part of the eval loop and at the same time worried that it throws aways years of tweaks and improvements that came beforehand.  This is more of a major surgery than the other patches.

Raymond Hettinger

From storchaka at gmail.com  Sun Jun  5 15:16:57 2016
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sun, 5 Jun 2016 22:16:57 +0300
Subject: [Python-Dev] Improving the bytecode
In-Reply-To: <535A1D81-B1BD-439C-8167-A93B2174507E@gmail.com>
References: <niu2a8$n9m$1@ger.gmane.org>
 <535A1D81-B1BD-439C-8167-A93B2174507E@gmail.com>
Message-ID: <nj1tr9$cc0$1@ger.gmane.org>

On 05.06.16 21:24, Raymond Hettinger wrote:
>> On Jun 4, 2016, at 1:08 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:
>> 1. http://bugs.python.org/issue27129
>> Make the bytecode more 16-bit oriented.
>
> I don' think this should be done.  Adding the /2 and *2 just complicates the code and messes with my ability to reason about jumps.
>
> With VM opcodes, there is always a tension between being close to implementation (what byte address are we jumping to) and being high level (what is the word offset).  In this case, I think we should stay with the former because they are primarily used in ceval.c and peephole.c which are close to the implementation.  At the higher level, there isn't any real benefit either (because dis.py already does a nice job of translating the jump targets).
>
> Here is one example of the parts of the diff that cause concern that future maintenance will be made more difficult by the change:
>
> -                j = blocks[j + i + 2] - blocks[i] - 2;
> +                j = (blocks[j * 2 + i + 2] - blocks[i] - 2) / 2;
>
> Reviewing the original line only gives me a mild headache while the second one really makes me want to avert my eyes ;-)

The /2 and *2 are added just because Victor wants to keep f_lineno 
counting bytes. Please look at my first patch. It doesn't contain /2 and 
*2. It even contains much less +2 and -2. For example the above change 
looks as:

-                j = blocks[j + i + 2] - blocks[i] - 2;
+                j = blocks[j + i + 1] - blocks[i] - 1;

Doesn't this give you less headache?

>> 2. http://bugs.python.org/issue27140
>> Add new opcode BUILD_CONST_KEY_MAP for building a dict with constant keys. This optimize the common case and especially helpful for two following issues (creating and calling functions).
>
> This shows promise.
>
> The proposed name BUILD_CONST_KEY_MAP is much more clear than BUILD_MAP_EX.

If you accept this patch, I'll commit it. At least two other issues wait 
this.

>> 5. http://bugs.python.org/issue27127
>> Rework the for loop implementation.
>
> I'm unclear what problem is being solved by requiring that GET_ITER always followed immediately by FOR_ITER.

As I understand, the purpose was to decrease the number of executed 
opcodes. It looks to me that existing patch is not acceptable, because 
there is a reason for using two opcodes in the for loop start. But I 
think that we can use other optimization here. I'll try to write a patch.

From sturla.molden at gmail.com  Sun Jun  5 22:28:44 2016
From: sturla.molden at gmail.com (Sturla Molden)
Date: Mon, 6 Jun 2016 02:28:44 +0000 (UTC)
Subject: [Python-Dev] C99
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <1b9e1e41-b401-c27d-894e-913ecc485d8d@python.org>
 <CAP7+vJKM1DZdWrcyi_w8ZrQvSR9OqxBKGy_1+HoZHQ6T-Drdbw@mail.gmail.com>
 <0764173d-e69b-4f75-8797-30828ca6b471@python.org>
 <CAP7+vJ+bm1qgc=3uD=dGqOW1TQdaMRH5aE=C4PFaWRyX3HBJhA@mail.gmail.com>
Message-ID: <1031128263486872749.403655sturla.molden-gmail.com@news.gmane.org>

Guido van Rossum <guido at python.org> wrote:

> I'm talking about 3rd party extensions. Those may require source
> compatibility with older Python versions. All I'm asking for is to not
> require source-level use of C99 features. 

This of course removes a lot of its usefulness. E.g. macros cannot be
replaced by inline functions, as header files must still be plain C89.

Sturla Molden

From tritium-list at sdamon.com  Sun Jun  5 22:35:28 2016
From: tritium-list at sdamon.com (tritium-list at sdamon.com)
Date: Sun, 5 Jun 2016 22:35:28 -0400
Subject: [Python-Dev] C99
In-Reply-To: <1031128263486872749.403655sturla.molden-gmail.com@news.gmane.org>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <1b9e1e41-b401-c27d-894e-913ecc485d8d@python.org>
 <CAP7+vJKM1DZdWrcyi_w8ZrQvSR9OqxBKGy_1+HoZHQ6T-Drdbw@mail.gmail.com>
 <0764173d-e69b-4f75-8797-30828ca6b471@python.org>
 <CAP7+vJ+bm1qgc=3uD=dGqOW1TQdaMRH5aE=C4PFaWRyX3HBJhA@mail.gmail.com>
 <1031128263486872749.403655sturla.molden-gmail.com@news.gmane.org>
Message-ID: <0a9001d1bf9c$1710b7b0$45322710$@hotmail.com>

> -----Original Message-----
> From: Python-Dev [mailto:python-dev-bounces+tritium-
> list=sdamon.com at python.org] On Behalf Of Sturla Molden
> Sent: Sunday, June 5, 2016 10:29 PM
> To: python-dev at python.org
> Subject: Re: [Python-Dev] C99
> 
> Guido van Rossum <guido at python.org> wrote:
> 
> > I'm talking about 3rd party extensions. Those may require source
> > compatibility with older Python versions. All I'm asking for is to not
> > require source-level use of C99 features.
> 
> This of course removes a lot of its usefulness. E.g. macros cannot be
> replaced by inline functions, as header files must still be plain C89.
> 
> 
> Sturla Molden
> 

I share Guido's priority there - source compatibility is more important than
smoothing a few of C's rough edges.  Maybe the next breaking change release
this should be considered (python 4000... python 5000?)

From vgr255 at live.ca  Sun Jun  5 22:42:12 2016
From: vgr255 at live.ca (=?iso-8859-1?Q?=C9manuel_Barry?=)
Date: Sun, 5 Jun 2016 22:42:12 -0400
Subject: [Python-Dev] C99
In-Reply-To: <0a9001d1bf9c$1710b7b0$45322710$@hotmail.com>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <1b9e1e41-b401-c27d-894e-913ecc485d8d@python.org>
 <CAP7+vJKM1DZdWrcyi_w8ZrQvSR9OqxBKGy_1+HoZHQ6T-Drdbw@mail.gmail.com>
 <0764173d-e69b-4f75-8797-30828ca6b471@python.org>
 <CAP7+vJ+bm1qgc=3uD=dGqOW1TQdaMRH5aE=C4PFaWRyX3HBJhA@mail.gmail.com>
 <1031128263486872749.403655sturla.molden-gmail.com@news.gmane.org>
 <0a9001d1bf9c$1710b7b0$45322710$@hotmail.com>
Message-ID: <BLU403-EAS14032B06095DCB5DBD22849915C0@phx.gbl>

> From: Python-Dev [mailto:python-dev-
> bounces+vgr255=live.ca at python.org] On Behalf Of tritium-
> list at sdamon.com
> Sent: Sunday, June 05, 2016 10:35 PM
> To: 'Sturla Molden'; python-dev at python.org
> Subject: Re: [Python-Dev] C99
> 
> > -----Original Message-----
> > From: Python-Dev [mailto:python-dev-bounces+tritium-
> > list=sdamon.com at python.org] On Behalf Of Sturla Molden
> > Sent: Sunday, June 5, 2016 10:29 PM
> > To: python-dev at python.org
> > Subject: Re: [Python-Dev] C99
> >
> > Guido van Rossum <guido at python.org> wrote:
> >
> > > I'm talking about 3rd party extensions. Those may require source
> > > compatibility with older Python versions. All I'm asking for is to not
> > > require source-level use of C99 features.
> >
> > This of course removes a lot of its usefulness. E.g. macros cannot be
> > replaced by inline functions, as header files must still be plain C89.
> >
> >
> > Sturla Molden
> >
> 
> I share Guido's priority there - source compatibility is more important
than
> smoothing a few of C's rough edges. 

Correct me if I'm wrong, but I think that Guido meant that the third-party
extensions might require their own code (not CPython's) to be compatible
with versions of CPython < 3.6, and so PEP 7 shouldn't force them to break
their own backwards compatibility.

Either way I'm +1 for allowing (but not enforcing) C99 syntax.

> Maybe the next breaking change release
> this should be considered (python 4000... python 5000?)

Let's not!

-Emanuel

From sturla.molden at gmail.com  Sun Jun  5 22:42:15 2016
From: sturla.molden at gmail.com (Sturla Molden)
Date: Mon, 6 Jun 2016 02:42:15 +0000 (UTC)
Subject: [Python-Dev] C99
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <1b9e1e41-b401-c27d-894e-913ecc485d8d@python.org>
 <CAP7+vJKM1DZdWrcyi_w8ZrQvSR9OqxBKGy_1+HoZHQ6T-Drdbw@mail.gmail.com>
 <0764173d-e69b-4f75-8797-30828ca6b471@python.org>
 <CAP7+vJ+bm1qgc=3uD=dGqOW1TQdaMRH5aE=C4PFaWRyX3HBJhA@mail.gmail.com>
 <1031128263486872749.403655sturla.molden-gmail.com@news.gmane.org>
 <0a9001d1bf9c$1710b7b0$45322710$@hotmail.com>
Message-ID: <287791199486873532.937937sturla.molden-gmail.com@news.gmane.org>

<tritium-list at sdamon.com> wrote:

> I share Guido's priority there - source compatibility is more important than
> smoothing a few of C's rough edges.  Maybe the next breaking change release
> this should be considered (python 4000... python 5000?)

I was simply pointing out that Guido's priority removes a lot of the
usefulness of C99 at source level. I was not saying I disagreed. If we have
to keep header files clean of C99 I think this proposal just adds clutter.

From guido at python.org  Sun Jun  5 22:52:52 2016
From: guido at python.org (Guido van Rossum)
Date: Sun, 5 Jun 2016 19:52:52 -0700
Subject: [Python-Dev] C99
In-Reply-To: <287791199486873532.937937sturla.molden-gmail.com@news.gmane.org>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <1b9e1e41-b401-c27d-894e-913ecc485d8d@python.org>
 <CAP7+vJKM1DZdWrcyi_w8ZrQvSR9OqxBKGy_1+HoZHQ6T-Drdbw@mail.gmail.com>
 <0764173d-e69b-4f75-8797-30828ca6b471@python.org>
 <CAP7+vJ+bm1qgc=3uD=dGqOW1TQdaMRH5aE=C4PFaWRyX3HBJhA@mail.gmail.com>
 <1031128263486872749.403655sturla.molden-gmail.com@news.gmane.org>
 <0a9001d1bf9c$1710b7b0$45322710$@hotmail.com>
 <287791199486873532.937937sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CAP7+vJJVrxggLoyZ5Woc03kLKZAbD=OnTpavrFDRDJct_r3CBw@mail.gmail.com>

I'm not sure I meant that. But if I have a 3rd party extension that
compiles with 3.5 headers using C89, then it should still compile with
3.6 headers using C99. Also if I compile it for 3.5 and it only uses
the ABI it should still be linkable with 3.6.

On Sun, Jun 5, 2016 at 7:42 PM, Sturla Molden <sturla.molden at gmail.com> wrote:
> <tritium-list at sdamon.com> wrote:
>
>> I share Guido's priority there - source compatibility is more important than
>> smoothing a few of C's rough edges.  Maybe the next breaking change release
>> this should be considered (python 4000... python 5000?)
>
> I was simply pointing out that Guido's priority removes a lot of the
> usefulness of C99 at source level. I was not saying I disagreed. If we have
> to keep header files clean of C99 I think this proposal just adds clutter.
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org

-- 
--Guido van Rossum (python.org/~guido)

From benjamin at python.org  Mon Jun  6 02:51:54 2016
From: benjamin at python.org (Benjamin Peterson)
Date: Sun, 05 Jun 2016 23:51:54 -0700
Subject: [Python-Dev] cpython: replace custom validation logic in the
 parse module with a simple DFA validator
In-Reply-To: <4005d323-bb04-9dda-00c8-a5fb271061f9@python.org>
References: <20160602183248.72964.43203.21CCDB55@psf.io>
 <4005d323-bb04-9dda-00c8-a5fb271061f9@python.org>
Message-ID: <1465195914.3950157.628917729.492FEB13@webmail.messagingengine.com>

On Sat, Jun 4, 2016, at 17:26, Christian Heimes wrote:
> On 2016-06-02 11:32, benjamin.peterson wrote:
> > https://hg.python.org/cpython/rev/4a9159ea2536
> > changeset:   101601:4a9159ea2536
> > user:        Benjamin Peterson <benjamin at python.org>
> > date:        Thu Jun 02 11:30:18 2016 -0700
> > summary:
> >   replace custom validation logic in the parse module with a simple DFA validator (closes #26526)
> > 
> > Patch from A. Skrobov.
> > 
> > files:
> >   Misc/NEWS              |     3 +
> >   Modules/parsermodule.c |  2545 +--------------------------
> >   2 files changed, 96 insertions(+), 2452 deletions(-)
> > 
> > 
> > diff --git a/Misc/NEWS b/Misc/NEWS
> > --- a/Misc/NEWS
> > +++ b/Misc/NEWS
> > @@ -22,6 +22,9 @@
> >  Library
> >  -------
> >  
> > +- Issue #26526: Replace custom parse tree validation in the parser
> > +  module with a simple DFA validator.
> > +
> >  - Issue #27114: Fix SSLContext._load_windows_store_certs fails with
> >    PermissionError
> >  
> > diff --git a/Modules/parsermodule.c b/Modules/parsermodule.c
> > --- a/Modules/parsermodule.c
> > +++ b/Modules/parsermodule.c
> > @@ -670,9 +670,75 @@
> >  
> >  
> >  static node* build_node_tree(PyObject *tuple);
> > -static int   validate_expr_tree(node *tree);
> > -static int   validate_file_input(node *tree);
> > -static int   validate_encoding_decl(node *tree);
> > +
> > +static int
> > +validate_node(node *tree)
> > +{
> > +    int type = TYPE(tree);
> > +    int nch = NCH(tree);
> > +    dfa *nt_dfa;
> > +    state *dfa_state;
> > +    int pos, arc;
> > +
> > +    assert(ISNONTERMINAL(type));
> > +    type -= NT_OFFSET;
> > +    if (type >= _PyParser_Grammar.g_ndfas) {
> > +        PyErr_Format(parser_error, "Unrecognized node type %d.", TYPE(tree));
> > +        return 0;
> > +    }
> > +    nt_dfa = &_PyParser_Grammar.g_dfa[type];
> > +    REQ(tree, nt_dfa->d_type);
> > +
> > +    /* Run the DFA for this nonterminal. */
> > +    dfa_state = &nt_dfa->d_state[nt_dfa->d_initial];
> > +    for (pos = 0; pos < nch; ++pos) {
> > +        node *ch = CHILD(tree, pos);
> > +        int ch_type = TYPE(ch);
> > +        for (arc = 0; arc < dfa_state->s_narcs; ++arc) {
> > +            short a_label = dfa_state->s_arc[arc].a_lbl;
> > +            assert(a_label < _PyParser_Grammar.g_ll.ll_nlabels);
> > +            if (_PyParser_Grammar.g_ll.ll_label[a_label].lb_type == ch_type) {
> > +     	        /* The child is acceptable; if non-terminal, validate it recursively. */
> > +                if (ISNONTERMINAL(ch_type) && !validate_node(ch))
> > +                    return 0;
> > +
> > +                /* Update the state, and move on to the next child. */
> > +                dfa_state = &nt_dfa->d_state[dfa_state->s_arc[arc].a_arrow];
> > +                goto arc_found;
> > +            }
> > +        }
> > +        /* What would this state have accepted? */
> > +        {
> > +            short a_label = dfa_state->s_arc->a_lbl;
> > +            int next_type;
> > +            if (!a_label) /* Wouldn't accept any more children */
> > +                goto illegal_num_children;
> > +
> > +            next_type = _PyParser_Grammar.g_ll.ll_label[a_label].lb_type;
> > +            if (ISNONTERMINAL(next_type))
> > +                PyErr_Format(parser_error, "Expected node type %d, got %d.",
> > +                             next_type, ch_type);
> > +            else
> > +                PyErr_Format(parser_error, "Illegal terminal: expected %s.",
> > +                             _PyParser_TokenNames[next_type]);
> 
> Coverity doesn't that line:
> 
> CID 1362505 (#1 of 1): Out-of-bounds read (OVERRUN)
> 20. overrun-local: Overrunning array _PyParser_TokenNames of 58 8-byte
> elements at element index 255 (byte offset 2040) using index next_type
> (which evaluates to 255).

I don't think this can cause a problem because it doesn't ever come from
user-provided input.

From sturla.molden at gmail.com  Mon Jun  6 07:23:31 2016
From: sturla.molden at gmail.com (Sturla Molden)
Date: Mon, 6 Jun 2016 11:23:31 +0000 (UTC)
Subject: [Python-Dev] C99
References: <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <1b9e1e41-b401-c27d-894e-913ecc485d8d@python.org>
 <CAP7+vJKM1DZdWrcyi_w8ZrQvSR9OqxBKGy_1+HoZHQ6T-Drdbw@mail.gmail.com>
 <0764173d-e69b-4f75-8797-30828ca6b471@python.org>
 <CAP7+vJ+bm1qgc=3uD=dGqOW1TQdaMRH5aE=C4PFaWRyX3HBJhA@mail.gmail.com>
 <1031128263486872749.403655sturla.molden-gmail.com@news.gmane.org>
 <0a9001d1bf9c$1710b7b0$45322710$@hotmail.com>
 <287791199486873532.937937sturla.molden-gmail.com@news.gmane.org>
 <CAP7+vJJVrxggLoyZ5Woc03kLKZAbD=OnTpavrFDRDJct_r3CBw@mail.gmail.com>
Message-ID: <70851808486904675.240857sturla.molden-gmail.com@news.gmane.org>

Guido van Rossum <guido at python.org> wrote:

> I'm not sure I meant that. But if I have a 3rd party extension that
> compiles with 3.5 headers using C89, then it should still compile with
> 3.6 headers using C99. Also if I compile it for 3.5 and it only uses
> the ABI it should still be linkable with 3.6.

Ok, but if third-party developers shall be free to use a C89 compiler for
their own code, we cannot have C99 in the include files. Otherwise the
include files will taint the C89 purity of their source code.

Personally I don't think we need to worry about compilers that don't
implement C99 features like inline functions in C. How long have the Linux
kernel used inline functions instead of macros? 20 years or more?

Sturla

From random832 at fastmail.com  Mon Jun  6 09:25:24 2016
From: random832 at fastmail.com (Random832)
Date: Mon, 06 Jun 2016 09:25:24 -0400
Subject: [Python-Dev] C99
In-Reply-To: <70851808486904675.240857sturla.molden-gmail.com@news.gmane.org>
References: <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <1b9e1e41-b401-c27d-894e-913ecc485d8d@python.org>
 <CAP7+vJKM1DZdWrcyi_w8ZrQvSR9OqxBKGy_1+HoZHQ6T-Drdbw@mail.gmail.com>
 <0764173d-e69b-4f75-8797-30828ca6b471@python.org>
 <CAP7+vJ+bm1qgc=3uD=dGqOW1TQdaMRH5aE=C4PFaWRyX3HBJhA@mail.gmail.com>
 <1031128263486872749.403655sturla.molden-gmail.com@news.gmane.org>
 <0a9001d1bf9c$1710b7b0$45322710$@hotmail.com>
 <287791199486873532.937937sturla.molden-gmail.com@news.gmane.org>
 <CAP7+vJJVrxggLoyZ5Woc03kLKZAbD=OnTpavrFDRDJct_r3CBw@mail.gmail.com>
 <70851808486904675.240857sturla.molden-gmail.com@news.gmane.org>
Message-ID: <1465219524.1062244.629232969.0CBE8C17@webmail.messagingengine.com>

On Mon, Jun 6, 2016, at 07:23, Sturla Molden wrote:
> Ok, but if third-party developers shall be free to use a C89 compiler for
> their own code, we cannot have C99 in the include files. Otherwise the
> include files will taint the C89 purity of their source code.
> 
> Personally I don't think we need to worry about compilers that don't
> implement C99 features like inline functions in C. How long have the
> Linux
> kernel used inline functions instead of macros? 20 years or more?

Using inline functions instead of macros doesn't have to mean anything
but a performance hit on platforms that don't support them, since the
inline keyword, or some other identifier, could be defined to expand to
an empty token sequence on platforms that do not support it. It's much
lower impact on the source code than some other C99 features.

From guido at python.org  Mon Jun  6 10:11:39 2016
From: guido at python.org (Guido van Rossum)
Date: Mon, 6 Jun 2016 07:11:39 -0700
Subject: [Python-Dev] C99
In-Reply-To: <70851808486904675.240857sturla.molden-gmail.com@news.gmane.org>
References: <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <1b9e1e41-b401-c27d-894e-913ecc485d8d@python.org>
 <CAP7+vJKM1DZdWrcyi_w8ZrQvSR9OqxBKGy_1+HoZHQ6T-Drdbw@mail.gmail.com>
 <0764173d-e69b-4f75-8797-30828ca6b471@python.org>
 <CAP7+vJ+bm1qgc=3uD=dGqOW1TQdaMRH5aE=C4PFaWRyX3HBJhA@mail.gmail.com>
 <1031128263486872749.403655sturla.molden-gmail.com@news.gmane.org>
 <0a9001d1bf9c$1710b7b0$45322710$@hotmail.com>
 <287791199486873532.937937sturla.molden-gmail.com@news.gmane.org>
 <CAP7+vJJVrxggLoyZ5Woc03kLKZAbD=OnTpavrFDRDJct_r3CBw@mail.gmail.com>
 <70851808486904675.240857sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CAP7+vJLrk79JzzYe5gVMXuzU2kYs0e_R3P70=YVPWnZV-e9nDw@mail.gmail.com>

On Mon, Jun 6, 2016 at 4:23 AM, Sturla Molden <sturla.molden at gmail.com> wrote:
> Guido van Rossum <guido at python.org> wrote:
>
>> I'm not sure I meant that. But if I have a 3rd party extension that
>> compiles with 3.5 headers using C89, then it should still compile with
>> 3.6 headers using C99. Also if I compile it for 3.5 and it only uses
>> the ABI it should still be linkable with 3.6.
>
> Ok, but if third-party developers shall be free to use a C89 compiler for
> their own code, we cannot have C99 in the include files. Otherwise the
> include files will taint the C89 purity of their source code.

Well, they should use the right compiler for the Python version they
are targeting. I'm just saying that they can't afford C99 features in
their own code. Not even to call C/Python APIs. I think it would be
okay if e.g. Py_INCREF was an inline function in Python 3.6, as long
as the way you use it remains the same.

-- 
--Guido van Rossum (python.org/~guido)

From guido at python.org  Mon Jun  6 10:12:10 2016
From: guido at python.org (Guido van Rossum)
Date: Mon, 6 Jun 2016 07:12:10 -0700
Subject: [Python-Dev] C99
In-Reply-To: <1465219524.1062244.629232969.0CBE8C17@webmail.messagingengine.com>
References: <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <1b9e1e41-b401-c27d-894e-913ecc485d8d@python.org>
 <CAP7+vJKM1DZdWrcyi_w8ZrQvSR9OqxBKGy_1+HoZHQ6T-Drdbw@mail.gmail.com>
 <0764173d-e69b-4f75-8797-30828ca6b471@python.org>
 <CAP7+vJ+bm1qgc=3uD=dGqOW1TQdaMRH5aE=C4PFaWRyX3HBJhA@mail.gmail.com>
 <1031128263486872749.403655sturla.molden-gmail.com@news.gmane.org>
 <0a9001d1bf9c$1710b7b0$45322710$@hotmail.com>
 <287791199486873532.937937sturla.molden-gmail.com@news.gmane.org>
 <CAP7+vJJVrxggLoyZ5Woc03kLKZAbD=OnTpavrFDRDJct_r3CBw@mail.gmail.com>
 <70851808486904675.240857sturla.molden-gmail.com@news.gmane.org>
 <1465219524.1062244.629232969.0CBE8C17@webmail.messagingengine.com>
Message-ID: <CAP7+vJKGGq2qPbjb-XSmRaNqUorDGt9_C8MHO+Dy-rJU4oEeyg@mail.gmail.com>

On Mon, Jun 6, 2016 at 6:25 AM, Random832 <random832 at fastmail.com> wrote:
> On Mon, Jun 6, 2016, at 07:23, Sturla Molden wrote:
>> Ok, but if third-party developers shall be free to use a C89 compiler for
>> their own code, we cannot have C99 in the include files. Otherwise the
>> include files will taint the C89 purity of their source code.
>>
>> Personally I don't think we need to worry about compilers that don't
>> implement C99 features like inline functions in C. How long have the
>> Linux
>> kernel used inline functions instead of macros? 20 years or more?
>
> Using inline functions instead of macros doesn't have to mean anything
> but a performance hit on platforms that don't support them, since the
> inline keyword, or some other identifier, could be defined to expand to
> an empty token sequence on platforms that do not support it. It's much
> lower impact on the source code than some other C99 features.

That could be a major performance impact.

-- 
--Guido van Rossum (python.org/~guido)

From eric at trueblade.com  Mon Jun  6 10:31:12 2016
From: eric at trueblade.com (Eric V. Smith)
Date: Mon, 6 Jun 2016 10:31:12 -0400
Subject: [Python-Dev] C99
In-Reply-To: <CAP7+vJLrk79JzzYe5gVMXuzU2kYs0e_R3P70=YVPWnZV-e9nDw@mail.gmail.com>
References: <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <1b9e1e41-b401-c27d-894e-913ecc485d8d@python.org>
 <CAP7+vJKM1DZdWrcyi_w8ZrQvSR9OqxBKGy_1+HoZHQ6T-Drdbw@mail.gmail.com>
 <0764173d-e69b-4f75-8797-30828ca6b471@python.org>
 <CAP7+vJ+bm1qgc=3uD=dGqOW1TQdaMRH5aE=C4PFaWRyX3HBJhA@mail.gmail.com>
 <1031128263486872749.403655sturla.molden-gmail.com@news.gmane.org>
 <0a9001d1bf9c$1710b7b0$45322710$@hotmail.com>
 <287791199486873532.937937sturla.molden-gmail.com@news.gmane.org>
 <CAP7+vJJVrxggLoyZ5Woc03kLKZAbD=OnTpavrFDRDJct_r3CBw@mail.gmail.com>
 <70851808486904675.240857sturla.molden-gmail.com@news.gmane.org>
 <CAP7+vJLrk79JzzYe5gVMXuzU2kYs0e_R3P70=YVPWnZV-e9nDw@mail.gmail.com>
Message-ID: <57558930.2040700@trueblade.com>

On 06/06/2016 10:11 AM, Guido van Rossum wrote:
> On Mon, Jun 6, 2016 at 4:23 AM, Sturla Molden <sturla.molden at gmail.com> wrote:
>> Guido van Rossum <guido at python.org> wrote:
>>
>>> I'm not sure I meant that. But if I have a 3rd party extension that
>>> compiles with 3.5 headers using C89, then it should still compile with
>>> 3.6 headers using C99. Also if I compile it for 3.5 and it only uses
>>> the ABI it should still be linkable with 3.6.
>>
>> Ok, but if third-party developers shall be free to use a C89 compiler for
>> their own code, we cannot have C99 in the include files. Otherwise the
>> include files will taint the C89 purity of their source code.
> 
> Well, they should use the right compiler for the Python version they
> are targeting. I'm just saying that they can't afford C99 features in
> their own code. Not even to call C/Python APIs. I think it would be
> okay if e.g. Py_INCREF was an inline function in Python 3.6, as long
> as the way you use it remains the same.

Right. So we could use C99 features in 3.6 .h files, as long as the same
extension module, unmodified, could be compiled with 3.5 .h files with a
3.5 approved (C89) compiler, and also with a 3.6 approved (C99) compiler.

The headers would be different, but so would the compilers. It's the
extension module source code that must be the same in the two scenarios.

We're not saying that an extension module must compile with a C89
compiler under 3.6.

Eric.

From guido at python.org  Mon Jun  6 10:39:55 2016
From: guido at python.org (Guido van Rossum)
Date: Mon, 6 Jun 2016 07:39:55 -0700
Subject: [Python-Dev] C99
In-Reply-To: <57558930.2040700@trueblade.com>
References: <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <1b9e1e41-b401-c27d-894e-913ecc485d8d@python.org>
 <CAP7+vJKM1DZdWrcyi_w8ZrQvSR9OqxBKGy_1+HoZHQ6T-Drdbw@mail.gmail.com>
 <0764173d-e69b-4f75-8797-30828ca6b471@python.org>
 <CAP7+vJ+bm1qgc=3uD=dGqOW1TQdaMRH5aE=C4PFaWRyX3HBJhA@mail.gmail.com>
 <1031128263486872749.403655sturla.molden-gmail.com@news.gmane.org>
 <0a9001d1bf9c$1710b7b0$45322710$@hotmail.com>
 <287791199486873532.937937sturla.molden-gmail.com@news.gmane.org>
 <CAP7+vJJVrxggLoyZ5Woc03kLKZAbD=OnTpavrFDRDJct_r3CBw@mail.gmail.com>
 <70851808486904675.240857sturla.molden-gmail.com@news.gmane.org>
 <CAP7+vJLrk79JzzYe5gVMXuzU2kYs0e_R3P70=YVPWnZV-e9nDw@mail.gmail.com>
 <57558930.2040700@trueblade.com>
Message-ID: <CAP7+vJJXOV41YBA9P_U-MXi8U_f7p-7T2-4AFQs+1yOcOo2d3Q@mail.gmail.com>

Right.

On Mon, Jun 6, 2016 at 7:31 AM, Eric V. Smith <eric at trueblade.com> wrote:
> On 06/06/2016 10:11 AM, Guido van Rossum wrote:
>> On Mon, Jun 6, 2016 at 4:23 AM, Sturla Molden <sturla.molden at gmail.com> wrote:
>>> Guido van Rossum <guido at python.org> wrote:
>>>
>>>> I'm not sure I meant that. But if I have a 3rd party extension that
>>>> compiles with 3.5 headers using C89, then it should still compile with
>>>> 3.6 headers using C99. Also if I compile it for 3.5 and it only uses
>>>> the ABI it should still be linkable with 3.6.
>>>
>>> Ok, but if third-party developers shall be free to use a C89 compiler for
>>> their own code, we cannot have C99 in the include files. Otherwise the
>>> include files will taint the C89 purity of their source code.
>>
>> Well, they should use the right compiler for the Python version they
>> are targeting. I'm just saying that they can't afford C99 features in
>> their own code. Not even to call C/Python APIs. I think it would be
>> okay if e.g. Py_INCREF was an inline function in Python 3.6, as long
>> as the way you use it remains the same.
>
> Right. So we could use C99 features in 3.6 .h files, as long as the same
> extension module, unmodified, could be compiled with 3.5 .h files with a
> 3.5 approved (C89) compiler, and also with a 3.6 approved (C99) compiler.
>
> The headers would be different, but so would the compilers. It's the
> extension module source code that must be the same in the two scenarios.
>
> We're not saying that an extension module must compile with a C89
> compiler under 3.6.
>
> Eric.
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org

-- 
--Guido van Rossum (python.org/~guido)

From yselivanov.ml at gmail.com  Mon Jun  6 15:23:43 2016
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Mon, 6 Jun 2016 15:23:43 -0400
Subject: [Python-Dev] PEP 492: __aiter__ should return async iterator
 directly instead of awaitable
Message-ID: <5755CDBF.40905@gmail.com>

There is a small flaw in PEP 492 design -- __aiter__ should not return 
an awaitable object that resolves to an asynchronous iterator. It should 
return an asynchronous iterator directly.

Let me explain this by showing some examples.

I've discovered this while working on a new asynchronous generators 
PEP.  Let's pretend that we have them already: if we have a 'yield' 
expression in an 'async def' function, the function becomes an 
"asynchronous generator function":

    async def foo():
       await bar()
       yield 1
       await baz()
       yield 2

    # foo -- is an `asynchronous generator function`
    # foo() -- is an `asynchronous generator`

If we iterate through "foo()", it will await on "bar()", yield "1", 
await on "baz()", and yield "2":

    >>> async for el in foo():
    ...     print(el)
    1
    2

If we decide to have a class with an __aiter__ that is an async 
generator, we'd write something like this:

    class Foo:
       async def __aiter__(self):
           await bar()
           yield 1
           await baz()
           yield 2

However, with the current PEP 492 design, the above code would be 
invalid!  The interpreter expects __aiter__ to return a coroutine, not 
an async generator.

I'm still working on the PEP for async generators, targeting CPython 
3.6.  And once it is ready, it might still be rejected or deferred.  But 
in any case, this PEP 492 flaw has to be fixed now, in 3.5.2 (since PEP 
492 is provisional).

I've created an issue on the bug tracker: http://bugs.python.org/issue27243

The proposed patch fixes the __aiter__ in a backwards compatible way:

1. ceval/GET_AITER opcode calls the __aiter__ method.

2. If the returned object has an '__anext__' method, GET_AITER silently 
wraps it in an awaitable, which is equivalent to the following coroutine:

     async def wrapper(aiter_result):
         return aiter_result

3. If the returned object does not have an '__anext__' method, a 
DeprecationWarning is raised.

From lukasz at langa.pl  Mon Jun  6 16:02:11 2016
From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=)
Date: Mon, 6 Jun 2016 13:02:11 -0700
Subject: [Python-Dev] PEP 492: __aiter__ should return async iterator
 directly instead of awaitable
In-Reply-To: <5755CDBF.40905@gmail.com>
References: <5755CDBF.40905@gmail.com>
Message-ID: <3E4C21A8-4AB0-4ABC-A090-2BE8D5CFBD95@langa.pl>

On Jun 6, 2016, at 12:23 PM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> 
> However, with the current PEP 492 design, the above code would be invalid!  The interpreter expects __aiter__ to return a coroutine, not an async generator.
> 

Yes, I remember asking about the reason behind __aiter__ being an awaitable during the original PEP 492 review process. You added an explanation to the PEP but I don?t think we ever had an example where this was needed. I?m +1 to resolve this now.

> The proposed patch fixes the __aiter__ in a backwards compatible way:
> 
> 1. ceval/GET_AITER opcode calls the __aiter__ method.
> 
> 2. If the returned object has an '__anext__' method, GET_AITER silently wraps it in an awaitable, which is equivalent to the following coroutine:
> 
>    async def wrapper(aiter_result):
>        return aiter_result
> 
> 3. If the returned object does not have an '__anext__' method, a DeprecationWarning is raised.

There?s a problem with this approach. It will force people to write deprecated code because you never know if your library is going to run on 3.5.0 or 3.5.1. Barry, Ubuntu wily, xenial and yakkety currently package 3.5.0 or 3.5.1. When 3.5.2 is going to get released, are they going to get it? I?m pretty sure wily isn?t and yakkety is but just wanted to confirm; especially with xenial being an LTS release.

--
Not-that-i-see-a-different-way-out?sly yours,
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160606/b904e6c1/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160606/b904e6c1/attachment.sig>

From yselivanov.ml at gmail.com  Mon Jun  6 16:05:53 2016
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Mon, 6 Jun 2016 16:05:53 -0400
Subject: [Python-Dev] PEP 492: __aiter__ should return async iterator
 directly instead of awaitable
In-Reply-To: <3E4C21A8-4AB0-4ABC-A090-2BE8D5CFBD95@langa.pl>
References: <5755CDBF.40905@gmail.com>
 <3E4C21A8-4AB0-4ABC-A090-2BE8D5CFBD95@langa.pl>
Message-ID: <5755D7A1.40402@gmail.com>

On 2016-06-06 4:02 PM, ?ukasz Langa wrote:
>> The proposed patch fixes the __aiter__ in a backwards compatible way:
>>
>> 1. ceval/GET_AITER opcode calls the __aiter__ method.
>>
>> 2. If the returned object has an '__anext__' method, GET_AITER 
>> silently wraps it in an awaitable, which is equivalent to the 
>> following coroutine:
>>
>>    async def wrapper(aiter_result):
>>        return aiter_result
>>
>> 3. If the returned object does not have an '__anext__' method, a 
>> DeprecationWarning is raised.
>
> There?s a problem with this approach. It will force people to write 
> deprecated code because you never know if your library is going to run 
> on 3.5.0 or 3.5.1. Barry, Ubuntu wily, xenial and yakkety currently 
> package 3.5.0 or 3.5.1. When 3.5.2 is going to get released, are they 
> going to get it? I?m pretty sure wily *isn?t* and yakkety *is* but 
> just wanted to confirm; especially with xenial being an LTS release.
>

Yes, I agree. OTOH, I don't see any other way of resolving this.

Another option would be to start raising the DeprecationWarning only in 3.6.

Yury

From guido at python.org  Mon Jun  6 16:21:25 2016
From: guido at python.org (Guido van Rossum)
Date: Mon, 6 Jun 2016 13:21:25 -0700
Subject: [Python-Dev] PEP 492: __aiter__ should return async iterator
 directly instead of awaitable
In-Reply-To: <5755D7A1.40402@gmail.com>
References: <5755CDBF.40905@gmail.com>
 <3E4C21A8-4AB0-4ABC-A090-2BE8D5CFBD95@langa.pl>
 <5755D7A1.40402@gmail.com>
Message-ID: <CAP7+vJJLD+ROUAmLs9=LJi2itpR5hrmojzTGh-LoO7MJzK37WQ@mail.gmail.com>

The RC for 3.5.2 is going out coming weekend (see PEP 478
<https://www.python.org/dev/peps/pep-0478/>). We should get this out now,
or make it the first incompatibility in 3.6 (that's also an option; 3.6
feature freeze starts September, see PEP 494
<https://www.python.org/dev/peps/pep-0494/>).

On Mon, Jun 6, 2016 at 1:05 PM, Yury Selivanov <yselivanov.ml at gmail.com>
wrote:

>
>
> On 2016-06-06 4:02 PM, ?ukasz Langa wrote:
>
>> The proposed patch fixes the __aiter__ in a backwards compatible way:
>>>
>>> 1. ceval/GET_AITER opcode calls the __aiter__ method.
>>>
>>> 2. If the returned object has an '__anext__' method, GET_AITER silently
>>> wraps it in an awaitable, which is equivalent to the following coroutine:
>>>
>>>    async def wrapper(aiter_result):
>>>        return aiter_result
>>>
>>> 3. If the returned object does not have an '__anext__' method, a
>>> DeprecationWarning is raised.
>>>
>>
>> There?s a problem with this approach. It will force people to write
>> deprecated code because you never know if your library is going to run on
>> 3.5.0 or 3.5.1. Barry, Ubuntu wily, xenial and yakkety currently package
>> 3.5.0 or 3.5.1. When 3.5.2 is going to get released, are they going to get
>> it? I?m pretty sure wily *isn?t* and yakkety *is* but just wanted to
>> confirm; especially with xenial being an LTS release.
>>
>>
> Yes, I agree. OTOH, I don't see any other way of resolving this.
>
> Another option would be to start raising the DeprecationWarning only in
> 3.6.
>
> Yury
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160606/e9908a95/attachment.html>

From barry at python.org  Mon Jun  6 16:58:54 2016
From: barry at python.org (Barry Warsaw)
Date: Mon, 6 Jun 2016 16:58:54 -0400
Subject: [Python-Dev] PEP 492: __aiter__ should return async iterator
 directly instead of awaitable
In-Reply-To: <3E4C21A8-4AB0-4ABC-A090-2BE8D5CFBD95@langa.pl>
References: <5755CDBF.40905@gmail.com>
 <3E4C21A8-4AB0-4ABC-A090-2BE8D5CFBD95@langa.pl>
Message-ID: <20160606165854.7af952c3@subdivisions.wooz.org>

On Jun 06, 2016, at 01:02 PM, ?ukasz Langa wrote:

>There?s a problem with this approach. It will force people to write
>deprecated code because you never know if your library is going to run on
>3.5.0 or 3.5.1. Barry, Ubuntu wily, xenial and yakkety currently package
>3.5.0 or 3.5.1. When 3.5.2 is going to get released, are they going to get
>it? I?m pretty sure wily isn?t and yakkety is but just wanted to confirm;
>especially with xenial being an LTS release.

Matthias and I talked briefly about this at Pycon.  We want to get 3.5.2 into
Ubuntu 16.04.1 if it's released in time.  16.04.1 is currently scheduled for
July 21st [1] so if Larry keeps with his announced schedule that should work
out[2].

Obviously it would make it into Yakkety too.  It's not worth it for Wily
(15.10) since that EOLs next month.

Cheers,
-Barry

[1] https://wiki.ubuntu.com/XenialXerus/ReleaseSchedule
[2] https://mail.python.org/pipermail/python-dev/2016-April/144383.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160606/4018eaf4/attachment.sig>

From tjreedy at udel.edu  Mon Jun  6 22:33:31 2016
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 6 Jun 2016 22:33:31 -0400
Subject: [Python-Dev] C99
In-Reply-To: <57558930.2040700@trueblade.com>
References: <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <1b9e1e41-b401-c27d-894e-913ecc485d8d@python.org>
 <CAP7+vJKM1DZdWrcyi_w8ZrQvSR9OqxBKGy_1+HoZHQ6T-Drdbw@mail.gmail.com>
 <0764173d-e69b-4f75-8797-30828ca6b471@python.org>
 <CAP7+vJ+bm1qgc=3uD=dGqOW1TQdaMRH5aE=C4PFaWRyX3HBJhA@mail.gmail.com>
 <1031128263486872749.403655sturla.molden-gmail.com@news.gmane.org>
 <0a9001d1bf9c$1710b7b0$45322710$@hotmail.com>
 <287791199486873532.937937sturla.molden-gmail.com@news.gmane.org>
 <CAP7+vJJVrxggLoyZ5Woc03kLKZAbD=OnTpavrFDRDJct_r3CBw@mail.gmail.com>
 <70851808486904675.240857sturla.molden-gmail.com@news.gmane.org>
 <CAP7+vJLrk79JzzYe5gVMXuzU2kYs0e_R3P70=YVPWnZV-e9nDw@mail.gmail.com>
 <57558930.2040700@trueblade.com>
Message-ID: <nj5bq3$3m7$1@ger.gmane.org>

On 6/6/2016 10:31 AM, Eric V. Smith wrote:

> Right. So we could use C99 features in 3.6 .h files, as long as the same
> extension module, unmodified, could be compiled with 3.5 .h files with a
> 3.5 approved (C89) compiler, and also with a 3.6 approved (C99) compiler.

> The headers would be different, but so would the compilers.

On Windows, the compiler would be the 2015 MS compiler in both cases. 
Steve Dower would know if compiler flags need to be changed to enable or 
stop disabling C99 features.

 > It's the
> extension module source code that must be the same in the two scenarios.

We could run the experiment ourselves by changing one or more .h files 
to include one or more of the C99 features we want while leaving our .c 
files alone in the sense of remaining C89 compatible.  Compile and run 
the test suite.  If successful, add more.  We would soon find out 
whether any of the features we want in header files require use of C99 
features in .c files that include them.  With a .h standard established, 
we could then revise *our* .c files without imposing the same on 
extensions.

-- 
Terry Jan Reedy

From victor.stinner at gmail.com  Tue Jun  7 06:24:26 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Tue, 7 Jun 2016 12:24:26 +0200
Subject: [Python-Dev] C99
In-Reply-To: <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
Message-ID: <CAMpsgwbq6e9Va9koS884LuR5Ly9wmoDhuv=CTGJMONiO152XFw@mail.gmail.com>

Hi,

2016-06-04 19:47 GMT+02:00 Guido van Rossum <guido at python.org>:
> Funny. Just two weeks ago I was helping someone who discovered a
> compiler that doesn't support the new relaxed variable declaration
> rules. I think it was on Windows. Maybe this move is a little too
> aggressively deprecating older Windows compilers?

I understood that Python only has a tiny list of officially supported
compilers. For example, MinGW is somehow explicitly not supported and
I see this as a deliberate choice.

I'm quite sure that all supported compilers support C99.

Is it worth to support a compiler that in 2016 doesn't support the C
standard released in 1999, 17 years ago?

Victor

From guido at python.org  Tue Jun  7 11:18:39 2016
From: guido at python.org (Guido van Rossum)
Date: Tue, 7 Jun 2016 08:18:39 -0700
Subject: [Python-Dev] C99
In-Reply-To: <CAMpsgwbq6e9Va9koS884LuR5Ly9wmoDhuv=CTGJMONiO152XFw@mail.gmail.com>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <CAMpsgwbq6e9Va9koS884LuR5Ly9wmoDhuv=CTGJMONiO152XFw@mail.gmail.com>
Message-ID: <CAP7+vJJ6Y4fQkNan5DkwgAE7ddmkBL1ChVUCB=3jckjFk-ieOA@mail.gmail.com>

I'll ask my colleague what his compiler setup was.

On Tue, Jun 7, 2016 at 3:24 AM, Victor Stinner <victor.stinner at gmail.com>
wrote:

> Hi,
>
> 2016-06-04 19:47 GMT+02:00 Guido van Rossum <guido at python.org>:
> > Funny. Just two weeks ago I was helping someone who discovered a
> > compiler that doesn't support the new relaxed variable declaration
> > rules. I think it was on Windows. Maybe this move is a little too
> > aggressively deprecating older Windows compilers?
>
> I understood that Python only has a tiny list of officially supported
> compilers. For example, MinGW is somehow explicitly not supported and
> I see this as a deliberate choice.
>
> I'm quite sure that all supported compilers support C99.
>
> Is it worth to support a compiler that in 2016 doesn't support the C
> standard released in 1999, 17 years ago?
>
> Victor
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160607/9df99a4a/attachment.html>

From guido at python.org  Tue Jun  7 11:21:45 2016
From: guido at python.org (Guido van Rossum)
Date: Tue, 7 Jun 2016 08:21:45 -0700
Subject: [Python-Dev] C99
In-Reply-To: <CAP7+vJJ6Y4fQkNan5DkwgAE7ddmkBL1ChVUCB=3jckjFk-ieOA@mail.gmail.com>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <CAMpsgwbq6e9Va9koS884LuR5Ly9wmoDhuv=CTGJMONiO152XFw@mail.gmail.com>
 <CAP7+vJJ6Y4fQkNan5DkwgAE7ddmkBL1ChVUCB=3jckjFk-ieOA@mail.gmail.com>
Message-ID: <CAP7+vJLWLdxfLjxUakLKbDpP1nEbMsRXQS75U8jdkBsOJfOjmA@mail.gmail.com>

So here's the diffs that seem to indicate we were working with a compiler
that wasn't full C99 (or maybe previously we were working with a compiler
that had extensions?)

https://github.com/dropbox/typed_ast/commit/f7497e25abc3bcceced3ca6c3be3786d8805df41

On Tue, Jun 7, 2016 at 8:18 AM, Guido van Rossum <guido at python.org> wrote:

> I'll ask my colleague what his compiler setup was.
>
> On Tue, Jun 7, 2016 at 3:24 AM, Victor Stinner <victor.stinner at gmail.com>
> wrote:
>
>> Hi,
>>
>> 2016-06-04 19:47 GMT+02:00 Guido van Rossum <guido at python.org>:
>> > Funny. Just two weeks ago I was helping someone who discovered a
>> > compiler that doesn't support the new relaxed variable declaration
>> > rules. I think it was on Windows. Maybe this move is a little too
>> > aggressively deprecating older Windows compilers?
>>
>> I understood that Python only has a tiny list of officially supported
>> compilers. For example, MinGW is somehow explicitly not supported and
>> I see this as a deliberate choice.
>>
>> I'm quite sure that all supported compilers support C99.
>>
>> Is it worth to support a compiler that in 2016 doesn't support the C
>> standard released in 1999, 17 years ago?
>>
>> Victor
>>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160607/c5f760fa/attachment-0001.html>

From ethan at stoneleaf.us  Tue Jun  7 13:37:58 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 07 Jun 2016 10:37:58 -0700
Subject: [Python-Dev] Proper way to specify that a method is not defined for
 a type
Message-ID: <57570676.4070700@stoneleaf.us>

For binary methods, such as __add__, either do not implement or return 
NotImplemented if the other operand/class is not supported.

For non-binary methods, simply do not define.

Except for subclasses when the super-class defines __hash__ and the 
subclass is not hashable -- then set __hash__ to None.

Question:

Are there any other methods that should be set to None to tell the 
run-time that the method is not supported?  Or is this a general 
mechanism for subclasses to declare any method is unsupported?

--
~Ethan~

From ericsnowcurrently at gmail.com  Tue Jun  7 13:51:52 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 7 Jun 2016 10:51:52 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
Message-ID: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>

Hi all,

Following discussion a few years back (and rough approval from Guido
[1]), I started work on using OrderedDict for the class definition
namespace by default.  The bulk of the effort lay in implementing
OrderedDict in C, which I got landed just in time for 3.5.  The
remaining work was quite minimal and the actual change is quite small.

My intention was to land the patch soon, having gone through code
review during PyCon.  However, Nick pointed out to me the benefit of
having a concrete point of reference for the change, as well as making
sure it isn't a problem for other implementations.  So in that spirit,
here's a PEP for the change.  Feedback is welcome, particularly from
from other implementors.

-eric

[1] https://mail.python.org/pipermail/python-ideas/2013-February/019704.html

==================================================

PEP: XXX
Title: Ordered Class Definition Namespace
Version: $Revision$
Last-Modified: $Date$
Author: Eric Snow <ericsnowcurrently at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 4-Jun-2016
Python-Version: 3.6
Post-History: 7-Jun-2016

Abstract
========

This PEP changes the default class definition namespace to ``OrderedDict``.
Furthermore, the order in which the attributes are defined in each class
body will now be preserved in ``type.__definition_order__``.  This allows
introspection of the original definition order, e.g. by class decorators.

Note: just to be clear, this PEP is *not* about changing ``type.__dict__``
to ``OrderedDict``.

Motivation
==========

Currently the namespace used during execution of a class body defaults
to dict.  If the metaclass defines ``__prepare__()`` then the result of
calling it is used.  Thus, before this PEP, if you needed your class
definition namespace to be ``OrderedDict`` you had to use a metaclass.

Metaclasses introduce an extra level of complexity to code and in some
cases (e.g. conflicts) are a problem.  So reducing the need for them is
worth doing when the opportunity presents itself.  Given that we now have
a C implementation of ``OrderedDict`` and that ``OrderedDict`` is the
common use case for ``__prepare__()``, we have such an opportunity by
defaulting to ``OrderedDict``.

The usefulness of ``OrderedDict``-by-default is greatly increased if the
definition order is directly introspectable on classes afterward,
particularly by code that is independent of the original class definition.
One of the original motivating use cases for this PEP is generic class
decorators that make use of the definition order.

Changing the default class definition namespace has been discussed a
number of times, including on the mailing lists and in PEP 422 and
PEP 487 (see the References section below).

Specification
=============

* the default class *definition* namespace is now ``OrderdDict``
* the order in which class attributes are defined is preserved in the
  new ``__definition_order__`` attribute on each class
* "dunder" attributes (e.g. ``__init__``, ``__module__``) are ignored
* ``__definition_order__`` is a tuple
* ``__definition_order__`` is a read-only attribute
* ``__definition_order__`` is always set:

  * if ``__definition_order__`` is defined in the class body then it
    is used
  * types that do not have a class definition (e.g. builtins) have
    their ``__definition_order__`` set to ``None``
  * types for which `__prepare__()`` returned something other than
    ``OrderedDict`` (or a subclass) have their ``__definition_order__``
    set to ``None``

The following code demonstrates roughly equivalent semantics::

   class Meta(type):
       def __prepare__(cls, *args, **kwargs):
           return OrderedDict()

   class Spam(metaclass=Meta):
       ham = None
       eggs = 5
       __definition_order__ = tuple(k for k in locals()
                                    if (!k.startswith('__') or
                                        !k.endswith('__')))

Note that [pep487_] proposes a similar solution, albeit as part of a
broader proposal.

Compatibility
=============

This PEP does not break backward compatibility, except in the case that
someone relies *strictly* on dicts as the class definition namespace.  This
shouldn't be a problem.

Changes
=============

In addition to the class syntax, the following expose the new behavior:

* builtins.__build_class__
* types.prepare_class
* types.new_class

Other Python Implementations
============================

Pending feedback, the impact on Python implementations is expected to
be minimal.  If a Python implementation cannot support switching to
`OrderedDict``-by-default then it can always set ``__definition_order__``
to ``None``.

Implementation
==============

The implementation is found in the tracker. [impl_]

Alternatives
============

type.__dict__ as OrderedDict
----------------------------

Instead of storing the definition order in ``__definition_order__``,
the now-ordered definition namespace could be copied into a new
``OrderedDict``.  This would mostly provide the same semantics.

However, using ``OrderedDict`` for ``type,__dict__`` would obscure the
relationship with the definition namespace, making it less useful.
Additionally, doing this would require significant changes to the
semantics of the concrete dict C-API.

A "namespace" Keyword Arg for Class Definition
----------------------------------------------

PEP 422 introduced a new "namespace" keyword arg to class definitions
that effectively replaces the need to ``__prepare__()``. [pep422_]
However, the proposal was withdrawn in favor of the simpler PEP 487.

References
==========

.. [impl] issue #24254
   (https://bugs.python.org/issue24254)

.. [pep422] PEP 422
   (https://www.python.org/dev/peps/pep-0422/#order-preserving-classes)

.. [pep487] PEP 487
   (https://www.python.org/dev/peps/pep-0487/#defining-arbitrary-namespaces)

.. [orig] original discussion
   (https://mail.python.org/pipermail/python-ideas/2013-February/019690.html)

.. [followup1] follow-up 1
   (https://mail.python.org/pipermail/python-dev/2013-June/127103.html)

.. [followup2] follow-up 2
   (https://mail.python.org/pipermail/python-dev/2015-May/140137.html)

Copyright
===========
This document has been placed in the public domain.

From vgr255 at live.ca  Tue Jun  7 13:55:01 2016
From: vgr255 at live.ca (=?iso-8859-1?Q?=C9manuel_Barry?=)
Date: Tue, 7 Jun 2016 13:55:01 -0400
Subject: [Python-Dev] Proper way to specify that a method is not defined
 for a type
In-Reply-To: <57570676.4070700@stoneleaf.us>
References: <57570676.4070700@stoneleaf.us>
Message-ID: <BLU403-EAS2801749FF1B9CB58841E96C915D0@phx.gbl>

> From: Ethan Furman
> Sent: Tuesday, June 07, 2016 1:38 PM
> To: Python Dev
> Subject: [Python-Dev] Proper way to specify that a method is not defined
for
> a type

(Just so everyone follows, this is a followup of
http://bugs.python.org/issue27242 )

> For binary methods, such as __add__, either do not implement or return
> NotImplemented if the other operand/class is not supported.
> 
> For non-binary methods, simply do not define.
> 
> Except for subclasses when the super-class defines __hash__ and the
> subclass is not hashable -- then set __hash__ to None.

Should I mention the __hash__ special case in the
NotImplemented/NotImplementedError docs? If people are looking for a way to
declare this specific operation undefined, they'd find it there as well as
the hash() documentation.

> Question:
> 
> Are there any other methods that should be set to None to tell the
> run-time that the method is not supported?  Or is this a general
> mechanism for subclasses to declare any method is unsupported?

There was a discussion on one of Python-ideas or Python-dev some time ago
about exactly that, but I don't think any consensus was reached. However, I
think it would make sense for e.g. __iter__ and __reversed__ to tell the
interpreter to *not* fall back to the default sequence protocol (well, in
practice that already works, but it gives an unhelpful error message). I'm
not sure how useful it would be for arbitrary methods, though. __bytes__
(which originally sparked the issue) may or may not be a good candidate, I'm
not sure.

While I like the `__magic_method__ = None` approach, I think the main reason
__hash__ supports that is because there are legitimate use cases of
disallowing hashing (i.e. mutable objects which may or may not change hash
during their lifetime), but I don't think the same rationale applies to
everything ("accidentally" iterating over a not-meant-to-be-iterable object
will result in nonsensical data, but it won't bite the user later, unlike
changing hashes which definitely will).

> --
> ~Ethan~

Special-cases-aren't-special-enough-but-they're-still-there'ly yrs,
-Emanuel

From guido at python.org  Tue Jun  7 13:56:37 2016
From: guido at python.org (Guido van Rossum)
Date: Tue, 7 Jun 2016 10:56:37 -0700
Subject: [Python-Dev] Proper way to specify that a method is not defined
 for a type
In-Reply-To: <57570676.4070700@stoneleaf.us>
References: <57570676.4070700@stoneleaf.us>
Message-ID: <CAP7+vJ+xQe0BTaFUGrPMPG9txDNfdP29tfx1kKJ__nXwnW4B8g@mail.gmail.com>

Setting it to None in the subclass is the intended pattern. But CPython
must explicitly handle that somewhere so I don't know how general it is
supported. Try defining a list subclass with __len__ set to None and see
what happens. Then try the same with MutableSequence.

On Tue, Jun 7, 2016 at 10:37 AM, Ethan Furman <ethan at stoneleaf.us> wrote:

> For binary methods, such as __add__, either do not implement or return
> NotImplemented if the other operand/class is not supported.
>
> For non-binary methods, simply do not define.
>
> Except for subclasses when the super-class defines __hash__ and the
> subclass is not hashable -- then set __hash__ to None.
>
> Question:
>
> Are there any other methods that should be set to None to tell the
> run-time that the method is not supported?  Or is this a general mechanism
> for subclasses to declare any method is unsupported?
>
> --
> ~Ethan~
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160607/026aa47f/attachment.html>

From ethan at stoneleaf.us  Tue Jun  7 14:01:43 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 07 Jun 2016 11:01:43 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
Message-ID: <57570C07.9000703@stoneleaf.us>

On 06/07/2016 10:51 AM, Eric Snow wrote:

> My intention was to land the patch soon, having gone through code
> review during PyCon.  However, Nick pointed out to me the benefit of
> having a concrete point of reference for the change, as well as making
> sure it isn't a problem for other implementations.  So in that spirit,
> here's a PEP for the change.  Feedback is welcome, particularly from
> from other implementors.

+1

> Specification
> =============

>    * types for which `__prepare__()`` returned something other than
>      ``OrderedDict`` (or a subclass) have their ``__definition_order__``
>      set to ``None``

I assume this check happens in type.__new__?  If a non-OrderedDict is 
used as the namespace, but a __definition_order__ key and value are 
supplied, is it used or still set to None?

--
~Ethan~

From ericsnowcurrently at gmail.com  Tue Jun  7 14:13:45 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 7 Jun 2016 11:13:45 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <57570C07.9000703@stoneleaf.us>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
 <57570C07.9000703@stoneleaf.us>
Message-ID: <CALFfu7B1=4etOsiMoDJ8siyDcCurcpnMT4TuFQRdfjOdW+y_rg@mail.gmail.com>

On Tue, Jun 7, 2016 at 11:01 AM, Ethan Furman <ethan at stoneleaf.us> wrote:
> On 06/07/2016 10:51 AM, Eric Snow wrote:
>> Specification
>> =============
>
>
>>    * types for which `__prepare__()`` returned something other than
>>      ``OrderedDict`` (or a subclass) have their ``__definition_order__``
>>      set to ``None``
>
>
> I assume this check happens in type.__new__?  If a non-OrderedDict is used
> as the namespace, but a __definition_order__ key and value are supplied, is
> it used or still set to None?

A __definition_order__ in the class body always takes precedence.  So
a supplied value will be honored (and not replaced with None).

-eric

From tjreedy at udel.edu  Tue Jun  7 14:32:14 2016
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 7 Jun 2016 14:32:14 -0400
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
Message-ID: <nj73vn$mgm$1@ger.gmane.org>

On 6/7/2016 1:51 PM, Eric Snow wrote:

> Note: just to be clear, this PEP is *not* about changing
 > ``type.__dict__`` to ``OrderedDict``.

By 'type', do you mean the one and one objected named 'type or the class 
being defined?  To be really clear, will the following change?

 >>> class C: pass

 >>> type(C.__dict__)
<class 'mappingproxy'>

If the proposal only affects (slows) the class definition process, and 
then only minimally, and has no effect on class use, then +1 on being 
able to avoid metaclass and prepare for its most common current usage.

-- 
Terry Jan Reedy

From vgr255 at live.ca  Tue Jun  7 14:36:20 2016
From: vgr255 at live.ca (=?iso-8859-1?Q?=C9manuel_Barry?=)
Date: Tue, 7 Jun 2016 14:36:20 -0400
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
Message-ID: <BLU403-EAS269736178C3E66726BBFA1A915D0@phx.gbl>

> From: Eric Snow
> Sent: Tuesday, June 07, 2016 1:52 PM
> To: Python-Dev
> Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
> 
> 
> Currently the namespace used during execution of a class body defaults
> to dict.  If the metaclass defines ``__prepare__()`` then the result of
> calling it is used.  Thus, before this PEP, if you needed your class
> definition namespace to be ``OrderedDict`` you had to use a metaclass.

Formatting nit: ``dict``

> Specification
> =============
> 
> * the default class *definition* namespace is now ``OrderdDict``
> * the order in which class attributes are defined is preserved in the
>   new ``__definition_order__`` attribute on each class
> * "dunder" attributes (e.g. ``__init__``, ``__module__``) are ignored

What does this imply? If I define some __dunder__ methods, will they simply
not be present in __definition_order__? What if I want to keep the order of
those? While keeping the order of these might be meaningless in most cases,
I don't think there's really a huge problem in doing so. Maybe I'm
overthinking it.

> * ``__definition_order__`` is a tuple
> * ``__definition_order__`` is a read-only attribute
> * ``__definition_order__`` is always set:
> 
>   * if ``__definition_order__`` is defined in the class body then it
>     is used
>   * types that do not have a class definition (e.g. builtins) have
>     their ``__definition_order__`` set to ``None``
>   * types for which `__prepare__()`` returned something other than
>     ``OrderedDict`` (or a subclass) have their ``__definition_order__``
>     set to ``None``

I would probably like a ``type.definition_order`` method, for which the
return value is bound to __definition_order__ when the class is created
(much like the link between ``type.mro`` and ``cls.__mro__``. Additionally
I'm not sure if setting the attribute to None is a good idea; I'd have it as
an empty tuple. Then again I tend to overthink a lot.

> The following code demonstrates roughly equivalent semantics::
> 
>    class Meta(type):
>        def __prepare__(cls, *args, **kwargs):
>            return OrderedDict()
> 
>    class Spam(metaclass=Meta):
>        ham = None
>        eggs = 5
>        __definition_order__ = tuple(k for k in locals()
>                                     if (!k.startswith('__') or
>                                         !k.endswith('__')))

Mixing up C and Python syntax here.

> However, using ``OrderedDict`` for ``type,__dict__`` would obscure the
> relationship with the definition namespace, making it less useful.
> Additionally, doing this would require significant changes to the
> semantics of the concrete dict C-API.

Formatting nit: ``dict``

I'm +1 on the whole idea (one of my common uses of metaclasses was to keep
the definition order *somewhere*). Thank you for doing that!
-Emanuel

From ericsnowcurrently at gmail.com  Tue Jun  7 14:39:06 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 7 Jun 2016 11:39:06 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <nj73vn$mgm$1@ger.gmane.org>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
 <nj73vn$mgm$1@ger.gmane.org>
Message-ID: <CALFfu7CN-3zouXqnSN46WSe36D0O=rYJ7epm8kjiCe69og+x3g@mail.gmail.com>

On Tue, Jun 7, 2016 at 11:32 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 6/7/2016 1:51 PM, Eric Snow wrote:
>
>> Note: just to be clear, this PEP is *not* about changing
>
>> ``type.__dict__`` to ``OrderedDict``.
>
> By 'type', do you mean the one and one objected named 'type or the class
> being defined?  To be really clear, will the following change?
>
>>>> class C: pass
>
>>>> type(C.__dict__)
> <class 'mappingproxy'>

I mean the latter, "type" -> the class being defined.

>
> If the proposal only affects (slows) the class definition process, and then
> only minimally, and has no effect on class use, then +1 on being able to
> avoid metaclass and prepare for its most common current usage.

That is all correct.

-eric

From ethan at stoneleaf.us  Tue Jun  7 14:45:52 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 07 Jun 2016 11:45:52 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7B1=4etOsiMoDJ8siyDcCurcpnMT4TuFQRdfjOdW+y_rg@mail.gmail.com>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
 <57570C07.9000703@stoneleaf.us>
 <CALFfu7B1=4etOsiMoDJ8siyDcCurcpnMT4TuFQRdfjOdW+y_rg@mail.gmail.com>
Message-ID: <57571660.2090709@stoneleaf.us>

On 06/07/2016 11:13 AM, Eric Snow wrote:
> On Tue, Jun 7, 2016 at 11:01 AM, Ethan Furman <ethan at stoneleaf.us> wrote:
>> On 06/07/2016 10:51 AM, Eric Snow wrote:
>>> Specification
>>> =============
>>
>>
>>>     * types for which `__prepare__()`` returned something other than
>>>       ``OrderedDict`` (or a subclass) have their ``__definition_order__``
>>>       set to ``None``
>>
>>
>> I assume this check happens in type.__new__?  If a non-OrderedDict is used
>> as the namespace, but a __definition_order__ key and value are supplied, is
>> it used or still set to None?
>
> A __definition_order__ in the class body always takes precedence.  So
> a supplied value will be honored (and not replaced with None).

Will the supplied __definition_order__ be made a tuple, and still be 
read-only?

--
~Ethan~

From rdmurray at bitdance.com  Tue Jun  7 14:45:03 2016
From: rdmurray at bitdance.com (R. David Murray)
Date: Tue, 07 Jun 2016 14:45:03 -0400
Subject: [Python-Dev] Proper way to specify that a method is not defined
 for a type
In-Reply-To: <CAP7+vJ+xQe0BTaFUGrPMPG9txDNfdP29tfx1kKJ__nXwnW4B8g@mail.gmail.com>
References: <57570676.4070700@stoneleaf.us>
 <CAP7+vJ+xQe0BTaFUGrPMPG9txDNfdP29tfx1kKJ__nXwnW4B8g@mail.gmail.com>
Message-ID: <20160607184504.82143B14027@webabinitio.net>

For those interested in this topic, if you are not already aware of it,
see also http://bugs.python.org/issue25958, which among other things
has a relevant proposed patch for datamode.rst.

On Tue, 07 Jun 2016 10:56:37 -0700, Guido van Rossum <guido at python.org> wrote:
> Setting it to None in the subclass is the intended pattern. But CPython
> must explicitly handle that somewhere so I don't know how general it is
> supported. Try defining a list subclass with __len__ set to None and see
> what happens. Then try the same with MutableSequence.
> 
> On Tue, Jun 7, 2016 at 10:37 AM, Ethan Furman <ethan at stoneleaf.us> wrote:
> 
> > For binary methods, such as __add__, either do not implement or return
> > NotImplemented if the other operand/class is not supported.
> >
> > For non-binary methods, simply do not define.
> >
> > Except for subclasses when the super-class defines __hash__ and the
> > subclass is not hashable -- then set __hash__ to None.
> >
> > Question:
> >
> > Are there any other methods that should be set to None to tell the
> > run-time that the method is not supported?  Or is this a general mechanism
> > for subclasses to declare any method is unsupported?

From ericsnowcurrently at gmail.com  Tue Jun  7 14:51:34 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 7 Jun 2016 11:51:34 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <BLU403-EAS269736178C3E66726BBFA1A915D0@phx.gbl>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
 <BLU403-EAS269736178C3E66726BBFA1A915D0@phx.gbl>
Message-ID: <CALFfu7A3P5PTArip7HBfQr1zLct5F8uEJXAj4YYUDTBQOnVndQ@mail.gmail.com>

On Tue, Jun 7, 2016 at 11:36 AM, ?manuel Barry <vgr255 at live.ca> wrote:
>> From: Eric Snow
>> * "dunder" attributes (e.g. ``__init__``, ``__module__``) are ignored
>
> What does this imply? If I define some __dunder__ methods, will they simply
> not be present in __definition_order__? What if I want to keep the order of
> those? While keeping the order of these might be meaningless in most cases,
> I don't think there's really a huge problem in doing so. Maybe I'm
> overthinking it.

"dunder" names (not just methods) will not be present in
__definition_order__.  I'll add an explanation to the PEP.  The gist
of it is that they are reserved for use by the interpreter and will
always clutter up __definition_order__.  Since needing dunder names
included in __definition_order__ would be rather exceptional, and
there are other options available, leaving them out by default is a
matter of practicality.

>
>> * ``__definition_order__`` is a tuple
>> * ``__definition_order__`` is a read-only attribute
>> * ``__definition_order__`` is always set:
>>
>>   * if ``__definition_order__`` is defined in the class body then it
>>     is used
>>   * types that do not have a class definition (e.g. builtins) have
>>     their ``__definition_order__`` set to ``None``
>>   * types for which `__prepare__()`` returned something other than
>>     ``OrderedDict`` (or a subclass) have their ``__definition_order__``
>>     set to ``None``
>
> I would probably like a ``type.definition_order`` method, for which the
> return value is bound to __definition_order__ when the class is created
> (much like the link between ``type.mro`` and ``cls.__mro__``.

What is the value of type.definition_order()?  If you need a mutable
copy then pass __definition_order__ to list().

> Additionally
> I'm not sure if setting the attribute to None is a good idea; I'd have it as
> an empty tuple. Then again I tend to overthink a lot.

None indicates that there is no order.  An empty tuple indicates that
there were no attributes.

>>        __definition_order__ = tuple(k for k in locals()
>>                                     if (!k.startswith('__') or
>>                                         !k.endswith('__')))
>
> Mixing up C and Python syntax here.

nice catch :)

> I'm +1 on the whole idea (one of my common uses of metaclasses was to keep
> the definition order *somewhere*). Thank you for doing that!

:)

-eric

From ericsnowcurrently at gmail.com  Tue Jun  7 14:53:53 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 7 Jun 2016 11:53:53 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <57571660.2090709@stoneleaf.us>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
 <57570C07.9000703@stoneleaf.us>
 <CALFfu7B1=4etOsiMoDJ8siyDcCurcpnMT4TuFQRdfjOdW+y_rg@mail.gmail.com>
 <57571660.2090709@stoneleaf.us>
Message-ID: <CALFfu7AvP7o+Xh9oc7Xn5kKGy+ggs22nNbYjHrDNfnGbtj24Fg@mail.gmail.com>

On Tue, Jun 7, 2016 at 11:45 AM, Ethan Furman <ethan at stoneleaf.us> wrote:
> On 06/07/2016 11:13 AM, Eric Snow wrote:
>> A __definition_order__ in the class body always takes precedence.  So
>> a supplied value will be honored (and not replaced with None).
>
> Will the supplied __definition_order__ be made a tuple, and still be
> read-only?

I had planned on leaving a supplied one alone.  So no change to tuple.
It remain a read-only attribute though, since that is handled via a
descriptor (a la type.__dict__).

-eric

From vgr255 at live.ca  Tue Jun  7 14:57:16 2016
From: vgr255 at live.ca (=?utf-8?Q?=C3=89manuel_Barry?=)
Date: Tue, 7 Jun 2016 14:57:16 -0400
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7A3P5PTArip7HBfQr1zLct5F8uEJXAj4YYUDTBQOnVndQ@mail.gmail.com>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
 <BLU403-EAS269736178C3E66726BBFA1A915D0@phx.gbl>
 <CALFfu7A3P5PTArip7HBfQr1zLct5F8uEJXAj4YYUDTBQOnVndQ@mail.gmail.com>
Message-ID: <BLU403-EAS196A5EC682D99DDD531C781915D0@phx.gbl>

> From: Eric Snow
> Sent: Tuesday, June 07, 2016 2:52 PM
> To: ?manuel Barry
> Cc: Python-Dev
> Subject: Re: [Python-Dev] PEP: Ordered Class Definition Namespace
> 
> "dunder" names (not just methods) will not be present in
> __definition_order__.  I'll add an explanation to the PEP.  The gist
> of it is that they are reserved for use by the interpreter and will
> always clutter up __definition_order__.  Since needing dunder names
> included in __definition_order__ would be rather exceptional, and
> there are other options available, leaving them out by default is a
> matter of practicality.

Good point. I'll assume that if we need that we'll do something in the metaclass.

> What is the value of type.definition_order()?  If you need a mutable
> copy then pass __definition_order__ to list().

I think I explained it backwards. I meant to have a method on ``type`` (which metaclasses can override at will) which will set what is passed to the resulting __definition_order__ attribute. But it might not be needed, as we can probably sneak that inside the namespace in the metaclass' __new__.

> > Additionally
> > I'm not sure if setting the attribute to None is a good idea; I'd have it as
> > an empty tuple. Then again I tend to overthink a lot.
> 
> None indicates that there is no order.  An empty tuple indicates that
> there were no attributes.

Fair enough.

> 
> -eric

-Emanuel

From ncoghlan at gmail.com  Tue Jun  7 15:30:31 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 7 Jun 2016 12:30:31 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
Message-ID: <CADiSq7f48beR4Kj6QvQs1g14_kCJKYKQNuW+v1hS7WuT_nAJxw@mail.gmail.com>

On 7 June 2016 at 10:51, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> Specification
> =============
>
> * the default class *definition* namespace is now ``OrderdDict``
> * the order in which class attributes are defined is preserved in the
>   new ``__definition_order__`` attribute on each class
> * "dunder" attributes (e.g. ``__init__``, ``__module__``) are ignored
> * ``__definition_order__`` is a tuple
> * ``__definition_order__`` is a read-only attribute

Thinking about the class decorator use case, I think this may need to
be reconsidered, as class decorators may:

1. Remove class attributes
2. Add class attributes

This will then lead to __definition_order__ getting out of sync with
the current state of the class namespace.

One option for dealing with that would be to make type.__setattr__ and
type.__delattr__ aware of __definition_order__, and have them replace
the tuple with a new one as needed. If we did that, then the main
question would be whether updating an existing attribute changed the
definition order, and I'd be inclined to say "No" (to minimise the
side effects of monkey-patching).

The main alternative would be to make __definition_order__ writable,
so the default behaviour would be for it to reflect the original class
body, but decorators would be free to update it to reflect their
changes, as well as to make other modifications (e.g. stripping out
all callables from the list).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From sturla.molden at gmail.com  Tue Jun  7 15:37:21 2016
From: sturla.molden at gmail.com (Sturla Molden)
Date: Tue, 7 Jun 2016 19:37:21 +0000 (UTC)
Subject: [Python-Dev] C99
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <CAMpsgwbq6e9Va9koS884LuR5Ly9wmoDhuv=CTGJMONiO152XFw@mail.gmail.com>
Message-ID: <837505337487020527.659630sturla.molden-gmail.com@news.gmane.org>

Victor Stinner <victor.stinner at gmail.com> wrote:

> Is it worth to support a compiler that in 2016 doesn't support the C
> standard released in 1999, 17 years ago?

MSVC only supports C99 when its needed for C++11 or some MS extension to C.

Is it worth supporting MSVC? If not, we have Intel C, Clang and Cygwin GCC
are the viable options we have on Windows (and perhaps Embarcadero, but I
haven't used C++ builder for a very long time). Even MinGW does not fully
support C99, because it depends on Microsoft's CRT. If we think MSVC and
MinGW are worth supporting, we cannot just use C99 indiscriminantly.

From tritium-list at sdamon.com  Tue Jun  7 16:10:28 2016
From: tritium-list at sdamon.com (tritium-list at sdamon.com)
Date: Tue, 7 Jun 2016 16:10:28 -0400
Subject: [Python-Dev] C99
In-Reply-To: <837505337487020527.659630sturla.molden-gmail.com@news.gmane.org>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <CAMpsgwbq6e9Va9koS884LuR5Ly9wmoDhuv=CTGJMONiO152XFw@mail.gmail.com>
 <837505337487020527.659630sturla.molden-gmail.com@news.gmane.org>
Message-ID: <0cb001d1c0f8$a3322550$e9966ff0$@hotmail.com>

Doesn't Cygwin build against the posix abstraction layer?  Wouldn't a python
built as such operate as though it was on a unix of some sort?  It has been
quite a while since I messed with Cygwin - if it hasn't changed, it's not
really an option, especially when we have native windows builds now.  It
would be too much of a downgrade in experience and performance.

> -----Original Message-----
> From: Python-Dev [mailto:python-dev-bounces+tritium-
> list=sdamon.com at python.org] On Behalf Of Sturla Molden
> Sent: Tuesday, June 7, 2016 3:37 PM
> To: python-dev at python.org
> Subject: Re: [Python-Dev] C99
> 
> Victor Stinner <victor.stinner at gmail.com> wrote:
> 
> > Is it worth to support a compiler that in 2016 doesn't support the C
> > standard released in 1999, 17 years ago?
> 
> MSVC only supports C99 when its needed for C++11 or some MS extension
> to C.
> 
> Is it worth supporting MSVC? If not, we have Intel C, Clang and Cygwin GCC
> are the viable options we have on Windows (and perhaps Embarcadero, but I
> haven't used C++ builder for a very long time). Even MinGW does not fully
> support C99, because it depends on Microsoft's CRT. If we think MSVC and
> MinGW are worth supporting, we cannot just use C99 indiscriminantly.
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/tritium-
> list%40sdamon.com

From ethan at stoneleaf.us  Tue Jun  7 16:28:13 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 07 Jun 2016 13:28:13 -0700
Subject: [Python-Dev] PEP 467:  Minor API improvements to bytes, bytearray,
 and memoryview
Message-ID: <57572E5D.4020101@stoneleaf.us>

Minor changes: updated version numbers, add punctuation.

The current text seems to take into account Guido's last comments.

Thoughts before asking for acceptance?

PEP: 467
Title: Minor API improvements for binary sequences
Version: $Revision$
Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 2014-03-30
Python-Version: 3.5
Post-History: 2014-03-30 2014-08-15 2014-08-16

Abstract
========

During the initial development of the Python 3 language specification, 
the core ``bytes`` type for arbitrary binary data started as the mutable 
type that is now referred to as ``bytearray``. Other aspects of 
operating in the binary domain in Python have also evolved over the 
course of the Python 3 series.

This PEP proposes four small adjustments to the APIs of the ``bytes``, 
``bytearray`` and ``memoryview`` types to make it easier to operate 
entirely in the binary domain:

* Deprecate passing single integer values to ``bytes`` and ``bytearray``
* Add ``bytes.zeros`` and ``bytearray.zeros`` alternative constructors
* Add ``bytes.byte`` and ``bytearray.byte`` alternative constructors
* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and
   ``memoryview.iterbytes`` alternative iterators

Proposals
=========

Deprecation of current "zero-initialised sequence" behaviour
------------------------------------------------------------

Currently, the ``bytes`` and ``bytearray`` constructors accept an 
integer argument and interpret it as meaning to create a 
zero-initialised sequence of the given size::

     >>> bytes(3)
     b'\x00\x00\x00'
     >>> bytearray(3)
     bytearray(b'\x00\x00\x00')

This PEP proposes to deprecate that behaviour in Python 3.6, and remove 
it entirely in Python 3.7.

No other changes are proposed to the existing constructors.

Addition of explicit "zero-initialised sequence" constructors
-------------------------------------------------------------

To replace the deprecated behaviour, this PEP proposes the addition of 
an explicit ``zeros`` alternative constructor as a class method on both 
``bytes`` and ``bytearray``::

     >>> bytes.zeros(3)
     b'\x00\x00\x00'
     >>> bytearray.zeros(3)
     bytearray(b'\x00\x00\x00')

It will behave just as the current constructors behave when passed a 
single integer.

The specific choice of ``zeros`` as the alternative constructor name is 
taken from the corresponding initialisation function in NumPy (although, 
as these are 1-dimensional sequence types rather than N-dimensional 
matrices, the constructors take a length as input rather than a shape 
tuple).

Addition of explicit "single byte" constructors
-----------------------------------------------

As binary counterparts to the text ``chr`` function, this PEP proposes 
the addition of an explicit ``byte`` alternative constructor as a class 
method on both ``bytes`` and ``bytearray``::

     >>> bytes.byte(3)
     b'\x03'
     >>> bytearray.byte(3)
     bytearray(b'\x03')

These methods will only accept integers in the range 0 to 255 (inclusive)::

     >>> bytes.byte(512)
     Traceback (most recent call last):
       File "<stdin>", line 1, in <module>
     ValueError: bytes must be in range(0, 256)

     >>> bytes.byte(1.0)
     Traceback (most recent call last):
       File "<stdin>", line 1, in <module>
     TypeError: 'float' object cannot be interpreted as an integer

The documentation of the ``ord`` builtin will be updated to explicitly 
note that ``bytes.byte`` is the inverse operation for binary data, while 
``chr`` is the inverse operation for text data.

Behaviourally, ``bytes.byte(x)`` will be equivalent to the current 
``bytes([x])`` (and similarly for ``bytearray``). The new spelling is 
expected to be easier to discover and easier to read (especially when 
used in conjunction with indexing operations on binary sequence types).

As a separate method, the new spelling will also work better with higher 
order functions like ``map``.

Addition of optimised iterator methods that produce ``bytes`` objects
---------------------------------------------------------------------

This PEP proposes that ``bytes``, ``bytearray`` and ``memoryview`` gain 
an optimised ``iterbytes`` method that produces length 1 ``bytes`` 
objects rather than integers::

     for x in data.iterbytes():
         # x is a length 1 ``bytes`` object, rather than an integer

The method can be used with arbitrary buffer exporting objects by 
wrapping them in a ``memoryview`` instance first::

     for x in memoryview(data).iterbytes():
         # x is a length 1 ``bytes`` object, rather than an integer

For ``memoryview``, the semantics of ``iterbytes()`` are defined such that::

     memview.tobytes() == b''.join(memview.iterbytes())

This allows the raw bytes of the memory view to be iterated over without 
needing to make a copy, regardless of the defined shape and format.

The main advantage this method offers over the ``map(bytes.byte, data)`` 
approach is that it is guaranteed *not* to fail midstream with a 
``ValueError`` or ``TypeError``. By contrast, when using the ``map`` 
based approach, the type and value of the individual items in the 
iterable are only checked as they are retrieved and passed through the 
``bytes.byte`` constructor.

Design discussion
=================

Why not rely on sequence repetition to create zero-initialised sequences?
-------------------------------------------------------------------------

Zero-initialised sequences can be created via sequence repetition::

     >>> b'\x00' * 3
     b'\x00\x00\x00'
     >>> bytearray(b'\x00') * 3
     bytearray(b'\x00\x00\x00')

However, this was also the case when the ``bytearray`` type was 
originally designed, and the decision was made to add explicit support 
for it in the type constructor. The immutable ``bytes`` type then 
inherited that feature when it was introduced in PEP 3137.

This PEP isn't revisiting that original design decision, just changing 
the spelling as users sometimes find the current behaviour of the binary 
sequence constructors surprising. In particular, there's a reasonable 
case to be made that ``bytes(x)`` (where ``x`` is an integer) should 
behave like the ``bytes.byte(x)`` proposal in this PEP. Providing both 
behaviours as separate class methods avoids that ambiguity.

References
==========

.. [1] Initial March 2014 discussion thread on python-ideas
    (https://mail.python.org/pipermail/python-ideas/2014-March/027295.html)
.. [2] Guido's initial feedback in that thread
    (https://mail.python.org/pipermail/python-ideas/2014-March/027376.html)
.. [3] Issue proposing moving zero-initialised sequences to a dedicated API
    (http://bugs.python.org/issue20895)
.. [4] Issue proposing to use calloc() for zero-initialised binary sequences
    (http://bugs.python.org/issue21644)
.. [5] August 2014 discussion thread on python-dev
    (https://mail.python.org/pipermail/python-ideas/2014-March/027295.html)

Copyright
=========

This document has been placed in the public domain.

From gvanrossum at gmail.com  Tue Jun  7 15:45:35 2016
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue, 7 Jun 2016 12:45:35 -0700
Subject: [Python-Dev] C99
In-Reply-To: <837505337487020527.659630sturla.molden-gmail.com@news.gmane.org>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <CAMpsgwbq6e9Va9koS884LuR5Ly9wmoDhuv=CTGJMONiO152XFw@mail.gmail.com>
 <837505337487020527.659630sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CAP7+vJJuSreWFH=ojBphsYsBXt8cz2zoc3=6Pmh=KPu7E-VOaA@mail.gmail.com>

We should definitely keep supporting MSVC.

--Guido (mobile)
On Jun 7, 2016 12:39 PM, "Sturla Molden" <sturla.molden at gmail.com> wrote:

> Victor Stinner <victor.stinner at gmail.com> wrote:
>
> > Is it worth to support a compiler that in 2016 doesn't support the C
> > standard released in 1999, 17 years ago?
>
> MSVC only supports C99 when its needed for C++11 or some MS extension to C.
>
> Is it worth supporting MSVC? If not, we have Intel C, Clang and Cygwin GCC
> are the viable options we have on Windows (and perhaps Embarcadero, but I
> haven't used C++ builder for a very long time). Even MinGW does not fully
> support C99, because it depends on Microsoft's CRT. If we think MSVC and
> MinGW are worth supporting, we cannot just use C99 indiscriminantly.
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160607/0584fe5d/attachment.html>

From njs at pobox.com  Tue Jun  7 16:54:04 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 7 Jun 2016 13:54:04 -0700
Subject: [Python-Dev] C99
In-Reply-To: <837505337487020527.659630sturla.molden-gmail.com@news.gmane.org>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <CAMpsgwbq6e9Va9koS884LuR5Ly9wmoDhuv=CTGJMONiO152XFw@mail.gmail.com>
 <837505337487020527.659630sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CAPJVwBmYi5QjmHHNQwC9x7GB6cS8y9VXf2=Xcz5W26md8AEsgA@mail.gmail.com>

On Tue, Jun 7, 2016 at 12:37 PM, Sturla Molden <sturla.molden at gmail.com> wrote:
> Victor Stinner <victor.stinner at gmail.com> wrote:
>
>> Is it worth to support a compiler that in 2016 doesn't support the C
>> standard released in 1999, 17 years ago?
>
> MSVC only supports C99 when its needed for C++11 or some MS extension to C.
>
> Is it worth supporting MSVC? If not, we have Intel C, Clang and Cygwin GCC
> are the viable options we have on Windows (and perhaps Embarcadero, but I
> haven't used C++ builder for a very long time). Even MinGW does not fully
> support C99, because it depends on Microsoft's CRT. If we think MSVC and
> MinGW are worth supporting, we cannot just use C99 indiscriminantly.

No-one's proposing to use C99 indiscriminately; AFAICT the proposal
was: it would make a big difference if the CPython core could start
using some of C99's basic features like long long, inline functions,
and mid-block declarations, and all interesting compilers support
these, so let's officially switch from C89-only to
C89-plus-the-bits-of-C99-that-MSVC-supports. This would be a big
improvement and is just a matter of recognizing the status quo; no
need to drag in anything controversial.

There's no chance that CPython is going to drop MSVC support in 3.6.
Intel C is hardly a viable option given that the license requires the
people running the compiler to accept unbounded liability for Intel
lawyer bills and imposes non-DFSG-free conditions on the compiled
output. And Cygwin GCC isn't even real Windows. Maybe switching to
Clang will make sense in 3.7 but that's a long ways off...

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From sturla.molden at gmail.com  Tue Jun  7 17:03:57 2016
From: sturla.molden at gmail.com (Sturla Molden)
Date: Tue, 7 Jun 2016 21:03:57 +0000 (UTC)
Subject: [Python-Dev] C99
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <CAMpsgwbq6e9Va9koS884LuR5Ly9wmoDhuv=CTGJMONiO152XFw@mail.gmail.com>
 <837505337487020527.659630sturla.molden-gmail.com@news.gmane.org>
 <CAPJVwBmYi5QjmHHNQwC9x7GB6cS8y9VXf2=Xcz5W26md8AEsgA@mail.gmail.com>
Message-ID: <1399814768487025983.893766sturla.molden-gmail.com@news.gmane.org>

Nathaniel Smith <njs at pobox.com> wrote:

> No-one's proposing to use C99 indiscriminately;

> There's no chance that CPython is going to drop MSVC support in 3.6.

Stinner was proposing that by saying

"Is it worth to support a compiler that in 2016 doesn't support the C
standard released in 1999, 17 years ago?"

This is basically a suggestion to drop MSVC support, as I read it. That is
never going to happen.

Sturla

From ericsnowcurrently at gmail.com  Tue Jun  7 17:20:02 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 7 Jun 2016 14:20:02 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <CADiSq7f48beR4Kj6QvQs1g14_kCJKYKQNuW+v1hS7WuT_nAJxw@mail.gmail.com>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
 <CADiSq7f48beR4Kj6QvQs1g14_kCJKYKQNuW+v1hS7WuT_nAJxw@mail.gmail.com>
Message-ID: <CALFfu7A3rRG7HY8DevRMFa-ucJtK1=kVp9Ya78LSfUgLs8vm2Q@mail.gmail.com>

On Tue, Jun 7, 2016 at 12:30 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 7 June 2016 at 10:51, Eric Snow <ericsnowcurrently at gmail.com> wrote:
>> * ``__definition_order__`` is a tuple
>> * ``__definition_order__`` is a read-only attribute
>
> Thinking about the class decorator use case, I think this may need to
> be reconsidered, as class decorators may:
>
> 1. Remove class attributes
> 2. Add class attributes
>
> This will then lead to __definition_order__ getting out of sync with
> the current state of the class namespace.

I'm not clear on your point.  Decorators are applied after the class
has been created.  Hence they have no impact on the class's definition
order.  I'd expect __definition_order__ to strictly represent what
happened in the class body during definition, and not anything
afterward.

Certainly __definition_order__ might not align with __dict__ (or
dir()); we don't have any way to guarantee that it would, do we?  If
anything, the ability to diff __definition_order__ and __dict__ is a
positive, since it allows you to see changes on the class since it was
defined.

>
> One option for dealing with that would be to make type.__setattr__ and
> type.__delattr__ aware of __definition_order__, and have them replace
> the tuple with a new one as needed. If we did that, then the main
> question would be whether updating an existing attribute changed the
> definition order, and I'd be inclined to say "No" (to minimise the
> side effects of monkey-patching).
>
> The main alternative would be to make __definition_order__ writable,
> so the default behaviour would be for it to reflect the original class
> body, but decorators would be free to update it to reflect their
> changes, as well as to make other modifications (e.g. stripping out
> all callables from the list).

I think both of those make __definition_order__ more complicated and
less useful.  As the PEP stands, folks can be confident in what
__definition_order__ represents.  What would you consider to be the
benefit of a mutable (or replaceable) __definition_order__ that
outweighs the benefit of a simpler definition of what's in it.

BTW, thanks for bringing this up. :)

-eric

From barry at python.org  Tue Jun  7 17:31:19 2016
From: barry at python.org (Barry Warsaw)
Date: Tue, 7 Jun 2016 17:31:19 -0400
Subject: [Python-Dev] PEP 467:  Minor API improvements to bytes,
 bytearray, and memoryview
In-Reply-To: <57572E5D.4020101@stoneleaf.us>
References: <57572E5D.4020101@stoneleaf.us>
Message-ID: <20160607173119.36961fcf.barry@wooz.org>

On Jun 07, 2016, at 01:28 PM, Ethan Furman wrote:

>* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and
>   ``memoryview.iterbytes`` alternative iterators

+1 but I want to go just a little farther.

We can't change bytes.__getitem__ but we can add another method that returns
single byte objects?  I think it's still a bit of a pain to extract single
bytes even with .iterbytes().

Maybe .iterbytes can take a single index argument (blech) or add a method like
.byte_at(i).  I'll let you bikeshed on the name.

Cheers,
-Barry

From ethan at stoneleaf.us  Tue Jun  7 17:34:58 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 07 Jun 2016 14:34:58 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7A3rRG7HY8DevRMFa-ucJtK1=kVp9Ya78LSfUgLs8vm2Q@mail.gmail.com>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
 <CADiSq7f48beR4Kj6QvQs1g14_kCJKYKQNuW+v1hS7WuT_nAJxw@mail.gmail.com>
 <CALFfu7A3rRG7HY8DevRMFa-ucJtK1=kVp9Ya78LSfUgLs8vm2Q@mail.gmail.com>
Message-ID: <57573E02.9020206@stoneleaf.us>

On 06/07/2016 02:20 PM, Eric Snow wrote:
> On Tue, Jun 7, 2016 at 12:30 PM, Nick Coghlan wrote:
>> On 7 June 2016 at 10:51, Eric Snow wrote:

>>> * ``__definition_order__`` is a tuple
>>> * ``__definition_order__`` is a read-only attribute
>>
>> Thinking about the class decorator use case, I think this may need to
>> be reconsidered, as class decorators may:
>>
>> 1. Remove class attributes
>> 2. Add class attributes
>>
>> This will then lead to __definition_order__ getting out of sync with
>> the current state of the class namespace.
>
> I'm not clear on your point.  Decorators are applied after the class
> has been created.  Hence they have no impact on the class's definition
> order.  I'd expect __definition_order__ to strictly represent what
> happened in the class body during definition, and not anything
> afterward.
>
> Certainly __definition_order__ might not align with __dict__ (or
> dir()); we don't have any way to guarantee that it would, do we?  If
> anything, the ability to diff __definition_order__ and __dict__ is a
> positive, since it allows you to see changes on the class since it was
> defined.
>
>>
>> One option for dealing with that would be to make type.__setattr__ and
>> type.__delattr__ aware of __definition_order__, and have them replace
>> the tuple with a new one as needed. If we did that, then the main
>> question would be whether updating an existing attribute changed the
>> definition order, and I'd be inclined to say "No" (to minimise the
>> side effects of monkey-patching).
>>
>> The main alternative would be to make __definition_order__ writable,
>> so the default behaviour would be for it to reflect the original class
>> body, but decorators would be free to update it to reflect their
>> changes, as well as to make other modifications (e.g. stripping out
>> all callables from the list).
>
> I think both of those make __definition_order__ more complicated and
> less useful.  As the PEP stands, folks can be confident in what
> __definition_order__ represents.  What would you consider to be the
> benefit of a mutable (or replaceable) __definition_order__ that
> outweighs the benefit of a simpler definition of what's in it.

I think the question is which is more useful?

- a definition order that lists items that are not in the class, as
   well as not having items that are in the class (set by the decorator)

or

- a definition order that is representative of the class state after
   all decorators have been applied

One argument for the latter is that, even though the class has been 
technically "defined" (class body executed, type.__new__ called, etc.), 
applying decorators feels like continued class definition.

One argument for the former is simplified implementation, and is 
definition order really important after the class body has been 
executed?  (okay, two arguments ;)

Perhaps the best thing is just to make it writeable -- after all, if 
__class__, __name__, etc., can all be changed, why should 
__definition_order__ be special?

--
~Ethan~

From k7hoven at gmail.com  Tue Jun  7 17:34:02 2016
From: k7hoven at gmail.com (Koos Zevenhoven)
Date: Wed, 8 Jun 2016 00:34:02 +0300
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <57572E5D.4020101@stoneleaf.us>
References: <57572E5D.4020101@stoneleaf.us>
Message-ID: <CAMiohohZj1DBJbf0qPdEaVHB5VJRiR845us1Fj5QrN_7zrL2xw@mail.gmail.com>

On Tue, Jun 7, 2016 at 11:28 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
>
> Minor changes: updated version numbers, add punctuation.
>
> The current text seems to take into account Guido's last comments.
>
> Thoughts before asking for acceptance?
>
> PEP: 467
> Title: Minor API improvements for binary sequences
> Version: $Revision$
> Last-Modified: $Date$
> Author: Nick Coghlan <ncoghlan at gmail.com>
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 2014-03-30
> Python-Version: 3.5
> Post-History: 2014-03-30 2014-08-15 2014-08-16
>
>
> Abstract
> ========
>
> During the initial development of the Python 3 language specification, the core ``bytes`` type for arbitrary binary data started as the mutable type that is now referred to as ``bytearray``. Other aspects of operating in the binary domain in Python have also evolved over the course of the Python 3 series.
>
> This PEP proposes four small adjustments to the APIs of the ``bytes``, ``bytearray`` and ``memoryview`` types to make it easier to operate entirely in the binary domain:
>
> * Deprecate passing single integer values to ``bytes`` and ``bytearray``
> * Add ``bytes.zeros`` and ``bytearray.zeros`` alternative constructors
> * Add ``bytes.byte`` and ``bytearray.byte`` alternative constructors
> * Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and
>   ``memoryview.iterbytes`` alternative iterators
>

Why not bytes.viewbytes (or whatever name) so that one could also
subscript it? And if it were a property, one could perhaps
conveniently get the n'th byte:

b'abcde'.viewbytes[n]   # compared to b'abcde'[n:n+1]

Also, would it not be more clear to call the int -> bytes method
something like bytes.fromint or bytes.fromord and introduce the same
thing on str? And perhaps allow multiple arguments to create a
str/bytes of length > 1. I guess this may violate TOOWTDI, but anyway,
just a thought.

-- Koos

From ncoghlan at gmail.com  Tue Jun  7 17:34:37 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 7 Jun 2016 14:34:37 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7A3rRG7HY8DevRMFa-ucJtK1=kVp9Ya78LSfUgLs8vm2Q@mail.gmail.com>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
 <CADiSq7f48beR4Kj6QvQs1g14_kCJKYKQNuW+v1hS7WuT_nAJxw@mail.gmail.com>
 <CALFfu7A3rRG7HY8DevRMFa-ucJtK1=kVp9Ya78LSfUgLs8vm2Q@mail.gmail.com>
Message-ID: <CADiSq7egZAySWQYjDhH_7qVrS_CxFqT5CnZPn+VcsF1k2XJN9A@mail.gmail.com>

On 7 June 2016 at 14:20, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> On Tue, Jun 7, 2016 at 12:30 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> The main alternative would be to make __definition_order__ writable,
>> so the default behaviour would be for it to reflect the original class
>> body, but decorators would be free to update it to reflect their
>> changes, as well as to make other modifications (e.g. stripping out
>> all callables from the list).
>
> I think both of those make __definition_order__ more complicated and
> less useful.  As the PEP stands, folks can be confident in what
> __definition_order__ represents.  What would you consider to be the
> benefit of a mutable (or replaceable) __definition_order__ that
> outweighs the benefit of a simpler definition of what's in it.

Mainly the fact that class decorators and metaclasses can't hide the
difference between "attributes defined in the class body" and
"attributes injected by a decorator or metaclass". I don't have a
concrete use case for that, it just bothers me on general principles
when we have things the interpreter can do that can't readily be
emulated in Python code.

However, if it proves to be a hassle in practice, making it writable
can be done later based on specific use cases, so I don't mind if the
PEP stays as it is on that front.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From pmiscml at gmail.com  Tue Jun  7 17:33:50 2016
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Wed, 8 Jun 2016 00:33:50 +0300
Subject: [Python-Dev] PEP 467:  Minor API improvements to bytes,
 bytearray, and memoryview
In-Reply-To: <57572E5D.4020101@stoneleaf.us>
References: <57572E5D.4020101@stoneleaf.us>
Message-ID: <20160608003350.7a7c6641@x230>

Hello,

On Tue, 07 Jun 2016 13:28:13 -0700
Ethan Furman <ethan at stoneleaf.us> wrote:

> Minor changes: updated version numbers, add punctuation.
> 
> The current text seems to take into account Guido's last comments.
> 
> Thoughts before asking for acceptance?
> 
> 
[]

> Deprecation of current "zero-initialised sequence" behaviour
> ------------------------------------------------------------
> 
> Currently, the ``bytes`` and ``bytearray`` constructors accept an 
> integer argument and interpret it as meaning to create a 
> zero-initialised sequence of the given size::
> 
>      >>> bytes(3)
>      b'\x00\x00\x00'
>      >>> bytearray(3)
>      bytearray(b'\x00\x00\x00')
> 
> This PEP proposes to deprecate that behaviour in Python 3.6, and
> remove it entirely in Python 3.7.

Why the desire to break applications of thousands and thousands of
people? Besides, bytes(3) behavior is very logical. Everyone who knows
what malloc(3) does also knows what bytes(3) does. Who doesn't, can
learn, and eventually be grateful that learning Python actually helped
them to learn other language as well.

[]

> Addition of explicit "single byte" constructors
> -----------------------------------------------
> 
> As binary counterparts to the text ``chr`` function, this PEP
> proposes the addition of an explicit ``byte`` alternative constructor
> as a class method on both ``bytes`` and ``bytearray``::
> 
>      >>> bytes.byte(3)
>      b'\x03'
>      >>> bytearray.byte(3)
>      bytearray(b'\x03')
> 
> These methods will only accept integers in the range 0 to 255
> (inclusive)::
> 
>      >>> bytes.byte(512)
>      Traceback (most recent call last):
>        File "<stdin>", line 1, in <module>
>      ValueError: bytes must be in range(0, 256)
> 
>      >>> bytes.byte(1.0)
>      Traceback (most recent call last):
>        File "<stdin>", line 1, in <module>
>      TypeError: 'float' object cannot be interpreted as an integer
> 
> The documentation of the ``ord`` builtin will be updated to
> explicitly note that ``bytes.byte`` is the inverse operation for
> binary data, while ``chr`` is the inverse operation for text data.

The documentation should probably also mention that bytes.byte(x) is
equivalent to x.to_bytes(1, "little"). 

[]

-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From pmiscml at gmail.com  Tue Jun  7 17:37:11 2016
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Wed, 8 Jun 2016 00:37:11 +0300
Subject: [Python-Dev] PEP 467:  Minor API improvements to bytes,
 bytearray, and memoryview
In-Reply-To: <20160607173119.36961fcf.barry@wooz.org>
References: <57572E5D.4020101@stoneleaf.us>
 <20160607173119.36961fcf.barry@wooz.org>
Message-ID: <20160608003711.6149bc96@x230>

Hello,

On Tue, 7 Jun 2016 17:31:19 -0400
Barry Warsaw <barry at python.org> wrote:

> On Jun 07, 2016, at 01:28 PM, Ethan Furman wrote:
> 
> >* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and
> >   ``memoryview.iterbytes`` alternative iterators
> 
> +1 but I want to go just a little farther.
> 
> We can't change bytes.__getitem__ but we can add another method that
> returns single byte objects?  I think it's still a bit of a pain to
> extract single bytes even with .iterbytes().
> 
> Maybe .iterbytes can take a single index argument (blech) or add a
> method like .byte_at(i).  I'll let you bikeshed on the name.

What's wrong with b[i:i+1] ?

-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From ncoghlan at gmail.com  Tue Jun  7 17:39:30 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 7 Jun 2016 14:39:30 -0700
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <20160607173119.36961fcf.barry@wooz.org>
References: <57572E5D.4020101@stoneleaf.us>
 <20160607173119.36961fcf.barry@wooz.org>
Message-ID: <CADiSq7ekTTBT5wuEdL9MswshqN-=644O1aadC1uNv41i5bgy7w@mail.gmail.com>

On 7 June 2016 at 14:31, Barry Warsaw <barry at python.org> wrote:
> On Jun 07, 2016, at 01:28 PM, Ethan Furman wrote:
>
>>* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and
>>   ``memoryview.iterbytes`` alternative iterators
>
> +1 but I want to go just a little farther.
>
> We can't change bytes.__getitem__ but we can add another method that returns
> single byte objects?  I think it's still a bit of a pain to extract single
> bytes even with .iterbytes().
>
> Maybe .iterbytes can take a single index argument (blech) or add a method like
> .byte_at(i).  I'll let you bikeshed on the name.

Perhaps:

 data.getbyte(i)
 data.iterbytes()

The rationale for "Why not a live view?" is that an iterator is simple
to define and implement, while we know from experience with memoryview
and the various dict views that live views are a minefield for folks
defining new container types. Since this PEP would in some sense
change what it means to implement a full "bytes-like object", it's
worth keeping implementation complexity in mind.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From brett at python.org  Tue Jun  7 17:40:35 2016
From: brett at python.org (Brett Cannon)
Date: Tue, 07 Jun 2016 21:40:35 +0000
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <20160608003711.6149bc96@x230>
References: <57572E5D.4020101@stoneleaf.us>
 <20160607173119.36961fcf.barry@wooz.org>
 <20160608003711.6149bc96@x230>
Message-ID: <CAP1=2W7XSQcLHysATMV5w2fUxW=+V+-SgrYBpqSBLDupNUK-Dg@mail.gmail.com>

On Tue, 7 Jun 2016 at 14:38 Paul Sokolovsky <pmiscml at gmail.com> wrote:

> Hello,
>
> On Tue, 7 Jun 2016 17:31:19 -0400
> Barry Warsaw <barry at python.org> wrote:
>
> > On Jun 07, 2016, at 01:28 PM, Ethan Furman wrote:
> >
> > >* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and
> > >   ``memoryview.iterbytes`` alternative iterators
> >
> > +1 but I want to go just a little farther.
> >
> > We can't change bytes.__getitem__ but we can add another method that
> > returns single byte objects?  I think it's still a bit of a pain to
> > extract single bytes even with .iterbytes().
> >
> > Maybe .iterbytes can take a single index argument (blech) or add a
> > method like .byte_at(i).  I'll let you bikeshed on the name.
>
> What's wrong with b[i:i+1] ?
>

It always succeeds while indexing can trigger an IndexError.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160607/d882821f/attachment-0001.html>

From tritium-list at sdamon.com  Tue Jun  7 17:50:49 2016
From: tritium-list at sdamon.com (tritium-list at sdamon.com)
Date: Tue, 7 Jun 2016 17:50:49 -0400
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <CADiSq7ekTTBT5wuEdL9MswshqN-=644O1aadC1uNv41i5bgy7w@mail.gmail.com>
References: <57572E5D.4020101@stoneleaf.us>
 <20160607173119.36961fcf.barry@wooz.org>
 <CADiSq7ekTTBT5wuEdL9MswshqN-=644O1aadC1uNv41i5bgy7w@mail.gmail.com>
Message-ID: <0cea01d1c106$a88a3750$f99ea5f0$@hotmail.com>

> -----Original Message-----
> From: Python-Dev [mailto:python-dev-bounces+tritium-
> list=sdamon.com at python.org] On Behalf Of Nick Coghlan
> Sent: Tuesday, June 7, 2016 5:40 PM
> To: Barry Warsaw <barry at python.org>
> Cc: python-dev at python.org
> Subject: Re: [Python-Dev] PEP 467: Minor API improvements to bytes,
> bytearray, and memoryview
> 
> On 7 June 2016 at 14:31, Barry Warsaw <barry at python.org> wrote:
> > On Jun 07, 2016, at 01:28 PM, Ethan Furman wrote:
> >
> >>* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and
> >>   ``memoryview.iterbytes`` alternative iterators
> >
> > +1 but I want to go just a little farther.
> >
> > We can't change bytes.__getitem__ but we can add another method that
> returns
> > single byte objects?  I think it's still a bit of a pain to extract
single
> > bytes even with .iterbytes().
> >
> > Maybe .iterbytes can take a single index argument (blech) or add a
method
> like
> > .byte_at(i).  I'll let you bikeshed on the name.
> 
> Perhaps:
> 
>  data.getbyte(i)
>  data.iterbytes()

data.getbyte(index_or_slice_object) ?

while it might not be... ideal... to create a sliceable live view object, we
can have a method that accepts a slice, even if we have to create it
manually (or at least make it convenient for those who wish to wrap a bytes
object in their own type and blindly pass the first-non-self arg of a custom
__getitem__ to the method).

> The rationale for "Why not a live view?" is that an iterator is simple
> to define and implement, while we know from experience with memoryview
> and the various dict views that live views are a minefield for folks
> defining new container types. Since this PEP would in some sense
> change what it means to implement a full "bytes-like object", it's
> worth keeping implementation complexity in mind.
> 
> Cheers,
> Nick.
> 
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/tritium-
> list%40sdamon.com

From ericsnowcurrently at gmail.com  Tue Jun  7 17:51:46 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 7 Jun 2016 14:51:46 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <CADiSq7egZAySWQYjDhH_7qVrS_CxFqT5CnZPn+VcsF1k2XJN9A@mail.gmail.com>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
 <CADiSq7f48beR4Kj6QvQs1g14_kCJKYKQNuW+v1hS7WuT_nAJxw@mail.gmail.com>
 <CALFfu7A3rRG7HY8DevRMFa-ucJtK1=kVp9Ya78LSfUgLs8vm2Q@mail.gmail.com>
 <CADiSq7egZAySWQYjDhH_7qVrS_CxFqT5CnZPn+VcsF1k2XJN9A@mail.gmail.com>
Message-ID: <CALFfu7D0OxgFYc7Zxe96XMZwwjAQYd7kttn_6SR-Ua4OzRtjjg@mail.gmail.com>

On Tue, Jun 7, 2016 at 2:34 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 7 June 2016 at 14:20, Eric Snow <ericsnowcurrently at gmail.com> wrote:
>> What would you consider to be the
>> benefit of a mutable (or replaceable) __definition_order__ that
>> outweighs the benefit of a simpler definition of what's in it.
>
> Mainly the fact that class decorators and metaclasses can't hide the
> difference between "attributes defined in the class body" and
> "attributes injected by a decorator or metaclass". I don't have a
> concrete use case for that, it just bothers me on general principles
> when we have things the interpreter can do that can't readily be
> emulated in Python code.

Yeah, I see what you mean.

>
> However, if it proves to be a hassle in practice, making it writable
> can be done later based on specific use cases, so I don't mind if the
> PEP stays as it is on that front.

Agreed.

-eric

From tritium-list at sdamon.com  Tue Jun  7 17:52:51 2016
From: tritium-list at sdamon.com (tritium-list at sdamon.com)
Date: Tue, 7 Jun 2016 17:52:51 -0400
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <0cea01d1c106$a88a3750$f99ea5f0$@hotmail.com>
References: <57572E5D.4020101@stoneleaf.us>
 <20160607173119.36961fcf.barry@wooz.org>
 <CADiSq7ekTTBT5wuEdL9MswshqN-=644O1aadC1uNv41i5bgy7w@mail.gmail.com>
 <0cea01d1c106$a88a3750$f99ea5f0$@hotmail.com>
Message-ID: <0ceb01d1c106$f0eb6be0$d2c243a0$@hotmail.com>

Ignore that message.  I hit send before brain and hands were fully in sync.

> -----Original Message-----
> From: tritium-list at sdamon.com [mailto:tritium-list at sdamon.com]
> Sent: Tuesday, June 7, 2016 5:51 PM
> To: 'Nick Coghlan' <ncoghlan at gmail.com>; 'Barry Warsaw'
> <barry at python.org>
> Cc: python-dev at python.org
> Subject: RE: [Python-Dev] PEP 467: Minor API improvements to bytes,
> bytearray, and memoryview
> 
> 
> 
> > -----Original Message-----
> > From: Python-Dev [mailto:python-dev-bounces+tritium-
> > list=sdamon.com at python.org] On Behalf Of Nick Coghlan
> > Sent: Tuesday, June 7, 2016 5:40 PM
> > To: Barry Warsaw <barry at python.org>
> > Cc: python-dev at python.org
> > Subject: Re: [Python-Dev] PEP 467: Minor API improvements to bytes,
> > bytearray, and memoryview
> >
> > On 7 June 2016 at 14:31, Barry Warsaw <barry at python.org> wrote:
> > > On Jun 07, 2016, at 01:28 PM, Ethan Furman wrote:
> > >
> > >>* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and
> > >>   ``memoryview.iterbytes`` alternative iterators
> > >
> > > +1 but I want to go just a little farther.
> > >
> > > We can't change bytes.__getitem__ but we can add another method
> that
> > returns
> > > single byte objects?  I think it's still a bit of a pain to extract
> single
> > > bytes even with .iterbytes().
> > >
> > > Maybe .iterbytes can take a single index argument (blech) or add a
> method
> > like
> > > .byte_at(i).  I'll let you bikeshed on the name.
> >
> > Perhaps:
> >
> >  data.getbyte(i)
> >  data.iterbytes()
> 
> data.getbyte(index_or_slice_object) ?
> 
> while it might not be... ideal... to create a sliceable live view object,
we
> can have a method that accepts a slice, even if we have to create it
> manually (or at least make it convenient for those who wish to wrap a
bytes
> object in their own type and blindly pass the first-non-self arg of a
custom
> __getitem__ to the method).
> 
> > The rationale for "Why not a live view?" is that an iterator is simple
> > to define and implement, while we know from experience with
> memoryview
> > and the various dict views that live views are a minefield for folks
> > defining new container types. Since this PEP would in some sense
> > change what it means to implement a full "bytes-like object", it's
> > worth keeping implementation complexity in mind.
> >
> > Cheers,
> > Nick.
> >
> > --
> > Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> > _______________________________________________
> > Python-Dev mailing list
> > Python-Dev at python.org
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe: https://mail.python.org/mailman/options/python-
> dev/tritium-
> > list%40sdamon.com

From ncoghlan at gmail.com  Tue Jun  7 17:56:38 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 7 Jun 2016 14:56:38 -0700
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <20160608003350.7a7c6641@x230>
References: <57572E5D.4020101@stoneleaf.us> <20160608003350.7a7c6641@x230>
Message-ID: <CADiSq7cgatAo5L7meFsX7VhDjOSBGzHwsE997+xc8gu504TQ=w@mail.gmail.com>

On 7 June 2016 at 14:33, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> Hello,
>
> On Tue, 07 Jun 2016 13:28:13 -0700
> Ethan Furman <ethan at stoneleaf.us> wrote:
>
>> Minor changes: updated version numbers, add punctuation.
>>
>> The current text seems to take into account Guido's last comments.
>>
>> Thoughts before asking for acceptance?
>>
>>
> []
>
>> Deprecation of current "zero-initialised sequence" behaviour
>> ------------------------------------------------------------
>>
>> Currently, the ``bytes`` and ``bytearray`` constructors accept an
>> integer argument and interpret it as meaning to create a
>> zero-initialised sequence of the given size::
>>
>>      >>> bytes(3)
>>      b'\x00\x00\x00'
>>      >>> bytearray(3)
>>      bytearray(b'\x00\x00\x00')
>>
>> This PEP proposes to deprecate that behaviour in Python 3.6, and
>> remove it entirely in Python 3.7.
>
> Why the desire to break applications of thousands and thousands of
> people?

Same argument as any deprecation: to make existing and future defects
easier to find or easier to debug.

That said, this is the main part I was referring to in the other
thread when I mentioned some of the constructor changes were
potentially controversial and probably not worth the hassle - it's the
only one with the potential to break currently working code, while the
others are just a matter of choosing suitable names.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From barry at python.org  Tue Jun  7 17:57:45 2016
From: barry at python.org (Barry Warsaw)
Date: Tue, 7 Jun 2016 17:57:45 -0400
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <CAP1=2W7XSQcLHysATMV5w2fUxW=+V+-SgrYBpqSBLDupNUK-Dg@mail.gmail.com>
References: <57572E5D.4020101@stoneleaf.us>
 <20160607173119.36961fcf.barry@wooz.org>
 <20160608003711.6149bc96@x230>
 <CAP1=2W7XSQcLHysATMV5w2fUxW=+V+-SgrYBpqSBLDupNUK-Dg@mail.gmail.com>
Message-ID: <20160607175745.291e595a@subdivisions.wooz.org>

On Jun 07, 2016, at 09:40 PM, Brett Cannon wrote:

>On Tue, 7 Jun 2016 at 14:38 Paul Sokolovsky <pmiscml at gmail.com> wrote:
>> What's wrong with b[i:i+1] ?
>It always succeeds while indexing can trigger an IndexError.

Right.  You want a method with the semantics of __getitem__() but that returns
the desired type.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160607/ba0fe956/attachment.sig>

From k7hoven at gmail.com  Tue Jun  7 18:22:21 2016
From: k7hoven at gmail.com (Koos Zevenhoven)
Date: Wed, 8 Jun 2016 01:22:21 +0300
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <20160607175745.291e595a@subdivisions.wooz.org>
References: <57572E5D.4020101@stoneleaf.us>
 <20160607173119.36961fcf.barry@wooz.org>
 <20160608003711.6149bc96@x230>
 <CAP1=2W7XSQcLHysATMV5w2fUxW=+V+-SgrYBpqSBLDupNUK-Dg@mail.gmail.com>
 <20160607175745.291e595a@subdivisions.wooz.org>
Message-ID: <CAMiohohcm5nTBKd00rTLMgCTLekqK3D=GQuSWhwqEC1OHbhwNg@mail.gmail.com>

On Wed, Jun 8, 2016 at 12:57 AM, Barry Warsaw <barry at python.org> wrote:
> On Jun 07, 2016, at 09:40 PM, Brett Cannon wrote:
>
>>On Tue, 7 Jun 2016 at 14:38 Paul Sokolovsky <pmiscml at gmail.com> wrote:
>>> What's wrong with b[i:i+1] ?
>>It always succeeds while indexing can trigger an IndexError.
>
> Right.  You want a method with the semantics of __getitem__() but that returns
> the desired type.
>

And if this is called __getitem__ (with slices delegated to
bytes.__getitem__) and implemented in a class, one has a view. Maybe
I'm missing something, but I fail to understand what makes this
significantly more problematic than an iterator. Ok, I guess we might
also need __len__.

-- Koos

> -Barry
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com
>

From ericsnowcurrently at gmail.com  Tue Jun  7 18:27:16 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 7 Jun 2016 15:27:16 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <57573E02.9020206@stoneleaf.us>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
 <CADiSq7f48beR4Kj6QvQs1g14_kCJKYKQNuW+v1hS7WuT_nAJxw@mail.gmail.com>
 <CALFfu7A3rRG7HY8DevRMFa-ucJtK1=kVp9Ya78LSfUgLs8vm2Q@mail.gmail.com>
 <57573E02.9020206@stoneleaf.us>
Message-ID: <CALFfu7Bjy9pU-5c6YE-xnbU3ToZ90hRB6BUhoXphX=B49=AJ9A@mail.gmail.com>

On Tue, Jun 7, 2016 at 2:34 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
> On 06/07/2016 02:20 PM, Eric Snow wrote:
>> I think both of those make __definition_order__ more complicated and
>> less useful.  As the PEP stands, folks can be confident in what
>> __definition_order__ represents.  What would you consider to be the
>> benefit of a mutable (or replaceable) __definition_order__ that
>> outweighs the benefit of a simpler definition of what's in it.
>
> I think the question is which is more useful?
>
> - a definition order that lists items that are not in the class, as
>   well as not having items that are in the class (set by the decorator)
>
> or
>
> - a definition order that is representative of the class state after
>   all decorators have been applied

"definition" refers explicitly to the execution of the class body in a
class statement.  So what you've described is a bit confusing to me.
If we're talking about some other semantics then the name
"__definition_order__" is misleading.

Also, consider that __definition_order__ is, IMHO, most useful when
interpreted as the actual order of attributes in the class definition
body.  The point is that code outside the class body can leverage the
order of assigned names within that block.  So, relative to the class
definition, I'm not clear on valid use cases that divorce themselves
from the class definition, such that either of your scenarios is
relevant.

Semantics that relate more to the class namespace (__dict__) are a
separate matter from this PEP.  I'd welcome a broader solution that
still met the needs at which __definition_order__ is aiming.  For
example, consider if the class's __dict__ (or rather the proxied
value) were OrderedDict.  In fact, Guido was originally (in 2013)
amenable to that idea.  However, I tried it and making it work was a
challenge due to use of the concrete dict C-API.  I'd be glad if it
was worked out.  In the meantime, this PEP is more focused on a
practical representation of the ordering information inside just the
class definition body.

>
> One argument for the latter is that, even though the class has been
> technically "defined" (class body executed, type.__new__ called, etc.),
> applying decorators feels like continued class definition.

Perhaps.  That doesn't align with my intuition on decorators, but I'll
readily concede that my views aren't always particularly
representative of everyone else. :)  That said, there are many
different uses for decorators and modifying the class namespace
(__dict__) isn't the only one (and in my experience not necessarily
the most common).

>
> One argument for the former is simplified implementation,

I'm not sure what you're implying about the implementation.  Do you
mean that it's easier than just letting __definition_order__ be
writable (or mutable)?  It's actually slightly more work to make it a
read-only attr.  Perhaps you mean that the semantics in the PEP are
easier to implement than something that tracks changes to the class
namespace (__dict__) after definition is over?  Probably, though I
don't see anything like that happening (other than if OrderedDict were
used for __dict__).

> and is definition
> order really important after the class body has been executed?  (okay, two
> arguments ;)

Given that the focus is on class definition, I'd argue no. :)

>
> Perhaps the best thing is just to make it writeable -- after all, if
> __class__, __name__, etc., can all be changed, why should
> __definition_order__ be special?

Not all attrs are writable and it's a case-by-case situation: some of
the ones that are writable started out read-only and changed once
there was a valid reason.  If anything, it's arguably safer in general
to take an immutable-by-default approach.

-eric

From ethan at stoneleaf.us  Tue Jun  7 18:39:28 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 07 Jun 2016 15:39:28 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7Bjy9pU-5c6YE-xnbU3ToZ90hRB6BUhoXphX=B49=AJ9A@mail.gmail.com>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
 <CADiSq7f48beR4Kj6QvQs1g14_kCJKYKQNuW+v1hS7WuT_nAJxw@mail.gmail.com>
 <CALFfu7A3rRG7HY8DevRMFa-ucJtK1=kVp9Ya78LSfUgLs8vm2Q@mail.gmail.com>
 <57573E02.9020206@stoneleaf.us>
 <CALFfu7Bjy9pU-5c6YE-xnbU3ToZ90hRB6BUhoXphX=B49=AJ9A@mail.gmail.com>
Message-ID: <57574D20.3010700@stoneleaf.us>

On 06/07/2016 03:27 PM, Eric Snow wrote:

> Not all attrs are writable and it's a case-by-case situation: some of
> the ones that are writable started out read-only and changed once
> there was a valid reason.  If anything, it's arguably safer in general
> to take an immutable-by-default approach.

I'm sold.  Leave it read-only.  :)

--
~Ethan~

From ethan at stoneleaf.us  Tue Jun  7 18:46:00 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 07 Jun 2016 15:46:00 -0700
Subject: [Python-Dev] PEP 467:  Minor API improvements to bytes,
 bytearray, and memoryview
In-Reply-To: <20160608003350.7a7c6641@x230>
References: <57572E5D.4020101@stoneleaf.us> <20160608003350.7a7c6641@x230>
Message-ID: <57574EA8.9090805@stoneleaf.us>

On 06/07/2016 02:33 PM, Paul Sokolovsky wrote:

>> This PEP proposes to deprecate that behaviour in Python 3.6, and
>> remove it entirely in Python 3.7.
>
> Why the desire to break applications of thousands and thousands of
> people? Besides, bytes(3) behavior is very logical. Everyone who knows
> what malloc(3) does also knows what bytes(3) does. Who doesn't, can
> learn, and eventually be grateful that learning Python actually helped
> them to learn other language as well.

Two reasons:

1) bytes are immutable, so creating a 3-byte 0x00 string seems
    ridiculous;

2) Python is not C, and the vagaries of malloc are not relevant to
    Python.

However, there is little point in breaking working code, so a 
deprecation without removal is fine by me.

--
~Ethan~

From ncoghlan at gmail.com  Tue Jun  7 19:03:13 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 7 Jun 2016 16:03:13 -0700
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <CAMiohohcm5nTBKd00rTLMgCTLekqK3D=GQuSWhwqEC1OHbhwNg@mail.gmail.com>
References: <57572E5D.4020101@stoneleaf.us>
 <20160607173119.36961fcf.barry@wooz.org>
 <20160608003711.6149bc96@x230>
 <CAP1=2W7XSQcLHysATMV5w2fUxW=+V+-SgrYBpqSBLDupNUK-Dg@mail.gmail.com>
 <20160607175745.291e595a@subdivisions.wooz.org>
 <CAMiohohcm5nTBKd00rTLMgCTLekqK3D=GQuSWhwqEC1OHbhwNg@mail.gmail.com>
Message-ID: <CADiSq7crpoSwm-ctsC3hX8tfmGWFRg-gjeQ_Crtmi9DyRAN79Q@mail.gmail.com>

On 7 June 2016 at 15:22, Koos Zevenhoven <k7hoven at gmail.com> wrote:
> On Wed, Jun 8, 2016 at 12:57 AM, Barry Warsaw <barry at python.org> wrote:
>> On Jun 07, 2016, at 09:40 PM, Brett Cannon wrote:
>>
>>>On Tue, 7 Jun 2016 at 14:38 Paul Sokolovsky <pmiscml at gmail.com> wrote:
>>>> What's wrong with b[i:i+1] ?
>>>It always succeeds while indexing can trigger an IndexError.
>>
>> Right.  You want a method with the semantics of __getitem__() but that returns
>> the desired type.
>>
>
> And if this is called __getitem__ (with slices delegated to
> bytes.__getitem__) and implemented in a class, one has a view. Maybe
> I'm missing something, but I fail to understand what makes this
> significantly more problematic than an iterator. Ok, I guess we might
> also need __len__.

Right, it's the fact that a view is a much broader API than we need,
since most of the operations on the base type are already fine. The
two alternate operations that people are interested in are:

- like indexing, but producing bytes instead of ints
- like iteration, but producing bytes instead of ints

That said, it occurs to me that there's a reasonably strong
composability argument in favour of a view-based approach: a view will
work with operator.itemgetter() and other sequence consuming APIs,
while special methods won't. The "like-memoryview-but-not" view type
could also take any bytes-like object as input, similar to memoryview
itself.

Cheers,
Nick.

P.S. I'm starting to remember why I stopped working on this - I'm
genuinely unsure of the right way forward, so I wasn't prepared to
advocate strongly for the particular approach in the PEP :)

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From raymond.hettinger at gmail.com  Tue Jun  7 19:03:57 2016
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Tue, 7 Jun 2016 16:03:57 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
Message-ID: <00C7A5D7-686C-45F7-9C8E-930CAB96FDFD@gmail.com>

> On Jun 7, 2016, at 10:51 AM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> 
> This PEP changes the default class definition namespace to ``OrderedDict``.

I think this would be a nice improvement.

> Furthermore, the order in which the attributes are defined in each class
> body will now be preserved in ``type.__definition_order__``.  This allows
> introspection of the original definition order, e.g. by class decorators.

I'm unclear on why this would be needed.  Wouldn't the OrderedDict be sufficient for preserving definition order?

Raymond

From ncoghlan at gmail.com  Tue Jun  7 19:12:15 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 7 Jun 2016 16:12:15 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <00C7A5D7-686C-45F7-9C8E-930CAB96FDFD@gmail.com>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
 <00C7A5D7-686C-45F7-9C8E-930CAB96FDFD@gmail.com>
Message-ID: <CADiSq7e-Rw-PnYar_JcPTnBP+C8+Jo_fREJrxAF30f43YsJD0A@mail.gmail.com>

On 7 June 2016 at 16:03, Raymond Hettinger <raymond.hettinger at gmail.com> wrote:
>
>> On Jun 7, 2016, at 10:51 AM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
>>
>> This PEP changes the default class definition namespace to ``OrderedDict``.
>
> I think this would be a nice improvement.
>
>> Furthermore, the order in which the attributes are defined in each class
>> body will now be preserved in ``type.__definition_order__``.  This allows
>> introspection of the original definition order, e.g. by class decorators.
>
> I'm unclear on why this would be needed.  Wouldn't the OrderedDict be sufficient for preserving definition order?

By the time decorators run, the original execution namespace is no
longer available - the contents have been copied into the class dict,
which will still be a plain dict (and there's a lot of code that calls
PyDict_* APIs on tp_dict, so replacing the latter with a subclass is
neither trivial nor particularly safe in the presence of extension
modules).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From pmiscml at gmail.com  Tue Jun  7 19:17:12 2016
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Wed, 8 Jun 2016 02:17:12 +0300
Subject: [Python-Dev] PEP 467:  Minor API improvements to bytes,
 bytearray, and memoryview
In-Reply-To: <57574EA8.9090805@stoneleaf.us>
References: <57572E5D.4020101@stoneleaf.us> <20160608003350.7a7c6641@x230>
 <57574EA8.9090805@stoneleaf.us>
Message-ID: <20160608021712.0e2e02d7@x230>

Hello,

On Tue, 07 Jun 2016 15:46:00 -0700
Ethan Furman <ethan at stoneleaf.us> wrote:

> On 06/07/2016 02:33 PM, Paul Sokolovsky wrote:
> 
> >> This PEP proposes to deprecate that behaviour in Python 3.6, and
> >> remove it entirely in Python 3.7.
> >
> > Why the desire to break applications of thousands and thousands of
> > people? Besides, bytes(3) behavior is very logical. Everyone who
> > knows what malloc(3) does also knows what bytes(3) does. Who
> > doesn't, can learn, and eventually be grateful that learning Python
> > actually helped them to learn other language as well.
> 
> Two reasons:
> 
> 1) bytes are immutable, so creating a 3-byte 0x00 string seems
>     ridiculous;

There's nothing ridiculous in sending N zero bytes over network,
writing to a file, transferring to a hardware device. That however
raises questions e.g. how to (efficiently) fill a (subsection) of
bytearray with something but a 0, and how to apply all that
consistently to array.array, but I don't even want to bring it,
because the answer will be "we need first to deal with subjects of this
PEP".

> 
> 2) Python is not C, and the vagaries of malloc are not relevant to
>     Python.

Yes, but Python has always had some traits nicely similar to C, (%
formatting, os.read/write at the fingertips, this bytes/bytearray
constructor, etc.), and that certainly catered for sizable share of its
audience. It's nice that nowadays Python is truly multi-paradigm and
taught to pre-schools and used by folks who know how to analyze data
much better than how to allocate memory to hold that data in the first
place. But hopefully people who used Python since 1.x as a nice
system-level integration language, concise without much ambiguity
(definitely less than other languages, maybe COBOL excluded) shouldn't
suffer and have their stuff broken.

> 
> However, there is little point in breaking working code, so a 
> deprecation without removal is fine by me.

Thanks.

> 
> --
> ~Ethan~

-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From ericsnowcurrently at gmail.com  Tue Jun  7 20:50:50 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 7 Jun 2016 17:50:50 -0700
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
Message-ID: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>

I've grabbed a PEP # (520) and updated the PEP to clarify points that
were brought up earlier today.  Given positive feedback I got at PyCon
and the reaction today, I'm hopeful the PEP isn't far off from
pronouncement. :)

-eric

==========================================

PEP: 520
Title: Ordered Class Definition Namespace
Version: $Revision$
Last-Modified: $Date$
Author: Eric Snow <ericsnowcurrently at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 7-Jun-2016
Python-Version: 3.6
Post-History: 7-Jun-2016

Abstract
========

This PEP changes the default class definition namespace to ``OrderedDict``.
Furthermore, the order in which the attributes are defined in each class
body will now be preserved in ``type.__definition_order__``.  This allows
introspection of the original definition order, e.g. by class decorators.

Note: just to be clear, this PEP is *not* about changing ``__dict__`` for
classes to ``OrderedDict``.

Motivation
==========

Currently the namespace used during execution of a class body defaults
to ``dict``.  If the metaclass defines ``__prepare__()`` then the result
of calling it is used.  Thus, before this PEP, if you needed your class
definition namespace to be ``OrderedDict`` you had to use a metaclass.

Metaclasses introduce an extra level of complexity to code and in some
cases (e.g. conflicts) are a problem.  So reducing the need for them is
worth doing when the opportunity presents itself.  Given that we now have
a C implementation of ``OrderedDict`` and that ``OrderedDict`` is the
common use case for ``__prepare__()``, we have such an opportunity by
defaulting to ``OrderedDict``.

The usefulness of ``OrderedDict``-by-default is greatly increased if the
definition order is directly introspectable on classes afterward,
particularly by code that is independent of the original class definition.
One of the original motivating use cases for this PEP is generic class
decorators that make use of the definition order.

Changing the default class definition namespace has been discussed a
number of times, including on the mailing lists and in PEP 422 and
PEP 487 (see the References section below).

Specification
=============

* the default class *definition* namespace is now ``OrderdDict``
* the order in which class attributes are defined is preserved in the
  new ``__definition_order__`` attribute on each class
* "dunder" attributes (e.g. ``__init__``, ``__module__``) are ignored
* ``__definition_order__`` is a tuple
* ``__definition_order__`` is a read-only attribute
* ``__definition_order__`` is always set:

  1. if ``__definition_order__`` is defined in the class body then the
     value is used as-is, though the attribute will still be read-only
  2. types that do not have a class definition (e.g. builtins) have
     their ``__definition_order__`` set to ``None``
  3. types for which `__prepare__()`` returned something other than
     ``OrderedDict`` (or a subclass) have their ``__definition_order__``
     set to ``None`` (except where #1 applies)

The following code demonstrates roughly equivalent semantics::

   class Meta(type):
       def __prepare__(cls, *args, **kwargs):
           return OrderedDict()

   class Spam(metaclass=Meta):
       ham = None
       eggs = 5
       __definition_order__ = tuple(k for k in locals()
                                    if (!k.startswith('__') or
                                        !k.endswith('__')))

Note that [pep487_] proposes a similar solution, albeit as part of a
broader proposal.

Why a tuple?
------------

Use of a tuple reflects the fact that we are exposing the order in
which attributes on the class were *defined*.  Since the definition
is already complete by the time ``definition_order__`` is set, the
content and order of the value won't be changing.  Thus we use a type
that communicates that state of immutability.

Why a read-only attribute?
--------------------------

As with the use of tuple, making ``__definition_order__`` a read-only
attribute communicates the fact that the information it represents is
complete.  Since it represents the state of a particular one-time event
(execution of the class definition body), allowing the value to be
replaced would reduce confidence that the attribute corresponds to the
original class body.

If a use case for a writable (or mutable) ``__definition_order__``
arises, the restriction may be loosened later.  Presently this seems
unlikely and furthermore it is usually best to go immutable-by-default.

Note that ``__definition_order__`` is centered on the class definition
body.  The use cases for dealing with the class namespace (``__dict__``)
post-definition are a separate matter.  ``__definition_order__`` would
be a significantly misleading name for a supporting feature.

See [nick_concern_] for more discussion.

Why ignore "dunder" names?
--------------------------

Names starting and ending with "__" are reserved for use by the
interpreter.  In practice they should not be relevant to the users of
``__definition_order__``.  Instead, for early everyone they would only
be clutter, causing the same extra work for everyone.

Why is __definition_order__ even necessary?
-------------------------------------------

Since the definition order is not preserved in ``__dict__``, it would be
lost once class definition execution completes.  Classes *could*
explicitly set the attribute as the last thing in the body.  However,
then independent decorators could only make use of classes that had done
so.  Instead, ``__definition_order__`` preserves this one bit of info
from the class body so that it is universally available.

Compatibility
=============

This PEP does not break backward compatibility, except in the case that
someone relies *strictly* on ``dict`` as the class definition namespace.
This shouldn't be a problem.

Changes
=============

In addition to the class syntax, the following expose the new behavior:

* builtins.__build_class__
* types.prepare_class
* types.new_class

Other Python Implementations
============================

Pending feedback, the impact on Python implementations is expected to
be minimal.  If a Python implementation cannot support switching to
`OrderedDict``-by-default then it can always set ``__definition_order__``
to ``None``.

Implementation
==============

The implementation is found in the tracker. [impl_]

Alternatives
============

type.__dict__ as OrderedDict
----------------------------

Instead of storing the definition order in ``__definition_order__``,
the now-ordered definition namespace could be copied into a new
``OrderedDict``.  This would mostly provide the same semantics.

However, using ``OrderedDict`` for ``type,__dict__`` would obscure the
relationship with the definition namespace, making it less useful.
Additionally, doing this would require significant changes to the
semantics of the concrete ``dict`` C-API.

A "namespace" Keyword Arg for Class Definition
----------------------------------------------

PEP 422 introduced a new "namespace" keyword arg to class definitions
that effectively replaces the need to ``__prepare__()``. [pep422_]
However, the proposal was withdrawn in favor of the simpler PEP 487.

References
==========

.. [impl] issue #24254
   (https://bugs.python.org/issue24254)

.. [nick_concern] Nick's concerns about mutability
   (https://mail.python.org/pipermail/python-dev/2016-June/144883.html)

.. [pep422] PEP 422
   (https://www.python.org/dev/peps/pep-0422/#order-preserving-classes)

.. [pep487] PEP 487
   (https://www.python.org/dev/peps/pep-0487/#defining-arbitrary-namespaces)

.. [orig] original discussion
   (https://mail.python.org/pipermail/python-ideas/2013-February/019690.html)

.. [followup1] follow-up 1
   (https://mail.python.org/pipermail/python-dev/2013-June/127103.html)

.. [followup2] follow-up 2
   (https://mail.python.org/pipermail/python-dev/2015-May/140137.html)

Copyright
===========
This document has been placed in the public domain.

From steve at pearwood.info  Tue Jun  7 21:09:57 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 8 Jun 2016 11:09:57 +1000
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7CN-3zouXqnSN46WSe36D0O=rYJ7epm8kjiCe69og+x3g@mail.gmail.com>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
 <nj73vn$mgm$1@ger.gmane.org>
 <CALFfu7CN-3zouXqnSN46WSe36D0O=rYJ7epm8kjiCe69og+x3g@mail.gmail.com>
Message-ID: <20160608010957.GG12028@ando.pearwood.info>

On Tue, Jun 07, 2016 at 11:39:06AM -0700, Eric Snow wrote:
> On Tue, Jun 7, 2016 at 11:32 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> > On 6/7/2016 1:51 PM, Eric Snow wrote:
> >
> >> Note: just to be clear, this PEP is *not* about changing
> >
> >> ``type.__dict__`` to ``OrderedDict``.
> >
> > By 'type', do you mean the one and one objected named 'type or the class
> > being defined?  To be really clear, will the following change?
> >
> >>>> class C: pass
> >
> >>>> type(C.__dict__)
> > <class 'mappingproxy'>
> 
> I mean the latter, "type" -> the class being defined.

Could you clarify that in the PEP please? Like Terry, I too found it 
unclear. I think there are a couple of places where you refer to `type` 
and it isn't clear whether you mean builtins.type or something else.

-- 
Steve

From ethan at stoneleaf.us  Tue Jun  7 21:20:38 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 07 Jun 2016 18:20:38 -0700
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
Message-ID: <575772E6.7040906@stoneleaf.us>

On 06/07/2016 05:50 PM, Eric Snow wrote:

Overall +1.  Some nits below.

> Specification
> =============

>    3. types for which `__prepare__()`` returned something other than
>       ``OrderedDict`` (or a subclass) have their ``__definition_order__``
>       set to ``None``

        (unless ``__definition_order__`` is present in
        the class dict either by virtue of being in the class body or
        because the metaclass inserted it before calling
        ``type.__new__``)

>         __definition_order__ = tuple(k for k in locals()
>                                      if (!k.startswith('__') or
>                                          !k.endswith('__')))

Still mixing C and Python!  ;)

> Why a tuple?
> ------------
>
> Use of a tuple reflects the fact that we are exposing the order in
> which attributes on the class were *defined*.  Since the definition
> is already complete by the time ``definition_order__`` is set, the
> content and order of the value won't be changing.  Thus we use a type
> that communicates that state of immutability.

> Why a read-only attribute?
> --------------------------
>
> As with the use of tuple, making ``__definition_order__`` a read-only
> attribute communicates the fact that the information it represents is
> complete.  Since it represents the state of a particular one-time event
> (execution of the class definition body), allowing the value to be
> replaced would reduce confidence that the attribute corresponds to the
> original class body.
>
> If a use case for a writable (or mutable) ``__definition_order__``
> arises, the restriction may be loosened later.  Presently this seems
> unlikely and furthermore it is usually best to go immutable-by-default.

If __definition_order__ is supposed to be immutable as well as read-only
then we should convert non-tuples to tuples.  No point in letting that
user bug slip through.

> Why ignore "dunder" names?
> --------------------------
>
> Names starting and ending with "__" are reserved for use by the
> interpreter.  In practice they should not be relevant to the users of
> ``__definition_order__``.  Instead, for early everyone they would only

s/early/nearly

> Why is __definition_order__ even necessary?
> -------------------------------------------
>
> Since the definition order is not preserved in ``__dict__``, it would be
> lost once class definition execution completes.  Classes *could*
> explicitly set the attribute as the last thing in the body.  However,
> then independent decorators could only make use of classes that had done
> so.  Instead, ``__definition_order__`` preserves this one bit of info
> from the class body so that it is universally available.

s/would be/is

--
~Ethan~

From vadmium+py at gmail.com  Tue Jun  7 22:01:14 2016
From: vadmium+py at gmail.com (Martin Panter)
Date: Wed, 8 Jun 2016 02:01:14 +0000
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <57572E5D.4020101@stoneleaf.us>
References: <57572E5D.4020101@stoneleaf.us>
Message-ID: <CA+eR4cE9yVqoFrk6xi_VXp1+ZYqjy-15VNHosCvCVf_Vc5UANg@mail.gmail.com>

On 7 June 2016 at 20:28, Ethan Furman <ethan at stoneleaf.us> wrote:
> Addition of explicit "single byte" constructors
> -----------------------------------------------
>
> As binary counterparts to the text ``chr`` function, this PEP proposes the
> addition of an explicit ``byte`` alternative constructor as a class method
> on both ``bytes`` and ``bytearray``::
>
>     >>> bytes.byte(3)
>     b'\x03'
>     >>> bytearray.byte(3)
>     bytearray(b'\x03')

Bytes.byte() is a great idea. But what?s the point or use case of
bytearray.byte(), a mutable array of one pre-defined byte?

> Addition of optimised iterator methods that produce ``bytes`` objects
> ---------------------------------------------------------------------
>
> This PEP proposes that ``bytes``, ``bytearray`` and ``memoryview`` gain an
> optimised ``iterbytes`` method that produces length 1 ``bytes`` objects
> rather than integers::
>
>     for x in data.iterbytes():
>         # x is a length 1 ``bytes`` object, rather than an integer

Might be good to have an example with concrete output, so you see the
one-byte strings coming out of it.

>>> tuple(b"ABC".iterbytes())
(b'A', b'B', b'C')

From steve at pearwood.info  Tue Jun  7 23:09:15 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 8 Jun 2016 13:09:15 +1000
Subject: [Python-Dev] PEP 467:  Minor API improvements to bytes,
 bytearray, and memoryview
In-Reply-To: <20160608021712.0e2e02d7@x230>
References: <57572E5D.4020101@stoneleaf.us> <20160608003350.7a7c6641@x230>
 <57574EA8.9090805@stoneleaf.us> <20160608021712.0e2e02d7@x230>
Message-ID: <20160608030915.GI12028@ando.pearwood.info>

On Wed, Jun 08, 2016 at 02:17:12AM +0300, Paul Sokolovsky wrote:
> Hello,
> 
> On Tue, 07 Jun 2016 15:46:00 -0700
> Ethan Furman <ethan at stoneleaf.us> wrote:
> 
> > On 06/07/2016 02:33 PM, Paul Sokolovsky wrote:
> > 
> > >> This PEP proposes to deprecate that behaviour in Python 3.6, and
> > >> remove it entirely in Python 3.7.
> > >
> > > Why the desire to break applications of thousands and thousands of
> > > people? 

I'm not so sure that *thousands* of people are relying on this 
behaviour, but your point is taken that it is a backwards-incompatible 
change.

> > > Besides, bytes(3) behavior is very logical. Everyone who
> > > knows what malloc(3) does also knows what bytes(3) does.

Most Python coders are not C coders. Knowing C is not and should not be 
a pre-requisite for using Python.

> > > Who
> > > doesn't, can learn, and eventually be grateful that learning Python
> > > actually helped them to learn other language as well.

I really don't think that learning Python will help with C.

> > Two reasons:
> > 
> > 1) bytes are immutable, so creating a 3-byte 0x00 string seems
> >     ridiculous;
> 
> There's nothing ridiculous in sending N zero bytes over network,
> writing to a file, transferring to a hardware device.

True, but there is a good way of writing N identical bytes, not limited 
to nulls, using the replication operator:

py> b'\xff'*10
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'

which is more useful than `bytes(10)` since that can only produce 
zeroes.

> That however
> raises questions e.g. how to (efficiently) fill a (subsection) of
> bytearray with something but a 0

Slicing.

py> b = bytearray(10)
py> b[4:4] = b'\xff'*4
py> b
bytearray(b'\x00\x00\x00\x00\xff\xff\xff\xff\x00\x00\x00\x00\x00\x00')

-- 
Steve

From ericsnowcurrently at gmail.com  Tue Jun  7 23:17:16 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 7 Jun 2016 20:17:16 -0700
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <575772E6.7040906@stoneleaf.us>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <575772E6.7040906@stoneleaf.us>
Message-ID: <CALFfu7DJDrBpPT4BvRG0-+coHRyvH+6tgwR=BFvPd0wRJ3NBPQ@mail.gmail.com>

On Tue, Jun 7, 2016 at 6:20 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
> On 06/07/2016 05:50 PM, Eric Snow wrote:
>>         __definition_order__ = tuple(k for k in locals()
>>                                      if (!k.startswith('__') or
>>                                          !k.endswith('__')))
>
>
> Still mixing C and Python!  ;)

I knew I was missing something!

>
>
>> Why a tuple?
>> ------------
>>
>> Use of a tuple reflects the fact that we are exposing the order in
>> which attributes on the class were *defined*.  Since the definition
>> is already complete by the time ``definition_order__`` is set, the
>> content and order of the value won't be changing.  Thus we use a type
>> that communicates that state of immutability.
>
>
>> Why a read-only attribute?
>> --------------------------
>>
>> As with the use of tuple, making ``__definition_order__`` a read-only
>> attribute communicates the fact that the information it represents is
>> complete.  Since it represents the state of a particular one-time event
>> (execution of the class definition body), allowing the value to be
>> replaced would reduce confidence that the attribute corresponds to the
>> original class body.
>>
>> If a use case for a writable (or mutable) ``__definition_order__``
>> arises, the restriction may be loosened later.  Presently this seems
>> unlikely and furthermore it is usually best to go immutable-by-default.
>
>
> If __definition_order__ is supposed to be immutable as well as read-only
> then we should convert non-tuples to tuples.  No point in letting that
> user bug slip through.

Do you mean if a class explicitly defines __definition_order__?  If
so, I'm not clear on how that would work.  It could be set to
anything, including None or a value that does not iterate into a
definition order.  If someone explicitly set __definition_order__ then
I think it should be used as-is.

>
>
>> Why ignore "dunder" names?
>> --------------------------
>>
>> Names starting and ending with "__" are reserved for use by the
>> interpreter.  In practice they should not be relevant to the users of
>> ``__definition_order__``.  Instead, for early everyone they would only
>
>
> s/early/nearly

fixed

>
>> Why is __definition_order__ even necessary?
>> -------------------------------------------
>>
>> Since the definition order is not preserved in ``__dict__``, it would be
>> lost once class definition execution completes.  Classes *could*
>> explicitly set the attribute as the last thing in the body.  However,
>> then independent decorators could only make use of classes that had done
>> so.  Instead, ``__definition_order__`` preserves this one bit of info
>> from the class body so that it is universally available.
>
>
> s/would be/is

fixed

Thanks!

-eric

From ericsnowcurrently at gmail.com  Tue Jun  7 23:17:54 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 7 Jun 2016 20:17:54 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <20160608010957.GG12028@ando.pearwood.info>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
 <nj73vn$mgm$1@ger.gmane.org>
 <CALFfu7CN-3zouXqnSN46WSe36D0O=rYJ7epm8kjiCe69og+x3g@mail.gmail.com>
 <20160608010957.GG12028@ando.pearwood.info>
Message-ID: <CALFfu7BesxbWX7-g-9SRiibj_Tyn4rW4giKyhfobPtF5BaUZhw@mail.gmail.com>

On Tue, Jun 7, 2016 at 6:09 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Tue, Jun 07, 2016 at 11:39:06AM -0700, Eric Snow wrote:
>> I mean the latter, "type" -> the class being defined.
>
> Could you clarify that in the PEP please? Like Terry, I too found it
> unclear. I think there are a couple of places where you refer to `type`
> and it isn't clear whether you mean builtins.type or something else.

Yep.  Done.

-eric

From vadmium+py at gmail.com  Wed Jun  8 00:00:50 2016
From: vadmium+py at gmail.com (Martin Panter)
Date: Wed, 8 Jun 2016 04:00:50 +0000
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <CADiSq7cgatAo5L7meFsX7VhDjOSBGzHwsE997+xc8gu504TQ=w@mail.gmail.com>
References: <57572E5D.4020101@stoneleaf.us> <20160608003350.7a7c6641@x230>
 <CADiSq7cgatAo5L7meFsX7VhDjOSBGzHwsE997+xc8gu504TQ=w@mail.gmail.com>
Message-ID: <CA+eR4cFLWqk+EAPNJCo6HW=A6_=kYn0AVGa5QEme9thPMpF9XA@mail.gmail.com>

On 7 June 2016 at 21:56, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 7 June 2016 at 14:33, Paul Sokolovsky <pmiscml at gmail.com> wrote:
>> Ethan Furman <ethan at stoneleaf.us> wrote:
>>> Deprecation of current "zero-initialised sequence" behaviour
>>> ------------------------------------------------------------
>>>
>>> Currently, the ``bytes`` and ``bytearray`` constructors accept an
>>> integer argument and interpret it as meaning to create a
>>> zero-initialised sequence of the given size::
>>>
>>>      >>> bytes(3)
>>>      b'\x00\x00\x00'
>>>      >>> bytearray(3)
>>>      bytearray(b'\x00\x00\x00')
>>>
>>> This PEP proposes to deprecate that behaviour in Python 3.6, and
>>> remove it entirely in Python 3.7.
>>
>> Why the desire to break applications of thousands and thousands of
>> people?
>
> Same argument as any deprecation: to make existing and future defects
> easier to find or easier to debug.
>
> That said, this is the main part I was referring to in the other
> thread when I mentioned some of the constructor changes were
> potentially controversial and probably not worth the hassle - it's the
> only one with the potential to break currently working code, while the
> others are just a matter of choosing suitable names.

An argument against deprecating bytearray(n) in particular is that
this is supported in Python 2. I think I have (ab)used this fact to
work around the problem with bytes(n) in Python 2 & 3 compatible code.

From raymond.hettinger at gmail.com  Wed Jun  8 00:27:35 2016
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Tue, 7 Jun 2016 21:27:35 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <CADiSq7e-Rw-PnYar_JcPTnBP+C8+Jo_fREJrxAF30f43YsJD0A@mail.gmail.com>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
 <00C7A5D7-686C-45F7-9C8E-930CAB96FDFD@gmail.com>
 <CADiSq7e-Rw-PnYar_JcPTnBP+C8+Jo_fREJrxAF30f43YsJD0A@mail.gmail.com>
Message-ID: <F3E9541A-CB27-4827-AA59-A96086CC8A01@gmail.com>

> On Jun 7, 2016, at 4:12 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> By the time decorators run, the original execution namespace is no
> longer available - the contents have been copied into the class dict,
> which will still be a plain dict (and there's a lot of code that calls
> PyDict_* APIs on tp_dict, so replacing the latter with a subclass is
> neither trivial nor particularly safe in the presence of extension
> modules).

That makes sense.

+1 all around.

Raymond

From storchaka at gmail.com  Wed Jun  8 01:28:03 2016
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 8 Jun 2016 08:28:03 +0300
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
Message-ID: <nj8ad4$frh$1@ger.gmane.org>

On 07.06.16 20:51, Eric Snow wrote:
> Hi all,
>
> Following discussion a few years back (and rough approval from Guido
> [1]), I started work on using OrderedDict for the class definition
> namespace by default.  The bulk of the effort lay in implementing
> OrderedDict in C, which I got landed just in time for 3.5.  The
> remaining work was quite minimal and the actual change is quite small.
>
> My intention was to land the patch soon, having gone through code
> review during PyCon.  However, Nick pointed out to me the benefit of
> having a concrete point of reference for the change, as well as making
> sure it isn't a problem for other implementations.  So in that spirit,
> here's a PEP for the change.  Feedback is welcome, particularly from
> from other implementors.

Be aware that C implementation of OrderedDict still is not free from 
problems.

From storchaka at gmail.com  Wed Jun  8 01:42:23 2016
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 8 Jun 2016 08:42:23 +0300
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <57572E5D.4020101@stoneleaf.us>
References: <57572E5D.4020101@stoneleaf.us>
Message-ID: <nj8b7v$r51$1@ger.gmane.org>

On 07.06.16 23:28, Ethan Furman wrote:
> Minor changes: updated version numbers, add punctuation.
>
> The current text seems to take into account Guido's last comments.
>
> Thoughts before asking for acceptance?
>
>
>
>
> PEP: 467
> Title: Minor API improvements for binary sequences
> Version: $Revision$
> Last-Modified: $Date$
> Author: Nick Coghlan <ncoghlan at gmail.com>
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 2014-03-30
> Python-Version: 3.5
> Post-History: 2014-03-30 2014-08-15 2014-08-16
>
>
> Abstract
> ========
>
> During the initial development of the Python 3 language specification,
> the core ``bytes`` type for arbitrary binary data started as the mutable
> type that is now referred to as ``bytearray``. Other aspects of
> operating in the binary domain in Python have also evolved over the
> course of the Python 3 series.
>
> This PEP proposes four small adjustments to the APIs of the ``bytes``,
> ``bytearray`` and ``memoryview`` types to make it easier to operate
> entirely in the binary domain:
>
> * Deprecate passing single integer values to ``bytes`` and ``bytearray``
> * Add ``bytes.zeros`` and ``bytearray.zeros`` alternative constructors
> * Add ``bytes.byte`` and ``bytearray.byte`` alternative constructors
> * Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and
>    ``memoryview.iterbytes`` alternative iterators

"Byte" is an alias to "octet" (8-bit integer) in modern terminology. 
Iterating bytes and bytearray already produce bytes. Wouldn't this be 
confused? May be name these methods "iterbytestrings", since they adds 
str-like behavior?

From stephen at xemacs.org  Wed Jun  8 02:48:58 2016
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 8 Jun 2016 15:48:58 +0900
Subject: [Python-Dev]  PEP 467:  Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <57572E5D.4020101@stoneleaf.us>
References: <57572E5D.4020101@stoneleaf.us>
Message-ID: <22359.49114.638710.974863@turnbull.sk.tsukuba.ac.jp>

Ethan Furman writes:

 > * Deprecate passing single integer values to ``bytes`` and
 >   ``bytearray``

Why?  This is a slightly awkward idiom compared to .zeros (EITBI etc),
but your 32-bit clock will roll over before we can actually remove it.
There are a lot of languages that do this kind of initialization of
arrays based on ``count``.  If you want to do something useful here,
add an optional argument (here in ridiculous :-) generality:

    bytes(count, tile=[0]) -> bytes(tile * count)

where ``tile`` is a Sequence of a type that is acceptable to bytes
anyway, or Sequence[int], which is treated as

    b"".join([bytes(chr(i)) for i in tile] * count])

Interpretation of ``count`` of course  i bikesheddable, with at least
one alternative interpretation (length of result bytes, with last tile
truncated if necessary).

 > * Add ``bytes.zeros`` and ``bytearray.zeros`` alternative constructors

this is an API break if you take the deprecation as a mandate (which
eventual removal does indicate).  And backward compatibility for
clients of the bytes API means that we violate TOOWTDI indefinitely,
on a constructor of quite specialized utility.  Yuck.

-1 on both.

Barry Warsaw writes later in thread:

 > We can't change bytes.__getitem__ but we can add another method
 > that returns single byte objects?  I think it's still a bit of a
 > pain to extract single bytes even with .iterbytes().

+1  ISTM that more than the other changes, this is the most important
one.

Steve

From leewangzhong+python at gmail.com  Wed Jun  8 03:07:26 2016
From: leewangzhong+python at gmail.com (Franklin? Lee)
Date: Wed, 8 Jun 2016 03:07:26 -0400
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
Message-ID: <CAB_e7iw+LUQ4MvWnxYGnk=e1My_cT8W-Gcf6SuzLkuHi81tv5A@mail.gmail.com>

On Jun 7, 2016 8:52 PM, "Eric Snow" <ericsnowcurrently at gmail.com> wrote:
> * the default class *definition* namespace is now ``OrderdDict``
> * the order in which class attributes are defined is preserved in the

By using an OrderedDict, names are ordered by first definition point,
rather than location of the used definition.

For example, the definition order of the following will be "x, y", even
though the definitions actually bound to the name are in order "y, x".
        class C:
            x = 0
            def y(self): return 'y'
            def x(self): return 'x'

Is that okay?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160608/504c8706/attachment.html>

From victor.stinner at gmail.com  Wed Jun  8 04:07:27 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Wed, 8 Jun 2016 10:07:27 +0200
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
Message-ID: <CAMpsgwZJiiVo07S3vMrUvGqZ_Uc22-3GfiBMsdWZ2gtkExjG3A@mail.gmail.com>

> Abstract
> ========
>
> This PEP changes the default class definition namespace to ``OrderedDict``.
> Furthermore, the order in which the attributes are defined in each class
> body will now be preserved in ``type.__definition_order__``.  This allows
> introspection of the original definition order, e.g. by class decorators.
>
> Note: just to be clear, this PEP is *not* about changing ``__dict__`` for
> classes to ``OrderedDict``.

What is the cost in term of performance?

What can be slower: define a new class and/or instanciate a class?

Victor

From victor.stinner at gmail.com  Wed Jun  8 04:04:08 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Wed, 8 Jun 2016 10:04:08 +0200
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <57572E5D.4020101@stoneleaf.us>
References: <57572E5D.4020101@stoneleaf.us>
Message-ID: <CAMpsgwa45AxPYfXDL89DfE3RZrFwCQUorzNKXy2QP+pEYpX0ZQ@mail.gmail.com>

Hi,

> Currently, the ``bytes`` and ``bytearray`` constructors accept an integer
> argument and interpret it as meaning to create a zero-initialised sequence
> of the given size::
> (...)
> This PEP proposes to deprecate that behaviour in Python 3.6, and remove it
> entirely in Python 3.7.

I'm opposed to this change (presented like that). Please stop breaking
the backward compatibility in minor versions.

I'm porting Python 2 code to Python 3 for longer than 2 years. First,
Python 3 only proposed to immediatly drop Python 2 support using the
2to3 tool. It simply doesn't work because you must port incrementally
all dependencies, so you must write code working with Python 2 and
Python 3 using the same code base. A few people tried to duplicate
repositories, projects, project name, etc. to have one version for
Python 2 and one version for Python 3, but IMHO it's even worse. It's
very difficult to handle dependencies using that.

It took a few years until six was widely used and that pip was popular
enough to be able to add six as a *dependency* (and not put an old
copy in the project).

Basically, you propose to introduce a backward incompatible change for
free (I fail to see the benefit of replacing bytes(n) with
bytes.zeros(n)) and without obvious way to write code compatible with
Python <= 3.6 and Python >= 3.7.

Moreover, a single cycle is way too short to port all code in the wild.

It's common that users complain that Python core developers like
breaking the compatibility at each release. Recently, I saw a list of
applications which need to be ported to Python 3.5, while they work
perfectly on Python 3.4.

*If* you still want to deprecate bytes(n), you must introduce an
helper working on *all* Python versions. Obviously, the helper must be
avaialble and work for Python 2.7. Maybe it can be the six module.
Maybe something else.

In Perl 5, there is a nice "use 5.12;" pragma to explicitly ask to
keep the compatiiblity with Perl 5.12. This pragma allows to change
the language more easily, since you can port code file by file. I
don't know if it's technically possible in Python, maybe not for all
kinds of backward incompatible changes.

Victor

From victor.stinner at gmail.com  Wed Jun  8 04:26:34 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Wed, 8 Jun 2016 10:26:34 +0200
Subject: [Python-Dev] C99
In-Reply-To: <CAP7+vJJuSreWFH=ojBphsYsBXt8cz2zoc3=6Pmh=KPu7E-VOaA@mail.gmail.com>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <niv32b$d7k$1@ger.gmane.org>
 <CAP7+vJKNu58iKeXJ-=pZeXp_c0B=JERPgHjxYiQTf9X7sbJRqA@mail.gmail.com>
 <CAMpsgwbq6e9Va9koS884LuR5Ly9wmoDhuv=CTGJMONiO152XFw@mail.gmail.com>
 <837505337487020527.659630sturla.molden-gmail.com@news.gmane.org>
 <CAP7+vJJuSreWFH=ojBphsYsBXt8cz2zoc3=6Pmh=KPu7E-VOaA@mail.gmail.com>
Message-ID: <CAMpsgwaj1aLDTzUrHYczY65m38yKT9aCvZ+R0mBpRmTrSqESZQ@mail.gmail.com>

I guess that as usual, we should use the "common denominator" of all
compilers supported by CPython. For example, if MSVC doesn't support a
feature, we should not use it in CPython.

In practice, it's easy to check if a feature is supported or not: we
have buildbots building Python at each commit. It was very common to
get a compilation error only on MSVC when a variable was defined in
the middle of a function. We are now using
-Werror=declaration-after-statement with GCC because of MSVC!

Maybe GCC has an option to ask for the subset of the C99 standard
compatible with MSVC? Something like "-std=c99 -pedantic"?

Note: I tried -pedantic, GCC emits a lot of warnings on code which
looks valid and/or is protected with #ifdef for features specific to
GCC like computed goto.

Victor

2016-06-07 21:45 GMT+02:00 Guido van Rossum <gvanrossum at gmail.com>:
> We should definitely keep supporting MSVC.
>
> --Guido (mobile)
>
> On Jun 7, 2016 12:39 PM, "Sturla Molden" <sturla.molden at gmail.com> wrote:
>>
>> Victor Stinner <victor.stinner at gmail.com> wrote:
>>
>> > Is it worth to support a compiler that in 2016 doesn't support the C
>> > standard released in 1999, 17 years ago?
>>
>> MSVC only supports C99 when its needed for C++11 or some MS extension to
>> C.
>>
>> Is it worth supporting MSVC? If not, we have Intel C, Clang and Cygwin GCC
>> are the viable options we have on Windows (and perhaps Embarcadero, but I
>> haven't used C++ builder for a very long time). Even MinGW does not fully
>> support C99, because it depends on Microsoft's CRT. If we think MSVC and
>> MinGW are worth supporting, we cannot just use C99 indiscriminantly.
>>
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
>

From storchaka at gmail.com  Wed Jun  8 04:53:06 2016
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 8 Jun 2016 11:53:06 +0300
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <CAMpsgwa45AxPYfXDL89DfE3RZrFwCQUorzNKXy2QP+pEYpX0ZQ@mail.gmail.com>
References: <57572E5D.4020101@stoneleaf.us>
 <CAMpsgwa45AxPYfXDL89DfE3RZrFwCQUorzNKXy2QP+pEYpX0ZQ@mail.gmail.com>
Message-ID: <nj8mdj$7rg$1@ger.gmane.org>

On 08.06.16 11:04, Victor Stinner wrote:
>> Currently, the ``bytes`` and ``bytearray`` constructors accept an integer
>> argument and interpret it as meaning to create a zero-initialised sequence
>> of the given size::
>> (...)
>> This PEP proposes to deprecate that behaviour in Python 3.6, and remove it
>> entirely in Python 3.7.
>
> I'm opposed to this change (presented like that). Please stop breaking
> the backward compatibility in minor versions.

The argument for deprecating bytes(n) is that this has different meaning 
in Python 2, and when backport a code to Python 2 or write 2+3 
compatible code there is a risk to make a mistake. This argument is not 
applicable to bytearray(n).

> *If* you still want to deprecate bytes(n), you must introduce an
> helper working on *all* Python versions. Obviously, the helper must be
> avaialble and work for Python 2.7. Maybe it can be the six module.
> Maybe something else.

The obvious way to create the bytes object of length n is b'\0' * n. It 
works in all Python versions starting from 2.6. I don't see the need in 
bytes(n) and bytes.zeros(n). There are no special methods for creating a 
list or a string of size n.

From storchaka at gmail.com  Wed Jun  8 05:11:40 2016
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 8 Jun 2016 12:11:40 +0300
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <CADiSq7crpoSwm-ctsC3hX8tfmGWFRg-gjeQ_Crtmi9DyRAN79Q@mail.gmail.com>
References: <57572E5D.4020101@stoneleaf.us>
 <20160607173119.36961fcf.barry@wooz.org> <20160608003711.6149bc96@x230>
 <CAP1=2W7XSQcLHysATMV5w2fUxW=+V+-SgrYBpqSBLDupNUK-Dg@mail.gmail.com>
 <20160607175745.291e595a@subdivisions.wooz.org>
 <CAMiohohcm5nTBKd00rTLMgCTLekqK3D=GQuSWhwqEC1OHbhwNg@mail.gmail.com>
 <CADiSq7crpoSwm-ctsC3hX8tfmGWFRg-gjeQ_Crtmi9DyRAN79Q@mail.gmail.com>
Message-ID: <nj8ngc$p6i$1@ger.gmane.org>

On 08.06.16 02:03, Nick Coghlan wrote:
> That said, it occurs to me that there's a reasonably strong
> composability argument in favour of a view-based approach: a view will
> work with operator.itemgetter() and other sequence consuming APIs,
> while special methods won't. The "like-memoryview-but-not" view type
> could also take any bytes-like object as input, similar to memoryview
> itself.

Something like:

class chunks:
     def __init__(self, seq, size):
         self._seq = seq
         self._size = size

     def __len__(self):
         return len(self._seq) // self._size

     def __getitem__(self, i):
         chunk = self._seq[i: i + self._size]
         if len(chunk) != self._size:
             raise IndexError
         return chunk

(but needs more checks and slices support).

It would be useful for general sequences too.

From pmiscml at gmail.com  Wed Jun  8 06:37:37 2016
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Wed, 8 Jun 2016 13:37:37 +0300
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <nj8mdj$7rg$1@ger.gmane.org>
References: <57572E5D.4020101@stoneleaf.us>
 <CAMpsgwa45AxPYfXDL89DfE3RZrFwCQUorzNKXy2QP+pEYpX0ZQ@mail.gmail.com>
 <nj8mdj$7rg$1@ger.gmane.org>
Message-ID: <20160608133737.63e6c666@x230>

Hello,

On Wed, 8 Jun 2016 11:53:06 +0300
Serhiy Storchaka <storchaka at gmail.com> wrote:

> On 08.06.16 11:04, Victor Stinner wrote:
> >> Currently, the ``bytes`` and ``bytearray`` constructors accept an
> >> integer argument and interpret it as meaning to create a
> >> zero-initialised sequence of the given size::
> >> (...)
> >> This PEP proposes to deprecate that behaviour in Python 3.6, and
> >> remove it entirely in Python 3.7.
> >
> > I'm opposed to this change (presented like that). Please stop
> > breaking the backward compatibility in minor versions.
> 
> The argument for deprecating bytes(n) is that this has different
> meaning in Python 2,

That's artifact (as in: defect) of "bytes" (apparently) being a flat
alias of "str" in Python2, without trying to validate its arguments. It
would be sad if thinkos in Python2 implementation dictate how Python3
should work. It's not too late to fix it in Python2 by issuing s CVE
along the lines of "Lack of argument validation in Python2 bytes()
constructor may lead to insecure code."

> and when backport a code to Python 2 or write
> 2+3 compatible code there is a risk to make a mistake. This argument
> is not applicable to bytearray(n).
> 
> > *If* you still want to deprecate bytes(n), you must introduce an
> > helper working on *all* Python versions. Obviously, the helper must
> > be avaialble and work for Python 2.7. Maybe it can be the six
> > module. Maybe something else.
> 
> The obvious way to create the bytes object of length n is b'\0' * n.

That's very inefficient: it requires allocating useless b'\0', then a
generic function to repeat arbitrary memory block N times. If there's a
talk of Python to not be laughed at for being SLOW, there would rather
be efficient ways to deal with blocks of binary data.

> It works in all Python versions starting from 2.6. I don't see the
> need in bytes(n) and bytes.zeros(n). There are no special methods for
> creating a list or a string of size n.

So, above, unless you specifically mean having bytearray.zero() and not
having bytes.zero(). But then the whole purpose of the presented PEP is
make API more, not less consistent. Having random gaps in bytes vs
bytearray API isn't going to help anyone.

-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From storchaka at gmail.com  Wed Jun  8 07:05:19 2016
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 8 Jun 2016 14:05:19 +0300
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <20160608133737.63e6c666@x230>
References: <57572E5D.4020101@stoneleaf.us>
 <CAMpsgwa45AxPYfXDL89DfE3RZrFwCQUorzNKXy2QP+pEYpX0ZQ@mail.gmail.com>
 <nj8mdj$7rg$1@ger.gmane.org> <20160608133737.63e6c666@x230>
Message-ID: <nj8u5g$7gn$1@ger.gmane.org>

On 08.06.16 13:37, Paul Sokolovsky wrote:
>> The obvious way to create the bytes object of length n is b'\0' * n.
>
> That's very inefficient: it requires allocating useless b'\0', then a
> generic function to repeat arbitrary memory block N times. If there's a
> talk of Python to not be laughed at for being SLOW, there would rather
> be efficient ways to deal with blocks of binary data.

Do you have any evidences for this claim?

$ ./python -m timeit -s 'n = 10000' -- 'bytes(n)'
1000000 loops, best of 3: 1.32 usec per loop
$ ./python -m timeit -s 'n = 10000' -- 'b"\0" * n'
1000000 loops, best of 3: 0.858 usec per loop

From pmiscml at gmail.com  Wed Jun  8 07:26:45 2016
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Wed, 8 Jun 2016 14:26:45 +0300
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <nj8u5g$7gn$1@ger.gmane.org>
References: <57572E5D.4020101@stoneleaf.us>
 <CAMpsgwa45AxPYfXDL89DfE3RZrFwCQUorzNKXy2QP+pEYpX0ZQ@mail.gmail.com>
 <nj8mdj$7rg$1@ger.gmane.org> <20160608133737.63e6c666@x230>
 <nj8u5g$7gn$1@ger.gmane.org>
Message-ID: <20160608142645.71884fc1@x230>

Hello,

On Wed, 8 Jun 2016 14:05:19 +0300
Serhiy Storchaka <storchaka at gmail.com> wrote:

> On 08.06.16 13:37, Paul Sokolovsky wrote:
> >> The obvious way to create the bytes object of length n is b'\0' *
> >> n.
> >
> > That's very inefficient: it requires allocating useless b'\0', then
> > a generic function to repeat arbitrary memory block N times. If
> > there's a talk of Python to not be laughed at for being SLOW, there
> > would rather be efficient ways to deal with blocks of binary data.
> 
> Do you have any evidences for this claim?

Yes, it's written above, let me repeat it: bytes(n) is (can be)
calloc(1, n) underlyingly, while b"\0" * n is a more complex algorithm. 

> 
> $ ./python -m timeit -s 'n = 10000' -- 'bytes(n)'
> 1000000 loops, best of 3: 1.32 usec per loop
> $ ./python -m timeit -s 'n = 10000' -- 'b"\0" * n'
> 1000000 loops, best of 3: 0.858 usec per loop

I don't know how inefficient CPython's bytes(n) or how efficient
repetition (maybe 1-byte repetitions are optimized into memset()?), but
MicroPython (where bytes(n) is truly calloc(n)) gives expected results:

$ ./run-bench-tests bench/bytealloc*
bench/bytealloc:
    3.333s (+00.00%) bench/bytealloc-1-bytes_n.py
    11.244s (+237.35%) bench/bytealloc-2-repeat.py

-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From storchaka at gmail.com  Wed Jun  8 07:45:22 2016
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 8 Jun 2016 14:45:22 +0300
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <20160608142645.71884fc1@x230>
References: <57572E5D.4020101@stoneleaf.us>
 <CAMpsgwa45AxPYfXDL89DfE3RZrFwCQUorzNKXy2QP+pEYpX0ZQ@mail.gmail.com>
 <nj8mdj$7rg$1@ger.gmane.org> <20160608133737.63e6c666@x230>
 <nj8u5g$7gn$1@ger.gmane.org> <20160608142645.71884fc1@x230>
Message-ID: <nj90gi$e2s$1@ger.gmane.org>

On 08.06.16 14:26, Paul Sokolovsky wrote:
> On Wed, 8 Jun 2016 14:05:19 +0300
> Serhiy Storchaka <storchaka at gmail.com> wrote:
>
>> On 08.06.16 13:37, Paul Sokolovsky wrote:
>>>> The obvious way to create the bytes object of length n is b'\0' *
>>>> n.
>>>
>>> That's very inefficient: it requires allocating useless b'\0', then
>>> a generic function to repeat arbitrary memory block N times. If
>>> there's a talk of Python to not be laughed at for being SLOW, there
>>> would rather be efficient ways to deal with blocks of binary data.
>>
>> Do you have any evidences for this claim?
>
> Yes, it's written above, let me repeat it: bytes(n) is (can be)
> calloc(1, n) underlyingly, while b"\0" * n is a more complex algorithm.
>
>>
>> $ ./python -m timeit -s 'n = 10000' -- 'bytes(n)'
>> 1000000 loops, best of 3: 1.32 usec per loop
>> $ ./python -m timeit -s 'n = 10000' -- 'b"\0" * n'
>> 1000000 loops, best of 3: 0.858 usec per loop
>
> I don't know how inefficient CPython's bytes(n) or how efficient
> repetition (maybe 1-byte repetitions are optimized into memset()?), but
> MicroPython (where bytes(n) is truly calloc(n)) gives expected results:
>
> $ ./run-bench-tests bench/bytealloc*
> bench/bytealloc:
>      3.333s (+00.00%) bench/bytealloc-1-bytes_n.py
>      11.244s (+237.35%) bench/bytealloc-2-repeat.py

If the performance of creating an immutable array of n zero bytes is 
important in MicroPython, it is worth to optimize b"\0" * n.

For now CPython is the main implementation of Python 3 and bytes(n) is 
slower than b"\0" * n in CPython.

From pmiscml at gmail.com  Wed Jun  8 08:11:47 2016
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Wed, 8 Jun 2016 15:11:47 +0300
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <nj90gi$e2s$1@ger.gmane.org>
References: <57572E5D.4020101@stoneleaf.us>
 <CAMpsgwa45AxPYfXDL89DfE3RZrFwCQUorzNKXy2QP+pEYpX0ZQ@mail.gmail.com>
 <nj8mdj$7rg$1@ger.gmane.org> <20160608133737.63e6c666@x230>
 <nj8u5g$7gn$1@ger.gmane.org> <20160608142645.71884fc1@x230>
 <nj90gi$e2s$1@ger.gmane.org>
Message-ID: <20160608151147.54531ccc@x230>

Hello,

On Wed, 8 Jun 2016 14:45:22 +0300
Serhiy Storchaka <storchaka at gmail.com> wrote:

[]

> > $ ./run-bench-tests bench/bytealloc*
> > bench/bytealloc:
> >      3.333s (+00.00%) bench/bytealloc-1-bytes_n.py
> >      11.244s (+237.35%) bench/bytealloc-2-repeat.py
> 
> If the performance of creating an immutable array of n zero bytes is 
> important in MicroPython, it is worth to optimize b"\0" * n.

No matter how you optimize calloc + something, it's always slower than
just calloc.

> For now CPython is the main implementation of Python 3 

Indeed, and it already has bytes(N). So, perhaps nothing should be done
about it except leaving it alone. Perhaps, more discussion should go
into whether there's need for .iterbytes() if there's [i:i+1] already.
(I personally skip that, as I find [i:i+1] perfectly ok, and while I
can't understand how people may be not ok with it up to wanting
something more, I leave such possibility).

> and bytes(n)
> is slower than b"\0" * n in CPython.

-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From ericsnowcurrently at gmail.com  Wed Jun  8 10:17:28 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 8 Jun 2016 07:17:28 -0700
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CAB_e7iw+LUQ4MvWnxYGnk=e1My_cT8W-Gcf6SuzLkuHi81tv5A@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <CAB_e7iw+LUQ4MvWnxYGnk=e1My_cT8W-Gcf6SuzLkuHi81tv5A@mail.gmail.com>
Message-ID: <CALFfu7DZQfkBHgf3zBJi7WmspQdfQybTYSJWh_avPCcvDJ=bGg@mail.gmail.com>

On Wed, Jun 8, 2016 at 12:07 AM, Franklin? Lee
<leewangzhong+python at gmail.com> wrote:
> On Jun 7, 2016 8:52 PM, "Eric Snow" <ericsnowcurrently at gmail.com> wrote:
>> * the default class *definition* namespace is now ``OrderdDict``
>> * the order in which class attributes are defined is preserved in the
>
> By using an OrderedDict, names are ordered by first definition point, rather
> than location of the used definition.
>
> For example, the definition order of the following will be "x, y", even
> though the definitions actually bound to the name are in order "y, x".
>         class C:
>             x = 0
>             def y(self): return 'y'
>             def x(self): return 'x'
>
> Is that okay?

In practice that will seldom be an issue.  In the few cases where it
could possibly be a problem, the class may explicitly set
__definition_order__.

-eric

From barry at python.org  Wed Jun  8 10:25:54 2016
From: barry at python.org (Barry Warsaw)
Date: Wed, 8 Jun 2016 10:25:54 -0400
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <CA+eR4cE9yVqoFrk6xi_VXp1+ZYqjy-15VNHosCvCVf_Vc5UANg@mail.gmail.com>
References: <57572E5D.4020101@stoneleaf.us>
 <CA+eR4cE9yVqoFrk6xi_VXp1+ZYqjy-15VNHosCvCVf_Vc5UANg@mail.gmail.com>
Message-ID: <20160608102554.120a5b2b.barry@wooz.org>

On Jun 08, 2016, at 02:01 AM, Martin Panter wrote:

>Bytes.byte() is a great idea. But what?s the point or use case of
>bytearray.byte(), a mutable array of one pre-defined byte?

I like Bytes.byte() too.  I would guess you'd want the same method on
bytearray for duck typing APIs.

-Barry

From ericsnowcurrently at gmail.com  Wed Jun  8 10:26:29 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 8 Jun 2016 07:26:29 -0700
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CAMpsgwZJiiVo07S3vMrUvGqZ_Uc22-3GfiBMsdWZ2gtkExjG3A@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <CAMpsgwZJiiVo07S3vMrUvGqZ_Uc22-3GfiBMsdWZ2gtkExjG3A@mail.gmail.com>
Message-ID: <CALFfu7D2zuBPPfGDaQouV+mfSPzyHujfXvRHkZG790DPPjY=tw@mail.gmail.com>

On Wed, Jun 8, 2016 at 1:07 AM, Victor Stinner <victor.stinner at gmail.com> wrote:
>> Abstract
>> ========
>>
>> This PEP changes the default class definition namespace to ``OrderedDict``.
>> Furthermore, the order in which the attributes are defined in each class
>> body will now be preserved in ``type.__definition_order__``.  This allows
>> introspection of the original definition order, e.g. by class decorators.
>>
>> Note: just to be clear, this PEP is *not* about changing ``__dict__`` for
>> classes to ``OrderedDict``.
>
> What is the cost in term of performance?

Do you mean the cost of the PEP?  The extra cost is negligible:
creating an OrderedDict + mutation operations on it.  Note that it is
only used during class definition (execution of the class body).

>
> What can be slower: define a new class and/or instanciate a class?

By "instantiate" do you mean the equivalent of "type(...)" or do you
mean creating a new instance of a class?  As noted above, the impact
of using OrderedDict during class definition is negligible.  During
definition the cost of other operations will usually dwarf any extra
overhead from using an OrderedDict.

-eric

From steve at pearwood.info  Wed Jun  8 10:49:47 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 9 Jun 2016 00:49:47 +1000
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <CAMpsgwa45AxPYfXDL89DfE3RZrFwCQUorzNKXy2QP+pEYpX0ZQ@mail.gmail.com>
References: <57572E5D.4020101@stoneleaf.us>
 <CAMpsgwa45AxPYfXDL89DfE3RZrFwCQUorzNKXy2QP+pEYpX0ZQ@mail.gmail.com>
Message-ID: <20160608144947.GK12028@ando.pearwood.info>

On Wed, Jun 08, 2016 at 10:04:08AM +0200, Victor Stinner wrote:

> It's common that users complain that Python core developers like
> breaking the compatibility at each release.

No more common as users complaining that Python features are badly 
designed and crufty and should be fixed.

Whatever we do, we can't win. If we fix misfeatures, people complain. If 
we don't fix them, people complain. Sometimes the same people, depending 
on their specific needs. "Fix this, because it annoys me, but don't fix 
that, because I'm used to it and it doesn't annoy me any more."

*shrug*

Ultimately it comes down to a subjective feeling as to which is worse. 
My own subjective feeling is that, in the long run, we'll be better off 
fixing bytes than keeping it, and the longer we wait to fix it, the 
harder it will be.

-- 
Steve

From leewangzhong+python at gmail.com  Wed Jun  8 16:42:25 2016
From: leewangzhong+python at gmail.com (Franklin? Lee)
Date: Wed, 8 Jun 2016 16:42:25 -0400
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <20160608151147.54531ccc@x230>
References: <57572E5D.4020101@stoneleaf.us>
 <CAMpsgwa45AxPYfXDL89DfE3RZrFwCQUorzNKXy2QP+pEYpX0ZQ@mail.gmail.com>
 <nj8mdj$7rg$1@ger.gmane.org> <20160608133737.63e6c666@x230>
 <nj8u5g$7gn$1@ger.gmane.org> <20160608142645.71884fc1@x230>
 <nj90gi$e2s$1@ger.gmane.org> <20160608151147.54531ccc@x230>
Message-ID: <CAB_e7iz1h=jyE1VsWwBPNUfR=YJSjhA0XpwHmz1=WWGOdFjGJw@mail.gmail.com>

On Jun 8, 2016 8:13 AM, "Paul Sokolovsky" <pmiscml at gmail.com> wrote:
>
> Hello,
>
> On Wed, 8 Jun 2016 14:45:22 +0300
> Serhiy Storchaka <storchaka at gmail.com> wrote:
>
> []
>
> > > $ ./run-bench-tests bench/bytealloc*
> > > bench/bytealloc:
> > >      3.333s (+00.00%) bench/bytealloc-1-bytes_n.py
> > >      11.244s (+237.35%) bench/bytealloc-2-repeat.py
> >
> > If the performance of creating an immutable array of n zero bytes is
> > important in MicroPython, it is worth to optimize b"\0" * n.
>
> No matter how you optimize calloc + something, it's always slower than
> just calloc.

`bytes(n)` *is* calloc + something. It's a lookup of and call to a global
function. (Unless MicroPython optimizes away lookups for builtins, in which
case it can theoretically optimize b"\0".__mul__.)

On the other hand, b"\0" is a constant, and * is an operator lookup that
succeeds on the first argument (meaning, perhaps, a successful branch
prediction). As a constant, it is only created once, so there's no
intermediate object created.

AFAICT, the first requires optimizing global function lookups + calls, and
the second requires optimizing lookup and *successful* application of
__mul__ (versus failure + fallback to some __rmul__), and repetitions of a
particular `bytes` object (which can be interned and checked against). That
means there is room for either to win, depending on the efforts of the
implementers.

(However, `bytearray` has no syntax for literals (and therefore easy
constants), and is a more valid and, AFAIK, more practical concern.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160608/c6dfaeed/attachment.html>

From neil at python.ca  Wed Jun  8 17:01:33 2016
From: neil at python.ca (Neil Schemenauer)
Date: Wed, 8 Jun 2016 14:01:33 -0700
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
Message-ID: <20160608210133.GA4318@python.ca>

[I've posted something about this on python-ideas but since I now
have some basic working code, I think it is more than an idea.]

I think the uptake of Python 3 is starting to accelerate.  That's
good.  However, there are still millions or maybe billions of lines
of Python code that still needs to be ported.  It is beneficial to
the Python ecosystem if this code can get ported.

My idea is to make a stepping stone version of Python, between 2.7.x
and 3.x that eases the porting job.  The high level goals are:

- code coming out of 2to3 runs correctly on this modified Python

- code that runs without warnings on this modified Python will run
  correctly on Python 3.x.

Achieving these goals is not technically possible.  Still, I want to
reduce as much as possible the manual work involved in porting.
Incrementally fixing code that generates warnings is a lot easier
than trying to fix an entire application or library at once.

I have a very early version on github:

    https://github.com/nascheme/ppython

I'm hoping if people find it useful then they would contribute
backwards compatibility fixes that help their applications or
librarys run.  I am currently running a newly 2to3 ported
application on it.  At this time there is no warning generated but I
would rather get a warning then have one of my customers run into a
porting bug.

To be clear, I'm not proposing that these backwards compatiblity
features go into Python 3.x or that this modified Python becomes the
standard version.  It is purely an intermediate step in getting code
ported to Python 3.

I've temporarily named it "Pragmatic Python".  I'd like a better
name if someone can suggest one.  Maybe something like Perverted,
Debauched or Impure Python.

Regards,

  Neil

From rymg19 at gmail.com  Wed Jun  8 17:33:22 2016
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Wed, 8 Jun 2016 16:33:22 -0500
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <20160608210133.GA4318@python.ca>
References: <20160608210133.GA4318@python.ca>
Message-ID: <CAO41-mOuSP9YedSg2cK3o7gEnO-==f9X7=GbBhGBRunrAXuiCA@mail.gmail.com>

On Jun 8, 2016 4:04 PM, "Neil Schemenauer" <neil at python.ca> wrote:
>
> [I've posted something about this on python-ideas but since I now
> have some basic working code, I think it is more than an idea.]
>
> I think the uptake of Python 3 is starting to accelerate.  That's
> good.  However, there are still millions or maybe billions of lines
> of Python code that still needs to be ported.  It is beneficial to
> the Python ecosystem if this code can get ported.
>
> My idea is to make a stepping stone version of Python, between 2.7.x
> and 3.x that eases the porting job.  The high level goals are:
>
> - code coming out of 2to3 runs correctly on this modified Python
>
> - code that runs without warnings on this modified Python will run
>   correctly on Python 3.x.
>
> Achieving these goals is not technically possible.  Still, I want to
> reduce as much as possible the manual work involved in porting.
> Incrementally fixing code that generates warnings is a lot easier
> than trying to fix an entire application or library at once.
>
> I have a very early version on github:
>
>     https://github.com/nascheme/ppython
>
> I'm hoping if people find it useful then they would contribute
> backwards compatibility fixes that help their applications or
> librarys run.  I am currently running a newly 2to3 ported
> application on it.  At this time there is no warning generated but I
> would rather get a warning then have one of my customers run into a
> porting bug.
>
> To be clear, I'm not proposing that these backwards compatiblity
> features go into Python 3.x or that this modified Python becomes the
> standard version.  It is purely an intermediate step in getting code
> ported to Python 3.
>
> I've temporarily named it "Pragmatic Python".  I'd like a better
> name if someone can suggest one.  Maybe something like Perverted,
> Debauched or Impure Python.
>

...Perverted Python? Ouch.

What about something like "unpythonic" or similar?

> Regards,
>
>   Neil
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com

--
Ryan
[ERROR]: Your autotools build scripts are 200 lines longer than your
program. Something?s wrong.
http://kirbyfan64.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160608/511659d7/attachment.html>

From fred at fdrake.net  Wed Jun  8 17:40:39 2016
From: fred at fdrake.net (Fred Drake)
Date: Wed, 8 Jun 2016 17:40:39 -0400
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <CAO41-mOuSP9YedSg2cK3o7gEnO-==f9X7=GbBhGBRunrAXuiCA@mail.gmail.com>
References: <20160608210133.GA4318@python.ca>
 <CAO41-mOuSP9YedSg2cK3o7gEnO-==f9X7=GbBhGBRunrAXuiCA@mail.gmail.com>
Message-ID: <CAFT4OTEQ4YrOpjBv=5E88Eex7kC+Znc42A-OercO+xcO7MU+5g@mail.gmail.com>

On Wed, Jun 8, 2016 at 5:33 PM, Ryan Gonzalez <rymg19 at gmail.com> wrote:
> What about something like "unpythonic" or similar?

Or perhaps... antipythy?

  -Fred

-- 
Fred L. Drake, Jr.    <fred at fdrake.net>
"A storm broke loose in my mind."  --Albert Einstein

From greg.ewing at canterbury.ac.nz  Wed Jun  8 18:08:50 2016
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 09 Jun 2016 10:08:50 +1200
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <CAO41-mOuSP9YedSg2cK3o7gEnO-==f9X7=GbBhGBRunrAXuiCA@mail.gmail.com>
References: <20160608210133.GA4318@python.ca>
 <CAO41-mOuSP9YedSg2cK3o7gEnO-==f9X7=GbBhGBRunrAXuiCA@mail.gmail.com>
Message-ID: <57589772.2010707@canterbury.ac.nz>

> On Jun 8, 2016 4:04 PM, "Neil Schemenauer" <neil at python.ca 
> <mailto:neil at python.ca>> wrote:
>  >
>  > I've temporarily named it "Pragmatic Python".  I'd like a better
>  > name if someone can suggest one.  Maybe something like Perverted,
>  > Debauched or Impure Python.

Python Two and Three Quarters.

-- 
Greg

From phd at phdru.name  Wed Jun  8 18:13:47 2016
From: phd at phdru.name (Oleg Broytman)
Date: Thu, 9 Jun 2016 00:13:47 +0200
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <57589772.2010707@canterbury.ac.nz>
References: <20160608210133.GA4318@python.ca>
 <CAO41-mOuSP9YedSg2cK3o7gEnO-==f9X7=GbBhGBRunrAXuiCA@mail.gmail.com>
 <57589772.2010707@canterbury.ac.nz>
Message-ID: <20160608221347.GA5854@phdru.name>

On Thu, Jun 09, 2016 at 10:08:50AM +1200, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> >On Jun 8, 2016 4:04 PM, "Neil Schemenauer" <neil at python.ca
> ><mailto:neil at python.ca>> wrote:
> > >
> > > I've temporarily named it "Pragmatic Python".  I'd like a better
> > > name if someone can suggest one.  Maybe something like Perverted,
> > > Debauched or Impure Python.
> 
> Python Two and Three Quarters.

   QOTW! :-D

> -- 
> Greg

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.

From tjreedy at udel.edu  Wed Jun  8 18:22:22 2016
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 8 Jun 2016 18:22:22 -0400
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CAMpsgwZJiiVo07S3vMrUvGqZ_Uc22-3GfiBMsdWZ2gtkExjG3A@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <CAMpsgwZJiiVo07S3vMrUvGqZ_Uc22-3GfiBMsdWZ2gtkExjG3A@mail.gmail.com>
Message-ID: <nja5r7$ohm$1@ger.gmane.org>

On 6/8/2016 4:07 AM, Victor Stinner wrote:
>> Abstract
>> ========
>>
>> This PEP changes the default class definition namespace to ``OrderedDict``.
>> Furthermore, the order in which the attributes are defined in each class
>> body will now be preserved in ``type.__definition_order__``.  This allows
>> introspection of the original definition order, e.g. by class decorators.
>>
>> Note: just to be clear, this PEP is *not* about changing ``__dict__`` for
>> classes to ``OrderedDict``.
>
> What is the cost in term of performance?
>
> What can be slower: define a new class and/or instanciate a class?

A class is defined once, used many times to instantiate instances.  Each 
instance is typically used many times, with many lookups.  So it is 
self.class_attribute lookups, like method lookups, that likely matter 
the most, and which are not changed by the PEP.

-- 
Terry Jan Reedy

From jake at lwn.net  Wed Jun  8 19:23:58 2016
From: jake at lwn.net (Jake Edge)
Date: Wed, 8 Jun 2016 17:23:58 -0600
Subject: [Python-Dev] Round 2 of the Python Language Summit coverage at LWN
Message-ID: <20160608172358.146ae6c6@redtail.lan>

Howdy python-dev,

The second batch of articles from the Python Language Summit is now
available.

The starting point is here: https://lwn.net/Articles/688969/
(or here for non-subscribers:
https://lwn.net/SubscriberLink/688969/91cbeeaf32807914/ for the next
few hours anyway, it will be open to all after that using either link.)

I have added five more sessions since last week's three, still six more
to go, which should all be done by next week (and I'll post here again).

Python's GitHub migration and workflow changes:
https://lwn.net/Articles/689937/
https://lwn.net/SubscriberLink/689937/1fd56367a74206bf/

The state of mypy: https://lwn.net/Articles/690081/
https://lwn.net/SubscriberLink/690081/5c35679cafe42d1b/

An introduction to pytype: https://lwn.net/Articles/690150/
https://lwn.net/SubscriberLink/690150/660acde532afb8a3/

PyCharm and type hints: https://lwn.net/Articles/690186/
https://lwn.net/SubscriberLink/690186/848c447551204ffe/

Python 3.6 and 3.7 release cycles: https://lwn.net/Articles/690404/
https://lwn.net/SubscriberLink/690404/73cfb918fa21d27c/

The articles will be freely available (without using the
SubscriberLink) to the world at large in a week (and the next batch the
week after that) ... until then, feel free to share the SubscriberLinks.

Hopefully I have captured things reasonably well.  If there are
corrections or clarifications needed, though, I recommend posting them
as comments on the article.

enjoy!

jake

-- 
Jake Edge - LWN - jake at lwn.net - http://lwn.net

From ben+python at benfinney.id.au  Wed Jun  8 19:55:50 2016
From: ben+python at benfinney.id.au (Ben Finney)
Date: Thu, 09 Jun 2016 09:55:50 +1000
Subject: [Python-Dev] Round 2 of the Python Language Summit coverage at
 LWN
References: <20160608172358.146ae6c6@redtail.lan>
Message-ID: <8560tj1bi1.fsf@benfinney.id.au>

Jake Edge <jake at lwn.net> writes:

> The second batch of articles from the Python Language Summit is now
> available.

Thank you for writing these (and many other good articles) for Linux
Weekly News! High-quality, dependable reporting is very valuable for our
community.

-- 
 \     ?To punish me for my contempt of authority, Fate has made me an |
  `\                   authority myself.? ?Albert Einstein, 1930-09-18 |
_o__)                                                                  |
Ben Finney

From victor.stinner at gmail.com  Wed Jun  8 21:11:10 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Thu, 9 Jun 2016 03:11:10 +0200
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <20160608210133.GA4318@python.ca>
References: <20160608210133.GA4318@python.ca>
Message-ID: <CAMpsgwYfLm5jw0oxrsvQc54xSoWYLToz7DVpsmkwhP8uNCfk+A@mail.gmail.com>

2016-06-08 23:01 GMT+02:00 Neil Schemenauer <neil at python.ca>:
> - code coming out of 2to3 runs correctly on this modified Python

Stop using 2to3. This tool adds many useless changes when you only
care of Python 2.7 and Python 3.4+. I suggest to use better tools like
2to6, modernize or my own tool:
https://pypi.python.org/pypi/sixer

"Add Python 3 support to Python 2 applications using the six module."

Victor

From guido at python.org  Wed Jun  8 22:35:04 2016
From: guido at python.org (Guido van Rossum)
Date: Wed, 8 Jun 2016 19:35:04 -0700
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <CAMpsgwYfLm5jw0oxrsvQc54xSoWYLToz7DVpsmkwhP8uNCfk+A@mail.gmail.com>
References: <20160608210133.GA4318@python.ca>
 <CAMpsgwYfLm5jw0oxrsvQc54xSoWYLToz7DVpsmkwhP8uNCfk+A@mail.gmail.com>
Message-ID: <CAP7+vJ+bvz4Pyfks=GEBYQGRBUyZrGJBFgqe-JiQXhy9mUBfPg@mail.gmail.com>

Or write your own set of 2to3 fixers that *are* necessary.

On Wed, Jun 8, 2016 at 6:11 PM, Victor Stinner <victor.stinner at gmail.com>
wrote:

> 2016-06-08 23:01 GMT+02:00 Neil Schemenauer <neil at python.ca>:
> > - code coming out of 2to3 runs correctly on this modified Python
>
> Stop using 2to3. This tool adds many useless changes when you only
> care of Python 2.7 and Python 3.4+. I suggest to use better tools like
> 2to6, modernize or my own tool:
> https://pypi.python.org/pypi/sixer
>
> "Add Python 3 support to Python 2 applications using the six module."
>
> Victor
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160608/4a4d17d9/attachment-0001.html>

From larry at hastings.org  Thu Jun  9 07:25:04 2016
From: larry at hastings.org (Larry Hastings)
Date: Thu, 9 Jun 2016 04:25:04 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever waiting
 for high-quality random bits?
Message-ID: <57595210.4000508@hastings.org>

A problem has surfaced just this week in 3.5.1.  Obviously this is a 
good time to fix it for 3.5.2.  But there's a big argument over what is 
"broken" and what is an appropriate "fix".

As 3.5 Release Manager, I can put my foot down and make rulings, and 
AFAIK the only way to overrule me is with the BDFL.  In two of three 
cases I've put my foot down.  In the third I'm pretty sure I'm right, 
but IIUC literally everyone with a stated opinion else disagrees with 
me.  So I thought it best I escalate it.  Note that 3.5.2 is going to 
wait until the issue is settled and any changes to behavior are written 
and checked in.

(Blanket disclaimer for the below: in some places I'm trying to 
communicate other's people positions.  I apologize if I misrepresented 
yours; please reply and correct my mistake.  Also, sorry for the length 
of this email.  But feel even sorrier for me: this debate has already 
eaten two days this week.)

BACKGROUND

For 3.5 os.urandom() was changed: instead of reading from /dev/urandom, 
it uses the new system call getrandom() where available.  This is a new 
system call on Linux (which has already been cloned by Solaris).  
getrandom(), as CPython uses it, reads from the same PRNG that 
/dev/urandom gets its bits from.  But because it's a system call you 
don't have to mess around with file handles.  Also it always works in 
chrooted environments.  Sounds like a fine idea.

Also for 3.5, several other places where CPython internally needs random 
bits were switched from reading from /dev/urandom to calling 
getrandom().  The two that I know of: choosing the seed for hash 
randomization, and initializing the default Mersenne Twister for the 
random module.

There's one subtle but important difference between /dev/urandom and 
getrandom().  At startup, Linux seeds the urandom PRNG from the entropy 
pool.  If the entropy pool is uninitialized, what happens? CPython's 
calls to getrandom() will block until the entropy pool is initialized, 
which is usually just a few seconds (or less) after startup.  But 
/dev/urandom *guarantees* that reads will *always* work.  If the entropy 
pool hasn't been initialized, it pulls numbers from the PRNG before it's 
been properly seeded.  What this results in depends on various aspects 
of the configuration (do you have ECC RAM? how long was the machine 
powered down? does the system have a correct realtime clock?).  In 
extreme circumstances this may mean the "random" numbers are shockingly 
predictable!

Under normal circumstances this minor difference is irrelevant. After 
all, when would the entropy pool ever be uninitialized?

THE PROBLEM

Issue #26839:

    http://bugs.python.org/issue26839

(warning, the issue is now astonishingly long, and exhausting to read, 
and various bits of it are factually wrong)

A user reports that when starting CPython soon after startup on a fresh 
virtual machine, the process would hang for a long time. Someone on the 
issue reported observed delays of over 90 seconds. Later we found out: 
it wasn't 90 seconds before CPython became usable, these 90 seconds 
delays were before systemd timed out and simply killed the process.  
It's not clear what the upper bound on the delay might be.

The issue author had already identified the cause: CPython was blocking 
on getrandom() in order to initialize hash randomization. On this fresh 
virtual machine the entropy pool started out uninitialized.  And since 
the only thing running on the machine was CPython, and since CPython was 
blocked on initialization, the entropy pool was initializing very, very 
slowly.

Other posters to the thread pointed out that the same thing would happen 
in "import random", if your code could get that far.  The constructor 
for the Random() object would seed the Mersenne Twister, which would 
call getrandom() and block.

Naturally, callers to os.urandom() could also block for an unbounded 
period for the same reason.

MY RULINGS SO FAR

1) The change in 3.5 that means "import random" may block for an 
unbounded period of time on Linux due to the switch to getrandom() must 
be backed out or amended so that it never blocks.

I *think* everyone agrees with this.  The Mersenne Twister is not a 
CPRNG, so seeding it with crypto-quality bits isn't necessary.  And 
unbounded delays are bad.

2) The change in 3.5 that means hash randomization initialization may 
block for an unbounded period of time on Linux due to the switch to 
getrandom() must be backed out or amended so that it never blocks.

I believe most people agree with me.  The cryptography experts 
disagree.  IIUC both Alex Gaynor and Christian Heimes feel the blocking 
is preferable to non-random hash "randomization".

Yes, the bad random data means the hashing will be predictable. Neither 
choice is exactly what you want.  But most people feel it's simply 
unreasonable that in extreme corner cases CPython can block for an 
unbounded amount of time before running user code.

OS.URANDOM()

Here's where it gets complicated--and where everyone else thinks I'm wrong.

os.urandom() is currently the best place for a Python programmer to get 
high-quality random bits.  The one-line summary for os.urandom() reads: 
"Return a string of n random bytes suitable for cryptographic use."

On 3.4 and before, on Linux, os.urandom() would never block, but if the 
entropy pool was uninitialized it could return very-very-poor-quality 
random bits.  On 3.5.0 and 3.5.1, on Linux, when using the getrandom() 
call, it will instead block for an apparently unbounded period before 
returning high-quality random bits.  The question: is this new behavior 
preferable, or should we return to the old behavior?

Since I'm the one writing this email, let me make the case for my 
position: I think that os.urandom() should never block on Linux. Why?

1) Functions in the os module that look like OS functions should behave 
predictably like thin wrappers over those OS functions.

Most of the time this is exactly what they are.  In some cases they're 
more sophisticated; examples include os.popen(), os.scandir(), and the 
byzantine os.utime().  There are also some functions provided by the os 
module that don't resemble any native functionality, but these have 
unique names that don't look like anything provided by the OS.

This makes the behavior of the Python function easy to reason about: it 
always behaves like your local OS function.  Python provides os.stat() 
and it behaves like the local stat().  So if you want to know how any os 
module function behaves, just read your local man page.  Therefore, 
os.urandom() should behave exactly like a thin shell around reading the 
local /dev/urandom.

On Linux, /dev/urandom guarantees that it will never block.  This means 
it has undesirable behavior if read immediately after a fresh boot.  But 
this guarantee is so strong that Theodore Ts'o couldn't break it to fix 
the undesirable behavior.  Instead he added the getrandom() system 
call.  But he left /dev/urandom alone. Therefore, on Linux, os.urandom() 
should behave the same way, and also never block.

2) It's unfair to change the semantics of a well-established function to 
such a radical degree.

os.urandom() has been in Python since at least 2.6--I was too lazy to go 
back any further.  From 2.6 to 3.4, it behaved exactly like 
/dev/urandom, which meant that on Linux it would never block.  As of 
3.5, on Linux, it might now block for an unbounded period of time. Any 
code that calls os.urandom() has had its behavior radically changed in 
this extreme corner case.

3) os.urandom() doesn't actually guarantee it's suitable for cryptography.

The documentation for os.urandom() has contained this sentence, 
untouched, since 2.6:

    The returned data should be unpredictable enough for cryptographic
    applications, though its exact quality depends on the OS
    implementation. On a Unix-like system this will query /dev/urandom,
    and on Windows it will use CryptGenRandom().

Of course, version 3.5 added this:

    On Linux 3.17 and newer, the getrandom() syscall is now used when
    available.

But the waffling about its suitability for cryptography remains 
unchanged.  So, while it's undesirable that os.urandom() might return 
shockingly poor quality random bits, it is *permissible* according to 
the documentation.

4) This really is a rare corner-case we're talking about.

I just want to re-state: this case on Linux where /dev/urandom returns 
totally predictable bytes, and getrandom() will block, only happens when 
the entropy pool for urandom is uninitialized. Although it has been seen 
in the field, it's extremely rare. 99.99999%+ of the time, reading 
/dev/urandom and calling getrandom() will both return the exact same 
high-quality random bits without blocking.

5) This corner-case behavior is fixable externally to CPython.

I don't really understand the subject, but apparently it's entirely 
reasonable to expect sysadmins to directly manage the entropy pools of 
virtual machines.  They should be able to spin up their VMs with a 
pre-filled entropy pool.  So it should be possible to ensure that 
os.urandom() always returns the high-quality random bits we wanted, even 
on freshly-booted VMs.

6) Guido and Tim Peters already decided once that os.urandom() should 
behave like /dev/urandom.

Issue #25003:

    http://bugs.python.org/issue25003

In 2.7.10, os.urandom() was changed to call getentropy() instead of 
reading /dev/urandom when getentropy() was available.  getentropy() was 
"stunningly slow" on Solaris, on the order of 300x slower than reading 
/dev/urandom.  Guido and Tim both participated in the discussion on the 
issue; Guido also apparently discussed it via email with Theo De Raadt.

While it's not quite apples-to-apples, I think this establishes some 
precedent that os.urandom() should
   * behave like /dev/urandom, and
   * be fast.

--

On the other side is... everybody else.  I've already spent an enormous 
amount of time researching and writing and re-writing this email.  
Rather than try (and fail) to accurately present the other sides of this 
debate, I'm just going to end the email here and let the other 
participants reply and voice their views.

Bottom line: Guido, in this extreme corner case on Linux, should 
os.urandom() return bad random data like it used to, or should it block 
forever like it does in 3.5.0 and 3.5.1?

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/62cc3884/attachment.html>

From victor.stinner at gmail.com  Thu Jun  9 07:35:38 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Thu, 9 Jun 2016 13:35:38 +0200
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <57595210.4000508@hastings.org>
References: <57595210.4000508@hastings.org>
Message-ID: <CAMpsgwbJtJNX0VQUBXd6ZfQN6rFMNaV-_O0SEmiQnrwnGhjcdw@mail.gmail.com>

I understood that Christian Heimes and/or Donald Stufft are interested
to work on a PEP.

2016-06-09 13:25 GMT+02:00 Larry Hastings <larry at hastings.org>:
> A problem has surfaced just this week in 3.5.1.  Obviously this is a good
> time to fix it for 3.5.2.  But there's a big argument over what is "broken"
> and what is an appropriate "fix".

IMHO the bug is now fixed in 3.5.2 as I explained at:
http://haypo-notes.readthedocs.io/pep_random.html#status-of-python-3-5-2

> THE PROBLEM
>
> Issue #26839:
>
> http://bugs.python.org/issue26839
>
> (warning, the issue is now astonishingly long, and exhausting to read, and
> various bits of it are factually wrong)

You may want to read my summary:
http://haypo-notes.readthedocs.io/pep_random.html

I'm not interested to reply to Larry's email point per point. IHMO a
formal PEP is now required for Python 3.6 (to enhance os.urandom and
clarify Python behaviour before urandom is initialized). Python 3.5.2
is fixed, there is no more urgency ;-)

Victor

From christian at python.org  Thu Jun  9 07:54:38 2016
From: christian at python.org (Christian Heimes)
Date: Thu, 9 Jun 2016 13:54:38 +0200
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <57595210.4000508@hastings.org>
References: <57595210.4000508@hastings.org>
Message-ID: <njbldu$p6p$1@ger.gmane.org>

On 2016-06-09 13:25, Larry Hastings wrote:
> 
> A problem has surfaced just this week in 3.5.1.  Obviously this is a
> good time to fix it for 3.5.2.  But there's a big argument over what is
> "broken" and what is an appropriate "fix".
> 
> As 3.5 Release Manager, I can put my foot down and make rulings, and
> AFAIK the only way to overrule me is with the BDFL.  In two of three
> cases I've put my foot down.  In the third I'm pretty sure I'm right,
> but IIUC literally everyone with a stated opinion else disagrees with
> me.  So I thought it best I escalate it.  Note that 3.5.2 is going to
> wait until the issue is settled and any changes to behavior are written
> and checked in.
> 
> (Blanket disclaimer for the below: in some places I'm trying to
> communicate other's people positions.  I apologize if I misrepresented
> yours; please reply and correct my mistake.  Also, sorry for the length
> of this email.  But feel even sorrier for me: this debate has already
> eaten two days this week.)

Thanks for the digest, Larry.

I would appreciate if we could split the issue into three separate problems:

1) behavior of os.urandom()
2) initialization of _Py_HashSecret for byte, str and XML hash
randomization.
3) initialization of default random.random Mersenne-Twister

As of now 2 and 3 are the culprit for blocking starting. Both happen to
use _PyOS_URandom() either directly or indirectly through os.urandom().
We chose to use the OS random source because it was convenient. It is
not a necessity. The seed for Mersenne-Twister and the keys for hash
randomization don't have to be strong cryptographic values in all cases.
They just have to be hard-to-guess by an attacker. In case of scripts in
early boot, there are no viable attack scenarios.

Therefore I propose to fix problem 2 and 3:

- add a new random_seed member to _Py_HashSecret and use it to derive an
initial Mersenne-Twister state for the default random instance of the
random module.

- try CPRNG for _Py_HashSecret first, fall back to a user space RNG when
the Kernel's CPRNG would block.

For some operating systems like Windows and OSX, we can assume that
Kernel CPRNG is always available. For Linux we can use getrandom() in
non-blocking mode and handle EWOULDBLOCK. On BSD the seed state can be
queried from /proc.

Christian

From cory at lukasa.co.uk  Thu Jun  9 08:12:22 2016
From: cory at lukasa.co.uk (Cory Benfield)
Date: Thu, 9 Jun 2016 13:12:22 +0100
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <njbldu$p6p$1@ger.gmane.org>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
Message-ID: <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>

> On 9 Jun 2016, at 12:54, Christian Heimes <christian at python.org> wrote:
> 
> Therefore I propose to fix problem 2 and 3:
> 
> - add a new random_seed member to _Py_HashSecret and use it to derive an
> initial Mersenne-Twister state for the default random instance of the
> random module.
> 
> - try CPRNG for _Py_HashSecret first, fall back to a user space RNG when
> the Kernel's CPRNG would block.
> 
> For some operating systems like Windows and OSX, we can assume that
> Kernel CPRNG is always available. For Linux we can use getrandom() in
> non-blocking mode and handle EWOULDBLOCK. On BSD the seed state can be
> queried from /proc.
> 

I am in agreement with Christian here.

Let me add: Larry has suggested that it?s ok that os.urandom() can degrade to weak random numbers in part because "os.urandom() doesn't actually guarantee it's suitable for cryptography.? That?s true, that is what the documentation says.

However, that documentation has been emphatically disagreed with by the entire Python ecosystem *including* the Python standard library. Both random.SystemRandom and the secrets module use os.urandom() to generate their random numbers. The secrets module says this right at the top:

"The secrets module is used for generating cryptographically strong random numbers suitable for managing data such as passwords, account authentication, security tokens, and related secrets.?

Regressing the behaviour in os.urandom() would mean that this statement is not unequivocally true but only situationally true. It would be more accurate to say ?The secrets module should generate cryptographically strong random numbers most of the time?. So I?d argue that while os.urandom() does not make these promises, the rest of the standard library behaves like it does. While we?re here I should note that the cryptography project unequivocally recommends os.urandom[0], and that this aspect of Linux?s /dev/urandom behaviour is considered to be a dangerous misfeature by almost everyone in the crypto community.

The Linux kernel can?t change this stuff easily because they mustn?t break userspace. Python *is* userspace, we can do what we like, and we should be aiming to make sure that doing the obvious thing in Python amounts to doing the *right* thing. *Obviously* this shouldn?t block startup, and obviously we should fix that, but I disagree that we should be reverting the change to os.urandom().

Cory

[0]: https://cryptography.io/en/latest/random-numbers/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/8db8cf90/attachment.sig>

From donald at stufft.io  Thu Jun  9 08:26:20 2016
From: donald at stufft.io (Donald Stufft)
Date: Thu, 9 Jun 2016 08:26:20 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <57595210.4000508@hastings.org>
References: <57595210.4000508@hastings.org>
Message-ID: <6728E567-C894-47EF-88E0-0E0A2A678E6B@stufft.io>

> On Jun 9, 2016, at 7:25 AM, Larry Hastings <larry at hastings.org> wrote:
> 
> A problem has surfaced just this week in 3.5.1.  Obviously this is a good time to fix it for 3.5.2.  But there's a big argument over what is "broken" and what is an appropriate "fix".

Couple clarifications:

random.py
---------

In the abstract it doesn't hurt to seed MT with a CSPRNG, it just doesn't
provide much (if any) benefit and in this case it is hurting us because of the
cost on import (which will exist on other platforms as well no matter what we
do here for Linux). There are a couple solutions to this problem:

* Use getrandom(GRND_NONBLOCK) for random.Random since it doesn't matter if we
  get cryptographically secure random numbers or not.

* Switch it to use something other than a CSPRNG by default since it doesn't
  need that.

* Instead of seeding itself from os.urandom on import, have it lazily do that
  the first time one of the random.rand* functions are called.

* Do nothing, and say that ``import random`` relies on having the kernel's
  urandom pool initialized.

Between these options, I have a slight preference for switching it to use a non
CSPRNG, but I really don't care that much which of these options we pick. Using
random.Random is not secure and none of the above options meaningfully change
the security posture of something that accidently uses it.

SipHash and the Interpreter Startup
-----------------------------------

I have complicated thoughts on what SipHash should do. For something like, a
Django process, we never want it to be initialized with ?bad? entropy, however
reading straight from /dev/urandom, or getrandom(GRND_NONBLOCK) means that we
might get that if we start the process early enough in the boot process. The
rub here is that I cannot think of a situation where by the time you?re at the
point you?re starting up something like Django, you?re even remotely likely to
not have an initialized random pool. The other side of this issue is that we
have Python scripts which do not need a secure random being passed to SipHash
running early enough in the boot process with systemd that we need to be able
to have SipHash initialization not block on waiting for /dev/urandom.

So I?m torn between the ?Practicality beats Purity? mindset, which says we
should just let SipHash seed itself with whatever quality of random from the
urandom pool is currently available and the ?Special cases aren?t special
enough to break the rules? mindset which says that we should just make it
easier for scripts in this edge case to declare they don?t care about hash
randomization to remove the need for it (in other words, a CLI flag that
matches PYTHONHASHSEED in functionality). An additional wrinkle in the mix is
that we cannot get non-blocking random on many (any?) modern OS besides Linux,
so we're going to run into this same problem if say, FreeBSD decides to put a
Python script early enough in the boot sequence.

In the end, both of these choices make me happy and unhappy in different ways
but I would lean towards adding a CLI flag for the special case and letting the
systemd script that caused this problem invoke their Python with that flag. I
think this because:

* It leaves the interpreter so that it is secure by default, but provides the
  relevant knobs to turn off this default in cases where a user doesn't need
  or want it.
* It solves the problem in a cross platform way, that doesn't rely on the
  nuances of the CSPRNG interface on one particular supported platform.

os.urandom
----------

There have been a lot of proposals thrown around, and people pointing to
different sections of the documentation to justify different opinions. This is
easily the most contentious question we have here.

It is my belief that reading from urandom is the right thing to do for
generating cryptographically secure random numbers. This is a view point held
by every major security expert and cryptographer that I'm aware of. Most (all?)
major platforms besides Linux do not allow reading from their equivalent of
/dev/urandom until it has been successfully initialized and it is widely held
by all security experts and cryptographers that I'm aware of that this property
is a good one, and the Linux behavior of /dev/urandom is a wart/footgun but
that prior to getrandom() there simply wasn't a better option on Linux.

With that in mind, I think that we should, to the best of our ability given the
platform we're on, ensure that os.urandom does not return bytes that the OS
does not think is cryptographically secure.

In practice this means that os.urandom should do one of two things in the very
early boot process on Linux:

* Block waiting on the kernel to initialize the urandom pool, and then return
  the now secure random bytes given to us.
* Raise an exception saying that the pool has not been initialized and thus
  os.urandom is not ready yet.

The key point in both of these options is that os.urandom never [1] returns
bytes prior to the OS believing that it can give us cryptographically secure
random bytes.

I believe I have a preference for blocking on waiting the kernel to intialize
the urandom pool, because that makes Linux behave similarly to the other
platforms that I'm aware of.

I do not believe that adding additional public functions like some other people
have expressed to be a good option. I think they muddy the waters and I think
that it forces us to try and convince people that "no really, yes everyone
says you should use urandom, but you actually want getrandom". Particularly
since the outcome of these two functions would be exactly the same in all but
a very narrow edge case on Linux.

Larry has suggested that os.py should only ever be thin shells around OS
provided functionality and thus os.urandom should simply mimic whatever the
behavior of /dev/urandom is on that OS. For os.urandom in particular this is
already not the case since it calls CryptGetRandom on Windows, but putting that
aside since that's a Windows vs POSIX difference, we're not talking about
adding a great amount of functionality around something provided by the OS.
We're only talking about using a different interface to access the same
underlying functionality. In this case, an interface that better suits the
actual use of os.urandom in the wild and provides better properties all around.

He's also pointed out that the documentation does not guarantee that the result
of os.urandom will be cryptographically strong in the following quote:

    This function returns random bytes from an OS-specific randomness source.
    The returned data should be unpredictable enough for cryptographic
    applications, though its exact quality depends on the OS implementation. 

My read of this quote, is that this is a hedge against operating systems that
have implemented their urandom pool in such a way that it does not return
cryptographically secure random numbers that you don't come back and yell at
Python for it. In other words, it's a hedge against /dev/urandom being
https://xkcd.com/221/. I do not think this documentation excuses us from using
a weaker interface to the OS-specific randomness source simply because it's
name happens to match the name of the function. Particularly since earlier on
in that documentation it states:

    Return a string of n random bytes suitable for cryptographic use.

and the Python standard library, and the entire ecosystem as I know it, as well
as all security experts and crypto experts believe you should treat it as such.
This is largely because if your urandom pool is implemented in a way that, in
the general case it provides insecure random values, then you're beyond the
pale and there's nothing that Python, or anyone but your OS vendor, can do to
help you.

Further more, I think that the behavior I want (that os.urandom is secure by
default to the best of our abilities) is tricker to get right, and requires
interfacing with C code. However, getting the exact semantics of /dev/urandom
on Linux is trivial to do with a single line of Python code:

    def urandom(amt): open("/dev/urandom", "rb").read(amt)

So if you're someone who is depending on the Linux urandom behavior in an edge
case that almost nobody is going to hit, you can trivially get the old behavior
back. Even better, if you're someone depending on this, you're going to get an
*obvious* failure rather than silently getting insecure bytes. On top of all of
that, this only matters in a small edge case, most likely to only ever been hit
by OS vendors themselves, who are in the best position to make informed
decisions about how to work around the fact the urandom entropy pool hasn't
already been initialized rather than expecting every other user to have to try
and ensure that they don't start their Python script too early.

[1] To the best of our ability, given the interfaces and implementation
    provided to us by the OS.

?
Donald Stufft

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/7a28ef80/attachment-0001.html>

From donald at stufft.io  Thu Jun  9 08:32:02 2016
From: donald at stufft.io (Donald Stufft)
Date: Thu, 9 Jun 2016 08:32:02 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <57595210.4000508@hastings.org>
References: <57595210.4000508@hastings.org>
Message-ID: <2D2A5F9D-05BD-4E9D-907D-0E862E11821D@stufft.io>

> On Jun 9, 2016, at 7:25 AM, Larry Hastings <larry at hastings.org> wrote:
> 
> 6) Guido and Tim Peters already decided once that os.urandom() should behave like /dev/urandom.
> 
> Issue #25003:
> http://bugs.python.org/issue25003 <http://bugs.python.org/issue25003>
To be exceedingly clear, in this issue the problem wasn?t that os.urandom was
blocking once, early on in the boot process before the kernel had initialized
it?s urandom pool. The problem was that the getentropy() function on Solaris
behaves more like /dev/random does on Linux. This behavior is something that
myself, and most security experts/cryptographers that I know of, think is bad
behavior (and indeed, most OSs have gotten rid of this behavior of /dev/random
and made /dev/random and /dev/urandom behave the same... except again for
Linux).

The ask here isn't to make Linux behave like Solaris did in that issue, it's to
use the newer, better, interface to make Linux use the more secure behavior
that most (all?) of the other modern OSs have already adopted.

?
Donald Stufft

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/02603ac5/attachment.html>

From rdmurray at bitdance.com  Thu Jun  9 08:41:01 2016
From: rdmurray at bitdance.com (R. David Murray)
Date: Thu, 09 Jun 2016 08:41:01 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
Message-ID: <20160609124102.5EE4EB14024@webabinitio.net>

On Thu, 09 Jun 2016 13:12:22 +0100, Cory Benfield <cory at lukasa.co.uk> wrote:
> The Linux kernel can???t change this stuff easily because they mustn???t
> break userspace. Python *is* userspace, we can do what we like, and we

I don't have specific input on the rest of this discussion, but I disagree
strongly with this statement.  The environment in which python programs
run, ie: the python runtime and standard library, are *our* "userspace",
and the same constraints apply to our making changes there as apply
to the linux kernel and its userspace...even though we knowingly break
those constraints from time to time[*].

--David

[*] Which I think the twisted folks at least would argue we shouldn't
be doing :)

From doug at doughellmann.com  Thu Jun  9 08:53:51 2016
From: doug at doughellmann.com (Doug Hellmann)
Date: Thu, 09 Jun 2016 08:53:51 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160609124102.5EE4EB14024@webabinitio.net>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
Message-ID: <1465476616-sup-8510@lrrr.local>

Excerpts from R. David Murray's message of 2016-06-09 08:41:01 -0400:
> On Thu, 09 Jun 2016 13:12:22 +0100, Cory Benfield <cory at lukasa.co.uk> wrote:
> > The Linux kernel can???t change this stuff easily because they mustn???t
> > break userspace. Python *is* userspace, we can do what we like, and we
> 
> I don't have specific input on the rest of this discussion, but I disagree
> strongly with this statement.  The environment in which python programs
> run, ie: the python runtime and standard library, are *our* "userspace",
> and the same constraints apply to our making changes there as apply
> to the linux kernel and its userspace...even though we knowingly break
> those constraints from time to time[*].
> 
> --David
> 
> [*] Which I think the twisted folks at least would argue we shouldn't
> be doing :)

I agree with David. We shouldn't break existing behavior in a way
that might lead to someone else's software being unusable.

Adding a new API that does block allows anyone to call that when
they want guaranteed random values, and the decision about whether
to block or not can be placed in the application developer's hands.

Christian's points about separating the various cases and solutions also
make sense.

Doug

From donald at stufft.io  Thu Jun  9 08:59:57 2016
From: donald at stufft.io (Donald Stufft)
Date: Thu, 9 Jun 2016 08:59:57 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <1465476616-sup-8510@lrrr.local>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
Message-ID: <6E696C6F-5ADB-4F82-AF8D-E1C13CC71BA8@stufft.io>

> On Jun 9, 2016, at 8:53 AM, Doug Hellmann <doug at doughellmann.com> wrote:
> 
> Excerpts from R. David Murray's message of 2016-06-09 08:41:01 -0400:
>> On Thu, 09 Jun 2016 13:12:22 +0100, Cory Benfield <cory at lukasa.co.uk> wrote:
>>> The Linux kernel can???t change this stuff easily because they mustn???t
>>> break userspace. Python *is* userspace, we can do what we like, and we
>> 
>> I don't have specific input on the rest of this discussion, but I disagree
>> strongly with this statement.  The environment in which python programs
>> run, ie: the python runtime and standard library, are *our* "userspace",
>> and the same constraints apply to our making changes there as apply
>> to the linux kernel and its userspace...even though we knowingly break
>> those constraints from time to time[*].
>> 
>> --David
>> 
>> [*] Which I think the twisted folks at least would argue we shouldn't
>> be doing :)
> 
> I agree with David. We shouldn't break existing behavior in a way
> that might lead to someone else's software being unusable.
> 
> Adding a new API that does block allows anyone to call that when
> they want guaranteed random values, and the decision about whether
> to block or not can be placed in the application developer's hands.
> 

I think this is a terrible compromise. The new API is going to be exactly the
same as the old API in 99.9999% of cases and it's fighting against the entire
software ecosystem's suggestion of what to use ("use urandom" is basically a
meme at this point). This is like saying that we can't switch to verifying
HTTPS by default because a one in a million connection might have different
behavior instead of being silently insecure.

?
Donald Stufft

From cory at lukasa.co.uk  Thu Jun  9 09:27:36 2016
From: cory at lukasa.co.uk (Cory Benfield)
Date: Thu, 9 Jun 2016 14:27:36 +0100
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <1465476616-sup-8510@lrrr.local>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
Message-ID: <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>

> On 9 Jun 2016, at 13:53, Doug Hellmann <doug at doughellmann.com> wrote:
> 
> I agree with David. We shouldn't break existing behavior in a way
> that might lead to someone else's software being unusable.

What does ?usable? mean? Does it mean ?the code must execute from beginning to end?? Or does it mean ?the code must maintain the expected invariants?? If it?s the second, what reasonably counts as ?the expected invariants??

The problem here is that both definitions of ?broken? are unclear. If we leave os.urandom() as it is, there is a small-but-nonzero change that your program will hang, potentially indefinitely. If we change it back, there is a small-but-nonzero chance your program will generate you bad random numbers.

If we assume, for a moment, that os.urandom() doesn?t get called during Python startup (that is that we adopt Christian?s approach to deal with random and SipHash as separate concerns), what we?ve boiled down to is: your application called os.urandom() so early that you?ve got weak random numbers, does it hang or proceed? Those are literally our two options.

These two options can be described a different way. If you didn?t actually need strong random numbers but were affected by the hang, that program failed obviously, and it failed closed. You *will* notice that your program didn?t start up, you?ll investigate, and you?ll take action. On the other hand, if you need strong random numbers but were affected by os.urandom() returning bad random numbers, you almost certainly will *not* notice, and your program will have failed *open*: that is, you are exposed to a security risk, and you have no way to be alerted to that fact.

For my part, I think the first failure mode is *vastly* better than the second, even if the first failure mode affects vastly more people than the second one does. Failing early, obviously, and safely is IMO much, much better than failing late, silently, and dangerously.

I?d argue that all the security disagreements that happen in this list boil down to weighting that differently. For my part, I want code that expects to be used in a secure context to fail *as loudly as possible* if it is unable to operate securely. And for that reason:

> Adding a new API that does block allows anyone to call that when
> they want guaranteed random values, and the decision about whether
> to block or not can be placed in the application developer's hands.

I?d rather flip this around. Add a new API that *does not* block. Right now, os.urandom() is trying to fill two niches, one of which is security focused. I?d much rather decide that os.urandom() is the secure API and fail as loudly as possible when people are using it insecurely than to decide that os.urandom() is the *insecure* API and require changes.

This is because, again, people very rarely notice this kind of new API introduction unless their code explodes when they migrate. If you think you can find a way to blow up the secure crypto code only, I?m willing to have that patch too, but otherwise I really think that those who expect this code to be safe should be prioritised over those who expect it to be 100% available.

My ideal solution: change os.urandom() to throw an exception if the kernel CSPRNG is not seeded, and add a new function for saying you don?t care if the CSPRNG isn?t seeded, with all the appropriate ?don?t use this unless you?re sure? warnings on it.

Cory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/847e6ff6/attachment.sig>

From doug at doughellmann.com  Thu Jun  9 09:48:08 2016
From: doug at doughellmann.com (Doug Hellmann)
Date: Thu, 9 Jun 2016 09:48:08 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
Message-ID: <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>

> On Jun 9, 2016, at 9:27 AM, Cory Benfield <cory at lukasa.co.uk> wrote:
> 
> 
>> On 9 Jun 2016, at 13:53, Doug Hellmann <doug at doughellmann.com> wrote:
>> 
>> I agree with David. We shouldn't break existing behavior in a way
>> that might lead to someone else's software being unusable.
> 
> What does ?usable? mean? Does it mean ?the code must execute from beginning to end?? Or does it mean ?the code must maintain the expected invariants?? If it?s the second, what reasonably counts as ?the expected invariants??

The code must not cause the user?s computer to completely freeze in a way that makes their VM appear to be failing to boot?

> 
> The problem here is that both definitions of ?broken? are unclear. If we leave os.urandom() as it is, there is a small-but-nonzero change that your program will hang, potentially indefinitely. If we change it back, there is a small-but-nonzero chance your program will generate you bad random numbers.
> 
> If we assume, for a moment, that os.urandom() doesn?t get called during Python startup (that is that we adopt Christian?s approach to deal with random and SipHash as separate concerns), what we?ve boiled down to is: your application called os.urandom() so early that you?ve got weak random numbers, does it hang or proceed? Those are literally our two options.

I agree those are the two options. I want the application developer to make the choice, not us.

> 
> These two options can be described a different way. If you didn?t actually need strong random numbers but were affected by the hang, that program failed obviously, and it failed closed. You *will* notice that your program didn?t start up, you?ll investigate, and you?ll take action. On the other hand, if you need strong random numbers but were affected by os.urandom() returning bad random numbers, you almost certainly will *not* notice, and your program will have failed *open*: that is, you are exposed to a security risk, and you have no way to be alerted to that fact.
> 
> For my part, I think the first failure mode is *vastly* better than the second, even if the first failure mode affects vastly more people than the second one does. Failing early, obviously, and safely is IMO much, much better than failing late, silently, and dangerously.
> 
> I?d argue that all the security disagreements that happen in this list boil down to weighting that differently. For my part, I want code that expects to be used in a secure context to fail *as loudly as possible* if it is unable to operate securely. And for that reason:
> 
>> Adding a new API that does block allows anyone to call that when
>> they want guaranteed random values, and the decision about whether
>> to block or not can be placed in the application developer's hands.
> 
> I?d rather flip this around. Add a new API that *does not* block. Right now, os.urandom() is trying to fill two niches, one of which is security focused. I?d much rather decide that os.urandom() is the secure API and fail as loudly as possible when people are using it insecurely than to decide that os.urandom() is the *insecure* API and require changes.
> 
> This is because, again, people very rarely notice this kind of new API introduction unless their code explodes when they migrate. If you think you can find a way to blow up the secure crypto code only, I?m willing to have that patch too, but otherwise I really think that those who expect this code to be safe should be prioritised over those who expect it to be 100% available.
> 
> My ideal solution: change os.urandom() to throw an exception if the kernel CSPRNG is not seeded, and add a new function for saying you don?t care if the CSPRNG isn?t seeded, with all the appropriate ?don?t use this unless you?re sure? warnings on it.

All of which fails to be backwards compatible (new exceptions and hanging behavior), which means you?re breaking apps. Introducing a new API lets the developers who care about strong random values use them without breaking anyone else.

Doug

From donald at stufft.io  Thu Jun  9 09:57:22 2016
From: donald at stufft.io (Donald Stufft)
Date: Thu, 9 Jun 2016 09:57:22 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
Message-ID: <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>

> On Jun 9, 2016, at 9:48 AM, Doug Hellmann <doug at doughellmann.com> wrote:
> 
> All of which fails to be backwards compatible (new exceptions and hanging behavior), which means you?re breaking apps. Introducing a new API lets the developers who care about strong random values use them without breaking anyone else.

I assert that the vast bulk of users of os.urandom are using it because they
care about strong random values, not because they care about the nuances of
it's behavior on Linux. You're suggesting that almost every [1] single use of
os.urandom in the wild should switch to this new API. Forcing the multitudes to
adapt for the minority is just pointless churn and pain. Besides, Python has
never held backwards compatibility sacred above all else and regularly breaks
it in X.Y+1 releases when there is good reason to do so. Just yesterday there
was discussion on removing bytes(n) from Python 3.x not because it's dangerous
in any way, but because it's behavior makes it slightly confusing in an
extremely obvious way in a PEP that appears like it has a reasonably good
chance of being accepted. 

[1] I would almost go as far as to call it every single use, but I'm sure
    someone can dig up one person somewhere who purposely used this behavior. 

?
Donald Stufft

From cory at lukasa.co.uk  Thu Jun  9 10:32:35 2016
From: cory at lukasa.co.uk (Cory Benfield)
Date: Thu, 9 Jun 2016 15:32:35 +0100
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
Message-ID: <3AB96F39-8682-49AF-B57C-E5B9E6E68FF4@lukasa.co.uk>

> On 9 Jun 2016, at 14:48, Doug Hellmann <doug at doughellmann.com> wrote:
> 
> I agree those are the two options. I want the application developer to make the choice, not us.

Right, but right now those two options aren?t available: only one of them is. And one way or another we?re taking an action here: either we?re leaving os.urandom() as it stands now, or reverting it back to the way it was in 3.4.0.

This means that you *do* want python-dev to make a choice: specifically, you want python-dev to make the choice that was made in 3.4.0, rather than the one that was made in 3.5.0. That?s fine, but we shouldn?t be pretending that either side is arguing for inaction or the status quo for Python 3.5 a choice was made with insufficient knowledge of the outcomes, and now we?re arguing about whether we can revert that choice. The difference is, now we *do* know about both outcomes, which means we are consciously choosing between them.

> All of which fails to be backwards compatible (new exceptions and hanging behavior), which means you?re breaking apps.

Backwards compatible with what? Python 3.5.0 and 3.5.1 both have this behaviour, so I assume you mean ?backward compatible with 3.4?. However, part of the point of a major release is that it doesn?t have to be backward compatible in this manner: Python breaks backward compatibility all the time in major releases.

I should point out that as far as I'm aware there are exactly two applications that suffer from this problem. One of them is Debian?s autopkgtest, which has resolved this problem by invoking Python with PYTHONHASHSEED=0. The other is systemd-cron, and frankly it does not seem at all unreasonable to suggest that perhaps systemd-cron should *maybe* hold off until the system?s CSPRNG gets seeded before it starts executing.

Cory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/db50a65a/attachment-0001.sig>

From colm at tuatha.org  Thu Jun  9 10:02:53 2016
From: colm at tuatha.org (Colm Buckley)
Date: Thu, 9 Jun 2016 15:02:53 +0100
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
Message-ID: <CAPbCDfi0JnJAkWgSZi94rekCuPcKFeXDDZJmqGSwiO-e+++FLw@mail.gmail.com>

Larry Hastings wrote:

On 3.4 and before, on Linux, os.urandom() would never block, but if the
> entropy pool was uninitialized it could return very-very-poor-quality
> random bits.  On 3.5.0 and 3.5.1, on Linux, when using the getrandom()
> call, it will instead block for an apparently unbounded period before
> returning high-quality random bits.

Just a point of information here. Ted Ts'o commented on the quality of the
pre-initialization bits; it's not a given that they're "very very poor
quality". Even before the per-boot entropy pool is initialized, the kernel
has a few sources of randomness available to it - viz: interrupt timings,
RDRAND (on x86) and a little per-machine data (uname -a). If RDRAND is
trusted, this is enough to provide quite significant entropy, however
that's not much help to all the ARM devices out there.

The most pressing issue from my perspective is the hash randomization
initialization; as there is currently nothing a script author can do to
influence its behavior (except setting PYTHONHASHSEED before invocation,
which might not be an option).

It should be possible, at least conceptually, for Python to be used to
implement /sbin/init. This isn't currently the case on Linux with Python
3.5.1 and Linux 3.17+

For what it's worth, I do agree with Larry that os.urandom() should hew as
closely as possible to the OS-specific urandom implementation. Adding an
optional "blocking" boolean flag might be a useful addition for 3.6.

Colm

-- 
Colm Buckley / colm at tuatha.org / +353 87 2469146
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/73ef4ffb/attachment.html>

From guido at python.org  Thu Jun  9 11:52:50 2016
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Jun 2016 08:52:50 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
Message-ID: <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>

Wow. I have to decide an issue on which lots of people I respect disagree
strongly. So no matter how I decide some of you are going to hate me. Oh
well. :-(

So let's summarize the easy part first. It seems that there is actually
agreement that for the initialization of hash randomization and for the
random module's Mersenne Twister initialization it is not worth waiting.

That leaves direct calls to os.urandom(). I don't think this should block
either.

I'm not a security expert. I'm not really an expert in anything. But I
often have a good sense for what users need or want. In this case it's
clear what users want: they don't want Python to hang waiting for random
numbers.

Take an example from asyncio. If os.urandom() could block, then an ayncio
coroutine that wants to call it would have to move that call to a separate
thread using loop.run_in_executor() and await the resulting Future, just to
avoid blocking all I/O. But you can't test such code, because in practice
when you're there to test it, it will never block anyway. So nobody will
write it that way, and everybody's code will have a subtle bug (i.e. a
coroutine may block without letting other coroutines run). And it's not
just bare calls to os.urandom() -- it's any call to library code that might
call os.urandom(). Who documents whether their library call uses
os.urandom()? It's unknowable. And therein lies madness.

The problem with security experts is that they're always right when they
say you shouldn't do something. The only truly secure computer is one
that's disconnected and buried 6 feet under the ground. There's always a
scenario through which an attacker could exploit a certain behavior. And
there's always the possibility that the computer that's thus compromised is
guarding a list of Chinese dissidents, or a million credit card numbers, or
the key Apple uses to sign iPhone apps. But much more likely it just has my
family photos and 100 cloned GitHub projects.

And the only time when os.urandom() is going to block on me is probably
when I'm rebooting a development VM and wondering why it's so slow.

Maybe we can put in a warning when getrandom(..., GRND_NONBLOCK) returns
EAGAIN? And then award a prize to people who can make it print that
warning. Maybe we'll find a way to actually test this code.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/cfb39b3d/attachment.html>

From donald at stufft.io  Thu Jun  9 11:58:14 2016
From: donald at stufft.io (Donald Stufft)
Date: Thu, 9 Jun 2016 11:58:14 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
Message-ID: <621DF93D-AC42-4A36-A54D-1F6F08F9FDAF@stufft.io>

> On Jun 9, 2016, at 11:52 AM, Guido van Rossum <guido at python.org> wrote:
> 
> Wow. I have to decide an issue on which lots of people I respect disagree strongly. So no matter how I decide some of you are going to hate me. Oh well. :-(
> 
> So let's summarize the easy part first. It seems that there is actually agreement that for the initialization of hash randomization and for the random module's Mersenne Twister initialization it is not worth waiting.
> 
> That leaves direct calls to os.urandom(). I don't think this should block either.

To be clear, it?s going to block until urandom has been initialized on most non Linux OSs, so either way if the requirement of someone calling os.urandom is ?must never block?, then they can?t use os.urandom on most non Linux systems. 

> 
> I'm not a security expert. I'm not really an expert in anything. But I often have a good sense for what users need or want. In this case it's clear what users want: they don't want Python to hang waiting for random numbers.
> 
> Take an example from asyncio. If os.urandom() could block, then an ayncio coroutine that wants to call it would have to move that call to a separate thread using loop.run_in_executor() and await the resulting Future, just to avoid blocking all I/O. But you can't test such code, because in practice when you're there to test it, it will never block anyway. So nobody will write it that way, and everybody's code will have a subtle bug (i.e. a coroutine may block without letting other coroutines run). And it's not just bare calls to os.urandom() -- it's any call to library code that might call os.urandom(). Who documents whether their library call uses os.urandom()? It's unknowable. And therein lies madness.
> 
> The problem with security experts is that they're always right when they say you shouldn't do something. The only truly secure computer is one that's disconnected and buried 6 feet under the ground. There's always a scenario through which an attacker could exploit a certain behavior. And there's always the possibility that the computer that's thus compromised is guarding a list of Chinese dissidents, or a million credit card numbers, or the key Apple uses to sign iPhone apps. But much more likely it just has my family photos and 100 cloned GitHub projects.
> 
> And the only time when os.urandom() is going to block on me is probably when I'm rebooting a development VM and wondering why it's so slow.
> 
> Maybe we can put in a warning when getrandom(..., GRND_NONBLOCK) returns EAGAIN? And then award a prize to people who can make it print that warning. Maybe we'll find a way to actually test this code.
> 
> -- 
> --Guido van Rossum (python.org/~guido <http://python.org/~guido>)

?
Donald Stufft

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/5a7314b7/attachment.html>

From ethan at stoneleaf.us  Thu Jun  9 12:03:46 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 09 Jun 2016 09:03:46 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <57595210.4000508@hastings.org>
References: <57595210.4000508@hastings.org>
Message-ID: <57599362.7000304@stoneleaf.us>

On 06/09/2016 04:25 AM, Larry Hastings wrote:
>
> A problem has surfaced just this week in 3.5.1.  Obviously this is a
> good time to fix it for 3.5.2.  But there's a big argument over what is
> "broken" and what is an appropriate "fix".

Having read the thread thus far, here is my take on fixing it:

- Modify os.urandom() to raise an exception instead of blocking.
   Everyone seems to agree that this is a rare corner case, and
   being rare it would be easier (at least for me) to troubleshoot
   an exception instead of a VM (or whatever) hanging and then being
   killed.

- Add a CLI knob to not raise, but instead wait for initialization.
   I think this should be under the control of the user, who knows
   (or should) the environment that Python is running under, and not
   the developer who may have never dreamed his/her little script
   would be called first thing during bootup.  Maybe we just continue
   to use the hash seed parameter for this.

- Modify the functions that don't need cryptographically strong random
   bits to use the old style (reading directly from /dev/urandom?).

This seems like it should appease the security folks, yet still allow 
those in the trenches to (more) easily diagnose and work around the problem.

--
~Ethan~

From guido at python.org  Thu Jun  9 12:16:54 2016
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Jun 2016 09:16:54 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <57599362.7000304@stoneleaf.us>
References: <57595210.4000508@hastings.org> <57599362.7000304@stoneleaf.us>
Message-ID: <CAP7+vJJrXqujzeKZM-h7NKR-8UAESRUeoGUMP9zY1D3uYKEF_Q@mail.gmail.com>

To expand on my idea of printing a warning, in 3.6 we could add a new
Warning exception for this purpose, so you'd have command-line control over
the behavior of os.urandom() by specifying -WXXX on your Python command
line. For 3.5.2 that's too fancy though -- we can't add a new exception.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/52f0b21e/attachment.html>

From tjreedy at udel.edu  Thu Jun  9 12:27:38 2016
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 9 Jun 2016 12:27:38 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
Message-ID: <njc5dr$8p1$1@ger.gmane.org>

On 6/9/2016 9:48 AM, Doug Hellmann wrote:
>
>> On Jun 9, 2016, at 9:27 AM, Cory Benfield <cory at lukasa.co.uk>
>> wrote:

>> The problem here is that both definitions of ?broken? are unclear.
>> If we leave os.urandom() as it is, there is a small-but-nonzero
>> change that your program will hang, potentially indefinitely. If we
>> change it back, there is a small-but-nonzero chance your program
>> will generate you bad random numbers.
>>
>> If we assume, for a moment, that os.urandom() doesn?t get called
>> during Python startup (that is that we adopt Christian?s approach
>> to deal with random and SipHash as separate concerns), what we?ve
>> boiled down to is: your application called os.urandom() so early
>> that you?ve got weak random numbers, does it hang or proceed? Those
>> are literally our two options.
>
> I agree those are the two options. I want the application developer
> to make the choice, not us.

I think the 'new API' should be a parameter, not a new function. With 
just two choices, 'wait' = True/False  could work.  If 'raise an 
exception' were added, then
'action (when good bits are not immediately available' =
'return (best possible)' or
'wait (until have good bits)' or
'raise (CryptBitsNotAvailable)'

In either case, there would then be the question of whether the default 
should match 3.5.0/1 or 3.4 and before.

-- 
Terry Jan Reedy

From steve at pearwood.info  Thu Jun  9 12:30:02 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 10 Jun 2016 02:30:02 +1000
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <6728E567-C894-47EF-88E0-0E0A2A678E6B@stufft.io>
References: <57595210.4000508@hastings.org>
 <6728E567-C894-47EF-88E0-0E0A2A678E6B@stufft.io>
Message-ID: <20160609163000.GB27919@ando.pearwood.info>

On Thu, Jun 09, 2016 at 08:26:20AM -0400, Donald Stufft wrote:

> random.py
> ---------
> 
> In the abstract it doesn't hurt to seed MT with a CSPRNG, it just doesn't
> provide much (if any) benefit and in this case it is hurting us because of the
> cost on import (which will exist on other platforms as well no matter what we
> do here for Linux). There are a couple solutions to this problem:
> 
> * Use getrandom(GRND_NONBLOCK) for random.Random since it doesn't matter if we
>   get cryptographically secure random numbers or not.

+1 on this option (see below for rationale).

> * Switch it to use something other than a CSPRNG by default since it doesn't
>   need that.
[...]
> Between these options, I have a slight preference for switching it to use a non
> CSPRNG, but I really don't care that much which of these options we pick. Using
> random.Random is not secure and none of the above options meaningfully change
> the security posture of something that accidently uses it.

I don't think that is quite right, although it will depend on your 
definition of "meaningful".

PEP 506 says:

    Demonstrated attacks against MT are typically against PHP 
    applications. It is believed that PHP's version of MT is a 
    significantly softer target than Python's version, due to
    a poor seeding technique [17] . 

https://www.python.org/dev/peps/pep-0506/#id17

specifically that PHP seeds the MT with the time, while we use the 
output of a CSPRNG. Now, we all agree that MT is completely the wrong 
thing to use for secrets, good seeding or not, but *bad* seeding could 
make it a PHP-level soft target.

The point of PEP 506 is to move people away from using random.Random for 
their secrets, but we should expect that whatever we do, there will be 
some late adopters who are slow to get the message and continue to use 
it. I would not like us to weaken the seeding technique to the point 
that those folks become an attractive target.

I think that using getrandom(GRND_NONBLOCK) will be okay, provided that 
when the entropy pool is too low and getrandom falls back to something 
cryptographically weak, it's still better (hopefully significantly 
better) than seeding with the time.

My reasoning is that the sort of applications that could be targets of 
attacks against MT are unlikely to be started up early in the boot 
process, so they're almost always going to get good crypto seeds. On the 
rare occasion that they don't, well, there's only so far that I'm 
prepared to stand up for developer's right to be ignorant of security 
concerns in 2016, and that's where I draw the line.

> SipHash and the Interpreter Startup
> -----------------------------------
[...]
> In the end, both of these choices make me happy and unhappy in different ways
> but I would lean towards adding a CLI flag for the special case and letting the
> systemd script that caused this problem invoke their Python with that flag. I
> think this because:
> 
> * It leaves the interpreter so that it is secure by default, but provides the
>   relevant knobs to turn off this default in cases where a user doesn't need
>   or want it.
> * It solves the problem in a cross platform way, that doesn't rely on the
>   nuances of the CSPRNG interface on one particular supported platform.

Makes sense to me.

+1

> os.urandom
> ----------
[...]
> With that in mind, I think that we should, to the best of our ability given the
> platform we're on, ensure that os.urandom does not return bytes that the OS
> does not think is cryptographically secure.

Just to be clear, you're talking about having it block rather than raise 
an exception, right?

If so, that makes sense to me. That's already the behaviour on all major 
platforms except Linux, so you're just bringing Linux into line with the 
others. Those who want the non-blocking behaviour on Linux can just read 
from /dev/urandom.

+1

-- 
Steve

From donald at stufft.io  Thu Jun  9 12:39:00 2016
From: donald at stufft.io (Donald Stufft)
Date: Thu, 9 Jun 2016 12:39:00 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160609163000.GB27919@ando.pearwood.info>
References: <57595210.4000508@hastings.org>
 <6728E567-C894-47EF-88E0-0E0A2A678E6B@stufft.io>
 <20160609163000.GB27919@ando.pearwood.info>
Message-ID: <D0F93447-6FFC-49BF-8CCB-3646D9D0C926@stufft.io>

> On Jun 9, 2016, at 12:30 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> 
>> 
>> os.urandom
>> ----------
> [...]
>> With that in mind, I think that we should, to the best of our ability given the
>> platform we're on, ensure that os.urandom does not return bytes that the OS
>> does not think is cryptographically secure.
> 
> Just to be clear, you're talking about having it block rather than raise 
> an exception, right?
> 
> If so, that makes sense to me. That's already the behaviour on all major 
> platforms except Linux, so you're just bringing Linux into line with the 
> others. Those who want the non-blocking behaviour on Linux can just read 
> from /dev/urandom.

There are three options for what do with os.urandom by default:

* Allow it to silently return data that may or may not be cryptographically secure based on what the state of the urandom pool initialization looks like.
* Raise an exception if we determine that the pool isn?t initialized enough to get secure random from it.
* Block until the pool is initialized.

Historically Python has done the first option on Linux (but not on other OSs) because that was simply the only interface that Linux offered at all. In 3.5.0 Victor changed the way os.urandom worked in a way that made it use the third option (he wasn?t attempting to change the security properties, just avoid using an FD, but it improved the security properties as well).

My opinion is that blocking is slightly better than raising an exception because it matches what other OSs do, but that both blocking and raising an exception is better than silently giving data that may or may not be cryptographically secure.

?
Donald Stufft

From benno at benno.id.au  Thu Jun  9 12:54:31 2016
From: benno at benno.id.au (Ben Leslie)
Date: Thu, 9 Jun 2016 12:54:31 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <D0F93447-6FFC-49BF-8CCB-3646D9D0C926@stufft.io>
References: <57595210.4000508@hastings.org>
 <6728E567-C894-47EF-88E0-0E0A2A678E6B@stufft.io>
 <20160609163000.GB27919@ando.pearwood.info>
 <D0F93447-6FFC-49BF-8CCB-3646D9D0C926@stufft.io>
Message-ID: <CABZ0LtAF_qUJCNdU0q+9-6Q9v_WxL+jWgG4dX9KRUBwskkeKjA@mail.gmail.com>

On 9 June 2016 at 12:39, Donald Stufft <donald at stufft.io> wrote:
>
>> On Jun 9, 2016, at 12:30 PM, Steven D'Aprano <steve at pearwood.info> wrote:
>>
>>>
>>> os.urandom
>>> ----------
>> [...]
>>> With that in mind, I think that we should, to the best of our ability given the
>>> platform we're on, ensure that os.urandom does not return bytes that the OS
>>> does not think is cryptographically secure.
>>
>> Just to be clear, you're talking about having it block rather than raise
>> an exception, right?
>>
>> If so, that makes sense to me. That's already the behaviour on all major
>> platforms except Linux, so you're just bringing Linux into line with the
>> others. Those who want the non-blocking behaviour on Linux can just read
>> from /dev/urandom.
>
>
> There are three options for what do with os.urandom by default:
>
> * Allow it to silently return data that may or may not be cryptographically secure based on what the state of the urandom pool initialization looks like.
> * Raise an exception if we determine that the pool isn?t initialized enough to get secure random from it.
> * Block until the pool is initialized.
>
> Historically Python has done the first option on Linux (but not on other OSs) because that was simply the only interface that Linux offered at all. In 3.5.0 Victor changed the way os.urandom worked in a way that made it use the third option (he wasn?t attempting to change the security properties, just avoid using an FD, but it improved the security properties as well).
>
> My opinion is that blocking is slightly better than raising an exception because it matches what other OSs do, but that both blocking and raising an exception is better than silently giving data that may or may not be cryptographically secure.

I think an exception is much easier for a user to deal with from a
practical point of view. Trying to work out why a process has hung is
obviously possible, but not necessarily easy.

Having a process crash due to an exception is very easy to diagnose by
comparison.

Cheers,

Ben

From steve at pearwood.info  Thu Jun  9 13:14:50 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 10 Jun 2016 03:14:50 +1000
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <D0F93447-6FFC-49BF-8CCB-3646D9D0C926@stufft.io>
References: <57595210.4000508@hastings.org>
 <6728E567-C894-47EF-88E0-0E0A2A678E6B@stufft.io>
 <20160609163000.GB27919@ando.pearwood.info>
 <D0F93447-6FFC-49BF-8CCB-3646D9D0C926@stufft.io>
Message-ID: <20160609171450.GC27919@ando.pearwood.info>

On Thu, Jun 09, 2016 at 12:39:00PM -0400, Donald Stufft wrote:

> There are three options for what do with os.urandom by default:
> 
> * Allow it to silently return data that may or may not be 
> cryptographically secure based on what the state of the urandom pool 
> initialization looks like.

Just to be clear, this is only an option on Linux, right? All the other 
major platforms block, whatever we decide to do on Linux. Including 
Windows?

-- 
Steve

From p.f.moore at gmail.com  Thu Jun  9 13:21:32 2016
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 9 Jun 2016 18:21:32 +0100
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CABZ0LtAF_qUJCNdU0q+9-6Q9v_WxL+jWgG4dX9KRUBwskkeKjA@mail.gmail.com>
References: <57595210.4000508@hastings.org>
 <6728E567-C894-47EF-88E0-0E0A2A678E6B@stufft.io>
 <20160609163000.GB27919@ando.pearwood.info>
 <D0F93447-6FFC-49BF-8CCB-3646D9D0C926@stufft.io>
 <CABZ0LtAF_qUJCNdU0q+9-6Q9v_WxL+jWgG4dX9KRUBwskkeKjA@mail.gmail.com>
Message-ID: <CACac1F-zRKeaAhZOYW85c3K2hFA_PMi3Jww6D+a1wufPLEExAA@mail.gmail.com>

On 9 June 2016 at 17:54, Ben Leslie <benno at benno.id.au> wrote:
>> My opinion is that blocking is slightly better than raising an exception because it matches what other OSs do, but that both blocking and raising an exception is better than silently giving data that may or may not be cryptographically secure.
>
> I think an exception is much easier for a user to deal with from a
> practical point of view. Trying to work out why a process has hung is
> obviously possible, but not necessarily easy.

If we put the specific issue of applications that run very early in
system startup to one side, is there a possibility of running out of
entropy during normal system use? Even for a tiny duration? An
exception may be better than a hanging process, but a random process
crash in place of a wait of a few microseconds for the entropy buffer
to fill up again not so much.

If we could predict whether the call was going to block for a
microsecond, or for 20 minutes, I'd be OK with an exception for the
latter case. But we can't predict the future, so unless the system
call is guaranteed not to block except at system startup, then I
prefer blocking over an exception.

As for blocking vs returning less random results, I defer to others on that.

On 9 June 2016 at 18:14, Steven D'Aprano <steve at pearwood.info> wrote:
> On Thu, Jun 09, 2016 at 12:39:00PM -0400, Donald Stufft wrote:
>
>> There are three options for what do with os.urandom by default:
>>
>> * Allow it to silently return data that may or may not be
>> cryptographically secure based on what the state of the urandom pool
>> initialization looks like.
>
> Just to be clear, this is only an option on Linux, right? All the other
> major platforms block, whatever we decide to do on Linux. Including
> Windows?

That's what I understood, certainly. But the place where this was an
issue in real life was a Python program being run during the startup
sequence of the OS. That's never going to be possible on Windows, so
I'd be cautious about drawing parallels with Windows in this situation
(blocking on Windows may be fine because Python can never run when
Windows could possibly have low entropy available).

Paul

From donald at stufft.io  Thu Jun  9 13:22:00 2016
From: donald at stufft.io (Donald Stufft)
Date: Thu, 9 Jun 2016 13:22:00 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160609171450.GC27919@ando.pearwood.info>
References: <57595210.4000508@hastings.org>
 <6728E567-C894-47EF-88E0-0E0A2A678E6B@stufft.io>
 <20160609163000.GB27919@ando.pearwood.info>
 <D0F93447-6FFC-49BF-8CCB-3646D9D0C926@stufft.io>
 <20160609171450.GC27919@ando.pearwood.info>
Message-ID: <A7C3ACE6-2C34-432F-B4B2-E56AB006773C@stufft.io>

> On Jun 9, 2016, at 1:14 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> 
> On Thu, Jun 09, 2016 at 12:39:00PM -0400, Donald Stufft wrote:
> 
>> There are three options for what do with os.urandom by default:
>> 
>> * Allow it to silently return data that may or may not be 
>> cryptographically secure based on what the state of the urandom pool 
>> initialization looks like.
> 
> Just to be clear, this is only an option on Linux, right? All the other 
> major platforms block, whatever we decide to do on Linux. Including 
> Windows?

To my knowledge, all other major platforms block or otherwise ensure that /dev/urandom can never return anything but cryptographically secure random. [1]

> 
> 
> -- 
> Steve
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/donald%40stufft.io

[1] I believe OpenBSD cannot block, but they inject randomness via the boot loader so that the system is never in a state where the kernel doesn?t have enough entropy.

?
Donald Stufft

From donald at stufft.io  Thu Jun  9 13:24:11 2016
From: donald at stufft.io (Donald Stufft)
Date: Thu, 9 Jun 2016 13:24:11 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CACac1F-zRKeaAhZOYW85c3K2hFA_PMi3Jww6D+a1wufPLEExAA@mail.gmail.com>
References: <57595210.4000508@hastings.org>
 <6728E567-C894-47EF-88E0-0E0A2A678E6B@stufft.io>
 <20160609163000.GB27919@ando.pearwood.info>
 <D0F93447-6FFC-49BF-8CCB-3646D9D0C926@stufft.io>
 <CABZ0LtAF_qUJCNdU0q+9-6Q9v_WxL+jWgG4dX9KRUBwskkeKjA@mail.gmail.com>
 <CACac1F-zRKeaAhZOYW85c3K2hFA_PMi3Jww6D+a1wufPLEExAA@mail.gmail.com>
Message-ID: <AA16D123-997E-471D-86DD-2DBE5200AB63@stufft.io>

> On Jun 9, 2016, at 1:21 PM, Paul Moore <p.f.moore at gmail.com> wrote:
> 
> On 9 June 2016 at 17:54, Ben Leslie <benno at benno.id.au> wrote:
>>> My opinion is that blocking is slightly better than raising an exception because it matches what other OSs do, but that both blocking and raising an exception is better than silently giving data that may or may not be cryptographically secure.
>> 
>> I think an exception is much easier for a user to deal with from a
>> practical point of view. Trying to work out why a process has hung is
>> obviously possible, but not necessarily easy.
> 
> If we put the specific issue of applications that run very early in
> system startup to one side, is there a possibility of running out of
> entropy during normal system use? Even for a tiny duration? An
> exception may be better than a hanging process, but a random process
> crash in place of a wait of a few microseconds for the entropy buffer
> to fill up again not so much.
> 
> If we could predict whether the call was going to block for a
> microsecond, or for 20 minutes, I'd be OK with an exception for the
> latter case. But we can't predict the future, so unless the system
> call is guaranteed not to block except at system startup, then I
> prefer blocking over an exception.

/dev/urandom (and getrandom() on Linux) will never block once the pool
has been initialized. The concept of ?running out of entropy? doesn?t
apply to it. Once it has entropy it?s good to go.

> 
> As for blocking vs returning less random results, I defer to others on that.
> 
> On 9 June 2016 at 18:14, Steven D'Aprano <steve at pearwood.info> wrote:
>> On Thu, Jun 09, 2016 at 12:39:00PM -0400, Donald Stufft wrote:
>> 
>>> There are three options for what do with os.urandom by default:
>>> 
>>> * Allow it to silently return data that may or may not be
>>> cryptographically secure based on what the state of the urandom pool
>>> initialization looks like.
>> 
>> Just to be clear, this is only an option on Linux, right? All the other
>> major platforms block, whatever we decide to do on Linux. Including
>> Windows?
> 
> That's what I understood, certainly. But the place where this was an
> issue in real life was a Python program being run during the startup
> sequence of the OS. That's never going to be possible on Windows, so
> I'd be cautious about drawing parallels with Windows in this situation
> (blocking on Windows may be fine because Python can never run when
> Windows could possibly have low entropy available).
> 
> Paul

?
Donald Stufft

From steve at pearwood.info  Thu Jun  9 13:29:12 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 10 Jun 2016 03:29:12 +1000
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CABZ0LtAF_qUJCNdU0q+9-6Q9v_WxL+jWgG4dX9KRUBwskkeKjA@mail.gmail.com>
References: <57595210.4000508@hastings.org>
 <6728E567-C894-47EF-88E0-0E0A2A678E6B@stufft.io>
 <20160609163000.GB27919@ando.pearwood.info>
 <D0F93447-6FFC-49BF-8CCB-3646D9D0C926@stufft.io>
 <CABZ0LtAF_qUJCNdU0q+9-6Q9v_WxL+jWgG4dX9KRUBwskkeKjA@mail.gmail.com>
Message-ID: <20160609172911.GD27919@ando.pearwood.info>

On Thu, Jun 09, 2016 at 12:54:31PM -0400, Ben Leslie wrote:

> I think an exception is much easier for a user to deal with from a
> practical point of view. Trying to work out why a process has hung is
> obviously possible, but not necessarily easy.
> 
> Having a process crash due to an exception is very easy to diagnose by
> comparison.

That only makes sense if the application is going to block for (say) 
five or ten minutes. If it's going to block for three seconds, you might 
not even notice. At least not on a server.

But what are you going to do when you catch that exception?

- Sleep for a few seconds, and try again? That's just blocking.

- Stop waiting on secure randomness, and use something low quality 
  and insecure? That's how you get exploits.

- Crash?

-- 
Steve

From steve at pearwood.info  Thu Jun  9 13:49:27 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 10 Jun 2016 03:49:27 +1000
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CACac1F-zRKeaAhZOYW85c3K2hFA_PMi3Jww6D+a1wufPLEExAA@mail.gmail.com>
References: <57595210.4000508@hastings.org>
 <6728E567-C894-47EF-88E0-0E0A2A678E6B@stufft.io>
 <20160609163000.GB27919@ando.pearwood.info>
 <D0F93447-6FFC-49BF-8CCB-3646D9D0C926@stufft.io>
 <CABZ0LtAF_qUJCNdU0q+9-6Q9v_WxL+jWgG4dX9KRUBwskkeKjA@mail.gmail.com>
 <CACac1F-zRKeaAhZOYW85c3K2hFA_PMi3Jww6D+a1wufPLEExAA@mail.gmail.com>
Message-ID: <20160609174927.GE27919@ando.pearwood.info>

On Thu, Jun 09, 2016 at 06:21:32PM +0100, Paul Moore wrote:

> If we put the specific issue of applications that run very early in
> system startup to one side, is there a possibility of running out of
> entropy during normal system use? Even for a tiny duration?

With /dev/urandom, I believe the answer to that is no.

On most platforms other than Linux, /dev/urandom is exactly the same as 
/dev/random, and both can only block straight after the machine has 
booted up before enough entropy has been collected. Then they will run 
forever without blocking. (Or at least until you reboot.)

On Linux, /dev/random *will* block, at unpredictable times, but 
fortunately we're not using /dev/random. We're using Urandom. Apart from 
just after boot up, /dev/urandom on Linux will also run forever without 
blocking, just like the other platforms.

The critical difference is just after booting up:

- Linux /dev/urandom doesn't block, but it might return predictable, 
  poor-quality pseudo-random bytes (i.e. a potential exploit);

- Other OSes may block for potentially many minutes (i.e. a 
  potential DOS).

Two links which may help explain what's happening:

http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/

http://security.stackexchange.com/a/42955

-- 
Steve

From ncoghlan at gmail.com  Thu Jun  9 13:53:59 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 9 Jun 2016 10:53:59 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <57595210.4000508@hastings.org>
References: <57595210.4000508@hastings.org>
Message-ID: <CADiSq7cXd52UQQgx6te1=QajqmZmkqgH3L2fUH-GFPXOYS8whQ@mail.gmail.com>

On 9 June 2016 at 04:25, Larry Hastings <larry at hastings.org> wrote:
> A user reports that when starting CPython soon after startup on a fresh
> virtual machine, the process would hang for a long time.  Someone on the
> issue reported observed delays of over 90 seconds.  Later we found out: it
> wasn't 90 seconds before CPython became usable, these 90 seconds delays were
> before systemd timed out and simply killed the process.  It's not clear what
> the upper bound on the delay might be.
>
> The issue author had already identified the cause: CPython was blocking on
> getrandom() in order to initialize hash randomization.  On this fresh
> virtual machine the entropy pool started out uninitialized.  And since the
> only thing running on the machine was CPython, and since CPython was blocked
> on initialization, the entropy pool was initializing very, very slowly.

Further analysis (mentioned later in the original Python-3.5-on-Linux
bug report) suggested that this wasn't actually a generic "waiting for
the entropy pool to initialise" problem. Instead, the problem appeared
to be specifically that the Python script was being invoked *before
the Linux kernel had initialised the entropy pool* and the boot
process was waiting for that script to run before continuing on with
other tasks (like initialising the entropy pool). That meant
os.urandom() had nothing to do with it (since the affected script
wasn't generating random numbers), and the entire problem was that we
were blocking trying to initialise CPython's internal hashing.

Born from Victor's proposal to add a "wait for entropy?" flag to
os.urandom [1], the simplest proposal for a long term fix [2] posted
so far has been to:

1. make os.urandom raise BlockingIOError if kernel entropy is not available
2. don't rely on os.urandom for internal hash initialisation
3. don't rely on os.urandom for MT seeding in the random module

Linux is currently the only OS we know of where the BlockingIOError
would be a possible result, and the only known scenarios where it
could be raised are Linux init system scripts and some embedded
systems where the kernel doesn't have any good sources of entropy. In
both those cases, the lack of entropy is potentially a real problem,
and an exception lets the software author make an informed decision to
either wait for entropy (e.g. by polling os.urandom() until it
succeeds, or selecting on /dev/random) or else read directly from
/dev/urandom (potentially getting non-cryptographically secure bits)

The virtue of this approach is that it's entirely invisible for almost
all users, and the users that it does affect will start getting an
exception in Python 3.6+ rather than silently being handed
cryptographically non-secure random data.

Cheers,
Nick.

[1] http://bugs.python.org/issue27266
[2] http://bugs.python.org/issue27282

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From benno at benno.id.au  Thu Jun  9 13:57:58 2016
From: benno at benno.id.au (Ben Leslie)
Date: Thu, 9 Jun 2016 13:57:58 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160609172911.GD27919@ando.pearwood.info>
References: <57595210.4000508@hastings.org>
 <6728E567-C894-47EF-88E0-0E0A2A678E6B@stufft.io>
 <20160609163000.GB27919@ando.pearwood.info>
 <D0F93447-6FFC-49BF-8CCB-3646D9D0C926@stufft.io>
 <CABZ0LtAF_qUJCNdU0q+9-6Q9v_WxL+jWgG4dX9KRUBwskkeKjA@mail.gmail.com>
 <20160609172911.GD27919@ando.pearwood.info>
Message-ID: <CABZ0LtC31gBDML82THw+Y2_7M3A4E7agyfAwhamHzSdkFZ0stQ@mail.gmail.com>

On 9 June 2016 at 13:29, Steven D'Aprano <steve at pearwood.info> wrote:
> On Thu, Jun 09, 2016 at 12:54:31PM -0400, Ben Leslie wrote:
>
>> I think an exception is much easier for a user to deal with from a
>> practical point of view. Trying to work out why a process has hung is
>> obviously possible, but not necessarily easy.
>>
>> Having a process crash due to an exception is very easy to diagnose by
>> comparison.
>
> That only makes sense if the application is going to block for (say)
> five or ten minutes. If it's going to block for three seconds, you might
> not even notice. At least not on a server.
>
> But what are you going to do when you catch that exception?
>
> - Sleep for a few seconds, and try again? That's just blocking.
>
> - Stop waiting on secure randomness, and use something low quality
>   and insecure? That's how you get exploits.
>
> - Crash?

What does a program do when on any exception? It really depends on the
program and the circumstances in which it is running.

But I would think that in most circumstances 'crash' is the answer.

In the circumstances where this is most likely going to occur (server
startup) you are almost certainly going to have some type of
supervisory program restarting the failed process. It will almost
certainly be logging the failure. Having logs filled with process
restarts due to this error until there is finally entropy is better
than it just hanging. At least that is what I'd prefer to diagnose.

I think the real solution here would be outside of Python; starting a
process that needs entropy when the system isn't ready yet is just as
silly as running a 'mount' on a disk where the driver is still
loading, or 'ifconfig' on a network interface where the network driver
isn't yet loaded. But that isn't really a problem that can be solved
in the context of Python.

Cheers,

Ben

From christian at python.org  Thu Jun  9 13:57:37 2016
From: christian at python.org (Christian Heimes)
Date: Thu, 9 Jun 2016 19:57:37 +0200
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160609171450.GC27919@ando.pearwood.info>
References: <57595210.4000508@hastings.org>
 <6728E567-C894-47EF-88E0-0E0A2A678E6B@stufft.io>
 <20160609163000.GB27919@ando.pearwood.info>
 <D0F93447-6FFC-49BF-8CCB-3646D9D0C926@stufft.io>
 <20160609171450.GC27919@ando.pearwood.info>
Message-ID: <njcamh$qjh$1@ger.gmane.org>

On 2016-06-09 19:14, Steven D'Aprano wrote:
> On Thu, Jun 09, 2016 at 12:39:00PM -0400, Donald Stufft wrote:
> 
>> There are three options for what do with os.urandom by default:
>>
>> * Allow it to silently return data that may or may not be 
>> cryptographically secure based on what the state of the urandom pool 
>> initialization looks like.
> 
> Just to be clear, this is only an option on Linux, right? All the other 
> major platforms block, whatever we decide to do on Linux. Including 
> Windows?

To best of my knowledge, Windows and OSX are already initialized when
Python is started. On other BSD platforms it is possible to get the
seeding state through the proc file system.

From zreed at fastmail.com  Thu Jun  9 15:41:02 2016
From: zreed at fastmail.com (zreed at fastmail.com)
Date: Thu, 09 Jun 2016 14:41:02 -0500
Subject: [Python-Dev] PEP 468
Message-ID: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>

Is there any further thoughts on including this in 3.6?  Similar to the
recent discussion on OrderedDict namespaces for metaclasses, this would
simplify / enable a number of type factory use cases where proper
metaclasses are overkill. This feature would also be quite nice in say
pandas where the (currently unspecified) field order used in the
definition of frames is preserved in user-visible displays.

From vgr255 at live.ca  Thu Jun  9 16:10:00 2016
From: vgr255 at live.ca (=?iso-8859-1?Q?=C9manuel_Barry?=)
Date: Thu, 9 Jun 2016 16:10:00 -0400
Subject: [Python-Dev] PEP 468
In-Reply-To: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
Message-ID: <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>

> From: zreed at fastmail.com
> Subject: [Python-Dev] PEP 468
> 
> Is there any further thoughts on including this in 3.6?  Similar to the
> recent discussion on OrderedDict namespaces for metaclasses, this would
> simplify / enable a number of type factory use cases where proper
> metaclasses are overkill. This feature would also be quite nice in say
> pandas where the (currently unspecified) field order used in the
> definition of frames is preserved in user-visible displays.

As stated by Guido (and pointed out in the PEP):

Making **kwds ordered is still open, but requires careful design and
implementation to avoid slowing down function calls that don't benefit.

The PEP has not been updated in a while, though. Python 3.5 has been
released, and with it a C implementation of OrderedDict.

Eric, are you still interested in this? IIRC that PEP was one of the
motivating use cases for implementing OrderedDict in C. Maybe it's time for
a second round of discussion on Python-ideas?

-Emanuel

From ncoghlan at gmail.com  Thu Jun  9 17:39:40 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 9 Jun 2016 14:39:40 -0700
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7DJDrBpPT4BvRG0-+coHRyvH+6tgwR=BFvPd0wRJ3NBPQ@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <575772E6.7040906@stoneleaf.us>
 <CALFfu7DJDrBpPT4BvRG0-+coHRyvH+6tgwR=BFvPd0wRJ3NBPQ@mail.gmail.com>
Message-ID: <CADiSq7fr-K_a2nBnp3LJyvi0sQSWbGTYij5xmRbA6+wwDRbmHA@mail.gmail.com>

On 7 June 2016 at 20:17, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> On Tue, Jun 7, 2016 at 6:20 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
>> If __definition_order__ is supposed to be immutable as well as read-only
>> then we should convert non-tuples to tuples.  No point in letting that
>> user bug slip through.
>
> Do you mean if a class explicitly defines __definition_order__?  If
> so, I'm not clear on how that would work.  It could be set to
> anything, including None or a value that does not iterate into a
> definition order.  If someone explicitly set __definition_order__ then
> I think it should be used as-is.

I'm guessing Ethan is suggesting defining it as:

    __definition_order__ = tuple(ns["__definition_order__"])

When the attribute is present in the method body.

That restriction would be comparable to what we do with __slots__ today:

    >>> class C:
    ...     __slots__ = 1
    ...
    Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
    TypeError: 'int' object is not iterable

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Thu Jun  9 17:55:13 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 9 Jun 2016 14:55:13 -0700
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <20160608210133.GA4318@python.ca>
References: <20160608210133.GA4318@python.ca>
Message-ID: <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>

On 8 June 2016 at 14:01, Neil Schemenauer <neil at python.ca> wrote:
> [I've posted something about this on python-ideas but since I now
> have some basic working code, I think it is more than an idea.]
>
> I think the uptake of Python 3 is starting to accelerate.  That's
> good.  However, there are still millions or maybe billions of lines
> of Python code that still needs to be ported.  It is beneficial to
> the Python ecosystem if this code can get ported.
>
> My idea is to make a stepping stone version of Python, between 2.7.x
> and 3.x that eases the porting job.  The high level goals are:
>
> - code coming out of 2to3 runs correctly on this modified Python
>
> - code that runs without warnings on this modified Python will run
>   correctly on Python 3.x.

As Victor noted, and as the porting guide describes in
https://docs.python.org/3/howto/pyporting.html#update-your-code, we've
determined that 2to3 isn't the best choice of tool for folks that
can't afford to immediately drop Python 2 support.

Once you switch to those now recommended more conservative migration
tools, the tool suite you request already exists:

- update your code with modernize or futurize
- check it still runs on Python 2.7
- check it doesn't generate warnings under 2.7's "-3" switch
- check it passes "pylint --py3k"
- check if it runs on Python 3.5

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From brett at python.org  Thu Jun  9 18:02:45 2016
From: brett at python.org (Brett Cannon)
Date: Thu, 09 Jun 2016 22:02:45 +0000
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
References: <20160608210133.GA4318@python.ca>
 <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
Message-ID: <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>

On Thu, 9 Jun 2016 at 14:56 Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 8 June 2016 at 14:01, Neil Schemenauer <neil at python.ca> wrote:
> > [I've posted something about this on python-ideas but since I now
> > have some basic working code, I think it is more than an idea.]
> >
> > I think the uptake of Python 3 is starting to accelerate.  That's
> > good.  However, there are still millions or maybe billions of lines
> > of Python code that still needs to be ported.  It is beneficial to
> > the Python ecosystem if this code can get ported.
> >
> > My idea is to make a stepping stone version of Python, between 2.7.x
> > and 3.x that eases the porting job.  The high level goals are:
> >
> > - code coming out of 2to3 runs correctly on this modified Python
> >
> > - code that runs without warnings on this modified Python will run
> >   correctly on Python 3.x.
>
> As Victor noted, and as the porting guide describes in
> https://docs.python.org/3/howto/pyporting.html#update-your-code, we've
> determined that 2to3 isn't the best choice of tool for folks that
> can't afford to immediately drop Python 2 support.
>
> Once you switch to those now recommended more conservative migration
> tools, the tool suite you request already exists:
>
> - update your code with modernize or futurize
> - check it still runs on Python 2.7
> - check it doesn't generate warnings under 2.7's "-3" switch
> - check it passes "pylint --py3k"
> - check if it runs on Python 3.5
>

`python3.5 -bb` is best to help keep Python 2.7 compatibility, otherwise
what Nick said. :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/3ea06df3/attachment.html>

From ethan at stoneleaf.us  Thu Jun  9 18:11:50 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 09 Jun 2016 15:11:50 -0700
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CADiSq7fr-K_a2nBnp3LJyvi0sQSWbGTYij5xmRbA6+wwDRbmHA@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <575772E6.7040906@stoneleaf.us>
 <CALFfu7DJDrBpPT4BvRG0-+coHRyvH+6tgwR=BFvPd0wRJ3NBPQ@mail.gmail.com>
 <CADiSq7fr-K_a2nBnp3LJyvi0sQSWbGTYij5xmRbA6+wwDRbmHA@mail.gmail.com>
Message-ID: <5759E9A6.1080706@stoneleaf.us>

On 06/09/2016 02:39 PM, Nick Coghlan wrote:
> On 7 June 2016 at 20:17, Eric Snow wrote:
>> On Tue, Jun 7, 2016 at 6:20 PM, Ethan Furman wrote:

>>> If __definition_order__ is supposed to be immutable as well as read-only
>>> then we should convert non-tuples to tuples.  No point in letting that
>>> user bug slip through.
>>
>> Do you mean if a class explicitly defines __definition_order__?  If
>> so, I'm not clear on how that would work.  It could be set to
>> anything, including None or a value that does not iterate into a
>> definition order.  If someone explicitly set __definition_order__ then
>> I think it should be used as-is.
>
> I'm guessing Ethan is suggesting defining it as:
>
>      __definition_order__ = tuple(ns["__definition_order__"])
>
> When the attribute is present in the method body.

Yup, that it's it exactly.  Thanks, Nick!

--
~Ethan~

From ethan at stoneleaf.us  Thu Jun  9 18:16:55 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 09 Jun 2016 15:16:55 -0700
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <CAFT4OTEQ4YrOpjBv=5E88Eex7kC+Znc42A-OercO+xcO7MU+5g@mail.gmail.com>
References: <20160608210133.GA4318@python.ca>
 <CAO41-mOuSP9YedSg2cK3o7gEnO-==f9X7=GbBhGBRunrAXuiCA@mail.gmail.com>
 <CAFT4OTEQ4YrOpjBv=5E88Eex7kC+Znc42A-OercO+xcO7MU+5g@mail.gmail.com>
Message-ID: <5759EAD7.4030506@stoneleaf.us>

On 06/08/2016 02:40 PM, Fred Drake wrote:
> On Wed, Jun 8, 2016 at 5:33 PM, Ryan Gonzalez <rymg19 at gmail.com> wrote:
>> What about something like "unpythonic" or similar?
>
> Or perhaps... antipythy?

That's awfully close to antipathy [1], my path module on PyPI.

Besides, I liked the suggestion from the -ideas list: Python 2therescue. ;)

--
~Ethan~

[1] https://pypi.python.org/pypi/antipathy

From fred at fdrake.net  Thu Jun  9 18:19:14 2016
From: fred at fdrake.net (Fred Drake)
Date: Thu, 9 Jun 2016 18:19:14 -0400
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <5759EAD7.4030506@stoneleaf.us>
References: <20160608210133.GA4318@python.ca>
 <CAO41-mOuSP9YedSg2cK3o7gEnO-==f9X7=GbBhGBRunrAXuiCA@mail.gmail.com>
 <CAFT4OTEQ4YrOpjBv=5E88Eex7kC+Znc42A-OercO+xcO7MU+5g@mail.gmail.com>
 <5759EAD7.4030506@stoneleaf.us>
Message-ID: <CAFT4OTHUE_JNBkfxvToBMU_Pb40E+J-K2KnVHExCL8f+hAnrZw@mail.gmail.com>

On Thu, Jun 9, 2016 at 6:16 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
> That's awfully close to antipathy [1], my path module on PyPI.

Good point.  Increasing confusion would not help.

> Besides, I liked the suggestion from the -ideas list: Python 2therescue. ;)

Nice; I like that too.  :-)

  -Fred

-- 
Fred L. Drake, Jr.    <fred at fdrake.net>
"A storm broke loose in my mind."  --Albert Einstein

From larry at hastings.org  Thu Jun  9 18:22:35 2016
From: larry at hastings.org (Larry Hastings)
Date: Thu, 9 Jun 2016 15:22:35 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
Message-ID: <5759EC2B.8040208@hastings.org>

On 06/09/2016 08:52 AM, Guido van Rossum wrote:
> That leaves direct calls to os.urandom(). I don't think this should 
> block either.

Then it's you and me against the rest of the world ;-)

Okay, it's decided: os.urandom() must be changed for 3.5.2 to never 
block on a getrandom() call.  It's permissible to take advantage of 
getrandom(GRND_NONBLOCK), but if it returns EAGAIN we must read from 
/dev/urandom.

It's already well established that this will upset the cryptography 
experts.  As a concession to them, I propose adding a simple! 
predictable! function to Python 3.5.2: os.getrandom().  This would be a 
simple wrapper over getrandom, only available on platforms that expose 
it.  It would provide a way to use both extant flags, GRND_RANDOM and  
GRND_NONBLOCK, though possibly not exactly mirroring the native API.

This would enable cryptography libraries to easily do what (IIUC) they 
regard as the "correct" thing on Linux for all supported versions of Python:

    if hasattr(os, "getrandom"):
         bits = os.getrandom(n)
    else:
         bits = os.urandom(n)

I'm not excited about adding a new function in 3.5.2, but on the other 
hand we are taking away this functionality they had in 3.5.0 and 3.5.1 
so only seems fair.  And the implementation of os.getrandom() should be 
very straightforward, and its semantics will mirror the native call, so 
I'm pretty confident we can get it solid in a couple of days, though we 
might slip 3.5.2rc1 by a day or two.

Guido: do you see this as an acceptable compromise?

Cryptographers: given that os.urandom() will no longer block in 3.5.2, 
do you want this?

Pointing out an alternate approach: Marc-Andre Lemburg proposes in issue 
#27279 ( http://bugs.python.org/issue27279 ) that we should add two 
"known best-practices" functions to get pseudo-random bits; one merely 
for pseudo random bits, the other for crypto-strength pseudo random 
bits.  While I think this is a fine idea, the exact spelling, semantics, 
and per-platform implementation of these functions is far from settled, 
and nobody is proposing that we do something like that for 3.5.

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/e226f261/attachment.html>

From larry at hastings.org  Thu Jun  9 18:33:03 2016
From: larry at hastings.org (Larry Hastings)
Date: Thu, 9 Jun 2016 15:33:03 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <A7C3ACE6-2C34-432F-B4B2-E56AB006773C@stufft.io>
References: <57595210.4000508@hastings.org>
 <6728E567-C894-47EF-88E0-0E0A2A678E6B@stufft.io>
 <20160609163000.GB27919@ando.pearwood.info>
 <D0F93447-6FFC-49BF-8CCB-3646D9D0C926@stufft.io>
 <20160609171450.GC27919@ando.pearwood.info>
 <A7C3ACE6-2C34-432F-B4B2-E56AB006773C@stufft.io>
Message-ID: <5759EE9F.9010203@hastings.org>

On 06/09/2016 10:22 AM, Donald Stufft wrote:
>> On Jun 9, 2016, at 1:14 PM, Steven D'Aprano <steve at pearwood.info> wrote:
>>
>> Just to be clear, this is only an option on Linux, right? All the other
>> major platforms block, whatever we decide to do on Linux. Including
>> Windows?
> To my knowledge, all other major platforms block or otherwise ensure that /dev/urandom can never return anything but cryptographically secure random. [1]

I've done some research into this over the past couple of days.  To the 
best of my knowledge:

* Linux: /dev/urandom will never block.  If the entropy pool isn't 
initialized yet, it will return poor-quality random bits from what is 
effectively an unseeded PRNG.  (Yes: it uses a custom algorithm which 
isn't considered CPRNG-strength, it is merely a PRNG seeded with entropy.)

* OS X: AFAICT, /dev/urandom guarantees it will never block.  It uses an 
old CSPRNG, 160-bit Yarrow.  The documentation states that if the 
entropy pool is "drained", it won't block; instead it'll degrade 
("output quality will suffer over time without any explicit indication 
from the random device itself").  It isn't clear how initialization of 
the entropy pool during early startup might affect this.  
http://www.manpages.info/macosx/random.4.html

* FreeBSD: /dev/urandom may block.  It also using Yarrow (but maybe with 
more bits? and possibly switching soon to Yarrow's successor, 
Fortuna?).  Both devices guarantee high-quality random bits, and will 
block if they feel like they're running low on entropy.

* OpenBSD 5.1 is like FreeBSD, except the algorithm used is ARC4. In 
OpenBSD 5.5 they changed to using ChaCha20.

On all of those platforms *except* Linux, /dev/random and /dev/urandom 
are exactly the same.

Also, regarding Windows: Victor Stinner did some experiments with a VM, 
and even in early startup he was able to get random bits from 
os.urandom().  But it's hard to have a "fresh" Windows VM, so it's 
possible it had residual entropy from a previous boot, so this isn't 
conclusive.

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/e20589df/attachment.html>

From ethan at stoneleaf.us  Thu Jun  9 18:44:11 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 09 Jun 2016 15:44:11 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <5759EC2B.8040208@hastings.org>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
Message-ID: <5759F13B.2000909@stoneleaf.us>

On 06/09/2016 03:22 PM, Larry Hastings wrote:
>
> On 06/09/2016 08:52 AM, Guido van Rossum wrote:
>> That leaves direct calls to os.urandom(). I don't think this should
>> block either.
>
> Then it's you and me against the rest of the world ;-)
>
>
> Okay, it's decided: os.urandom() must be changed for 3.5.2 to never
> block on a getrandom() call.

One way to not block is to raise an exception.  Since this is such a 
rare occurrence anyway I don't see this being a problem, plus it keeps 
everybody mostly happy:  normal users won't see it hang, crypto-folk 
won't see vulnerable-from-this-cause-by-default machines, and those 
running Python early in the boot sequence will have something they can 
figure out, plus an existing knob to work around it [hashseed, I think?].

> As a concession to [the crypto experts], I propose adding a simple!
> predictable! function to Python 3.5.2: os.getrandom().

This would be unnecessary if we go the exception route.

> And the implementation of os.getrandom() should be
> very straightforward, and its semantics will mirror the native call, so
> I'm pretty confident we can get it solid in a couple of days, though we
> might slip 3.5.2rc1 by a day or two.

I would think the exception route would also not take very long to make 
solid.

Okay, I'll shut up now.  ;)

--
~Ethan~

From larry at hastings.org  Thu Jun  9 18:47:54 2016
From: larry at hastings.org (Larry Hastings)
Date: Thu, 9 Jun 2016 15:47:54 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <5759F13B.2000909@stoneleaf.us>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <5759F13B.2000909@stoneleaf.us>
Message-ID: <5759F21A.5080905@hastings.org>

On 06/09/2016 03:44 PM, Ethan Furman wrote:
> On 06/09/2016 03:22 PM, Larry Hastings wrote:
>> Okay, it's decided: os.urandom() must be changed for 3.5.2 to never
>> block on a getrandom() call.
>
> One way to not block is to raise an exception.  Since this is such a 
> rare occurrence anyway I don't see this being a problem, plus it keeps 
> everybody mostly happy:  normal users won't see it hang, crypto-folk 
> won't see vulnerable-from-this-cause-by-default machines, and those 
> running Python early in the boot sequence will have something they can 
> figure out, plus an existing knob to work around it [hashseed, I think?].

Nope, I want the old behavior back.  os.urandom() should read 
/dev/random if getrandom() would block.  As the British say, "it should 
do what it says on the tin".

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/993044f3/attachment.html>

From neil at python.ca  Thu Jun  9 19:08:07 2016
From: neil at python.ca (Neil Schemenauer)
Date: Thu, 9 Jun 2016 16:08:07 -0700
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>
References: <20160608210133.GA4318@python.ca>
 <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
 <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>
Message-ID: <20160609230807.GA8118@python.ca>

On 2016-06-09, Brett Cannon wrote:
> On Thu, 9 Jun 2016 at 14:56 Nick Coghlan <ncoghlan at gmail.com> wrote:
> > Once you switch to those now recommended more conservative migration
> > tools, the tool suite you request already exists:
> >
> > - update your code with modernize or futurize
> > - check it still runs on Python 2.7
> > - check it doesn't generate warnings under 2.7's "-3" switch
> > - check it passes "pylint --py3k"
> > - check if it runs on Python 3.5
> >
> 
> `python3.5 -bb` is best to help keep Python 2.7 compatibility, otherwise
> what Nick said. :)

I have to wonder if you guys actually ported at lot of Python 2
code.  Maybe you somehow avoided the problematic behavior. Below is
a pretty trival set of functions.  The tools you recommend do not
help at all.  One problem is that the str literals should be bytes
literals.  Comparison with None needs to be avoided.

With Python 2 code runs successfully.  With Python 3 the code
crashes with a traceback.  With my modified Python 3.6, the code
runs successfully but generates the following warnings:

    test.py:13: DeprecationWarning: encoding bytes to str
      output.write('%d:' % len(s))
    test.py:14: DeprecationWarning: encoding bytes to str
      output.write(s)
    test.py:15: DeprecationWarning: encoding bytes to str
      output.write(',')
    test.py:5: DeprecationWarning: encoding bytes to str
      if c == ':':
    test.py:9: DeprecationWarning: encoding bytes to str
      size += c
    test.py:24: DeprecationWarning: encoding bytes to str
      data = data + s
    test.py:26: DeprecationWarning: encoding bytes to str
      if input.read(1) != ',':
    test.py:31: DeprecationWarning: default compare is depreciated
      if a > 0:

It is very easy for me to find code written for Python 2 that will
fail in the same way.  According to you guys, there is no problem
and we already have good enough tooling. ;-(
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.py
Type: text/x-python
Size: 1133 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/67823727/attachment.py>

From brett at python.org  Thu Jun  9 19:43:24 2016
From: brett at python.org (Brett Cannon)
Date: Thu, 09 Jun 2016 23:43:24 +0000
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <20160609230807.GA8118@python.ca>
References: <20160608210133.GA4318@python.ca>
 <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
 <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>
 <20160609230807.GA8118@python.ca>
Message-ID: <CAP1=2W70JKm=FepejD_qm-rCdPBUVLuK0Rm9aJ2bNBAhrYnBrA@mail.gmail.com>

On Thu, 9 Jun 2016 at 16:08 Neil Schemenauer <neil at python.ca> wrote:

> On 2016-06-09, Brett Cannon wrote:
> > On Thu, 9 Jun 2016 at 14:56 Nick Coghlan <ncoghlan at gmail.com> wrote:
> > > Once you switch to those now recommended more conservative migration
> > > tools, the tool suite you request already exists:
> > >
> > > - update your code with modernize or futurize
> > > - check it still runs on Python 2.7
> > > - check it doesn't generate warnings under 2.7's "-3" switch
> > > - check it passes "pylint --py3k"
> > > - check if it runs on Python 3.5
> > >
> >
> > `python3.5 -bb` is best to help keep Python 2.7 compatibility, otherwise
> > what Nick said. :)
>
> I have to wonder if you guys actually ported at lot of Python 2
> code.

Yes I have, including code that needed to be 2.4-3.4 compatible of all
things. Plus I'm the author of the porting HOWTO so I know the edge cases
pretty well.

I don't think you meant for what you said to sound insulting, Neil, but it
did feel like it upon first reading.

> Maybe you somehow avoided the problematic behavior. Below is
> a pretty trival set of functions.  The tools you recommend do not
> help at all.  One problem is that the str literals should be bytes
> literals.

At least for Modernize that's on purpose as it can't tell semantically what
is meant to be binary data vs. textual ASCII data (which you obviously
know, else you wouldn't be trying to add runtime warnings for this sort of
stuff).

> Comparison with None needs to be avoided.
>
> With Python 2 code runs successfully.  With Python 3 the code
> crashes with a traceback.  With my modified Python 3.6, the code
> runs successfully but generates the following warnings:
>
>     test.py:13: DeprecationWarning: encoding bytes to str
>       output.write('%d:' % len(s))
>     test.py:14: DeprecationWarning: encoding bytes to str
>       output.write(s)
>     test.py:15: DeprecationWarning: encoding bytes to str
>       output.write(',')
>     test.py:5: DeprecationWarning: encoding bytes to str
>       if c == ':':
>     test.py:9: DeprecationWarning: encoding bytes to str
>       size += c
>     test.py:24: DeprecationWarning: encoding bytes to str
>       data = data + s
>     test.py:26: DeprecationWarning: encoding bytes to str
>       if input.read(1) != ',':
>     test.py:31: DeprecationWarning: default compare is depreciated
>       if a > 0:
>
> It is very easy for me to find code written for Python 2 that will
> fail in the same way.  According to you guys, there is no problem
> and we already have good enough tooling. ;-(
>

That's not what I'm saying at all (nor what I think Nick is saying); more
tooling to ease the transition is always welcomed. The point we are trying
to make is 2to3 is not considered best practice anymore, and so targeting
its specific output might not be the best use of your time. I'm totally
happy to have your fork work out and help give warnings for situations
where runtime semantics are the only way to know there will be a problem
that static analyzing tools can't handle and have the porting HOWTO updated
so that people can run their test suite with your interpreter to help with
that final bit of porting. I personally just don't want to see you waste
time on warnings that are handled by the tools already or ignore the fact
that six, modernize, and futurize can help more than 2to3 typically can
with the easy stuff when trying to keep 2/3 compatibility. IOW some of us
have become allergic to the word "2to3" in regards to porting. :) But if
you want to target 2to3 output then by all means please do and your work
will still be appreciated.

And I should also mention in case you don't know -- and assuming I'm
remembering correctly -- that adding new Py3kWarning cases to Python 2.7 is
still allowed, so if there is a warning you want to add that makes sense to
be upstream then we can consider adding it in Python 2.7.12 (or later).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/781bb70e/attachment-0001.html>

From steve.dower at python.org  Thu Jun  9 20:00:39 2016
From: steve.dower at python.org (Steve Dower)
Date: Fri, 10 Jun 2016 10:00:39 +1000
Subject: [Python-Dev] BDFL ruling request: should we block
 foreverwaiting for high-quality random bits?
In-Reply-To: <5759F21A.5080905@hastings.org>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <5759F13B.2000909@stoneleaf.us>
 <5759F21A.5080905@hastings.org>
Message-ID: <E1bB9sx-0000CA-Dv@se2-syd.hostedmail.net.au>

(fat fingered the send button, picking up where I left off)

If the pattern is really going to be the hasattr check you posted earlier, can we just do it for people and save them writing code that won't work on different OSs?

Cheers,
Steve

Top-posted from my Windows Phone

-----Original Message-----
From: "Larry Hastings" <larry at hastings.org>
Sent: ?6/?10/?2016 8:50
To: "python-dev at python.org" <python-dev at python.org>
Subject: Re: [Python-Dev] BDFL ruling request: should we block foreverwaiting for high-quality random bits?

On 06/09/2016 03:44 PM, Ethan Furman wrote:

On 06/09/2016 03:22 PM, Larry Hastings wrote: 

Okay, it's decided: os.urandom() must be changed for 3.5.2 to never 
block on a getrandom() call. 

One way to not block is to raise an exception.  Since this is such a rare occurrence anyway I don't see this being a problem, plus it keeps everybody mostly happy:  normal users won't see it hang, crypto-folk won't see vulnerable-from-this-cause-by-default machines, and those running Python early in the boot sequence will have something they can figure out, plus an existing knob to work around it [hashseed, I think?].

Nope, I want the old behavior back.  os.urandom() should read /dev/random if getrandom() would block.  As the British say, "it should do what it says on the tin".

/arry
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/d4b1c07f/attachment.html>

From neil at python.ca  Thu Jun  9 20:35:57 2016
From: neil at python.ca (Neil Schemenauer)
Date: Thu, 9 Jun 2016 17:35:57 -0700
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <CAP1=2W70JKm=FepejD_qm-rCdPBUVLuK0Rm9aJ2bNBAhrYnBrA@mail.gmail.com>
References: <20160608210133.GA4318@python.ca>
 <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
 <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>
 <20160609230807.GA8118@python.ca>
 <CAP1=2W70JKm=FepejD_qm-rCdPBUVLuK0Rm9aJ2bNBAhrYnBrA@mail.gmail.com>
Message-ID: <20160610003557.GA9353@python.ca>

On 2016-06-09, Brett Cannon wrote:
> I don't think you meant for what you said to sound insulting,
> Neil, but it did feel like it upon first reading.

Sorry, I think I misunderstood what you and Nick were saying.  I've
experienced a fair amount of negative feedback on my idea so I'm
pretty cranky at this point.  Amber Brown claimed that she spent
$60k of her time porting Twisted to Python 3.  I think there is lots
of room to make our porting tools better.

Using something like modernize, 2to6, or sixer seems like a better
idea than trying to improve on 2to3.  I agree on that point.
However, those tools combined with my modified Python 3.6 makes for
a much easier migration path than going directly to Python 3.x.  My
runtime warnings catch many common problems and make it easy to see
what needs fixing.

We have a lot more freedom to put ugly, backwards compatibility
hacks into this stepping stone version, rather than changing either
Python 2.7.x or the main 3.x line.  I'm hoping to get community
contributions to add more backwards compatibility and runtime
warnings.

From steve.dower at python.org  Thu Jun  9 19:58:57 2016
From: steve.dower at python.org (Steve Dower)
Date: Fri, 10 Jun 2016 09:58:57 +1000
Subject: [Python-Dev] BDFL ruling request: should we block
 foreverwaiting for high-quality random bits?
In-Reply-To: <5759F21A.5080905@hastings.org>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <5759F13B.2000909@stoneleaf.us>
 <5759F21A.5080905@hastings.org>
Message-ID: <E1bBAWo-0006Os-1G@se2-syd.hostedmail.net.au>

Can we get any new function on all platforms, deferring to urandom() if getrandom() isn't there?

If the pattern is really going to be the hasattr check you posted earlier

Top-posted from my Windows Phone

-----Original Message-----
From: "Larry Hastings" <larry at hastings.org>
Sent: ?6/?10/?2016 8:50
To: "python-dev at python.org" <python-dev at python.org>
Subject: Re: [Python-Dev] BDFL ruling request: should we block foreverwaiting for high-quality random bits?

On 06/09/2016 03:44 PM, Ethan Furman wrote:

On 06/09/2016 03:22 PM, Larry Hastings wrote: 

Okay, it's decided: os.urandom() must be changed for 3.5.2 to never 
block on a getrandom() call. 

One way to not block is to raise an exception.  Since this is such a rare occurrence anyway I don't see this being a problem, plus it keeps everybody mostly happy:  normal users won't see it hang, crypto-folk won't see vulnerable-from-this-cause-by-default machines, and those running Python early in the boot sequence will have something they can figure out, plus an existing knob to work around it [hashseed, I think?].

Nope, I want the old behavior back.  os.urandom() should read /dev/random if getrandom() would block.  As the British say, "it should do what it says on the tin".

/arry
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/023543b8/attachment.html>

From greg.ewing at canterbury.ac.nz  Thu Jun  9 20:33:16 2016
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 10 Jun 2016 12:33:16 +1200
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160609174927.GE27919@ando.pearwood.info>
References: <57595210.4000508@hastings.org>
 <6728E567-C894-47EF-88E0-0E0A2A678E6B@stufft.io>
 <20160609163000.GB27919@ando.pearwood.info>
 <D0F93447-6FFC-49BF-8CCB-3646D9D0C926@stufft.io>
 <CABZ0LtAF_qUJCNdU0q+9-6Q9v_WxL+jWgG4dX9KRUBwskkeKjA@mail.gmail.com>
 <CACac1F-zRKeaAhZOYW85c3K2hFA_PMi3Jww6D+a1wufPLEExAA@mail.gmail.com>
 <20160609174927.GE27919@ando.pearwood.info>
Message-ID: <575A0ACC.5040809@canterbury.ac.nz>

Steven D'Aprano wrote:
> - Linux /dev/urandom doesn't block, but it might return predictable, 
>   poor-quality pseudo-random bytes (i.e. a potential exploit);
> 
> - Other OSes may block for potentially many minutes (i.e. a 
>   potential DOS).

It's even possible that it could block *forever*.

There was a case here recently in the cosc dept where students were
running Clojure programs in a virtual machine environment. When
they updated to a newer version of Clojure, everyone's programs
started hanging on startup. It turned out the Clojure library was
initialising its RNG from /dev/random, and the VM didn't have any
real spinning disks or other devices to provide entropy.

-- 
Greg

From njs at pobox.com  Thu Jun  9 21:03:35 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 9 Jun 2016 18:03:35 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <5759EC2B.8040208@hastings.org>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
Message-ID: <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>

On Thu, Jun 9, 2016 at 3:22 PM, Larry Hastings <larry at hastings.org> wrote:
>
> On 06/09/2016 08:52 AM, Guido van Rossum wrote:
>
>> That leaves direct calls to os.urandom(). I don't think this should block
>> either.
>
>
> Then it's you and me against the rest of the world ;-)
>
>
> Okay, it's decided: os.urandom() must be changed for 3.5.2 to never block on
> a getrandom() call.  It's permissible to take advantage of
> getrandom(GRND_NONBLOCK), but if it returns EAGAIN we must read from
> /dev/urandom.
>
> It's already well established that this will upset the cryptography experts.
> As a concession to them, I propose adding a simple! predictable! function to
> Python 3.5.2: os.getrandom().  This would be a simple wrapper over
> getrandom, only available on platforms that expose it.  It would provide a
> way to use both extant flags, GRND_RANDOM and  GRND_NONBLOCK, though
> possibly not exactly mirroring the native API.
>
> This would enable cryptography libraries to easily do what (IIUC) they
> regard as the "correct" thing on Linux for all supported versions of Python:
>
> if hasattr(os, "getrandom"):
>     bits = os.getrandom(n)
> else:
>     bits = os.urandom(n)

So I understand that the trade-offs between crypto users and regular
users are tricky, but this resolution concerns me quite a bit :-(

Specifically, it seems to me that:
1) we now have these two functions that need to be supported forever,
and AFAICT in every case where someone is currently explicitly calling
os.urandom and the behavior differs, they want os.getrandom instead.
(This is based on the assumption that the only time that explicitly
calling os.urandom is the best option is when one cares about the
cryptographic strength of the result -- I'm explicitly distinguishing
here between the hash seeding issue that triggered the original bug
report and explicit calls to os.urandom.) So in practice this change
makes it so that the only correct way of calling either of these
functions is the if/else stanza above.
2) every piece of security-sensitive software is going to spend
resources churning their code to implement the above,
3) every future security audit of Python software is going to spend
resources making sure this is on their checklist of incredibly subtle
gotchas that have to be audited for,
4) the crypto folks are going to have to spin up a whole evangelism
effort to re-educate everyone that (contrary to what we've been
telling everyone for years), os.urandom is no longer the right way to
get cryptographic randomness.

OTOH if we allow explicit calls to os.urandom to block or raise an
exception, then AFAICT from this thread this will break exactly zero
projects.

Maybe this is just rehashing the same things that have already been
discussed ad naseaum, in which case I apologize. But I really feel
like this is one of those cases where the crypto folks aren't so much
saying "oh BUT what if <incredibly unlikely situation involving
oppressive regimes and ticking bombs>"; they're more saying "oh $#@
you're going to cause me a *massive* amount of real work and churn and
ongoing costs for no perceivable gain and I'm exhausted even thinking
about it".

> I'm not excited about adding a new function in 3.5.2, but on the other hand
> we are taking away this functionality they had in 3.5.0 and 3.5.1 so only
> seems fair.  And the implementation of os.getrandom() should be very
> straightforward, and its semantics will mirror the native call, so I'm
> pretty confident we can get it solid in a couple of days, though we might
> slip 3.5.2rc1 by a day or two.
>
> Guido: do you see this as an acceptable compromise?
>
> Cryptographers: given that os.urandom() will no longer block in 3.5.2, do
> you want this?
>
>
> Pointing out an alternate approach: Marc-Andre Lemburg proposes in issue
> #27279 ( http://bugs.python.org/issue27279 ) that we should add two "known
> best-practices" functions to get pseudo-random bits; one merely for pseudo
> random bits, the other for crypto-strength pseudo random bits.  While I
> think this is a fine idea, the exact spelling, semantics, and per-platform
> implementation of these functions is far from settled, and nobody is
> proposing that we do something like that for 3.5.

We already have a function for non-crypto-strength pseudo-random bits:
random.getrandbits. os.urandom is the one for the cryptographers (I
thought).

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From guido at python.org  Thu Jun  9 21:18:33 2016
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Jun 2016 18:18:33 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
Message-ID: <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>

I don't think we should add a new function. I think we should convince
ourselves that there is not enough of a risk of an exploit even if
os.urandom() falls back.

On Thu, Jun 9, 2016 at 6:03 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Thu, Jun 9, 2016 at 3:22 PM, Larry Hastings <larry at hastings.org> wrote:
> >
> > On 06/09/2016 08:52 AM, Guido van Rossum wrote:
> >
> >> That leaves direct calls to os.urandom(). I don't think this should
> block
> >> either.
> >
> >
> > Then it's you and me against the rest of the world ;-)
> >
> >
> > Okay, it's decided: os.urandom() must be changed for 3.5.2 to never
> block on
> > a getrandom() call.  It's permissible to take advantage of
> > getrandom(GRND_NONBLOCK), but if it returns EAGAIN we must read from
> > /dev/urandom.
> >
> > It's already well established that this will upset the cryptography
> experts.
> > As a concession to them, I propose adding a simple! predictable!
> function to
> > Python 3.5.2: os.getrandom().  This would be a simple wrapper over
> > getrandom, only available on platforms that expose it.  It would provide
> a
> > way to use both extant flags, GRND_RANDOM and  GRND_NONBLOCK, though
> > possibly not exactly mirroring the native API.
> >
> > This would enable cryptography libraries to easily do what (IIUC) they
> > regard as the "correct" thing on Linux for all supported versions of
> Python:
> >
> > if hasattr(os, "getrandom"):
> >     bits = os.getrandom(n)
> > else:
> >     bits = os.urandom(n)
>
> So I understand that the trade-offs between crypto users and regular
> users are tricky, but this resolution concerns me quite a bit :-(
>
> Specifically, it seems to me that:
> 1) we now have these two functions that need to be supported forever,
> and AFAICT in every case where someone is currently explicitly calling
> os.urandom and the behavior differs, they want os.getrandom instead.
> (This is based on the assumption that the only time that explicitly
> calling os.urandom is the best option is when one cares about the
> cryptographic strength of the result -- I'm explicitly distinguishing
> here between the hash seeding issue that triggered the original bug
> report and explicit calls to os.urandom.) So in practice this change
> makes it so that the only correct way of calling either of these
> functions is the if/else stanza above.
> 2) every piece of security-sensitive software is going to spend
> resources churning their code to implement the above,
> 3) every future security audit of Python software is going to spend
> resources making sure this is on their checklist of incredibly subtle
> gotchas that have to be audited for,
> 4) the crypto folks are going to have to spin up a whole evangelism
> effort to re-educate everyone that (contrary to what we've been
> telling everyone for years), os.urandom is no longer the right way to
> get cryptographic randomness.
>
> OTOH if we allow explicit calls to os.urandom to block or raise an
> exception, then AFAICT from this thread this will break exactly zero
> projects.
>
> Maybe this is just rehashing the same things that have already been
> discussed ad naseaum, in which case I apologize. But I really feel
> like this is one of those cases where the crypto folks aren't so much
> saying "oh BUT what if <incredibly unlikely situation involving
> oppressive regimes and ticking bombs>"; they're more saying "oh $#@
> you're going to cause me a *massive* amount of real work and churn and
> ongoing costs for no perceivable gain and I'm exhausted even thinking
> about it".
>
> > I'm not excited about adding a new function in 3.5.2, but on the other
> hand
> > we are taking away this functionality they had in 3.5.0 and 3.5.1 so only
> > seems fair.  And the implementation of os.getrandom() should be very
> > straightforward, and its semantics will mirror the native call, so I'm
> > pretty confident we can get it solid in a couple of days, though we might
> > slip 3.5.2rc1 by a day or two.
> >
> > Guido: do you see this as an acceptable compromise?
> >
> > Cryptographers: given that os.urandom() will no longer block in 3.5.2, do
> > you want this?
> >
> >
> > Pointing out an alternate approach: Marc-Andre Lemburg proposes in issue
> > #27279 ( http://bugs.python.org/issue27279 ) that we should add two
> "known
> > best-practices" functions to get pseudo-random bits; one merely for
> pseudo
> > random bits, the other for crypto-strength pseudo random bits.  While I
> > think this is a fine idea, the exact spelling, semantics, and
> per-platform
> > implementation of these functions is far from settled, and nobody is
> > proposing that we do something like that for 3.5.
>
> We already have a function for non-crypto-strength pseudo-random bits:
> random.getrandbits. os.urandom is the one for the cryptographers (I
> thought).
>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/5aa3a72c/attachment-0001.html>

From barry at python.org  Thu Jun  9 21:53:43 2016
From: barry at python.org (Barry Warsaw)
Date: Thu, 9 Jun 2016 21:53:43 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <5759EC2B.8040208@hastings.org>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
Message-ID: <20160609215343.00b0190e.barry@wooz.org>

On Jun 09, 2016, at 03:22 PM, Larry Hastings wrote:

>On 06/09/2016 08:52 AM, Guido van Rossum wrote:
>> That leaves direct calls to os.urandom(). I don't think this should > block either.  
>
>Then it's you and me against the rest of the world ;-)

FWIW, I agree with you and Guido.  I'm also not opposed to adding a more
direct exposure of getrandom(), but in Python 3.6 only.  Like it or not,
that's the right approach for our backward compatibility policies.

Cheers,
-Barry

From barry at python.org  Thu Jun  9 22:13:56 2016
From: barry at python.org (Barry Warsaw)
Date: Thu, 9 Jun 2016 22:13:56 -0400
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <20160610003557.GA9353@python.ca>
References: <20160608210133.GA4318@python.ca>
 <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
 <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>
 <20160609230807.GA8118@python.ca>
 <CAP1=2W70JKm=FepejD_qm-rCdPBUVLuK0Rm9aJ2bNBAhrYnBrA@mail.gmail.com>
 <20160610003557.GA9353@python.ca>
Message-ID: <20160609221356.3b18e447.barry@wooz.org>

On Jun 09, 2016, at 05:35 PM, Neil Schemenauer wrote:

>Amber Brown claimed that she spent $60k of her time porting Twisted to Python
>3.  I think there is lots of room to make our porting tools better.

Amber gave a presentation at the language summit and a Pycon talk.  The latter
video is up on YouTube but the former wasn't recorded.  I'm hoping Jake will
post a summary of it though.

She's done a truly impressive amount of work in porting Twisted and has a lot
of good insight.  I've ported a fair bit, but nothing of the size and
complexity of Twisted.  FWIW, I did port the Mailman 3 core, which is now
Python 3.4 and 3.5 compatible.

In my own experience, and IIRC Amber had a similar experience, the ease of
porting to Python 3 really comes down to how bytes/unicode clean your code
base is.  Almost all the other pieces are either pretty manageable or fairly
easily automated.  But if you're code isn't bytes-clean you're in for a world
of hurt because you first have to decide how to represent those things.
Twisted's job is especially fun because it's all about wire protocols, which I
think Amber described as (paraphrasing) bytes that happen to have contents
that look like strings.

I've ported some libraries that weren't bytes-clean.  With one of them, I
actually failed twice before I hit on the correct representation.  Once I got
that right the rest went much more quickly.

There's does seem to be a wide variety of experiences in porting to Python 3.
I think is worth both accepting, acknowledging, and promoting that for a lot
of code, it's really not that hard, but that for some code it's really
painful.  It's within our job to help understand the remaining pain and
address it in some way.  But let's also not scare people away from Python 3,
because it *can* be very easy to port, and I think there's fairly widespread
agreement that once you're in the Python 3 world, you don't want to look back.

Cheers,
-Barry

From barry at python.org  Thu Jun  9 22:21:57 2016
From: barry at python.org (Barry Warsaw)
Date: Thu, 9 Jun 2016 22:21:57 -0400
Subject: [Python-Dev] PEP 467:  Minor API improvements to bytes,
 bytearray, and memoryview
References: <57572E5D.4020101@stoneleaf.us>
Message-ID: <20160609222157.2063ca00@anarchist.wooz.org>

On Jun 07, 2016, at 01:28 PM, Ethan Furman wrote:

>Deprecation of current "zero-initialised sequence" behaviour
>------------------------------------------------------------
>
>Currently, the ``bytes`` and ``bytearray`` constructors accept an integer
>argument and interpret it as meaning to create a zero-initialised sequence of
>the given size::
>
>     >>> bytes(3)  
>     b'\x00\x00\x00'
>     >>> bytearray(3)  
>     bytearray(b'\x00\x00\x00')
>
>This PEP proposes to deprecate that behaviour in Python 3.6, and remove it
>entirely in Python 3.7.
>
>No other changes are proposed to the existing constructors.

Does it need to be *actually* removed?  That does break existing code for not
a lot of benefit.  Yes, the default constructor is a little wonky, but with
the addition of the new constructors, and the fact that you're not proposing
to eventually change the default constructor, removal seems unnecessary.
Besides, once it's removed, what would `bytes(3)` actually do?  The PEP
doesn't say.

Also, since you're proposing to add `bytes.byte(3)` have you considered also
adding an optional count argument?  E.g. `bytes.byte(3, count=7)` would yield
b'\x03\x03\x03\x03\x03\x03\x03'.  That seems like it could be useful.

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/9be84756/attachment.sig>

From Nikolaus at rath.org  Thu Jun  9 22:38:37 2016
From: Nikolaus at rath.org (Nikolaus Rath)
Date: Thu, 09 Jun 2016 19:38:37 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <5759F21A.5080905@hastings.org> (Larry Hastings's message of
 "Thu, 9 Jun 2016 15:47:54 -0700")
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <5759F13B.2000909@stoneleaf.us>
 <5759F21A.5080905@hastings.org>
Message-ID: <87oa79ydhu.fsf@vostro.rath.org>

On Jun 09 2016, Larry Hastings <larry at hastings.org> wrote:
> On 06/09/2016 03:44 PM, Ethan Furman wrote:
>> On 06/09/2016 03:22 PM, Larry Hastings wrote:
>>> Okay, it's decided: os.urandom() must be changed for 3.5.2 to never
>>> block on a getrandom() call.
>>
>> One way to not block is to raise an exception.  Since this is such a
>> rare occurrence anyway I don't see this being a problem, plus it
>> keeps everybody mostly happy:  normal users won't see it hang,
>> crypto-folk won't see vulnerable-from-this-cause-by-default
>> machines, and those running Python early in the boot sequence will
>> have something they can figure out, plus an existing knob to work
>> around it [hashseed, I think?].
>
> Nope, I want the old behavior back.  os.urandom() should read
> /dev/random if getrandom() would block.  As the British say, "it
> should do what it says on the tin".

Aeh, what the tin says is "return random bytes". What everyone uses it
for (including the standard library) is to provide randomness for
cryptographic purposes. What it does (in the problematic case) is return
something that's not random.

To me this sounds about as sensible as having open('/dev/zero') return
non-zero values in some rare situations. And yes, for most people "the
kernel running out of zeros" makes exactly as much sense as "the kernel
runs out of random data". 

Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From Nikolaus at rath.org  Thu Jun  9 22:52:31 2016
From: Nikolaus at rath.org (Nikolaus Rath)
Date: Thu, 09 Jun 2016 19:52:31 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 (Guido van Rossum's message of "Thu, 9 Jun 2016 18:18:33 -0700")
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
Message-ID: <87lh2dycuo.fsf@vostro.rath.org>

On Jun 09 2016, Guido van Rossum <guido at python.org> wrote:
> I don't think we should add a new function. I think we should convince
> ourselves that there is not enough of a risk of an exploit even if
> os.urandom() falls back.

That will be hard, because you have to consider an active, clever
adversary.

On the other hand, convincing yourself that in practice os.urandom would
never block unless the setup is super exotic or there is active
maliciousness seems much easier.

Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From breamoreboy at yahoo.co.uk  Thu Jun  9 22:52:23 2016
From: breamoreboy at yahoo.co.uk (Mark Lawrence)
Date: Fri, 10 Jun 2016 03:52:23 +0100
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <CAP1=2W70JKm=FepejD_qm-rCdPBUVLuK0Rm9aJ2bNBAhrYnBrA@mail.gmail.com>
References: <20160608210133.GA4318@python.ca>
 <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
 <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>
 <20160609230807.GA8118@python.ca>
 <CAP1=2W70JKm=FepejD_qm-rCdPBUVLuK0Rm9aJ2bNBAhrYnBrA@mail.gmail.com>
Message-ID: <njda1n$ksq$1@ger.gmane.org>

On 10/06/2016 00:43, Brett Cannon wrote:
>
> That's not what I'm saying at all (nor what I think Nick is saying);
> more tooling to ease the transition is always welcomed. The point we are
> trying to make is 2to3 is not considered best practice anymore, and so
> targeting its specific output might not be the best use of your time.
> I'm totally happy to have your fork work out and help give warnings for
> situations where runtime semantics are the only way to know there will
> be a problem that static analyzing tools can't handle and have the
> porting HOWTO updated so that people can run their test suite with your
> interpreter to help with that final bit of porting. I personally just
> don't want to see you waste time on warnings that are handled by the
> tools already or ignore the fact that six, modernize, and futurize can
> help more than 2to3 typically can with the easy stuff when trying to
> keep 2/3 compatibility. IOW some of us have become allergic to the word
> "2to3" in regards to porting. :) But if you want to target 2to3 output
> then by all means please do and your work will still be appreciated.
>

Given the above and that 2to3 appears to be unsupported* is there a case 
for deprecating it?

*  There are 46 outstanding issues on the bug tracker.  Is the above the 
reason for this, I don't know?

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

From larry at hastings.org  Thu Jun  9 22:54:51 2016
From: larry at hastings.org (Larry Hastings)
Date: Thu, 9 Jun 2016 19:54:51 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <87oa79ydhu.fsf@vostro.rath.org>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <5759F13B.2000909@stoneleaf.us>
 <5759F21A.5080905@hastings.org> <87oa79ydhu.fsf@vostro.rath.org>
Message-ID: <575A2BFB.8040907@hastings.org>

On 06/09/2016 07:38 PM, Nikolaus Rath wrote:
> On Jun 09 2016, Larry Hastings <larry at hastings.org> wrote:
>> Nope, I want the old behavior back.  os.urandom() should read
>> /dev/random if getrandom() would block.  As the British say, "it
>> should do what it says on the tin".
> Aeh, what the tin says is "return random bytes".

What the tin says is "urandom", which has local man pages that dictate 
exactly how it behaves.  On Linux the "urandom" man page says:

    A read from the /dev/urandom device will not block waiting for more
    entropy.  If there is not sufficient entropy, a pseudorandom number
    generator is used to create the requested bytes.

os.urandom() needs to behave like that on Linux, which is how it behaved 
in Python 2.4 through 3.4.

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/2ec86519/attachment.html>

From njs at pobox.com  Thu Jun  9 22:58:14 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 9 Jun 2016 19:58:14 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160609215343.00b0190e.barry@wooz.org>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
Message-ID: <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>

On Thu, Jun 9, 2016 at 6:53 PM, Barry Warsaw <barry at python.org> wrote:
> On Jun 09, 2016, at 03:22 PM, Larry Hastings wrote:
>
>>On 06/09/2016 08:52 AM, Guido van Rossum wrote:
>>> That leaves direct calls to os.urandom(). I don't think this should > block either.
>>
>>Then it's you and me against the rest of the world ;-)
>
> FWIW, I agree with you and Guido.  I'm also not opposed to adding a more
> direct exposure of getrandom(), but in Python 3.6 only.  Like it or not,
> that's the right approach for our backward compatibility policies.

I suspect the crypto folks would be okay with pushing this back to
3.6, so long as the final resolution is that os.urandom remains the
standard interface for, as the docstring says, "Return[ing] a string
of n random bytes suitable for cryptographic use" using the
OS-recommended method, and they don't have to go change all their
code. After all, 3.4 and 2.7 will still have this subtle brokenness
for some time.

But I'm a little uncertain what you think would need to happen to
satisfy the backwards compatibility policies. If we can change it in
3.6 without having a warning in 3.5, then presumably we can also
change it in 3.5 without a warning in 3.4, which is what already
happened accidentally :-).

Would it be acceptable for 3.5.2 to start raising a warning "urandom
returning non-random bytes -- in 3.6 this will be an error", and then
make it an error in 3.6?

(And it would probably be good even in the long run to issue a
prominent warning if hash seeding fails.)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From larry at hastings.org  Thu Jun  9 23:01:14 2016
From: larry at hastings.org (Larry Hastings)
Date: Thu, 9 Jun 2016 20:01:14 -0700
Subject: [Python-Dev] BDFL ruling request: should we block
 foreverwaiting for high-quality random bits?
In-Reply-To: <E1bB9sy-0000CB-C5@se2-syd.hostedmail.net.au>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <5759F13B.2000909@stoneleaf.us>
 <5759F21A.5080905@hastings.org> <E1bB9sy-0000CB-C5@se2-syd.hostedmail.net.au>
Message-ID: <575A2D7A.6080808@hastings.org>

On 06/09/2016 05:00 PM, Steve Dower wrote:
> If the pattern is really going to be the hasattr check you posted 
> earlier, can we just do it for people and save them writing code that 
> won't work on different OSs?

No.  That's what got us into this mess in the first place.

3.5.0 and 3.5.1 *already* changed to the new behavior, and it resulted 
in the situation where CPython blocked forever at startup in these 
certain edge cases.  os.urandom() has been around for more than a 
decade, we can't unilaterally change its semantics now. os.urandom() in 
3.5 has to go back to how it behaved on Linux in 3.4.  And if I were 
release manager for 3.6, I'd say "it has to stay that way in 3.6 too".

However, Guido's already said "don't add os.getrandom() in 3.5", so the 
debate is somewhat irrelevant.

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/3e1a4596/attachment.html>

From larry at hastings.org  Thu Jun  9 23:11:08 2016
From: larry at hastings.org (Larry Hastings)
Date: Thu, 9 Jun 2016 20:11:08 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
Message-ID: <575A2FCC.5070101@hastings.org>

On 06/09/2016 07:58 PM, Nathaniel Smith wrote:
> I suspect the crypto folks would be okay with pushing this back to
> 3.6, so long as the final resolution is that os.urandom remains the
> standard interface for, as the docstring says, "Return[ing] a string
> of n random bytes suitable for cryptographic use" using the
> OS-recommended method, and they don't have to go change all their
> code.

The Linux core devs didn't like the behavior of /dev/urandom.  But they 
couldn't change its behavior without breaking userspace code. Linux 
takes backwards-compatibility very seriously, so they left /dev/urandom 
exactly the way it was and added new functionality (the getrandom() 
system call) that had the semantics they felt were best.

I don't understand why so many people seem to think it's okay to break 
old code in new versions of Python, when Python's history has shown a 
similarly strong commitment to backwards-compatibility. os.urandom() was 
added in Python 2.4, in 2004, and remained unchanged for about thirteen 
years.  That's thirteen years of people calling it and assuming its 
semantics were identical to the local "urandom" man page, which was correct.

I don't think we should change os.urandom() to block on Linux even in 
3.6.  Happily, that's no longer my fight, as I'm not 3.6 RM.

> Would it be acceptable for 3.5.2 to start raising a warning "urandom
> returning non-random bytes -- in 3.6 this will be an error", and then
> make it an error in 3.6?

No.  In 3.5.2 and the remaining 3.5 releases, os.urandom() must behave 
identically to how it behaved in 3.4 and the previous releases.

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/76d83e49/attachment-0001.html>

From donald at stufft.io  Thu Jun  9 23:48:33 2016
From: donald at stufft.io (Donald Stufft)
Date: Thu, 9 Jun 2016 23:48:33 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <575A2FCC.5070101@hastings.org>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
Message-ID: <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>

> On Jun 9, 2016, at 11:11 PM, Larry Hastings <larry at hastings.org> wrote:
> 
> I don't understand why so many people seem to think it's okay to break old code in new versions of Python, when Python's history has shown a similarly strong commitment to backwards-compatibility.

Python *regularly* breaks compatibility in X.Y+1 releases, and does it on purpose. An example from Python 3.5 would be PEP 479. I think breaking compatibility is a good thing from time to time, as long as it?s not done so with wanton disregard and as long as the cost is carefully weighed against the benefits.

One of the more frustrating aspects of trying to discuss security sensitive topics on python-dev is a feeling (at least from my end) that whenever someone wants to make something more secure [1] folks come in and try to anchor the discussion by treating backwards compatibility as some sort of sacred duty that can never be broken and the discussion ends up feeling (from the security side that I?m typically on) being try to justify the idea of ever breaking backwards compatibility, instead of weighing the cost/benefit of a particular change. On the flip side, when a different kind of change that breaks compatibility , say to make some behavior less confusing, gets brought up it feels like the discussion instead focuses on whether or not breaking compatibility is worth it in that particular instance.

I?m perfectly happy to accept that Python has decided to make a trade off differently than what I would prefer it, but the rhetoric that is employed makes trying to improve Python?s security an extremely frustrating experience for myself and others [2]. Feeling like you have to litigate that it?s *ever* OK to break compatibility before you can even get to the point of discussing if it makes sense in any particular instance, while watching other kinds proposals not have to do that is a pretty disheartening experience.

[1] Making code more secure pretty much by definition means taking some code that previously executed and making it so it no longer executes, ideally only in degenerate and dangerous conditions, but in general, that?s always the case.

[2] I don?t want to name names, as they didn?t give me permission to do so, but these discussions have caused more than one person who tends to fall on the security side of things to consider avoiding contributing to Python at all, because of this kind of rhetoric.

?
Donald Stufft

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/2c678d5e/attachment.html>

From Nikolaus at rath.org  Thu Jun  9 23:50:45 2016
From: Nikolaus at rath.org (Nikolaus Rath)
Date: Thu, 09 Jun 2016 20:50:45 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <575A2BFB.8040907@hastings.org> (Larry Hastings's message of
 "Thu, 9 Jun 2016 19:54:51 -0700")
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <5759F13B.2000909@stoneleaf.us>
 <5759F21A.5080905@hastings.org> <87oa79ydhu.fsf@vostro.rath.org>
 <575A2BFB.8040907@hastings.org>
Message-ID: <878tydg0ru.fsf@vostro.rath.org>

On Jun 09 2016, Larry Hastings <larry at hastings.org> wrote:
> On 06/09/2016 07:38 PM, Nikolaus Rath wrote:
>> On Jun 09 2016, Larry Hastings <larry at hastings.org> wrote:
>>> Nope, I want the old behavior back.  os.urandom() should read
>>> /dev/random if getrandom() would block.  As the British say, "it
>>> should do what it says on the tin".
>> Aeh, what the tin says is "return random bytes".
>
> What the tin says is "urandom", which has local man pages that dictate
> exactly how it behaves. 
[...]

I disagree. The authoritative source for the behavior of the Python
'urandom' function is the Python documentation, not the Linux manpage
for the "urandom" device.

And https://docs.python.org/3.4/library/os.html says first and foremost:

 os.urandom(n)?

    Return a string of n random bytes suitable for cryptographic use.

Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From tim.peters at gmail.com  Thu Jun  9 23:54:15 2016
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 9 Jun 2016 22:54:15 -0500
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <575A2BFB.8040907@hastings.org>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <5759F13B.2000909@stoneleaf.us>
 <5759F21A.5080905@hastings.org> <87oa79ydhu.fsf@vostro.rath.org>
 <575A2BFB.8040907@hastings.org>
Message-ID: <CAExdVNk5sdtEqrCKU4a5h0Ehh1hbhQqDEFx9_ENTAy88YN2ohw@mail.gmail.com>

[Nikolaus Rath]
>> Aeh, what the tin says is "return random bytes".

[Larry Hastings]
> What the tin says is "urandom", which has local man pages that dictate
> exactly how it behaves.  On Linux the "urandom" man page says:
>
>     A read from the /dev/urandom device will not block waiting for more entropy.
>     If there is not sufficient entropy, a pseudorandom number generator is used
>     to create the requested bytes.
>
> os.urandom() needs to behave like that on Linux, which is how it behaved in
> Python 2.4 through 3.4.

I agree (with Larry).  If the change hadn't already been made, nobody
would get anywhere trying to make it now.  So best to pretend it was
never made to begin with ;-)

The tin that _will_ say "return random bytes" in Python will
be`secrets.token_bytes()`.  That's self-evidently (to me) where the
"possibly block forever" implementation belongs.

From guido at python.org  Fri Jun 10 00:28:18 2016
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Jun 2016 21:28:18 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAExdVNk5sdtEqrCKU4a5h0Ehh1hbhQqDEFx9_ENTAy88YN2ohw@mail.gmail.com>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <5759F13B.2000909@stoneleaf.us>
 <5759F21A.5080905@hastings.org> <87oa79ydhu.fsf@vostro.rath.org>
 <575A2BFB.8040907@hastings.org>
 <CAExdVNk5sdtEqrCKU4a5h0Ehh1hbhQqDEFx9_ENTAy88YN2ohw@mail.gmail.com>
Message-ID: <CAP7+vJJR789Wu7FN0UXJgwu68gOt0jd2aJYcWUCg_g6fZwM-jw@mail.gmail.com>

So secrets.py needs an upgrade; it currently uses random.SysRandom.

On Thursday, June 9, 2016, Tim Peters <tim.peters at gmail.com> wrote:

> [Nikolaus Rath]
> >> Aeh, what the tin says is "return random bytes".
>
> [Larry Hastings]
> > What the tin says is "urandom", which has local man pages that dictate
> > exactly how it behaves.  On Linux the "urandom" man page says:
> >
> >     A read from the /dev/urandom device will not block waiting for more
> entropy.
> >     If there is not sufficient entropy, a pseudorandom number generator
> is used
> >     to create the requested bytes.
> >
> > os.urandom() needs to behave like that on Linux, which is how it behaved
> in
> > Python 2.4 through 3.4.
>
> I agree (with Larry).  If the change hadn't already been made, nobody
> would get anywhere trying to make it now.  So best to pretend it was
> never made to begin with ;-)
>
> The tin that _will_ say "return random bytes" in Python will
> be`secrets.token_bytes()`.  That's self-evidently (to me) where the
> "possibly block forever" implementation belongs.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org <javascript:;>
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>

-- 
--Guido (mobile)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/dcabfda7/attachment.html>

From njs at pobox.com  Fri Jun 10 00:32:53 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 9 Jun 2016 21:32:53 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <575A2FCC.5070101@hastings.org>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
Message-ID: <CAPJVwBn+TJbBhmV-K=oYc7RaO0Jvmh9dmUYbFSEzmDffnfkWXA@mail.gmail.com>

On Thu, Jun 9, 2016 at 8:11 PM, Larry Hastings <larry at hastings.org> wrote:

>
> On 06/09/2016 07:58 PM, Nathaniel Smith wrote:
>
> I suspect the crypto folks would be okay with pushing this back to
> 3.6, so long as the final resolution is that os.urandom remains the
> standard interface for, as the docstring says, "Return[ing] a string
> of n random bytes suitable for cryptographic use" using the
> OS-recommended method, and they don't have to go change all their
> code.
>
>
> The Linux core devs didn't like the behavior of /dev/urandom.  But they
> couldn't change its behavior without breaking userspace code.  Linux takes
> backwards-compatibility very seriously, so they left /dev/urandom exactly
> the way it was and added new functionality (the getrandom() system call)
> that had the semantics they felt were best.
>
> I don't understand why so many people seem to think it's okay to break old
> code in new versions of Python, when Python's history has shown a similarly
> strong commitment to backwards-compatibility.  os.urandom() was added in
> Python 2.4, in 2004, and remained unchanged for about thirteen years.
> That's thirteen years of people calling it and assuming its semantics were
> identical to the local "urandom" man page, which was correct.
>

I can only speak for myself, but the the reason it doesn't bother me is
that the documentation for os.urandom has always been very clear that it is
an abstraction over multiple OS-specific sources of cryptographic
randomness -- even in the 2.4 docs [1] we read that its output "depends on
the OS implementation", and that it might be /dev/urandom, it might be
CryptGenRandom, and it might even raise an exception if "a randomness
source is not found". So as a user I've always expected that it will make a
best-effort attempt to use whatever the best source of cryptographic
randomness is in a given environment, or else make a best-effort attempt to
raise an error if it's determined that it can't give me cryptographic
randomness, and it's been doing that unchanged for thirteen years too.

But now Linux has moved forward and provided an improved OS-specific source
of cryptographic randomness, and in particular one that actually signals to
userspace when it doesn't have randomness available. So we have a choice:
either we have to break the guarantee that os.urandom is identical to
/dev/urandom, or we have to break the guarantee that os.urandom uses the
best OS-specific source of cryptographic randomness. Either way we're
breaking some guarantee we used to make. And AFAICT so far 100% of the
people who actually maintain libraries that call os.urandom are asking
python-dev to break the identical-to-/dev/urandom guarantee and preserve
the uses-the-best-OS-specific-cryptographic-randomness guarantee.
Disrupting working code is a bad thing, but in the long run, no-one is
actually asking for an os.urandom that silently falls back on the xkcd #221
PRNG [2].

All that said, on the eve of the 3.5.2 release is a terrible time to be
trying to decide this, and it makes perfect sense to me that maybe 3.5
should kick this can down the road. Your efforts as RM are appreciated and
I'm glad I'm not in your spot :-).

-n

[1] https://docs.python.org/2.4/lib/os-miscfunc.html
[2] https://xkcd.com/221/

-- 
Nathaniel J. Smith -- https://vorpus.org <http://vorpus.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160609/4c617cc8/attachment.html>

From ethan at stoneleaf.us  Fri Jun 10 00:57:45 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 09 Jun 2016 21:57:45 -0700
Subject: [Python-Dev] PEP: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7B1=4etOsiMoDJ8siyDcCurcpnMT4TuFQRdfjOdW+y_rg@mail.gmail.com>
References: <CALFfu7BWF=83GBa0eCr8Scd8NAckbOQPCuqSBqrL5_Su+nbVYg@mail.gmail.com>
 <57570C07.9000703@stoneleaf.us>
 <CALFfu7B1=4etOsiMoDJ8siyDcCurcpnMT4TuFQRdfjOdW+y_rg@mail.gmail.com>
Message-ID: <575A48C9.5080100@stoneleaf.us>

On 06/07/2016 11:13 AM, Eric Snow wrote:
> On Tue, Jun 7, 2016 at 11:01 AM, Ethan Furman <ethan at stoneleaf.us> wrote:
>> On 06/07/2016 10:51 AM, Eric Snow wrote:
>>> Specification
>>> =============
>>
>>
>>>     * types for which `__prepare__()`` returned something other than
>>>       ``OrderedDict`` (or a subclass) have their ``__definition_order__``
>>>       set to ``None``
>>
>>
>> I assume this check happens in type.__new__?  If a non-OrderedDict is used
>> as the namespace, but a __definition_order__ key and value are supplied, is
>> it used or still set to None?
>
> A __definition_order__ in the class body always takes precedence.  So
> a supplied value will be honored (and not replaced with None).

Nice.  I'll add it to the Enum, enum34, and aenum as soon as it lands 
(give or take a couple months ;)

--
~Ethan~

From vadmium+py at gmail.com  Fri Jun 10 01:11:02 2016
From: vadmium+py at gmail.com (Martin Panter)
Date: Fri, 10 Jun 2016 05:11:02 +0000
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <5759EC2B.8040208@hastings.org>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
Message-ID: <CA+eR4cHrnggwxSit3nSu+spdPRYn8H3bc3tx6R0+5Jcuw8bRzA@mail.gmail.com>

> On 06/09/2016 08:52 AM, Guido van Rossum wrote:
> That leaves direct calls to os.urandom(). I don't think this should block
> either.

On 9 June 2016 at 22:22, Larry Hastings <larry at hastings.org> wrote:
> Then it's you and me against the rest of the world ;-)
>
>
> Okay, it's decided: os.urandom() must be changed for 3.5.2 to never block on
> a getrandom() call.  It's permissible to take advantage of
> getrandom(GRND_NONBLOCK), but if it returns EAGAIN we must read from
> /dev/urandom.

So assuming this is the ?final? decision, where to from here? I think
the latest change by Colm and committed by Victor already implements
this decision:

https://hg.python.org/cpython/rev/9de508dc4837

Getrandom() is still called, but if it would block, we fall back to
trying the less-secure Linux /dev/urandom, or fail if /dev/urandom is
missing. The Python hash seed is still set using this code. And
os.urandom() calls this code. Random.seed() and SystemRandom still use
os.urandom(), as documented.

So I suggest we close the original mega bug thread
<https://bugs.python.org/issue26839> as fixed. Unless people think
they can change Larry or Guido?s mind, we should focus further
discussion on any changes for 3.6.

From christian at python.org  Fri Jun 10 02:06:39 2016
From: christian at python.org (Christian Heimes)
Date: Fri, 10 Jun 2016 08:06:39 +0200
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
Message-ID: <njdldf$h9r$1@ger.gmane.org>

On 2016-06-10 05:48, Donald Stufft wrote:
> 
>> On Jun 9, 2016, at 11:11 PM, Larry Hastings <larry at hastings.org
>> <mailto:larry at hastings.org>> wrote:
>>
>> I don't understand why so many people seem to think it's okay to break
>> old code in new versions of Python, when Python's history has shown a
>> similarly strong commitment to backwards-compatibility.
> 
> Python *regularly* breaks compatibility in X.Y+1 releases, and does it
> on purpose. An example from Python 3.5 would be PEP 479. I think
> breaking compatibility is a good thing from time to time, as long as
> it?s not done so with wanton disregard and as long as the cost is
> carefully weighed against the benefits.
> 
> One of the more frustrating aspects of trying to discuss security
> sensitive topics on python-dev is a feeling (at least from my end) that
> whenever someone wants to make something more secure [1] folks come in
> and try to anchor the discussion by treating backwards compatibility as
> some sort of sacred duty that can never be broken and the discussion
> ends up feeling (from the security side that I?m typically on) being try
> to justify the idea of ever breaking backwards compatibility, instead of
> weighing the cost/benefit of a particular change. On the flip side, when
> a different kind of change that breaks compatibility , say to make some
> behavior less confusing, gets brought up it feels like the discussion
> instead focuses on whether or not breaking compatibility is worth it in
> that particular instance.
> 
> I?m perfectly happy to accept that Python has decided to make a trade
> off differently than what I would prefer it, but the rhetoric that is
> employed makes trying to improve Python?s security an extremely
> frustrating experience for myself and others [2]. Feeling like you have
> to litigate that it?s *ever* OK to break compatibility before you can
> even get to the point of discussing if it makes sense in any particular
> instance, while watching other kinds proposals not have to do that is a
> pretty disheartening experience.
> 
> 
> [1] Making code more secure pretty much by definition means taking some
> code that previously executed and making it so it no longer executes,
> ideally only in degenerate and dangerous conditions, but in general,
> that?s always the case.
> 
> [2] I don?t want to name names, as they didn?t give me permission to do
> so, but these discussions have caused more than one person who tends to
> fall on the security side of things to consider avoiding contributing to
> Python at all, because of this kind of rhetoric.

Donald,

feel free to name me. I'm mentally exhausted and frustrated by the
discussions over the last days and weeks. As of now I'm considering to
step down from PSRT and take a long break from Python core development.

My frustration is mostly rooted in Dunning-Kruger effects. If you still
think that a CSPRNG can run out of entropy or that it is a good idea to
implement a crypto hash function in pure Python, then please go back to
the children table and let the grown-ups talk. You are still struggling
with basic addition and multiplication, while we discuss Laplace
transformation for linear ODEs and consult experts, who do quantum
fourier transformation to solve a hidden subgroup problem by converting
it from finite Abelian groups to Shor's quantum algorithm [1]. Quoting
Larry: "You must be this tall to ride the security train." </rant>

I'm well aware that I'm not a trained and studied cryptographer. Cory
and Donald repeatedly stated the same. However we are aware of our
shortcomings, know our limits and constantly follow the advice of
trusted experts. At least we combine enough experience to recognize bad
ideas.

Please, please don't add unnecessary noise to security discussions.
os.urandom() is not about the concrete foundation of a bike shed. It's
the f...reaking core catcher [2] of a nuclear power plant. You want to
have a secure core catcher when the nuclear reactor goes BOOOM and
spills hot molten, extremely radioactive Corium.

Christian

[1] Yes, that is a real thing. It will break all current asymmetric
ciphers like RSA and EC.
[2] https://en.wikipedia.org/wiki/Core_catcher

From stephen at xemacs.org  Fri Jun 10 02:23:43 2016
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 10 Jun 2016 15:23:43 +0900
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <20160609230807.GA8118@python.ca>
References: <20160608210133.GA4318@python.ca>
 <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
 <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>
 <20160609230807.GA8118@python.ca>
Message-ID: <22362.23791.365192.927455@turnbull.sk.tsukuba.ac.jp>

Neil Schemenauer writes:

 > I have to wonder if you guys actually ported at lot of Python 2
 > code.

Python 3 (including stdlib) itself is quite a bit of code.

 > According to you guys, there is no problem

No, according to us, there are problems, but in the code, not in the
language or its implementation.  This is a Brooksian "no silver
bullet" problem: it's very hard to write reliable code that handles
multiple text representations (as pretty much everything does
nowadays), except by converting to internal text on input and back to
encoded text on output.  The warnings you quote (and presumably the
code that generates them) make assumptions (cf Barry's post) that are
frequently invalid.  I don't know about cross-type comparisons, but as
Barry and Brett both pointed out, mixtures of bytes and text are
*rarely* easy to fix, because it's often extremely difficult to know
which is the appropriate representation for a given variable unless
you do a complete refactoring as described above.  When I've tried to
fix such warnings one at a time, it's always been whack-a-mole.

The experience in GNU Emacs and Mailman 2 has been that it took about
ten years to get to the point where they went a whole year without an
encoding bug once non-Latin-1 encodings were being handled.  XEmacs
OTOH took only about 3 years from the proof-of-concept introduction of
multibyte characters to essentially no bugs (except in C code, of
course!) because we had the same policy as Python 3: bytes and text
don't mix, and in development we also would abort on mixing integers
and characters (in GNU Emacs, the character type was the same as the
integer type until very recently).  We affectionately referred to
those bugs as "Ebola" (not very polite, but it gets the point across
about how seriously we took the idea of making the internal text
representation completely opaque).  In Mailman 2, we still can't say
confidently that there are no Unicode bugs left even today.  We still
need an outer "except UnicodeError: quarantine_and_call_for_help(msg)"
handler, although AFAIK it hasn't been reported for a couple years.

It's not that you can't continue to run the potentially buggy code in
Python 2.  Mailman 2 does; you can, too.  What we don't support (and I
personally hope we never support) is running that code in Python 3
(warnings or no).  If you want to support that yourself, more power to
you, but I advise you that my experience suggests that it's not going
to be a panacea, and I do believe it's going to be more trouble than
biting the bullet and just thoroughly porting your code.  Even if that
takes as much time as it took Amber to port Twisted.

 > and we already have good enough tooling. ;-(

Nobody said that, just that the existing tooling is pretty good for
the problems that tools can help with, while no tool is likely to be
much help with some of the code your tool allows to run.  You're
welcome to try to prove that claim wrong -- if you do, it would indeed
be very valuable!  But I personally, based on my own experience, think
that the chance of success is too low to justify the cost.  (Granted,
I don't have to port Twisted, so in that sense I'm biased. :-/ )

BTW tools continue to be added, as well as language changes (PEP 461!)
There is no resistence to that.

What you're running into here is that several of us have substantial
experience with various of the issues raised, and that experience
convinces us that there's no silver bullet, just hard work, if you
face them.

Steve

From p.f.moore at gmail.com  Fri Jun 10 04:35:45 2016
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 10 Jun 2016 09:35:45 +0100
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <20160609221356.3b18e447.barry@wooz.org>
References: <20160608210133.GA4318@python.ca>
 <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
 <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>
 <20160609230807.GA8118@python.ca>
 <CAP1=2W70JKm=FepejD_qm-rCdPBUVLuK0Rm9aJ2bNBAhrYnBrA@mail.gmail.com>
 <20160610003557.GA9353@python.ca> <20160609221356.3b18e447.barry@wooz.org>
Message-ID: <CACac1F-Fs2V_ATe4Q+GnxvySTRfmNpofzsrfHE_Bjf4EKbfh2Q@mail.gmail.com>

On 10 June 2016 at 03:13, Barry Warsaw <barry at python.org> wrote:
> In my own experience, and IIRC Amber had a similar experience, the ease of
> porting to Python 3 really comes down to how bytes/unicode clean your code
> base is.  Almost all the other pieces are either pretty manageable or fairly
> easily automated.  But if you're code isn't bytes-clean you're in for a world
> of hurt because you first have to decide how to represent those things.
> Twisted's job is especially fun because it's all about wire protocols, which I
> think Amber described as (paraphrasing) bytes that happen to have contents
> that look like strings.

Although I have much less experience with porting than many others in
this thread, that's my experience as well. Get a clear and
well-understood separation of bytes and strings, and the rest of the
porting exercise is (relatively!) straightforward. But if you just
once think "I'm not quite sure, but I think I just need to decode here
to be safe" and you'll be fighting Unicode errors for ever.

My hope is that static typing tools like MyPy could help here. I
typically review Python 2 code by mentally categorising which
functions (theoretically) take bytes, which take strings, and which
are confused. And sort things out from there. Type annotations seem
like they'd help that process. But I've yet to use typing in practice,
so it may not be that simple.

Paul

From victor.stinner at gmail.com  Fri Jun 10 07:13:10 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Fri, 10 Jun 2016 13:13:10 +0200
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
Message-ID: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>

Hi,

Last weeks, I made researchs on how to get stable and reliable
benchmarks, especially for the corner case of microbenchmarks. The
first result is a serie of article, here are the first three:

https://haypo.github.io/journey-to-stable-benchmark-system.html
https://haypo.github.io/journey-to-stable-benchmark-deadcode.html
https://haypo.github.io/journey-to-stable-benchmark-average.html

The second result is a new perf module which includes all "tricks"
discovered in my research: compute average and standard deviation,
spawn multiple worker child processes, automatically calibrate the
number of outter-loop iterations, automatically pin worker processes
to isolated CPUs, and more.

The perf module allows to store benchmark results as JSON to analyze
them in depth later. It helps to configure correctly a benchmark and
check manually if it is reliable or not.

The perf documentation also explains how to get stable and reliable
benchmarks (ex: how to tune Linux to isolate CPUs).

perf has 3 builtin CLI commands:

* python -m perf: show and compare JSON results
* python -m perf.timeit: new better and more reliable implementation of timeit
* python -m metadata: display collected metadata

Python 3 is recommended to get time.perf_counter(), use the new
accurate statistics module, automatic CPU pinning (I will implement it
on Python 2 later), etc. But Python 2.7 is also supported, fallbacks
are implemented when needed.

Example with the patched telco benchmark (benchmark for the decimal
module) on a Linux with two isolated CPUs.

First run the benchmark:
---
$ python3 telco.py --json-file=telco.json
.........................
Average: 26.7 ms +- 0.2 ms
---

Then show the JSON content to see all details:
---
$ python3 -m perf -v show telco.json
Metadata:
- aslr: enabled
- cpu_affinity: 2, 3
- cpu_count: 4
- cpu_model_name: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
- hostname: smithers
- loops: 10
- platform: Linux-4.4.9-300.fc23.x86_64-x86_64-with-fedora-23-Twenty_Three
- python_executable: /usr/bin/python3
- python_implementation: cpython
- python_version: 3.4.3

Run 1/25: warmup (1): 26.9 ms; samples (3): 26.8 ms, 26.8 ms, 26.7 ms
Run 2/25: warmup (1): 26.8 ms; samples (3): 26.7 ms, 26.7 ms, 26.7 ms
Run 3/25: warmup (1): 26.9 ms; samples (3): 26.8 ms, 26.9 ms, 26.8 ms
(...)
Run 25/25: warmup (1): 26.8 ms; samples (3): 26.7 ms, 26.7 ms, 26.7 ms

Average: 26.7 ms +- 0.2 ms (25 runs x 3 samples; 1 warmup)
---

Note: benchmarks can be analyzed with Python 2.

I'm posting my email to python-dev because providing timeit results is
commonly requested in review of optimization patches.

The next step is to patch the CPython benchmark suite to use the perf
module. I already forked the repository and started to patch some
benchmarks.

If you are interested by Python performance in general, please join us
on the speed mailing list!
https://mail.python.org/mailman/listinfo/speed

Victor

From steve at pearwood.info  Fri Jun 10 09:20:51 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 10 Jun 2016 23:20:51 +1000
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
Message-ID: <20160610132051.GH27919@ando.pearwood.info>

On Fri, Jun 10, 2016 at 01:13:10PM +0200, Victor Stinner wrote:
> Hi,
> 
> Last weeks, I made researchs on how to get stable and reliable
> benchmarks, especially for the corner case of microbenchmarks. The
> first result is a serie of article, here are the first three:

Thank you for this! I am very interested in benchmarking.

> https://haypo.github.io/journey-to-stable-benchmark-system.html
> https://haypo.github.io/journey-to-stable-benchmark-deadcode.html
> https://haypo.github.io/journey-to-stable-benchmark-average.html

I strongly question your statement in the third:

    [quote]
    But how can we compare performances if results are random? 
    Take the minimum?

    No! You must never (ever again) use the minimum for 
    benchmarking! Compute the average and some statistics like
    the standard deviation:
    [end quote]

While I'm happy to see a real-world use for the statistics module, I 
disagree with your logic.

The problem is that random noise can only ever slow the code down, it 
cannot speed it up. To put it another way, the random errors in the 
timings are always positive.

Suppose you micro-benchmark some code snippet and get a series of 
timings. We can model the measured times as:

    measured time t = T + ?

where T is the unknown "true" timing we wish to estimate, and ? is some 
variable error due to noise in the system. But ? is always positive, 
never negative, and we always measure something larger than T.

Let's suppose we somehow (magically) know what the epsilons are:

measurements = [T + 0.01, T + 0.02, T + 0.04, T + 0.01]

The average is (4*T + 0.08)/4 = T + 0.02

But the minimum is T + 0.01, which is a better estimate than the 
average. Taking the average means that *worse* epsilons will effect your 
estimate, while the minimum means that only the smallest epsilon effects 
your estimate.

Taking the average is appropriate is if the error terms can be positive 
or negative, e.g. if they are *measurement error* rather than noise:

measurements = [T + 0.01, T - 0.02, T + 0.04, T - 0.01]

The average is (4*T + 0.02)/4 = T + 0.005

The minimum is T - 0.02, which is worse than the average.

Unless you have good reason to think that the timing variation is mostly 
caused by some error which can be both positive and negative, the 
minimum is the right statistic to use, not the average. But ask 
yourself: what sort of error, noise or external influence will cause the 
code snippet to run FASTER than the fastest the CPU can execute it?

-- 
Steve

From dmalcolm at redhat.com  Fri Jun 10 10:34:26 2016
From: dmalcolm at redhat.com (David Malcolm)
Date: Fri, 10 Jun 2016 10:34:26 -0400
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <20160610132051.GH27919@ando.pearwood.info>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
 <20160610132051.GH27919@ando.pearwood.info>
Message-ID: <1465569266.4029.43.camel@redhat.com>

On Fri, 2016-06-10 at 23:20 +1000, Steven D'Aprano wrote:
> On Fri, Jun 10, 2016 at 01:13:10PM +0200, Victor Stinner wrote:
> > Hi,
> > 
> > Last weeks, I made researchs on how to get stable and reliable
> > benchmarks, especially for the corner case of microbenchmarks. The
> > first result is a serie of article, here are the first three:
> 
> Thank you for this! I am very interested in benchmarking.
> 
> > https://haypo.github.io/journey-to-stable-benchmark-system.html
> > https://haypo.github.io/journey-to-stable-benchmark-deadcode.html
> > https://haypo.github.io/journey-to-stable-benchmark-average.html
> 
> I strongly question your statement in the third:
> 
>     [quote]
>     But how can we compare performances if results are random? 
>     Take the minimum?
> 
>     No! You must never (ever again) use the minimum for 
>     benchmarking! Compute the average and some statistics like
>     the standard deviation:
>     [end quote]
> 
> 
> While I'm happy to see a real-world use for the statistics module, I 
> disagree with your logic.
> 
> The problem is that random noise can only ever slow the code down, it
> cannot speed it up. 

Consider a workload being benchmarked running on one core, which has a
particular pattern of cache hits and misses.  Now consider another
process running on a sibling core, sharing the same cache.

Isn't it possible that under some circumstances the 2nd process could
prefetch memory into the cache in such a way that the workload under
test actually gets faster than if the 2nd process wasn't running?

[...snip...]

Hope this is constructive
Dave

From sebastian at realpath.org  Fri Jun 10 02:39:02 2016
From: sebastian at realpath.org (Sebastian Krause)
Date: Fri, 10 Jun 2016 08:39:02 +0200
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 (Nathaniel Smith's message of "Thu, 9 Jun 2016 18:03:35 -0700")
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
Message-ID: <m260th5z09.fsf@news.realpath.org>

Nathaniel Smith <njs at pobox.com> wrote:
> (This is based on the assumption that the only time that explicitly
> calling os.urandom is the best option is when one cares about the
> cryptographic strength of the result -- I'm explicitly distinguishing
> here between the hash seeding issue that triggered the original bug
> report and explicit calls to os.urandom.)

I disagree with that assumption. I've often found myself to use
os.urandom for non-secure random data and seen it as the best option
simply because it directly returns the type I wanted: bytes.

The last time I looked the random module didn't have a function to
directly give me bytes, so I would have to wrap it in something like:

bytearray(random.getrandbits(8) for _ in range(size))

Or maybe the function exists, but then it doesn't seem very
discoverable. Ideally I would only want to use the random module for
non-secure and (in 3.6) the secrets module (which could block) for
secure random data and never bother with os.urandom (and knowing how
it behaves). But then those modules should probably get new
functions to directly return bytes.

Sebastian

From cody.piersall at gmail.com  Fri Jun 10 10:09:27 2016
From: cody.piersall at gmail.com (Cody Piersall)
Date: Fri, 10 Jun 2016 09:09:27 -0500
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <20160609230807.GA8118@python.ca>
References: <20160608210133.GA4318@python.ca>
 <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
 <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>
 <20160609230807.GA8118@python.ca>
Message-ID: <CAFSbXtMyTQdxSqa6Kf-FDBO2DkdjULEovW97a5QZaz6tNkWuEg@mail.gmail.com>

> One problem is that the str literals should be bytes
> literals.  Comparison with None needs to be avoided.
>
> With Python 2 code runs successfully.  With Python 3 the code
> crashes with a traceback.  With my modified Python 3.6, the code
> runs successfully but generates the following warnings:
>
>     test.py:13: DeprecationWarning: encoding bytes to str
>       output.write('%d:' % len(s))
>     test.py:14: DeprecationWarning: encoding bytes to str
>       output.write(s)
>     test.py:15: DeprecationWarning: encoding bytes to str
>       output.write(',')
>     test.py:5: DeprecationWarning: encoding bytes to str
>       if c == ':':
>     test.py:9: DeprecationWarning: encoding bytes to str
>       size += c
>     test.py:24: DeprecationWarning: encoding bytes to str
>       data = data + s
>     test.py:26: DeprecationWarning: encoding bytes to str
>       if input.read(1) != ',':
>     test.py:31: DeprecationWarning: default compare is depreciated
>       if a > 0:
>

This seems _very_ useful; I'm surprised that other people don't think
so too.  Currently, the easiest way to find bytes/str errors in a big
application is by running the program, finding where it crashes,
fixing that one line (or hopefully wherever the data entered the
system if you can find it), and repeating the process.

This is nice because you can get in "fix my encoding errors" mode for
more than just one traceback at a time; the new method would be to run
the program, look at the millions of bytes/str errors, and fix
everything that showed up in this round at once.  That seems like a
big win for productivity to me.

Cody

From victor.stinner at gmail.com  Fri Jun 10 11:07:18 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Fri, 10 Jun 2016 17:07:18 +0200
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <20160610132051.GH27919@ando.pearwood.info>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
 <20160610132051.GH27919@ando.pearwood.info>
Message-ID: <CAMpsgwYRe7XxH7_=9-c2BRVJd1fvzaRcjE3tJoTz4WtRDTtgZw@mail.gmail.com>

I started to work on visualisation. IMHO it helps to understand the problem.

Let's create a large dataset: 500 samples (100 processes x 5 samples):
---
$ python3 telco.py --json-file=telco.json -p 100 -n 5
---

Attached plot.py script creates an histogram:
---
avg: 26.7 ms +- 0.2 ms; min = 26.2 ms

26.1 ms:   1 #
26.2 ms:  12 #####
26.3 ms:  34 ############
26.4 ms:  44 ################
26.5 ms: 109 ######################################
26.6 ms: 117 ########################################
26.7 ms:  86 ##############################
26.8 ms:  50 ##################
26.9 ms:  32 ###########
27.0 ms:  10 ####
27.1 ms:   3 ##
27.2 ms:   1 #
27.3 ms:   1 #

minimum 26.1 ms: 0.2% (1) of 500 samples
---

Replace "if 1" with "if 0" to produce a graphical view, or just view
the attached distribution.png, the numpy+scipy histogram.

The distribution looks a gaussian curve:
https://en.wikipedia.org/wiki/Gaussian_function

The interesting thing is that only 1 sample on 500 are in the minimum
bucket (26.1 ms). If you say that the performance is 26.1 ms, only
0.2% of your users will be able to reproduce this timing.

The average and std dev are 26.7 ms +- 0.2 ms, so numbers 26.5 ms ..
26.9 ms: we got 109+117+86+50+32 samples in this range which gives us
394/500 = 79%.

IMHO saying "26.7 ms +- 0.2 ms" (79% of samples) is less a lie than
26.1 ms (0.2%).

Victor
-------------- next part --------------
A non-text attachment was scrubbed...
Name: distribution.png
Type: image/png
Size: 31967 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/11c3f393/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: telco.json
Type: application/json
Size: 58847 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/11c3f393/attachment-0001.json>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: plot.py
Type: text/x-python
Size: 1109 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/11c3f393/attachment-0001.py>

From p.f.moore at gmail.com  Fri Jun 10 11:09:44 2016
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 10 Jun 2016 16:09:44 +0100
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <1465569266.4029.43.camel@redhat.com>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
 <20160610132051.GH27919@ando.pearwood.info>
 <1465569266.4029.43.camel@redhat.com>
Message-ID: <CACac1F9akVc5KjPw1ub=0_-H1cES3Xqae2drcgqf+s3Cct52dQ@mail.gmail.com>

On 10 June 2016 at 15:34, David Malcolm <dmalcolm at redhat.com> wrote:
>> The problem is that random noise can only ever slow the code down, it
>> cannot speed it up.
[...]
> Isn't it possible that under some circumstances the 2nd process could
> prefetch memory into the cache in such a way that the workload under
> test actually gets faster than if the 2nd process wasn't running?

My feeling is that it would be much rarer for random effects to speed
up the benchmark under test - possible in the sort of circumstance you
describe, but not common.

The conclusion I draw is "be careful how you interpret summary
statistics if you don't know the distribution of the underlying data
as an estimator of the value you are interested in".

In the case of Victor's article, he's specifically trying to
compensate for variations introduced by Python's hash randomisation
algorithm. And for that, you would get both positive and negative
effects on code speed, so the average makes sense. But only if you've
already eliminated the other common noise (such as other proceses,
etc). In Victor's articles, he sounds like he's done this, but he's
using very Linux-specific mechanisms, and I don't know if he's done
the same for other platforms. Also, the way people commonly use
micro-benchmarks ("hey, look, this way of writing the expression goes
faster than that way") doesn't really address questions like "is the
difference statistically significant".

Summary: Micro-benchmarking is hard. Victor looks like he's done some
really interesting work on it, but any "easy to use" timeit tool will
typically get used in an over-simplistic way in practice, and so you
probably shouldn't read too much into timing figures quoted in
isolation, no matter what tool was used to generate them.

Paul

From p.f.moore at gmail.com  Fri Jun 10 11:14:45 2016
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 10 Jun 2016 16:14:45 +0100
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <CAFSbXtMyTQdxSqa6Kf-FDBO2DkdjULEovW97a5QZaz6tNkWuEg@mail.gmail.com>
References: <20160608210133.GA4318@python.ca>
 <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
 <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>
 <20160609230807.GA8118@python.ca>
 <CAFSbXtMyTQdxSqa6Kf-FDBO2DkdjULEovW97a5QZaz6tNkWuEg@mail.gmail.com>
Message-ID: <CACac1F_8CDuHX57VvkNUujR94EcV6O86PQXh6KmNytv6a0h-7A@mail.gmail.com>

On 10 June 2016 at 15:09, Cody Piersall <cody.piersall at gmail.com> wrote:
>> One problem is that the str literals should be bytes
>> literals.  Comparison with None needs to be avoided.
>>
>> With Python 2 code runs successfully.  With Python 3 the code
>> crashes with a traceback.  With my modified Python 3.6, the code
>> runs successfully but generates the following warnings:
>>
>>     test.py:13: DeprecationWarning: encoding bytes to str
>>       output.write('%d:' % len(s))
>>     test.py:14: DeprecationWarning: encoding bytes to str
>>       output.write(s)
>>     test.py:15: DeprecationWarning: encoding bytes to str
>>       output.write(',')
>>     test.py:5: DeprecationWarning: encoding bytes to str
>>       if c == ':':
>>     test.py:9: DeprecationWarning: encoding bytes to str
>>       size += c
>>     test.py:24: DeprecationWarning: encoding bytes to str
>>       data = data + s
>>     test.py:26: DeprecationWarning: encoding bytes to str
>>       if input.read(1) != ',':
>>     test.py:31: DeprecationWarning: default compare is depreciated
>>       if a > 0:
>>
>
> This seems _very_ useful; I'm surprised that other people don't think
> so too.  Currently, the easiest way to find bytes/str errors in a big
> application is by running the program, finding where it crashes,
> fixing that one line (or hopefully wherever the data entered the
> system if you can find it), and repeating the process.

It *is* very nice. But...

> This is nice because you can get in "fix my encoding errors" mode for
> more than just one traceback at a time; the new method would be to run
> the program, look at the millions of bytes/str errors, and fix
> everything that showed up in this round at once.  That seems like a
> big win for productivity to me.

If you're fixing encoding errors at the point they occur, rather than
looking at the high-level design of the program's handling of textual
and bytestring data, you're likely to end up in a bit of a mess no
matter how you locate the issues. Most likely because at the point in
the code where the warning occurs, you no longer know what the correct
encoding to use should be.

But absolutely, anything that gives extra information about where the
encoding hotspots are in your code is of value.
Paul

From status at bugs.python.org  Fri Jun 10 12:08:43 2016
From: status at bugs.python.org (Python tracker)
Date: Fri, 10 Jun 2016 18:08:43 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <20160610160843.612C656AAD@psf.upfronthosting.co.za>

ACTIVITY SUMMARY (2016-06-03 - 2016-06-10)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open    5553 (+16)
  closed 33491 (+75)
  total  39044 (+91)

Open issues with patches: 2424 

Issues opened (69)
==================

#16484: pydoc generates invalid docs.python.org link for xml.etree.Ele
http://bugs.python.org/issue16484  reopened by martin.panter

#26243: zlib.compress level as keyword argument
http://bugs.python.org/issue26243  reopened by serhiy.storchaka

#26839: Python 3.5 running on Linux kernel 3.17+ can block at startup 
http://bugs.python.org/issue26839  reopened by haypo

#27186: add os.fspath()
http://bugs.python.org/issue27186  reopened by brett.cannon

#27197: mock.patch interactions with "from" imports
http://bugs.python.org/issue27197  opened by clarkbreyman

#27198: Adding an assertClose() method to unittest.TestCase
http://bugs.python.org/issue27198  opened by ChrisBarker

#27199: TarFile expose copyfileobj bufsize to improve throughput
http://bugs.python.org/issue27199  opened by fried

#27200: make doctest in CPython has failures
http://bugs.python.org/issue27200  opened by Jelle Zijlstra

#27201: expose the ABI name as a config variable
http://bugs.python.org/issue27201  opened by doko

#27204: Failing doctests in Doc/howto/
http://bugs.python.org/issue27204  opened by Jelle Zijlstra

#27205: Failing doctests in Library/collections.rst
http://bugs.python.org/issue27205  opened by Jelle Zijlstra

#27206: Failing doctests in Doc/tutorial/
http://bugs.python.org/issue27206  opened by Jelle Zijlstra

#27207: Failing doctests in Doc/whatsnew/3.2.rst
http://bugs.python.org/issue27207  opened by Jelle Zijlstra

#27208: Failing doctests in Library/traceback.rst
http://bugs.python.org/issue27208  opened by Jelle Zijlstra

#27209: Failing doctests in Library/email.*.rst
http://bugs.python.org/issue27209  opened by Jelle Zijlstra

#27210: Failing doctests due to environmental dependencies in Lib/*lib
http://bugs.python.org/issue27210  opened by Jelle Zijlstra

#27212: Doc for itertools, 'islice()' implementation have unwanted beh
http://bugs.python.org/issue27212  opened by alex0307

#27213: Rework CALL_FUNCTION* opcodes
http://bugs.python.org/issue27213  opened by serhiy.storchaka

#27214: a potential future bug and an optimization that mostly undermi
http://bugs.python.org/issue27214  opened by Oren Milman

#27218: improve tracing performance with f_trace set to Py_None
http://bugs.python.org/issue27218  opened by xdegaye

#27219: turtle.fillcolor doesn't accept a tuple of floats
http://bugs.python.org/issue27219  opened by Jelle Zijlstra

#27220: Add a pure Python version of 'collections.defaultdict'
http://bugs.python.org/issue27220  opened by ebarry

#27221: multiprocessing documentation is outdated regarding method pic
http://bugs.python.org/issue27221  opened by memeplex

#27222: redundant checks and a weird use of goto statements in long_rs
http://bugs.python.org/issue27222  opened by Oren Milman

#27223: _ready_ready and _write_ready should respect _conn_lost
http://bugs.python.org/issue27223  opened by lukasz.langa

#27226: distutils: unable to compile both .opt-1.pyc and .opt2.pyc sim
http://bugs.python.org/issue27226  opened by mgorny

#27227: argparse fails to parse [] when using choices and nargs='*'
http://bugs.python.org/issue27227  opened by evan_

#27231: Support the fspath protocol in the posixpath module
http://bugs.python.org/issue27231  opened by Jelle Zijlstra

#27232: os.fspath() should not use repr() on error
http://bugs.python.org/issue27232  opened by Jelle Zijlstra

#27233: Missing documentation for PyOS_FSPath
http://bugs.python.org/issue27233  opened by Jelle Zijlstra

#27235: Heap overflow occurred due to the int overflow (Python-2.7.11/
http://bugs.python.org/issue27235  opened by madness

#27238: Bare except: usages in turtle.py
http://bugs.python.org/issue27238  opened by Jelle Zijlstra

#27240: 'UnstructuredTokenList' object has no attribute '_fold_as_ew'
http://bugs.python.org/issue27240  opened by touilleMan

#27241: Catch exceptions raised in pstats add (repl)
http://bugs.python.org/issue27241  opened by llllllllll

#27242: Make the docs for NotImplemented & NotImplementedError unambig
http://bugs.python.org/issue27242  opened by ebarry

#27243: __aiter__ should return async iterator instead of awaitable
http://bugs.python.org/issue27243  opened by yselivanov

#27244: print(';;') fails in pdb with SyntaxError
http://bugs.python.org/issue27244  opened by cjw296

#27245: IDLE: Fix deletion of custom themes and key bindings
http://bugs.python.org/issue27245  opened by terry.reedy

#27248: Possible refleaks in PyType_Ready in error condition
http://bugs.python.org/issue27248  opened by xiang.zhang

#27250: Add os.urandom_block()
http://bugs.python.org/issue27250  opened by haypo

#27252: Make dict views copyable
http://bugs.python.org/issue27252  opened by serhiy.storchaka

#27253: More efficient deepcopying of Mapping
http://bugs.python.org/issue27253  opened by serhiy.storchaka

#27254: heap overflow in Tkinter module
http://bugs.python.org/issue27254  opened by Emin Ghuliev

#27255: More opcode predictions
http://bugs.python.org/issue27255  opened by serhiy.storchaka

#27256: header indentation destroyed
http://bugs.python.org/issue27256  opened by frispete

#27257: get_addresses results in traceback with an addrspec with an em
http://bugs.python.org/issue27257  opened by frispete

#27258: Exception in BytesGenerator.flatten
http://bugs.python.org/issue27258  opened by frispete

#27259: Possible missing deprecation warnings?
http://bugs.python.org/issue27259  opened by mark

#27260: Missing equality check for super objects
http://bugs.python.org/issue27260  opened by Jelle Zijlstra

#27261: io.BytesIO.truncate does not work as advertised
http://bugs.python.org/issue27261  opened by justus.winter

#27262: IDLE: move Aqua context menu code to maxosx
http://bugs.python.org/issue27262  opened by terry.reedy

#27263: IDLE sets the HOME environment variable breaking scripts
http://bugs.python.org/issue27263  opened by Jarrod Petz

#27266: Always use getrandom() in os.random() on Linux and add block=F
http://bugs.python.org/issue27266  opened by haypo

#27268: Incorrect error message on float('')
http://bugs.python.org/issue27268  opened by Drekin

#27269: ipaddress: Wrong behavior with ::ffff:1.2.3.4 style IPs
http://bugs.python.org/issue27269  opened by ThiefMaster

#27270: 'parentheses-equality' warnings when building with clang and c
http://bugs.python.org/issue27270  opened by xdegaye

#27272: random.Random should not read 2500 bytes from urandom
http://bugs.python.org/issue27272  opened by haypo

#27273: subprocess.run(cmd, input='text') should pass universal_newlin
http://bugs.python.org/issue27273  opened by akira

#27274: [ctypes] Allow from_pointer creation
http://bugs.python.org/issue27274  opened by memeplex

#27275: KeyError thrown by optimised collections.OrderedDict.popitem()
http://bugs.python.org/issue27275  opened by kaniini

#27277: Fatal Python error: Segmentation fault in test_exceptions
http://bugs.python.org/issue27277  opened by Rohit Mediratta

#27278: py_getrandom() uses an int for syscall() result
http://bugs.python.org/issue27278  opened by haypo

#27279: Add random.cryptorandom() and random.pseudorandom, deprecate o
http://bugs.python.org/issue27279  opened by lemburg

#27281: unpickling an xmlrpc.client.Fault raises TypeError
http://bugs.python.org/issue27281  opened by Uri Okrent

#27282: Raise BlockingIOError in os.urandom if kernel is not ready
http://bugs.python.org/issue27282  opened by ncoghlan

#27283: Add a "What's New" entry for PEP 519
http://bugs.python.org/issue27283  opened by brett.cannon

#27285: Deprecate pyvenv in favor of python3 -m venv
http://bugs.python.org/issue27285  opened by stevepiercy

#27286: str object got multiple values for keyword argument
http://bugs.python.org/issue27286  opened by martin.panter

#27287: SIGSEGV when calling os.forkpty()
http://bugs.python.org/issue27287  opened by Alexander Haensch

Most recent 15 issues with no replies (15)
==========================================

#27287: SIGSEGV when calling os.forkpty()
http://bugs.python.org/issue27287

#27283: Add a "What's New" entry for PEP 519
http://bugs.python.org/issue27283

#27273: subprocess.run(cmd, input='text') should pass universal_newlin
http://bugs.python.org/issue27273

#27269: ipaddress: Wrong behavior with ::ffff:1.2.3.4 style IPs
http://bugs.python.org/issue27269

#27259: Possible missing deprecation warnings?
http://bugs.python.org/issue27259

#27258: Exception in BytesGenerator.flatten
http://bugs.python.org/issue27258

#27248: Possible refleaks in PyType_Ready in error condition
http://bugs.python.org/issue27248

#27241: Catch exceptions raised in pstats add (repl)
http://bugs.python.org/issue27241

#27240: 'UnstructuredTokenList' object has no attribute '_fold_as_ew'
http://bugs.python.org/issue27240

#27227: argparse fails to parse [] when using choices and nargs='*'
http://bugs.python.org/issue27227

#27223: _ready_ready and _write_ready should respect _conn_lost
http://bugs.python.org/issue27223

#27222: redundant checks and a weird use of goto statements in long_rs
http://bugs.python.org/issue27222

#27220: Add a pure Python version of 'collections.defaultdict'
http://bugs.python.org/issue27220

#27218: improve tracing performance with f_trace set to Py_None
http://bugs.python.org/issue27218

#27214: a potential future bug and an optimization that mostly undermi
http://bugs.python.org/issue27214

Most recent 15 issues waiting for review (15)
=============================================

#27286: str object got multiple values for keyword argument
http://bugs.python.org/issue27286

#27281: unpickling an xmlrpc.client.Fault raises TypeError
http://bugs.python.org/issue27281

#27273: subprocess.run(cmd, input='text') should pass universal_newlin
http://bugs.python.org/issue27273

#27270: 'parentheses-equality' warnings when building with clang and c
http://bugs.python.org/issue27270

#27266: Always use getrandom() in os.random() on Linux and add block=F
http://bugs.python.org/issue27266

#27262: IDLE: move Aqua context menu code to maxosx
http://bugs.python.org/issue27262

#27255: More opcode predictions
http://bugs.python.org/issue27255

#27253: More efficient deepcopying of Mapping
http://bugs.python.org/issue27253

#27252: Make dict views copyable
http://bugs.python.org/issue27252

#27248: Possible refleaks in PyType_Ready in error condition
http://bugs.python.org/issue27248

#27245: IDLE: Fix deletion of custom themes and key bindings
http://bugs.python.org/issue27245

#27243: __aiter__ should return async iterator instead of awaitable
http://bugs.python.org/issue27243

#27242: Make the docs for NotImplemented & NotImplementedError unambig
http://bugs.python.org/issue27242

#27241: Catch exceptions raised in pstats add (repl)
http://bugs.python.org/issue27241

#27238: Bare except: usages in turtle.py
http://bugs.python.org/issue27238

Top 10 most discussed issues (10)
=================================

#26839: Python 3.5 running on Linux kernel 3.17+ can block at startup 
http://bugs.python.org/issue26839 144 msgs

#27266: Always use getrandom() in os.random() on Linux and add block=F
http://bugs.python.org/issue27266  61 msgs

#27186: add os.fspath()
http://bugs.python.org/issue27186  18 msgs

#27243: __aiter__ should return async iterator instead of awaitable
http://bugs.python.org/issue27243  18 msgs

#27272: random.Random should not read 2500 bytes from urandom
http://bugs.python.org/issue27272  18 msgs

#5124: IDLE - pasting text doesn't delete selection
http://bugs.python.org/issue5124  16 msgs

#27198: Adding an assertClose() method to unittest.TestCase
http://bugs.python.org/issue27198  13 msgs

#27250: Add os.urandom_block()
http://bugs.python.org/issue27250  13 msgs

#23401: Add pickle support of Mapping views
http://bugs.python.org/issue23401  12 msgs

#25548: Show the address in the repr for class objects
http://bugs.python.org/issue25548  11 msgs

Issues closed (71)
==================

#8491: Need readline command and keybinding information
http://bugs.python.org/issue8491  closed by martin.panter

#12962: TitledHelpFormatter and IndentedHelpFormatter are not document
http://bugs.python.org/issue12962  closed by berker.peksag

#13771: HTTPSConnection __init__ super implementation causes recursion
http://bugs.python.org/issue13771  closed by berker.peksag

#15476: Index "code object" and link to code object definition
http://bugs.python.org/issue15476  closed by martin.panter

#17888: docs: more information on documentation team
http://bugs.python.org/issue17888  closed by berker.peksag

#18027: distutils should access stat_result timestamps via .st_*time a
http://bugs.python.org/issue18027  closed by berker.peksag

#18117: Missing symlink:Current after Mac OS X 3.3.2 package installat
http://bugs.python.org/issue18117  closed by ned.deily

#19234: socket.fileno() documentation
http://bugs.python.org/issue19234  closed by Jelle Zijlstra

#19611: inspect.getcallargs doesn't properly interpret set comprehensi
http://bugs.python.org/issue19611  closed by ncoghlan

#20041: TypeError when f_trace is None and tracing.
http://bugs.python.org/issue20041  closed by serhiy.storchaka

#20567: test_idle causes test_ttk_guionly 'can't invoke "event" comman
http://bugs.python.org/issue20567  closed by terry.reedy

#21272: use _sysconfigdata to itinialize distutils.sysconfig
http://bugs.python.org/issue21272  closed by doko

#21277: don't try to link _ctypes with a ffi_convenience library
http://bugs.python.org/issue21277  closed by doko

#21313: Py_GetVersion() is broken when using mqueue and a long patch n
http://bugs.python.org/issue21313  closed by martin.panter

#21916: Create unit tests for turtle textonly
http://bugs.python.org/issue21916  closed by serhiy.storchaka

#22797: urllib.request.urlopen documentation falsely guarantees that a
http://bugs.python.org/issue22797  closed by r.david.murray

#23264: Add pickle support of dict views
http://bugs.python.org/issue23264  closed by serhiy.storchaka

#24617: os.makedirs()'s [mode] not correct
http://bugs.python.org/issue24617  closed by martin.panter

#24810: UX mode for IDLE targeted to 'new learners'
http://bugs.python.org/issue24810  closed by terry.reedy

#25738: http.server doesn't handle RESET CONTENT status correctly
http://bugs.python.org/issue25738  closed by martin.panter

#25941: Add 'How to Review a Patch' section to devguide
http://bugs.python.org/issue25941  closed by ned.deily

#26014: Guide users to the newer package install instructions
http://bugs.python.org/issue26014  closed by ned.deily

#26305: Make Argument Clinic to generate PEP 7 conforming code
http://bugs.python.org/issue26305  closed by serhiy.storchaka

#26372: Popen.communicate not ignoring BrokenPipeError
http://bugs.python.org/issue26372  closed by gregory.p.smith

#26437: asyncio create_server() not always accepts the 'port' paramete
http://bugs.python.org/issue26437  closed by berker.peksag

#26448: dis.findlabels ignores EXTENDED_ARG
http://bugs.python.org/issue26448  closed by serhiy.storchaka

#26809: Add __all__ list to the string module
http://bugs.python.org/issue26809  closed by python-dev

#26884: android: cross-compilation of extension module links to the wr
http://bugs.python.org/issue26884  closed by doko

#26983: float() can return not exact float instance
http://bugs.python.org/issue26983  closed by serhiy.storchaka

#27052: Python2.7.11+ as in Debian testing and Ubuntu 16.04 LTS crashe
http://bugs.python.org/issue27052  closed by doko

#27066: SystemError if custom opener returns -1
http://bugs.python.org/issue27066  closed by barry

#27072: random.getrandbits is limited to 2**31-1 bits on 64-bit Window
http://bugs.python.org/issue27072  closed by rhettinger

#27073: redundant checks in long_add and long_sub
http://bugs.python.org/issue27073  closed by serhiy.storchaka

#27105: cgi.__all__ is incomplete
http://bugs.python.org/issue27105  closed by martin.panter

#27107: mailbox.__all__ list is incomplete
http://bugs.python.org/issue27107  closed by martin.panter

#27108: mimetypes.__all__ list is incomplete
http://bugs.python.org/issue27108  closed by martin.panter

#27109: plistlib.__all__ list is incomplete
http://bugs.python.org/issue27109  closed by martin.panter

#27110: smtpd.__all__ list is incomplete
http://bugs.python.org/issue27110  closed by martin.panter

#27127: Never have GET_ITER not followed by FOR_ITER
http://bugs.python.org/issue27127  closed by Demur Rumed

#27136: sock_connect fails for bluetooth (and probably others)
http://bugs.python.org/issue27136  closed by yselivanov

#27156: IDLE: remove unused code
http://bugs.python.org/issue27156  closed by terry.reedy

#27164: zlib can't decompress DEFLATE using shared dictionary
http://bugs.python.org/issue27164  closed by martin.panter

#27167: subprocess reports signal as negative exit status, not documen
http://bugs.python.org/issue27167  closed by gregory.p.smith

#27187: Relax __all__ location requirement in PEP 8
http://bugs.python.org/issue27187  closed by python-dev

#27196: Eliminate 'ThemeChanged' warning when running IDLE tests
http://bugs.python.org/issue27196  closed by terry.reedy

#27202: make doctest fails on 2.7 release notes
http://bugs.python.org/issue27202  closed by orsenthil

#27203: Failing doctests in Doc/faq/programming.rst
http://bugs.python.org/issue27203  closed by orsenthil

#27211: Heap corruption via Python 2.7.11 IOBase readline()
http://bugs.python.org/issue27211  closed by python-dev

#27215: Docstrings of Sequence and MutableSequence seems not right
http://bugs.python.org/issue27215  closed by rhettinger

#27216: Fix capitalisation of "Python runtime" in os.path.islink descr
http://bugs.python.org/issue27216  closed by ned.deily

#27217: IDLE 3.5.1 not using Tk 8.6
http://bugs.python.org/issue27217  closed by ned.deily

#27224: IDLE: editor versus grep line number differ
http://bugs.python.org/issue27224  closed by terry.reedy

#27225: Potential refleak in type_new when setting __new__ fails
http://bugs.python.org/issue27225  closed by serhiy.storchaka

#27228: just for clearing: os.path.normpath("file://a") returns "file:
http://bugs.python.org/issue27228  closed by georg.brandl

#27229: In tree cross-build fails copying  Include/graminit.h to itsel
http://bugs.python.org/issue27229  closed by martin.panter

#27230: Calculation involving mpmath gives wrong result with Python 3.
http://bugs.python.org/issue27230  closed by ned.deily

#27234: tuple - single value with comma is assigned as type tuple
http://bugs.python.org/issue27234  closed by steven.daprano

#27236: Add CHAINED_COMPARE_OP opcode
http://bugs.python.org/issue27236  closed by serhiy.storchaka

#27237: Kafka Python Consumer Messages gets truncated
http://bugs.python.org/issue27237  closed by ned.deily

#27239: Make idlelib.macosx self-contained.
http://bugs.python.org/issue27239  closed by terry.reedy

#27246: Keyboard Shortcuts Crash Idle
http://bugs.python.org/issue27246  closed by ebarry

#27247: telnetlib AttributeError: 'error' object has no attribute 'err
http://bugs.python.org/issue27247  closed by berker.peksag

#27249: Add os.urandom_info
http://bugs.python.org/issue27249  closed by haypo

#27251: TypeError in logging.HTTPHandler.emit; possible python 2 to 3 
http://bugs.python.org/issue27251  closed by vinay.sajip

#27264: python 3.4 vs. 3.5 strftime same locale different output on Wi
http://bugs.python.org/issue27264  closed by eryksun

#27265: Hash of different, specific Decimals created from str is the s
http://bugs.python.org/issue27265  closed by mark.dickinson

#27267: memory leak in _ssl.c, function load_cert_chain
http://bugs.python.org/issue27267  closed by python-dev

#27271: asyncio lost udp packets
http://bugs.python.org/issue27271  closed by gvanrossum

#27276: FileFinder.find_spec() incompatible with finder specification
http://bugs.python.org/issue27276  closed by paulmar

#27280: Paste fail in ipaddress documentation
http://bugs.python.org/issue27280  closed by berker.peksag

#27284: Spam
http://bugs.python.org/issue27284  closed by eryksun

From victor.stinner at gmail.com  Fri Jun 10 12:09:02 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Fri, 10 Jun 2016 18:09:02 +0200
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <CACac1F9akVc5KjPw1ub=0_-H1cES3Xqae2drcgqf+s3Cct52dQ@mail.gmail.com>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
 <20160610132051.GH27919@ando.pearwood.info>
 <1465569266.4029.43.camel@redhat.com>
 <CACac1F9akVc5KjPw1ub=0_-H1cES3Xqae2drcgqf+s3Cct52dQ@mail.gmail.com>
Message-ID: <CAMpsgwbDF6x0xK9S3wqkGHo05oCQAfyP2m0Xby8Gimm_4bsOQA@mail.gmail.com>

2016-06-10 17:09 GMT+02:00 Paul Moore <p.f.moore at gmail.com>:
> Also, the way people commonly use
> micro-benchmarks ("hey, look, this way of writing the expression goes
> faster than that way") doesn't really address questions like "is the
> difference statistically significant".

If you use the "python3 -m perf compare method1.json method2.json",
perf will checks that the difference is significant using the
is_significant() method:
http://perf.readthedocs.io/en/latest/api.html#perf.is_significant
"This uses a Student?s two-sample, two-tailed t-test with alpha=0.95."

FYI at the beginning, this function comes from the Unladen Swallow
benchmark suite ;-)

We should design a CLI command to do timeit+compare at once.

Victor

From guido at python.org  Fri Jun 10 12:23:17 2016
From: guido at python.org (Guido van Rossum)
Date: Fri, 10 Jun 2016 09:23:17 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <njdldf$h9r$1@ger.gmane.org>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
Message-ID: <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>

I somehow feel compelled to clarify that (perhaps unlike Larry) my concern
is not the strict rules of backwards compatibility (if that was the case I
would have objected to changing this in 3.5.2).

I just don't like the potentially blocking behavior, and experts' opinions
seem to widely vary on how insecure the fallback bits really are, how
likely you are to find yourself in that situation, and how probable an
exploit would be.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/9cf825ae/attachment.html>

From brett at python.org  Fri Jun 10 12:30:26 2016
From: brett at python.org (Brett Cannon)
Date: Fri, 10 Jun 2016 16:30:26 +0000
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <njda1n$ksq$1@ger.gmane.org>
References: <20160608210133.GA4318@python.ca>
 <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
 <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>
 <20160609230807.GA8118@python.ca>
 <CAP1=2W70JKm=FepejD_qm-rCdPBUVLuK0Rm9aJ2bNBAhrYnBrA@mail.gmail.com>
 <njda1n$ksq$1@ger.gmane.org>
Message-ID: <CAP1=2W7n4TZo=0oqi3s5LSf0U8Q4ieCtiaT+K9sejSaL9yKjsw@mail.gmail.com>

On Thu, 9 Jun 2016 at 19:53 Mark Lawrence via Python-Dev <
python-dev at python.org> wrote:

> On 10/06/2016 00:43, Brett Cannon wrote:
> >
> > That's not what I'm saying at all (nor what I think Nick is saying);
> > more tooling to ease the transition is always welcomed. The point we are
> > trying to make is 2to3 is not considered best practice anymore, and so
> > targeting its specific output might not be the best use of your time.
> > I'm totally happy to have your fork work out and help give warnings for
> > situations where runtime semantics are the only way to know there will
> > be a problem that static analyzing tools can't handle and have the
> > porting HOWTO updated so that people can run their test suite with your
> > interpreter to help with that final bit of porting. I personally just
> > don't want to see you waste time on warnings that are handled by the
> > tools already or ignore the fact that six, modernize, and futurize can
> > help more than 2to3 typically can with the easy stuff when trying to
> > keep 2/3 compatibility. IOW some of us have become allergic to the word
> > "2to3" in regards to porting. :) But if you want to target 2to3 output
> > then by all means please do and your work will still be appreciated.
> >
>
> Given the above and that 2to3 appears to be unsupported* is there a case
> for deprecating it?
>

I don't think so because it's still a useful transpiler tool. Basically the
community has decided the standard rewriters included with 2to3 aren't how
people prefer to port, but 2to3 as a tool is the basis of both modernize
and futurize (as are some of those rewriters, but tweaked to do something
different).

>
> *  There are 46 outstanding issues on the bug tracker.  Is the above the
> reason for this, I don't know?
>

Typically the bugs are for the rewrite rules and they are for edge cases
that no one wants to try and tackle as they are tough to cover (although
this is based on what comes through my inbox so my generalization could be
wrong).

-Brett

>
> --
> My fellow Pythonistas, ask not what our language can do for you, ask
> what you can do for our language.
>
> Mark Lawrence
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/2db129b6/attachment.html>

From ericsnowcurrently at gmail.com  Fri Jun 10 12:42:32 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Fri, 10 Jun 2016 09:42:32 -0700
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CADiSq7fr-K_a2nBnp3LJyvi0sQSWbGTYij5xmRbA6+wwDRbmHA@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <575772E6.7040906@stoneleaf.us>
 <CALFfu7DJDrBpPT4BvRG0-+coHRyvH+6tgwR=BFvPd0wRJ3NBPQ@mail.gmail.com>
 <CADiSq7fr-K_a2nBnp3LJyvi0sQSWbGTYij5xmRbA6+wwDRbmHA@mail.gmail.com>
Message-ID: <CALFfu7BPbof_y+S7C2z+XDwHjag5KwCcDih+GX-fwhZ6429ZUg@mail.gmail.com>

On Thu, Jun 9, 2016 at 2:39 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> I'm guessing Ethan is suggesting defining it as:
>
>     __definition_order__ = tuple(ns["__definition_order__"])
>
> When the attribute is present in the method body.

Ah.  I'd rather stick to "consenting adults" in the case that
__definition_order__ is explicitly set.  We'll strongly recommend
setting it to None or a tuple of identifier strings.

>
> That restriction would be comparable to what we do with __slots__ today:
>
>     >>> class C:
>     ...     __slots__ = 1
>     ...
>     Traceback (most recent call last):
>      File "<stdin>", line 1, in <module>
>     TypeError: 'int' object is not iterable

Are you suggesting that we require it be a tuple of identifiers (or
None) and raise TypeError otherwise, similar to __slots__?  The
difference is that __slots__ has specific type requirements that do
not apply to __definition_order__, as well as a different purpose.
__definition_order__ is about preserving definition-type info that we
are currently throwing away.

-eric

From ericsnowcurrently at gmail.com  Fri Jun 10 12:49:10 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Fri, 10 Jun 2016 09:49:10 -0700
Subject: [Python-Dev] PEP 468
In-Reply-To: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
Message-ID: <CALFfu7DfprtYgDVf3=GROqRoCXkRXddWPnPZLb-KhVunOQO26A@mail.gmail.com>

On Thu, Jun 9, 2016 at 12:41 PM,  <zreed at fastmail.com> wrote:
> Is there any further thoughts on including this in 3.6?

I don't have any plans and I don't know of anyone willing to champion
the PEP for 3.6.  Note that the implementation itself shouldn't take
very long.

>  Similar to the
> recent discussion on OrderedDict namespaces for metaclasses, this would
> simplify / enable a number of type factory use cases where proper
> metaclasses are overkill. This feature would also be quite nice in say
> pandas where the (currently unspecified) field order used in the
> definition of frames is preserved in user-visible displays.

Good point.  One weakness of the PEP has been sufficient
justification.  The greater number of compelling use cases, the
better.  So thanks! :)

-eric

From ericsnowcurrently at gmail.com  Fri Jun 10 12:54:32 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Fri, 10 Jun 2016 09:54:32 -0700
Subject: [Python-Dev] PEP 468
In-Reply-To: <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
Message-ID: <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>

On Thu, Jun 9, 2016 at 1:10 PM, ?manuel Barry <vgr255 at live.ca> wrote:
> As stated by Guido (and pointed out in the PEP):
>
> Making **kwds ordered is still open, but requires careful design and
> implementation to avoid slowing down function calls that don't benefit.
>
> The PEP has not been updated in a while, though. Python 3.5 has been
> released, and with it a C implementation of OrderedDict.
>
> Eric, are you still interested in this?

Yes, but wasn't planning on dusting it off yet (i.e. in time for 3.6).
I'm certainly not opposed to someone picking up the banner.
<wink-wink>

> IIRC that PEP was one of the
> motivating use cases for implementing OrderedDict in C.

Correct, though I'm not sure OrderedDict needs to be involved any more.

> Maybe it's time for
> a second round of discussion on Python-ideas?

Fine with me, though I won't have a lot of time in the 3.6 timeframe
to handle a high-volume discussion or push through an implementation.

-eric

From tjreedy at udel.edu  Fri Jun 10 12:55:22 2016
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 10 Jun 2016 12:55:22 -0400
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <20160610132051.GH27919@ando.pearwood.info>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
 <20160610132051.GH27919@ando.pearwood.info>
Message-ID: <njerdr$o46$1@ger.gmane.org>

On 6/10/2016 9:20 AM, Steven D'Aprano wrote:
> On Fri, Jun 10, 2016 at 01:13:10PM +0200, Victor Stinner wrote:
>> Hi,
>>
>> Last weeks, I made researchs on how to get stable and reliable
>> benchmarks, especially for the corner case of microbenchmarks. The
>> first result is a serie of article, here are the first three:
>
> Thank you for this! I am very interested in benchmarking.
>
>> https://haypo.github.io/journey-to-stable-benchmark-system.html
>> https://haypo.github.io/journey-to-stable-benchmark-deadcode.html
>> https://haypo.github.io/journey-to-stable-benchmark-average.html
>
> I strongly question your statement in the third:
>
>     [quote]
>     But how can we compare performances if results are random?
>     Take the minimum?
>
>     No! You must never (ever again) use the minimum for
>     benchmarking! Compute the average and some statistics like
>     the standard deviation:
>     [end quote]
>
>
> While I'm happy to see a real-world use for the statistics module, I
> disagree with your logic.
>
> The problem is that random noise can only ever slow the code down, it
> cannot speed it up. To put it another way, the random errors in the
> timings are always positive.
>
> Suppose you micro-benchmark some code snippet and get a series of
> timings. We can model the measured times as:
>
>     measured time t = T + ?
>
> where T is the unknown "true" timing we wish to estimate,

For comparative timings, we do not care about T.  So arguments about the 
best estimate of T mist the point.

What we do wish to estimate is the relationship between two Ts, T0 for 
'control', and T1 for 'treatment', in particular T1/T0.  I suspect 
Viktor is correct that mean(t1)/mean(t0) is better than min(t1)/min(t0) 
as an estimate of the true ratio T1/T0 (for a particular machine).

But given that we have matched pairs of measurements with the same 
hashseed and address, it may be better yet to estimate T1/T0 from the 
ratios t1i/t0i, where i indexes experimental conditions.  But it has 
been a long time since I have read about estimation of ratios.  What I 
remember is that this is a nasty subject.

It is also the case that while an individual with one machine wants the 
best ratio for that machine, we need to make CPython patch decisions for 
the universe of machines that run Python.

> and ? is some variable error due to noise in the system.
 > But ? is always positive,  never negative,

lognormal might be a first guess. But what we really have is 
contributions from multiple factors,

-- 
Terry Jan Reedy

From sebastian at realpath.org  Fri Jun 10 13:01:23 2016
From: sebastian at realpath.org (Sebastian Krause)
Date: Fri, 10 Jun 2016 19:01:23 +0200
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 (Guido van Rossum's message of "Fri, 10 Jun 2016 09:23:17 -0700")
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
Message-ID: <m21t455670.fsf@news.realpath.org>

Guido van Rossum <guido at python.org> wrote:
> I just don't like the potentially blocking behavior, and experts' opinions
> seem to widely vary on how insecure the fallback bits really are, how
> likely you are to find yourself in that situation, and how probable an
> exploit would be.

This is not just a theoretical problem being discussed by security
experts that *could* be exploited, there have already been multiple
real-life cases of devices (mostly embedded Linux machines)
generating predicatable SSH keys because they read from an
uninitialized /dev/urandom at first boot. Most recently in the
Raspbian distribution for the Raspberry Pi:
https://www.raspberrypi.org/forums/viewtopic.php?f=66&t=126892

At least in 3.6 there should be obvious way to get random data that
*always* guarantees to be secure and either fails or blocks if it
can't guarantee that.

Sebastian

From zreed at fastmail.com  Fri Jun 10 13:04:31 2016
From: zreed at fastmail.com (zreed at fastmail.com)
Date: Fri, 10 Jun 2016 12:04:31 -0500
Subject: [Python-Dev] PEP 468
In-Reply-To: <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
 <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
Message-ID: <1465578271.1903265.634024905.666E8FA2@webmail.messagingengine.com>

I would be super excited for this feature, so if there's a reasonable
chance of it being picked up I don't mind doing the implementation work.

On Fri, Jun 10, 2016, at 11:54 AM, Eric Snow wrote:
> On Thu, Jun 9, 2016 at 1:10 PM, ?manuel Barry <vgr255 at live.ca> wrote:
> > As stated by Guido (and pointed out in the PEP):
> >
> > Making **kwds ordered is still open, but requires careful design and
> > implementation to avoid slowing down function calls that don't benefit.
> >
> > The PEP has not been updated in a while, though. Python 3.5 has been
> > released, and with it a C implementation of OrderedDict.
> >
> > Eric, are you still interested in this?
> 
> Yes, but wasn't planning on dusting it off yet (i.e. in time for 3.6).
> I'm certainly not opposed to someone picking up the banner.
> <wink-wink>
> 
> > IIRC that PEP was one of the
> > motivating use cases for implementing OrderedDict in C.
> 
> Correct, though I'm not sure OrderedDict needs to be involved any more.
> 
> > Maybe it's time for
> > a second round of discussion on Python-ideas?
> 
> Fine with me, though I won't have a lot of time in the 3.6 timeframe
> to handle a high-volume discussion or push through an implementation.
> 
> -eric

From tjreedy at udel.edu  Fri Jun 10 13:04:51 2016
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 10 Jun 2016 13:04:51 -0400
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <CAMpsgwYRe7XxH7_=9-c2BRVJd1fvzaRcjE3tJoTz4WtRDTtgZw@mail.gmail.com>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
 <20160610132051.GH27919@ando.pearwood.info>
 <CAMpsgwYRe7XxH7_=9-c2BRVJd1fvzaRcjE3tJoTz4WtRDTtgZw@mail.gmail.com>
Message-ID: <njervk$1dk$1@ger.gmane.org>

On 6/10/2016 11:07 AM, Victor Stinner wrote:
> I started to work on visualisation. IMHO it helps to understand the problem.
>
> Let's create a large dataset: 500 samples (100 processes x 5 samples):

As I finished by response to Steven, I was thinking you should do 
something like this to get real data.

> ---
> $ python3 telco.py --json-file=telco.json -p 100 -n 5
> ---
>
> Attached plot.py script creates an histogram:
> ---
> avg: 26.7 ms +- 0.2 ms; min = 26.2 ms
>
> 26.1 ms:   1 #
> 26.2 ms:  12 #####
> 26.3 ms:  34 ############
> 26.4 ms:  44 ################
> 26.5 ms: 109 ######################################
> 26.6 ms: 117 ########################################
> 26.7 ms:  86 ##############################
> 26.8 ms:  50 ##################
> 26.9 ms:  32 ###########
> 27.0 ms:  10 ####
> 27.1 ms:   3 ##
> 27.2 ms:   1 #
> 27.3 ms:   1 #
>
> minimum 26.1 ms: 0.2% (1) of 500 samples
> ---
>
> Replace "if 1" with "if 0" to produce a graphical view, or just view
> the attached distribution.png, the numpy+scipy histogram.
>
> The distribution looks a gaussian curve:
> https://en.wikipedia.org/wiki/Gaussian_function

I am not too surprised.  If there are several somewhat independent 
sources of slowdown, their sum would tend to be normal.  I am also not 
surprised that there is also a bit of skewness, but probably not enough 
to worry about.

> The interesting thing is that only 1 sample on 500 are in the minimum
> bucket (26.1 ms). If you say that the performance is 26.1 ms, only
> 0.2% of your users will be able to reproduce this timing.
>
> The average and std dev are 26.7 ms +- 0.2 ms, so numbers 26.5 ms ..
> 26.9 ms: we got 109+117+86+50+32 samples in this range which gives us
> 394/500 = 79%.
>
> IMHO saying "26.7 ms +- 0.2 ms" (79% of samples) is less a lie than
> 26.1 ms (0.2%).

-- 
Terry Jan Reedy

From tritium-list at sdamon.com  Fri Jun 10 13:05:58 2016
From: tritium-list at sdamon.com (Alex Walters)
Date: Fri, 10 Jun 2016 13:05:58 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <m21t455670.fsf@news.realpath.org>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io> <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
Message-ID: <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>

> -----Original Message-----
> From: Python-Dev [mailto:python-dev-bounces+tritium-
> list=sdamon.com at python.org] On Behalf Of Sebastian Krause
> Sent: Friday, June 10, 2016 1:01 PM
> To: python-dev at python.org
> Subject: Re: [Python-Dev] BDFL ruling request: should we block forever
> waiting for high-quality random bits?
> 
> Guido van Rossum <guido at python.org> wrote:
> > I just don't like the potentially blocking behavior, and experts'
opinions
> > seem to widely vary on how insecure the fallback bits really are, how
> > likely you are to find yourself in that situation, and how probable an
> > exploit would be.
> 
> This is not just a theoretical problem being discussed by security
> experts that *could* be exploited, there have already been multiple
> real-life cases of devices (mostly embedded Linux machines)
> generating predicatable SSH keys because they read from an
> uninitialized /dev/urandom at first boot. Most recently in the
> Raspbian distribution for the Raspberry Pi:
> https://www.raspberrypi.org/forums/viewtopic.php?f=66&t=126892
> 
> At least in 3.6 there should be obvious way to get random data that
> *always* guarantees to be secure and either fails or blocks if it
> can't guarantee that.
> 
> Sebastian

And that should live in the secrets module.

From steve at pearwood.info  Fri Jun 10 13:04:54 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 11 Jun 2016 03:04:54 +1000
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <CAMpsgwYRe7XxH7_=9-c2BRVJd1fvzaRcjE3tJoTz4WtRDTtgZw@mail.gmail.com>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
 <20160610132051.GH27919@ando.pearwood.info>
 <CAMpsgwYRe7XxH7_=9-c2BRVJd1fvzaRcjE3tJoTz4WtRDTtgZw@mail.gmail.com>
Message-ID: <20160610170453.GI27919@ando.pearwood.info>

On Fri, Jun 10, 2016 at 05:07:18PM +0200, Victor Stinner wrote:
> I started to work on visualisation. IMHO it helps to understand the problem.
> 
> Let's create a large dataset: 500 samples (100 processes x 5 samples):
> ---
> $ python3 telco.py --json-file=telco.json -p 100 -n 5
> ---
> 
> Attached plot.py script creates an histogram:
> ---
> avg: 26.7 ms +- 0.2 ms; min = 26.2 ms
> 
> 26.1 ms:   1 #
> 26.2 ms:  12 #####
> 26.3 ms:  34 ############
> 26.4 ms:  44 ################
> 26.5 ms: 109 ######################################
> 26.6 ms: 117 ########################################
> 26.7 ms:  86 ##############################
> 26.8 ms:  50 ##################
> 26.9 ms:  32 ###########
> 27.0 ms:  10 ####
> 27.1 ms:   3 ##
> 27.2 ms:   1 #
> 27.3 ms:   1 #
> 
> minimum 26.1 ms: 0.2% (1) of 500 samples
> ---
[...] 
> The distribution looks a gaussian curve:
> https://en.wikipedia.org/wiki/Gaussian_function

Lots of distributions look a bit Gaussian, but they can be skewed, or 
truncated, or both. E.g. the average life-span of a lightbulb is 
approximately Gaussian with a central peak at some value (let's say 5000 
hours), but while it is conceivable that you might be really lucky and 
find a bulb that lasts 15000 hours, it isn't possible to find one that 
lasts -10000 hours. The distribution is truncated on the left.

To me, your graph looks like the distribution is skewed: the right-hand 
tail (shown at the bottom) is longer than the left-hand tail, six 
buckets compared to five buckets. There are actual statistical tests for 
detecting deviation from Gaussian curves, but I'd have to look them up. 
But as a really quick and dirty test, we can count the number of samples 
on either side of the central peak (the mode):

left: 109+44+34+12+1 = 200
centre: 117
right: 500 - 200 - 117 = 183

It certainly looks *close* to Gaussian, but with the crude tests we are 
using, we can't be sure. If you took more and more samples, I would 
expect that the right-hand tail would get longer and longer, but the 
left-hand tail would not.

> The interesting thing is that only 1 sample on 500 are in the minimum
> bucket (26.1 ms). If you say that the performance is 26.1 ms, only
> 0.2% of your users will be able to reproduce this timing.

Hmmm. Okay, that is a good point. In this case, you're not so much 
reporting your estimate of what the "true speed" of the code snippet 
would be in the absence of all noise, but your estimate of what your 
users should expect to experience "most of the time".

Assuming they have exactly the same hardware, operating system, and load 
on their system as you have.

> The average and std dev are 26.7 ms +- 0.2 ms, so numbers 26.5 ms ..
> 26.9 ms: we got 109+117+86+50+32 samples in this range which gives us
> 394/500 = 79%.
> 
> IMHO saying "26.7 ms +- 0.2 ms" (79% of samples) is less a lie than
> 26.1 ms (0.2%).

I think I understand the point you are making. I'll have to think about 
it some more to decide if I agree with you.

But either way, I think the work you have done on perf is fantastic and 
I think this will be a great tool. I really love the histogram. Can you 
draw a histogram of two functions side-by-side, for comparisons?

-- 
Steve

From ncoghlan at gmail.com  Fri Jun 10 13:40:53 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 10 Jun 2016 10:40:53 -0700
Subject: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray,
 and memoryview
In-Reply-To: <20160609222157.2063ca00@anarchist.wooz.org>
References: <57572E5D.4020101@stoneleaf.us>
 <20160609222157.2063ca00@anarchist.wooz.org>
Message-ID: <CADiSq7cMDs3jzN0dZiZ5H9FKA_qMuVhVe-foSX14yZKMP8FNjg@mail.gmail.com>

On 9 June 2016 at 19:21, Barry Warsaw <barry at python.org> wrote:
> On Jun 07, 2016, at 01:28 PM, Ethan Furman wrote:
>
>>Deprecation of current "zero-initialised sequence" behaviour
>>------------------------------------------------------------
>>
>>Currently, the ``bytes`` and ``bytearray`` constructors accept an integer
>>argument and interpret it as meaning to create a zero-initialised sequence of
>>the given size::
>>
>>     >>> bytes(3)
>>     b'\x00\x00\x00'
>>     >>> bytearray(3)
>>     bytearray(b'\x00\x00\x00')
>>
>>This PEP proposes to deprecate that behaviour in Python 3.6, and remove it
>>entirely in Python 3.7.
>>
>>No other changes are proposed to the existing constructors.
>
> Does it need to be *actually* removed?  That does break existing code for not
> a lot of benefit.  Yes, the default constructor is a little wonky, but with
> the addition of the new constructors, and the fact that you're not proposing
> to eventually change the default constructor, removal seems unnecessary.
> Besides, once it's removed, what would `bytes(3)` actually do?  The PEP
> doesn't say.

Raise TypeError, presumably. However, I agree this isn't worth the
hassle of breaking working code, especially since truly ludicrous
values will fail promptly with MemoryError - it's only a particular
range of values that fit within the limits of the machine, but also
push it into heavy swapping that are a potential problem.

> Also, since you're proposing to add `bytes.byte(3)` have you considered also
> adding an optional count argument?  E.g. `bytes.byte(3, count=7)` would yield
> b'\x03\x03\x03\x03\x03\x03\x03'.  That seems like it could be useful.

The purpose of bytes.byte() in the PEP is to provide a way to
roundtrip ord() calls with binary inputs, since the current spelling
is pretty unintuitive:

    >>> ord("A")
    65
    >>> chr(ord("A"))
    'A'
    >>> ord(b"A")
    65
    >>> bytes([ord(b"A")])
    b'A'

That said, perhaps it would make more sense for the corresponding
round-trip to be:

    >>> bchr(ord("A"))
    b'A'

With the "b" prefix on "chr" reflecting the "b" prefix on the output.
This also inverts the chr/unichr pairing that existed in Python 2
(replacing it with bchr/chr), and is hence very friendly to
compatibility modules like six and future (future.builtins already
provides a chr that behaves like the Python 3 one, and bchr would be
much easier to add to that than a new bytes object method).

In terms of an efficient memory-preallocation interface, the
equivalent NumPy operation to request a pre-filled array is
"ndarray.full":
http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.full.html
(there's also an inplace mutation operation, "fill")

For bytes and bytearray though, that has an unfortunate name collision
with "zfill", which refers to zero-padding numeric values for fixed
width display.

If the PEP just added bchr() to complement chr(), and [bytes,
bytearray].zeros() as a more discoverable alternative to passing
integers to the default constructor, I think that would be a decent
step forward, and the question of pre-initialising with arbitrary
values can be deferred for now (and perhaps left to NumPy
indefinitely)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From brett at python.org  Fri Jun 10 13:41:48 2016
From: brett at python.org (Brett Cannon)
Date: Fri, 10 Jun 2016 17:41:48 +0000
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <20160610170453.GI27919@ando.pearwood.info>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
 <20160610132051.GH27919@ando.pearwood.info>
 <CAMpsgwYRe7XxH7_=9-c2BRVJd1fvzaRcjE3tJoTz4WtRDTtgZw@mail.gmail.com>
 <20160610170453.GI27919@ando.pearwood.info>
Message-ID: <CAP1=2W4n5_63nnq7NTDDXOCFVmxo3oH1=nb3MWopoCkfZDdn4w@mail.gmail.com>

On Fri, 10 Jun 2016 at 10:11 Steven D'Aprano <steve at pearwood.info> wrote:

> On Fri, Jun 10, 2016 at 05:07:18PM +0200, Victor Stinner wrote:
> > I started to work on visualisation. IMHO it helps to understand the
> problem.
> >
> > Let's create a large dataset: 500 samples (100 processes x 5 samples):
> > ---
> > $ python3 telco.py --json-file=telco.json -p 100 -n 5
> > ---
> >
> > Attached plot.py script creates an histogram:
> > ---
> > avg: 26.7 ms +- 0.2 ms; min = 26.2 ms
> >
> > 26.1 ms:   1 #
> > 26.2 ms:  12 #####
> > 26.3 ms:  34 ############
> > 26.4 ms:  44 ################
> > 26.5 ms: 109 ######################################
> > 26.6 ms: 117 ########################################
> > 26.7 ms:  86 ##############################
> > 26.8 ms:  50 ##################
> > 26.9 ms:  32 ###########
> > 27.0 ms:  10 ####
> > 27.1 ms:   3 ##
> > 27.2 ms:   1 #
> > 27.3 ms:   1 #
> >
> > minimum 26.1 ms: 0.2% (1) of 500 samples
> > ---
> [...]
> > The distribution looks a gaussian curve:
> > https://en.wikipedia.org/wiki/Gaussian_function
>
> Lots of distributions look a bit Gaussian, but they can be skewed, or
> truncated, or both. E.g. the average life-span of a lightbulb is
> approximately Gaussian with a central peak at some value (let's say 5000
> hours), but while it is conceivable that you might be really lucky and
> find a bulb that lasts 15000 hours, it isn't possible to find one that
> lasts -10000 hours. The distribution is truncated on the left.
>
> To me, your graph looks like the distribution is skewed: the right-hand
> tail (shown at the bottom) is longer than the left-hand tail, six
> buckets compared to five buckets. There are actual statistical tests for
> detecting deviation from Gaussian curves, but I'd have to look them up.
> But as a really quick and dirty test, we can count the number of samples
> on either side of the central peak (the mode):
>
> left: 109+44+34+12+1 = 200
> centre: 117
> right: 500 - 200 - 117 = 183
>
> It certainly looks *close* to Gaussian, but with the crude tests we are
> using, we can't be sure. If you took more and more samples, I would
> expect that the right-hand tail would get longer and longer, but the
> left-hand tail would not.
>
>
> > The interesting thing is that only 1 sample on 500 are in the minimum
> > bucket (26.1 ms). If you say that the performance is 26.1 ms, only
> > 0.2% of your users will be able to reproduce this timing.
>
> Hmmm. Okay, that is a good point. In this case, you're not so much
> reporting your estimate of what the "true speed" of the code snippet
> would be in the absence of all noise, but your estimate of what your
> users should expect to experience "most of the time".
>
>
I think the other way to think about why you don't want to use the minimum
is what if one run just happened to get lucky and ran when nothing else was
running (some random lull on the system), while the second run didn't get
so lucky on magically hitting an equivalent lull? Using the average helps
remove the "luck of the draw" potential of taking the minimum. This is why
the PyPy folks suggested to Victor to not consider the minimum but the
average instead; minimum doesn't measure typical system behaviour.

> Assuming they have exactly the same hardware, operating system, and load
> on their system as you have.
>

Sure, but that's true of any benchmarking. The only way to get accurate
measurements for one's own system is to run the benchmarks yourself.

-Brett

>
>
> > The average and std dev are 26.7 ms +- 0.2 ms, so numbers 26.5 ms ..
> > 26.9 ms: we got 109+117+86+50+32 samples in this range which gives us
> > 394/500 = 79%.
> >
> > IMHO saying "26.7 ms +- 0.2 ms" (79% of samples) is less a lie than
> > 26.1 ms (0.2%).
>
> I think I understand the point you are making. I'll have to think about
> it some more to decide if I agree with you.
>
> But either way, I think the work you have done on perf is fantastic and
> I think this will be a great tool. I really love the histogram. Can you
> draw a histogram of two functions side-by-side, for comparisons?
>
>
> --
> Steve
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/c30ee8eb/attachment-0001.html>

From ncoghlan at gmail.com  Fri Jun 10 13:49:37 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 10 Jun 2016 10:49:37 -0700
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <CAP1=2W70JKm=FepejD_qm-rCdPBUVLuK0Rm9aJ2bNBAhrYnBrA@mail.gmail.com>
References: <20160608210133.GA4318@python.ca>
 <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
 <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>
 <20160609230807.GA8118@python.ca>
 <CAP1=2W70JKm=FepejD_qm-rCdPBUVLuK0Rm9aJ2bNBAhrYnBrA@mail.gmail.com>
Message-ID: <CADiSq7fcimcrADzR=bJG+nVgakMYOq8FH4UxCJWSJwUFwQiZ2g@mail.gmail.com>

On 9 June 2016 at 16:43, Brett Cannon <brett at python.org> wrote:
> That's not what I'm saying at all (nor what I think Nick is saying); more
> tooling to ease the transition is always welcomed.

What Brett said is mostly accurate for me, except with one slight
caveat: I've been explicitly trying to nudge you towards making the
*existing tools better*, rather than introducing new tools. With
modernize and futurize we have a fairly clear trade-off ("Do you want
your code to look more like Python 2 or more like Python 3?"), and
things like "pylint --py3k" and the static analyzers are purely
additive to the migration process (so folks can take them or leave
them), but alternate interpreter builds and new converters have really
high barriers to adoption.

More -3 warnings in Python 2.7 are definitely welcome (since those can
pick up runtime behaviors that the static analysers miss), and if
there are things the existing code converters and static analysers
*could* detect but don't, that's a fruitful avenue for improvement as
well.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Fri Jun 10 14:00:00 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 10 Jun 2016 11:00:00 -0700
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <CAFSbXtMyTQdxSqa6Kf-FDBO2DkdjULEovW97a5QZaz6tNkWuEg@mail.gmail.com>
References: <20160608210133.GA4318@python.ca>
 <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
 <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>
 <20160609230807.GA8118@python.ca>
 <CAFSbXtMyTQdxSqa6Kf-FDBO2DkdjULEovW97a5QZaz6tNkWuEg@mail.gmail.com>
Message-ID: <CADiSq7f7O+fXK7Ci00FwBhWwfiwN_VOwev+Ju0R3VRy56CK4UQ@mail.gmail.com>

On 10 June 2016 at 07:09, Cody Piersall <cody.piersall at gmail.com> wrote:
>> One problem is that the str literals should be bytes
>> literals.  Comparison with None needs to be avoided.
>>
>> With Python 2 code runs successfully.  With Python 3 the code
>> crashes with a traceback.  With my modified Python 3.6, the code
>> runs successfully but generates the following warnings:
>>
>>     test.py:13: DeprecationWarning: encoding bytes to str
>>       output.write('%d:' % len(s))
>>     test.py:14: DeprecationWarning: encoding bytes to str
>>       output.write(s)
>>     test.py:15: DeprecationWarning: encoding bytes to str
>>       output.write(',')
>>     test.py:5: DeprecationWarning: encoding bytes to str
>>       if c == ':':
>>     test.py:9: DeprecationWarning: encoding bytes to str
>>       size += c
>>     test.py:24: DeprecationWarning: encoding bytes to str
>>       data = data + s
>>     test.py:26: DeprecationWarning: encoding bytes to str
>>       if input.read(1) != ',':
>>     test.py:31: DeprecationWarning: default compare is depreciated
>>       if a > 0:
>>
>
> This seems _very_ useful; I'm surprised that other people don't think
> so too.  Currently, the easiest way to find bytes/str errors in a big
> application is by running the program, finding where it crashes,
> fixing that one line (or hopefully wherever the data entered the
> system if you can find it), and repeating the process.

It could be very interesting to add an "ascii-warn" codec to Python
2.7, and then set that as the default encoding when the -3 flag is
set. The expressed lack of interest has been in the idea of
recommending people use an alternate interpreter build (which has
nothing to do with the usefulness of the added warnings, and
everything to do with the logistics of distributing and adopting
alternate runtimes), rather than in the concept of improving the
available runtime compatibility warnings.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From neil at python.ca  Fri Jun 10 14:00:45 2016
From: neil at python.ca (Neil Schemenauer)
Date: Fri, 10 Jun 2016 11:00:45 -0700
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <CADiSq7fcimcrADzR=bJG+nVgakMYOq8FH4UxCJWSJwUFwQiZ2g@mail.gmail.com>
References: <20160608210133.GA4318@python.ca>
 <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
 <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>
 <20160609230807.GA8118@python.ca>
 <CAP1=2W70JKm=FepejD_qm-rCdPBUVLuK0Rm9aJ2bNBAhrYnBrA@mail.gmail.com>
 <CADiSq7fcimcrADzR=bJG+nVgakMYOq8FH4UxCJWSJwUFwQiZ2g@mail.gmail.com>
Message-ID: <04c726d9-e53b-7242-a3e4-c0a2435efb67@python.ca>

On 6/10/2016 10:49 AM, Nick Coghlan wrote:
> What Brett said is mostly accurate for me, except with one slight
> caveat: I've been explicitly trying to nudge you towards making the
> *existing tools better*, rather than introducing new tools. With
> modernize and futurize we have a fairly clear trade-off ("Do you want
> your code to look more like Python 2 or more like Python 3?"), and
> things like "pylint --py3k" and the static analyzers are purely
> additive to the migration process (so folks can take them or leave
> them), but alternate interpreter builds and new converters have really
> high barriers to adoption.

I agree with that idea.  If there is anything that is "clean" enough, it 
should be merged with either 2.7.x or 3.x.  There is nothing in my tree 
that can be usefully merged though.

> More -3 warnings in Python 2.7 are definitely welcome (since those can
> pick up runtime behaviors that the static analysers miss), and if
> there are things the existing code converters and static analysers
> *could* detect but don't, that's a fruitful avenue for improvement as
> well.
We are really limited on what can be done with the bytes/string issue 
because in Python 2 there is no distinct type for bytes. Also, the 
standard library does all sorts of unclean mixing of str and unicode so 
a warning would spew a lot of noise.

Likewise, a warning about comparison behavior (None, default ordering of 
types) would also not be useful because there is so much standard 
library code that would spew warnings.

From ncoghlan at gmail.com  Fri Jun 10 14:16:43 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 10 Jun 2016 11:16:43 -0700
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <04c726d9-e53b-7242-a3e4-c0a2435efb67@python.ca>
References: <20160608210133.GA4318@python.ca>
 <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
 <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>
 <20160609230807.GA8118@python.ca>
 <CAP1=2W70JKm=FepejD_qm-rCdPBUVLuK0Rm9aJ2bNBAhrYnBrA@mail.gmail.com>
 <CADiSq7fcimcrADzR=bJG+nVgakMYOq8FH4UxCJWSJwUFwQiZ2g@mail.gmail.com>
 <04c726d9-e53b-7242-a3e4-c0a2435efb67@python.ca>
Message-ID: <CADiSq7fOznRYMqK7DdK-ruU5U1vTTGR=qqR3qAzYKzSY-AGLFA@mail.gmail.com>

On 10 June 2016 at 11:00, Neil Schemenauer <neil at python.ca> wrote:
> On 6/10/2016 10:49 AM, Nick Coghlan wrote:
>> More -3 warnings in Python 2.7 are definitely welcome (since those can
>> pick up runtime behaviors that the static analysers miss), and if
>> there are things the existing code converters and static analysers
>> *could* detect but don't, that's a fruitful avenue for improvement as
>> well.
>
> We are really limited on what can be done with the bytes/string issue
> because in Python 2 there is no distinct type for bytes. Also, the standard
> library does all sorts of unclean mixing of str and unicode so a warning
> would spew a lot of noise.
>
> Likewise, a warning about comparison behavior (None, default ordering of
> types) would also not be useful because there is so much standard library
> code that would spew warnings.

Implicitly enabling those warnings universally with -3 might not be an
option then, but it may be feasible to have those warnings ignored by
default, and allow people to enable them selectively for their own
code via the warnings module.

Failing that, you may be right that there's value in a permissive
Python 3.x variant as an optional compatibility testing tool (I admit
I originally thought you were proposing such an environment as a
production deployment target for partially migrated code, which I'd be
thoroughly against, but as a tool for running a test suite or
experimentally migrated instance it would be closer in spirit to the
-3 switch and the static analysers - folks can use it if they think it
will help them, but they don't need to worry about it if they don't
need it themselves)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From mertz at gnosis.cx  Fri Jun 10 14:29:01 2016
From: mertz at gnosis.cx (David Mertz)
Date: Fri, 10 Jun 2016 11:29:01 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
Message-ID: <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>

This is fairly academic, since I do not anticipate needing to do this
myself, but I have a specific question.  I'll assume that Python 3.5.2 will
go back to the 2.6-3.4 behavior in which os.urandom() never blocks on
Linux.  Moreover, I understand that the case where the insecure bits might
be returned are limited to Python scripts that run on system initialization
on Linux.

If I *were* someone who needed to write a Linux system initialization
script using Python 3.5.2, what would the code look like.  I think for this
use case, requiring something with a little bit of "code smell" is fine,
but I kinda hope it exists at all.

-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/66976c4a/attachment.html>

From ncoghlan at gmail.com  Fri Jun 10 14:29:09 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 10 Jun 2016 11:29:09 -0700
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7BPbof_y+S7C2z+XDwHjag5KwCcDih+GX-fwhZ6429ZUg@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <575772E6.7040906@stoneleaf.us>
 <CALFfu7DJDrBpPT4BvRG0-+coHRyvH+6tgwR=BFvPd0wRJ3NBPQ@mail.gmail.com>
 <CADiSq7fr-K_a2nBnp3LJyvi0sQSWbGTYij5xmRbA6+wwDRbmHA@mail.gmail.com>
 <CALFfu7BPbof_y+S7C2z+XDwHjag5KwCcDih+GX-fwhZ6429ZUg@mail.gmail.com>
Message-ID: <CADiSq7eoafux0HR868JCHu6dNgioWf_YvOU0zBsr1Ek1_8JnWg@mail.gmail.com>

On 10 June 2016 at 09:42, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> On Thu, Jun 9, 2016 at 2:39 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> That restriction would be comparable to what we do with __slots__ today:
>>
>>     >>> class C:
>>     ...     __slots__ = 1
>>     ...
>>     Traceback (most recent call last):
>>      File "<stdin>", line 1, in <module>
>>     TypeError: 'int' object is not iterable
>
> Are you suggesting that we require it be a tuple of identifiers (or
> None) and raise TypeError otherwise, similar to __slots__?  The
> difference is that __slots__ has specific type requirements that do
> not apply to __definition_order__, as well as a different purpose.
> __definition_order__ is about preserving definition-type info that we
> are currently throwing away.

If we don't enforce the tuple-of-identifiers restriction at type
creation time, everyone that *doesn't* make it a tuple-of-identifiers
is likely to have a subtle compatibility bug with class decorators and
other code that assume the default tuple-of-identifiers format is the
only possible format (aside from None). To put it in PEP 484 terms:
regardless of what the PEP says, people are going to assume the type
of __definition_order__ is Optional[Tuple[str]], as that's going to
cover almost all class definitions they encounter.

It makes sense to me to give class definitions and metaclasses the
opportunity to change the *content* of the definition order: "Use
these names in this order, not the names and order you would have
calculated by default".

It doesn't make sense to me to give them an opportunity to change the
*form* of the definition order, since that makes it incredibly
difficult to consume correctly: "Sure, it's *normally* a
tuple-of-identifiers, but it *might* be a dictionary, or a complex
number, or a set, or whatever the class author decided to make it".

By contrast, if the class machinery enforces Optional[Tuple[str]],
then it becomes a lot easier to consume reliably, and anyone violating
the constraint gets an immediate exception when defining the offending
class, rather than a potentially obscure exception from a class
decorator or other piece of code that assumes __definition_order__
could only be None or a tuple of strings.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From donald at stufft.io  Fri Jun 10 14:33:17 2016
From: donald at stufft.io (Donald Stufft)
Date: Fri, 10 Jun 2016 14:33:17 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io> <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
Message-ID: <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>

> On Jun 10, 2016, at 2:29 PM, David Mertz <mertz at gnosis.cx> wrote:
> 
> If I *were* someone who needed to write a Linux system initialization script using Python 3.5.2, what would the code look like.  I think for this use case, requiring something with a little bit of "code smell" is fine, but I kinda hope it exists at all.

Do you mean if os.urandom blocked and you wanted to call os.urandom from your boot script? Or if os.urandom doesn?t block and you wanted to ensure you got good random numbers on boot?

?
Donald Stufft

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/c5a5a0bd/attachment.html>

From kmod at dropbox.com  Fri Jun 10 14:37:17 2016
From: kmod at dropbox.com (Kevin Modzelewski)
Date: Fri, 10 Jun 2016 11:37:17 -0700
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <20160610170453.GI27919@ando.pearwood.info>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
 <20160610132051.GH27919@ando.pearwood.info>
 <CAMpsgwYRe7XxH7_=9-c2BRVJd1fvzaRcjE3tJoTz4WtRDTtgZw@mail.gmail.com>
 <20160610170453.GI27919@ando.pearwood.info>
Message-ID: <CAO=oM6v+LS6OsbCAi4OE-g_QYxvYrdTeX0-Ueyp306Gtz5s2-A@mail.gmail.com>

Hi all, I wrote a blog post about this.
http://blog.kevmod.com/2016/06/benchmarking-minimum-vs-average/

We can rule out any argument that one (minimum or average) is strictly
better than the other, since there are cases that make either one better.
It comes down to our expectation of the underlying distribution.

Victor if you could calculate the sample skewness
<https://en.wikipedia.org/wiki/Skewness#Sample_skewness> of your results I
think that would be very interesting!

kmod

On Fri, Jun 10, 2016 at 10:04 AM, Steven D'Aprano <steve at pearwood.info>
wrote:

> On Fri, Jun 10, 2016 at 05:07:18PM +0200, Victor Stinner wrote:
> > I started to work on visualisation. IMHO it helps to understand the
> problem.
> >
> > Let's create a large dataset: 500 samples (100 processes x 5 samples):
> > ---
> > $ python3 telco.py --json-file=telco.json -p 100 -n 5
> > ---
> >
> > Attached plot.py script creates an histogram:
> > ---
> > avg: 26.7 ms +- 0.2 ms; min = 26.2 ms
> >
> > 26.1 ms:   1 #
> > 26.2 ms:  12 #####
> > 26.3 ms:  34 ############
> > 26.4 ms:  44 ################
> > 26.5 ms: 109 ######################################
> > 26.6 ms: 117 ########################################
> > 26.7 ms:  86 ##############################
> > 26.8 ms:  50 ##################
> > 26.9 ms:  32 ###########
> > 27.0 ms:  10 ####
> > 27.1 ms:   3 ##
> > 27.2 ms:   1 #
> > 27.3 ms:   1 #
> >
> > minimum 26.1 ms: 0.2% (1) of 500 samples
> > ---
> [...]
> > The distribution looks a gaussian curve:
> > https://en.wikipedia.org/wiki/Gaussian_function
>
> Lots of distributions look a bit Gaussian, but they can be skewed, or
> truncated, or both. E.g. the average life-span of a lightbulb is
> approximately Gaussian with a central peak at some value (let's say 5000
> hours), but while it is conceivable that you might be really lucky and
> find a bulb that lasts 15000 hours, it isn't possible to find one that
> lasts -10000 hours. The distribution is truncated on the left.
>
> To me, your graph looks like the distribution is skewed: the right-hand
> tail (shown at the bottom) is longer than the left-hand tail, six
> buckets compared to five buckets. There are actual statistical tests for
> detecting deviation from Gaussian curves, but I'd have to look them up.
> But as a really quick and dirty test, we can count the number of samples
> on either side of the central peak (the mode):
>
> left: 109+44+34+12+1 = 200
> centre: 117
> right: 500 - 200 - 117 = 183
>
> It certainly looks *close* to Gaussian, but with the crude tests we are
> using, we can't be sure. If you took more and more samples, I would
> expect that the right-hand tail would get longer and longer, but the
> left-hand tail would not.
>
>
> > The interesting thing is that only 1 sample on 500 are in the minimum
> > bucket (26.1 ms). If you say that the performance is 26.1 ms, only
> > 0.2% of your users will be able to reproduce this timing.
>
> Hmmm. Okay, that is a good point. In this case, you're not so much
> reporting your estimate of what the "true speed" of the code snippet
> would be in the absence of all noise, but your estimate of what your
> users should expect to experience "most of the time".
>
> Assuming they have exactly the same hardware, operating system, and load
> on their system as you have.
>
>
> > The average and std dev are 26.7 ms +- 0.2 ms, so numbers 26.5 ms ..
> > 26.9 ms: we got 109+117+86+50+32 samples in this range which gives us
> > 394/500 = 79%.
> >
> > IMHO saying "26.7 ms +- 0.2 ms" (79% of samples) is less a lie than
> > 26.1 ms (0.2%).
>
> I think I understand the point you are making. I'll have to think about
> it some more to decide if I agree with you.
>
> But either way, I think the work you have done on perf is fantastic and
> I think this will be a great tool. I really love the histogram. Can you
> draw a histogram of two functions side-by-side, for comparisons?
>
>
> --
> Steve
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/kmod%40dropbox.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/0b9643d0/attachment-0001.html>

From chris.jerdonek at gmail.com  Fri Jun 10 14:42:40 2016
From: chris.jerdonek at gmail.com (Chris Jerdonek)
Date: Fri, 10 Jun 2016 11:42:40 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
Message-ID: <CAOTb1we7WsteA_5t-bdgoqX_xRS2WexaCuV4Bv7B7xZE2FaqkQ@mail.gmail.com>

On Fri, Jun 10, 2016 at 11:29 AM, David Mertz <mertz at gnosis.cx> wrote:
> This is fairly academic, since I do not anticipate needing to do this
> myself, but I have a specific question.  I'll assume that Python 3.5.2 will
> go back to the 2.6-3.4 behavior in which os.urandom() never blocks on Linux.
> Moreover, I understand that the case where the insecure bits might be
> returned are limited to Python scripts that run on system initialization on
> Linux.
>
> If I *were* someone who needed to write a Linux system initialization script
> using Python 3.5.2, what would the code look like.  I think for this use
> case, requiring something with a little bit of "code smell" is fine, but I
> kinda hope it exists at all.

Good question.  And going back to Larry's original e-mail, where he said--

On Thu, Jun 9, 2016 at 4:25 AM, Larry Hastings <larry at hastings.org> wrote:
> THE PROBLEM
> ...
> The issue author had already identified the cause: CPython was blocking on
> getrandom() in order to initialize hash randomization.  On this fresh
> virtual machine the entropy pool started out uninitialized.  And since the
> only thing running on the machine was CPython, and since CPython was blocked
> on initialization, the entropy pool was initializing very, very slowly.

it seems to me that you'd want such a solution to have code that
causes the initialization of the entropy pool to be sped up so that it
happens as quickly as possible (if that is even possible).  Is it
possible? (E.g. by causing the machine to start doing things other
than just CPython?)

--Chris

From mertz at gnosis.cx  Fri Jun 10 14:43:51 2016
From: mertz at gnosis.cx (David Mertz)
Date: Fri, 10 Jun 2016 11:43:51 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
Message-ID: <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>

My hypothetical is "Ensure good random bits (on Python 3.5.2 and Linux),
and block rather than allow bad bits."

I'm not quite sure I understand all of your question, Donald.  On Python
3.4?and by BDFL declaration on 3.5.2?os.urandom() *will not* block,
although it might on 3.5.1.

On Fri, Jun 10, 2016 at 11:33 AM, Donald Stufft <donald at stufft.io> wrote:

>
> On Jun 10, 2016, at 2:29 PM, David Mertz <mertz at gnosis.cx> wrote:
>
> If I *were* someone who needed to write a Linux system initialization
> script using Python 3.5.2, what would the code look like.  I think for this
> use case, requiring something with a little bit of "code smell" is fine,
> but I kinda hope it exists at all.
>
>
> Do you mean if os.urandom blocked and you wanted to call os.urandom from
> your boot script? Or if os.urandom doesn?t block and you wanted to ensure
> you got good random numbers on boot?
>
> ?
> Donald Stufft
>
>
>
>

-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/07c145c8/attachment.html>

From meadori at gmail.com  Fri Jun 10 14:47:19 2016
From: meadori at gmail.com (Meador Inge)
Date: Fri, 10 Jun 2016 13:47:19 -0500
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
Message-ID: <CAK1QoopjYefq0+M_ROyOKA7XEbyNhGfYJt1Lh_GUoHJ=emejPw@mail.gmail.com>

On Fri, Jun 10, 2016 at 6:13 AM, Victor Stinner <victor.stinner at gmail.com>
wrote:

The second result is a new perf module which includes all "tricks"
> discovered in my research: compute average and standard deviation,
> spawn multiple worker child processes, automatically calibrate the
> number of outter-loop iterations, automatically pin worker processes
> to isolated CPUs, and more.
>

Apologies in advance if this is answered in one of the links you posted, but
out of curiosity was geometric mean considered?

In the compiler world this is a very common way of aggregating performance
results.

-- Meador
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/b96da5aa/attachment.html>

From leewangzhong+python at gmail.com  Fri Jun 10 14:54:05 2016
From: leewangzhong+python at gmail.com (Franklin? Lee)
Date: Fri, 10 Jun 2016 14:54:05 -0400
Subject: [Python-Dev] PEP 468
In-Reply-To: <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
 <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
Message-ID: <CAB_e7iw7A46SzWGmnvSn9d0TzYJSZ927iOsppFWTQxtym3bH-g@mail.gmail.com>

Eric, have you any work in progress on compact dicts?

On Fri, Jun 10, 2016 at 12:54 PM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> On Thu, Jun 9, 2016 at 1:10 PM, ?manuel Barry <vgr255 at live.ca> wrote:
>> As stated by Guido (and pointed out in the PEP):
>>
>> Making **kwds ordered is still open, but requires careful design and
>> implementation to avoid slowing down function calls that don't benefit.
>>
>> The PEP has not been updated in a while, though. Python 3.5 has been
>> released, and with it a C implementation of OrderedDict.
>>
>> Eric, are you still interested in this?
>
> Yes, but wasn't planning on dusting it off yet (i.e. in time for 3.6).
> I'm certainly not opposed to someone picking up the banner.
> <wink-wink>
>
>> IIRC that PEP was one of the
>> motivating use cases for implementing OrderedDict in C.
>
> Correct, though I'm not sure OrderedDict needs to be involved any more.
>
>> Maybe it's time for
>> a second round of discussion on Python-ideas?
>
> Fine with me, though I won't have a lot of time in the 3.6 timeframe
> to handle a high-volume discussion or push through an implementation.
>
> -eric
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/leewangzhong%2Bpython%40gmail.com

From donald at stufft.io  Fri Jun 10 14:55:39 2016
From: donald at stufft.io (Donald Stufft)
Date: Fri, 10 Jun 2016 14:55:39 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
Message-ID: <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>

Ok, so you?re looking for how would you replicate the blocking behavior of os.urandom that exists in 3.5.0 and 3.5.1?

In that case, it?s hard. I don?t think linux provides any way to externally determine if /dev/urandom has been initialized or not. Probably the easiest thing to do would be to interface with the getrandom() function using a c-ext, CFFI, or ctypes. If you?re looking for a way of doing this without calling the getrandom() function.. I believe the answer is you can?t.

The closest thing you can get is checking the /proc/sys/kernel/random/entropy_avail file, but that tells you how much entropy the system currently thinks it has (which will go up and down over time) and corresponds to /dev/random on Linux not /dev/urandom.

You could read from /dev/random, but that?s going to randomly block outside of the pool initialization whenever the kernel things it doesn?t have enough entropy. Cryptographers and security experts alike consider this to be pretty stupid behavior and don?t recommend using it because of this ?randomly block throughout the use of your application? behavior.

So really, out of the recommended solutions you really only have find a way to interface with the getrandom() function, or just consume /dev/urandom and hope it?s been initialized.

> On Jun 10, 2016, at 2:43 PM, David Mertz <mertz at gnosis.cx> wrote:
> 
> My hypothetical is "Ensure good random bits (on Python 3.5.2 and Linux), and block rather than allow bad bits."
> 
> I'm not quite sure I understand all of your question, Donald.  On Python 3.4?and by BDFL declaration on 3.5.2?os.urandom() *will not* block, although it might on 3.5.1.
> 
> On Fri, Jun 10, 2016 at 11:33 AM, Donald Stufft <donald at stufft.io <mailto:donald at stufft.io>> wrote:
> 
>> On Jun 10, 2016, at 2:29 PM, David Mertz <mertz at gnosis.cx <mailto:mertz at gnosis.cx>> wrote:
>> 
>> If I *were* someone who needed to write a Linux system initialization script using Python 3.5.2, what would the code look like.  I think for this use case, requiring something with a little bit of "code smell" is fine, but I kinda hope it exists at all.
> 
> 
> Do you mean if os.urandom blocked and you wanted to call os.urandom from your boot script? Or if os.urandom doesn?t block and you wanted to ensure you got good random numbers on boot?
> 
> ?
> Donald Stufft
> 
> 
> 
> 
> 
> 
> -- 
> Keeping medicines from the bloodstreams of the sick; food 
> from the bellies of the hungry; books from the hands of the 
> uneducated; technology from the underdeveloped; and putting 
> advocates of freedom in prisons.  Intellectual property is
> to the 21st century what the slave trade was to the 16th.

?
Donald Stufft

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/0b08f81a/attachment.html>

From donald at stufft.io  Fri Jun 10 15:01:48 2016
From: donald at stufft.io (Donald Stufft)
Date: Fri, 10 Jun 2016 15:01:48 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
Message-ID: <FC0B91D1-4BC0-433B-870D-68A39B23EECD@stufft.io>

> On Jun 10, 2016, at 2:55 PM, Donald Stufft <donald at stufft.io> wrote:
> 
> So really, out of the recommended solutions you really only have find a way to interface with the getrandom() function, or just consume /dev/urandom and hope it?s been initialized.

I?d note, this is one of the reasons why I felt like blocking (or raising an exception) on os.urandom was the right solution? because it?s hard to get that behavior on Linux otherwise. However, if we instead kept the blocking (or exception) behavior, getting the old behavior back on Linux is trivial, since it would only require open(?/dev/urandom?).read(?).

?
Donald Stufft

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/fb4a3095/attachment.html>

From mertz at gnosis.cx  Fri Jun 10 15:05:38 2016
From: mertz at gnosis.cx (David Mertz)
Date: Fri, 10 Jun 2016 12:05:38 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
Message-ID: <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>

OK.  My understanding is that Guido ruled out introducing an os.getrandom()
API in 3.5.2.  But would you be happy if that interface is added to 3.6?

It feels to me like the correct spelling in 3.6 should probably be
secrets.getrandom() or something related to that.

On Fri, Jun 10, 2016 at 11:55 AM, Donald Stufft <donald at stufft.io> wrote:

> Ok, so you?re looking for how would you replicate the blocking behavior of
> os.urandom that exists in 3.5.0 and 3.5.1?
>
> In that case, it?s hard. I don?t think linux provides any way to
> externally determine if /dev/urandom has been initialized or not. Probably
> the easiest thing to do would be to interface with the getrandom() function
> using a c-ext, CFFI, or ctypes. If you?re looking for a way of doing this
> without calling the getrandom() function.. I believe the answer is you
> can?t.
>
> The closest thing you can get is checking
> the /proc/sys/kernel/random/entropy_avail file, but that tells you how much
> entropy the system currently thinks it has (which will go up and down over
> time) and corresponds to /dev/random on Linux not /dev/urandom.
>
> You could read from /dev/random, but that?s going to randomly block
> outside of the pool initialization whenever the kernel things it doesn?t
> have enough entropy. Cryptographers and security experts alike consider
> this to be pretty stupid behavior and don?t recommend using it because of
> this ?randomly block throughout the use of your application? behavior.
>
> So really, out of the recommended solutions you really only have find a
> way to interface with the getrandom() function, or just consume
> /dev/urandom and hope it?s been initialized.
>
>
> On Jun 10, 2016, at 2:43 PM, David Mertz <mertz at gnosis.cx> wrote:
>
> My hypothetical is "Ensure good random bits (on Python 3.5.2 and Linux),
> and block rather than allow bad bits."
>
> I'm not quite sure I understand all of your question, Donald.  On Python
> 3.4?and by BDFL declaration on 3.5.2?os.urandom() *will not* block,
> although it might on 3.5.1.
>
> On Fri, Jun 10, 2016 at 11:33 AM, Donald Stufft <donald at stufft.io> wrote:
>
>>
>> On Jun 10, 2016, at 2:29 PM, David Mertz <mertz at gnosis.cx> wrote:
>>
>> If I *were* someone who needed to write a Linux system initialization
>> script using Python 3.5.2, what would the code look like.  I think for this
>> use case, requiring something with a little bit of "code smell" is fine,
>> but I kinda hope it exists at all.
>>
>>
>> Do you mean if os.urandom blocked and you wanted to call os.urandom from
>> your boot script? Or if os.urandom doesn?t block and you wanted to ensure
>> you got good random numbers on boot?
>>
>> ?
>> Donald Stufft
>>
>>
>>
>>
>
>
> --
> Keeping medicines from the bloodstreams of the sick; food
> from the bellies of the hungry; books from the hands of the
> uneducated; technology from the underdeveloped; and putting
> advocates of freedom in prisons.  Intellectual property is
> to the 21st century what the slave trade was to the 16th.
>
>
>
> ?
> Donald Stufft
>
>
>
>

-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/b71dc767/attachment-0001.html>

From tim.peters at gmail.com  Fri Jun 10 15:16:30 2016
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 10 Jun 2016 14:16:30 -0500
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
 <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
Message-ID: <CAExdVNkPyfBZiYbiF9VGka4Hw2qasXn7-nfGZq2wwYF8nMp7Sg@mail.gmail.com>

[David Mertz]
> OK.  My understanding is that Guido ruled out introducing an os.getrandom()
> API in 3.5.2.  But would you be happy if that interface is added to 3.6?
>
> It feels to me like the correct spelling in 3.6 should probably be
> secrets.getrandom() or something related to that.

secrets.token_bytes() is already the way to spell "get a string of
messed-up bytes", and that's the dead obvious (according to me) place
to add the potentially blocking implementation.  Indeed, everything in
the `secrets` module should block when the OS thinks that's needed.

From donald at stufft.io  Fri Jun 10 15:17:47 2016
From: donald at stufft.io (Donald Stufft)
Date: Fri, 10 Jun 2016 15:17:47 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
 <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
Message-ID: <B50734BC-E2DC-4F7C-B607-A7D624A3C0C9@stufft.io>

> On Jun 10, 2016, at 3:05 PM, David Mertz <mertz at gnosis.cx> wrote:
> 
> OK.  My understanding is that Guido ruled out introducing an os.getrandom() API in 3.5.2.  But would you be happy if that interface is added to 3.6? 
> 
> It feels to me like the correct spelling in 3.6 should probably be secrets.getrandom() or something related to that.

Well we have https://docs.python.org/dev/library/secrets.html#secrets.token_bytes <https://docs.python.org/dev/library/secrets.html#secrets.token_bytes> so adding a getrandom() function to secrets would largely be the same as that function.

The problem of course is that the secrets library in 3.6 uses os.urandom under the covers, so it?s security rests on the security of os.urandom. To ensure that the secrets library is actually safe even in early boot it?ll need to stop using os.urandom on Linux and use the getrandom() function.

That same library exposes random.SystemRandom as secrets.SystemRandom [1], and of course SystemRandom uses os.urandom too. So if we want people to treat secrets.SystemRandom as ?always secure? then it would need to stop using os.urandom and start using the get random() function on Linux as well.

[1] This is actually documented as "using the highest-quality sources provided by the operating system? in the secrets documentation, and I?d argue that it is not using the highest-quality source if it?s reading from /dev/urandom or getrandom(GRD_NONBLOCK) on Linux systems where getrandom() is available. Of course, it?s just an alias for random.SystemRandom, and that is documented as using os.urandom.

?
Donald Stufft

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/6a07169a/attachment.html>

From mertz at gnosis.cx  Fri Jun 10 15:29:10 2016
From: mertz at gnosis.cx (David Mertz)
Date: Fri, 10 Jun 2016 12:29:10 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <B50734BC-E2DC-4F7C-B607-A7D624A3C0C9@stufft.io>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
 <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
 <B50734BC-E2DC-4F7C-B607-A7D624A3C0C9@stufft.io>
Message-ID: <CAEbHw4aCETZ-cBZ4kvgtAOrcVW3CGSSKV2_iAxs2w4iTv4tSjA@mail.gmail.com>

I believe that secrets.token_bytes() and secrets.SystemRandom() should be
changed even for 3.5.1 to use getrandom() on Linux.

Thanks for fixing my spelling of the secrets API, Donald. :-)

On Fri, Jun 10, 2016 at 12:17 PM, Donald Stufft <donald at stufft.io> wrote:

>
> On Jun 10, 2016, at 3:05 PM, David Mertz <mertz at gnosis.cx> wrote:
>
> OK.  My understanding is that Guido ruled out introducing an
> os.getrandom() API in 3.5.2.  But would you be happy if that interface is
> added to 3.6?
>
> It feels to me like the correct spelling in 3.6 should probably be
> secrets.getrandom() or something related to that.
>
>
>
> Well we have
> https://docs.python.org/dev/library/secrets.html#secrets.token_bytes so
> adding a getrandom() function to secrets would largely be the same as that
> function.
>
> The problem of course is that the secrets library in 3.6 uses os.urandom
> under the covers, so it?s security rests on the security of os.urandom. To
> ensure that the secrets library is actually safe even in early boot it?ll
> need to stop using os.urandom on Linux and use the getrandom() function.
>
> That same library exposes random.SystemRandom as secrets.SystemRandom [1],
> and of course SystemRandom uses os.urandom too. So if we want people to
> treat secrets.SystemRandom as ?always secure? then it would need to stop
> using os.urandom and start using the get random() function on Linux as well.
>
>
> [1] This is actually documented as "using the highest-quality sources
> provided by the operating system? in the secrets documentation, and I?d
> argue that it is not using the highest-quality source if it?s reading from
> /dev/urandom or getrandom(GRD_NONBLOCK) on Linux systems where getrandom()
> is available. Of course, it?s just an alias for random.SystemRandom, and
> that is documented as using os.urandom.
>
> ?
> Donald Stufft
>
>
>
>

-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/bf3849c2/attachment.html>

From mertz at gnosis.cx  Fri Jun 10 15:29:55 2016
From: mertz at gnosis.cx (David Mertz)
Date: Fri, 10 Jun 2016 12:29:55 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAEbHw4aCETZ-cBZ4kvgtAOrcVW3CGSSKV2_iAxs2w4iTv4tSjA@mail.gmail.com>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
 <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
 <B50734BC-E2DC-4F7C-B607-A7D624A3C0C9@stufft.io>
 <CAEbHw4aCETZ-cBZ4kvgtAOrcVW3CGSSKV2_iAxs2w4iTv4tSjA@mail.gmail.com>
Message-ID: <CAEbHw4ZHD_AL1QnBry12LWQNKMx0NBUB4K2jByJOir3KaNQzzA@mail.gmail.com>

Ooops.... thinko there! Of course `secrets` won't exist in 3.5.1, so that's
a 3.6 matter instead.

On Fri, Jun 10, 2016 at 12:29 PM, David Mertz <mertz at gnosis.cx> wrote:

> I believe that secrets.token_bytes() and secrets.SystemRandom() should be
> changed even for 3.5.1 to use getrandom() on Linux.
>
> Thanks for fixing my spelling of the secrets API, Donald. :-)
>
> On Fri, Jun 10, 2016 at 12:17 PM, Donald Stufft <donald at stufft.io> wrote:
>
>>
>> On Jun 10, 2016, at 3:05 PM, David Mertz <mertz at gnosis.cx> wrote:
>>
>> OK.  My understanding is that Guido ruled out introducing an
>> os.getrandom() API in 3.5.2.  But would you be happy if that interface is
>> added to 3.6?
>>
>> It feels to me like the correct spelling in 3.6 should probably be
>> secrets.getrandom() or something related to that.
>>
>>
>>
>> Well we have
>> https://docs.python.org/dev/library/secrets.html#secrets.token_bytes so
>> adding a getrandom() function to secrets would largely be the same as that
>> function.
>>
>> The problem of course is that the secrets library in 3.6 uses os.urandom
>> under the covers, so it?s security rests on the security of os.urandom. To
>> ensure that the secrets library is actually safe even in early boot it?ll
>> need to stop using os.urandom on Linux and use the getrandom() function.
>>
>> That same library exposes random.SystemRandom as secrets.SystemRandom
>> [1], and of course SystemRandom uses os.urandom too. So if we want people
>> to treat secrets.SystemRandom as ?always secure? then it would need to stop
>> using os.urandom and start using the get random() function on Linux as well.
>>
>>
>> [1] This is actually documented as "using the highest-quality sources
>> provided by the operating system? in the secrets documentation, and I?d
>> argue that it is not using the highest-quality source if it?s reading from
>> /dev/urandom or getrandom(GRD_NONBLOCK) on Linux systems where getrandom()
>> is available. Of course, it?s just an alias for random.SystemRandom, and
>> that is documented as using os.urandom.
>>
>> ?
>> Donald Stufft
>>
>>
>>
>>
>
>
> --
> Keeping medicines from the bloodstreams of the sick; food
> from the bellies of the hungry; books from the hands of the
> uneducated; technology from the underdeveloped; and putting
> advocates of freedom in prisons.  Intellectual property is
> to the 21st century what the slave trade was to the 16th.
>

-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/3d030b8d/attachment.html>

From brett at python.org  Fri Jun 10 15:33:45 2016
From: brett at python.org (Brett Cannon)
Date: Fri, 10 Jun 2016 19:33:45 +0000
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <B50734BC-E2DC-4F7C-B607-A7D624A3C0C9@stufft.io>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
 <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
 <B50734BC-E2DC-4F7C-B607-A7D624A3C0C9@stufft.io>
Message-ID: <CAP1=2W4VdPC0EXqEiY1-3iWey+MUZXSHMf=y4SZjRAuXbqXDnw@mail.gmail.com>

On Fri, 10 Jun 2016 at 12:20 Donald Stufft <donald at stufft.io> wrote:

>
> On Jun 10, 2016, at 3:05 PM, David Mertz <mertz at gnosis.cx> wrote:
>
> OK.  My understanding is that Guido ruled out introducing an
> os.getrandom() API in 3.5.2.  But would you be happy if that interface is
> added to 3.6?
>
> It feels to me like the correct spelling in 3.6 should probably be
> secrets.getrandom() or something related to that.
>
>
>
> Well we have
> https://docs.python.org/dev/library/secrets.html#secrets.token_bytes so
> adding a getrandom() function to secrets would largely be the same as that
> function.
>
> The problem of course is that the secrets library in 3.6 uses os.urandom
> under the covers, so it?s security rests on the security of os.urandom. To
> ensure that the secrets library is actually safe even in early boot it?ll
> need to stop using os.urandom on Linux and use the getrandom() function.
>
> That same library exposes random.SystemRandom as secrets.SystemRandom [1],
> and of course SystemRandom uses os.urandom too. So if we want people to
> treat secrets.SystemRandom as ?always secure? then it would need to stop
> using os.urandom and start using the get random() function on Linux as well.
>
>
> [1] This is actually documented as "using the highest-quality sources
> provided by the operating system? in the secrets documentation, and I?d
> argue that it is not using the highest-quality source if it?s reading from
> /dev/urandom or getrandom(GRD_NONBLOCK) on Linux systems where getrandom()
> is available. Of course, it?s just an alias for random.SystemRandom, and
> that is documented as using os.urandom.
>

If that's the case then we should file a bug so we are sure this is the
case and we need to decouple the secrets documentation from random so that
they can operate independently with secrets always doing whatever is
required to be as secure as possible.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/f3178ea9/attachment.html>

From sebastian at realpath.org  Fri Jun 10 15:48:02 2016
From: sebastian at realpath.org (Sebastian Krause)
Date: Fri, 10 Jun 2016 21:48:02 +0200
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
 (David Mertz's message of "Fri, 10 Jun 2016 12:05:38 -0700")
References: <57595210.4000508@hastings.org>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
 <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
Message-ID: <m2wplw4yh9.fsf@news.realpath.org>

David Mertz <mertz at gnosis.cx> wrote:
> It feels to me like the correct spelling in 3.6 should probably be
> secrets.getrandom() or something related to that.

Since there already is a secrets.randbits(k), I would keep the
naming similar and suggest something like:

secrets.randbytes(k, *, nonblock=False)

With the argument "nonblock" you can control what happens when not
enough entropy is available: It either blocks or (if nonblock=True)
raises an exception. The third option, getting unsecure random data,
is simply not available in this function.

Then you can keep os.urandom() as it was in Python 3.4 and earlier,
but update the documentation to better warn about its behavior and
point developers to the secrets module.

Sebastian

From srkunze at mail.de  Fri Jun 10 15:45:13 2016
From: srkunze at mail.de (Sven R. Kunze)
Date: Fri, 10 Jun 2016 21:45:13 +0200
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <B50734BC-E2DC-4F7C-B607-A7D624A3C0C9@stufft.io>
References: <57595210.4000508@hastings.org>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io> <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
 <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
 <B50734BC-E2DC-4F7C-B607-A7D624A3C0C9@stufft.io>
Message-ID: <575B18C9.9090002@mail.de>

On 10.06.2016 21:17, Donald Stufft wrote:
>
>> On Jun 10, 2016, at 3:05 PM, David Mertz <mertz at gnosis.cx 
>> <mailto:mertz at gnosis.cx>> wrote:
>>
>> OK.  My understanding is that Guido ruled out introducing an 
>> os.getrandom() API in 3.5.2.  But would you be happy if that 
>> interface is added to 3.6?
>>
>> It feels to me like the correct spelling in 3.6 should probably be 
>> secrets.getrandom() or something related to that.
>

I am not a security expert but your reply makes it clear to me. So, for 
me this makes:

os -> os-dependent and because of this varies from os to os (also 
quality-wise)
random -> pseudo-random, but it works for most non-critical use-cases
secret -> that's for crypto

If don't need crypto, secret would be a waste of resources, but if you 
need crypto, then os and random are unsafe. I think that's simple 
enough. At least, I would understand it.

Just my 2 cents: if I need crypto, I would pay the price of blocking 
rather then to get an exception (what are my alternatives? I need those 
bits! ) or get unsecure bits.

Sven

> Well we have 
> https://docs.python.org/dev/library/secrets.html#secrets.token_bytes so adding 
> a getrandom() function to secrets would largely be the same as that 
> function.
>
> The problem of course is that the secrets library in 3.6 uses 
> os.urandom under the covers, so it?s security rests on the security of 
> os.urandom. To ensure that the secrets library is actually safe even 
> in early boot it?ll need to stop using os.urandom on Linux and use the 
> getrandom() function.
>
> That same library exposes random.SystemRandom as secrets.SystemRandom 
> [1], and of course SystemRandom uses os.urandom too. So if we want 
> people to treat secrets.SystemRandom as ?always secure? then it would 
> need to stop using os.urandom and start using the get random() 
> function on Linux as well.
>
>
> [1] This is actually documented as "using the highest-quality sources 
> provided by the operating system? in the secrets documentation, and 
> I?d argue that it is not using the highest-quality source if it?s 
> reading from /dev/urandom or getrandom(GRD_NONBLOCK) on Linux systems 
> where getrandom() is available. Of course, it?s just an alias for 
> random.SystemRandom, and that is documented as using os.urandom.
>
> ?
> Donald Stufft
>
>
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/srkunze%40mail.de

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/29416d75/attachment.html>

From larry at hastings.org  Fri Jun 10 15:55:04 2016
From: larry at hastings.org (Larry Hastings)
Date: Fri, 10 Jun 2016 12:55:04 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAEbHw4aCETZ-cBZ4kvgtAOrcVW3CGSSKV2_iAxs2w4iTv4tSjA@mail.gmail.com>
References: <57595210.4000508@hastings.org>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io> <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
 <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
 <B50734BC-E2DC-4F7C-B607-A7D624A3C0C9@stufft.io>
 <CAEbHw4aCETZ-cBZ4kvgtAOrcVW3CGSSKV2_iAxs2w4iTv4tSjA@mail.gmail.com>
Message-ID: <575B1B18.9020502@hastings.org>

On 06/10/2016 12:29 PM, David Mertz wrote:
> I believe that secrets.token_bytes() and secrets.SystemRandom() should 
> be changed even for 3.5.1 to use getrandom() on Linux.

Surely you meant 3.5.2?  3.5.1 shipped last December.

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/682bdbce/attachment.html>

From donald at stufft.io  Fri Jun 10 15:57:01 2016
From: donald at stufft.io (Donald Stufft)
Date: Fri, 10 Jun 2016 15:57:01 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAP1=2W4VdPC0EXqEiY1-3iWey+MUZXSHMf=y4SZjRAuXbqXDnw@mail.gmail.com>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
 <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
 <B50734BC-E2DC-4F7C-B607-A7D624A3C0C9@stufft.io>
 <CAP1=2W4VdPC0EXqEiY1-3iWey+MUZXSHMf=y4SZjRAuXbqXDnw@mail.gmail.com>
Message-ID: <F004362B-5F04-4D7E-8E34-193DE99D67FF@stufft.io>

> On Jun 10, 2016, at 3:33 PM, Brett Cannon <brett at python.org> wrote:
> 
> If that's the case then we should file a bug so we are sure this is the case and we need to decouple the secrets documentation from random so that they can operate independently with secrets always doing whatever is required to be as secure as possible.

https://bugs.python.org/issue27288 <https://bugs.python.org/issue27288>

?
Donald Stufft

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/41fd1ec7/attachment.html>

From sebastian at realpath.org  Fri Jun 10 15:57:31 2016
From: sebastian at realpath.org (Sebastian Krause)
Date: Fri, 10 Jun 2016 21:57:31 +0200
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAExdVNkPyfBZiYbiF9VGka4Hw2qasXn7-nfGZq2wwYF8nMp7Sg@mail.gmail.com>
 (Tim Peters's message of "Fri, 10 Jun 2016 14:16:30 -0500")
References: <57595210.4000508@hastings.org> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
 <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
 <CAExdVNkPyfBZiYbiF9VGka4Hw2qasXn7-nfGZq2wwYF8nMp7Sg@mail.gmail.com>
Message-ID: <m2shwk4y1g.fsf@news.realpath.org>

Tim Peters <tim.peters at gmail.com> wrote:
> secrets.token_bytes() is already the way to spell "get a string of
> messed-up bytes", and that's the dead obvious (according to me) place
> to add the potentially blocking implementation.

I honestly didn't think that this was the dead obvious function to
use. To me the naming kind of suggested that it would do some
special magic that tokens needed, instead of just returning random
bytes (even though the best token is probably just perfectly random
data). If you want to provide a general function for secure random
bytes I would suggest at least a better naming.

Sebastian

From mertz at gnosis.cx  Fri Jun 10 16:01:12 2016
From: mertz at gnosis.cx (David Mertz)
Date: Fri, 10 Jun 2016 13:01:12 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <575B1B18.9020502@hastings.org>
References: <57595210.4000508@hastings.org>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
 <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
 <B50734BC-E2DC-4F7C-B607-A7D624A3C0C9@stufft.io>
 <CAEbHw4aCETZ-cBZ4kvgtAOrcVW3CGSSKV2_iAxs2w4iTv4tSjA@mail.gmail.com>
 <575B1B18.9020502@hastings.org>
Message-ID: <CAEbHw4bgvHS-9trwQDbs6aN6zK=hr0xXF72DY8C+YM5hTYxvEA@mail.gmail.com>

On Fri, Jun 10, 2016 at 12:55 PM, Larry Hastings <larry at hastings.org> wrote:

> On 06/10/2016 12:29 PM, David Mertz wrote:
>
> I believe that secrets.token_bytes() and secrets.SystemRandom() should be
> changed even for 3.5.1 to use getrandom() on Linux.
>
> Surely you meant 3.5.2?  3.5.1 shipped last December.
>

Yeah, that combines a couple thinkos even.  I had intended to write "for
3.5.2" ... but that is also an error, since the secrets module doesn't
exist until 3.6.  So yes, I think 3.5.2 should restore the 2.6-3.4 behavior
of os.urandom(), and the NEW APIs in secrets should use the "best available
randomness (even if it blocks)"

Donald is correct that we have the spelling secrets.token_bytes() available
in 3.6a1, so the spellings secrets.getrandom() or secrets.randbytes() are
not needed.  However, Sebastian's (adapted) suggestion to allow
secrets.token_bytes(k,
*, nonblock=False) as the signature makes sense to me (i.e. it's a choice
of "block or raise exception", not an option to get non-crypto bytes).

-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/24df4c43/attachment.html>

From larry at hastings.org  Fri Jun 10 16:02:44 2016
From: larry at hastings.org (Larry Hastings)
Date: Fri, 10 Jun 2016 13:02:44 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
References: <57595210.4000508@hastings.org>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io> <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
Message-ID: <575B1CE4.7030902@hastings.org>

On 06/10/2016 11:55 AM, Donald Stufft wrote:
> Ok, so you?re looking for how would you replicate the blocking 
> behavior of os.urandom that exists in 3.5.0 and 3.5.1?
>
> In that case, it?s hard. I don?t think linux provides any way to 
> externally determine if /dev/urandom has been initialized or not. 
> Probably the easiest thing to do would be to interface with the 
> getrandom() function using a c-ext, CFFI, or ctypes. If you?re looking 
> for a way of doing this without calling the getrandom() function.. I 
> believe the answer is you can?t.

I'm certain you're correct: you can't perform any operation on 
/dev/urandom to determine whether or not the urandom device has been 
initialized.  That's one of the reasons why Mr. Ts'o added 
getrandom()--you can use it to test exactly that (getrandom(GRND_NONBLOCK)).

That's also why I proposed adding os.getrandom() in 3.5.2, to make it 
possible to block until urandom was initialized (without using ctypes 
etc as you suggest).  However, none of the cryptography guys jumped up 
and said they wanted it, and in any case it was overruled by Guido, so 
we're not adding it to 3.5.2.

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/9646e6c6/attachment.html>

From tim.peters at gmail.com  Fri Jun 10 16:04:48 2016
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 10 Jun 2016 15:04:48 -0500
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <m2shwk4y1g.fsf@news.realpath.org>
References: <57595210.4000508@hastings.org> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
 <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
 <CAExdVNkPyfBZiYbiF9VGka4Hw2qasXn7-nfGZq2wwYF8nMp7Sg@mail.gmail.com>
 <m2shwk4y1g.fsf@news.realpath.org>
Message-ID: <CAExdVNmRD_FxjiTqgOM-TaHpvBff9X1V0tFgVQq_1rkybObidA@mail.gmail.com>

[Tim]
>> secrets.token_bytes() is already the way to spell "get a string of
>> messed-up bytes", and that's the dead obvious (according to me) place
>> to add the potentially blocking implementation.

[Sebastian Krause]
> I honestly didn't think that this was the dead obvious function to
> use. To me the naming kind of suggested that it would do some
> special magic that tokens needed, instead of just returning random
> bytes (even though the best token is probably just perfectly random
> data). If you want to provide a general function for secure random
> bytes I would suggest at least a better naming.

There was ample bikeshedding over the names of `secrets` functions at
the time.  If token_bytes wasn't the obvious function to you, I
suspect you have scant idea what _is_ in the `secrets` module.   The
naming is logical in context, where various "token_xxx" functions
supply random-ish bytes in different formats.  In that context,
xxx=bytes is the obvious way to get raw bytes.

From larry at hastings.org  Fri Jun 10 16:06:45 2016
From: larry at hastings.org (Larry Hastings)
Date: Fri, 10 Jun 2016 13:06:45 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAEbHw4bgvHS-9trwQDbs6aN6zK=hr0xXF72DY8C+YM5hTYxvEA@mail.gmail.com>
References: <57595210.4000508@hastings.org> <5759EC2B.8040208@hastings.org>
 <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io> <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
 <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
 <B50734BC-E2DC-4F7C-B607-A7D624A3C0C9@stufft.io>
 <CAEbHw4aCETZ-cBZ4kvgtAOrcVW3CGSSKV2_iAxs2w4iTv4tSjA@mail.gmail.com>
 <575B1B18.9020502@hastings.org>
 <CAEbHw4bgvHS-9trwQDbs6aN6zK=hr0xXF72DY8C+YM5hTYxvEA@mail.gmail.com>
Message-ID: <575B1DD5.4040305@hastings.org>

On 06/10/2016 01:01 PM, David Mertz wrote:
> So yes, I think 3.5.2 should restore the 2.6-3.4 behavior of os.urandom(),

That makes... five of us I think ;-) (Larry Guido Barry Tim David)

> and the NEW APIs in secrets should use the "best available randomness 
> (even if it blocks)"

I'm not particular about how the new API is spelled.  However, I do 
think os.getrandom() should be exposed as a thin wrapper over 
getrandom() in 3.6.   That would permit Python programmers to take 
maximal advantage of the features offered by their platform.  It would 
also permit the secrets module to continue to be written in pure Python.

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/121dbf81/attachment.html>

From barry at python.org  Fri Jun 10 16:11:59 2016
From: barry at python.org (Barry Warsaw)
Date: Fri, 10 Jun 2016 16:11:59 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
References: <57595210.4000508@hastings.org>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
 <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
Message-ID: <20160610161159.7e4f1cce.barry@wooz.org>

On Jun 10, 2016, at 12:05 PM, David Mertz wrote:

>OK.  My understanding is that Guido ruled out introducing an os.getrandom()
>API in 3.5.2.  But would you be happy if that interface is added to 3.6?

I would.

>It feels to me like the correct spelling in 3.6 should probably be
>secrets.getrandom() or something related to that.

ISTM that secrets is a somewhat higher level API while it makes sense that a
fairly simple plumbing of the underlying C call should go in os.  But I
wouldn't argue much if folks had strong opinions to the contrary.

Cheers,
-Barry

From ericsnowcurrently at gmail.com  Fri Jun 10 16:19:37 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Fri, 10 Jun 2016 13:19:37 -0700
Subject: [Python-Dev] PEP 468
In-Reply-To: <CAB_e7iw7A46SzWGmnvSn9d0TzYJSZ927iOsppFWTQxtym3bH-g@mail.gmail.com>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
 <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
 <CAB_e7iw7A46SzWGmnvSn9d0TzYJSZ927iOsppFWTQxtym3bH-g@mail.gmail.com>
Message-ID: <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>

On Fri, Jun 10, 2016 at 11:54 AM, Franklin? Lee
<leewangzhong+python at gmail.com> wrote:
> Eric, have you any work in progress on compact dicts?

Nope.  I presume you are talking the proposal Raymond made a while back.

-eric

From ericsnowcurrently at gmail.com  Fri Jun 10 16:25:10 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Fri, 10 Jun 2016 13:25:10 -0700
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CADiSq7eoafux0HR868JCHu6dNgioWf_YvOU0zBsr1Ek1_8JnWg@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <575772E6.7040906@stoneleaf.us>
 <CALFfu7DJDrBpPT4BvRG0-+coHRyvH+6tgwR=BFvPd0wRJ3NBPQ@mail.gmail.com>
 <CADiSq7fr-K_a2nBnp3LJyvi0sQSWbGTYij5xmRbA6+wwDRbmHA@mail.gmail.com>
 <CALFfu7BPbof_y+S7C2z+XDwHjag5KwCcDih+GX-fwhZ6429ZUg@mail.gmail.com>
 <CADiSq7eoafux0HR868JCHu6dNgioWf_YvOU0zBsr1Ek1_8JnWg@mail.gmail.com>
Message-ID: <CALFfu7DGBT+xZzpO0kp6ummfQd74Qdxa_j4t4p97qof7o_M-SQ@mail.gmail.com>

On Fri, Jun 10, 2016 at 11:29 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 10 June 2016 at 09:42, Eric Snow <ericsnowcurrently at gmail.com> wrote:
>> On Thu, Jun 9, 2016 at 2:39 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>> That restriction would be comparable to what we do with __slots__ today:
>>>
>>>     >>> class C:
>>>     ...     __slots__ = 1
>>>     ...
>>>     Traceback (most recent call last):
>>>      File "<stdin>", line 1, in <module>
>>>     TypeError: 'int' object is not iterable
>>
>> Are you suggesting that we require it be a tuple of identifiers (or
>> None) and raise TypeError otherwise, similar to __slots__?  The
>> difference is that __slots__ has specific type requirements that do
>> not apply to __definition_order__, as well as a different purpose.
>> __definition_order__ is about preserving definition-type info that we
>> are currently throwing away.
>
> If we don't enforce the tuple-of-identifiers restriction at type
> creation time, everyone that *doesn't* make it a tuple-of-identifiers
> is likely to have a subtle compatibility bug with class decorators and
> other code that assume the default tuple-of-identifiers format is the
> only possible format (aside from None). To put it in PEP 484 terms:
> regardless of what the PEP says, people are going to assume the type
> of __definition_order__ is Optional[Tuple[str]], as that's going to
> cover almost all class definitions they encounter.
>
> It makes sense to me to give class definitions and metaclasses the
> opportunity to change the *content* of the definition order: "Use
> these names in this order, not the names and order you would have
> calculated by default".
>
> It doesn't make sense to me to give them an opportunity to change the
> *form* of the definition order, since that makes it incredibly
> difficult to consume correctly: "Sure, it's *normally* a
> tuple-of-identifiers, but it *might* be a dictionary, or a complex
> number, or a set, or whatever the class author decided to make it".
>
> By contrast, if the class machinery enforces Optional[Tuple[str]],
> then it becomes a lot easier to consume reliably, and anyone violating
> the constraint gets an immediate exception when defining the offending
> class, rather than a potentially obscure exception from a class
> decorator or other piece of code that assumes __definition_order__
> could only be None or a tuple of strings.

That makes sense.  I'll adjust the PEP (and the implementation).

-eric

From tytso at mit.edu  Fri Jun 10 15:54:11 2016
From: tytso at mit.edu (Theodore Ts'o)
Date: Fri, 10 Jun 2016 15:54:11 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <njdldf$h9r$1@ger.gmane.org>
References: <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
Message-ID: <20160610195411.GA3932@thunk.org>

I will observe that feelings have gotten a little heated, so without
making any suggestions to how the python-dev community should decide
things, let me offer some observations that might perhaps shed a
little light, and perhaps dispell a little bit of the heat.

As someone who has been working in security for a long time --- before
I started getting paid to hack Linux full-time, worked on Kerberos,
was on the Security Area Directorate of the IETF, where among other
things I was one of the working group chairs for the IP Security
(ipsec) working group --- I tend to cringe a bit when people talk
about security in terms of absolutes.  For example, the phrase
"improving Python's security".  Security is something that is best
talked about given a specific threat environment, where the value of
what you are trying to protect, the capabilities and resources of the
attackers, etc., are all well known.

This gets hard for those of us who work on infrastructure which can
get used in many different arenas, and so that's something that
applies both to the Linux Kernel and to C-Python, because how people
will use the tools that we spend so much of our passion crafting is
largely out of our control, and we may not even know how they are
using it.

As far as /dev/urandom is concerned, it's true that it doesn't block
before it has been initialized.  If you are a security academic who
likes to write papers about how great you are at finding defects in
other people's work.  This is definitely a weakness.

Is it a fatal weakness?  Well, first of all, on most server and
desktop deployments, we save 1 kilobyte or so of /dev/urandom output
during the shutdown sequence, and immediately after the init scripts
are completed.  This saved entropy is then piped back into /dev/random
infrastructure and used initialized /dev/random and /dev/urandom very
early in the init scripts.  On a freshly instaled machine, this won't
help, true, but in practice, on most systems, /dev/urandom will get
initialized from interrupt timing sampling within a few seconds after
boot.  For example, on a sample Google Compute Engine VM which is
booted into Debian and then left idle, /dev/urandom was initialized
within 2.8 seconds after boot, while the root file system was
remounted read-only 1.6 seconds after boot.

So even on Python pre-3.5.0, realistically speaking, the "weakness" of
os.random would only be an issue (a) if it is run within the first few
seconds of boot, and (b) os.random is used to directly generate a
long-term cryptographic secret.  If you are fork openssl or ssh-keygen
to generate a public/private keypair, then you aren't using os.random.

Furthermore, if you are running on a modern x86 system with RDRAND,
you'll also be fine, because we mix in randomness from the CPU chip
via the RDRAND instruction.

So this whole question of whether os.random should block *is*
important in certain very specific cases, and if you are generating
long-term cryptogaphic secrets in Python, maybe you should be worrying
about that.  But to be honest, there are lots of other things you
should be worrying about as well, and I would hope that people writing
cryptographic code would be asking questions of how the random nunmber
stack is working, not just at the C-Python interpretor level, but also
at the OS level.

My preference would be that os.random should block, because the odds
that people would be trying to generate long-term cryptographic
secrets within seconds after boot is very small, and if you *do* block
for a second or two, it's not the end of the world.  The problem that
triggered this was specifically because systemd was trying to use
C-Python very early in the boot process to initialize the SIPHASH used
for the dictionary, and it's not clear that really needed to be
extremely strong because it wasn't a long-term cryptogaphic secret ---
certainly not how systemd was using that specific script!

The reason why I think blocking is better is that once you've solved
the "don't hang the VM for 90 seconds until python has started up",
someone who is using os.random will almost certainly not be on the
blocking path of the system boot sequence, and so blocking for 2
seconds before generating a long-term cryptographic secret is not the
end of the world.

And if it does block by accident, in a security critical scenario it
will hopefully force the progammer to think, and and in a non-security
critical scenario, it should be easy to switch to either a totally
non-blocking interface, or switch to a pseudo-random interface hwich
is more efficient.

*HOWEVER*, on the flip side, if os.random *doesn't* block, in 99.999%
percent of the cases, the python script that is directly generating a
long-term secret will not be started 1.2 seconds after the root file
system is remounted read/write, so it is *also* not the end of the
world.  Realistically speaking, we do know which processes are likely
to be generating long-term cryptographic secrets imnmediately after
boot, and they'll most likely be using progams like openssl or
openssh-keygen, to actually generate the cryptogaphic key, and in both
of those places, (a) it's there problem to get it right, and (b)
blocking for two seconds is a completely reasonable thing to do, and
they will probably do it, so we're fine.

So either way, I think it will be fine.  I may have a preference, but
if Python choses another path, all will be well.  There is an old
saying that Academic politics are often so passionate because the
stakes are so small.  It may be that one of the reasons why this topic
has been so passionate is precisely because of Sayre's Law.

Peace,

						- Ted

From mal at egenix.com  Fri Jun 10 16:30:29 2016
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 10 Jun 2016 22:30:29 +0200
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
References: <57595210.4000508@hastings.org>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io> <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
Message-ID: <575B2365.7050200@egenix.com>

On 10.06.2016 20:55, Donald Stufft wrote:
> Ok, so you?re looking for how would you replicate the blocking behavior of os.urandom that exists in 3.5.0 and 3.5.1?
> 
> In that case, it?s hard. I don?t think linux provides any way to externally determine if /dev/urandom has been initialized or not. Probably the easiest thing to do would be to interface with the getrandom() function using a c-ext, CFFI, or ctypes. If you?re looking for a way of doing this without calling the getrandom() function.. I believe the answer is you can?t.

Well, you can see the effect by running Python early in the boot process.

See e.g. http://bugs.python.org/issue26839#msg267749

and if you look at the system log file, you'll find a notice
entry "random: %s pool is initialized" which gets written once the
pool is initialized:

http://lxr.free-electrons.com/source/drivers/char/random.c#L684

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Jun 10 2016)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________

::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/
                      http://www.malemburg.com/

From tjreedy at udel.edu  Fri Jun 10 16:40:16 2016
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 10 Jun 2016 16:40:16 -0400
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <CAMpsgwbDF6x0xK9S3wqkGHo05oCQAfyP2m0Xby8Gimm_4bsOQA@mail.gmail.com>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
 <20160610132051.GH27919@ando.pearwood.info>
 <1465569266.4029.43.camel@redhat.com>
 <CACac1F9akVc5KjPw1ub=0_-H1cES3Xqae2drcgqf+s3Cct52dQ@mail.gmail.com>
 <CAMpsgwbDF6x0xK9S3wqkGHo05oCQAfyP2m0Xby8Gimm_4bsOQA@mail.gmail.com>
Message-ID: <njf8jh$8en$1@ger.gmane.org>

On 6/10/2016 12:09 PM, Victor Stinner wrote:
> 2016-06-10 17:09 GMT+02:00 Paul Moore <p.f.moore at gmail.com>:
>> Also, the way people commonly use
>> micro-benchmarks ("hey, look, this way of writing the expression goes
>> faster than that way") doesn't really address questions like "is the
>> difference statistically significant".
>
> If you use the "python3 -m perf compare method1.json method2.json",
> perf will checks that the difference is significant using the
> is_significant() method:
> http://perf.readthedocs.io/en/latest/api.html#perf.is_significant
> "This uses a Student?s two-sample, two-tailed t-test with alpha=0.95."

Depending on the sampling design, a matched-pairs t-test may be more 
appropriate.

-- 
Terry Jan Reedy

From robertc at robertcollins.net  Fri Jun 10 16:51:06 2016
From: robertc at robertcollins.net (Robert Collins)
Date: Sat, 11 Jun 2016 08:51:06 +1200
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <CAMpsgwbDF6x0xK9S3wqkGHo05oCQAfyP2m0Xby8Gimm_4bsOQA@mail.gmail.com>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
 <20160610132051.GH27919@ando.pearwood.info>
 <1465569266.4029.43.camel@redhat.com>
 <CACac1F9akVc5KjPw1ub=0_-H1cES3Xqae2drcgqf+s3Cct52dQ@mail.gmail.com>
 <CAMpsgwbDF6x0xK9S3wqkGHo05oCQAfyP2m0Xby8Gimm_4bsOQA@mail.gmail.com>
Message-ID: <CAJ3HoZ1M5CV9s_sUekk70Y2UCbF_HgBaQNDYSpyuH1OSdO2hMA@mail.gmail.com>

On 11 June 2016 at 04:09, Victor Stinner <victor.stinner at gmail.com> wrote:
..> We should design a CLI command to do timeit+compare at once.

http://judge.readthedocs.io/en/latest/ might offer some inspiration

There's also ministat -
https://www.freebsd.org/cgi/man.cgi?query=ministat&apropos=0&sektion=0&manpath=FreeBSD+8-current&format=html

From larry at hastings.org  Fri Jun 10 17:06:29 2016
From: larry at hastings.org (Larry Hastings)
Date: Fri, 10 Jun 2016 14:06:29 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160610195411.GA3932@thunk.org>
References: <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io> <njdldf$h9r$1@ger.gmane.org>
 <20160610195411.GA3932@thunk.org>
Message-ID: <575B2BD5.4050209@hastings.org>

On 06/10/2016 12:54 PM, Theodore Ts'o wrote:
> So even on Python pre-3.5.0, realistically speaking, the "weakness" of
> os.random would only be an issue (a) if it is run within the first few
> seconds of boot, and (b) os.random is used to directly generate a
> long-term cryptographic secret.  If you are fork openssl or ssh-keygen
> to generate a public/private keypair, then you aren't using os.random.

Just a gentle correction: wherever Mr. Ts'o says "os.random", he means 
"os.urandom()".  We don't have an "os.random" in Python.

My thanks to today's celebrity guest correspondent, Mr. Theodore Ts'o!

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/acc0d924/attachment-0001.html>

From leewangzhong+python at gmail.com  Fri Jun 10 17:13:13 2016
From: leewangzhong+python at gmail.com (Franklin? Lee)
Date: Fri, 10 Jun 2016 17:13:13 -0400
Subject: [Python-Dev] PEP 468
In-Reply-To: <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
 <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
 <CAB_e7iw7A46SzWGmnvSn9d0TzYJSZ927iOsppFWTQxtym3bH-g@mail.gmail.com>
 <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
Message-ID: <CAB_e7izLkQ4C=6GdnZ7t3CdY5g-OKY9fkq2Eu5imHjgV9EcZwA@mail.gmail.com>

I am. I was just wondering if there was an in-progress effort I should be
looking at, because I am interested in extensions to it.

P.S.: If anyone is missing the relevance, Raymond Hettinger's compact dicts
are inherently ordered until a delitem happens.[1] That could be "good
enough" for many purposes, including kwargs and class definition. If
CPython implements efficient compact dicts, it would be easier to propose
order-preserving (or initially-order-preserving) dicts in some places in
the standard.

[1] Whether delitem preserves order depends on whether you want to allow
gaps in your compact entry table. PyPy implemented compact dicts and
chose(?) to make dicts ordered.

On Saturday, June 11, 2016, Eric Snow <ericsnowcurrently at gmail.com> wrote:

> On Fri, Jun 10, 2016 at 11:54 AM, Franklin? Lee
> <leewangzhong+python at gmail.com <javascript:;>> wrote:
> > Eric, have you any work in progress on compact dicts?
>
> Nope.  I presume you are talking the proposal Raymond made a while back.
>
> -eric
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/6200dce9/attachment.html>

From random832 at fastmail.com  Fri Jun 10 17:14:50 2016
From: random832 at fastmail.com (Random832)
Date: Fri, 10 Jun 2016 17:14:50 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160610195411.GA3932@thunk.org>
References: <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org> <20160610195411.GA3932@thunk.org>
Message-ID: <1465593290.2349072.634239529.67EEE9C8@webmail.messagingengine.com>

On Fri, Jun 10, 2016, at 15:54, Theodore Ts'o wrote:
> So even on Python pre-3.5.0, realistically speaking, the "weakness" of
> os.random would only be an issue (a) if it is run within the first few
> seconds of boot, and (b) os.random is used to directly generate a
> long-term cryptographic secret.  If you are fork openssl or ssh-keygen
> to generate a public/private keypair, then you aren't using os.random.

So, I have a question. If this "weakness" in /dev/urandom is so
unimportant to 99% of situations... why isn't there a flag that can be
passed to getrandom() to allow the same behavior?

From tim.peters at gmail.com  Fri Jun 10 17:21:38 2016
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 10 Jun 2016 16:21:38 -0500
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <1465593290.2349072.634239529.67EEE9C8@webmail.messagingengine.com>
References: <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org> <20160610195411.GA3932@thunk.org>
 <1465593290.2349072.634239529.67EEE9C8@webmail.messagingengine.com>
Message-ID: <CAExdVNmJw6o08krtuSJ7XjuhE8qw73mghyUuDWba=LfZMAi1mQ@mail.gmail.com>

[Random832]
> So, I have a question. If this "weakness" in /dev/urandom is so
> unimportant to 99% of situations... why isn't there a flag that can be
> passed to getrandom() to allow the same behavior?

Isn't that precisely the purpose of the GRND_NONBLOCK flag?

http://man7.org/linux/man-pages/man2/getrandom.2.html

From victor.stinner at gmail.com  Fri Jun 10 17:22:42 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Fri, 10 Jun 2016 23:22:42 +0200
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <CAK1QoopjYefq0+M_ROyOKA7XEbyNhGfYJt1Lh_GUoHJ=emejPw@mail.gmail.com>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
 <CAK1QoopjYefq0+M_ROyOKA7XEbyNhGfYJt1Lh_GUoHJ=emejPw@mail.gmail.com>
Message-ID: <CAMpsgwbCKB17197Qgw50aqT6u4uF6FzbpAFqchG9zqprZ=XM+w@mail.gmail.com>

2016-06-10 20:47 GMT+02:00 Meador Inge <meadori at gmail.com>:
> Apologies in advance if this is answered in one of the links you posted, but
> out of curiosity was geometric mean considered?
>
> In the compiler world this is a very common way of aggregating performance
> results.

FYI I chose to store all timings in the JSON file. So later, you are
free to recompute the average differently, compute other statistics,
etc.

I saw that the CPython benchmark suite has an *option* to compute the
geometric mean. I don't understand well the difference with the
arithmeric mean.

Is the geometric mean recommended to aggregate results of different
(unrelated) benchmarks, or also even for multuple runs of a single
benchmark?

Victor

From donald at stufft.io  Fri Jun 10 17:28:28 2016
From: donald at stufft.io (Donald Stufft)
Date: Fri, 10 Jun 2016 17:28:28 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAExdVNmJw6o08krtuSJ7XjuhE8qw73mghyUuDWba=LfZMAi1mQ@mail.gmail.com>
References: <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io> <njdldf$h9r$1@ger.gmane.org>
 <20160610195411.GA3932@thunk.org>
 <1465593290.2349072.634239529.67EEE9C8@webmail.messagingengine.com>
 <CAExdVNmJw6o08krtuSJ7XjuhE8qw73mghyUuDWba=LfZMAi1mQ@mail.gmail.com>
Message-ID: <6523337E-0764-42C2-B637-575DBC7B8561@stufft.io>

> On Jun 10, 2016, at 5:21 PM, Tim Peters <tim.peters at gmail.com> wrote:
> 
> Isn't that precisely the purpose of the GRND_NONBLOCK flag?

It doesn?t behave exactly the same as /dev/urandom. If the pool hasn?t been initialized yet /dev/urandom will return possibly predictable data whereas getrandom(GRND_NONBLOCK) will EAGAIN.

?
Donald Stufft

From tjreedy at udel.edu  Fri Jun 10 17:37:31 2016
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 10 Jun 2016 17:37:31 -0400
Subject: [Python-Dev] Cutoff time for patches for upcoming releases
Message-ID: <njfbus$ra2$1@ger.gmane.org>

A question for each of the three release managers:
when is the earliest that you might tag your release and
cutoff submission of further patches for the release?

2.7.12 ('6-12')?

3.5.2 ('6-12')?

3.6.0a2 ('6-13')?

-- 
Terry Jan Reedy

From victor.stinner at gmail.com  Fri Jun 10 18:06:31 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Sat, 11 Jun 2016 00:06:31 +0200
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <CAO=oM6v+LS6OsbCAi4OE-g_QYxvYrdTeX0-Ueyp306Gtz5s2-A@mail.gmail.com>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
 <20160610132051.GH27919@ando.pearwood.info>
 <CAMpsgwYRe7XxH7_=9-c2BRVJd1fvzaRcjE3tJoTz4WtRDTtgZw@mail.gmail.com>
 <20160610170453.GI27919@ando.pearwood.info>
 <CAO=oM6v+LS6OsbCAi4OE-g_QYxvYrdTeX0-Ueyp306Gtz5s2-A@mail.gmail.com>
Message-ID: <CAMpsgwaKgnfAeFtrwjJ9h4UJSFSsm3S4u2GUFUTJse7HChACCQ@mail.gmail.com>

Hi,

2016-06-10 20:37 GMT+02:00 Kevin Modzelewski via Python-Dev
<python-dev at python.org>:
> Hi all, I wrote a blog post about this.
> http://blog.kevmod.com/2016/06/benchmarking-minimum-vs-average/

Oh nice, it's even better to have different articles to explain the
problem of using the minimum ;-) It added it to my doc.

> We can rule out any argument that one (minimum or average) is strictly
> better than the other, since there are cases that make either one better.
> It comes down to our expectation of the underlying distribution.

Ah? In which cases do you prefer to use the minimum? Are you able to
get reliable benchmark results when using the minimum?

> Victor if you could calculate the sample skewness of your results I think
> that would be very interesting!

I'm good to copy/paste code, but less to compute statistics :-) Would
be interesed to write a pull request, or at least to send me a
function computing the expected value?
https://github.com/haypo/perf

Victor

From nad at python.org  Fri Jun 10 18:23:36 2016
From: nad at python.org (Ned Deily)
Date: Fri, 10 Jun 2016 18:23:36 -0400
Subject: [Python-Dev] Reminder: 3.6.0a2 snapshot 2016-06-13 12:00 UTC
Message-ID: <23B6CAA5-6E07-4F2B-898F-B9EABF8E9BD0@python.org>

Just a quick reminder that the next alpha snapshot for the 3.6 release cycle is coming up in a couple of days.  This is the second of four alphas we have planned.  Alpha 2 follows the development sprints at the PyCon US 2016 in Portland.  Thanks to all of you who were able to be there and contribute!  And to all of you who continue to contribue from afar.  While there are still plenty of proposed patches awaiting review, nearly 300 commits have been pushed to the default branch (for 3.6.0) in the four weeks since alpha 1. 

As a reminder, alpha releases are intended to make it easier for the wider community to test the current state of new features and bug fixes for an upcoming Python release as a whole and for us to test the release process.  During the alpha phase, features may be added, modified, or deleted up until the start of the beta phase.  Alpha users beware!

Also note that Larry has announced plans to do a 3.5.2 release candidate sometime this weekend and Benjamin plans to do a 2.7.12 release candidate.  So get important maintenance release fixes in ASAP.

Looking ahead, the next alpha release, 3.6.0a3, will follow in about a month on 2016-07-11.

2016-06-13 ~12:00 UTC: code snapshot for 3.6.0 alpha 1

now to 2016-09-07: Alpha phase (unrestricted feature development)

2016-09-07: 3.6.0 feature code freeze, 3.7.0 feature development begins

2016-09-07 to 2016-12-04: 3.6.0 beta phase (bug and regression fixes, no new features)

2016-12-04 3.6.0 release candidate 1 (3.6.0 code freeze)

2016-12-16 3.6.0 release (3.6.0rc1 plus, if necessary, any dire emergency fixes)

--Ned

P.S. Just to be clear, this upcoming alpha snapshot will *not* contain a resolution for 3.6.0 of the current on-going discussions about the behavior of os.urandom(), the secrets module, and friends (Issue26839, Issue27288, et al).  I think the focus should be on getting 3.5.2 settled and then we can decide on and implement any changes for 3.6.0 in an upcoming alpha prior to beta 1.

https://www.python.org/dev/peps/pep-0494/

--
  Ned Deily
  nad at python.org -- []

From neil at python.ca  Fri Jun 10 19:36:24 2016
From: neil at python.ca (Neil Schemenauer)
Date: Fri, 10 Jun 2016 23:36:24 +0000 (UTC)
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
References: <20160608210133.GA4318@python.ca>
 <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
 <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>
 <20160609230807.GA8118@python.ca>
 <CAFSbXtMyTQdxSqa6Kf-FDBO2DkdjULEovW97a5QZaz6tNkWuEg@mail.gmail.com>
 <CADiSq7f7O+fXK7Ci00FwBhWwfiwN_VOwev+Ju0R3VRy56CK4UQ@mail.gmail.com>
Message-ID: <njfito$uhf$1@ger.gmane.org>

Nick Coghlan <ncoghlan at gmail.com> wrote:
> It could be very interesting to add an "ascii-warn" codec to Python
> 2.7, and then set that as the default encoding when the -3 flag is
> set.

I don't think that can work.  The library code in Python would spew
out warnings even in the cases when nothing is wrong with the
application code.  I think warnings have to be added to a Python
where str and bytes have been properly separated.  Without extreme
backporting efforts, that means 3.x.

We don't want to saddle 3.x with a bunch of backwards compatibility
cruft.  Maybe some of my runtime warning changes could be merged
using a command line flag to enable them.  It would be nice to have
the stepping stone version just be normal 3.x with a command line
option.  However, for the sanity of people maintaining 3.x, I think
perhaps we don't want to do it.

From steve at pearwood.info  Fri Jun 10 21:35:56 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 11 Jun 2016 11:35:56 +1000
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <CAMpsgwaKgnfAeFtrwjJ9h4UJSFSsm3S4u2GUFUTJse7HChACCQ@mail.gmail.com>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
 <20160610132051.GH27919@ando.pearwood.info>
 <CAMpsgwYRe7XxH7_=9-c2BRVJd1fvzaRcjE3tJoTz4WtRDTtgZw@mail.gmail.com>
 <20160610170453.GI27919@ando.pearwood.info>
 <CAO=oM6v+LS6OsbCAi4OE-g_QYxvYrdTeX0-Ueyp306Gtz5s2-A@mail.gmail.com>
 <CAMpsgwaKgnfAeFtrwjJ9h4UJSFSsm3S4u2GUFUTJse7HChACCQ@mail.gmail.com>
Message-ID: <20160611013555.GJ27919@ando.pearwood.info>

On Sat, Jun 11, 2016 at 12:06:31AM +0200, Victor Stinner wrote:

> > Victor if you could calculate the sample skewness of your results I think
> > that would be very interesting!
> 
> I'm good to copy/paste code, but less to compute statistics :-) Would
> be interesed to write a pull request, or at least to send me a
> function computing the expected value?
> https://github.com/haypo/perf

I have some code and tests for calculating (population and sample) 
skewness and kurtosis. Do you think it will be useful to add it to the 
statistics module? I can polish it up and aim to have it ready by 3.6.0 
alpha 4.

-- 
Steve

From steve at pearwood.info  Fri Jun 10 21:45:49 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 11 Jun 2016 11:45:49 +1000
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <CAMpsgwbCKB17197Qgw50aqT6u4uF6FzbpAFqchG9zqprZ=XM+w@mail.gmail.com>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
 <CAK1QoopjYefq0+M_ROyOKA7XEbyNhGfYJt1Lh_GUoHJ=emejPw@mail.gmail.com>
 <CAMpsgwbCKB17197Qgw50aqT6u4uF6FzbpAFqchG9zqprZ=XM+w@mail.gmail.com>
Message-ID: <20160611014549.GK27919@ando.pearwood.info>

On Fri, Jun 10, 2016 at 11:22:42PM +0200, Victor Stinner wrote:
> 2016-06-10 20:47 GMT+02:00 Meador Inge <meadori at gmail.com>:
> > Apologies in advance if this is answered in one of the links you posted, but
> > out of curiosity was geometric mean considered?
> >
> > In the compiler world this is a very common way of aggregating performance
> > results.
> 
> FYI I chose to store all timings in the JSON file. So later, you are
> free to recompute the average differently, compute other statistics,
> etc.
> 
> I saw that the CPython benchmark suite has an *option* to compute the
> geometric mean. I don't understand well the difference with the
> arithmeric mean.
> 
> Is the geometric mean recommended to aggregate results of different
> (unrelated) benchmarks, or also even for multuple runs of a single
> benchmark?

The Wikipedia article discusses this, but sits on the fence and can't 
decide whether using the gmean for performance results is a good or bad 
idea:

https://en.wikipedia.org/wiki/Geometric_mean#Properties

Geometric mean is usually used in finance for averaging rates of growth:

https://www.math.toronto.edu/mathnet/questionCorner/geomean.html

If you express your performances as speeds (as "calculations per 
second") then the harmonic mean is the right way to average them.

-- 
Steve

From benjamin at python.org  Fri Jun 10 23:45:41 2016
From: benjamin at python.org (Benjamin Peterson)
Date: Fri, 10 Jun 2016 20:45:41 -0700
Subject: [Python-Dev] Cutoff time for patches for upcoming releases
In-Reply-To: <njfbus$ra2$1@ger.gmane.org>
References: <njfbus$ra2$1@ger.gmane.org>
Message-ID: <1465616741.1960720.634423329.39FF561D@webmail.messagingengine.com>

2016-06-11 18:00 UTC

On Fri, Jun 10, 2016, at 14:37, Terry Reedy wrote:
> A question for each of the three release managers:
> when is the earliest that you might tag your release and
> cutoff submission of further patches for the release?
>
> 2.7.12 ('6-12')?
>
> 3.5.2 ('6-12')?
>
> 3.6.0a2 ('6-13')?
>
> --
> Terry Jan Reedy
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/benjamin%40python.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160610/e6a69497/attachment-0001.html>

From steve at pearwood.info  Sat Jun 11 03:40:14 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 11 Jun 2016 17:40:14 +1000
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <87lh2dycuo.fsf@vostro.rath.org>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org>
Message-ID: <20160611074013.GL27919@ando.pearwood.info>

On Thu, Jun 09, 2016 at 07:52:31PM -0700, Nikolaus Rath wrote:
> On Jun 09 2016, Guido van Rossum <guido at python.org> wrote:
> > I don't think we should add a new function. I think we should convince
> > ourselves that there is not enough of a risk of an exploit even if
> > os.urandom() falls back.
> 
> That will be hard, because you have to consider an active, clever
> adversary.

We know that there are exploitable bugs from Linux systems due to 
urandom, e.g. the Raspberry Pi bug referenced elsewhere in this thread.

https://www.raspberrypi.org/forums/viewtopic.php?f=66&t=126892

> On the other hand, convincing yourself that in practice os.urandom would
> never block unless the setup is super exotic or there is active
> maliciousness seems much easier.

Not that super exotic. In my day job, I've seen processes hang for five 
or ten minutes during boot up, waiting for the OS to collect enough 
entropy, although this was not recently and it wasn't involving Python. 
But VMs or embedded devices may take a long time to generate entropy. If 
the device doesn't have a hardware source of randomness, and isn't 
connected to an external source of noise like networking or a user who 
habitually fiddles with the mouse, it might take a very long time indeed 
to gather entropy...

If I have understood the concensus, I think we're on the right track:

(1) os.urandom should do whatever the OS says it should do, which on 
Linux is fall back on pseudo-random bytes when the entropy pool hasn't 
be initialised yet. It won't block and won't raise.

(2) os.getrandom will be added to 3.6, and it will block, or possibly 
raise, whichever the caller specifies.

(3) The secrets module in 3.6 will stop relying on os.urandom, and use 
os.getrandom. It may provide a switch to choose between blocking and 
non-blocking (raise an exception) behaviour. It WON'T fall back to 
predictable non-crypto bytes (unless the OS itself is completely 
broken).

(4) random will continue to seed itself from os.urandom, because it 
doesn't care if urandom provides degraded randomness. It just needs to 
be better than using the time as seed.

(5) What about random.SysRandom? I think it should use os.getrandom.

(6) A bunch of stuff will happen to make the hash randomisation not 
break when systemd runs Python scripts early in the boot process, but I 
haven't been paying attention to that part :-)

Is this a good summary of where we are at?

-- 
Steve

From steve at pearwood.info  Sat Jun 11 03:49:43 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 11 Jun 2016 17:49:43 +1000
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <575B1DD5.4040305@hastings.org>
References: <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
 <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
 <B50734BC-E2DC-4F7C-B607-A7D624A3C0C9@stufft.io>
 <CAEbHw4aCETZ-cBZ4kvgtAOrcVW3CGSSKV2_iAxs2w4iTv4tSjA@mail.gmail.com>
 <575B1B18.9020502@hastings.org>
 <CAEbHw4bgvHS-9trwQDbs6aN6zK=hr0xXF72DY8C+YM5hTYxvEA@mail.gmail.com>
 <575B1DD5.4040305@hastings.org>
Message-ID: <20160611074943.GM27919@ando.pearwood.info>

On Fri, Jun 10, 2016 at 01:06:45PM -0700, Larry Hastings wrote:
> 
> On 06/10/2016 01:01 PM, David Mertz wrote:
> >So yes, I think 3.5.2 should restore the 2.6-3.4 behavior of os.urandom(),
> 
> That makes... five of us I think ;-) (Larry Guido Barry Tim David)
> 
> 
> >and the NEW APIs in secrets should use the "best available randomness 
> >(even if it blocks)"
> 
> I'm not particular about how the new API is spelled.  However, I do 
> think os.getrandom() should be exposed as a thin wrapper over 
> getrandom() in 3.6.   That would permit Python programmers to take 
> maximal advantage of the features offered by their platform.  It would 
> also permit the secrets module to continue to be written in pure Python.

A big +1 for that.

Will there be platforms where os.getrandom doesn't exist? If not, then 
secrets can just rely on it, otherwise what should it do?

if hasattr(os, 'getrandom'):
    return os.getrandom(n)
else:
    # Fail? Fall back on os.urandom?

-- 
Steve

From larry at hastings.org  Sat Jun 11 04:24:15 2016
From: larry at hastings.org (Larry Hastings)
Date: Sat, 11 Jun 2016 01:24:15 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160611074943.GM27919@ando.pearwood.info>
References: <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
 <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
 <B50734BC-E2DC-4F7C-B607-A7D624A3C0C9@stufft.io>
 <CAEbHw4aCETZ-cBZ4kvgtAOrcVW3CGSSKV2_iAxs2w4iTv4tSjA@mail.gmail.com>
 <575B1B18.9020502@hastings.org>
 <CAEbHw4bgvHS-9trwQDbs6aN6zK=hr0xXF72DY8C+YM5hTYxvEA@mail.gmail.com>
 <575B1DD5.4040305@hastings.org> <20160611074943.GM27919@ando.pearwood.info>
Message-ID: <575BCAAF.5000009@hastings.org>

On 06/11/2016 12:49 AM, Steven D'Aprano wrote:
> Will there be platforms where os.getrandom doesn't exist? If not, then
> secrets can just rely on it, otherwise what should it do?
>
> if hasattr(os, 'getrandom'):
>      return os.getrandom(n)
> else:
>      # Fail? Fall back on os.urandom?

AFAIK:

  * Only Linux and Solaris have getrandom() right now.  IIUC Solaris
    duplicated Linux's API, but I don't know that for certain, and I
    don't know in particular what GRND_RANDOM does on Solaris.  (Of
    course, you don't need GRND_RANDOM for secrets.token_bytes().)
  * Only Linux and OS X have never-blocking /dev/urandom.  On Linux, you
    can choose to block by calling getrandom().  On OS X you have no
    choice, you can only use the never-blocking /dev/urandom.  (OS X
    also has a /dev/random but it behaves identically to /dev/urandom.) 
    OS X's man page reassuringly claims blocking is never necessary; the
    blogosphere disagrees.

If I were writing the function for the secrets module, I'd write it like 
you have above: call os.getrandom() if it's present, and os.urandom() if 
it isn't.  I believe that achieves current-best-practice everywhere: it 
does the right thing on Linux, it does the right thing on Solaris, it 
does the right thing on all the other OSes where reading from 
/dev/urandom can block, and it uses the only facility available to us on 
OS X.

//arry/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/25e47ab8/attachment.html>

From steve at pearwood.info  Sat Jun 11 04:24:42 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 11 Jun 2016 18:24:42 +1000
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAOTb1we7WsteA_5t-bdgoqX_xRS2WexaCuV4Bv7B7xZE2FaqkQ@mail.gmail.com>
References: <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io> <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <CAOTb1we7WsteA_5t-bdgoqX_xRS2WexaCuV4Bv7B7xZE2FaqkQ@mail.gmail.com>
Message-ID: <20160611082437.GN27919@ando.pearwood.info>

On Fri, Jun 10, 2016 at 11:42:40AM -0700, Chris Jerdonek wrote:

> And going back to Larry's original e-mail, where he said--
> 
> On Thu, Jun 9, 2016 at 4:25 AM, Larry Hastings <larry at hastings.org> wrote:
> > THE PROBLEM
> > ...
> > The issue author had already identified the cause: CPython was blocking on
> > getrandom() in order to initialize hash randomization.  On this fresh
> > virtual machine the entropy pool started out uninitialized.  And since the
> > only thing running on the machine was CPython, and since CPython was blocked
> > on initialization, the entropy pool was initializing very, very slowly.
> 
> it seems to me that you'd want such a solution to have code that
> causes the initialization of the entropy pool to be sped up so that it
> happens as quickly as possible (if that is even possible).  Is it
> possible? (E.g. by causing the machine to start doing things other
> than just CPython?)

I don't think that's something which the Python interpreter ought to do 
for you, but you can write to /dev/urandom or /dev/random (both keep 
their own, separate, entropy pools):

open("/dev/urandom", "w").write("hello world")

But of course there's the question of where you're going to get a source 
of noise to write to the file. While it's (probably?) harmless to write 
a hard-coded string to it, I don't think its going to give you much 
entropy.

-- 
Steve

From sebastian at realpath.org  Sat Jun 11 07:00:48 2016
From: sebastian at realpath.org (Sebastian Krause)
Date: Sat, 11 Jun 2016 13:00:48 +0200
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160611082437.GN27919@ando.pearwood.info> (Steven D'Aprano's
 message of "Sat, 11 Jun 2016 18:24:42 +1000")
References: <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <CAOTb1we7WsteA_5t-bdgoqX_xRS2WexaCuV4Bv7B7xZE2FaqkQ@mail.gmail.com>
 <20160611082437.GN27919@ando.pearwood.info>
Message-ID: <m28tycm1lr.fsf@news.realpath.org>

Steven D'Aprano <steve at pearwood.info> wrote:
>> it seems to me that you'd want such a solution to have code that
>> causes the initialization of the entropy pool to be sped up so that it
>> happens as quickly as possible (if that is even possible).  Is it
>> possible? (E.g. by causing the machine to start doing things other
>> than just CPython?)
>
> I don't think that's something which the Python interpreter ought to do 
> for you, but you can write to /dev/urandom or /dev/random (both keep 
> their own, separate, entropy pools):

There are projects like http://www.issihosts.com/haveged/ that use
some tiny timing fluctuations in CPUs to feed the entropy pool and
which are available in most Linux distributions. But as you said,
that is something completely outside of Python's scope.

From g.rodola at gmail.com  Sat Jun 11 07:53:29 2016
From: g.rodola at gmail.com (Giampaolo Rodola')
Date: Sat, 11 Jun 2016 13:53:29 +0200
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
Message-ID: <CAFYqXL8EoE8iS7qscJk_Ovauayh1UXCOZ8V0b7c171CNwHgBBw@mail.gmail.com>

On Fri, Jun 10, 2016 at 1:13 PM, Victor Stinner <victor.stinner at gmail.com>
wrote:

> Hi,
>
> Last weeks, I made researchs on how to get stable and reliable
> benchmarks, especially for the corner case of microbenchmarks. The
> first result is a serie of article, here are the first three:
>
> https://haypo.github.io/journey-to-stable-benchmark-system.html
> https://haypo.github.io/journey-to-stable-benchmark-deadcode.html
> https://haypo.github.io/journey-to-stable-benchmark-average.html
>
> The second result is a new perf module which includes all "tricks"
> discovered in my research: compute average and standard deviation,
> spawn multiple worker child processes, automatically calibrate the
> number of outter-loop iterations, automatically pin worker processes
> to isolated CPUs, and more.
>
> The perf module allows to store benchmark results as JSON to analyze
> them in depth later. It helps to configure correctly a benchmark and
> check manually if it is reliable or not.
>
> The perf documentation also explains how to get stable and reliable
> benchmarks (ex: how to tune Linux to isolate CPUs).
>
> perf has 3 builtin CLI commands:
>
> * python -m perf: show and compare JSON results
> * python -m perf.timeit: new better and more reliable implementation of
> timeit
> * python -m metadata: display collected metadata
>
> Python 3 is recommended to get time.perf_counter(), use the new
> accurate statistics module, automatic CPU pinning (I will implement it
> on Python 2 later), etc. But Python 2.7 is also supported, fallbacks
> are implemented when needed.
>
> Example with the patched telco benchmark (benchmark for the decimal
> module) on a Linux with two isolated CPUs.
>
> First run the benchmark:
> ---
> $ python3 telco.py --json-file=telco.json
> .........................
> Average: 26.7 ms +- 0.2 ms
> ---
>
>
> Then show the JSON content to see all details:
> ---
> $ python3 -m perf -v show telco.json
> Metadata:
> - aslr: enabled
> - cpu_affinity: 2, 3
> - cpu_count: 4
> - cpu_model_name: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
> - hostname: smithers
> - loops: 10
> - platform: Linux-4.4.9-300.fc23.x86_64-x86_64-with-fedora-23-Twenty_Three
> - python_executable: /usr/bin/python3
> - python_implementation: cpython
> - python_version: 3.4.3
>
> Run 1/25: warmup (1): 26.9 ms; samples (3): 26.8 ms, 26.8 ms, 26.7 ms
> Run 2/25: warmup (1): 26.8 ms; samples (3): 26.7 ms, 26.7 ms, 26.7 ms
> Run 3/25: warmup (1): 26.9 ms; samples (3): 26.8 ms, 26.9 ms, 26.8 ms
> (...)
> Run 25/25: warmup (1): 26.8 ms; samples (3): 26.7 ms, 26.7 ms, 26.7 ms
>
> Average: 26.7 ms +- 0.2 ms (25 runs x 3 samples; 1 warmup)
> ---
>
> Note: benchmarks can be analyzed with Python 2.
>
> I'm posting my email to python-dev because providing timeit results is
> commonly requested in review of optimization patches.
>
> The next step is to patch the CPython benchmark suite to use the perf
> module. I already forked the repository and started to patch some
> benchmarks.
>
> If you are interested by Python performance in general, please join us
> on the speed mailing list!
> https://mail.python.org/mailman/listinfo/speed
>
> Victor
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/g.rodola%40gmail.com
>

This is very interesting and also somewhat related to psutil. I wonder...
would increasing process priority help isolating benchmarks even more? By
this I mean "os.nice(-20)".
Extra: perhaps even IO priority:
https://pythonhosted.org/psutil/#psutil.Process.ionice ?

-- 
Giampaolo - http://grodola.blogspot.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/317f019b/attachment.html>

From christian at python.org  Sat Jun 11 08:40:45 2016
From: christian at python.org (Christian Heimes)
Date: Sat, 11 Jun 2016 14:40:45 +0200
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAOTb1we7WsteA_5t-bdgoqX_xRS2WexaCuV4Bv7B7xZE2FaqkQ@mail.gmail.com>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io> <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <CAOTb1we7WsteA_5t-bdgoqX_xRS2WexaCuV4Bv7B7xZE2FaqkQ@mail.gmail.com>
Message-ID: <njh0sd$bk0$1@ger.gmane.org>

On 2016-06-10 20:42, Chris Jerdonek wrote:
> On Fri, Jun 10, 2016 at 11:29 AM, David Mertz <mertz at gnosis.cx> wrote:
>> This is fairly academic, since I do not anticipate needing to do this
>> myself, but I have a specific question.  I'll assume that Python 3.5.2 will
>> go back to the 2.6-3.4 behavior in which os.urandom() never blocks on Linux.
>> Moreover, I understand that the case where the insecure bits might be
>> returned are limited to Python scripts that run on system initialization on
>> Linux.
>>
>> If I *were* someone who needed to write a Linux system initialization script
>> using Python 3.5.2, what would the code look like.  I think for this use
>> case, requiring something with a little bit of "code smell" is fine, but I
>> kinda hope it exists at all.
> 
> Good question.  And going back to Larry's original e-mail, where he said--
> 
> On Thu, Jun 9, 2016 at 4:25 AM, Larry Hastings <larry at hastings.org> wrote:
>> THE PROBLEM
>> ...
>> The issue author had already identified the cause: CPython was blocking on
>> getrandom() in order to initialize hash randomization.  On this fresh
>> virtual machine the entropy pool started out uninitialized.  And since the
>> only thing running on the machine was CPython, and since CPython was blocked
>> on initialization, the entropy pool was initializing very, very slowly.

I repeat for like the fifth time:

os.urandom() and Python startup are totally unrelated. They just happen
to use the same internal function to set the hash randomization state.
The startup problem can be solved without f... up the security
properties of os.urandom().

The correct questions to ask are:

1) Does hash randomization for bytes, text and XML always require
cryptographically strong random values from a potentially blocking CPRNG?

2) Does the initial state of the Mersenne-Twister of the default
random.Random instance really need cryptographically strong values?

3) Should os.urandom() always use the best CSPRNG source available and
make sure it never returns weak, predictable values (when possible)?

The answers are:

1) No
2) No
3) HELL YES!

If you think that the answer to 3 is "No" and that a CSPRNG is permitted
to return predictable values, then you are *by definition* ineligible to
vote on security issues.

Christian

From victor.stinner at gmail.com  Sat Jun 11 10:37:44 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Sat, 11 Jun 2016 16:37:44 +0200
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <njh0sd$bk0$1@ger.gmane.org>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <CAOTb1we7WsteA_5t-bdgoqX_xRS2WexaCuV4Bv7B7xZE2FaqkQ@mail.gmail.com>
 <njh0sd$bk0$1@ger.gmane.org>
Message-ID: <CAMpsgwYYAy9V1A6zxAOaVAUN2WUYkqAzWKawo_-9B_oK2GpfWw@mail.gmail.com>

> I  repeat for like the fifth time:

So, is there a candidate to write a PEP?

I didn't read the thread. As expected, the discussion restarted for the 3rd
time, there are almost 100 emails in this thread.

Victor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/41a8ec2d/attachment.html>

From cory at lukasa.co.uk  Sat Jun 11 10:56:16 2016
From: cory at lukasa.co.uk (Cory Benfield)
Date: Sat, 11 Jun 2016 15:56:16 +0100
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <575BCAAF.5000009@hastings.org>
References: <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <58F60D6A-4840-4A7F-8BA5-065356770036@stufft.io>
 <CAEbHw4b4R46U1vQduZ-KTLqqOEd8LzZz7q6b0xgFj4wmCyQzCg@mail.gmail.com>
 <0D09AC01-10B1-4577-AAEF-F1582ABAD8F7@stufft.io>
 <CAEbHw4Z+KHXkQLSYahQh98Pjr7r+WX5hGUy1JC=DpeFG4ArjvQ@mail.gmail.com>
 <B50734BC-E2DC-4F7C-B607-A7D624A3C0C9@stufft.io>
 <CAEbHw4aCETZ-cBZ4kvgtAOrcVW3CGSSKV2_iAxs2w4iTv4tSjA@mail.gmail.com>
 <575B1B18.9020502@hastings.org>
 <CAEbHw4bgvHS-9trwQDbs6aN6zK=hr0xXF72DY8C+YM5hTYxvEA@mail.gmail.com>
 <575B1DD5.4040305@hastings.org> <20160611074943.GM27919@ando.pearwood.info>
 <575BCAAF.5000009@hastings.org>
Message-ID: <7C877B5C-0410-413F-8589-DFFF48792BBD@lukasa.co.uk>

> On 11 Jun 2016, at 09:24, Larry Hastings <larry at hastings.org> wrote:
> Only Linux and OS X have never-blocking /dev/urandom.  On Linux, you can choose to block by calling getrandom().  On OS X you have no choice, you can only use the never-blocking /dev/urandom.  (OS X also has a /dev/random but it behaves identically to /dev/urandom.)  OS X's man page reassuringly claims blocking is never necessary; the blogosphere disagrees.
> If I were writing the function for the secrets module, I'd write it like you have above: call os.getrandom() if it's present, and os.urandom() if it isn't.  I believe that achieves current-best-practice everywhere: it does the right thing on Linux, it does the right thing on Solaris, it does the right thing on all the other OSes where reading from /dev/urandom can block, and it uses the only facility available to us on OS X.

Sorry Larry, but as far as I know this is misleading (it?s not *wrong*, but it suggests that OS X?s /dev/urandom is the same as Linux?s, which is emphatically not true).

I?ve found the discussion around OS X?s random devices to be weirdly abstract, given that the source code for it is public, so I went and took a look. My initial reading of it (and, to be clear, this is a high-level read of a codebase I don?t know well, so please take this with the grain of salt that is intended) is that the operating system literally will not boot without at least 128 bits of entropy to read from the EFI boot loader. In the absence of 128 bits of entropy the kernel will panic, rather than continue to boot.

Generally speaking that entropy will come from RDRAND, given the restrictions on where OS X can be run (Intel CPUs for real OS X, virtualised on top of OS X, and so on top of Intel CPUs, for VMs), which imposes a baseline on the quality of the entropy you can get. Assuming that OS X is being run in a manner that is acceptable from the perspective of its license agreement (and we can all agree that no-one would violate the terms of OS X?s license agreement, right?), I think it?s reasonable to assume that OS X, either virtualised or not, is getting 128 bits of somewhat sensible entropy from the boot loader/CPU before it boots.

That means we can say this about OS X?s /dev/urandom: the reason it never blocks is because the situation of ?not enough entropy to generate good random numbers? is synonymous with ?not enough entropy to boot the OS?. So maybe we can stop casting aspersions on OS X?s RNG now.

Cory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/57ed67b3/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/57ed67b3/attachment-0001.sig>

From yan12125 at gmail.com  Sat Jun 11 01:59:49 2016
From: yan12125 at gmail.com (Chi Hsuan Yen)
Date: Sat, 11 Jun 2016 13:59:49 +0800
Subject: [Python-Dev] Current Python 3.2 status?
Message-ID: <CAMNjDR3hcfXjWywBfUAJqxuq60+oK=1d5hU-k-drAUKMoFOBvA@mail.gmail.com>

Hello all,

Georg said in February that 3.2.7 is going to be released, and now it's
June. Will it ever be released?

pip [2], virtualenv [3] and setuptools [4] have all dropped Python 3.2
support, and there's no new commits since 2016/01/15 on CPython's 3.2
branch. I'd like to know CPython's attitude against Python 3.2. Is it still
maintained? Or is it dead?

[1] https://mail.python.org/pipermail/python-dev/2016-February/143300.html
[2]
https://github.com/pypa/pip/commit/b11cb019a47ff0cf3d8a37a0c89d8ae4cf25282f
[3]
https://github.com/pypa/virtualenv/commit/8132fa3a826ff1ba0c0c065563b9733c2e5a5b6c
[4]
https://github.com/pypa/setuptools/commit/ae6c73f07680da77345f5ccfac4facde30ad4d7e
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/1fa822d6/attachment.html>

From christian at python.org  Sat Jun 11 11:08:54 2016
From: christian at python.org (Christian Heimes)
Date: Sat, 11 Jun 2016 17:08:54 +0200
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAMpsgwYYAy9V1A6zxAOaVAUN2WUYkqAzWKawo_-9B_oK2GpfWw@mail.gmail.com>
References: <57595210.4000508@hastings.org>
 <20160609124102.5EE4EB14024@webabinitio.net> <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io> <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <CAOTb1we7WsteA_5t-bdgoqX_xRS2WexaCuV4Bv7B7xZE2FaqkQ@mail.gmail.com>
 <njh0sd$bk0$1@ger.gmane.org>
 <CAMpsgwYYAy9V1A6zxAOaVAUN2WUYkqAzWKawo_-9B_oK2GpfWw@mail.gmail.com>
Message-ID: <162d6d54-c6ac-bf60-5912-7076e8a07261@python.org>

On 2016-06-11 16:37, Victor Stinner wrote:
>> I  repeat for like the fifth time:
> 
> So, is there a candidate to write a PEP?
> 
> I didn't read the thread. As expected, the discussion restarted for the
> 3rd time, there are almost 100 emails in this thread.

Sorry, I'm out.

I simply lack the necessary strength and mental energy to persuade the
issue any further. Donald Stufft just forwarded a quote that resonates
with my current state of mind (replace 'lists' with 'current topic'):

    "I feel I no longer possess either the necessary strength or perhaps
the necessary faith to continue rolling the stone of Sisyphus against
the forces of reaction which are triumphing everywhere. I am therefore
retiring from the lists, and ask if my dear contemporaries only one
thing ? oblivion."

Christian

From guido at python.org  Sat Jun 11 11:34:20 2016
From: guido at python.org (Guido van Rossum)
Date: Sat, 11 Jun 2016 08:34:20 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160611074013.GL27919@ando.pearwood.info>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org>
 <20160611074013.GL27919@ando.pearwood.info>
Message-ID: <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>

In terms of API design, I'd prefer a flag to os.urandom() indicating a
preference for
- blocking
- raising an exception
- weaker random bits

To those still upset by the decision, please read Ted Ts'o's message.

On Saturday, June 11, 2016, Steven D'Aprano <steve at pearwood.info> wrote:

> On Thu, Jun 09, 2016 at 07:52:31PM -0700, Nikolaus Rath wrote:
> > On Jun 09 2016, Guido van Rossum <guido at python.org <javascript:;>>
> wrote:
> > > I don't think we should add a new function. I think we should convince
> > > ourselves that there is not enough of a risk of an exploit even if
> > > os.urandom() falls back.
> >
> > That will be hard, because you have to consider an active, clever
> > adversary.
>
> We know that there are exploitable bugs from Linux systems due to
> urandom, e.g. the Raspberry Pi bug referenced elsewhere in this thread.
>
> https://www.raspberrypi.org/forums/viewtopic.php?f=66&t=126892
>
>
> > On the other hand, convincing yourself that in practice os.urandom would
> > never block unless the setup is super exotic or there is active
> > maliciousness seems much easier.
>
> Not that super exotic. In my day job, I've seen processes hang for five
> or ten minutes during boot up, waiting for the OS to collect enough
> entropy, although this was not recently and it wasn't involving Python.
> But VMs or embedded devices may take a long time to generate entropy. If
> the device doesn't have a hardware source of randomness, and isn't
> connected to an external source of noise like networking or a user who
> habitually fiddles with the mouse, it might take a very long time indeed
> to gather entropy...
>
> If I have understood the concensus, I think we're on the right track:
>
> (1) os.urandom should do whatever the OS says it should do, which on
> Linux is fall back on pseudo-random bytes when the entropy pool hasn't
> be initialised yet. It won't block and won't raise.
>
> (2) os.getrandom will be added to 3.6, and it will block, or possibly
> raise, whichever the caller specifies.
>
> (3) The secrets module in 3.6 will stop relying on os.urandom, and use
> os.getrandom. It may provide a switch to choose between blocking and
> non-blocking (raise an exception) behaviour. It WON'T fall back to
> predictable non-crypto bytes (unless the OS itself is completely
> broken).
>
> (4) random will continue to seed itself from os.urandom, because it
> doesn't care if urandom provides degraded randomness. It just needs to
> be better than using the time as seed.
>
> (5) What about random.SysRandom? I think it should use os.getrandom.
>
> (6) A bunch of stuff will happen to make the hash randomisation not
> break when systemd runs Python scripts early in the boot process, but I
> haven't been paying attention to that part :-)
>
> Is this a good summary of where we are at?
>
>
>
> --
> Steve
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org <javascript:;>
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>

-- 
--Guido (mobile)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/4c50c704/attachment.html>

From brett at python.org  Sat Jun 11 12:35:26 2016
From: brett at python.org (Brett Cannon)
Date: Sat, 11 Jun 2016 16:35:26 +0000
Subject: [Python-Dev] Current Python 3.2 status?
In-Reply-To: <CAMNjDR3hcfXjWywBfUAJqxuq60+oK=1d5hU-k-drAUKMoFOBvA@mail.gmail.com>
References: <CAMNjDR3hcfXjWywBfUAJqxuq60+oK=1d5hU-k-drAUKMoFOBvA@mail.gmail.com>
Message-ID: <CAP1=2W6HLMszu27GKzDK7ewd665N9RuEMfe-y_1e8SFF+uW0EA@mail.gmail.com>

On Sat, 11 Jun 2016 at 08:05 Chi Hsuan Yen <yan12125 at gmail.com> wrote:

> Hello all,
>
> Georg said in February that 3.2.7 is going to be released, and now it's
> June. Will it ever be released?
>
> pip [2], virtualenv [3] and setuptools [4] have all dropped Python 3.2
> support, and there's no new commits since 2016/01/15 on CPython's 3.2
> branch. I'd like to know CPython's attitude against Python 3.2. Is it still
> maintained? Or is it dead?
>

It's up to Georg to decide to do one final source-only release. But to the
rest of us it's reached its end-of-life. Basically checking out the source
code will more-or-less be the same as whatever Georg relesases (sans
tweaking some version numbers).

-Brett

>
> [1] https://mail.python.org/pipermail/python-dev/2016-February/143300.html
> [2]
> https://github.com/pypa/pip/commit/b11cb019a47ff0cf3d8a37a0c89d8ae4cf25282f
> [3]
> https://github.com/pypa/virtualenv/commit/8132fa3a826ff1ba0c0c065563b9733c2e5a5b6c
> [4]
> https://github.com/pypa/setuptools/commit/ae6c73f07680da77345f5ccfac4facde30ad4d7e
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/574f9dc4/attachment.html>

From berker.peksag at gmail.com  Sat Jun 11 13:02:21 2016
From: berker.peksag at gmail.com (=?UTF-8?Q?Berker_Peksa=C4=9F?=)
Date: Sat, 11 Jun 2016 20:02:21 +0300
Subject: [Python-Dev] Current Python 3.2 status?
In-Reply-To: <CAMNjDR3hcfXjWywBfUAJqxuq60+oK=1d5hU-k-drAUKMoFOBvA@mail.gmail.com>
References: <CAMNjDR3hcfXjWywBfUAJqxuq60+oK=1d5hU-k-drAUKMoFOBvA@mail.gmail.com>
Message-ID: <CAF4280+4DbaEE0T0C9nNVHqS0Yoi=bmFk4aUFb_TJ2=eTJu-OQ@mail.gmail.com>

On Sat, Jun 11, 2016 at 8:59 AM, Chi Hsuan Yen <yan12125 at gmail.com> wrote:
> Hello all,
>
> Georg said in February that 3.2.7 is going to be released, and now it's
> June. Will it ever be released?

Hi,

It was delayed because of a security issue. See Georg's email at
https://mail.python.org/pipermail/python-dev/2016-February/143400.html

--Berker

From donald at stufft.io  Sat Jun 11 13:15:51 2016
From: donald at stufft.io (Donald Stufft)
Date: Sat, 11 Jun 2016 13:15:51 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
Message-ID: <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>

> On Jun 11, 2016, at 11:34 AM, Guido van Rossum <guido at python.org> wrote:
> 
> In terms of API design, I'd prefer a flag to os.urandom() indicating a preference for
> - blocking
> - raising an exception
> - weaker random bits

If os.urandom can?t block on Linux, then I feel like it?d be saner to add os.getrandom(). I feel like these flags are going to confuse people, particularly when you take into account that all 3 of them are only going to really matter on Linux (and particularly on newer Linux) and for things like ?blocking? it?s going to get confused with the blocking that /dev/random does on Linux.

Right now there are two ways to access the system CSPRNG on *nix, there is /dev/urandom pretty much always, and then there is getrandom() (or arc4random, etc, depending on the specific OS you?re on). 

Perhaps the right answer is to go back to making os.urandom always open(?/dev/urandom?).read() instead of trying to save a FD by using getrandom() and just add os.getrandom() which will interface with getrandom()/arc4random()/etc and always in blocking mode. Why always in blocking mode? Because it?s the only way to get consistent behavior across different platforms, all non Linux OSs either block or they otherwise ensure that it is initialized prior to it even being possible to access the CSPRNG.

Using this, code can be smarter about what to do in edge cases than we can reasonably be in os.urandom, for example see https://bpaste.net/show/41d89e520913 <https://bpaste.net/show/41d89e520913>.

The reasons I think this is preferable to adding parameters to os.urandom are:

* If we add parameters to os.urandom, you can?t feature detect their existence easily, you have to use version checks.
* With flags, unless we add even more flags we can?t dictate what should happen if we?re on a system where the person?s desired preference can?t be satisfied. We either have to just silently do something that may be wrong, or add more flags. By adding two functions people can pick which of the following they want with some programming (see example):

    * Just try to get the strongest random, but fall back to maybe not random if it?s early enough in boot process.
    * Fail on old Linux rather than possibly get insecure random.
    * Actually write cross platform code to prevent blocking (since only Linux allows you to not block)
        * Fail hard rather than block if we can?t get secure random bytes without blocking.
        * Soft fail and get ?probably good enough? random from os.urandom on Linux.
        * Hard fail on non Linux if we would block since there?s no non-blocking and ?probably good enough? interface.
        * Soft fail and get ?probably good enough? random from os.urandom on Linux, and use time/pid/memory offsets on non Linux.
    * Just use the best source of random available to use on the system, and block rather than fail.

I don?t see any way to get the same wide set of options by just adding flags to os.urandom unless we add flags that work for every possible combination of what people may or may not want to.

?
Donald Stufft

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/bc37a7dc/attachment.html>

From tjreedy at udel.edu  Sat Jun 11 13:28:33 2016
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 11 Jun 2016 13:28:33 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
Message-ID: <njhho5$ucr$1@ger.gmane.org>

On 6/11/2016 11:34 AM, Guido van Rossum wrote:
> In terms of API design, I'd prefer a flag to os.urandom() indicating a
> preference for
> - blocking
> - raising an exception
> - weaker random bits

+100 ;-)

I proposed exactly this 2 days ago, 5 hours after Larry's initial post.

'''
I think the 'new API' should be a parameter, not a new function. With 
just two choices, 'wait' = True/False  could work.  If 'raise an 
exception' were added, then
'action (when good bits are not immediately available' =
'return (best possible)' or
'wait (until have good bits)' or
'raise (CryptBitsNotAvailable)'

In either case, there would then be the question of whether the default 
should match 3.5.0/1 or 3.4 and before.
'''

Deciding on this then might have saved some hurt feelings, to the point 
where two contributors feel like disappearing, and a release manager 
must feel the same.  In any case, Guido already picked 3.4 behavior as 
the default.  Can we agree and move on?

-- 
Terry Jan Reedy

From guido at python.org  Sat Jun 11 13:39:11 2016
From: guido at python.org (Guido van Rossum)
Date: Sat, 11 Jun 2016 10:39:11 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
Message-ID: <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>

Is the feature detection desire about being able to write code that runs on
older Python versions or for platforms that just don't have getrandom()?

My assumption was that nobody would actually use these flags except the
secrets module and people writing code that generates long-lived secrets --
and the latter category should be checking platform and versions anyway
since they need the whole stack to be secure (if I understand Ted Ts'o's
email right).

My assumption is also that the flags should be hints (perhaps only relevant
on Linux) -- platforms that can't perform the action desired (because their
system's API doesn't support it) would just do their default action,
assuming the system API does the best it can.

I think the problem with making os.urandom() go back to always reading
/dev/urandom is that we've come to rely on it on all platforms, so we've
passed that station.

On Sat, Jun 11, 2016 at 10:15 AM, Donald Stufft <donald at stufft.io> wrote:

>
> On Jun 11, 2016, at 11:34 AM, Guido van Rossum <guido at python.org> wrote:
>
> In terms of API design, I'd prefer a flag to os.urandom() indicating a
> preference for
> - blocking
> - raising an exception
> - weaker random bits
>
>
> If os.urandom can?t block on Linux, then I feel like it?d be saner to add
> os.getrandom(). I feel like these flags are going to confuse people,
> particularly when you take into account that all 3 of them are only going
> to really matter on Linux (and particularly on newer Linux) and for things
> like ?blocking? it?s going to get confused with the blocking that
> /dev/random does on Linux.
>
> Right now there are two ways to access the system CSPRNG on *nix, there is
> /dev/urandom pretty much always, and then there is getrandom() (or
> arc4random, etc, depending on the specific OS you?re on).
>
> Perhaps the right answer is to go back to making os.urandom always
> open(?/dev/urandom?).read() instead of trying to save a FD by using
> getrandom() and just add os.getrandom() which will interface with
> getrandom()/arc4random()/etc and always in blocking mode. Why always in
> blocking mode? Because it?s the only way to get consistent behavior across
> different platforms, all non Linux OSs either block or they otherwise
> ensure that it is initialized prior to it even being possible to access the
> CSPRNG.
>
> Using this, code can be smarter about what to do in edge cases than we can
> reasonably be in os.urandom, for example see
> https://bpaste.net/show/41d89e520913.
>
> The reasons I think this is preferable to adding parameters to os.urandom
> are:
>
> * If we add parameters to os.urandom, you can?t feature detect their
> existence easily, you have to use version checks.
> * With flags, unless we add even more flags we can?t dictate what should
> happen if we?re on a system where the person?s desired preference can?t be
> satisfied. We either have to just silently do something that may be wrong,
> or add more flags. By adding two functions people can pick which of the
> following they want with some programming (see example):
>
>     * Just try to get the strongest random, but fall back to maybe not
> random if it?s early enough in boot process.
>     * Fail on old Linux rather than possibly get insecure random.
>     * Actually write cross platform code to prevent blocking (since only
> Linux allows you to not block)
>         * Fail hard rather than block if we can?t get secure random bytes
> without blocking.
>         * Soft fail and get ?probably good enough? random from os.urandom
> on Linux.
>         * Hard fail on non Linux if we would block since there?s no
> non-blocking and ?probably good enough? interface.
>         * Soft fail and get ?probably good enough? random from os.urandom
> on Linux, and use time/pid/memory offsets on non Linux.
>     * Just use the best source of random available to use on the system,
> and block rather than fail.
>
> I don?t see any way to get the same wide set of options by just adding
> flags to os.urandom unless we add flags that work for every possible
> combination of what people may or may not want to.
>
> ?
> Donald Stufft
>
>
>
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/03beaf8d/attachment.html>

From guido at python.org  Sat Jun 11 13:41:19 2016
From: guido at python.org (Guido van Rossum)
Date: Sat, 11 Jun 2016 10:41:19 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <njhho5$ucr$1@ger.gmane.org>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <njhho5$ucr$1@ger.gmane.org>
Message-ID: <CAP7+vJKMOf2SUpP1Q_mPKZgMeJQFPRTZhn3QHFwZtFgDoAa=Yw@mail.gmail.com>

You can add me to the list of people who feel like disappearing.

On Sat, Jun 11, 2016 at 10:28 AM, Terry Reedy <tjreedy at udel.edu> wrote:

> On 6/11/2016 11:34 AM, Guido van Rossum wrote:
>
>> In terms of API design, I'd prefer a flag to os.urandom() indicating a
>> preference for
>> - blocking
>> - raising an exception
>> - weaker random bits
>>
>
> +100 ;-)
>
> I proposed exactly this 2 days ago, 5 hours after Larry's initial post.
>
> '''
> I think the 'new API' should be a parameter, not a new function. With just
> two choices, 'wait' = True/False  could work.  If 'raise an exception' were
> added, then
> 'action (when good bits are not immediately available' =
> 'return (best possible)' or
> 'wait (until have good bits)' or
> 'raise (CryptBitsNotAvailable)'
>
> In either case, there would then be the question of whether the default
> should match 3.5.0/1 or 3.4 and before.
> '''
>
> Deciding on this then might have saved some hurt feelings, to the point
> where two contributors feel like disappearing, and a release manager must
> feel the same.  In any case, Guido already picked 3.4 behavior as the
> default.  Can we agree and move on?
>
> --
> Terry Jan Reedy
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/fa1529fd/attachment-0001.html>

From yan12125 at gmail.com  Sat Jun 11 13:41:59 2016
From: yan12125 at gmail.com (Chi Hsuan Yen)
Date: Sun, 12 Jun 2016 01:41:59 +0800
Subject: [Python-Dev] Current Python 3.2 status?
In-Reply-To: <CAF4280+4DbaEE0T0C9nNVHqS0Yoi=bmFk4aUFb_TJ2=eTJu-OQ@mail.gmail.com>
References: <CAMNjDR3hcfXjWywBfUAJqxuq60+oK=1d5hU-k-drAUKMoFOBvA@mail.gmail.com>
 <CAF4280+4DbaEE0T0C9nNVHqS0Yoi=bmFk4aUFb_TJ2=eTJu-OQ@mail.gmail.com>
Message-ID: <CAMNjDR03g6KAWKSpN4qiFj-XkABWBx=RgLoyeH6Y1H_R6cLBOA@mail.gmail.com>

On Sun, Jun 12, 2016 at 1:02 AM, Berker Peksa? <berker.peksag at gmail.com>
wrote:

> On Sat, Jun 11, 2016 at 8:59 AM, Chi Hsuan Yen <yan12125 at gmail.com> wrote:
> > Hello all,
> >
> > Georg said in February that 3.2.7 is going to be released, and now it's
> > June. Will it ever be released?
>
> Hi,
>
> It was delayed because of a security issue. See Georg's email at
> https://mail.python.org/pipermail/python-dev/2016-February/143400.html
>
> --Berker
>

Thanks for that. I'm just curious what's happening on the 3.2 branch.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160612/96551a72/attachment.html>

From donald at stufft.io  Sat Jun 11 14:30:19 2016
From: donald at stufft.io (Donald Stufft)
Date: Sat, 11 Jun 2016 14:30:19 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
Message-ID: <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>

> On Jun 11, 2016, at 1:39 PM, Guido van Rossum <guido at python.org> wrote:
> 
> Is the feature detection desire about being able to write code that runs on older Python versions or for platforms that just don't have getrandom()?
> 
> My assumption was that nobody would actually use these flags except the secrets module and people writing code that generates long-lived secrets -- and the latter category should be checking platform and versions anyway since they need the whole stack to be secure (if I understand Ted Ts'o's email right).
> 
> My assumption is also that the flags should be hints (perhaps only relevant on Linux) -- platforms that can't perform the action desired (because their system's API doesn't support it) would just do their default action, assuming the system API does the best it can.

The problem is that someone writing software that does os.urandom(block=True) or os.urandom(exception=True) which gets some bytes doesn?t know if it got back cryptographically secure random because Python called getrandom() or if it got back cryptographically secure random because it called /dev/urandom and that gave it secure random because it?s on a platform that defines that as always returning secure or because it?s on Linux and the urandom pool is initialized or if it got back some random bytes that are not cryptographically secure because it fell back to reading /dev/urandom on Linux prior to the pool being initialized.

The ?silently does the wrong thing, even though I explicitly asked for it do something different? is something that I would consider to be a footgun and footgun?s in security sensitive code make me really worried.

Outside of the security side of things, if someone goes ?Ok I need some random bytes and I need to make sure it doesn?t block?, then doing ``os.random(block=False, exception=False)`` isn?t going to make sure that it doesn?t block except on Linux.

In other words, it?s basically impossible to ensure you get the behavior you want with these flags which I feel like will make everyone unhappy (both the people who want to ensure non-blocking, and the people who want to ensure cryptographically secure). These flags are an attractive nuisance that look like they do the right thing, but silently don?t.

Meanwhile if we have os.urandom that reads from /dev/urandom and os.getrandom() which reads from blocking random, then we make it both easier to ensure you get the behavior you want, either by using the function that best suits your needs:

* If you just want the best the OS has to offer, os.getrandom falling back to os.urandom.
* If you want to ensure you get cryptographically secure bytes, os.getrandom, falling back to os.urandom on non Linux platforms and erroring on Linux.
* If you want to *ensure* that there?s no blocking, then os.urandom on Linux (or os.urandom wrapped with timeout code anywhere else, as that?s the only way to ensure not blocking cross platform).
* If you just don?t care, YOLO it up with either os.urandom or os.getrandom or random.random.

> 
> I think the problem with making os.urandom() go back to always reading /dev/urandom is that we've come to rely on it on all platforms, so we've passed that station.
> 

Sorry, to be more specific I meant the 3.4 behavior, which was open(?/dev/urandom?).read() on *nix and CryptGenRandom on Windows.

?
Donald Stufft

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/dda146f9/attachment.html>

From brett at python.org  Sat Jun 11 14:55:51 2016
From: brett at python.org (Brett Cannon)
Date: Sat, 11 Jun 2016 18:55:51 +0000
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
Message-ID: <CAP1=2W42oC=MswTCoZAUhK3RZR872N59zXMA5=U_Yq0E3v7p6A@mail.gmail.com>

On Sat, 11 Jun 2016 at 11:31 Donald Stufft <donald at stufft.io> wrote:

> On Jun 11, 2016, at 1:39 PM, Guido van Rossum <guido at python.org> wrote:
>
> Is the feature detection desire about being able to write code that runs
> on older Python versions or for platforms that just don't have getrandom()?
>
> My assumption was that nobody would actually use these flags except the
> secrets module and people writing code that generates long-lived secrets --
> and the latter category should be checking platform and versions anyway
> since they need the whole stack to be secure (if I understand Ted Ts'o's
> email right).
>
> My assumption is also that the flags should be hints (perhaps only
> relevant on Linux) -- platforms that can't perform the action desired
> (because their system's API doesn't support it) would just do their default
> action, assuming the system API does the best it can.
>
>
> The problem is that someone writing software that does
> os.urandom(block=True) or os.urandom(exception=True) which gets some bytes
> doesn?t know if it got back cryptographically secure random because Python
> called getrandom() or if it got back cryptographically secure random
> because it called /dev/urandom and that gave it secure random because it?s
> on a platform that defines that as always returning secure or because it?s
> on Linux and the urandom pool is initialized or if it got back some random
> bytes that are not cryptographically secure because it fell back to reading
> /dev/urandom on Linux prior to the pool being initialized.
>
> The ?silently does the wrong thing, even though I explicitly asked for it
> do something different? is something that I would consider to be a footgun
> and footgun?s in security sensitive code make me really worried.
>
> Outside of the security side of things, if someone goes ?Ok I need some
> random bytes and I need to make sure it doesn?t block?, then doing
> ``os.random(block=False, exception=False)`` isn?t going to make sure that
> it doesn?t block except on Linux.
>
> In other words, it?s basically impossible to ensure you get the behavior
> you want with these flags which I feel like will make everyone unhappy
> (both the people who want to ensure non-blocking, and the people who want
> to ensure cryptographically secure). These flags are an attractive nuisance
> that look like they do the right thing, but silently don?t.
>
> Meanwhile if we have os.urandom that reads from /dev/urandom and
> os.getrandom() which reads from blocking random, then we make it both
> easier to ensure you get the behavior you want, either by using the
> function that best suits your needs:
>
> * If you just want the best the OS has to offer, os.getrandom falling back
> to os.urandom.
> * If you want to ensure you get cryptographically secure bytes,
> os.getrandom, falling back to os.urandom on non Linux platforms and
> erroring on Linux.
> * If you want to *ensure* that there?s no blocking, then os.urandom on
> Linux (or os.urandom wrapped with timeout code anywhere else, as that?s the
> only way to ensure not blocking cross platform).
> * If you just don?t care, YOLO it up with either os.urandom or
> os.getrandom or random.random.
>

I'm +1 w/ what Donald is suggesting here and below w/ proper documentation
in both the secrets and random modules to explain when to use what (i.e.
secrets for crypto-no-matter-what randomness, random for quick-and-dirty
randomness). This also includes any appropriate decoupling of the secrets
module from the random module so there's no reliance on the random module
in the docs of the secrets module beyond "this class has the same
interface", and letting the secrets module be the way people generally get
crypto randomness.

-Brett

>
>
> I think the problem with making os.urandom() go back to always reading
> /dev/urandom is that we've come to rely on it on all platforms, so we've
> passed that station.
>
>
> Sorry, to be more specific I meant the 3.4 behavior, which was
> open(?/dev/urandom?).read() on *nix and CryptGenRandom on Windows.
>
>
> ?
>
> Donald Stufft
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/e120057d/attachment.html>

From guido at python.org  Sat Jun 11 15:40:06 2016
From: guido at python.org (Guido van Rossum)
Date: Sat, 11 Jun 2016 12:40:06 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
Message-ID: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>

On Sat, Jun 11, 2016 at 11:30 AM, Donald Stufft <donald at stufft.io> wrote:

>
> On Jun 11, 2016, at 1:39 PM, Guido van Rossum <guido at python.org> wrote:
>
> Is the feature detection desire about being able to write code that runs
> on older Python versions or for platforms that just don't have getrandom()?
>
> My assumption was that nobody would actually use these flags except the
> secrets module and people writing code that generates long-lived secrets --
> and the latter category should be checking platform and versions anyway
> since they need the whole stack to be secure (if I understand Ted Ts'o's
> email right).
>
> My assumption is also that the flags should be hints (perhaps only
> relevant on Linux) -- platforms that can't perform the action desired
> (because their system's API doesn't support it) would just do their default
> action, assuming the system API does the best it can.
>
>
> The problem is that someone writing software that does
> os.urandom(block=True) or os.urandom(exception=True) which gets some bytes
> doesn?t know if it got back cryptographically secure random because Python
> called getrandom() or if it got back cryptographically secure random
> because it called /dev/urandom and that gave it secure random because it?s
> on a platform that defines that as always returning secure or because it?s
> on Linux and the urandom pool is initialized or if it got back some random
> bytes that are not cryptographically secure because it fell back to reading
> /dev/urandom on Linux prior to the pool being initialized.
>
> The ?silently does the wrong thing, even though I explicitly asked for it
> do something different? is something that I would consider to be a footgun
> and footgun?s in security sensitive code make me really worried.
>

Yeah, but we've already established that there's a lot more upset, rhetoric
and worry than warranted by the situation.

> Outside of the security side of things, if someone goes ?Ok I need some
> random bytes and I need to make sure it doesn?t block?, then doing
> ``os.random(block=False, exception=False)`` isn?t going to make sure that
> it doesn?t block except on Linux.
>

To people who "just want some random bytes" we should recommend the random
module.

> In other words, it?s basically impossible to ensure you get the behavior
> you want with these flags which I feel like will make everyone unhappy
> (both the people who want to ensure non-blocking, and the people who want
> to ensure cryptographically secure). These flags are an attractive nuisance
> that look like they do the right thing, but silently don?t.
>

OK, it looks like the flags just won't make you happy, and I'm happy to
give up on them. By default the status quo will win, and that means neither
these flags nor os.getrandom(). (But of course you can roll your own using
ctypes. :-)

> Meanwhile if we have os.urandom that reads from /dev/urandom and
> os.getrandom() which reads from blocking random, then we make it both
> easier to ensure you get the behavior you want, either by using the
> function that best suits your needs:
>
> * If you just want the best the OS has to offer, os.getrandom falling back
> to os.urandom.
>

Actually the proposal for that was the secrets module. And the secrets
module would be the only user of os.urandom(blocking=True).

> * If you want to ensure you get cryptographically secure bytes,
> os.getrandom, falling back to os.urandom on non Linux platforms and
> erroring on Linux.
>

"Erroring" doesn't sound like it satisfies the "ensure" part of the
requirement. And I don't see the advantage of os.getrandom() over the
secrets module. (Either way you have to fall back on os.urandom() to
suppport Python 3.5 and before.)

> * If you want to *ensure* that there?s no blocking, then os.urandom on
> Linux (or os.urandom wrapped with timeout code anywhere else, as that?s the
> only way to ensure not blocking cross platform).
>

That's fine with me.

> * If you just don?t care, YOLO it up with either os.urandom or
> os.getrandom or random.random.
>

Now you're just taking the mickey.

>
> I think the problem with making os.urandom() go back to always reading
> /dev/urandom is that we've come to rely on it on all platforms, so we've
> passed that station.
>
>
> Sorry, to be more specific I meant the 3.4 behavior, which was
> open(?/dev/urandom?).read() on *nix and CryptGenRandom on Windows.
>

I am all for keeping it that way. The secrets module doesn't have to use
any of these, it can use an undocumented extension module for all I care.
Or it can use os.urandom() and trust Ted Ts'o.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/a550d606/attachment-0001.html>

From larry at hastings.org  Sat Jun 11 15:53:36 2016
From: larry at hastings.org (Larry Hastings)
Date: Sat, 11 Jun 2016 12:53:36 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
Message-ID: <575C6C40.7020403@hastings.org>

On 06/11/2016 11:30 AM, Donald Stufft wrote:
> The problem is that someone writing software that does 
> os.urandom(block=True) or os.urandom(exception=True) which gets some 
> bytes doesn?t know if it got back cryptographically secure random 
> because Python called getrandom() or if it got back cryptographically 
> secure random because it called /dev/urandom and that gave it secure 
> random because it?s on a platform that defines that as always 
> returning secure or because it?s on Linux and the urandom pool is 
> initialized or if it got back some random bytes that are not 
> cryptographically secure because it fell back to reading /dev/urandom 
> on Linux prior to the pool being initialized.

Let me jump in tangentially to say: I think os.urandom(block=True) is 
simply a bad API.  On FreeBSD and OpenBSD, /dev/urandom may block, and 
you don't have a choice.  On OS X, /dev/urandom will never block, and 
you don't have a choice.  In Victor's initial patch where he proposed 
it, the flag was accepted on all platforms but only affected its 
behavior on Linux and possibly Solaris.  I think it's bad API design to 
have a flag that seems like it would be meaningful on multiple 
platforms, but in practice is useful only in very limited 
circumstances.  If this were old code, or behavior we inherited from the 
platform and we were making the best of a bad situation, that'd be one 
thing.  But this is a proposed new API and I definitely think we can do 
better.

As I understand the proposed semantics for os.urandom(exception=True), I 
feel it falls into the same trap though not to the same degree.

Of course, both flags break backwards-compatibility if they default to 
True, and I strongly disagree with .

It's far better in my opinion to keep the os module as a thin shell over 
platform functionality.  That makes Python's behavior more predictable 
on a platform-by-platform basis.  So I think the best approach here is 
to add os.getrandom() as a thin shell over the local getrandom() (if any).

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/93628969/attachment.html>

From brett at python.org  Sat Jun 11 16:26:39 2016
From: brett at python.org (Brett Cannon)
Date: Sat, 11 Jun 2016 20:26:39 +0000
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
Message-ID: <CAP1=2W5OmjB5_3MOkzqDY1vypQv=cGB6hOmjn9ZauPMa30qVEA@mail.gmail.com>

http://bugs.python.org/issue27288 covers updating the secrets module to use
getrandom().

http://bugs.python.org/issue27292 covers documenting the drawbacks of
os.urandom()

http://bugs.python.org/issue27293 covers documenting all of the issues
pointed out in this discussion.

Only issue I can think of that we're missing is one to track reverting
os.urandom() to 3.4 semantics (any doc updates required for the random
module?). Am I missing anything?

On Sat, Jun 11, 2016, 12:41 Guido van Rossum <guido at python.org> wrote:

> On Sat, Jun 11, 2016 at 11:30 AM, Donald Stufft <donald at stufft.io> wrote:
>
>>
>> On Jun 11, 2016, at 1:39 PM, Guido van Rossum <guido at python.org> wrote:
>>
>> Is the feature detection desire about being able to write code that runs
>> on older Python versions or for platforms that just don't have getrandom()?
>>
>> My assumption was that nobody would actually use these flags except the
>> secrets module and people writing code that generates long-lived secrets --
>> and the latter category should be checking platform and versions anyway
>> since they need the whole stack to be secure (if I understand Ted Ts'o's
>> email right).
>>
>> My assumption is also that the flags should be hints (perhaps only
>> relevant on Linux) -- platforms that can't perform the action desired
>> (because their system's API doesn't support it) would just do their default
>> action, assuming the system API does the best it can.
>>
>>
>> The problem is that someone writing software that does
>> os.urandom(block=True) or os.urandom(exception=True) which gets some bytes
>> doesn?t know if it got back cryptographically secure random because Python
>> called getrandom() or if it got back cryptographically secure random
>> because it called /dev/urandom and that gave it secure random because it?s
>> on a platform that defines that as always returning secure or because it?s
>> on Linux and the urandom pool is initialized or if it got back some random
>> bytes that are not cryptographically secure because it fell back to reading
>> /dev/urandom on Linux prior to the pool being initialized.
>>
>> The ?silently does the wrong thing, even though I explicitly asked for it
>> do something different? is something that I would consider to be a footgun
>> and footgun?s in security sensitive code make me really worried.
>>
>
> Yeah, but we've already established that there's a lot more upset,
> rhetoric and worry than warranted by the situation.
>
>
>> Outside of the security side of things, if someone goes ?Ok I need some
>> random bytes and I need to make sure it doesn?t block?, then doing
>> ``os.random(block=False, exception=False)`` isn?t going to make sure that
>> it doesn?t block except on Linux.
>>
>
> To people who "just want some random bytes" we should recommend the random
> module.
>
>
>> In other words, it?s basically impossible to ensure you get the behavior
>> you want with these flags which I feel like will make everyone unhappy
>> (both the people who want to ensure non-blocking, and the people who want
>> to ensure cryptographically secure). These flags are an attractive nuisance
>> that look like they do the right thing, but silently don?t.
>>
>
> OK, it looks like the flags just won't make you happy, and I'm happy to
> give up on them. By default the status quo will win, and that means neither
> these flags nor os.getrandom(). (But of course you can roll your own using
> ctypes. :-)
>
>
>> Meanwhile if we have os.urandom that reads from /dev/urandom and
>> os.getrandom() which reads from blocking random, then we make it both
>> easier to ensure you get the behavior you want, either by using the
>> function that best suits your needs:
>>
>> * If you just want the best the OS has to offer, os.getrandom falling
>> back to os.urandom.
>>
>
> Actually the proposal for that was the secrets module. And the secrets
> module would be the only user of os.urandom(blocking=True).
>
>
>> * If you want to ensure you get cryptographically secure bytes,
>> os.getrandom, falling back to os.urandom on non Linux platforms and
>> erroring on Linux.
>>
>
> "Erroring" doesn't sound like it satisfies the "ensure" part of the
> requirement. And I don't see the advantage of os.getrandom() over the
> secrets module. (Either way you have to fall back on os.urandom() to
> suppport Python 3.5 and before.)
>
>
>> * If you want to *ensure* that there?s no blocking, then os.urandom on
>> Linux (or os.urandom wrapped with timeout code anywhere else, as that?s the
>> only way to ensure not blocking cross platform).
>>
>
> That's fine with me.
>
>
>> * If you just don?t care, YOLO it up with either os.urandom or
>> os.getrandom or random.random.
>>
>
> Now you're just taking the mickey.
>
>
>>
>> I think the problem with making os.urandom() go back to always reading
>> /dev/urandom is that we've come to rely on it on all platforms, so we've
>> passed that station.
>>
>>
>> Sorry, to be more specific I meant the 3.4 behavior, which was
>> open(?/dev/urandom?).read() on *nix and CryptGenRandom on Windows.
>>
>
> I am all for keeping it that way. The secrets module doesn't have to use
> any of these, it can use an undocumented extension module for all I care.
> Or it can use os.urandom() and trust Ted Ts'o.
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/ea7637d6/attachment.html>

From donald at stufft.io  Sat Jun 11 16:48:05 2016
From: donald at stufft.io (Donald Stufft)
Date: Sat, 11 Jun 2016 16:48:05 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
Message-ID: <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>

> On Jun 11, 2016, at 3:40 PM, Guido van Rossum <guido at python.org> wrote:
> 
> On Sat, Jun 11, 2016 at 11:30 AM, Donald Stufft <donald at stufft.io <mailto:donald at stufft.io>> wrote:
> 
>> On Jun 11, 2016, at 1:39 PM, Guido van Rossum <guido at python.org <mailto:guido at python.org>> wrote:
>> 
>> Is the feature detection desire about being able to write code that runs on older Python versions or for platforms that just don't have getrandom()?
>> 
>> My assumption was that nobody would actually use these flags except the secrets module and people writing code that generates long-lived secrets -- and the latter category should be checking platform and versions anyway since they need the whole stack to be secure (if I understand Ted Ts'o's email right).
>> 
>> My assumption is also that the flags should be hints (perhaps only relevant on Linux) -- platforms that can't perform the action desired (because their system's API doesn't support it) would just do their default action, assuming the system API does the best it can.
> 
> The problem is that someone writing software that does os.urandom(block=True) or os.urandom(exception=True) which gets some bytes doesn?t know if it got back cryptographically secure random because Python called getrandom() or if it got back cryptographically secure random because it called /dev/urandom and that gave it secure random because it?s on a platform that defines that as always returning secure or because it?s on Linux and the urandom pool is initialized or if it got back some random bytes that are not cryptographically secure because it fell back to reading /dev/urandom on Linux prior to the pool being initialized.
> 
> The ?silently does the wrong thing, even though I explicitly asked for it do something different? is something that I would consider to be a footgun and footgun?s in security sensitive code make me really worried.
> 
> Yeah, but we've already established that there's a lot more upset, rhetoric and worry than warranted by the situation.

Have we? There are real, documented security failures in the wild because of /dev/urandom?s behavior. This isn?t just a theoretical problem, it actually has had consequences in real life, and those same consequences could just have easily happened to Python (in one of the cases that most recently comes to mind it was a C program, but that?s not really relevant because the same problem would have happened if they had written in Python using os.urandom in 3.4 but not in 3.5.0 or 3.5.1.

>  
> Outside of the security side of things, if someone goes ?Ok I need some random bytes and I need to make sure it doesn?t block?, then doing ``os.random(block=False, exception=False)`` isn?t going to make sure that it doesn?t block except on Linux.
> 
> To people who "just want some random bytes" we should recommend the random module.
>  
> In other words, it?s basically impossible to ensure you get the behavior you want with these flags which I feel like will make everyone unhappy (both the people who want to ensure non-blocking, and the people who want to ensure cryptographically secure). These flags are an attractive nuisance that look like they do the right thing, but silently don?t.
> 
> OK, it looks like the flags just won't make you happy, and I'm happy to give up on them. By default the status quo will win, and that means neither these flags nor os.getrandom(). (But of course you can roll your own using ctypes. :-)
>  
> Meanwhile if we have os.urandom that reads from /dev/urandom and os.getrandom() which reads from blocking random, then we make it both easier to ensure you get the behavior you want, either by using the function that best suits your needs:
> 
> * If you just want the best the OS has to offer, os.getrandom falling back to os.urandom.
> 
> Actually the proposal for that was the secrets module. And the secrets module would be the only user of os.urandom(blocking=True).

I?m fine if this lives in the secrets module? Steven asked for it to be an os function so that secrets.py could continue to be pure python.

>  
> * If you want to ensure you get cryptographically secure bytes, os.getrandom, falling back to os.urandom on non Linux platforms and erroring on Linux.
> 
> "Erroring" doesn't sound like it satisfies the "ensure" part of the requirement. And I don't see the advantage of os.getrandom() over the secrets module. (Either way you have to fall back on os.urandom() to suppport Python 3.5 and before.)

Erroring does satisfy the ensure part, because if it?s not possible to get cryptographically secure bytes then the only option is to error if you want to be ensured of cryptographically secure bytes.

It?s a bit like if you did open(?somefile.txt?), it?s reasonable to say that we should ensure that open(?somefile.txt?) actually opens ./somefile.txt, and doesn?t randomly open a different file if ./somefile.txt doesn?t exist? if it can?t open ./somefile.txt it should error. If I *need* cryptographically secure random bytes, and I?m on a platform that doesn?t provide those, then erroring is often times the correct behavior. This is such an important thing that OS X will flat out kernel panic and refuse to boot if it can?t ensure that it can give people cryptographically secure random bytes.

It?s a fairly simple decision tree, I go ?hey, give me cryptographically secure random bytes, and only cryptographically secure random bytes?. If it cannot give them to me because the APIs of the system cannot guarantee they are cryptographically secure then there are only two options, either A) it is explicit about it?s inability to do this and raises an error or B) it does something completely different than what I asked it to do and pretends that it?s what I wanted.

>  
> * If you want to *ensure* that there?s no blocking, then os.urandom on Linux (or os.urandom wrapped with timeout code anywhere else, as that?s the only way to ensure not blocking cross platform).
> 
> That's fine with me.
>  
> * If you just don?t care, YOLO it up with either os.urandom or os.getrandom or random.random.
> 
> Now you're just taking the mickey.

No I?m not? random.Random is such a use case where it wants to seed with as secure of bytes as it can get it?s hands on, but it doesn?t care if it falls back to insecure bytes if it?s not possible to get secure bytes. This code even falls back to using time as a seed if all else fails.

>  
>> 
>> I think the problem with making os.urandom() go back to always reading /dev/urandom is that we've come to rely on it on all platforms, so we've passed that station.
>> 
> 
> Sorry, to be more specific I meant the 3.4 behavior, which was open(?/dev/urandom?).read() on *nix and CryptGenRandom on Windows.
> 
> I am all for keeping it that way. The secrets module doesn't have to use any of these, it can use an undocumented extension module for all I care. Or it can use os.urandom() and trust Ted Ts'o.
> 
> -- 
> --Guido van Rossum (python.org/~guido <http://python.org/~guido>)

?
Donald Stufft

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/70c127a1/attachment-0001.html>

From donald at stufft.io  Sat Jun 11 16:48:48 2016
From: donald at stufft.io (Donald Stufft)
Date: Sat, 11 Jun 2016 16:48:48 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAP1=2W5OmjB5_3MOkzqDY1vypQv=cGB6hOmjn9ZauPMa30qVEA@mail.gmail.com>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <CAP1=2W5OmjB5_3MOkzqDY1vypQv=cGB6hOmjn9ZauPMa30qVEA@mail.gmail.com>
Message-ID: <43B25EC7-FE1D-434C-84B4-58AB9F8D703C@stufft.io>

> On Jun 11, 2016, at 4:26 PM, Brett Cannon <brett at python.org> wrote:
> 
> Only issue I can think of that we're missing is one to track reverting os.urandom() to 3.4 semantics (any doc updates required for the random module?). Am I missing anything?

It?s already been reverted to 3.4 semantics (well, it will try to use getrandom(GRD_NONBLOCK) but falls back to /dev/urandom if that would have blocked).

?
Donald Stufft

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/74278d32/attachment.html>

From guido at python.org  Sat Jun 11 16:48:49 2016
From: guido at python.org (Guido van Rossum)
Date: Sat, 11 Jun 2016 13:48:49 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <575C6C40.7020403@hastings.org>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <575C6C40.7020403@hastings.org>
Message-ID: <CAP7+vJ+9cGdOMaa+zKX4GxSppPxtXaedvAVWkWrB6e7o00bL1A@mail.gmail.com>

On Sat, Jun 11, 2016 at 12:53 PM, Larry Hastings <larry at hastings.org> wrote:

>
> On 06/11/2016 11:30 AM, Donald Stufft wrote:
>
> The problem is that someone writing software that does
> os.urandom(block=True) or os.urandom(exception=True) which gets some bytes
> doesn?t know if it got back cryptographically secure random because Python
> called getrandom() or if it got back cryptographically secure random
> because it called /dev/urandom and that gave it secure random because it?s
> on a platform that defines that as always returning secure or because it?s
> on Linux and the urandom pool is initialized or if it got back some random
> bytes that are not cryptographically secure because it fell back to reading
> /dev/urandom on Linux prior to the pool being initialized.
>
>
> Let me jump in tangentially to say: I think os.urandom(block=True) is
> simply a bad API.  On FreeBSD and OpenBSD, /dev/urandom may block, and you
> don't have a choice.  On OS X, /dev/urandom will never block, and you don't
> have a choice.  In Victor's initial patch where he proposed it, the flag
> was accepted on all platforms but only affected its behavior on Linux and
> possibly Solaris.  I think it's bad API design to have a flag that seems
> like it would be meaningful on multiple platforms, but in practice is
> useful only in very limited circumstances.  If this were old code, or
> behavior we inherited from the platform and we were making the best of a
> bad situation, that'd be one thing.  But this is a proposed new API and I
> definitely think we can do better.
>
> As I understand the proposed semantics for os.urandom(exception=True), I
> feel it falls into the same trap though not to the same degree.
>
> Of course, both flags break backwards-compatibility if they default to
> True, and I strongly disagree with .
>
> It's far better in my opinion to keep the os module as a thin shell over
> platform functionality.  That makes Python's behavior more predictable on a
> platform-by-platform basis.  So I think the best approach here is to add
> os.getrandom() as a thin shell over the local getrandom() (if any).
>

OK, the flags are unpopular, so let's forget about them.

But I find an os.getrandom() that only exists on those (few?) platforms
that support it a nuisance too -- this just encourages cargo cult code
that's unnecessarily complicated and believed to be secure without anybody
ever verifying.

I'd like to consider what people freak out about.

- You could freak out about blocking

- You could freak out about getting slightly less random bits

- You could freak out about supporting Python 3.5 and earlier

- You could freak out about supporting all platforms

You could also freak out about combinations of the above, but that gets
complicated and you should probably consider that you're over-constraining
matters. If you freak out about all at once (or both the first and the
second bullet) you should consider a career change.

If you don't freak out about any of these (meaning you're happy with Python
3.6+) you should use the secrets module.

If you freak out about support for older Python versions, try the secrets
module first and fall back to os.urandom() -- there really isn't any other
choice.

If you freak out about getting slightly less random bits you should
probably do a complete security assessment of your entire stack and fix the
OS and Python version, and use the best you can get for that combination.
You may not want to rely on the standard library at all.

If you freak out about blocking you're probably on a specific platform, and
if that platform is Linux, you're in luck: use os.urandom() and avoid
Python 3.5.0 and 3.5.1. On other platforms you're out of luck.

So I still don't see why we need os.getrandom() -- it has nothing to
recommend it over the secrets module (since both won't happen before 3.6).

So what should the secrets module use? Let's make that part an extension
module.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/b18ff4fb/attachment.html>

From donald at stufft.io  Sat Jun 11 17:16:04 2016
From: donald at stufft.io (Donald Stufft)
Date: Sat, 11 Jun 2016 17:16:04 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAP7+vJ+9cGdOMaa+zKX4GxSppPxtXaedvAVWkWrB6e7o00bL1A@mail.gmail.com>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <575C6C40.7020403@hastings.org>
 <CAP7+vJ+9cGdOMaa+zKX4GxSppPxtXaedvAVWkWrB6e7o00bL1A@mail.gmail.com>
Message-ID: <5AF510E5-5DE4-4FF0-9D3E-916987627463@stufft.io>

> On Jun 11, 2016, at 4:48 PM, Guido van Rossum <guido at python.org> wrote:
> 
> But I find an os.getrandom() that only exists on those (few?) platforms that support it a nuisance too -- this just encourages cargo cult code that's unnecessarily complicated and believed to be secure without anybody ever verifying.

Well, new enough Linux has getrandom(0), OpenBSD has getentropy(), Solaris has getrandom(), Windows has CryptGenRandom which all make it possible (or it?s the only way to invoke it) to get cryptographically secure random bytes or block and no in-between. So it?d likely be possible to have os.getrandom() with blocking semantics and no FD on all of the most popular platforms we support.

If we relax the no FD then FreeBSD and OS X also have /dev/random (or /dev/urandom it?s the same thing) which will ensure that you give cryptographically secure random bytes.

?
Donald Stufft

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/231ad990/attachment.html>

From guido at python.org  Sat Jun 11 17:16:21 2016
From: guido at python.org (Guido van Rossum)
Date: Sat, 11 Jun 2016 14:16:21 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
Message-ID: <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>

On Sat, Jun 11, 2016 at 1:48 PM, Donald Stufft <donald at stufft.io> wrote:

>
> On Jun 11, 2016, at 3:40 PM, Guido van Rossum <guido at python.org> wrote:
>
> Yeah, but we've already established that there's a lot more upset,
> rhetoric and worry than warranted by the situation.
>
>
> Have we? There are real, documented security failures in the wild because
> of /dev/urandom?s behavior. This isn?t just a theoretical problem, it
> actually has had consequences in real life, and those same consequences
> could just have easily happened to Python (in one of the cases that most
> recently comes to mind it was a C program, but that?s not really relevant
> because the same problem would have happened if they had written in Python
> using os.urandom in 3.4 but not in 3.5.0 or 3.5.1.
>

Actually it's not clear to me at all that it could have happened to Python.
(Wasn't it an embedded system?)

> Actually the proposal for that was the secrets module. And the secrets
> module would be the only user of os.urandom(blocking=True).
>
>
> I?m fine if this lives in the secrets module? Steven asked for it to be an
> os function so that secrets.py could continue to be pure python.
>

The main thing that I want to avoid is that people start cargo-culting
whatever the secrets module uses rather than just using the secrets module.
Having it redundantly available as os.getrandom() is just begging for
people to show off how much they know about writing secure code.

>
>
>> * If you want to ensure you get cryptographically secure bytes,
>> os.getrandom, falling back to os.urandom on non Linux platforms and
>> erroring on Linux.
>>
>
> "Erroring" doesn't sound like it satisfies the "ensure" part of the
> requirement. And I don't see the advantage of os.getrandom() over the
> secrets module. (Either way you have to fall back on os.urandom() to
> suppport Python 3.5 and before.)
>
>
> Erroring does satisfy the ensure part, because if it?s not possible to get
> cryptographically secure bytes then the only option is to error if you want
> to be ensured of cryptographically secure bytes.
>

> It?s a bit like if you did open(?somefile.txt?), it?s reasonable to say
> that we should ensure that open(?somefile.txt?) actually opens
> ./somefile.txt, and doesn?t randomly open a different file if
> ./somefile.txt doesn?t exist? if it can?t open ./somefile.txt it should
> error. If I *need* cryptographically secure random bytes, and I?m on a
> platform that doesn?t provide those, then erroring is often times the
> correct behavior. This is such an important thing that OS X will flat out
> kernel panic and refuse to boot if it can?t ensure that it can give people
> cryptographically secure random bytes.
>

But what is a Python script going to do with that error? IIUC this kind of
error would only happen very early during boot time, and rarely, so the
most likely outcome is a hard-to-debug mystery failure.

> It?s a fairly simple decision tree, I go ?hey, give me cryptographically
> secure random bytes, and only cryptographically secure random bytes?. If it
> cannot give them to me because the APIs of the system cannot guarantee they
> are cryptographically secure then there are only two options, either A) it
> is explicit about it?s inability to do this and raises an error or B) it
> does something completely different than what I asked it to do and pretends
> that it?s what I wanted.
>

I really don't believe that there is only one kind of cryptographically
secure random bytes. There are many different applications (use cases) of
randomness and they need different behaviors. (If it was simple we wouldn't
still be arguing. :-)

>
>
>> * If you want to *ensure* that there?s no blocking, then os.urandom on
>> Linux (or os.urandom wrapped with timeout code anywhere else, as that?s the
>> only way to ensure not blocking cross platform).
>>
>
> That's fine with me.
>
>
>> * If you just don?t care, YOLO it up with either os.urandom or
>> os.getrandom or random.random.
>>
>
> Now you're just taking the mickey.
>
>
> No I?m not? random.Random is such a use case where it wants to seed with
> as secure of bytes as it can get it?s hands on, but it doesn?t care if it
> falls back to insecure bytes if it?s not possible to get secure bytes. This
> code even falls back to using time as a seed if all else fails.
>

Fair enough. The hash randomization is the other case I suppose (since not
running any Python code at all isn't an option, and neither is waiting
indefinitely before the user's code gets control).

It does show the point that there are different use cases with different
needs. But I think the stdlib should limit the choices.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/cb520425/attachment.html>

From guido at python.org  Sat Jun 11 17:24:40 2016
From: guido at python.org (Guido van Rossum)
Date: Sat, 11 Jun 2016 14:24:40 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <5AF510E5-5DE4-4FF0-9D3E-916987627463@stufft.io>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <575C6C40.7020403@hastings.org>
 <CAP7+vJ+9cGdOMaa+zKX4GxSppPxtXaedvAVWkWrB6e7o00bL1A@mail.gmail.com>
 <5AF510E5-5DE4-4FF0-9D3E-916987627463@stufft.io>
Message-ID: <CAP7+vJ+_49i4TutHjMdC6AMuQKSPmi5_+_8TZommeSN7pCT0kA@mail.gmail.com>

On Sat, Jun 11, 2016 at 2:16 PM, Donald Stufft <donald at stufft.io> wrote:

>
> On Jun 11, 2016, at 4:48 PM, Guido van Rossum <guido at python.org> wrote:
>
> But I find an os.getrandom() that only exists on those (few?) platforms
> that support it a nuisance too -- this just encourages cargo cult code
> that's unnecessarily complicated and believed to be secure without anybody
> ever verifying.
>
>
>
> Well, new enough Linux has getrandom(0), OpenBSD has getentropy(), Solaris
> has getrandom(), Windows has CryptGenRandom which all make it possible (or
> it?s the only way to invoke it) to get cryptographically secure random
> bytes or block and no in-between. So it?d likely be possible to have
> os.getrandom() with blocking semantics and no FD on all of the most popular
> platforms we support.
>
> If we relax the no FD then FreeBSD and OS X also have /dev/random (or
> /dev/urandom it?s the same thing) which will ensure that you give
> cryptographically secure random bytes.
>

OK, so we should implement the best we can do for the secrets module, and
leave os.urandom() alone. I think the requirement that the secrets module
remain pure Python has to be dropped. I'm not sure what it should do if
even blocking can't give it sufficiently strong random bytes, but I care
much less -- it's a new API and it doesn't resemble any OS function, so as
long as it is documented it should be fine.

An alternative would be to keep the secrets module linked to SystemRandom,
and improve the latter. Its link with os.random() is AFAIK undocumented.
Its API is clumsy but for code that needs some form of secret-ish bytes and
requires platform and Python version independence it might be better than
anything else. Then the secrets module is just what we recommend new users
on Python 3.6.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/a0a7e83d/attachment-0001.html>

From tim.peters at gmail.com  Sat Jun 11 17:35:23 2016
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 11 Jun 2016 16:35:23 -0500
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAP7+vJ+_49i4TutHjMdC6AMuQKSPmi5_+_8TZommeSN7pCT0kA@mail.gmail.com>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <575C6C40.7020403@hastings.org>
 <CAP7+vJ+9cGdOMaa+zKX4GxSppPxtXaedvAVWkWrB6e7o00bL1A@mail.gmail.com>
 <5AF510E5-5DE4-4FF0-9D3E-916987627463@stufft.io>
 <CAP7+vJ+_49i4TutHjMdC6AMuQKSPmi5_+_8TZommeSN7pCT0kA@mail.gmail.com>
Message-ID: <CAExdVN=5qQurbpV+s-bHO9dpEb2c1aLv2US0UFYCGV92QpE3Zw@mail.gmail.com>

[Guido]
> ...
> An alternative would be to keep the secrets module linked to SystemRandom,
> and improve the latter. Its link with os.random() is AFAIK undocumented. Its
> API is clumsy but for code that needs some form of secret-ish bytes and
> requires platform and Python version independence it might be better than
> anything else. Then the secrets module is just what we recommend new users
> on Python 3.6.

There's an issue currently open about this:

    http://bugs.python.org/issue27288

The docs for SystemRandom are very brief, so people may have actually
noticed ;-) the first sentence:

    Class that uses the os.urandom() function for generating random numbers ...

IOW, "uses os.urandom()" has been one of its only advertised qualities.

From donald at stufft.io  Sat Jun 11 17:46:29 2016
From: donald at stufft.io (Donald Stufft)
Date: Sat, 11 Jun 2016 17:46:29 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org>
 <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
Message-ID: <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>

> On Jun 11, 2016, at 5:16 PM, Guido van Rossum <guido at python.org> wrote:
> 
> On Sat, Jun 11, 2016 at 1:48 PM, Donald Stufft <donald at stufft.io <mailto:donald at stufft.io>> wrote:
> 
>> On Jun 11, 2016, at 3:40 PM, Guido van Rossum <guido at python.org <mailto:guido at python.org>> wrote:
>> 
>> Yeah, but we've already established that there's a lot more upset, rhetoric and worry than warranted by the situation.
> 
> Have we? There are real, documented security failures in the wild because of /dev/urandom?s behavior. This isn?t just a theoretical problem, it actually has had consequences in real life, and those same consequences could just have easily happened to Python (in one of the cases that most recently comes to mind it was a C program, but that?s not really relevant because the same problem would have happened if they had written in Python using os.urandom in 3.4 but not in 3.5.0 or 3.5.1.
> 
> Actually it's not clear to me at all that it could have happened to Python. (Wasn't it an embedded system?)

It was a RaspberryPI that ran a shell script on boot that called ssh-keygen. That shell script could have just as easily been a Python script that called os.urandom via https://github.com/sybrenstuvel/python-rsa <https://github.com/sybrenstuvel/python-rsa> instead of a shell script that called ssh-keygen.

>> Actually the proposal for that was the secrets module. And the secrets module would be the only user of os.urandom(blocking=True).
> 
> I?m fine if this lives in the secrets module? Steven asked for it to be an os function so that secrets.py could continue to be pure python.
> 
> The main thing that I want to avoid is that people start cargo-culting whatever the secrets module uses rather than just using the secrets module. Having it redundantly available as os.getrandom() is just begging for people to show off how much they know about writing secure code. 

I guess one question would be, what does the secrets module do if it?s on a Linux that is too old to have getrandom(0), off the top of my head I can think of:

* Silently fall back to reading os.urandom and hope that it?s been seeded.
* Fall back to os.urandom and hope that it?s been seeded and add a SecurityWarning or something like it to mention that it?s falling back to os.urandom and it may be getting predictable random from /dev/urandom.
* Hard fail because it can?t guarantee secure cryptographic random.

Of the three, I would probably suggest the second one, it doesn?t let the problem happen silently, but it still ?works? (where it?s basically just hoping it?s being called late enough that /dev/urandom has been seeded), and people can convert it to the third case using the warnings module to turn the warning into an exception.

>>  
>> * If you want to ensure you get cryptographically secure bytes, os.getrandom, falling back to os.urandom on non Linux platforms and erroring on Linux.
>> 
>> "Erroring" doesn't sound like it satisfies the "ensure" part of the requirement. And I don't see the advantage of os.getrandom() over the secrets module. (Either way you have to fall back on os.urandom() to suppport Python 3.5 and before.)
> 
> Erroring does satisfy the ensure part, because if it?s not possible to get cryptographically secure bytes then the only option is to error if you want to be ensured of cryptographically secure bytes. 
> 
> It?s a bit like if you did open(?somefile.txt?), it?s reasonable to say that we should ensure that open(?somefile.txt?) actually opens ./somefile.txt, and doesn?t randomly open a different file if ./somefile.txt doesn?t exist? if it can?t open ./somefile.txt it should error. If I *need* cryptographically secure random bytes, and I?m on a platform that doesn?t provide those, then erroring is often times the correct behavior. This is such an important thing that OS X will flat out kernel panic and refuse to boot if it can?t ensure that it can give people cryptographically secure random bytes.
> 
> But what is a Python script going to do with that error? IIUC this kind of error would only happen very early during boot time, and rarely, so the most likely outcome is a hard-to-debug mystery failure.

Depends on why they?re calling it, which is sort of the underlying problem I suspect with why there isn?t agreement about what the right default behavior is. The correct answer for some application might be to hard fail and wait for the operator to fix the environment that it?s running in. It depends on how important the thing that is getting this random is.

One example: If I was writing a communication platform for people who are fighting oppressive regimes or to securely discuss sexual orientation in more dangerous parts of the world, I would want to make this program hard fail if it couldn?t ensure that it was using an interface that ensured cryptographic random, because the alternative is predictable numbers and someone possibly being arrested or executed. I know that?s a bit of an extreme edge case, but it?s also the kind of thing that people can might use Python for where the predictability of the CSPRNG it?s using is of the utmost importance.

For other things, the importance will fall somewhere between best effort being good enough and predictable random numbers being a catastrophic.

>  
> It?s a fairly simple decision tree, I go ?hey, give me cryptographically secure random bytes, and only cryptographically secure random bytes?. If it cannot give them to me because the APIs of the system cannot guarantee they are cryptographically secure then there are only two options, either A) it is explicit about it?s inability to do this and raises an error or B) it does something completely different than what I asked it to do and pretends that it?s what I wanted.
> 
> I really don't believe that there is only one kind of cryptographically secure random bytes. There are many different applications (use cases) of randomness and they need different behaviors. (If it was simple we wouldn't still be arguing. :-) 

I mean for a CSPRNG there?s only one real important property: Can an attacker predict the next byte. Any other property for a CSPRNG doesn?t really matter. For other, non kinds of CSPRNGs they want other behaviors (equidistribution, etc) but those aren?t cryptographically secure (nor do they need to be).

>>  
>> * If you want to *ensure* that there?s no blocking, then os.urandom on Linux (or os.urandom wrapped with timeout code anywhere else, as that?s the only way to ensure not blocking cross platform).
>> 
>> That's fine with me.
>>  
>> * If you just don?t care, YOLO it up with either os.urandom or os.getrandom or random.random.
>> 
>> Now you're just taking the mickey.
> 
> No I?m not? random.Random is such a use case where it wants to seed with as secure of bytes as it can get it?s hands on, but it doesn?t care if it falls back to insecure bytes if it?s not possible to get secure bytes. This code even falls back to using time as a seed if all else fails.
> 
> Fair enough. The hash randomization is the other case I suppose (since not running any Python code at all isn't an option, and neither is waiting indefinitely before the user's code gets control).
> 
> It does show the point that there are different use cases with different needs. But I think the stdlib should limit the choices.
> 
> -- 
> --Guido van Rossum (python.org/~guido <http://python.org/~guido>)

?
Donald Stufft

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/55972f46/attachment.html>

From leewangzhong+python at gmail.com  Sat Jun 11 17:53:34 2016
From: leewangzhong+python at gmail.com (Franklin? Lee)
Date: Sat, 11 Jun 2016 17:53:34 -0400
Subject: [Python-Dev] PEP 468
In-Reply-To: <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
 <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
 <CAB_e7iw7A46SzWGmnvSn9d0TzYJSZ927iOsppFWTQxtym3bH-g@mail.gmail.com>
 <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
Message-ID: <CAB_e7iwrge9+HNg7XoP_6bGWxSthzQu4AThL7grE3G_e5baibA@mail.gmail.com>

I am. I was just wondering if there was an in-progress effort I should be
looking at, because I am interested in extensions to it.

P.S.: If anyone is missing the relevance, Raymond Hettinger's compact dicts
are inherently ordered until a delitem happens.[1] That could be "good
enough" for many purposes, including kwargs and class definition. If
CPython implements efficient compact dicts, it would be easier to propose
order-preserving (or initially-order-preserving) dicts in some places in
the standard.

[1] Whether delitem preserves order depends on whether you want to allow
gaps in your compact entry table. PyPy implemented compact dicts and
chose(?) to make dicts ordered.

On Saturday, June 11, 2016, Eric Snow <ericsnowcurrently at gmail.com> wrote:

> On Fri, Jun 10, 2016 at 11:54 AM, Franklin? Lee
> <leewangzhong+python at gmail.com <javascript:;>> wrote:
> > Eric, have you any work in progress on compact dicts?
>
> Nope.  I presume you are talking the proposal Raymond made a while back.
>
> -eric
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/d0a8c561/attachment.html>

From larry at hastings.org  Sat Jun 11 17:58:07 2016
From: larry at hastings.org (Larry Hastings)
Date: Sat, 11 Jun 2016 14:58:07 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAP7+vJ+9cGdOMaa+zKX4GxSppPxtXaedvAVWkWrB6e7o00bL1A@mail.gmail.com>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <575C6C40.7020403@hastings.org>
 <CAP7+vJ+9cGdOMaa+zKX4GxSppPxtXaedvAVWkWrB6e7o00bL1A@mail.gmail.com>
Message-ID: <575C896F.3060201@hastings.org>

On 06/11/2016 01:48 PM, Guido van Rossum wrote:
> So I still don't see why we need os.getrandom() -- it has nothing to 
> recommend it over the secrets module (since both won't happen before 3.6).

I have two reasons, neither of which I think are necessarily all that 
persuasive.  Don't consider this an argument--merely some observations.

First, simply as a practical matter: the secrets module is currently 
pure Python.  ISTM that the os module is where we put miscellaneous bits 
of os functionality; getrandom() definitely falls into that category.  
Rather than adding a new _secrets module or whatever it seemed easiest 
just to add it there.

Second, I'd put this under the "consenting adults" rule.  Clearly 
cryptography is a contentious subject with sharply differing opinions.  
There are many, many cryptography libraries available on PyPi; perhaps 
those libraries would like to use getrandom(), or /dev/urandom, or even 
getentropy(), in a way different than how secrets does it.  My thinking 
is, the os module should provide platform support, the secrets module 
should be our codified best-practices, and we encourage everyone to use 
secrets.  I'd go so far as to add that recommendation to the doc *and* 
the docstrings of os.urandom(), random.SystemRandom, and os.getrandom() 
(and os.getentropy()) if we add it.  But by providing the OS 
functionality in a neutral way we allow external cryptographers to write 
what *they* view as best-practices code without wading into 
implementation detalis of secrets, or using ctypes, or whatnot.

But like I said I don't have a strong opinion.  As long as we're not 
adding mysterious flags to os.urandom() I'll probably sit the rest of 
this one out.

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/79e897fd/attachment.html>

From guido at python.org  Sat Jun 11 18:44:36 2016
From: guido at python.org (Guido van Rossum)
Date: Sat, 11 Jun 2016 15:44:36 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <575C896F.3060201@hastings.org>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <575C6C40.7020403@hastings.org>
 <CAP7+vJ+9cGdOMaa+zKX4GxSppPxtXaedvAVWkWrB6e7o00bL1A@mail.gmail.com>
 <575C896F.3060201@hastings.org>
Message-ID: <CAP7+vJ+zMp7D5QM4izbi8fvOe_b0goj_wLeDUomxzpJNja9TYw@mail.gmail.com>

Fortunately, 3.6 feature freeze isn't until September, so we can all cool
off and figure out the best way forward. I'm going on vacation for a week,
and after sending this I'm going to mute the thread so I won't be pulled
into it while I'm supposed to be relaxing.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/8c4998b3/attachment-0001.html>

From random832 at fastmail.com  Sat Jun 11 19:43:18 2016
From: random832 at fastmail.com (Random832)
Date: Sat, 11 Jun 2016 19:43:18 -0400
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <20160611014549.GK27919@ando.pearwood.info>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
 <CAK1QoopjYefq0+M_ROyOKA7XEbyNhGfYJt1Lh_GUoHJ=emejPw@mail.gmail.com>
 <CAMpsgwbCKB17197Qgw50aqT6u4uF6FzbpAFqchG9zqprZ=XM+w@mail.gmail.com>
 <20160611014549.GK27919@ando.pearwood.info>
Message-ID: <1465688598.1078301.634935153.258EE6BF@webmail.messagingengine.com>

On Fri, Jun 10, 2016, at 21:45, Steven D'Aprano wrote:
> If you express your performances as speeds (as "calculations per 
> second") then the harmonic mean is the right way to average them.

That's true in so far as you get the same result as if you were to take
the arithmetic mean of the times and then converted from that to
calculations per second. Is there any other particular basis for
considering it "right"?

From stephen at xemacs.org  Sat Jun 11 20:16:44 2016
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sun, 12 Jun 2016 09:16:44 +0900
Subject: [Python-Dev] writing to /dev/*random [was: BDFL ruling request:
 should we block ...]
In-Reply-To: <20160611082437.GN27919@ando.pearwood.info>
References: <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <CAOTb1we7WsteA_5t-bdgoqX_xRS2WexaCuV4Bv7B7xZE2FaqkQ@mail.gmail.com>
 <20160611082437.GN27919@ando.pearwood.info>
Message-ID: <22364.43500.337805.137220@turnbull.sk.tsukuba.ac.jp>

This is related to David Mertz's request for backward compatible
initialization, not to the bdfl decision.

Steven D'Aprano writes:

 > I don't think that's something which the Python interpreter ought to do 
 > for you, but you can write to /dev/urandom or /dev/random (both keep 
 > their own, separate, entropy pools):
 > 
 > open("/dev/urandom", "w").write("hello world")

This fails for unprivileged users on Mac.  I'm not sure what happens
on Linux; it appears to succeed, but the result wasn't what I
expected.

Also, when entropy gets low, it's not clear how additional entropy is
allocated between the /dev/random and /dev/urandom pools.

 > But of course there's the question of where you're going to get a
 > source of noise to write to the file. While it's (probably?) 
 > harmless to write a hard-coded string to it, I don't think its
 > going to give you much entropy.

Use a Raspberry-Pi, or other advanced expensive<wink/> hardware.
There's no real excuse for not having a hardware generator if the Pi
has one!  I would guess you can probably get something with a USB
interface for $20 or so.
http://scruss.com/blog/2013/06/07/well-that-was-unexpected-the-raspberry-pis-hardware-random-number-generator/

From larry at hastings.org  Sat Jun 11 20:28:16 2016
From: larry at hastings.org (Larry Hastings)
Date: Sat, 11 Jun 2016 17:28:16 -0700
Subject: [Python-Dev] writing to /dev/*random [was: BDFL ruling request:
 should we block ...]
In-Reply-To: <22364.43500.337805.137220@turnbull.sk.tsukuba.ac.jp>
References: <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io> <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <CAOTb1we7WsteA_5t-bdgoqX_xRS2WexaCuV4Bv7B7xZE2FaqkQ@mail.gmail.com>
 <20160611082437.GN27919@ando.pearwood.info>
 <22364.43500.337805.137220@turnbull.sk.tsukuba.ac.jp>
Message-ID: <575CACA0.7030400@hastings.org>

On 06/11/2016 05:16 PM, Stephen J. Turnbull wrote:
> Use a Raspberry-Pi, or other advanced expensive<wink/> hardware.
> There's no real excuse for not having a hardware generator if the Pi
> has one!

Intel CPUs added the RDRAND instruction as of Ivy Bridge, although 
there's an ongoing debate as to whether or not it's a suitable source of 
entropy to use for seeding urandom.

    https://en.wikipedia.org/wiki/RdRand#Reception

Wikipedia goes on to describe the very-new RDSEED instruction which 
might be more suitable.

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/7ba44e7d/attachment.html>

From tim.peters at gmail.com  Sat Jun 11 22:21:41 2016
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 11 Jun 2016 21:21:41 -0500
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <m260th5z09.fsf@news.realpath.org>
References: <57595210.4000508@hastings.org> <njbldu$p6p$1@ger.gmane.org>
 <817C1F1A-5BCE-40C9-B148-0B4919B307EE@lukasa.co.uk>
 <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <m260th5z09.fsf@news.realpath.org>
Message-ID: <CAExdVNm4=0ob4ta93mytbiC2w4FfZKVf9Y6cj=2YPNG-=k1+EA@mail.gmail.com>

[Sebastian Krause]
> ...
> Ideally I would only want to use the random module for
> non-secure and (in 3.6) the secrets module (which could block) for
> secure random data and never bother with os.urandom (and knowing how
> it behaves). But then those modules should probably get new
> functions to directly return bytes.

`secrets.token_bytes()` does just that, and other token_XXX()
functions return bytes too but with different spellings (e.g., if you
want, with the byte values represented as ASCII hex digits)..

I believe everyone agrees token_bytes() will potentially block in 3.6
(along with all the other `secrets` facilities) on platforms
supporting getrandom().

You're right that `random` doesn't expose such a function, and that
the closest it gets is .getrandbits() (which returns a potentially
giant int).  So far, nobody has proposed adding new functions to
`random`.

From ericsnowcurrently at gmail.com  Sat Jun 11 22:37:17 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Sat, 11 Jun 2016 19:37:17 -0700
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace (round 3)
Message-ID: <CALFfu7DXG9o_drdkS5+r6g03eVVzDCDbrKcXymwsCqVDkz+3OQ@mail.gmail.com>

I've updated the PEP to reflect feedback up to this point.  The
reception has been positive.  The only change to the original proposal
has been that a manually set __definition_order__ must be a tuple of
identifiers or None (rather that using the value as-is).  All other
updates to the PEP have been clarification.

Guido, at this point I believe the PEP is ready for pronouncement. *
I've included the most recent copy of the text below.  Thanks.

-eric

==============================

PEP: 520
Title: Ordered Class Definition Namespace
Version: $Revision$
Last-Modified: $Date$
Author: Eric Snow <ericsnowcurrently at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 7-Jun-2016
Python-Version: 3.6
Post-History: 7-Jun-2016

Abstract
========

When a class is defined using a ``class`` statement, the class body is
executed within a namespace.  After the execution completes, that
namespace is copied into new ``dict`` and the original definition
namespace is discarded.  The new copy is stored away as the class's
namespace and is exposed as ``__dict__`` through a read-only proxy.

This PEP changes the default class definition namespace to ``OrderedDict``.
The long-lived class namespace (``__dict__``) will remain a ``dict``.
Furthermore, the order in which the attributes are defined in each class
body will now be preserved in the ``__definition_order__`` attribute of
the class.  This allows introspection of the original definition order,
e.g. by class decorators.

Motivation
==========

Currently the namespace used during execution of a class body defaults
to ``dict``.  If the metaclass defines ``__prepare__()`` then the result
of calling it is used.  Thus, before this PEP, if you needed your class
definition namespace to be ``OrderedDict`` you had to use a metaclass.

Metaclasses introduce an extra level of complexity to code and in some
cases (e.g. conflicts) are a problem.  So reducing the need for them is
worth doing when the opportunity presents itself.  Given that we now have
a C implementation of ``OrderedDict`` and that ``OrderedDict`` is the
common use case for ``__prepare__()``, we have such an opportunity by
defaulting to ``OrderedDict``.

The usefulness of ``OrderedDict``-by-default is greatly increased if the
definition order is directly introspectable on classes afterward,
particularly by code that is independent of the original class definition.
One of the original motivating use cases for this PEP is generic class
decorators that make use of the definition order.

Changing the default class definition namespace has been discussed a
number of times, including on the mailing lists and in PEP 422 and
PEP 487 (see the References section below).

Specification
=============

* the default class *definition* namespace is now ``OrderdDict``
* the order in which class attributes are defined is preserved in the
  new ``__definition_order__`` attribute on each class
* "dunder" attributes (e.g. ``__init__``, ``__module__``) are ignored
* ``__definition_order__`` is a ``tuple`` (or ``None``)
* ``__definition_order__`` is a read-only attribute
* ``__definition_order__`` is always set:

  1. if ``__definition_order__`` is defined in the class body then it
     must be a ``tuple`` of identifiers or ``None``; any other value
     will result in ``TypeError``
  2. classes that do not have a class definition (e.g. builtins) have
     their ``__definition_order__`` set to ``None``
  3. classes for which `__prepare__()`` returned something other than
     ``OrderedDict`` (or a subclass) have their ``__definition_order__``
     set to ``None`` (except where #1 applies)

The following code demonstrates roughly equivalent semantics for the
default behavior::

   class Meta(type):
       def __prepare__(cls, *args, **kwargs):
           return OrderedDict()

   class Spam(metaclass=Meta):
       ham = None
       eggs = 5
       __definition_order__ = tuple(k for k in locals()
                                    if not (k.startswith('__') and
                                            k.endswith('__')))

Note that [pep487_] proposes a similar solution, albeit as part of a
broader proposal.

Why a tuple?
------------

Use of a tuple reflects the fact that we are exposing the order in
which attributes on the class were *defined*.  Since the definition
is already complete by the time ``definition_order__`` is set, the
content and order of the value won't be changing.  Thus we use a type
that communicates that state of immutability.

Why a read-only attribute?
--------------------------

As with the use of tuple, making ``__definition_order__`` a read-only
attribute communicates the fact that the information it represents is
complete.  Since it represents the state of a particular one-time event
(execution of the class definition body), allowing the value to be
replaced would reduce confidence that the attribute corresponds to the
original class body.

If a use case for a writable (or mutable) ``__definition_order__``
arises, the restriction may be loosened later.  Presently this seems
unlikely and furthermore it is usually best to go immutable-by-default.

Note that ``__definition_order__`` is centered on the class definition
body.  The use cases for dealing with the class namespace (``__dict__``)
post-definition are a separate matter.  ``__definition_order__`` would
be a significantly misleading name for a feature focused on more than
class definition.

See [nick_concern_] for more discussion.

Why ignore "dunder" names?
--------------------------

Names starting and ending with "__" are reserved for use by the
interpreter.  In practice they should not be relevant to the users of
``__definition_order__``.  Instead, for nearly everyone they would only
be clutter, causing the same extra work for everyone.

Why None instead of an empty tuple?
-----------------------------------

A key objective of adding ``__definition_order__`` is to preserve
information in class definitions which was lost prior to this PEP.
One consequence is that ``__definition_order__`` implies an original
class definition.  Using ``None`` allows us to clearly distinquish
classes that do not have a definition order.  An empty tuple clearly
indicates a class that came from a definition statement but did not
define any attributes there.

Why None instead of not setting the attribute?
----------------------------------------------

The absence of an attribute requires more complex handling than ``None``
does for consumers of ``__definition_order__``.

Why constrain manually set values?
----------------------------------

If ``__definition_order__`` is manually set in the class body then it
will be used.  We require it to be a tuple of identifiers (or ``None``)
so that consumers of ``__definition_order__`` may have a consistent
expectation for the value.  That helps maximize the feature's
usefulness.

Why is __definition_order__ even necessary?
-------------------------------------------

Since the definition order is not preserved in ``__dict__``, it is
lost once class definition execution completes.  Classes *could*
explicitly set the attribute as the last thing in the body.  However,
then independent decorators could only make use of classes that had done
so.  Instead, ``__definition_order__`` preserves this one bit of info
from the class body so that it is universally available.

Compatibility
=============

This PEP does not break backward compatibility, except in the case that
someone relies *strictly* on ``dict`` as the class definition namespace.
This shouldn't be a problem.

Changes
=============

In addition to the class syntax, the following expose the new behavior:

* builtins.__build_class__
* types.prepare_class
* types.new_class

Other Python Implementations
============================

Pending feedback, the impact on Python implementations is expected to
be minimal.  If a Python implementation cannot support switching to
`OrderedDict``-by-default then it can always set ``__definition_order__``
to ``None``.

Implementation
==============

The implementation is found in the tracker. [impl_]

Alternatives
============

<class>.__dict__ as OrderedDict
-------------------------------

Instead of storing the definition order in ``__definition_order__``,
the now-ordered definition namespace could be copied into a new
``OrderedDict``.  This would then be used as the mapping proxied as
``__dict__``.  Doing so would mostly provide the same semantics.

However, using ``OrderedDict`` for ``__dict__`` would obscure the
relationship with the definition namespace, making it less useful.
Additionally, doing this would require significant changes to the
semantics of the concrete ``dict`` C-API.

A "namespace" Keyword Arg for Class Definition
----------------------------------------------

PEP 422 introduced a new "namespace" keyword arg to class definitions
that effectively replaces the need to ``__prepare__()``. [pep422_]
However, the proposal was withdrawn in favor of the simpler PEP 487.

References
==========

.. [impl] issue #24254
   (https://bugs.python.org/issue24254)

.. [nick_concern] Nick's concerns about mutability
   (https://mail.python.org/pipermail/python-dev/2016-June/144883.html)

.. [pep422] PEP 422
   (https://www.python.org/dev/peps/pep-0422/#order-preserving-classes)

.. [pep487] PEP 487
   (https://www.python.org/dev/peps/pep-0487/#defining-arbitrary-namespaces)

.. [orig] original discussion
   (https://mail.python.org/pipermail/python-ideas/2013-February/019690.html)

.. [followup1] follow-up 1
   (https://mail.python.org/pipermail/python-dev/2013-June/127103.html)

.. [followup2] follow-up 2
   (https://mail.python.org/pipermail/python-dev/2015-May/140137.html)

Copyright
===========
This document has been placed in the public domain.

From tytso at mit.edu  Sat Jun 11 22:37:37 2016
From: tytso at mit.edu (Theodore Ts'o)
Date: Sat, 11 Jun 2016 22:37:37 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <1465593290.2349072.634239529.67EEE9C8@webmail.messagingengine.com>
References: <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org> <20160610195411.GA3932@thunk.org>
 <1465593290.2349072.634239529.67EEE9C8@webmail.messagingengine.com>
Message-ID: <20160612023737.GB5489@thunk.org>

On Fri, Jun 10, 2016 at 05:14:50PM -0400, Random832 wrote:
> On Fri, Jun 10, 2016, at 15:54, Theodore Ts'o wrote:
> > So even on Python pre-3.5.0, realistically speaking, the "weakness" of
> > os.random would only be an issue (a) if it is run within the first few
> > seconds of boot, and (b) os.random is used to directly generate a
> > long-term cryptographic secret.  If you are fork openssl or ssh-keygen
> > to generate a public/private keypair, then you aren't using os.random.
> 
> So, I have a question. If this "weakness" in /dev/urandom is so
> unimportant to 99% of situations... why isn't there a flag that can be
> passed to getrandom() to allow the same behavior?

The intention behind getrandom() is that it is intended *only* for
cryptographic purposes.  For that use case, there's no point having a
"return a potentially unseeded cryptographic secret" option.  This
makes this much like FreeBSD's /dev/random and getentropy system
calls.

(BTW, I've seen an assertion on this thread that FreeBSD's
getentropy(2) never blocks.  As far as I know, this is **not** true.
FreeBSD's getentropy(2) works like its /dev/random device, in that if
it is not fully seeded, it will block.  The only reason why OpenBSD's
getentropy(2) and /dev/random devices will never block is because they
only support architectures where they can make sure that entropy is
passed from a previous boot session to the next, given specialized
bootloader support.  Linux can't do this because we support a very
large number of bootloaders, and the bootloaders are not under the
kernel developers' control.  Fundamentally, you can't guarantee both
(a) that your RNG will never block, and (b) will always be of high
cryptographic quality, in a completely general sense.  You can if you
make caveats about your hardware or when the code runs, but that's
fundamentally the problem with the documentation of os.urandom(); it's
making promises which can't be true 100% of the time, for all
hardware, operating environments, etc.)

Anyway, if you don't need cryptographic guarantees, you don't need
getrandom(2) or getentropy(2); something like this will do just fine:

long getrand()
{
	static int initialized = 0;
	struct timeval tv;

	if (!initialized) {
		gettimeofday(&tv, NULL);
		srandom(tv.tv_sec ^ tv.tv_usec ^ getpid());
		initialized++;
	}
	return random();
}

So this is why I did what I did.  If Python decides to go down this
same path, you could define a new interface ala getrandom(2), which is
specifically designed for cryptogaphic purposes, and perhaps a new,
more efficient interface for those people who don't need cryptogaphic
guarantees --- and then keep the behavior of os.urandom consistent
with Python 3.4, but update the documentation to reflect the reality.

Alternatively, you could keep the implementation of os.urandom
consistent with Python 3.5, and then document that under some
circumstances, it will block.  Both approaches have certain tradeoffs,
but it's not going to be the end of the world regardless of which way
you decide to go.   I'd suggest that you use your existing mechanisms
to decide on which approach is more Pythony, and then make sure you
communicate and over-communicate it to your user/developer base.

And then --- relax.  It may seem like a big deal today, but in a year
or so people will have gotten used to whatever interface or
documentation changes you decide to make, and it will be all fine.
As Dame Julian of Norwich once said, "All shall be well, and all shall
be well, and all manner of things shall be well."  

Cheers,

						- Ted

From vgr255 at live.ca  Sat Jun 11 22:51:29 2016
From: vgr255 at live.ca (=?iso-8859-1?Q?=C9manuel_Barry?=)
Date: Sat, 11 Jun 2016 22:51:29 -0400
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace (round
 3)
In-Reply-To: <CALFfu7DXG9o_drdkS5+r6g03eVVzDCDbrKcXymwsCqVDkz+3OQ@mail.gmail.com>
References: <CALFfu7DXG9o_drdkS5+r6g03eVVzDCDbrKcXymwsCqVDkz+3OQ@mail.gmail.com>
Message-ID: <BLU403-EAS206BE7753D1E540C23737F691520@phx.gbl>

> From: Eric Snow
> Sent: Saturday, June 11, 2016 10:37 PM
> To: Python-Dev; Guido van Rossum
> Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace (round
> 3)

> The only change to the original proposal
> has been that a manually set __definition_order__ must be a tuple of
> identifiers or None (rather that using the value as-is).

>   1. if ``__definition_order__`` is defined in the class body then it
>      must be a ``tuple`` of identifiers or ``None``; any other value
>      will result in ``TypeError``

Why not just any arbitrary iterable, which get converted to a tuple at
runtime? __slots__ allows any arbitrary iterable:

>>> def g():
...   yield "foo"
...   yield "bar"
...   yield "baz"
>>> class C:
...   __slots__ = g()
>>> C.__slots__
<generator object g at 0x0074A9F0>
>>> C.__slots__.gi_running
False
>>> dir(C)
[<snip>, 'bar', 'baz', 'foo']

> Use of a tuple reflects the fact that we are exposing the order in
> which attributes on the class were *defined*.  Since the definition
> is already complete by the time ``definition_order__`` is set, the
> content and order of the value won't be changing.  Thus we use a type
> that communicates that state of immutability.

Typo: missing leading underscores in __definition_order__

> Compatibility
> =============
> 
> This PEP does not break backward compatibility, except in the case that
> someone relies *strictly* on ``dict`` as the class definition namespace.
> This shouldn't be a problem.

Perhaps add a mention that isinstance(namespace, dict) will still be true,
so users don't get unnecessarily confused.

> <class>.__dict__ as OrderedDict
> -------------------------------

<class> looks weird to me. I tend to use `cls` (although `klass` isn't
uncommon). `C` might also not be a bad choice.

Thanks!
-Emanuel

From ericsnowcurrently at gmail.com  Sat Jun 11 23:01:33 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Sat, 11 Jun 2016 20:01:33 -0700
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace (round
 3)
In-Reply-To: <BLU403-EAS206BE7753D1E540C23737F691520@phx.gbl>
References: <CALFfu7DXG9o_drdkS5+r6g03eVVzDCDbrKcXymwsCqVDkz+3OQ@mail.gmail.com>
 <BLU403-EAS206BE7753D1E540C23737F691520@phx.gbl>
Message-ID: <CALFfu7AfpXvVfYr5mERxkAymPX=hutyz124w4_-bwHEqH44LAA@mail.gmail.com>

On Sat, Jun 11, 2016 at 7:51 PM, ?manuel Barry <vgr255 at live.ca> wrote:
>> From: Eric Snow
>>   1. if ``__definition_order__`` is defined in the class body then it
>>      must be a ``tuple`` of identifiers or ``None``; any other value
>>      will result in ``TypeError``
>
> Why not just any arbitrary iterable, which get converted to a tuple at
> runtime?

An arbitrary iterable does not necessarily infer a definition order.
For example, dict is an iterable but the order is undefined.  Also,
I'd rather favor simplicity for this (most likely) uncommon corner
case of manually setting __definition_order__, particularly at the
start.  If it proves to be a problematic restriction in the future we
can loosen it.

> __slots__ allows any arbitrary iterable:

Yes, but __slots__ is not order-sensitive.

>> is already complete by the time ``definition_order__`` is set, the
>
> Typo: missing leading underscores in __definition_order__

I'll fix that.

>
>> Compatibility
>> =============
>>
>> This PEP does not break backward compatibility, except in the case that
>> someone relies *strictly* on ``dict`` as the class definition namespace.
>> This shouldn't be a problem.
>
> Perhaps add a mention that isinstance(namespace, dict) will still be true,
> so users don't get unnecessarily confused.

Good point.

>
>> <class>.__dict__ as OrderedDict
>> -------------------------------
>
> <class> looks weird to me. I tend to use `cls` (although `klass` isn't
> uncommon). `C` might also not be a bad choice.

Yes, that is better.

-eric

From vgr255 at live.ca  Sat Jun 11 23:04:32 2016
From: vgr255 at live.ca (=?utf-8?Q?=C3=89manuel_Barry?=)
Date: Sat, 11 Jun 2016 23:04:32 -0400
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace (round
 3)
In-Reply-To: <CALFfu7AfpXvVfYr5mERxkAymPX=hutyz124w4_-bwHEqH44LAA@mail.gmail.com>
References: <CALFfu7DXG9o_drdkS5+r6g03eVVzDCDbrKcXymwsCqVDkz+3OQ@mail.gmail.com>
 <BLU403-EAS206BE7753D1E540C23737F691520@phx.gbl>
 <CALFfu7AfpXvVfYr5mERxkAymPX=hutyz124w4_-bwHEqH44LAA@mail.gmail.com>
Message-ID: <BLU403-EAS102B8E0C0FBF62795BE871791520@phx.gbl>

> From: Eric Snow
> Sent: Saturday, June 11, 2016 11:02 PM
> To: ?manuel Barry
> Cc: Python-Dev
> Subject: Re: [Python-Dev] PEP 520: Ordered Class Definition Namespace
> (round 3)
> 
> On Sat, Jun 11, 2016 at 7:51 PM, ?manuel Barry <vgr255 at live.ca> wrote:
> >> From: Eric Snow
> >>   1. if ``__definition_order__`` is defined in the class body then it
> >>      must be a ``tuple`` of identifiers or ``None``; any other value
> >>      will result in ``TypeError``
> >
> > Why not just any arbitrary iterable, which get converted to a tuple at
> > runtime?
> 
> An arbitrary iterable does not necessarily infer a definition order.
> For example, dict is an iterable but the order is undefined.  Also,
> I'd rather favor simplicity for this (most likely) uncommon corner
> case of manually setting __definition_order__, particularly at the
> start.  If it proves to be a problematic restriction in the future we
> can loosen it.

Point. This can always be revised later (I'm probably overthinking this as always ;)

> 
> -eric

-Emanuel

From steve at pearwood.info  Sat Jun 11 23:15:17 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 12 Jun 2016 13:15:17 +1000
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
References: <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
Message-ID: <20160612031517.GQ27919@ando.pearwood.info>

On Sat, Jun 11, 2016 at 02:16:21PM -0700, Guido van Rossum wrote:

[on the real-world consequences of degraded randomness from /dev/urandom]

> Actually it's not clear to me at all that it could have happened to Python.
> (Wasn't it an embedded system?)

A Raspberry Pi. But don't people run Python on at least some embedded 
systems? The wiki thinks so:

https://wiki.python.org/moin/EmbeddedPython

And I thought that was the purpose of ?Python.

> > Actually the proposal for that was the secrets module. And the secrets
> > module would be the only user of os.urandom(blocking=True).
> >
> > I?m fine if this lives in the secrets module? Steven asked for it to be an
> > os function so that secrets.py could continue to be pure python.
> 
> The main thing that I want to avoid is that people start cargo-culting
> whatever the secrets module uses rather than just using the secrets module.
> Having it redundantly available as os.getrandom() is just begging for
> people to show off how much they know about writing secure code.

That makes sense.

I'm happy for getrandom to be an implementation detail of secrets, but 
I'll need help with that part.

> >> * If you want to ensure you get cryptographically secure bytes,
> >> os.getrandom, falling back to os.urandom on non Linux platforms and
> >> erroring on Linux.
[...]

> But what is a Python script going to do with that error? IIUC this kind of
> error would only happen very early during boot time, and rarely, so the
> most likely outcome is a hard-to-debug mystery failure.

In my day job, I work for a Linux sys admin consulting company, and I 
can tell you from our experience that debugging a process that 
occasionally hangs mysteriously during boot is much harder than 
debugging a process that occasionally fails with an explicit error in 
the logs, especially if the error message is explicit about the cause:

OSError: entropy pool has not been initialized yet

At that point, you can take whatever action is appropriate for your 
script:

- fail altogether, just as it might fail if it requires a writable
  file system and can't find one;
- sleep for three seconds and try again;
- log the error and proceed with degraded randomness or functionality;
- change it so the script runs later in the boot process.

-- 
Steve

From random832 at fastmail.com  Sun Jun 12 01:49:34 2016
From: random832 at fastmail.com (Random832)
Date: Sun, 12 Jun 2016 01:49:34 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160612023737.GB5489@thunk.org>
References: <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org> <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org> <20160610195411.GA3932@thunk.org>
 <1465593290.2349072.634239529.67EEE9C8@webmail.messagingengine.com>
 <20160612023737.GB5489@thunk.org>
Message-ID: <1465710574.1145257.635059377.3AF7C63B@webmail.messagingengine.com>

On Sat, Jun 11, 2016, at 22:37, Theodore Ts'o wrote:
> On Fri, Jun 10, 2016 at 05:14:50PM -0400, Random832 wrote:
> > So, I have a question. If this "weakness" in /dev/urandom is so
> > unimportant to 99% of situations... why isn't there a flag that can be
> > passed to getrandom() to allow the same behavior?
> 
> The intention behind getrandom() is that it is intended *only* for
> cryptographic purposes. 

I'm somewhat confused now because if that's the case it seems to
accomplish multiple unrelated things. Why was this implemented as a
system call rather than a device (or an ioctl on the existing ones)? If
there's a benefit in not going through the non-atomic (and possibly
resource limited) procedure of acquiring a file descriptor, reading from
it, and closing it, why is that benefit not also extended to
non-cryptographic users of urandom via allowing the system call to be
used in that way?

> Anyway, if you don't need cryptographic guarantees, you don't need
> getrandom(2) or getentropy(2); something like this will do just fine:

Then what's /dev/urandom *for*, anyway?

From tytso at mit.edu  Sun Jun 12 02:11:42 2016
From: tytso at mit.edu (Theodore Ts'o)
Date: Sun, 12 Jun 2016 02:11:42 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
References: <87lh2dycuo.fsf@vostro.rath.org>
 <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
Message-ID: <20160612061142.GA1986@thunk.org>

On Sat, Jun 11, 2016 at 05:46:29PM -0400, Donald Stufft wrote:
> 
> It was a RaspberryPI that ran a shell script on boot that called
> ssh-keygen.  That shell script could have just as easily been a
> Python script that called os.urandom via
> https://github.com/sybrenstuvel/python-rsa instead of a shell script
> that called ssh-keygen.

So I'm going to argue that the primary bug was in the how the systemd
init scripts were configured.  In generally, creating keypairs at boot
time is just a bad idea.  They should be created lazily, in a
just-in-time paradigm.

Consider that if you assume that os.urandom can block, this isn't
necessarily going to do the right thing either --- if you use
getrandom and it blocks, and it's part of a systemd unit which is
blocking futher boot progress, then the system will hang for 90
seconds, and while it's hanging, there won't be any interrupts, so the
system will be dead in the water, just like the orignal bug report
complaining that Python was hanging when it was using getrandom() to
initialize its SipHash.

At which point there will be another bug complaining about how python
was causing systemd to hang for 90 seconds, and there will be demand
to make os.random no longer block.  (Since by definition, systemd can
do no wrong; it's always other programs that have to change to
accomodate systemd.  :-)

So some people will freak out when the keygen systemd unit hangs,
blocking the boot --- and other people will freak out of the systemd
unit doesn't hang, and you get predictable SSH keys --- and some wiser
folks will be asking the question, why the *heck* is it not
openssh/systemd's fault for trying to generate keys this early,
instead of after the first time sshd needs host ssh keys?  If you wait
until the first time the host ssh keys are needed, then the system is
fully booted, so it's likely that the entropy will be collected -- and
even if it isn't, networking will already be brought up, and the
system will be in multi-user mode, so entropy will be collected very
quickly.

Sometimes, we can't solve the problem at the Python level or at the
Kernel level.  It will require security-saavy userspace/application
programmers as well.

Cheers,

						- Ted

From cory at lukasa.co.uk  Sun Jun 12 06:40:58 2016
From: cory at lukasa.co.uk (Cory Benfield)
Date: Sun, 12 Jun 2016 11:40:58 +0100
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160612061142.GA1986@thunk.org>
References: <87lh2dycuo.fsf@vostro.rath.org>
 <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
Message-ID: <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>

> On 12 Jun 2016, at 07:11, Theodore Ts'o <tytso at mit.edu> wrote:
> 
> On Sat, Jun 11, 2016 at 05:46:29PM -0400, Donald Stufft wrote:
>> 
>> It was a RaspberryPI that ran a shell script on boot that called
>> ssh-keygen.  That shell script could have just as easily been a
>> Python script that called os.urandom via
>> https://github.com/sybrenstuvel/python-rsa instead of a shell script
>> that called ssh-keygen.
> 
> So I'm going to argue that the primary bug was in the how the systemd
> init scripts were configured.  In generally, creating keypairs at boot
> time is just a bad idea.  They should be created lazily, in a
> just-in-time paradigm.

Agreed. I hope that if there is only one thing every participant has learned from this (extremely painful for all concerned) discussion, it?s that doing anything that requires really good random numbers should be delayed as long as possible on all systems, and should absolutely not be done during the boot process on Linux. Don?t generate key pairs, don?t make TLS connections, just don?t perform any action that requires really good randomness at all.

> So some people will freak out when the keygen systemd unit hangs,
> blocking the boot --- and other people will freak out of the systemd
> unit doesn't hang, and you get predictable SSH keys --- and some wiser
> folks will be asking the question, why the *heck* is it not
> openssh/systemd's fault for trying to generate keys this early,
> instead of after the first time sshd needs host ssh keys?  If you wait
> until the first time the host ssh keys are needed, then the system is
> fully booted, so it's likely that the entropy will be collected -- and
> even if it isn't, networking will already be brought up, and the
> system will be in multi-user mode, so entropy will be collected very
> quickly.

As far as I know we still only have three programs that were encountering this problem: Debian?s autopkgtest (which patched with PYTHONHASHSEED=0), systemd-cron (which is moving from Python to Rust anyway), and cloud-init (not formally reported but mentioned to me by a third-party). It remains unclear to me why the systemd-cron service files can?t simply request to be delayed until the kernel CSPRNG is seeded: I guess systemd doesn?t have any way to express that constraint? Perhaps it should.

Of this set, only cloud-init worries me, and it worries me for the *opposite* reason that Guido and Larry are worried. Guido and Larry are worried that programs like cloud-init will be delayed by two minutes while they wait for entropy: that?s an understandable concern. I?m much more worried that programs like cloud-init may attempt to establish TLS connections or create keys during this two minute window, leaving them staring down the possibility of performing ?secure? actions with insecure keys.

This is why I advocate, like Donald does, for having *some* tool in Python that allows Python programs to crash if they attempt to generate cryptographically secure random bytes on a system that is incapable of providing them (which, in practice, can only happen on Linux systems). I don?t care how it?s spelled, I just care that programs that want to use a properly-seeded CSPRNG can error out effectively when one is not available. That allows us to ensure that Python programs that want to do TLS or build key pairs correctly refuse to do so when used in this state, *and* that they provide a clearly debuggable reason for why they refused. That allows the savvy application developers that Ted talked about to make their own decisions about whether their rapid startup is sufficiently important to take the risk.

Cory

[0]: https://github.com/systemd-cron/systemd-cron/issues/43#issuecomment-160343989

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160612/891de1b6/attachment.sig>

From p.f.moore at gmail.com  Sun Jun 12 07:10:00 2016
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 12 Jun 2016 12:10:00 +0100
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org> <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
Message-ID: <CACac1F_FiectHb7M=HPx0-kEDnrMWsot4AbKc_pNK6HT--fPZA@mail.gmail.com>

On 11 June 2016 at 22:46, Donald Stufft <donald at stufft.io> wrote:
> I guess one question would be, what does the secrets module do if it?s on a
> Linux that is too old to have getrandom(0), off the top of my head I can
> think of:
>
> * Silently fall back to reading os.urandom and hope that it?s been seeded.
> * Fall back to os.urandom and hope that it?s been seeded and add a
> SecurityWarning or something like it to mention that it?s falling back to
> os.urandom and it may be getting predictable random from /dev/urandom.
> * Hard fail because it can?t guarantee secure cryptographic random.
>
> Of the three, I would probably suggest the second one, it doesn?t let the
> problem happen silently, but it still ?works? (where it?s basically just
> hoping it?s being called late enough that /dev/urandom has been seeded), and
> people can convert it to the third case using the warnings module to turn
> the warning into an exception.

I have kept out of this discussion as I don't know enough about
security to comment, but in this instance I think the answer is clear
- there is no requirement for Python to protect the user against
security bugs in the underlying OS (sure, it's nice if it can, but
it's not necessary) so fallng back to os.urandom (with no warning) is
fine. A warning, or even worse a hard fail, that 99.99% of the time
should be ignored (because you're *not* writing a boot script) seems
like a very bad idea.

By all means document "if your OS provides no means of getting
guaranteed secure randon mumbers (e.g., older versions of Linux very
early in the boot sequence) then the secrets module cannot give you
results that are any better than the OS provides". It seems
self-evident to me that this would be the case, but I see no reason to
object if the experts feel it's worth adding.

Paul

From tytso at mit.edu  Sun Jun 12 09:28:44 2016
From: tytso at mit.edu (Theodore Ts'o)
Date: Sun, 12 Jun 2016 09:28:44 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <1465710574.1145257.635059377.3AF7C63B@webmail.messagingengine.com>
References: <5759EC2B.8040208@hastings.org>
 <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io>
 <njdldf$h9r$1@ger.gmane.org> <20160610195411.GA3932@thunk.org>
 <1465593290.2349072.634239529.67EEE9C8@webmail.messagingengine.com>
 <20160612023737.GB5489@thunk.org>
 <1465710574.1145257.635059377.3AF7C63B@webmail.messagingengine.com>
Message-ID: <20160612132844.GB1986@thunk.org>

On Sun, Jun 12, 2016 at 01:49:34AM -0400, Random832 wrote:
> > The intention behind getrandom() is that it is intended *only* for
> > cryptographic purposes. 
> 
> I'm somewhat confused now because if that's the case it seems to
> accomplish multiple unrelated things. Why was this implemented as a
> system call rather than a device (or an ioctl on the existing ones)? If
> there's a benefit in not going through the non-atomic (and possibly
> resource limited) procedure of acquiring a file descriptor, reading from
> it, and closing it, why is that benefit not also extended to
> non-cryptographic users of urandom via allowing the system call to be
> used in that way?

This design was taken from OpenBSD, and the goal with getentropy(2)
(which is also designed only for cryptographic use cases), was so that
a denial of service attack (fd exhaustion) could force an application
to fall back to a weaker -- in some cases, very weak or non-existent
--- source of randomness.

Non-cryptographic users don't need to use this interface at all.  They
can just use srandom(3)/random(3) and be happy.

> > Anyway, if you don't need cryptographic guarantees, you don't need
> > getrandom(2) or getentropy(2); something like this will do just fine:
> 
> Then what's /dev/urandom *for*, anyway?

/dev/urandom is a legacy interface.  It was intended originally for
cryptographic use cases, but it was intended for the days when very
few programs needed a secure cryptographic random generator, and it
was assumed that application programmers would be very careful in
checking error codes, etc.

It also dates back to a time when the NSA was still pushing very hard
for cryptographic export controls (hence the use of SHA-1 versus an
encryption algorithm) and when many people questioned whether or not
the SHA-1 algorithm, as designed by the NSA, had a backdoor in it.
(As it turns out, the NSA put a back door into DUAL-EC, so retrospect
this concern really wasn't that unreasonable.)  Because of those
concerns, the assumption is those few applications who really wanted
to get security right (e.g., PGP, which still uses /dev/random for
long-term key generation), would want to use /dev/random and deal with
entropy accounting, and asking the user to type randomness on the
keyboard and move their mouse around while generating a random key.

But times change, and these days people are much more likely to
believe that SHA-1 is in fact cryptographically secure, and future
crypto hash algorithms are designed by teams from all over the world
and NIST/NSA merely review the submissions (along with everyone else).
So for example, SHA-3 was *not* designed by the NSA, and it was
evaluated using a much more open process than SHA-1.

Also, we have a much larger set of people writing code which is
sensitive to cryptographic issues (back when I wrote /dev/random, I
probably had met, or at least electronically corresponded with a large
number of the folks who were working on network security protocols, at
least in the non-classified world), and these days, there is much less
trust that people writing code to use /dev/[u]random are in fact
careful and competent security engineers.  Whether or not this is a
fair concern or not, it is true that there has been a change in API
design ethos away from the "Unix let's make things as general as
possible, in case someone clever comes up use case we didn't think
of", to "idiots are ingenious so they will come up with ways to misuse
an idiot-proof interface, so we need to lock it down as much as
possible."  OpenBSD's getentropy(2) interface is a strong example of
this new attitude towards API design, and getrandom(2) is not quite so
doctrinaire (I added a flags field when getentropy(2) didn't even give
those options to progammers), but it is following in the same tradition.

Cheers,

						- Ted

From tytso at mit.edu  Sun Jun 12 09:43:15 2016
From: tytso at mit.edu (Theodore Ts'o)
Date: Sun, 12 Jun 2016 09:43:15 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
References: <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
Message-ID: <20160612134315.GC1986@thunk.org>

On Sun, Jun 12, 2016 at 11:40:58AM +0100, Cory Benfield wrote:
> 
> Of this set, only cloud-init worries me, and it worries me for the
> *opposite* reason that Guido and Larry are worried. Guido and Larry
> are worried that programs like cloud-init will be delayed by two
> minutes while they wait for entropy: that?s an understandable
> concern. I?m much more worried that programs like cloud-init may
> attempt to establish TLS connections or create keys during this two
> minute window, leaving them staring down the possibility of
> performing ?secure? actions with insecure keys.

There are patches in the dev branch of:

     https://git.kernel.org/cgit/linux/kernel/git/tytso/random.git/

which will automatically use virtio-rng (if it is provided by the
cloud provider) to initialize /dev/urandom.  It also uses a much more
aggressive mechanism to initialize the /dev/urandom pool, so that
getrandom(2) will block for a much shorter period of time immediately
after boot time on real hardware.  I'm confident it's secure for x86
platforms.  I'm still thinking about whether I should fall back to
something more conservative for crappy embedded processors that don't
have a cycle counter or an CPU-provided RDRAND-like instruction.
Related to this is whether I should finally make the change so that
/dev/urandom will block until it is initialized.  (This would make
Linux work like FreeBSD, which *will* also block if its entropy pool
is not initialized.)

> This is why I advocate, like Donald does, for having *some* tool in
> Python that allows Python programs to crash if they attempt to
> generate cryptographically secure random bytes on a system that is
> incapable of providing them (which, in practice, can only happen on
> Linux systems).

Well, it can only happen on Linux because you insist on falling back
to /dev/urandom --- and because other OS's have the good taste not to
use systemd and/or Python very early in the boot process.  If someone
tried to run a python script in early FreeBSD init scripts, it would
block just as you were seeing on Linux --- you just haven't seen that
yet, because arguably the FreeBSD developers have better taste in
their choice of init scripts than Red Hat and Debian.  :-)

So the question is whether I should do what FreeBSD did, which will
statisfy those people who are freaking out and whinging about how
Linux could allow stupidly written or deployed Python scripts get
cryptographically insecure bytes, by removing that option from Python
developers.  Or should I remove that one line from changes in the
random.git patch series, and allow /dev/urandom to be used even when
it might be insecure, so as to satisfy all of the people who are
freaking out and whinging about the fact that a stupildly written
and/or deployed Python script might block during early boot and hang a
system?

Note that I've tried to do what I can to make the time that
/dev/urandom might block as small as possible, but at the end of the
day, there is still the question of whether I should remove the choice
re: blocking from userspace, ala FreeBSD, or not.  And either way,
some number of people will be whinging and freaking out.  Which is why
I completely sympathetic to how Guido might be getting a little
exasperated over this whole thread.  :-)

						- Ted

From christian at python.org  Sun Jun 12 10:36:56 2016
From: christian at python.org (Christian Heimes)
Date: Sun, 12 Jun 2016 16:36:56 +0200
Subject: [Python-Dev] New hash algorithms: SHA3, SHAKE, BLAKE2,
 truncated SHA512
In-Reply-To: <52d03a08-8d5e-9751-405d-aeeca740d832@python.org>
References: <52d03a08-8d5e-9751-405d-aeeca740d832@python.org>
Message-ID: <njjs28$cb2$1@ger.gmane.org>

On 2016-05-25 12:29, Christian Heimes wrote:
> Hi everybody,
> 
> I have three hashing-related patches for Python 3.6 that are waiting for
> review. Altogether the three patches add ten new hash algorithms to the
> hashlib module: SHA3 (224, 256, 384, 512), SHAKE (SHA3 XOF 128, 256),
> BLAKE2 (blake2b, blake2s) and truncated SHA512 (224, 256).
> 
> 
> SHA-3 / SHAKE: https://bugs.python.org/issue16113
> BLAKE2: https://bugs.python.org/issue26798
> SHA512/224 / SHA512/256: https://bugs.python.org/issue26834
> 
> 
> I like to push the patches during the sprints at PyCon. Please assist
> with reviews.

Hi,

I have unassigned myself from the tickets and will no longer pursue the
addition of new crypto hash algorithms. I might try again when blake2
and sha3 are more widely adopted and the opposition from other core
contributors has diminished. Acceptance is simply not high enough to be
worth the trouble.

Kind regards,
Christian

From michael at felt.demon.nl  Sun Jun 12 10:06:43 2016
From: michael at felt.demon.nl (Michael Felt)
Date: Sun, 12 Jun 2016 16:06:43 +0200
Subject: [Python-Dev] C99
In-Reply-To: <CA+eR4cF56FBdd86inH8NBzPLcL4W+Nb2ikixBn=xUsoOCmhhLg@mail.gmail.com>
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <CA+eR4cF56FBdd86inH8NBzPLcL4W+Nb2ikixBn=xUsoOCmhhLg@mail.gmail.com>
Message-ID: <4b260bf4-0eb8-960e-3e98-8e852c190dd9@felt.demon.nl>

I am using IBM xlc aka vac - version 11.

afaik it will deal with c99 features (by default I set it to behave that 
way because a common 'issue' is C++ style comments, when they should not 
be that style (fyi: not seen that in Python).

IMHO: GCC is not just a compiler - it brings with it a whole set of 
infrastructure requirements (aka run-time environment, rte). Certainly 
not an issue for GNU environments, but non-gnu (e.g., posix) will/may 
have continual side-effects from "competing" rte.. At least that was my 
experience when I was using gcc rather than xlc.

On 6/4/2016 9:53 AM, Martin Panter wrote:
> Sounds good for features that are well-supported by compilers that
> people use. (Are there other compilers used than just GCC and MSVC?)

From stefan at bytereef.org  Sun Jun 12 11:10:07 2016
From: stefan at bytereef.org (Stefan Krah)
Date: Sun, 12 Jun 2016 15:10:07 +0000 (UTC)
Subject: [Python-Dev] C99
References: <1465020691.2818312.627646289.6A6F4D74@webmail.messagingengine.com>
 <CA+eR4cF56FBdd86inH8NBzPLcL4W+Nb2ikixBn=xUsoOCmhhLg@mail.gmail.com>
 <4b260bf4-0eb8-960e-3e98-8e852c190dd9@felt.demon.nl>
Message-ID: <loom.20160612T170239-556@post.gmane.org>

Michael Felt <michael <at> felt.demon.nl> writes: 
> I am using IBM xlc aka vac - version 11.
> 
> afaik it will deal with c99 features (by default I set it to behave that 
> way because a common 'issue' is C++ style comments, when they should not 
> be that style (fyi: not seen that in Python).

We had a couple of exotic build machines a while ago: xlc, the
HPUX compiler and a couple of others all support the subset of C99
we are aiming for.  In fact the support of the commercial Unix
compilers for C99 is quite good -- the common error messages
suggest that several of them use the same front end (Comeau?).

Stefan Krah

From donald at stufft.io  Sun Jun 12 12:35:38 2016
From: donald at stufft.io (Donald Stufft)
Date: Sun, 12 Jun 2016 12:35:38 -0400
Subject: [Python-Dev] writing to /dev/*random [was: BDFL ruling request:
 should we block ...]
In-Reply-To: <22364.43500.337805.137220@turnbull.sk.tsukuba.ac.jp>
References: <20160609215343.00b0190e.barry@wooz.org>
 <CAPJVwB=3bcj4vagMenfH9h1x49rU9BGhRzf-Ut5gZbGTDUeNrg@mail.gmail.com>
 <575A2FCC.5070101@hastings.org>
 <981CD440-71B6-46AD-A057-585A812E083B@stufft.io> <njdldf$h9r$1@ger.gmane.org>
 <CAP7+vJ+mYXDW+eNHkX=ZpCiOSjZ093v7cuMdbPMNWc67UV1wKw@mail.gmail.com>
 <m21t455670.fsf@news.realpath.org>
 <048901d1c33a$5bf13930$13d3ab90$@sdamon.com>
 <CAEbHw4b5J0ExkCp444_-DWVb0VGARvC0Du5F0-RrdHrSjMF3yw@mail.gmail.com>
 <CAOTb1we7WsteA_5t-bdgoqX_xRS2WexaCuV4Bv7B7xZE2FaqkQ@mail.gmail.com>
 <20160611082437.GN27919@ando.pearwood.info>
 <22364.43500.337805.137220@turnbull.sk.tsukuba.ac.jp>
Message-ID: <844F63E6-2B12-4F05-8FC5-611111A3F276@stufft.io>

> On Jun 11, 2016, at 8:16 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> 
> This fails for unprivileged users on Mac.  I'm not sure what happens
> on Linux; it appears to succeed, but the result wasn't what I
> expected.

I think that on Linux it will mix in whatever you write into the entropy, but it won?t increase the entropy counter for it.

?
Donald Stufft

From njs at pobox.com  Sun Jun 12 14:07:22 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 12 Jun 2016 11:07:22 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160612061142.GA1986@thunk.org>
References: <87lh2dycuo.fsf@vostro.rath.org>
 <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
Message-ID: <CAPJVwBm8AC+HkBR_D0zMCijnTaGUwUq9-GtmEfCziksw=Lv9yQ@mail.gmail.com>

On Jun 11, 2016 11:13 PM, "Theodore Ts'o" <tytso at mit.edu> wrote:
>
> On Sat, Jun 11, 2016 at 05:46:29PM -0400, Donald Stufft wrote:
> >
> > It was a RaspberryPI that ran a shell script on boot that called
> > ssh-keygen.  That shell script could have just as easily been a
> > Python script that called os.urandom via
> > https://github.com/sybrenstuvel/python-rsa instead of a shell script
> > that called ssh-keygen.
>
> So I'm going to argue that the primary bug was in the how the systemd
> init scripts were configured.  In generally, creating keypairs at boot
> time is just a bad idea.  They should be created lazily, in a
> just-in-time paradigm.
>
> Consider that if you assume that os.urandom can block, this isn't
> necessarily going to do the right thing either --- if you use
> getrandom and it blocks, and it's part of a systemd unit which is
> blocking futher boot progress, then the system will hang for 90
> seconds, and while it's hanging, there won't be any interrupts, so the
> system will be dead in the water, just like the orignal bug report
> complaining that Python was hanging when it was using getrandom() to
> initialize its SipHash.

Hi Ted,

>From another perspective, I guess one could also argue that the best place
to fix this is in the kernel: if a process is blocked waiting for entropy
then the kernel probably shouldn't take that its cue to turn off all the
entropy generation mechanisms, just like how if a process is blocked
waiting for disk I/O then we probably shouldn't power down the disk
controller. Obviously this is a weird case because the kernel is
architected in a way that makes the dependency between the disk controller
and the I/O request obvious, while the dependency between the random pool
and... well... everything else, more or less, is much more subtle and goes
outside the usual channels, and we wouldn't want to rearchitect everything
just for this. But for example, if a process is actively blocked waiting
for the initial entropy, one could spawn a kernel thread that keeps the
system from quiescing by attempting to scrounge up entropy as fast as
possible, via whatever mechanisms are locally appropriate (e.g. doing a
busy-loop racing two clocks against each other, or just scheduling lots of
interrupts -- which I guess is the same thing, more or less). And the
thread would go away again as soon as userspace wasn't blocked on entropy.
That way this deadlock wouldn't be possible.

I guess someone *might* complain about the idea of the entropy pool
actually spending resources instead of being quietly parasitic, because
this is the kernel and someone will always complain about everything :-).
But complaining about this makes about much sense as complaining about the
idea of spending resources trying to service I/O when a process is blocked
on that ("maybe if we wait long enough then some other part of the system
will just kind of accidentally page in the data we need as a side effect of
whatever it's doing, and then this thread will be able to proceed").

Is this an approach that you've considered?

> At which point there will be another bug complaining about how python
> was causing systemd to hang for 90 seconds, and there will be demand
> to make os.random no longer block.  (Since by definition, systemd can
> do no wrong; it's always other programs that have to change to
> accomodate systemd.  :-)

FWIW, the systemd thing is a red herring -- this was debian's configuration
of a particular daemon that is not maintained by the systemd project, and
the exact same thing would have happened with sysvinit if debian had tried
using python 3.5 early in their rcS.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160612/bc7bb2a3/attachment.html>

From cory at lukasa.co.uk  Sun Jun 12 16:01:09 2016
From: cory at lukasa.co.uk (Cory Benfield)
Date: Sun, 12 Jun 2016 21:01:09 +0100
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160612134315.GC1986@thunk.org>
References: <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
Message-ID: <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>

> On 12 Jun 2016, at 14:43, Theodore Ts'o <tytso at mit.edu> wrote:
> 
> Well, it can only happen on Linux because you insist on falling back
> to /dev/urandom --- and because other OS's have the good taste not to
> use systemd and/or Python very early in the boot process.  If someone
> tried to run a python script in early FreeBSD init scripts, it would
> block just as you were seeing on Linux --- you just haven't seen that
> yet, because arguably the FreeBSD developers have better taste in
> their choice of init scripts than Red Hat and Debian.  :-)

Heh, yes, so to be clear, I said ?this can only happen on Linux? because I?m talking about the world that we live in: the one where I lost this debate. =D

Certainly right now the codebase as it stands could encounter the same problems on FreeBSD. That?s a problem for Python to deal with.

> So the question is whether I should do what FreeBSD did, which will
> statisfy those people who are freaking out and whinging about how
> Linux could allow stupidly written or deployed Python scripts get
> cryptographically insecure bytes, by removing that option from Python
> developers.  Or should I remove that one line from changes in the
> random.git patch series, and allow /dev/urandom to be used even when
> it might be insecure, so as to satisfy all of the people who are
> freaking out and whinging about the fact that a stupildly written
> and/or deployed Python script might block during early boot and hang a
> system?
> 
> Note that I've tried to do what I can to make the time that
> /dev/urandom might block as small as possible, but at the end of the
> day, there is still the question of whether I should remove the choice
> re: blocking from userspace, ala FreeBSD, or not.  And either way,
> some number of people will be whinging and freaking out.  Which is why
> I completely sympathetic to how Guido might be getting a little
> exasperated over this whole thread.  :-)

I don?t know that we need to talk about removing the choice. I understand the desire to commit to backwards compatibility, of course I do. My problem with /dev/urandom is not that it *exists*, per se: all kinds of stupid stuff exists for the sake of backward compatibility.

My problem with /dev/urandom is that it?s a trap, lying in wait for someone who doesn?t know enough about the problem they?re solving to step into it. And it?s the worst kind of trap: it?s one you don?t know you?ve stepped in. Nothing about the failure mode of /dev/urandom is obvious. Worse, well-written apps that try their best to do the right thing can still step into that failure mode if they?re run in a situation that they weren?t expecting (e.g. on an embedded device without hardware RNG or early in the boot process).

So my real problem with /dev/urandom is that the man page doesn?t say, in gigantic letters, ?this device has a really nasty failure mode that you cannot possibly detect by just running the code in the dangerous mode?. It?s understandable to have insecure weak stuff available to users: Python has loads of it. But where possible, the documentation marks it as such. It?d be good to have /dev/urandom?s man page say ?hey, by the way, you almost certainly don?t want this: try using getrandom() instead?.

Anyway, regarding changing the behaviour of /dev/urandom: as you?ve correctly highlighted, at this point you?re damned if you do and damned if you don?t. If you don?t change, you?ll forever have people like me saying that /dev/urandom is dangerous, and that its behaviour in the unseeded/poorly-seeded state is a misfeature. I trust you?ll understand when I tell you that that opinion has nothing to do with *you* or the Linux kernel maintainership. This is all about the way software security evolves: things that used to be ok start to become not ok over time. We learn, we improve.

Of course, if you do change the behaviour, you?ll rightly have programmers stumble onto this exact problem. They?ll be unhappy too. And the worst part of all of this is that neither side of that debate is *wrong*: they just prioritise different things. Guido, Larry, and friends aren?t wrong, any more than I am: we just rate the different concerns differently. That?s fine: after all, it?s probably why Guido invented and maintains an extremely popular programming language and I haven?t and never will! I have absolutely no problem with breaking ?working? code if I believe that that code is exposing users to risks they aren?t aware of (you can check my OSS record to prove it, and I?m happy to provide references).

The best advice I can give anyone in this debate, on either side, is to make decisions that you can live with. Consider the consequences, consider the promises you?ve made to users, and then do what you think is right. Guido and Larry have decided to go with backward-compatibility: fine. They?re responsible, the buck stops with them, they know that. The same is true for you, Ted, with the /dev/urandom device.

If it were me, I?d change the behaviour of /dev/urandom in a heartbeat. But then again, I?m not Ted Ts?o, and I suspect that instinct is part of why.

For my part, thanks for participating, Ted. It?s good to know you know what the problems are, even if your solution isn?t necessarily the one I?d go for. =)

Cory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160612/bdeed302/attachment.sig>

From tytso at mit.edu  Sun Jun 12 17:10:38 2016
From: tytso at mit.edu (Theodore Ts'o)
Date: Sun, 12 Jun 2016 17:10:38 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAPJVwBm8AC+HkBR_D0zMCijnTaGUwUq9-GtmEfCziksw=Lv9yQ@mail.gmail.com>
References: <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <CAPJVwBm8AC+HkBR_D0zMCijnTaGUwUq9-GtmEfCziksw=Lv9yQ@mail.gmail.com>
Message-ID: <20160612211038.GF1986@thunk.org>

On Sun, Jun 12, 2016 at 11:07:22AM -0700, Nathaniel Smith wrote:
> But for example, if a process is actively blocked waiting
> for the initial entropy, one could spawn a kernel thread that keeps the
> system from quiescing by attempting to scrounge up entropy as fast as
> possible, via whatever mechanisms are locally appropriate (e.g. doing a
> busy-loop racing two clocks against each other, or just scheduling lots of
> interrupts -- which I guess is the same thing, more or less).

There's a lot of snake oil, or at least, hand waving, that goes on
with respect to what will actually work to gather randomness.  One of
the worst possible choices is a standard, kernel-defined workload that
tries to just busy loop two clocks against each other.  For one thing,
on many embedded systems, all of your clocks are generated off of a
single master oscillator anyway.  And in early boot, it's not
realistic for the kernel to be able to measure network interrupt
timings and radio strength indicators from the WiFi, which ultimately
is going to be much more likely to be unpredictable by an outside
attacker sitting in Fort Meade than pretending that you can just
"schedule lots of interrupts".

Again, part of the problem here is that if you really want to be
secure, it needs to be a full stack perspective, where the hardware
designers, the OS developers, and the application level developers are
all working together.  If one side tries to exert a strong "somebody
else's problem field", it's very likely the end solution isn't going
to be secure.  Because in many cases this is simply not practical, we
all have to make assumptions at the OS and C-Python interpreter level,
and hope that the assumptions that we make are are conservative
enough.

> Is this an approach that you've considered?

Ultimately, the arguments made by approaches such as Jitterbug are, to
put it succiently and perhaps a little unfairly, "gee whillikers, the
Intel L1/L2 cache hierarchy is really complicated and it's a closed
hardware implementation so no one can understand it, and besides, the
statistical analysis of the output looks good".

To which I would say, "the first argument is an argument of security
through ignorance", and "AES(NSA_KEY, COUNTER++)" also has really
great statistical results, and if you don't know the NSA_KEY, it will
look very strong and as far as we know, we wouldn't be able to
distinguish it from truly secure random number generator --- but it
really isn't secure.

So yeah, I don't buy it.  In order for it to be secure, we need to be
grabbing measurements which can't be replicated or determined by a
remote attacker.  So having the kernel kick off a kernel thread is not
going to be useful unless we can mix in entropy from the user, or the
workload, or the local configuration, or from the local environment.
(Using RSSI is helpful because the remote attacker might not know
whether your mobile handset is in the knapsack under the table, or on
the desk, and that will change the RSSI numbers.)  Remember, the whole
*point* of modern CPU designs is that the huge amounts of engineering
effort is put into making the CPU be predictable, and so spawning a
kernel thread in isolation isn't going perform magic in terms of
getting guaranteed unpredictability.

> FWIW, the systemd thing is a red herring -- this was debian's configuration
> of a particular daemon that is not maintained by the systemd project, and
> the exact same thing would have happened with sysvinit if debian had tried
> using python 3.5 early in their rcS.

It's not a daemon.  It's the script in
/lib/systemd/system-generators/systemd-crontab-generator, and it's
needed because systemd subsumed the cron daemon, and developers who
wanted to not break user's existing crontab files turned to it.  I
suppose you are technically correct that it is not mainained by
systemd, but the need for it was generated out of systemd's lack of
concern of backwards compatibility.

Because FreeBSD and Mac OS are not using systemd, they are not likely
to run into this problem.  I will grant that if they decided to try to
run a python script out of their /etc/rc script, they would run into
the same problem.

						- Ted

From tytso at mit.edu  Sun Jun 12 19:28:03 2016
From: tytso at mit.edu (Theodore Ts'o)
Date: Sun, 12 Jun 2016 19:28:03 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
References: <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
Message-ID: <20160612232803.GB17328@thunk.org>

On Sun, Jun 12, 2016 at 09:01:09PM +0100, Cory Benfield wrote:
> My problem with /dev/urandom is that it?s a trap, lying in wait for
> someone who doesn?t know enough about the problem they?re solving to
> step into it.

And my answer to that question is absent backwards compatibility
concerns, use getrandom(2) on Linux, or getentropy(2) on *BSD, and be
happy.  Don't use /dev/urandom; use getrandom(2) instead.  That way
you also solve a number of other problems such as the file descriptor
DOS attack issue, etc.

The problem with Python is that you *do* have backwards compatibility
concerns.  At which point you are faced with the same issues that we
are in the kernel; except I gather than that the commitment to
backwards compatibility isn't quite as absolute (although it is
strong).  Which is why I've been trying very hard not to tell
python-dev what to do, but rather to give you folks the best
information I can, and then encouraging you to do whatever seems most
"Pythony" --- which might or might not be the same as the decisions
we've made in the kernel.

Cheers,

						- Ted

P.S.  BTW, I probably won't change the behaviour of /dev/urandom to
make it be blocking.  Before I found out about Pyhton Bug #26839, I
actually had patches that did make /dev/urandom blocking, and they
were planned to for the next kernel merge window.  But ultimately, the
reason why I won't is because there is a set of real users (Debian
Stretch users on Amazon AWS and Google GCE) for which if I changed how
/dev/urandom worked, then I would be screwing them over, even if
Python 3.5.2 falls back to /dev/urandom.  It's not a problem for bare
metal hardware and cloud systems with virtio-rng; I have patches that
will take care of those scenarios.

Unfortunately, both AWS and GCE don't support virtio-rng currently,
and as much as some poeple are worried about the hypothetical problems
of stupidly written/deployed Python scripts that try to generate
long-term secrets during early boot, weighed against the very real
prospect of user lossage on two of the most popular Cloud environments
out there --- it's simply no contest.

From larry at hastings.org  Sun Jun 12 20:21:14 2016
From: larry at hastings.org (Larry Hastings)
Date: Sun, 12 Jun 2016 17:21:14 -0700
Subject: [Python-Dev] Reminder: 3.6.0a2 snapshot 2016-06-13 12:00 UTC
In-Reply-To: <23B6CAA5-6E07-4F2B-898F-B9EABF8E9BD0@python.org>
References: <23B6CAA5-6E07-4F2B-898F-B9EABF8E9BD0@python.org>
Message-ID: <575DFC7A.7000403@hastings.org>

On 06/10/2016 03:23 PM, Ned Deily wrote:
> Also note that Larry has announced plans to do a 3.5.2 release candidate sometime this weekend and Benjamin plans to do a 2.7.12 release candidate.  So get important maintenance release fixes in ASAP.

To clarify: /both/ 3.5.2rc1 /and/ 3.4.5rc1 were tagged yesterday and 
will ship later today.

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160612/193dc68a/attachment.html>

From njs at pobox.com  Sun Jun 12 21:53:54 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 12 Jun 2016 18:53:54 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160612232803.GB17328@thunk.org>
References: <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
Message-ID: <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>

On Sun, Jun 12, 2016 at 4:28 PM, Theodore Ts'o <tytso at mit.edu> wrote:
> P.S.  BTW, I probably won't change the behaviour of /dev/urandom to
> make it be blocking.  Before I found out about Pyhton Bug #26839, I
> actually had patches that did make /dev/urandom blocking, and they
> were planned to for the next kernel merge window.  But ultimately, the
> reason why I won't is because there is a set of real users (Debian
> Stretch users on Amazon AWS and Google GCE) for which if I changed how
> /dev/urandom worked, then I would be screwing them over, even if
> Python 3.5.2 falls back to /dev/urandom.  It's not a problem for bare
> metal hardware and cloud systems with virtio-rng; I have patches that
> will take care of those scenarios.
>
> Unfortunately, both AWS and GCE don't support virtio-rng currently,
> and as much as some poeple are worried about the hypothetical problems
> of stupidly written/deployed Python scripts that try to generate
> long-term secrets during early boot, weighed against the very real
> prospect of user lossage on two of the most popular Cloud environments
> out there --- it's simply no contest.

Speaking of full-stack perspectives, would it affect your decision if
Debian Stretch were made robust against blocking /dev/urandom on
AWS/GCE? Because I think we could find lots of people who would be
overjoyed to fix Stretch before the next merge window even opens
(AFAICT the quick fix is literally a 1 line patch), if that allowed
the blocking /dev/urandom patches to go in upstream...

(It looks like Jessie isn't affected, because while Jessie does
provide a systemd-cron package for those who decide to install it,
Jessie's systemd-cron is still using python2, python2 doesn't have
hash randomization so it doesn't touch /dev/urandom at startup, and
systemd-cron doesn't have any code that would trigger access to
/dev/urandom otherwise. It looks like Xenial *is* affected, because
they ship systemd-cron with python3, but their python3 is still
unconditionally using getrandom() in blocking mode, so they need to
patch that regardless, and could just as easily make it robust against
blocking /dev/urandom at the same time. I don't understand the RPM
world as well, but I can't find any evidence that Fedora or SuSE ship
systemd-cron at all.)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From larry at hastings.org  Sun Jun 12 23:16:12 2016
From: larry at hastings.org (Larry Hastings)
Date: Sun, 12 Jun 2016 20:16:12 -0700
Subject: [Python-Dev] [RELEASED] Python 3.4.5rc1 and Python 3.5.2rc1 are now
 available
Message-ID: <575E257C.5020808@hastings.org>

On behalf of the Python development community and the Python 3.4 and 
Python 3.5 release teams, I'm pleased to announce the availability of 
Python 3.4.5rc1 and Python 3.5.2rc1.

Python 3.4 is now in "security fixes only" mode.  This is the final 
stage of support for Python 3.4.  All changes made to Python 3.4 since 
Python 3.4.4 should be security fixes only; conventional bug fixes are 
not accepted.  Also, Python 3.4.5rc1 and all future releases of Python 
3.4 will only be released as source code--no official binary installers 
will be produced.

Python 3.5 is still in active "bug fix" mode.  Python 3.5.2rc1 contains 
many incremental improvements over Python 3.5.1.

Both these releases are "release candidates".  They should not be 
considered the final releases, although the final releases should 
contain only minor differences.  Python users are encouraged to test 
with these releases and report any problems they encounter.

You can find Python 3.4.5rc1 here:

    https://www.python.org/downloads/release/python-345rc1/

And you can find Python 3.5.2rc1 here:

    https://www.python.org/downloads/release/python-352rc1/ 

Python 3.4.5 final and Python 3.5.2 final are both scheduled for release 
on June 26th, 2016.

Happy Pythoneering,

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160612/6c12fdb5/attachment.html>

From benjamin at python.org  Sun Jun 12 23:35:25 2016
From: benjamin at python.org (Benjamin Peterson)
Date: Sun, 12 Jun 2016 20:35:25 -0700
Subject: [Python-Dev] [RELEASE] Python 2.7.12 release candidate 1
Message-ID: <1465788925.287521.635663601.5B6BD6AA@webmail.messagingengine.com>

Python 2.7.12 release candidate 1 is now available for download. This is
a preview release of the next bugfix release in the Python 2.7.x series.
Assuming no horrible regressions are located, a final release will
follow in two weeks.

Downloads for 2.7.12rc1 can be found python.org:
    https://www.python.org/downloads/release/python-2712rc1/

The complete changelog may be viewed at
    https://hg.python.org/cpython/raw-file/v2.7.12rc1/Misc/NEWS

Please test the pre-release and report any bugs to
   https://bugs.python.org

Servus,
Benjamin

From steve at pearwood.info  Mon Jun 13 00:29:31 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 13 Jun 2016 14:29:31 +1000
Subject: [Python-Dev] Stop using timeit, use perf.timeit!
In-Reply-To: <1465688598.1078301.634935153.258EE6BF@webmail.messagingengine.com>
References: <CAMpsgwYTa1N+Eg1GwDx1pbRkTQ+D8qxhamiFhP-PcvM1QG+=sA@mail.gmail.com>
 <CAK1QoopjYefq0+M_ROyOKA7XEbyNhGfYJt1Lh_GUoHJ=emejPw@mail.gmail.com>
 <CAMpsgwbCKB17197Qgw50aqT6u4uF6FzbpAFqchG9zqprZ=XM+w@mail.gmail.com>
 <20160611014549.GK27919@ando.pearwood.info>
 <1465688598.1078301.634935153.258EE6BF@webmail.messagingengine.com>
Message-ID: <20160613042931.GS27919@ando.pearwood.info>

On Sat, Jun 11, 2016 at 07:43:18PM -0400, Random832 wrote:
> On Fri, Jun 10, 2016, at 21:45, Steven D'Aprano wrote:
> > If you express your performances as speeds (as "calculations per 
> > second") then the harmonic mean is the right way to average them.
> 
> That's true in so far as you get the same result as if you were to take
> the arithmetic mean of the times and then converted from that to
> calculations per second. Is there any other particular basis for
> considering it "right"?

I think this is getting off-topic, so extended discussion should 
probably go off-list. But the brief answer is that it gives a physically 
meaningful result if you replace each of the data points with the mean. 
Which specific mean you use depends on how you are using the data 
points.

http://mathforum.org/library/drmath/view/69480.html

Consider the question:

Dave can paint a room in 5 hours, and Sue can paint the same room in 3 
hours. How long will it take them, working together, to paint the room?

The right answer can be found the long way:

Dave paints 1/5 of a room per hour, and Sue paints 1/3 of a room per 
hour, so together they paint (1/5+1/3) = 8/15 of a room per hour. So to 
paint one full room, it takes 15/8 = 1.875 hours.

(Sanity check: after 1.875 hours, Sue has painted 1.875/3 of the room, 
or 62.5%. In that same time, Dave has painted 1.875/5 of the room, or 
37.5%. Add the percentages together, and you have 100% of the room.)

Using the harmonic mean, the problem is simple:

data = 5, 3  # time taken per person
mean = 3.75  # time taken per person on average

Since they are painting the room in parallel, each person need only 
paint half the room on average, giving total time of:

3.75/2 = 1.875 hours

If we were to use the arithmetic mean (5+3)/2 = 4 hours, we'd get the 
wrong answer.

-- 
Steve

From tytso at mit.edu  Mon Jun 13 08:26:54 2016
From: tytso at mit.edu (Theodore Ts'o)
Date: Mon, 13 Jun 2016 08:26:54 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
Message-ID: <20160613122654.GE17328@thunk.org>

On Sun, Jun 12, 2016 at 06:53:54PM -0700, Nathaniel Smith wrote:
> 
> Speaking of full-stack perspectives, would it affect your decision if
> Debian Stretch were made robust against blocking /dev/urandom on
> AWS/GCE? Because I think we could find lots of people who would be
> overjoyed to fix Stretch before the next merge window even opens
> (AFAICT the quick fix is literally a 1 line patch), if that allowed
> the blocking /dev/urandom patches to go in upstream...

Alas, it's not just Debian.  Apparently it breaks the boot on Openwrt
as well as Ubuntu Quantal:

	https://lkml.org/lkml/2016/6/13/48
	https://lkml.org/lkml/2016/5/31/599

(Yay for an automated test infrastructure that fires off as soon as
you push to an externally visible git repository.  :-)

I haven't investigated to see exactly *why* it's blowing up on these
userspace setups, but it's a great reminder for why changing an
established interface is something that has to be done very carefully
indeed.

	     	       	    	   - Ted

From leewangzhong+python at gmail.com  Mon Jun 13 09:35:20 2016
From: leewangzhong+python at gmail.com (Franklin? Lee)
Date: Mon, 13 Jun 2016 09:35:20 -0400
Subject: [Python-Dev] PEP 468
In-Reply-To: <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
 <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
 <CAB_e7iw7A46SzWGmnvSn9d0TzYJSZ927iOsppFWTQxtym3bH-g@mail.gmail.com>
 <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
Message-ID: <CAB_e7ixJHtZMw7p4OBRX6K4c05g_T+DCVs-gpLBn=_L1SAjUAA@mail.gmail.com>

I am. I was just wondering if there was an in-progress effort I should be
looking at, because I am interested in extensions to it.

P.S.: If anyone is missing the relevance, Raymond Hettinger's compact dicts
are inherently ordered until a delitem happens.[1] That could be "good
enough" for many purposes, including kwargs and class definition. If
CPython implements efficient compact dicts, it would be easier to propose
order-preserving (or initially-order-preserving) dicts in some places in
the standard.

[1] Whether delitem preserves order depends on whether you want to allow
gaps in your compact entry table. PyPy implemented compact dicts and
chose(?) to make dicts ordered.

On Saturday, June 11, 2016, Eric Snow <ericsnowcurrently at gmail.com> wrote:

> On Fri, Jun 10, 2016 at 11:54 AM, Franklin? Lee
> <leewangzhong+python at gmail.com <javascript:;>> wrote:
> > Eric, have you any work in progress on compact dicts?
>
> Nope.  I presume you are talking the proposal Raymond made a while back.
>
> -eric
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160613/88135fd5/attachment.html>

From ethan at stoneleaf.us  Mon Jun 13 12:34:07 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 13 Jun 2016 09:34:07 -0700
Subject: [Python-Dev] PEP 468
In-Reply-To: <CAB_e7izLkQ4C=6GdnZ7t3CdY5g-OKY9fkq2Eu5imHjgV9EcZwA@mail.gmail.com>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
 <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
 <CAB_e7iw7A46SzWGmnvSn9d0TzYJSZ927iOsppFWTQxtym3bH-g@mail.gmail.com>
 <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
 <CAB_e7izLkQ4C=6GdnZ7t3CdY5g-OKY9fkq2Eu5imHjgV9EcZwA@mail.gmail.com>
Message-ID: <575EE07F.8040102@stoneleaf.us>

On 06/10/2016 02:13 PM, Franklin? Lee wrote:

> P.S.: If anyone is missing the relevance, Raymond Hettinger's compact
> dicts are inherently ordered until a delitem happens.[1] That could be
> "good enough" for many purposes, including kwargs and class definition.

It would be great for kwargs, but not for class definition: del's can 
happen there, so we need PEP 520 with OrderedDict so the definition 
order is not lost when an item is deleted during class creation.

--
~Ethan~

From berker.peksag at gmail.com  Mon Jun 13 14:12:56 2016
From: berker.peksag at gmail.com (=?UTF-8?Q?Berker_Peksa=C4=9F?=)
Date: Mon, 13 Jun 2016 21:12:56 +0300
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace (round
 3)
In-Reply-To: <CALFfu7DXG9o_drdkS5+r6g03eVVzDCDbrKcXymwsCqVDkz+3OQ@mail.gmail.com>
References: <CALFfu7DXG9o_drdkS5+r6g03eVVzDCDbrKcXymwsCqVDkz+3OQ@mail.gmail.com>
Message-ID: <CAF4280JHB5bCkY6FH7FWrXVxxYJkkXyDXUUDkn5VeofNdEdbng@mail.gmail.com>

On Sun, Jun 12, 2016 at 5:37 AM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> The following code demonstrates roughly equivalent semantics for the
> default behavior::
>
>    class Meta(type):
>        def __prepare__(cls, *args, **kwargs):

Shouldn't this be wrapped with a classmethod decorator?

+1 from me.

--Berker

From leewangzhong+python at gmail.com  Mon Jun 13 16:37:33 2016
From: leewangzhong+python at gmail.com (Franklin? Lee)
Date: Mon, 13 Jun 2016 16:37:33 -0400
Subject: [Python-Dev] PEP 468
In-Reply-To: <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
 <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
 <CAB_e7iw7A46SzWGmnvSn9d0TzYJSZ927iOsppFWTQxtym3bH-g@mail.gmail.com>
 <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
Message-ID: <CAB_e7ix75i7fv8rJ+nnW0hhLUWHXiXNjheykEnEkd1kgJ6SXFg@mail.gmail.com>

I am. I was just wondering if there was an in-progress effort I should be
looking at, because I am interested in extensions to it.

P.S.: If anyone is missing the relevance, Raymond Hettinger's compact dicts
are inherently ordered until a delitem happens.[1] That could be "good
enough" for many purposes, including kwargs and class definition. If
CPython implements efficient compact dicts, it would be easier to propose
order-preserving (or initially-order-preserving) dicts in some places in
the standard.

[1] Whether delitem preserves order depends on whether you want to allow
gaps in your compact entry table. PyPy implemented compact dicts and
chose(?) to make dicts ordered.

On Saturday, June 11, 2016, Eric Snow <ericsnowcurrently at gmail.com> wrote:

> On Fri, Jun 10, 2016 at 11:54 AM, Franklin? Lee
> <leewangzhong+python at gmail.com <javascript:;>> wrote:
> > Eric, have you any work in progress on compact dicts?
>
> Nope.  I presume you are talking the proposal Raymond made a while back.
>
> -eric
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160613/da63b8b7/attachment.html>

From leewangzhong+python at gmail.com  Mon Jun 13 17:24:09 2016
From: leewangzhong+python at gmail.com (Franklin? Lee)
Date: Mon, 13 Jun 2016 17:24:09 -0400
Subject: [Python-Dev] PEP 468
In-Reply-To: <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
 <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
 <CAB_e7iw7A46SzWGmnvSn9d0TzYJSZ927iOsppFWTQxtym3bH-g@mail.gmail.com>
 <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
Message-ID: <CAB_e7izNguoUiJE3sfHMGEWstedTFYrZznWi-mz0KruBMaqHWw@mail.gmail.com>

I am. I was just wondering if there was an in-progress effort I should be
looking at, because I am interested in extensions to it.

P.S.: If anyone is missing the relevance, Raymond Hettinger's compact dicts
are inherently ordered until a delitem happens.[1] That could be "good
enough" for many purposes, including kwargs and class definition. If
CPython implements efficient compact dicts, it would be easier to propose
order-preserving (or initially-order-preserving) dicts in some places in
the standard.

[1] Whether delitem preserves order depends on whether you want to allow
gaps in your compact entry table. PyPy implemented compact dicts and
chose(?) to make dicts ordered.

On Saturday, June 11, 2016, Eric Snow <ericsnowcurrently at gmail.com> wrote:

> On Fri, Jun 10, 2016 at 11:54 AM, Franklin? Lee
> <leewangzhong+python at gmail.com <javascript:;>> wrote:
> > Eric, have you any work in progress on compact dicts?
>
> Nope.  I presume you are talking the proposal Raymond made a while back.
>
> -eric
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160613/ee2c43fe/attachment.html>

From leewangzhong+python at gmail.com  Mon Jun 13 17:35:00 2016
From: leewangzhong+python at gmail.com (Franklin? Lee)
Date: Mon, 13 Jun 2016 17:35:00 -0400
Subject: [Python-Dev] PEP 468
In-Reply-To: <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
 <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
 <CAB_e7iw7A46SzWGmnvSn9d0TzYJSZ927iOsppFWTQxtym3bH-g@mail.gmail.com>
 <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
Message-ID: <CAB_e7iwZzRXmHM=8s39GZXTr32vxPbMLzt02EB6eg=ecGG8kVA@mail.gmail.com>

I am. I was just wondering if there was an in-progress effort I should be
looking at, because I am interested in extensions to it.

P.S.: If anyone is missing the relevance, Raymond Hettinger's compact dicts
are inherently ordered until a delitem happens.[1] That could be "good
enough" for many purposes, including kwargs and class definition. If
CPython implements efficient compact dicts, it would be easier to propose
order-preserving (or initially-order-preserving) dicts in some places in
the standard.

[1] Whether delitem preserves order depends on whether you want to allow
gaps in your compact entry table. PyPy implemented compact dicts and
chose(?) to make dicts ordered.

On Saturday, June 11, 2016, Eric Snow <ericsnowcurrently at gmail.com> wrote:

> On Fri, Jun 10, 2016 at 11:54 AM, Franklin? Lee
> <leewangzhong+python at gmail.com <javascript:;>> wrote:
> > Eric, have you any work in progress on compact dicts?
>
> Nope.  I presume you are talking the proposal Raymond made a while back.
>
> -eric
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160613/9fd36b59/attachment.html>

From gvanrossum at gmail.com  Mon Jun 13 17:34:56 2016
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon, 13 Jun 2016 14:34:56 -0700
Subject: [Python-Dev] PEP 468
In-Reply-To: <CAB_e7izNguoUiJE3sfHMGEWstedTFYrZznWi-mz0KruBMaqHWw@mail.gmail.com>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
 <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
 <CAB_e7iw7A46SzWGmnvSn9d0TzYJSZ927iOsppFWTQxtym3bH-g@mail.gmail.com>
 <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
 <CAB_e7izNguoUiJE3sfHMGEWstedTFYrZznWi-mz0KruBMaqHWw@mail.gmail.com>
Message-ID: <CAP7+vJKfVEfb1zt2kob-xdD3CdR1g_18f49imVzkjqg4MYcZgw@mail.gmail.com>

Can someone block Franklin until his mailer stops resending this message?

--Guido (mobile)
On Jun 13, 2016 2:26 PM, "Franklin? Lee" <leewangzhong+python at gmail.com>
wrote:

> I am. I was just wondering if there was an in-progress effort I should be
> looking at, because I am interested in extensions to it.
>
> P.S.: If anyone is missing the relevance, Raymond Hettinger's compact
> dicts are inherently ordered until a delitem happens.[1] That could be
> "good enough" for many purposes, including kwargs and class definition. If
> CPython implements efficient compact dicts, it would be easier to propose
> order-preserving (or initially-order-preserving) dicts in some places in
> the standard.
>
> [1] Whether delitem preserves order depends on whether you want to allow
> gaps in your compact entry table. PyPy implemented compact dicts and
> chose(?) to make dicts ordered.
>
> On Saturday, June 11, 2016, Eric Snow <ericsnowcurrently at gmail.com> wrote:
>
>> On Fri, Jun 10, 2016 at 11:54 AM, Franklin? Lee
>> <leewangzhong+python at gmail.com> wrote:
>> > Eric, have you any work in progress on compact dicts?
>>
>> Nope.  I presume you are talking the proposal Raymond made a while back.
>>
>> -eric
>>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160613/9c2a9051/attachment.html>

From mail at timgolden.me.uk  Mon Jun 13 18:08:26 2016
From: mail at timgolden.me.uk (Tim Golden)
Date: Mon, 13 Jun 2016 23:08:26 +0100
Subject: [Python-Dev] PEP 468
In-Reply-To: <CAP7+vJKfVEfb1zt2kob-xdD3CdR1g_18f49imVzkjqg4MYcZgw@mail.gmail.com>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
 <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
 <CAB_e7iw7A46SzWGmnvSn9d0TzYJSZ927iOsppFWTQxtym3bH-g@mail.gmail.com>
 <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
 <CAB_e7izNguoUiJE3sfHMGEWstedTFYrZznWi-mz0KruBMaqHWw@mail.gmail.com>
 <CAP7+vJKfVEfb1zt2kob-xdD3CdR1g_18f49imVzkjqg4MYcZgw@mail.gmail.com>
Message-ID: <0b940ccd-33f1-ae21-5295-025bb2b46006@timgolden.me.uk>

I've set him to moderation for now. Beyond that we'd have to unsubscribe 
him altogether and ask him to resubscribe later.

TJG

On 13/06/2016 22:34, Guido van Rossum wrote:
> Can someone block Franklin until his mailer stops resending this message?
>
> --Guido (mobile)
>
> On Jun 13, 2016 2:26 PM, "Franklin? Lee" <leewangzhong+python at gmail.com
> <mailto:leewangzhong%2Bpython at gmail.com>> wrote:
>
>     I am. I was just wondering if there was an in-progress effort I
>     should be looking at, because I am interested in extensions to it.
>
>     P.S.: If anyone is missing the relevance, Raymond
>     Hettinger's compact dicts are inherently ordered until a
>     delitem happens.[1] That could be "good enough" for many purposes,
>     including kwargs and class definition. If CPython implements
>     efficient compact dicts, it would be easier to propose
>     order-preserving (or initially-order-preserving) dicts in some
>     places in the standard.
>
>     [1] Whether delitem preserves order depends on whether you want to
>     allow gaps in your compact entry table. PyPy implemented compact
>     dicts and chose(?) to make dicts ordered.
>
>     On Saturday, June 11, 2016, Eric Snow <ericsnowcurrently at gmail.com
>     <mailto:ericsnowcurrently at gmail.com>> wrote:
>
>         On Fri, Jun 10, 2016 at 11:54 AM, Franklin? Lee
>         <leewangzhong+python at gmail.com> wrote:
>         > Eric, have you any work in progress on compact dicts?
>
>         Nope.  I presume you are talking the proposal Raymond made a
>         while back.
>
>         -eric
>
>
>     _______________________________________________
>     Python-Dev mailing list
>     Python-Dev at python.org <mailto:Python-Dev at python.org>
>     https://mail.python.org/mailman/listinfo/python-dev
>     Unsubscribe:
>     https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/mail%40timgolden.me.uk
>

From python at mrabarnett.plus.com  Mon Jun 13 20:05:06 2016
From: python at mrabarnett.plus.com (MRAB)
Date: Tue, 14 Jun 2016 01:05:06 +0100
Subject: [Python-Dev] PEP 468
In-Reply-To: <575EE07F.8040102@stoneleaf.us>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
 <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
 <CAB_e7iw7A46SzWGmnvSn9d0TzYJSZ927iOsppFWTQxtym3bH-g@mail.gmail.com>
 <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
 <CAB_e7izLkQ4C=6GdnZ7t3CdY5g-OKY9fkq2Eu5imHjgV9EcZwA@mail.gmail.com>
 <575EE07F.8040102@stoneleaf.us>
Message-ID: <fb9aa7fa-4683-dc0b-4f6c-9f129838ab10@mrabarnett.plus.com>

On 2016-06-13 17:34, Ethan Furman wrote:
> On 06/10/2016 02:13 PM, Franklin? Lee wrote:
>
>> P.S.: If anyone is missing the relevance, Raymond Hettinger's compact
>> dicts are inherently ordered until a delitem happens.[1] That could be
>> "good enough" for many purposes, including kwargs and class definition.
>
> It would be great for kwargs, but not for class definition: del's can
> happen there, so we need PEP 520 with OrderedDict so the definition
> order is not lost when an item is deleted during class creation.
>
The order can be lost when an item is deleted because it moves the last 
item into the 'hole' left by the deleted item.

This could be avoided by expanding the items to include the index of the 
'previous' and 'next' item, so that they could be handled like a 
doubly-linked list.

The disadvantage would be that it would use more memory.

From larry at hastings.org  Mon Jun 13 20:47:12 2016
From: larry at hastings.org (Larry Hastings)
Date: Mon, 13 Jun 2016 17:47:12 -0700
Subject: [Python-Dev] PEP 468
In-Reply-To: <fb9aa7fa-4683-dc0b-4f6c-9f129838ab10@mrabarnett.plus.com>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
 <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
 <CAB_e7iw7A46SzWGmnvSn9d0TzYJSZ927iOsppFWTQxtym3bH-g@mail.gmail.com>
 <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
 <CAB_e7izLkQ4C=6GdnZ7t3CdY5g-OKY9fkq2Eu5imHjgV9EcZwA@mail.gmail.com>
 <575EE07F.8040102@stoneleaf.us>
 <fb9aa7fa-4683-dc0b-4f6c-9f129838ab10@mrabarnett.plus.com>
Message-ID: <575F5410.6080106@hastings.org>

On 06/13/2016 05:05 PM, MRAB wrote:
> This could be avoided by expanding the items to include the index of 
> the 'previous' and 'next' item, so that they could be handled like a 
> doubly-linked list.
>
> The disadvantage would be that it would use more memory.

Another, easier technique: don't fill holes.  Same disadvantage 
(increased memory use), but easier to write and maintain.

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160613/645ceaea/attachment.html>

From python at mrabarnett.plus.com  Mon Jun 13 21:14:26 2016
From: python at mrabarnett.plus.com (MRAB)
Date: Tue, 14 Jun 2016 02:14:26 +0100
Subject: [Python-Dev] PEP 468
In-Reply-To: <575F5410.6080106@hastings.org>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
 <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
 <CAB_e7iw7A46SzWGmnvSn9d0TzYJSZ927iOsppFWTQxtym3bH-g@mail.gmail.com>
 <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
 <CAB_e7izLkQ4C=6GdnZ7t3CdY5g-OKY9fkq2Eu5imHjgV9EcZwA@mail.gmail.com>
 <575EE07F.8040102@stoneleaf.us>
 <fb9aa7fa-4683-dc0b-4f6c-9f129838ab10@mrabarnett.plus.com>
 <575F5410.6080106@hastings.org>
Message-ID: <fa9581a5-371b-644e-ad83-1b265985d100@mrabarnett.plus.com>

On 2016-06-14 01:47, Larry Hastings wrote:
> On 06/13/2016 05:05 PM, MRAB wrote:
>> This could be avoided by expanding the items to include the index of
>> the 'previous' and 'next' item, so that they could be handled like a
>> doubly-linked list.
>>
>> The disadvantage would be that it would use more memory.
>
> Another, easier technique: don't fill holes.  Same disadvantage
> (increased memory use), but easier to write and maintain.
>
When iterating over the dict, you'd need to skip over the holes, so it 
would be a good idea to compact it a some point, when there are too many 
holes.

From njs at pobox.com  Mon Jun 13 21:33:57 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 13 Jun 2016 18:33:57 -0700
Subject: [Python-Dev] PEP 468
In-Reply-To: <fa9581a5-371b-644e-ad83-1b265985d100@mrabarnett.plus.com>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
 <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
 <CAB_e7iw7A46SzWGmnvSn9d0TzYJSZ927iOsppFWTQxtym3bH-g@mail.gmail.com>
 <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
 <CAB_e7izLkQ4C=6GdnZ7t3CdY5g-OKY9fkq2Eu5imHjgV9EcZwA@mail.gmail.com>
 <575EE07F.8040102@stoneleaf.us>
 <fb9aa7fa-4683-dc0b-4f6c-9f129838ab10@mrabarnett.plus.com>
 <575F5410.6080106@hastings.org>
 <fa9581a5-371b-644e-ad83-1b265985d100@mrabarnett.plus.com>
Message-ID: <CAPJVwB=LadX7=QqbgnRb8UZ=NsO3OG1smVta5NgB7D84RoTSAA@mail.gmail.com>

On Jun 13, 2016 6:16 PM, "MRAB" <python at mrabarnett.plus.com> wrote:
>
> On 2016-06-14 01:47, Larry Hastings wrote:
>>
>> On 06/13/2016 05:05 PM, MRAB wrote:
>>>
>>> This could be avoided by expanding the items to include the index of
>>> the 'previous' and 'next' item, so that they could be handled like a
>>> doubly-linked list.
>>>
>>> The disadvantage would be that it would use more memory.
>>
>>
>> Another, easier technique: don't fill holes.  Same disadvantage
>> (increased memory use), but easier to write and maintain.
>>
> When iterating over the dict, you'd need to skip over the holes, so it
would be a good idea to compact it a some point, when there are too many
holes.

Right -- but if you wait for some ratio of holes to filled space before
compacting, you can amortize the cost down, and have a good big-O
complexity for both del and iteration simultaneously. Same basic principle
as using proportional overallocation when appending to a list, just in
reverse.

I believe this is what pypy's implementation actually does.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160613/76158ad7/attachment.html>

From ethan at stoneleaf.us  Mon Jun 13 22:23:10 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 13 Jun 2016 19:23:10 -0700
Subject: [Python-Dev] PEP 468
In-Reply-To: <575F5410.6080106@hastings.org>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
 <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
 <CAB_e7iw7A46SzWGmnvSn9d0TzYJSZ927iOsppFWTQxtym3bH-g@mail.gmail.com>
 <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
 <CAB_e7izLkQ4C=6GdnZ7t3CdY5g-OKY9fkq2Eu5imHjgV9EcZwA@mail.gmail.com>
 <575EE07F.8040102@stoneleaf.us>
 <fb9aa7fa-4683-dc0b-4f6c-9f129838ab10@mrabarnett.plus.com>
 <575F5410.6080106@hastings.org>
Message-ID: <575F6A8E.3010803@stoneleaf.us>

On 06/13/2016 05:47 PM, Larry Hastings wrote:
> On 06/13/2016 05:05 PM, MRAB wrote:

>> This could be avoided by expanding the items to include the index of
>> the 'previous' and 'next' item, so that they could be handled like a
>> doubly-linked list.
>>
>> The disadvantage would be that it would use more memory.
>
> Another, easier technique: don't fill holes.  Same disadvantage
> (increased memory use), but easier to write and maintain.

I hope this is just an academic discussion: suddenly having Python's 
dicts grow continuously is going to have nasty consequences somewhere.

--
~Ethan~

From nad at python.org  Mon Jun 13 23:57:02 2016
From: nad at python.org (Ned Deily)
Date: Mon, 13 Jun 2016 23:57:02 -0400
Subject: [Python-Dev] Python 3.6.0a2 is now available
Message-ID: <D8EAD32E-506D-4594-8831-A693EC048CE7@python.org>

On behalf of the Python development community and the Python 3.6 release
team, I'm happy to announce the availability of Python 3.6.0a2.
3.6.0a2 is the first of four planned alpha releases of Python 3.6,
the next major release of Python.  During the alpha phase, Python 3.6
remains under heavy development: additional features will be added
and existing features may be modified or deleted.  Please keep in mind
that this is a preview release and its use is not recommended for
production environments.

You can find Python 3.6.0a2 here:

https://www.python.org/downloads/release/python-360a2/ 

The next release of Python 3.6 will be 3.6.0a3, currently scheduled for
2016-07-11.

Enjoy!

--Ned

--
  Ned Deily
  nad at python.org -- []

From g.brandl at gmx.net  Tue Jun 14 02:17:13 2016
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 14 Jun 2016 08:17:13 +0200
Subject: [Python-Dev] Current Python 3.2 status?
In-Reply-To: <CAMNjDR03g6KAWKSpN4qiFj-XkABWBx=RgLoyeH6Y1H_R6cLBOA@mail.gmail.com>
References: <CAMNjDR3hcfXjWywBfUAJqxuq60+oK=1d5hU-k-drAUKMoFOBvA@mail.gmail.com>
 <CAF4280+4DbaEE0T0C9nNVHqS0Yoi=bmFk4aUFb_TJ2=eTJu-OQ@mail.gmail.com>
 <CAMNjDR03g6KAWKSpN4qiFj-XkABWBx=RgLoyeH6Y1H_R6cLBOA@mail.gmail.com>
Message-ID: <njo7h9$ggd$1@ger.gmane.org>

On 06/11/2016 07:41 PM, Chi Hsuan Yen wrote:
> 
> 
> On Sun, Jun 12, 2016 at 1:02 AM, Berker Peksa? <berker.peksag at gmail.com
> <mailto:berker.peksag at gmail.com>> wrote:
> 
>     On Sat, Jun 11, 2016 at 8:59 AM, Chi Hsuan Yen <yan12125 at gmail.com
>     <mailto:yan12125 at gmail.com>> wrote:
>     > Hello all,
>     >
>     > Georg said in February that 3.2.7 is going to be released, and now it's
>     > June. Will it ever be released?
> 
>     Hi,
> 
>     It was delayed because of a security issue. See Georg's email at
>     https://mail.python.org/pipermail/python-dev/2016-February/143400.html
> 
>     --Berker
> 
> 
> Thanks for that. I'm just curious what's happening on the 3.2 branch.

Patches being available now, I'll do the releases this weekend.

Georg

From nikita at nemkin.ru  Tue Jun 14 05:41:39 2016
From: nikita at nemkin.ru (Nikita Nemkin)
Date: Tue, 14 Jun 2016 14:41:39 +0500
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
Message-ID: <CANawmycixVLySD9KAAHmxjCoPKfBRGDKWmczwvHmUo3+vuWDkA@mail.gmail.com>

Is there any rationale for rejecting alternatives like:

1. Adding standard metaclass with ordered namespace.
2. Adding `namespace` or `ordered` args to the default metaclass.
3. Making compiler fill in __definition_order__ for every class
    (just like __qualname__) without touching the runtime.
?

To me, any of the above seems preferred to complicating
the core part of the language forever.

The vast majority of Python classes don't care about their member
order, this is minority use case receiving majority treatment.

Also, wiring OrderedDict into class creation means elevating it
from a peripheral utility to indispensable built-in type.

From asimkostas at gmail.com  Tue Jun 14 04:44:01 2016
From: asimkostas at gmail.com (asimkon)
Date: Tue, 14 Jun 2016 11:44:01 +0300
Subject: [Python-Dev] mod_python compilation error in VS 2008 for py2.7.1
Message-ID: <CA+9XG=O3oo6sYCmNwBobPjzVTud-b3=qZ2tWCsPxK5dhFntfiA@mail.gmail.com>

I would like to ask you a technical question regarding python module
compilation for python 2.7.1.

I want to compile mod_python
<https://app.box.com/s/orsffo3t4g6h9ftkq6p1>library
for Apache 2.2 and py2.7 <https://www.python.org/downloads/> on Win32 in
order to use it for  psp - py scripts that i have written. I tried to
compile it using VS 2008 (VC++) and unfortunately i get an error on
pyconfig.h (Py2.7/include) error C2632: int followed by int  is illegal.

This problem occurs when i try to run the bat file that exists on
mod_python/dist folder. Any idea or suggestion what should i do in order to
run it on Win 7 Pro (win 32) environment and produce the final apache
executable module (.so).

For your better assistance, i attach you the necessary files and error_log
(ouput that i get during compilation process). I have posted the same
question here
<http://stackoverflow.com/questions/37696936/vc-compilation-error-in-pyconfig-h-vs-2008>,
but unfortunately i had had no luck!

Additionally i give you the compilation instructions that i follow (used
also MinGW-w64 and get the same error) in order to produce the final output!

Compiling

Open a command prompt with VS2008 support. The easiest way to do this is to
use "Start | All Programs | Microsoft Visual Studio 2008 | Visual Studio
Tools | Visual Studio 2008 Command Prompt". (This puts the VS2008 binaries
in the path and sets up the lib/include environmental variables for the
Platform SDK.)

1.cd to the mod_python\dist folder.

2.Tell mod_python where Apache is: set APACHESRC=C:\Apache

3. Run build_installer.bat.

If it succeeds, an installer.exe will be created in a subfolder. Run that
install the module.

Kind  Regards

Kostas Asimakopoulos
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160614/787db14d/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: unistd.h
Type: text/x-chdr
Size: 1753 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160614/787db14d/attachment-0003.h>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: getopt.h
Type: text/x-chdr
Size: 18564 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160614/787db14d/attachment-0004.h>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mod_python_error.docx
Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Size: 17248 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160614/787db14d/attachment-0001.docx>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pyconfig.h
Type: text/x-chdr
Size: 22098 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160614/787db14d/attachment-0005.h>

From leewangzhong+python at gmail.com  Tue Jun 14 03:47:56 2016
From: leewangzhong+python at gmail.com (Franklin? Lee)
Date: Tue, 14 Jun 2016 03:47:56 -0400
Subject: [Python-Dev] PEP 468
In-Reply-To: <575F6A8E.3010803@stoneleaf.us>
References: <1465501262.461706.633110089.19D9C3C8@webmail.messagingengine.com>
 <BLU403-EAS1716C1FD5A1F423FBF668CB915F0@phx.gbl>
 <CALFfu7DO3LddxtFAQJCeOA+mxJJG-9qF-oC26G9e8c2zyT8dQg@mail.gmail.com>
 <CAB_e7iw7A46SzWGmnvSn9d0TzYJSZ927iOsppFWTQxtym3bH-g@mail.gmail.com>
 <CALFfu7DOUSBM+EmKKmeU_DaDv30RCtXPu=zG2Bzv9OOeFNBLyg@mail.gmail.com>
 <CAB_e7izLkQ4C=6GdnZ7t3CdY5g-OKY9fkq2Eu5imHjgV9EcZwA@mail.gmail.com>
 <575EE07F.8040102@stoneleaf.us>
 <fb9aa7fa-4683-dc0b-4f6c-9f129838ab10@mrabarnett.plus.com>
 <575F5410.6080106@hastings.org> <575F6A8E.3010803@stoneleaf.us>
Message-ID: <CAB_e7izVcKuhzyp-mSvagLowFQvLAnTYE91GGU=q5v22BxyF0w@mail.gmail.com>

Compact OrderedDicts can leave gaps, and once in a while compactify. For
example, whenever the entry table is full, it can decide whether to resize
(and only copy non-gaps), or just compactactify

Compact regular dicts can swap from the back and have no gaps.

I don't see the point of discussing these details. Isn't it enough to say
that these are solvable problems, which we can worry about if/when someone
actually decides to sit down and implement compact dicts?

P.S.: Sorry about the repeated emails. I think it was the iOS Gmail app.

On Jun 13, 2016 10:23 PM, "Ethan Furman" <ethan at stoneleaf.us> wrote:
>
> On 06/13/2016 05:47 PM, Larry Hastings wrote:
>>
>> On 06/13/2016 05:05 PM, MRAB wrote:
>
>
>>> This could be avoided by expanding the items to include the index of
>>> the 'previous' and 'next' item, so that they could be handled like a
>>> doubly-linked list.
>>>
>>> The disadvantage would be that it would use more memory.
>>
>>
>> Another, easier technique: don't fill holes.  Same disadvantage
>> (increased memory use), but easier to write and maintain.
>
>
> I hope this is just an academic discussion: suddenly having Python's
dicts grow continuously is going to have nasty consequences somewhere.
>
> --
> ~Ethan~
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160614/77cc923d/attachment.html>

From steve at pearwood.info  Tue Jun 14 11:07:14 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 15 Jun 2016 01:07:14 +1000
Subject: [Python-Dev] [Python-checkins] cpython (3.5): Fix os.urandom()
 using getrandom() on Linux
In-Reply-To: <20160614143358.10086.1428.9B00D7BE@psf.io>
References: <20160614143358.10086.1428.9B00D7BE@psf.io>
Message-ID: <20160614150713.GX27919@ando.pearwood.info>

Is this right? I thought we had decided that os.urandom should *not* 
fall back on getrandom on Linux?

On Tue, Jun 14, 2016 at 02:36:27PM +0000, victor. stinner wrote:
> https://hg.python.org/cpython/rev/e028e86a5b73
> changeset:   102033:e028e86a5b73
> branch:      3.5
> parent:      102031:a36238de31ae
> user:        Victor Stinner <victor.stinner at gmail.com>
> date:        Tue Jun 14 16:31:35 2016 +0200
> summary:
>   Fix os.urandom() using getrandom() on Linux
> 
> Issue #27278: Fix os.urandom() implementation using getrandom() on Linux.
> Truncate size to INT_MAX and loop until we collected enough random bytes,
> instead of casting a directly Py_ssize_t to int.
> 
> files:
>   Misc/NEWS       |  4 ++++
>   Python/random.c |  2 +-
>   2 files changed, 5 insertions(+), 1 deletions(-)
> 
> 
> diff --git a/Misc/NEWS b/Misc/NEWS
> --- a/Misc/NEWS
> +++ b/Misc/NEWS
> @@ -13,6 +13,10 @@
>  Library
>  -------
>  
> +- Issue #27278: Fix os.urandom() implementation using getrandom() on Linux.
> +  Truncate size to INT_MAX and loop until we collected enough random bytes,
> +  instead of casting a directly Py_ssize_t to int.
> +
>  - Issue #26386: Fixed ttk.TreeView selection operations with item id's
>    containing spaces.
>  
> diff --git a/Python/random.c b/Python/random.c
> --- a/Python/random.c
> +++ b/Python/random.c
> @@ -143,7 +143,7 @@
>             to 1024 bytes */
>          n = Py_MIN(size, 1024);
>  #else
> -        n = size;
> +        n = Py_MIN(size, INT_MAX);
>  #endif
>  
>          errno = 0;
> 
> -- 
> Repository URL: https://hg.python.org/cpython

> _______________________________________________
> Python-checkins mailing list
> Python-checkins at python.org
> https://mail.python.org/mailman/listinfo/python-checkins

From steve at pearwood.info  Tue Jun 14 11:19:35 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 15 Jun 2016 01:19:35 +1000
Subject: [Python-Dev] Why does base64 return bytes?
Message-ID: <20160614151935.GY27919@ando.pearwood.info>

Normally I'd take a question like this to Python-List, but this question 
has turned out to be quite diversive, with people having strong opinions 
but no definitive answer. So I thought I'd ask here and hope that some 
of the core devs would have an idea.

Why does base64 encoding in Python return bytes?

base64.b64encode take bytes as input and returns bytes. Some people are 
arguing that this is wrong behaviour, as RFC 3548 specifies that Base64 
should transform bytes to characters:

https://tools.ietf.org/html/rfc3548.html

albeit US-ASCII characters. E.g.:

    The encoding process represents 24-bit groups of input bits 
    as output strings of 4 encoded characters. 
    [...]
    Each 6-bit group is used as an index into an array of 64 printable
    characters.  The character referenced by the index is placed in the
    output string.

Are they misinterpreting the standard? Has Python got it wrong? Is there 
a good reason for returning bytes?

I see that other languages choose different strategies. Microsoft's 
languages C#, F# and VB (plus their C++ compiler) take an array of bytes 
as input, and outputs a UTF-16 string:

https://msdn.microsoft.com/en-us/library/dhx0d524%28v=vs.110%29.aspx

Java's base64 encoder takes and returns bytes:

https://docs.oracle.com/javase/8/docs/api/java/util/Base64.Encoder.html

and Javascript's Base64 encoder takes input as UTF-16 encoded text and 
returns the same:

https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding

I'm not necessarily arguing that Python's strategy is the wrong one, but 
I am interested in what (if any) reasons are behind it.

Thanks in advance,

Steve

From jelle.zijlstra at gmail.com  Tue Jun 14 11:27:01 2016
From: jelle.zijlstra at gmail.com (Jelle Zijlstra)
Date: Tue, 14 Jun 2016 08:27:01 -0700
Subject: [Python-Dev] [Python-checkins] cpython (3.5): Fix os.urandom()
 using getrandom() on Linux
In-Reply-To: <20160614150713.GX27919@ando.pearwood.info>
References: <20160614143358.10086.1428.9B00D7BE@psf.io>
 <20160614150713.GX27919@ando.pearwood.info>
Message-ID: <CAFp3-p-hnEdsFsPaBzxO-9d2TXiYw_gR0DGcXgxsvFevNu67hA@mail.gmail.com>

I think this is an issue unrelated to the big discussion from a little
while ago. The problem isn't that os.urandom() uses getrandom(), it's that
it calls it in a mode that may block.

2016-06-14 8:07 GMT-07:00 Steven D'Aprano <steve at pearwood.info>:

> Is this right? I thought we had decided that os.urandom should *not*
> fall back on getrandom on Linux?
>
>
>
> On Tue, Jun 14, 2016 at 02:36:27PM +0000, victor. stinner wrote:
> > https://hg.python.org/cpython/rev/e028e86a5b73
> > changeset:   102033:e028e86a5b73
> > branch:      3.5
> > parent:      102031:a36238de31ae
> > user:        Victor Stinner <victor.stinner at gmail.com>
> > date:        Tue Jun 14 16:31:35 2016 +0200
> > summary:
> >   Fix os.urandom() using getrandom() on Linux
> >
> > Issue #27278: Fix os.urandom() implementation using getrandom() on Linux.
> > Truncate size to INT_MAX and loop until we collected enough random bytes,
> > instead of casting a directly Py_ssize_t to int.
> >
> > files:
> >   Misc/NEWS       |  4 ++++
> >   Python/random.c |  2 +-
> >   2 files changed, 5 insertions(+), 1 deletions(-)
> >
> >
> > diff --git a/Misc/NEWS b/Misc/NEWS
> > --- a/Misc/NEWS
> > +++ b/Misc/NEWS
> > @@ -13,6 +13,10 @@
> >  Library
> >  -------
> >
> > +- Issue #27278: Fix os.urandom() implementation using getrandom() on
> Linux.
> > +  Truncate size to INT_MAX and loop until we collected enough random
> bytes,
> > +  instead of casting a directly Py_ssize_t to int.
> > +
> >  - Issue #26386: Fixed ttk.TreeView selection operations with item id's
> >    containing spaces.
> >
> > diff --git a/Python/random.c b/Python/random.c
> > --- a/Python/random.c
> > +++ b/Python/random.c
> > @@ -143,7 +143,7 @@
> >             to 1024 bytes */
> >          n = Py_MIN(size, 1024);
> >  #else
> > -        n = size;
> > +        n = Py_MIN(size, INT_MAX);
> >  #endif
> >
> >          errno = 0;
> >
> > --
> > Repository URL: https://hg.python.org/cpython
>
> > _______________________________________________
> > Python-checkins mailing list
> > Python-checkins at python.org
> > https://mail.python.org/mailman/listinfo/python-checkins
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/jelle.zijlstra%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160614/d0cb62c2/attachment.html>

From jsbueno at python.org.br  Tue Jun 14 11:29:25 2016
From: jsbueno at python.org.br (Joao S. O. Bueno)
Date: Tue, 14 Jun 2016 12:29:25 -0300
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <20160614151935.GY27919@ando.pearwood.info>
References: <20160614151935.GY27919@ando.pearwood.info>
Message-ID: <CAH0mxTQP3i_bn-Mx9a7BzPA_GnFSAvYqo79QK9FWef5DHAMJZg@mail.gmail.com>

On 14 June 2016 at 12:19, Steven D'Aprano <steve at pearwood.info> wrote:
> Is there
> a good reason for returning bytes?

What about: it returns 0-255 numeric values for each position in  a stream, with
no clue whatsoever to how those values map to text characters beyond
the 32-128 range?

Maybe base64.decode could take a "encoding" optional parameter - or
there could  be
a separate 'decote_to_text" method that would explicitly take a text codec name.
Otherwise, no, you simply can't take a bunch of bytes and say they
represent text.

Jo?o

(see ^- the "?" ?)

From victor.stinner at gmail.com  Tue Jun 14 11:35:15 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Tue, 14 Jun 2016 17:35:15 +0200
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <20160614151935.GY27919@ando.pearwood.info>
References: <20160614151935.GY27919@ando.pearwood.info>
Message-ID: <CAMpsgwZwiSGvwUNLzNMJjcs03F0iv4scOPdSqB-PUYuOQ2Rq8A@mail.gmail.com>

To port OpenStack to Python 3, I wrote 4 (2x2) helper functions which
accept bytes *and* Unicode as input. xxx_as_bytes() functions return bytes,
xxx_as_text() return Unicode:
http://docs.openstack.org/developer/oslo.serialization/api.html

Victor
Le 14 juin 2016 5:21 PM, "Steven D'Aprano" <steve at pearwood.info> a ?crit :

> Normally I'd take a question like this to Python-List, but this question
> has turned out to be quite diversive, with people having strong opinions
> but no definitive answer. So I thought I'd ask here and hope that some
> of the core devs would have an idea.
>
> Why does base64 encoding in Python return bytes?
>
> base64.b64encode take bytes as input and returns bytes. Some people are
> arguing that this is wrong behaviour, as RFC 3548 specifies that Base64
> should transform bytes to characters:
>
> https://tools.ietf.org/html/rfc3548.html
>
> albeit US-ASCII characters. E.g.:
>
>     The encoding process represents 24-bit groups of input bits
>     as output strings of 4 encoded characters.
>     [...]
>     Each 6-bit group is used as an index into an array of 64 printable
>     characters.  The character referenced by the index is placed in the
>     output string.
>
> Are they misinterpreting the standard? Has Python got it wrong? Is there
> a good reason for returning bytes?
>
> I see that other languages choose different strategies. Microsoft's
> languages C#, F# and VB (plus their C++ compiler) take an array of bytes
> as input, and outputs a UTF-16 string:
>
> https://msdn.microsoft.com/en-us/library/dhx0d524%28v=vs.110%29.aspx
>
> Java's base64 encoder takes and returns bytes:
>
> https://docs.oracle.com/javase/8/docs/api/java/util/Base64.Encoder.html
>
> and Javascript's Base64 encoder takes input as UTF-16 encoded text and
> returns the same:
>
>
> https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding
>
> I'm not necessarily arguing that Python's strategy is the wrong one, but
> I am interested in what (if any) reasons are behind it.
>
>
> Thanks in advance,
>
>
>
>
> Steve
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160614/87c65358/attachment.html>

From victor.stinner at gmail.com  Tue Jun 14 11:38:40 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Tue, 14 Jun 2016 17:38:40 +0200
Subject: [Python-Dev] [Python-checkins] cpython (3.5): Fix os.urandom()
 using getrandom() on Linux
In-Reply-To: <20160614150713.GX27919@ando.pearwood.info>
References: <20160614143358.10086.1428.9B00D7BE@psf.io>
 <20160614150713.GX27919@ando.pearwood.info>
Message-ID: <CAMpsgwZBcks+t_2PDF+QZgPYh_ZcsueUveXZLnqx5Ov8x+QoAQ@mail.gmail.com>

Sorry, I don't hve the bandwith to follow the huge discussion around random
in Python. If you want my help, please write a PEP to summarize the
discussion.

My change fixes an obvious bug. Even if the Python API changes, I don't
expect that all the C code will be removed.

Victor
Le 14 juin 2016 5:11 PM, "Steven D'Aprano" <steve at pearwood.info> a ?crit :

> Is this right? I thought we had decided that os.urandom should *not*
> fall back on getrandom on Linux?
>
>
>
> On Tue, Jun 14, 2016 at 02:36:27PM +0000, victor. stinner wrote:
> > https://hg.python.org/cpython/rev/e028e86a5b73
> > changeset:   102033:e028e86a5b73
> > branch:      3.5
> > parent:      102031:a36238de31ae
> > user:        Victor Stinner <victor.stinner at gmail.com>
> > date:        Tue Jun 14 16:31:35 2016 +0200
> > summary:
> >   Fix os.urandom() using getrandom() on Linux
> >
> > Issue #27278: Fix os.urandom() implementation using getrandom() on Linux.
> > Truncate size to INT_MAX and loop until we collected enough random bytes,
> > instead of casting a directly Py_ssize_t to int.
> >
> > files:
> >   Misc/NEWS       |  4 ++++
> >   Python/random.c |  2 +-
> >   2 files changed, 5 insertions(+), 1 deletions(-)
> >
> >
> > diff --git a/Misc/NEWS b/Misc/NEWS
> > --- a/Misc/NEWS
> > +++ b/Misc/NEWS
> > @@ -13,6 +13,10 @@
> >  Library
> >  -------
> >
> > +- Issue #27278: Fix os.urandom() implementation using getrandom() on
> Linux.
> > +  Truncate size to INT_MAX and loop until we collected enough random
> bytes,
> > +  instead of casting a directly Py_ssize_t to int.
> > +
> >  - Issue #26386: Fixed ttk.TreeView selection operations with item id's
> >    containing spaces.
> >
> > diff --git a/Python/random.c b/Python/random.c
> > --- a/Python/random.c
> > +++ b/Python/random.c
> > @@ -143,7 +143,7 @@
> >             to 1024 bytes */
> >          n = Py_MIN(size, 1024);
> >  #else
> > -        n = size;
> > +        n = Py_MIN(size, INT_MAX);
> >  #endif
> >
> >          errno = 0;
> >
> > --
> > Repository URL: https://hg.python.org/cpython
>
> > _______________________________________________
> > Python-checkins mailing list
> > Python-checkins at python.org
> > https://mail.python.org/mailman/listinfo/python-checkins
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160614/4eae9ff4/attachment.html>

From victor.stinner at gmail.com  Tue Jun 14 11:40:33 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Tue, 14 Jun 2016 17:40:33 +0200
Subject: [Python-Dev] [Python-checkins] cpython (3.5): Fix os.urandom()
 using getrandom() on Linux
In-Reply-To: <CAFp3-p-hnEdsFsPaBzxO-9d2TXiYw_gR0DGcXgxsvFevNu67hA@mail.gmail.com>
References: <20160614143358.10086.1428.9B00D7BE@psf.io>
 <20160614150713.GX27919@ando.pearwood.info>
 <CAFp3-p-hnEdsFsPaBzxO-9d2TXiYw_gR0DGcXgxsvFevNu67hA@mail.gmail.com>
Message-ID: <CAMpsgwYWXAstEmJbDF4Z_QnizE5if6B1LkCFAV8QEfmuW_vBRg@mail.gmail.com>

Le 14 juin 2016 5:28 PM, "Jelle Zijlstra" <jelle.zijlstra at gmail.com> a
?crit :
>The problem isn't that os.urandom() uses getrandom(), it's that it calls
it in a mode that may block.

Except if it changed very recently, os.urandom() doesn't block anymore
thanks to my previous change ;-)

Victor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160614/7b40705a/attachment-0001.html>

From p.f.moore at gmail.com  Tue Jun 14 11:51:44 2016
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 14 Jun 2016 16:51:44 +0100
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <20160614151935.GY27919@ando.pearwood.info>
References: <20160614151935.GY27919@ando.pearwood.info>
Message-ID: <CACac1F9wkEGGtn5+pWb9fDO3_HkP6mr=5g49H8J+vZPMsJ+7zA@mail.gmail.com>

On 14 June 2016 at 16:19, Steven D'Aprano <steve at pearwood.info> wrote:
> Why does base64 encoding in Python return bytes?

I seem to recall there was a debate about this around the time of the
Python 3 move. (IIRC, it was related to the fact that there used to be
a base64 "codec", that wasn't available in Python 3 because it wasn't
clear whether it converted bytes to text or bytes). I don't remember
any of the details, let alone if a conclusion was reached, but a
search of the archives may find something.

Paul

From a.badger at gmail.com  Tue Jun 14 12:32:30 2016
From: a.badger at gmail.com (Toshio Kuratomi)
Date: Tue, 14 Jun 2016 09:32:30 -0700
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <CAH0mxTQP3i_bn-Mx9a7BzPA_GnFSAvYqo79QK9FWef5DHAMJZg@mail.gmail.com>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CAH0mxTQP3i_bn-Mx9a7BzPA_GnFSAvYqo79QK9FWef5DHAMJZg@mail.gmail.com>
Message-ID: <CABVPEKpXPvA5fwOGiVOv_SgodegOyA7A1pzBT9K+GeqYJdER+w@mail.gmail.com>

On Jun 14, 2016 8:32 AM, "Joao S. O. Bueno" <jsbueno at python.org.br> wrote:
>
> On 14 June 2016 at 12:19, Steven D'Aprano <steve at pearwood.info> wrote:
> > Is there
> > a good reason for returning bytes?
>
> What about: it returns 0-255 numeric values for each position in  a
stream, with
> no clue whatsoever to how those values map to text characters beyond
> the 32-128 range?
>
> Maybe base64.decode could take a "encoding" optional parameter - or
> there could  be
> a separate 'decote_to_text" method that would explicitly take a text
codec name.
> Otherwise, no, you simply can't take a bunch of bytes and say they
> represent text.
>
Although it's not explicit, the question seems to be about the output of
encoding (and for symmetry, the input of decoding).  In both of those
cases, valid output will consist only of ascii characters.

The input to encoding would have to remain bytes (that's the main purpose
of base64... to turn bytes into an ascii string).

-Toshio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160614/512eb692/attachment.html>

From tjreedy at udel.edu  Tue Jun 14 12:38:46 2016
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 14 Jun 2016 12:38:46 -0400
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <20160614151935.GY27919@ando.pearwood.info>
References: <20160614151935.GY27919@ando.pearwood.info>
Message-ID: <njpbur$v61$1@ger.gmane.org>

On 6/14/2016 11:19 AM, Steven D'Aprano wrote:
> Normally I'd take a question like this to Python-List, but this question
> has turned out to be quite diversive, with people having strong opinions
> but no definitive answer. So I thought I'd ask here and hope that some
> of the core devs would have an idea.
>
> Why does base64 encoding in Python return bytes?

Ultimately, because we never decided to change this in 3.0.

> base64.b64encode take bytes as input and returns bytes. Some people are
> arguing that this is wrong behaviour, as RFC 3548 specifies that Base64
> should transform bytes to characters:
>
> https://tools.ietf.org/html/rfc3548.html
>
> albeit US-ASCII characters. E.g.:
>
>     The encoding process represents 24-bit groups of input bits
>     as output strings of 4 encoded characters.

One could argue that 'encoded character' means 'bytes' in Python, but I 
don't know what the standard writer meant, as unicode characters always 
have some internal encoding.

>     [...]
>     Each 6-bit group is used as an index into an array of 64 printable
>     characters.  The character referenced by the index is placed in the
>     output string.

-- 
Terry Jan Reedy

From breamoreboy at yahoo.co.uk  Tue Jun 14 12:29:12 2016
From: breamoreboy at yahoo.co.uk (Mark Lawrence)
Date: Tue, 14 Jun 2016 17:29:12 +0100
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <CACac1F9wkEGGtn5+pWb9fDO3_HkP6mr=5g49H8J+vZPMsJ+7zA@mail.gmail.com>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CACac1F9wkEGGtn5+pWb9fDO3_HkP6mr=5g49H8J+vZPMsJ+7zA@mail.gmail.com>
Message-ID: <njpbcr$lsk$1@ger.gmane.org>

On 14/06/2016 16:51, Paul Moore wrote:
> On 14 June 2016 at 16:19, Steven D'Aprano <steve at pearwood.info> wrote:
>> Why does base64 encoding in Python return bytes?
>
> I seem to recall there was a debate about this around the time of the
> Python 3 move. (IIRC, it was related to the fact that there used to be
> a base64 "codec", that wasn't available in Python 3 because it wasn't
> clear whether it converted bytes to text or bytes). I don't remember
> any of the details, let alone if a conclusion was reached, but a
> search of the archives may find something.
>
> Paul
>

As I've the time to play detective I'd suggest 
https://mail.python.org/pipermail/python-3000/2007-July/008975.html

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

From tjreedy at udel.edu  Tue Jun 14 12:43:38 2016
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 14 Jun 2016 12:43:38 -0400
Subject: [Python-Dev] mod_python compilation error in VS 2008 for py2.7.1
In-Reply-To: <CA+9XG=O3oo6sYCmNwBobPjzVTud-b3=qZ2tWCsPxK5dhFntfiA@mail.gmail.com>
References: <CA+9XG=O3oo6sYCmNwBobPjzVTud-b3=qZ2tWCsPxK5dhFntfiA@mail.gmail.com>
Message-ID: <23fff550-c19f-ff6f-c663-c206a806b1d9@udel.edu>

On 6/14/2016 4:44 AM, asimkon wrote:
> I would like to ask you a technical question regarding python module
> compilation for python 2.7.1.

So you know, python-list, where you cross-posted this, is the right 
place for discussion of development *with* Python.

python-dev is for development *of* Python language and future CPython 
and this is off-topic here.

-- 
Terry Jan Reedy

From jsbueno at python.org.br  Tue Jun 14 13:05:19 2016
From: jsbueno at python.org.br (Joao S. O. Bueno)
Date: Tue, 14 Jun 2016 14:05:19 -0300
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <CABVPEKpXPvA5fwOGiVOv_SgodegOyA7A1pzBT9K+GeqYJdER+w@mail.gmail.com>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CAH0mxTQP3i_bn-Mx9a7BzPA_GnFSAvYqo79QK9FWef5DHAMJZg@mail.gmail.com>
 <CABVPEKpXPvA5fwOGiVOv_SgodegOyA7A1pzBT9K+GeqYJdER+w@mail.gmail.com>
Message-ID: <CAH0mxTQ08jfr2kYHd2oyJYz2cZjL=iy5FdjnuyAFhzkCrQiLJw@mail.gmail.com>

On 14 June 2016 at 13:32, Toshio Kuratomi <a.badger at gmail.com> wrote:
>
> On Jun 14, 2016 8:32 AM, "Joao S. O. Bueno" <jsbueno at python.org.br> wrote:
>>
>> On 14 June 2016 at 12:19, Steven D'Aprano <steve at pearwood.info> wrote:
>> > Is there
>> > a good reason for returning bytes?
>>
>> What about: it returns 0-255 numeric values for each position in  a
>> stream, with
>> no clue whatsoever to how those values map to text characters beyond
>> the 32-128 range?
>>
>> Maybe base64.decode could take a "encoding" optional parameter - or
>> there could  be
>> a separate 'decote_to_text" method that would explicitly take a text codec
>> name.
>> Otherwise, no, you simply can't take a bunch of bytes and say they
>> represent text.
>>
> Although it's not explicit, the question seems to be about the output of
> encoding (and for symmetry, the input of decoding).  In both of those cases,
> valid output will consist only of ascii characters.
>
> The input to encoding would have to remain bytes (that's the main purpose of
> base64... to turn bytes into an ascii string).
>

Sorry, it is 2016, and I don't think at this point anyone can consider
an ASCII string
as a representative pattern of textual data in any field of application.
Bytes are not text. Bytes with an associated, meaningful, encoding are text.
  I thought this had been through when Python 3 was out.

Unless you are working with COBOL generated data (and intending to keep
the file format) , it does not make sense in any real-world field.
(supposing your
Cobol data is ASCII and nort EBCDIC).

> -Toshio

From pmiscml at gmail.com  Tue Jun 14 13:19:09 2016
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Tue, 14 Jun 2016 20:19:09 +0300
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <CACac1F9wkEGGtn5+pWb9fDO3_HkP6mr=5g49H8J+vZPMsJ+7zA@mail.gmail.com>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CACac1F9wkEGGtn5+pWb9fDO3_HkP6mr=5g49H8J+vZPMsJ+7zA@mail.gmail.com>
Message-ID: <20160614201909.3cd4322b@x230>

Hello,

On Tue, 14 Jun 2016 16:51:44 +0100
Paul Moore <p.f.moore at gmail.com> wrote:

> On 14 June 2016 at 16:19, Steven D'Aprano <steve at pearwood.info> wrote:
> > Why does base64 encoding in Python return bytes?
> 
> I seem to recall there was a debate about this around the time of the
> Python 3 move. (IIRC, it was related to the fact that there used to be
> a base64 "codec", that wasn't available in Python 3 because it wasn't
> clear whether it converted bytes to text or bytes). I don't remember
> any of the details, let alone if a conclusion was reached, but a
> search of the archives may find something.

Well, it's easy to remember the conclusion - it was decided to return
bytes. The reason also wouldn't be hard to imagine - regardless of the
fact that base64 uses ASCII codes for digits and letters, it's still
essentially a binary data. And the most natural step for it is to send
it down the socket (socket.send() accepts bytes), etc.

I'd find it a bit more surprising that binascii.hexlify() returns
bytes, but I personally got used to it, and consider it a
consistency thing on binascii module.

Generally, with Python3 by default using (inefficient) Unicode for
strings, any efficient data processing would use bytes, and then one
appreciates the fact that data encoding/decoding routines also return
bytes, avoiding implicit expensive conversion to strings.

-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From random832 at fastmail.com  Tue Jun 14 13:45:00 2016
From: random832 at fastmail.com (Random832)
Date: Tue, 14 Jun 2016 13:45:00 -0400
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <CAH0mxTQ08jfr2kYHd2oyJYz2cZjL=iy5FdjnuyAFhzkCrQiLJw@mail.gmail.com>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CAH0mxTQP3i_bn-Mx9a7BzPA_GnFSAvYqo79QK9FWef5DHAMJZg@mail.gmail.com>
 <CABVPEKpXPvA5fwOGiVOv_SgodegOyA7A1pzBT9K+GeqYJdER+w@mail.gmail.com>
 <CAH0mxTQ08jfr2kYHd2oyJYz2cZjL=iy5FdjnuyAFhzkCrQiLJw@mail.gmail.com>
Message-ID: <1465926300.85154.637554673.09FC6B35@webmail.messagingengine.com>

On Tue, Jun 14, 2016, at 13:05, Joao S. O. Bueno wrote:
> Sorry, it is 2016, and I don't think at this point anyone can consider
> an ASCII string
> as a representative pattern of textual data in any field of application.
> Bytes are not text. Bytes with an associated, meaningful, encoding are
> text.
>   I thought this had been through when Python 3 was out.

Of all the things that anyone has said in this thread, this makes the
*least* contextual sense. The input to base64 encoding, which is what is
under discussion, is not text in any way. It is images, it is zip files,
it is executables, it could be the output of os.urandom (at least,
provided it doesn't block ;) for all anyone cares.

The *output* is only an ascii string in the sense that it is a text
string consisting of characters within (a carefully chosen subset of)
ASCII's repertoire, but the output wasn't what he was claiming should be
bytes in the sentence you replied to. Is your objection to the phrase
"ascii string"?

From jsbueno at python.org.br  Tue Jun 14 14:00:24 2016
From: jsbueno at python.org.br (Joao S. O. Bueno)
Date: Tue, 14 Jun 2016 15:00:24 -0300
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <1465926300.85154.637554673.09FC6B35@webmail.messagingengine.com>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CAH0mxTQP3i_bn-Mx9a7BzPA_GnFSAvYqo79QK9FWef5DHAMJZg@mail.gmail.com>
 <CABVPEKpXPvA5fwOGiVOv_SgodegOyA7A1pzBT9K+GeqYJdER+w@mail.gmail.com>
 <CAH0mxTQ08jfr2kYHd2oyJYz2cZjL=iy5FdjnuyAFhzkCrQiLJw@mail.gmail.com>
 <1465926300.85154.637554673.09FC6B35@webmail.messagingengine.com>
Message-ID: <CAH0mxTTh21McosYKM1tBT3x53FBFnWLe02unuEU8g56fTNOPyA@mail.gmail.com>

On 14 June 2016 at 14:45, Random832 <random832 at fastmail.com> wrote:
> On Tue, Jun 14, 2016, at 13:05, Joao S. O. Bueno wrote:
>> Sorry, it is 2016, and I don't think at this point anyone can consider
>> an ASCII string
>> as a representative pattern of textual data in any field of application.
>> Bytes are not text. Bytes with an associated, meaningful, encoding are
>> text.
>>   I thought this had been through when Python 3 was out.
>
> Of all the things that anyone has said in this thread, this makes the
> *least* contextual sense. The input to base64 encoding, which is what is
> under discussion, is not text in any way. It is images, it is zip files,
> it is executables, it could be the output of os.urandom (at least,
> provided it doesn't block ;) for all anyone cares.
>
> The *output* is only an ascii string in the sense that it is a text
> string consisting of characters within (a carefully chosen subset of)
> ASCII's repertoire, but the output wasn't what he was claiming should be
> bytes in the sentence you replied to. Is your objection to the phrase
> "ascii string"?
Sorry - everything I wrote, I was thinking about _decoding_ base 64.
As for the result of an encoded base64, yes, of course it fits into ASCII.

The arguments about compactness and what is most likely to happen
next applies (transmission trhough a binary network protocol),
 but the strong objection I had was just because I thought it was
a suggestion of decoding base 64 automatically to text without providing
a text encoding.

> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/jsbueno%40python.org.br

From random832 at fastmail.com  Tue Jun 14 14:02:02 2016
From: random832 at fastmail.com (Random832)
Date: Tue, 14 Jun 2016 14:02:02 -0400
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <20160614201909.3cd4322b@x230>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CACac1F9wkEGGtn5+pWb9fDO3_HkP6mr=5g49H8J+vZPMsJ+7zA@mail.gmail.com>
 <20160614201909.3cd4322b@x230>
Message-ID: <1465927322.90071.637565817.6473B401@webmail.messagingengine.com>

On Tue, Jun 14, 2016, at 13:19, Paul Sokolovsky wrote:
> Well, it's easy to remember the conclusion - it was decided to return
> bytes. The reason also wouldn't be hard to imagine - regardless of the
> fact that base64 uses ASCII codes for digits and letters, it's still
> essentially a binary data. 

Only in the sense that all text is binary data. There's nothing in the
definition of base64 specifying ASCII codes. It specifies *characters*
that all happen to be in ASCII's character repertoire.

>And the most natural step for it is to send
> it down the socket (socket.send() accepts bytes), etc.

How is that more natural than to send it to a text buffer that is
ultimately encoded (maybe not even in an ASCII-compatible encoding...
though probably) and sent down a socket or written to a file by a layer
that is outside your control? Yes, everything eventually ends up as
bytes. That doesn't mean that we should obsessively convert things to
bytes as early as possible.

I mean if we were gonna do that why bother even having a unicode string
type at all?

> I'd find it a bit more surprising that binascii.hexlify() returns
> bytes, but I personally got used to it, and consider it a
> consistency thing on binascii module.
> 
> Generally, with Python3 by default using (inefficient) Unicode for
> strings, 

Why is it inefficient?

> any efficient data processing would use bytes, and then one
> appreciates the fact that data encoding/decoding routines also return
> bytes, avoiding implicit expensive conversion to strings.

From rdmurray at bitdance.com  Tue Jun 14 14:05:55 2016
From: rdmurray at bitdance.com (R. David Murray)
Date: Tue, 14 Jun 2016 14:05:55 -0400
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <CAH0mxTQ08jfr2kYHd2oyJYz2cZjL=iy5FdjnuyAFhzkCrQiLJw@mail.gmail.com>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CAH0mxTQP3i_bn-Mx9a7BzPA_GnFSAvYqo79QK9FWef5DHAMJZg@mail.gmail.com>
 <CABVPEKpXPvA5fwOGiVOv_SgodegOyA7A1pzBT9K+GeqYJdER+w@mail.gmail.com>
 <CAH0mxTQ08jfr2kYHd2oyJYz2cZjL=iy5FdjnuyAFhzkCrQiLJw@mail.gmail.com>
Message-ID: <20160614180556.9A1C0B1401C@webabinitio.net>

On Tue, 14 Jun 2016 14:05:19 -0300, "Joao S. O. Bueno" <jsbueno at python.org.br> wrote:
> On 14 June 2016 at 13:32, Toshio Kuratomi <a.badger at gmail.com> wrote:
> >
> > On Jun 14, 2016 8:32 AM, "Joao S. O. Bueno" <jsbueno at python.org.br> wrote:
> >>
> >> On 14 June 2016 at 12:19, Steven D'Aprano <steve at pearwood.info> wrote:
> >> > Is there
> >> > a good reason for returning bytes?
> >>
> >> What about: it returns 0-255 numeric values for each position in  a
> >> stream, with
> >> no clue whatsoever to how those values map to text characters beyond
> >> the 32-128 range?
> >>
> >> Maybe base64.decode could take a "encoding" optional parameter - or
> >> there could  be
> >> a separate 'decote_to_text" method that would explicitly take a text codec
> >> name.
> >> Otherwise, no, you simply can't take a bunch of bytes and say they
> >> represent text.
> >>
> > Although it's not explicit, the question seems to be about the output of
> > encoding (and for symmetry, the input of decoding).  In both of those cases,
> > valid output will consist only of ascii characters.
> >
> > The input to encoding would have to remain bytes (that's the main purpose of
> > base64... to turn bytes into an ascii string).
> >
> 
> Sorry, it is 2016, and I don't think at this point anyone can consider
> an ASCII string
> as a representative pattern of textual data in any field of application.
> Bytes are not text. Bytes with an associated, meaningful, encoding are text.
>   I thought this had been through when Python 3 was out.
> 
> Unless you are working with COBOL generated data (and intending to keep
> the file format) , it does not make sense in any real-world field.
> (supposing your
> Cobol data is ASCII and nort EBCDIC).

The fundamental purpose of the base64 encoding is to take a series
of arbitrary bytes and reversibly turn them into another series of
bytes in which the eighth bit is not significant.  Its utility is for
transmitting eight bit bytes over a channel that is not eight bit clean.
Before unicode, that meant bytes.  Now that we have unicode in use in
lots of places, you can think of unicode as a communications channel
that is not eight bit clean.  So, we might want to use base64 encoding to
transmit arbitrary bytes over a unicode channel.  This gives a legitimate
reason to want unicode output from a base64 encoder.   However, it is
equally legitimate in the Python context to say you should be explicit
about your intentions by decoding the bytes output of the base64 encoder
using the ASCII codec.

This was indeed discussed at length.  For a while we didn't even allow
unicode input on either side, but we relaxed that.  My understanding of
Python's current stance on functions that handle both bytes and string
is that *either* the function accepts both types and outputs the *same*
type as the input, *or* it accepts both types but always outputs *one*
type or the other.

You can't have unicode output if you give unicode input to the base64
decoder in the general case.  So decode, at least, has to always give
bytes output.  Likewise, there is small to zero utility for using unicode
input to the base64 encoder, since the unicode would have to be ASCII
only and there'd be no point in doing the encoding.  So, the only thing
that makes sense is to follow the "one output type" rule here.

Now, you can argue whether or not it would make sense for the encoder
to always produce unicode.  However, you then immediately run into the
backward compatibility issue:  the primary use case of the base64 encoding
is to produce *wire ready* bytes.  This is what the email package uses
it for, for example.  So for backward compatibility reasons, which
are consonant with its primary use case, it makes more sense for the
encoder to produce bytes than string.  If you need to transmit bytes
over a unicode channel, you can decode it from ASCII.  That is,
unicode is the *exceptional* use case here, not the rule.  That might
in fact be changing, but for backward compatibility reasons, Python
won't change.

And that should answer Steve's original question :)

--David

From dholth at gmail.com  Tue Jun 14 14:13:11 2016
From: dholth at gmail.com (Daniel Holth)
Date: Tue, 14 Jun 2016 18:13:11 +0000
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <1465927322.90071.637565817.6473B401@webmail.messagingengine.com>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CACac1F9wkEGGtn5+pWb9fDO3_HkP6mr=5g49H8J+vZPMsJ+7zA@mail.gmail.com>
 <20160614201909.3cd4322b@x230>
 <1465927322.90071.637565817.6473B401@webmail.messagingengine.com>
Message-ID: <CAG8k2+5XTz+wh+0vRpG_+_3Hioa=gdkxrryKUF71ucwe8vLPuA@mail.gmail.com>

IMO this is more a philosophical problem than a programming problem. base64
has a dual-nature. It is both text and bytes. At least it should fit in a
1-byte-per-character efficient Python 3 unicode string also.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160614/ec5796d9/attachment.html>

From pmiscml at gmail.com  Tue Jun 14 14:17:16 2016
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Tue, 14 Jun 2016 21:17:16 +0300
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <1465927322.90071.637565817.6473B401@webmail.messagingengine.com>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CACac1F9wkEGGtn5+pWb9fDO3_HkP6mr=5g49H8J+vZPMsJ+7zA@mail.gmail.com>
 <20160614201909.3cd4322b@x230>
 <1465927322.90071.637565817.6473B401@webmail.messagingengine.com>
Message-ID: <20160614211716.354bf102@x230>

Hello,

On Tue, 14 Jun 2016 14:02:02 -0400
Random832 <random832 at fastmail.com> wrote:

> On Tue, Jun 14, 2016, at 13:19, Paul Sokolovsky wrote:
> > Well, it's easy to remember the conclusion - it was decided to
> > return bytes. The reason also wouldn't be hard to imagine -
> > regardless of the fact that base64 uses ASCII codes for digits and
> > letters, it's still essentially a binary data. 
> 
> Only in the sense that all text is binary data. There's nothing in the
> definition of base64 specifying ASCII codes. It specifies *characters*
> that all happen to be in ASCII's character repertoire.
> 
> >And the most natural step for it is to send
> > it down the socket (socket.send() accepts bytes), etc.
> 
> How is that more natural than to send it to a text buffer that is

It's more natural because it's more efficient. It's more natural in the
same sense that the most natural way to get from point A to point B is
a straight line.

> ultimately encoded (maybe not even in an ASCII-compatible encoding...
> though probably) and sent down a socket or written to a file by a
> layer that is outside your control? Yes, everything eventually ends
> up as bytes. That doesn't mean that we should obsessively convert
> things to bytes as early as possible.

It's vice-versa - there's no need to obsessively convert simple,
primary type of bytes (everything in computers are bytes) to more
complex things like Unicode strings.

> I mean if we were gonna do that why bother even having a unicode
> string type at all?

You're trying to raise the topic which is a subject of gigantic flame
wars on python-list for years. Here's my summary: not using unicode
string type *at all* is better than not using bytes type at all. So,
feel free to use unicode string *only* when it's needed, which is
*only* when you accept input from or produce output for *human* (like
real human, walking down a street to do grocery shopping). In all
other cases, data should stay bytes (mind - stay, as it's bytes in the
beginning, and it requires extra effort to convert it to a strings).

> > I'd find it a bit more surprising that binascii.hexlify() returns
> > bytes, but I personally got used to it, and consider it a
> > consistency thing on binascii module.
> > 
> > Generally, with Python3 by default using (inefficient) Unicode for
> > strings, 
> 
> Why is it inefficient?

Because bytes is the most efficient basic representation of data.
Everything which tries to convert it to something is less efficient in
general. Less efficient == inefficient.

-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From tjreedy at udel.edu  Tue Jun 14 14:17:55 2016
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 14 Jun 2016 14:17:55 -0400
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <CABVPEKpXPvA5fwOGiVOv_SgodegOyA7A1pzBT9K+GeqYJdER+w@mail.gmail.com>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CAH0mxTQP3i_bn-Mx9a7BzPA_GnFSAvYqo79QK9FWef5DHAMJZg@mail.gmail.com>
 <CABVPEKpXPvA5fwOGiVOv_SgodegOyA7A1pzBT9K+GeqYJdER+w@mail.gmail.com>
Message-ID: <njphop$vr5$1@ger.gmane.org>

On 6/14/2016 12:32 PM, Toshio Kuratomi wrote:

> The input to encoding would have to remain bytes (that's the main
> purpose of base64... to turn bytes into an ascii string).

The purpose is to turn arbitrary binary data (commonly images) into 
'safe bytes' that will not get mangled on transmission (7 bit channels 
were once common) and that will not mangle a display of data transmitted 
or received.  Ignoring the EBCDIC world, which Python mostly does, the 
set of 'safe bytes' is the set that encodes printable ascii characters. 
Those bytes pass through 7 bit channels and display on ascii-based 
terminals.

-- 
Terry Jan Reedy

From pmiscml at gmail.com  Tue Jun 14 14:25:48 2016
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Tue, 14 Jun 2016 21:25:48 +0300
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <CAG8k2+5XTz+wh+0vRpG_+_3Hioa=gdkxrryKUF71ucwe8vLPuA@mail.gmail.com>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CACac1F9wkEGGtn5+pWb9fDO3_HkP6mr=5g49H8J+vZPMsJ+7zA@mail.gmail.com>
 <20160614201909.3cd4322b@x230>
 <1465927322.90071.637565817.6473B401@webmail.messagingengine.com>
 <CAG8k2+5XTz+wh+0vRpG_+_3Hioa=gdkxrryKUF71ucwe8vLPuA@mail.gmail.com>
Message-ID: <20160614212548.7361f8a3@x230>

Hello,

On Tue, 14 Jun 2016 18:13:11 +0000
Daniel Holth <dholth at gmail.com> wrote:

> IMO this is more a philosophical problem than a programming problem.
> base64 has a dual-nature. It is both text and bytes. At least it
> should fit in a 1-byte-per-character efficient Python 3 unicode
> string also.

You probably mean "CPython3 1-byte-per-character "efficient" string".
But CPython3 is merely one of half-dozen Python3 language
implementations. Yup, a special one, but hopefully it's special in a
respect that it doesn't abuse its powers to make language API *changes*
based on its own implementation details. API changes, because API
*decisions* have been done long ago already.

-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From tjreedy at udel.edu  Tue Jun 14 14:42:36 2016
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 14 Jun 2016 14:42:36 -0400
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <njpbcr$lsk$1@ger.gmane.org>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CACac1F9wkEGGtn5+pWb9fDO3_HkP6mr=5g49H8J+vZPMsJ+7zA@mail.gmail.com>
 <njpbcr$lsk$1@ger.gmane.org>
Message-ID: <njpj72$n9i$1@ger.gmane.org>

On 6/14/2016 12:29 PM, Mark Lawrence via Python-Dev wrote:

> As I've the time to play detective I'd suggest
> https://mail.python.org/pipermail/python-3000/2007-July/008975.html

Thank you for finding that.  I reread it and still believe that bytes 
was the right choice.  Base64 is an generic edge encoding for binary 
data.  It fits in with the the standard paradigm as a edge encoding.

Receive encoded bytes.
Decode bytes to python objects
Manipulate python objects
Encode python objects to bytes
Send bytes.

Receive and send can be from and to either local files or sockets 
usually connected to remote systems.  Transmissions can have blocks with 
different encodings. In the latter case, the bytes need to be parsed 
into blocks with different encodings.

In the (fairly common) special case that a transmission consists 
entirely of text in *1* encoding (ignoring any transmission wrappers), 
decode and encode can be incorporated into a text-mode file object.  If 
a transmission consists entirely or partly of binary, one can open in 
binary mode and .write one or more blocks of encoded bytes, possible 
with encoding data.

-- 
Terry Jan Reedy

From stephen at xemacs.org  Tue Jun 14 14:44:47 2016
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 15 Jun 2016 03:44:47 +0900
Subject: [Python-Dev]  Why does base64 return bytes?
In-Reply-To: <20160614151935.GY27919@ando.pearwood.info>
References: <20160614151935.GY27919@ando.pearwood.info>
Message-ID: <22368.20639.247590.870541@turnbull.sk.tsukuba.ac.jp>

Steven D'Aprano writes:

 > base64.b64encode take bytes as input and returns bytes. Some people are 
 > arguing that this is wrong behaviour, as RFC 3548

That RFC is obsolete: the replacement is RFC 4648.  However, the text
is essentially unchanged.

 > specifies that Base64  should transform bytes to characters:

Without defining "character" except as a "subset" of ASCII.  That
omission is evidently deliberate.  Unfortunately the RFC is unclear
whether a subset of the ASCII repertoire of (abstract) characters is
meant, or a subset of the ASCII codes.  I believe the latter is meant,
but either way, it does refer to *encoded* characters as the output of
the encoding process:

 >     The encoding process represents 24-bit groups of input bits 
 >     as output strings of 4 encoded characters. 

and I see no reason to deny that the bytes output by base64.b64encode
are the octets representing the ASCII codes for the characters of the
BASE64 alphabet.

 > Are they misinterpreting the standard?

I think they are.  As I understand it, the intention of the standard
in using "character" to denote the code unit is similar to that of RFC
3986: BASE encodings are intended to be printable and recognizable to
humans.  If you're using a non-ASCII-superset encoding such as EBCDIC
for text I/O, then you should translate from ASCII to that encoding
for display, and in the (unlikely) case that a human types BASE
encoding from the terminal, the reverse transformation is necessary.

 > Has Python got it wrong?

I can't see anything in the RFC that suggests that.  And, in the end,
an RFC is not concerned with Python's internal fiddling, but rather
with what goes out over the wire.  All of the implementations you
mention will eventually send to the wire octets that are interpreted
as ASCII-encoded characters according to their integer values.

 > Is there a good reason for returning bytes?

I suppose practicality over purity: BASE encodings are normally used
on the wire, and so programs need to encode text to appropriately
encoded octets *before* BASE encoding, and then normally immediately
put the BASE-encoded content on the wire.  Why round-trip from UTF-8
bytes to a str in BASE64 representation, and then do the (trivial)
conversion back to bytes?  OK, it's not that expensive, but still...

From mailing at franzoni.eu  Tue Jun 14 16:25:24 2016
From: mailing at franzoni.eu (Alan Franzoni)
Date: Tue, 14 Jun 2016 22:25:24 +0200
Subject: [Python-Dev] ValuesView abc: why doesn't it (officially) inherit
 from Iterable?
Message-ID: <CAF3z5=mMjJ3oxv6W_PxeMraVi+D6Vbhz9w3JcZp6cNF5_W1Qpg@mail.gmail.com>

Hello,
I hope not to bother anyone with a somewhat trivial question, I was
unable to get an answer from other channels.

I was just checking out some docs on ABCs for a project of mine, where
I need to do some type-related work. Those are the official docs about
the ValuesView type, in both Python 2 and 3:

https://docs.python.org/2/library/collections.html#collections.ValuesView
https://docs.python.org/3/library/collections.abc.html

and this is the source (Python 2, but same happens in Python 3)

https://hg.python.org/releases/2.7.11/file/9213c70c67d2/Lib/_abcoll.py#l479

I was very puzzled about the ValuesView interface, because from a
logical standpoint it should inherit from Iterable, IMHO (it's even
got the __iter__ Mixin method); on the contrary the docs say that it
just inherits from MappingView, which inherits from Sized, which
doesn't inherit from Iterable.

So I fired up my 2.7 interpreter:

>>> from collections import Iterable
>>> d = {1:2, 3:4}
>>> isinstance(d.viewvalues(), Iterable)
True
>>>

It looks iterable, after all, because of Iterable's own subclasshook.

But I don't understand why ValuesView isn't explicitly Iterable. Other
ABCs, like Sequence, are explicitly inheriting Iterable. Is there some
arcane reason behind that, or it's just a documentation+implementation
shortcoming (with no real-world impact) for a little-used feature?

Bye,

-- 
www.franzoni.eu - Twitter: @alanfranz
contact me at public@[mysurname].eu

From brett at python.org  Tue Jun 14 16:44:19 2016
From: brett at python.org (Brett Cannon)
Date: Tue, 14 Jun 2016 20:44:19 +0000
Subject: [Python-Dev] ValuesView abc: why doesn't it (officially)
 inherit from Iterable?
In-Reply-To: <CAF3z5=mMjJ3oxv6W_PxeMraVi+D6Vbhz9w3JcZp6cNF5_W1Qpg@mail.gmail.com>
References: <CAF3z5=mMjJ3oxv6W_PxeMraVi+D6Vbhz9w3JcZp6cNF5_W1Qpg@mail.gmail.com>
Message-ID: <CAP1=2W4BNVffG8iF8pckUhQRwhd6quX+fcqFEXL1jETDq6pBAQ@mail.gmail.com>

On Tue, 14 Jun 2016 at 13:30 Alan Franzoni <mailing at franzoni.eu> wrote:

> Hello,
> I hope not to bother anyone with a somewhat trivial question, I was
> unable to get an answer from other channels.
>
> I was just checking out some docs on ABCs for a project of mine, where
> I need to do some type-related work. Those are the official docs about
> the ValuesView type, in both Python 2 and 3:
>
> https://docs.python.org/2/library/collections.html#collections.ValuesView
> https://docs.python.org/3/library/collections.abc.html
>
> and this is the source (Python 2, but same happens in Python 3)
>
> https://hg.python.org/releases/2.7.11/file/9213c70c67d2/Lib/_abcoll.py#l479
>
> I was very puzzled about the ValuesView interface, because from a
> logical standpoint it should inherit from Iterable, IMHO (it's even
> got the __iter__ Mixin method); on the contrary the docs say that it
> just inherits from MappingView, which inherits from Sized, which
> doesn't inherit from Iterable.
>
> So I fired up my 2.7 interpreter:
>
> >>> from collections import Iterable
> >>> d = {1:2, 3:4}
> >>> isinstance(d.viewvalues(), Iterable)
> True
> >>>
>
> It looks iterable, after all, because of Iterable's own subclasshook.
>
> But I don't understand why ValuesView isn't explicitly Iterable. Other
> ABCs, like Sequence, are explicitly inheriting Iterable. Is there some
> arcane reason behind that, or it's just a documentation+implementation
> shortcoming (with no real-world impact) for a little-used feature?
>

To add some extra info, both KeysView and ItemsView inherit from Set which
does inherit from Iterable. I personally don't know why ValuesView doesn't
inherit from Set (although Iterable does override __subclasshook__() so
there isn't a direct functional loss which if this turns out to be a bug
why no one has notified until now).

Alan, would you mind filing an issue at bugs.python.org about this?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160614/26e1b9a8/attachment.html>

From mailing at franzoni.eu  Tue Jun 14 16:47:24 2016
From: mailing at franzoni.eu (Alan Franzoni)
Date: Tue, 14 Jun 2016 22:47:24 +0200
Subject: [Python-Dev] ValuesView abc: why doesn't it (officially)
 inherit from Iterable?
In-Reply-To: <CAP1=2W4BNVffG8iF8pckUhQRwhd6quX+fcqFEXL1jETDq6pBAQ@mail.gmail.com>
References: <CAF3z5=mMjJ3oxv6W_PxeMraVi+D6Vbhz9w3JcZp6cNF5_W1Qpg@mail.gmail.com>
 <CAP1=2W4BNVffG8iF8pckUhQRwhd6quX+fcqFEXL1jETDq6pBAQ@mail.gmail.com>
Message-ID: <CAF3z5==m4Zi7CTJO4cH0wMGp-r9w3j8qtgdoEOFtcrdjqqVc-g@mail.gmail.com>

ValuesView doesn't inherit from Set because the values in a dictionary
can contain duplicates. That makes sense. It's just the missing
Iterable, which is a weaker contract, that doesn't.

I'm filing the bug tomorrow.

On Tue, Jun 14, 2016 at 10:44 PM, Brett Cannon <brett at python.org> wrote:
> On Tue, 14 Jun 2016 at 13:30 Alan Franzoni <mailing at franzoni.eu> wrote:
>>
>> Hello,
>> I hope not to bother anyone with a somewhat trivial question, I was
>> unable to get an answer from other channels.
>>
>> I was just checking out some docs on ABCs for a project of mine, where
>> I need to do some type-related work. Those are the official docs about
>> the ValuesView type, in both Python 2 and 3:
>>
>> https://docs.python.org/2/library/collections.html#collections.ValuesView
>> https://docs.python.org/3/library/collections.abc.html
>>
>> and this is the source (Python 2, but same happens in Python 3)
>>
>>
>> https://hg.python.org/releases/2.7.11/file/9213c70c67d2/Lib/_abcoll.py#l479
>>
>> I was very puzzled about the ValuesView interface, because from a
>> logical standpoint it should inherit from Iterable, IMHO (it's even
>> got the __iter__ Mixin method); on the contrary the docs say that it
>> just inherits from MappingView, which inherits from Sized, which
>> doesn't inherit from Iterable.
>>
>> So I fired up my 2.7 interpreter:
>>
>> >>> from collections import Iterable
>> >>> d = {1:2, 3:4}
>> >>> isinstance(d.viewvalues(), Iterable)
>> True
>> >>>
>>
>> It looks iterable, after all, because of Iterable's own subclasshook.
>>
>> But I don't understand why ValuesView isn't explicitly Iterable. Other
>> ABCs, like Sequence, are explicitly inheriting Iterable. Is there some
>> arcane reason behind that, or it's just a documentation+implementation
>> shortcoming (with no real-world impact) for a little-used feature?
>
>
> To add some extra info, both KeysView and ItemsView inherit from Set which
> does inherit from Iterable. I personally don't know why ValuesView doesn't
> inherit from Set (although Iterable does override __subclasshook__() so
> there isn't a direct functional loss which if this turns out to be a bug why
> no one has notified until now).
>
> Alan, would you mind filing an issue at bugs.python.org about this?

-- 
My development blog: ollivander.franzoni.eu . @franzeur on Twitter
contact me at public@[mysurname].eu

From greg.ewing at canterbury.ac.nz  Tue Jun 14 19:37:34 2016
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 15 Jun 2016 11:37:34 +1200
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <CAH0mxTTh21McosYKM1tBT3x53FBFnWLe02unuEU8g56fTNOPyA@mail.gmail.com>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CAH0mxTQP3i_bn-Mx9a7BzPA_GnFSAvYqo79QK9FWef5DHAMJZg@mail.gmail.com>
 <CABVPEKpXPvA5fwOGiVOv_SgodegOyA7A1pzBT9K+GeqYJdER+w@mail.gmail.com>
 <CAH0mxTQ08jfr2kYHd2oyJYz2cZjL=iy5FdjnuyAFhzkCrQiLJw@mail.gmail.com>
 <1465926300.85154.637554673.09FC6B35@webmail.messagingengine.com>
 <CAH0mxTTh21McosYKM1tBT3x53FBFnWLe02unuEU8g56fTNOPyA@mail.gmail.com>
Message-ID: <5760953E.8050703@canterbury.ac.nz>

Joao S. O. Bueno wrote:
> The arguments about compactness and what is most likely to happen
> next applies (transmission trhough a binary network protocol),

I'm not convinced that this is what is most likely to
happen next *in a Python program*. How many people
implement their own binary network protocols in Python?
It seems to me most people will be using a protocol
library written by someone else.

-- 
Greg

From greg.ewing at canterbury.ac.nz  Tue Jun 14 19:51:05 2016
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 15 Jun 2016 11:51:05 +1200
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <20160614180556.9A1C0B1401C@webabinitio.net>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CAH0mxTQP3i_bn-Mx9a7BzPA_GnFSAvYqo79QK9FWef5DHAMJZg@mail.gmail.com>
 <CABVPEKpXPvA5fwOGiVOv_SgodegOyA7A1pzBT9K+GeqYJdER+w@mail.gmail.com>
 <CAH0mxTQ08jfr2kYHd2oyJYz2cZjL=iy5FdjnuyAFhzkCrQiLJw@mail.gmail.com>
 <20160614180556.9A1C0B1401C@webabinitio.net>
Message-ID: <57609869.4060304@canterbury.ac.nz>

R. David Murray wrote:
> The fundamental purpose of the base64 encoding is to take a series
> of arbitrary bytes and reversibly turn them into another series of
> bytes in which the eighth bit is not significant.

No, it's not. If that were its only purpose, it would be
called base128, and the RFC would describe it purely in
terms of bit patterns and not mention characters or
character sets at all.

The RFC does *not* do that. It describes the output in
terms of characters, and does not specify any bit patterns
for the output. The intention is clearly to represent
binary data as *text*.

-- 
Greg

From steve at pearwood.info  Tue Jun 14 22:48:05 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 15 Jun 2016 12:48:05 +1000
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <njpbcr$lsk$1@ger.gmane.org>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CACac1F9wkEGGtn5+pWb9fDO3_HkP6mr=5g49H8J+vZPMsJ+7zA@mail.gmail.com>
 <njpbcr$lsk$1@ger.gmane.org>
Message-ID: <20160615024805.GA27919@ando.pearwood.info>

On Tue, Jun 14, 2016 at 05:29:12PM +0100, Mark Lawrence via Python-Dev wrote:

> As I've the time to play detective I'd suggest 
> https://mail.python.org/pipermail/python-3000/2007-July/008975.html

Thanks Mark, that's great!

-- 
Steve

From stephen at xemacs.org  Tue Jun 14 22:58:04 2016
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 15 Jun 2016 11:58:04 +0900
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <57609869.4060304@canterbury.ac.nz>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CAH0mxTQP3i_bn-Mx9a7BzPA_GnFSAvYqo79QK9FWef5DHAMJZg@mail.gmail.com>
 <CABVPEKpXPvA5fwOGiVOv_SgodegOyA7A1pzBT9K+GeqYJdER+w@mail.gmail.com>
 <CAH0mxTQ08jfr2kYHd2oyJYz2cZjL=iy5FdjnuyAFhzkCrQiLJw@mail.gmail.com>
 <20160614180556.9A1C0B1401C@webabinitio.net>
 <57609869.4060304@canterbury.ac.nz>
Message-ID: <22368.50236.788324.306156@turnbull.sk.tsukuba.ac.jp>

Greg Ewing writes:

 > The RFC does *not* do that. It describes the output in terms of
 > characters, and does not specify any bit patterns for the
 > output.

The RFC is unclear on this point, but I read it as specifying the
ASCII coded character set, not the ASCII repertoire of (abstract)
characters.  Therefore, it specifies an invertible mapping from a
particular set of integers to characters.

 > The intention is clearly to represent binary data as *text*.

It's more subtle than that.  *RFCs do not deal with text.*  Text is
an internal concept of (some) programming environments.  RFCs may
deal with *encoded text*, and RFC 4648 indeed specifically mentions
"encoded characters" as the output of the BASE64 algorithm.[1]

The intention then is to represent binary data with *binary data that
may be conveniently interpreted as text* (ie, without reencoding), eg,
by a terminal or a printer.[2]  It is also desirable that it be likely
to pass unscathed through channels that are not necessarily even 7-bit
clean (file system directories and JIS X 0201, for example) which
*inadvertantly* treat it as text.  Both requirements are conveniently
fulfilled by using appropriate ASCII subsets, and encoding on the wire
using the usual bit patterns.  However, I suppose you could also use
EBCDIC or UTF-16, as long as you have agreed with the receiver to do
so.

So I would say that Python can do what it wants with the type that
base64.b64encode returns as far as the RFC is concerned; that's an
internal aspect of Python.  It's purely a matter of our convenience
(as programmer *in* Python) whether we return str or bytes.

My own experience is biased toward email and web (not to be confused
with SMTP and HTTP), and so my experience is that most composers
(1) automatically handle text encodings for the users, and then the
content transfer encoding as necessary for the underlying protocol,
and (2) handle attachments by placing a reference in the composed
content, which is replaced by the object just before transmission (and
any desired content transfer encoding is applied at that time, at the
option of the composing agent, which rarely needs to bother the user
with such trivia).  Bytes seem more convenient to me, and give an on-
the-wire representation consistent with that of Python 2 str.

Footnotes: 
[1]  Admittedly, RFC 3986 (URIs) does stretch the notion of "encoded
text" to the breaking point by including marks on paper.

[2]  Thus, BASE64-encoding resources provides a more efficient,
alternative datagram protocol for the physical links used by RFC 1149
networks.

From random832 at fastmail.com  Wed Jun 15 00:29:06 2016
From: random832 at fastmail.com (Random832)
Date: Wed, 15 Jun 2016 00:29:06 -0400
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <22368.50236.788324.306156@turnbull.sk.tsukuba.ac.jp>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CAH0mxTQP3i_bn-Mx9a7BzPA_GnFSAvYqo79QK9FWef5DHAMJZg@mail.gmail.com>
 <CABVPEKpXPvA5fwOGiVOv_SgodegOyA7A1pzBT9K+GeqYJdER+w@mail.gmail.com>
 <CAH0mxTQ08jfr2kYHd2oyJYz2cZjL=iy5FdjnuyAFhzkCrQiLJw@mail.gmail.com>
 <20160614180556.9A1C0B1401C@webabinitio.net>
 <57609869.4060304@canterbury.ac.nz>
 <22368.50236.788324.306156@turnbull.sk.tsukuba.ac.jp>
Message-ID: <1465964946.2281576.638049745.3C35B600@webmail.messagingengine.com>

On Tue, Jun 14, 2016, at 22:58, Stephen J. Turnbull wrote:
> The RFC is unclear on this point, but I read it as specifying the
> ASCII coded character set, not the ASCII repertoire of (abstract)
> characters.  Therefore, it specifies an invertible mapping from a
> particular set of integers to characters.

There are multiple descriptions of base 64 that specifically mention
using it with EBCDIC and with local character sets of unspecified
nature.

>  > The intention is clearly to represent binary data as *text*.
> 
> It's more subtle than that.  *RFCs do not deal with text.*  Text is
> an internal concept of (some) programming environments.

It's also a human concept. Plenty of RFCs deal with human concept rather
than purely programming topics.

From greg.ewing at canterbury.ac.nz  Wed Jun 15 01:40:26 2016
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 15 Jun 2016 17:40:26 +1200
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <22368.20639.247590.870541@turnbull.sk.tsukuba.ac.jp>
References: <20160614151935.GY27919@ando.pearwood.info>
 <22368.20639.247590.870541@turnbull.sk.tsukuba.ac.jp>
Message-ID: <5760EA4A.4060404@canterbury.ac.nz>

Stephen J. Turnbull wrote:
> it does refer to *encoded* characters as the output of
> the encoding process:
> 
>  >     The encoding process represents 24-bit groups of input bits 
>  >     as output strings of 4 encoded characters.

The "encoding" being referred to there is the encoding
from input bytes to output characters, not an encoding
of the output characters as bytes.

Nowhere in RFC 4648 does it refer to the output as
being made up of "bytes" or "octets". It's always
described in terms of "characters".

> As I understand it, the intention of the standard
> in using "character" to denote the code unit is similar to that of RFC
> 3986: BASE encodings are intended to be printable and recognizable to
> humans.

Hmmm... so why then does it say, in section 4:

    The Base 64 encoding is designed to represent arbitrary sequences of
    octets in a form that ... need not be human readable.

> If you're using a non-ASCII-superset encoding such as EBCDIC
> for text I/O, then you should translate from ASCII to that encoding
> for display,

What about the channel you're sending the encoded data over?

Suppose I'm on Windows and I'm embedding the base64 encoded
data in a text message that I'm sending through a mail client
that accepts text in utf-16.

I hope you would agree that, in that situation, encoding the
base64 output in ASCII and giving those bytes directly to
the mail client would be very much the wrong thing to do?

-- 
Greg

From larry at hastings.org  Wed Jun 15 01:41:32 2016
From: larry at hastings.org (Larry Hastings)
Date: Tue, 14 Jun 2016 22:41:32 -0700
Subject: [Python-Dev] [Python-checkins] cpython (3.5): Fix os.urandom()
 using getrandom() on Linux
In-Reply-To: <20160614150713.GX27919@ando.pearwood.info>
References: <20160614143358.10086.1428.9B00D7BE@psf.io>
 <20160614150713.GX27919@ando.pearwood.info>
Message-ID: <5760EA8C.4070507@hastings.org>

On 06/14/2016 08:07 AM, Steven D'Aprano wrote:
> Is this right? I thought we had decided that os.urandom should *not*
> fall back on getrandom on Linux?

We decided that os.urandom() should not *block* on Linux.  Which it 
doesn't; we now strictly call getrandom(GRND_NONBLOCK), which will never 
block.  getrandom() is better because it's a system call, instead of 
reading from a file.  So it's much less messy.

If getrandom() wanted to block, instead it'll return EAGAIN, and we'll 
fail over to reading from /dev/urandom directly, just like we did in 3.4 
and before.

It's all working as intended,

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160614/a0deadbe/attachment.html>

From hodgestar+pythondev at gmail.com  Wed Jun 15 02:22:28 2016
From: hodgestar+pythondev at gmail.com (Simon Cross)
Date: Wed, 15 Jun 2016 08:22:28 +0200
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <njpj72$n9i$1@ger.gmane.org>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CACac1F9wkEGGtn5+pWb9fDO3_HkP6mr=5g49H8J+vZPMsJ+7zA@mail.gmail.com>
 <njpbcr$lsk$1@ger.gmane.org> <njpj72$n9i$1@ger.gmane.org>
Message-ID: <CAD5NRCFQB5eZQfBFA0KdqMiN_uqrfCJtVtyVJxPK-_n4jqbpSQ@mail.gmail.com>

On Tue, Jun 14, 2016 at 8:42 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> Thank you for finding that.  I reread it and still believe that bytes was
> the right choice.  Base64 is an generic edge encoding for binary data.  It
> fits in with the the standard paradigm as a edge encoding.

I'd like to me-too Terry's sentiment, but also expand on it a bit.

Base64 encoding is used to convert bytes into a limited set of symbols
for inclusion in a stream of data. Whether bytes or unicode characters
are appropriate depends on whether the stream being constructed is a
byte stream or a unicode character stream.

Many people do deal with byte streams in Python and we have large
sub-communities for who this use case is important (e.g. Twisted,
Asyncio, anyone using the socket module).

It is also no longer 1980 though, and there are many protocols layered
on top of unicode character streams rather than bytes.

Ideally I'd like us to support both options (like we've been
increasingly doing for reading from other external sources such as
file systems or environment variables).

If we only support one, I would prefer it to be bytes since (bytes ->
bytes -> unicode) seems like less overhead and slightly conceptually
clearer than (bytes -> unicode -> bytes), but I consider this a
personal preference rather than any sort of one-true-way.

Schiavo
Simon

From greg.ewing at canterbury.ac.nz  Wed Jun 15 03:02:57 2016
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 15 Jun 2016 19:02:57 +1200
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <22368.50236.788324.306156@turnbull.sk.tsukuba.ac.jp>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CAH0mxTQP3i_bn-Mx9a7BzPA_GnFSAvYqo79QK9FWef5DHAMJZg@mail.gmail.com>
 <CABVPEKpXPvA5fwOGiVOv_SgodegOyA7A1pzBT9K+GeqYJdER+w@mail.gmail.com>
 <CAH0mxTQ08jfr2kYHd2oyJYz2cZjL=iy5FdjnuyAFhzkCrQiLJw@mail.gmail.com>
 <20160614180556.9A1C0B1401C@webabinitio.net>
 <57609869.4060304@canterbury.ac.nz>
 <22368.50236.788324.306156@turnbull.sk.tsukuba.ac.jp>
Message-ID: <5760FDA1.4000803@canterbury.ac.nz>

Stephen J. Turnbull wrote:

> The RFC is unclear on this point, but I read it as specifying the
> ASCII coded character set, not the ASCII repertoire of (abstract)
> characters.

Well, I think you've misread it. Or at least there is a
more general reading possible that is entirely consistent
with the stated purpose and doesn't assume any particular
output encoding.

> It's more subtle than that.  *RFCs do not deal with text.*

That may be true of most RFCs, but I think this particular
one really *is* talking about text, even if the authors
didn't realise it at the time.

> It is also desirable that it be likely to pass unscathed through channels
> that ... *inadvertantly* treat it as text.  Both requirements are
> conveniently fulfilled by using appropriate ASCII subsets, and encoding on
> the wire using the usual bit patterns.

But only if the part that is (deliberately or inadvertently)
treating it as text is using ASCII as its encoding. So, by
your reading of the RFC, base64 is *only* intended for
channels that use ASCII encoding.

Whereas if you drop the assumption of ASCII and use whatever
encoding the channel uses for text, then it works for all
channels.

RFC 4648 doesn't mention it, but an earlier RFC on base64
explicitly said that characters were chosen that also exist
in EBCDIC, so it seems they were intending that base64
should work on EBCDIC-bases systems as well as ASCII-based
ones.

> It's purely a matter of our convenience
> (as programmer *in* Python) whether we return str or bytes.

Yes, and it seems to me the decision has been made by people
with their noses stuck in low-level protocol implementations.
Whenever *I've* needed to base64 encode something, I've wanted
the output as text, because that's what I needed to feed into
the next stage of the process.

Maybe there should be two versions of the base64 codec, one
producing bytes and one producing text?

-- 
Greg

From greg.ewing at canterbury.ac.nz  Wed Jun 15 03:07:26 2016
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 15 Jun 2016 19:07:26 +1200
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <CAD5NRCFQB5eZQfBFA0KdqMiN_uqrfCJtVtyVJxPK-_n4jqbpSQ@mail.gmail.com>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CACac1F9wkEGGtn5+pWb9fDO3_HkP6mr=5g49H8J+vZPMsJ+7zA@mail.gmail.com>
 <njpbcr$lsk$1@ger.gmane.org> <njpj72$n9i$1@ger.gmane.org>
 <CAD5NRCFQB5eZQfBFA0KdqMiN_uqrfCJtVtyVJxPK-_n4jqbpSQ@mail.gmail.com>
Message-ID: <5760FEAE.7000004@canterbury.ac.nz>

Simon Cross wrote:
> If we only support one, I would prefer it to be bytes since (bytes ->
> bytes -> unicode) seems like less overhead and slightly conceptually
> clearer than (bytes -> unicode -> bytes),

Whereas bytes -> unicode, followed if needed by unicode -> bytes,
seems conceptually clearer to me. IOW, base64 is conceptually a
bytes-to-text transformation, and the usual way to represent
text in Python 3 is unicode.

-- 
Greg

From steve at pearwood.info  Wed Jun 15 08:34:01 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 15 Jun 2016 22:34:01 +1000
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <CAP7+vJKqqpRK5iUx_wHzyb6UExByfFdQFt0+jgVLTjWimzTitQ@mail.gmail.com>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CAP7+vJKqqpRK5iUx_wHzyb6UExByfFdQFt0+jgVLTjWimzTitQ@mail.gmail.com>
Message-ID: <20160615123401.GB27919@ando.pearwood.info>

On Tue, Jun 14, 2016 at 09:40:51PM -0700, Guido van Rossum wrote:
> I'm officially on vacation, but I was surprised that people now assume
> RFCs, which specify internet protocols, would have a bearing on programming
> languages. (With perhaps an exception for RFCs that specifically specify
> how programming languages or their libraries should treat certain specific
> issues -- but I found no evidence that this RFC is doing that.)

Sorry to disturb your vacation!

I hoped that there might have been a nice simple answer, like "the 
main use-case for Base64 is the email module, which needs bytes, and 
thus it was decided". Or even "because backwards compatibility".

Thanks to everyone for their constructive comments, and expecially Mark 
for digging up the original discussion on the Python-3000 list. I'm 
satisfied that the choice made by Python is the right choice, and that 
it meets the spirit (if, arguably, not the letter) of the RFC.

-- 
Steve

From dholth at gmail.com  Wed Jun 15 08:53:15 2016
From: dholth at gmail.com (Daniel Holth)
Date: Wed, 15 Jun 2016 12:53:15 +0000
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <20160615123401.GB27919@ando.pearwood.info>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CAP7+vJKqqpRK5iUx_wHzyb6UExByfFdQFt0+jgVLTjWimzTitQ@mail.gmail.com>
 <20160615123401.GB27919@ando.pearwood.info>
Message-ID: <CAG8k2+4Qn7Kj2B4RQZTJ8W8WsQOfy=tavKQvB_YFifqRhRwc6g@mail.gmail.com>

In that case could we just add a base64_text() method somewhere? Who would
like to measure whether it would be a win?

On Wed, Jun 15, 2016 at 8:34 AM Steven D'Aprano <steve at pearwood.info> wrote:

> On Tue, Jun 14, 2016 at 09:40:51PM -0700, Guido van Rossum wrote:
> > I'm officially on vacation, but I was surprised that people now assume
> > RFCs, which specify internet protocols, would have a bearing on
> programming
> > languages. (With perhaps an exception for RFCs that specifically specify
> > how programming languages or their libraries should treat certain
> specific
> > issues -- but I found no evidence that this RFC is doing that.)
>
> Sorry to disturb your vacation!
>
> I hoped that there might have been a nice simple answer, like "the
> main use-case for Base64 is the email module, which needs bytes, and
> thus it was decided". Or even "because backwards compatibility".
>
> Thanks to everyone for their constructive comments, and expecially Mark
> for digging up the original discussion on the Python-3000 list. I'm
> satisfied that the choice made by Python is the right choice, and that
> it meets the spirit (if, arguably, not the letter) of the RFC.
>
>
> --
> Steve
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160615/911419d1/attachment-0001.html>

From p.f.moore at gmail.com  Wed Jun 15 09:17:40 2016
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 15 Jun 2016 14:17:40 +0100
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <CAG8k2+4Qn7Kj2B4RQZTJ8W8WsQOfy=tavKQvB_YFifqRhRwc6g@mail.gmail.com>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CAP7+vJKqqpRK5iUx_wHzyb6UExByfFdQFt0+jgVLTjWimzTitQ@mail.gmail.com>
 <20160615123401.GB27919@ando.pearwood.info>
 <CAG8k2+4Qn7Kj2B4RQZTJ8W8WsQOfy=tavKQvB_YFifqRhRwc6g@mail.gmail.com>
Message-ID: <CACac1F-Nto7ok74=VidqAk75-LrNmvm84dk8_F2zWwdwTpBuzQ@mail.gmail.com>

On 15 June 2016 at 13:53, Daniel Holth <dholth at gmail.com> wrote:
> In that case could we just add a base64_text() method somewhere? Who would
> like to measure whether it would be a win?

"Just adding" a method in the stdlib, means we'd have to support it
long term (backward compatibility). So by the time such an experiment
determined whether it was worth it, it'd be too late.

Finding out whether users/projects typically write such a helper
function for themselves would be a better way of getting this
information. Personally, I suspect they don't, but facts beat
speculation.

Of course, "not every one liner needs to be a stdlib function" applies here too.

Paul

From steve at pearwood.info  Wed Jun 15 11:07:50 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 16 Jun 2016 01:07:50 +1000
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <CAG8k2+4Qn7Kj2B4RQZTJ8W8WsQOfy=tavKQvB_YFifqRhRwc6g@mail.gmail.com>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CAP7+vJKqqpRK5iUx_wHzyb6UExByfFdQFt0+jgVLTjWimzTitQ@mail.gmail.com>
 <20160615123401.GB27919@ando.pearwood.info>
 <CAG8k2+4Qn7Kj2B4RQZTJ8W8WsQOfy=tavKQvB_YFifqRhRwc6g@mail.gmail.com>
Message-ID: <20160615150750.GC27919@ando.pearwood.info>

On Wed, Jun 15, 2016 at 12:53:15PM +0000, Daniel Holth wrote:
> In that case could we just add a base64_text() method somewhere? Who would
> like to measure whether it would be a win?

Just call .decode('ascii') on the output of base64.b64encode. Not every 
one-liner needs to be a standard function.

-- 
Steve

From ijmorlan at uwaterloo.ca  Wed Jun 15 06:21:25 2016
From: ijmorlan at uwaterloo.ca (Isaac Morland)
Date: Wed, 15 Jun 2016 06:21:25 -0400 (EDT)
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <5760FEAE.7000004@canterbury.ac.nz>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CACac1F9wkEGGtn5+pWb9fDO3_HkP6mr=5g49H8J+vZPMsJ+7zA@mail.gmail.com>
 <njpbcr$lsk$1@ger.gmane.org> <njpj72$n9i$1@ger.gmane.org>
 <CAD5NRCFQB5eZQfBFA0KdqMiN_uqrfCJtVtyVJxPK-_n4jqbpSQ@mail.gmail.com>
 <5760FEAE.7000004@canterbury.ac.nz>
Message-ID: <alpine.DEB.2.10.1606150618570.31860@ubuntu1404-104.cs.uwaterloo.ca>

On Wed, 15 Jun 2016, Greg Ewing wrote:

> Simon Cross wrote:
>> If we only support one, I would prefer it to be bytes since (bytes ->
>> bytes -> unicode) seems like less overhead and slightly conceptually
>> clearer than (bytes -> unicode -> bytes),
>
> Whereas bytes -> unicode, followed if needed by unicode -> bytes,
> seems conceptually clearer to me. IOW, base64 is conceptually a
> bytes-to-text transformation, and the usual way to represent
> text in Python 3 is unicode.

And in CPython, do I understand correctly that the output text would be 
represented using one byte per character?  If so, would there be a way of 
encoding that into UTF-8 that re-used the raw memory that backs the 
Unicode object?  And, therefore, avoids almost all the inefficiency of 
going via Unicode?  If so, this would be a win - proper use of Unicode to 
represent a text string, combined with instantaneous conversion into a 
bytes object for the purpose of writing to the OS.

Isaac Morland           CSCF Web Guru
DC 2619, x36650         WWW Software Specialist

From ninosm12 at gmail.com  Wed Jun 15 02:40:06 2016
From: ninosm12 at gmail.com (ninostephen mathew)
Date: Wed, 15 Jun 2016 12:10:06 +0530
Subject: [Python-Dev] Bug in the DELETE statement in sqlite3 module
Message-ID: <CAHwT9V9E1Kfw+cAuqVe0OuC=f9_oNDM63fbFFm9dZeCJ5CA+aw@mail.gmail.com>

Respected Developer(s),
while writing a database module for one of my applications in python I
encountered something interesting. I had a username and password field in
my table and only one entry which was  "Admin" and "password". While
debugging I purposefully deleted that record. Then I ran the same statement
again. To my surprise, it got execute. Then I ran the statement to delete
the user "admin" (lowercase 'a') which does not exist in the table.
Surprisingly again is got executed even though the table was empty. What I
expected was an error popping up. But nothing happened.  I hope this error
gets fixed soon. The code snippet is given below.

self.cursor.execute(''' DELETE FROM Users WHERE username =
?''',(self.username,))
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160615/54140378/attachment.html>

From dholth at gmail.com  Wed Jun 15 11:11:31 2016
From: dholth at gmail.com (Daniel Holth)
Date: Wed, 15 Jun 2016 15:11:31 +0000
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <alpine.DEB.2.10.1606150618570.31860@ubuntu1404-104.cs.uwaterloo.ca>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CACac1F9wkEGGtn5+pWb9fDO3_HkP6mr=5g49H8J+vZPMsJ+7zA@mail.gmail.com>
 <njpbcr$lsk$1@ger.gmane.org> <njpj72$n9i$1@ger.gmane.org>
 <CAD5NRCFQB5eZQfBFA0KdqMiN_uqrfCJtVtyVJxPK-_n4jqbpSQ@mail.gmail.com>
 <5760FEAE.7000004@canterbury.ac.nz>
 <alpine.DEB.2.10.1606150618570.31860@ubuntu1404-104.cs.uwaterloo.ca>
Message-ID: <CAG8k2+5mY60engCqDSG3rcA2x2LVrsBR0o3BiR0MZ+dTKHFD3g@mail.gmail.com>

It would be a codec. base64_text in the codecs module. Probably 1 line
different than the existing codec. Very easy to use and maintain. Less
surprising and less error prone for everyone who thinks base64 should
convert between bytes to text. Sounds like an obvious win to me.

On Wed, Jun 15, 2016 at 11:08 AM Isaac Morland <ijmorlan at uwaterloo.ca>
wrote:

> On Wed, 15 Jun 2016, Greg Ewing wrote:
>
> > Simon Cross wrote:
> >> If we only support one, I would prefer it to be bytes since (bytes ->
> >> bytes -> unicode) seems like less overhead and slightly conceptually
> >> clearer than (bytes -> unicode -> bytes),
> >
> > Whereas bytes -> unicode, followed if needed by unicode -> bytes,
> > seems conceptually clearer to me. IOW, base64 is conceptually a
> > bytes-to-text transformation, and the usual way to represent
> > text in Python 3 is unicode.
>
> And in CPython, do I understand correctly that the output text would be
> represented using one byte per character?  If so, would there be a way of
> encoding that into UTF-8 that re-used the raw memory that backs the
> Unicode object?  And, therefore, avoids almost all the inefficiency of
> going via Unicode?  If so, this would be a win - proper use of Unicode to
> represent a text string, combined with instantaneous conversion into a
> bytes object for the purpose of writing to the OS.
>
> Isaac Morland           CSCF Web Guru
> DC 2619, x36650         WWW Software Specialist
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160615/4107db79/attachment.html>

From duda.piotr at gmail.com  Wed Jun 15 11:25:53 2016
From: duda.piotr at gmail.com (Piotr Duda)
Date: Wed, 15 Jun 2016 17:25:53 +0200
Subject: [Python-Dev] Bug in the DELETE statement in sqlite3 module
In-Reply-To: <CAHwT9V9E1Kfw+cAuqVe0OuC=f9_oNDM63fbFFm9dZeCJ5CA+aw@mail.gmail.com>
References: <CAHwT9V9E1Kfw+cAuqVe0OuC=f9_oNDM63fbFFm9dZeCJ5CA+aw@mail.gmail.com>
Message-ID: <CAJ1Wxn0c5=F5=D0Z07rJP6gruG_-M+KBHsqkON086YoiVikJ=Q@mail.gmail.com>

This is not a bug, this is correct behavior of any sql database.

2016-06-15 8:40 GMT+02:00 ninostephen mathew <ninosm12 at gmail.com>:
> Respected Developer(s),
> while writing a database module for one of my applications in python I
> encountered something interesting. I had a username and password field in my
> table and only one entry which was  "Admin" and "password". While debugging
> I purposefully deleted that record. Then I ran the same statement again. To
> my surprise, it got execute. Then I ran the statement to delete the user
> "admin" (lowercase 'a') which does not exist in the table. Surprisingly
> again is got executed even though the table was empty. What I expected was
> an error popping up. But nothing happened.  I hope this error gets fixed
> soon. The code snippet is given below.
>
> self.cursor.execute(''' DELETE FROM Users WHERE username =
> ?''',(self.username,))
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/duda.piotr%40gmail.com
>

-- 
????????
??????

From python-dev at masklinn.net  Wed Jun 15 11:34:31 2016
From: python-dev at masklinn.net (Xavier Morel)
Date: Wed, 15 Jun 2016 17:34:31 +0200
Subject: [Python-Dev] Bug in the DELETE statement in sqlite3 module
In-Reply-To: <CAHwT9V9E1Kfw+cAuqVe0OuC=f9_oNDM63fbFFm9dZeCJ5CA+aw@mail.gmail.com>
References: <CAHwT9V9E1Kfw+cAuqVe0OuC=f9_oNDM63fbFFm9dZeCJ5CA+aw@mail.gmail.com>
Message-ID: <0BBBD362-B7A2-4437-9F5A-0843DB686055@masklinn.net>

> On 2016-06-15, at 08:40 , ninostephen mathew <ninosm12 at gmail.com> wrote:
> 
> Respected Developer(s),
> while writing a database module for one of my applications in python I encountered something interesting. I had a username and password field in my table and only one entry which was  "Admin" and "password". While debugging I purposefully deleted that record. Then I ran the same statement again. To my surprise, it got execute. Then I ran the statement to delete the user "admin" (lowercase 'a') which does not exist in the table. Surprisingly again is got executed even though the table was empty. What I expected was an error popping up. But nothing happened.  I hope this error gets fixed soon. The code snippet is given below.
> 
> self.cursor.execute(''' DELETE FROM Users WHERE username = ?''',(self.username,))

Despite Python bundling sqlite, the Python mailing list is not
responsible for developing SQLite (only for the SQLite bindings
themselves) so this is the wrong mailing list.

That being said, the DELETE statement deletes whichever records in the
table match the provided predicate. If no record matches the predicate,
it will simply delete no record, that is not an error, it is the exact
expected and documented behaviour for the statement in SQL in general
and SQLite in particular.

See https://www.sqlite.org/lang_delete.html for the documentation of the
DELETE statement in SQLite.

While you should feel free to report your expectations to the SQLite
project or to the JTC1/SC32 technical committee (which is responsible
for SQL itself) I fear that's what you will get told there, and that you
are about 30 years too late to try influence such a core statement of
the language.

Not that it would have worked I'd think, I'm reasonably sure the
behaviour of the DELETE statement is a natural consequence of SQL's set-
theoretic foundations: DELETE applies to a set of records, regardless of
the set's cardinality.

From p.f.moore at gmail.com  Wed Jun 15 11:29:43 2016
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 15 Jun 2016 16:29:43 +0100
Subject: [Python-Dev] Bug in the DELETE statement in sqlite3 module
In-Reply-To: <CAHwT9V9E1Kfw+cAuqVe0OuC=f9_oNDM63fbFFm9dZeCJ5CA+aw@mail.gmail.com>
References: <CAHwT9V9E1Kfw+cAuqVe0OuC=f9_oNDM63fbFFm9dZeCJ5CA+aw@mail.gmail.com>
Message-ID: <CACac1F_TTCwk1ThsOE-XbFj08Sy92u0MuwN3Zv9or6ZO4iDLaw@mail.gmail.com>

On 15 June 2016 at 07:40, ninostephen mathew <ninosm12 at gmail.com> wrote:
> Respected Developer(s),
> while writing a database module for one of my applications in python I
> encountered something interesting. I had a username and password field in my
> table and only one entry which was  "Admin" and "password". While debugging
> I purposefully deleted that record. Then I ran the same statement again. To
> my surprise, it got execute. Then I ran the statement to delete the user
> "admin" (lowercase 'a') which does not exist in the table. Surprisingly
> again is got executed even though the table was empty. What I expected was
> an error popping up. But nothing happened.  I hope this error gets fixed
> soon. The code snippet is given below.
>
> self.cursor.execute(''' DELETE FROM Users WHERE username =
> ?''',(self.username,))

First of all, this list is for the discussions about the development
of Python itself, not for developing applications with Python. You
should probably be posting to python-list instead.

Having said that, this is how SQL works - a DELETE statement selects
all records matching the WHERE clause and deletes them. If the WHERE
clause doesn't match anything, nothing gets deleted. So your code is
working exactly as I would expect.

Paul

From ethan at stoneleaf.us  Wed Jun 15 12:12:07 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 15 Jun 2016 09:12:07 -0700
Subject: [Python-Dev] proposed os.fspath() change
Message-ID: <57617E57.40808@stoneleaf.us>

I would like to make a change to os.fspath().

Specifically, os.fspath() currently raises an exception if something
besides str, bytes, or os.PathLike is passed in, but makes no checks
if an os.PathLike object returns something besides a str or bytes.

I would like to change that to the opposite: if a non-os.PathLike is
passed in, return it unchanged (so no change for str and bytes); if
an os.PathLike object returns something that is not a str nor bytes,
raise.

An example of the difference in the lzma file:

Current code (has not been upgraded to use os.fspath() yet)
-----------------------------------------------------------

     if isinstance(filename, (str, bytes)):
         if "b" not in mode:
             mode += "b"
         self._fp = builtins.open(filename, mode)
         self._closefp = True
         self._mode = mode_code
     elif hasattr(filename, "read") or hasattr(filename, "write"):
         self._fp = filename
         self._mode = mode_code
     else:
         raise TypeError(
              "filename must be a str or bytes object, or a file"
               )

Code change if using upgraded os.fspath() (placed before above stanza):

     filename = os.fspath(filename)

Code change with current os.fspath() (ditto):

     if isinstance(filename, os.PathLike):
         filename = os.fspath(filename)

My intention with the os.fspath() function was to minimize boiler-plate
code and make PathLike objects easy and painless to support; having to
discover if any given parameter is PathLike before calling os.fspath()
on it is, IMHO, just the opposite.

There is also precedent for having a __dunder__ check the return type:

     --> class Huh:
     ...   def __int__(self):
     ...     return 'string'
     ...   def __index__(self):
     ...     return b'bytestring'
     ...   def __bool__(self):
     ...     return 'true-ish'
     ...
     --> h = Huh()

     --> int(h)
     Traceback (most recent call last):
       File "<stdin>", line 1, in <module>
     TypeError: __int__ returned non-int (type str)

     --> ''[h]
     Traceback (most recent call last):
       File "<stdin>", line 1, in <module>
     TypeError: __index__ returned non-int (type bytes)

     --> bool(h)
     Traceback (most recent call last):
       File "<stdin>", line 1, in <module>
     TypeError: __bool__ should return bool, returned str

Arguments in favor or against?

--
~Ethan~

From guido at python.org  Wed Jun 15 12:33:47 2016
From: guido at python.org (Guido van Rossum)
Date: Wed, 15 Jun 2016 09:33:47 -0700
Subject: [Python-Dev] Bug in the DELETE statement in sqlite3 module
In-Reply-To: <CACac1F_TTCwk1ThsOE-XbFj08Sy92u0MuwN3Zv9or6ZO4iDLaw@mail.gmail.com>
References: <CAHwT9V9E1Kfw+cAuqVe0OuC=f9_oNDM63fbFFm9dZeCJ5CA+aw@mail.gmail.com>
 <CACac1F_TTCwk1ThsOE-XbFj08Sy92u0MuwN3Zv9or6ZO4iDLaw@mail.gmail.com>
Message-ID: <CAP7+vJK-mOG_b6Lj+t7bEfaost9EfPTgUHnLkNP--pq=LxEZSQ@mail.gmail.com>

A point of order: it's not necessary to post three separate "this is the
wrong list" replies. In fact the optimal number is probably close to zero
-- I understand we all want to be helpful, and we don't want to send
duplicate replies, but someone who posts an inappropriate question is
likely to try another venue when they receive no replies, and three replies
to the list implies that some folks are a little too eager to appear
helpful (while reading the list with considerable delay). When the OP pings
the thread maybe one person, preferably someone who reads the list directly
via email from the list server, could post a standard "wrong list" response.

On Wed, Jun 15, 2016 at 8:29 AM, Paul Moore <p.f.moore at gmail.com> wrote:

> On 15 June 2016 at 07:40, ninostephen mathew <ninosm12 at gmail.com> wrote:
> > Respected Developer(s),
> > while writing a database module for one of my applications in python I
> > encountered something interesting. I had a username and password field
> in my
> > table and only one entry which was  "Admin" and "password". While
> debugging
> > I purposefully deleted that record. Then I ran the same statement again.
> To
> > my surprise, it got execute. Then I ran the statement to delete the user
> > "admin" (lowercase 'a') which does not exist in the table. Surprisingly
> > again is got executed even though the table was empty. What I expected
> was
> > an error popping up. But nothing happened.  I hope this error gets fixed
> > soon. The code snippet is given below.
> >
> > self.cursor.execute(''' DELETE FROM Users WHERE username =
> > ?''',(self.username,))
>
> First of all, this list is for the discussions about the development
> of Python itself, not for developing applications with Python. You
> should probably be posting to python-list instead.
>
> Having said that, this is how SQL works - a DELETE statement selects
> all records matching the WHERE clause and deletes them. If the WHERE
> clause doesn't match anything, nothing gets deleted. So your code is
> working exactly as I would expect.
>
> Paul
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160615/b962b818/attachment.html>

From guido at python.org  Wed Jun 15 12:46:46 2016
From: guido at python.org (Guido van Rossum)
Date: Wed, 15 Jun 2016 09:46:46 -0700
Subject: [Python-Dev] proposed os.fspath() change
In-Reply-To: <57617E57.40808@stoneleaf.us>
References: <57617E57.40808@stoneleaf.us>
Message-ID: <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>

These are really two separate proposals.

I'm okay with checking the return value of calling obj.__fspath__; that's
an error in the object anyways, and it doesn't matter much whether we do
this or not (though when approving the PEP I considered this and decided
not to insert a check for this). But it doesn't affect your example, does
it? I guess it's easier to raise now and change the API in the future to
avoid raising in this case (if we find that raising is undesirable) than
the other way around, so I'm +0 on this.

The other proposal (passing anything that's not understood right through)
is more interesting and your use case is somewhat compelling. Catching the
exception coming out of os.fspath() would certainly be much messier. The
question remaining is whether, when this behavior is not desired (e.g. when
the caller of os.fspath() just wants a string that it can pass to open()),
the condition of passing that's neither a string not supports __fspath__
still produces an understandable error. I'm not sure that that's the case.
E.g. open() accepts file descriptors in addition to paths, but I'm not sure
that accepting an integer is a good idea in most cases -- it either gives a
mystery "Bad file descriptor" error or starts reading/writing some random
system file, which it then closes once the stream is closed.

On Wed, Jun 15, 2016 at 9:12 AM, Ethan Furman <ethan at stoneleaf.us> wrote:

> I would like to make a change to os.fspath().
>
> Specifically, os.fspath() currently raises an exception if something
> besides str, bytes, or os.PathLike is passed in, but makes no checks
> if an os.PathLike object returns something besides a str or bytes.
>
> I would like to change that to the opposite: if a non-os.PathLike is
> passed in, return it unchanged (so no change for str and bytes); if
> an os.PathLike object returns something that is not a str nor bytes,
> raise.
>
> An example of the difference in the lzma file:
>
> Current code (has not been upgraded to use os.fspath() yet)
> -----------------------------------------------------------
>
>     if isinstance(filename, (str, bytes)):
>         if "b" not in mode:
>             mode += "b"
>         self._fp = builtins.open(filename, mode)
>         self._closefp = True
>         self._mode = mode_code
>     elif hasattr(filename, "read") or hasattr(filename, "write"):
>         self._fp = filename
>         self._mode = mode_code
>     else:
>         raise TypeError(
>              "filename must be a str or bytes object, or a file"
>               )
>
> Code change if using upgraded os.fspath() (placed before above stanza):
>
>     filename = os.fspath(filename)
>
> Code change with current os.fspath() (ditto):
>
>     if isinstance(filename, os.PathLike):
>         filename = os.fspath(filename)
>
> My intention with the os.fspath() function was to minimize boiler-plate
> code and make PathLike objects easy and painless to support; having to
> discover if any given parameter is PathLike before calling os.fspath()
> on it is, IMHO, just the opposite.
>
> There is also precedent for having a __dunder__ check the return type:
>
>     --> class Huh:
>     ...   def __int__(self):
>     ...     return 'string'
>     ...   def __index__(self):
>     ...     return b'bytestring'
>     ...   def __bool__(self):
>     ...     return 'true-ish'
>     ...
>     --> h = Huh()
>
>     --> int(h)
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in <module>
>     TypeError: __int__ returned non-int (type str)
>
>     --> ''[h]
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in <module>
>     TypeError: __index__ returned non-int (type bytes)
>
>     --> bool(h)
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in <module>
>     TypeError: __bool__ should return bool, returned str
>
> Arguments in favor or against?
>
> --
> ~Ethan~
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160615/e112b653/attachment.html>

From tseaver at palladion.com  Wed Jun 15 12:50:18 2016
From: tseaver at palladion.com (Tres Seaver)
Date: Wed, 15 Jun 2016 12:50:18 -0400
Subject: [Python-Dev] Bug in the DELETE statement in sqlite3 module
In-Reply-To: <CAP7+vJK-mOG_b6Lj+t7bEfaost9EfPTgUHnLkNP--pq=LxEZSQ@mail.gmail.com>
References: <CAHwT9V9E1Kfw+cAuqVe0OuC=f9_oNDM63fbFFm9dZeCJ5CA+aw@mail.gmail.com>
 <CACac1F_TTCwk1ThsOE-XbFj08Sy92u0MuwN3Zv9or6ZO4iDLaw@mail.gmail.com>
 <CAP7+vJK-mOG_b6Lj+t7bEfaost9EfPTgUHnLkNP--pq=LxEZSQ@mail.gmail.com>
Message-ID: <njs10a$hn0$1@ger.gmane.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 06/15/2016 12:33 PM, Guido van Rossum wrote:

> A point of order: it's not necessary to post three separate "this is
> the wrong list" replies. In fact the optimal number is probably close
> to zero -- I understand we all want to be helpful, and we don't want
> to send duplicate replies, but someone who posts an inappropriate
> question is likely to try another venue when they receive no replies,
> and three replies to the list implies that some folks are a little too
> eager to appear helpful (while reading the list with considerable
> delay). When the OP pings the thread maybe one person, preferably
> someone who reads the list directly via email from the list server,
> could post a standard "wrong list" response.

In addition, please don't undermine the "this is the wrong list" message
by responding substantively to the OP's query.

Tres.
- -- 
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBAgAGBQJXYYc/AAoJEPKpaDSJE9HYlSgP/1v+FpEvildmH4fEpZXG+j18
jCt3Q48ffSW22oPhx4lyfZv1Sh3EOsEuHHd3oU7jG9kUtTPyluQQYJiygfCBpSev
CP8LonjJxxkFsVwK5SRGcp7JdjiFbLyqUXbtkFM6s2OE7mpXwtbn4suCRJx7MYaO
CUkN2h0vAandftV4xu+lp/r7n0l8HLTTOsrUFuPZRbT4dVzKwRcM+ER1W4tCnkgZ
bFRXM8YjrUcX/Um2blSi4yZT75TvHjyi44ujbQPsR3OHCPN8GAfAzIVSkbiECP2K
xAqT2/h0E6VkGdEymELCMRHvhCI2wFrAoA6nWYCdyR2Ekg7VB/tnr6AGi+SNvP06
BETMf0BRxpd4sXOvS4+ydhBQQpydW4hiw61RHs8xFiy0W7pqp5Zh4ZHHcZBR2KRT
TXfoxrwQIBIWKlyBdgv9d0maOWg3uq3I3MqO2vnGj/XRPsjs/BWCX9BYZqpnEATB
MasQItCMPoOfmVxlS+cS7rIXXVFdwulm2s5GRZR9PwEuMS8Vmi9A5UyEpshlDYZM
ZMPT3CScFOyczVgC3N+LyO7rYaJMlcNQD/HxxQDvpXoYinxQAFo4eVE2+490XN8j
Od8n3UIo72+rFyyFJ8A7iBORYF9UD44VrFHQRHROTEvv7dV1OTYSVZcdqBb4Ik6S
8Wl+qMIEm8VcuFKI4b/T
=4IaO
-----END PGP SIGNATURE-----

From brett at python.org  Wed Jun 15 13:59:58 2016
From: brett at python.org (Brett Cannon)
Date: Wed, 15 Jun 2016 17:59:58 +0000
Subject: [Python-Dev] proposed os.fspath() change
In-Reply-To: <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>
References: <57617E57.40808@stoneleaf.us>
 <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>
Message-ID: <CAP1=2W5hPgK9rL5b50gJjZ1VR81i3xJi8-evEbY+vGFrdwQrOw@mail.gmail.com>

On Wed, 15 Jun 2016 at 09:48 Guido van Rossum <guido at python.org> wrote:

> These are really two separate proposals.
>
> I'm okay with checking the return value of calling obj.__fspath__; that's
> an error in the object anyways, and it doesn't matter much whether we do
> this or not (though when approving the PEP I considered this and decided
> not to insert a check for this). But it doesn't affect your example, does
> it? I guess it's easier to raise now and change the API in the future to
> avoid raising in this case (if we find that raising is undesirable) than
> the other way around, so I'm +0 on this.
>

+0 from me as well. I know in some code in the stdlib that has been ported
which prior to adding support was explicitly checking for str/bytes this
will eliminate its own checking (obviously not a motivating factor as it's
pretty minor).

>
> The other proposal (passing anything that's not understood right through)
> is more interesting and your use case is somewhat compelling. Catching the
> exception coming out of os.fspath() would certainly be much messier. The
> question remaining is whether, when this behavior is not desired (e.g. when
> the caller of os.fspath() just wants a string that it can pass to open()),
> the condition of passing that's neither a string not supports __fspath__
> still produces an understandable error. I'm not sure that that's the case.
> E.g. open() accepts file descriptors in addition to paths, but I'm not sure
> that accepting an integer is a good idea in most cases -- it either gives a
> mystery "Bad file descriptor" error or starts reading/writing some random
> system file, which it then closes once the stream is closed.
>

The FD issue of magically passing through an int was also a concern when
Ethan brought this up in an issue on the tracker. My argument is that FDs
are not file paths and so shouldn't magically pass through if we're going
to type-check anything or claim os.fspath() only works with paths (FDs are
already open file objects). So in my view  either we go ahead and
type-check the return value of __fspath__() and thus restrict everything
coming out of os.fspath() to Union[str, bytes] or we don't type check
anything and be consistent that os.fspath() simply does is call
__fspath__() if present.

And just  because I'm thinking about it, I would special-case the FDs, not
os.PathLike (clearer why you care and faster as it skips the override of
__subclasshook__):

# Can be a single-line ternary operator if preferred.
if not isinstance(filename, int):
    filename = os.fspath(filename)

> On Wed, Jun 15, 2016 at 9:12 AM, Ethan Furman <ethan at stoneleaf.us> wrote:
>
>> I would like to make a change to os.fspath().
>>
>> Specifically, os.fspath() currently raises an exception if something
>> besides str, bytes, or os.PathLike is passed in, but makes no checks
>> if an os.PathLike object returns something besides a str or bytes.
>>
>> I would like to change that to the opposite: if a non-os.PathLike is
>> passed in, return it unchanged (so no change for str and bytes); if
>> an os.PathLike object returns something that is not a str nor bytes,
>> raise.
>>
>> An example of the difference in the lzma file:
>>
>> Current code (has not been upgraded to use os.fspath() yet)
>> -----------------------------------------------------------
>>
>>     if isinstance(filename, (str, bytes)):
>>         if "b" not in mode:
>>             mode += "b"
>>         self._fp = builtins.open(filename, mode)
>>         self._closefp = True
>>         self._mode = mode_code
>>     elif hasattr(filename, "read") or hasattr(filename, "write"):
>>         self._fp = filename
>>         self._mode = mode_code
>>     else:
>>         raise TypeError(
>>              "filename must be a str or bytes object, or a file"
>>               )
>>
>> Code change if using upgraded os.fspath() (placed before above stanza):
>>
>>     filename = os.fspath(filename)
>>
>> Code change with current os.fspath() (ditto):
>>
>>     if isinstance(filename, os.PathLike):
>>         filename = os.fspath(filename)
>>
>> My intention with the os.fspath() function was to minimize boiler-plate
>> code and make PathLike objects easy and painless to support; having to
>> discover if any given parameter is PathLike before calling os.fspath()
>> on it is, IMHO, just the opposite.
>>
>> There is also precedent for having a __dunder__ check the return type:
>>
>>     --> class Huh:
>>     ...   def __int__(self):
>>     ...     return 'string'
>>     ...   def __index__(self):
>>     ...     return b'bytestring'
>>     ...   def __bool__(self):
>>     ...     return 'true-ish'
>>     ...
>>     --> h = Huh()
>>
>>     --> int(h)
>>     Traceback (most recent call last):
>>       File "<stdin>", line 1, in <module>
>>     TypeError: __int__ returned non-int (type str)
>>
>>     --> ''[h]
>>     Traceback (most recent call last):
>>       File "<stdin>", line 1, in <module>
>>     TypeError: __index__ returned non-int (type bytes)
>>
>>     --> bool(h)
>>     Traceback (most recent call last):
>>       File "<stdin>", line 1, in <module>
>>     TypeError: __bool__ should return bool, returned str
>>
>> Arguments in favor or against?
>>
>> --
>> ~Ethan~
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>>
> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160615/65314c9e/attachment-0001.html>

From k7hoven at gmail.com  Wed Jun 15 13:39:03 2016
From: k7hoven at gmail.com (Koos Zevenhoven)
Date: Wed, 15 Jun 2016 20:39:03 +0300
Subject: [Python-Dev] proposed os.fspath() change
In-Reply-To: <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>
References: <57617E57.40808@stoneleaf.us>
 <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>
Message-ID: <CAMiohojX45zmNcQYTYH9WX8Ru5ZAKHmnvfufrOosLGyKH-E9XA@mail.gmail.com>

My proposal at the point of the first PEP draft solved both of these issues.

That version of the fspath function passed anything right through that
was an instance of the keyword-only `type_constraint`. If not, it
would ask __fspath__, and before returning the result, it would check
that __fspath__ returned an instance of `type_constraint` and
otherwise raise a TypeError. `type_constraint=object` would then have
given the behavior you want. I always wanted fspath to spare the
caller from all the instance checking (most of which it does even
now).

The main problem with setting type_constraint to something broader
than (str, bytes) is that then that parameter would affect the return
type of the function, which would at least complicate the type hinting
issue. Mypy might now support things like

@overload
def fspath(path: T, type_constraint: Type[T] = (str, bytes)) -> T: ...

but then again, isinstance and Union are not compatible (for a
reason?), and PEP484 for a reason does not allow tuples like (str,
bytes) in place of Unions.

Anyway, if we were to go back to this behavior, we would need to
decide whether to officially allow a wider type constraint or whether
to leave that to Stack Overflow, so to speak.

-- Koos

On Wed, Jun 15, 2016 at 7:46 PM, Guido van Rossum <guido at python.org> wrote:
> These are really two separate proposals.
>
> I'm okay with checking the return value of calling obj.__fspath__; that's an
> error in the object anyways, and it doesn't matter much whether we do this
> or not (though when approving the PEP I considered this and decided not to
> insert a check for this). But it doesn't affect your example, does it? I
> guess it's easier to raise now and change the API in the future to avoid
> raising in this case (if we find that raising is undesirable) than the other
> way around, so I'm +0 on this.
>
> The other proposal (passing anything that's not understood right through) is
> more interesting and your use case is somewhat compelling. Catching the
> exception coming out of os.fspath() would certainly be much messier. The
> question remaining is whether, when this behavior is not desired (e.g. when
> the caller of os.fspath() just wants a string that it can pass to open()),
> the condition of passing that's neither a string not supports __fspath__
> still produces an understandable error. I'm not sure that that's the case.
> E.g. open() accepts file descriptors in addition to paths, but I'm not sure
> that accepting an integer is a good idea in most cases -- it either gives a
> mystery "Bad file descriptor" error or starts reading/writing some random
> system file, which it then closes once the stream is closed.
>
> On Wed, Jun 15, 2016 at 9:12 AM, Ethan Furman <ethan at stoneleaf.us> wrote:
>>
>> I would like to make a change to os.fspath().
>>
>> Specifically, os.fspath() currently raises an exception if something
>> besides str, bytes, or os.PathLike is passed in, but makes no checks
>> if an os.PathLike object returns something besides a str or bytes.
>>
>> I would like to change that to the opposite: if a non-os.PathLike is
>> passed in, return it unchanged (so no change for str and bytes); if
>> an os.PathLike object returns something that is not a str nor bytes,
>> raise.
>>
>> An example of the difference in the lzma file:
>>
>> Current code (has not been upgraded to use os.fspath() yet)
>> -----------------------------------------------------------
>>
>>     if isinstance(filename, (str, bytes)):
>>         if "b" not in mode:
>>             mode += "b"
>>         self._fp = builtins.open(filename, mode)
>>         self._closefp = True
>>         self._mode = mode_code
>>     elif hasattr(filename, "read") or hasattr(filename, "write"):
>>         self._fp = filename
>>         self._mode = mode_code
>>     else:
>>         raise TypeError(
>>              "filename must be a str or bytes object, or a file"
>>               )
>>
>> Code change if using upgraded os.fspath() (placed before above stanza):
>>
>>     filename = os.fspath(filename)
>>
>> Code change with current os.fspath() (ditto):
>>
>>     if isinstance(filename, os.PathLike):
>>         filename = os.fspath(filename)
>>
>> My intention with the os.fspath() function was to minimize boiler-plate
>> code and make PathLike objects easy and painless to support; having to
>> discover if any given parameter is PathLike before calling os.fspath()
>> on it is, IMHO, just the opposite.
>>
>> There is also precedent for having a __dunder__ check the return type:
>>
>>     --> class Huh:
>>     ...   def __int__(self):
>>     ...     return 'string'
>>     ...   def __index__(self):
>>     ...     return b'bytestring'
>>     ...   def __bool__(self):
>>     ...     return 'true-ish'
>>     ...
>>     --> h = Huh()
>>
>>     --> int(h)
>>     Traceback (most recent call last):
>>       File "<stdin>", line 1, in <module>
>>     TypeError: __int__ returned non-int (type str)
>>
>>     --> ''[h]
>>     Traceback (most recent call last):
>>       File "<stdin>", line 1, in <module>
>>     TypeError: __index__ returned non-int (type bytes)
>>
>>     --> bool(h)
>>     Traceback (most recent call last):
>>       File "<stdin>", line 1, in <module>
>>     TypeError: __bool__ should return bool, returned str
>>
>> Arguments in favor or against?
>>
>> --
>> ~Ethan~
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com
>

-- 
--
+ Koos Zevenhoven + http://twitter.com/k7hoven +

From ncoghlan at gmail.com  Wed Jun 15 14:29:52 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 15 Jun 2016 11:29:52 -0700
Subject: [Python-Dev] proposed os.fspath() change
In-Reply-To: <CAP1=2W5hPgK9rL5b50gJjZ1VR81i3xJi8-evEbY+vGFrdwQrOw@mail.gmail.com>
References: <57617E57.40808@stoneleaf.us>
 <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>
 <CAP1=2W5hPgK9rL5b50gJjZ1VR81i3xJi8-evEbY+vGFrdwQrOw@mail.gmail.com>
Message-ID: <CADiSq7dCArHLavgOEE0NegX1QvNoPsxwMk-ZfZgc7feZUubygA@mail.gmail.com>

On 15 June 2016 at 10:59, Brett Cannon <brett at python.org> wrote:
>
>
> On Wed, 15 Jun 2016 at 09:48 Guido van Rossum <guido at python.org> wrote:
>>
>> These are really two separate proposals.
>>
>> I'm okay with checking the return value of calling obj.__fspath__; that's
>> an error in the object anyways, and it doesn't matter much whether we do
>> this or not (though when approving the PEP I considered this and decided not
>> to insert a check for this). But it doesn't affect your example, does it? I
>> guess it's easier to raise now and change the API in the future to avoid
>> raising in this case (if we find that raising is undesirable) than the other
>> way around, so I'm +0 on this.
>
> +0 from me as well. I know in some code in the stdlib that has been ported
> which prior to adding support was explicitly checking for str/bytes this
> will eliminate its own checking (obviously not a motivating factor as it's
> pretty minor).

I'd like a strong assertion that the return value of os.fspath() is a
plausible filesystem path representation (so either bytes or str), and
*not* some other kind of object that can also be used for accessing
the filesystem (like a file descriptor or an IO stream)

>> The other proposal (passing anything that's not understood right through)
>> is more interesting and your use case is somewhat compelling. Catching the
>> exception coming out of os.fspath() would certainly be much messier. The
>> question remaining is whether, when this behavior is not desired (e.g. when
>> the caller of os.fspath() just wants a string that it can pass to open()),
>> the condition of passing that's neither a string not supports __fspath__
>> still produces an understandable error. I'm not sure that that's the case.
>> E.g. open() accepts file descriptors in addition to paths, but I'm not sure
>> that accepting an integer is a good idea in most cases -- it either gives a
>> mystery "Bad file descriptor" error or starts reading/writing some random
>> system file, which it then closes once the stream is closed.
>
> The FD issue of magically passing through an int was also a concern when
> Ethan brought this up in an issue on the tracker. My argument is that FDs
> are not file paths and so shouldn't magically pass through if we're going to
> type-check anything or claim os.fspath() only works with paths (FDs are
> already open file objects). So in my view  either we go ahead and type-check
> the return value of __fspath__() and thus restrict everything coming out of
> os.fspath() to Union[str, bytes] or we don't type check anything and be
> consistent that os.fspath() simply does is call __fspath__() if present.
>
> And just  because I'm thinking about it, I would special-case the FDs, not
> os.PathLike (clearer why you care and faster as it skips the override of
> __subclasshook__):
>
> # Can be a single-line ternary operator if preferred.
> if not isinstance(filename, int):
>     filename = os.fspath(filename)

Note that the LZMA case Ethan cites is one where the code accepts
either an already opened file-like object *or* a path-like object, and
does different things based on which it receives.

In that scenario, rather than introducing an unconditional "filename =
os.fspath(filename)" before the current logic, it makes more sense to
me to change the current logic to use the new protocol check rather
than a strict typecheck on str/bytes:

    if isinstance(filename, os.PathLike): # Changed line
        filename = os.fspath(filename)    # New line
        if "b" not in mode:
            mode += "b"
        self._fp = builtins.open(filename, mode)
        self._closefp = True
        self._mode = mode_code
    elif hasattr(filename, "read") or hasattr(filename, "write"):
        self._fp = filename
        self._mode = mode_code
    else:
        raise TypeError(
             "filename must be a path-like or file-like object"
              )

I *don't* think it makes sense to weaken the guarantees on os.fspath
to let it propagate non-path-like objects.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From guido at python.org  Wed Jun 15 14:39:43 2016
From: guido at python.org (Guido van Rossum)
Date: Wed, 15 Jun 2016 11:39:43 -0700
Subject: [Python-Dev] proposed os.fspath() change
In-Reply-To: <CADiSq7dCArHLavgOEE0NegX1QvNoPsxwMk-ZfZgc7feZUubygA@mail.gmail.com>
References: <57617E57.40808@stoneleaf.us>
 <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>
 <CAP1=2W5hPgK9rL5b50gJjZ1VR81i3xJi8-evEbY+vGFrdwQrOw@mail.gmail.com>
 <CADiSq7dCArHLavgOEE0NegX1QvNoPsxwMk-ZfZgc7feZUubygA@mail.gmail.com>
Message-ID: <CAP7+vJJvxZ6hNA8-6fq+cvzpTMZsUM0n0702cun9AsQ4Ae9Psg@mail.gmail.com>

OK, so let's add a check on the return of __fspath__() and keep the check
on path-like or string/bytes.

--Guido (mobile)
On Jun 15, 2016 11:29 AM, "Nick Coghlan" <ncoghlan at gmail.com> wrote:

> On 15 June 2016 at 10:59, Brett Cannon <brett at python.org> wrote:
> >
> >
> > On Wed, 15 Jun 2016 at 09:48 Guido van Rossum <guido at python.org> wrote:
> >>
> >> These are really two separate proposals.
> >>
> >> I'm okay with checking the return value of calling obj.__fspath__;
> that's
> >> an error in the object anyways, and it doesn't matter much whether we do
> >> this or not (though when approving the PEP I considered this and
> decided not
> >> to insert a check for this). But it doesn't affect your example, does
> it? I
> >> guess it's easier to raise now and change the API in the future to avoid
> >> raising in this case (if we find that raising is undesirable) than the
> other
> >> way around, so I'm +0 on this.
> >
> > +0 from me as well. I know in some code in the stdlib that has been
> ported
> > which prior to adding support was explicitly checking for str/bytes this
> > will eliminate its own checking (obviously not a motivating factor as
> it's
> > pretty minor).
>
> I'd like a strong assertion that the return value of os.fspath() is a
> plausible filesystem path representation (so either bytes or str), and
> *not* some other kind of object that can also be used for accessing
> the filesystem (like a file descriptor or an IO stream)
>
> >> The other proposal (passing anything that's not understood right
> through)
> >> is more interesting and your use case is somewhat compelling. Catching
> the
> >> exception coming out of os.fspath() would certainly be much messier. The
> >> question remaining is whether, when this behavior is not desired (e.g.
> when
> >> the caller of os.fspath() just wants a string that it can pass to
> open()),
> >> the condition of passing that's neither a string not supports __fspath__
> >> still produces an understandable error. I'm not sure that that's the
> case.
> >> E.g. open() accepts file descriptors in addition to paths, but I'm not
> sure
> >> that accepting an integer is a good idea in most cases -- it either
> gives a
> >> mystery "Bad file descriptor" error or starts reading/writing some
> random
> >> system file, which it then closes once the stream is closed.
> >
> > The FD issue of magically passing through an int was also a concern when
> > Ethan brought this up in an issue on the tracker. My argument is that FDs
> > are not file paths and so shouldn't magically pass through if we're
> going to
> > type-check anything or claim os.fspath() only works with paths (FDs are
> > already open file objects). So in my view  either we go ahead and
> type-check
> > the return value of __fspath__() and thus restrict everything coming out
> of
> > os.fspath() to Union[str, bytes] or we don't type check anything and be
> > consistent that os.fspath() simply does is call __fspath__() if present.
> >
> > And just  because I'm thinking about it, I would special-case the FDs,
> not
> > os.PathLike (clearer why you care and faster as it skips the override of
> > __subclasshook__):
> >
> > # Can be a single-line ternary operator if preferred.
> > if not isinstance(filename, int):
> >     filename = os.fspath(filename)
>
> Note that the LZMA case Ethan cites is one where the code accepts
> either an already opened file-like object *or* a path-like object, and
> does different things based on which it receives.
>
> In that scenario, rather than introducing an unconditional "filename =
> os.fspath(filename)" before the current logic, it makes more sense to
> me to change the current logic to use the new protocol check rather
> than a strict typecheck on str/bytes:
>
>     if isinstance(filename, os.PathLike): # Changed line
>         filename = os.fspath(filename)    # New line
>         if "b" not in mode:
>             mode += "b"
>         self._fp = builtins.open(filename, mode)
>         self._closefp = True
>         self._mode = mode_code
>     elif hasattr(filename, "read") or hasattr(filename, "write"):
>         self._fp = filename
>         self._mode = mode_code
>     else:
>         raise TypeError(
>              "filename must be a path-like or file-like object"
>               )
>
> I *don't* think it makes sense to weaken the guarantees on os.fspath
> to let it propagate non-path-like objects.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160615/b5e81f09/attachment.html>

From brett at python.org  Wed Jun 15 14:44:13 2016
From: brett at python.org (Brett Cannon)
Date: Wed, 15 Jun 2016 18:44:13 +0000
Subject: [Python-Dev] proposed os.fspath() change
In-Reply-To: <CAP7+vJJvxZ6hNA8-6fq+cvzpTMZsUM0n0702cun9AsQ4Ae9Psg@mail.gmail.com>
References: <57617E57.40808@stoneleaf.us>
 <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>
 <CAP1=2W5hPgK9rL5b50gJjZ1VR81i3xJi8-evEbY+vGFrdwQrOw@mail.gmail.com>
 <CADiSq7dCArHLavgOEE0NegX1QvNoPsxwMk-ZfZgc7feZUubygA@mail.gmail.com>
 <CAP7+vJJvxZ6hNA8-6fq+cvzpTMZsUM0n0702cun9AsQ4Ae9Psg@mail.gmail.com>
Message-ID: <CAP1=2W5GcMhd3soyqoPnrW8dbjrL7R6SwmkyFQ51Eii+yK5XRw@mail.gmail.com>

On Wed, 15 Jun 2016 at 11:39 Guido van Rossum <guido at python.org> wrote:

> OK, so let's add a check on the return of __fspath__() and keep the check
> on path-like or string/bytes.
>

I'll update the PEP.

Ethan, do you want to leave a note on the os.fspath() issue to update the
code and go through where we've used os.fspath() to see where we can cut
out redundant type checks?

> --Guido (mobile)
> On Jun 15, 2016 11:29 AM, "Nick Coghlan" <ncoghlan at gmail.com> wrote:
>
>> On 15 June 2016 at 10:59, Brett Cannon <brett at python.org> wrote:
>> >
>> >
>> > On Wed, 15 Jun 2016 at 09:48 Guido van Rossum <guido at python.org> wrote:
>> >>
>> >> These are really two separate proposals.
>> >>
>> >> I'm okay with checking the return value of calling obj.__fspath__;
>> that's
>> >> an error in the object anyways, and it doesn't matter much whether we
>> do
>> >> this or not (though when approving the PEP I considered this and
>> decided not
>> >> to insert a check for this). But it doesn't affect your example, does
>> it? I
>> >> guess it's easier to raise now and change the API in the future to
>> avoid
>> >> raising in this case (if we find that raising is undesirable) than the
>> other
>> >> way around, so I'm +0 on this.
>> >
>> > +0 from me as well. I know in some code in the stdlib that has been
>> ported
>> > which prior to adding support was explicitly checking for str/bytes this
>> > will eliminate its own checking (obviously not a motivating factor as
>> it's
>> > pretty minor).
>>
>> I'd like a strong assertion that the return value of os.fspath() is a
>> plausible filesystem path representation (so either bytes or str), and
>> *not* some other kind of object that can also be used for accessing
>> the filesystem (like a file descriptor or an IO stream)
>>
>> >> The other proposal (passing anything that's not understood right
>> through)
>> >> is more interesting and your use case is somewhat compelling. Catching
>> the
>> >> exception coming out of os.fspath() would certainly be much messier.
>> The
>> >> question remaining is whether, when this behavior is not desired (e.g.
>> when
>> >> the caller of os.fspath() just wants a string that it can pass to
>> open()),
>> >> the condition of passing that's neither a string not supports
>> __fspath__
>> >> still produces an understandable error. I'm not sure that that's the
>> case.
>> >> E.g. open() accepts file descriptors in addition to paths, but I'm not
>> sure
>> >> that accepting an integer is a good idea in most cases -- it either
>> gives a
>> >> mystery "Bad file descriptor" error or starts reading/writing some
>> random
>> >> system file, which it then closes once the stream is closed.
>> >
>> > The FD issue of magically passing through an int was also a concern when
>> > Ethan brought this up in an issue on the tracker. My argument is that
>> FDs
>> > are not file paths and so shouldn't magically pass through if we're
>> going to
>> > type-check anything or claim os.fspath() only works with paths (FDs are
>> > already open file objects). So in my view  either we go ahead and
>> type-check
>> > the return value of __fspath__() and thus restrict everything coming
>> out of
>> > os.fspath() to Union[str, bytes] or we don't type check anything and be
>> > consistent that os.fspath() simply does is call __fspath__() if present.
>> >
>> > And just  because I'm thinking about it, I would special-case the FDs,
>> not
>> > os.PathLike (clearer why you care and faster as it skips the override of
>> > __subclasshook__):
>> >
>> > # Can be a single-line ternary operator if preferred.
>> > if not isinstance(filename, int):
>> >     filename = os.fspath(filename)
>>
>> Note that the LZMA case Ethan cites is one where the code accepts
>> either an already opened file-like object *or* a path-like object, and
>> does different things based on which it receives.
>>
>> In that scenario, rather than introducing an unconditional "filename =
>> os.fspath(filename)" before the current logic, it makes more sense to
>> me to change the current logic to use the new protocol check rather
>> than a strict typecheck on str/bytes:
>>
>>     if isinstance(filename, os.PathLike): # Changed line
>>         filename = os.fspath(filename)    # New line
>>         if "b" not in mode:
>>             mode += "b"
>>         self._fp = builtins.open(filename, mode)
>>         self._closefp = True
>>         self._mode = mode_code
>>     elif hasattr(filename, "read") or hasattr(filename, "write"):
>>         self._fp = filename
>>         self._mode = mode_code
>>     else:
>>         raise TypeError(
>>              "filename must be a path-like or file-like object"
>>               )
>>
>> I *don't* think it makes sense to weaken the guarantees on os.fspath
>> to let it propagate non-path-like objects.
>>
>> Cheers,
>> Nick.
>>
>> --
>> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160615/0b1154e4/attachment.html>

From ethan at stoneleaf.us  Wed Jun 15 14:46:21 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 15 Jun 2016 11:46:21 -0700
Subject: [Python-Dev] proposed os.fspath() change
In-Reply-To: <CAP1=2W5hPgK9rL5b50gJjZ1VR81i3xJi8-evEbY+vGFrdwQrOw@mail.gmail.com>
References: <57617E57.40808@stoneleaf.us>
 <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>
 <CAP1=2W5hPgK9rL5b50gJjZ1VR81i3xJi8-evEbY+vGFrdwQrOw@mail.gmail.com>
Message-ID: <5761A27D.3070004@stoneleaf.us>

On 06/15/2016 10:59 AM, Brett Cannon wrote:
> On Wed, 15 Jun 2016 at 09:48 Guido van Rossum wrote:

>> These are really two separate proposals.
>>
>> I'm okay with checking the return value of calling obj.__fspath__;
>> that's an error in the object anyways, and it doesn't matter much
>> whether we do this or not (though when approving the PEP I
>> considered this and decided not to insert a check for this). But it
>> doesn't affect your example, does it? I guess it's easier to raise
>> now and change the API in the future to avoid raising in this case
>> (if we find that raising is undesirable) than the other way around,
>> so I'm +0 on this.
>
> +0 from me as well. I know in some code in the stdlib that has been
> ported which prior to adding support was explicitly checking for
> str/bytes this will eliminate its own checking (obviously not a
> motivating factor as it's pretty minor).

If we accept both parts of this proposal the checking will have to stay 
in place as the original argument may not have been bytes, str, nor 
os.PathLike.

>> The other proposal (passing anything that's not understood right
>> through) is more interesting and your use case is somewhat
>> compelling. Catching the exception coming out of os.fspath() would
>> certainly be much messier. The question remaining is whether, when
>> this behavior is not desired (e.g. when the caller of os.fspath()
>> just wants a string that it can pass to open()), the condition of
>> passing that's neither a string not supports __fspath__ still
>> produces an understandable error.

This is no different than before os.fspath() existed -- if the function 
wasn't checking that the "filename" was a str but just used it as-is, 
then whatever strange, possibly-hard-to-debug error they would get now 
is the same as what they would have gotten before.

>> I'm not sure that that's the case.
>> E.g. open() accepts file descriptors in addition to paths, but I'm
>> not sure that accepting an integer is a good idea in most cases --
>> it either gives a mystery "Bad file descriptor" error or starts
>> reading/writing some random system file, which it then closes once
>> the stream is closed.

My vision of os.fspath() is simply to reduce rich-path objects to their 
component str or bytes representation, and pass anything else through.

The advantage:

- if os.open accepts str/bytes/fd it can prep the argument by
   calling os.fspath() and then do it's argument checking all
   in one place;

- if lzma accepts bytes/str/filelike-obj it can prep its argument
   by calling os.fspath() and then do it's argument checking all in
   one place

- if Path accepts str/os.PathLike it can prep it's argument(s)
   with os.fspath() and then do its argument checking all in one
   place.

> The FD issue of magically passing through an int was also a concern when
> Ethan brought this up in an issue on the tracker. My argument is that
> FDs are not file paths and so shouldn't magically pass through if we're
> going to type-check anything or claim os.fspath() only works with paths
> (FDs are already open file objects). So in my view  either we go ahead
> and type-check the return value of __fspath__() and thus restrict
> everything coming out of os.fspath() to Union[str, bytes] or we don't
> type check anything and be consistent that os.fspath() simply does is
> call __fspath__() if present.

This is better than what os.fspath() currently does as it has all the 
advantages listed above, but why is checking the output of __fspath__ 
incompatible with not checking anything else?

> And just  because I'm thinking about it, I would special-case the FDs,
> not os.PathLike (clearer why you care and faster as it skips the
> override of __subclasshook__):
>
> # Can be a single-line ternary operator if preferred.
> if not isinstance(filename, int):
>      filename = os.fspath(filename)

That example will not do the right thing in the lzma case.

--
~Ethan~

From k7hoven at gmail.com  Wed Jun 15 14:48:36 2016
From: k7hoven at gmail.com (Koos Zevenhoven)
Date: Wed, 15 Jun 2016 21:48:36 +0300
Subject: [Python-Dev] proposed os.fspath() change
In-Reply-To: <CADiSq7dCArHLavgOEE0NegX1QvNoPsxwMk-ZfZgc7feZUubygA@mail.gmail.com>
References: <57617E57.40808@stoneleaf.us>
 <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>
 <CAP1=2W5hPgK9rL5b50gJjZ1VR81i3xJi8-evEbY+vGFrdwQrOw@mail.gmail.com>
 <CADiSq7dCArHLavgOEE0NegX1QvNoPsxwMk-ZfZgc7feZUubygA@mail.gmail.com>
Message-ID: <CAMiohoj4y+Tjvm2rRg55ZhoRy623b8kSM8j5Z92VbtdXtCkXJg@mail.gmail.com>

On Wed, Jun 15, 2016 at 9:29 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 15 June 2016 at 10:59, Brett Cannon <brett at python.org> wrote:
>>
>>
>> On Wed, 15 Jun 2016 at 09:48 Guido van Rossum <guido at python.org> wrote:
>>>
>>> These are really two separate proposals.
>>>
>>> I'm okay with checking the return value of calling obj.__fspath__; that's
>>> an error in the object anyways, and it doesn't matter much whether we do
>>> this or not (though when approving the PEP I considered this and decided not
>>> to insert a check for this). But it doesn't affect your example, does it? I
>>> guess it's easier to raise now and change the API in the future to avoid
>>> raising in this case (if we find that raising is undesirable) than the other
>>> way around, so I'm +0 on this.
>>
>> +0 from me as well. I know in some code in the stdlib that has been ported
>> which prior to adding support was explicitly checking for str/bytes this
>> will eliminate its own checking (obviously not a motivating factor as it's
>> pretty minor).
>
> I'd like a strong assertion that the return value of os.fspath() is a
> plausible filesystem path representation (so either bytes or str), and
> *not* some other kind of object that can also be used for accessing
> the filesystem (like a file descriptor or an IO stream)

I agree, so I'm -0.5 on passing through any object (at least by default).

>>> The other proposal (passing anything that's not understood right through)
>>> is more interesting and your use case is somewhat compelling. Catching the
>>> exception coming out of os.fspath() would certainly be much messier. The
>>> question remaining is whether, when this behavior is not desired (e.g. when
>>> the caller of os.fspath() just wants a string that it can pass to open()),
>>> the condition of passing that's neither a string not supports __fspath__
>>> still produces an understandable error. I'm not sure that that's the case.
>>> E.g. open() accepts file descriptors in addition to paths, but I'm not sure
>>> that accepting an integer is a good idea in most cases -- it either gives a
>>> mystery "Bad file descriptor" error or starts reading/writing some random
>>> system file, which it then closes once the stream is closed.
>>
>> The FD issue of magically passing through an int was also a concern when
>> Ethan brought this up in an issue on the tracker. My argument is that FDs
>> are not file paths and so shouldn't magically pass through if we're going to
>> type-check anything or claim os.fspath() only works with paths (FDs are
>> already open file objects). So in my view  either we go ahead and type-check
>> the return value of __fspath__() and thus restrict everything coming out of
>> os.fspath() to Union[str, bytes] or we don't type check anything and be
>> consistent that os.fspath() simply does is call __fspath__() if present.
>>
>> And just  because I'm thinking about it, I would special-case the FDs, not
>> os.PathLike (clearer why you care and faster as it skips the override of
>> __subclasshook__):
>>
>> # Can be a single-line ternary operator if preferred.
>> if not isinstance(filename, int):
>>     filename = os.fspath(filename)
>
> Note that the LZMA case Ethan cites is one where the code accepts
> either an already opened file-like object *or* a path-like object, and
> does different things based on which it receives.
>
> In that scenario, rather than introducing an unconditional "filename =
> os.fspath(filename)" before the current logic, it makes more sense to
> me to change the current logic to use the new protocol check rather
> than a strict typecheck on str/bytes:
>
>     if isinstance(filename, os.PathLike): # Changed line
>         filename = os.fspath(filename)    # New line

You are making one of my earlier points here, thanks ;). The point is
that the name PathLike sounds like it would mean anything path-like,
except that os.PathLike does not include str and bytes. And I still
think the naming should be a little different.

So that would be (os.Pathlike, str, bytes) instead of just os.PathLike.

>         if "b" not in mode:
>             mode += "b"
>         self._fp = builtins.open(filename, mode)
>         self._closefp = True
>         self._mode = mode_code
>     elif hasattr(filename, "read") or hasattr(filename, "write"):
>         self._fp = filename
>         self._mode = mode_code
>     else:
>         raise TypeError(
>              "filename must be a path-like or file-like object"
>               )
>
> I *don't* think it makes sense to weaken the guarantees on os.fspath
> to let it propagate non-path-like objects.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com

--
+ Koos Zevenhoven + http://twitter.com/k7hoven +

From ncoghlan at gmail.com  Wed Jun 15 14:55:34 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 15 Jun 2016 11:55:34 -0700
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <njfito$uhf$1@ger.gmane.org>
References: <20160608210133.GA4318@python.ca>
 <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
 <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>
 <20160609230807.GA8118@python.ca>
 <CAFSbXtMyTQdxSqa6Kf-FDBO2DkdjULEovW97a5QZaz6tNkWuEg@mail.gmail.com>
 <CADiSq7f7O+fXK7Ci00FwBhWwfiwN_VOwev+Ju0R3VRy56CK4UQ@mail.gmail.com>
 <njfito$uhf$1@ger.gmane.org>
Message-ID: <CADiSq7d-4UC8GgO+NW4y5AsPH_BN_UMSySKh3RPa_7LxFWfO0w@mail.gmail.com>

On 10 June 2016 at 16:36, Neil Schemenauer <neil at python.ca> wrote:
> Nick Coghlan <ncoghlan at gmail.com> wrote:
>> It could be very interesting to add an "ascii-warn" codec to Python
>> 2.7, and then set that as the default encoding when the -3 flag is
>> set.
>
> I don't think that can work.  The library code in Python would spew
> out warnings even in the cases when nothing is wrong with the
> application code.  I think warnings have to be added to a Python
> where str and bytes have been properly separated.  Without extreme
> backporting efforts, that means 3.x.
>
> We don't want to saddle 3.x with a bunch of backwards compatibility
> cruft.  Maybe some of my runtime warning changes could be merged
> using a command line flag to enable them.  It would be nice to have
> the stepping stone version just be normal 3.x with a command line
> option.  However, for the sanity of people maintaining 3.x, I think
> perhaps we don't want to do it.

Right, my initial negative reactions were mainly to the idea of having
these kinds of capabilities in the mainline 3.x codebase (where we'd
then have to support them for everyone, not just the folks that
genuinely need them to help in migration from Python 2).

The standard porting instructions currently assume code bases that are
*mostly* bytes/unicode clean, with perhaps a few oversights where
Python 3 rejects ambiguity that Python 2 tolerates. In that context,
"run your test suite, address the test failures" should generally be
sufficient, without needing to use a custom Python build.

However, there are a couple of cases those standard instructions still
don't cover:

- if there's no test suite, exploratory discovery is problematic when
the app falls over at the first type ambiguity
- even if there is a test suite, sufficiently pervasive type ambiguity
may make it difficult to use for fault isolation

That's where I now agree your proposal for a variant build
specifically aimed at compatibility testing is potentially
interesting:

- the tool would become an escalation path for folks that aren't in a
position to use their own test suite to isolate type ambiguity
problems under Python 3
- using Python 3 as a basis means you get a clean standard library
that shouldn't emit any false alarms
- the necessary feature set is defined by the common subset of Python
2.7 and a chosen minimum Python 3 version, not any future 3.x release,
so you should be able to maintain the changes as a stable patch set
without needing to chase CPython trunk (with the attendant risk of
merge conflicts)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ethan at stoneleaf.us  Wed Jun 15 15:09:39 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 15 Jun 2016 12:09:39 -0700
Subject: [Python-Dev] proposed os.fspath() change
In-Reply-To: <CAP1=2W5GcMhd3soyqoPnrW8dbjrL7R6SwmkyFQ51Eii+yK5XRw@mail.gmail.com>
References: <57617E57.40808@stoneleaf.us>
 <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>
 <CAP1=2W5hPgK9rL5b50gJjZ1VR81i3xJi8-evEbY+vGFrdwQrOw@mail.gmail.com>
 <CADiSq7dCArHLavgOEE0NegX1QvNoPsxwMk-ZfZgc7feZUubygA@mail.gmail.com>
 <CAP7+vJJvxZ6hNA8-6fq+cvzpTMZsUM0n0702cun9AsQ4Ae9Psg@mail.gmail.com>
 <CAP1=2W5GcMhd3soyqoPnrW8dbjrL7R6SwmkyFQ51Eii+yK5XRw@mail.gmail.com>
Message-ID: <5761A7F3.4040401@stoneleaf.us>

On 06/15/2016 11:44 AM, Brett Cannon wrote:
> On Wed, 15 Jun 2016 at 11:39 Guido van Rossum wrote:

>> OK, so let's add a check on the return of __fspath__() and keep the
>> check on path-like or string/bytes.
>
> I'll update the PEP.
>
> Ethan, do you want to leave a note on the os.fspath() issue to update
> the code and go through where we've used os.fspath() to see where we can
> cut out redundant type checks?

Will do.

I didn't see this subthread before my last post, so unless you agree 
with those other changes feel free to ignore it.  ;)

--
~Ethan~

From k7hoven at gmail.com  Wed Jun 15 15:10:11 2016
From: k7hoven at gmail.com (Koos Zevenhoven)
Date: Wed, 15 Jun 2016 22:10:11 +0300
Subject: [Python-Dev] proposed os.fspath() change
In-Reply-To: <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>
References: <57617E57.40808@stoneleaf.us>
 <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>
Message-ID: <CAMiohoiL3yE2NpuoovMZPzd2hOJ+CpxoWZNR_sJuERVhe1o1BA@mail.gmail.com>

>>     if isinstance(filename, os.PathLike):

By the way, regarding the line of code above, is there a convention
regarding whether implementing some protocol/interface requires
registering with (or inheriting from) the appropriate ABC for it to
work in all situations. IOW, in this case, is it sufficient to
implement __fspath__ to make your type pathlike? Is there a conscious
trend towards requiring the ABC?

-- Koos

From brett at python.org  Wed Jun 15 15:15:01 2016
From: brett at python.org (Brett Cannon)
Date: Wed, 15 Jun 2016 19:15:01 +0000
Subject: [Python-Dev] proposed os.fspath() change
In-Reply-To: <CAMiohoiL3yE2NpuoovMZPzd2hOJ+CpxoWZNR_sJuERVhe1o1BA@mail.gmail.com>
References: <57617E57.40808@stoneleaf.us>
 <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>
 <CAMiohoiL3yE2NpuoovMZPzd2hOJ+CpxoWZNR_sJuERVhe1o1BA@mail.gmail.com>
Message-ID: <CAP1=2W6FZYMaOUdYcG=w=YHjhvnnqkf1Ga_QgMdapqU5MSHU=g@mail.gmail.com>

On Wed, 15 Jun 2016 at 12:12 Koos Zevenhoven <k7hoven at gmail.com> wrote:

> >>     if isinstance(filename, os.PathLike):
>
> By the way, regarding the line of code above, is there a convention
> regarding whether implementing some protocol/interface requires
> registering with (or inheriting from) the appropriate ABC for it to
> work in all situations. IOW, in this case, is it sufficient to
> implement __fspath__ to make your type pathlike? Is there a conscious
> trend towards requiring the ABC?
>

ABCs like os.PathLike can override __subclasshook__ so that registration
isn't required (see
https://hg.python.org/cpython/file/default/Lib/os.py#l1136). So
registration is definitely good to do to be explicit that you're trying to
meet an ABC, but it isn't strictly required.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160615/104133b0/attachment.html>

From ethan at stoneleaf.us  Wed Jun 15 15:16:38 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 15 Jun 2016 12:16:38 -0700
Subject: [Python-Dev] proposed os.fspath() change
In-Reply-To: <CAMiohoiL3yE2NpuoovMZPzd2hOJ+CpxoWZNR_sJuERVhe1o1BA@mail.gmail.com>
References: <57617E57.40808@stoneleaf.us>
 <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>
 <CAMiohoiL3yE2NpuoovMZPzd2hOJ+CpxoWZNR_sJuERVhe1o1BA@mail.gmail.com>
Message-ID: <5761A996.3010101@stoneleaf.us>

On 06/15/2016 12:10 PM, Koos Zevenhoven wrote:
>>>      if isinstance(filename, os.PathLike):
>
> By the way, regarding the line of code above, is there a convention
> regarding whether implementing some protocol/interface requires
> registering with (or inheriting from) the appropriate ABC for it to
> work in all situations. IOW, in this case, is it sufficient to
> implement __fspath__ to make your type pathlike? Is there a conscious
> trend towards requiring the ABC?

The ABC is not required, simply having the __fspath__ attribute is 
enough.  Of course, to actually work that attribute should be a function 
that returns a str or bytes object.  ;)

--
~Ethan~

From k7hoven at gmail.com  Wed Jun 15 15:24:43 2016
From: k7hoven at gmail.com (Koos Zevenhoven)
Date: Wed, 15 Jun 2016 22:24:43 +0300
Subject: [Python-Dev] proposed os.fspath() change
In-Reply-To: <CAP1=2W6FZYMaOUdYcG=w=YHjhvnnqkf1Ga_QgMdapqU5MSHU=g@mail.gmail.com>
References: <57617E57.40808@stoneleaf.us>
 <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>
 <CAMiohoiL3yE2NpuoovMZPzd2hOJ+CpxoWZNR_sJuERVhe1o1BA@mail.gmail.com>
 <CAP1=2W6FZYMaOUdYcG=w=YHjhvnnqkf1Ga_QgMdapqU5MSHU=g@mail.gmail.com>
Message-ID: <CAMiohogTCrdX-jwZnRpjGupKgK_rPPiB0X1+vWj0BkEZ82B93Q@mail.gmail.com>

On Wed, Jun 15, 2016 at 10:15 PM, Brett Cannon <brett at python.org> wrote:
>
>
> On Wed, 15 Jun 2016 at 12:12 Koos Zevenhoven <k7hoven at gmail.com> wrote:
>>
>> >>     if isinstance(filename, os.PathLike):
>>
>> By the way, regarding the line of code above, is there a convention
>> regarding whether implementing some protocol/interface requires
>> registering with (or inheriting from) the appropriate ABC for it to
>> work in all situations. IOW, in this case, is it sufficient to
>> implement __fspath__ to make your type pathlike? Is there a conscious
>> trend towards requiring the ABC?
>
>
> ABCs like os.PathLike can override __subclasshook__ so that registration
> isn't required (see
> https://hg.python.org/cpython/file/default/Lib/os.py#l1136). So registration
> is definitely good to do to be explicit that you're trying to meet an ABC,
> but it isn't strictly required.

Ok I suppose that's fine, so I propose we update the ABC part in the
PEP with __subclasshook__.

And the other question could be turned into whether to make str and
bytes also PathLike in __subclasshook__.

-- Koos

-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +

From brett at python.org  Wed Jun 15 15:57:28 2016
From: brett at python.org (Brett Cannon)
Date: Wed, 15 Jun 2016 19:57:28 +0000
Subject: [Python-Dev] proposed os.fspath() change
In-Reply-To: <CAP1=2W5GcMhd3soyqoPnrW8dbjrL7R6SwmkyFQ51Eii+yK5XRw@mail.gmail.com>
References: <57617E57.40808@stoneleaf.us>
 <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>
 <CAP1=2W5hPgK9rL5b50gJjZ1VR81i3xJi8-evEbY+vGFrdwQrOw@mail.gmail.com>
 <CADiSq7dCArHLavgOEE0NegX1QvNoPsxwMk-ZfZgc7feZUubygA@mail.gmail.com>
 <CAP7+vJJvxZ6hNA8-6fq+cvzpTMZsUM0n0702cun9AsQ4Ae9Psg@mail.gmail.com>
 <CAP1=2W5GcMhd3soyqoPnrW8dbjrL7R6SwmkyFQ51Eii+yK5XRw@mail.gmail.com>
Message-ID: <CAP1=2W7L+zFffhKUeafcdqDna_=BQGLQULE5m4gpTdpzLsT_zw@mail.gmail.com>

PEP 519 updated: https://hg.python.org/peps/rev/92feff129ee4

On Wed, 15 Jun 2016 at 11:44 Brett Cannon <brett at python.org> wrote:

> On Wed, 15 Jun 2016 at 11:39 Guido van Rossum <guido at python.org> wrote:
>
>> OK, so let's add a check on the return of __fspath__() and keep the check
>> on path-like or string/bytes.
>>
>
> I'll update the PEP.
>
> Ethan, do you want to leave a note on the os.fspath() issue to update the
> code and go through where we've used os.fspath() to see where we can cut
> out redundant type checks?
>
>
>> --Guido (mobile)
>> On Jun 15, 2016 11:29 AM, "Nick Coghlan" <ncoghlan at gmail.com> wrote:
>>
>>> On 15 June 2016 at 10:59, Brett Cannon <brett at python.org> wrote:
>>> >
>>> >
>>> > On Wed, 15 Jun 2016 at 09:48 Guido van Rossum <guido at python.org>
>>> wrote:
>>> >>
>>> >> These are really two separate proposals.
>>> >>
>>> >> I'm okay with checking the return value of calling obj.__fspath__;
>>> that's
>>> >> an error in the object anyways, and it doesn't matter much whether we
>>> do
>>> >> this or not (though when approving the PEP I considered this and
>>> decided not
>>> >> to insert a check for this). But it doesn't affect your example, does
>>> it? I
>>> >> guess it's easier to raise now and change the API in the future to
>>> avoid
>>> >> raising in this case (if we find that raising is undesirable) than
>>> the other
>>> >> way around, so I'm +0 on this.
>>> >
>>> > +0 from me as well. I know in some code in the stdlib that has been
>>> ported
>>> > which prior to adding support was explicitly checking for str/bytes
>>> this
>>> > will eliminate its own checking (obviously not a motivating factor as
>>> it's
>>> > pretty minor).
>>>
>>> I'd like a strong assertion that the return value of os.fspath() is a
>>> plausible filesystem path representation (so either bytes or str), and
>>> *not* some other kind of object that can also be used for accessing
>>> the filesystem (like a file descriptor or an IO stream)
>>>
>>> >> The other proposal (passing anything that's not understood right
>>> through)
>>> >> is more interesting and your use case is somewhat compelling.
>>> Catching the
>>> >> exception coming out of os.fspath() would certainly be much messier.
>>> The
>>> >> question remaining is whether, when this behavior is not desired
>>> (e.g. when
>>> >> the caller of os.fspath() just wants a string that it can pass to
>>> open()),
>>> >> the condition of passing that's neither a string not supports
>>> __fspath__
>>> >> still produces an understandable error. I'm not sure that that's the
>>> case.
>>> >> E.g. open() accepts file descriptors in addition to paths, but I'm
>>> not sure
>>> >> that accepting an integer is a good idea in most cases -- it either
>>> gives a
>>> >> mystery "Bad file descriptor" error or starts reading/writing some
>>> random
>>> >> system file, which it then closes once the stream is closed.
>>> >
>>> > The FD issue of magically passing through an int was also a concern
>>> when
>>> > Ethan brought this up in an issue on the tracker. My argument is that
>>> FDs
>>> > are not file paths and so shouldn't magically pass through if we're
>>> going to
>>> > type-check anything or claim os.fspath() only works with paths (FDs are
>>> > already open file objects). So in my view  either we go ahead and
>>> type-check
>>> > the return value of __fspath__() and thus restrict everything coming
>>> out of
>>> > os.fspath() to Union[str, bytes] or we don't type check anything and be
>>> > consistent that os.fspath() simply does is call __fspath__() if
>>> present.
>>> >
>>> > And just  because I'm thinking about it, I would special-case the FDs,
>>> not
>>> > os.PathLike (clearer why you care and faster as it skips the override
>>> of
>>> > __subclasshook__):
>>> >
>>> > # Can be a single-line ternary operator if preferred.
>>> > if not isinstance(filename, int):
>>> >     filename = os.fspath(filename)
>>>
>>> Note that the LZMA case Ethan cites is one where the code accepts
>>> either an already opened file-like object *or* a path-like object, and
>>> does different things based on which it receives.
>>>
>>> In that scenario, rather than introducing an unconditional "filename =
>>> os.fspath(filename)" before the current logic, it makes more sense to
>>> me to change the current logic to use the new protocol check rather
>>> than a strict typecheck on str/bytes:
>>>
>>>     if isinstance(filename, os.PathLike): # Changed line
>>>         filename = os.fspath(filename)    # New line
>>>         if "b" not in mode:
>>>             mode += "b"
>>>         self._fp = builtins.open(filename, mode)
>>>         self._closefp = True
>>>         self._mode = mode_code
>>>     elif hasattr(filename, "read") or hasattr(filename, "write"):
>>>         self._fp = filename
>>>         self._mode = mode_code
>>>     else:
>>>         raise TypeError(
>>>              "filename must be a path-like or file-like object"
>>>               )
>>>
>>> I *don't* think it makes sense to weaken the guarantees on os.fspath
>>> to let it propagate non-path-like objects.
>>>
>>> Cheers,
>>> Nick.
>>>
>>> --
>>> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160615/6476a503/attachment-0001.html>

From ethan at stoneleaf.us  Wed Jun 15 16:00:01 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 15 Jun 2016 13:00:01 -0700
Subject: [Python-Dev] proposed os.fspath() change
In-Reply-To: <CAMiohogTCrdX-jwZnRpjGupKgK_rPPiB0X1+vWj0BkEZ82B93Q@mail.gmail.com>
References: <57617E57.40808@stoneleaf.us>
 <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>
 <CAMiohoiL3yE2NpuoovMZPzd2hOJ+CpxoWZNR_sJuERVhe1o1BA@mail.gmail.com>
 <CAP1=2W6FZYMaOUdYcG=w=YHjhvnnqkf1Ga_QgMdapqU5MSHU=g@mail.gmail.com>
 <CAMiohogTCrdX-jwZnRpjGupKgK_rPPiB0X1+vWj0BkEZ82B93Q@mail.gmail.com>
Message-ID: <5761B3C1.3060601@stoneleaf.us>

On 06/15/2016 12:24 PM, Koos Zevenhoven wrote:
> On Wed, Jun 15, 2016 at 10:15 PM, Brett Cannon wrote:

>> ABCs like os.PathLike can override __subclasshook__ so that registration
>> isn't required (see
>> https://hg.python.org/cpython/file/default/Lib/os.py#l1136). So registration
>> is definitely good to do to be explicit that you're trying to meet an ABC,
>> but it isn't strictly required.

> And the other question could be turned into whether to make str and
> bytes also PathLike in __subclasshook__.

No, for two reasons.

- most str's and bytes' are not paths;
- PathLike indicates a rich-path object, which str's and bytes' are not.

--
~Ethan~

From ncoghlan at gmail.com  Wed Jun 15 16:01:27 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 15 Jun 2016 13:01:27 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160613122654.GE17328@thunk.org>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
Message-ID: <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>

[whew, actually read the whole thread]

On 11 June 2016 at 10:28, Terry Reedy <tjreedy at udel.edu> wrote:
> On 6/11/2016 11:34 AM, Guido van Rossum wrote:
>>
>> In terms of API design, I'd prefer a flag to os.urandom() indicating a
>> preference for
>> - blocking
>> - raising an exception
>> - weaker random bits
>
>
> +100 ;-)
>
> I proposed exactly this 2 days ago, 5 hours after Larry's initial post.

No, this is a bad idea. Asking novice developers to make security
decisions they're not yet qualified to make when it's genuinely
possible for us to do the right thing by default is the antithesis of
good security API design, and os.urandom() *is* a security API
(whether we like it or not - third party documentation written by the
cryptographic software development community has made it so, since
it's part of their guidelines for writing security sensitive code in
pure Python).

Adding *new* APIs is also a bad idea, since "os.urandom() is the right
answer on every OS except Linux, and also the best currently available
answer on Linux" has been the standard security advice for generating
cryptographic secrets in pure Python code for years now, so we should
only change that guidance if we have extraordinarily compelling
reasons to do so, and we don't. Instead, we have Ted T'so himself
chiming in to say: "My preference would be that os.[u]random should
block, because the odds that people would be trying to generate
long-term cryptographic secrets within seconds after boot is very
small, and if you *do* block for a second or two, it's not the end of
the world."

The *actual bug* that triggered this latest firestorm of commentary
(from experts and non-experts alike) had *nothing* to do with user
code calling os.urandom, and instead was a combination of:

- CPython startup requesting cryptographically secure randomness when
it didn't need it
- a systemd init script written in Python running before the kernel
RNG was fully initialised

That created a deadlock between CPython startup and the rest of the
Linux init process, so the latter only continued when the systemd
watchdog timed out and killed the offending script. As others have
noted, this kind of deadlock scenario is generally impossible on other
operating systems, as the operating system doesn't provide a way to
run Python code before the random number generator is ready.

The change Victor made in 3.5.2 to fall back to reading /dev/urandom
directly if the getrandom() syscall returns EAGAIN (effectively
reverting to the Python 3.4 behaviour) was the simplest possible fix
for that problem (and an approach I thoroughly endorse, both for 3.5.2
and for the life of the 3.5 series), but that doesn't make it the
right answer for 3.6+.

To repeat: the problem encountered was NOT due to user code calling
os.urandom(), but rather due to the way CPython initialises its own
internal hash algorithm at interpreter startup. However, due to the
way CPython is currently implemented, fixing the regression in that
not only changed the behaviour of CPython startup, it *also* changed
the behaviour of every call to os.urandom() in Python 3.5.2+.

For 3.6+, we can instead make it so that the only things that actually
rely on cryptographic quality randomness being available are:

- calling a secrets module API
- calling a random.SystemRandom method
- calling os.urandom directly

These are all APIs that were either created specifically for use in
security sensitive situations (secrets module), or have long been
documented (both within our own documentation, and in third party
documentation, books and Q&A sites) as being an appropriate choice for
use in security sensitive situations (os.urandom and
random.SystemRandom).

However, we don't need to make those block waiting for randomness to
be available - we can update them to raise BlockingIOError instead
(which makes it trivial for people to decide for themselves how they
want to handle that case).

Along with that change, we can make it so that starting the
interpreter will never block waiting for cryptographic randomness to
be available (since it doesn't need it), and importing the random
module won't block waiting for it either.

To the best of our knowledge, on all operating systems other than
Linux, encountering the new exception will still be impossible in
practice, as there is no known opportunity to run Python code before
the kernel random number generator is ready.

On Linux, init scripts may still run before the kernel random number
generator is ready, but will now throw an immediate BlockingIOError if
they access an API that relies on crytographic randomness being
available, rather than potentially deadlocking the init process. Folks
encountering that situation will then need to make an explicit
decision:

- loop until the exception is no longer thrown
- switch to reading from /dev/urandom directly instead of calling os.urandom()
- switch to using a cross-platform non-cryptographic API (probably the
random module)

Victor has some additional technical details written up at
http://haypo-notes.readthedocs.io/pep_random.html and I'd be happy to
formalise this proposed approach as a PEP (the current reference is
http://bugs.python.org/issue27282 )

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ethan at stoneleaf.us  Wed Jun 15 16:30:33 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 15 Jun 2016 13:30:33 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
Message-ID: <5761BAE9.10604@stoneleaf.us>

On 06/15/2016 01:01 PM, Nick Coghlan wrote:

> For 3.6+, we can instead make it so that the only things that actually
> rely on cryptographic quality randomness being available are:
>
> - calling a secrets module API
> - calling a random.SystemRandom method
> - calling os.urandom directly
>

> However, we don't need to make those block waiting for randomness to
> be available - we can update them to raise BlockingIOError instead
> (which makes it trivial for people to decide for themselves how they
> want to handle that case).
>
> Along with that change, we can make it so that starting the
> interpreter will never block waiting for cryptographic randomness to
> be available (since it doesn't need it), and importing the random
> module won't block waiting for it either.

+1

--
~Ethan~

From ncoghlan at gmail.com  Wed Jun 15 16:30:19 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 15 Jun 2016 13:30:19 -0700
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CANawmycixVLySD9KAAHmxjCoPKfBRGDKWmczwvHmUo3+vuWDkA@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <CANawmycixVLySD9KAAHmxjCoPKfBRGDKWmczwvHmUo3+vuWDkA@mail.gmail.com>
Message-ID: <CADiSq7cGC2749iQG+nT_6fnW2Bq1yfF=vPZDuM9UD70gPXhQ5A@mail.gmail.com>

On 14 June 2016 at 02:41, Nikita Nemkin <nikita at nemkin.ru> wrote:
> Is there any rationale for rejecting alternatives like:

Good questions - Eric, it's likely worth capturing answers to these in
the PEP for the benefit of future readers.

> 1. Adding standard metaclass with ordered namespace.

Adding metaclasses to an existing class can break compatibility with
third party subclasses, so making it possible for people to avoid that
while still gaining the ability to implicitly expose attribute
ordering to class decorators and other potentially interested parties
is a recurring theme behind this PEP and also PEPs 422 and 487.

> 2. Adding `namespace` or `ordered` args to the default metaclass.

See below (as it relates to your own complexity argument)

> 3. Making compiler fill in __definition_order__ for every class
>     (just like __qualname__) without touching the runtime.
> ?

Class scopes support conditionals and loops, so we can't necessarily
be sure what names will be assigned without running the code. It's
also possible to make attribute assignments via locals() that are
entirely opaque to the compiler, but visible to the interpreter at
runtime.

> To me, any of the above seems preferred to complicating
> the core part of the language forever.
>
> The vast majority of Python classes don't care about their member
> order, this is minority use case receiving majority treatment.
>
> Also, wiring OrderedDict into class creation means elevating it
> from a peripheral utility to indispensable built-in type.

Right, that's one of the key reasons this is a PEP, rather than just
an item on the issue tracker.

The rationale for "Why not make this configurable, rather than
switching it unilaterally?" is that it's actually *simpler* overall to
just make it the default - we can then change the documentation to say
"class bodies are evaluated in a collections.OrderedDict instance by
default" and record the consequences of that, rather than having to
document yet another class customisation mechanism.

It also eliminates boilerplate from class decorator usage
instructions, where people have to write "to use this class decorator,
you must also specify 'namespace=collections.OrderedDict' in your
class header"

Folks that don't need the ordering information do end up paying a
slight import time and memory cost, which is another key reason for
handling the proposal as a PEP rather than just as a tracker issue.

Aside from the boilerplate reduction when used in conjunction with a
class decorator, a further possible category of consumers would be
documentation generators like pydoc and Sphinx apidoc, which may be
able to switch to displaying methods in definition order, rather than
the current approach of always listing them in alphabetical order.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From k7hoven at gmail.com  Wed Jun 15 16:58:17 2016
From: k7hoven at gmail.com (Koos Zevenhoven)
Date: Wed, 15 Jun 2016 23:58:17 +0300
Subject: [Python-Dev] proposed os.fspath() change
In-Reply-To: <5761B3C1.3060601@stoneleaf.us>
References: <57617E57.40808@stoneleaf.us>
 <CAP7+vJKsZoptxJ_z=z7Oa5iJ4oqkp===cAq_6+=Bfd5UA0K1oQ@mail.gmail.com>
 <CAMiohoiL3yE2NpuoovMZPzd2hOJ+CpxoWZNR_sJuERVhe1o1BA@mail.gmail.com>
 <CAP1=2W6FZYMaOUdYcG=w=YHjhvnnqkf1Ga_QgMdapqU5MSHU=g@mail.gmail.com>
 <CAMiohogTCrdX-jwZnRpjGupKgK_rPPiB0X1+vWj0BkEZ82B93Q@mail.gmail.com>
 <5761B3C1.3060601@stoneleaf.us>
Message-ID: <CAMiohogTyj59dPP_sjmn3r5wQOn9U=_ebRk08FiuqmaQ=+5DFA@mail.gmail.com>

On Wed, Jun 15, 2016 at 11:00 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
> On 06/15/2016 12:24 PM, Koos Zevenhoven wrote:
>>
>> And the other question could be turned into whether to make str and
>> bytes also PathLike in __subclasshook__.
>
> No, for two reasons.
>
> - most str's and bytes' are not paths;

True. Well, at least most str and bytes objects are not *meant* to be
used as paths, even if they could be.

> - PathLike indicates a rich-path object, which str's and bytes' are not.

This does not count as a reason.

If this were called pathlib.PathABC, I would definitely agree [1]. But
since this is called os.PathLike, I'm not quite as sure. Anyway,
including str and bytes is more of a type hinting issue. And since
type hints will in also act as documentation, the naming of types is
becoming more important.

-- Koos

[1] No, I'm not proposing moving this to pathlib

> --
> ~Ethan~
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com

-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +

From andersjm at stofanet.dk  Wed Jun 15 16:42:09 2016
From: andersjm at stofanet.dk (Anders J. Munch)
Date: Wed, 15 Jun 2016 22:42:09 +0200
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <CAG8k2+4Qn7Kj2B4RQZTJ8W8WsQOfy=tavKQvB_YFifqRhRwc6g@mail.gmail.com>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CAP7+vJKqqpRK5iUx_wHzyb6UExByfFdQFt0+jgVLTjWimzTitQ@mail.gmail.com>
 <20160615123401.GB27919@ando.pearwood.info>
 <CAG8k2+4Qn7Kj2B4RQZTJ8W8WsQOfy=tavKQvB_YFifqRhRwc6g@mail.gmail.com>
Message-ID: <af9e8d2e-7ba9-57ff-cec2-7870f5a449d6@stofanet.dk>

Paul Moore:
 > Finding out whether users/projects typically write such a helper
 > function for themselves would be a better way of getting this
 > information. Personally, I suspect they don't, but facts beat
 > speculation.

Well, I did. It was necessary to get 2to3 conversion to work(*). I turned every 
occurence of
     E.encode('base-64')
and
    E.decode('base-64')
into helper function calls that for Python 3 did:
    b64encode(E).decode('ascii')
and
    b64decode(E.encode('ascii'))
(Or something similar, I don't have the code in front of me.)

Leaving out .decode/.encode('ascii') would simply not have worked. That would 
just be asking for TypeError's.

regards, Anders

(*) Yes, I use 2to3, believe it or not.  Maintaining Python 2 code and doing an 
automated conversion to Python 3 as needed.

From njs at pobox.com  Wed Jun 15 19:12:57 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 15 Jun 2016 16:12:57 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
Message-ID: <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>

On Wed, Jun 15, 2016 at 1:01 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
[...]
> For 3.6+, we can instead make it so that the only things that actually
> rely on cryptographic quality randomness being available are:
>
> - calling a secrets module API
> - calling a random.SystemRandom method
> - calling os.urandom directly
>
> These are all APIs that were either created specifically for use in
> security sensitive situations (secrets module), or have long been
> documented (both within our own documentation, and in third party
> documentation, books and Q&A sites) as being an appropriate choice for
> use in security sensitive situations (os.urandom and
> random.SystemRandom).
>
> However, we don't need to make those block waiting for randomness to
> be available - we can update them to raise BlockingIOError instead
> (which makes it trivial for people to decide for themselves how they
> want to handle that case).
>
> Along with that change, we can make it so that starting the
> interpreter will never block waiting for cryptographic randomness to
> be available (since it doesn't need it), and importing the random
> module won't block waiting for it either.

This all seems exactly right to me, to the point that I've been
dreading having to find the time to write pretty much this exact
email. So thank you :-)

> To the best of our knowledge, on all operating systems other than
> Linux, encountering the new exception will still be impossible in
> practice, as there is no known opportunity to run Python code before
> the kernel random number generator is ready.
>
> On Linux, init scripts may still run before the kernel random number
> generator is ready, but will now throw an immediate BlockingIOError if
> they access an API that relies on crytographic randomness being
> available, rather than potentially deadlocking the init process. Folks
> encountering that situation will then need to make an explicit
> decision:
>
> - loop until the exception is no longer thrown
> - switch to reading from /dev/urandom directly instead of calling os.urandom()
> - switch to using a cross-platform non-cryptographic API (probably the
> random module)
>
> Victor has some additional technical details written up at
> http://haypo-notes.readthedocs.io/pep_random.html and I'd be happy to
> formalise this proposed approach as a PEP (the current reference is
> http://bugs.python.org/issue27282 )

I'd make two additional suggestions:

- one person did chime in on the thread to say that they've used
os.urandom for non-security-sensitive purposes, simply because it
provided a convenient "give me a random byte-string" API that is
missing from random. I think we should go ahead and add a .randbytes
method to random.Random that simply returns a random bytestring using
the regular RNG, to give these users a nice drop-in replacement for
os.urandom.

Rationale: I don't think the existence of these users should block
making os.urandom appropriate for generating secrets, because (1) a
glance at github shows that this is very unusual -- if you skim
through this search you get page after page of functions with names
like "generate_secret_key"

  https://github.com/search?l=python&p=2&q=urandom&ref=searchresults&type=Code&utf8=%E2%9C%93

and (2) for the minority of people who are using os.urandom for
non-security-sensitive purposes, if they find os.urandom raising an
error, then this is just a regular bug that they will notice
immediately and fix, and anyway it's basically never going to happen.
(As far as we can tell, this has never yet happened in the wild, even
once.) OTOH if os.urandom is allowed to fail silently, then people who
are using it to generate secrets will get silent catastrophic
failures, plus those users can't assume it will never happen because
they have to worry about active attackers trying to drive systems into
unusual states. So I'd much rather ask the non-security-sensitive
users to switch to using something in random, than force the
cryptographic users to switch to using secrets. But it does seem like
it would be good to give those non-security-sensitive users something
to switch to :-).

- It's not exactly true that the Python interpreter doesn't need
cryptographic randomness to initialize SipHash -- it's more that
*some* Python invocations need unguessable randomness (to first
approximation: all those which are exposed to hostile input), and some
don't. And since the Python interpreter has no idea which case it's
in, and since it's unacceptable for it to break invocations that don't
need unguessable hashes, then it has to err on the side of continuing
without randomness. All that's fine.

But, given that the interpreter doesn't know which state it's in,
there's also the possibility that this invocation *will* be exposed to
hostile input, and the 3.5.2+ behavior gives absolutely no warning
that this is what's happening. So instead of letting this potential
error pass silently, I propose that if SipHash fails to acquire real
randomness at startup, then it should issue a warning. In practice,
this will almost never happen. But in the rare cases it does, it at
least gives the user a fighting chance to realize that their system is
in a potentially dangerous state. And by using the warnings module, we
automatically get quite a bit of flexibility. If some particular
invocation (e.g. systemd-cron) has audited their code and decided that
they don't care about this issue, they can make the message go away:

   PYTHONWARNINGS=ignore::NoEntropyAtStartupWarning

OTOH if some particular invocation knows that they do process
potentially hostile input early on (e.g. cloud-init, maybe?), then
they can explicitly promote the warning to an error:

  PYTHONWARNINGS=error::NoEntropyAtStartupWarning

(I guess the way to implement this would be for the SipHash
initialization code -- which runs very early -- to set some flag, and
then we expose that flag in sys._something, and later in the startup
sequence check for it after the warnings module is functional.
Exposing the flag at the Python level would also make it possible for
code like cloud-init to do its own explicit check and respond
appropriately.)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From ncoghlan at gmail.com  Wed Jun 15 19:26:07 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 15 Jun 2016 16:26:07 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>
Message-ID: <CADiSq7cLOARKV=2=nh0ckQLO_rvOvSOhOmgX1q2hVTUK2USd=A@mail.gmail.com>

On 15 June 2016 at 16:12, Nathaniel Smith <njs at pobox.com> wrote:
> On Wed, Jun 15, 2016 at 1:01 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> Victor has some additional technical details written up at
>> http://haypo-notes.readthedocs.io/pep_random.html and I'd be happy to
>> formalise this proposed approach as a PEP (the current reference is
>> http://bugs.python.org/issue27282 )
>
> I'd make two additional suggestions:
>
> - one person did chime in on the thread to say that they've used
> os.urandom for non-security-sensitive purposes, simply because it
> provided a convenient "give me a random byte-string" API that is
> missing from random. I think we should go ahead and add a .randbytes
> method to random.Random that simply returns a random bytestring using
> the regular RNG, to give these users a nice drop-in replacement for
> os.urandom.

That seems reasonable.

> - It's not exactly true that the Python interpreter doesn't need
> cryptographic randomness to initialize SipHash -- it's more that
> *some* Python invocations need unguessable randomness (to first
> approximation: all those which are exposed to hostile input), and some
> don't. And since the Python interpreter has no idea which case it's
> in, and since it's unacceptable for it to break invocations that don't
> need unguessable hashes, then it has to err on the side of continuing
> without randomness. All that's fine.
>
> But, given that the interpreter doesn't know which state it's in,
> there's also the possibility that this invocation *will* be exposed to
> hostile input, and the 3.5.2+ behavior gives absolutely no warning
> that this is what's happening. So instead of letting this potential
> error pass silently, I propose that if SipHash fails to acquire real
> randomness at startup, then it should issue a warning. In practice,
> this will almost never happen. But in the rare cases it does, it at
> least gives the user a fighting chance to realize that their system is
> in a potentially dangerous state. And by using the warnings module, we
> automatically get quite a bit of flexibility.
>
> If some particular
> invocation (e.g. systemd-cron) has audited their code and decided that
> they don't care about this issue, they can make the message go away:
>
>    PYTHONWARNINGS=ignore::NoEntropyAtStartupWarning
>
> OTOH if some particular invocation knows that they do process
> potentially hostile input early on (e.g. cloud-init, maybe?), then
> they can explicitly promote the warning to an error:
>
>   PYTHONWARNINGS=error::NoEntropyAtStartupWarning
>
> (I guess the way to implement this would be for the SipHash
> initialization code -- which runs very early -- to set some flag, and
> then we expose that flag in sys._something, and later in the startup
> sequence check for it after the warnings module is functional.
> Exposing the flag at the Python level would also make it possible for
> code like cloud-init to do its own explicit check and respond
> appropriately.)

A Python level warning/flag seems overly elaborate to me, but we can
easily emit a warning on stderr when SipHash is initialised via the
fallback rather than the operating system's RNG.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From jake at lwn.net  Wed Jun 15 18:30:44 2016
From: jake at lwn.net (Jake Edge)
Date: Wed, 15 Jun 2016 16:30:44 -0600
Subject: [Python-Dev] Final round of the Python Language Summit coverage at
 LWN
Message-ID: <20160615163044.5abe13af@redtail.lan>

Hola python-dev,

The final batch of articles from the Python Language Summit is now
ready.

The starting point is here: https://lwn.net/Articles/688969/

I have added the final six sessions (with SubscriberLinks for those
without a subscription):

Python 3 in Fedora: https://lwn.net/Articles/690676/
https://lwn.net/SubscriberLink/690676/cdf118081ac0ffd5/

The Python JITs are coming: https://lwn.net/Articles/691070/
https://lwn.net/SubscriberLink/691070/2714fd6a4934f016/

Pyjion: https://lwn.net/Articles/691152/
https://lwn.net/SubscriberLink/691152/6334fd8d5a9992c0/

Why is Python slow?: https://lwn.net/Articles/691243/
https://lwn.net/SubscriberLink/691243/669cb2bf2fe220c4/

Automated testing of CPython patches: https://lwn.net/Articles/691307/
https://lwn.net/SubscriberLink/691307/89feefecfe425f58/

The Python security response team: https://lwn.net/Articles/691308/
https://lwn.net/SubscriberLink/691308/432ff50e0f9b794f/

The articles will be freely available (without using the
SubscriberLink) to the world at large in a week ... until then, feel
free to share the SubscriberLinks.

Hopefully I have captured things reasonably well.  If there are
corrections or clarifications needed, though, I recommend posting them
as comments on the article.

With luck, I will be able to sit in on the summit again next year ...

enjoy!

jake

-- 
Jake Edge - LWN - jake at lwn.net - http://lwn.net

From tytso at mit.edu  Thu Jun 16 01:25:41 2016
From: tytso at mit.edu (Theodore Ts'o)
Date: Thu, 16 Jun 2016 01:25:41 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>
References: <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>
Message-ID: <20160616052541.GB32689@thunk.org>

On Wed, Jun 15, 2016 at 04:12:57PM -0700, Nathaniel Smith wrote:
> - It's not exactly true that the Python interpreter doesn't need
> cryptographic randomness to initialize SipHash -- it's more that
> *some* Python invocations need unguessable randomness (to first
> approximation: all those which are exposed to hostile input), and some
> don't. And since the Python interpreter has no idea which case it's
> in, and since it's unacceptable for it to break invocations that don't
> need unguessable hashes, then it has to err on the side of continuing
> without randomness. All that's fine.

In practice, those Python ivocation which are exposed to hostile input
are those that are started while the network are up.  The vast
majority of time, they are launched by the web brwoser --- and if this
happens after a second or so of the system getting networking
interrupts, (a) getrandom won't block, and (b) /dev/urandom and
getrandom will be initialized.

Also, I wish people would say that this is only an issue on Linux.
Again, FreeBSD's /dev/urandom will block as well if it is
uninitialized.  It's just that in practice, for both Linux and
Freebsd, we try very hard to make sure /dev/urandom is fully
initialized by the time it matters.  It's just that so far, it's only
on Linux when there was an attempt to use Python in the early init
scripts, and in a VM and in a system where everything is modularized
such that the deadlock became visible.

> (I guess the way to implement this would be for the SipHash
> initialization code -- which runs very early -- to set some flag, and
> then we expose that flag in sys._something, and later in the startup
> sequence check for it after the warnings module is functional.
> Exposing the flag at the Python level would also make it possible for
> code like cloud-init to do its own explicit check and respond
> appropriately.)

I really don't think it's that big a of a deal in *practice*, and but
if you really are concerned about the very remote possibility that a
Python invocation could start in early boot, and *then* also stick
around for the long term, and *then* be exosed to hostile input ---
what if you set the flag, and then later on, N minutes, either
automatically, or via some trigger such as cloud-init --- try and see
if /dev/urandom is initialized (even a few seconds later, so long as
the init scripts are hanging, it should be initialized) have Python
hash all of its dicts, or maybe just the non-system dicts (since those
are presumably the ones mos tlikely to be exposed hostile input).

    	       	   	    	       	  - Ted

From greg.ewing at canterbury.ac.nz  Thu Jun 16 01:59:26 2016
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 16 Jun 2016 17:59:26 +1200
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <20160615123401.GB27919@ando.pearwood.info>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CAP7+vJKqqpRK5iUx_wHzyb6UExByfFdQFt0+jgVLTjWimzTitQ@mail.gmail.com>
 <20160615123401.GB27919@ando.pearwood.info>
Message-ID: <5762403E.90701@canterbury.ac.nz>

Steven D'Aprano wrote:
> I'm 
> satisfied that the choice made by Python is the right choice, and that 
> it meets the spirit (if, arguably, not the letter) of the RFC.

IMO it meets the letter (if you read it a certain way)
but *not* the spirit.

-- 
Greg

From barry at python.org  Thu Jun 16 02:45:08 2016
From: barry at python.org (Barry Warsaw)
Date: Thu, 16 Jun 2016 09:45:08 +0300
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
Message-ID: <20160616094508.3acf1de7.barry@wooz.org>

On Jun 15, 2016, at 01:01 PM, Nick Coghlan wrote:

>No, this is a bad idea. Asking novice developers to make security
>decisions they're not yet qualified to make when it's genuinely
>possible for us to do the right thing by default is the antithesis of
>good security API design, and os.urandom() *is* a security API
>(whether we like it or not - third party documentation written by the
>cryptographic software development community has made it so, since
>it's part of their guidelines for writing security sensitive code in
>pure Python).

Regardless of what third parties have said about os.urandom(), let's look at
what *we* have said about it.  Going back to pre-churn 3.4 documentation:

    os.urandom(n)
    Return a string of n random bytes suitable for cryptographic use.

    This function returns random bytes from an OS-specific randomness
    source. The returned data should be unpredictable enough for cryptographic
    applications, though its exact quality depends on the OS
    implementation. On a Unix-like system this will query /dev/urandom, and on
    Windows it will use CryptGenRandom(). If a randomness source is not found,
    NotImplementedError will be raised.

    For an easy-to-use interface to the random number generator provided by
    your platform, please see random.SystemRandom.

So we very clearly provided platform-dependent caveats on the cryptographic
quality of os.urandom().  We also made a strong claim that there's a direct
connection between os.urandom() and /dev/urandom on "Unix-like system(s)".

We broke that particular promise in 3.5. and semi-fixed it 3.5.2.

>Adding *new* APIs is also a bad idea, since "os.urandom() is the right
>answer on every OS except Linux, and also the best currently available
>answer on Linux" has been the standard security advice for generating
>cryptographic secrets in pure Python code for years now, so we should
>only change that guidance if we have extraordinarily compelling
>reasons to do so, and we don't.

Disagree.

We have broken one long-term promise on os.urandom() ("On a Unix-like system
this will query /dev/urandom") and changed another ("should be unpredictable
enough for cryptographic applications, though its exact quality depends on OS
implementations").

We broke the experienced Linux developer's natural and long-standing link
between the API called os.urandom() and /dev/urandom.  This breaks pre-3.5
code that assumes read-from-/dev/urandom semantics for os.urandom().

We have introduced churn.  Predicting a future SO question such as "Can
os.urandom() block on Linux?" the answer is "No in Python 3.4 and earlier, yes
possibly in Python 3.5.0 and 3.5.1, no in Python 3.5.2 and the rest of the
3.5.x series, and yes possibly in Python 3.6 and beyond".

We have a better answer for "cryptographically appropriate" use cases in
Python 3.6 - the secrets module.  Trying to make os.urandom() "the right
answer on every OS" weakens the promotion of secrets as *the* module to use
for cryptographically appropriate use cases.

IMHO it would be better to leave os.urandom() well enough alone, except for
the documentation which should effectively say, a la 3.4:

    os.urandom(n)
    Return a string of n random bytes suitable for cryptographic use.

    This function returns random bytes from an OS-specific randomness
    source. The returned data should be unpredictable enough for cryptographic
    applications, though its exact quality depends on the OS
    implementation. On a Unix-like system this will query /dev/urandom, and on
    Windows it will use CryptGenRandom(). If a randomness source is not found,
    NotImplementedError will be raised.

    Cryptographic applications should use the secrets module for stronger
    guaranteed sources of randomness.

    For an easy-to-use interface to the random number generator provided by
    your platform, please see random.SystemRandom.

Cheers,
-Barry

From larry at hastings.org  Thu Jun 16 02:52:19 2016
From: larry at hastings.org (Larry Hastings)
Date: Wed, 15 Jun 2016 23:52:19 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160616094508.3acf1de7.barry@wooz.org>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
Message-ID: <57624CA3.3020704@hastings.org>

On 06/15/2016 11:45 PM, Barry Warsaw wrote:
> So we very clearly provided platform-dependent caveats on the cryptographic
> quality of os.urandom().  We also made a strong claim that there's a direct
> connection between os.urandom() and /dev/urandom on "Unix-like system(s)".
>
> We broke that particular promise in 3.5. and semi-fixed it 3.5.2.

Well, 3.5.2 hasn't happened yet.  So if you see it as still being 
broken, please speak up now.

Why do you call it only "semi-fixed"?  As far as I understand it, the 
semantics of os.urandom() in 3.5.2rc1 are indistinguishable from reading 
from /dev/urandom directly, except it may not need to use a file handle.

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160615/fc4872fe/attachment.html>

From robertc at robertcollins.net  Thu Jun 16 03:26:14 2016
From: robertc at robertcollins.net (Robert Collins)
Date: Thu, 16 Jun 2016 19:26:14 +1200
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAJ3HoZ27DPHKj=3TyoXxLTWPSq185=izWZKvvcxDYtduR+6Z9A@mail.gmail.com>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <57624CA3.3020704@hastings.org>
 <CAJ3HoZ3pioYu0OhGMPGyHf-q_3sqd4fgMSpnfb=PfuK6mLKfJw@mail.gmail.com>
 <CAJ3HoZ27DPHKj=3TyoXxLTWPSq185=izWZKvvcxDYtduR+6Z9A@mail.gmail.com>
Message-ID: <CAJ3HoZ1eTMoF6jCKD=9RMZ+8Z9q9ZRU4HMBSbhqsTXaD8QU93g@mail.gmail.com>

On 16 Jun 2016 6:55 PM, "Larry Hastings" <larry at hastings.org> wrote:
>
>

> Why do you call it only "semi-fixed"?  As far as I understand it, the
semantics of os.urandom() in 3.5.2rc1 are indistinguishable from reading
from /dev/urandom directly, except it may not need to use a file handle.

Which is a contract change. Someone testing in E.g. a chroot could have a
different device on /dev/urandom, and now they will need to intercept
syscalls for the same effect. Personally I think this is fine, but assuming
i see Barry's point correctly, it is indeed but the same as it was.

-rob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160616/d4baebb4/attachment.html>

From njs at pobox.com  Thu Jun 16 03:36:13 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 16 Jun 2016 00:36:13 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160616052541.GB32689@thunk.org>
References: <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>
 <20160616052541.GB32689@thunk.org>
Message-ID: <CAPJVwB=_uz7JH4GLttcxqPofgzmMDGH3w9pxyJ_dRzq_fXS4Og@mail.gmail.com>

On Wed, Jun 15, 2016 at 10:25 PM, Theodore Ts'o <tytso at mit.edu> wrote:
> On Wed, Jun 15, 2016 at 04:12:57PM -0700, Nathaniel Smith wrote:
>> - It's not exactly true that the Python interpreter doesn't need
>> cryptographic randomness to initialize SipHash -- it's more that
>> *some* Python invocations need unguessable randomness (to first
>> approximation: all those which are exposed to hostile input), and some
>> don't. And since the Python interpreter has no idea which case it's
>> in, and since it's unacceptable for it to break invocations that don't
>> need unguessable hashes, then it has to err on the side of continuing
>> without randomness. All that's fine.
>
> In practice, those Python ivocation which are exposed to hostile input
> are those that are started while the network are up.  The vast
> majority of time, they are launched by the web brwoser --- and if this
> happens after a second or so of the system getting networking
> interrupts, (a) getrandom won't block, and (b) /dev/urandom and
> getrandom will be initialized.

Not sure what you mean about the vast majority of Python invocations
being launched by the web browser? But anyway, sure, usually this
isn't an issue. This is just discussing about what to do in the
unlikely case when it actually has become an issue, and it's hard to
be certain that this will *never* happen. E.g. it's entirely plausible
that someone will write some cloud-init plugin that exposes an HTTP
server or something. People do all kinds of weird things in VMs these
days... Basically this is a question of whether we should make an
(unlikely) error totally invisible to the user, and "errors should
never pass silently" is right there in the Zen of Python :-).

> Also, I wish people would say that this is only an issue on Linux.
> Again, FreeBSD's /dev/urandom will block as well if it is
> uninitialized.  It's just that in practice, for both Linux and
> Freebsd, we try very hard to make sure /dev/urandom is fully
> initialized by the time it matters.  It's just that so far, it's only
> on Linux when there was an attempt to use Python in the early init
> scripts, and in a VM and in a system where everything is modularized
> such that the deadlock became visible.
>
>
>> (I guess the way to implement this would be for the SipHash
>> initialization code -- which runs very early -- to set some flag, and
>> then we expose that flag in sys._something, and later in the startup
>> sequence check for it after the warnings module is functional.
>> Exposing the flag at the Python level would also make it possible for
>> code like cloud-init to do its own explicit check and respond
>> appropriately.)
>
> I really don't think it's that big a of a deal in *practice*, and but
> if you really are concerned about the very remote possibility that a
> Python invocation could start in early boot, and *then* also stick
> around for the long term, and *then* be exosed to hostile input ---
> what if you set the flag, and then later on, N minutes, either
> automatically, or via some trigger such as cloud-init --- try and see
> if /dev/urandom is initialized (even a few seconds later, so long as
> the init scripts are hanging, it should be initialized) have Python
> hash all of its dicts, or maybe just the non-system dicts (since those
> are presumably the ones mos tlikely to be exposed hostile input).

I don't think this is technically doable. There's no global list of
hash tables, and Python exposes the actual hash values to user code
with some guarantee that they won't change.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From stefan at bytereef.org  Thu Jun 16 03:48:28 2016
From: stefan at bytereef.org (Stefan Krah)
Date: Thu, 16 Jun 2016 07:48:28 +0000 (UTC)
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
References: <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>
 <20160616052541.GB32689@thunk.org>
 <CAPJVwB=_uz7JH4GLttcxqPofgzmMDGH3w9pxyJ_dRzq_fXS4Og@mail.gmail.com>
Message-ID: <loom.20160616T094601-799@post.gmane.org>

Nathaniel Smith <njs <at> pobox.com> writes:
> On Wed, Jun 15, 2016 at 10:25 PM, Theodore Ts'o <tytso <at> mit.edu> wrote:
> > In practice, those Python ivocation which are exposed to hostile input
> > are those that are started while the network are up.
> 
> Not sure what you mean about the vast majority of Python invocations
> being launched by the web browser?

"Python invocations which are exposed to hostile input". ;)

Stefan Krah

From njs at pobox.com  Thu Jun 16 03:53:38 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 16 Jun 2016 00:53:38 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160616094508.3acf1de7.barry@wooz.org>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
Message-ID: <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>

On Wed, Jun 15, 2016 at 11:45 PM, Barry Warsaw <barry at python.org> wrote:
> On Jun 15, 2016, at 01:01 PM, Nick Coghlan wrote:
>
>>No, this is a bad idea. Asking novice developers to make security
>>decisions they're not yet qualified to make when it's genuinely
>>possible for us to do the right thing by default is the antithesis of
>>good security API design, and os.urandom() *is* a security API
>>(whether we like it or not - third party documentation written by the
>>cryptographic software development community has made it so, since
>>it's part of their guidelines for writing security sensitive code in
>>pure Python).
>
> Regardless of what third parties have said about os.urandom(), let's look at
> what *we* have said about it.  Going back to pre-churn 3.4 documentation:
>
>     os.urandom(n)
>     Return a string of n random bytes suitable for cryptographic use.
>
>     This function returns random bytes from an OS-specific randomness
>     source. The returned data should be unpredictable enough for cryptographic
>     applications, though its exact quality depends on the OS
>     implementation. On a Unix-like system this will query /dev/urandom, and on
>     Windows it will use CryptGenRandom(). If a randomness source is not found,
>     NotImplementedError will be raised.
>
>     For an easy-to-use interface to the random number generator provided by
>     your platform, please see random.SystemRandom.
>
> So we very clearly provided platform-dependent caveats on the cryptographic
> quality of os.urandom().  We also made a strong claim that there's a direct
> connection between os.urandom() and /dev/urandom on "Unix-like system(s)".
>
> We broke that particular promise in 3.5. and semi-fixed it 3.5.2.
>
>>Adding *new* APIs is also a bad idea, since "os.urandom() is the right
>>answer on every OS except Linux, and also the best currently available
>>answer on Linux" has been the standard security advice for generating
>>cryptographic secrets in pure Python code for years now, so we should
>>only change that guidance if we have extraordinarily compelling
>>reasons to do so, and we don't.
>
> Disagree.
>
> We have broken one long-term promise on os.urandom() ("On a Unix-like system
> this will query /dev/urandom") and changed another ("should be unpredictable
> enough for cryptographic applications, though its exact quality depends on OS
> implementations").
>
> We broke the experienced Linux developer's natural and long-standing link
> between the API called os.urandom() and /dev/urandom.  This breaks pre-3.5
> code that assumes read-from-/dev/urandom semantics for os.urandom().
>
> We have introduced churn.  Predicting a future SO question such as "Can
> os.urandom() block on Linux?" the answer is "No in Python 3.4 and earlier, yes
> possibly in Python 3.5.0 and 3.5.1, no in Python 3.5.2 and the rest of the
> 3.5.x series, and yes possibly in Python 3.6 and beyond".

It also depends on the kernel version, since it will never block on
old kernels that are missing getrandom(), but it might block on future
kernels if Linux's /dev/urandom ever becomes blocking. (Ted's said
that this is not going to happen now, but the only reason it isn't was
that he tried to make the change and it broke some distros that are
still in use -- so it seems entirely possible that it will happen a
few years from now.)

> We have a better answer for "cryptographically appropriate" use cases in
> Python 3.6 - the secrets module.  Trying to make os.urandom() "the right
> answer on every OS" weakens the promotion of secrets as *the* module to use
> for cryptographically appropriate use cases.
>
> IMHO it would be better to leave os.urandom() well enough alone, except for
> the documentation which should effectively say, a la 3.4:
>
>     os.urandom(n)
>     Return a string of n random bytes suitable for cryptographic use.
>
>     This function returns random bytes from an OS-specific randomness
>     source. The returned data should be unpredictable enough for cryptographic
>     applications, though its exact quality depends on the OS
>     implementation. On a Unix-like system this will query /dev/urandom, and on
>     Windows it will use CryptGenRandom(). If a randomness source is not found,
>     NotImplementedError will be raised.
>
>     Cryptographic applications should use the secrets module for stronger
>     guaranteed sources of randomness.
>
>     For an easy-to-use interface to the random number generator provided by
>     your platform, please see random.SystemRandom.

This is not an accurate docstring, though. The more accurate docstring
for your proposed behavior would be:

os.urandom(n)
Return a string of n bytes that will usually, but not always, be
suitable for cryptographic use.

This function returns random bytes from an OS-specific randomness
source. On non-Linux OSes, this uses the best available source of
randomness, e.g. CryptGenRandom() on Windows and /dev/urandom on OS X,
and thus will be strong enough for cryptographic use. However, on
Linux it uses a deprecated API (/dev/urandom) which in rare cases is
known to return bytes that look random, but aren't. There is no way to
know when this has happened; your code will just silently stop being
secure. In some unusual configurations, where Python is not configured
with any source of randomness, it will raise NotImplementedError.

You should never use this function. If you need unguessable random
bytes, then the 'secrets' module is always a strictly better choice --
unlike this function, it always uses the best available source of
cryptographic randomness, even on Linux. Alternatively, if you need
random bytes but it doesn't matter whether other people can guess
them, then the 'random' module is always a strictly better choice --
it will be faster, as well as providing useful features like
deterministic seeding.

---

In practice, your proposal means that ~all existing code that uses
os.urandom becomes incorrect and should be switched to either secrets
or random. This is *far* more churn for end-users than Nick's
proposal.

...Anyway, since there's clearly going to be at least one PEP about
this, maybe we should stop rehashing bits and pieces of the argument
in these long threads that most people end up skipping and then
rehashing again later?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From barry at python.org  Thu Jun 16 04:03:29 2016
From: barry at python.org (Barry Warsaw)
Date: Thu, 16 Jun 2016 11:03:29 +0300
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <57624CA3.3020704@hastings.org>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <57624CA3.3020704@hastings.org>
Message-ID: <20160616110329.37acd390.barry@wooz.org>

On Jun 15, 2016, at 11:52 PM, Larry Hastings wrote:

>Well, 3.5.2 hasn't happened yet.  So if you see it as still being broken,
>please speak up now.

In discussion with other Ubuntu developers, several salient points were raised.

The documentation for os.urandom() in 3.5.2rc1 doesn't make sense:

    On Linux, getrandom() syscall is used if available and the urandom entropy
    pool is initialized (getrandom() does not block). On a Unix-like system
    this will query /dev/urandom.

Perhaps better would be:

    Where available, the getrandom() syscall is used (with the GRND_NONBLOCK
    flag) if available and the urandom entropy pool is initialized.  When
    getrandom() returns EAGAIN because of insufficient entropy, fallback to
    reading from /dev/urandom.  When the getrandom() syscall is unavailable on
    other Unix-like systems, this will query /dev/urandom.

It's actually a rather twisty maze of code to verify these claims, and I'm
nearly certain we don't have any tests to guarantee this is what actually
happens in those cases, so there are many caveats.

This means that an experienced developer can no longer just `man urandom` to
understand the unique operational behavior of os.urandom() on their platform,
but instead would be forced to actually read our code to find out what's
actually happening when/if things break.

It is unacceptable if any new exceptions are raised when insufficient entropy
is available.  Python 3.4 essentially promises that "if only crap entropy is
available, you'll get crap, but at least it won't block and no exceptions are
raised".  Proper backward compatibility requires the same in 3.5 and beyond.
Are we sure that's still the case?

Using the system call *may* be faster in the we-have-good-entropy-case, but it
will definitely be slower in the we-don't-have-good-entropy-case (because of
the fallback logic).  Maybe that doesn't matter in practice but it's worth
noting.

>Why do you call it only "semi-fixed"?  As far as I understand it, the
>semantics of os.urandom() in 3.5.2rc1 are indistinguishable from reading from
>/dev/urandom directly, except it may not need to use a file handle.

Semi-fixed because os.urandom() will still not be strictly backward compatible
between Python 3.5.2 and 3.4.

*If* we can guarantee that os.urandom() will never block or raise an exception
when only poor entropy is available, then it may be indeed indistinguishably
backward compatible for most if not all cases.

Cheers,
-Barry

From stefan at bytereef.org  Thu Jun 16 04:19:53 2016
From: stefan at bytereef.org (Stefan Krah)
Date: Thu, 16 Jun 2016 08:19:53 +0000 (UTC)
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>
Message-ID: <loom.20160616T100935-276@post.gmane.org>

Nathaniel Smith <njs <at> pobox.com> writes:
> In practice, your proposal means that ~all existing code that uses
> os.urandom becomes incorrect and should be switched to either secrets
> or random. This is *far* more churn for end-users than Nick's
> proposal.

This should only concern code that a) was specifically written for
3.5.0/3.5.1 and b) implements a serious cryptographic application
in Python.

I think b) is not a good idea anyway due to timing and side channel
attacks and the lack of secure wiping of memory. Such applications
should be written in C, where one does not have to predict the
behavior of multiple layers of abstractions.

Stefan Krah

From barry at python.org  Thu Jun 16 04:22:20 2016
From: barry at python.org (Barry Warsaw)
Date: Thu, 16 Jun 2016 11:22:20 +0300
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAJ3HoZ1eTMoF6jCKD=9RMZ+8Z9q9ZRU4HMBSbhqsTXaD8QU93g@mail.gmail.com>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <57624CA3.3020704@hastings.org>
 <CAJ3HoZ3pioYu0OhGMPGyHf-q_3sqd4fgMSpnfb=PfuK6mLKfJw@mail.gmail.com>
 <CAJ3HoZ27DPHKj=3TyoXxLTWPSq185=izWZKvvcxDYtduR+6Z9A@mail.gmail.com>
 <CAJ3HoZ1eTMoF6jCKD=9RMZ+8Z9q9ZRU4HMBSbhqsTXaD8QU93g@mail.gmail.com>
Message-ID: <20160616112220.681ff6e6.barry@wooz.org>

On Jun 16, 2016, at 07:26 PM, Robert Collins wrote:

>Which is a contract change. Someone testing in E.g. a chroot could have a
>different device on /dev/urandom, and now they will need to intercept
>syscalls for the same effect. Personally I think this is fine, but assuming
>i see Barry's point correctly, it is indeed but the same as it was.

It's true there could be a different device on /dev/urandom, but by my reading
of the getrandom() manpage I think that *should* be transparent since

    By default, getrandom() draws entropy from the /dev/urandom pool.  This
    behavior can be changed via the flags argument.

and we don't pass the GRND_RANDOM flag to getrandom().

Cheers,
-Barry

From barry at python.org  Thu Jun 16 04:33:51 2016
From: barry at python.org (Barry Warsaw)
Date: Thu, 16 Jun 2016 11:33:51 +0300
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>
Message-ID: <20160616113351.7e9f2c3b@python.org>

On Jun 16, 2016, at 12:53 AM, Nathaniel Smith wrote:

>> We have introduced churn.  Predicting a future SO question such as "Can
>> os.urandom() block on Linux?" the answer is "No in Python 3.4 and earlier,
>> yes possibly in Python 3.5.0 and 3.5.1, no in Python 3.5.2 and the rest of
>> the 3.5.x series, and yes possibly in Python 3.6 and beyond".
>
>It also depends on the kernel version, since it will never block on
>old kernels that are missing getrandom(), but it might block on future
>kernels if Linux's /dev/urandom ever becomes blocking. (Ted's said
>that this is not going to happen now, but the only reason it isn't was
>that he tried to make the change and it broke some distros that are
>still in use -- so it seems entirely possible that it will happen a
>few years from now.)

Right; I noticed this and had it in my copious notes for my follow up but
forgot to mention it.  Thanks!

>This is not an accurate docstring, though. The more accurate docstring
>for your proposed behavior would be:

[...]

>You should never use this function. If you need unguessable random
>bytes, then the 'secrets' module is always a strictly better choice --
>unlike this function, it always uses the best available source of
>cryptographic randomness, even on Linux. Alternatively, if you need
>random bytes but it doesn't matter whether other people can guess
>them, then the 'random' module is always a strictly better choice --
>it will be faster, as well as providing useful features like
>deterministic seeding.

Note that I was talking about 3.5.x, where we don't have the secrets module.

I'd quibble about the admonition about never using the function.  It *can* be
useful if the trade-offs are appropriate for your application (e.g. "almost
always random enough, but maybe not though at least you won't block and you'll
get back something").  Getting the words right is useful, but I agree that we
should be strongly recommending crypto applications use the secrets module in
Python 3.6.

>In practice, your proposal means that ~all existing code that uses os.urandom
>becomes incorrect and should be switched to either secrets or random. This is
>*far* more churn for end-users than Nick's proposal.

I disagree.  We have a clear upgrade path for end-users.  If you're using
os.urandom() in pre-3.5 and understand what you're getting (or not getting as
the case may be), you will continue to get or not get exactly the same bits in
3.5.x (where x >= 2).  No changes to your code are necessary.  This is also
the case in 3.6 but there you can do much better by porting your code to the
new secrets module.  Go do that!

>...Anyway, since there's clearly going to be at least one PEP about this,
>maybe we should stop rehashing bits and pieces of the argument in these long
>threads that most people end up skipping and then rehashing again later?

Sure, I'll try. ;)

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160616/7a639913/attachment.sig>

From larry at hastings.org  Thu Jun 16 04:40:22 2016
From: larry at hastings.org (Larry Hastings)
Date: Thu, 16 Jun 2016 01:40:22 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160616110329.37acd390.barry@wooz.org>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org> <57624CA3.3020704@hastings.org>
 <20160616110329.37acd390.barry@wooz.org>
Message-ID: <576265F6.5020807@hastings.org>

On 06/16/2016 01:03 AM, Barry Warsaw wrote:
> *If* we can guarantee that os.urandom() will never block or raise an exception
> when only poor entropy is available, then it may be indeed indistinguishably
> backward compatible for most if not all cases.

I stepped through the code that shipped in 3.5.2rc1.  It only ever calls 
getrandom() with the GRND_NONBLOCK flag.  If getrandom() returns -1 and 
errno is EAGAIN it falls back to /dev/urandom--I actually simulated this 
condition in gdb and watched it open /dev/urandom.  I didn't see any 
code for raising an exception or blocking when only poor entropy is 
available.

As Robert Collins points out, this does change the behavior 
ever-so-slightly from 3.4; if urandom is initialized, and the kernel has 
the getrandom system call, getrandom() will give us the bytes we asked 
for and we won't open and read from /dev/urandom.  In this state 
os.urandom() behaves ever-so-slightly differently:

  * os.urandom() will now work in chroot environments where /dev/urandom
    doesn't exist.
  * If Python runs in a chroot environment with a fake /dev/urandom,
    we'll ignore that and use the kernel's urandom device.
  * If the sysadmin changed what the systemwide /dev/urandom points to,
    we'll ignore that and use the kernel's urandom device.

But os.urandom() is documented as calling getrandom() when available in 
3.5... though doesn't detail how it calls it or what it uses the result 
for.  Anyway, I feel these differences were minor, and covered by the 
documented change in 3.5, so I thought it was reasonable and un-broken.

If this isn't backwards-compatible enough to suit you, please speak up now!

//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160616/831d8cdf/attachment.html>

From barry at python.org  Thu Jun 16 04:46:10 2016
From: barry at python.org (Barry Warsaw)
Date: Thu, 16 Jun 2016 11:46:10 +0300
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAPJVwB=_uz7JH4GLttcxqPofgzmMDGH3w9pxyJ_dRzq_fXS4Og@mail.gmail.com>
References: <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>
 <20160616052541.GB32689@thunk.org>
 <CAPJVwB=_uz7JH4GLttcxqPofgzmMDGH3w9pxyJ_dRzq_fXS4Og@mail.gmail.com>
Message-ID: <20160616114610.18de2d88.barry@wooz.org>

On Jun 16, 2016, at 12:36 AM, Nathaniel Smith wrote:

>Basically this is a question of whether we should make an (unlikely) error
>totally invisible to the user, and "errors should never pass silently" is
>right there in the Zen of Python :-).

I'd phrase it differently though.  To me, it comes down to hand-holding our
users who for whatever reason, don't use the appropriate APIs for what they're
trying to accomplish.

We can educate them through documentation, but I don't think it's appropriate
to retrofit existing APIs to different behavior based on those faulty
assumptions, because that has other negative effects, such as breaking the
promises we make to experienced and knowledgeable developers.

To me, the better policy is to admit our mistake in 3.5.0 and 3.5.1, restore
pre-existing behavior, accurately document the trade-offs, and provide a
clear, better upgrade path for our users.  We've done this beautifully and
effectively via the secrets module in Python 3.6.

Cheers,
-Barry

From barry at python.org  Thu Jun 16 05:06:38 2016
From: barry at python.org (Barry Warsaw)
Date: Thu, 16 Jun 2016 12:06:38 +0300
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <576265F6.5020807@hastings.org>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <57624CA3.3020704@hastings.org>
 <20160616110329.37acd390.barry@wooz.org>
 <576265F6.5020807@hastings.org>
Message-ID: <20160616120638.424be4fe.barry@wooz.org>

On Jun 16, 2016, at 01:40 AM, Larry Hastings wrote:

>As Robert Collins points out, this does change the behavior ever-so-slightly
>from 3.4;

Ah yes, I misunderstood Robert's point.

>if urandom is initialized, and the kernel has the getrandom system call,
>getrandom() will give us the bytes we asked for and we won't open and read
>from /dev/urandom.  In this state os.urandom() behaves ever-so-slightly
>differently:
>
>  * os.urandom() will now work in chroot environments where /dev/urandom
>    doesn't exist.
>  * If Python runs in a chroot environment with a fake /dev/urandom,
>    we'll ignore that and use the kernel's urandom device.
>  * If the sysadmin changed what the systemwide /dev/urandom points to,
>    we'll ignore that and use the kernel's urandom device.
>
>But os.urandom() is documented as calling getrandom() when available in
>3.5... though doesn't detail how it calls it or what it uses the result for.
>Anyway, I feel these differences were minor, and covered by the documented
>change in 3.5, so I thought it was reasonable and un-broken.
>
>If this isn't backwards-compatible enough to suit you, please speak up now!

It does seem like a narrow corner case, which of course means *someone* will
be affected by it <wink>.

I'll leave it up to you, though it should at least be clearly documented.
Let's hope the googles will also help our hypothetical future head-scratcher.

Cheers,
-Barry

From stephen at xemacs.org  Thu Jun 16 05:32:23 2016
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 16 Jun 2016 18:32:23 +0900
Subject: [Python-Dev] Smoothing the transition from Python 2 to 3
In-Reply-To: <CADiSq7d-4UC8GgO+NW4y5AsPH_BN_UMSySKh3RPa_7LxFWfO0w@mail.gmail.com>
References: <20160608210133.GA4318@python.ca>
 <CADiSq7cHDOgCMtY0mporvsz2ngyWvtpskpoqdeVn8Sitr+5qeg@mail.gmail.com>
 <CAP1=2W72m=Gnu19H3=2psxTw=rj1JPY6tdv_A2gkbiq+fV9LFg@mail.gmail.com>
 <20160609230807.GA8118@python.ca>
 <CAFSbXtMyTQdxSqa6Kf-FDBO2DkdjULEovW97a5QZaz6tNkWuEg@mail.gmail.com>
 <CADiSq7f7O+fXK7Ci00FwBhWwfiwN_VOwev+Ju0R3VRy56CK4UQ@mail.gmail.com>
 <njfito$uhf$1@ger.gmane.org>
 <CADiSq7d-4UC8GgO+NW4y5AsPH_BN_UMSySKh3RPa_7LxFWfO0w@mail.gmail.com>
Message-ID: <22370.29223.878866.723213@turnbull.sk.tsukuba.ac.jp>

Nick Coghlan writes:

 > - even if there is a test suite, sufficiently pervasive [str/bytes]
 > type ambiguity may make it difficult to use for fault isolation

Difficult yes, but I would argue that that difficuly is inherent[1].
Ie, if it's pervasive, the fault should be isolated to the whole
module.  Such a fault *will* regress, often in the exact same place,
but if not there, elsewhere due to the same ambiguity.  That was my
experience in both GNU Emacs and Mailman.  In GNU Emacs's case there's
a paired, much more successful (in respect of encoding problems)
experience with XEmacs to compare.[2]  We'll see how things go in
Mailman 3 (which uses a nearly completely rewritten email package),
but I'll bet the experience there is even more successful.[3]

If you're looking for a band-aid that will get you back running asap,
then you're better off bisecting the change history than going through
a slew of warnings one-by-one, as a recent error is likely due to a
recent change.

If Neil still wants to go ahead, more power to him.  I don't know
everything.  It's just that my experience in this area is sufficiently
extensive and sufficiently bad that it's worth repeating (just this
once!)

Footnotes: 
[1]  Or as Brooks would have said, "of the essence".

[2]  GNU Emacs has a multilingualization specialist in Ken Handa whose
day job is writing multiligualization libraries, so their encoding
detection, accuracy of implementation, and codec coverage is and
always was better than XEmacs's.  I'm referring here to internal bugs
in the Lisp primitives dealing with text, as well as the difficulty of
writing applications that handled both internal text and external
bytes without confusing them.

[3]  Though not strictly comparable to the XEmacs experience, due to
(1) being a second implementation, not a parallel implementation, and
(2) the Internet environment being much more standard conformant, even
in email, these days.

From donald at stufft.io  Thu Jun 16 06:04:39 2016
From: donald at stufft.io (Donald Stufft)
Date: Thu, 16 Jun 2016 06:04:39 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160616114610.18de2d88.barry@wooz.org>
References: <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>
 <20160616052541.GB32689@thunk.org>
 <CAPJVwB=_uz7JH4GLttcxqPofgzmMDGH3w9pxyJ_dRzq_fXS4Og@mail.gmail.com>
 <20160616114610.18de2d88.barry@wooz.org>
Message-ID: <61CE1B75-AC9A-4493-9AF5-44D022629DEA@stufft.io>

> On Jun 16, 2016, at 4:46 AM, Barry Warsaw <barry at python.org> wrote:
> 
> We can educate them through documentation, but I don't think it's appropriate
> to retrofit existing APIs to different behavior based on those faulty
> assumptions, because that has other negative effects, such as breaking the
> promises we make to experienced and knowledgeable developers.

You can?t document your way out of a usability problem, in the same way that while it was true that urllib was *documented* to not verify certificates by default, that didn?t matter because a large set of users used it like it did anyways.

In my opinion, this is a usability issue as well. You have a ton of third party documentation and effort around ?just use urandom? for Cryptographic random which is generally the right (and best!) answer except for this one little niggle on a Linux platform where /dev/urandom *may* produce predictable bytes (but usually doesn?t). That documentation typically doesn?t go into telling people this small niggle because prior to getrandom(0) there wasn?t much they could do about it except use /dev/random which is bad in every other situation but early boot cryptographic keys.

Regardless of what we document it as, people are going to use os.urandom for cryptographic purposes because for everyone who doesn?t keep up on exactly what modules are being added to Python who has any idea about cryptography at all is going to look for a Python interface to urandom. That doesn?t even begin to touch the thousands upon thousands of uses that already exist in the wild that are assuming that os.urandom will always give them cryptographic random, who now *need* to write this as:

try:
    from secrets import token_bytes
except ImportError:
    from os import urandom as token_bytes

In order to get the best cryptographic random available to them on their system, which assumes they?re even going to notice at all that there?s a new secrets model, and requires each and every use of os.urandom to change.

Honestly, I think that the first sentence in the documentation should most obviously be the most pertinent one, and the first sentence here is "Return a string of n random bytes suitable for cryptographic use.?. The bit about how the exact quality depends on the OS and documenting what device it uses is, to my eyes, obviously a hedge to say that ?Hey, if this gives you bad random it?s your OSs fault not ours, we can?t produce good random where your OS can?t give us some? and to give people a suggestion of where to look to determine if they?re going to get good random or not.

I do not think ?uses /dev/urandom? is, or should be considered a core part of this API, it already doesn?t use /dev/urandom on Windows where it doesn?t exist nor does it use /dev/urandom in 3.5+ if it can help it. Using getrandom(0) or using getrandom(GRND_NONBLOCK) and raising an exception on EAGAIN is still accessing the urandom CSPRNG with the same general runtime characteristics of /dev/urandom outside of cases where it?s not safe to actually use /dev/urandom.

Frankly, I think it?s a disservice to Python developers to leave in this footgun.

?
Donald Stufft

From barry at python.org  Thu Jun 16 07:07:56 2016
From: barry at python.org (Barry Warsaw)
Date: Thu, 16 Jun 2016 14:07:56 +0300
Subject: [Python-Dev] Our responsibilities (was Re: BDFL ruling request:
 should we block forever waiting for high-quality random bits?)
In-Reply-To: <61CE1B75-AC9A-4493-9AF5-44D022629DEA@stufft.io>
References: <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>
 <20160616052541.GB32689@thunk.org>
 <CAPJVwB=_uz7JH4GLttcxqPofgzmMDGH3w9pxyJ_dRzq_fXS4Og@mail.gmail.com>
 <20160616114610.18de2d88.barry@wooz.org>
 <61CE1B75-AC9A-4493-9AF5-44D022629DEA@stufft.io>
Message-ID: <20160616140756.62be6e25@python.org>

On Jun 16, 2016, at 06:04 AM, Donald Stufft wrote:

>Regardless of what we document it as, people are going to use os.urandom for
>cryptographic purposes because for everyone who doesn?t keep up on exactly
>what modules are being added to Python who has any idea about cryptography at
>all is going to look for a Python interface to urandom. That doesn?t even
>begin to touch the thousands upon thousands of uses that already exist in the
>wild that are assuming that os.urandom will always give them cryptographic
>random, who now *need* to write this as:

[...]

>Frankly, I think it?s a disservice to Python developers to leave in this
>footgun.

This really gets to the core of our responsibility to our users.  Let's start
by acknowledging that good-willed people can have different opinions on this,
and that we all want to do what's best for our users, although we may have
different definitions of "what's best".

Since this topic comes up over and over again, it's worth exploring in more
detail.  Here's my take on it in this context.

We have a responsibility to provide stable, well-documented, obvious APIs to
our users to provide functionality that is useful and appropriate to the best
of our abilities.

We have a responsibility to provide secure implementations of that
functionality wherever possible.

It's in the conflict between these two responsibilities that these heated
discussions and differences of opinions come up.  This conflict is exposed in
the os.urandom() debate because the first responsibility informs us that
backward compatibility is more important to maintain because it provides
stability and predictability.  The second responsibility urges us to favor
retrofitting increased security into APIs that for practicality purposes are
being used counter to our original intent.

It's not that you think backward compatibility is unimportant, or that I think
improving security has no value.  In the messy mudpit of the middle, we can't
seem to have both, as much as I'd argue that providing new, better APIs can
give us edible cake.

Coming down on either side has its consequences, both known and unintended,
and I think in these cases consensus can't be reached.  It's for these reasons
that we have RMs and BDFLs to break the tie.  We must lay out our arguments
and trust our Larrys, Neds, and Guidos to make the right --or at least *a*--
decision on a case-by-case basis, and if not agree then accept.

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160616/6d6f8a95/attachment.sig>

From rdmurray at bitdance.com  Thu Jun 16 07:08:59 2016
From: rdmurray at bitdance.com (R. David Murray)
Date: Thu, 16 Jun 2016 07:08:59 -0400
Subject: [Python-Dev] Why does base64 return bytes?
In-Reply-To: <57609869.4060304@canterbury.ac.nz>
References: <20160614151935.GY27919@ando.pearwood.info>
 <CAH0mxTQP3i_bn-Mx9a7BzPA_GnFSAvYqo79QK9FWef5DHAMJZg@mail.gmail.com>
 <CABVPEKpXPvA5fwOGiVOv_SgodegOyA7A1pzBT9K+GeqYJdER+w@mail.gmail.com>
 <CAH0mxTQ08jfr2kYHd2oyJYz2cZjL=iy5FdjnuyAFhzkCrQiLJw@mail.gmail.com>
 <20160614180556.9A1C0B1401C@webabinitio.net>
 <57609869.4060304@canterbury.ac.nz>
Message-ID: <20160616110900.653EAB14028@webabinitio.net>

On Wed, 15 Jun 2016 11:51:05 +1200, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> R. David Murray wrote:
> > The fundamental purpose of the base64 encoding is to take a series
> > of arbitrary bytes and reversibly turn them into another series of
> > bytes in which the eighth bit is not significant.
> 
> No, it's not. If that were its only purpose, it would be
> called base128, and the RFC would describe it purely in
> terms of bit patterns and not mention characters or
> character sets at all.

Sorry, you are correct.  IMO it is to encode it to a representation
that consists of a limited subset of printable (makes marks on paper or
screen) characters (which is an imprecise term); ie: data that will not
be interpreted as having control information by most programs processing
the data stream as either human-readable or raw bytes.

The rest of the argument still applies, specifically the part about
wire encoding to seven bit bytes being the currently-most-used[*] and
backward-compatible use case.  And I say this despite the fact that the
email package currently handles everything as surrogate-escaped text
and so does in fact decode the output of base64.encode to ASCII and
only later re-encodes it.  That's a design issue in the email package
deriving from the fact that bytes and string used to be the same thing
in python2.  It might some day get corrected, but probably won't be, and
it is a legacy of *not* making the distinction between bytes and string.

--David

[*] Yes this is changing, I already said that :)

From nikita at nemkin.ru  Thu Jun 16 07:11:33 2016
From: nikita at nemkin.ru (Nikita Nemkin)
Date: Thu, 16 Jun 2016 16:11:33 +0500
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
Message-ID: <CANawmycMpZc+SBnqqTgwfXffnAU111MC5Ax1D79-o17WVZXNUA@mail.gmail.com>

I'll reformulate my argument:

Ordered class namespaces are a minority use case that's already covered
by existing language features (custom metaclasses) and doesn't warrant
the extension of the language (i.e. making OrderedDict a builtin type).
This is about Python-the-Language, not CPython-the-runtime.

If you disagree with this premise, there's no point arguing about
the alternatives. That being said, below are the answers to your objections
to specific alternatives.

On Thu, Jun 16, 2016 at 1:30 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 14 June 2016 at 02:41, Nikita Nemkin <nikita at nemkin.ru> wrote:
>
> Adding metaclasses to an existing class can break compatibility with
> third party subclasses, so making it possible for people to avoid that
> while still gaining the ability to implicitly expose attribute
> ordering to class decorators and other potentially interested parties
> is a recurring theme behind this PEP and also PEPs 422 and 487.

The simple answer is "don't do that", i.e. don't pile an ordered metaclass
on top of another metaclass. Such use case is hypothetical anyway.

Also, namespace argument to the default metaclass doesn't cause conflicts.

>> 3. Making compiler fill in __definition_order__ for every class
>>     (just like __qualname__) without touching the runtime.
>> ?
>
> Class scopes support conditionals and loops, so we can't necessarily
> be sure what names will be assigned without running the code. It's
> also possible to make attribute assignments via locals() that are
> entirely opaque to the compiler, but visible to the interpreter at
> runtime.

All explicit assignments in the class body can be detected statically.
Implicit assignments via locals(), sys._frame() etc. can't be detected,
BUT they are unlikely to have a meaningful order!
It's reasonable to exclude them from __definition_order__.

This also applies to documentation tools. If there really was a need,
they could have easily extracted static order, solving 99.9999% of
the problem.

> The rationale for "Why not make this configurable, rather than
> switching it unilaterally?" is that it's actually *simpler* overall to
> just make it the default - we can then change the documentation to say
> "class bodies are evaluated in a collections.OrderedDict instance by
> default" and record the consequences of that, rather than having to
> document yet another class customisation mechanism.

It would have been a "simpler" default if it was the core dict that
became ordered. Instead, it brings in a 3rd party (OrderedDict).

Documenting an extra metaclass or an extra type kward would hardly
take more space. And it's NOT yet another mechanism. It's the good old
metaclass mechanism.

> It also eliminates boilerplate from class decorator usage
> instructions, where people have to write "to use this class decorator,
> you must also specify 'namespace=collections.OrderedDict' in your
> class header"

Statically inferred __definition_order__ would work here.
Order-dependent decorators don't seem to be important enough
to worry about their usability.

From donald at stufft.io  Thu Jun 16 07:34:47 2016
From: donald at stufft.io (Donald Stufft)
Date: Thu, 16 Jun 2016 07:34:47 -0400
Subject: [Python-Dev] Our responsibilities (was Re: BDFL ruling request:
 should we block forever waiting for high-quality random bits?)
In-Reply-To: <20160616140756.62be6e25@python.org>
References: <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>
 <20160616052541.GB32689@thunk.org>
 <CAPJVwB=_uz7JH4GLttcxqPofgzmMDGH3w9pxyJ_dRzq_fXS4Og@mail.gmail.com>
 <20160616114610.18de2d88.barry@wooz.org>
 <61CE1B75-AC9A-4493-9AF5-44D022629DEA@stufft.io>
 <20160616140756.62be6e25@python.org>
Message-ID: <85CBF6B2-85A6-4285-891A-154B84C3A533@stufft.io>

> On Jun 16, 2016, at 7:07 AM, Barry Warsaw <barry at python.org> wrote:
> 
> On Jun 16, 2016, at 06:04 AM, Donald Stufft wrote:
> 
>> Regardless of what we document it as, people are going to use os.urandom for
>> cryptographic purposes because for everyone who doesn?t keep up on exactly
>> what modules are being added to Python who has any idea about cryptography at
>> all is going to look for a Python interface to urandom. That doesn?t even
>> begin to touch the thousands upon thousands of uses that already exist in the
>> wild that are assuming that os.urandom will always give them cryptographic
>> random, who now *need* to write this as:
> 
> [...]
> 
>> Frankly, I think it?s a disservice to Python developers to leave in this
>> footgun.
> 
> This really gets to the core of our responsibility to our users.  Let's start
> by acknowledging that good-willed people can have different opinions on this,
> and that we all want to do what's best for our users, although we may have
> different definitions of "what's best?.

Yes, I don?t think anyone is being malicious :) that?s why I qualified my
statement with ?I think?, because I don?t believe that whether or not this
particular choice is a disservice is a fundamental property of the universe,
but rather my opinion influenced by my priorities.

> 
> Since this topic comes up over and over again, it's worth exploring in more
> detail.  Here's my take on it in this context.
> 
> We have a responsibility to provide stable, well-documented, obvious APIs to
> our users to provide functionality that is useful and appropriate to the best
> of our abilities.
> 
> We have a responsibility to provide secure implementations of that
> functionality wherever possible.
> 
> It's in the conflict between these two responsibilities that these heated
> discussions and differences of opinions come up.  This conflict is exposed in
> the os.urandom() debate because the first responsibility informs us that
> backward compatibility is more important to maintain because it provides
> stability and predictability.  The second responsibility urges us to favor
> retrofitting increased security into APIs that for practicality purposes are
> being used counter to our original intent.

Well, I don?t think that for os.urandom someone using it for security is running
?counter to it?s original intent?, given that in general urandom?s purpose is for
cryptographic random. Someone *may* be using it for something other than that,
but it?s pretty explicitly there for security sensitive applications.

> 
> It's not that you think backward compatibility is unimportant, or that I think
> improving security has no value.  In the messy mudpit of the middle, we can't
> seem to have both, as much as I'd argue that providing new, better APIs can
> give us edible cake.

Right. I personally often fall towards securing the *existing* APIs and adding
new, insecure APIs that are obviously so in cases where we can reasonably do
that. That?s largely because given an API that?s both being used in security
sensitive applications and ones that?s not, the ?failure? to be properly secure
is almost always a silent failure, while the ?failure? to applications that don?t
need that security is almost always obvious and immediate.

Taking os.urandom as an example, the failure case here for the security side is
that you get some bytes that are, to some degree, predictable. There is nobody
alive who can look at some bytes and go ?oh yep, those bytes are predictable
we?re using the wrong API?, thus basically anyone ?incorrectly? [1] using this
API for security sensitive applications is going to have it just silently doing
the wrong thing.

On the flip side, if someone is using this API and what they care about is it
not blocking, ever, and always giving them some sort of random-ish number no
matter how predictable it is, then both of the proposed failure cases are fairly
noticeable (to varying degrees), either it blocks long enough for it to matter
for those people and they notice and dig in, or it raises an exception and they
notice and dig in. In both cases they get some indication that something is
wrong.

> 
> Coming down on either side has its consequences, both known and unintended,
> and I think in these cases consensus can't be reached.  It's for these reasons
> that we have RMs and BDFLs to break the tie.  We must lay out our arguments
> and trust our Larrys, Neds, and Guidos to make the right --or at least *a*--
> decision on a case-by-case basis, and if not agree then accept.

Right. I?ve personally tried not to personally be the one who keeps pushing
for this even after a decree, partially because it?s draining to me to argue
for the security side with python-dev [2] and partially because It was ruled
on and I lost. However if there continues to be discussion I?ll continue to
advocate for what I think is right :)

[1] I don?t think using os.urandom is incorrect to use for security sensitive
    applications and I think it?s a losing battle for Python to try and fight
    the rest of the world that urandom is not the right answer here.

[2] python-dev tends to favor not breaking ?working? code over securing existing
    APIs, even if ?working? is silently doing the wrong thing in a security
    context. This is particularly frustrating when it comes to security because
    security is by it?s nature the act of taking code that would otherwise
    execute and making it error, ideally only in bad situations, but this
    ?security?s purpose is to make things break? nature clashes with python-dev?s
    default of not breaking ?working? code in a way that is personally draining
    to me.

?
Donald Stufft

From cory at lukasa.co.uk  Thu Jun 16 07:58:29 2016
From: cory at lukasa.co.uk (Cory Benfield)
Date: Thu, 16 Jun 2016 12:58:29 +0100
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <loom.20160616T100935-276@post.gmane.org>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>
 <loom.20160616T100935-276@post.gmane.org>
Message-ID: <63F9DC8F-7AB2-4D1E-8DC6-771D41955DA5@lukasa.co.uk>

> On 16 Jun 2016, at 09:19, Stefan Krah <stefan at bytereef.org> wrote:
> 
> This should only concern code that a) was specifically written for
> 3.5.0/3.5.1 and b) implements a serious cryptographic application
> in Python.
> 
> I think b) is not a good idea anyway due to timing and side channel
> attacks and the lack of secure wiping of memory. Such applications
> should be written in C, where one does not have to predict the
> behavior of multiple layers of abstractions.

No, it concerns code that generates its random numbers from Python. For example, you may want to use AES GCM to encrypt a file at rest. AES GCM requires the use of an nonce, and has only one rule about this nonce: you MUST NOT, under any circumstances, re-use an nonce/key combination. If you do, AES GCM fails catastrophically (I cannot emphasise this enough, re-using a nonce/key combination in AES GCM totally destroys all the properties the algorithm provides)[0].

You can use a C implementation of all of the AES logic, including offload to your x86 CPU with its fancy AES GCM instruction set. However, you *need* to provide an nonce: AES GCM can?t magically guess what it is, and it needs to be communicated in some way for the decryption[1]. In situations where you do not have an easily available nonce (you do have it for TLS, for example), you will need to provide one, and the logical and obvious thing to do is to use a random number. Your Python application needs to obtain that random number, and the safest way to do it is via os.urandom().

This is the problem with this argument: we cannot wave our hands and say ?os.urandom can be as unsafe as we want because crypto code must not be written in Python?. Even if we never implement an algorithm in Python (and I agree with you that crypto primitives in general should not be implemented in Python for the exact reasons you suggest), most algorithms require the ability to be provided with good random numbers by their callers. As long as crypto algorithms require good nonces, Python needs access to a secure CSPRNG. Kernel CSPRNGs are *strongly* favoured for many reasons that I won?t go into here, so os.urandom is our winner.

python-dev cannot wash its hands of the security decision here. As I?ve said many times, I?m pleased to see the decision makers have not done that: while I don?t agree with their decision, I totally respect that it was theirs to make, and they made it with all of the facts.

Cory

[0]: Someone will *inevitably* point out that other algorithms resist nonce misuse somewhat better than this. While that?s true, it?s a) not relevant, because some standards require use of the non-NMR algorithms, and b) unhelpful, because even if we could switch, we?d need access to the better primitives, which we don?t have.

[1]: Again, to head off some questions at the pass: the reason nonces are usually provided by the user of the algorithm is that sometimes they?re generated semi-deterministically. For example, TLS generates a unique key for each session (again, requiring randomness, but that?s neither here nor there), and so TLS can use deterministic *but non-repeated* nonces, which in practice it derives from record numbers. Because you have two options (re-use keys with random nonces, or random keys with deterministic nonces), a generic algorithm implementation does not constrain your choice of nonce.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160616/8b921c37/attachment.sig>

From barry at python.org  Thu Jun 16 08:24:33 2016
From: barry at python.org (Barry Warsaw)
Date: Thu, 16 Jun 2016 15:24:33 +0300
Subject: [Python-Dev] Our responsibilities (was Re: BDFL ruling request:
 should we block forever waiting for high-quality random bits?)
In-Reply-To: <85CBF6B2-85A6-4285-891A-154B84C3A533@stufft.io>
References: <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>
 <20160616052541.GB32689@thunk.org>
 <CAPJVwB=_uz7JH4GLttcxqPofgzmMDGH3w9pxyJ_dRzq_fXS4Og@mail.gmail.com>
 <20160616114610.18de2d88.barry@wooz.org>
 <61CE1B75-AC9A-4493-9AF5-44D022629DEA@stufft.io>
 <20160616140756.62be6e25@python.org>
 <85CBF6B2-85A6-4285-891A-154B84C3A533@stufft.io>
Message-ID: <20160616152433.70b2d9f7@python.org>

On Jun 16, 2016, at 07:34 AM, Donald Stufft wrote:

>Well, I don?t think that for os.urandom someone using it for security is
>running ?counter to it?s original intent?, given that in general urandom?s
>purpose is for cryptographic random. Someone *may* be using it for something
>other than that, but it?s pretty explicitly there for security sensitive
>applications.

Except that I disagree.  I think os.urandom's original intent, as documented
in Python 3.4, is to provide a thin layer over /dev/urandom, with all that
implies, and with the documented quality caveats.  I know as a Linux developer
that if I need to know the details of that, I can `man urandom` and read the
gory details.  In Python 3.5, I can't do that any more.

>Right. I personally often fall towards securing the *existing* APIs and
>adding new, insecure APIs that are obviously so in cases where we can
>reasonably do that.

Sure, and I personally fall on the side of maintaining stable, backward
compatible APIs, adding new, better, more secure APIs to address deficiencies
in real-world use cases.  That's because when we break APIs, even with the
best of intentions, it breaks people's code in ways and places that we can't
predict, and which are very often very difficult to discover.

I guess it all comes down to who's yelling at you. ;)

Cheers,
-Barry

P.S. These discussions do not always end in despair.  Witness PEP 493.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160616/bd5a175a/attachment.sig>

From tytso at mit.edu  Thu Jun 16 08:44:01 2016
From: tytso at mit.edu (Theodore Ts'o)
Date: Thu, 16 Jun 2016 08:44:01 -0400
Subject: [Python-Dev] Our responsibilities (was Re: BDFL ruling request:
 should we block forever waiting for high-quality random bits?)
In-Reply-To: <20160616152433.70b2d9f7@python.org>
References: <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>
 <20160616052541.GB32689@thunk.org>
 <CAPJVwB=_uz7JH4GLttcxqPofgzmMDGH3w9pxyJ_dRzq_fXS4Og@mail.gmail.com>
 <20160616114610.18de2d88.barry@wooz.org>
 <61CE1B75-AC9A-4493-9AF5-44D022629DEA@stufft.io>
 <20160616140756.62be6e25@python.org>
 <85CBF6B2-85A6-4285-891A-154B84C3A533@stufft.io>
 <20160616152433.70b2d9f7@python.org>
Message-ID: <20160616124401.GC32689@thunk.org>

On Thu, Jun 16, 2016 at 03:24:33PM +0300, Barry Warsaw wrote:
> Except that I disagree.  I think os.urandom's original intent, as documented
> in Python 3.4, is to provide a thin layer over /dev/urandom, with all that
> implies, and with the documented quality caveats.  I know as a Linux developer
> that if I need to know the details of that, I can `man urandom` and read the
> gory details.  In Python 3.5, I can't do that any more.

If Python were to document os.urandom as providing a thin wrapper over
/dev/urandom as implemented on Linux, and also document os.getrandom
as providing a thin wrapper over getrandom(2) as implemented on Linux.
And then say that the best emulation of those two interfaces will be
provided say that on other operating systems, and that today the best
practice is to call getrandom with the flags set to zero (or defaulted
out), that would certainly make me very happy.

I could imagine that some people might complain that it is too
Linux-centric, or it is not adhering to Python's design principles,
but it makes a lot sense of me as a Linux person.  :-)

Cheers,

						- Ted

From p.f.moore at gmail.com  Thu Jun 16 08:50:54 2016
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 16 Jun 2016 13:50:54 +0100
Subject: [Python-Dev] Our responsibilities (was Re: BDFL ruling request:
 should we block forever waiting for high-quality random bits?)
In-Reply-To: <85CBF6B2-85A6-4285-891A-154B84C3A533@stufft.io>
References: <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>
 <20160616052541.GB32689@thunk.org>
 <CAPJVwB=_uz7JH4GLttcxqPofgzmMDGH3w9pxyJ_dRzq_fXS4Og@mail.gmail.com>
 <20160616114610.18de2d88.barry@wooz.org>
 <61CE1B75-AC9A-4493-9AF5-44D022629DEA@stufft.io>
 <20160616140756.62be6e25@python.org>
 <85CBF6B2-85A6-4285-891A-154B84C3A533@stufft.io>
Message-ID: <CACac1F9rmtK_nGgDntHOxGp-xBp_fVoC0KcNv4TJ5rqTwhYiAQ@mail.gmail.com>

On 16 June 2016 at 12:34, Donald Stufft <donald at stufft.io> wrote:
> [1] I don?t think using os.urandom is incorrect to use for security sensitive
>     applications and I think it?s a losing battle for Python to try and fight
>     the rest of the world that urandom is not the right answer here.
>
> [2] python-dev tends to favor not breaking ?working? code over securing existing
>     APIs, even if ?working? is silently doing the wrong thing in a security
>     context. This is particularly frustrating when it comes to security because
>     security is by it?s nature the act of taking code that would otherwise
>     execute and making it error, ideally only in bad situations, but this
>     ?security?s purpose is to make things break? nature clashes with python-dev?s
>     default of not breaking ?working? code in a way that is personally draining
>     to me.

Should I take it from these two statements that you do not believe
that providing *new* APIs that provide better security compared to a
backward compatible but flawed existing implementation is a reasonable
approach? And specifically that you don't agree with the decision to
provide the new "secrets" module as the recommended interface for
getting secure random numbers from Python?

One of the aspects of this debate that I'm unclear about is what role
the people arguing that os.urandom must change see for the new secrets
module.

Paul

From stefan at bytereef.org  Thu Jun 16 08:57:54 2016
From: stefan at bytereef.org (Stefan Krah)
Date: Thu, 16 Jun 2016 12:57:54 +0000 (UTC)
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>
 <loom.20160616T100935-276@post.gmane.org>
 <63F9DC8F-7AB2-4D1E-8DC6-771D41955DA5@lukasa.co.uk>
Message-ID: <loom.20160616T144202-546@post.gmane.org>

Cory Benfield <cory <at> lukasa.co.uk> writes:
> python-dev cannot wash its hands of the security decision here. As I?ve
said many times, I?m pleased to
> see the decision makers have not done that: while I don?t agree with their
decision, I totally respect
> that it was theirs to make, and they made it with all of the facts.

I think the sysadmin's responsibility still plays a major role here.
If a Linux system crucially relies on the quality of /dev/urandom, it
should be possible to insert a small C program (call it ensure_random)
into the boot sequence that does *exactly* what Python did in the
bug report: block until entropy is available.

Well, it *was* possible with SysVinit ... :)

Python is not the only application that needs a secure /dev/urandom.

Stefan Krah

From donald at stufft.io  Thu Jun 16 09:02:26 2016
From: donald at stufft.io (Donald Stufft)
Date: Thu, 16 Jun 2016 09:02:26 -0400
Subject: [Python-Dev] Our responsibilities (was Re: BDFL ruling request:
 should we block forever waiting for high-quality random bits?)
In-Reply-To: <CACac1F9rmtK_nGgDntHOxGp-xBp_fVoC0KcNv4TJ5rqTwhYiAQ@mail.gmail.com>
References: <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>
 <20160616052541.GB32689@thunk.org>
 <CAPJVwB=_uz7JH4GLttcxqPofgzmMDGH3w9pxyJ_dRzq_fXS4Og@mail.gmail.com>
 <20160616114610.18de2d88.barry@wooz.org>
 <61CE1B75-AC9A-4493-9AF5-44D022629DEA@stufft.io>
 <20160616140756.62be6e25@python.org>
 <85CBF6B2-85A6-4285-891A-154B84C3A533@stufft.io>
 <CACac1F9rmtK_nGgDntHOxGp-xBp_fVoC0KcNv4TJ5rqTwhYiAQ@mail.gmail.com>
Message-ID: <82F8D8A2-57D0-4973-9B08-05D04E0D7629@stufft.io>

> On Jun 16, 2016, at 8:50 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> 
> On 16 June 2016 at 12:34, Donald Stufft <donald at stufft.io> wrote:
>> [1] I don?t think using os.urandom is incorrect to use for security sensitive
>>    applications and I think it?s a losing battle for Python to try and fight
>>    the rest of the world that urandom is not the right answer here.
>> 
>> [2] python-dev tends to favor not breaking ?working? code over securing existing
>>    APIs, even if ?working? is silently doing the wrong thing in a security
>>    context. This is particularly frustrating when it comes to security because
>>    security is by it?s nature the act of taking code that would otherwise
>>    execute and making it error, ideally only in bad situations, but this
>>    ?security?s purpose is to make things break? nature clashes with python-dev?s
>>    default of not breaking ?working? code in a way that is personally draining
>>    to me.
> 
> Should I take it from these two statements that you do not believe
> that providing *new* APIs that provide better security compared to a
> backward compatible but flawed existing implementation is a reasonable
> approach? And specifically that you don't agree with the decision to
> provide the new "secrets" module as the recommended interface for
> getting secure random numbers from Python?
> 
> One of the aspects of this debate that I'm unclear about is what role
> the people arguing that os.urandom must change see for the new secrets
> module.
> 
> Paul

I think the new secrets module is great, particularly for functions
other than secrets.token_bytes. If that?s all the secrets module was
then I?d argue it shouldn?t exist because we already have os.urandom.
IOW I think it solves a different problem than os.urandom, if all you
need is cryptographically random bytes, I think that os.urandom is the
most obvious thing that someone will reach for given:

* Pages upon pages of documentation both inside the Python community
  and outside saying ?use urandom?.
* The sheer bulk of existing code that is already out there using
  os.urandom for it?s cryptographic properties.

I also think it?s a great module for providing defaults that we can?t
provide in os.urandom, like the number of bytes that are considered
?secure? [1].

What I don?t think is that the secrets module means that all of a sudden
os.urandom is no longer an API that is primarily used in a security
sensitive context [2] and thus we should willfully choose to use a subpar
interface to the same CSPRNG when the OS provides us a better one [3]
because one small edge case *might* break in a loud an obvious way for
the minority of people using this API in a non security sensitive context
while leaving the majority of people using this API possible getting
silently insecure behavior from it.

[1] Of course, what is considered secure is going to be application
    dependent, but secrets can give a pretty good approximation for
    the general case.

[2] This is one of the things that really gets me about this, it?s not
    like folks on my side are saying we need to break the pickle module
    because it?s possible to use it insecurely. That would be silly
    because one of the primary use cases for that module is using it in
    a context that is not security sensitive. However, os.urandom is, to
    the best of my ability to determine and reason, almost always used in 
    a security sensitive context, and thus should make security sensitive
    trade offs in it?s API.

[3] Thus it?s still a small wrapper around OS provided APIs, so we?re not
    asking for os.py to implement some great big functionality, we?re just
    asking for it to provide a thin shim over a better interface to the
    same thing.

?
Donald Stufft

From random832 at fastmail.com  Thu Jun 16 09:51:52 2016
From: random832 at fastmail.com (Random832)
Date: Thu, 16 Jun 2016 09:51:52 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160616110329.37acd390.barry@wooz.org>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org> <57624CA3.3020704@hastings.org>
 <20160616110329.37acd390.barry@wooz.org>
Message-ID: <1466085112.1516432.639628113.069A20D4@webmail.messagingengine.com>

On Thu, Jun 16, 2016, at 04:03, Barry Warsaw wrote:
> *If* we can guarantee that os.urandom() will never block or raise an
> exception when only poor entropy is available, then it may be indeed
> indistinguishably backward compatible for most if not all cases.

Why can't we exclude cases when only poor entropy is available from
"most if not all cases"?

From barry at python.org  Thu Jun 16 10:04:43 2016
From: barry at python.org (Barry Warsaw)
Date: Thu, 16 Jun 2016 17:04:43 +0300
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <1466085112.1516432.639628113.069A20D4@webmail.messagingengine.com>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <57624CA3.3020704@hastings.org>
 <20160616110329.37acd390.barry@wooz.org>
 <1466085112.1516432.639628113.069A20D4@webmail.messagingengine.com>
Message-ID: <20160616170443.4dc73f23.barry@wooz.org>

On Jun 16, 2016, at 09:51 AM, Random832 wrote:

>On Thu, Jun 16, 2016, at 04:03, Barry Warsaw wrote:
>> *If* we can guarantee that os.urandom() will never block or raise an
>> exception when only poor entropy is available, then it may be indeed
>> indistinguishably backward compatible for most if not all cases.  
>
>Why can't we exclude cases when only poor entropy is available from
>"most if not all cases"?

Because if it blocks or raises a new exception on poor entropy it's an API
break.

Cheers,
-Barry

From srkunze at mail.de  Thu Jun 16 10:53:35 2016
From: srkunze at mail.de (Sven R. Kunze)
Date: Thu, 16 Jun 2016 16:53:35 +0200
Subject: [Python-Dev] Our responsibilities (was Re: BDFL ruling request:
 should we block forever waiting for high-quality random bits?)
In-Reply-To: <82F8D8A2-57D0-4973-9B08-05D04E0D7629@stufft.io>
References: <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>
 <20160616052541.GB32689@thunk.org>
 <CAPJVwB=_uz7JH4GLttcxqPofgzmMDGH3w9pxyJ_dRzq_fXS4Og@mail.gmail.com>
 <20160616114610.18de2d88.barry@wooz.org>
 <61CE1B75-AC9A-4493-9AF5-44D022629DEA@stufft.io>
 <20160616140756.62be6e25@python.org>
 <85CBF6B2-85A6-4285-891A-154B84C3A533@stufft.io>
 <CACac1F9rmtK_nGgDntHOxGp-xBp_fVoC0KcNv4TJ5rqTwhYiAQ@mail.gmail.com>
 <82F8D8A2-57D0-4973-9B08-05D04E0D7629@stufft.io>
Message-ID: <5762BD6F.50809@mail.de>

> I also think it?s a great module for providing defaults that we can?t
> provide in os.urandom, like the number of bytes that are considered
> ?secure? [1].
>
> What I don?t think is that the secrets module means that all of a sudden
> os.urandom is no longer an API that is primarily used in a security
> sensitive context

Not all of a sudden. However, I guess things will change in the future.

If we want the secrets module to be the first and only place where 
crypto goes, we should work towards that goal. It needs proper 
communication, marketing etc.

Deprecation periods can be years long. This change (whatever form it 
will take) can be carried out over 3 or 4 releases when the ultimate 
goal is made clear to everybody reading the docs. OTOH I don't know 
whether long deprecation periods are necessary here at all. Other 
industries are very sensitive to fast changes.

Furthermore, next generations will be taught using the new way, so the 
Python community should not be afraid of some changes because most of 
them are for the better.

On 16.06.2016 15:02, Donald Stufft wrote:
> I think that os.urandom is the most obvious thing that someone will reach for given:
>
> * Pages upon pages of documentation both inside the Python community
>    and outside saying ?use urandom?.
> * The sheer bulk of existing code that is already out there using
>    os.urandom for it?s cryptographic properties.

That's maybe you. However, as stated before, I am not expert in this 
field. So, when I need to, I first would start researching the current 
state of the art in Python.

If the docs says: use the secrets module (e.g. near os.urandom), I would 
happily comply -- especially when there's reasonable explanation.

That's from a newbie's point of view.

Best,
Sven

From random832 at fastmail.com  Thu Jun 16 11:07:04 2016
From: random832 at fastmail.com (Random832)
Date: Thu, 16 Jun 2016 11:07:04 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160616170443.4dc73f23.barry@wooz.org>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org> <57624CA3.3020704@hastings.org>
 <20160616110329.37acd390.barry@wooz.org>
 <1466085112.1516432.639628113.069A20D4@webmail.messagingengine.com>
 <20160616170443.4dc73f23.barry@wooz.org>
Message-ID: <1466089624.1532079.639708689.6394BFD9@webmail.messagingengine.com>

On Thu, Jun 16, 2016, at 10:04, Barry Warsaw wrote:
> On Jun 16, 2016, at 09:51 AM, Random832 wrote:
> 
> >On Thu, Jun 16, 2016, at 04:03, Barry Warsaw wrote:
> >> *If* we can guarantee that os.urandom() will never block or raise an
> >> exception when only poor entropy is available, then it may be indeed
> >> indistinguishably backward compatible for most if not all cases.  
> >
> >Why can't we exclude cases when only poor entropy is available from
> >"most if not all cases"?
>
> Because if it blocks or raises a new exception on poor entropy it's an
> API break.

Yes, but in only very rare cases. Which as I *just said* makes it
backwards compatible for "most" cases.

From contact at ionelmc.ro  Thu Jun 16 07:27:01 2016
From: contact at ionelmc.ro (=?UTF-8?Q?Ionel_Cristian_M=C4=83rie=C8=99?=)
Date: Thu, 16 Jun 2016 14:27:01 +0300
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <61CE1B75-AC9A-4493-9AF5-44D022629DEA@stufft.io>
References: <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>
 <20160616052541.GB32689@thunk.org>
 <CAPJVwB=_uz7JH4GLttcxqPofgzmMDGH3w9pxyJ_dRzq_fXS4Og@mail.gmail.com>
 <20160616114610.18de2d88.barry@wooz.org>
 <61CE1B75-AC9A-4493-9AF5-44D022629DEA@stufft.io>
Message-ID: <CANkHFr9tkxUO+pJEBARF6HrscV++vAFHu-kc9+fK1mkhw=iFYw@mail.gmail.com>

On Thu, Jun 16, 2016 at 1:04 PM, Donald Stufft <donald at stufft.io> wrote:

> In my opinion, this is a usability issue as well. You have a ton of third
> party documentation and effort around ?just use urandom? for Cryptographic
> random which is generally the right (and best!) answer except for this one
> little niggle on a Linux platform where /dev/urandom *may* produce
> predictable bytes (but usually doesn?t).

?Why not consider opt-out behavior with environment variables?? Eg: people
that don't care about crypto mumbojumbo and want fast interpreter startup
could just use a PYTHONWEAKURANDOM=y or PYTHONFASTURANDOM=y.

That ways there's no need to change api of os.urandom() and users have a
clear and easy path to get old behavior.

Thanks,
-- Ionel Cristian M?rie?, http://blog.ionelmc.ro
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160616/16bd71a9/attachment.html>

From ncoghlan at gmail.com  Thu Jun 16 11:39:48 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 16 Jun 2016 08:39:48 -0700
Subject: [Python-Dev] Our responsibilities (was Re: BDFL ruling request:
 should we block forever waiting for high-quality random bits?)
In-Reply-To: <CACac1F9rmtK_nGgDntHOxGp-xBp_fVoC0KcNv4TJ5rqTwhYiAQ@mail.gmail.com>
References: <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>
 <20160616052541.GB32689@thunk.org>
 <CAPJVwB=_uz7JH4GLttcxqPofgzmMDGH3w9pxyJ_dRzq_fXS4Og@mail.gmail.com>
 <20160616114610.18de2d88.barry@wooz.org>
 <61CE1B75-AC9A-4493-9AF5-44D022629DEA@stufft.io>
 <20160616140756.62be6e25@python.org>
 <85CBF6B2-85A6-4285-891A-154B84C3A533@stufft.io>
 <CACac1F9rmtK_nGgDntHOxGp-xBp_fVoC0KcNv4TJ5rqTwhYiAQ@mail.gmail.com>
Message-ID: <CADiSq7fY3zddNin=+SSsLLsifY621p-6SBVKaqaJ9sXdAvvTVQ@mail.gmail.com>

On 16 June 2016 at 05:50, Paul Moore <p.f.moore at gmail.com> wrote:
> On 16 June 2016 at 12:34, Donald Stufft <donald at stufft.io> wrote:
>> [1] I don?t think using os.urandom is incorrect to use for security sensitive
>>     applications and I think it?s a losing battle for Python to try and fight
>>     the rest of the world that urandom is not the right answer here.
>>
>> [2] python-dev tends to favor not breaking ?working? code over securing existing
>>     APIs, even if ?working? is silently doing the wrong thing in a security
>>     context. This is particularly frustrating when it comes to security because
>>     security is by it?s nature the act of taking code that would otherwise
>>     execute and making it error, ideally only in bad situations, but this
>>     ?security?s purpose is to make things break? nature clashes with python-dev?s
>>     default of not breaking ?working? code in a way that is personally draining
>>     to me.
>
> Should I take it from these two statements that you do not believe
> that providing *new* APIs that provide better security compared to a
> backward compatible but flawed existing implementation is a reasonable
> approach? And specifically that you don't agree with the decision to
> provide the new "secrets" module as the recommended interface for
> getting secure random numbers from Python?
>
> One of the aspects of this debate that I'm unclear about is what role
> the people arguing that os.urandom must change see for the new secrets
> module.

The secrets module is great for new code that gets to ignore any
version of Python older than 3.6 - it's the "solve this problem for
the next generation of developers" answer. All of the complicated
"this API is safe for that purpose, this API isn't" discussions get
replaced by "do the obvious thing" (i.e. use random for simulations,
secrets for security).

The os.urandom() debate is about taking the current obvious (because
that's what the entire security community is telling you to do) low
level way to do it and categorically eliminating any and all caveats
on its correctness. Not "it's correct if you use these new flags that
are incompatible with older Python versions". Not "it's not correct
anymore, use a different API". Just "it's correct, and the newer your
Python runtime, the more correct it is".

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From njs at pobox.com  Thu Jun 16 11:58:30 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 16 Jun 2016 08:58:30 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <loom.20160616T100935-276@post.gmane.org>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>
 <loom.20160616T100935-276@post.gmane.org>
Message-ID: <CAPJVwBnMZiDK7ZTnUbkgtSNvitEv=zTAhSZoK--1U7oicEpA5g@mail.gmail.com>

On Jun 16, 2016 1:23 AM, "Stefan Krah" <stefan at bytereef.org> wrote:
>
> Nathaniel Smith <njs <at> pobox.com> writes:
> > In practice, your proposal means that ~all existing code that uses
> > os.urandom becomes incorrect and should be switched to either secrets
> > or random. This is *far* more churn for end-users than Nick's
> > proposal.
>
> This should only concern code that a) was specifically written for
> 3.5.0/3.5.1 and b) implements a serious cryptographic application
> in Python.
>
> I think b) is not a good idea anyway due to timing and side channel
> attacks and the lack of secure wiping of memory. Such applications
> should be written in C, where one does not have to predict the
> behavior of multiple layers of abstractions.

This is completely unhelpful. Firstly because it's an argument that
os.urandom and the secrets module shouldn't exist, which doesn't tell is
much about what their behavior should be given that they do exist, and
secondly because it fundamentally misunderstands why they exist.

The word "cryptographic" here is a bit of a red herring. The guarantee that
a CSPRNG makes is that the output should be *unguessable by third parties*.
There are plenty of times when this is what you need even when you aren't
using actual cryptography. For example, when someone logs into a web app, I
may want to send back a session cookie so that I can recognize this person
later without making then reauthenticate all the time. For this to work
securely, it's extremely important that no one else be able to predict what
session cookie I sent, because if you can guess the cookie then you can
impersonate the user.

In python 2.3-3.5, the most correct way to write this code is to use
os.urandom. The question in this thread is whether we should break that in
3.6, so that conscientious users are forced to switch existing code over to
using the secrets module if they want to continue to get the most correct
available behavior, or whether we should preserve that in 3.6, so that code
like my hypothetical web app that was correct on 2.3-3.5 remains correct on
3.6 (with the secrets module being a more friendly wrapper that we
recommend for new code, but with no urgency about porting existing code to
it).

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160616/02bc33b6/attachment.html>

From p.f.moore at gmail.com  Thu Jun 16 12:39:12 2016
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 16 Jun 2016 17:39:12 +0100
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAPJVwBnMZiDK7ZTnUbkgtSNvitEv=zTAhSZoK--1U7oicEpA5g@mail.gmail.com>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>
 <loom.20160616T100935-276@post.gmane.org>
 <CAPJVwBnMZiDK7ZTnUbkgtSNvitEv=zTAhSZoK--1U7oicEpA5g@mail.gmail.com>
Message-ID: <CACac1F_xcCtb7S=iKDhiMsW9QODLBhtUaoG4YAqPOKAbDoPOaw@mail.gmail.com>

On 16 June 2016 at 16:58, Nathaniel Smith <njs at pobox.com> wrote:
> The word "cryptographic" here is a bit of a red herring. The guarantee that
> a CSPRNG makes is that the output should be *unguessable by third parties*.
> There are plenty of times when this is what you need even when you aren't
> using actual cryptography. For example, when someone logs into a web app, I
> may want to send back a session cookie so that I can recognize this person
> later without making then reauthenticate all the time. For this to work
> securely, it's extremely important that no one else be able to predict what
> session cookie I sent, because if you can guess the cookie then you can
> impersonate the user.
>
> In python 2.3-3.5, the most correct way to write this code is to use
> os.urandom. The question in this thread is whether we should break that in
> 3.6, so that conscientious users are forced to switch existing code over to
> using the secrets module if they want to continue to get the most correct
> available behavior, or whether we should preserve that in 3.6, so that code
> like my hypothetical web app that was correct on 2.3-3.5 remains correct on
> 3.6 (with the secrets module being a more friendly wrapper that we recommend
> for new code, but with no urgency about porting existing code to it).

While your example is understandable and clear, it's also a bit of a
red herring as well. Nobody's setting up a web session cookie during
the first moments of Linux boot (are they?), so os.urandom is
perfectly OK in all cases here. We have a new API in 3.6 that might
better express the *intent* of generating a secret token, but
(cryptographic) correctness is the same either way for this example.

As someone who isn't experienced in crypto, I genuinely don't have the
slightest idea of what sort of program we're talking about that is
written in Python, runs in the early stages of OS startup, and needs
crypto-strength random numbers. So I can't reason about whether the
proposed solutions are sensible. Would such programs be used in a
variety of environments with different Python versions? Would the
developers be non-specialists? Which of the mistakes being made that
result in a vulnerability is the easiest to solve (move the code to
run later, modify the Python code, require a fixed version of Python)?
How severe is the security hole compared to others (for example, users
with weak passwords)? What attacks are possible, and what damage could
be done? (I know that in principle, any security hole needs to be
plugged, but I work in an environment where production services with a
password of "password" exist, and applying system security patches is
treated as a "think about it when things are quiet" activity - so
forgive me if I don't immediately understand why obscure
vulnerabilities are important).

I'm willing to accept the view of the security experts that there's a
problem here. But without a clear explanation of the problem, how can
a non-specialist like myself have an opinion? (And I hope the security
POV isn't "you don't need an opinion, just do as we say").

Paul

From random832 at fastmail.com  Thu Jun 16 12:57:09 2016
From: random832 at fastmail.com (Random832)
Date: Thu, 16 Jun 2016 12:57:09 -0400
Subject: [Python-Dev] Our responsibilities (was Re: BDFL ruling request:
 should we block forever waiting for high-quality random bits?)
In-Reply-To: <85CBF6B2-85A6-4285-891A-154B84C3A533@stufft.io>
References: <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <CAPJVwBmEjC7BSj=t5cdbF_5rz_DT7LM3iMVgS-LNDsUxCama3Q@mail.gmail.com>
 <20160616052541.GB32689@thunk.org>
 <CAPJVwB=_uz7JH4GLttcxqPofgzmMDGH3w9pxyJ_dRzq_fXS4Og@mail.gmail.com>
 <20160616114610.18de2d88.barry@wooz.org>
 <61CE1B75-AC9A-4493-9AF5-44D022629DEA@stufft.io>
 <20160616140756.62be6e25@python.org>
 <85CBF6B2-85A6-4285-891A-154B84C3A533@stufft.io>
Message-ID: <1466096229.1556382.639822985.1A5F49FA@webmail.messagingengine.com>

On Thu, Jun 16, 2016, at 07:34, Donald Stufft wrote:
>   python-dev tends to favor not breaking ?working? code over securing
>   existing  APIs, even if ?working? is silently doing the wrong thing
>   in a  security  context. This is particularly frustrating when it
>   comes to security  because  security is by it?s nature the act of
>   taking code that would  otherwise  execute and making it error,
>   ideally only in bad situations, but this  ?security?s purpose is to
>   make things break? nature clashes with  python-dev?s  default of
>   not breaking ?working? code in a way that is personally  draining
>   to me.

I was almost about to reply with "Maybe what we need is a new zen of
python", then I checked. It turns out we already have "Errors should
never pass silently" which fits *perfectly* in this situation. So what's
needed is a change to the attitude that if an error passes silently,
that making it no longer pass silently is a backward compatibility
break.

This isn't Java, where the exceptions not thrown by an API are part of
that API's contract. We're free to throw new exceptions in a new version
of Python.

From mertz at gnosis.cx  Thu Jun 16 13:01:00 2016
From: mertz at gnosis.cx (David Mertz)
Date: Thu, 16 Jun 2016 13:01:00 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAPJVwBnMZiDK7ZTnUbkgtSNvitEv=zTAhSZoK--1U7oicEpA5g@mail.gmail.com>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>
 <loom.20160616T100935-276@post.gmane.org>
 <CAPJVwBnMZiDK7ZTnUbkgtSNvitEv=zTAhSZoK--1U7oicEpA5g@mail.gmail.com>
Message-ID: <CAEbHw4aY0EbbxYeO10PnwQNafmGd=sAYvQijnjBrCErR0Akrog@mail.gmail.com>

On Thu, Jun 16, 2016 at 11:58 AM, Nathaniel Smith <njs at pobox.com> wrote:

> [...] no one else be able to predict what session cookie I sent [...] In
> python 2.3-3.5, the most correct way to write this code is to use
> os.urandom. The question in this thread is whether we should break that in
> 3.6, so that conscientious users are forced to switch existing code over to
> using the secrets module if they want to continue to get the most correct
> available behavior, or whether we should preserve that in 3.6, so that code
> like my hypothetical web app that was correct on 2.3-3.5 remains correct on
> 3.6
>
This is kinda silly.  Unless you specifically wrote your code for Python
3.5.1, and NOT for 2.3.x through 3.4.x, your code is NO WORSE in 3.5.2 than
it has been under all those prior versions.  The cases where the behavior
in everything other than 3.5.0-3.5.1 is suboptimal are *extremely limited*,
as you understand (things that run in Python very early in the boot
process, and only on recent versions of Linux, no other OS).  This does not
even remotely describe the web-server-with-cookies example that you outline.

Python 3.6 is introducing a NEW MODULE, with new APIs.  The 'secrets'
module is the very first time that Python has ever really explicitly
addressed cryptography in the standard library.  Yes, there have been
third-party modules and libraries, but any cryptographic application of
Python prior to 'secrets' is very much roll-your-own and
know-what-you-are-doing.

Yes, there has been a history of telling people to "use os.urandom()" on
StackOverflow and places like that.  That's about the best advice that was
available prior to 3.6.  Adding a new module and API is specifically
designed to allow for a better answer, otherwise there'd be no reason to
include it.  And that advice that's been on StackOverflow and wherever has
been subject to the narrow, edge-case flaw we've discussed here for at
least a decade without anyone noticing or caring.

It seems to me that backporting 'secrets' and putting it on Warehouse would
be a lot more productive than complaining about 3.5.2 reverting to (almost)
the behavior of 2.3-3.4.

-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160616/db55ac1b/attachment.html>

From ncoghlan at gmail.com  Thu Jun 16 13:03:34 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 16 Jun 2016 10:03:34 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CACac1F_xcCtb7S=iKDhiMsW9QODLBhtUaoG4YAqPOKAbDoPOaw@mail.gmail.com>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>
 <loom.20160616T100935-276@post.gmane.org>
 <CAPJVwBnMZiDK7ZTnUbkgtSNvitEv=zTAhSZoK--1U7oicEpA5g@mail.gmail.com>
 <CACac1F_xcCtb7S=iKDhiMsW9QODLBhtUaoG4YAqPOKAbDoPOaw@mail.gmail.com>
Message-ID: <CADiSq7dw__xoga9A3R8G66TpwcC0XbWs056q82TtioJCUr9mBw@mail.gmail.com>

On 16 June 2016 at 09:39, Paul Moore <p.f.moore at gmail.com> wrote:
> I'm willing to accept the view of the security experts that there's a
> problem here. But without a clear explanation of the problem, how can
> a non-specialist like myself have an opinion? (And I hope the security
> POV isn't "you don't need an opinion, just do as we say").

If you're not writing Linux (and presumably *BSD) scripts and
applications that run during system initialisation or on embedded ARM
hardware with no good sources of randomness, then there's zero chance
of any change made in relation to this affecting you (Windows and Mac
OS X are completely immune, since they don't allow Python scripts to
run early enough in the boot sequence for there to ever be a problem).

The only question at hand is what CPython should do in the case where
the operating system *does* let Python scripts run before the system
random number generator is ready, and the application calls a security
sensitive API that relies on that RNG:

- throw BlockingIOError (so the script developer knows they have a
potential problem to fix)
- block (so the script developer has a system hang to debug)
- return low quality random data (so the script developer doesn't even
know they have a potential problem)

The last option is the status quo, and has a remarkable number of
vocal defenders.

The second option is what we changed the behaviour to in 3.5 as a side
effect of switching to a syscall to save a file descriptor (and *also*
inadvertently made a gating requirement for CPython starting at all,
without which I'd be very surprised if anyone actually noticed the
potentially blocking behaviour in os.urandom itself)

The first option is the one I'm currently writing a PEP for, since it
makes the longstanding advice to use os.urandom() as the low level
random data API for security sensitive operations unequivocally
correct (as it will either do the right thing, or throw an exception
which the developer can handle as appropriate for their particular
application)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Thu Jun 16 13:26:22 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 16 Jun 2016 10:26:22 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAEbHw4aY0EbbxYeO10PnwQNafmGd=sAYvQijnjBrCErR0Akrog@mail.gmail.com>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>
 <loom.20160616T100935-276@post.gmane.org>
 <CAPJVwBnMZiDK7ZTnUbkgtSNvitEv=zTAhSZoK--1U7oicEpA5g@mail.gmail.com>
 <CAEbHw4aY0EbbxYeO10PnwQNafmGd=sAYvQijnjBrCErR0Akrog@mail.gmail.com>
Message-ID: <CADiSq7eJGWMUi8x4HYsCsDWwQ3q_0VcOoWPv=6b0ffdYD4NK5Q@mail.gmail.com>

On 16 June 2016 at 10:01, David Mertz <mertz at gnosis.cx> wrote:
> It seems to me that backporting 'secrets' and putting it on Warehouse would
> be a lot more productive than complaining about 3.5.2 reverting to (almost)
> the behavior of 2.3-3.4.

"Let Flask/Django/passlib/cryptography/whatever handle the problem
rather than rolling your own" is already the higher level
meta-guidance. However, there are multiple levels of improvement being
pursued here, since developer ignorance of security concerns and
problematic defaults at the language level is a chronic problem rather
than an acute one (and one that affects all languages, not just
Python).

In that context, the main benefit of the secrets module is as a
deterrent against people reaching for the reproducible simulation
focused random module to implement security sensitive operations. By
offering both secrets and random in the standard library, we help make
it clear that secrecy and simulation are *not the same problem*, even
though they both involve random numbers. Folks that learn Python 3.6
first and then later start supporting earlier versions are likely to
be more aware of the difference, and hence go looking for "What's the
equivalent of the secrets module on earlier Python versions?" (at
which point they can just copy whichever one-liner they actually need
into their particular application - just as not every 3 line function
needs to be a builtin, not every 3 line function needs to be a module
on PyPI)

The os.urandom proposal is aimed more at removing any remaining
equivocation from the longstanding "Use os.urandom() for security
sensitive operations in Python" advice - it's for the benefit of folks
that are *already* attempting to do the right thing given the tools
they have available. The sole source of that equivocation is that in
some cases, at least on Linux, and potentially on *BSD (although we
haven't seen a confirmed reproducer there), os.urandom() may return
results that are sufficiently predictable to be inappropriate for use
in security sensitive applications.

At the moment, determining whether or not you're risking exposure to
that problem requires that you know a whole lot about Linux (and *BSD,
where even we haven't been able to determine the level of exposure on
embedded systems), and also about how ``os.urandom()`` is implemented
on different platforms.

My proposal is that we do away with the requirement for all that
assumed knowledge and instead say "Are you using os.urandom(),
random.SystemRandom(), or an API in the secrets module? Are you using
Python 3.6+? Did it raise BlockingIOError? No? Then you're fine".

The vast majority of Python developers will thus be free to remain
entirely ignorant of these platform specific idiosyncracies, while
those that have a potential need to know will get an exception from
the interpreter that they can then feed into a search engine and get
pointed in the right direction.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From njs at pobox.com  Thu Jun 16 13:40:12 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 16 Jun 2016 10:40:12 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAEbHw4aY0EbbxYeO10PnwQNafmGd=sAYvQijnjBrCErR0Akrog@mail.gmail.com>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>
 <loom.20160616T100935-276@post.gmane.org>
 <CAPJVwBnMZiDK7ZTnUbkgtSNvitEv=zTAhSZoK--1U7oicEpA5g@mail.gmail.com>
 <CAEbHw4aY0EbbxYeO10PnwQNafmGd=sAYvQijnjBrCErR0Akrog@mail.gmail.com>
Message-ID: <CAPJVwBmOUFt=JQng6op-BEcJ078cEAiO8bOjzWRAOLJSjKZi=A@mail.gmail.com>

On Jun 16, 2016 10:01 AM, "David Mertz" <mertz at gnosis.cx> wrote:
> Python 3.6 is introducing a NEW MODULE, with new APIs.  The 'secrets'
module is the very first time that Python has ever really explicitly
addressed cryptography in the standard library.

This is completely, objectively untrue. If you look up os.urandom in the
official manual for the standard library, then it have always stated
explicitly, as the very first line, that os.urandom returns "a string of n
random bytes suitable for cryptographic use." This is *exactly* the same
explicit guarantee that the secrets module makes. The motivation for adding
the secrets module was to make this functionality easier to find and more
convenient to use (e.g. by providing convenience functions for getting
random strings of ASCII characters), not to suddenly start addressing
cryptographic concerns for the first time.

(Will try to address other more nuanced points later.)

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160616/2eb7aa62/attachment.html>

From ericsnowcurrently at gmail.com  Thu Jun 16 13:46:25 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 16 Jun 2016 11:46:25 -0600
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CANawmycixVLySD9KAAHmxjCoPKfBRGDKWmczwvHmUo3+vuWDkA@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <CANawmycixVLySD9KAAHmxjCoPKfBRGDKWmczwvHmUo3+vuWDkA@mail.gmail.com>
Message-ID: <CALFfu7A_b0rZGmwD-EM-eM6Rpk_6-a9KrRJHeSG8ti+jxNzA-w@mail.gmail.com>

Thanks for raising these good points, Nikita.  I'll make sure the PEP
reflects this discussion.  (inline responses below...)

-eric

On Tue, Jun 14, 2016 at 3:41 AM, Nikita Nemkin <nikita at nemkin.ru> wrote:
> Is there any rationale for rejecting alternatives like:
>
> 1. Adding standard metaclass with ordered namespace.
> 2. Adding `namespace` or `ordered` args to the default metaclass.

We already have a metaclass-based solution: __prepare__().
Unfortunately, this opt-in option means that the definition order
isn't preserved by default, which means folks can't rely on access to
the definition order.  This is effectively no different from the
status quo.

Furthermore, there's a practical problem with requiring the use of
metaclasses to achieve some particular capability: metaclass
conflicts.  PEPs 422 and 487 exist, in large part, as a response to
specific feedback from users about problems they've had with
metaclasses.  While the key objective of PEP 520 is preserving the
class definition order, it also helps make it less necessary to write
a metaclass.

> 3. Making compiler fill in __definition_order__ for every class
>     (just like __qualname__) without touching the runtime.

This is a great idea.  I'd support any effort to do so.  But keep in
mind that how we derive __definition_order__ isn't as important as
that it's always there.  So the use of OrderedDict for the
implementation isn't necessary.  Instead, it's the implementation I've
taken.  If we later switch to using the compiler to get the definition
order, then great!

> ?
>
> To me, any of the above seems preferred to complicating
> the core part of the language forever.

What specific complication are you expecting?  Like nearly all of
Python's "power tools", folks won't need to know about the changes
from this PEP in order to use the language.  Then when they need the
new functionality, it will be ready for them to use.  Furthermore, as
far as changes to the language go, this change is quite simple and
straightforward (consider other recent changes, e.g. async).  It is
arguably a natural step and fills in some of the information that
Python currently throws away.  Finally, I've gotten broad support for
the change from across the community (both on the mailing lists and in
personal correspondence), from the time I first introduced the idea
several years ago.

>
> The vast majority of Python classes don't care about their member
> order, this is minority use case receiving majority treatment.

The problem is that there isn't any other recourse available to code
that wishes to determine the definition order of an arbitrary class.
This is an obstacle to code that I personally want to write (hence my
interest).

>
> Also, wiring OrderedDict into class creation means elevating it
> from a peripheral utility to indispensable built-in type.

Note that as of 3.5 CPython's OrderedDict *is* a builtin type (though
exposed via the collections module rather than the builtins module).
However, you're right that this change would mean OrderedDict would
now be used by the interpreter in all implementations of Python 3.6+.
Some of the other implementators from which I've gotten feedback have
indicated this isn't a problem.

From ncoghlan at gmail.com  Thu Jun 16 13:53:31 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 16 Jun 2016 10:53:31 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAPJVwBmOUFt=JQng6op-BEcJ078cEAiO8bOjzWRAOLJSjKZi=A@mail.gmail.com>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>
 <loom.20160616T100935-276@post.gmane.org>
 <CAPJVwBnMZiDK7ZTnUbkgtSNvitEv=zTAhSZoK--1U7oicEpA5g@mail.gmail.com>
 <CAEbHw4aY0EbbxYeO10PnwQNafmGd=sAYvQijnjBrCErR0Akrog@mail.gmail.com>
 <CAPJVwBmOUFt=JQng6op-BEcJ078cEAiO8bOjzWRAOLJSjKZi=A@mail.gmail.com>
Message-ID: <CADiSq7cKmQmU6EPdDTyWrkkmQ8bZXkasY++X24XxNdCSNqDMWA@mail.gmail.com>

On 16 June 2016 at 10:40, Nathaniel Smith <njs at pobox.com> wrote:
> On Jun 16, 2016 10:01 AM, "David Mertz" <mertz at gnosis.cx> wrote:
>> Python 3.6 is introducing a NEW MODULE, with new APIs.  The 'secrets'
>> module is the very first time that Python has ever really explicitly
>> addressed cryptography in the standard library.
>
> This is completely, objectively untrue. If you look up os.urandom in the
> official manual for the standard library, then it have always stated
> explicitly, as the very first line, that os.urandom returns "a string of n
> random bytes suitable for cryptographic use." This is *exactly* the same
> explicit guarantee that the secrets module makes. The motivation for adding
> the secrets module was to make this functionality easier to find and more
> convenient to use (e.g. by providing convenience functions for getting
> random strings of ASCII characters), not to suddenly start addressing
> cryptographic concerns for the first time.

An analogy that occurred to me that may help some folks: secrets is a
higher level API around os.urandom and some other standard library
features (like base64 and binascii.hexlify) in the same way that
shutil and pathlib are higher level APIs that aggregate other os
module functions with other parts of the standard library.

The existence of those higher level APIs doesn't make the lower level
building blocks redundant.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ericsnowcurrently at gmail.com  Thu Jun 16 14:15:21 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 16 Jun 2016 12:15:21 -0600
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CANawmycMpZc+SBnqqTgwfXffnAU111MC5Ax1D79-o17WVZXNUA@mail.gmail.com>
References: <CANawmycMpZc+SBnqqTgwfXffnAU111MC5Ax1D79-o17WVZXNUA@mail.gmail.com>
Message-ID: <CALFfu7Co1U1et3f1Deax6C5a5=G-32PfyvwCpgfkYvokPSAMJw@mail.gmail.com>

On Thu, Jun 16, 2016 at 5:11 AM, Nikita Nemkin <nikita at nemkin.ru> wrote:
> I'll reformulate my argument:
>
> Ordered class namespaces are a minority use case that's already covered
> by existing language features (custom metaclasses) and doesn't warrant
> the extension of the language (i.e. making OrderedDict a builtin type).
> This is about Python-the-Language, not CPython-the-runtime.

So your main objection is that OrderedDict would effectively become
part of the language definition?  Please elaborate on why this is a
problem.

> The simple answer is "don't do that", i.e. don't pile an ordered metaclass
> on top of another metaclass. Such use case is hypothetical anyway.

It isn't hypothetical.  It's a concrete problem that folks have run
into enough that it's been a point of discussion on several occasions
and the motivation for several PEPs.

> All explicit assignments in the class body can be detected statically.
> Implicit assignments via locals(), sys._frame() etc. can't be detected,
> BUT they are unlikely to have a meaningful order!
> It's reasonable to exclude them from __definition_order__.

Yeah, it's reasonable to exclude them.  However, in cases where I've
done so I would have wanted them included in the definition order.
That said, explicitly setting __definition_order__ in the class body
would be enough to address that corner case.

>
> This also applies to documentation tools. If there really was a need,
> they could have easily extracted static order, solving 99.9999% of
> the problem.

You mean that they have the opportunity to do something like AST
traversal to extract the definition order?  I expect the definition
order isn't important enough to them to do that work.  However, if the
language provided the definition order to them for free then they'd
use it.

>
>> The rationale for "Why not make this configurable, rather than
>> switching it unilaterally?" is that it's actually *simpler* overall to
>> just make it the default - we can then change the documentation to say
>> "class bodies are evaluated in a collections.OrderedDict instance by
>> default" and record the consequences of that, rather than having to
>> document yet another class customisation mechanism.
>
> It would have been a "simpler" default if it was the core dict that
> became ordered. Instead, it brings in a 3rd party (OrderedDict).

Obviously if dict preserved insertion order then we'd use that instead
of OrderedDict.  There have been proposals along those lines in the
past but at the end of the day someone has to do the work.  Since we
can use OrderedDict right now and there's no ordered dict in sight, it
makes the choice rather easy. :)  Ultimately the cost of defaulting to
OrderedDict is not significant, neither to the language definition nor
to run-time performance.  Furthermore, defaulting to OrderedDict (per
the PEP) makes things possible right now that aren't otherwise a
possibility.

>
>> It also eliminates boilerplate from class decorator usage
>> instructions, where people have to write "to use this class decorator,
>> you must also specify 'namespace=collections.OrderedDict' in your
>> class header"
>
> Statically inferred __definition_order__ would work here.
> Order-dependent decorators don't seem to be important enough
> to worry about their usability.

Please be careful about discounting seemingly unimportant use cases.
There's a decent chance they are important to someone.  In this case
that someone is (at least) myself. :)  My main motivation for PEP 520
is exactly writing a class decorator that would rely on access to the
definition order.  Such a decorator (which could also be used
stand-alone) cannot rely on every possible class it might encounter to
explicitly expose its definition order.

-eric

From Nikolaus at rath.org  Thu Jun 16 14:29:04 2016
From: Nikolaus at rath.org (Nikolaus Rath)
Date: Thu, 16 Jun 2016 11:29:04 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CADiSq7dw__xoga9A3R8G66TpwcC0XbWs056q82TtioJCUr9mBw@mail.gmail.com>
 (Nick Coghlan's message of "Thu, 16 Jun 2016 10:03:34 -0700")
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>
 <loom.20160616T100935-276@post.gmane.org>
 <CAPJVwBnMZiDK7ZTnUbkgtSNvitEv=zTAhSZoK--1U7oicEpA5g@mail.gmail.com>
 <CACac1F_xcCtb7S=iKDhiMsW9QODLBhtUaoG4YAqPOKAbDoPOaw@mail.gmail.com>
 <CADiSq7dw__xoga9A3R8G66TpwcC0XbWs056q82TtioJCUr9mBw@mail.gmail.com>
Message-ID: <87eg7x3s3z.fsf@thinkpad.rath.org>

On Jun 16 2016, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 16 June 2016 at 09:39, Paul Moore <p.f.moore at gmail.com> wrote:
>> I'm willing to accept the view of the security experts that there's a
>> problem here. But without a clear explanation of the problem, how can
>> a non-specialist like myself have an opinion? (And I hope the security
>> POV isn't "you don't need an opinion, just do as we say").
>
> If you're not writing Linux (and presumably *BSD) scripts and
> applications that run during system initialisation or on embedded ARM
> hardware with no good sources of randomness, then there's zero chance
> of any change made in relation to this affecting you (Windows and Mac
> OS X are completely immune, since they don't allow Python scripts to
> run early enough in the boot sequence for there to ever be a problem).
>
> The only question at hand is what CPython should do in the case where
> the operating system *does* let Python scripts run before the system
> random number generator is ready, and the application calls a security
> sensitive API that relies on that RNG:
>
> - throw BlockingIOError (so the script developer knows they have a
> potential problem to fix)
> - block (so the script developer has a system hang to debug)
> - return low quality random data (so the script developer doesn't even
> know they have a potential problem)
>
> The last option is the status quo, and has a remarkable number of
> vocal defenders.

*applaud*

Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From amk at amk.ca  Thu Jun 16 14:38:19 2016
From: amk at amk.ca (A.M. Kuchling)
Date: Thu, 16 Jun 2016 14:38:19 -0400
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CADiSq7eJGWMUi8x4HYsCsDWwQ3q_0VcOoWPv=6b0ffdYD4NK5Q@mail.gmail.com>
References: <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>
 <loom.20160616T100935-276@post.gmane.org>
 <CAPJVwBnMZiDK7ZTnUbkgtSNvitEv=zTAhSZoK--1U7oicEpA5g@mail.gmail.com>
 <CAEbHw4aY0EbbxYeO10PnwQNafmGd=sAYvQijnjBrCErR0Akrog@mail.gmail.com>
 <CADiSq7eJGWMUi8x4HYsCsDWwQ3q_0VcOoWPv=6b0ffdYD4NK5Q@mail.gmail.com>
Message-ID: <20160616183819.GA47680@ratlwsb1lcredwi.cmg.int>

On Thu, Jun 16, 2016 at 10:26:22AM -0700, Nick Coghlan wrote:
> meta-guidance. However, there are multiple levels of improvement being
> pursued here, since developer ignorance of security concerns and
> problematic defaults at the language level is a chronic problem rather
> than an acute one (and one that affects all languages, not just
> Python).

For a while Christian Heimes has speculated on Twitter about writing a
Secure Programming HOWTO.  At the last language summit in Montreal, I
told him I'd be happy to do the actual writing and editing if given a
detailed outline.  (I miss not having an ongoing writing project since
ceasing to write the "What's New", but have no ideas for anything to
write about.)

That offer is still open, if Christian or someone else wants to
produce an outline.

--amk

From lkb.teichmann at gmail.com  Thu Jun 16 14:56:33 2016
From: lkb.teichmann at gmail.com (Martin Teichmann)
Date: Thu, 16 Jun 2016 20:56:33 +0200
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
Message-ID: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>

Hi list,

using metaclasses in Python is a very flexible method of customizing
class creation, yet this customization comes at a cost: once you want
to combine two classes with different metaclasses, you run into
problems.

This is why I proposed PEP 487 (see
https://github.com/tecki/peps/blob/pep487/pep-0487.txt, which I also
attached here for ease of discussion), proposing a simple hook into
class creation, with which one can override in subclasses such that
sub-subclasses get customized accordingly. Otherwise, the standard
Python inheritance rules apply (super() and the MRO).

I also proposed to store the order in which attributes in classes are
defined. This is exactly the same as PEP 520, discussed here recently,
just that unfortunately we chose different names, but I am open for
suggestions for better names.

After having gotten good feedback on python-ideas (see
https://mail.python.org/pipermail/python-ideas/2016-February/038305.html)
and from IPython traitlets as a potential user of the feature (see
https://mail.scipy.org/pipermail/ipython-dev/2016-February/017066.html,
and the code at https://github.com/tecki/traitlets/commits/pep487) I
implemented a pure Python version of this PEP, to be introduced into
the standard library. I also wrote a proof-of-concept for another
potential user of this feature, django forms, at
https://github.com/tecki/django/commits/no-metaclass.

The code to be introduced into the standard library can be found at
https://github.com/tecki/cpython/commits/pep487
(sorry for using github, I'll submit something using hg once I
understand that toolchain). It introduces a new metaclass types.Type
which contains the machinery, and a new base class types.Object which
uses said metaclass. The naming was chosen to clarify the intention
that eventually those classes may be implemented in C and replace type
and object. As above, I am open to better naming.

As a second step, I let abc.ABCMeta inherit from said types.Type, such
that an ABC may also use the features of my metaclass, without the
need to define a new mixing metaclass.

I am looking forward to a lot of comments on this!

Greetings

Martin

The proposed PEP for discussion:

PEP: 487
Title: Simpler customisation of class creation
Version: $Revision$
Last-Modified: $Date$
Author: Martin Teichmann <lkb.teichmann at gmail.com>,
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 27-Feb-2015
Python-Version: 3.6
Post-History: 27-Feb-2015, 5-Feb-2016
Replaces: 422

Abstract
========

Currently, customising class creation requires the use of a custom metaclass.
This custom metaclass then persists for the entire lifecycle of the class,
creating the potential for spurious metaclass conflicts.

This PEP proposes to instead support a wide range of customisation
scenarios through a new ``__init_subclass__`` hook in the class body,
a hook to initialize attributes, and a way to keep the order in which
attributes are defined.

Those hooks should at first be defined in a metaclass in the standard
library, with the option that this metaclass eventually becomes the
default ``type`` metaclass.

The new mechanism should be easier to understand and use than
implementing a custom metaclass, and thus should provide a gentler
introduction to the full power Python's metaclass machinery.

Background
==========

Metaclasses are a powerful tool to customize class creation. They have,
however, the problem that there is no automatic way to combine metaclasses.
If one wants to use two metaclasses for a class, a new metaclass combining
those two needs to be created, typically manually.

This need often occurs as a surprise to a user: inheriting from two base
classes coming from two different libraries suddenly raises the necessity
to manually create a combined metaclass, where typically one is not
interested in those details about the libraries at all. This becomes
even worse if one library starts to make use of a metaclass which it
has not done before. While the library itself continues to work perfectly,
suddenly every code combining those classes with classes from another library
fails.

Proposal
========

While there are many possible ways to use a metaclass, the vast majority
of use cases falls into just three categories: some initialization code
running after class creation, the initalization of descriptors and
keeping the order in which class attributes were defined.

Those three use cases can easily be performed by just one metaclass. If
this metaclass is put into the standard library, and all libraries that
wish to customize class creation use this very metaclass, no combination
of metaclasses is necessary anymore. Said metaclass should live in the
``types`` module under the name ``Type``. This should hint the user that
in the future, this metaclass may become the default metaclass ``type``.

The three use cases are achieved as follows:

1. The metaclass contains an ``__init_subclass__`` hook that initializes
   all subclasses of a given class,
2. the metaclass calls a ``__set_owner__`` hook on all the attribute
   (descriptors) defined in the class, and
3. an ``__attribute_order__`` tuple is left in the class in order to inspect
   the order in which attributes were defined.

For ease of use, a base class ``types.Object`` is defined, which uses said
metaclass and contains an empty stub for the hook described for use case 1.
It will eventually become the new replacement for the standard ``object``.

As an example, the first use case looks as follows::

   >>> class SpamBase(types.Object):
   ...    # this is implicitly a @classmethod
   ...    def __init_subclass__(cls, **kwargs):
   ...        cls.class_args = kwargs
   ...        super().__init_subclass__(cls, **kwargs)

   >>> class Spam(SpamBase, a=1, b="b"):
   ...    pass

   >>> Spam.class_args
   {'a': 1, 'b': 'b'}

The base class ``types.Object`` contains an empty ``__init_subclass__``
method which serves as an endpoint for cooperative multiple inheritance.
Note that this method has no keyword arguments, meaning that all
methods which are more specialized have to process all keyword
arguments.

This general proposal is not a new idea (it was first suggested for
inclusion in the language definition `more than 10 years ago`_, and a
similar mechanism has long been supported by `Zope's ExtensionClass`_),
but the situation has changed sufficiently in recent years that
the idea is worth reconsidering for inclusion.

The second part of the proposal adds an ``__set_owner__``
initializer for class attributes, especially if they are descriptors.
Descriptors are defined in the body of a
class, but they do not know anything about that class, they do not
even know the name they are accessed with. They do get to know their
owner once ``__get__`` is called, but still they do not know their
name. This is unfortunate, for example they cannot put their
associated value into their object's ``__dict__`` under their name,
since they do not know that name.  This problem has been solved many
times, and is one of the most important reasons to have a metaclass in
a library. While it would be easy to implement such a mechanism using
the first part of the proposal, it makes sense to have one solution
for this problem for everyone.

To give an example of its usage, imagine a descriptor representing weak
referenced values::

    import weakref

    class WeakAttribute:
        def __get__(self, instance, owner):
            return instance.__dict__[self.name]

        def __set__(self, instance, value):
            instance.__dict__[self.name] = weakref.ref(value)

        # this is the new initializer:
        def __set_owner__(self, owner, name):
            self.name = name

While this example looks very trivial, it should be noted that until
now such an attribute cannot be defined without the use of a metaclass.
And given that such a metaclass can make life very hard, this kind of
attribute does not exist yet.

The third part of the proposal is to leave a tuple called
``__attribute_order__`` in the class that contains the order in which
the attributes were defined. This is a very common usecase, many
libraries use an ``OrderedDict`` to store this order. This is a very
simple way to achieve the same goal.

Under the hood, the implementation *does* ``__prepare__`` an
``OrderedDict`` namespace, it just retains the order of the keys in
``__attribute_order__``, since ``type.__new__`` will cripple the
``OrderedDict`` into a normal ``dict``, discarding the order
information.

Key Benefits
============

Easier inheritance of definition time behaviour
-----------------------------------------------

Understanding Python's metaclasses requires a deep understanding of
the type system and the class construction process. This is legitimately
seen as challenging, due to the need to keep multiple moving parts (the code,
the metaclass hint, the actual metaclass, the class object, instances of the
class object) clearly distinct in your mind. Even when you know the rules,
it's still easy to make a mistake if you're not being extremely careful.

Understanding the proposed implicit class initialization hook only requires
ordinary method inheritance, which isn't quite as daunting a task. The new
hook provides a more gradual path towards understanding all of the phases
involved in the class definition process.

Reduced chance of metaclass conflicts
-------------------------------------

One of the big issues that makes library authors reluctant to use metaclasses
(even when they would be appropriate) is the risk of metaclass conflicts.
These occur whenever two unrelated metaclasses are used by the desired
parents of a class definition. This risk also makes it very difficult to
*add* a metaclass to a class that has previously been published without one.

By contrast, adding an ``__init_subclass__`` method to an existing type poses
a similar level of risk to adding an ``__init__`` method: technically, there
is a risk of breaking poorly implemented subclasses, but when that occurs,
it is recognised as a bug in the subclass rather than the library author
breaching backwards compatibility guarantees.

A path of introduction into Python
==================================

Most of the benefits of this PEP can already be implemented using
a simple metaclass. For the ``__init_subclass__`` hook this works
all the way down to Python 2.7, while the attribute order needs Python 3.0
to work. Such a class has been `uploaded to PyPI`_.

The only drawback of such a metaclass are the mentioned problems with
metaclasses and multiple inheritance. Two classes using such a
metaclass can only be combined, if they use exactly the same such
metaclass. This fact calls for the inclusion of such a class into the
standard library, as ``types.Type``, with a ``types.Object`` base class
using it. Once all users use this standard
library metaclass, classes from different packages can easily be
combined.

But still such classes cannot be easily combined with other classes
using other metaclasses. Authors of metaclasses should bear that in
mind and inherit from the standard metaclass if it seems useful
for users of the metaclass to add more functionality. Ultimately,
if the need for combining with other metaclasses is strong enough,
the proposed functionality may be introduced into Python's ``type``.

Those arguments strongly hint to the following procedure to include
the proposed functionality into Python:

1. The metaclass implementing this proposal is put onto PyPI, so that
   it can be used and scrutinized.
2. Introduce this class into the Python 3.6 standard library.
3. Consider this as the default behavior for Python 3.7.

Steps 2 and 3 would be similar to how the ``set`` datatype was first
introduced as ``sets.Set``, and only later made a builtin type (with a
slightly different API) based on wider experiences with the ``sets``
module.

While the metaclass is still in the standard library and not in the
language, it may still clash with other metaclasses.  The most
prominent metaclass in use is probably ABCMeta.  It is also a
particularly good example for the need of combining metaclasses. For
users who want to define a ABC with subclass initialization, we should
support a ``types.ABCMeta`` class, or let ``abc.ABCMeta`` inherit from this
PEP's metaclass. As it turns out, most of the behavior of ``abc.ABCMeta``
can be done achieved with our ``types.Type``, except its core behavior,
``__instancecheck__`` and ``__subclasscheck__`` which can be supplied,
as per the definition of the Python language, exclusively in a metaclass.

Extensions written in C or C++ also often define their own metaclass.
It would be very useful if those could also inherit from the metaclass
defined here, but this is probably not possible.

New Ways of Using Classes
=========================

This proposal has many usecases like the following. In the examples,
we still inherit from the ``SubclassInit`` base class. This would
become unnecessary once this PEP is included in Python directly.

Subclass registration
---------------------

Especially when writing a plugin system, one likes to register new
subclasses of a plugin baseclass. This can be done as follows::

   class PluginBase(Object):
       subclasses = []

       def __init_subclass__(cls, **kwargs):
           super().__init_subclass__(**kwargs)
           cls.subclasses.append(cls)

In this example, ``PluginBase.subclasses`` will contain a plain list of all
subclasses in the entire inheritance tree.  One should note that this also
works nicely as a mixin class.

Trait descriptors
-----------------

There are many designs of Python descriptors in the wild which, for
example, check boundaries of values. Often those "traits" need some support
of a metaclass to work. This is how this would look like with this
PEP::

   class Trait:
       def __get__(self, instance, owner):
           return instance.__dict__[self.key]

       def __set__(self, instance, value):
           instance.__dict__[self.key] = value

       def __set_owner__(self, owner, name):
           self.key = name

Rejected Design Options
=======================

Calling the hook on the class itself
------------------------------------

Adding an ``__autodecorate__`` hook that would be called on the class
itself was the proposed idea of PEP 422.  Most examples work the same
way or even better if the hook is called on the subclass. In general,
it is much easier to explicitly call the hook on the class in which it
is defined (to opt-in to such a behavior) than to opt-out, meaning
that one does not want the hook to be called on the class it is
defined in.

This becomes most evident if the class in question is designed as a
mixin: it is very unlikely that the code of the mixin is to be
executed for the mixin class itself, as it is not supposed to be a
complete class on its own.

The original proposal also made major changes in the class
initialization process, rendering it impossible to back-port the
proposal to older Python versions.

More importantly, having a pure Python implementation allows us to
take two preliminary steps before before we actually change the
interpreter, giving us the chance to iron out all possible wrinkles
in the API.

Other variants of calling the hook
----------------------------------

Other names for the hook were presented, namely ``__decorate__`` or
``__autodecorate__``. This proposal opts for ``__init_subclass__`` as
it is very close to the ``__init__`` method, just for the subclass,
while it is not very close to decorators, as it does not return the
class.

Requiring an explicit decorator on ``__init_subclass__``
--------------------------------------------------------

One could require the explicit use of ``@classmethod`` on the
``__init_subclass__`` decorator. It was made implicit since there's no
sensible interpretation for leaving it out, and that case would need
to be detected anyway in order to give a useful error message.

This decision was reinforced after noticing that the user experience of
defining ``__prepare__`` and forgetting the ``@classmethod`` method
decorator is singularly incomprehensible (particularly since PEP 3115
documents it as an ordinary method, and the current documentation doesn't
explicitly say anything one way or the other).

Defining arbitrary namespaces
-----------------------------

PEP 422 defined a generic way to add arbitrary namespaces for class
definitions. This approach is much more flexible than just leaving
the definition order in a tuple. The ``__prepare__`` method in a metaclass
supports exactly this behavior. But given that effectively
the only use cases that could be found out in the wild were the
``OrderedDict`` way of determining the attribute order, it seemed
reasonable to only support this special case.

The metaclass described in this PEP has been designed to be very simple
such that it could be reasonably made the default metaclass. This was
especially important when designing the attribute order functionality:
This was a highly demanded feature and has been enabled through the
``__prepare__`` method of metaclasses. This method can be abused in
very weird ways, making it hard to correctly maintain this feature in
CPython. This is why it has been proposed to deprecated this feature,
and instead use ``OrderedDict`` as the standard namespace, supporting
the most important feature while dropping most of the complexity. But
this would have meant that ``OrderedDict`` becomes a language builtin
like dict and set, and not just a standard library class. The choice
of the ``__attribute_order__`` tuple is a much simpler solution to the
problem.

A more ``__new__``-like hook
----------------------------

In PEP 422 the hook worked more like the ``__new__`` method than the
``__init__`` method, meaning that it returned a class instead of
modifying one. This allows a bit more flexibility, but at the cost
of much harder implementation and undesired side effects.

History
=======

This used to be a competing proposal to PEP 422 by Nick Coghlan and Daniel
Urban. PEP 422 intended to achieve the same goals as this PEP, but with a
different way of implementation.  In the meantime, PEP 422 has been withdrawn
favouring this approach.

References
==========

.. _published code:
   http://mail.python.org/pipermail/python-dev/2012-June/119878.html

.. _more than 10 years ago:
   http://mail.python.org/pipermail/python-dev/2001-November/018651.html

.. _Zope's ExtensionClass:
   http://docs.zope.org/zope_secrets/extensionclass.html

.. _uploaded to PyPI:
   https://pypi.python.org/pypi/metaclass

Copyright
=========

This document has been placed in the public domain.

From kevin-lists at theolliviers.com  Thu Jun 16 15:22:12 2016
From: kevin-lists at theolliviers.com (Kevin Ollivier)
Date: Thu, 16 Jun 2016 12:22:12 -0700
Subject: [Python-Dev] Discussion overload
Message-ID: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>

Hi all,

Recent joiner here, I signed up after PyCon made me want to get more involved and have been lurking. I woke up this morning again to about 30 new messages in my inbox, almost all of which revolve around the os.urandom blocking discussion. There are just about hourly new posts showing up on this topic.

There is such a thing as too much of a good thing. Discussion of issues is certainly good, but so far since joining this list I am seeing too much discussion happening too fast, and as someone who has been involved in open source for approaching two decades now, frankly, that is not really a good sign. The discussions are somewhat overlapping as so many people write back so quickly, there are multiple sub-discussions happening at once, and really at this point I'm not sure how much new each message is really adding, if anything at all. It seems to me the main solutions to this problem have all been identified, as have the tradeoffs of each. The discussion is now mostly at a point where people are just repeatedly debating (or promoting) the merits of their preferred solution and tradeoff. It is even spawning more abstract sub-discsussions about things like project compatibility policies. This discussion has really taken on a life of its own.

For someone like me, a new joiner, seeing this makes me feel like wanting to simply unsubscribe. I've been on mailing lists where issues get debated endlessly, and at some point what inevitably happens is that the project starts to lose members who feel that even just trying to follow the discussions is eating up too much of their time. It really can suck the energy right out of a community. I don't want to see that happen to Python. I had a blast at PyCon, my first, and I really came away feeling more than ever that the community you have here is really special. The one problem I felt concerned about though, was that the core dev community risked a sense of paralysis caused by having too many cooks in the kitchen and too much worry about the potential unseen ramifications of changing things. That creates a sort of paralysis and difficulty achieving consensus on anything that, eventually, causes projects to slowly decline and be disrupted by a more agile alternative.

Please consider taking a step back from this issue. Take a deep breath, and consider responding more slowly and letting people's points stew in your head for a day or two first. (Including this one pls. :) Python will not implode if you don't get that email out right away. If I understand what I've read of this torrent of messages correctly, we don't even know if there's a single real world use case where a user of os.urandom is hitting the same problem CPython did, so we don't even know if the blocking at startup issue is actually even happening in any real world Python code out there. It's clearly far from a rampant problem, in any case. Stop and think about that for a second. This is, in practice, potentially a complete non-issue. Fixing it in any number of ways may potentially change things for no one at all. You could even introduce a real problem while trying to fix a hypothetical one. There are more than enough real problems to deal with, so why push hypothetical problems to the top of your priority list?

It's too easy to get caught up in the abstract nature of problems and to lose sight of the real people and code behind them, or sometimes, the lack thereof. Be practical, be pragmatic. Before you hit that reply button, think - in a practical sense, of all the things I could be doing right now, is this discussion the place where my involvement could generate the greatest positive impact for the project? Is this the biggest and most substantial problem the project should be focusing on right now? Projects and developers who know how to manage focus go on to achieve the greatest things, in my experience.

Having been critical, I will end with a compliment. :) It is nice to see that with only a couple small exceptions, this discussion has remained very civil and respectful, which should be expected, but I know from experience that far too often these discussions start to take a nasty tone as people get frustrated. This is one of the things I really do love about the Python community, and it's one reason I want to see both the product and community grow and succeed even more. That, in fact, is why I'm choosing to write this message first rather than simply unsubscribe.

Kevin

From p.f.moore at gmail.com  Thu Jun 16 15:33:16 2016
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 16 Jun 2016 20:33:16 +0100
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CADiSq7dw__xoga9A3R8G66TpwcC0XbWs056q82TtioJCUr9mBw@mail.gmail.com>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>
 <loom.20160616T100935-276@post.gmane.org>
 <CAPJVwBnMZiDK7ZTnUbkgtSNvitEv=zTAhSZoK--1U7oicEpA5g@mail.gmail.com>
 <CACac1F_xcCtb7S=iKDhiMsW9QODLBhtUaoG4YAqPOKAbDoPOaw@mail.gmail.com>
 <CADiSq7dw__xoga9A3R8G66TpwcC0XbWs056q82TtioJCUr9mBw@mail.gmail.com>
Message-ID: <CACac1F_jQ=S0wAYApAG+OviNSUhFvPNssYYHUBV5pDCqEgKzvA@mail.gmail.com>

On 16 June 2016 at 18:03, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 16 June 2016 at 09:39, Paul Moore <p.f.moore at gmail.com> wrote:
>> I'm willing to accept the view of the security experts that there's a
>> problem here. But without a clear explanation of the problem, how can
>> a non-specialist like myself have an opinion? (And I hope the security
>> POV isn't "you don't need an opinion, just do as we say").
>
> If you're not writing Linux (and presumably *BSD) scripts and
> applications that run during system initialisation or on embedded ARM
> hardware with no good sources of randomness, then there's zero chance
> of any change made in relation to this affecting you (Windows and Mac
> OS X are completely immune, since they don't allow Python scripts to
> run early enough in the boot sequence for there to ever be a problem).

Understood. I could quite happily ignore this thread for all the
impact it will have on me. However, I've seen enough of these debates
(and witnessed the frustration of the security advocates) that I want
to try to understand the issues better - as much as anything so that I
don't end up adding uninformed opposition to these threads (in my day
job, unfortunately, security is generally the excuse for all sorts of
counter-productive rules, and never offers any practical benefits that
I am aware of, so I'm predisposed to rejecting arguments based on
security - that background isn't accurate in this environment and I'm
actively trying to counter it).

> The only question at hand is what CPython should do in the case where
> the operating system *does* let Python scripts run before the system
> random number generator is ready, and the application calls a security
> sensitive API that relies on that RNG:
>
> - throw BlockingIOError (so the script developer knows they have a
> potential problem to fix)
> - block (so the script developer has a system hang to debug)
> - return low quality random data (so the script developer doesn't even
> know they have a potential problem)
>
> The last option is the status quo, and has a remarkable number of
> vocal defenders.

Understood. It seems to me that there are two arguments here -
backward compatibility (which is always a pressure, but sometimes
applied too vigourously and not always consistently) and "we've always
done it that way" (aka "people will have to consider what happens when
they run under 3.4 anyway, so how will changing help?"). Jusging
backward compatibility is always a matter of trade-offs, hence my
interest in the actual benefits.

> The second option is what we changed the behaviour to in 3.5 as a side
> effect of switching to a syscall to save a file descriptor (and *also*
> inadvertently made a gating requirement for CPython starting at all,
> without which I'd be very surprised if anyone actually noticed the
> potentially blocking behaviour in os.urandom itself)

OK, so (given that the issue of CPython starting at all was an
accidental, and now corrected, side effect) why is this so bad? Maybe
not in a minor release, but at least for 3.6? How come this has caused
such a fuss? I genuinely don't understand why people see blocking as
such an issue (and as far as I can tell, Ted Tso seems to agree). The
one case where this had an impact was a quickly fixed bug - so as far
as I can tell, the risk of problems caused by blocking is purely
hypothetical.

> The first option is the one I'm currently writing a PEP for, since it
> makes the longstanding advice to use os.urandom() as the low level
> random data API for security sensitive operations unequivocally
> correct (as it will either do the right thing, or throw an exception
> which the developer can handle as appropriate for their particular
> application)

In my code, I typically prefer Python to make detailed decisions for
me (e.g. requests follows redirects by default, it doesn't expect me
to do so manually). Now certainly this is a low-level interface so the
rules are different, but I don't see why blocking by default isn't
"unequivocally correct" in the same way that it is on other platforms,
rather than raising an exception and requiring the developer to do the
wait manually. (What else would they do - fall back to insecure data?
I thought the point here was that that's the wrong thing to do?)
Having a blocking default with a non-blocking version seems just as
arguable, and has the advantage that naive users (I don't even know if
we're allowing for naive users here) won't get an unexpected exception
and handle it badly because they don't know what to do (a sadly common
practice in my experience).

OK. Guido has pronounced, you're writing a PEP. None of this debate is
really constructive any more. But I still don't understand the
trade-offs, which frustrates me. Surely security isn't so hard that it
can't be explained in a way that an interested layman like myself can
follow? :-(

Paul

From barry at python.org  Thu Jun 16 16:09:40 2016
From: barry at python.org (Barry Warsaw)
Date: Thu, 16 Jun 2016 23:09:40 +0300
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CAEbHw4aY0EbbxYeO10PnwQNafmGd=sAYvQijnjBrCErR0Akrog@mail.gmail.com>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>
 <loom.20160616T100935-276@post.gmane.org>
 <CAPJVwBnMZiDK7ZTnUbkgtSNvitEv=zTAhSZoK--1U7oicEpA5g@mail.gmail.com>
 <CAEbHw4aY0EbbxYeO10PnwQNafmGd=sAYvQijnjBrCErR0Akrog@mail.gmail.com>
Message-ID: <20160616230940.570b8553.barry@wooz.org>

On Jun 16, 2016, at 01:01 PM, David Mertz wrote:

>It seems to me that backporting 'secrets' and putting it on Warehouse would
>be a lot more productive than complaining about 3.5.2 reverting to (almost)
>the behavior of 2.3-3.4.

Very wise suggestion indeed.  We have all kinds of stdlib modules backported
and released as third party packages.  Why not secrets too?  If such were on
PyPI, I'd happily package it up for the Debian ecosystem.  Problem solved
<wink>.

But I'm *really* going to try to disengage from this discussion until Nick's
PEP is posted.

Cheers,
-Barry

From ncoghlan at gmail.com  Thu Jun 16 16:50:55 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 16 Jun 2016 13:50:55 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <20160616230940.570b8553.barry@wooz.org>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>
 <loom.20160616T100935-276@post.gmane.org>
 <CAPJVwBnMZiDK7ZTnUbkgtSNvitEv=zTAhSZoK--1U7oicEpA5g@mail.gmail.com>
 <CAEbHw4aY0EbbxYeO10PnwQNafmGd=sAYvQijnjBrCErR0Akrog@mail.gmail.com>
 <20160616230940.570b8553.barry@wooz.org>
Message-ID: <CADiSq7dRzOR0ACmq8xoTJFu1xgNszVk_E4E3My=TyA05h=qOxw@mail.gmail.com>

On 16 June 2016 at 13:09, Barry Warsaw <barry at python.org> wrote:
> On Jun 16, 2016, at 01:01 PM, David Mertz wrote:
>
>>It seems to me that backporting 'secrets' and putting it on Warehouse would
>>be a lot more productive than complaining about 3.5.2 reverting to (almost)
>>the behavior of 2.3-3.4.
>
> Very wise suggestion indeed.  We have all kinds of stdlib modules backported
> and released as third party packages.  Why not secrets too?  If such were on
> PyPI, I'd happily package it up for the Debian ecosystem.  Problem solved
> <wink>.

The secrets module is just a collection of one liners pulling together
other stdlib components that have been around for years - the main
problem it aims to address is one of discoverability (rather than one
of code complexity), while also eliminating the "simulation is in the
standard library, secrecy requires a third party module" discrepancy
in the long term.

Once you're aware the problem exists, the easiest way to use it in a
version independent manner is to just copy the relevant snippet into
your own project's utility library - adding an entire new dependency
to your project just for those utility functions would be overkill.

If you *do* add a dependency, you'd typically be better off with
something more comprehensive and tailored to the particular problem
domain you're dealing with, like passlib or cryptography or
itsdangerous.

Cheers,
Nick.

P.S. Having the secrets module available on PyPI wouldn't *hurt*, I
just don't think it would help much.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ericsnowcurrently at gmail.com  Thu Jun 16 16:24:04 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 16 Jun 2016 14:24:04 -0600
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
Message-ID: <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>

On Thu, Jun 16, 2016 at 12:56 PM, Martin Teichmann
<lkb.teichmann at gmail.com> wrote:
> I am looking forward to a lot of comments on this!

I'd be glad to give feedback on this, probably later today or
tomorrow.  In particular, I'd like to help resolve the intersection
with PEP 520. :)

-eric

From lkb.teichmann at gmail.com  Thu Jun 16 17:17:14 2016
From: lkb.teichmann at gmail.com (Martin Teichmann)
Date: Thu, 16 Jun 2016 23:17:14 +0200
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
 <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
Message-ID: <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>

Hi Eric, hi List,

> I'd be glad to give feedback on this, probably later today or
> tomorrow.  In particular, I'd like to help resolve the intersection
> with PEP 520. :)

Thanks in advance! Let me already elaborate on the differences, so
that others can follow:

You chose the name "__definition_order__", I chose
"__attribute_order__", I am fine with either, what are other people's
opinions?

The bigger difference is actually the path to inclusion into Python:
my idea is to first make it a standard library feature, with the later
option to put it into the C core, while you want to put the feature
directly into the C core. Again I'm fine with either, as long as the
feature is eventually in.

As a side note, you propose to use OrderedDict as the class definition
namespace, and this is exactly how I implemented it. Nonetheless, I
would like to keep this fact as an implementation detail, such that
other implementations of Python (PyPy comes to mind) or even CPython
at a later time may switch to a different way to implement this
feature. I am thinking especially about the option to determine the
-_order__ already at compile time. Sure, this would mean that someone
could trick us by dynamically changing the order of attribute
definition, but I would document that as an abuse of the functionality
with undocumented outcome.

Greetings

Martin

From ncoghlan at gmail.com  Thu Jun 16 17:36:36 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 16 Jun 2016 14:36:36 -0700
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
 <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
 <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>
Message-ID: <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>

On 16 June 2016 at 14:17, Martin Teichmann <lkb.teichmann at gmail.com> wrote:
> As a side note, you propose to use OrderedDict as the class definition
> namespace, and this is exactly how I implemented it. Nonetheless, I
> would like to keep this fact as an implementation detail, such that
> other implementations of Python (PyPy comes to mind) or even CPython
> at a later time may switch to a different way to implement this
> feature. I am thinking especially about the option to determine the
> -_order__ already at compile time. Sure, this would mean that someone
> could trick us by dynamically changing the order of attribute
> definition, but I would document that as an abuse of the functionality
> with undocumented outcome.

I don't think that's a side note, I think it's an important point (and
relates to one of Nikita's questions as well): we have the option of
carving out certain aspects of PEP 520 as CPython implementation
details.

In particular, the language level guarantee can be that "class
statements set __definition_order__ by default, but may not do so when
using a metaclass that returns a custom namespace from __prepare__",
with the implementation detail that CPython does that by using
collection.OrderedDict for the class namespace by default.

An implementation like PyPy, with an inherently ordered standard dict
implementation, can just rely on that rather than being obliged to
switch to their full collections.OrderedDict type.

However, I don't think we should leave the compile-time vs runtime
definition order question as an implementation detail - I think we
should be explicit that the definition order attribute captures the
runtime definition order, with conditionals, loops and reassignment
being handled accordingly.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ericsnowcurrently at gmail.com  Thu Jun 16 17:57:16 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 16 Jun 2016 15:57:16 -0600
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
 <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
 <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>
 <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>
Message-ID: <CALFfu7AgNraSSJ9CZe3B=N-uYgmuQx-8aVJT-WBn4ct-1m8P+w@mail.gmail.com>

On Thu, Jun 16, 2016 at 3:36 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> I don't think that's a side note, I think it's an important point (and
> relates to one of Nikita's questions as well): we have the option of
> carving out certain aspects of PEP 520 as CPython implementation
> details.
>
> In particular, the language level guarantee can be that "class
> statements set __definition_order__ by default, but may not do so when
> using a metaclass that returns a custom namespace from __prepare__",
> with the implementation detail that CPython does that by using
> collection.OrderedDict for the class namespace by default.
>
> An implementation like PyPy, with an inherently ordered standard dict
> implementation, can just rely on that rather than being obliged to
> switch to their full collections.OrderedDict type.

Excellent point from you both. :)  I'll rework PEP 520 accordingly (to
focus on __definition_order__).  At that point I expect the definition
order part of PEP 487 could be dropped (as redundant).

>
> However, I don't think we should leave the compile-time vs runtime
> definition order question as an implementation detail - I think we
> should be explicit that the definition order attribute captures the
> runtime definition order, with conditionals, loops and reassignment
> being handled accordingly.

Yeah, I'll make that clear.

We can discuss these changes in a separate thread once I've updated
PEP 520.  So let's focus back on the rest of PEP 487! :)

-eric

From nikita at nemkin.ru  Thu Jun 16 18:24:15 2016
From: nikita at nemkin.ru (Nikita Nemkin)
Date: Fri, 17 Jun 2016 03:24:15 +0500
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
 <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
 <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>
 <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>
Message-ID: <CANawmycB2ptBw-y90B2DsEe0kKqnC8fFBVdBFBZQRTv0DitG1g@mail.gmail.com>

On Fri, Jun 17, 2016 at 2:36 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 16 June 2016 at 14:17, Martin Teichmann <lkb.teichmann at gmail.com> wrote:

> An implementation like PyPy, with an inherently ordered standard dict
> implementation, can just rely on that rather than being obliged to
> switch to their full collections.OrderedDict type.

I didin't know that PyPy has actually implemented packed ordered dicts!
https://morepypy.blogspot.ru/2015/01/faster-more-memory-efficient-and-more.html
https://mail.python.org/pipermail/python-dev/2012-December/123028.html

This old idea by Raymond Hettinger is vastly superior to __definition_order__
duct tape (now that PyPy has validated it).
It also gives kwarg order for free, which is important in many metaprogramming
scenarios.
Not to mention memory usage reduction and dict operations speedup...

From mertz at gnosis.cx  Thu Jun 16 18:33:42 2016
From: mertz at gnosis.cx (David Mertz)
Date: Thu, 16 Jun 2016 15:33:42 -0700
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <CADiSq7dRzOR0ACmq8xoTJFu1xgNszVk_E4E3My=TyA05h=qOxw@mail.gmail.com>
References: <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
 <20160612061142.GA1986@thunk.org>
 <147ACCD6-17A5-42DE-A3C6-15758F45D289@lukasa.co.uk>
 <20160612134315.GC1986@thunk.org>
 <1A3E7FD6-4BF5-4097-BEC3-77EAB6956487@lukasa.co.uk>
 <20160612232803.GB17328@thunk.org>
 <CAPJVwBkaVe1xTck98EgHqCMG=WC32cBb3xo_84x3-BBeq8Dt3w@mail.gmail.com>
 <20160613122654.GE17328@thunk.org>
 <CADiSq7cxZ6-h+oo1eDg+7u+O0J3nSZ+v-D+6D47UG5C9XXh+xw@mail.gmail.com>
 <20160616094508.3acf1de7.barry@wooz.org>
 <CAPJVwBkWT0nX5aWE=Q5AWxRbg7fD5xfQEWKu=mCZNKVp6X=aeQ@mail.gmail.com>
 <loom.20160616T100935-276@post.gmane.org>
 <CAPJVwBnMZiDK7ZTnUbkgtSNvitEv=zTAhSZoK--1U7oicEpA5g@mail.gmail.com>
 <CAEbHw4aY0EbbxYeO10PnwQNafmGd=sAYvQijnjBrCErR0Akrog@mail.gmail.com>
 <20160616230940.570b8553.barry@wooz.org>
 <CADiSq7dRzOR0ACmq8xoTJFu1xgNszVk_E4E3My=TyA05h=qOxw@mail.gmail.com>
Message-ID: <CAEbHw4Y7OEcODcksQQg943G0GJXjMYPwpe2Khr-DRHAVXeVrAA@mail.gmail.com>

Yes 'secrets' is one-liners. However, it might grow a few more lines around
the blocking in getrandom() on Linux. But still, not more than a few.

But the reason it should be on PyPI is so that programs can have a uniform
API across various Python versions. There's no real reason that someone
stick on Python 2.7 or 3.3 shouldn't be able to include the future-style:

import secrets
Answer = secrets.token_bytes(42)
On Jun 16, 2016 4:53 PM, "Nick Coghlan" <ncoghlan at gmail.com> wrote:

> On 16 June 2016 at 13:09, Barry Warsaw <barry at python.org> wrote:
> > On Jun 16, 2016, at 01:01 PM, David Mertz wrote:
> >
> >>It seems to me that backporting 'secrets' and putting it on Warehouse
> would
> >>be a lot more productive than complaining about 3.5.2 reverting to
> (almost)
> >>the behavior of 2.3-3.4.
> >
> > Very wise suggestion indeed.  We have all kinds of stdlib modules
> backported
> > and released as third party packages.  Why not secrets too?  If such
> were on
> > PyPI, I'd happily package it up for the Debian ecosystem.  Problem solved
> > <wink>.
>
> The secrets module is just a collection of one liners pulling together
> other stdlib components that have been around for years - the main
> problem it aims to address is one of discoverability (rather than one
> of code complexity), while also eliminating the "simulation is in the
> standard library, secrecy requires a third party module" discrepancy
> in the long term.
>
> Once you're aware the problem exists, the easiest way to use it in a
> version independent manner is to just copy the relevant snippet into
> your own project's utility library - adding an entire new dependency
> to your project just for those utility functions would be overkill.
>
> If you *do* add a dependency, you'd typically be better off with
> something more comprehensive and tailored to the particular problem
> domain you're dealing with, like passlib or cryptography or
> itsdangerous.
>
> Cheers,
> Nick.
>
> P.S. Having the secrets module available on PyPI wouldn't *hurt*, I
> just don't think it would help much.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/mertz%40gnosis.cx
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160616/4a0f4da8/attachment-0001.html>

From guido at python.org  Thu Jun 16 20:27:51 2016
From: guido at python.org (Guido van Rossum)
Date: Thu, 16 Jun 2016 17:27:51 -0700
Subject: [Python-Dev] Discussion overload
In-Reply-To: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
Message-ID: <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>

Hi Kevin,

I often feel the same way. Are you using GMail? It combines related
messages in threads and lets you mute threads. I often use this feature so
I can manage my inbox. (I presume other mailers have the same features, but
I don't know if all of them do.) There are also many people who read the
list on a website, e.g. gmane. (Though I think that sometimes the delays
incurred there add to the noise -- e.g. when a decision is reached on the
list sometimes people keep responding to earlier threads.)

--Guido (don't get me started on top-posting :-)

On Thu, Jun 16, 2016 at 12:22 PM, Kevin Ollivier <
kevin-lists at theolliviers.com> wrote:

> Hi all,
>
> Recent joiner here, I signed up after PyCon made me want to get more
> involved and have been lurking. I woke up this morning again to about 30
> new messages in my inbox, almost all of which revolve around the os.urandom
> blocking discussion. There are just about hourly new posts showing up on
> this topic.
>
>
>
>
> There is such a thing as too much of a good thing. Discussion of issues is
> certainly good, but so far since joining this list I am seeing too much
> discussion happening too fast, and as someone who has been involved in open
> source for approaching two decades now, frankly, that is not really a good
> sign. The discussions are somewhat overlapping as so many people write back
> so quickly, there are multiple sub-discussions happening at once, and
> really at this point I'm not sure how much new each message is really
> adding, if anything at all. It seems to me the main solutions to this
> problem have all been identified, as have the tradeoffs of each. The
> discussion is now mostly at a point where people are just repeatedly
> debating (or promoting) the merits of their preferred solution and
> tradeoff. It is even spawning more abstract sub-discsussions about things
> like project compatibility policies. This discussion has really taken on a
> life of its own.
>
> For someone like me, a new joiner, seeing this makes me feel like wanting
> to simply unsubscribe. I've been on mailing lists where issues get debated
> endlessly, and at some point what inevitably happens is that the project
> starts to lose members who feel that even just trying to follow the
> discussions is eating up too much of their time. It really can suck the
> energy right out of a community. I don't want to see that happen to Python.
> I had a blast at PyCon, my first, and I really came away feeling more than
> ever that the community you have here is really special. The one problem I
> felt concerned about though, was that the core dev community risked a sense
> of paralysis caused by having too many cooks in the kitchen and too much
> worry about the potential unseen ramifications of changing things. That
> creates a sort of paralysis and difficulty achieving consensus on anything
> that, eventually, causes projects to slowly decline and be disrupted by a
> more agile alternative.
>
> Please consider taking a step back from this issue. Take a deep breath,
> and consider responding more slowly and letting people's points stew in
> your head for a day or two first. (Including this one pls. :) Python will
> not implode if you don't get that email out right away. If I understand
> what I've read of this torrent of messages correctly, we don't even know if
> there's a single real world use case where a user of os.urandom is hitting
> the same problem CPython did, so we don't even know if the blocking at
> startup issue is actually even happening in any real world Python code out
> there. It's clearly far from a rampant problem, in any case. Stop and think
> about that for a second. This is, in practice, potentially a complete
> non-issue. Fixing it in any number of ways may potentially change things
> for no one at all. You could even introduce a real problem while trying to
> fix a hypothetical one. There are more than enough real problems to deal
> with, so why push hypothetical problems to t
>  he top of your priority list?
>
> It's too easy to get caught up in the abstract nature of problems and to
> lose sight of the real people and code behind them, or sometimes, the lack
> thereof. Be practical, be pragmatic. Before you hit that reply button,
> think - in a practical sense, of all the things I could be doing right now,
> is this discussion the place where my involvement could generate the
> greatest positive impact for the project? Is this the biggest and most
> substantial problem the project should be focusing on right now? Projects
> and developers who know how to manage focus go on to achieve the greatest
> things, in my experience.
>
> Having been critical, I will end with a compliment. :) It is nice to see
> that with only a couple small exceptions, this discussion has remained very
> civil and respectful, which should be expected, but I know from experience
> that far too often these discussions start to take a nasty tone as people
> get frustrated. This is one of the things I really do love about the Python
> community, and it's one reason I want to see both the product and community
> grow and succeed even more. That, in fact, is why I'm choosing to write
> this message first rather than simply unsubscribe.
>
> Kevin
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160616/2d113db7/attachment.html>

From kevin-lists at theolliviers.com  Thu Jun 16 22:00:59 2016
From: kevin-lists at theolliviers.com (Kevin Ollivier)
Date: Thu, 16 Jun 2016 19:00:59 -0700
Subject: [Python-Dev] Discussion overload
In-Reply-To: <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
Message-ID: <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>

Hi Guido,

From:  <gvanrossum at gmail.com> on behalf of Guido van Rossum <guido at python.org>
Reply-To:  <guido at python.org>
Date:  Thursday, June 16, 2016 at 5:27 PM
To:  Kevin Ollivier <kevin-lists at theolliviers.com>
Cc:  Python Dev <python-dev at python.org>
Subject:  Re: [Python-Dev] Discussion overload

Hi Kevin,

I often feel the same way. Are you using GMail? It combines related messages in threads and lets you mute threads. I often use this feature so I can manage my inbox. (I presume other mailers have the same features, but I don't know if all of them do.) There are also many people who read the list on a website, e.g. gmane. (Though I think that sometimes the delays incurred there add to the noise -- e.g. when a decision is reached on the list sometimes people keep responding to earlier threads.)

I fear I did quite a poor job of making my point. :( I've been on open source mailing lists since the late 90s, so I've learned strategies for dealing with mailing list overload. I've got my mail folders, my mail rules, etc. Having been on many mailing lists over the years, I've seen many productive discussions and many unproductive ones, and over time you start to see patterns. You also see what happens to those communities over time.

On the mailing lists where discussions become these unwieldy floods with 30-40 posts a day on one topic, over time what I have seen is that that rapid fire of posts generally does not lead to better decisions being made. In fact, usually it is the opposite. Faster discussions are not usually better discussions, and the chances of that gem of knowledge getting lost in the flood of posts is much greater. The more long-term consequence is that people start hesitating to bring up ideas, sometimes even very good ones, simply because even the discussion of them gets to be so draining that it's better to just leave things be. As an example, I do have work to do :) and I know if I was the one who had wanted to propose a fix for os.urandom or what have you, waking up to 30 messages I need to read to get caught up each day would be a pretty disheartening prospect, and possibly not even possible with my work obligations. It raises the bar to participating, in a way.

Perhaps some of this is inherent in mailing list discussions, but really in my experience, just a conscious decision on the part of contributors to slow down the discussion and "think more, write less", can do quite a lot to ensure the discussion is in fact a better one.

I probably should have taken more time to write my initial message, in fact, in order to better coalesce my points into something more succinct and clearly understandable. I somehow managed to convince people I need to learn mail management strategies. :)

Anyway, that is just my $0.02 cents on the matter. With inflation it accounts for less every day, so make of it what you will. :P

Thanks,

Kevin

--Guido (don't get me started on top-posting :-)

On Thu, Jun 16, 2016 at 12:22 PM, Kevin Ollivier <kevin-lists at theolliviers.com> wrote:
Hi all,

Recent joiner here, I signed up after PyCon made me want to get more involved and have been lurking. I woke up this morning again to about 30 new messages in my inbox, almost all of which revolve around the os.urandom blocking discussion. There are just about hourly new posts showing up on this topic.

There is such a thing as too much of a good thing. Discussion of issues is certainly good, but so far since joining this list I am seeing too much discussion happening too fast, and as someone who has been involved in open source for approaching two decades now, frankly, that is not really a good sign. The discussions are somewhat overlapping as so many people write back so quickly, there are multiple sub-discussions happening at once, and really at this point I'm not sure how much new each message is really adding, if anything at all. It seems to me the main solutions to this problem have all been identified, as have the tradeoffs of each. The discussion is now mostly at a point where people are just repeatedly debating (or promoting) the merits of their preferred solution and tradeoff. It is even spawning more abstract sub-discsussions about things like project compatibility policies. This discussion has really taken on a life of its own.

For someone like me, a new joiner, seeing this makes me feel like wanting to simply unsubscribe. I've been on mailing lists where issues get debated endlessly, and at some point what inevitably happens is that the project starts to lose members who feel that even just trying to follow the discussions is eating up too much of their time. It really can suck the energy right out of a community. I don't want to see that happen to Python. I had a blast at PyCon, my first, and I really came away feeling more than ever that the community you have here is really special. The one problem I felt concerned about though, was that the core dev community risked a sense of paralysis caused by having too many cooks in the kitchen and too much worry about the potential unseen ramifications of changing things. That creates a sort of paralysis and difficulty achieving consensus on anything that, eventually, causes projects to slowly decline and be disrupted by a more agile alternative.

Please consider taking a step back from this issue. Take a deep breath, and consider responding more slowly and letting people's points stew in your head for a day or two first. (Including this one pls. :) Python will not implode if you don't get that email out right away. If I understand what I've read of this torrent of messages correctly, we don't even know if there's a single real world use case where a user of os.urandom is hitting the same problem CPython did, so we don't even know if the blocking at startup issue is actually even happening in any real world Python code out there. It's clearly far from a rampant problem, in any case. Stop and think about that for a second. This is, in practice, potentially a complete non-issue. Fixing it in any number of ways may potentially change things for no one at all. You could even introduce a real problem while trying to fix a hypothetical one. There are more than enough real problems to deal with, so why push hypothetical problems to t
 he top of your priority list?

It's too easy to get caught up in the abstract nature of problems and to lose sight of the real people and code behind them, or sometimes, the lack thereof. Be practical, be pragmatic. Before you hit that reply button, think - in a practical sense, of all the things I could be doing right now, is this discussion the place where my involvement could generate the greatest positive impact for the project? Is this the biggest and most substantial problem the project should be focusing on right now? Projects and developers who know how to manage focus go on to achieve the greatest things, in my experience.

Having been critical, I will end with a compliment. :) It is nice to see that with only a couple small exceptions, this discussion has remained very civil and respectful, which should be expected, but I know from experience that far too often these discussions start to take a nasty tone as people get frustrated. This is one of the things I really do love about the Python community, and it's one reason I want to see both the product and community grow and succeed even more. That, in fact, is why I'm choosing to write this message first rather than simply unsubscribe.

Kevin

_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org

-- 
--Guido van Rossum (python.org/~guido)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160616/3c5fe0ca/attachment.html>

From guido at python.org  Thu Jun 16 23:25:51 2016
From: guido at python.org (Guido van Rossum)
Date: Thu, 16 Jun 2016 20:25:51 -0700
Subject: [Python-Dev] Discussion overload
In-Reply-To: <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
Message-ID: <CAP7+vJJDXLgX_+kKEhstVszaHsfg3fMOMeP6pmj1TneZdq1zyA@mail.gmail.com>

More likely your post was too long... :-(

On Thu, Jun 16, 2016 at 7:00 PM, Kevin Ollivier <
kevin-lists at theolliviers.com> wrote:

> Hi Guido,
>
> From: <gvanrossum at gmail.com> on behalf of Guido van Rossum <
> guido at python.org>
> Reply-To: <guido at python.org>
> Date: Thursday, June 16, 2016 at 5:27 PM
> To: Kevin Ollivier <kevin-lists at theolliviers.com>
> Cc: Python Dev <python-dev at python.org>
> Subject: Re: [Python-Dev] Discussion overload
>
> Hi Kevin,
>
> I often feel the same way. Are you using GMail? It combines related
> messages in threads and lets you mute threads. I often use this feature so
> I can manage my inbox. (I presume other mailers have the same features, but
> I don't know if all of them do.) There are also many people who read the
> list on a website, e.g. gmane. (Though I think that sometimes the delays
> incurred there add to the noise -- e.g. when a decision is reached on the
> list sometimes people keep responding to earlier threads.)
>
>
> I fear I did quite a poor job of making my point. :( I've been on open
> source mailing lists since the late 90s, so I've learned strategies for
> dealing with mailing list overload. I've got my mail folders, my mail
> rules, etc. Having been on many mailing lists over the years, I've seen
> many productive discussions and many unproductive ones, and over time you
> start to see patterns. You also see what happens to those communities over
> time.
>
> On the mailing lists where discussions become these unwieldy floods with
> 30-40 posts a day on one topic, over time what I have seen is that that
> rapid fire of posts generally does not lead to better decisions being made.
> In fact, usually it is the opposite. Faster discussions are not usually
> better discussions, and the chances of that gem of knowledge getting lost
> in the flood of posts is much greater. The more long-term consequence is
> that people start hesitating to bring up ideas, sometimes even very good
> ones, simply because even the discussion of them gets to be so draining
> that it's better to just leave things be. As an example, I do have work to
> do :) and I know if I was the one who had wanted to propose a fix for
> os.urandom or what have you, waking up to 30 messages I need to read to get
> caught up each day would be a pretty disheartening prospect, and possibly
> not even possible with my work obligations. It raises the bar to
> participating, in a way.
>
> Perhaps some of this is inherent in mailing list discussions, but really
> in my experience, just a conscious decision on the part of contributors to
> slow down the discussion and "think more, write less", can do quite a lot
> to ensure the discussion is in fact a better one.
>
> I probably should have taken more time to write my initial message, in
> fact, in order to better coalesce my points into something more succinct
> and clearly understandable. I somehow managed to convince people I need to
> learn mail management strategies. :)
>
> Anyway, that is just my $0.02 cents on the matter. With inflation it
> accounts for less every day, so make of it what you will. :P
>
> Thanks,
>
> Kevin
>
>
> --Guido (don't get me started on top-posting :-)
>
> On Thu, Jun 16, 2016 at 12:22 PM, Kevin Ollivier <
> kevin-lists at theolliviers.com> wrote:
>
>> Hi all,
>>
>> Recent joiner here, I signed up after PyCon made me want to get more
>> involved and have been lurking. I woke up this morning again to about 30
>> new messages in my inbox, almost all of which revolve around the os.urandom
>> blocking discussion. There are just about hourly new posts showing up on
>> this topic.
>>
>>
>>
>>
>> There is such a thing as too much of a good thing. Discussion of issues
>> is certainly good, but so far since joining this list I am seeing too much
>> discussion happening too fast, and as someone who has been involved in open
>> source for approaching two decades now, frankly, that is not really a good
>> sign. The discussions are somewhat overlapping as so many people write back
>> so quickly, there are multiple sub-discussions happening at once, and
>> really at this point I'm not sure how much new each message is really
>> adding, if anything at all. It seems to me the main solutions to this
>> problem have all been identified, as have the tradeoffs of each. The
>> discussion is now mostly at a point where people are just repeatedly
>> debating (or promoting) the merits of their preferred solution and
>> tradeoff. It is even spawning more abstract sub-discsussions about things
>> like project compatibility policies. This discussion has really taken on a
>> life of its own.
>>
>> For someone like me, a new joiner, seeing this makes me feel like wanting
>> to simply unsubscribe. I've been on mailing lists where issues get debated
>> endlessly, and at some point what inevitably happens is that the project
>> starts to lose members who feel that even just trying to follow the
>> discussions is eating up too much of their time. It really can suck the
>> energy right out of a community. I don't want to see that happen to Python.
>> I had a blast at PyCon, my first, and I really came away feeling more than
>> ever that the community you have here is really special. The one problem I
>> felt concerned about though, was that the core dev community risked a sense
>> of paralysis caused by having too many cooks in the kitchen and too much
>> worry about the potential unseen ramifications of changing things. That
>> creates a sort of paralysis and difficulty achieving consensus on anything
>> that, eventually, causes projects to slowly decline and be disrupted by a
>> more agile alternative.
>>
>> Please consider taking a step back from this issue. Take a deep breath,
>> and consider responding more slowly and letting people's points stew in
>> your head for a day or two first. (Including this one pls. :) Python will
>> not implode if you don't get that email out right away. If I understand
>> what I've read of this torrent of messages correctly, we don't even know if
>> there's a single real world use case where a user of os.urandom is hitting
>> the same problem CPython did, so we don't even know if the blocking at
>> startup issue is actually even happening in any real world Python code out
>> there. It's clearly far from a rampant problem, in any case. Stop and think
>> about that for a second. This is, in practice, potentially a complete
>> non-issue. Fixing it in any number of ways may potentially change things
>> for no one at all. You could even introduce a real problem while trying to
>> fix a hypothetical one. There are more than enough real problems to deal
>> with, so why push hypothetical problems to t
>>  he top of your priority list?
>>
>> It's too easy to get caught up in the abstract nature of problems and to
>> lose sight of the real people and code behind them, or sometimes, the lack
>> thereof. Be practical, be pragmatic. Before you hit that reply button,
>> think - in a practical sense, of all the things I could be doing right now,
>> is this discussion the place where my involvement could generate the
>> greatest positive impact for the project? Is this the biggest and most
>> substantial problem the project should be focusing on right now? Projects
>> and developers who know how to manage focus go on to achieve the greatest
>> things, in my experience.
>>
>> Having been critical, I will end with a compliment. :) It is nice to see
>> that with only a couple small exceptions, this discussion has remained very
>> civil and respectful, which should be expected, but I know from experience
>> that far too often these discussions start to take a nasty tone as people
>> get frustrated. This is one of the things I really do love about the Python
>> community, and it's one reason I want to see both the product and community
>> grow and succeed even more. That, in fact, is why I'm choosing to write
>> this message first rather than simply unsubscribe.
>>
>> Kevin
>>
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160616/c00d0880/attachment-0001.html>

From kevin-lists at theolliviers.com  Fri Jun 17 00:10:56 2016
From: kevin-lists at theolliviers.com (Kevin Ollivier)
Date: Thu, 16 Jun 2016 21:10:56 -0700
Subject: [Python-Dev] Discussion overload
In-Reply-To: <CAP7+vJJDXLgX_+kKEhstVszaHsfg3fMOMeP6pmj1TneZdq1zyA@mail.gmail.com>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
 <CAP7+vJJDXLgX_+kKEhstVszaHsfg3fMOMeP6pmj1TneZdq1zyA@mail.gmail.com>
Message-ID: <76676028-A883-4B08-869E-E68FB2CB0D21@theolliviers.com>

Yes, it most certainly was. :( Sorry about that!

From:  <gvanrossum at gmail.com> on behalf of Guido van Rossum <guido at python.org>
Reply-To:  <guido at python.org>
Date:  Thursday, June 16, 2016 at 8:25 PM
To:  Kevin Ollivier <kevin-lists at theolliviers.com>
Cc:  Python Dev <python-dev at python.org>
Subject:  Re: [Python-Dev] Discussion overload

More likely your post was too long... :-(

On Thu, Jun 16, 2016 at 7:00 PM, Kevin Ollivier <kevin-lists at theolliviers.com> wrote:
Hi Guido,

From:  <gvanrossum at gmail.com> on behalf of Guido van Rossum <guido at python.org>
Reply-To:  <guido at python.org>
Date:  Thursday, June 16, 2016 at 5:27 PM
To:  Kevin Ollivier <kevin-lists at theolliviers.com>
Cc:  Python Dev <python-dev at python.org>
Subject:  Re: [Python-Dev] Discussion overload

Hi Kevin,

I often feel the same way. Are you using GMail? It combines related messages in threads and lets you mute threads. I often use this feature so I can manage my inbox. (I presume other mailers have the same features, but I don't know if all of them do.) There are also many people who read the list on a website, e.g. gmane. (Though I think that sometimes the delays incurred there add to the noise -- e.g. when a decision is reached on the list sometimes people keep responding to earlier threads.)

I fear I did quite a poor job of making my point. :( I've been on open source mailing lists since the late 90s, so I've learned strategies for dealing with mailing list overload. I've got my mail folders, my mail rules, etc. Having been on many mailing lists over the years, I've seen many productive discussions and many unproductive ones, and over time you start to see patterns. You also see what happens to those communities over time.

On the mailing lists where discussions become these unwieldy floods with 30-40 posts a day on one topic, over time what I have seen is that that rapid fire of posts generally does not lead to better decisions being made. In fact, usually it is the opposite. Faster discussions are not usually better discussions, and the chances of that gem of knowledge getting lost in the flood of posts is much greater. The more long-term consequence is that people start hesitating to bring up ideas, sometimes even very good ones, simply because even the discussion of them gets to be so draining that it's better to just leave things be. As an example, I do have work to do :) and I know if I was the one who had wanted to propose a fix for os.urandom or what have you, waking up to 30 messages I need to read to get caught up each day would be a pretty disheartening prospect, and possibly not even possible with my work obligations. It raises the bar to participating, in a way.

Perhaps some of this is inherent in mailing list discussions, but really in my experience, just a conscious decision on the part of contributors to slow down the discussion and "think more, write less", can do quite a lot to ensure the discussion is in fact a better one.

I probably should have taken more time to write my initial message, in fact, in order to better coalesce my points into something more succinct and clearly understandable. I somehow managed to convince people I need to learn mail management strategies. :)

Anyway, that is just my $0.02 cents on the matter. With inflation it accounts for less every day, so make of it what you will. :P

Thanks,

Kevin

--Guido (don't get me started on top-posting :-)

On Thu, Jun 16, 2016 at 12:22 PM, Kevin Ollivier <kevin-lists at theolliviers.com> wrote:
Hi all,

Recent joiner here, I signed up after PyCon made me want to get more involved and have been lurking. I woke up this morning again to about 30 new messages in my inbox, almost all of which revolve around the os.urandom blocking discussion. There are just about hourly new posts showing up on this topic.

There is such a thing as too much of a good thing. Discussion of issues is certainly good, but so far since joining this list I am seeing too much discussion happening too fast, and as someone who has been involved in open source for approaching two decades now, frankly, that is not really a good sign. The discussions are somewhat overlapping as so many people write back so quickly, there are multiple sub-discussions happening at once, and really at this point I'm not sure how much new each message is really adding, if anything at all. It seems to me the main solutions to this problem have all been identified, as have the tradeoffs of each. The discussion is now mostly at a point where people are just repeatedly debating (or promoting) the merits of their preferred solution and tradeoff. It is even spawning more abstract sub-discsussions about things like project compatibility policies. This discussion has really taken on a life of its own.

For someone like me, a new joiner, seeing this makes me feel like wanting to simply unsubscribe. I've been on mailing lists where issues get debated endlessly, and at some point what inevitably happens is that the project starts to lose members who feel that even just trying to follow the discussions is eating up too much of their time. It really can suck the energy right out of a community. I don't want to see that happen to Python. I had a blast at PyCon, my first, and I really came away feeling more than ever that the community you have here is really special. The one problem I felt concerned about though, was that the core dev community risked a sense of paralysis caused by having too many cooks in the kitchen and too much worry about the potential unseen ramifications of changing things. That creates a sort of paralysis and difficulty achieving consensus on anything that, eventually, causes projects to slowly decline and be disrupted by a more agile alternative.

Please consider taking a step back from this issue. Take a deep breath, and consider responding more slowly and letting people's points stew in your head for a day or two first. (Including this one pls. :) Python will not implode if you don't get that email out right away. If I understand what I've read of this torrent of messages correctly, we don't even know if there's a single real world use case where a user of os.urandom is hitting the same problem CPython did, so we don't even know if the blocking at startup issue is actually even happening in any real world Python code out there. It's clearly far from a rampant problem, in any case. Stop and think about that for a second. This is, in practice, potentially a complete non-issue. Fixing it in any number of ways may potentially change things for no one at all. You could even introduce a real problem while trying to fix a hypothetical one. There are more than enough real problems to deal with, so why push hypothetical problems to t
 he top of your priority list?

It's too easy to get caught up in the abstract nature of problems and to lose sight of the real people and code behind them, or sometimes, the lack thereof. Be practical, be pragmatic. Before you hit that reply button, think - in a practical sense, of all the things I could be doing right now, is this discussion the place where my involvement could generate the greatest positive impact for the project? Is this the biggest and most substantial problem the project should be focusing on right now? Projects and developers who know how to manage focus go on to achieve the greatest things, in my experience.

Having been critical, I will end with a compliment. :) It is nice to see that with only a couple small exceptions, this discussion has remained very civil and respectful, which should be expected, but I know from experience that far too often these discussions start to take a nasty tone as people get frustrated. This is one of the things I really do love about the Python community, and it's one reason I want to see both the product and community grow and succeed even more. That, in fact, is why I'm choosing to write this message first rather than simply unsubscribe.

Kevin

_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org

-- 
--Guido van Rossum (python.org/~guido)

-- 
--Guido van Rossum (python.org/~guido)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160616/999ff4cf/attachment.html>

From songofacandy at gmail.com  Fri Jun 17 05:15:53 2016
From: songofacandy at gmail.com (INADA Naoki)
Date: Fri, 17 Jun 2016 18:15:53 +0900
Subject: [Python-Dev] Compact dict implementations (was:  PEP 468
Message-ID: <CAEfz+Tzk_uNWwpkzpWCibYt1q0G61ztcNEBnqei9sPUezdF7mg@mail.gmail.com>

Hi, developers.

I'm trying to implement compact dict.
https://github.com/methane/cpython/pull/1

Current status is passing most of tests.
Some tests are failing because of I haven't updated `sizeof` until layout fix.
And I haven't dropped OrderedDict has linked list.

Before finishing implementation, I want to see comments and tests from core
developers.
Please come to core-mentorship ML or pull request and try it if you
interested in.

Regards,
-- 
INADA Naoki  <songofacandy at gmail.com>

From status at bugs.python.org  Fri Jun 17 12:08:38 2016
From: status at bugs.python.org (Python tracker)
Date: Fri, 17 Jun 2016 18:08:38 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <20160617160838.8966856A1A@psf.upfronthosting.co.za>

ACTIVITY SUMMARY (2016-06-10 - 2016-06-17)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open    5544 ( -9)
  closed 33557 (+66)
  total  39101 (+57)

Open issues with patches: 2417 

Issues opened (41)
==================

#10839: email module should not allow some header field repetitions
http://bugs.python.org/issue10839  reopened by rhettinger

#26171: heap overflow in zipimporter module
http://bugs.python.org/issue26171  reopened by ned.deily

#27288: secrets should use getrandom() on Linux
http://bugs.python.org/issue27288  opened by dstufft

#27292: Warn users that os.urandom() can return insecure values
http://bugs.python.org/issue27292  opened by christian.heimes

#27293: Summarize issues related to urandom, getrandom etc in secrets 
http://bugs.python.org/issue27293  opened by steven.daprano

#27294: Better repr for Tkinter event objects
http://bugs.python.org/issue27294  opened by serhiy.storchaka

#27297: Add support for /dev/random to "secrets"
http://bugs.python.org/issue27297  opened by larry

#27298: redundant iteration over digits in _PyLong_AsUnsignedLongMask
http://bugs.python.org/issue27298  opened by Oren Milman

#27299: urllib does not splitport while putrequest realhost to HTTP he
http://bugs.python.org/issue27299  opened by gr zhang

#27300: tempfile.TemporaryFile(): missing errors=... argument
http://bugs.python.org/issue27300  opened by mmarkk

#27302: csv.Sniffer guesses wrong when unquoted fields contain quotes
http://bugs.python.org/issue27302  opened by Redoute

#27303: [argparse] Unify options in help output
http://bugs.python.org/issue27303  opened by memeplex

#27304: Create "Source Code" links in module sections, where relevant
http://bugs.python.org/issue27304  opened by Yoni Lavi

#27305: Crash with "pip list --outdated" on Windows 10 with Python 2.7
http://bugs.python.org/issue27305  opened by James.Paget

#27307: string.Formatter does not support key/attribute access on unnu
http://bugs.python.org/issue27307  opened by tbeadle

#27309: Visual Styles support
http://bugs.python.org/issue27309  opened by [HYBRID BEING]

#27312: test_setupapp (idlelib.idle_test.test_macosx.SetupTest) fails 
http://bugs.python.org/issue27312  opened by ned.deily

#27313: test case failures in test_widgets.ComboboxTest.of test_ttk_gu
http://bugs.python.org/issue27313  opened by ned.deily

#27314: Cannot install 3.5.2 with 3.6.0a1 installed
http://bugs.python.org/issue27314  opened by steve.dower

#27315: pydoc: prefer the pager command in favor of the specifc less c
http://bugs.python.org/issue27315  opened by doko

#27317: Handling data_files: too much is removed in uninstall
http://bugs.python.org/issue27317  opened by sylvain.corlay

#27318: Add support for symlinks to zipfile
http://bugs.python.org/issue27318  opened by ldoktor

#27319: Multiple item arguments for selection operations
http://bugs.python.org/issue27319  opened by serhiy.storchaka

#27320: ./setup.py --help-commands should sort extra commands
http://bugs.python.org/issue27320  opened by Antony.Lee

#27321: Email parser creates a message object that can't be flattened
http://bugs.python.org/issue27321  opened by msapiro

#27322: test_compile_path fails when python has been installed
http://bugs.python.org/issue27322  opened by xdegaye

#27323: ncurses putwin() fails in test_module_funcs
http://bugs.python.org/issue27323  opened by xdegaye

#27326: SIGSEV in test_window_funcs of test_curses
http://bugs.python.org/issue27326  opened by xdegaye

#27328: Documentation corrections for email defects
http://bugs.python.org/issue27328  opened by martin.panter

#27329: Document behavior when CDLL is called  with None as an argumen
http://bugs.python.org/issue27329  opened by Jeffrey Esquivel Sibaja

#27331: Add a policy argument to email.mime.MIMEBase
http://bugs.python.org/issue27331  opened by berker.peksag

#27332: Clinic: first parameter for module-level functions should be P
http://bugs.python.org/issue27332  opened by encukou

#27333: validate_step in rangeobject.c, incorrect code logic but right
http://bugs.python.org/issue27333  opened by xiang.zhang

#27334: pysqlite3 context manager not performing rollback when a datab
http://bugs.python.org/issue27334  opened by lciti

#27335: Clarify that writing to locals() inside a class body is suppor
http://bugs.python.org/issue27335  opened by steven.daprano

#27337: 3.6.0a2 tarball has weird paths
http://bugs.python.org/issue27337  opened by petere

#27340: bytes-like objects with socket.sendall(), SSL, and http.client
http://bugs.python.org/issue27340  opened by martin.panter

#27341: mock.patch decorator fails silently on generators
http://bugs.python.org/issue27341  opened by shoshber

#27342: Clean up some Py_XDECREFs in rangeobject.c and bltinmodule.c
http://bugs.python.org/issue27342  opened by xiang.zhang

#27343: Incorrect error message for conflicting initializers of ctypes
http://bugs.python.org/issue27343  opened by serhiy.storchaka

#27344: zipfile *does* support utf-8 filenames
http://bugs.python.org/issue27344  opened by dholth

Most recent 15 issues with no replies (15)
==========================================

#27344: zipfile *does* support utf-8 filenames
http://bugs.python.org/issue27344

#27343: Incorrect error message for conflicting initializers of ctypes
http://bugs.python.org/issue27343

#27341: mock.patch decorator fails silently on generators
http://bugs.python.org/issue27341

#27340: bytes-like objects with socket.sendall(), SSL, and http.client
http://bugs.python.org/issue27340

#27332: Clinic: first parameter for module-level functions should be P
http://bugs.python.org/issue27332

#27331: Add a policy argument to email.mime.MIMEBase
http://bugs.python.org/issue27331

#27329: Document behavior when CDLL is called  with None as an argumen
http://bugs.python.org/issue27329

#27328: Documentation corrections for email defects
http://bugs.python.org/issue27328

#27326: SIGSEV in test_window_funcs of test_curses
http://bugs.python.org/issue27326

#27323: ncurses putwin() fails in test_module_funcs
http://bugs.python.org/issue27323

#27322: test_compile_path fails when python has been installed
http://bugs.python.org/issue27322

#27317: Handling data_files: too much is removed in uninstall
http://bugs.python.org/issue27317

#27309: Visual Styles support
http://bugs.python.org/issue27309

#27307: string.Formatter does not support key/attribute access on unnu
http://bugs.python.org/issue27307

#27304: Create "Source Code" links in module sections, where relevant
http://bugs.python.org/issue27304

Most recent 15 issues waiting for review (15)
=============================================

#27343: Incorrect error message for conflicting initializers of ctypes
http://bugs.python.org/issue27343

#27342: Clean up some Py_XDECREFs in rangeobject.c and bltinmodule.c
http://bugs.python.org/issue27342

#27334: pysqlite3 context manager not performing rollback when a datab
http://bugs.python.org/issue27334

#27333: validate_step in rangeobject.c, incorrect code logic but right
http://bugs.python.org/issue27333

#27332: Clinic: first parameter for module-level functions should be P
http://bugs.python.org/issue27332

#27331: Add a policy argument to email.mime.MIMEBase
http://bugs.python.org/issue27331

#27328: Documentation corrections for email defects
http://bugs.python.org/issue27328

#27321: Email parser creates a message object that can't be flattened
http://bugs.python.org/issue27321

#27320: ./setup.py --help-commands should sort extra commands
http://bugs.python.org/issue27320

#27319: Multiple item arguments for selection operations
http://bugs.python.org/issue27319

#27318: Add support for symlinks to zipfile
http://bugs.python.org/issue27318

#27315: pydoc: prefer the pager command in favor of the specifc less c
http://bugs.python.org/issue27315

#27307: string.Formatter does not support key/attribute access on unnu
http://bugs.python.org/issue27307

#27304: Create "Source Code" links in module sections, where relevant
http://bugs.python.org/issue27304

#27298: redundant iteration over digits in _PyLong_AsUnsignedLongMask
http://bugs.python.org/issue27298

Top 10 most discussed issues (10)
=================================

#27305: Crash with "pip list --outdated" on Windows 10 with Python 2.7
http://bugs.python.org/issue27305  18 msgs

#27294: Better repr for Tkinter event objects
http://bugs.python.org/issue27294  13 msgs

#10839: email module should not allow some header field repetitions
http://bugs.python.org/issue10839  12 msgs

#25782: CPython hangs on error __context__ set to the error itself
http://bugs.python.org/issue25782  12 msgs

#27186: add os.fspath()
http://bugs.python.org/issue27186  11 msgs

#27292: Warn users that os.urandom() can return insecure values
http://bugs.python.org/issue27292  11 msgs

#27263: Tkinter sets the HOME environment variable, breaking scripts
http://bugs.python.org/issue27263   9 msgs

#25455: Some repr implementations don't check for self-referential str
http://bugs.python.org/issue25455   8 msgs

#27025: More human readable generated widget names
http://bugs.python.org/issue27025   8 msgs

#27288: secrets should use getrandom() on Linux
http://bugs.python.org/issue27288   8 msgs

Issues closed (62)
==================

#5124: IDLE - pasting text doesn't delete selection
http://bugs.python.org/issue5124  closed by terry.reedy

#8637: Add MANPAGER envvar to specify pager for pydoc
http://bugs.python.org/issue8637  closed by doko

#14209: pkgutil.iter_zipimport_modules ignores the prefix parameter fo
http://bugs.python.org/issue14209  closed by lukasz.langa

#15468: Edit docs to hide hashlib.md5()
http://bugs.python.org/issue15468  closed by gregory.p.smith

#16182: readline: Wrong tab completion scope indices in Unicode termin
http://bugs.python.org/issue16182  closed by martin.panter

#16234: Implement correct block_size and tests for HMAC-SHA3
http://bugs.python.org/issue16234  closed by christian.heimes

#16864: sqlite3.Cursor.lastrowid isn't populated when executing a SQL 
http://bugs.python.org/issue16864  closed by berker.peksag

#17500: move PC/icons/source.xar to http://www.python.org/community/lo
http://bugs.python.org/issue17500  closed by doko

#19328: Improve PBKDF2 documentation
http://bugs.python.org/issue19328  closed by christian.heimes

#20508: IndexError from ipaddress._BaseNetwork.__getitem__ has no mess
http://bugs.python.org/issue20508  closed by berker.peksag

#20699: Document that binary IO classes work with bytes-likes objects
http://bugs.python.org/issue20699  closed by martin.panter

#20900: distutils register command should print text, not bytes repr
http://bugs.python.org/issue20900  closed by berker.peksag

#21386: ipaddress.IPv4Address.is_global not implemented
http://bugs.python.org/issue21386  closed by berker.peksag

#22558: Missing doc links to source code for Python-coded modules.
http://bugs.python.org/issue22558  closed by terry.reedy

#22970: asyncio: Cancelling wait() after notification leaves Condition
http://bugs.python.org/issue22970  closed by yselivanov

#24086: Configparser interpolation is unexpected
http://bugs.python.org/issue24086  closed by lukasz.langa

#24136: document PEP 448: unpacking generalization
http://bugs.python.org/issue24136  closed by martin.panter

#24750: IDLE: Cosmetic improvements for main window
http://bugs.python.org/issue24750  closed by terry.reedy

#24887: Sqlite3 has no option to provide open flags
http://bugs.python.org/issue24887  closed by berker.peksag

#25529: Provide access to the validated certificate chain in ssl modul
http://bugs.python.org/issue25529  closed by berker.peksag

#25724: SSLv3 test failure on Ubuntu 16.04 LTS
http://bugs.python.org/issue25724  closed by martin.panter

#26282: Add support for partial keyword arguments in extension functio
http://bugs.python.org/issue26282  closed by serhiy.storchaka

#26386: tkinter - Treeview - .selection_add and selection_toggle
http://bugs.python.org/issue26386  closed by serhiy.storchaka

#26556: Update expat to 2.2.1
http://bugs.python.org/issue26556  closed by python-dev

#26862: android: SYS_getdents64 does not need to be defined on android
http://bugs.python.org/issue26862  closed by xdegaye

#27029: Remove support of deprecated mode 'U' in zipfile
http://bugs.python.org/issue27029  closed by serhiy.storchaka

#27030: Remove deprecated re features
http://bugs.python.org/issue27030  closed by serhiy.storchaka

#27095: Simplify MAKE_FUNCTION
http://bugs.python.org/issue27095  closed by serhiy.storchaka

#27122: Hang with contextlib.ExitStack and subprocess.Popen (regressio
http://bugs.python.org/issue27122  closed by gregory.p.smith

#27140: Opcode for creating dict with constant keys
http://bugs.python.org/issue27140  closed by serhiy.storchaka

#27188: sqlite3 execute* methods return value not documented
http://bugs.python.org/issue27188  closed by berker.peksag

#27190: Check sqlite3_version before allowing check_same_thread = Fals
http://bugs.python.org/issue27190  closed by berker.peksag

#27194: Tarfile superfluous truncate calls slows extraction.
http://bugs.python.org/issue27194  closed by lukasz.langa

#27221: multiprocessing documentation is outdated regarding method pic
http://bugs.python.org/issue27221  closed by berker.peksag

#27223: _read_ready and _write_ready should respect _conn_lost
http://bugs.python.org/issue27223  closed by yselivanov

#27227: argparse fails to parse [] when using choices and nargs='*'
http://bugs.python.org/issue27227  closed by berker.peksag

#27233: Missing documentation for PyOS_FSPath
http://bugs.python.org/issue27233  closed by Jelle Zijlstra

#27238: Bare except: usages in turtle.py
http://bugs.python.org/issue27238  closed by serhiy.storchaka

#27245: IDLE: Fix deletion of custom themes and key bindings
http://bugs.python.org/issue27245  closed by terry.reedy

#27262: IDLE: move Aqua context menu code to maxosx
http://bugs.python.org/issue27262  closed by terry.reedy

#27270: 'parentheses-equality' warnings when building with clang and c
http://bugs.python.org/issue27270  closed by xdegaye

#27272: random.Random should not read 2500 bytes from urandom
http://bugs.python.org/issue27272  closed by rhettinger

#27278: py_getrandom() uses an int for syscall() result
http://bugs.python.org/issue27278  closed by haypo

#27286: str object got multiple values for keyword argument
http://bugs.python.org/issue27286  closed by serhiy.storchaka

#27289: test_ftp_timeout fails with EOFError
http://bugs.python.org/issue27289  closed by berker.peksag

#27290: Turn heaps library into a more OOP data structure?
http://bugs.python.org/issue27290  closed by rhettinger

#27291: two heap corruption issues when running modified pyc code.
http://bugs.python.org/issue27291  closed by gregory.p.smith

#27295: heaps library does not have support for max heap
http://bugs.python.org/issue27295  closed by rhettinger

#27296: Urllib/Urlopen IncompleteRead with HTTP header with new line c
http://bugs.python.org/issue27296  closed by martin.panter

#27301: Incorrect return codes in compile.c
http://bugs.python.org/issue27301  closed by serhiy.storchaka

#27306: Grammatical Error in Documentation - Tarfile page
http://bugs.python.org/issue27306  closed by berker.peksag

#27308: Inconsistency in cgi.FieldStorage() causes unicode/byte TypeEr
http://bugs.python.org/issue27308  closed by berker.peksag

#27310: 3.6.0a2 IDLE.app on OS X fails to launch, use command line idl
http://bugs.python.org/issue27310  closed by ned.deily

#27311: Incorrect documentation for zipfile.writestr()
http://bugs.python.org/issue27311  closed by martin.panter

#27316: [PDB] NameError in list comprehension in PDB
http://bugs.python.org/issue27316  closed by SilentGhost

#27324: Error when building Python extension
http://bugs.python.org/issue27324  closed by zach.ware

#27325: random failure of test_builtin
http://bugs.python.org/issue27325  closed by berker.peksag

#27327: re documentation: typo "escapes consist of"
http://bugs.python.org/issue27327  closed by ned.deily

#27330: Possible leaks in ctypes
http://bugs.python.org/issue27330  closed by serhiy.storchaka

#27336: --without-threads build fails due to undeclared _PyGILState_ch
http://bugs.python.org/issue27336  closed by berker.peksag

#27338: python 2.7 platform.system reports wrong on Mac OS X El Capita
http://bugs.python.org/issue27338  closed by Audric D'Hoest (Dr. Pariolo)

#27339: Security Issue: Typosquatting
http://bugs.python.org/issue27339  closed by haypo

From ncoghlan at gmail.com  Fri Jun 17 21:12:43 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 17 Jun 2016 18:12:43 -0700
Subject: [Python-Dev] Discussion overload
In-Reply-To: <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
Message-ID: <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>

On 16 June 2016 at 19:00, Kevin Ollivier <kevin-lists at theolliviers.com> wrote:
> Hi Guido,
>
> From: <gvanrossum at gmail.com> on behalf of Guido van Rossum
> <guido at python.org>
> Reply-To: <guido at python.org>
> Date: Thursday, June 16, 2016 at 5:27 PM
> To: Kevin Ollivier <kevin-lists at theolliviers.com>
> Cc: Python Dev <python-dev at python.org>
> Subject: Re: [Python-Dev] Discussion overload
>
> Hi Kevin,
>
> I often feel the same way. Are you using GMail? It combines related messages
> in threads and lets you mute threads. I often use this feature so I can
> manage my inbox. (I presume other mailers have the same features, but I
> don't know if all of them do.) There are also many people who read the list
> on a website, e.g. gmane. (Though I think that sometimes the delays incurred
> there add to the noise -- e.g. when a decision is reached on the list
> sometimes people keep responding to earlier threads.)
>
>
> I fear I did quite a poor job of making my point. :( I've been on open
> source mailing lists since the late 90s, so I've learned strategies for
> dealing with mailing list overload. I've got my mail folders, my mail rules,
> etc. Having been on many mailing lists over the years, I've seen many
> productive discussions and many unproductive ones, and over time you start
> to see patterns. You also see what happens to those communities over time.

This is one of the major reasons we have the option of escalating
things to the PEP process (and that's currently in train for
os.urandom), as well as the SIGs for when folks really need to dig
into topics that risk incurring a relatively low signal-to-noise
ration on python-dev. It's also why python-ideas was turned into a
separate list, since folks without the time for more speculative
discussions and brainstorming can safely ignore it, while remaining
confident that any ideas considered interesting enough for further
review will be brought to python-dev's attention.

But yes, one of the more significant design errors I've made with the
contextlib API was due to just such a draining pile-on by folks that
weren't happy the original name wasn't a 100% accurate description of
the underlying mechanics (even though it was an accurate description
of the intended use case), and "people yelling at you on project
communication channels without doing adequate research first" is the
number one reason we see otherwise happily engaged core developers
decide to find something else to do with their time.

The challenge and art in community management in that context is
balancing telling both old and new list participants "It's OK to ask
'Why is this so?', as sometimes the answer is that there isn't a good
reason and we may want to change it" and "Learn to be a good peer
manager, and avoid behaving like a micro-managing autocrat that chases
away experienced contributors".

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Fri Jun 17 21:32:36 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 17 Jun 2016 18:32:36 -0700
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
Message-ID: <CADiSq7dZqpH2n_JpgzC-3BgqOKcWv3OcnCvPMLe-jx8zJB2_mA@mail.gmail.com>

On 7 June 2016 at 17:50, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> Why is __definition_order__ even necessary?
> -------------------------------------------
>
> Since the definition order is not preserved in ``__dict__``, it would be
> lost once class definition execution completes.  Classes *could*
> explicitly set the attribute as the last thing in the body.  However,
> then independent decorators could only make use of classes that had done
> so.  Instead, ``__definition_order__`` preserves this one bit of info
> from the class body so that it is universally available.

The discussion in the PEP 487 thread made me realise that I'd like to
see a discussion in PEP 520 regarding whether or not to define
__definition_order__ for builtin types initialised via PyType_Ready or
created via PyType_FromSpec in addition to defining it for types
created via the class statement or types.new_class().

For static types, PyType_Ready could potentially set it based on
tp_members, tp_methods & tp_getset (see
https://docs.python.org/3/c-api/typeobj.html )
Similarly, PyType_FromSpec could potentially set it based on the
contents of Py_tp_members, Py_tp_methods and Py_tp_getset slot
definitions

Having definition order support in both types.new_class() and builtin
types would also make it clear why we can't rely purely on the
compiler to provide the necessary ordering information - in both of
those cases, the Python compiler isn't directly involved in the type
creation process.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From brett at python.org  Fri Jun 17 22:58:48 2016
From: brett at python.org (Brett Cannon)
Date: Sat, 18 Jun 2016 02:58:48 +0000
Subject: [Python-Dev] frame evaluation API PEP
In-Reply-To: <CAP1=2W7K6Ny82Vq-rk3zO9cHxxJmtQGczqGy3byg83sy-N8W9Q@mail.gmail.com>
References: <CAP1=2W7K6Ny82Vq-rk3zO9cHxxJmtQGczqGy3byg83sy-N8W9Q@mail.gmail.com>
Message-ID: <CAP1=2W6T8Gjo4B44Ut1t9yDYQcMymTBY1hcZ=+QagY5pyrauNg@mail.gmail.com>

I have taken PEP 523 for this:
https://github.com/python/peps/blob/master/pep-0523.txt .

I'm waiting until Guido gets back from vacation, at which point I'll ask
for a pronouncement or assignment of a BDFL delegate.

On Fri, 3 Jun 2016 at 14:37 Brett Cannon <brett at python.org> wrote:

> For those of you who follow python-ideas or were at the PyCon US 2016
> language summit, you have already seen/heard about this PEP. For those of
> you who don't fall into either of those categories, this PEP proposed a
> frame evaluation API for CPython. The motivating example of this work has
> been Pyjion, the experimental CPython JIT Dino Viehland and I have been
> working on in our spare time at Microsoft. The API also works for
> debugging, though, as already demonstrated by Google having added a very
> similar API internally for debugging purposes.
>
> The PEP is pasted in below and also available in rendered form at
> https://github.com/Microsoft/Pyjion/blob/master/pep.rst (I will assign
> myself a PEP # once discussion is finished as it's easier to work in git
> for this for the rich rendering of the in-progress PEP).
>
> I should mention that the difference from python-ideas and the language
> summit in the PEP are the listed support from Google's use of a very
> similar API as well as clarifying the co_extra field on code objects
> doesn't change their immutability (at least from the view of the PEP).
>
> ----------
> PEP: NNN
> Title: Adding a frame evaluation API to CPython
> Version: $Revision$
> Last-Modified: $Date$
> Author: Brett Cannon <brett at python.org>,
>         Dino Viehland <dinov at microsoft.com>
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 16-May-2016
> Post-History: 16-May-2016
>               03-Jun-2016
>
>
> Abstract
> ========
>
> This PEP proposes to expand CPython's C API [#c-api]_ to allow for
> the specification of a per-interpreter function pointer to handle the
> evaluation of frames [#pyeval_evalframeex]_. This proposal also
> suggests adding a new field to code objects [#pycodeobject]_ to store
> arbitrary data for use by the frame evaluation function.
>
>
> Rationale
> =========
>
> One place where flexibility has been lacking in Python is in the direct
> execution of Python code. While CPython's C API [#c-api]_ allows for
> constructing the data going into a frame object and then evaluating it
> via ``PyEval_EvalFrameEx()`` [#pyeval_evalframeex]_, control over the
> execution of Python code comes down to individual objects instead of a
> hollistic control of execution at the frame level.
>
> While wanting to have influence over frame evaluation may seem a bit
> too low-level, it does open the possibility for things such as a
> method-level JIT to be introduced into CPython without CPython itself
> having to provide one. By allowing external C code to control frame
> evaluation, a JIT can participate in the execution of Python code at
> the key point where evaluation occurs. This then allows for a JIT to
> conditionally recompile Python bytecode to machine code as desired
> while still allowing for executing regular CPython bytecode when
> running the JIT is not desired. This can be accomplished by allowing
> interpreters to specify what function to call to evaluate a frame. And
> by placing the API at the frame evaluation level it allows for a
> complete view of the execution environment of the code for the JIT.
>
> This ability to specify a frame evaluation function also allows for
> other use-cases beyond just opening CPython up to a JIT. For instance,
> it would not be difficult to implement a tracing or profiling function
> at the call level with this API. While CPython does provide the
> ability to set a tracing or profiling function at the Python level,
> this would be able to match the data collection of the profiler and
> quite possibly be faster for tracing by simply skipping per-line
> tracing support.
>
> It also opens up the possibility of debugging where the frame
> evaluation function only performs special debugging work when it
> detects it is about to execute a specific code object. In that
> instance the bytecode could be theoretically rewritten in-place to
> inject a breakpoint function call at the proper point for help in
> debugging while not having to do a heavy-handed approach as
> required by ``sys.settrace()``.
>
> To help facilitate these use-cases, we are also proposing the adding
> of a "scratch space" on code objects via a new field. This will allow
> per-code object data to be stored with the code object itself for easy
> retrieval by the frame evaluation function as necessary. The field
> itself will simply be a ``PyObject *`` type so that any data stored in
> the field will participate in normal object memory management.
>
>
> Proposal
> ========
>
> All proposed C API changes below will not be part of the stable ABI.
>
>
> Expanding ``PyCodeObject``
> --------------------------
>
> One field is to be added to the ``PyCodeObject`` struct
> [#pycodeobject]_::
>
>   typedef struct {
>      ...
>      PyObject *co_extra;  /* "Scratch space" for the code object. */
>   } PyCodeObject;
>
> The ``co_extra`` will be ``NULL`` by default and will not be used by
> CPython itself. Third-party code is free to use the field as desired.
> Values stored in the field are expected to not be required in order
> for the code object to function, allowing the loss of the data of the
> field to be acceptable (this keeps the code object as immutable from
> a functionality point-of-view; this is slightly contentious and so is
> listed as an open issue in `Is co_extra needed?`_). The field will be
> freed like all other fields on ``PyCodeObject`` during deallocation
> using ``Py_XDECREF()``.
>
> It is not recommended that multiple users attempt to use the
> ``co_extra`` simultaneously. While a dictionary could theoretically be
> set to the field and various users could use a key specific to the
> project, there is still the issue of key collisions as well as
> performance degradation from using a dictionary lookup on every frame
> evaluation. Users are expected to do a type check to make sure that
> the field has not been previously set by someone else.
>
>
> Expanding ``PyInterpreterState``
> --------------------------------
>
> The entrypoint for the frame evalution function is per-interpreter::
>
>   // Same type signature as PyEval_EvalFrameEx().
>   typedef PyObject* (__stdcall *PyFrameEvalFunction)(PyFrameObject*, int);
>
>   typedef struct {
>       ...
>       PyFrameEvalFunction eval_frame;
>   } PyInterpreterState;
>
> By default, the ``eval_frame`` field will be initialized to a function
> pointer that represents what ``PyEval_EvalFrameEx()`` currently is
> (called ``PyEval_EvalFrameDefault()``, discussed later in this PEP).
> Third-party code may then set their own frame evaluation function
> instead to control the execution of Python code. A pointer comparison
> can be used to detect if the field is set to
> ``PyEval_EvalFrameDefault()`` and thus has not been mutated yet.
>
>
> Changes to ``Python/ceval.c``
> -----------------------------
>
> ``PyEval_EvalFrameEx()`` [#pyeval_evalframeex]_ as it currently stands
> will be renamed to ``PyEval_EvalFrameDefault()``. The new
> ``PyEval_EvalFrameEx()`` will then become::
>
>     PyObject *
>     PyEval_EvalFrameEx(PyFrameObject *frame, int throwflag)
>     {
>         PyThreadState *tstate = PyThreadState_GET();
>         return tstate->interp->eval_frame(frame, throwflag);
>     }
>
> This allows third-party code to place themselves directly in the path
> of Python code execution while being backwards-compatible with code
> already using the pre-existing C API.
>
>
> Updating ``python-gdb.py``
> --------------------------
>
> The generated ``python-gdb.py`` file used for Python support in GDB
> makes some hard-coded assumptions about ``PyEval_EvalFrameEx()``, e.g.
> the names of local variables. It will need to be updated to work with
> the proposed changes.
>
>
> Performance impact
> ==================
>
> As this PEP is proposing an API to add pluggability, performance
> impact is considered only in the case where no third-party code has
> made any changes.
>
> Several runs of pybench [#pybench]_ consistently showed no performance
> cost from the API change alone.
>
> A run of the Python benchmark suite [#py-benchmarks]_ showed no
> measurable cost in performance.
>
> In terms of memory impact, since there are typically not many CPython
> interpreters executing in a single process that means the impact of
> ``co_extra`` being added to ``PyCodeObject`` is the only worry.
> According to [#code-object-count]_, a run of the Python test suite
> results in about 72,395 code objects being created. On a 64-bit
> CPU that would result in 579,160 bytes of extra memory being used if
> all code objects were alive at once and had nothing set in their
> ``co_extra`` fields.
>
>
> Example Usage
> =============
>
> A JIT for CPython
> -----------------
>
> Pyjion
> ''''''
>
> The Pyjion project [#pyjion]_ has used this proposed API to implement
> a JIT for CPython using the CoreCLR's JIT [#coreclr]_. Each code
> object has its ``co_extra`` field set to a ``PyjionJittedCode`` object
> which stores four pieces of information:
>
> 1. Execution count
> 2. A boolean representing whether a previous attempt to JIT failed
> 3. A function pointer to a trampoline (which can be type tracing or not)
> 4. A void pointer to any JIT-compiled machine code
>
> The frame evaluation function has (roughly) the following algorithm::
>
>     def eval_frame(frame, throw_flag):
>         pyjion_code = frame.code.co_extra
>         if not pyjion_code:
>             frame.code.co_extra = PyjionJittedCode()
>         elif not pyjion_code.jit_failed:
>             if not pyjion_code.jit_code:
>                 return pyjion_code.eval(pyjion_code.jit_code, frame)
>             elif pyjion_code.exec_count > 20_000:
>                 if jit_compile(frame):
>                     return pyjion_code.eval(pyjion_code.jit_code, frame)
>                 else:
>                     pyjion_code.jit_failed = True
>         pyjion_code.exec_count += 1
>         return PyEval_EvalFrameDefault(frame, throw_flag)
>
> The key point, though, is that all of this work and logic is separate
> from CPython and yet with the proposed API changes it is able to
> provide a JIT that is compliant with Python semantics (as of this
> writing, performance is almost equivalent to CPython without the new
> API). This means there's nothing technically preventing others from
> implementing their own JITs for CPython by utilizing the proposed API.
>
>
> Other JITs
> ''''''''''
>
> It should be mentioned that the Pyston team was consulted on an
> earlier version of this PEP that was more JIT-specific and they were
> not interested in utilizing the changes proposed because they want
> control over memory layout they had no interest in directly supporting
> CPython itself. An informal discusion with a developer on the PyPy
> team led to a similar comment.
>
> Numba [#numba]_, on the other hand, suggested that they would be
> interested in the proposed change in a post-1.0 future for
> themselves [#numba-interest]_.
>
> The experimental Coconut JIT [#coconut]_ could have benefitted from
> this PEP. In private conversations with Coconut's creator we were told
> that our API was probably superior to the one they developed for
> Coconut to add JIT support to CPython.
>
>
> Debugging
> ---------
>
> In conversations with the Python Tools for Visual Studio team (PTVS)
> [#ptvs]_, they thought they would find these API changes useful for
> implementing more performant debugging. As mentioned in the Rationale_
> section, this API would allow for switching on debugging functionality
> only in frames where it is needed. This could allow for either
> skipping information that ``sys.settrace()`` normally provides and
> even go as far as to dynamically rewrite bytecode prior to execution
> to inject e.g. breakpoints in the bytecode.
>
> It also turns out that Google has provided a very similar API
> internally for years. It has been used for performant debugging
> purposes.
>
>
> Implementation
> ==============
>
> A set of patches implementing the proposed API is available through
> the Pyjion project [#pyjion]_. In its current form it has more
> changes to CPython than just this proposed API, but that is for ease
> of development instead of strict requirements to accomplish its goals.
>
>
> Open Issues
> ===========
>
> Allow ``eval_frame`` to be ``NULL``
> -----------------------------------
>
> Currently the frame evaluation function is expected to always be set.
> It could very easily simply default to ``NULL`` instead which would
> signal to use ``PyEval_EvalFrameDefault()``. The current proposal of
> not special-casing the field seemed the most straight-forward, but it
> does require that the field not accidentally be cleared, else a crash
> may occur.
>
>
> Is co_extra needed?
> -------------------
>
> While discussing this PEP at PyCon US 2016, some core developers
> expressed their worry of the ``co_extra`` field making code objects
> mutable. The thinking seemed to be that having a field that was
> mutated after the creation of the code object made the object seem
> mutable, even though no other aspect of code objects changed.
>
> The view of this PEP is that the `co_extra` field doesn't change the
> fact that code objects are immutable. The field is specified in this
> PEP as to not contain information required to make the code object
> usable, making it more of a caching field. It could be viewed as
> similar to the UTF-8 cache that string objects have internally;
> strings are still considered immutable even though they have a field
> that is conditionally set.
>
> The field is also not strictly necessary. While the field greatly
> simplifies attaching extra information to code objects, other options
> such as keeping a mapping of code object memory addresses to what
> would have been kept in ``co_extra`` or perhaps using a weak reference
> of the data on the code object and then iterating through the weak
> references until the attached data is found is possible. But obviously
> all of these solutions are not as simple or performant as adding the
> ``co_extra`` field.
>
>
> Rejected Ideas
> ==============
>
> A JIT-specific C API
> --------------------
>
> Originally this PEP was going to propose a much larger API change
> which was more JIT-specific. After soliciting feedback from the Numba
> team [#numba]_, though, it became clear that the API was unnecessarily
> large. The realization was made that all that was truly needed was the
> opportunity to provide a trampoline function to handle execution of
> Python code that had been JIT-compiled and a way to attach that
> compiled machine code along with other critical data to the
> corresponding Python code object. Once it was shown that there was no
> loss in functionality or in performance while minimizing the API
> changes required, the proposal was changed to its current form.
>
>
> References
> ==========
>
> .. [#pyjion] Pyjion project
>    (https://github.com/microsoft/pyjion)
>
> .. [#c-api] CPython's C API
>    (https://docs.python.org/3/c-api/index.html)
>
> .. [#pycodeobject] ``PyCodeObject``
>    (https://docs.python.org/3/c-api/code.html#c.PyCodeObject)
>
> .. [#coreclr] .NET Core Runtime (CoreCLR)
>    (https://github.com/dotnet/coreclr)
>
> .. [#pyeval_evalframeex] ``PyEval_EvalFrameEx()``
>    (
> https://docs.python.org/3/c-api/veryhigh.html?highlight=pyframeobject#c.PyEval_EvalFrameEx
> )
>
> .. [#pycodeobject] ``PyCodeObject``
>    (https://docs.python.org/3/c-api/code.html#c.PyCodeObject)
>
> .. [#numba] Numba
>    (http://numba.pydata.org/)
>
> .. [#numba-interest]  numba-users mailing list:
>    "Would the C API for a JIT entrypoint being proposed by Pyjion help out
> Numba?"
>    (
> https://groups.google.com/a/continuum.io/forum/#!topic/numba-users/yRl_0t8-m1g
> )
>
> .. [#code-object-count] [Python-Dev] Opcode cache in ceval loop
>    (https://mail.python.org/pipermail/python-dev/2016-February/143025.html
> )
>
> .. [#py-benchmarks] Python benchmark suite
>    (https://hg.python.org/benchmarks)
>
> .. [#pyston] Pyston
>    (http://pyston.org)
>
> .. [#pypy] PyPy
>    (http://pypy.org/)
>
> .. [#ptvs] Python Tools for Visual Studio
>    (http://microsoft.github.io/PTVS/)
>
> .. [#coconut] Coconut
>    (https://github.com/davidmalcolm/coconut)
>
>
> Copyright
> =========
>
> This document has been placed in the public domain.
>
>
> ..
>    Local Variables:
>    mode: indented-text
>    indent-tabs-mode: nil
>    sentence-end-double-space: t
>    fill-column: 70
>    coding: utf-8
>    End:
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160618/2da0c7c5/attachment-0001.html>

From brett at python.org  Fri Jun 17 23:06:32 2016
From: brett at python.org (Brett Cannon)
Date: Sat, 18 Jun 2016 03:06:32 +0000
Subject: [Python-Dev] security SIG? (was: Discussion overload)
In-Reply-To: <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
 <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
Message-ID: <CAP1=2W6rEsVAUwAye8Q3SvCnYM7oY_WFVFyCtVgO_mK=4-dJ-A@mail.gmail.com>

On Fri, 17 Jun 2016 at 18:13 Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 16 June 2016 at 19:00, Kevin Ollivier <kevin-lists at theolliviers.com>
> wrote:
> > Hi Guido,
> >
> > From: <gvanrossum at gmail.com> on behalf of Guido van Rossum
> > <guido at python.org>
> > Reply-To: <guido at python.org>
> > Date: Thursday, June 16, 2016 at 5:27 PM
> > To: Kevin Ollivier <kevin-lists at theolliviers.com>
> > Cc: Python Dev <python-dev at python.org>
> > Subject: Re: [Python-Dev] Discussion overload
> >
> > Hi Kevin,
> >
> > I often feel the same way. Are you using GMail? It combines related
> messages
> > in threads and lets you mute threads. I often use this feature so I can
> > manage my inbox. (I presume other mailers have the same features, but I
> > don't know if all of them do.) There are also many people who read the
> list
> > on a website, e.g. gmane. (Though I think that sometimes the delays
> incurred
> > there add to the noise -- e.g. when a decision is reached on the list
> > sometimes people keep responding to earlier threads.)
> >
> >
> > I fear I did quite a poor job of making my point. :( I've been on open
> > source mailing lists since the late 90s, so I've learned strategies for
> > dealing with mailing list overload. I've got my mail folders, my mail
> rules,
> > etc. Having been on many mailing lists over the years, I've seen many
> > productive discussions and many unproductive ones, and over time you
> start
> > to see patterns. You also see what happens to those communities over
> time.
>
> This is one of the major reasons we have the option of escalating
> things to the PEP process (and that's currently in train for
> os.urandom), as well as the SIGs for when folks really need to dig
> into topics that risk incurring a relatively low signal-to-noise
> ration on python-dev. It's also why python-ideas was turned into a
> separate list, since folks without the time for more speculative
> discussions and brainstorming can safely ignore it, while remaining
> confident that any ideas considered interesting enough for further
> review will be brought to python-dev's attention.
>

Do we need a security SIG? E.g. would people like Christian and Cory like
to have a separate place to talk about the ssl stuff brought up at the
language summit?

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160618/a166ef50/attachment.html>

From barry at python.org  Sat Jun 18 00:47:05 2016
From: barry at python.org (Barry Warsaw)
Date: Sat, 18 Jun 2016 07:47:05 +0300
Subject: [Python-Dev] security SIG? (was: Discussion overload)
In-Reply-To: <CAP1=2W6rEsVAUwAye8Q3SvCnYM7oY_WFVFyCtVgO_mK=4-dJ-A@mail.gmail.com>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
 <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
 <CAP1=2W6rEsVAUwAye8Q3SvCnYM7oY_WFVFyCtVgO_mK=4-dJ-A@mail.gmail.com>
Message-ID: <20160618074705.58bddccf.barry@wooz.org>

On Jun 18, 2016, at 03:06 AM, Brett Cannon wrote:

>Do we need a security SIG? E.g. would people like Christian and Cory like
>to have a separate place to talk about the ssl stuff brought up at the
>language summit?

The only thing I'd be worried about is people thinking that the sig is the
place to report confidential security issues.  Thesaurusly suggesting
danger-sig and not just because that sounds so much cooler.

not-a-serious-suggestion-ly y'rs,
-Barry

From songofacandy at gmail.com  Sat Jun 18 03:12:50 2016
From: songofacandy at gmail.com (INADA Naoki)
Date: Sat, 18 Jun 2016 16:12:50 +0900
Subject: [Python-Dev] Compact dict implementations (was: PEP 468
In-Reply-To: <CAEfz+Tzk_uNWwpkzpWCibYt1q0G61ztcNEBnqei9sPUezdF7mg@mail.gmail.com>
References: <CAEfz+Tzk_uNWwpkzpWCibYt1q0G61ztcNEBnqei9sPUezdF7mg@mail.gmail.com>
Message-ID: <CAEfz+TykBhKiDPWF30QRTXFRgmnng0cC6QefafniL9pwcGOPGg@mail.gmail.com>

Now I fixed failing tests (some tests relying to underlying layout).

Before posting it to bugs.python.org, I want to confirm I have chance to
it merged.

First big problem is language spec.

If builtin dict in both of PyPy and CPython is ordered, many people
will relying it.
It will force other Python implementations to implement it for compatibility.
In other words, it may be de-facto "Python Language", even if Python
Language spec
say it's an implementation detail.

Is it OK?

Second problem is performance.

Quick benchmark on my laptop (Sorry, I don't have dedicated hardware
for long running stable
benchmarking), It reduces 3% memory usage and increase 3% cpu time.
I'll run longer benchmark in next week.
I think I can't avoid the penalty because index hashtable and (hash,
key, value) is not in
same cacheline.  (I hope my thought is wrong and there is way to optimize more.)

pybench: https://gist.github.com/methane/cfad1427d87ceff9310350e78a214880
benchmark: https://gist.github.com/methane/5eb11fdd93863813b222e795ca0bfc1f

Is it acceptable?

I have some other minor problems (e.g. How I can use 2byte integer?
Using int16_t in stdint.h
is OK?).  I'll discuss them in core-mentor ML or bugs.python.org.

Thanks
-- 
INADA Naoki  <songofacandy at gmail.com>

From stephen at xemacs.org  Sat Jun 18 06:38:34 2016
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 18 Jun 2016 19:38:34 +0900
Subject: [Python-Dev]  security SIG? (was: Discussion overload)
In-Reply-To: <CAP1=2W6rEsVAUwAye8Q3SvCnYM7oY_WFVFyCtVgO_mK=4-dJ-A@mail.gmail.com>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
 <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
 <CAP1=2W6rEsVAUwAye8Q3SvCnYM7oY_WFVFyCtVgO_mK=4-dJ-A@mail.gmail.com>
Message-ID: <22373.9386.96945.48324@turnbull.sk.tsukuba.ac.jp>

Brett Cannon writes:

 > Do we need a security SIG? E.g. would people like Christian and
 > Cory like to have a separate place to talk about the ssl stuff
 > brought up at the language summit?

Besides what Barry brought up about the potential for attractive
nuisance where people post security issues that should be confidential
(I don't think it's that great, though), I don't see it solving the
"clash of cultures" issue.  The people who have invested in learning a
lot of technical stuff related to security post as if they believe
that "consenting adults" cannot be applied to security issues (more on
that below), while RMs and working on distros tend to take the
position that, of course, "consenting adults" covers security too.

A SIG does help to address Christian's "ya gotta be this tall" to
contribute to security discussions, at least in the early stages of
discussion, but eventually it's going to arrive at python-dev.[1]
ISTM that in this case sufficient behind the scenes discussion took
place that the main contributors to the ultimate decision had a pretty
good idea of where each other stood, and (I'm guessing here) Larry
said "OK, we agree to disagree.  I could say I'm RM, you lose, but to
be fair I'll ask for a BDFL ruling."  Even though there really wasn't
anything for most of us to do but wait for that ruling (really --
Guido talks to Ted T'so and Theo de Raadt when he wants advice, there
are very few among us who travel in those circles), it ended up that
several of the security guys say they're not sure they can participate
in Python development any more.

I see the security issue as a backyard swimming pool.  The law may say
you must put a fence around it, but even 6 year olds can climb the
fence, fall in the pool, and drown.  The hard-line security advocate
position then is "the risk is a *kid's life*, backyard pools must be
banned".  You have to sympathize with their honest and deep concern,
but the community accepts that risk in the case of swimming pools.  I
suspect the Python community at large is going to be happy with
Larry's decision and the strategy of emphasizing the secrets module
starting with 3.6.

If so, the hard-line security advocates are going to have to accept
that, or stay painfully frustrated.  That would be very unfortunate,
because their knowledge is very much needed.

Footnotes: 
[1]  Keeping the BFDL ruling within the security group isn't going to
work, either -- the news of a secret patch will become public quickly,
and it will just seriously harm the trust the community has in its
leaders.

From cory at lukasa.co.uk  Sat Jun 18 10:25:49 2016
From: cory at lukasa.co.uk (Cory Benfield)
Date: Sat, 18 Jun 2016 15:25:49 +0100
Subject: [Python-Dev] security SIG? (was: Discussion overload)
In-Reply-To: <22373.9386.96945.48324@turnbull.sk.tsukuba.ac.jp>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
 <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
 <CAP1=2W6rEsVAUwAye8Q3SvCnYM7oY_WFVFyCtVgO_mK=4-dJ-A@mail.gmail.com>
 <22373.9386.96945.48324@turnbull.sk.tsukuba.ac.jp>
Message-ID: <1BFEC08A-C620-4E89-B801-AA8072F5391A@lukasa.co.uk>

> On 18 Jun 2016, at 11:38, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> 
> I see the security issue as a backyard swimming pool.  The law may say
> you must put a fence around it, but even 6 year olds can climb the
> fence, fall in the pool, and drown.  The hard-line security advocate
> position then is "the risk is a *kid's life*, backyard pools must be
> banned".  You have to sympathize with their honest and deep concern,
> but the community accepts that risk in the case of swimming pools.  I
> suspect the Python community at large is going to be happy with
> Larry's decision and the strategy of emphasizing the secrets module
> starting with 3.6.

I don?t think that?s really an accurate representation of any of the arguments put forward here. A better analogy is this:

- Right now, we have a fence around the neighbourhood swimming pool. This fence has a gate with a guard, and the guard prevents 6 year olds from getting through the gate.
- Kids can climb the fence (by using random.choice or something like it). The security community is mostly in agreement with the stdlib folks: we?re happy to say that this problem is best dealt with by educating children to not climb the fence (don?t use random.choice in a security context).
- In Python 3.4 and earlier, the guard on this gate will, in some circumstances, turn up to work drunk. The guard is very good at pretending to be sober, so you cannot tell just by looking that he?s drunk, but in this state he will let anyone through the gate. He sobers up fast, so it only matters if a child tries to get in very shortly after you open the swimming pool.
- In Python 3.5 we included a patch that, by accident, installed a breathalyser on the gate. Now the guard can only open the gate when he?s sober.
- The problem is that he cannot open the gate *for anyone* while he?s drunk. All entry to the pool stops if he shows up drunk.
- The security folks want to say ?yes, this breathalyser is awesome, leave it in place, it should always have been there?.
- The compat folks want to say ?the gate hasn?t had a breathalyser for years, and it represents a genuine inconvenience to adults who want to swim, so we need to remove it?.

We are not trying to take non-CSPRNGs away from you. We are not trying to remove the random module, we are not trying to say that everyone must use urandom for all cases. We totally agree with the consenting adults policy. We just believe that the number of people who have used os.urandom and actively wanted the Linux behaviour may well be zero, and is certainly a tiny fraction of the user base of os.urandom, whereas the number of people who have used os.urandom and expected it to produce safe random bytes is dramatically larger.

We believe that invisible risks are bad. We believe that it is difficult to meaningfully consent to behaviour you do not know about. And we believe that when your expectations are violated, it is better for the system to fail hard than to subtly degrade to a behaviour that puts you at risk, because one of these provides a trigger for action and one does not.

In the case of ?consenting adults?: users cannot be said to have meaningfully consented to behaviours they do not understand. Consider urllib2 and PEP 476. Prior to Python 2.7.9, urllib2 did not validate TLS certificates. It *could*, if a user was willing to configure it to do so, but by default it did not. We could defend that behaviour under ?consenting adults?: users *technically* consented to the behaviour by using the code. However, most of these users *passively* consented: their consent is inferred by their use of the API, but they have written no code that actively asserts their consent.

In Requests, we allow consenting adults to turn off cert verification if they want to, but they have to *actively* consent: they *have* to say verify=False. And they have to say it every time: we deliberately provide no global switch to turn off cert validation in Requests, you have to set verify=False every single time. This is very deliberate. If a user wants to shoot themselves in the foot they are welcome to do so, but we don?t hand the gun to the user pointed at their foot.

We?re arguing for the same here with os.urandom(). If you want the Linux default urandom behaviour, that?s fine, but we think that it?s surprising to people and that they should be forced to *actively ask* for that behaviour, rather than passively be given it.

The TL;DR is: consent is not the absence of saying no, it?s the presence of saying yes.

Cory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160618/1b22458c/attachment.sig>

From cory at lukasa.co.uk  Sat Jun 18 10:30:31 2016
From: cory at lukasa.co.uk (Cory Benfield)
Date: Sat, 18 Jun 2016 15:30:31 +0100
Subject: [Python-Dev] security SIG? (was: Discussion overload)
In-Reply-To: <CAP1=2W6rEsVAUwAye8Q3SvCnYM7oY_WFVFyCtVgO_mK=4-dJ-A@mail.gmail.com>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
 <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
 <CAP1=2W6rEsVAUwAye8Q3SvCnYM7oY_WFVFyCtVgO_mK=4-dJ-A@mail.gmail.com>
Message-ID: <5B4E973C-B09E-487E-9074-3B42DC773B99@lukasa.co.uk>

> On 18 Jun 2016, at 04:06, Brett Cannon <brett at python.org> wrote:
> 
> Do we need a security SIG? E.g. would people like Christian and Cory like to have a separate place to talk about the ssl stuff brought up at the language summit?

Honestly, I?m not sure what we would gain.

Unless that SIG is empowered to take action, all it will be is a factory for generating arguments like this one. It will inevitably be either a toxic environment in itself, or a source of toxic threads on python-dev as the security SIG brings new threads like this one to the table.

It should be noted that of the three developers that originally stepped forward on the security side of things here (myself, Donald, and Christian), only I am left subscribed to python-dev and nosy?d on the relevant issues. Put another way: each time we do this, several people on the security side burn themselves out in the thread and walk away (it?s possible that those on the other side of the threads do too, I just don?t know those people so well). It?s hard to get enthusiastic about signing people up for that. =)

Cory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160618/0ad87123/attachment.sig>

From leewangzhong+rsm at gmail.com  Sat Jun 18 12:57:16 2016
From: leewangzhong+rsm at gmail.com (Franklin Lee)
Date: Sat, 18 Jun 2016 12:57:16 -0400
Subject: [Python-Dev] Compact dict implementations (was: PEP 468
In-Reply-To: <CAEfz+TykBhKiDPWF30QRTXFRgmnng0cC6QefafniL9pwcGOPGg@mail.gmail.com>
References: <CAEfz+Tzk_uNWwpkzpWCibYt1q0G61ztcNEBnqei9sPUezdF7mg@mail.gmail.com>
 <CAEfz+TykBhKiDPWF30QRTXFRgmnng0cC6QefafniL9pwcGOPGg@mail.gmail.com>
Message-ID: <CAB_e7ixxFrgUQL1J66jdH0K85aOZyc05kFh8Qu5_Ro0+C741qg@mail.gmail.com>

In the original discussion, I think they decided to reimplement set before
dict.

The original discussion is here, for anyone else:
https://mail.python.org/pipermail/python-dev/2012-December/123028.html

On Jun 18, 2016 3:15 AM, "INADA Naoki" <songofacandy at gmail.com> wrote:
> If builtin dict in both of PyPy and CPython is ordered, many people
> will relying it.
> It will force other Python implementations to implement it for
compatibility.
> In other words, it may be de-facto "Python Language", even if Python
> Language spec
> say it's an implementation detail.
>
> Is it OK?

Ordered, or just initially ordered? I mean, "ordered if no deletion".

They discussed scrambling the order.

(Subdiscussion was here:
https://mail.python.org/pipermail/python-dev/2012-December/123041.html)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160618/d860e869/attachment.html>

From songofacandy at gmail.com  Sat Jun 18 13:13:55 2016
From: songofacandy at gmail.com (INADA Naoki)
Date: Sun, 19 Jun 2016 02:13:55 +0900
Subject: [Python-Dev] Compact dict implementations (was: PEP 468
In-Reply-To: <CAEfz+TykBhKiDPWF30QRTXFRgmnng0cC6QefafniL9pwcGOPGg@mail.gmail.com>
References: <CAEfz+Tzk_uNWwpkzpWCibYt1q0G61ztcNEBnqei9sPUezdF7mg@mail.gmail.com>
 <CAEfz+TykBhKiDPWF30QRTXFRgmnng0cC6QefafniL9pwcGOPGg@mail.gmail.com>
Message-ID: <CAEfz+TxG0QL71Xf7kyEa-rx==GAtLdQSc4do0OxZ=Xy4PC27Wg@mail.gmail.com>

>
> pybench: https://gist.github.com/methane/cfad1427d87ceff9310350e78a214880
> benchmark: https://gist.github.com/methane/5eb11fdd93863813b222e795ca0bfc1f
>
> Is it acceptable?

latest result is here
https://gist.github.com/methane/22cf5d1dadb62bc87a15e9244a9d0ab8

-- 
INADA Naoki  <songofacandy at gmail.com>

From ethan at stoneleaf.us  Sat Jun 18 13:36:56 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Sat, 18 Jun 2016 10:36:56 -0700
Subject: [Python-Dev] security SIG?
In-Reply-To: <5B4E973C-B09E-487E-9074-3B42DC773B99@lukasa.co.uk>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
 <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
 <CAP1=2W6rEsVAUwAye8Q3SvCnYM7oY_WFVFyCtVgO_mK=4-dJ-A@mail.gmail.com>
 <5B4E973C-B09E-487E-9074-3B42DC773B99@lukasa.co.uk>
Message-ID: <576586B8.5090009@stoneleaf.us>

On 06/18/2016 07:30 AM, Cory Benfield wrote:
> On 18 Jun 2016, at 04:06, Brett Cannon wrote:

>> Do we need a security SIG? E.g. would people like Christian and Cory like
 >> to have a separate place to talk about the ssl stuff brought up at the
 >> language summit?
>
> Honestly, I?m not sure what we would gain.

We would gain a place where security enhancements/fixes can be discussed 
by those interested, where the environment is "how do we fix/improve 
such-and-such while breaking as little as possible" (those that want 
backward-compatibility at all costs need not apply ;).

Once a consensus has been reached (and possibly a PEP written, but 
hopefully that part will only rarely be necessary) then the proposal can 
be made to py-dev, complete with the "this portion is backwards 
incompatible, this is the expected impact, this is why it's important, 
here are the other far more painful alternatives".

> Unless that SIG is empowered to take action, all it will be is a factory for
 > generating arguments like this one. It will inevitably be either a toxic
 > environment in itself, or a source of toxic threads on python-dev as the
 > security SIG brings new threads like this one to the table.

I suspect the resulting thread on py-dev will be far less painful when 
the initial discussions on ways to fix/improve this-or-that has already 
been done, the various options are being laid out, it's clear the new 
method will be in the next major release (unless incredibly serious, of 
course).

> It should be noted that of the three developers that originally stepped forward
 > on the security side of things here (myself, Donald, and Christian), 
only I am
 > left subscribed to python-dev and nosy?d on the relevant issues. Put 
another way:
 > each time we do this, several people on the security side burn 
themselves out in
 > the thread and walk away (it?s possible that those on the other side 
of the
 > threads do too, I just don?t know those people so well). It?s hard to get
 > enthusiastic about signing people up for that. =)

One of the big advantages of a SIG is the much reduced pool of 
participants, and that those participants are usually interested in 
forward progress.  It would also be helpful to have a single person both 
champion and act as buffer for the proposals (not necessarily the same 
person each time).  I am reminded of the matrix-multiply PEP brought 
forward by Nathaniel a few months ago -- the proposal was researched 
outside of py-dev, presented to py-dev when ready, Nathaniel acted as 
the gateway between py-dev and those that wanted/needed the change, the 
discussion stayed (pretty much) on track, and it felt like the whole 
thing was very smooth.  (If it was somebody else, my apologies for my 
terrible memory! ;)

To sum up:  I think it would be a good idea.

--
~Ethan~

From songofacandy at gmail.com  Sat Jun 18 13:55:45 2016
From: songofacandy at gmail.com (INADA Naoki)
Date: Sun, 19 Jun 2016 02:55:45 +0900
Subject: [Python-Dev] Compact dict implementations (was: PEP 468
In-Reply-To: <CAB_e7ixxFrgUQL1J66jdH0K85aOZyc05kFh8Qu5_Ro0+C741qg@mail.gmail.com>
References: <CAEfz+Tzk_uNWwpkzpWCibYt1q0G61ztcNEBnqei9sPUezdF7mg@mail.gmail.com>
 <CAEfz+TykBhKiDPWF30QRTXFRgmnng0cC6QefafniL9pwcGOPGg@mail.gmail.com>
 <CAB_e7ixxFrgUQL1J66jdH0K85aOZyc05kFh8Qu5_Ro0+C741qg@mail.gmail.com>
Message-ID: <CAEfz+TxfhjGF5J87HF+Fw_G--B62o2O46faRxT_x0nrKmbqZwA@mail.gmail.com>

>
> Ordered, or just initially ordered? I mean, "ordered if no deletion".
>

I implemented "ordered".  Because:

* "orderd" is easier to explain than "ordered if no deletion".

* I don't want to split sparse index hash and dense entry array.
  In case of very small dict, index hash (8byte) and first two entries
(24*2=48byte)
  can be on one cache line.

* Easy to implement "split dictionary" (aka. key sharing dictionary).

You can see what I implemented in here.
https://github.com/methane/cpython/pull/1/files

-- 
INADA Naoki  <songofacandy at gmail.com>

From brett at python.org  Sat Jun 18 14:10:29 2016
From: brett at python.org (Brett Cannon)
Date: Sat, 18 Jun 2016 18:10:29 +0000
Subject: [Python-Dev] security SIG? (was: Discussion overload)
In-Reply-To: <5B4E973C-B09E-487E-9074-3B42DC773B99@lukasa.co.uk>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
 <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
 <CAP1=2W6rEsVAUwAye8Q3SvCnYM7oY_WFVFyCtVgO_mK=4-dJ-A@mail.gmail.com>
 <5B4E973C-B09E-487E-9074-3B42DC773B99@lukasa.co.uk>
Message-ID: <CAP1=2W7tBnFeNbYxUvuqcLNrYXgtyhO+5p+7ndAeXaDF8PedXQ@mail.gmail.com>

On Sat, 18 Jun 2016 at 07:30 Cory Benfield <cory at lukasa.co.uk> wrote:

>
> > On 18 Jun 2016, at 04:06, Brett Cannon <brett at python.org> wrote:
> >
> > Do we need a security SIG? E.g. would people like Christian and Cory
> like to have a separate place to talk about the ssl stuff brought up at the
> language summit?
>
>
> Honestly, I?m not sure what we would gain.
>
> Unless that SIG is empowered to take action, all it will be is a factory
> for generating arguments like this one. It will inevitably be either a
> toxic environment in itself, or a source of toxic threads on python-dev as
> the security SIG brings new threads like this one to the table.
>
> It should be noted that of the three developers that originally stepped
> forward on the security side of things here (myself, Donald, and
> Christian), only I am left subscribed to python-dev and nosy?d on the
> relevant issues. Put another way: each time we do this, several people on
> the security side burn themselves out in the thread and walk away (it?s
> possible that those on the other side of the threads do too, I just don?t
> know those people so well). It?s hard to get enthusiastic about signing
> people up for that. =)
>

And this is the problem I'm trying to solve. As various people have pointed
out, the conversation was pretty much cordial, but it did end up feeling
like "you're not listening to me" on both sides on top of the volume, which
is what I think burned people out on this thread.

I think Nick brought up the point that we as a group need to come up with
some guideline that we more-or-less stick with to help guide this kind of
discussion or else we are going to burn out regularly any time security
comes up; we can't keep holding security discussions like this or else
we're going to end up in a bad place when everyone burns out and stops
caring.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160618/5fff7c20/attachment.html>

From raymond.hettinger at gmail.com  Sat Jun 18 15:22:48 2016
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Sat, 18 Jun 2016 12:22:48 -0700
Subject: [Python-Dev] Compact dict implementations (was: PEP 468
In-Reply-To: <CAB_e7ixxFrgUQL1J66jdH0K85aOZyc05kFh8Qu5_Ro0+C741qg@mail.gmail.com>
References: <CAEfz+Tzk_uNWwpkzpWCibYt1q0G61ztcNEBnqei9sPUezdF7mg@mail.gmail.com>
 <CAEfz+TykBhKiDPWF30QRTXFRgmnng0cC6QefafniL9pwcGOPGg@mail.gmail.com>
 <CAB_e7ixxFrgUQL1J66jdH0K85aOZyc05kFh8Qu5_Ro0+C741qg@mail.gmail.com>
Message-ID: <417DB1ED-405D-417A-B868-EF82F9AEB712@gmail.com>

> On Jun 18, 2016, at 9:57 AM, Franklin Lee <leewangzhong+rsm at gmail.com> wrote:
> 
> In the original discussion, I think they decided to reimplement set before dict.

I ended-up going in a different direction with sets (using linear probes to reduce the cost of collisions).  Also, after the original discussion, PyPy implemented the idea for dicts and achieved some nice improvements.   So, I think Inada Naoki is going in the right direction by focusing on compact dicts.

Raymond

From c4obi at yahoo.com  Sat Jun 18 17:04:10 2016
From: c4obi at yahoo.com (Obiesie ike-nwosu)
Date: Sat, 18 Jun 2016 22:04:10 +0100
Subject: [Python-Dev] JUMP_ABSOLUTE in nested if statements
Message-ID: <5509708F-76C5-431F-A1BB-7F379E86B184@yahoo.com>

Hi, 

Could some one give a hand with explaining to me why we have a JUMP_ABSOLUTE followed by a JUMP_FORWARD op code when this function is disassembled.

>>> def f1():
...     a, b = 10, 11
...     if a >= 10:
...             if b >= 11:
...                     print("hello world")
? 

The disassembled function is shown below.
>>> dis(f1)
  2           0 LOAD_CONST               4 ((10, 11))
              3 UNPACK_SEQUENCE          2
              6 STORE_FAST               0 (a)
              9 STORE_FAST               1 (b)

  3        12 LOAD_FAST                0 (a)
            15 LOAD_CONST               1 (10)
            18 COMPARE_OP               5 (>=)
            21 POP_JUMP_IF_FALSE       47

  4        24 LOAD_FAST                1 (b)
            27 LOAD_CONST               2 (11)
            30 COMPARE_OP               5 (>=)
            33 POP_JUMP_IF_FALSE       47

  5        36 LOAD_CONST               3 ('hello world')
            39 PRINT_ITEM          
            40 PRINT_NEWLINE       
            41 JUMP_ABSOLUTE           47
            44 JUMP_FORWARD             0 (to 47)
     >>   47 LOAD_CONST               0 (None)
            50 RETURN_VALUE  

From my understanding, once JUMP_ABSOLUTE is executed, then JUMP_FORWARD is never gotten to so must be dead code so why is it being generated?
Furthermore why is JUMP_ABSOLUTE rather than JUMP_FORWARD used in this particular case of nested if statements? I have tried other types of nested if statements and it has always been JUMP_FORWARD that 
is generated.

From victor.stinner at gmail.com  Sat Jun 18 18:18:42 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Sun, 19 Jun 2016 00:18:42 +0200
Subject: [Python-Dev] JUMP_ABSOLUTE in nested if statements
In-Reply-To: <5509708F-76C5-431F-A1BB-7F379E86B184@yahoo.com>
References: <5509708F-76C5-431F-A1BB-7F379E86B184@yahoo.com>
Message-ID: <CAMpsgwa0CHd5iopsLORt98eATa=99BwkYPqOYaQ2+NT23F4few@mail.gmail.com>

Python has a peephole optimizer which does not remove dead code that it
just created.

Victor
Le 18 juin 2016 23:14, "Obiesie ike-nwosu via Python-Dev" <
python-dev at python.org> a ?crit :

> Hi,
>
> Could some one give a hand with explaining to me why we have a
> JUMP_ABSOLUTE followed by a JUMP_FORWARD op code when this function is
> disassembled.
>
> >>> def f1():
> ...     a, b = 10, 11
> ...     if a >= 10:
> ...             if b >= 11:
> ...                     print("hello world")
> ?
>
> The disassembled function is shown below.
> >>> dis(f1)
>   2           0 LOAD_CONST               4 ((10, 11))
>               3 UNPACK_SEQUENCE          2
>               6 STORE_FAST               0 (a)
>               9 STORE_FAST               1 (b)
>
>   3        12 LOAD_FAST                0 (a)
>             15 LOAD_CONST               1 (10)
>             18 COMPARE_OP               5 (>=)
>             21 POP_JUMP_IF_FALSE       47
>
>   4        24 LOAD_FAST                1 (b)
>             27 LOAD_CONST               2 (11)
>             30 COMPARE_OP               5 (>=)
>             33 POP_JUMP_IF_FALSE       47
>
>   5        36 LOAD_CONST               3 ('hello world')
>             39 PRINT_ITEM
>             40 PRINT_NEWLINE
>             41 JUMP_ABSOLUTE           47
>             44 JUMP_FORWARD             0 (to 47)
>      >>   47 LOAD_CONST               0 (None)
>             50 RETURN_VALUE
>
> From my understanding, once JUMP_ABSOLUTE is executed, then JUMP_FORWARD
> is never gotten to so must be dead code so why is it being generated?
> Furthermore why is JUMP_ABSOLUTE rather than JUMP_FORWARD used in this
> particular case of nested if statements? I have tried other types of nested
> if statements and it has always been JUMP_FORWARD that
> is generated.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160619/338ec8a3/attachment.html>

From raymond.hettinger at gmail.com  Sat Jun 18 18:10:21 2016
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Sat, 18 Jun 2016 15:10:21 -0700
Subject: [Python-Dev] JUMP_ABSOLUTE in nested if statements
In-Reply-To: <5509708F-76C5-431F-A1BB-7F379E86B184@yahoo.com>
References: <5509708F-76C5-431F-A1BB-7F379E86B184@yahoo.com>
Message-ID: <3646EE2F-C372-49DC-8ADF-5360F3F60BE8@gmail.com>

> On Jun 18, 2016, at 2:04 PM, Obiesie ike-nwosu via Python-Dev <python-dev at python.org> wrote:
> 
> Hi, 
> 
> Could some one give a hand with explaining to me why we have a JUMP_ABSOLUTE followed by a JUMP_FORWARD op code when this function is disassembled.
>  < snipped>
> From my understanding, once JUMP_ABSOLUTE is executed, then JUMP_FORWARD is never gotten to so must be dead code so why is it being generated?
> Furthermore why is JUMP_ABSOLUTE rather than JUMP_FORWARD used in this particular case of nested if statements? I have tried other types of nested if statements and it has always been JUMP_FORWARD that 
> is generated.

The AST compilation step generates code with two JUMP_FORWARDs (see below).  Then, the peephole optimizer recognizes a jump-to-an-unconditional-jump and replaces the first one with a JUMP_ABSOLUTE to save an unnecessary step.

The reason that it uses JUMP_ABSOLUTE instead of JUMP_FORWARD is that the former is more general (it can jump backwards).  Using the more general form reduces the complexity of the optimizer.

The reason that the remaining jump-to-jump isn't optimized is that the peepholer is intentionally kept simplistic, making only a single pass over the opcodes.  That misses some optimizations but gets the most common cases.

FWIW, the jump opcodes are very fast, so missing the final jump-to-jump isn't much of a loss.

If you're curious, the relevant code is in Python/compile.c and Python/peephole.c.  The compile.c code generated opcodes in the most straight-forward way possible and then the peephole optimizer gets some of the low-hanging fruit by making a few simple transformations.

Raymond

------------ AST generated code before peephole optimization -----------------

  5           0 LOAD_CONST               1 (10)
              3 LOAD_CONST               2 (11)
              6 BUILD_TUPLE              2
              9 UNPACK_SEQUENCE          2
             12 STORE_FAST               0 (a)
             15 STORE_FAST               1 (b)

  6          18 LOAD_FAST                0 (a)
             21 LOAD_CONST               1 (10)
             24 COMPARE_OP               5 (>=)
             27 POP_JUMP_IF_FALSE       53

  7          30 LOAD_FAST                1 (b)
             33 LOAD_CONST               2 (11)
             36 COMPARE_OP               5 (>=)
             39 POP_JUMP_IF_FALSE       50

  8          42 LOAD_CONST               3 ('hello world')
             45 PRINT_ITEM
             46 PRINT_NEWLINE
             47 JUMP_FORWARD             0 (to 50)
        >>   50 JUMP_FORWARD             0 (to 53)
        >>   53 LOAD_CONST               0 (None)
             56 RETURN_VALUE

From barry at python.org  Sat Jun 18 18:41:23 2016
From: barry at python.org (Barry Warsaw)
Date: Sat, 18 Jun 2016 18:41:23 -0400
Subject: [Python-Dev] security SIG? (was: Discussion overload)
In-Reply-To: <5B4E973C-B09E-487E-9074-3B42DC773B99@lukasa.co.uk>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
 <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
 <CAP1=2W6rEsVAUwAye8Q3SvCnYM7oY_WFVFyCtVgO_mK=4-dJ-A@mail.gmail.com>
 <5B4E973C-B09E-487E-9074-3B42DC773B99@lukasa.co.uk>
Message-ID: <20160618184123.4ad9b93b.barry@wooz.org>

On Jun 18, 2016, at 03:30 PM, Cory Benfield wrote:

>Unless that SIG is empowered to take action

It wouldn't be, but there *is* a private security mailing list that is.
Christian was on it, and I'm sad that he got burned out.  If you are willing
and able to help out there, please contact security at python dot org.

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160618/91f53212/attachment.sig>

From steve.dower at python.org  Sat Jun 18 18:47:57 2016
From: steve.dower at python.org (Steve Dower)
Date: Sat, 18 Jun 2016 15:47:57 -0700
Subject: [Python-Dev] security SIG? (was: Discussion overload)
In-Reply-To: <CAP1=2W7tBnFeNbYxUvuqcLNrYXgtyhO+5p+7ndAeXaDF8PedXQ@mail.gmail.com>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
 <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
 <CAP1=2W6rEsVAUwAye8Q3SvCnYM7oY_WFVFyCtVgO_mK=4-dJ-A@mail.gmail.com>
 <5B4E973C-B09E-487E-9074-3B42DC773B99@lukasa.co.uk>
 <CAP1=2W7tBnFeNbYxUvuqcLNrYXgtyhO+5p+7ndAeXaDF8PedXQ@mail.gmail.com>
Message-ID: <E1bEP2H-0008Vd-Qh@se2-syd.hostedmail.net.au>

It's not just security discussions. The same thing happened with fspath, tzinfo, and many others that I have erased from my own memory. distutils-sig sees them often as well.

The whole thing seems like a limitation of written communication. There's no way to indicate or define whether something should be nitpicked or not, and so everything gets line-by-line analysis whether it deserves it or not, which is what leads to such huge and fragmented threads, regardless of topic.

At work, when we start seeing email or IM discussions going this way, we schedule a meeting. Perhaps we need a formal outlet for suspending discussion (and moderating incoming emails with a particular subject?) until an online call can be held and outcomes presented back to the list. Maybe we should schedule monthly online language summits and defer these discussions/decisions to that?

I know that change won't be popular with some people. Honestly, if you haven't contributed more than the people who quit python-dev over these threads, you don't get to demand status quo. We need to change something, and I don't think more email or mute buttons (sorry Guido :) ) are the answer.

Top-posted from my Windows Phone

-----Original Message-----
From: "Brett Cannon" <brett at python.org>
Sent: ?6/?18/?2016 11:13
To: "Cory Benfield" <cory at lukasa.co.uk>
Cc: "Nick Coghlan" <ncoghlan at gmail.com>; "Python Dev" <python-dev at python.org>
Subject: Re: [Python-Dev] security SIG? (was: Discussion overload)

On Sat, 18 Jun 2016 at 07:30 Cory Benfield <cory at lukasa.co.uk> wrote:

> On 18 Jun 2016, at 04:06, Brett Cannon <brett at python.org> wrote:
>
> Do we need a security SIG? E.g. would people like Christian and Cory like to have a separate place to talk about the ssl stuff brought up at the language summit?

Honestly, I?m not sure what we would gain.

Unless that SIG is empowered to take action, all it will be is a factory for generating arguments like this one. It will inevitably be either a toxic environment in itself, or a source of toxic threads on python-dev as the security SIG brings new threads like this one to the table.

It should be noted that of the three developers that originally stepped forward on the security side of things here (myself, Donald, and Christian), only I am left subscribed to python-dev and nosy?d on the relevant issues. Put another way: each time we do this, several people on the security side burn themselves out in the thread and walk away (it?s possible that those on the other side of the threads do too, I just don?t know those people so well). It?s hard to get enthusiastic about signing people up for that. =)

And this is the problem I'm trying to solve. As various people have pointed out, the conversation was pretty much cordial, but it did end up feeling like "you're not listening to me" on both sides on top of the volume, which is what I think burned people out on this thread.

I think Nick brought up the point that we as a group need to come up with some guideline that we more-or-less stick with to help guide this kind of discussion or else we are going to burn out regularly any time security comes up; we can't keep holding security discussions like this or else we're going to end up in a bad place when everyone burns out and stops caring. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160618/5205b2b3/attachment.html>

From c4obi at yahoo.com  Sat Jun 18 18:32:52 2016
From: c4obi at yahoo.com (Obiesie ike-nwosu)
Date: Sat, 18 Jun 2016 23:32:52 +0100
Subject: [Python-Dev] JUMP_ABSOLUTE in nested if statements
In-Reply-To: <3646EE2F-C372-49DC-8ADF-5360F3F60BE8@gmail.com>
References: <5509708F-76C5-431F-A1BB-7F379E86B184@yahoo.com>
 <3646EE2F-C372-49DC-8ADF-5360F3F60BE8@gmail.com>
Message-ID: <FE1B57D1-7033-4189-AC84-0D83769BD52B@yahoo.com>

That is much clearer now. 
Thanks a lot Raymond for taking the time out to explain this to me.
 On a closing note, is this mailing list the right place to ask these kinds of n00b questions?

Obi.
> On 18 Jun 2016, at 23:10, Raymond Hettinger <raymond.hettinger at gmail.com> wrote:
> 
> 
>> On Jun 18, 2016, at 2:04 PM, Obiesie ike-nwosu via Python-Dev <python-dev at python.org> wrote:
>> 
>> Hi, 
>> 
>> Could some one give a hand with explaining to me why we have a JUMP_ABSOLUTE followed by a JUMP_FORWARD op code when this function is disassembled.
>> < snipped>
>> From my understanding, once JUMP_ABSOLUTE is executed, then JUMP_FORWARD is never gotten to so must be dead code so why is it being generated?
>> Furthermore why is JUMP_ABSOLUTE rather than JUMP_FORWARD used in this particular case of nested if statements? I have tried other types of nested if statements and it has always been JUMP_FORWARD that 
>> is generated.
> 
> The AST compilation step generates code with two JUMP_FORWARDs (see below).  Then, the peephole optimizer recognizes a jump-to-an-unconditional-jump and replaces the first one with a JUMP_ABSOLUTE to save an unnecessary step.
> 
> The reason that it uses JUMP_ABSOLUTE instead of JUMP_FORWARD is that the former is more general (it can jump backwards).  Using the more general form reduces the complexity of the optimizer.
> 
> The reason that the remaining jump-to-jump isn't optimized is that the peepholer is intentionally kept simplistic, making only a single pass over the opcodes.  That misses some optimizations but gets the most common cases.
> 
> FWIW, the jump opcodes are very fast, so missing the final jump-to-jump isn't much of a loss.
> 
> If you're curious, the relevant code is in Python/compile.c and Python/peephole.c.  The compile.c code generated opcodes in the most straight-forward way possible and then the peephole optimizer gets some of the low-hanging fruit by making a few simple transformations.
> 
> 
> Raymond
> 
> 
> ------------ AST generated code before peephole optimization -----------------
> 
> 
>  5           0 LOAD_CONST               1 (10)
>              3 LOAD_CONST               2 (11)
>              6 BUILD_TUPLE              2
>              9 UNPACK_SEQUENCE          2
>             12 STORE_FAST               0 (a)
>             15 STORE_FAST               1 (b)
> 
>  6          18 LOAD_FAST                0 (a)
>             21 LOAD_CONST               1 (10)
>             24 COMPARE_OP               5 (>=)
>             27 POP_JUMP_IF_FALSE       53
> 
>  7          30 LOAD_FAST                1 (b)
>             33 LOAD_CONST               2 (11)
>             36 COMPARE_OP               5 (>=)
>             39 POP_JUMP_IF_FALSE       50
> 
>  8          42 LOAD_CONST               3 ('hello world')
>             45 PRINT_ITEM
>             46 PRINT_NEWLINE
>             47 JUMP_FORWARD             0 (to 50)
>>>  50 JUMP_FORWARD             0 (to 53)
>>>  53 LOAD_CONST               0 (None)
>             56 RETURN_VALUE
> 

From guido at python.org  Sat Jun 18 20:39:54 2016
From: guido at python.org (Guido van Rossum)
Date: Sat, 18 Jun 2016 17:39:54 -0700
Subject: [Python-Dev] security SIG? (was: Discussion overload)
In-Reply-To: <E1bEP2H-0008Vd-Qh@se2-syd.hostedmail.net.au>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
 <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
 <CAP1=2W6rEsVAUwAye8Q3SvCnYM7oY_WFVFyCtVgO_mK=4-dJ-A@mail.gmail.com>
 <5B4E973C-B09E-487E-9074-3B42DC773B99@lukasa.co.uk>
 <CAP1=2W7tBnFeNbYxUvuqcLNrYXgtyhO+5p+7ndAeXaDF8PedXQ@mail.gmail.com>
 <E1bEP2H-0008Vd-Qh@se2-syd.hostedmail.net.au>
Message-ID: <CAP7+vJL8fjro6NSYDtgmZcTs1mHCvfujsSQgyaQ_7cL7Av+dhA@mail.gmail.com>

Like it or not, written communication is all we have. However, I do think
we are running into some kind of limitation: the ancient concept of mailing
lists (or newsgroups). I would like to continue the discussion of this
limitation in the original thread.

PS. I think it's somewhat ironic that Steve posted his idea to deal with
discussions run amok in the forked thread that was meant specifically t
discuss the proposal for a security-sig. Ditto that Cory used this same
thread to bring up his philosophy about computer security -- that topic
itself belongs clearly in the proposed SIG or on python-dev (if we don't
create a SIG) but not (yet) in the discussion about whether we should
create a SIG.

On Sat, Jun 18, 2016 at 3:47 PM, Steve Dower <steve.dower at python.org> wrote:

> It's not just security discussions. The same thing happened with fspath,
> tzinfo, and many others that I have erased from my own memory.
> distutils-sig sees them often as well.
>
> The whole thing seems like a limitation of written communication. There's
> no way to indicate or define whether something should be nitpicked or not,
> and so everything gets line-by-line analysis whether it deserves it or not,
> which is what leads to such huge and fragmented threads, regardless of
> topic.
>
> At work, when we start seeing email or IM discussions going this way, we
> schedule a meeting. Perhaps we need a formal outlet for suspending
> discussion (and moderating incoming emails with a particular subject?)
> until an online call can be held and outcomes presented back to the list.
> Maybe we should schedule monthly online language summits and defer these
> discussions/decisions to that?
>
> I know that change won't be popular with some people. Honestly, if you
> haven't contributed more than the people who quit python-dev over these
> threads, you don't get to demand status quo. We need to change something,
> and I don't think more email or mute buttons (sorry Guido :) ) are the
> answer.
>
> Top-posted from my Windows Phone
> ------------------------------
> From: Brett Cannon <brett at python.org>
> Sent: ?6/?18/?2016 11:13
> To: Cory Benfield <cory at lukasa.co.uk>
> Cc: Nick Coghlan <ncoghlan at gmail.com>; Python Dev <python-dev at python.org>
> Subject: Re: [Python-Dev] security SIG? (was: Discussion overload)
>
>
>
> On Sat, 18 Jun 2016 at 07:30 Cory Benfield <cory at lukasa.co.uk> wrote:
>
>>
>> > On 18 Jun 2016, at 04:06, Brett Cannon <brett at python.org> wrote:
>> >
>> > Do we need a security SIG? E.g. would people like Christian and Cory
>> like to have a separate place to talk about the ssl stuff brought up at the
>> language summit?
>>
>>
>> Honestly, I?m not sure what we would gain.
>>
>> Unless that SIG is empowered to take action, all it will be is a factory
>> for generating arguments like this one. It will inevitably be either a
>> toxic environment in itself, or a source of toxic threads on python-dev as
>> the security SIG brings new threads like this one to the table.
>>
>> It should be noted that of the three developers that originally stepped
>> forward on the security side of things here (myself, Donald, and
>> Christian), only I am left subscribed to python-dev and nosy?d on the
>> relevant issues. Put another way: each time we do this, several people on
>> the security side burn themselves out in the thread and walk away (it?s
>> possible that those on the other side of the threads do too, I just don?t
>> know those people so well). It?s hard to get enthusiastic about signing
>> people up for that. =)
>>
>
> And this is the problem I'm trying to solve. As various people have
> pointed out, the conversation was pretty much cordial, but it did end up
> feeling like "you're not listening to me" on both sides on top of the
> volume, which is what I think burned people out on this thread.
>
> I think Nick brought up the point that we as a group need to come up with
> some guideline that we more-or-less stick with to help guide this kind of
> discussion or else we are going to burn out regularly any time security
> comes up; we can't keep holding security discussions like this or else
> we're going to end up in a bad place when everyone burns out and stops
> caring.
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160618/6fb94251/attachment.html>

From brett at python.org  Sat Jun 18 21:17:03 2016
From: brett at python.org (Brett Cannon)
Date: Sun, 19 Jun 2016 01:17:03 +0000
Subject: [Python-Dev] Discussion overload
In-Reply-To: <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
 <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
Message-ID: <CAP1=2W4dsmf1LsjvE3dWi2Dz2+apDieYzTGP_s=PArT2B=hWQg@mail.gmail.com>

Over on the "security SIG" thread, the point has been made that we seem to
be hitting some limits in communication (Steve Dower said written
communication, Guido said mailing lists/newsgroups). Based on the burnout
we are seeing from these centi-threads we need to try and come up with some
solution to this problem, else we are heading towards a bad place sue to
communication burn-out.

For me, I don't think we can give up written communication thanks to how
worldwide we all are and thus make scheduling some monthly video chat very
difficult. What I would like to consider, though, is something like
Discourse where we at least have a chance to have tools available to us to
manage discussions better than through federated email where everyone has
different experiences in terms of delivery rate, ability to filter,
splitting discussions, locking down out-of-control discussions, etc. I
think harmonizing the experience along with better controls could help make
all of this more manageable.

On Fri, Jun 17, 2016, 18:13 Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 16 June 2016 at 19:00, Kevin Ollivier <kevin-lists at theolliviers.com>
> wrote:
> > Hi Guido,
> >
> > From: <gvanrossum at gmail.com> on behalf of Guido van Rossum
> > <guido at python.org>
> > Reply-To: <guido at python.org>
> > Date: Thursday, June 16, 2016 at 5:27 PM
> > To: Kevin Ollivier <kevin-lists at theolliviers.com>
> > Cc: Python Dev <python-dev at python.org>
> > Subject: Re: [Python-Dev] Discussion overload
> >
> > Hi Kevin,
> >
> > I often feel the same way. Are you using GMail? It combines related
> messages
> > in threads and lets you mute threads. I often use this feature so I can
> > manage my inbox. (I presume other mailers have the same features, but I
> > don't know if all of them do.) There are also many people who read the
> list
> > on a website, e.g. gmane. (Though I think that sometimes the delays
> incurred
> > there add to the noise -- e.g. when a decision is reached on the list
> > sometimes people keep responding to earlier threads.)
> >
> >
> > I fear I did quite a poor job of making my point. :( I've been on open
> > source mailing lists since the late 90s, so I've learned strategies for
> > dealing with mailing list overload. I've got my mail folders, my mail
> rules,
> > etc. Having been on many mailing lists over the years, I've seen many
> > productive discussions and many unproductive ones, and over time you
> start
> > to see patterns. You also see what happens to those communities over
> time.
>
> This is one of the major reasons we have the option of escalating
> things to the PEP process (and that's currently in train for
> os.urandom), as well as the SIGs for when folks really need to dig
> into topics that risk incurring a relatively low signal-to-noise
> ration on python-dev. It's also why python-ideas was turned into a
> separate list, since folks without the time for more speculative
> discussions and brainstorming can safely ignore it, while remaining
> confident that any ideas considered interesting enough for further
> review will be brought to python-dev's attention.
>
> But yes, one of the more significant design errors I've made with the
> contextlib API was due to just such a draining pile-on by folks that
> weren't happy the original name wasn't a 100% accurate description of
> the underlying mechanics (even though it was an accurate description
> of the intended use case), and "people yelling at you on project
> communication channels without doing adequate research first" is the
> number one reason we see otherwise happily engaged core developers
> decide to find something else to do with their time.
>
> The challenge and art in community management in that context is
> balancing telling both old and new list participants "It's OK to ask
> 'Why is this so?', as sometimes the answer is that there isn't a good
> reason and we may want to change it" and "Learn to be a good peer
> manager, and avoid behaving like a micro-managing autocrat that chases
> away experienced contributors".
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160619/0270b3e0/attachment-0001.html>

From guido at python.org  Sat Jun 18 21:57:49 2016
From: guido at python.org (Guido van Rossum)
Date: Sat, 18 Jun 2016 18:57:49 -0700
Subject: [Python-Dev] Discussion overload
In-Reply-To: <CAP1=2W4dsmf1LsjvE3dWi2Dz2+apDieYzTGP_s=PArT2B=hWQg@mail.gmail.com>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
 <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
 <CAP1=2W4dsmf1LsjvE3dWi2Dz2+apDieYzTGP_s=PArT2B=hWQg@mail.gmail.com>
Message-ID: <CAP7+vJ+hf3ir6fmG3eQa0ENkFh1nt1b4ttbLf=rsGeMesh1w6g@mail.gmail.com>

On Sat, Jun 18, 2016 at 6:17 PM, Brett Cannon <brett at python.org> wrote:

> Over on the "security SIG" thread, the point has been made that we seem to
> be hitting some limits in communication (Steve Dower said written
> communication, Guido said mailing lists/newsgroups). Based on the burnout
> we are seeing from these centi-threads we need to try and come up with some
> solution to this problem, else we are heading towards a bad place [d]ue to
> communication burn-out.
>
> For me, I don't think we can give up written communication thanks to how
> worldwide we all are and thus make scheduling some monthly video chat very
> difficult. What I would like to consider, though, is something like
> Discourse where we at least have a chance to have tools available to us to
> manage discussions better than through federated email where everyone has
> different experiences in terms of delivery rate, ability to filter,
> splitting discussions, locking down out-of-control discussions, etc. I
> think harmonizing the experience along with better controls could help make
> all of this more manageable.
>
Agreed that any form of real-time communication is out.

First, I want to apologize to Kevin -- I only skimmed his message. I only
saw that he had carefully qualified himself as a long-time open source
contributor and list participant when I re-read his message.

I also want to keep this short, so I'm proof-reading this before posting.

Many projects on which I am currently working use one or more GitHub issue
trackers as their main communication mechanism (mypy et al. don't even have
a mailing list). I find that this works quite well to stay focused. We have
quite a few issues that track important discussions over many days, weeks
or months, and there is very little noise or cross-talk. It's easy to stay
on topic, it's much easier to refer to other topics, it's easy to mute
individual topics, and it's much less likely that a topic degenerates into
a different discussion altogether (because it's easy to create a new issue
for it). It's also easier to moderate, and you can even edit conversations
(with restraint). I also like that it's possible to to do
sentence-by-sentence quotation, but the extra effort required (copy/paste)
encourages a linear thread of conversation within one issue.

I did a quick check of my inbox and I think over the past week I had about
as much mypy-related messages generated by GitHub as there were python-dev
messages. And I felt much less bad for ignoring much of the mypy traffic
while I was on vacation than I felt for ignoring python-dev, because it's
easy to catch up using GitHub's web UI. (And no, I don't want to use gmane.
I think it doesn't solve any of the other problems.)

I don't know Discourse, but if it has a similar (or even better) feature
set maybe we should give it a try. Or, now that we're going to migrate the
CPython repo to GitHub, maybe we could just give GitHub's issue tracker a
try? We could create a repo that has just a tracker (or a tracker plus a
README.md explaining its purpose -- eventually we could add more resources
and even a wiki).

I'm sure that in the venerable python-dev tradition everyone is now jumping
to give their opinion about Discourse, the GitHub tracker, their favorite
alternative, the needs for free-form discussion, the need to have a GitHub
account to participate, Slack, and the upcoming Mailman 3.0. But let's not
do that, because it would be too self-referential (and defeat the purpose).
I think we seriously need to rethink the way we have conversations here,
and that includes the conversation about conversations.

Here's my proposal: let's decide what to do about this roughly the same way
we decided what to do with Mercurial. We don't have to take as long, but
we'll use a similar process: a small committee run by a dedicated volunteer
will compare alternatives and pick a strategy. If you're interested in
serving on this committee, send me email off-list. If you want to head the
committee, ditto. If you reply-all, you're automatically disqualified. :-)

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160618/9d139074/attachment.html>

From steve at pearwood.info  Sat Jun 18 23:46:51 2016
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 19 Jun 2016 13:46:51 +1000
Subject: [Python-Dev] JUMP_ABSOLUTE in nested if statements
In-Reply-To: <FE1B57D1-7033-4189-AC84-0D83769BD52B@yahoo.com>
References: <5509708F-76C5-431F-A1BB-7F379E86B184@yahoo.com>
 <3646EE2F-C372-49DC-8ADF-5360F3F60BE8@gmail.com>
 <FE1B57D1-7033-4189-AC84-0D83769BD52B@yahoo.com>
Message-ID: <20160619034651.GP27919@ando.pearwood.info>

On Sat, Jun 18, 2016 at 11:32:52PM +0100, Obiesie ike-nwosu via Python-Dev wrote:
> That is much clearer now. 
> Thanks a lot Raymond for taking the time out to explain this to me.
>  On a closing note, is this mailing list the right place to ask these kinds of n00b questions?

That depends what sort of n00b question.

If they are specifically related to the internals of the CPython 
interpreter, then this is certainly the right place. Code generation 
will count as an internal function of the interpreter.

If they're general questions about Python the language, then the 
python-list mailing list is better. (Also available as comp.lang.python 
on Usenet.) Beware: it tends to be a high-volume, easily distracted 
forum where people often go off on long discussions which are only 
peripherally related to Python.

-- 
Steve

From songofacandy at gmail.com  Sat Jun 18 23:48:43 2016
From: songofacandy at gmail.com (INADA Naoki)
Date: Sun, 19 Jun 2016 12:48:43 +0900
Subject: [Python-Dev] Compact dict implementations (was: PEP 468
In-Reply-To: <CAEfz+Tzk_uNWwpkzpWCibYt1q0G61ztcNEBnqei9sPUezdF7mg@mail.gmail.com>
References: <CAEfz+Tzk_uNWwpkzpWCibYt1q0G61ztcNEBnqei9sPUezdF7mg@mail.gmail.com>
Message-ID: <CAEfz+Tz5TNZczHgbhb_HnYT7_Oi8xb9OCZkUiKAsN6o5vy6SUw@mail.gmail.com>

I've sent my patch to issue tracker, since I can't fix some remains
TODOs by myself.

http://bugs.python.org/issue27350

On Fri, Jun 17, 2016 at 6:15 PM, INADA Naoki <songofacandy at gmail.com> wrote:
> Hi, developers.
>
> I'm trying to implement compact dict.
> https://github.com/methane/cpython/pull/1
>
> Current status is passing most of tests.
> Some tests are failing because of I haven't updated `sizeof` until layout fix.
> And I haven't dropped OrderedDict has linked list.
>
> Before finishing implementation, I want to see comments and tests from core
> developers.
> Please come to core-mentorship ML or pull request and try it if you
> interested in.
>
> Regards,
> --
> INADA Naoki  <songofacandy at gmail.com>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com

-- 
INADA Naoki  <songofacandy at gmail.com>

From guido at python.org  Sun Jun 19 00:48:47 2016
From: guido at python.org (Guido van Rossum)
Date: Sat, 18 Jun 2016 21:48:47 -0700
Subject: [Python-Dev] frame evaluation API PEP
In-Reply-To: <CAP1=2W6T8Gjo4B44Ut1t9yDYQcMymTBY1hcZ=+QagY5pyrauNg@mail.gmail.com>
References: <CAP1=2W7K6Ny82Vq-rk3zO9cHxxJmtQGczqGy3byg83sy-N8W9Q@mail.gmail.com>
 <CAP1=2W6T8Gjo4B44Ut1t9yDYQcMymTBY1hcZ=+QagY5pyrauNg@mail.gmail.com>
Message-ID: <CAP7+vJ+TQLcDku=pNUg=7uQ=_sw_NsBqAbM5Os1K7VAx=7W1nQ@mail.gmail.com>

Hi Brett,

I've got a few questions about the specific design. Probably you know the
answers, it would be nice to have them in the PEP.

First, why not have a global hook? What does a hook per interpreter give
you? Would even finer granularity buy anything?

Next, I'm a bit (but no more than a bit) concerned about the extra 8 bytes
per code object, especially since for most people this is just waste
(assuming most people won't be using Pyjion or Numba). Could it be a
compile-time feature (requiring recompilation of CPython but not
extensions)? Could you figure out some other way to store per-code-object
data? It seems you considered this but decided that the co_extra field was
simpler and faster; I'm basically pushing a little harder on this. Of
course most of the PEP would disappear without this feature; the extra
interpreter field is fine.

Finally, there are some error messages from pep2html.py:
https://www.python.org/dev/peps/pep-0523/#copyright

--Guido

On Fri, Jun 17, 2016 at 7:58 PM, Brett Cannon <brett at python.org> wrote:

> I have taken PEP 523 for this:
> https://github.com/python/peps/blob/master/pep-0523.txt .
>
> I'm waiting until Guido gets back from vacation, at which point I'll ask
> for a pronouncement or assignment of a BDFL delegate.
>
> On Fri, 3 Jun 2016 at 14:37 Brett Cannon <brett at python.org> wrote:
>
>> For those of you who follow python-ideas or were at the PyCon US 2016
>> language summit, you have already seen/heard about this PEP. For those of
>> you who don't fall into either of those categories, this PEP proposed a
>> frame evaluation API for CPython. The motivating example of this work has
>> been Pyjion, the experimental CPython JIT Dino Viehland and I have been
>> working on in our spare time at Microsoft. The API also works for
>> debugging, though, as already demonstrated by Google having added a very
>> similar API internally for debugging purposes.
>>
>> The PEP is pasted in below and also available in rendered form at
>> https://github.com/Microsoft/Pyjion/blob/master/pep.rst (I will assign
>> myself a PEP # once discussion is finished as it's easier to work in git
>> for this for the rich rendering of the in-progress PEP).
>>
>> I should mention that the difference from python-ideas and the language
>> summit in the PEP are the listed support from Google's use of a very
>> similar API as well as clarifying the co_extra field on code objects
>> doesn't change their immutability (at least from the view of the PEP).
>>
>> ----------
>> PEP: NNN
>> Title: Adding a frame evaluation API to CPython
>> Version: $Revision$
>> Last-Modified: $Date$
>> Author: Brett Cannon <brett at python.org>,
>>         Dino Viehland <dinov at microsoft.com>
>> Status: Draft
>> Type: Standards Track
>> Content-Type: text/x-rst
>> Created: 16-May-2016
>> Post-History: 16-May-2016
>>               03-Jun-2016
>>
>>
>> Abstract
>> ========
>>
>> This PEP proposes to expand CPython's C API [#c-api]_ to allow for
>> the specification of a per-interpreter function pointer to handle the
>> evaluation of frames [#pyeval_evalframeex]_. This proposal also
>> suggests adding a new field to code objects [#pycodeobject]_ to store
>> arbitrary data for use by the frame evaluation function.
>>
>>
>> Rationale
>> =========
>>
>> One place where flexibility has been lacking in Python is in the direct
>> execution of Python code. While CPython's C API [#c-api]_ allows for
>> constructing the data going into a frame object and then evaluating it
>> via ``PyEval_EvalFrameEx()`` [#pyeval_evalframeex]_, control over the
>> execution of Python code comes down to individual objects instead of a
>> hollistic control of execution at the frame level.
>>
>> While wanting to have influence over frame evaluation may seem a bit
>> too low-level, it does open the possibility for things such as a
>> method-level JIT to be introduced into CPython without CPython itself
>> having to provide one. By allowing external C code to control frame
>> evaluation, a JIT can participate in the execution of Python code at
>> the key point where evaluation occurs. This then allows for a JIT to
>> conditionally recompile Python bytecode to machine code as desired
>> while still allowing for executing regular CPython bytecode when
>> running the JIT is not desired. This can be accomplished by allowing
>> interpreters to specify what function to call to evaluate a frame. And
>> by placing the API at the frame evaluation level it allows for a
>> complete view of the execution environment of the code for the JIT.
>>
>> This ability to specify a frame evaluation function also allows for
>> other use-cases beyond just opening CPython up to a JIT. For instance,
>> it would not be difficult to implement a tracing or profiling function
>> at the call level with this API. While CPython does provide the
>> ability to set a tracing or profiling function at the Python level,
>> this would be able to match the data collection of the profiler and
>> quite possibly be faster for tracing by simply skipping per-line
>> tracing support.
>>
>> It also opens up the possibility of debugging where the frame
>> evaluation function only performs special debugging work when it
>> detects it is about to execute a specific code object. In that
>> instance the bytecode could be theoretically rewritten in-place to
>> inject a breakpoint function call at the proper point for help in
>> debugging while not having to do a heavy-handed approach as
>> required by ``sys.settrace()``.
>>
>> To help facilitate these use-cases, we are also proposing the adding
>> of a "scratch space" on code objects via a new field. This will allow
>> per-code object data to be stored with the code object itself for easy
>> retrieval by the frame evaluation function as necessary. The field
>> itself will simply be a ``PyObject *`` type so that any data stored in
>> the field will participate in normal object memory management.
>>
>>
>> Proposal
>> ========
>>
>> All proposed C API changes below will not be part of the stable ABI.
>>
>>
>> Expanding ``PyCodeObject``
>> --------------------------
>>
>> One field is to be added to the ``PyCodeObject`` struct
>> [#pycodeobject]_::
>>
>>   typedef struct {
>>      ...
>>      PyObject *co_extra;  /* "Scratch space" for the code object. */
>>   } PyCodeObject;
>>
>> The ``co_extra`` will be ``NULL`` by default and will not be used by
>> CPython itself. Third-party code is free to use the field as desired.
>> Values stored in the field are expected to not be required in order
>> for the code object to function, allowing the loss of the data of the
>> field to be acceptable (this keeps the code object as immutable from
>> a functionality point-of-view; this is slightly contentious and so is
>> listed as an open issue in `Is co_extra needed?`_). The field will be
>> freed like all other fields on ``PyCodeObject`` during deallocation
>> using ``Py_XDECREF()``.
>>
>> It is not recommended that multiple users attempt to use the
>> ``co_extra`` simultaneously. While a dictionary could theoretically be
>> set to the field and various users could use a key specific to the
>> project, there is still the issue of key collisions as well as
>> performance degradation from using a dictionary lookup on every frame
>> evaluation. Users are expected to do a type check to make sure that
>> the field has not been previously set by someone else.
>>
>>
>> Expanding ``PyInterpreterState``
>> --------------------------------
>>
>> The entrypoint for the frame evalution function is per-interpreter::
>>
>>   // Same type signature as PyEval_EvalFrameEx().
>>   typedef PyObject* (__stdcall *PyFrameEvalFunction)(PyFrameObject*, int);
>>
>>   typedef struct {
>>       ...
>>       PyFrameEvalFunction eval_frame;
>>   } PyInterpreterState;
>>
>> By default, the ``eval_frame`` field will be initialized to a function
>> pointer that represents what ``PyEval_EvalFrameEx()`` currently is
>> (called ``PyEval_EvalFrameDefault()``, discussed later in this PEP).
>> Third-party code may then set their own frame evaluation function
>> instead to control the execution of Python code. A pointer comparison
>> can be used to detect if the field is set to
>> ``PyEval_EvalFrameDefault()`` and thus has not been mutated yet.
>>
>>
>> Changes to ``Python/ceval.c``
>> -----------------------------
>>
>> ``PyEval_EvalFrameEx()`` [#pyeval_evalframeex]_ as it currently stands
>> will be renamed to ``PyEval_EvalFrameDefault()``. The new
>> ``PyEval_EvalFrameEx()`` will then become::
>>
>>     PyObject *
>>     PyEval_EvalFrameEx(PyFrameObject *frame, int throwflag)
>>     {
>>         PyThreadState *tstate = PyThreadState_GET();
>>         return tstate->interp->eval_frame(frame, throwflag);
>>     }
>>
>> This allows third-party code to place themselves directly in the path
>> of Python code execution while being backwards-compatible with code
>> already using the pre-existing C API.
>>
>>
>> Updating ``python-gdb.py``
>> --------------------------
>>
>> The generated ``python-gdb.py`` file used for Python support in GDB
>> makes some hard-coded assumptions about ``PyEval_EvalFrameEx()``, e.g.
>> the names of local variables. It will need to be updated to work with
>> the proposed changes.
>>
>>
>> Performance impact
>> ==================
>>
>> As this PEP is proposing an API to add pluggability, performance
>> impact is considered only in the case where no third-party code has
>> made any changes.
>>
>> Several runs of pybench [#pybench]_ consistently showed no performance
>> cost from the API change alone.
>>
>> A run of the Python benchmark suite [#py-benchmarks]_ showed no
>> measurable cost in performance.
>>
>> In terms of memory impact, since there are typically not many CPython
>> interpreters executing in a single process that means the impact of
>> ``co_extra`` being added to ``PyCodeObject`` is the only worry.
>> According to [#code-object-count]_, a run of the Python test suite
>> results in about 72,395 code objects being created. On a 64-bit
>> CPU that would result in 579,160 bytes of extra memory being used if
>> all code objects were alive at once and had nothing set in their
>> ``co_extra`` fields.
>>
>>
>> Example Usage
>> =============
>>
>> A JIT for CPython
>> -----------------
>>
>> Pyjion
>> ''''''
>>
>> The Pyjion project [#pyjion]_ has used this proposed API to implement
>> a JIT for CPython using the CoreCLR's JIT [#coreclr]_. Each code
>> object has its ``co_extra`` field set to a ``PyjionJittedCode`` object
>> which stores four pieces of information:
>>
>> 1. Execution count
>> 2. A boolean representing whether a previous attempt to JIT failed
>> 3. A function pointer to a trampoline (which can be type tracing or not)
>> 4. A void pointer to any JIT-compiled machine code
>>
>> The frame evaluation function has (roughly) the following algorithm::
>>
>>     def eval_frame(frame, throw_flag):
>>         pyjion_code = frame.code.co_extra
>>         if not pyjion_code:
>>             frame.code.co_extra = PyjionJittedCode()
>>         elif not pyjion_code.jit_failed:
>>             if not pyjion_code.jit_code:
>>                 return pyjion_code.eval(pyjion_code.jit_code, frame)
>>             elif pyjion_code.exec_count > 20_000:
>>                 if jit_compile(frame):
>>                     return pyjion_code.eval(pyjion_code.jit_code, frame)
>>                 else:
>>                     pyjion_code.jit_failed = True
>>         pyjion_code.exec_count += 1
>>         return PyEval_EvalFrameDefault(frame, throw_flag)
>>
>> The key point, though, is that all of this work and logic is separate
>> from CPython and yet with the proposed API changes it is able to
>> provide a JIT that is compliant with Python semantics (as of this
>> writing, performance is almost equivalent to CPython without the new
>> API). This means there's nothing technically preventing others from
>> implementing their own JITs for CPython by utilizing the proposed API.
>>
>>
>> Other JITs
>> ''''''''''
>>
>> It should be mentioned that the Pyston team was consulted on an
>> earlier version of this PEP that was more JIT-specific and they were
>> not interested in utilizing the changes proposed because they want
>> control over memory layout they had no interest in directly supporting
>> CPython itself. An informal discusion with a developer on the PyPy
>> team led to a similar comment.
>>
>> Numba [#numba]_, on the other hand, suggested that they would be
>> interested in the proposed change in a post-1.0 future for
>> themselves [#numba-interest]_.
>>
>> The experimental Coconut JIT [#coconut]_ could have benefitted from
>> this PEP. In private conversations with Coconut's creator we were told
>> that our API was probably superior to the one they developed for
>> Coconut to add JIT support to CPython.
>>
>>
>> Debugging
>> ---------
>>
>> In conversations with the Python Tools for Visual Studio team (PTVS)
>> [#ptvs]_, they thought they would find these API changes useful for
>> implementing more performant debugging. As mentioned in the Rationale_
>> section, this API would allow for switching on debugging functionality
>> only in frames where it is needed. This could allow for either
>> skipping information that ``sys.settrace()`` normally provides and
>> even go as far as to dynamically rewrite bytecode prior to execution
>> to inject e.g. breakpoints in the bytecode.
>>
>> It also turns out that Google has provided a very similar API
>> internally for years. It has been used for performant debugging
>> purposes.
>>
>>
>> Implementation
>> ==============
>>
>> A set of patches implementing the proposed API is available through
>> the Pyjion project [#pyjion]_. In its current form it has more
>> changes to CPython than just this proposed API, but that is for ease
>> of development instead of strict requirements to accomplish its goals.
>>
>>
>> Open Issues
>> ===========
>>
>> Allow ``eval_frame`` to be ``NULL``
>> -----------------------------------
>>
>> Currently the frame evaluation function is expected to always be set.
>> It could very easily simply default to ``NULL`` instead which would
>> signal to use ``PyEval_EvalFrameDefault()``. The current proposal of
>> not special-casing the field seemed the most straight-forward, but it
>> does require that the field not accidentally be cleared, else a crash
>> may occur.
>>
>>
>> Is co_extra needed?
>> -------------------
>>
>> While discussing this PEP at PyCon US 2016, some core developers
>> expressed their worry of the ``co_extra`` field making code objects
>> mutable. The thinking seemed to be that having a field that was
>> mutated after the creation of the code object made the object seem
>> mutable, even though no other aspect of code objects changed.
>>
>> The view of this PEP is that the `co_extra` field doesn't change the
>> fact that code objects are immutable. The field is specified in this
>> PEP as to not contain information required to make the code object
>> usable, making it more of a caching field. It could be viewed as
>> similar to the UTF-8 cache that string objects have internally;
>> strings are still considered immutable even though they have a field
>> that is conditionally set.
>>
>> The field is also not strictly necessary. While the field greatly
>> simplifies attaching extra information to code objects, other options
>> such as keeping a mapping of code object memory addresses to what
>> would have been kept in ``co_extra`` or perhaps using a weak reference
>> of the data on the code object and then iterating through the weak
>> references until the attached data is found is possible. But obviously
>> all of these solutions are not as simple or performant as adding the
>> ``co_extra`` field.
>>
>>
>> Rejected Ideas
>> ==============
>>
>> A JIT-specific C API
>> --------------------
>>
>> Originally this PEP was going to propose a much larger API change
>> which was more JIT-specific. After soliciting feedback from the Numba
>> team [#numba]_, though, it became clear that the API was unnecessarily
>> large. The realization was made that all that was truly needed was the
>> opportunity to provide a trampoline function to handle execution of
>> Python code that had been JIT-compiled and a way to attach that
>> compiled machine code along with other critical data to the
>> corresponding Python code object. Once it was shown that there was no
>> loss in functionality or in performance while minimizing the API
>> changes required, the proposal was changed to its current form.
>>
>>
>> References
>> ==========
>>
>> .. [#pyjion] Pyjion project
>>    (https://github.com/microsoft/pyjion)
>>
>> .. [#c-api] CPython's C API
>>    (https://docs.python.org/3/c-api/index.html)
>>
>> .. [#pycodeobject] ``PyCodeObject``
>>    (https://docs.python.org/3/c-api/code.html#c.PyCodeObject)
>>
>> .. [#coreclr] .NET Core Runtime (CoreCLR)
>>    (https://github.com/dotnet/coreclr)
>>
>> .. [#pyeval_evalframeex] ``PyEval_EvalFrameEx()``
>>    (
>> https://docs.python.org/3/c-api/veryhigh.html?highlight=pyframeobject#c.PyEval_EvalFrameEx
>> )
>>
>> .. [#pycodeobject] ``PyCodeObject``
>>    (https://docs.python.org/3/c-api/code.html#c.PyCodeObject)
>>
>> .. [#numba] Numba
>>    (http://numba.pydata.org/)
>>
>> .. [#numba-interest]  numba-users mailing list:
>>    "Would the C API for a JIT entrypoint being proposed by Pyjion help
>> out Numba?"
>>    (
>> https://groups.google.com/a/continuum.io/forum/#!topic/numba-users/yRl_0t8-m1g
>> )
>>
>> .. [#code-object-count] [Python-Dev] Opcode cache in ceval loop
>>    (
>> https://mail.python.org/pipermail/python-dev/2016-February/143025.html)
>>
>> .. [#py-benchmarks] Python benchmark suite
>>    (https://hg.python.org/benchmarks)
>>
>> .. [#pyston] Pyston
>>    (http://pyston.org)
>>
>> .. [#pypy] PyPy
>>    (http://pypy.org/)
>>
>> .. [#ptvs] Python Tools for Visual Studio
>>    (http://microsoft.github.io/PTVS/)
>>
>> .. [#coconut] Coconut
>>    (https://github.com/davidmalcolm/coconut)
>>
>>
>> Copyright
>> =========
>>
>> This document has been placed in the public domain.
>>
>>
>> ..
>>    Local Variables:
>>    mode: indented-text
>>    indent-tabs-mode: nil
>>    sentence-end-double-space: t
>>    fill-column: 70
>>    coding: utf-8
>>    End:
>>
>>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160618/e53ed810/attachment-0001.html>

From kevin-lists at theolliviers.com  Sun Jun 19 12:12:36 2016
From: kevin-lists at theolliviers.com (Kevin Ollivier)
Date: Sun, 19 Jun 2016 09:12:36 -0700
Subject: [Python-Dev] Discussion overload
In-Reply-To: <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
 <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
Message-ID: <E86D466A-2D3E-428A-9DD8-5E16C4453AD3@theolliviers.com>

Hi Nick,

On 6/17/16, 6:12 PM, "Nick Coghlan" <ncoghlan at gmail.com> wrote:

>On 16 June 2016 at 19:00, Kevin Ollivier <kevin-lists at theolliviers.com> wrote:
>> Hi Guido,
>>
>> From: <gvanrossum at gmail.com> on behalf of Guido van Rossum
>> <guido at python.org>
>> Reply-To: <guido at python.org>
>> Date: Thursday, June 16, 2016 at 5:27 PM
>> To: Kevin Ollivier <kevin-lists at theolliviers.com>
>> Cc: Python Dev <python-dev at python.org>
>> Subject: Re: [Python-Dev] Discussion overload
>>
>> Hi Kevin,
>>
>> I often feel the same way. Are you using GMail? It combines related messages
>> in threads and lets you mute threads. I often use this feature so I can
>> manage my inbox. (I presume other mailers have the same features, but I
>> don't know if all of them do.) There are also many people who read the list
>> on a website, e.g. gmane. (Though I think that sometimes the delays incurred
>> there add to the noise -- e.g. when a decision is reached on the list
>> sometimes people keep responding to earlier threads.)
>>
>>
>> I fear I did quite a poor job of making my point. :( I've been on open
>> source mailing lists since the late 90s, so I've learned strategies for
>> dealing with mailing list overload. I've got my mail folders, my mail rules,
>> etc. Having been on many mailing lists over the years, I've seen many
>> productive discussions and many unproductive ones, and over time you start
>> to see patterns. You also see what happens to those communities over time.
>
>This is one of the major reasons we have the option of escalating
>things to the PEP process (and that's currently in train for
>os.urandom), as well as the SIGs for when folks really need to dig
>into topics that risk incurring a relatively low signal-to-noise
>ration on python-dev. It's also why python-ideas was turned into a
>separate list, since folks without the time for more speculative
>discussions and brainstorming can safely ignore it, while remaining
>confident that any ideas considered interesting enough for further
>review will be brought to python-dev's attention.
>
>But yes, one of the more significant design errors I've made with the
>contextlib API was due to just such a draining pile-on by folks that
>weren't happy the original name wasn't a 100% accurate description of
>the underlying mechanics (even though it was an accurate description
>of the intended use case), and "people yelling at you on project
>communication channels without doing adequate research first" is the
>number one reason we see otherwise happily engaged core developers
>decide to find something else to do with their time.

Yeah, the sad truth is that when you start having these problems, it's the good people that leave. The key though is not to treat this as some unsolvable problem, which honestly is what I've seen many projects do. :( My guess is that once these issues are addressed, at least some of the people who left would be willing to give it another try. 

I had written a couple paragraphs about some different tools and approaches that might help with that, but I think Guido's got the right idea by taking it off-list to determine the best way to move forward first.

Regards,

Kevin

>The challenge and art in community management in that context is
>balancing telling both old and new list participants "It's OK to ask
>'Why is this so?', as sometimes the answer is that there isn't a good
>reason and we may want to change it" and "Learn to be a good peer
>manager, and avoid behaving like a micro-managing autocrat that chases
>away experienced contributors".

>Cheers,
>Nick.
>
>-- 
>Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Sun Jun 19 15:39:14 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 19 Jun 2016 12:39:14 -0700
Subject: [Python-Dev] security SIG?
In-Reply-To: <576586B8.5090009@stoneleaf.us>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
 <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
 <CAP1=2W6rEsVAUwAye8Q3SvCnYM7oY_WFVFyCtVgO_mK=4-dJ-A@mail.gmail.com>
 <5B4E973C-B09E-487E-9074-3B42DC773B99@lukasa.co.uk>
 <576586B8.5090009@stoneleaf.us>
Message-ID: <CADiSq7cF=C9scPHvuPHdeOmHS9vAskVjXWyFPbTNgbPYaabopg@mail.gmail.com>

On 18 June 2016 at 10:36, Ethan Furman <ethan at stoneleaf.us> wrote:
> One of the big advantages of a SIG is the much reduced pool of participants,
> and that those participants are usually interested in forward progress.  It
> would also be helpful to have a single person both champion and act as
> buffer for the proposals (not necessarily the same person each time).  I am
> reminded of the matrix-multiply PEP brought forward by Nathaniel a few
> months ago -- the proposal was researched outside of py-dev, presented to
> py-dev when ready, Nathaniel acted as the gateway between py-dev and those
> that wanted/needed the change, the discussion stayed (pretty much) on track,
> and it felt like the whole thing was very smooth.  (If it was somebody else,
> my apologies for my terrible memory! ;)
>
> To sum up:  I think it would be a good idea.

I'm coming around to this point of view as well. import-sig, for
example, is a very low traffic SIG, but I think it serves three key
useful purposes:

- it clearly indicates that import is a specialist topic with
additional considerations to take into account that may not be obvious
to developers touching the import system for the first time
- it provides a forum to collaboratively craft explanations of
proposed changes that should make sense to folks that *aren't*
specialists
- anyone that wants to become an "import system expert" can join the
SIG and learn from the intermittent discussions of proposed changes

distutils-sig is an example at the other end of the scale - while
distutils-sig and python-dev subscribers aren't a disjoint set, those
of us that fall into the intersection are a clear minority on both
lists, and can act as representatives of the interests of the other
group when needed.

As far as names go, my vote would be for "paranoia-sig" - it nicely
avoids any risk of folks submitting security bugs there instead of to
the PSRT, and "We're professionally paranoid, so you don't need to be"
is an apt description of good security sensitive API design in a
general purpose language like Python :)

Cheers,
Nick.

P.S. Hopefully we could get some of the Python Cryptographic Authority
folks to sign up, just as distutils-sig is a point of collaboration
between python-dev and PyPA. "Secure software design in Python" covers
a lot more than just the standard library, since in many cases you
really want to reach beyond the standard library and grab something
like cryptography or passlib, or delegate the problem to a domain
specific framework like Django or the relevant components of the Flask
or Pyramid ecosystems.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ethan at stoneleaf.us  Sun Jun 19 15:54:47 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Sun, 19 Jun 2016 12:54:47 -0700
Subject: [Python-Dev] security SIG?
In-Reply-To: <CADiSq7cF=C9scPHvuPHdeOmHS9vAskVjXWyFPbTNgbPYaabopg@mail.gmail.com>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
 <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
 <CAP1=2W6rEsVAUwAye8Q3SvCnYM7oY_WFVFyCtVgO_mK=4-dJ-A@mail.gmail.com>
 <5B4E973C-B09E-487E-9074-3B42DC773B99@lukasa.co.uk>
 <576586B8.5090009@stoneleaf.us>
 <CADiSq7cF=C9scPHvuPHdeOmHS9vAskVjXWyFPbTNgbPYaabopg@mail.gmail.com>
Message-ID: <5766F887.6090302@stoneleaf.us>

On 06/19/2016 12:39 PM, Nick Coghlan wrote:
> On 18 June 2016 at 10:36, Ethan Furman wrote:

>> To sum up:  I think it would be a good idea.
>
> I'm coming around to this point of view as well. import-sig, for
> example, is a very low traffic SIG, but I think it serves three key
> useful purposes:
>
> - it clearly indicates that import is a specialist topic with
> additional considerations to take into account that may not be obvious
> to developers touching the import system for the first time
> - it provides a forum to collaboratively craft explanations of
> proposed changes that should make sense to folks that *aren't*
> specialists
> - anyone that wants to become an "import system expert" can join the
> SIG and learn from the intermittent discussions of proposed changes

[...]

> As far as names go, my vote would be for "paranoia-sig" - it nicely
> avoids any risk of folks submitting security bugs there instead of to
> the PSRT, and "We're professionally paranoid, so you don't need to be"
> is an apt description of good security sensitive API design in a
> general purpose language like Python :)

Heh.  I like it.  If no one comes up with any other names I'll get the 
SIG requested mid-week-ish.

--
~Ethan~

From guido at python.org  Sun Jun 19 18:51:24 2016
From: guido at python.org (Guido van Rossum)
Date: Sun, 19 Jun 2016 15:51:24 -0700
Subject: [Python-Dev] security SIG?
In-Reply-To: <5766F887.6090302@stoneleaf.us>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
 <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
 <CAP1=2W6rEsVAUwAye8Q3SvCnYM7oY_WFVFyCtVgO_mK=4-dJ-A@mail.gmail.com>
 <5B4E973C-B09E-487E-9074-3B42DC773B99@lukasa.co.uk>
 <576586B8.5090009@stoneleaf.us>
 <CADiSq7cF=C9scPHvuPHdeOmHS9vAskVjXWyFPbTNgbPYaabopg@mail.gmail.com>
 <5766F887.6090302@stoneleaf.us>
Message-ID: <CAP7+vJ+09nYjtDmZa8nMS5WmC6hAzVVFELLV5dyqRmHR87fVFg@mail.gmail.com>

I think it's fine to have this SIG. I could see it going different ways in
terms of discussions and membership, but it's definitely worth a try. I
don't like clever names, and I very much doubt that it'll be mistaken for
an address to report sensitive issues, so I think it should just be
security-sig. (The sensitive-issues people are usually paranoid enough to
check before they post; the script kiddies reporting python.org "issues"
probably will get a faster and more appropriate response from the
security-sig.)

So let's just do it.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160619/5797f138/attachment.html>

From brett at python.org  Sun Jun 19 21:29:45 2016
From: brett at python.org (Brett Cannon)
Date: Mon, 20 Jun 2016 01:29:45 +0000
Subject: [Python-Dev] frame evaluation API PEP
In-Reply-To: <CAP7+vJ+TQLcDku=pNUg=7uQ=_sw_NsBqAbM5Os1K7VAx=7W1nQ@mail.gmail.com>
References: <CAP1=2W7K6Ny82Vq-rk3zO9cHxxJmtQGczqGy3byg83sy-N8W9Q@mail.gmail.com>
 <CAP1=2W6T8Gjo4B44Ut1t9yDYQcMymTBY1hcZ=+QagY5pyrauNg@mail.gmail.com>
 <CAP7+vJ+TQLcDku=pNUg=7uQ=_sw_NsBqAbM5Os1K7VAx=7W1nQ@mail.gmail.com>
Message-ID: <CAP1=2W7135HTD_=7uFNGC0MyLfZiHfMW4jte4DWD+RZObUupbg@mail.gmail.com>

On Sat, 18 Jun 2016 at 21:49 Guido van Rossum <guido at python.org> wrote:

> Hi Brett,
>
> I've got a few questions about the specific design. Probably you know the
> answers, it would be nice to have them in the PEP.
>

Once you're happy with my answers I'll update the PEP.

>
> First, why not have a global hook? What does a hook per interpreter give
> you? Would even finer granularity buy anything?
>

We initially considered a per-code object hook, but we figured it was
unnecessary to have that level of control, especially since people like
Numba have gotten away with not needing it for this long (although I
suspect that's because they are a decorator so they can just return an
object that overrides __call__()). We didn't think that a global one was
appropriate as different workloads may call for different
JITs/debuggers/etc. and there is no guarantee that you are executing every
interpreter with the same workload. Plus we figured people might simply
import their JIT of choice and as a side-effect set the hook, and since
imports are a per-interpreter thing that seemed to suggest the granularity
of interpreters.

IOW it seemed to be more in line with sys.settrace() than some global thing
for the process.

>
> Next, I'm a bit (but no more than a bit) concerned about the extra 8 bytes
> per code object, especially since for most people this is just waste
> (assuming most people won't be using Pyjion or Numba). Could it be a
> compile-time feature (requiring recompilation of CPython but not
> extensions)?
>

Probably. It does water down potential usage thanks to needing a special
build. If the decision is "special build or not", I would simply pull out
this part of the proposal as I wouldn't want to add a flag that influences
what is or is not possible for an interpreter.

> Could you figure out some other way to store per-code-object data? It
> seems you considered this but decided that the co_extra field was simpler
> and faster; I'm basically pushing a little harder on this. Of course most
> of the PEP would disappear without this feature; the extra interpreter
> field is fine.
>

Dino and I thought of two potential alternatives, neither of which we have
taken the time to implement and benchmark. One is to simply have a hash
table of memory addresses to JIT data that is kept on the JIT side of
things. Obviously it would be nice to avoid the overhead of a hash table
lookup on every function call. This also doesn't help minimize memory when
the code object gets GC'ed.

The other potential solution we came up with was to use weakrefs. I have
not looked into the details, but we were thinking that if we registered the
JIT data object as a weakref on the code object, couldn't we iterate
through the weakrefs attached to the code object to look for the JIT data
object, and then get the reference that way? It would let us avoid a more
expensive hash table lookup if we assume most code objects won't have a
weakref on it (assuming weakrefs are stored in a list), and it gives us the
proper cleanup semantics we want by getting the weakref cleanup callback
execution to make sure we decref the JIT data object appropriately. But as
I said, I have not looked into the feasibility of this at all to know if
I'm remembering the weakref implementation details correctly.

>
> Finally, there are some error messages from pep2html.py:
> https://www.python.org/dev/peps/pep-0523/#copyright
>

All fixed in
https://github.com/python/peps/commit/6929f850a5af07e51d0163558a5fe8d6b85dccfe
 .

-Brett

>
>
> --Guido
>
> On Fri, Jun 17, 2016 at 7:58 PM, Brett Cannon <brett at python.org> wrote:
>
>> I have taken PEP 523 for this:
>> https://github.com/python/peps/blob/master/pep-0523.txt .
>>
>> I'm waiting until Guido gets back from vacation, at which point I'll ask
>> for a pronouncement or assignment of a BDFL delegate.
>>
>> On Fri, 3 Jun 2016 at 14:37 Brett Cannon <brett at python.org> wrote:
>>
>>> For those of you who follow python-ideas or were at the PyCon US 2016
>>> language summit, you have already seen/heard about this PEP. For those of
>>> you who don't fall into either of those categories, this PEP proposed a
>>> frame evaluation API for CPython. The motivating example of this work has
>>> been Pyjion, the experimental CPython JIT Dino Viehland and I have been
>>> working on in our spare time at Microsoft. The API also works for
>>> debugging, though, as already demonstrated by Google having added a very
>>> similar API internally for debugging purposes.
>>>
>>> The PEP is pasted in below and also available in rendered form at
>>> https://github.com/Microsoft/Pyjion/blob/master/pep.rst (I will assign
>>> myself a PEP # once discussion is finished as it's easier to work in git
>>> for this for the rich rendering of the in-progress PEP).
>>>
>>> I should mention that the difference from python-ideas and the language
>>> summit in the PEP are the listed support from Google's use of a very
>>> similar API as well as clarifying the co_extra field on code objects
>>> doesn't change their immutability (at least from the view of the PEP).
>>>
>>> ----------
>>> PEP: NNN
>>> Title: Adding a frame evaluation API to CPython
>>> Version: $Revision$
>>> Last-Modified: $Date$
>>> Author: Brett Cannon <brett at python.org>,
>>>         Dino Viehland <dinov at microsoft.com>
>>> Status: Draft
>>> Type: Standards Track
>>> Content-Type: text/x-rst
>>> Created: 16-May-2016
>>> Post-History: 16-May-2016
>>>               03-Jun-2016
>>>
>>>
>>> Abstract
>>> ========
>>>
>>> This PEP proposes to expand CPython's C API [#c-api]_ to allow for
>>> the specification of a per-interpreter function pointer to handle the
>>> evaluation of frames [#pyeval_evalframeex]_. This proposal also
>>> suggests adding a new field to code objects [#pycodeobject]_ to store
>>> arbitrary data for use by the frame evaluation function.
>>>
>>>
>>> Rationale
>>> =========
>>>
>>> One place where flexibility has been lacking in Python is in the direct
>>> execution of Python code. While CPython's C API [#c-api]_ allows for
>>> constructing the data going into a frame object and then evaluating it
>>> via ``PyEval_EvalFrameEx()`` [#pyeval_evalframeex]_, control over the
>>> execution of Python code comes down to individual objects instead of a
>>> hollistic control of execution at the frame level.
>>>
>>> While wanting to have influence over frame evaluation may seem a bit
>>> too low-level, it does open the possibility for things such as a
>>> method-level JIT to be introduced into CPython without CPython itself
>>> having to provide one. By allowing external C code to control frame
>>> evaluation, a JIT can participate in the execution of Python code at
>>> the key point where evaluation occurs. This then allows for a JIT to
>>> conditionally recompile Python bytecode to machine code as desired
>>> while still allowing for executing regular CPython bytecode when
>>> running the JIT is not desired. This can be accomplished by allowing
>>> interpreters to specify what function to call to evaluate a frame. And
>>> by placing the API at the frame evaluation level it allows for a
>>> complete view of the execution environment of the code for the JIT.
>>>
>>> This ability to specify a frame evaluation function also allows for
>>> other use-cases beyond just opening CPython up to a JIT. For instance,
>>> it would not be difficult to implement a tracing or profiling function
>>> at the call level with this API. While CPython does provide the
>>> ability to set a tracing or profiling function at the Python level,
>>> this would be able to match the data collection of the profiler and
>>> quite possibly be faster for tracing by simply skipping per-line
>>> tracing support.
>>>
>>> It also opens up the possibility of debugging where the frame
>>> evaluation function only performs special debugging work when it
>>> detects it is about to execute a specific code object. In that
>>> instance the bytecode could be theoretically rewritten in-place to
>>> inject a breakpoint function call at the proper point for help in
>>> debugging while not having to do a heavy-handed approach as
>>> required by ``sys.settrace()``.
>>>
>>> To help facilitate these use-cases, we are also proposing the adding
>>> of a "scratch space" on code objects via a new field. This will allow
>>> per-code object data to be stored with the code object itself for easy
>>> retrieval by the frame evaluation function as necessary. The field
>>> itself will simply be a ``PyObject *`` type so that any data stored in
>>> the field will participate in normal object memory management.
>>>
>>>
>>> Proposal
>>> ========
>>>
>>> All proposed C API changes below will not be part of the stable ABI.
>>>
>>>
>>> Expanding ``PyCodeObject``
>>> --------------------------
>>>
>>> One field is to be added to the ``PyCodeObject`` struct
>>> [#pycodeobject]_::
>>>
>>>   typedef struct {
>>>      ...
>>>      PyObject *co_extra;  /* "Scratch space" for the code object. */
>>>   } PyCodeObject;
>>>
>>> The ``co_extra`` will be ``NULL`` by default and will not be used by
>>> CPython itself. Third-party code is free to use the field as desired.
>>> Values stored in the field are expected to not be required in order
>>> for the code object to function, allowing the loss of the data of the
>>> field to be acceptable (this keeps the code object as immutable from
>>> a functionality point-of-view; this is slightly contentious and so is
>>> listed as an open issue in `Is co_extra needed?`_). The field will be
>>> freed like all other fields on ``PyCodeObject`` during deallocation
>>> using ``Py_XDECREF()``.
>>>
>>> It is not recommended that multiple users attempt to use the
>>> ``co_extra`` simultaneously. While a dictionary could theoretically be
>>> set to the field and various users could use a key specific to the
>>> project, there is still the issue of key collisions as well as
>>> performance degradation from using a dictionary lookup on every frame
>>> evaluation. Users are expected to do a type check to make sure that
>>> the field has not been previously set by someone else.
>>>
>>>
>>> Expanding ``PyInterpreterState``
>>> --------------------------------
>>>
>>> The entrypoint for the frame evalution function is per-interpreter::
>>>
>>>   // Same type signature as PyEval_EvalFrameEx().
>>>   typedef PyObject* (__stdcall *PyFrameEvalFunction)(PyFrameObject*,
>>> int);
>>>
>>>   typedef struct {
>>>       ...
>>>       PyFrameEvalFunction eval_frame;
>>>   } PyInterpreterState;
>>>
>>> By default, the ``eval_frame`` field will be initialized to a function
>>> pointer that represents what ``PyEval_EvalFrameEx()`` currently is
>>> (called ``PyEval_EvalFrameDefault()``, discussed later in this PEP).
>>> Third-party code may then set their own frame evaluation function
>>> instead to control the execution of Python code. A pointer comparison
>>> can be used to detect if the field is set to
>>> ``PyEval_EvalFrameDefault()`` and thus has not been mutated yet.
>>>
>>>
>>> Changes to ``Python/ceval.c``
>>> -----------------------------
>>>
>>> ``PyEval_EvalFrameEx()`` [#pyeval_evalframeex]_ as it currently stands
>>> will be renamed to ``PyEval_EvalFrameDefault()``. The new
>>> ``PyEval_EvalFrameEx()`` will then become::
>>>
>>>     PyObject *
>>>     PyEval_EvalFrameEx(PyFrameObject *frame, int throwflag)
>>>     {
>>>         PyThreadState *tstate = PyThreadState_GET();
>>>         return tstate->interp->eval_frame(frame, throwflag);
>>>     }
>>>
>>> This allows third-party code to place themselves directly in the path
>>> of Python code execution while being backwards-compatible with code
>>> already using the pre-existing C API.
>>>
>>>
>>> Updating ``python-gdb.py``
>>> --------------------------
>>>
>>> The generated ``python-gdb.py`` file used for Python support in GDB
>>> makes some hard-coded assumptions about ``PyEval_EvalFrameEx()``, e.g.
>>> the names of local variables. It will need to be updated to work with
>>> the proposed changes.
>>>
>>>
>>> Performance impact
>>> ==================
>>>
>>> As this PEP is proposing an API to add pluggability, performance
>>> impact is considered only in the case where no third-party code has
>>> made any changes.
>>>
>>> Several runs of pybench [#pybench]_ consistently showed no performance
>>> cost from the API change alone.
>>>
>>> A run of the Python benchmark suite [#py-benchmarks]_ showed no
>>> measurable cost in performance.
>>>
>>> In terms of memory impact, since there are typically not many CPython
>>> interpreters executing in a single process that means the impact of
>>> ``co_extra`` being added to ``PyCodeObject`` is the only worry.
>>> According to [#code-object-count]_, a run of the Python test suite
>>> results in about 72,395 code objects being created. On a 64-bit
>>> CPU that would result in 579,160 bytes of extra memory being used if
>>> all code objects were alive at once and had nothing set in their
>>> ``co_extra`` fields.
>>>
>>>
>>> Example Usage
>>> =============
>>>
>>> A JIT for CPython
>>> -----------------
>>>
>>> Pyjion
>>> ''''''
>>>
>>> The Pyjion project [#pyjion]_ has used this proposed API to implement
>>> a JIT for CPython using the CoreCLR's JIT [#coreclr]_. Each code
>>> object has its ``co_extra`` field set to a ``PyjionJittedCode`` object
>>> which stores four pieces of information:
>>>
>>> 1. Execution count
>>> 2. A boolean representing whether a previous attempt to JIT failed
>>> 3. A function pointer to a trampoline (which can be type tracing or not)
>>> 4. A void pointer to any JIT-compiled machine code
>>>
>>> The frame evaluation function has (roughly) the following algorithm::
>>>
>>>     def eval_frame(frame, throw_flag):
>>>         pyjion_code = frame.code.co_extra
>>>         if not pyjion_code:
>>>             frame.code.co_extra = PyjionJittedCode()
>>>         elif not pyjion_code.jit_failed:
>>>             if not pyjion_code.jit_code:
>>>                 return pyjion_code.eval(pyjion_code.jit_code, frame)
>>>             elif pyjion_code.exec_count > 20_000:
>>>                 if jit_compile(frame):
>>>                     return pyjion_code.eval(pyjion_code.jit_code, frame)
>>>                 else:
>>>                     pyjion_code.jit_failed = True
>>>         pyjion_code.exec_count += 1
>>>         return PyEval_EvalFrameDefault(frame, throw_flag)
>>>
>>> The key point, though, is that all of this work and logic is separate
>>> from CPython and yet with the proposed API changes it is able to
>>> provide a JIT that is compliant with Python semantics (as of this
>>> writing, performance is almost equivalent to CPython without the new
>>> API). This means there's nothing technically preventing others from
>>> implementing their own JITs for CPython by utilizing the proposed API.
>>>
>>>
>>> Other JITs
>>> ''''''''''
>>>
>>> It should be mentioned that the Pyston team was consulted on an
>>> earlier version of this PEP that was more JIT-specific and they were
>>> not interested in utilizing the changes proposed because they want
>>> control over memory layout they had no interest in directly supporting
>>> CPython itself. An informal discusion with a developer on the PyPy
>>> team led to a similar comment.
>>>
>>> Numba [#numba]_, on the other hand, suggested that they would be
>>> interested in the proposed change in a post-1.0 future for
>>> themselves [#numba-interest]_.
>>>
>>> The experimental Coconut JIT [#coconut]_ could have benefitted from
>>> this PEP. In private conversations with Coconut's creator we were told
>>> that our API was probably superior to the one they developed for
>>> Coconut to add JIT support to CPython.
>>>
>>>
>>> Debugging
>>> ---------
>>>
>>> In conversations with the Python Tools for Visual Studio team (PTVS)
>>> [#ptvs]_, they thought they would find these API changes useful for
>>> implementing more performant debugging. As mentioned in the Rationale_
>>> section, this API would allow for switching on debugging functionality
>>> only in frames where it is needed. This could allow for either
>>> skipping information that ``sys.settrace()`` normally provides and
>>> even go as far as to dynamically rewrite bytecode prior to execution
>>> to inject e.g. breakpoints in the bytecode.
>>>
>>> It also turns out that Google has provided a very similar API
>>> internally for years. It has been used for performant debugging
>>> purposes.
>>>
>>>
>>> Implementation
>>> ==============
>>>
>>> A set of patches implementing the proposed API is available through
>>> the Pyjion project [#pyjion]_. In its current form it has more
>>> changes to CPython than just this proposed API, but that is for ease
>>> of development instead of strict requirements to accomplish its goals.
>>>
>>>
>>> Open Issues
>>> ===========
>>>
>>> Allow ``eval_frame`` to be ``NULL``
>>> -----------------------------------
>>>
>>> Currently the frame evaluation function is expected to always be set.
>>> It could very easily simply default to ``NULL`` instead which would
>>> signal to use ``PyEval_EvalFrameDefault()``. The current proposal of
>>> not special-casing the field seemed the most straight-forward, but it
>>> does require that the field not accidentally be cleared, else a crash
>>> may occur.
>>>
>>>
>>> Is co_extra needed?
>>> -------------------
>>>
>>> While discussing this PEP at PyCon US 2016, some core developers
>>> expressed their worry of the ``co_extra`` field making code objects
>>> mutable. The thinking seemed to be that having a field that was
>>> mutated after the creation of the code object made the object seem
>>> mutable, even though no other aspect of code objects changed.
>>>
>>> The view of this PEP is that the `co_extra` field doesn't change the
>>> fact that code objects are immutable. The field is specified in this
>>> PEP as to not contain information required to make the code object
>>> usable, making it more of a caching field. It could be viewed as
>>> similar to the UTF-8 cache that string objects have internally;
>>> strings are still considered immutable even though they have a field
>>> that is conditionally set.
>>>
>>> The field is also not strictly necessary. While the field greatly
>>> simplifies attaching extra information to code objects, other options
>>> such as keeping a mapping of code object memory addresses to what
>>> would have been kept in ``co_extra`` or perhaps using a weak reference
>>> of the data on the code object and then iterating through the weak
>>> references until the attached data is found is possible. But obviously
>>> all of these solutions are not as simple or performant as adding the
>>> ``co_extra`` field.
>>>
>>>
>>> Rejected Ideas
>>> ==============
>>>
>>> A JIT-specific C API
>>> --------------------
>>>
>>> Originally this PEP was going to propose a much larger API change
>>> which was more JIT-specific. After soliciting feedback from the Numba
>>> team [#numba]_, though, it became clear that the API was unnecessarily
>>> large. The realization was made that all that was truly needed was the
>>> opportunity to provide a trampoline function to handle execution of
>>> Python code that had been JIT-compiled and a way to attach that
>>> compiled machine code along with other critical data to the
>>> corresponding Python code object. Once it was shown that there was no
>>> loss in functionality or in performance while minimizing the API
>>> changes required, the proposal was changed to its current form.
>>>
>>>
>>> References
>>> ==========
>>>
>>> .. [#pyjion] Pyjion project
>>>    (https://github.com/microsoft/pyjion)
>>>
>>> .. [#c-api] CPython's C API
>>>    (https://docs.python.org/3/c-api/index.html)
>>>
>>> .. [#pycodeobject] ``PyCodeObject``
>>>    (https://docs.python.org/3/c-api/code.html#c.PyCodeObject)
>>>
>>> .. [#coreclr] .NET Core Runtime (CoreCLR)
>>>    (https://github.com/dotnet/coreclr)
>>>
>>> .. [#pyeval_evalframeex] ``PyEval_EvalFrameEx()``
>>>    (
>>> https://docs.python.org/3/c-api/veryhigh.html?highlight=pyframeobject#c.PyEval_EvalFrameEx
>>> )
>>>
>>> .. [#pycodeobject] ``PyCodeObject``
>>>    (https://docs.python.org/3/c-api/code.html#c.PyCodeObject)
>>>
>>> .. [#numba] Numba
>>>    (http://numba.pydata.org/)
>>>
>>> .. [#numba-interest]  numba-users mailing list:
>>>    "Would the C API for a JIT entrypoint being proposed by Pyjion help
>>> out Numba?"
>>>    (
>>> https://groups.google.com/a/continuum.io/forum/#!topic/numba-users/yRl_0t8-m1g
>>> )
>>>
>>> .. [#code-object-count] [Python-Dev] Opcode cache in ceval loop
>>>    (
>>> https://mail.python.org/pipermail/python-dev/2016-February/143025.html)
>>>
>>> .. [#py-benchmarks] Python benchmark suite
>>>    (https://hg.python.org/benchmarks)
>>>
>>> .. [#pyston] Pyston
>>>    (http://pyston.org)
>>>
>>> .. [#pypy] PyPy
>>>    (http://pypy.org/)
>>>
>>> .. [#ptvs] Python Tools for Visual Studio
>>>    (http://microsoft.github.io/PTVS/)
>>>
>>> .. [#coconut] Coconut
>>>    (https://github.com/davidmalcolm/coconut)
>>>
>>>
>>> Copyright
>>> =========
>>>
>>> This document has been placed in the public domain.
>>>
>>>
>>> ..
>>>    Local Variables:
>>>    mode: indented-text
>>>    indent-tabs-mode: nil
>>>    sentence-end-double-space: t
>>>    fill-column: 70
>>>    coding: utf-8
>>>    End:
>>>
>>>
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
>>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160620/889a626b/attachment-0001.html>

From python at mrabarnett.plus.com  Sun Jun 19 22:14:37 2016
From: python at mrabarnett.plus.com (MRAB)
Date: Mon, 20 Jun 2016 03:14:37 +0100
Subject: [Python-Dev] frame evaluation API PEP
In-Reply-To: <CAP1=2W7135HTD_=7uFNGC0MyLfZiHfMW4jte4DWD+RZObUupbg@mail.gmail.com>
References: <CAP1=2W7K6Ny82Vq-rk3zO9cHxxJmtQGczqGy3byg83sy-N8W9Q@mail.gmail.com>
 <CAP1=2W6T8Gjo4B44Ut1t9yDYQcMymTBY1hcZ=+QagY5pyrauNg@mail.gmail.com>
 <CAP7+vJ+TQLcDku=pNUg=7uQ=_sw_NsBqAbM5Os1K7VAx=7W1nQ@mail.gmail.com>
 <CAP1=2W7135HTD_=7uFNGC0MyLfZiHfMW4jte4DWD+RZObUupbg@mail.gmail.com>
Message-ID: <d8602405-d5e4-7781-29f6-e87ea09bdb76@mrabarnett.plus.com>

On 2016-06-20 02:29, Brett Cannon wrote:
>
> On Sat, 18 Jun 2016 at 21:49 Guido van Rossum <guido at python.org
> <mailto:guido at python.org>> wrote:
>
[snip]
>
>     Could you figure out some other way to store per-code-object data?
>     It seems you considered this but decided that the co_extra field was
>     simpler and faster; I'm basically pushing a little harder on this.
>     Of course most of the PEP would disappear without this feature; the
>     extra interpreter field is fine.
>
> Dino and I thought of two potential alternatives, neither of which we
> have taken the time to implement and benchmark. One is to simply have a
> hash table of memory addresses to JIT data that is kept on the JIT side
> of things. Obviously it would be nice to avoid the overhead of a hash
> table lookup on every function call. This also doesn't help minimize
> memory when the code object gets GC'ed.
>
[snip]
If you had a flag in co_flags that said whether it should look in the 
hash table, then that might reduce the overhead.

From guido at python.org  Sun Jun 19 22:36:36 2016
From: guido at python.org (Guido van Rossum)
Date: Sun, 19 Jun 2016 19:36:36 -0700
Subject: [Python-Dev] frame evaluation API PEP
In-Reply-To: <CAP1=2W7135HTD_=7uFNGC0MyLfZiHfMW4jte4DWD+RZObUupbg@mail.gmail.com>
References: <CAP1=2W7K6Ny82Vq-rk3zO9cHxxJmtQGczqGy3byg83sy-N8W9Q@mail.gmail.com>
 <CAP1=2W6T8Gjo4B44Ut1t9yDYQcMymTBY1hcZ=+QagY5pyrauNg@mail.gmail.com>
 <CAP7+vJ+TQLcDku=pNUg=7uQ=_sw_NsBqAbM5Os1K7VAx=7W1nQ@mail.gmail.com>
 <CAP1=2W7135HTD_=7uFNGC0MyLfZiHfMW4jte4DWD+RZObUupbg@mail.gmail.com>
Message-ID: <CAP7+vJLVE-N7qtAo8qQ_amedYAJEEzZAw6d=pnGMJ35sHFQhAg@mail.gmail.com>

On Sun, Jun 19, 2016 at 6:29 PM, Brett Cannon <brett at python.org> wrote:

>
>
> On Sat, 18 Jun 2016 at 21:49 Guido van Rossum <guido at python.org> wrote:
>
>> Hi Brett,
>>
>> I've got a few questions about the specific design. Probably you know the
>> answers, it would be nice to have them in the PEP.
>>
>
> Once you're happy with my answers I'll update the PEP.
>

Soon!

>
>
>>
>> First, why not have a global hook? What does a hook per interpreter give
>> you? Would even finer granularity buy anything?
>>
>
> We initially considered a per-code object hook, but we figured it was
> unnecessary to have that level of control, especially since people like
> Numba have gotten away with not needing it for this long (although I
> suspect that's because they are a decorator so they can just return an
> object that overrides __call__()).
>

So they do it at the function object level?

> We didn't think that a global one was appropriate as different workloads
> may call for different JITs/debuggers/etc. and there is no guarantee that
> you are executing every interpreter with the same workload. Plus we figured
> people might simply import their JIT of choice and as a side-effect set the
> hook, and since imports are a per-interpreter thing that seemed to suggest
> the granularity of interpreters.
>

I like import as the argument here.

>
> IOW it seemed to be more in line with sys.settrace() than some global
> thing for the process.
>
>
>>
>> Next, I'm a bit (but no more than a bit) concerned about the extra 8
>> bytes per code object, especially since for most people this is just waste
>> (assuming most people won't be using Pyjion or Numba). Could it be a
>> compile-time feature (requiring recompilation of CPython but not
>> extensions)?
>>
>
> Probably. It does water down potential usage thanks to needing a special
> build. If the decision is "special build or not", I would simply pull out
> this part of the proposal as I wouldn't want to add a flag that influences
> what is or is not possible for an interpreter.
>

MRAB's response made me think of a possible approach: the co_extra field
could be the very last field of the PyCodeObject struct and only present if
a certain flag is set in co_flags. This is similar to a trick used by X11
(I know, it's long ago :-).

>
>
>> Could you figure out some other way to store per-code-object data? It
>> seems you considered this but decided that the co_extra field was simpler
>> and faster; I'm basically pushing a little harder on this. Of course most
>> of the PEP would disappear without this feature; the extra interpreter
>> field is fine.
>>
>
> Dino and I thought of two potential alternatives, neither of which we have
> taken the time to implement and benchmark. One is to simply have a hash
> table of memory addresses to JIT data that is kept on the JIT side of
> things. Obviously it would be nice to avoid the overhead of a hash table
> lookup on every function call. This also doesn't help minimize memory when
> the code object gets GC'ed.
>

I guess the prospect of the extra hash lookup per call isn't great given
that this is about perf...

>
> The other potential solution we came up with was to use weakrefs. I have
> not looked into the details, but we were thinking that if we registered the
> JIT data object as a weakref on the code object, couldn't we iterate
> through the weakrefs attached to the code object to look for the JIT data
> object, and then get the reference that way? It would let us avoid a more
> expensive hash table lookup if we assume most code objects won't have a
> weakref on it (assuming weakrefs are stored in a list), and it gives us the
> proper cleanup semantics we want by getting the weakref cleanup callback
> execution to make sure we decref the JIT data object appropriately. But as
> I said, I have not looked into the feasibility of this at all to know if
> I'm remembering the weakref implementation details correctly.
>

That would be even slower than the hash table lookup, and unbounded. So
let's not go there.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160619/4848f1dd/attachment.html>

From mark at hotpy.org  Mon Jun 20 00:01:16 2016
From: mark at hotpy.org (Mark Shannon)
Date: Sun, 19 Jun 2016 21:01:16 -0700
Subject: [Python-Dev] frame evaluation API PEP
In-Reply-To: <CAP1=2W7135HTD_=7uFNGC0MyLfZiHfMW4jte4DWD+RZObUupbg@mail.gmail.com>
References: <CAP1=2W7K6Ny82Vq-rk3zO9cHxxJmtQGczqGy3byg83sy-N8W9Q@mail.gmail.com>
 <CAP1=2W6T8Gjo4B44Ut1t9yDYQcMymTBY1hcZ=+QagY5pyrauNg@mail.gmail.com>
 <CAP7+vJ+TQLcDku=pNUg=7uQ=_sw_NsBqAbM5Os1K7VAx=7W1nQ@mail.gmail.com>
 <CAP1=2W7135HTD_=7uFNGC0MyLfZiHfMW4jte4DWD+RZObUupbg@mail.gmail.com>
Message-ID: <57676A8C.8070207@hotpy.org>

On 19/06/16 18:29, Brett Cannon wrote:
>
>
> On Sat, 18 Jun 2016 at 21:49 Guido van Rossum <guido at python.org
> <mailto:guido at python.org>> wrote:
>
>     Hi Brett,
>
>     I've got a few questions about the specific design. Probably you
>     know the answers, it would be nice to have them in the PEP.
>
>
> Once you're happy with my answers I'll update the PEP.
>
>
>     First, why not have a global hook? What does a hook per interpreter
>     give you? Would even finer granularity buy anything?
>
>
> We initially considered a per-code object hook, but we figured it was
> unnecessary to have that level of control, especially since people like
> Numba have gotten away with not needing it for this long (although I
> suspect that's because they are a decorator so they can just return an
> object that overrides __call__()). We didn't think that a global one was
> appropriate as different workloads may call for different
> JITs/debuggers/etc. and there is no guarantee that you are executing
> every interpreter with the same workload. Plus we figured people might
> simply import their JIT of choice and as a side-effect set the hook, and
> since imports are a per-interpreter thing that seemed to suggest the
> granularity of interpreters.
>
> IOW it seemed to be more in line with sys.settrace() than some global
> thing for the process.
>
>
>     Next, I'm a bit (but no more than a bit) concerned about the extra 8
>     bytes per code object, especially since for most people this is just
>     waste (assuming most people won't be using Pyjion or Numba). Could
>     it be a compile-time feature (requiring recompilation of CPython but
>     not extensions)?
>
>
> Probably. It does water down potential usage thanks to needing a special
> build. If the decision is "special build or not", I would simply pull
> out this part of the proposal as I wouldn't want to add a flag that
> influences what is or is not possible for an interpreter.
>
>     Could you figure out some other way to store per-code-object data?
>     It seems you considered this but decided that the co_extra field was
>     simpler and faster; I'm basically pushing a little harder on this.
>     Of course most of the PEP would disappear without this feature; the
>     extra interpreter field is fine.
>
>
> Dino and I thought of two potential alternatives, neither of which we
> have taken the time to implement and benchmark. One is to simply have a
> hash table of memory addresses to JIT data that is kept on the JIT side
> of things. Obviously it would be nice to avoid the overhead of a hash
> table lookup on every function call. This also doesn't help minimize
> memory when the code object gets GC'ed.

Hash lookups aren't that slow. If you combine it with the custom flags 
suggested by MRAB, then you would only suffer the lookup penalty when 
actually entering the special interpreter.
You can use a weakref callback to ensure things get GC'd properly.

Also, if there is a special extra field on code-object, then everyone 
will want to use it. How do you handle clashes?

>
> The other potential solution we came up with was to use weakrefs. I have
> not looked into the details, but we were thinking that if we registered
> the JIT data object as a weakref on the code object, couldn't we iterate
> through the weakrefs attached to the code object to look for the JIT
> data object, and then get the reference that way? It would let us avoid
> a more expensive hash table lookup if we assume most code objects won't
> have a weakref on it (assuming weakrefs are stored in a list), and it
> gives us the proper cleanup semantics we want by getting the weakref
> cleanup callback execution to make sure we decref the JIT data object
> appropriately. But as I said, I have not looked into the feasibility of
> this at all to know if I'm remembering the weakref implementation
> details correctly.
>
>
>     Finally, there are some error messages from pep2html.py:
>     https://www.python.org/dev/peps/pep-0523/#copyright
>
>
> All fixed in
> https://github.com/python/peps/commit/6929f850a5af07e51d0163558a5fe8d6b85dccfe .
>
> -Brett
>
>
>
>     --Guido
>
>     On Fri, Jun 17, 2016 at 7:58 PM, Brett Cannon <brett at python.org
>     <mailto:brett at python.org>> wrote:
>
>         I have taken PEP 523 for this:
>         https://github.com/python/peps/blob/master/pep-0523.txt .
>
>         I'm waiting until Guido gets back from vacation, at which point
>         I'll ask for a pronouncement or assignment of a BDFL delegate.
>
>         On Fri, 3 Jun 2016 at 14:37 Brett Cannon <brett at python.org
>         <mailto:brett at python.org>> wrote:
>
>             For those of you who follow python-ideas or were at the
>             PyCon US 2016 language summit, you have already seen/heard
>             about this PEP. For those of you who don't fall into either
>             of those categories, this PEP proposed a frame evaluation
>             API for CPython. The motivating example of this work has
>             been Pyjion, the experimental CPython JIT Dino Viehland and
>             I have been working on in our spare time at Microsoft. The
>             API also works for debugging, though, as already
>             demonstrated by Google having added a very similar API
>             internally for debugging purposes.
>
>             The PEP is pasted in below and also available in rendered
>             form at
>             https://github.com/Microsoft/Pyjion/blob/master/pep.rst (I
>             will assign myself a PEP # once discussion is finished as
>             it's easier to work in git for this for the rich rendering
>             of the in-progress PEP).
>
>             I should mention that the difference from python-ideas and
>             the language summit in the PEP are the listed support from
>             Google's use of a very similar API as well as clarifying the
>             co_extra field on code objects doesn't change their
>             immutability (at least from the view of the PEP).
>
>             ----------
>             PEP: NNN
>             Title: Adding a frame evaluation API to CPython
>             Version: $Revision$
>             Last-Modified: $Date$
>             Author: Brett Cannon <brett at python.org
>             <mailto:brett at python.org>>,
>                      Dino Viehland <dinov at microsoft.com
>             <mailto:dinov at microsoft.com>>
>             Status: Draft
>             Type: Standards Track
>             Content-Type: text/x-rst
>             Created: 16-May-2016
>             Post-History: 16-May-2016
>                            03-Jun-2016
>
>
>             Abstract
>             ========
>
>             This PEP proposes to expand CPython's C API [#c-api]_ to
>             allow for
>             the specification of a per-interpreter function pointer to
>             handle the
>             evaluation of frames [#pyeval_evalframeex]_. This proposal also
>             suggests adding a new field to code objects [#pycodeobject]_
>             to store
>             arbitrary data for use by the frame evaluation function.
>
>
>             Rationale
>             =========
>
>             One place where flexibility has been lacking in Python is in
>             the direct
>             execution of Python code. While CPython's C API [#c-api]_
>             allows for
>             constructing the data going into a frame object and then
>             evaluating it
>             via ``PyEval_EvalFrameEx()`` [#pyeval_evalframeex]_, control
>             over the
>             execution of Python code comes down to individual objects
>             instead of a
>             hollistic control of execution at the frame level.
>
>             While wanting to have influence over frame evaluation may
>             seem a bit
>             too low-level, it does open the possibility for things such as a
>             method-level JIT to be introduced into CPython without
>             CPython itself
>             having to provide one. By allowing external C code to
>             control frame
>             evaluation, a JIT can participate in the execution of Python
>             code at
>             the key point where evaluation occurs. This then allows for
>             a JIT to
>             conditionally recompile Python bytecode to machine code as
>             desired
>             while still allowing for executing regular CPython bytecode when
>             running the JIT is not desired. This can be accomplished by
>             allowing
>             interpreters to specify what function to call to evaluate a
>             frame. And
>             by placing the API at the frame evaluation level it allows for a
>             complete view of the execution environment of the code for
>             the JIT.
>
>             This ability to specify a frame evaluation function also
>             allows for
>             other use-cases beyond just opening CPython up to a JIT. For
>             instance,
>             it would not be difficult to implement a tracing or
>             profiling function
>             at the call level with this API. While CPython does provide the
>             ability to set a tracing or profiling function at the Python
>             level,
>             this would be able to match the data collection of the
>             profiler and
>             quite possibly be faster for tracing by simply skipping per-line
>             tracing support.
>
>             It also opens up the possibility of debugging where the frame
>             evaluation function only performs special debugging work when it
>             detects it is about to execute a specific code object. In that
>             instance the bytecode could be theoretically rewritten
>             in-place to
>             inject a breakpoint function call at the proper point for
>             help in
>             debugging while not having to do a heavy-handed approach as
>             required by ``sys.settrace()``.
>
>             To help facilitate these use-cases, we are also proposing
>             the adding
>             of a "scratch space" on code objects via a new field. This
>             will allow
>             per-code object data to be stored with the code object
>             itself for easy
>             retrieval by the frame evaluation function as necessary. The
>             field
>             itself will simply be a ``PyObject *`` type so that any data
>             stored in
>             the field will participate in normal object memory management.
>
>
>             Proposal
>             ========
>
>             All proposed C API changes below will not be part of the
>             stable ABI.
>
>
>             Expanding ``PyCodeObject``
>             --------------------------
>
>             One field is to be added to the ``PyCodeObject`` struct
>             [#pycodeobject]_::
>
>                typedef struct {
>                   ...
>                   PyObject *co_extra;  /* "Scratch space" for the code
>             object. */
>                } PyCodeObject;
>
>             The ``co_extra`` will be ``NULL`` by default and will not be
>             used by
>             CPython itself. Third-party code is free to use the field as
>             desired.
>             Values stored in the field are expected to not be required
>             in order
>             for the code object to function, allowing the loss of the
>             data of the
>             field to be acceptable (this keeps the code object as
>             immutable from
>             a functionality point-of-view; this is slightly contentious
>             and so is
>             listed as an open issue in `Is co_extra needed?`_). The
>             field will be
>             freed like all other fields on ``PyCodeObject`` during
>             deallocation
>             using ``Py_XDECREF()``.
>
>             It is not recommended that multiple users attempt to use the
>             ``co_extra`` simultaneously. While a dictionary could
>             theoretically be
>             set to the field and various users could use a key specific
>             to the
>             project, there is still the issue of key collisions as well as
>             performance degradation from using a dictionary lookup on
>             every frame
>             evaluation. Users are expected to do a type check to make
>             sure that
>             the field has not been previously set by someone else.
>
>
>             Expanding ``PyInterpreterState``
>             --------------------------------
>
>             The entrypoint for the frame evalution function is
>             per-interpreter::
>
>                // Same type signature as PyEval_EvalFrameEx().
>                typedef PyObject* (__stdcall
>             *PyFrameEvalFunction)(PyFrameObject*, int);
>
>                typedef struct {
>                    ...
>                    PyFrameEvalFunction eval_frame;
>                } PyInterpreterState;
>
>             By default, the ``eval_frame`` field will be initialized to
>             a function
>             pointer that represents what ``PyEval_EvalFrameEx()``
>             currently is
>             (called ``PyEval_EvalFrameDefault()``, discussed later in
>             this PEP).
>             Third-party code may then set their own frame evaluation
>             function
>             instead to control the execution of Python code. A pointer
>             comparison
>             can be used to detect if the field is set to
>             ``PyEval_EvalFrameDefault()`` and thus has not been mutated yet.
>
>
>             Changes to ``Python/ceval.c``
>             -----------------------------
>
>             ``PyEval_EvalFrameEx()`` [#pyeval_evalframeex]_ as it
>             currently stands
>             will be renamed to ``PyEval_EvalFrameDefault()``. The new
>             ``PyEval_EvalFrameEx()`` will then become::
>
>                  PyObject *
>                  PyEval_EvalFrameEx(PyFrameObject *frame, int throwflag)
>                  {
>                      PyThreadState *tstate = PyThreadState_GET();
>                      return tstate->interp->eval_frame(frame, throwflag);
>                  }
>
>             This allows third-party code to place themselves directly in
>             the path
>             of Python code execution while being backwards-compatible
>             with code
>             already using the pre-existing C API.
>
>
>             Updating ``python-gdb.py``
>             --------------------------
>
>             The generated ``python-gdb.py`` file used for Python support
>             in GDB
>             makes some hard-coded assumptions about
>             ``PyEval_EvalFrameEx()``, e.g.
>             the names of local variables. It will need to be updated to
>             work with
>             the proposed changes.
>
>
>             Performance impact
>             ==================
>
>             As this PEP is proposing an API to add pluggability, performance
>             impact is considered only in the case where no third-party
>             code has
>             made any changes.
>
>             Several runs of pybench [#pybench]_ consistently showed no
>             performance
>             cost from the API change alone.
>
>             A run of the Python benchmark suite [#py-benchmarks]_ showed no
>             measurable cost in performance.
>
>             In terms of memory impact, since there are typically not
>             many CPython
>             interpreters executing in a single process that means the
>             impact of
>             ``co_extra`` being added to ``PyCodeObject`` is the only worry.
>             According to [#code-object-count]_, a run of the Python test
>             suite
>             results in about 72,395 code objects being created. On a 64-bit
>             CPU that would result in 579,160 bytes of extra memory being
>             used if
>             all code objects were alive at once and had nothing set in their
>             ``co_extra`` fields.
>
>
>             Example Usage
>             =============
>
>             A JIT for CPython
>             -----------------
>
>             Pyjion
>             ''''''
>
>             The Pyjion project [#pyjion]_ has used this proposed API to
>             implement
>             a JIT for CPython using the CoreCLR's JIT [#coreclr]_. Each code
>             object has its ``co_extra`` field set to a
>             ``PyjionJittedCode`` object
>             which stores four pieces of information:
>
>             1. Execution count
>             2. A boolean representing whether a previous attempt to JIT
>             failed
>             3. A function pointer to a trampoline (which can be type
>             tracing or not)
>             4. A void pointer to any JIT-compiled machine code
>
>             The frame evaluation function has (roughly) the following
>             algorithm::
>
>                  def eval_frame(frame, throw_flag):
>                      pyjion_code = frame.code.co_extra
>                      if not pyjion_code:
>                          frame.code.co_extra = PyjionJittedCode()
>                      elif not pyjion_code.jit_failed:
>                          if not pyjion_code.jit_code:
>                              return
>             pyjion_code.eval(pyjion_code.jit_code, frame)
>                          elif pyjion_code.exec_count > 20_000:
>                              if jit_compile(frame):
>                                  return
>             pyjion_code.eval(pyjion_code.jit_code, frame)
>                              else:
>                                  pyjion_code.jit_failed = True
>                      pyjion_code.exec_count += 1
>                      return PyEval_EvalFrameDefault(frame, throw_flag)
>
>             The key point, though, is that all of this work and logic is
>             separate
>             from CPython and yet with the proposed API changes it is able to
>             provide a JIT that is compliant with Python semantics (as of
>             this
>             writing, performance is almost equivalent to CPython without
>             the new
>             API). This means there's nothing technically preventing
>             others from
>             implementing their own JITs for CPython by utilizing the
>             proposed API.
>
>
>             Other JITs
>             ''''''''''
>
>             It should be mentioned that the Pyston team was consulted on an
>             earlier version of this PEP that was more JIT-specific and
>             they were
>             not interested in utilizing the changes proposed because
>             they want
>             control over memory layout they had no interest in directly
>             supporting
>             CPython itself. An informal discusion with a developer on
>             the PyPy
>             team led to a similar comment.
>
>             Numba [#numba]_, on the other hand, suggested that they would be
>             interested in the proposed change in a post-1.0 future for
>             themselves [#numba-interest]_.
>
>             The experimental Coconut JIT [#coconut]_ could have
>             benefitted from
>             this PEP. In private conversations with Coconut's creator we
>             were told
>             that our API was probably superior to the one they developed for
>             Coconut to add JIT support to CPython.
>
>
>             Debugging
>             ---------
>
>             In conversations with the Python Tools for Visual Studio
>             team (PTVS)
>             [#ptvs]_, they thought they would find these API changes
>             useful for
>             implementing more performant debugging. As mentioned in the
>             Rationale_
>             section, this API would allow for switching on debugging
>             functionality
>             only in frames where it is needed. This could allow for either
>             skipping information that ``sys.settrace()`` normally
>             provides and
>             even go as far as to dynamically rewrite bytecode prior to
>             execution
>             to inject e.g. breakpoints in the bytecode.
>
>             It also turns out that Google has provided a very similar API
>             internally for years. It has been used for performant debugging
>             purposes.
>
>
>             Implementation
>             ==============
>
>             A set of patches implementing the proposed API is available
>             through
>             the Pyjion project [#pyjion]_. In its current form it has more
>             changes to CPython than just this proposed API, but that is
>             for ease
>             of development instead of strict requirements to accomplish
>             its goals.
>
>
>             Open Issues
>             ===========
>
>             Allow ``eval_frame`` to be ``NULL``
>             -----------------------------------
>
>             Currently the frame evaluation function is expected to
>             always be set.
>             It could very easily simply default to ``NULL`` instead
>             which would
>             signal to use ``PyEval_EvalFrameDefault()``. The current
>             proposal of
>             not special-casing the field seemed the most
>             straight-forward, but it
>             does require that the field not accidentally be cleared,
>             else a crash
>             may occur.
>
>
>             Is co_extra needed?
>             -------------------
>
>             While discussing this PEP at PyCon US 2016, some core developers
>             expressed their worry of the ``co_extra`` field making code
>             objects
>             mutable. The thinking seemed to be that having a field that was
>             mutated after the creation of the code object made the
>             object seem
>             mutable, even though no other aspect of code objects changed.
>
>             The view of this PEP is that the `co_extra` field doesn't
>             change the
>             fact that code objects are immutable. The field is specified
>             in this
>             PEP as to not contain information required to make the code
>             object
>             usable, making it more of a caching field. It could be viewed as
>             similar to the UTF-8 cache that string objects have internally;
>             strings are still considered immutable even though they have
>             a field
>             that is conditionally set.
>
>             The field is also not strictly necessary. While the field
>             greatly
>             simplifies attaching extra information to code objects,
>             other options
>             such as keeping a mapping of code object memory addresses to
>             what
>             would have been kept in ``co_extra`` or perhaps using a weak
>             reference
>             of the data on the code object and then iterating through
>             the weak
>             references until the attached data is found is possible. But
>             obviously
>             all of these solutions are not as simple or performant as
>             adding the
>             ``co_extra`` field.
>
>
>             Rejected Ideas
>             ==============
>
>             A JIT-specific C API
>             --------------------
>
>             Originally this PEP was going to propose a much larger API
>             change
>             which was more JIT-specific. After soliciting feedback from
>             the Numba
>             team [#numba]_, though, it became clear that the API was
>             unnecessarily
>             large. The realization was made that all that was truly
>             needed was the
>             opportunity to provide a trampoline function to handle
>             execution of
>             Python code that had been JIT-compiled and a way to attach that
>             compiled machine code along with other critical data to the
>             corresponding Python code object. Once it was shown that
>             there was no
>             loss in functionality or in performance while minimizing the API
>             changes required, the proposal was changed to its current form.
>
>
>             References
>             ==========
>
>             .. [#pyjion] Pyjion project
>                 (https://github.com/microsoft/pyjion)
>
>             .. [#c-api] CPython's C API
>                 (https://docs.python.org/3/c-api/index.html)
>
>             .. [#pycodeobject] ``PyCodeObject``
>                 (https://docs.python.org/3/c-api/code.html#c.PyCodeObject)
>
>             .. [#coreclr] .NET Core Runtime (CoreCLR)
>                 (https://github.com/dotnet/coreclr)
>
>             .. [#pyeval_evalframeex] ``PyEval_EvalFrameEx()``
>
>               (https://docs.python.org/3/c-api/veryhigh.html?highlight=pyframeobject#c.PyEval_EvalFrameEx)
>
>             .. [#pycodeobject] ``PyCodeObject``
>                 (https://docs.python.org/3/c-api/code.html#c.PyCodeObject)
>
>             .. [#numba] Numba
>                 (http://numba.pydata.org/)
>
>             .. [#numba-interest]  numba-users mailing list:
>                 "Would the C API for a JIT entrypoint being proposed by
>             Pyjion help out Numba?"
>
>               (https://groups.google.com/a/continuum.io/forum/#!topic/numba-users/yRl_0t8-m1g)
>
>             .. [#code-object-count] [Python-Dev] Opcode cache in ceval loop
>
>               (https://mail.python.org/pipermail/python-dev/2016-February/143025.html)
>
>             .. [#py-benchmarks] Python benchmark suite
>                 (https://hg.python.org/benchmarks)
>
>             .. [#pyston] Pyston
>                 (http://pyston.org)
>
>             .. [#pypy] PyPy
>                 (http://pypy.org/)
>
>             .. [#ptvs] Python Tools for Visual Studio
>                 (http://microsoft.github.io/PTVS/)
>
>             .. [#coconut] Coconut
>                 (https://github.com/davidmalcolm/coconut)
>
>
>             Copyright
>             =========
>
>             This document has been placed in the public domain.
>
>
>             

>             ..
>                 Local Variables:
>                 mode: indented-text
>                 indent-tabs-mode: nil
>                 sentence-end-double-space: t
>                 fill-column: 70
>                 coding: utf-8
>                 End:
>
>
>         _______________________________________________
>         Python-Dev mailing list
>         Python-Dev at python.org <mailto:Python-Dev at python.org>
>         https://mail.python.org/mailman/listinfo/python-dev
>         Unsubscribe:
>         https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>
>
>
>     --
>     --Guido van Rossum (python.org/~guido <http://python.org/~guido>)
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/mark%40hotpy.org
>

From ethan at stoneleaf.us  Mon Jun 20 02:17:53 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Sun, 19 Jun 2016 23:17:53 -0700
Subject: [Python-Dev] security SIG?
In-Reply-To: <CAP7+vJ+09nYjtDmZa8nMS5WmC6hAzVVFELLV5dyqRmHR87fVFg@mail.gmail.com>
References: <90F89145-596F-403D-B789-59E4DA866491@theolliviers.com>
 <CAP7+vJ+7SM4EPxPdgk_Tpp6JM5Sa_CssWRniCNcOuKP-Twn_5A@mail.gmail.com>
 <52FF5A38-7AD3-4C8D-9248-FE1FFFA6A6C6@theolliviers.com>
 <CADiSq7fzbJf3XdonvhtyYMzZgX20D_zmZ351Zs_5QDQ74facbQ@mail.gmail.com>
 <CAP1=2W6rEsVAUwAye8Q3SvCnYM7oY_WFVFyCtVgO_mK=4-dJ-A@mail.gmail.com>
 <5B4E973C-B09E-487E-9074-3B42DC773B99@lukasa.co.uk>
 <576586B8.5090009@stoneleaf.us>
 <CADiSq7cF=C9scPHvuPHdeOmHS9vAskVjXWyFPbTNgbPYaabopg@mail.gmail.com>
 <5766F887.6090302@stoneleaf.us>
 <CAP7+vJ+09nYjtDmZa8nMS5WmC6hAzVVFELLV5dyqRmHR87fVFg@mail.gmail.com>
Message-ID: <57678A91.9010704@stoneleaf.us>

On 06/19/2016 03:51 PM, Guido van Rossum wrote:

> I think it's fine to have this SIG. I could see it going different ways
> in terms of discussions and membership, but it's definitely worth a try.
> I don't like clever names, and I very much doubt that it'll be mistaken
> for an address to report sensitive issues, so I think it should just be
> security-sig. (The sensitive-issues people are usually paranoid enough
> to check before they post; the script kiddies reporting python.org
> <http://python.org> "issues" probably will get a faster and more
> appropriate response from the security-sig.)
>
> So let's just do it.

Started the process of creating "security-sig".

--
~Ethan~

From guido at python.org  Mon Jun 20 11:49:36 2016
From: guido at python.org (Guido van Rossum)
Date: Mon, 20 Jun 2016 08:49:36 -0700
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CADiSq7dZqpH2n_JpgzC-3BgqOKcWv3OcnCvPMLe-jx8zJB2_mA@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <CADiSq7dZqpH2n_JpgzC-3BgqOKcWv3OcnCvPMLe-jx8zJB2_mA@mail.gmail.com>
Message-ID: <CAP7+vJLKiP8iwFS-D51SpF0dGXVrDj9hLGSyR1C0t3ntonQgQA@mail.gmail.com>

I agree it's better to define the order as computed at runtime. I don't
think there's much of a point to mandate that all builtin/extension types
reveal their order too -- I doubt there will be many uses for that -- but I
don't want to disallow it either. But we can allow types to define this, as
long as it's in their documentation (so users can rely on it in those
cases).

As another point of review, I don't like the exception for dunder names. I
can see that __module__, __name__ etc. are distractions, but since you're
adding methods, you should also add methods with dunder names.

The overlap with PEP 487 makes me think that this feature is clearly
desirable (I like the name you give it in PEP 520 better, and PEP 487 is
too vague about its definition).

Finally, it seems someone is working on making all dicts ordered. Does that
mean this will soon be obsolete?

On Fri, Jun 17, 2016 at 6:32 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 7 June 2016 at 17:50, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> > Why is __definition_order__ even necessary?
> > -------------------------------------------
> >
> > Since the definition order is not preserved in ``__dict__``, it would be
> > lost once class definition execution completes.  Classes *could*
> > explicitly set the attribute as the last thing in the body.  However,
> > then independent decorators could only make use of classes that had done
> > so.  Instead, ``__definition_order__`` preserves this one bit of info
> > from the class body so that it is universally available.
>
> The discussion in the PEP 487 thread made me realise that I'd like to
> see a discussion in PEP 520 regarding whether or not to define
> __definition_order__ for builtin types initialised via PyType_Ready or
> created via PyType_FromSpec in addition to defining it for types
> created via the class statement or types.new_class().
>
> For static types, PyType_Ready could potentially set it based on
> tp_members, tp_methods & tp_getset (see
> https://docs.python.org/3/c-api/typeobj.html )
> Similarly, PyType_FromSpec could potentially set it based on the
> contents of Py_tp_members, Py_tp_methods and Py_tp_getset slot
> definitions
>
> Having definition order support in both types.new_class() and builtin
> types would also make it clear why we can't rely purely on the
> compiler to provide the necessary ordering information - in both of
> those cases, the Python compiler isn't directly involved in the type
> creation process.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160620/a1c4018c/attachment-0001.html>

From guido at python.org  Mon Jun 20 12:32:36 2016
From: guido at python.org (Guido van Rossum)
Date: Mon, 20 Jun 2016 09:32:36 -0700
Subject: [Python-Dev] Review of PEP 520: Ordered Class Definition Namespace
Message-ID: <CAP7+vJK5SVeng92QKvY4PpKR3OX=YLqjj2tgGFjbvmavnpLxxA@mail.gmail.com>

PEP 520 review notes.

(From previous message; edited.)

- I agree it's better to define the order as computed at runtime.

- I don't think there's much of a point to mandate that all
builtin/extension types reveal their order too -- I doubt there will be
many uses for that -- but I don't want to disallow it either. We can allow
types to define this, as long as it's in their documentation (so users can
rely on it in those cases).

- I don't like the exception for dunder names. I can see that __module__,
__name__ etc. that occur in every class are distractions, but since you're
adding methods, you should also add methods with dunder names like __init__
or __getattr__. (Otherwise, what if someone really wanted to create a
Django form with a field named __dunder__?)

- The overlap with PEP 487 makes me think that this feature is clearly
desirable (I like the name you give it in PEP 520 better, and PEP 487 is
too vague about its definition).

- It seems someone is working on making all dicts ordered. Does that mean
this will soon be obsolete? It would be a shame if we ended up having to
give every class an extra attribute that is just a materialization of
C.__dict__.keys() with (some) dunder names filtered out.

(New)

- It's a shame we can't just make __dict__ (a proxy to) an OrderedDict,
then we wouldn't need an extra attribute. Hm, maybe we could customize the
proxy class so its keys(), values(), items() views return things in the
order of __definition_order__? (In the tracker discussion this was
considered a big deal, but given that a class __dict__ is already a
readonly proxy I'm not sure I agree. Or is this about C level access? How
much of that would break?)

- I don't see why it needs to be a read-only attribute. There are very few
of those -- in general we let users play around with things unless we have
a hard reason to restrict assignment (e.g. the interpreter's internal state
could be compromised). I don't see such a hard reason here.

- All in all the motivation is fairly weak -- it seems to be mostly
motivated on avoiding a custom metaclass for this purpose because combining
metaclasses is a pain. I realize it's only a small patch in a small corner
of the language, but I do worry about repercussions -- it's an API that's
going to be used for new (and useful) purposes so we will never be able to
get rid of it.

Note: I'm neither accepting nor rejecting the PEP; I'm merely inviting more
contemplation.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160620/dd1271fb/attachment.html>

From guido at python.org  Mon Jun 20 12:48:34 2016
From: guido at python.org (Guido van Rossum)
Date: Mon, 20 Jun 2016 09:48:34 -0700
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <CANawmycB2ptBw-y90B2DsEe0kKqnC8fFBVdBFBZQRTv0DitG1g@mail.gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
 <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
 <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>
 <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>
 <CANawmycB2ptBw-y90B2DsEe0kKqnC8fFBVdBFBZQRTv0DitG1g@mail.gmail.com>
Message-ID: <CAP7+vJ+s97M737xFgMjY_4F5EsZ9XVye9VyxWwWpf_JGRrm+tw@mail.gmail.com>

On Thu, Jun 16, 2016 at 3:24 PM, Nikita Nemkin <nikita at nemkin.ru> wrote:

> On Fri, Jun 17, 2016 at 2:36 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > On 16 June 2016 at 14:17, Martin Teichmann <lkb.teichmann at gmail.com>
> wrote:
>
> > An implementation like PyPy, with an inherently ordered standard dict
> > implementation, can just rely on that rather than being obliged to
> > switch to their full collections.OrderedDict type.
>
> I didin't know that PyPy has actually implemented packed ordered dicts!
>
> https://morepypy.blogspot.ru/2015/01/faster-more-memory-efficient-and-more.html
> https://mail.python.org/pipermail/python-dev/2012-December/123028.html
>
> This old idea by Raymond Hettinger is vastly superior to
> __definition_order__ duct tape (now that PyPy has validated it).
> It also gives kwarg order for free, which is important in many
> metaprogramming scenarios.
> Not to mention memory usage reduction and dict operations speedup...
>

That idea is only vastly superior if we want to force all other Python
implementations to also have an order-preserving dict with the same
semantics and API.

I'd like to hear more about your metaprogramming scenarios -- often such
things end up being code the author is ashamed of. Perhaps they should stay
in the shadows? Or could we do something to make it so you won't have to be
ashamed of it?

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160620/4853f36e/attachment.html>

From nikita at nemkin.ru  Mon Jun 20 14:31:54 2016
From: nikita at nemkin.ru (Nikita Nemkin)
Date: Mon, 20 Jun 2016 23:31:54 +0500
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <CAP7+vJ+s97M737xFgMjY_4F5EsZ9XVye9VyxWwWpf_JGRrm+tw@mail.gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
 <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
 <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>
 <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>
 <CANawmycB2ptBw-y90B2DsEe0kKqnC8fFBVdBFBZQRTv0DitG1g@mail.gmail.com>
 <CAP7+vJ+s97M737xFgMjY_4F5EsZ9XVye9VyxWwWpf_JGRrm+tw@mail.gmail.com>
Message-ID: <CANawmyeMC-YRs7hhQ-ZRPDWERGBO=BJofSuKS+6sAcYe4AYGdw@mail.gmail.com>

On Mon, Jun 20, 2016 at 9:48 PM, Guido van Rossum <guido at python.org> wrote:
> On Thu, Jun 16, 2016 at 3:24 PM, Nikita Nemkin <nikita at nemkin.ru> wrote:
>>
>> I didin't know that PyPy has actually implemented packed ordered dicts!
>>
>> https://morepypy.blogspot.ru/2015/01/faster-more-memory-efficient-and-more.html
>> https://mail.python.org/pipermail/python-dev/2012-December/123028.html
>>
>> This old idea by Raymond Hettinger is vastly superior to
>> __definition_order__ duct tape (now that PyPy has validated it).
>> It also gives kwarg order for free, which is important in many
>> metaprogramming scenarios.
>> Not to mention memory usage reduction and dict operations speedup...
>
>
> That idea is only vastly superior if we want to force all other Python
> implementations to also have an order-preserving dict with the same
> semantics and API.

Right. Ordered by default is a very serious implementation constraint.
It's only superior in a sense that it completely subsumes/obsoletes
PEP 520.

> I'd like to hear more about your metaprogramming scenarios -- often such
> things end up being code the author is ashamed of. Perhaps they should stay
> in the shadows? Or could we do something to make it so you won't have to be
> ashamed of it?

What I meant is embedding declarative domain-specific languages
in Python. Examples of such languages include SQL table
definitions, binary data definitions (in-memory C structs or
wire protocol), GUI definitions (look up enaml for an interesting
example), etc. etc. DSLs are a well defined field and the point
of embedding into Python is to implement in Python and to
empower DSL with Python constructs for generation and logic.

Basic blocks for a declarative language are lists and "objects" -
groups of ordered, named fields.

Representing lists is easy and elegant, commas make a tuple
and [] makes a list.

It's when trying to represent "objects" the issues arise.
Literal dicts are "ugly" (for DSL purposes) and unordered.
Lists of 2-tuples are even uglier. Py3 gave us __prepare__ for
ordered class bodies, and this became a first valid option.
For example, SQL table:

    class MyTable(SqlTable):
        field1 = Type1(options...)
        field2 = Type2()

Unfortunately, class declarations don't look good when nested,
and nesting is a common thing.

    class MainWindow:
        caption = "Window"
        class HSplit:
            label1 = Label(...)
            text1 = Text(...)

You get the idea.
Another option for expressing "objects" are function calls with kwargs:

    packet = Struct(type=uint8,
                    length=uint32,
                    body=Array(uint8, 'type'))

Looks reasonably clean, but more often than not requires kwargs
to be ordered. THIS is the scenario I was talking about.

Function attributes also have a role, but being
attached to function definitions, their scope is somewhat limited.

Of course, all of the above is largely theoretical, for two basic
reasons:
1) Python syntax/runtime is too rigid for a declarative DSL.
   (Specifically, _embedded_ DSL. The syntax alone can be re-used
   with ast.parse, but it's a different scenario.)
2) DSLs in general are grossly unpythonic, hiding loads of magic
   and unfamiliar semantics behind what looks like a normal Python.
   It's not something to be ashamed of, but the benefit
   rarely justifies the (maintenance) cost.

To be clear: I'm NOT advocating for ordered kwargs. Embedding
DSLs into Python is generally a bad idea.

PS. __prepare__ enables many DSL tricks. In fact, it's difficult to
imagine a use case that's not related to some attempt at DSL.
Keyword-only args also help: ordered part of the definition can go
into *args, while attributes/options are kw-only args.

From ethan at stoneleaf.us  Mon Jun 20 15:24:22 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 20 Jun 2016 12:24:22 -0700
Subject: [Python-Dev] New security-sig mailling list
Message-ID: <576842E6.2030805@stoneleaf.us>

has been created:

   https://mail.python.org/mailman/listinfo/security-sig

The purpose of this list is to discuss security-related enhancements to 
Python while having as little impact on backwards compatibility as possible.

Once a proposal is ready it will be presented to Python Dev.

(This text is subject to change. ;)

--
~Ethan~

From brett at python.org  Mon Jun 20 15:52:23 2016
From: brett at python.org (Brett Cannon)
Date: Mon, 20 Jun 2016 19:52:23 +0000
Subject: [Python-Dev] frame evaluation API PEP
In-Reply-To: <57676A8C.8070207@hotpy.org>
References: <CAP1=2W7K6Ny82Vq-rk3zO9cHxxJmtQGczqGy3byg83sy-N8W9Q@mail.gmail.com>
 <CAP1=2W6T8Gjo4B44Ut1t9yDYQcMymTBY1hcZ=+QagY5pyrauNg@mail.gmail.com>
 <CAP7+vJ+TQLcDku=pNUg=7uQ=_sw_NsBqAbM5Os1K7VAx=7W1nQ@mail.gmail.com>
 <CAP1=2W7135HTD_=7uFNGC0MyLfZiHfMW4jte4DWD+RZObUupbg@mail.gmail.com>
 <57676A8C.8070207@hotpy.org>
Message-ID: <CAP1=2W5_Mf+v6iXCMNf6XPudFKv2LoGMxa6ieWdrkwVXiW17dQ@mail.gmail.com>

On Sun, 19 Jun 2016 at 21:01 Mark Shannon <mark at hotpy.org> wrote:

>
>
> On 19/06/16 18:29, Brett Cannon wrote:
> >
> >
> > On Sat, 18 Jun 2016 at 21:49 Guido van Rossum <guido at python.org
> > <mailto:guido at python.org>> wrote:
> >
> >     Hi Brett,
> >
> >     I've got a few questions about the specific design. Probably you
> >     know the answers, it would be nice to have them in the PEP.
> >
> >
> > Once you're happy with my answers I'll update the PEP.
> >
> >
> >     First, why not have a global hook? What does a hook per interpreter
> >     give you? Would even finer granularity buy anything?
> >
> >
> > We initially considered a per-code object hook, but we figured it was
> > unnecessary to have that level of control, especially since people like
> > Numba have gotten away with not needing it for this long (although I
> > suspect that's because they are a decorator so they can just return an
> > object that overrides __call__()). We didn't think that a global one was
> > appropriate as different workloads may call for different
> > JITs/debuggers/etc. and there is no guarantee that you are executing
> > every interpreter with the same workload. Plus we figured people might
> > simply import their JIT of choice and as a side-effect set the hook, and
> > since imports are a per-interpreter thing that seemed to suggest the
> > granularity of interpreters.
> >
> > IOW it seemed to be more in line with sys.settrace() than some global
> > thing for the process.
> >
> >
> >     Next, I'm a bit (but no more than a bit) concerned about the extra 8
> >     bytes per code object, especially since for most people this is just
> >     waste (assuming most people won't be using Pyjion or Numba). Could
> >     it be a compile-time feature (requiring recompilation of CPython but
> >     not extensions)?
> >
> >
> > Probably. It does water down potential usage thanks to needing a special
> > build. If the decision is "special build or not", I would simply pull
> > out this part of the proposal as I wouldn't want to add a flag that
> > influences what is or is not possible for an interpreter.
> >
> >     Could you figure out some other way to store per-code-object data?
> >     It seems you considered this but decided that the co_extra field was
> >     simpler and faster; I'm basically pushing a little harder on this.
> >     Of course most of the PEP would disappear without this feature; the
> >     extra interpreter field is fine.
> >
> >
> > Dino and I thought of two potential alternatives, neither of which we
> > have taken the time to implement and benchmark. One is to simply have a
> > hash table of memory addresses to JIT data that is kept on the JIT side
> > of things. Obviously it would be nice to avoid the overhead of a hash
> > table lookup on every function call. This also doesn't help minimize
> > memory when the code object gets GC'ed.
>
> Hash lookups aren't that slow.

There's "slow" and there's "slower".

> If you combine it with the custom flags
> suggested by MRAB, then you would only suffer the lookup penalty when
> actually entering the special interpreter.
>

You actually will always need the lookup in the JIT case to increment the
execution count if you're not always immediately JIT-ing. That means MRAB's
flag won't necessarily be that useful in the JIT case (it could in the
debugging case, though, if you're really aiming for the fastest debugger
possible).

> You can use a weakref callback to ensure things get GC'd properly.
>

Yes, that was already the plan if we lost co_extra.

>
> Also, if there is a special extra field on code-object, then everyone
> will want to use it. How do you handle clashes?
>

As already explained in the PEP in
https://www.python.org/dev/peps/pep-0523/#expanding-pycodeobject, like
consenting adults. The expectation is that there will not be multiple users
of the object at the same time.

-Brett

>
> >
> > The other potential solution we came up with was to use weakrefs. I have
> > not looked into the details, but we were thinking that if we registered
> > the JIT data object as a weakref on the code object, couldn't we iterate
> > through the weakrefs attached to the code object to look for the JIT
> > data object, and then get the reference that way? It would let us avoid
> > a more expensive hash table lookup if we assume most code objects won't
> > have a weakref on it (assuming weakrefs are stored in a list), and it
> > gives us the proper cleanup semantics we want by getting the weakref
> > cleanup callback execution to make sure we decref the JIT data object
> > appropriately. But as I said, I have not looked into the feasibility of
> > this at all to know if I'm remembering the weakref implementation
> > details correctly.
> >
> >
> >     Finally, there are some error messages from pep2html.py:
> >     https://www.python.org/dev/peps/pep-0523/#copyright
> >
> >
> > All fixed in
> >
> https://github.com/python/peps/commit/6929f850a5af07e51d0163558a5fe8d6b85dccfe
> .
> >
> > -Brett
> >
> >
> >
> >     --Guido
> >
> >     On Fri, Jun 17, 2016 at 7:58 PM, Brett Cannon <brett at python.org
> >     <mailto:brett at python.org>> wrote:
> >
> >         I have taken PEP 523 for this:
> >         https://github.com/python/peps/blob/master/pep-0523.txt .
> >
> >         I'm waiting until Guido gets back from vacation, at which point
> >         I'll ask for a pronouncement or assignment of a BDFL delegate.
> >
> >         On Fri, 3 Jun 2016 at 14:37 Brett Cannon <brett at python.org
> >         <mailto:brett at python.org>> wrote:
> >
> >             For those of you who follow python-ideas or were at the
> >             PyCon US 2016 language summit, you have already seen/heard
> >             about this PEP. For those of you who don't fall into either
> >             of those categories, this PEP proposed a frame evaluation
> >             API for CPython. The motivating example of this work has
> >             been Pyjion, the experimental CPython JIT Dino Viehland and
> >             I have been working on in our spare time at Microsoft. The
> >             API also works for debugging, though, as already
> >             demonstrated by Google having added a very similar API
> >             internally for debugging purposes.
> >
> >             The PEP is pasted in below and also available in rendered
> >             form at
> >             https://github.com/Microsoft/Pyjion/blob/master/pep.rst (I
> >             will assign myself a PEP # once discussion is finished as
> >             it's easier to work in git for this for the rich rendering
> >             of the in-progress PEP).
> >
> >             I should mention that the difference from python-ideas and
> >             the language summit in the PEP are the listed support from
> >             Google's use of a very similar API as well as clarifying the
> >             co_extra field on code objects doesn't change their
> >             immutability (at least from the view of the PEP).
> >
> >             ----------
> >             PEP: NNN
> >             Title: Adding a frame evaluation API to CPython
> >             Version: $Revision$
> >             Last-Modified: $Date$
> >             Author: Brett Cannon <brett at python.org
> >             <mailto:brett at python.org>>,
> >                      Dino Viehland <dinov at microsoft.com
> >             <mailto:dinov at microsoft.com>>
> >             Status: Draft
> >             Type: Standards Track
> >             Content-Type: text/x-rst
> >             Created: 16-May-2016
> >             Post-History: 16-May-2016
> >                            03-Jun-2016
> >
> >
> >             Abstract
> >             ========
> >
> >             This PEP proposes to expand CPython's C API [#c-api]_ to
> >             allow for
> >             the specification of a per-interpreter function pointer to
> >             handle the
> >             evaluation of frames [#pyeval_evalframeex]_. This proposal
> also
> >             suggests adding a new field to code objects [#pycodeobject]_
> >             to store
> >             arbitrary data for use by the frame evaluation function.
> >
> >
> >             Rationale
> >             =========
> >
> >             One place where flexibility has been lacking in Python is in
> >             the direct
> >             execution of Python code. While CPython's C API [#c-api]_
> >             allows for
> >             constructing the data going into a frame object and then
> >             evaluating it
> >             via ``PyEval_EvalFrameEx()`` [#pyeval_evalframeex]_, control
> >             over the
> >             execution of Python code comes down to individual objects
> >             instead of a
> >             hollistic control of execution at the frame level.
> >
> >             While wanting to have influence over frame evaluation may
> >             seem a bit
> >             too low-level, it does open the possibility for things such
> as a
> >             method-level JIT to be introduced into CPython without
> >             CPython itself
> >             having to provide one. By allowing external C code to
> >             control frame
> >             evaluation, a JIT can participate in the execution of Python
> >             code at
> >             the key point where evaluation occurs. This then allows for
> >             a JIT to
> >             conditionally recompile Python bytecode to machine code as
> >             desired
> >             while still allowing for executing regular CPython bytecode
> when
> >             running the JIT is not desired. This can be accomplished by
> >             allowing
> >             interpreters to specify what function to call to evaluate a
> >             frame. And
> >             by placing the API at the frame evaluation level it allows
> for a
> >             complete view of the execution environment of the code for
> >             the JIT.
> >
> >             This ability to specify a frame evaluation function also
> >             allows for
> >             other use-cases beyond just opening CPython up to a JIT. For
> >             instance,
> >             it would not be difficult to implement a tracing or
> >             profiling function
> >             at the call level with this API. While CPython does provide
> the
> >             ability to set a tracing or profiling function at the Python
> >             level,
> >             this would be able to match the data collection of the
> >             profiler and
> >             quite possibly be faster for tracing by simply skipping
> per-line
> >             tracing support.
> >
> >             It also opens up the possibility of debugging where the frame
> >             evaluation function only performs special debugging work
> when it
> >             detects it is about to execute a specific code object. In
> that
> >             instance the bytecode could be theoretically rewritten
> >             in-place to
> >             inject a breakpoint function call at the proper point for
> >             help in
> >             debugging while not having to do a heavy-handed approach as
> >             required by ``sys.settrace()``.
> >
> >             To help facilitate these use-cases, we are also proposing
> >             the adding
> >             of a "scratch space" on code objects via a new field. This
> >             will allow
> >             per-code object data to be stored with the code object
> >             itself for easy
> >             retrieval by the frame evaluation function as necessary. The
> >             field
> >             itself will simply be a ``PyObject *`` type so that any data
> >             stored in
> >             the field will participate in normal object memory
> management.
> >
> >
> >             Proposal
> >             ========
> >
> >             All proposed C API changes below will not be part of the
> >             stable ABI.
> >
> >
> >             Expanding ``PyCodeObject``
> >             --------------------------
> >
> >             One field is to be added to the ``PyCodeObject`` struct
> >             [#pycodeobject]_::
> >
> >                typedef struct {
> >                   ...
> >                   PyObject *co_extra;  /* "Scratch space" for the code
> >             object. */
> >                } PyCodeObject;
> >
> >             The ``co_extra`` will be ``NULL`` by default and will not be
> >             used by
> >             CPython itself. Third-party code is free to use the field as
> >             desired.
> >             Values stored in the field are expected to not be required
> >             in order
> >             for the code object to function, allowing the loss of the
> >             data of the
> >             field to be acceptable (this keeps the code object as
> >             immutable from
> >             a functionality point-of-view; this is slightly contentious
> >             and so is
> >             listed as an open issue in `Is co_extra needed?`_). The
> >             field will be
> >             freed like all other fields on ``PyCodeObject`` during
> >             deallocation
> >             using ``Py_XDECREF()``.
> >
> >             It is not recommended that multiple users attempt to use the
> >             ``co_extra`` simultaneously. While a dictionary could
> >             theoretically be
> >             set to the field and various users could use a key specific
> >             to the
> >             project, there is still the issue of key collisions as well
> as
> >             performance degradation from using a dictionary lookup on
> >             every frame
> >             evaluation. Users are expected to do a type check to make
> >             sure that
> >             the field has not been previously set by someone else.
> >
> >
> >             Expanding ``PyInterpreterState``
> >             --------------------------------
> >
> >             The entrypoint for the frame evalution function is
> >             per-interpreter::
> >
> >                // Same type signature as PyEval_EvalFrameEx().
> >                typedef PyObject* (__stdcall
> >             *PyFrameEvalFunction)(PyFrameObject*, int);
> >
> >                typedef struct {
> >                    ...
> >                    PyFrameEvalFunction eval_frame;
> >                } PyInterpreterState;
> >
> >             By default, the ``eval_frame`` field will be initialized to
> >             a function
> >             pointer that represents what ``PyEval_EvalFrameEx()``
> >             currently is
> >             (called ``PyEval_EvalFrameDefault()``, discussed later in
> >             this PEP).
> >             Third-party code may then set their own frame evaluation
> >             function
> >             instead to control the execution of Python code. A pointer
> >             comparison
> >             can be used to detect if the field is set to
> >             ``PyEval_EvalFrameDefault()`` and thus has not been mutated
> yet.
> >
> >
> >             Changes to ``Python/ceval.c``
> >             -----------------------------
> >
> >             ``PyEval_EvalFrameEx()`` [#pyeval_evalframeex]_ as it
> >             currently stands
> >             will be renamed to ``PyEval_EvalFrameDefault()``. The new
> >             ``PyEval_EvalFrameEx()`` will then become::
> >
> >                  PyObject *
> >                  PyEval_EvalFrameEx(PyFrameObject *frame, int throwflag)
> >                  {
> >                      PyThreadState *tstate = PyThreadState_GET();
> >                      return tstate->interp->eval_frame(frame, throwflag);
> >                  }
> >
> >             This allows third-party code to place themselves directly in
> >             the path
> >             of Python code execution while being backwards-compatible
> >             with code
> >             already using the pre-existing C API.
> >
> >
> >             Updating ``python-gdb.py``
> >             --------------------------
> >
> >             The generated ``python-gdb.py`` file used for Python support
> >             in GDB
> >             makes some hard-coded assumptions about
> >             ``PyEval_EvalFrameEx()``, e.g.
> >             the names of local variables. It will need to be updated to
> >             work with
> >             the proposed changes.
> >
> >
> >             Performance impact
> >             ==================
> >
> >             As this PEP is proposing an API to add pluggability,
> performance
> >             impact is considered only in the case where no third-party
> >             code has
> >             made any changes.
> >
> >             Several runs of pybench [#pybench]_ consistently showed no
> >             performance
> >             cost from the API change alone.
> >
> >             A run of the Python benchmark suite [#py-benchmarks]_ showed
> no
> >             measurable cost in performance.
> >
> >             In terms of memory impact, since there are typically not
> >             many CPython
> >             interpreters executing in a single process that means the
> >             impact of
> >             ``co_extra`` being added to ``PyCodeObject`` is the only
> worry.
> >             According to [#code-object-count]_, a run of the Python test
> >             suite
> >             results in about 72,395 code objects being created. On a
> 64-bit
> >             CPU that would result in 579,160 bytes of extra memory being
> >             used if
> >             all code objects were alive at once and had nothing set in
> their
> >             ``co_extra`` fields.
> >
> >
> >             Example Usage
> >             =============
> >
> >             A JIT for CPython
> >             -----------------
> >
> >             Pyjion
> >             ''''''
> >
> >             The Pyjion project [#pyjion]_ has used this proposed API to
> >             implement
> >             a JIT for CPython using the CoreCLR's JIT [#coreclr]_. Each
> code
> >             object has its ``co_extra`` field set to a
> >             ``PyjionJittedCode`` object
> >             which stores four pieces of information:
> >
> >             1. Execution count
> >             2. A boolean representing whether a previous attempt to JIT
> >             failed
> >             3. A function pointer to a trampoline (which can be type
> >             tracing or not)
> >             4. A void pointer to any JIT-compiled machine code
> >
> >             The frame evaluation function has (roughly) the following
> >             algorithm::
> >
> >                  def eval_frame(frame, throw_flag):
> >                      pyjion_code = frame.code.co_extra
> >                      if not pyjion_code:
> >                          frame.code.co_extra = PyjionJittedCode()
> >                      elif not pyjion_code.jit_failed:
> >                          if not pyjion_code.jit_code:
> >                              return
> >             pyjion_code.eval(pyjion_code.jit_code, frame)
> >                          elif pyjion_code.exec_count > 20_000:
> >                              if jit_compile(frame):
> >                                  return
> >             pyjion_code.eval(pyjion_code.jit_code, frame)
> >                              else:
> >                                  pyjion_code.jit_failed = True
> >                      pyjion_code.exec_count += 1
> >                      return PyEval_EvalFrameDefault(frame, throw_flag)
> >
> >             The key point, though, is that all of this work and logic is
> >             separate
> >             from CPython and yet with the proposed API changes it is
> able to
> >             provide a JIT that is compliant with Python semantics (as of
> >             this
> >             writing, performance is almost equivalent to CPython without
> >             the new
> >             API). This means there's nothing technically preventing
> >             others from
> >             implementing their own JITs for CPython by utilizing the
> >             proposed API.
> >
> >
> >             Other JITs
> >             ''''''''''
> >
> >             It should be mentioned that the Pyston team was consulted on
> an
> >             earlier version of this PEP that was more JIT-specific and
> >             they were
> >             not interested in utilizing the changes proposed because
> >             they want
> >             control over memory layout they had no interest in directly
> >             supporting
> >             CPython itself. An informal discusion with a developer on
> >             the PyPy
> >             team led to a similar comment.
> >
> >             Numba [#numba]_, on the other hand, suggested that they
> would be
> >             interested in the proposed change in a post-1.0 future for
> >             themselves [#numba-interest]_.
> >
> >             The experimental Coconut JIT [#coconut]_ could have
> >             benefitted from
> >             this PEP. In private conversations with Coconut's creator we
> >             were told
> >             that our API was probably superior to the one they developed
> for
> >             Coconut to add JIT support to CPython.
> >
> >
> >             Debugging
> >             ---------
> >
> >             In conversations with the Python Tools for Visual Studio
> >             team (PTVS)
> >             [#ptvs]_, they thought they would find these API changes
> >             useful for
> >             implementing more performant debugging. As mentioned in the
> >             Rationale_
> >             section, this API would allow for switching on debugging
> >             functionality
> >             only in frames where it is needed. This could allow for
> either
> >             skipping information that ``sys.settrace()`` normally
> >             provides and
> >             even go as far as to dynamically rewrite bytecode prior to
> >             execution
> >             to inject e.g. breakpoints in the bytecode.
> >
> >             It also turns out that Google has provided a very similar API
> >             internally for years. It has been used for performant
> debugging
> >             purposes.
> >
> >
> >             Implementation
> >             ==============
> >
> >             A set of patches implementing the proposed API is available
> >             through
> >             the Pyjion project [#pyjion]_. In its current form it has
> more
> >             changes to CPython than just this proposed API, but that is
> >             for ease
> >             of development instead of strict requirements to accomplish
> >             its goals.
> >
> >
> >             Open Issues
> >             ===========
> >
> >             Allow ``eval_frame`` to be ``NULL``
> >             -----------------------------------
> >
> >             Currently the frame evaluation function is expected to
> >             always be set.
> >             It could very easily simply default to ``NULL`` instead
> >             which would
> >             signal to use ``PyEval_EvalFrameDefault()``. The current
> >             proposal of
> >             not special-casing the field seemed the most
> >             straight-forward, but it
> >             does require that the field not accidentally be cleared,
> >             else a crash
> >             may occur.
> >
> >
> >             Is co_extra needed?
> >             -------------------
> >
> >             While discussing this PEP at PyCon US 2016, some core
> developers
> >             expressed their worry of the ``co_extra`` field making code
> >             objects
> >             mutable. The thinking seemed to be that having a field that
> was
> >             mutated after the creation of the code object made the
> >             object seem
> >             mutable, even though no other aspect of code objects changed.
> >
> >             The view of this PEP is that the `co_extra` field doesn't
> >             change the
> >             fact that code objects are immutable. The field is specified
> >             in this
> >             PEP as to not contain information required to make the code
> >             object
> >             usable, making it more of a caching field. It could be
> viewed as
> >             similar to the UTF-8 cache that string objects have
> internally;
> >             strings are still considered immutable even though they have
> >             a field
> >             that is conditionally set.
> >
> >             The field is also not strictly necessary. While the field
> >             greatly
> >             simplifies attaching extra information to code objects,
> >             other options
> >             such as keeping a mapping of code object memory addresses to
> >             what
> >             would have been kept in ``co_extra`` or perhaps using a weak
> >             reference
> >             of the data on the code object and then iterating through
> >             the weak
> >             references until the attached data is found is possible. But
> >             obviously
> >             all of these solutions are not as simple or performant as
> >             adding the
> >             ``co_extra`` field.
> >
> >
> >             Rejected Ideas
> >             ==============
> >
> >             A JIT-specific C API
> >             --------------------
> >
> >             Originally this PEP was going to propose a much larger API
> >             change
> >             which was more JIT-specific. After soliciting feedback from
> >             the Numba
> >             team [#numba]_, though, it became clear that the API was
> >             unnecessarily
> >             large. The realization was made that all that was truly
> >             needed was the
> >             opportunity to provide a trampoline function to handle
> >             execution of
> >             Python code that had been JIT-compiled and a way to attach
> that
> >             compiled machine code along with other critical data to the
> >             corresponding Python code object. Once it was shown that
> >             there was no
> >             loss in functionality or in performance while minimizing the
> API
> >             changes required, the proposal was changed to its current
> form.
> >
> >
> >             References
> >             ==========
> >
> >             .. [#pyjion] Pyjion project
> >                 (https://github.com/microsoft/pyjion)
> >
> >             .. [#c-api] CPython's C API
> >                 (https://docs.python.org/3/c-api/index.html)
> >
> >             .. [#pycodeobject] ``PyCodeObject``
> >                 (
> https://docs.python.org/3/c-api/code.html#c.PyCodeObject)
> >
> >             .. [#coreclr] .NET Core Runtime (CoreCLR)
> >                 (https://github.com/dotnet/coreclr)
> >
> >             .. [#pyeval_evalframeex] ``PyEval_EvalFrameEx()``
> >
> >               (
> https://docs.python.org/3/c-api/veryhigh.html?highlight=pyframeobject#c.PyEval_EvalFrameEx
> )
> >
> >             .. [#pycodeobject] ``PyCodeObject``
> >                 (
> https://docs.python.org/3/c-api/code.html#c.PyCodeObject)
> >
> >             .. [#numba] Numba
> >                 (http://numba.pydata.org/)
> >
> >             .. [#numba-interest]  numba-users mailing list:
> >                 "Would the C API for a JIT entrypoint being proposed by
> >             Pyjion help out Numba?"
> >
> >               (
> https://groups.google.com/a/continuum.io/forum/#!topic/numba-users/yRl_0t8-m1g
> )
> >
> >             .. [#code-object-count] [Python-Dev] Opcode cache in ceval
> loop
> >
> >               (
> https://mail.python.org/pipermail/python-dev/2016-February/143025.html)
> >
> >             .. [#py-benchmarks] Python benchmark suite
> >                 (https://hg.python.org/benchmarks)
> >
> >             .. [#pyston] Pyston
> >                 (http://pyston.org)
> >
> >             .. [#pypy] PyPy
> >                 (http://pypy.org/)
> >
> >             .. [#ptvs] Python Tools for Visual Studio
> >                 (http://microsoft.github.io/PTVS/)
> >
> >             .. [#coconut] Coconut
> >                 (https://github.com/davidmalcolm/coconut)
> >
> >
> >             Copyright
> >             =========
> >
> >             This document has been placed in the public domain.
> >
> >
> >
> >             ..
> >                 Local Variables:
> >                 mode: indented-text
> >                 indent-tabs-mode: nil
> >                 sentence-end-double-space: t
> >                 fill-column: 70
> >                 coding: utf-8
> >                 End:
> >
> >
> >         _______________________________________________
> >         Python-Dev mailing list
> >         Python-Dev at python.org <mailto:Python-Dev at python.org>
> >         https://mail.python.org/mailman/listinfo/python-dev
> >         Unsubscribe:
> >
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
> >
> >
> >
> >
> >     --
> >     --Guido van Rossum (python.org/~guido <http://python.org/~guido>)
> >
> >
> >
> > _______________________________________________
> > Python-Dev mailing list
> > Python-Dev at python.org
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/mark%40hotpy.org
> >
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160620/f1fbc785/attachment-0001.html>

From brett at python.org  Mon Jun 20 15:59:45 2016
From: brett at python.org (Brett Cannon)
Date: Mon, 20 Jun 2016 19:59:45 +0000
Subject: [Python-Dev] frame evaluation API PEP
In-Reply-To: <CAP7+vJLVE-N7qtAo8qQ_amedYAJEEzZAw6d=pnGMJ35sHFQhAg@mail.gmail.com>
References: <CAP1=2W7K6Ny82Vq-rk3zO9cHxxJmtQGczqGy3byg83sy-N8W9Q@mail.gmail.com>
 <CAP1=2W6T8Gjo4B44Ut1t9yDYQcMymTBY1hcZ=+QagY5pyrauNg@mail.gmail.com>
 <CAP7+vJ+TQLcDku=pNUg=7uQ=_sw_NsBqAbM5Os1K7VAx=7W1nQ@mail.gmail.com>
 <CAP1=2W7135HTD_=7uFNGC0MyLfZiHfMW4jte4DWD+RZObUupbg@mail.gmail.com>
 <CAP7+vJLVE-N7qtAo8qQ_amedYAJEEzZAw6d=pnGMJ35sHFQhAg@mail.gmail.com>
Message-ID: <CAP1=2W48dLL60XMjK+1y3htSBLbysva4pwXWh5pim7v8vOLucA@mail.gmail.com>

On Sun, 19 Jun 2016 at 19:37 Guido van Rossum <guido at python.org> wrote:

> On Sun, Jun 19, 2016 at 6:29 PM, Brett Cannon <brett at python.org> wrote:
>
>>
>>
>> On Sat, 18 Jun 2016 at 21:49 Guido van Rossum <guido at python.org> wrote:
>>
>>> Hi Brett,
>>>
>>> I've got a few questions about the specific design. Probably you know
>>> the answers, it would be nice to have them in the PEP.
>>>
>>
>> Once you're happy with my answers I'll update the PEP.
>>
>
> Soon!
>
>
>>
>>
>>>
>>> First, why not have a global hook? What does a hook per interpreter give
>>> you? Would even finer granularity buy anything?
>>>
>>
>> We initially considered a per-code object hook, but we figured it was
>> unnecessary to have that level of control, especially since people like
>> Numba have gotten away with not needing it for this long (although I
>> suspect that's because they are a decorator so they can just return an
>> object that overrides __call__()).
>>
>
> So they do it at the function object level?
>

Yes. They use a decorator, allowing them to completely control what
function object gets returned.

>
>
>> We didn't think that a global one was appropriate as different workloads
>> may call for different JITs/debuggers/etc. and there is no guarantee that
>> you are executing every interpreter with the same workload. Plus we figured
>> people might simply import their JIT of choice and as a side-effect set the
>> hook, and since imports are a per-interpreter thing that seemed to suggest
>> the granularity of interpreters.
>>
>
> I like import as the argument here.
>
>
>>
>> IOW it seemed to be more in line with sys.settrace() than some global
>> thing for the process.
>>
>>
>>>
>>> Next, I'm a bit (but no more than a bit) concerned about the extra 8
>>> bytes per code object, especially since for most people this is just waste
>>> (assuming most people won't be using Pyjion or Numba). Could it be a
>>> compile-time feature (requiring recompilation of CPython but not
>>> extensions)?
>>>
>>
>> Probably. It does water down potential usage thanks to needing a special
>> build. If the decision is "special build or not", I would simply pull out
>> this part of the proposal as I wouldn't want to add a flag that influences
>> what is or is not possible for an interpreter.
>>
>
> MRAB's response made me think of a possible approach: the co_extra field
> could be the very last field of the PyCodeObject struct and only present if
> a certain flag is set in co_flags. This is similar to a trick used by X11
> (I know, it's long ago :-).
>

But that doesn't resolve your memory worry, right? For a JIT you will have
to access the memory regardless for execution count (unless Yury's patch to
add caching goes in, in which case it will be provided by code objects
already).

>
>>
>>> Could you figure out some other way to store per-code-object data? It
>>> seems you considered this but decided that the co_extra field was simpler
>>> and faster; I'm basically pushing a little harder on this. Of course most
>>> of the PEP would disappear without this feature; the extra interpreter
>>> field is fine.
>>>
>>
>> Dino and I thought of two potential alternatives, neither of which we
>> have taken the time to implement and benchmark. One is to simply have a
>> hash table of memory addresses to JIT data that is kept on the JIT side of
>> things. Obviously it would be nice to avoid the overhead of a hash table
>> lookup on every function call. This also doesn't help minimize memory when
>> the code object gets GC'ed.
>>
>
> I guess the prospect of the extra hash lookup per call isn't great given
> that this is about perf...
>

It's not desirable, but it isn't the end of the world either. I think Dino
doesn't believe it will be that big of a deal to switch to a hash table.

>
>> The other potential solution we came up with was to use weakrefs. I have
>> not looked into the details, but we were thinking that if we registered the
>> JIT data object as a weakref on the code object, couldn't we iterate
>> through the weakrefs attached to the code object to look for the JIT data
>> object, and then get the reference that way? It would let us avoid a more
>> expensive hash table lookup if we assume most code objects won't have a
>> weakref on it (assuming weakrefs are stored in a list), and it gives us the
>> proper cleanup semantics we want by getting the weakref cleanup callback
>> execution to make sure we decref the JIT data object appropriately. But as
>> I said, I have not looked into the feasibility of this at all to know if
>> I'm remembering the weakref implementation details correctly.
>>
>
> That would be even slower than the hash table lookup, and unbounded. So
> let's not go there.
>

OK.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160620/e40c34cf/attachment.html>

From guido at python.org  Mon Jun 20 16:12:08 2016
From: guido at python.org (Guido van Rossum)
Date: Mon, 20 Jun 2016 13:12:08 -0700
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <CANawmyeMC-YRs7hhQ-ZRPDWERGBO=BJofSuKS+6sAcYe4AYGdw@mail.gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
 <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
 <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>
 <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>
 <CANawmycB2ptBw-y90B2DsEe0kKqnC8fFBVdBFBZQRTv0DitG1g@mail.gmail.com>
 <CAP7+vJ+s97M737xFgMjY_4F5EsZ9XVye9VyxWwWpf_JGRrm+tw@mail.gmail.com>
 <CANawmyeMC-YRs7hhQ-ZRPDWERGBO=BJofSuKS+6sAcYe4AYGdw@mail.gmail.com>
Message-ID: <CAP7+vJKKzM0y-9utC=wHTmbpwOpbj-iV0JatED=59G4-o6psRA@mail.gmail.com>

OK, basically you're arguing that knowing the definition order of class
attributes is often useful when (ab)using Python for things like schema or
form definitions. There are a few ways to go about it:

1. A hack using a global creation counter
<https://github.com/GoogleCloudPlatform/datastore-ndb-python/blob/master/ndb/model.py#L888>
2. Metaclass with __prepare__
<https://docs.python.org/3/reference/datamodel.html#prepare>
3. PEP 520 <https://www.python.org/dev/peps/pep-0520/>
4a. Make all dicts OrderedDicts in CPython
<http://bugs.python.org/issue27350>
4b. Ditto in the language standard

If we can make the jump to (4b) soon enough I think we should skip PEP 520;
if not, I am still hemming and hawing about whether PEP 520 has enough
benefits over (2) to bother.

Sorry Eric for making this so hard. The better is so often the enemy of the
good. I am currently somewhere between -0 and +0 on PEP 520. I'm not sure
if the work on (4a) is going to bear fruit in time for the 3.6 feature
freeze <https://www.python.org/dev/peps/pep-0494/#schedule>; if it goes
well I think we should have a separate conversation (maybe even a PEP?)
about (4b). Maybe we should ask for feedback from the Jython developers?
(PyPy already has this IIUC, and IronPython
<https://github.com/IronLanguages/main> seems moribund?)

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160620/e3cfdf44/attachment.html>

From guido at python.org  Mon Jun 20 16:18:26 2016
From: guido at python.org (Guido van Rossum)
Date: Mon, 20 Jun 2016 13:18:26 -0700
Subject: [Python-Dev] frame evaluation API PEP
In-Reply-To: <CAP1=2W48dLL60XMjK+1y3htSBLbysva4pwXWh5pim7v8vOLucA@mail.gmail.com>
References: <CAP1=2W7K6Ny82Vq-rk3zO9cHxxJmtQGczqGy3byg83sy-N8W9Q@mail.gmail.com>
 <CAP1=2W6T8Gjo4B44Ut1t9yDYQcMymTBY1hcZ=+QagY5pyrauNg@mail.gmail.com>
 <CAP7+vJ+TQLcDku=pNUg=7uQ=_sw_NsBqAbM5Os1K7VAx=7W1nQ@mail.gmail.com>
 <CAP1=2W7135HTD_=7uFNGC0MyLfZiHfMW4jte4DWD+RZObUupbg@mail.gmail.com>
 <CAP7+vJLVE-N7qtAo8qQ_amedYAJEEzZAw6d=pnGMJ35sHFQhAg@mail.gmail.com>
 <CAP1=2W48dLL60XMjK+1y3htSBLbysva4pwXWh5pim7v8vOLucA@mail.gmail.com>
Message-ID: <CAP7+vJLZhsEi=9aycU9axS9eZXHjfF7TngHJOLqnWuBr1daKHQ@mail.gmail.com>

On Mon, Jun 20, 2016 at 12:59 PM, Brett Cannon <brett at python.org> wrote:

> MRAB's response made me think of a possible approach: the co_extra field
>> could be the very last field of the PyCodeObject struct and only present if
>> a certain flag is set in co_flags. This is similar to a trick used by X11
>> (I know, it's long ago :-)
>>
>
> But that doesn't resolve your memory worry, right? For a JIT you will have
> to access the memory regardless for execution count (unless Yury's patch to
> add caching goes in, in which case it will be provided by code objects
> already).
>

If you make the code object constructor another function pointer in the
interpreter struct, you could solve this quite well IMO. An interpreter
with a JIT installed would always create code objects with the co_extra
field. But interpreters without a JIT would have have code objects without
it. This would mean the people who aren't using a JIT at all don't pay for
co_extra. The flag would still be needed so the JIT can tell when you pass
it a code object that was created before the JIT was installed (or
belonging to a different interpreter).

Would that work? Or is it important to be able to import a lot of code and
then later import+install the JIT and have it benefit the code you already
imported?

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160620/72f58426/attachment.html>

From christian at python.org  Mon Jun 20 16:41:40 2016
From: christian at python.org (Christian Heimes)
Date: Mon, 20 Jun 2016 22:41:40 +0200
Subject: [Python-Dev] frame evaluation API PEP
In-Reply-To: <CAP7+vJLZhsEi=9aycU9axS9eZXHjfF7TngHJOLqnWuBr1daKHQ@mail.gmail.com>
References: <CAP1=2W7K6Ny82Vq-rk3zO9cHxxJmtQGczqGy3byg83sy-N8W9Q@mail.gmail.com>
 <CAP1=2W6T8Gjo4B44Ut1t9yDYQcMymTBY1hcZ=+QagY5pyrauNg@mail.gmail.com>
 <CAP7+vJ+TQLcDku=pNUg=7uQ=_sw_NsBqAbM5Os1K7VAx=7W1nQ@mail.gmail.com>
 <CAP1=2W7135HTD_=7uFNGC0MyLfZiHfMW4jte4DWD+RZObUupbg@mail.gmail.com>
 <CAP7+vJLVE-N7qtAo8qQ_amedYAJEEzZAw6d=pnGMJ35sHFQhAg@mail.gmail.com>
 <CAP1=2W48dLL60XMjK+1y3htSBLbysva4pwXWh5pim7v8vOLucA@mail.gmail.com>
 <CAP7+vJLZhsEi=9aycU9axS9eZXHjfF7TngHJOLqnWuBr1daKHQ@mail.gmail.com>
Message-ID: <nk9ke5$r9f$1@ger.gmane.org>

On 2016-06-20 22:18, Guido van Rossum wrote:
> On Mon, Jun 20, 2016 at 12:59 PM, Brett Cannon <brett at python.org
> <mailto:brett at python.org>> wrote:
> 
>         MRAB's response made me think of a possible approach: the
>         co_extra field could be the very last field of the PyCodeObject
>         struct and only present if a certain flag is set in co_flags.
>         This is similar to a trick used by X11 (I know, it's long ago :-)
> 
> 
>     But that doesn't resolve your memory worry, right? For a JIT you
>     will have to access the memory regardless for execution count
>     (unless Yury's patch to add caching goes in, in which case it will
>     be provided by code objects already).
> 
> 
> If you make the code object constructor another function pointer in the
> interpreter struct, you could solve this quite well IMO. An interpreter
> with a JIT installed would always create code objects with the co_extra
> field. But interpreters without a JIT would have have code objects
> without it. This would mean the people who aren't using a JIT at all
> don't pay for co_extra. The flag would still be needed so the JIT can
> tell when you pass it a code object that was created before the JIT was
> installed (or belonging to a different interpreter).
> 
> Would that work? Or is it important to be able to import a lot of code
> and then later import+install the JIT and have it benefit the code you
> already imported?

Ha, I had the same idea. co_flags has some free bits. You could store
extra flags there.

Is the PyCodeObject's layout part of Python's stable ABI? I'm asking
because the PyCodeObject struct has a suboptimal layout. It's wasting a
couple of bytes becaues it mixes int and ptr. If we move the int
co_firstlineno member below the co_flags member, then the struct size
shrinks by 64 bits on 64bit system -- the exact same size a PyObject
*co_extras member.

Christian

From brett at python.org  Mon Jun 20 16:50:56 2016
From: brett at python.org (Brett Cannon)
Date: Mon, 20 Jun 2016 20:50:56 +0000
Subject: [Python-Dev] frame evaluation API PEP
In-Reply-To: <nk9ke5$r9f$1@ger.gmane.org>
References: <CAP1=2W7K6Ny82Vq-rk3zO9cHxxJmtQGczqGy3byg83sy-N8W9Q@mail.gmail.com>
 <CAP1=2W6T8Gjo4B44Ut1t9yDYQcMymTBY1hcZ=+QagY5pyrauNg@mail.gmail.com>
 <CAP7+vJ+TQLcDku=pNUg=7uQ=_sw_NsBqAbM5Os1K7VAx=7W1nQ@mail.gmail.com>
 <CAP1=2W7135HTD_=7uFNGC0MyLfZiHfMW4jte4DWD+RZObUupbg@mail.gmail.com>
 <CAP7+vJLVE-N7qtAo8qQ_amedYAJEEzZAw6d=pnGMJ35sHFQhAg@mail.gmail.com>
 <CAP1=2W48dLL60XMjK+1y3htSBLbysva4pwXWh5pim7v8vOLucA@mail.gmail.com>
 <CAP7+vJLZhsEi=9aycU9axS9eZXHjfF7TngHJOLqnWuBr1daKHQ@mail.gmail.com>
 <nk9ke5$r9f$1@ger.gmane.org>
Message-ID: <CAP1=2W42AcmW-bitRAe8OyjcwMZDkhEwcUHAayUPKn8K+8jqjg@mail.gmail.com>

On Mon, 20 Jun 2016 at 13:43 Christian Heimes <christian at python.org> wrote:

> On 2016-06-20 22:18, Guido van Rossum wrote:
> > On Mon, Jun 20, 2016 at 12:59 PM, Brett Cannon <brett at python.org
> > <mailto:brett at python.org>> wrote:
> >
> >         MRAB's response made me think of a possible approach: the
> >         co_extra field could be the very last field of the PyCodeObject
> >         struct and only present if a certain flag is set in co_flags.
> >         This is similar to a trick used by X11 (I know, it's long ago :-)
> >
> >
> >     But that doesn't resolve your memory worry, right? For a JIT you
> >     will have to access the memory regardless for execution count
> >     (unless Yury's patch to add caching goes in, in which case it will
> >     be provided by code objects already).
> >
> >
> > If you make the code object constructor another function pointer in the
> > interpreter struct, you could solve this quite well IMO. An interpreter
> > with a JIT installed would always create code objects with the co_extra
> > field. But interpreters without a JIT would have have code objects
> > without it. This would mean the people who aren't using a JIT at all
> > don't pay for co_extra. The flag would still be needed so the JIT can
> > tell when you pass it a code object that was created before the JIT was
> > installed (or belonging to a different interpreter).
> >
> > Would that work? Or is it important to be able to import a lot of code
> > and then later import+install the JIT and have it benefit the code you
> > already imported?
>
> Ha, I had the same idea. co_flags has some free bits. You could store
> extra flags there.
>
> Is the PyCodeObject's layout part of Python's stable ABI?

No: https://docs.python.org/3/c-api/code.html#c.PyCodeObject

> I'm asking
> because the PyCodeObject struct has a suboptimal layout. It's wasting a
> couple of bytes because it mixes int and ptr. If we move the int
> co_firstlineno member below the co_flags member, then the struct size
> shrinks by 64 bits on 64bit system -- the exact same size a PyObject
> *co_extras member.
>

:) We should probably do that reordering regardless of the result of this
PEP.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160620/08134eee/attachment.html>

From dinov at microsoft.com  Mon Jun 20 16:20:22 2016
From: dinov at microsoft.com (Dino Viehland)
Date: Mon, 20 Jun 2016 20:20:22 +0000
Subject: [Python-Dev] frame evaluation API PEP
In-Reply-To: <57676A8C.8070207@hotpy.org>
References: <CAP1=2W7K6Ny82Vq-rk3zO9cHxxJmtQGczqGy3byg83sy-N8W9Q@mail.gmail.com>
 <CAP1=2W6T8Gjo4B44Ut1t9yDYQcMymTBY1hcZ=+QagY5pyrauNg@mail.gmail.com>
 <CAP7+vJ+TQLcDku=pNUg=7uQ=_sw_NsBqAbM5Os1K7VAx=7W1nQ@mail.gmail.com>
 <CAP1=2W7135HTD_=7uFNGC0MyLfZiHfMW4jte4DWD+RZObUupbg@mail.gmail.com>
 <57676A8C.8070207@hotpy.org>
Message-ID: <BN3PR03MB21952B4CB10DD940C8F48FC2BB2A0@BN3PR03MB2195.namprd03.prod.outlook.com>

Mark wrote:
> > Dino and I thought of two potential alternatives, neither of which we
> > have taken the time to implement and benchmark. One is to simply have
> > a hash table of memory addresses to JIT data that is kept on the JIT
> > side of things. Obviously it would be nice to avoid the overhead of a
> > hash table lookup on every function call. This also doesn't help
> > minimize memory when the code object gets GC'ed.
> 
> Hash lookups aren't that slow. If you combine it with the custom flags
> suggested by MRAB, then you would only suffer the lookup penalty when
> actually entering the special interpreter.
> You can use a weakref callback to ensure things get GC'd properly.
> 
> Also, if there is a special extra field on code-object, then everyone will want
> to use it. How do you handle clashes?

This is exactly what I've started prototyping and have mostly coded up, I've
just been getting randomized and haven't gotten back to it yet.

It may have some impact in the short-term but my expectation is that as the
JIT gets better that this will become less and less important.  Currently we're
just JITing one method at a time and have no inlining support.  But once we 
can start putting guards in place and inlining across multiple function calls
we will start reducing the transitions from JIT -> Function Call -> JIT and get
rid of those hash table lookups entirely.  And if we can't succeed at inlining then
I suspect the JIT won't end up offering the performance we'd hope for.

From sandranel at comcast.net  Mon Jun 20 17:55:38 2016
From: sandranel at comcast.net (Sandranel)
Date: Mon, 20 Jun 2016 17:55:38 -0400
Subject: [Python-Dev] Problem
Message-ID: <001701d1cb3e$7badbe90$73093bb0$@comcast.net>

Hi:

My daughter and  I are trying to update to 8.1.2,but every time we  try this
happens

>From the API

To the Python window:

Please advise

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160620/3ab48538/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003.jpg
Type: image/jpeg
Size: 35588 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160620/3ab48538/attachment-0002.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image004.jpg
Type: image/jpeg
Size: 20791 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160620/3ab48538/attachment-0003.jpg>

From chris.jerdonek at gmail.com  Mon Jun 20 18:02:30 2016
From: chris.jerdonek at gmail.com (Chris Jerdonek)
Date: Mon, 20 Jun 2016 15:02:30 -0700
Subject: [Python-Dev] New security-sig mailling list
In-Reply-To: <576842E6.2030805@stoneleaf.us>
References: <576842E6.2030805@stoneleaf.us>
Message-ID: <CAOTb1wfo4bx8SJJx_4-bJdYNp4_iLWzkQoDq=9Q8B0WhJ5dhKQ@mail.gmail.com>

On Mon, Jun 20, 2016 at 12:24 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
>
> has been created:
>
>   https://mail.python.org/mailman/listinfo/security-sig
>
> The purpose of this list is to discuss security-related enhancements to Python while having as little impact on backwards compatibility as possible.

I would recommend clarifying the relationship of the SIG to the Python
Security Response Team ( https://www.python.org/news/security ), or at
least clarifying that the SIG is different from the PSRT (and that
security reports should not be sent to the SIG).

--Chris

>
> Once a proposal is ready it will be presented to Python Dev.
>
> (This text is subject to change. ;)
>
> --
> ~Ethan~
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/chris.jerdonek%40gmail.com

From phd at phdru.name  Mon Jun 20 18:13:14 2016
From: phd at phdru.name (Oleg Broytman)
Date: Tue, 21 Jun 2016 00:13:14 +0200
Subject: [Python-Dev] Problem
In-Reply-To: <001701d1cb3e$7badbe90$73093bb0$@comcast.net>
References: <001701d1cb3e$7badbe90$73093bb0$@comcast.net>
Message-ID: <20160620221314.GA31892@phdru.name>

Hello.

   We are sorry but we cannot help you. This mailing list is to work on
developing Python (adding new features to Python itself and fixing bugs);
if you're having problems learning, understanding or using Python, please
find another forum. Probably python-list/comp.lang.python mailing list/news
group is the best place; there are Python developers who participate in it;
you may get a faster, and probably more complete, answer there. See
http://www.python.org/community/ for other lists/news groups/fora. Thank
you for understanding.

On Mon, Jun 20, 2016 at 05:55:38PM -0400, Sandranel <sandranel at comcast.net> wrote:
> Hi:
> 
> My daughter and  I are trying to update to 8.1.2,but every time we  try this
> happens

   As for your question: the command "python -m pip install" must be run
from OS command line, not from Python itself.

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.

From guido at python.org  Mon Jun 20 18:13:15 2016
From: guido at python.org (Guido van Rossum)
Date: Mon, 20 Jun 2016 15:13:15 -0700
Subject: [Python-Dev] frame evaluation API PEP
In-Reply-To: <BN3PR03MB21952B4CB10DD940C8F48FC2BB2A0@BN3PR03MB2195.namprd03.prod.outlook.com>
References: <CAP1=2W7K6Ny82Vq-rk3zO9cHxxJmtQGczqGy3byg83sy-N8W9Q@mail.gmail.com>
 <CAP1=2W6T8Gjo4B44Ut1t9yDYQcMymTBY1hcZ=+QagY5pyrauNg@mail.gmail.com>
 <CAP7+vJ+TQLcDku=pNUg=7uQ=_sw_NsBqAbM5Os1K7VAx=7W1nQ@mail.gmail.com>
 <CAP1=2W7135HTD_=7uFNGC0MyLfZiHfMW4jte4DWD+RZObUupbg@mail.gmail.com>
 <57676A8C.8070207@hotpy.org>
 <BN3PR03MB21952B4CB10DD940C8F48FC2BB2A0@BN3PR03MB2195.namprd03.prod.outlook.com>
Message-ID: <CAP7+vJJCJyPZCN2eDz_iDXuju2FP7iXXYTtZzMfeqOFWY8N+Cw@mail.gmail.com>

Couple uses of "it" here are ambiguous -- are you saying we don't need
co_extra after all, or that we can safely insist it's a dict, or...?

On Mon, Jun 20, 2016 at 1:20 PM, Dino Viehland via Python-Dev <
python-dev at python.org> wrote:

> Mark wrote:
> > > Dino and I thought of two potential alternatives, neither of which we
> > > have taken the time to implement and benchmark. One is to simply have
> > > a hash table of memory addresses to JIT data that is kept on the JIT
> > > side of things. Obviously it would be nice to avoid the overhead of a
> > > hash table lookup on every function call. This also doesn't help
> > > minimize memory when the code object gets GC'ed.
> >
> > Hash lookups aren't that slow. If you combine it with the custom flags
> > suggested by MRAB, then you would only suffer the lookup penalty when
> > actually entering the special interpreter.
> > You can use a weakref callback to ensure things get GC'd properly.
> >
> > Also, if there is a special extra field on code-object, then everyone
> will want
> > to use it. How do you handle clashes?
>
> This is exactly what I've started prototyping and have mostly coded up,
> I've
> just been getting randomized and haven't gotten back to it yet.
>
> It may have some impact in the short-term but my expectation is that as the
> JIT gets better that this will become less and less important.  Currently
> we're
> just JITing one method at a time and have no inlining support.  But once we
> can start putting guards in place and inlining across multiple function
> calls
> we will start reducing the transitions from JIT -> Function Call -> JIT
> and get
> rid of those hash table lookups entirely.  And if we can't succeed at
> inlining then
> I suspect the JIT won't end up offering the performance we'd hope for.
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160620/e0a6eda5/attachment.html>

From jcgoble3 at gmail.com  Mon Jun 20 18:13:23 2016
From: jcgoble3 at gmail.com (Jonathan Goble)
Date: Mon, 20 Jun 2016 18:13:23 -0400
Subject: [Python-Dev] Problem
In-Reply-To: <001701d1cb3e$7badbe90$73093bb0$@comcast.net>
References: <001701d1cb3e$7badbe90$73093bb0$@comcast.net>
Message-ID: <CAK256p3L1=ZxJC2qxq3kBQu70QEpiMuqrhC=CHvNy3tHcPP6-A@mail.gmail.com>

General questions like this belong on python-list, not python-dev.

To answer your question, though, you need to run that command from the
Windows Command Prompt, not from the Python interpreter.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160620/c12489b3/attachment.html>

From dinov at microsoft.com  Mon Jun 20 16:32:54 2016
From: dinov at microsoft.com (Dino Viehland)
Date: Mon, 20 Jun 2016 20:32:54 +0000
Subject: [Python-Dev] frame evaluation API PEP
In-Reply-To: <CAP7+vJLZhsEi=9aycU9axS9eZXHjfF7TngHJOLqnWuBr1daKHQ@mail.gmail.com>
References: <CAP1=2W7K6Ny82Vq-rk3zO9cHxxJmtQGczqGy3byg83sy-N8W9Q@mail.gmail.com>
 <CAP1=2W6T8Gjo4B44Ut1t9yDYQcMymTBY1hcZ=+QagY5pyrauNg@mail.gmail.com>
 <CAP7+vJ+TQLcDku=pNUg=7uQ=_sw_NsBqAbM5Os1K7VAx=7W1nQ@mail.gmail.com>
 <CAP1=2W7135HTD_=7uFNGC0MyLfZiHfMW4jte4DWD+RZObUupbg@mail.gmail.com>
 <CAP7+vJLVE-N7qtAo8qQ_amedYAJEEzZAw6d=pnGMJ35sHFQhAg@mail.gmail.com>
 <CAP1=2W48dLL60XMjK+1y3htSBLbysva4pwXWh5pim7v8vOLucA@mail.gmail.com>
 <CAP7+vJLZhsEi=9aycU9axS9eZXHjfF7TngHJOLqnWuBr1daKHQ@mail.gmail.com>
Message-ID: <BN3PR03MB21955CE6BB2B4164A7A15C6FBB2A0@BN3PR03MB2195.namprd03.prod.outlook.com>

On Mon, Jun 20, 2016 at 12:59 PM, Brett Cannon <brett at python.org<mailto:brett at python.org>> wrote:
MRAB's response made me think of a possible approach: the co_extra field could be the very last field of the PyCodeObject struct and only present if a certain flag is set in co_flags. This is similar to a trick used by X11 (I know, it's long ago :-)

But that doesn't resolve your memory worry, right? For a JIT you will have to access the memory regardless for execution count (unless Yury's patch to add caching goes in, in which case it will be provided by code objects already).

If you make the code object constructor another function pointer in the interpreter struct, you could solve this quite well IMO. An interpreter with a JIT installed would always create code objects with the co_extra field. But interpreters without a JIT would have have code objects without it. This would mean the people who aren't using a JIT at all don't pay for co_extra. The flag would still be needed so the JIT can tell when you pass it a code object that was created before the JIT was installed (or belonging to a different interpreter).

Would that work? Or is it important to be able to import a lot of code and then later import+install the JIT and have it benefit the code you already imported?

That?s a pretty interesting idea.  We actually load the JIT DLL before we execute any Python code so currently it wouldn?t have any issues with not having the full sized code objects created.  But it could also let JITs store all of the info they need right there instead of having to create yet another place to track code data.  And it fits in nicely with having the extra data being truly ephemeral that no one else should care about.  It doesn?t help with the issue of potentially multiple consumers of that field that has been brought up before but I?m not sure how concerned we should be about that scenario anyway.

I still want to check and see what the hash table overhead looks like but if that does end up looking bad I can definitely look at giving this a shot.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160620/fbaf2cef/attachment.html>

From ethan at stoneleaf.us  Mon Jun 20 18:39:04 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 20 Jun 2016 15:39:04 -0700
Subject: [Python-Dev] New security-sig mailling list
In-Reply-To: <CAOTb1wfo4bx8SJJx_4-bJdYNp4_iLWzkQoDq=9Q8B0WhJ5dhKQ@mail.gmail.com>
References: <576842E6.2030805@stoneleaf.us>
 <CAOTb1wfo4bx8SJJx_4-bJdYNp4_iLWzkQoDq=9Q8B0WhJ5dhKQ@mail.gmail.com>
Message-ID: <57687088.5040801@stoneleaf.us>

On 06/20/2016 03:02 PM, Chris Jerdonek wrote:

> I would recommend clarifying the relationship of the SIG to the Python
> Security Response Team ( https://www.python.org/news/security ), or at
> least clarifying that the SIG is different from the PSRT (and that
> security reports should not be sent to the SIG).

Attempted to do so.  Let me know if it can be clearer still.

--
~Ethan~

From timothy.c.delaney at gmail.com  Mon Jun 20 20:41:23 2016
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Tue, 21 Jun 2016 10:41:23 +1000
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <CAP7+vJKKzM0y-9utC=wHTmbpwOpbj-iV0JatED=59G4-o6psRA@mail.gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
 <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
 <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>
 <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>
 <CANawmycB2ptBw-y90B2DsEe0kKqnC8fFBVdBFBZQRTv0DitG1g@mail.gmail.com>
 <CAP7+vJ+s97M737xFgMjY_4F5EsZ9XVye9VyxWwWpf_JGRrm+tw@mail.gmail.com>
 <CANawmyeMC-YRs7hhQ-ZRPDWERGBO=BJofSuKS+6sAcYe4AYGdw@mail.gmail.com>
 <CAP7+vJKKzM0y-9utC=wHTmbpwOpbj-iV0JatED=59G4-o6psRA@mail.gmail.com>
Message-ID: <CAN8CLgmnq0n0MnQ6P59v=dx0zpByTfeURQCwSBcL6toNB1UqKA@mail.gmail.com>

On 21 June 2016 at 06:12, Guido van Rossum <guido at python.org> wrote:

> OK, basically you're arguing that knowing the definition order of class
> attributes is often useful when (ab)using Python for things like schema or
> form definitions. There are a few ways to go about it:
>
> 1. A hack using a global creation counter
> <https://github.com/GoogleCloudPlatform/datastore-ndb-python/blob/master/ndb/model.py#L888>
> 2. Metaclass with __prepare__
> <https://docs.python.org/3/reference/datamodel.html#prepare>
> 3. PEP 520 <https://www.python.org/dev/peps/pep-0520/>
> 4a. Make all dicts OrderedDicts in CPython
> <http://bugs.python.org/issue27350>
> 4b. Ditto in the language standard
>
> If we can make the jump to (4b) soon enough I think we should skip PEP
> 520; if not, I am still hemming and hawing about whether PEP 520 has enough
> benefits over (2) to bother.
>
> Sorry Eric for making this so hard. The better is so often the enemy of
> the good. I am currently somewhere between -0 and +0 on PEP 520. I'm not
> sure if the work on (4a) is going to bear fruit in time for the 3.6
> feature freeze <https://www.python.org/dev/peps/pep-0494/#schedule>; if
> it goes well I think we should have a separate conversation (maybe even a
> PEP?) about (4b). Maybe we should ask for feedback from the Jython
> developers? (PyPy already has this IIUC, and IronPython
> <https://github.com/IronLanguages/main> seems moribund?)
>

Although not a Jython developer, I've looked into the code a few times.

The major stumbling block as I understand it will be that Jython uses a
ConcurrentHashMap as the underlying structure for a dictionary. This would
need to change to a concurrent LinkedHashMap, but there's no such thing in
the standard library. The best option would appear to be
https://github.com/ben-manes/concurrentlinkedhashmap.

There are also plenty of other places that use maps and all of them would
need to be looked at. In a lot of cases they're things like IdentityHashMap
which may also need an ordered equivalent.

There is a repo for Jython 3.5 development:
https://github.com/jython/jython3 but it doesn't seem to be very active -
only 11 commits in the last year (OTOH that's also in the last 3 months).

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160621/85aa4bde/attachment.html>

From ericsnowcurrently at gmail.com  Mon Jun 20 21:30:20 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Mon, 20 Jun 2016 19:30:20 -0600
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CADiSq7dZqpH2n_JpgzC-3BgqOKcWv3OcnCvPMLe-jx8zJB2_mA@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <CADiSq7dZqpH2n_JpgzC-3BgqOKcWv3OcnCvPMLe-jx8zJB2_mA@mail.gmail.com>
Message-ID: <CALFfu7BwHs3ywW8WNkJrBjLJmY+aMF0v+Tjf9Q3rw8EPH2dsdw@mail.gmail.com>

On Fri, Jun 17, 2016 at 7:32 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> The discussion in the PEP 487 thread made me realise that I'd like to
> see a discussion in PEP 520 regarding whether or not to define
> __definition_order__ for builtin types initialised via PyType_Ready or
> created via PyType_FromSpec in addition to defining it for types
> created via the class statement or types.new_class().
>
> For static types, PyType_Ready could potentially set it based on
> tp_members, tp_methods & tp_getset (see
> https://docs.python.org/3/c-api/typeobj.html )
> Similarly, PyType_FromSpec could potentially set it based on the
> contents of Py_tp_members, Py_tp_methods and Py_tp_getset slot
> definitions
>
> Having definition order support in both types.new_class() and builtin
> types would also make it clear why we can't rely purely on the
> compiler to provide the necessary ordering information - in both of
> those cases, the Python compiler isn't directly involved in the type
> creation process.

I'll mention this in the PEP, but I'd rather not make it a part of the proposal.

-eric

From ericsnowcurrently at gmail.com  Mon Jun 20 21:37:44 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Mon, 20 Jun 2016 19:37:44 -0600
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CAP7+vJLKiP8iwFS-D51SpF0dGXVrDj9hLGSyR1C0t3ntonQgQA@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <CADiSq7dZqpH2n_JpgzC-3BgqOKcWv3OcnCvPMLe-jx8zJB2_mA@mail.gmail.com>
 <CAP7+vJLKiP8iwFS-D51SpF0dGXVrDj9hLGSyR1C0t3ntonQgQA@mail.gmail.com>
Message-ID: <CALFfu7BS1fBHLqvoEVWsX4mW-bQ2SRTFnVqgT62Gc0APj+k94w@mail.gmail.com>

On Mon, Jun 20, 2016 at 9:49 AM, Guido van Rossum <guido at python.org> wrote:
> I agree it's better to define the order as computed at runtime. I don't
> think there's much of a point to mandate that all builtin/extension types
> reveal their order too -- I doubt there will be many uses for that -- but I
> don't want to disallow it either. But we can allow types to define this, as
> long as it's in their documentation (so users can rely on it in those
> cases).

Agreed.

>
> As another point of review, I don't like the exception for dunder names. I
> can see that __module__, __name__ etc. are distractions, but since you're
> adding methods, you should also add methods with dunder names.

I still think that in practice the dunder names will be clutter that
folks have to ignore.  However, it's a relatively weak point given
that it's easy to ignore dunder names.  So I don't mind including
them.

>
> The overlap with PEP 487 makes me think that this feature is clearly
> desirable (I like the name you give it in PEP 520 better, and PEP 487 is too
> vague about its definition).

Agreed.

>
> Finally, it seems someone is working on making all dicts ordered. Does that
> mean this will soon be obsolete?

Nope.  Having an ordered definition namespace by default does not give
us __definition_order__ for free.  Furthermore, the compact dict under
consideration isn't strictly order-preserving (re-orders for
deletion).

-eric

From ericsnowcurrently at gmail.com  Mon Jun 20 22:11:09 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Mon, 20 Jun 2016 20:11:09 -0600
Subject: [Python-Dev] Review of PEP 520: Ordered Class Definition
 Namespace
In-Reply-To: <CAP7+vJK5SVeng92QKvY4PpKR3OX=YLqjj2tgGFjbvmavnpLxxA@mail.gmail.com>
References: <CAP7+vJK5SVeng92QKvY4PpKR3OX=YLqjj2tgGFjbvmavnpLxxA@mail.gmail.com>
Message-ID: <CALFfu7Bqtbzn1T2mgWh-aY1KPw2iL_qgT1CYYCs7hGrJbwYGiw@mail.gmail.com>

On Mon, Jun 20, 2016 at 10:32 AM, Guido van Rossum <guido at python.org> wrote:
> - I don't like the exception for dunder names. I can see that __module__,
>  __name__ etc. that occur in every class are distractions, but since you're
> adding methods, you should also add methods with dunder names like
> __init__ or __getattr__. (Otherwise, what if someone really wanted to create
> a Django form with a field named __dunder__?)

Note that in that case they could set __definition_order__ manually in
their class body.  That said, I don't mind relaxing this if you think
the common-case clutter is worth it for the case where a dunder name
is relevant.  You have a keen sense for this sort of situation. :)

> - It's a shame we can't just make __dict__ (a proxy to) an OrderedDict, then
> we wouldn't need an extra attribute. Hm, maybe we could customize the proxy
> class so its keys(), values(), items() views return things in the order of
> __definition_order__?

I'm not sure it's worth it to mess with the proxy like that.  Plus, I
like how __definition_order__ makes it obvious what it is as well as
more discoverable.

> (In the tracker discussion this was considered a big
> deal, but given that a class __dict__ is already a readonly proxy I'm not
> sure I agree. Or is this about C level access? How much of that would
> break?)

I actually tried making the underlying class namespace (behind the
proxy at __dict__) an OrderedDict.  I ended up with a number of
problems because of the pervasive use of the concrete dict API
relative to the class dict.  That API does not play well with
subclasses.

>
> - I don't see why it needs to be a read-only attribute. There are very few
> of those -- in general we let users play around with things unless we have a
> hard reason to restrict assignment (e.g. the interpreter's internal state
> could be compromised). I don't see such a hard reason here.

I'm willing to change that.  I figured we would start off treating it
like we have other dunder attributes, some of which have become
writable while others remain read-only.  However, you are right that
there is no danger in making it writable.

>
> - All in all the motivation is fairly weak -- it seems to be mostly
> motivated on avoiding a custom metaclass for this purpose because combining
> metaclasses is a pain. I realize it's only a small patch in a small corner
> of the language, but I do worry about repercussions -- it's an API that's
> going to be used for new (and useful) purposes so we will never be able to
> get rid of it.

True.  It's certainly a very specific feature.  The point is that we
currently throw away the attribute order from class definitions.  You
can opt in to preserving the order using an appropriate metaclass.
However, everything that would make use of that information (e.g.
class decorators) would then have a prerequisite of that metaclass.
That means such a tool could only consume classes that were designed
to be used by the tool.  Then there's the whole problem of metaclass
conflicts (see PEP 487).

If, instead, we always preserved the definition order then these
problems (again, for an admittedly corner use case) go away.

FWIW, regarding repercussions, I do not expect any other potential
future feature will subsume the functionality of PEP 520.  The closest
thing would be if cls.__dict__ became ordered.  However, that would
intersect with __definition_order__ only at first.  Furthermore,
cls.__dict__ would only ever be able to make vague promises about any
relationship with the definiton order.  The point of
__definiton_order__ is to provide the one obvious place to get a
specific bit of information about a class.

>
> Note: I'm neither accepting nor rejecting the PEP; I'm merely inviting more
> contemplation.

Thanks. :)

-eric

From songofacandy at gmail.com  Mon Jun 20 22:14:39 2016
From: songofacandy at gmail.com (INADA Naoki)
Date: Tue, 21 Jun 2016 11:14:39 +0900
Subject: [Python-Dev] PEP 520: Ordered Class Definition Namespace
In-Reply-To: <CALFfu7BS1fBHLqvoEVWsX4mW-bQ2SRTFnVqgT62Gc0APj+k94w@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <CADiSq7dZqpH2n_JpgzC-3BgqOKcWv3OcnCvPMLe-jx8zJB2_mA@mail.gmail.com>
 <CAP7+vJLKiP8iwFS-D51SpF0dGXVrDj9hLGSyR1C0t3ntonQgQA@mail.gmail.com>
 <CALFfu7BS1fBHLqvoEVWsX4mW-bQ2SRTFnVqgT62Gc0APj+k94w@mail.gmail.com>
Message-ID: <CAEfz+Tx2iUUiq4mOw-ybU5DUVXxdqfJEq4R2xqQ0OkAgT=NJqg@mail.gmail.com>

>>
>> Finally, it seems someone is working on making all dicts ordered. Does that
>> mean this will soon be obsolete?
>
> Nope.  Having an ordered definition namespace by default does not give
> us __definition_order__ for free.  Furthermore, the compact dict under
> consideration isn't strictly order-preserving (re-orders for
> deletion).
>

compact ordered dict I proposed is preserves insertion order even some
items are deleted.
http://bugs.python.org/issue27350

Should I post PEP for compact dict?  Here is my draft, but I haven't
posted it yet since
my English is much worse than C.
https://www.dropbox.com/s/s85n9b2309k03cq/pep-compact-dict.txt?dl=0

From raymond.hettinger at gmail.com  Mon Jun 20 22:17:00 2016
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Mon, 20 Jun 2016 19:17:00 -0700
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <CAN8CLgmnq0n0MnQ6P59v=dx0zpByTfeURQCwSBcL6toNB1UqKA@mail.gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
 <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
 <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>
 <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>
 <CANawmycB2ptBw-y90B2DsEe0kKqnC8fFBVdBFBZQRTv0DitG1g@mail.gmail.com>
 <CAP7+vJ+s97M737xFgMjY_4F5EsZ9XVye9VyxWwWpf_JGRrm+tw@mail.gmail.com>
 <CANawmyeMC-YRs7hhQ-ZRPDWERGBO=BJofSuKS+6sAcYe4AYGdw@mail.gmail.com>
 <CAP7+vJKKzM0y-9utC=wHTmbpwOpbj-iV0JatED=59G4-o6psRA@mail.gmail.com>
 <CAN8CLgmnq0n0MnQ6P59v=dx0zpByTfeURQCwSBcL6toNB1UqKA@mail.gmail.com>
Message-ID: <38A44E02-93EA-44BC-A68E-0A6F490136F9@gmail.com>

> On Jun 20, 2016, at 5:41 PM, Tim Delaney <timothy.c.delaney at gmail.com> wrote:
> 
> Although not a Jython developer, I've looked into the code a few times.
> 
> The major stumbling block as I understand it will be that Jython uses a ConcurrentHashMap as the underlying structure for a dictionary. This would need to change to a concurrent LinkedHashMap, but there's no such thing in the standard library. The best option would appear to be https://github.com/ben-manes/concurrentlinkedhashmap.
> 
> There are also plenty of other places that use maps and all of them would need to be looked at. In a lot of cases they're things like IdentityHashMap which may also need an ordered equivalent.

If you can, check with Jim Baker.  At the language summit a few years ago, he and I sketched out a solution that he thought was doable without much effort and without much of a performance hit.   IIRC, it involved using a ConcurrentHashMap augmented by an auxiliary 2-by-n-row array of indices (one for forward links and the other for backward links).  There was also need to add a reentrant lock around the mutating methods.

Raymond Hettinger

From ericsnowcurrently at gmail.com  Mon Jun 20 22:17:04 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Mon, 20 Jun 2016 20:17:04 -0600
Subject: [Python-Dev] PEP 520: Preserving Class Attribute Definition Order
 (round 4)
Message-ID: <CALFfu7CJfxns1cMdH6HuLN7P+9Xg=Epgr071wbHJZorPVQBzJA@mail.gmail.com>

I've updated PEP 520 to reflect a clearer focus on the definition
order and less emphasis on OrderedDict.

-eric

=======================================

PEP: 520
Title: Preserving Class Attribute Definition Order
Version: $Revision$
Last-Modified: $Date$
Author: Eric Snow <ericsnowcurrently at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 7-Jun-2016
Python-Version: 3.6
Post-History: 7-Jun-2016, 11-Jun-2016, 20-Jun-2016

Abstract
========

When a class is defined using a ``class`` statement, the class body is
executed within a namespace.  After the execution completes, that
namespace is copied into new ``dict`` and the original definition
namespace is discarded.  The new copy is stored away as the class's
namespace and is exposed as ``__dict__`` through a read-only proxy.

This PEP preserves the order in which the attributes in the definition
namespace were added to it, before that namespace is discarded.  This
means it reflects the definition order of the class body.  That order
will now be preserved in the ``__definition_order__`` attribute of the
class.  This allows introspection of the original definition order,
e.g. by class decorators.

Additionally, this PEP changes the default class definition namespace
to ``OrderedDict``.  The long-lived class namespace (``__dict__``) will
remain a ``dict``.

Motivation
==========

Currently Python does not preserve the order in which attributes are
added to the class definition namespace. The namespace used during
execution of a class body defaults to ``dict``.  If the metaclass
defines ``__prepare__()`` then the result of calling it is used.  Thus,
before this PEP, to access your class definition namespace you must
use ``OrderedDict`` along with a metaclass. Then you must preserve the
definition order (from the ``OrderedDict``) yourself.  This has a
couple of problems.

First, it requires the use of a metaclass.  Metaclasses introduce an
extra level of complexity to code and in some cases (e.g. conflicts)
are a problem.  So reducing the need for them is worth doing when the
opportunity presents itself.  PEP 422 and PEP 487 discuss this at
length.  Given that we now have a C implementation of ``OrderedDict``
and that ``OrderedDict`` is the common use case for ``__prepare__()``,
we have such an opportunity by defaulting to ``OrderedDict``.

Second, only classes that opt in to using the ``OrderedDict``-based
metaclass will have access to the definition order. This is problematic
for cases where universal access to the definition order is important.
One of the original motivating use cases for this PEP is generic class
decorators that make use of the definition order.

Specification
=============

Part 1:

* the order in which class attributes are defined is preserved in the
  new ``__definition_order__`` attribute on each class
* "dunder" attributes (e.g. ``__init__``, ``__module__``) are ignored
* ``__definition_order__`` is a ``tuple`` (or ``None``)
* ``__definition_order__`` is a read-only attribute
* ``__definition_order__`` is always set:

  1. if ``__definition_order__`` is defined in the class body then it
     must be a ``tuple`` of identifiers or ``None``; any other value
     will result in ``TypeError``
  2. classes that do not have a class definition (e.g. builtins) have
     their ``__definition_order__`` set to ``None``
  3. classes for which `__prepare__()`` returned something other than
     ``OrderedDict`` (or a subclass) have their ``__definition_order__``
     set to ``None`` (except where #1 applies)

Part 2:

* the default class *definition* namespace is now ``OrderdDict``

The following code demonstrates roughly equivalent semantics for the
default behavior::

   class Meta(type):
       @classmethod
       def __prepare__(cls, *args, **kwargs):
           return OrderedDict()

   class Spam(metaclass=Meta):
       ham = None
       eggs = 5
       __definition_order__ = tuple(k for k in locals()
                                    if not (k.startswith('__') and
                                            k.endswith('__')))

Note that [pep487_] proposes a similar solution, albeit as part of a
broader proposal.

Why a tuple?
------------

Use of a tuple reflects the fact that we are exposing the order in
which attributes on the class were *defined*.  Since the definition
is already complete by the time ``__definition_order__`` is set, the
content and order of the value won't be changing.  Thus we use a type
that communicates that state of immutability.

Why a read-only attribute?
--------------------------

As with the use of tuple, making ``__definition_order__`` a read-only
attribute communicates the fact that the information it represents is
complete.  Since it represents the state of a particular one-time event
(execution of the class definition body), allowing the value to be
replaced would reduce confidence that the attribute corresponds to the
original class body.

If a use case for a writable (or mutable) ``__definition_order__``
arises, the restriction may be loosened later.  Presently this seems
unlikely and furthermore it is usually best to go immutable-by-default.

Note that ``__definition_order__`` is centered on the class definition
body.  The use cases for dealing with the class namespace (``__dict__``)
post-definition are a separate matter.  ``__definition_order__`` would
be a significantly misleading name for a feature focused on more than
class definition.

See [nick_concern_] for more discussion.

Why ignore "dunder" names?
--------------------------

Names starting and ending with "__" are reserved for use by the
interpreter.  In practice they should not be relevant to the users of
``__definition_order__``.  Instead, for nearly everyone they would only
be clutter, causing the same extra work for everyone.

Why None instead of an empty tuple?
-----------------------------------

A key objective of adding ``__definition_order__`` is to preserve
information in class definitions which was lost prior to this PEP.
One consequence is that ``__definition_order__`` implies an original
class definition.  Using ``None`` allows us to clearly distinquish
classes that do not have a definition order.  An empty tuple clearly
indicates a class that came from a definition statement but did not
define any attributes there.

Why None instead of not setting the attribute?
----------------------------------------------

The absence of an attribute requires more complex handling than ``None``
does for consumers of ``__definition_order__``.

Why constrain manually set values?
----------------------------------

If ``__definition_order__`` is manually set in the class body then it
will be used.  We require it to be a tuple of identifiers (or ``None``)
so that consumers of ``__definition_order__`` may have a consistent
expectation for the value.  That helps maximize the feature's
usefulness.

We could also also allow an arbitrary iterable for a manually set
``__definition_order__`` and convert it into a tuple.  However, not
all iterables infer a definition order (e.g. ``set``).  So we opt in
favor of requiring a tuple.

Why is __definition_order__ even necessary?
-------------------------------------------

Since the definition order is not preserved in ``__dict__``, it is
lost once class definition execution completes.  Classes *could*
explicitly set the attribute as the last thing in the body.  However,
then independent decorators could only make use of classes that had done
so.  Instead, ``__definition_order__`` preserves this one bit of info
from the class body so that it is universally available.

Support for C-API Types
=======================

Arguably, most C-defined Python types (e.g. built-in, extension modules)
have a roughly equivalent concept of a definition order. So conceivably
``__definition_order__`` could be set for such types automatically. This
PEP does not introduce any such support. However, it does not prohibit
it either.

Compatibility
=============

This PEP does not break backward compatibility, except in the case that
someone relies *strictly* on ``dict`` as the class definition namespace.
This shouldn't be a problem since ``issubclass(OrderedDict, dict)`` is
true.

Changes
=============

In addition to the class syntax, the following expose the new behavior:

* builtins.__build_class__
* types.prepare_class
* types.new_class

Other Python Implementations
============================

Pending feedback, the impact on Python implementations is expected to
be minimal.  If a Python implementation cannot support switching to
`OrderedDict``-by-default then it can always set ``__definition_order__``
to ``None``.

Implementation
==============

The implementation is found in the tracker. [impl_]

Alternatives
============

cls.__dict__ as OrderedDict
-------------------------------

Instead of storing the definition order in ``__definition_order__``,
the now-ordered definition namespace could be copied into a new
``OrderedDict``.  This would then be used as the mapping proxied as
``__dict__``.  Doing so would mostly provide the same semantics.

However, using ``OrderedDict`` for ``__dict__`` would obscure the
relationship with the definition namespace, making it less useful.
Additionally, doing this would require significant changes to the
semantics of the concrete ``dict`` C-API.

A "namespace" Keyword Arg for Class Definition
----------------------------------------------

PEP 422 introduced a new "namespace" keyword arg to class definitions
that effectively replaces the need to ``__prepare__()``. [pep422_]
However, the proposal was withdrawn in favor of the simpler PEP 487.

A stdlib Metaclass that Implements __prepare__() with OrderedDict
-----------------------------------------------------------------

This has all the same problems as writing your own metaclass.  The
only advantage is that you don't have to actually write this
metaclass.  So it doesn't offer any benefit in the context of this
PEP.

Set __definition_order__ at Compile-time
----------------------------------------

Each class's ``__qualname__`` is determined at compile-time.
This same concept could be applied to ``__definition_order__``.
The result of composing ``__definition_order__`` at compile-time
would be nearly the same as doing so at run-time.

Comparative implementation difficulty aside, the key difference
would be that at compile-time it would not be practical to
preserve definition order for attributes that are set dynamically
in the class body (e.g. ``locals()[name] = value``).  However,
they should still be reflected in the definition order.  One
posible resolution would be to require class authors to manually
set ``__definition_order__`` if they define any class attributes
dynamically.

Ultimately, the use of ``OrderedDict`` at run-time or compile-time
discovery is almost entirely an implementation detail.

References
==========

.. [impl] issue #24254
   (https://bugs.python.org/issue24254)

.. [nick_concern] Nick's concerns about mutability
   (https://mail.python.org/pipermail/python-dev/2016-June/144883.html)

.. [pep422] PEP 422
   (https://www.python.org/dev/peps/pep-0422/#order-preserving-classes)

.. [pep487] PEP 487
   (https://www.python.org/dev/peps/pep-0487/#defining-arbitrary-namespaces)

.. [orig] original discussion
   (https://mail.python.org/pipermail/python-ideas/2013-February/019690.html)

.. [followup1] follow-up 1
   (https://mail.python.org/pipermail/python-dev/2013-June/127103.html)

.. [followup2] follow-up 2
   (https://mail.python.org/pipermail/python-dev/2015-May/140137.html)

Copyright
===========
This document has been placed in the public domain.

From turnbull at sk.tsukuba.ac.jp  Sun Jun 12 02:43:20 2016
From: turnbull at sk.tsukuba.ac.jp (Stephen J. Turnbull)
Date: Sun, 12 Jun 2016 15:43:20 +0900
Subject: [Python-Dev] BDFL ruling request: should we block forever
 waiting for high-quality random bits?
In-Reply-To: <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
References: <20160609124102.5EE4EB14024@webabinitio.net>
 <1465476616-sup-8510@lrrr.local>
 <AFE019C2-CDAD-4971-8355-461BD8164CFD@lukasa.co.uk>
 <E3AF1F24-2F4D-4777-8232-5EFBF950A245@doughellmann.com>
 <5034383A-3A95-41EE-9326-983AA0AFEDC2@stufft.io>
 <CAP7+vJJnqMEtz1oj3sQ1wB1S10kTgS7KyWPFctxp4Jf8_oHmsA@mail.gmail.com>
 <5759EC2B.8040208@hastings.org>
 <CAPJVwBm-p-e3T+bZnN98UunCntRpXiaF5K6PHWabDdp+pn9nAA@mail.gmail.com>
 <CAP7+vJLueUMoKSfW053YgWcWVF_0s9PCfaB=OCOGdauaD1Nhzw@mail.gmail.com>
 <87lh2dycuo.fsf@vostro.rath.org>
 <20160611074013.GL27919@ando.pearwood.info>
 <CAP7+vJKJb11mnEnpf5Ac0+3saq4W4aUVWfFqHfuh_6nRWBo0=A@mail.gmail.com>
 <649D18FA-5076-4A69-8433-5D8A01EE23B4@stufft.io>
 <CAP7+vJK0dweV=fSkrmuz4irKjG2YcsJuwwLa2sHfkoMnydAv-g@mail.gmail.com>
 <ED96C35D-0D49-474F-965B-649A773AACB8@stufft.io>
 <CAP7+vJKJe8HZckAJFxFgV6-5CUukFin6KwVtumeJZRmK7npToQ@mail.gmail.com>
 <9F5471E7-CA58-4B87-A6BE-297C76222BA3@stufft.io>
 <CAP7+vJK8rTtBMjMz2Onxce+rcpRSWp5bAHAaFWd+FRSzBEGgHw@mail.gmail.com>
 <9BA06FA0-62F1-4491-AB57-8A1CFBF8334A@stufft.io>
Message-ID: <m2fusi7vqv.fsf@xemacs.org>

Donald Stufft writes:

 > I guess one question would be, what does the secrets module do if
 > it?s on a Linux that is too old to have getrandom(0), off the top
 > of my head I can think of:
 > 
 > * Silently fall back to reading os.urandom and hope that it?s been
 >   seeded.
 > * Fall back to os.urandom and hope that it?s been seeded and add a
 >   SecurityWarning or something like it to mention that it?s
 >   falling back to os.urandom and it may be getting predictable
 >   random from /dev/urandom.
 > * Hard fail because it can?t guarantee secure cryptographic
 >   random.

I'm going to hide behind the Linux manpage (which actually suggests
saving the data in a file to speed initialization at boot) in
mentioning this:

* if random_initialized_timestamp_pre_boot():
      r = open("/dev/random", "rb")
      u = open("/dev/urandom", "wb")
      u.write(r.read(enough_bytes))
      set_random_initialized_timestamp()
  # in theory, secrets can now use os.urandom

From phd at phdru.name  Mon Jun 20 23:17:27 2016
From: phd at phdru.name (Oleg Broytman)
Date: Tue, 21 Jun 2016 05:17:27 +0200
Subject: [Python-Dev] PEP XXX: Compact ordered dict
In-Reply-To: <CAEfz+Tx2iUUiq4mOw-ybU5DUVXxdqfJEq4R2xqQ0OkAgT=NJqg@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <CADiSq7dZqpH2n_JpgzC-3BgqOKcWv3OcnCvPMLe-jx8zJB2_mA@mail.gmail.com>
 <CAP7+vJLKiP8iwFS-D51SpF0dGXVrDj9hLGSyR1C0t3ntonQgQA@mail.gmail.com>
 <CALFfu7BS1fBHLqvoEVWsX4mW-bQ2SRTFnVqgT62Gc0APj+k94w@mail.gmail.com>
 <CAEfz+Tx2iUUiq4mOw-ybU5DUVXxdqfJEq4R2xqQ0OkAgT=NJqg@mail.gmail.com>
Message-ID: <20160621031727.GA7518@phdru.name>

Hi!

On Tue, Jun 21, 2016 at 11:14:39AM +0900, INADA Naoki <songofacandy at gmail.com> wrote:
> Here is my draft, but I haven't
> posted it yet since
> my English is much worse than C.
> https://www.dropbox.com/s/s85n9b2309k03cq/pep-compact-dict.txt?dl=0

   It's good enough for a start (if a PEP is needed at all). If you push
it to Github I'm sure they will come with pull requests.

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.

From songofacandy at gmail.com  Tue Jun 21 01:02:52 2016
From: songofacandy at gmail.com (INADA Naoki)
Date: Tue, 21 Jun 2016 14:02:52 +0900
Subject: [Python-Dev] PEP XXX: Compact ordered dict
In-Reply-To: <20160621031727.GA7518@phdru.name>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <CADiSq7dZqpH2n_JpgzC-3BgqOKcWv3OcnCvPMLe-jx8zJB2_mA@mail.gmail.com>
 <CAP7+vJLKiP8iwFS-D51SpF0dGXVrDj9hLGSyR1C0t3ntonQgQA@mail.gmail.com>
 <CALFfu7BS1fBHLqvoEVWsX4mW-bQ2SRTFnVqgT62Gc0APj+k94w@mail.gmail.com>
 <CAEfz+Tx2iUUiq4mOw-ybU5DUVXxdqfJEq4R2xqQ0OkAgT=NJqg@mail.gmail.com>
 <20160621031727.GA7518@phdru.name>
Message-ID: <CAEfz+Tzt+sgJzd_JuMExVfkjffWzeZncpeU8LzrTMNAZX690Mg@mail.gmail.com>

On Tue, Jun 21, 2016 at 12:17 PM, Oleg Broytman <phd at phdru.name> wrote:
> Hi!
>
> On Tue, Jun 21, 2016 at 11:14:39AM +0900, INADA Naoki <songofacandy at gmail.com> wrote:
>> Here is my draft, but I haven't
>> posted it yet since
>> my English is much worse than C.
>> https://www.dropbox.com/s/s85n9b2309k03cq/pep-compact-dict.txt?dl=0
>
>    It's good enough for a start (if a PEP is needed at all). If you push
> it to Github I'm sure they will come with pull requests.
>
> Oleg.

Thank you for reading my draft.

> (if a PEP is needed at all)

I don't think so. My PEP is not for changing Python Language,
just describe implementation detail.

Python 3.5 has new OrderedDict implemented in C without PEP.
My patch is relatively small than it.  And the idea has been well known.

-- 
INADA Naoki  <songofacandy at gmail.com>

From songofacandy at gmail.com  Tue Jun 21 11:10:15 2016
From: songofacandy at gmail.com (INADA Naoki)
Date: Wed, 22 Jun 2016 00:10:15 +0900
Subject: [Python-Dev] Compact ordered dict is not ordered for split table.
 (was: PEP XXX: Compact ordered dict
Message-ID: <CAEfz+Tx8XRjq1dPUdkyV5Y8AQPrnK0myvQsRmv75dC5F8ZJZ1A@mail.gmail.com>

I'm sorry, but I hadn't realized which compact ordered dict is
not ordered for split table.

For example:
>>> class A:
...   ...
...
>>> a = A()
>>> b = A()
>>> a.a = 1
>>> a.b = 2
>>> b.b = 3
>>> b.a = 4
>>> a.__dict__.items()
dict_items([('a', 1), ('b', 2)])
>>> b.__dict__.items()
dict_items([('a', 4), ('b', 3)])

This doesn't affects to **kwargs and class namespace.

But if we change the language spec to dict preserves insertion order,
this should be addressed.

On Tue, Jun 21, 2016 at 2:02 PM, INADA Naoki <songofacandy at gmail.com> wrote:
> On Tue, Jun 21, 2016 at 12:17 PM, Oleg Broytman <phd at phdru.name> wrote:
>> Hi!
>>
>> On Tue, Jun 21, 2016 at 11:14:39AM +0900, INADA Naoki <songofacandy at gmail.com> wrote:
>>> Here is my draft, but I haven't
>>> posted it yet since
>>> my English is much worse than C.
>>> https://www.dropbox.com/s/s85n9b2309k03cq/pep-compact-dict.txt?dl=0
>>
>>    It's good enough for a start (if a PEP is needed at all). If you push
>> it to Github I'm sure they will come with pull requests.
>>
>> Oleg.
>
> Thank you for reading my draft.
>
>> (if a PEP is needed at all)
>
> I don't think so. My PEP is not for changing Python Language,
> just describe implementation detail.
>
> Python 3.5 has new OrderedDict implemented in C without PEP.
> My patch is relatively small than it.  And the idea has been well known.
>
> --
> INADA Naoki  <songofacandy at gmail.com>

-- 
INADA Naoki  <songofacandy at gmail.com>

From ericsnowcurrently at gmail.com  Tue Jun 21 13:12:26 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 21 Jun 2016 11:12:26 -0600
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <CANawmyeMC-YRs7hhQ-ZRPDWERGBO=BJofSuKS+6sAcYe4AYGdw@mail.gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
 <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
 <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>
 <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>
 <CANawmycB2ptBw-y90B2DsEe0kKqnC8fFBVdBFBZQRTv0DitG1g@mail.gmail.com>
 <CAP7+vJ+s97M737xFgMjY_4F5EsZ9XVye9VyxWwWpf_JGRrm+tw@mail.gmail.com>
 <CANawmyeMC-YRs7hhQ-ZRPDWERGBO=BJofSuKS+6sAcYe4AYGdw@mail.gmail.com>
Message-ID: <CALFfu7CmbqFhCtYf2T-AW0Qh=FJSAvf=tdX=_Au116eNb=vZHg@mail.gmail.com>

On Mon, Jun 20, 2016 at 12:31 PM, Nikita Nemkin <nikita at nemkin.ru> wrote:
> Right. Ordered by default is a very serious implementation constraint.
> It's only superior in a sense that it completely subsumes/obsoletes
> PEP 520.

Just to be clear, PEP 520 is more than just OrderedDict-by-default.
In fact, the key point is preserving the definition order, which the
PEP now reflects better.  Raymond's compact dict would only provide
the ordered-by-default part and does nothing to persist the definition
order like the PEP specifies.

-eric

From guido at python.org  Tue Jun 21 13:18:50 2016
From: guido at python.org (Guido van Rossum)
Date: Tue, 21 Jun 2016 10:18:50 -0700
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <CALFfu7CmbqFhCtYf2T-AW0Qh=FJSAvf=tdX=_Au116eNb=vZHg@mail.gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
 <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
 <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>
 <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>
 <CANawmycB2ptBw-y90B2DsEe0kKqnC8fFBVdBFBZQRTv0DitG1g@mail.gmail.com>
 <CAP7+vJ+s97M737xFgMjY_4F5EsZ9XVye9VyxWwWpf_JGRrm+tw@mail.gmail.com>
 <CANawmyeMC-YRs7hhQ-ZRPDWERGBO=BJofSuKS+6sAcYe4AYGdw@mail.gmail.com>
 <CALFfu7CmbqFhCtYf2T-AW0Qh=FJSAvf=tdX=_Au116eNb=vZHg@mail.gmail.com>
Message-ID: <CAP7+vJ+PN0aTBKrnZ_dpfF6quCD_1LTCMqPsdGo-f4B9fXd-gg@mail.gmail.com>

On Tue, Jun 21, 2016 at 10:12 AM, Eric Snow <ericsnowcurrently at gmail.com>
wrote:

> On Mon, Jun 20, 2016 at 12:31 PM, Nikita Nemkin <nikita at nemkin.ru> wrote:
> > Right. Ordered by default is a very serious implementation constraint.
> > It's only superior in a sense that it completely subsumes/obsoletes
> > PEP 520.
>
> Just to be clear, PEP 520 is more than just OrderedDict-by-default.
> In fact, the key point is preserving the definition order, which the
> PEP now reflects better.  Raymond's compact dict would only provide
> the ordered-by-default part and does nothing to persist the definition
> order like the PEP specifies.
>

Judging from Inada's message there seems to be some confusion about how
well the compact dict preserves order (personally I think if it doesn't
guarantee order after deletions it's pretty useless).

Assuming it preserves order across deletions/compactions (like IIUC
OrderedDict does) isn't that good enough for any of the use cases
considered? It would require a delete+insert to change an item's order. If
we had had these semantics in the language from the start, there would have
been plenty uses of this order, and I suspect nobody would have considered
asking for __definition_order__.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160621/c6bc743e/attachment.html>

From raymond.hettinger at gmail.com  Tue Jun 21 13:50:00 2016
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Tue, 21 Jun 2016 10:50:00 -0700
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <CAP7+vJ+PN0aTBKrnZ_dpfF6quCD_1LTCMqPsdGo-f4B9fXd-gg@mail.gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
 <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
 <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>
 <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>
 <CANawmycB2ptBw-y90B2DsEe0kKqnC8fFBVdBFBZQRTv0DitG1g@mail.gmail.com>
 <CAP7+vJ+s97M737xFgMjY_4F5EsZ9XVye9VyxWwWpf_JGRrm+tw@mail.gmail.com>
 <CANawmyeMC-YRs7hhQ-ZRPDWERGBO=BJofSuKS+6sAcYe4AYGdw@mail.gmail.com>
 <CALFfu7CmbqFhCtYf2T-AW0Qh=FJSAvf=tdX=_Au116eNb=vZHg@mail.gmail.com>
 <CAP7+vJ+PN0aTBKrnZ_dpfF6quCD_1LTCMqPsdGo-f4B9fXd-gg@mail.gmail.com>
Message-ID: <32C4B383-4DD4-4F44-A522-55E6EA96FFE1@gmail.com>

> On Jun 21, 2016, at 10:18 AM, Guido van Rossum <guido at python.org> wrote:
> 
> Judging from Inada's message there seems to be some confusion about how well the compact dict preserves order (personally I think if it doesn't guarantee order after deletions it's pretty useless).

Inada should follow PyPy's implementation of the compact dict which does preserve order after deletions (see below).

My original proof-of-concept code didn't have that feature; instead, it was aimed at saving space and speeding-up iteration.  The key ordering was just a by-product.  Additional logic was needed to preserve order for interleaved insertions and deletions.

Raymond

---(PyPy test of order preservation)-------------------------------------------------------------

'Demonstrate PyPy preserves order across repeated insertions and deletions'

from random import randrange
import string

s = list(string.letters)
d = dict.fromkeys(s)
n = len(s)
for _ in range(10000):
    i = randrange(n)
    c = s.pop(i);    s.append(c)
    d.pop(c);        d[c] = None
    assert d.keys() == s

---(PyPy session showing order preservation)--------------------------------------------------

$ pypy
Python 2.7.10 (c09c19272c990a0611b17569a0085ad1ab00c8ff, Jun 13 2016, 03:59:08)
[PyPy 5.3.0 with GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.31)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>> d = dict(raymond='red', rachel='blue', matthew='yellow')
>>>> del d['rachel']
>>>> d['cindy'] = 'green'
>>>> d['rachel'] = 'azure'
>>>> d
{'raymond': 'red', 'matthew': 'yellow', 'cindy': 'green', 'rachel': 'azure'}

From storchaka at gmail.com  Tue Jun 21 16:48:09 2016
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Tue, 21 Jun 2016 23:48:09 +0300
Subject: [Python-Dev] When to use EOFError?
Message-ID: <nkc96a$esv$1@ger.gmane.org>

There is a design question. If you read file in some format or with some 
protocol, and the data is ended unexpectedly, when to use general 
EOFError exception and when to use format/protocol specific exception?

For example when load truncated pickle data, an unpickler can raise 
EOFError, UnpicklingError, ValueError or AttributeError. It is possible 
to avoid ValueError or AttributeError, but what exception should be 
raised instead, EOFError or UnpicklingError? Maybe convert all EOFError 
to UnpicklingError? Or all UnpicklingError caused by unexpectedly ended 
input to EOFError? Or raise EOFError if the input is ended after 
completed opcode, and UnpicklingError if it contains truncated opcode?

http://bugs.python.org/issue25761

From ncoghlan at gmail.com  Tue Jun 21 17:01:15 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 21 Jun 2016 14:01:15 -0700
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <CAP7+vJ+PN0aTBKrnZ_dpfF6quCD_1LTCMqPsdGo-f4B9fXd-gg@mail.gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
 <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
 <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>
 <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>
 <CANawmycB2ptBw-y90B2DsEe0kKqnC8fFBVdBFBZQRTv0DitG1g@mail.gmail.com>
 <CAP7+vJ+s97M737xFgMjY_4F5EsZ9XVye9VyxWwWpf_JGRrm+tw@mail.gmail.com>
 <CANawmyeMC-YRs7hhQ-ZRPDWERGBO=BJofSuKS+6sAcYe4AYGdw@mail.gmail.com>
 <CALFfu7CmbqFhCtYf2T-AW0Qh=FJSAvf=tdX=_Au116eNb=vZHg@mail.gmail.com>
 <CAP7+vJ+PN0aTBKrnZ_dpfF6quCD_1LTCMqPsdGo-f4B9fXd-gg@mail.gmail.com>
Message-ID: <CADiSq7fjaqqx3ChkxLc1TYKS=mDvWvsxrjuyS=30TDbwp2WKfg@mail.gmail.com>

On 21 June 2016 at 10:18, Guido van Rossum <guido at python.org> wrote:
> On Tue, Jun 21, 2016 at 10:12 AM, Eric Snow <ericsnowcurrently at gmail.com>
> wrote:
>>
>> On Mon, Jun 20, 2016 at 12:31 PM, Nikita Nemkin <nikita at nemkin.ru> wrote:
>> > Right. Ordered by default is a very serious implementation constraint.
>> > It's only superior in a sense that it completely subsumes/obsoletes
>> > PEP 520.
>>
>> Just to be clear, PEP 520 is more than just OrderedDict-by-default.
>> In fact, the key point is preserving the definition order, which the
>> PEP now reflects better.  Raymond's compact dict would only provide
>> the ordered-by-default part and does nothing to persist the definition
>> order like the PEP specifies.
>
>
> Judging from Inada's message there seems to be some confusion about how well
> the compact dict preserves order (personally I think if it doesn't guarantee
> order after deletions it's pretty useless).
>
> Assuming it preserves order across deletions/compactions (like IIUC
> OrderedDict does) isn't that good enough for any of the use cases
> considered? It would require a delete+insert to change an item's order. If
> we had had these semantics in the language from the start, there would have
> been plenty uses of this order, and I suspect nobody would have considered
> asking for __definition_order__.

RIght, if *tp_dict itself* on type objects is guaranteed to be
order-preserviing, then we don't need to do anything except perhaps
provide a helper method or descriptor on type that automatically
filters out the dunder-attributes, and spell out the type dict
population order for:

- class statements (universal)
- types.new_class (universal)
- calling type() directly (universal)
- PyType_Ready (CPython-specific)
- PyType_FromSpec (CPython-specific)

Something that isn't currently defined in PEP 520, and probably should
be regardless of whether the final implementation is an order
preserving tp_dict or a new __definition_order__ attribute, is where
descriptors implicitly defined via __slots__ will appear relative to
other attributes.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Tue Jun 21 17:10:58 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 21 Jun 2016 14:10:58 -0700
Subject: [Python-Dev] frame evaluation API PEP
In-Reply-To: <BN3PR03MB21955CE6BB2B4164A7A15C6FBB2A0@BN3PR03MB2195.namprd03.prod.outlook.com>
References: <CAP1=2W7K6Ny82Vq-rk3zO9cHxxJmtQGczqGy3byg83sy-N8W9Q@mail.gmail.com>
 <CAP1=2W6T8Gjo4B44Ut1t9yDYQcMymTBY1hcZ=+QagY5pyrauNg@mail.gmail.com>
 <CAP7+vJ+TQLcDku=pNUg=7uQ=_sw_NsBqAbM5Os1K7VAx=7W1nQ@mail.gmail.com>
 <CAP1=2W7135HTD_=7uFNGC0MyLfZiHfMW4jte4DWD+RZObUupbg@mail.gmail.com>
 <CAP7+vJLVE-N7qtAo8qQ_amedYAJEEzZAw6d=pnGMJ35sHFQhAg@mail.gmail.com>
 <CAP1=2W48dLL60XMjK+1y3htSBLbysva4pwXWh5pim7v8vOLucA@mail.gmail.com>
 <CAP7+vJLZhsEi=9aycU9axS9eZXHjfF7TngHJOLqnWuBr1daKHQ@mail.gmail.com>
 <BN3PR03MB21955CE6BB2B4164A7A15C6FBB2A0@BN3PR03MB2195.namprd03.prod.outlook.com>
Message-ID: <CADiSq7eV0iyKJtX9AcwdYc41xTc8pcPaXdeSSSpH1UN5ZbT6nA@mail.gmail.com>

On 20 June 2016 at 13:32, Dino Viehland via Python-Dev
<python-dev at python.org> wrote:
> It doesn?t help with the issue of potentially multiple consumers of that field
> that has been brought up before but I?m not sure how concerned we should be
> about that scenario anyway.

Brett's comparison with sys.settrace seems relevant here - we don't
allow multiple trace hooks at once, which means if you want more than
one active at once, either they need to cooperate with each other, or
you need to install a meta-tracehook to manage them somehow.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From victor.stinner at gmail.com  Tue Jun 21 17:17:00 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Tue, 21 Jun 2016 23:17:00 +0200
Subject: [Python-Dev] When to use EOFError?
In-Reply-To: <nkc96a$esv$1@ger.gmane.org>
References: <nkc96a$esv$1@ger.gmane.org>
Message-ID: <CAMpsgwYj5J9tfnV=h+hKK3CEf=EoDNFow=b8uSeFbZHk3Da9FQ@mail.gmail.com>

When loading truncated data with pickle, I expect a pickle error, not a
generic ValueError nor EOFError.

Victor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160621/7a8b41c3/attachment.html>

From ncoghlan at gmail.com  Tue Jun 21 17:21:21 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 21 Jun 2016 14:21:21 -0700
Subject: [Python-Dev] Review of PEP 520: Ordered Class Definition
 Namespace
In-Reply-To: <CALFfu7Bqtbzn1T2mgWh-aY1KPw2iL_qgT1CYYCs7hGrJbwYGiw@mail.gmail.com>
References: <CAP7+vJK5SVeng92QKvY4PpKR3OX=YLqjj2tgGFjbvmavnpLxxA@mail.gmail.com>
 <CALFfu7Bqtbzn1T2mgWh-aY1KPw2iL_qgT1CYYCs7hGrJbwYGiw@mail.gmail.com>
Message-ID: <CADiSq7fBmJbrkAocYFKbYashtigUqjnmb1xr6DzXWRkau-ysZA@mail.gmail.com>

On 20 June 2016 at 19:11, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> FWIW, regarding repercussions, I do not expect any other potential
> future feature will subsume the functionality of PEP 520.  The closest
> thing would be if cls.__dict__ became ordered.  However, that would
> intersect with __definition_order__ only at first.  Furthermore,
> cls.__dict__ would only ever be able to make vague promises about any
> relationship with the definiton order.  The point of
> __definiton_order__ is to provide the one obvious place to get a
> specific bit of information about a class.

It occurs to me that a settable __definition_order__ provides a
benefit that an ordered tp_dict doesn't: to get the "right" definition
order in something like Cython or dynamic type creation, you don't
need to carefully craft the order in which attributes are defined, you
just need to set __definition_order__ appropriately.

It also means that the "include dunder-attributes or not" decision is
easy to override, regardless of what we set as the default.

By contrast, if the *only* ordering information is
cls.__dict__.keys(), then there's no way for a type implementor to
hide implementation details.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From songofacandy at gmail.com  Tue Jun 21 19:09:09 2016
From: songofacandy at gmail.com (INADA Naoki)
Date: Wed, 22 Jun 2016 08:09:09 +0900
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <32C4B383-4DD4-4F44-A522-55E6EA96FFE1@gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
 <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
 <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>
 <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>
 <CANawmycB2ptBw-y90B2DsEe0kKqnC8fFBVdBFBZQRTv0DitG1g@mail.gmail.com>
 <CAP7+vJ+s97M737xFgMjY_4F5EsZ9XVye9VyxWwWpf_JGRrm+tw@mail.gmail.com>
 <CANawmyeMC-YRs7hhQ-ZRPDWERGBO=BJofSuKS+6sAcYe4AYGdw@mail.gmail.com>
 <CALFfu7CmbqFhCtYf2T-AW0Qh=FJSAvf=tdX=_Au116eNb=vZHg@mail.gmail.com>
 <CAP7+vJ+PN0aTBKrnZ_dpfF6quCD_1LTCMqPsdGo-f4B9fXd-gg@mail.gmail.com>
 <32C4B383-4DD4-4F44-A522-55E6EA96FFE1@gmail.com>
Message-ID: <CAEfz+TwXMujkPLj8aaDLnZUgTzEdhc_K5a=Ww2T87_Y9+n=UdQ@mail.gmail.com>

On Wed, Jun 22, 2016 at 2:50 AM, Raymond Hettinger
<raymond.hettinger at gmail.com> wrote:
>
>> On Jun 21, 2016, at 10:18 AM, Guido van Rossum <guido at python.org> wrote:
>>
>> Judging from Inada's message there seems to be some confusion about how well the compact dict preserves order (personally I think if it doesn't guarantee order after deletions it's pretty useless).
>
> Inada should follow PyPy's implementation of the compact dict which does preserve order after deletions (see below).

I follow it, for most cases.

When my compact dict doesn't preserve order is using PEP 412 Key sharing dict.

>>> class A:
...   ...
...
>>> a = A()
>>> b = A()   # a and b shares same keys, and have each values
>>> a.a = 1
>>> a.b = 2   # The order in shared key is (a, b)
>>> b.b = 3
>>> b.a = 4
>>> a.__dict__.items()
dict_items([('a', 1), ('b', 2)])
>>> b.__dict__.items()
dict_items([('a', 4), ('b', 3)])

It's possible to split keys when the insertion order is not strictly same.
But it decrease efficiency of key sharing dict.

If key sharing dict is effective only such a very strict cases,
I feel __slots__ can be used for it.

From ericsnowcurrently at gmail.com  Tue Jun 21 20:33:11 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 21 Jun 2016 18:33:11 -0600
Subject: [Python-Dev] Review of PEP 520: Ordered Class Definition
 Namespace
In-Reply-To: <CADiSq7fBmJbrkAocYFKbYashtigUqjnmb1xr6DzXWRkau-ysZA@mail.gmail.com>
References: <CAP7+vJK5SVeng92QKvY4PpKR3OX=YLqjj2tgGFjbvmavnpLxxA@mail.gmail.com>
 <CALFfu7Bqtbzn1T2mgWh-aY1KPw2iL_qgT1CYYCs7hGrJbwYGiw@mail.gmail.com>
 <CADiSq7fBmJbrkAocYFKbYashtigUqjnmb1xr6DzXWRkau-ysZA@mail.gmail.com>
Message-ID: <CALFfu7CVkKbjZ=1t0kbeEzYLcoavt7hQxhOmPkuNOKJH-j2X4w@mail.gmail.com>

On Tue, Jun 21, 2016 at 3:21 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> It occurs to me that a settable __definition_order__ provides a
> benefit that an ordered tp_dict doesn't: to get the "right" definition
> order in something like Cython or dynamic type creation, you don't
> need to carefully craft the order in which attributes are defined, you
> just need to set __definition_order__ appropriately.
>
> It also means that the "include dunder-attributes or not" decision is
> easy to override, regardless of what we set as the default.
>
> By contrast, if the *only* ordering information is
> cls.__dict__.keys(), then there's no way for a type implementor to
> hide implementation details.

Good point.  I'll make a note of this in the PEP.

-eric

From ericsnowcurrently at gmail.com  Tue Jun 21 20:41:53 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 21 Jun 2016 18:41:53 -0600
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <CAP7+vJ+PN0aTBKrnZ_dpfF6quCD_1LTCMqPsdGo-f4B9fXd-gg@mail.gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
 <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
 <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>
 <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>
 <CANawmycB2ptBw-y90B2DsEe0kKqnC8fFBVdBFBZQRTv0DitG1g@mail.gmail.com>
 <CAP7+vJ+s97M737xFgMjY_4F5EsZ9XVye9VyxWwWpf_JGRrm+tw@mail.gmail.com>
 <CANawmyeMC-YRs7hhQ-ZRPDWERGBO=BJofSuKS+6sAcYe4AYGdw@mail.gmail.com>
 <CALFfu7CmbqFhCtYf2T-AW0Qh=FJSAvf=tdX=_Au116eNb=vZHg@mail.gmail.com>
 <CAP7+vJ+PN0aTBKrnZ_dpfF6quCD_1LTCMqPsdGo-f4B9fXd-gg@mail.gmail.com>
Message-ID: <CALFfu7D_2SrT+T3Zu2qLf6hcBc4ZUYbb_Eoas6HdM6Fxr+WGUg@mail.gmail.com>

On Tue, Jun 21, 2016 at 11:18 AM, Guido van Rossum <guido at python.org> wrote:
> If we had had these semantics in the language from the start, there would have
> been plenty uses of this order, and I suspect nobody would have considered
> asking for __definition_order__.

True.  The key thing that __definition_order__ provides is an explicit
relationship with the class definition.  Since we have the opportunity
to capture that now, I think we should take it, regardless of the type
of the class definition namespace or even of cls.__dict__.  For me the
strong association with the order in the class definition is worth
having.

-eric

From ericsnowcurrently at gmail.com  Tue Jun 21 20:50:19 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 21 Jun 2016 18:50:19 -0600
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <CADiSq7fjaqqx3ChkxLc1TYKS=mDvWvsxrjuyS=30TDbwp2WKfg@mail.gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
 <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
 <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>
 <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>
 <CANawmycB2ptBw-y90B2DsEe0kKqnC8fFBVdBFBZQRTv0DitG1g@mail.gmail.com>
 <CAP7+vJ+s97M737xFgMjY_4F5EsZ9XVye9VyxWwWpf_JGRrm+tw@mail.gmail.com>
 <CANawmyeMC-YRs7hhQ-ZRPDWERGBO=BJofSuKS+6sAcYe4AYGdw@mail.gmail.com>
 <CALFfu7CmbqFhCtYf2T-AW0Qh=FJSAvf=tdX=_Au116eNb=vZHg@mail.gmail.com>
 <CAP7+vJ+PN0aTBKrnZ_dpfF6quCD_1LTCMqPsdGo-f4B9fXd-gg@mail.gmail.com>
 <CADiSq7fjaqqx3ChkxLc1TYKS=mDvWvsxrjuyS=30TDbwp2WKfg@mail.gmail.com>
Message-ID: <CALFfu7DA4Q4QWDfSjCRVB2_6NiCoo9jG5LOUi0D+=32YGgrXUQ@mail.gmail.com>

On Tue, Jun 21, 2016 at 3:01 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> RIght, if *tp_dict itself* on type objects is guaranteed to be
> order-preserviing, then we don't need to do anything except perhaps
> provide a helper method or descriptor on type that automatically
> filters out the dunder-attributes, and spell out the type dict
> population order for:
>
> - class statements (universal)
> - types.new_class (universal)
> - calling type() directly (universal)
> - PyType_Ready (CPython-specific)
> - PyType_FromSpec (CPython-specific)

The problem I have with this is that it still doesn't give any strong
relationship with the class definition.  Certainly in most cases it
will amount to the same thing.  However, there is no way to know if
cls.__dict__ represents the class definition or not.  You also lose
knowing whether or not a class came from a definition (or acts as
though it did).  Finally, __definition_order__ makes the relationship
with the definition order clear, whereas cls.__dict__ does not.
Instead of being an obvious tool, with cls.__dict__ that relationship
would be tucked away where only a few folks with deep knowledge of
Python would be in a position to take advantage.

>
> Something that isn't currently defined in PEP 520, and probably should
> be regardless of whether the final implementation is an order
> preserving tp_dict or a new __definition_order__ attribute, is where
> descriptors implicitly defined via __slots__ will appear relative to
> other attributes.

I'll add that.

-eric

From songofacandy at gmail.com  Tue Jun 21 23:40:48 2016
From: songofacandy at gmail.com (INADA Naoki)
Date: Wed, 22 Jun 2016 12:40:48 +0900
Subject: [Python-Dev] Compact ordered dict is not ordered for split
 table. (was: PEP XXX: Compact ordered dict
In-Reply-To: <CAEfz+Tx8XRjq1dPUdkyV5Y8AQPrnK0myvQsRmv75dC5F8ZJZ1A@mail.gmail.com>
References: <CAEfz+Tx8XRjq1dPUdkyV5Y8AQPrnK0myvQsRmv75dC5F8ZJZ1A@mail.gmail.com>
Message-ID: <CAEfz+TwQrT3+shkgAVPEAMx+HAEtpBD=3i+DZtip4EYYOOi5MA@mail.gmail.com>

There are three options I can think.

1) Revert key-shared dict (PEP412).

pros: Removing key-shared dict makes dict implementation simple.

cons: In some applications, PEP 412 is far more compact than compact
ordered dict.  (Note: Using __slots__ may help such situation).

2) Don't make "keeping insertion order" is Python Language Spec.

pros: Best efficiency

cons: Different behavior between normal dict and instance.__dict__ may
confuse people.

3) More strict rule for key sharing dict.

My idea is:
* Increasing number of entries (inserting new key) can be possible
only if refcnt of keys == 1.

* Inserting new item (with existing key) into dict is allowed only when
  insertion position == number of items in the dict (PyDictObject.ma_used).

pros: We can have "dict keeping insertion order".

cons: Can't use key-sharing dict for many cases.  Small and harmless
change may cause
sudden memory usage increase.  (__slots__ is more predicable).

On Wed, Jun 22, 2016 at 12:10 AM, INADA Naoki <songofacandy at gmail.com> wrote:
> I'm sorry, but I hadn't realized which compact ordered dict is
> not ordered for split table.
>
> For example:
>>>> class A:
> ...   ...
> ...
>>>> a = A()
>>>> b = A()
>>>> a.a = 1
>>>> a.b = 2
>>>> b.b = 3
>>>> b.a = 4
>>>> a.__dict__.items()
> dict_items([('a', 1), ('b', 2)])
>>>> b.__dict__.items()
> dict_items([('a', 4), ('b', 3)])
>
>
> This doesn't affects to **kwargs and class namespace.
>
> But if we change the language spec to dict preserves insertion order,
> this should be addressed.
>
>
> On Tue, Jun 21, 2016 at 2:02 PM, INADA Naoki <songofacandy at gmail.com> wrote:
>> On Tue, Jun 21, 2016 at 12:17 PM, Oleg Broytman <phd at phdru.name> wrote:
>>> Hi!
>>>
>>> On Tue, Jun 21, 2016 at 11:14:39AM +0900, INADA Naoki <songofacandy at gmail.com> wrote:
>>>> Here is my draft, but I haven't
>>>> posted it yet since
>>>> my English is much worse than C.
>>>> https://www.dropbox.com/s/s85n9b2309k03cq/pep-compact-dict.txt?dl=0
>>>
>>>    It's good enough for a start (if a PEP is needed at all). If you push
>>> it to Github I'm sure they will come with pull requests.
>>>
>>> Oleg.
>>
>> Thank you for reading my draft.
>>
>>> (if a PEP is needed at all)
>>
>> I don't think so. My PEP is not for changing Python Language,
>> just describe implementation detail.
>>
>> Python 3.5 has new OrderedDict implemented in C without PEP.
>> My patch is relatively small than it.  And the idea has been well known.
>>
>> --
>> INADA Naoki  <songofacandy at gmail.com>
>
>
>
> --
> INADA Naoki  <songofacandy at gmail.com>

-- 
INADA Naoki  <songofacandy at gmail.com>

From greg.ewing at canterbury.ac.nz  Wed Jun 22 01:34:48 2016
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 22 Jun 2016 17:34:48 +1200
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <CADiSq7fjaqqx3ChkxLc1TYKS=mDvWvsxrjuyS=30TDbwp2WKfg@mail.gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
 <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
 <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>
 <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>
 <CANawmycB2ptBw-y90B2DsEe0kKqnC8fFBVdBFBZQRTv0DitG1g@mail.gmail.com>
 <CAP7+vJ+s97M737xFgMjY_4F5EsZ9XVye9VyxWwWpf_JGRrm+tw@mail.gmail.com>
 <CANawmyeMC-YRs7hhQ-ZRPDWERGBO=BJofSuKS+6sAcYe4AYGdw@mail.gmail.com>
 <CALFfu7CmbqFhCtYf2T-AW0Qh=FJSAvf=tdX=_Au116eNb=vZHg@mail.gmail.com>
 <CAP7+vJ+PN0aTBKrnZ_dpfF6quCD_1LTCMqPsdGo-f4B9fXd-gg@mail.gmail.com>
 <CADiSq7fjaqqx3ChkxLc1TYKS=mDvWvsxrjuyS=30TDbwp2WKfg@mail.gmail.com>
Message-ID: <576A2378.9090508@canterbury.ac.nz>

Nick Coghlan wrote:
> Something that isn't currently defined in PEP 520 ... is where
> descriptors implicitly defined via __slots__ will appear relative to
> other attributes.

In the place where the __slots__ attribute appears?

-- 
Greg

From songofacandy at gmail.com  Wed Jun 22 07:48:59 2016
From: songofacandy at gmail.com (INADA Naoki)
Date: Wed, 22 Jun 2016 20:48:59 +0900
Subject: [Python-Dev] PEP XXX: Compact ordered dict
In-Reply-To: <20160621031727.GA7518@phdru.name>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <CADiSq7dZqpH2n_JpgzC-3BgqOKcWv3OcnCvPMLe-jx8zJB2_mA@mail.gmail.com>
 <CAP7+vJLKiP8iwFS-D51SpF0dGXVrDj9hLGSyR1C0t3ntonQgQA@mail.gmail.com>
 <CALFfu7BS1fBHLqvoEVWsX4mW-bQ2SRTFnVqgT62Gc0APj+k94w@mail.gmail.com>
 <CAEfz+Tx2iUUiq4mOw-ybU5DUVXxdqfJEq4R2xqQ0OkAgT=NJqg@mail.gmail.com>
 <20160621031727.GA7518@phdru.name>
Message-ID: <CAEfz+TyevCGhEEzo2NwWSwgQ7DnsSPtPxqqSr70f+DJQGZfzoA@mail.gmail.com>

FYI, Here is calculated size of each dict by len(d).
https://docs.google.com/spreadsheets/d/1nN5y6IsiJGdNxD7L7KBXmhdUyXjuRAQR_WbrS8zf6mA/edit?usp=sharing

On Tue, Jun 21, 2016 at 12:17 PM, Oleg Broytman <phd at phdru.name> wrote:
> Hi!
>
> On Tue, Jun 21, 2016 at 11:14:39AM +0900, INADA Naoki <songofacandy at gmail.com> wrote:
>> Here is my draft, but I haven't
>> posted it yet since
>> my English is much worse than C.
>> https://www.dropbox.com/s/s85n9b2309k03cq/pep-compact-dict.txt?dl=0
>
>    It's good enough for a start (if a PEP is needed at all). If you push
> it to Github I'm sure they will come with pull requests.
>
> Oleg.
> --
>      Oleg Broytman            http://phdru.name/            phd at phdru.name
>            Programmers don't die, they just GOSUB without RETURN.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com

-- 
INADA Naoki  <songofacandy at gmail.com>

From random832 at fastmail.com  Wed Jun 22 10:17:27 2016
From: random832 at fastmail.com (Random832)
Date: Wed, 22 Jun 2016 10:17:27 -0400
Subject: [Python-Dev] Why are class dictionaries not accessible?
Message-ID: <1466605047.4114458.645279001.61C40E1D@webmail.messagingengine.com>

The documentation states: """Objects such as modules and instances have
an updateable __dict__ attribute; however, other objects may have write
restrictions on their __dict__ attributes (for example, classes use a
dictproxy to prevent direct dictionary updates)."""

However, it's not clear from that *why* direct dictionary updates are
undesirable. This not only prevents you from getting a reference to the
real class dict (which is the apparent goal), but is also the
fundamental reason why you can't use a metaclass to put, say, an
OrderedDict in its place - because the type constructor has to copy the
dict that was used in class preparation into a new dict rather than
using the one that was actually returned by __prepare__.

[Also, the name of the type used for this is mappingproxy, not
dictproxy]

From random832 at fastmail.com  Wed Jun 22 10:31:27 2016
From: random832 at fastmail.com (Random832)
Date: Wed, 22 Jun 2016 10:31:27 -0400
Subject: [Python-Dev] When to use EOFError?
In-Reply-To: <nkc96a$esv$1@ger.gmane.org>
References: <nkc96a$esv$1@ger.gmane.org>
Message-ID: <1466605887.4117246.645287633.6E053CD9@webmail.messagingengine.com>

On Tue, Jun 21, 2016, at 16:48, Serhiy Storchaka wrote:
> There is a design question. If you read file in some format or with some 
> protocol, and the data is ended unexpectedly, when to use general 
> EOFError exception and when to use format/protocol specific exception?
> 
> For example when load truncated pickle data, an unpickler can raise 
> EOFError, UnpicklingError, ValueError or AttributeError. It is possible 
> to avoid ValueError or AttributeError, but what exception should be 
> raised instead, EOFError or UnpicklingError? Maybe convert all EOFError 
> to UnpicklingError?

I think this is the most appropriate. If the calling code needs to know
the original reason it can find it in __cause__.

My instinct, though, (and I'm aware that others may not agree, but I
thought it was worth bringing up) is that loads should actually always
raise a ValueError, i.e. my mental model of loads is like:

def loads(s):
   f = BytesIO(s)
   try:
      return load(f)
   except UnpicklingError as e:
      raise ValueError from e

From guido at python.org  Wed Jun 22 11:11:19 2016
From: guido at python.org (Guido van Rossum)
Date: Wed, 22 Jun 2016 08:11:19 -0700
Subject: [Python-Dev] Why are class dictionaries not accessible?
In-Reply-To: <1466605047.4114458.645279001.61C40E1D@webmail.messagingengine.com>
References: <1466605047.4114458.645279001.61C40E1D@webmail.messagingengine.com>
Message-ID: <CAP7+vJLU_myxxDoQu8ovbF+ZpCU1QfPzq3sqWPV8jo+i_tJxCg@mail.gmail.com>

On Wed, Jun 22, 2016 at 7:17 AM, Random832 <random832 at fastmail.com> wrote:

> The documentation states: """Objects such as modules and instances have
> an updateable __dict__ attribute; however, other objects may have write
> restrictions on their __dict__ attributes (for example, classes use a
> dictproxy to prevent direct dictionary updates)."""
>
> However, it's not clear from that *why* direct dictionary updates are
> undesirable. This not only prevents you from getting a reference to the
> real class dict (which is the apparent goal), but is also the
> fundamental reason why you can't use a metaclass to put, say, an
> OrderedDict in its place - because the type constructor has to copy the
> dict that was used in class preparation into a new dict rather than
> using the one that was actually returned by __prepare__.
>
> [Also, the name of the type used for this is mappingproxy, not
> dictproxy]
>

This is done in order to force all mutations of the class dict to go
through attribute assignments on the class. The latter takes care of
updating the class struct, e.g. if you were to add an `__add__` method
dynamically it would update tp_as_number->nb_add. If you could modify the
dict object directly it would be more difficult to arrange for this side
effect.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160622/ab914771/attachment.html>

From nd at perlig.de  Wed Jun 22 12:22:49 2016
From: nd at perlig.de (=?iso-8859-1?q?Andr=E9_Malo?=)
Date: Wed, 22 Jun 2016 18:22:49 +0200
Subject: [Python-Dev] When to use EOFError?
In-Reply-To: <nkc96a$esv$1@ger.gmane.org>
References: <nkc96a$esv$1@ger.gmane.org>
Message-ID: <201606221822.49308@news.perlig.de>

* Serhiy Storchaka wrote:

> There is a design question. If you read file in some format or with some
> protocol, and the data is ended unexpectedly, when to use general
> EOFError exception and when to use format/protocol specific exception?
>
> For example when load truncated pickle data, an unpickler can raise
> EOFError, UnpicklingError, ValueError or AttributeError. It is possible
> to avoid ValueError or AttributeError, but what exception should be
> raised instead, EOFError or UnpicklingError? Maybe convert all EOFError
> to UnpicklingError? Or all UnpicklingError caused by unexpectedly ended
> input to EOFError? Or raise EOFError if the input is ended after
> completed opcode, and UnpicklingError if it contains truncated opcode?

I often concatenate multiple pickles into one file. When reading them, it 
works like this:

try:
    while True:
        yield pickle.load(fp)
except EOFError:
    pass

In this case the truncation is not really unexpected. Maybe it should 
distinguish between truncated-in-the-middle and truncated-because-empty.

(Same goes for marshal)

Cheers,
-- 
Real programmers confuse Christmas and Halloween because
DEC 25 = OCT 31.  -- Unknown

                                      (found in ssl_engine_mutex.c)

From ericfahlgren at gmail.com  Wed Jun 22 13:05:50 2016
From: ericfahlgren at gmail.com (Eric Fahlgren)
Date: Wed, 22 Jun 2016 10:05:50 -0700
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
In-Reply-To: <CALFfu7DA4Q4QWDfSjCRVB2_6NiCoo9jG5LOUi0D+=32YGgrXUQ@mail.gmail.com>
References: <CAK9R32QH+RZ4oHo2W7P0wc_NOXo2TxzNr6OWToX1MBo+3cqSmA@mail.gmail.com>
 <CALFfu7Cxmm5Ms4_XqfpqOYr-A3uLugpRoFKjd20G+mDH6cv4eA@mail.gmail.com>
 <CAK9R32RFH34AY6xfCc4F4NRzKfdyFaF8Ynx2CpET+YKV4yK4PA@mail.gmail.com>
 <CADiSq7eGpYi7b6J7F0J8DSkL8XcHmwVNOg=tPs+tu0MwUNsyQQ@mail.gmail.com>
 <CANawmycB2ptBw-y90B2DsEe0kKqnC8fFBVdBFBZQRTv0DitG1g@mail.gmail.com>
 <CAP7+vJ+s97M737xFgMjY_4F5EsZ9XVye9VyxWwWpf_JGRrm+tw@mail.gmail.com>
 <CANawmyeMC-YRs7hhQ-ZRPDWERGBO=BJofSuKS+6sAcYe4AYGdw@mail.gmail.com>
 <CALFfu7CmbqFhCtYf2T-AW0Qh=FJSAvf=tdX=_Au116eNb=vZHg@mail.gmail.com>
 <CAP7+vJ+PN0aTBKrnZ_dpfF6quCD_1LTCMqPsdGo-f4B9fXd-gg@mail.gmail.com>
 <CADiSq7fjaqqx3ChkxLc1TYKS=mDvWvsxrjuyS=30TDbwp2WKfg@mail.gmail.com>
 <CALFfu7DA4Q4QWDfSjCRVB2_6NiCoo9jG5LOUi0D+=32YGgrXUQ@mail.gmail.com>
Message-ID: <08cf01d1cca8$54e2d960$fea88c20$@gmail.com>

On Wed 2016-06-22 Eric Snow [mailto:ericsnowcurrently at gmail.com] wrote:
> The problem I have with this is that it still doesn't give any strong relationship with the class definition.
> Certainly in most cases it will amount to the same thing.  However, there is no way to know if cls.__dict__ 
> represents the class definition or not.  You also lose knowing whether or not a class came from a definition
> (or acts as though it did).  Finally, __definition_order__ makes the relationship with the definition order clear,
>  whereas cls.__dict__ does not.
> Instead of being an obvious tool, with cls.__dict__ that relationship would be tucked away where only a
>  few folks with deep knowledge of Python would be in a position to take advantage.

I see this as being grossly/loosely analogous to traversing __bases__ vs calling mro(), so I feel the
same rationale applies to adding __definition_order__ as mro.

Eric

From songofacandy at gmail.com  Wed Jun 22 13:23:18 2016
From: songofacandy at gmail.com (INADA Naoki)
Date: Thu, 23 Jun 2016 02:23:18 +0900
Subject: [Python-Dev] Idea: more compact,
 interned string key only dict for namespace.
Message-ID: <CAEfz+Tzf6q0V7-Lo2DMfR3KvKsdFj2eahctgT+7zVUMAjOhgUg@mail.gmail.com>

As my last email, compact ordered dict can't preserve
insertion order of key sharing dict (PEP 412).

I'm thinking about deprecating key shared dict for now.

Instead, my new idea is introducing more compact dict
specialized for namespace.

If BDFL (or BDFL delegate) likes this idea, I'll take another
one week to implement this.

Background
----------------

* Most keys of namespace dict are string.
* Calculating hash of string is cheap (one memory access, thanks for cache).
* And most keys are interned already.

Design
----------

Instead of normal PyDictKeyEntry, use PyInternedKeyEntry like this.

typedef struct {
    // no me_hash
    PyObject *me_key, *me_value;
} PyInternedKeyEntry;

insertdict() interns key if it's unicode, otherwise it converts dict to
normal compact ordered dict.

lookdict_interned() compares only pointer (doesn't call unicode_eq())
when searching key is interned.

And add new internal API to create interned key only dict.

PyDictObject* _PyDict_NewForNamespace();

Memory usage
--------------------

on amd64 arch.

key-sharing dict:

* 96 bytes for ~3 items
* 128 bytes for 4~5 items.

compact dict:

* 224 bytes for ~5 items.

(232 bytes when keep supporting key-shared dict)

interned key only dict:

* 184 bytes for ~5 items

Note
------

Interned key only dict is still larger than key-shared dict.

But it can be used for more purpose.  It can be used for interning string
for example.  It can be used to kwargs dict when all keys are interned already.

If we provide _PyDict_NewForNamespace to extension modules,
json decoder can have option to use this, too.

-- 
INADA Naoki  <songofacandy at gmail.com>

From mark at hotpy.org  Wed Jun 22 21:30:01 2016
From: mark at hotpy.org (Mark Shannon)
Date: Wed, 22 Jun 2016 18:30:01 -0700
Subject: [Python-Dev] Idea: more compact,
 interned string key only dict for namespace.
In-Reply-To: <CAEfz+Tzf6q0V7-Lo2DMfR3KvKsdFj2eahctgT+7zVUMAjOhgUg@mail.gmail.com>
References: <CAEfz+Tzf6q0V7-Lo2DMfR3KvKsdFj2eahctgT+7zVUMAjOhgUg@mail.gmail.com>
Message-ID: <576B3B99.6070204@hotpy.org>

Hi all,

I think we need some more data before going any further reimplementing 
dicts.

What I would like to know is, across a set of Python programs (ideally a 
representative set), what the proportion of dicts in memory at any one 
time are:

a) instance dicts
b) other namespace dicts (classes and modules)
c) data dicts with all string keys
d) other data dicts
e) keyword argument dicts (I'm guessing this is vanishingly small)

I would expect that (a) far exceeds (b) and depending on the application 
also considerably exceeds (c), but I would like some real data.
 From that we can compute the (approximate) memory costs of the 
competing designs.

As an aside, if anyone is really keen to save memory, then removing the 
cycle GC header is the thing to do.
That uses 24 bytes per object and *half* of all live objects have it.
And don't forget that any Python object is really two objects, the 
object and its dict, so that is 48 extra bytes every time you create a 
new object.

Cheers,
Mark.

On 22/06/16 10:23, INADA Naoki wrote:
> As my last email, compact ordered dict can't preserve
> insertion order of key sharing dict (PEP 412).
>
> I'm thinking about deprecating key shared dict for now.
>
> Instead, my new idea is introducing more compact dict
> specialized for namespace.
>
> If BDFL (or BDFL delegate) likes this idea, I'll take another
> one week to implement this.
>
>
> Background
> ----------------
>
> * Most keys of namespace dict are string.
> * Calculating hash of string is cheap (one memory access, thanks for cache).
> * And most keys are interned already.
>
>
> Design
> ----------
>
> Instead of normal PyDictKeyEntry, use PyInternedKeyEntry like this.
>
> typedef struct {
>      // no me_hash
>      PyObject *me_key, *me_value;
> } PyInternedKeyEntry;
>
>
> insertdict() interns key if it's unicode, otherwise it converts dict to
> normal compact ordered dict.
>
> lookdict_interned() compares only pointer (doesn't call unicode_eq())
> when searching key is interned.
>
> And add new internal API to create interned key only dict.
>
> PyDictObject* _PyDict_NewForNamespace();
>
>
> Memory usage
> --------------------
>
> on amd64 arch.
>
> key-sharing dict:
>
> * 96 bytes for ~3 items
> * 128 bytes for 4~5 items.
>
> compact dict:
>
> * 224 bytes for ~5 items.
>
> (232 bytes when keep supporting key-shared dict)
>
> interned key only dict:
>
> * 184 bytes for ~5 items
>
>
> Note
> ------
>
> Interned key only dict is still larger than key-shared dict.
>
> But it can be used for more purpose.  It can be used for interning string
> for example.  It can be used to kwargs dict when all keys are interned already.
>
> If we provide _PyDict_NewForNamespace to extension modules,
> json decoder can have option to use this, too.
>
>

From songofacandy at gmail.com  Thu Jun 23 00:08:27 2016
From: songofacandy at gmail.com (INADA Naoki)
Date: Thu, 23 Jun 2016 13:08:27 +0900
Subject: [Python-Dev] Idea: more compact,
 interned string key only dict for namespace.
In-Reply-To: <CAEfz+Tzf6q0V7-Lo2DMfR3KvKsdFj2eahctgT+7zVUMAjOhgUg@mail.gmail.com>
References: <CAEfz+Tzf6q0V7-Lo2DMfR3KvKsdFj2eahctgT+7zVUMAjOhgUg@mail.gmail.com>
Message-ID: <CAEfz+Tx8PDznkUyrRQUq-nfgkAcjSVkkbs87H2wREe8092hXmw@mail.gmail.com>

> Memory usage
> --------------------
>
> on amd64 arch.
>
> key-sharing dict:
>
> * 96 bytes for ~3 items
> * 128 bytes for 4~5 items.

Note: There are another shared key.

* 128 bytes for ~3 items
* 224 bytes for 4~5 items

So, let S = how many instances shares the key,

* 90 + (96 / S) bytes for ~3 items
* 128 + (224 / S) bytes for 4~5 items

>
> compact dict:
>
> * 224 bytes for ~5 items.
>
> (232 bytes when keep supporting key-shared dict)
>
> interned key only dict:
>
> * 184 bytes for ~5 items
>
>
> Note
> ------
>
> Interned key only dict is still larger than key-shared dict.
>
> But it can be used for more purpose.  It can be used for interning string
> for example.  It can be used to kwargs dict when all keys are interned already.
>
> If we provide _PyDict_NewForNamespace to extension modules,
> json decoder can have option to use this, too.
>
>
> --
> INADA Naoki  <songofacandy at gmail.com>

-- 
INADA Naoki  <songofacandy at gmail.com>

From songofacandy at gmail.com  Thu Jun 23 00:43:17 2016
From: songofacandy at gmail.com (INADA Naoki)
Date: Thu, 23 Jun 2016 13:43:17 +0900
Subject: [Python-Dev] Idea: more compact,
 interned string key only dict for namespace.
In-Reply-To: <576B3B99.6070204@hotpy.org>
References: <CAEfz+Tzf6q0V7-Lo2DMfR3KvKsdFj2eahctgT+7zVUMAjOhgUg@mail.gmail.com>
 <576B3B99.6070204@hotpy.org>
Message-ID: <CAEfz+TxED3ptgNQk=7ozFFbcR-C9zuDDExBbTQVhswVkC7+B=A@mail.gmail.com>

Hi, Mark.  Thank you for reply.

On Thu, Jun 23, 2016 at 10:30 AM, Mark Shannon <mark at hotpy.org> wrote:
> Hi all,
>
> I think we need some more data before going any further reimplementing
> dicts.
>
> What I would like to know is, across a set of Python programs (ideally a
> representative set), what the proportion of dicts in memory at any one time
> are:
>
> a) instance dicts
> b) other namespace dicts (classes and modules)
> c) data dicts with all string keys
> d) other data dicts
> e) keyword argument dicts (I'm guessing this is vanishingly small)
>
> I would expect that (a) far exceeds (b) and depending on the application
> also considerably exceeds (c), but I would like some real data.
> From that we can compute the (approximate) memory costs of the competing
> designs.

I think you're right.
But, I don't have clear idea about how to do it.
Is there existing effort about collecting stats of dict?

>
> As an aside, if anyone is really keen to save memory, then removing the
> cycle GC header is the thing to do.
> That uses 24 bytes per object and *half* of all live objects have it.
> And don't forget that any Python object is really two objects, the object
> and its dict, so that is 48 extra bytes every time you create a new object.
>

It's great idea.  But I can't do it before Python 3.6.

My main concern is not saving memory, ordered dict for **kwargs without
significant overhead.

If "orderd, except key sharing dict" is acceptable, no problem.
Key sharing compact dict is smaller than current key sharing dict of Python 3.5
for most cases.
https://docs.google.com/spreadsheets/d/1nN5y6IsiJGdNxD7L7KBXmhdUyXjuRAQR_WbrS8zf6mA/edit#gid=0

Regards,

--
INADA Naoki  <songofacandy at gmail.com>

From songofacandy at gmail.com  Thu Jun 23 03:41:21 2016
From: songofacandy at gmail.com (INADA Naoki)
Date: Thu, 23 Jun 2016 16:41:21 +0900
Subject: [Python-Dev] Idea: more compact,
 interned string key only dict for namespace.
In-Reply-To: <CAEfz+TxED3ptgNQk=7ozFFbcR-C9zuDDExBbTQVhswVkC7+B=A@mail.gmail.com>
References: <CAEfz+Tzf6q0V7-Lo2DMfR3KvKsdFj2eahctgT+7zVUMAjOhgUg@mail.gmail.com>
 <576B3B99.6070204@hotpy.org>
 <CAEfz+TxED3ptgNQk=7ozFFbcR-C9zuDDExBbTQVhswVkC7+B=A@mail.gmail.com>
Message-ID: <CAEfz+TwbVQCQhrJMfJWUw8XRihZsXgVtJN0P+DpHmczRQ2dXkQ@mail.gmail.com>

I've checked time and maxrss of sphinx-build.

In case of sphinx,

## master

$ rm -rf build/
$ /usr/bin/time ~/local/python-master/bin/sphinx-build -b html -d
build/doctrees -D latex_paper_size=  . build/html -QN

71.76user 0.27system 1:12.06elapsed 99%CPU (0avgtext+0avgdata
176248maxresident)k
80inputs+202888outputs (2major+58234minor)pagefaults 0swaps

71.86user 0.28system 1:12.16elapsed 99%CPU (0avgtext+0avgdata
176312maxresident)k
0inputs+201480outputs (0major+59897minor)pagefaults 0swaps

## compact-dict w/ shared

$ rm -rf build/
$ /usr/bin/time ~/local/python-compact/bin/sphinx-build -b html -d
build/doctrees -D latex_paper_size=  . build/html -QN

72.18user 0.27system 1:12.47elapsed 99%CPU (0avgtext+0avgdata
158104maxresident)k
728inputs+200792outputs (0major+53409minor)pagefaults 0swaps

72.79user 0.30system 1:13.11elapsed 99%CPU (0avgtext+0avgdata
157916maxresident)k
0inputs+200792outputs (0major+54072minor)pagefaults 0swaps

## compact w/o shared key

(Only shared key removed.  No interned key only dict)

$ rm -rf build/
$ /usr/bin/time ~/local/python-intern/bin/sphinx-build -b html -d
build/doctrees -D latex_paper_size=  . build/html -QN

71.79user 0.34system 1:12.16elapsed 99%CPU (0avgtext+0avgdata
165884maxresident)k
480inputs+200792outputs (0major+56947minor)pagefaults 0swaps

71.84user 0.27system 1:12.13elapsed 99%CPU (0avgtext+0avgdata
166888maxresident)k
640inputs+200792outputs (5major+56834minor)pagefaults 0swaps

-- 
INADA Naoki  <songofacandy at gmail.com>

From random832 at fastmail.com  Thu Jun 23 11:01:05 2016
From: random832 at fastmail.com (Random832)
Date: Thu, 23 Jun 2016 11:01:05 -0400
Subject: [Python-Dev] Why are class dictionaries not accessible?
In-Reply-To: <CAP7+vJLU_myxxDoQu8ovbF+ZpCU1QfPzq3sqWPV8jo+i_tJxCg@mail.gmail.com>
References: <1466605047.4114458.645279001.61C40E1D@webmail.messagingengine.com>
 <CAP7+vJLU_myxxDoQu8ovbF+ZpCU1QfPzq3sqWPV8jo+i_tJxCg@mail.gmail.com>
Message-ID: <1466694065.230465.646395065.5FE7CA44@webmail.messagingengine.com>

On Wed, Jun 22, 2016, at 11:11, Guido van Rossum wrote:
> This is done in order to force all mutations of the class dict to go
> through attribute assignments on the class. The latter takes care of
> updating the class struct, e.g. if you were to add an `__add__` method
> dynamically it would update tp_as_number->nb_add. If you could modify the
> dict object directly it would be more difficult to arrange for this side
> effect.

Why is this different from the fact that updating a normal object's dict
bypasses descriptors and any special logic in __setattr__? Dunder
methods are already "special" in the sense that you can't use them as
object attributes; I wouldn't be surprised by "assigning a dunder method
via the class's dict breaks things".

From ericsnowcurrently at gmail.com  Thu Jun 23 11:03:57 2016
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 23 Jun 2016 09:03:57 -0600
Subject: [Python-Dev] PEP XXX: Compact ordered dict
In-Reply-To: <CAEfz+Tzt+sgJzd_JuMExVfkjffWzeZncpeU8LzrTMNAZX690Mg@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <CADiSq7dZqpH2n_JpgzC-3BgqOKcWv3OcnCvPMLe-jx8zJB2_mA@mail.gmail.com>
 <CAP7+vJLKiP8iwFS-D51SpF0dGXVrDj9hLGSyR1C0t3ntonQgQA@mail.gmail.com>
 <CALFfu7BS1fBHLqvoEVWsX4mW-bQ2SRTFnVqgT62Gc0APj+k94w@mail.gmail.com>
 <CAEfz+Tx2iUUiq4mOw-ybU5DUVXxdqfJEq4R2xqQ0OkAgT=NJqg@mail.gmail.com>
 <20160621031727.GA7518@phdru.name>
 <CAEfz+Tzt+sgJzd_JuMExVfkjffWzeZncpeU8LzrTMNAZX690Mg@mail.gmail.com>
Message-ID: <CALFfu7AWo6GurNP-1bA5FvwYv0ogQvZchbdphHyfTVnjd00-0Q@mail.gmail.com>

On Mon, Jun 20, 2016 at 11:02 PM, INADA Naoki <songofacandy at gmail.com> wrote:
> On Tue, Jun 21, 2016 at 12:17 PM, Oleg Broytman <phd at phdru.name> wrote:
>> (if a PEP is needed at all)
>
> I don't think so. My PEP is not for changing Python Language,
> just describe implementation detail.
>
> Python 3.5 has new OrderedDict implemented in C without PEP.
> My patch is relatively small than it.  And the idea has been well known.

How about, for 3.6, target re-implementing OrderedDict using the
compact dict approach (and leave dict alone for now).  That way we
have an extra release cycle to iron out the kinks before switching
dict over for 3.7. :)

-eric

From guido at python.org  Thu Jun 23 11:19:44 2016
From: guido at python.org (Guido van Rossum)
Date: Thu, 23 Jun 2016 08:19:44 -0700
Subject: [Python-Dev] Why are class dictionaries not accessible?
In-Reply-To: <1466694065.230465.646395065.5FE7CA44@webmail.messagingengine.com>
References: <1466605047.4114458.645279001.61C40E1D@webmail.messagingengine.com>
 <CAP7+vJLU_myxxDoQu8ovbF+ZpCU1QfPzq3sqWPV8jo+i_tJxCg@mail.gmail.com>
 <1466694065.230465.646395065.5FE7CA44@webmail.messagingengine.com>
Message-ID: <CAP7+vJLRMR6+KEs=LENaUPqqc1_p2H+dBaYAQsxp68O3x-21yw@mail.gmail.com>

On Thu, Jun 23, 2016 at 8:01 AM, Random832 <random832 at fastmail.com> wrote:

> On Wed, Jun 22, 2016, at 11:11, Guido van Rossum wrote:
> > This is done in order to force all mutations of the class dict to go
> > through attribute assignments on the class. The latter takes care of
> > updating the class struct, e.g. if you were to add an `__add__` method
> > dynamically it would update tp_as_number->nb_add. If you could modify the
> > dict object directly it would be more difficult to arrange for this side
> > effect.
>
> Why is this different from the fact that updating a normal object's dict
> bypasses descriptors and any special logic in __setattr__? Dunder
> methods are already "special" in the sense that you can't use them as
> object attributes; I wouldn't be surprised by "assigning a dunder method
> via the class's dict breaks things".
>

It was a long time when I wrote this, but IIRC the breakage could express
itself as a segfault or other C-level crash due to some internal state
invariant of the type object being violated, not just an exception. The
existence of ctypes notwithstanding, we take C-level crashes very seriously.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160623/900098e1/attachment-0001.html>

From songofacandy at gmail.com  Thu Jun 23 11:26:28 2016
From: songofacandy at gmail.com (INADA Naoki)
Date: Fri, 24 Jun 2016 00:26:28 +0900
Subject: [Python-Dev] PEP XXX: Compact ordered dict
In-Reply-To: <CALFfu7AWo6GurNP-1bA5FvwYv0ogQvZchbdphHyfTVnjd00-0Q@mail.gmail.com>
References: <CALFfu7AYa6rp7qMM4=rY6yai97a_MHX+EfT3igYD+Kv6vk_aCg@mail.gmail.com>
 <CADiSq7dZqpH2n_JpgzC-3BgqOKcWv3OcnCvPMLe-jx8zJB2_mA@mail.gmail.com>
 <CAP7+vJLKiP8iwFS-D51SpF0dGXVrDj9hLGSyR1C0t3ntonQgQA@mail.gmail.com>
 <CALFfu7BS1fBHLqvoEVWsX4mW-bQ2SRTFnVqgT62Gc0APj+k94w@mail.gmail.com>
 <CAEfz+Tx2iUUiq4mOw-ybU5DUVXxdqfJEq4R2xqQ0OkAgT=NJqg@mail.gmail.com>
 <20160621031727.GA7518@phdru.name>
 <CAEfz+Tzt+sgJzd_JuMExVfkjffWzeZncpeU8LzrTMNAZX690Mg@mail.gmail.com>
 <CALFfu7AWo6GurNP-1bA5FvwYv0ogQvZchbdphHyfTVnjd00-0Q@mail.gmail.com>
Message-ID: <CAEfz+TwnHMdud6wXGpmoFTnkN8Et3A148rT6nsh7ZEVYXUMYXw@mail.gmail.com>

On Fri, Jun 24, 2016 at 12:03 AM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> On Mon, Jun 20, 2016 at 11:02 PM, INADA Naoki <songofacandy at gmail.com> wrote:
>> On Tue, Jun 21, 2016 at 12:17 PM, Oleg Broytman <phd at phdru.name> wrote:
>>> (if a PEP is needed at all)
>>
>> I don't think so. My PEP is not for changing Python Language,
>> just describe implementation detail.
>>
>> Python 3.5 has new OrderedDict implemented in C without PEP.
>> My patch is relatively small than it.  And the idea has been well known.
>
> How about, for 3.6, target re-implementing OrderedDict using the
> compact dict approach (and leave dict alone for now).  That way we
> have an extra release cycle to iron out the kinks before switching
> dict over for 3.7. :)
>
> -eric

I can't.  Since OrderedDict inherits dict.  OrderedDict implementation
based on dict
implementation.
Since I'm not expert of Python object system,  I don't know how to
separate OrderedDict
implementation from dict.

-- 
INADA Naoki  <songofacandy at gmail.com>

From jeanpierreda at gmail.com  Thu Jun 23 16:03:00 2016
From: jeanpierreda at gmail.com (Devin Jeanpierre)
Date: Thu, 23 Jun 2016 13:03:00 -0700
Subject: [Python-Dev] Why are class dictionaries not accessible?
In-Reply-To: <CAP7+vJLRMR6+KEs=LENaUPqqc1_p2H+dBaYAQsxp68O3x-21yw@mail.gmail.com>
References: <1466605047.4114458.645279001.61C40E1D@webmail.messagingengine.com>
 <CAP7+vJLU_myxxDoQu8ovbF+ZpCU1QfPzq3sqWPV8jo+i_tJxCg@mail.gmail.com>
 <1466694065.230465.646395065.5FE7CA44@webmail.messagingengine.com>
 <CAP7+vJLRMR6+KEs=LENaUPqqc1_p2H+dBaYAQsxp68O3x-21yw@mail.gmail.com>
Message-ID: <CABicbJJ3Vn1fPjAT25S0=PiFtVY4tGChLYWCP4_8H26-Yk+Qpw@mail.gmail.com>

On Thu, Jun 23, 2016 at 8:19 AM, Guido van Rossum <guido at python.org> wrote:
>
> It was a long time when I wrote this, but IIRC the breakage could express
> itself as a segfault or other C-level crash due to some internal state
> invariant of the type object being violated, not just an exception. The
> existence of ctypes notwithstanding, we take C-level crashes very seriously.
>

Big digression: one can still obtain the dict if they really want to, even
without using ctypes. I suppose don't actually mutate it unless you want to
segfault.

>>> import gc
>>> class A(object): pass
>>> type(A.__dict__)
<class 'mappingproxy'>
>>> type(gc.get_referents(A.__dict__)[0])
<class 'dict'>
>>> gc.get_referents(A.__dict__)[0]['abc'] = 1
>>> A.abc
1
>>>

(One can also get it right from A, but A can have other references, so
maybe that's less reliable.)

I think I wanted this at the time so I could better measure the sizes of
objects. sys.getsizeof(A.__dict__) is very different
from sys.getsizeof(gc.get_referents(A.__dict__)[0]), and also different
from sys.getsizeof(A). For example:

>>> import gc
>>> class A(object): pass
>>> sys.getsizeof(A); sys.getsizeof(A.__dict__);
sys.getsizeof(gc.get_referents(A.__dict__)[0])
976
48
288
>>> for i in range(10000): setattr(A, 'attr_%s' % i, i)
>>> sys.getsizeof(A); sys.getsizeof(A.__dict__);
sys.getsizeof(gc.get_referents(A.__dict__)[0])
976
48
393312

(Fortunately, if you want to walk the object graph to measure memory usage
per object type, you're probably going to be using gc.get_referents already
anyway, so this is just confirmation that you're getting what you want in
one corner case.)

-- Devin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160623/22204201/attachment.html>

From guido at python.org  Thu Jun 23 16:40:51 2016
From: guido at python.org (Guido van Rossum)
Date: Thu, 23 Jun 2016 13:40:51 -0700
Subject: [Python-Dev] Why are class dictionaries not accessible?
In-Reply-To: <CABicbJJ3Vn1fPjAT25S0=PiFtVY4tGChLYWCP4_8H26-Yk+Qpw@mail.gmail.com>
References: <1466605047.4114458.645279001.61C40E1D@webmail.messagingengine.com>
 <CAP7+vJLU_myxxDoQu8ovbF+ZpCU1QfPzq3sqWPV8jo+i_tJxCg@mail.gmail.com>
 <1466694065.230465.646395065.5FE7CA44@webmail.messagingengine.com>
 <CAP7+vJLRMR6+KEs=LENaUPqqc1_p2H+dBaYAQsxp68O3x-21yw@mail.gmail.com>
 <CABicbJJ3Vn1fPjAT25S0=PiFtVY4tGChLYWCP4_8H26-Yk+Qpw@mail.gmail.com>
Message-ID: <CAP7+vJJf2tRTpCnj0twXbCHPhAV16HhthFPhzQ9O8SpZToUepQ@mail.gmail.com>

"Er, among our chief weapons are fear, surprise, ctypes, gc, and fanatical
devotion to the Pope!"

On Thu, Jun 23, 2016 at 1:03 PM, Devin Jeanpierre <jeanpierreda at gmail.com>
wrote:

> On Thu, Jun 23, 2016 at 8:19 AM, Guido van Rossum <guido at python.org>
>  wrote:
>>
>> It was a long time when I wrote this, but IIRC the breakage could express
>> itself as a segfault or other C-level crash due to some internal state
>> invariant of the type object being violated, not just an exception. The
>> existence of ctypes notwithstanding, we take C-level crashes very seriously.
>>
>
> Big digression: one can still obtain the dict if they really want to, even
> without using ctypes. I suppose don't actually mutate it unless you want to
> segfault.
>
> >>> import gc
> >>> class A(object): pass
> >>> type(A.__dict__)
> <class 'mappingproxy'>
> >>> type(gc.get_referents(A.__dict__)[0])
> <class 'dict'>
> >>> gc.get_referents(A.__dict__)[0]['abc'] = 1
> >>> A.abc
> 1
> >>>
>
> (One can also get it right from A, but A can have other references, so
> maybe that's less reliable.)
>
> I think I wanted this at the time so I could better measure the sizes of
> objects. sys.getsizeof(A.__dict__) is very different
> from sys.getsizeof(gc.get_referents(A.__dict__)[0]), and also different
> from sys.getsizeof(A). For example:
>
> >>> import gc
> >>> class A(object): pass
> >>> sys.getsizeof(A); sys.getsizeof(A.__dict__);
> sys.getsizeof(gc.get_referents(A.__dict__)[0])
> 976
> 48
> 288
> >>> for i in range(10000): setattr(A, 'attr_%s' % i, i)
> >>> sys.getsizeof(A); sys.getsizeof(A.__dict__);
> sys.getsizeof(gc.get_referents(A.__dict__)[0])
> 976
> 48
> 393312
>
> (Fortunately, if you want to walk the object graph to measure memory usage
> per object type, you're probably going to be using gc.get_referents already
> anyway, so this is just confirmation that you're getting what you want in
> one corner case.)
>
> -- Devin
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160623/374c2fa1/attachment.html>

From njs at pobox.com  Thu Jun 23 21:00:18 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 23 Jun 2016 18:00:18 -0700
Subject: [Python-Dev] Compact ordered dict is not ordered for split
 table. (was: PEP XXX: Compact ordered dict
In-Reply-To: <CAEfz+TwQrT3+shkgAVPEAMx+HAEtpBD=3i+DZtip4EYYOOi5MA@mail.gmail.com>
References: <CAEfz+Tx8XRjq1dPUdkyV5Y8AQPrnK0myvQsRmv75dC5F8ZJZ1A@mail.gmail.com>
 <CAEfz+TwQrT3+shkgAVPEAMx+HAEtpBD=3i+DZtip4EYYOOi5MA@mail.gmail.com>
Message-ID: <CAPJVwB=17sUY0BbTsxxi+ZGfHKXgxuM2mTVTwXGit0yxiS9faA@mail.gmail.com>

On Tue, Jun 21, 2016 at 8:40 PM, INADA Naoki <songofacandy at gmail.com> wrote:
> There are three options I can think.
>
>
> 1) Revert key-shared dict (PEP412).
>
> pros: Removing key-shared dict makes dict implementation simple.
>
> cons: In some applications, PEP 412 is far more compact than compact
> ordered dict.  (Note: Using __slots__ may help such situation).
>
>
> 2) Don't make "keeping insertion order" is Python Language Spec.
>
> pros: Best efficiency
>
> cons: Different behavior between normal dict and instance.__dict__ may
> confuse people.
>
>
> 3) More strict rule for key sharing dict.
>
> My idea is:
> * Increasing number of entries (inserting new key) can be possible
> only if refcnt of keys == 1.
>
> * Inserting new item (with existing key) into dict is allowed only when
>   insertion position == number of items in the dict (PyDictObject.ma_used).
>
> pros: We can have "dict keeping insertion order".
>
> cons: Can't use key-sharing dict for many cases.  Small and harmless
> change may cause
> sudden memory usage increase.  (__slots__ is more predicable).

IIUC, key-sharing dicts are a best-effort optimization where if I have
a class like:

class Foo:
    def __init__(self, a, b):
        self.a = a
        self.b = b

f1 = Foo(1, 2)
f2 = Foo(3, 4)

then f1.__dict__ and f2.__dict__ can share their key arrays... but if
I do f1.c = "c", then f1.__dict__ gets automatically switched to a
regular dict. The idea being that in, say, 99% of cases, different
objects of the same type all share the same set of keys, and in the
other 1%, oh well, we fall back on the regular behavior.

It seems to me that all this works fine for ordered dicts too, if we
add the restriction that key arrays can be shared if and only if the
two dicts have the same set of keys *and* initially assign those keys
in the same order. In, say, 98.9% of cases, different objects of the
same type all share the same set of keys and initially assign those
keys in the same order, and in the other 1.1%, oh well, we can
silently fall back on unshared keys, same as before. (And crucially,
the OrderedDict semantics are that only adding *new* keys changes the
order; assignments to existing keys preserve the existing order. So if
a given type always creates the same instance attributes in the same
order at startup and never adds or deletes any, then its key values
*and* key order will stay the same even if it later mutates some of
those existing attributes in-place.)

It's possible that there will be some weird types that mess this up, like:

class WeirdFoo:
    def __init__(self, a, b):
        if a % 2 == 0:
            self.a = a
            self.b = b
        else:
            self.b = b
            self.a = a

assert list(WeirdFoo(1, 2).__dict__.keys()) != list(WeirdFoo(2,
3).__dict__.keys())

but, who cares? It'd be good due-diligence to collect data on this to
confirm that it isn't a big issue, but intuitively, code like
WeirdFoo.__init__ is vanishingly rare, and this is just a best-effort
optimization anyway. Catching 98.9% of cases is good enough.

Is there something I'm missing here? Is this your option #3?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From songofacandy at gmail.com  Fri Jun 24 00:14:55 2016
From: songofacandy at gmail.com (INADA Naoki)
Date: Fri, 24 Jun 2016 13:14:55 +0900
Subject: [Python-Dev] Compact ordered dict is not ordered for split
 table. (was: PEP XXX: Compact ordered dict
In-Reply-To: <CAPJVwB=17sUY0BbTsxxi+ZGfHKXgxuM2mTVTwXGit0yxiS9faA@mail.gmail.com>
References: <CAEfz+Tx8XRjq1dPUdkyV5Y8AQPrnK0myvQsRmv75dC5F8ZJZ1A@mail.gmail.com>
 <CAEfz+TwQrT3+shkgAVPEAMx+HAEtpBD=3i+DZtip4EYYOOi5MA@mail.gmail.com>
 <CAPJVwB=17sUY0BbTsxxi+ZGfHKXgxuM2mTVTwXGit0yxiS9faA@mail.gmail.com>
Message-ID: <CAEfz+TxbYeBNHYesMV400TRH=R2BXMooH-Lpd5ek=8vutKmnLQ@mail.gmail.com>

> IIUC, key-sharing dicts are a best-effort optimization where if I have
> a class like:
>
> class Foo:
>     def __init__(self, a, b):
>         self.a = a
>         self.b = b
>
> f1 = Foo(1, 2)
> f2 = Foo(3, 4)
>
> then f1.__dict__ and f2.__dict__ can share their key arrays... but if
> I do f1.c = "c", then f1.__dict__ gets automatically switched to a
> regular dict. The idea being that in, say, 99% of cases, different
> objects of the same type all share the same set of keys, and in the
> other 1%, oh well, we fall back on the regular behavior.

Small correction:  Giving up sharing dict can happen when resizing keys.

f1 = Foo(1, 2)  # f1 has [a, b] keys. Let's say it k1.  Foo caches k1.
f2 = Foo(3, 4)  # new instance uses cached k1 keys.
f1.c = "c"   # Since k1 can contain three keys, nothing happen.
f1.d = "d"   # gave up.  Foo doesn't use shared key anymore.
f3 = Foo(5, 6)   # f3 has normal dict.

You can see it by `sys.getsizeof(f1.__dict__), sys.getsizeof(f2.__dict__)`.

> It seems to me that all this works fine for ordered dicts too, if we
> add the restriction that key arrays can be shared if and only if the
> two dicts have the same set of keys *and* initially assign those keys
> in the same order. In, say, 98.9% of cases, different objects of the
> same type all share the same set of keys and initially assign those
> keys in the same order, and in the other 1.1%, oh well, we can
> silently fall back on unshared keys, same as before. (And crucially,
> the OrderedDict semantics are that only adding *new* keys changes the
> order; assignments to existing keys preserve the existing order. So if
> a given type always creates the same instance attributes in the same
> order at startup and never adds or deletes any, then its key values
> *and* key order will stay the same even if it later mutates some of
> those existing attributes in-place.)
>
> It's possible that there will be some weird types that mess this up, like:
>
> class WeirdFoo:
>     def __init__(self, a, b):
>         if a % 2 == 0:
>             self.a = a
>             self.b = b
>         else:
>             self.b = b
>             self.a = a
>
> assert list(WeirdFoo(1, 2).__dict__.keys()) != list(WeirdFoo(2,
> 3).__dict__.keys())
>
> but, who cares? It'd be good due-diligence to collect data on this to
> confirm that it isn't a big issue, but intuitively, code like
> WeirdFoo.__init__ is vanishingly rare, and this is just a best-effort
> optimization anyway. Catching 98.9% of cases is good enough.
>

While I think it's less than 98.9% (see below examples), I agree with you.

1) not shared even current implementation

class A:
    n = 0

    def __init__(self, a, b, c):
        self.a, self.b, self.c = a, b, c

    def add(self):
        self.n += 1

a = A()
b = A()
a.add(1)

2) not shared if strict ordering rule

class A:
    file = None
    def __init__(self, a, **, filename=None):
        if filename is not None:
            self.file = open(filename, 'w')
        self.a = a

a = A(42, filename="logfile.txt")
b = B(43)

3) Web application's model objects

class User(Model):
    id = IntColumn()
    name = StringColumn()
    age = IntColumn()

# When creating new instance, (name, age) is initialized, and id is
filled after insert.
user = User(name="methane", age=32)
db.add(user)

# When instances fetched from DB, ORM populate attributes in (id,
name, age) order.
# 100 instances doesn't share keys under "strict ordering rule".

users = User.query.limit(100).all()

> Is there something I'm missing here? Is this your option #3?

Yes.

It may works well, but "one special instance disables key-sharing for
all instances
created after" may cause long time increasing memory usage.
People seeing monitoring graph will think their application have memory leak.

My new idea may have more stable memory usage, without decreasing memory
efficiency so much.

See https://mail.python.org/pipermail/python-dev/2016-June/145391.html

Compact ordered dict is more efficient than key-sharing dict in case of Sphinx.
It means, instance __dict__ is not dominance.

I'll implement POC of my new idea and compare it with Sphinx.
If you know another good *real application*, which is easy to benchmark,
please tell me it.

-- 
INADA Naoki  <songofacandy at gmail.com>

From lkb.teichmann at gmail.com  Fri Jun 24 03:41:36 2016
From: lkb.teichmann at gmail.com (Martin Teichmann)
Date: Fri, 24 Jun 2016 09:41:36 +0200
Subject: [Python-Dev] PEP 487: Simpler customization of class creation
Message-ID: <CAK9R32Sx1h6kDsGfPY+W_-mZEajHHNdcTS-2n3zH4GGttosjzA@mail.gmail.com>

Hi list,

just recently, I posted about the implementation of PEP 487. The
discussion quickly diverted to PEP 520, which happened to be
strongly related.

Hoping to get some comments about the rest of PEP 487, I took
out the part that is also in PEP 520. I attached the new version of
the PEP. The implementation can be found on the Python issue tracker:
http://bugs.python.org/issue27366

So PEP 487 is about simplifying the customization of class creation.
Currently, this is done via metaclasses, which are very powerful, but
often inflexible, as it is hard to combine two metaclasses. PEP 487
proposes a new metaclass which calls a method on all newly created
subclasses. This way, in order to customize the creation of subclasses,
one just needs to write a simple method.

An absolutely classic example for metaclasses is the need to tell descriptors
who they belong to. There are many large frameworks out there, e.g.
enthought's traits, IPython's traitlets, Django's forms and many more.
Their problem is: they're all fully incompatible. It's really hard to inherit
from two classes which have different metaclasses.

PEP 487 proposes to have one simple metaclass, which can do all those
frameworks need, making them all compatible. As an example, imagine
the framework has a generic descriptor called Integer, which describes,
well, an integer. Typically you use it like that:

    class MyClass(FrameworkBaseClass):
        my_value = Integer()

how does my_value know how it's called, how it should put its data into the
object's __dict__? Well, this is what the framework's metaclass is for.
With PEP 487, a framework doesn't need to declare an own metaclass
anymore, but simply uses types.Object of PEP 487 as a base class:

    class FrameworkBaseClass(types.Object):
        def __init_subclass__(cls):
            super().__init_subclass__()
            for k, v in cls.__dict__.items():
                if isinstance(v, FrameworkDescriptorBase):
                    v.__set_owner__(cls, name)

and all the framework's descriptors know their name. And if another framework
should be used as well: no problem, they just work together easily.

Actually, the above example is just that common, that PEP 487 includes
it directly:
a method __set_owner__ is called for every descriptor. That could make most
descriptors in frameworks work out of the box.

So now I am hoping for comments!

Greetings

Martin

New version of the PEP follows:

PEP: 487
Title: Simpler customisation of class creation
Version: $Revision$
Last-Modified: $Date$
Author: Martin Teichmann <lkb.teichmann at gmail.com>,
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 27-Feb-2015
Python-Version: 3.6
Post-History: 27-Feb-2015, 5-Feb-2016, 24-Jun-2016
Replaces: 422

Abstract
========

Currently, customising class creation requires the use of a custom metaclass.
This custom metaclass then persists for the entire lifecycle of the class,
creating the potential for spurious metaclass conflicts.

This PEP proposes to instead support a wide range of customisation
scenarios through a new ``__init_subclass__`` hook in the class body,
a hook to initialize attributes.

Those hooks should at first be defined in a metaclass in the standard
library, with the option that this metaclass eventually becomes the
default ``type`` metaclass.

The new mechanism should be easier to understand and use than
implementing a custom metaclass, and thus should provide a gentler
introduction to the full power Python's metaclass machinery.

Background
==========

Metaclasses are a powerful tool to customize class creation. They have,
however, the problem that there is no automatic way to combine metaclasses.
If one wants to use two metaclasses for a class, a new metaclass combining
those two needs to be created, typically manually.

This need often occurs as a surprise to a user: inheriting from two base
classes coming from two different libraries suddenly raises the necessity
to manually create a combined metaclass, where typically one is not
interested in those details about the libraries at all. This becomes
even worse if one library starts to make use of a metaclass which it
has not done before. While the library itself continues to work perfectly,
suddenly every code combining those classes with classes from another library
fails.

Proposal
========

While there are many possible ways to use a metaclass, the vast majority
of use cases falls into just three categories: some initialization code
running after class creation, the initalization of descriptors and
keeping the order in which class attributes were defined.

Those three use cases can easily be performed by just one metaclass. If
this metaclass is put into the standard library, and all libraries that
wish to customize class creation use this very metaclass, no combination
of metaclasses is necessary anymore. Said metaclass should live in the
``types`` module under the name ``Type``. This should hint the user that
in the future, this metaclass may become the default metaclass ``type``.

The three use cases are achieved as follows:

1. The metaclass contains an ``__init_subclass__`` hook that initializes
   all subclasses of a given class,
2. the metaclass calls a ``__set_owner__`` hook on all the attribute
   (descriptors) defined in the class, and

For ease of use, a base class ``types.Object`` is defined, which uses said
metaclass and contains an empty stub for the hook described for use case 1.
It will eventually become the new replacement for the standard ``object``.

As an example, the first use case looks as follows::

   >>> class SpamBase(types.Object):
   ...    # this is implicitly a @classmethod
   ...    def __init_subclass__(cls, **kwargs):
   ...        cls.class_args = kwargs
   ...        super().__init_subclass__(cls, **kwargs)

   >>> class Spam(SpamBase, a=1, b="b"):
   ...    pass

   >>> Spam.class_args
   {'a': 1, 'b': 'b'}

The base class ``types.Object`` contains an empty ``__init_subclass__``
method which serves as an endpoint for cooperative multiple inheritance.
Note that this method has no keyword arguments, meaning that all
methods which are more specialized have to process all keyword
arguments.

This general proposal is not a new idea (it was first suggested for
inclusion in the language definition `more than 10 years ago`_, and a
similar mechanism has long been supported by `Zope's ExtensionClass`_),
but the situation has changed sufficiently in recent years that
the idea is worth reconsidering for inclusion.

The second part of the proposal adds an ``__set_owner__``
initializer for class attributes, especially if they are descriptors.
Descriptors are defined in the body of a
class, but they do not know anything about that class, they do not
even know the name they are accessed with. They do get to know their
owner once ``__get__`` is called, but still they do not know their
name. This is unfortunate, for example they cannot put their
associated value into their object's ``__dict__`` under their name,
since they do not know that name.  This problem has been solved many
times, and is one of the most important reasons to have a metaclass in
a library. While it would be easy to implement such a mechanism using
the first part of the proposal, it makes sense to have one solution
for this problem for everyone.

To give an example of its usage, imagine a descriptor representing weak
referenced values::

    import weakref

    class WeakAttribute:
        def __get__(self, instance, owner):
            return instance.__dict__[self.name]

        def __set__(self, instance, value):
            instance.__dict__[self.name] = weakref.ref(value)

        # this is the new initializer:
        def __set_owner__(self, owner, name):
            self.name = name

While this example looks very trivial, it should be noted that until
now such an attribute cannot be defined without the use of a metaclass.
And given that such a metaclass can make life very hard, this kind of
attribute does not exist yet.

Key Benefits
============

Easier inheritance of definition time behaviour
-----------------------------------------------

Understanding Python's metaclasses requires a deep understanding of
the type system and the class construction process. This is legitimately
seen as challenging, due to the need to keep multiple moving parts (the code,
the metaclass hint, the actual metaclass, the class object, instances of the
class object) clearly distinct in your mind. Even when you know the rules,
it's still easy to make a mistake if you're not being extremely careful.

Understanding the proposed implicit class initialization hook only requires
ordinary method inheritance, which isn't quite as daunting a task. The new
hook provides a more gradual path towards understanding all of the phases
involved in the class definition process.

Reduced chance of metaclass conflicts
-------------------------------------

One of the big issues that makes library authors reluctant to use metaclasses
(even when they would be appropriate) is the risk of metaclass conflicts.
These occur whenever two unrelated metaclasses are used by the desired
parents of a class definition. This risk also makes it very difficult to
*add* a metaclass to a class that has previously been published without one.

By contrast, adding an ``__init_subclass__`` method to an existing type poses
a similar level of risk to adding an ``__init__`` method: technically, there
is a risk of breaking poorly implemented subclasses, but when that occurs,
it is recognised as a bug in the subclass rather than the library author
breaching backwards compatibility guarantees.

A path of introduction into Python
==================================