From flying-sheep at web.de  Tue Sep  1 10:00:21 2015
From: flying-sheep at web.de (Philipp A.)
Date: Tue, 01 Sep 2015 08:00:21 +0000
Subject: [Python-ideas] Add appdirs module to stdlib
Message-ID: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>

When defining a place for config files, cache files, and so on, people
usually hack around in a OS-dependent, misinformed, and therefore wrong way.

Thanks to the tempfile API we at least don?t see people hardcoding /tmp/
too much.

There is a beautiful little module that does things right and is easy to
use: appdirs <https://pypi.python.org/pypi/appdirs>

TI think this is a *really* good candidate for the stdlib since this
functionality is useful for everything that needs a cache or config (so not
only GUI and CLI applications, but also scripts that download and cache
stuff from the internet for faster re-running)

probably we should build the API around pathlib, since i found myself not
touching os.path with a barge pole since pathlib exists.

i?ll write a PEP about this soon :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150901/fcc9aff6/attachment.html>

From ncoghlan at gmail.com  Tue Sep  1 10:56:17 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 1 Sep 2015 18:56:17 +1000
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
Message-ID: <CADiSq7dbjWvqksH4jngX4TpCUpLN5KN5jrtCwwXvjnmM8HPy8g@mail.gmail.com>

On 1 September 2015 at 18:00, Philipp A. <flying-sheep at web.de> wrote:
> When defining a place for config files, cache files, and so on, people
> usually hack around in a OS-dependent, misinformed, and therefore wrong way.
>
> Thanks to the tempfile API we at least don?t see people hardcoding /tmp/ too
> much.
>
> There is a beautiful little module that does things right and is easy to
> use: appdirs
>
> TI think this is a *really* good candidate for the stdlib since this
> functionality is useful for everything that needs a cache or config (so not
> only GUI and CLI applications, but also scripts that download and cache
> stuff from the internet for faster re-running)
>
> probably we should build the API around pathlib, since i found myself not
> touching os.path with a barge pole since pathlib exists.
>
> i?ll write a PEP about this soon :)

This sounds like a reasonable idea to me, and we can point folks to
the original appdirs if they need a version-independent alternative.

Depending on the amount of code involved, we could potentially
consider providing this as an API *in* pathlib, rather than needing an
entire new module for the standard library version of it.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From bussonniermatthias at gmail.com  Tue Sep  1 11:09:26 2015
From: bussonniermatthias at gmail.com (Matthias Bussonnier)
Date: Tue, 1 Sep 2015 11:09:26 +0200
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CADiSq7dbjWvqksH4jngX4TpCUpLN5KN5jrtCwwXvjnmM8HPy8g@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <CADiSq7dbjWvqksH4jngX4TpCUpLN5KN5jrtCwwXvjnmM8HPy8g@mail.gmail.com>
Message-ID: <25F8FDAA-ACEF-48BA-A8D9-DC0FCDD2F197@gmail.com>

> 
> This sounds like a reasonable idea to me, and we can point folks to
> the original appdirs if they need a version-independent alternative.
> 
> Depending on the amount of code involved, we could potentially
> consider providing this as an API *in* pathlib, rather than needing an
> entire new module for the standard library version of it.
> 
> Regards,
> Nick.

+1, 

If this get into python, it would be nice to have a `python -m <module> <appname>` that return to user the
config dirs. One of the most challenging issues we have with user is ?where is my config/cache/...? 
and it?s always hard start the response by ?It depends of...?. The ?run this command to know? works better. 

-- 
M
 


> 
> -- 
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


From p.f.moore at gmail.com  Tue Sep  1 11:26:50 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 1 Sep 2015 10:26:50 +0100
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <25F8FDAA-ACEF-48BA-A8D9-DC0FCDD2F197@gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <CADiSq7dbjWvqksH4jngX4TpCUpLN5KN5jrtCwwXvjnmM8HPy8g@mail.gmail.com>
 <25F8FDAA-ACEF-48BA-A8D9-DC0FCDD2F197@gmail.com>
Message-ID: <CACac1F_=JOoW=taPeUxdsX1+ijTAuNLKEsZmiUc1GxKrmQWs8Q@mail.gmail.com>

On 1 September 2015 at 10:09, Matthias Bussonnier
<bussonniermatthias at gmail.com> wrote:
>> This sounds like a reasonable idea to me, and we can point folks to
>> the original appdirs if they need a version-independent alternative.
>>
>> Depending on the amount of code involved, we could potentially
>> consider providing this as an API *in* pathlib, rather than needing an
>> entire new module for the standard library version of it.
>>
>> Regards,
>> Nick.
>
> +1,
>
> If this get into python, it would be nice to have a `python -m <module> <appname>` that return to user the
> config dirs. One of the most challenging issues we have with user is ?where is my config/cache/...?
> and it?s always hard start the response by ?It depends of...?. The ?run this command to know? works better.

+1 to all of the above.
Paul

From rosuav at gmail.com  Tue Sep  1 11:29:59 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Tue, 1 Sep 2015 19:29:59 +1000
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
Message-ID: <CAPTjJmp2FEqZo0=W3RvTzf9RcNK5QbbHNE9MNbv2Xakj2kUdGA@mail.gmail.com>

On Tue, Sep 1, 2015 at 6:00 PM, Philipp A. <flying-sheep at web.de> wrote:
> When defining a place for config files, cache files, and so on, people
> usually hack around in a OS-dependent, misinformed, and therefore wrong way.
>
> There is a beautiful little module that does things right and is easy to
> use: appdirs
>
> TI think this is a *really* good candidate for the stdlib...

Who maintains appdirs? Is s/he willing to maintain it on the stdlib's
release schedule? If so, I'd be +1 on this; Python has a strong
precedent for papering over OS differences and providing a consistent
platform.

ChrisA

From abarnert at yahoo.com  Tue Sep  1 11:42:50 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Tue, 1 Sep 2015 02:42:50 -0700
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
Message-ID: <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>

On Sep 1, 2015, at 01:00, Philipp A. <flying-sheep at web.de> wrote:
> 
> When defining a place for config files, cache files, and so on, people usually hack around in a OS-dependent, misinformed, and therefore wrong way.
> 
> Thanks to the tempfile API we at least don?t see people hardcoding /tmp/ too much.
> 
> There is a beautiful little module that does things right and is easy to use: appdirs

Is appdirs compatible with the OS X recommendations (as required by the App Store). Apple only gives you cache and app data directories; prefs are supposed to use NSDefaults API or emulate the file names and formats properly, and you have to be sensitive to the sandbox.)

If so, definitely +1, because that's a pain to do with anything but Qt (or of course PyObjC). If not, -0.5, because making it easier to do it wrong is probably not beneficial, even if that's what many *nix apps end up writing a lot of code to get wrong on Mac...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150901/d85e3e59/attachment-0001.html>

From ian.team.python at gmail.com  Tue Sep  1 12:24:15 2015
From: ian.team.python at gmail.com (Ian)
Date: Tue, 1 Sep 2015 20:24:15 +1000
Subject: [Python-ideas] ideas for type hints for variable: beyond comments
Message-ID: <55E57CCF.9030909@gmail.com>

mypy currently inspects the comment on the line of first assignment for 
the variables to be type hinted.

It is logical that at some time python language will add support to 
allow these type hints to move from comments to the code as has happened 
for 'def' signatures.

One logical syntax would be to move from

i = 1           # Infer type int for i

to

i:int = 1       # no comment needed, but does not look attractive


The first question that arises is 'is the type inference legal for the 
additional uses. Having a 'second use' flagged by warning or error by 
either an external typechecker or even the language itself could pick up 
on accidental reuse of a name, but in practice accidentally creating a 
new variable through a typo can be more common. In python today the 
first use is the same as every other, so this change just does not feel 
comfortable. The other question is 'what about globals and nonlocals?'. 
Currently globals and nonlocals need a 'global' or 'nonlocal' statement 
to allow assignment, but what if these values are not assigned in scope? 
What if we allowed
global i:int

or

nonlocal i:int

and even

local i:int

Permitting a new keyword 'local' to me might bring far more symmetry 
between different cases.

It would also allow type hinting to be collected near the function 
definition and keep the type hinting clear of the main code.

Use of the 'local' keyword in the global namespace could indicate a 
value not accessible in other namespaces.

Personally I would like to go even further and allow some syntax to 
allow (or disable) flagging the use of new variables without type 
hinting as possible typos

I have a syntax in mind, but the idea is the discussion point, not the 
specific syntax.

Possibly what is here already is too much of a change of direction to 
consider for ideas already in progress?


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150901/de1de5dd/attachment.html>

From ian.team.python at gmail.com  Tue Sep  1 06:03:33 2015
From: ian.team.python at gmail.com (ia n)
Date: Mon, 31 Aug 2015 21:03:33 -0700 (PDT)
Subject: [Python-ideas] ideas for type hints for variable: beyond comments
Message-ID: <77cf2ecb-cbf8-4e80-a5d7-022a27e9a8a7@googlegroups.com>

 

mypy currently inspects the comment on the line of first assignment for the 
variables to be type hinted.


It is logical that at some time python language will add support to allow 
these type hints to move from comments to the code as has happened for 
'def' signatures.


One logical syntax would be to move from

i = 1           # Infer type int for i

to

i:int = 1       # no comment needed, but does not look attractive

The first question that arises is 'is the type inference legal for the additional uses.  Having a 'second use' flagged by warning or error by either an external typechecker or even the language itself could pick up on accidental reuse of a name, but in practice accidentally creating a new variable through a typo can be more common.

In python today the first use is the same as every other, so this change just does not feel comfortable.

The other question is 'what about globals and nonlocals?'.  Currently globals and nonlocals need a 'global' or 'nonlocal' statement to allow assignment, but what if these values are not assigned in scope?

What if we allowed 
global i:int


or 

nonlocal i:int


and even

local i:int


Permitting a new keyword 'local' to me might bring far more symmetry 
between different cases.

It would also allow type hinting to be collected near the function 
definition and keep the type hinting clear of the main code.

Use of the 'local' keyword in the global namespace could indicate a value 
not accessible in other namespaces.


Personally I would like to go even further and allow some syntax to allow 
(or disable) flagging the use of new variables without type hinting as 
possible typos


I have a syntax in mind, but the idea is the discussion point, not the 
specific syntax.


Possibly what is here already is too much of a change of direction to 
consider for ideas already in progress?


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150831/c7043230/attachment.html>

From rosuav at gmail.com  Tue Sep  1 13:19:29 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Tue, 1 Sep 2015 21:19:29 +1000
Subject: [Python-ideas] ideas for type hints for variable: beyond
	comments
In-Reply-To: <55E57CCF.9030909@gmail.com>
References: <55E57CCF.9030909@gmail.com>
Message-ID: <CAPTjJmqF5RHvy7yvv60f-YYcp_WvstCC+2zz3WEJZsxhm8nR4g@mail.gmail.com>

On Tue, Sep 1, 2015 at 8:24 PM, Ian <ian.team.python at gmail.com> wrote:
> mypy currently inspects the comment on the line of first assignment for the
> variables to be type hinted.
>
> It is logical that at some time python language will add support to allow
> these type hints to move from comments to the code as has happened for 'def'
> signatures.

Potential problem: Function annotations are supported all the way back
to Python 3.0, but any new syntax would be 3.6+ only. That's going to
severely limit its value for quite some time. That doesn't mean new
syntax can't be added (otherwise none ever would), but the bar is that
much higher - you'll need an extremely compelling justification.

> The other question is 'what about globals and nonlocals?'.  Currently
> globals and nonlocals need a 'global' or 'nonlocal' statement to allow
> assignment, but what if these values are not assigned in scope?

Not sure what you're talking about here. If they're not assigned in
this scope, then presumably they have the same value they had from
some other scope. You shouldn't need to declare that "len" is a
function, inside every function that calls it. Any type hints should
go where it's assigned, and nowhere else.

> What if we allowed
> global i:int
>
> or
>
> nonlocal i:int
>
> and even
>
> local i:int
>
> Permitting a new keyword 'local' to me might bring far more symmetry between
> different cases.

Hey, if you want C, you know where to find it :)

> Use of the 'local' keyword in the global namespace could indicate a value
> not accessible in other namespaces.

I'm not sure what "not accessible" would mean. If someone imports your
module, s/he gains access to all your globals. Do you mean that it's
"not intended for external access" (normally notated with a single
leading underscore)? Or is this a new feature - some way of preventing
other modules from using these? That might be useful, but that's a
completely separate proposal.

> Personally I would like to go even further and allow some syntax to allow
> (or disable) flagging the use of new variables without type hinting as
> possible typos

If you're serious about wanting all your variables to be declared,
then I think you want a language other than Python. There are such
languages around (and maybe even compiling to Python byte-code, I'm
not sure), but Python isn't built that way. Type hinting is NOT
variable declaration, and never will be. (Though that's famous last
words, I know, and I'm not the BDFL or even anywhere close to that. If
someone pulls up this email in ten years and laughs in my face, so be
it. It'd not be the first time I've been utterly confidently wrong!)

ChrisA

From steve at pearwood.info  Tue Sep  1 15:03:54 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 1 Sep 2015 23:03:54 +1000
Subject: [Python-ideas] ideas for type hints for variable: beyond
	comments
In-Reply-To: <CAPTjJmqF5RHvy7yvv60f-YYcp_WvstCC+2zz3WEJZsxhm8nR4g@mail.gmail.com>
References: <55E57CCF.9030909@gmail.com>
 <CAPTjJmqF5RHvy7yvv60f-YYcp_WvstCC+2zz3WEJZsxhm8nR4g@mail.gmail.com>
Message-ID: <20150901130354.GE19373@ando.pearwood.info>

On Tue, Sep 01, 2015 at 09:19:29PM +1000, Chris Angelico wrote:

> On Tue, Sep 1, 2015 at 8:24 PM, Ian <ian.team.python at gmail.com> wrote:
> > 
> > mypy currently inspects the comment on the line of first assignment for the
> > variables to be type hinted.
> >
> > It is logical that at some time python language will add support to allow
> > these type hints to move from comments to the code as has happened for 'def'
> > signatures.
> 
> Potential problem: Function annotations are supported all the way back
> to Python 3.0, but any new syntax would be 3.6+ only. That's going to
> severely limit its value for quite some time. That doesn't mean new
> syntax can't be added (otherwise none ever would), but the bar is that
> much higher - you'll need an extremely compelling justification.

PEP 484 says:

"No first-class syntax support for explicitly marking variables as being 
of a specific type is added by this PEP. To help with type inference in 
complex cases, a comment of the following format may be used: ..."

https://www.python.org/dev/peps/pep-0484/

I recall that in the discussions prior to the PEP, I got the strong 
impression that Guido was open to the concept of annotating variables in 
principle, but didn't think it was very important (for the most part, 
the type checker should be able to infer the variable type), and he 
didn't want to delay the PEP for the sake of agreement on a variable 
declaration syntax when a simple comment will do the job.

So in principle, if we agree that type declarations for variables should 
look like (let's say) `str s = some_function(arg)` then the syntax may 
be added in the future, but it's a low priority.


> > The other question is 'what about globals and nonlocals?'.  Currently
> > globals and nonlocals need a 'global' or 'nonlocal' statement to allow
> > assignment, but what if these values are not assigned in scope?
> 
> Not sure what you're talking about here. If they're not assigned in
> this scope, then presumably they have the same value they had from
> some other scope.

But they will be assigned in the scope, otherwise there's no need to 
declare them global.

def spam(*args):
    global eggs
    eggs = len(args)
    process(something, eggs)


That's a case where the type-checker should be able to infer that eggs 
will be an int. But what if the type inference engine cannot work that 
out? The developer may choose to add a hint.

    eggs = len(args)  # type:int

will work according to PEP 484 (although, I guess that's a quality of 
implementation issue for the actual type checker). Or we could steal 
syntax from some other language and make it "official" that type 
checkers have to look at this:

    eggs:int  # (Pascal, Swift, Ada, F#, Scala)

    int eggs  # (Java, C, Perl6)

    eggs int  # (Go)

    eggs as int  # (RealBasic)


Hence, for example:

    global eggs:int

    cheese:int, ham:str = 23, "foo"

A big question would be, what runtime effect (if any) would this have? 
If the default Python compiler ignored the type hint at both 
compile-time and run-time, it would be hard to justify making it syntax.

But perhaps the current namespace could get a magic variable

__annotations__ = {name: hint}

similar to the __annotations__ attribute of functions. Again, the 
default compiler would simply record the annotation and ignore it, the 
same as for functions, leaving any actual type-checking to third-party 
tools.


[...]
> > Use of the 'local' keyword in the global namespace could indicate a value
> > not accessible in other namespaces.

That won't work without a *major* change to Python's design. Currently, 
module namespaces are regular dicts, and there is no way to prevent 
others from looking up names in that dict/namespace. If you (Ian) want 
to change that, you should raise it as a completely separate PEP.



-- 
Steve

From rosuav at gmail.com  Tue Sep  1 15:13:51 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Tue, 1 Sep 2015 23:13:51 +1000
Subject: [Python-ideas] ideas for type hints for variable: beyond
	comments
In-Reply-To: <20150901130354.GE19373@ando.pearwood.info>
References: <55E57CCF.9030909@gmail.com>
 <CAPTjJmqF5RHvy7yvv60f-YYcp_WvstCC+2zz3WEJZsxhm8nR4g@mail.gmail.com>
 <20150901130354.GE19373@ando.pearwood.info>
Message-ID: <CAPTjJmr7Y-M0DbnL_=NxhorX+zQbqCXfRZLUfW2Zz9RqH3rUUw@mail.gmail.com>

On Tue, Sep 1, 2015 at 11:03 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Tue, Sep 01, 2015 at 09:19:29PM +1000, Chris Angelico wrote:
>
>> On Tue, Sep 1, 2015 at 8:24 PM, Ian <ian.team.python at gmail.com> wrote:
>> >
>> > mypy currently inspects the comment on the line of first assignment for the
>> > variables to be type hinted.
>> >
>> > It is logical that at some time python language will add support to allow
>> > these type hints to move from comments to the code as has happened for 'def'
>> > signatures.
>>
>> Potential problem: Function annotations are supported all the way back
>> to Python 3.0, but any new syntax would be 3.6+ only. That's going to
>> severely limit its value for quite some time. That doesn't mean new
>> syntax can't be added (otherwise none ever would), but the bar is that
>> much higher - you'll need an extremely compelling justification.
>
> PEP 484 says:
>
> "No first-class syntax support for explicitly marking variables as being
> of a specific type is added by this PEP. To help with type inference in
> complex cases, a comment of the following format may be used: ..."
>
> https://www.python.org/dev/peps/pep-0484/
>
> I recall that in the discussions prior to the PEP, I got the strong
> impression that Guido was open to the concept of annotating variables in
> principle, but didn't think it was very important (for the most part,
> the type checker should be able to infer the variable type), and he
> didn't want to delay the PEP for the sake of agreement on a variable
> declaration syntax when a simple comment will do the job.
>
> So in principle, if we agree that type declarations for variables should
> look like (let's say) `str s = some_function(arg)` then the syntax may
> be added in the future, but it's a low priority.

Right, it's low priority, and a non-backward-compatible one.
Backporting typing.py to any 3.x Python will make all the annotations
"succeed" (given that success, at run time, doesn't require any sort
of actual checking); it's not possible to backport a syntax change.
It's like using 'yield from' for coroutines - it instantly stops you
from running on anything older than 3.3. Maybe that'll be worthwhile,
but the complaint that "comments are ugly" isn't enough justification
IMO.

If there were some serious run-time value for these annotations, then
I could see more reason for adding them. At the moment, though, I'm
distinctly -1.

ChrisA

From ian.team.python at gmail.com  Tue Sep  1 17:01:29 2015
From: ian.team.python at gmail.com (Ian)
Date: Wed, 2 Sep 2015 01:01:29 +1000
Subject: [Python-ideas] ideas for type hints for variable: beyond
	comments
Message-ID: <55E5BDC9.7020807@gmail.com>

Chris Angelico wrote:

" > It is logical that at some time python language will add support to 
allow
 > these type hints to move from comments to the code as has happened 
for 'def'
 > signatures.

Potential problem: Function annotations are supported all the way back
to Python 3.0, but any new syntax would be 3.6+ only. That's going to
severely limit its value for quite some time. That doesn't mean new
syntax can't be added (otherwise none ever would), but the bar is that
much higher - you'll need an extremely compelling justification. "

My intent must not have been clear.  I am not suggesting changing 
function annotations.  I think function annotations as they are 
represent an addition that has been well thought out and is a very 
useful step which extends what is possible in a very useful way. Pep 484 
as introduced in 3.5 allows this to be taken further.

I am suggesting building on PEP 484 with a complimentary extensions for 
variables.

If extensions are made in the same manner as function annotations, then 
the actual python code simply has hints added.  Generating warnings or 
other steps is the domain of separate off-line checkers.

I feel it is clear that at some time an extension to allow type-hints 
for variables, complimenting current function annotations will be added.
I am just providing food for thought on how annotations for variables 
can be added.





 > The other question is 'what about globals and nonlocals?'.  Currently
 > globals and nonlocals need a 'global' or 'nonlocal' statement to allow
 > assignment, but what if these values are not assigned in scope?

"Not sure what you're talking about here. If they're not assigned in
this scope, then presumably they have the same value they had from
some other scope. You shouldn't need to declare that "len" is a
function, inside every function that calls it. Any type hints should
go where it's assigned, and nowhere else. "

These are hints.  Not a 'need'.    The type hints may be desired in the 
code referencing the 'globals' or 'nonlocals',
but not desired in the original context.   The idea is to allow this, 
NOT to require or need a declaration.

Hope this helps clarify what I am trying to suggest.

 > What if we allowed
 > global i:int
 >
 > or
 >
 > nonlocal i:int
 >
 > and even
 >
 > local i:int
 >
 > Permitting a new keyword 'local' to me might bring far more symmetry 
between
 > different cases.

Hey, if you want C, you know where to find it :)

 > Use of the 'local' keyword in the global namespace could indicate a 
value
 > not accessible in other namespaces.

"I'm not sure what "not accessible" would mean. If someone imports your
module, s/he gains access to all your globals. Do you mean that it's
"not intended for external access" (normally notated with a single
leading underscore)? Or is this a new feature - some way of preventing
other modules from using these? That might be useful, but that's a
completely separate proposal. "

Good point, the single _already does what I was thinking. I never think 
of using it in this specific case. I tend to associate it with hinting 
that an identifier is for internal use within a class.  Not to get 
warnings about use from outside the global namespace of globals.

 > Personally I would like to go even further and allow some syntax to 
allow
 > (or disable) flagging the use of new variables without type hinting as
 > possible typos

"If you're serious about wanting all your variables to be declared,
then I think you want a language other than Python. There are such
languages around (and maybe even compiling to Python byte-code, I'm
not sure), but Python isn't built that way. Type hinting is NOT
variable declaration, and never will be. (Though that's famous last
words, I know, and I'm not the BDFL or even anywhere close to that. If
someone pulls up this email in ten years and laughs in my face, so be
it. It'd not be the first time I've been utterly confidently wrong!) "

No, I am not serious about wanting all variables to be declared under 
normal circumstances.
Again, as you say, this is about hinting and getting warnings.  I think 
there are circumstances where hinting may be sufficiently useful that 
getting a warning for a missing hint would be desirable.

This is not a suggestion to change python.  But a suggestion to allow 
for specific situations without a change of how things normally happen.

Thank you for taking the time to comment.  It is appreciated and I hope 
I have been able to use your feedback to clarify.


From gokoproject at gmail.com  Tue Sep  1 18:19:21 2015
From: gokoproject at gmail.com (John Wong)
Date: Tue, 1 Sep 2015 12:19:21 -0400
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
Message-ID: <CACCLA56_Ba3T827DvX2UnFRfySN1jzoc1O0k8W0_=SbmZEBHug@mail.gmail.com>

But is appdirs only useful if you are running something that's more toward
system package / desktop application? A lot of projects today create their
own directory to save data, many use $HOME/DOTCUSTOM_DIR. So the use case
of appdirs should be addressed.

On Tue, Sep 1, 2015 at 5:42 AM, Andrew Barnert via Python-ideas <
python-ideas at python.org> wrote:

> On Sep 1, 2015, at 01:00, Philipp A. <flying-sheep at web.de> wrote:
>
> When defining a place for config files, cache files, and so on, people
> usually hack around in a OS-dependent, misinformed, and therefore wrong way.
>
> Thanks to the tempfile API we at least don?t see people hardcoding /tmp/
> too much.
>
> There is a beautiful little module that does things right and is easy to
> use: appdirs <https://pypi.python.org/pypi/appdirs>
>
>
> Is appdirs compatible with the OS X recommendations (as required by the
> App Store). Apple only gives you cache and app data directories; prefs are
> supposed to use NSDefaults API or emulate the file names and formats
> properly, and you have to be sensitive to the sandbox.)
>
> If so, definitely +1, because that's a pain to do with anything but Qt (or
> of course PyObjC). If not, -0.5, because making it easier to do it wrong is
> probably not beneficial, even if that's what many *nix apps end up writing
> a lot of code to get wrong on Mac...
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150901/f5d5545d/attachment.html>

From rosuav at gmail.com  Tue Sep  1 18:58:49 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 2 Sep 2015 02:58:49 +1000
Subject: [Python-ideas] ideas for type hints for variable: beyond
	comments
In-Reply-To: <55E5BDC9.7020807@gmail.com>
References: <55E5BDC9.7020807@gmail.com>
Message-ID: <CAPTjJmrR-4HQnOd-3RbFxb+MFFjv8dp2rk5NVH_-0s2GJZTcmg@mail.gmail.com>

On Wed, Sep 2, 2015 at 1:01 AM, Ian <ian.team.python at gmail.com> wrote:
> Chris Angelico wrote:
> Potential problem: Function annotations are supported all the way back
> to Python 3.0, but any new syntax would be 3.6+ only. That's going to
> severely limit its value for quite some time. That doesn't mean new
> syntax can't be added (otherwise none ever would), but the bar is that
> much higher - you'll need an extremely compelling justification. "
>
> My intent must not have been clear.  I am not suggesting changing function
> annotations.  I think function annotations as they are represent an addition
> that has been well thought out and is a very useful step which extends what
> is possible in a very useful way. Pep 484 as introduced in 3.5 allows this
> to be taken further.

I understand that, but the difference here is that PEP 484 adds
meaning to something that's already been syntactically valid. If you
pull up a Python 3.1 and run this code, it will work:

def do_nothing() -> None:
    pass

The special names List and Optional and so on are not available by
default, but they're imported from typing.py anyway; it's easy enough
to make sure that typing.py works on older Pythons (maybe as a pypi
dependency).

In contrast, you're suggesting completely new syntax. That means that
any program that uses them will simply *fail to run* on any Python
older than their introduction (same as those using function
annotations can't run on Python 2). As a general rule, the bar for new
syntax is a lot higher than the bar for a new function, module, etc,
that can be implemented with existing syntax. It's certainly possible;
you just need to convince everyone that it's worth adding syntax for.

>> The other question is 'what about globals and nonlocals?'.  Currently
>> globals and nonlocals need a 'global' or 'nonlocal' statement to allow
>> assignment, but what if these values are not assigned in scope?
>
> "Not sure what you're talking about here. If they're not assigned in
> this scope, then presumably they have the same value they had from
> some other scope. You shouldn't need to declare that "len" is a
> function, inside every function that calls it. Any type hints should
> go where it's assigned, and nowhere else. "
>
> These are hints.  Not a 'need'.    The type hints may be desired in the code
> referencing the 'globals' or 'nonlocals',
> but not desired in the original context.   The idea is to allow this, NOT to
> require or need a declaration.

Okay, I think I understand you here. It's for cases like this:

# big_module.py
_cache = {}

# way further down

def function_with_annotations(thing: str) -> str
    if thing not in _cache:
        _cache[thing] = frobnicate(thing)
    return _cache[thing]

Inside this brand new function, you want to tell the type hinter that
_cache is a dict, even though you don't declare it, don't assign to
it, or anything like that. That's reasonable, but it isn't all that
common a use case; generally, if you're adding code to a module
somewhere, you can edit other places in the module to add those type
hints, or else you can simply forego the type hint for that one thing.

> Hope this helps clarify what I am trying to suggest.

Yes, thank you. I think I get what you're saying there.

>> Use of the 'local' keyword in the global namespace could indicate a value
>> not accessible in other namespaces.
>
> "I'm not sure what "not accessible" would mean. If someone imports your
> module, s/he gains access to all your globals. Do you mean that it's
> "not intended for external access" (normally notated with a single
> leading underscore)? Or is this a new feature - some way of preventing
> other modules from using these? That might be useful, but that's a
> completely separate proposal. "
>
> Good point, the single _already does what I was thinking. I never think of
> using it in this specific case. I tend to associate it with hinting that an
> identifier is for internal use within a class.  Not to get warnings about
> use from outside the global namespace of globals.

Yeah, it comes to the same thing though. I'm not sure if any linters
would pick up on "module._identifier" usages, but code reviewers
certainly could.

>> Personally I would like to go even further and allow some syntax to allow
>> (or disable) flagging the use of new variables without type hinting as
>> possible typos
>
> "If you're serious about wanting all your variables to be declared,
> then I think you want a language other than Python. There are such
> languages around (and maybe even compiling to Python byte-code, I'm
> not sure), but Python isn't built that way. Type hinting is NOT
> variable declaration, and never will be. (Though that's famous last
> words, I know, and I'm not the BDFL or even anywhere close to that. If
> someone pulls up this email in ten years and laughs in my face, so be
> it. It'd not be the first time I've been utterly confidently wrong!) "
>
> No, I am not serious about wanting all variables to be declared under normal
> circumstances.

Even under abnormal circumstances, requiring all variables to be
declared would not be Python's way. There are plenty of ways of
handling the global vs local problem. PHP says "declare all your
globals, apart from functions and magic stuff the compiler gives you
for free"; C says "declare all your locals, anything undeclared will
be searched for in progressively larger scopes - everything has to be
declared somewhere"; Python says "declare all the globals that you
assign to, anything else assigned to is local, and anything not
assigned to is searched for at run time". I don't know of *any*
language that says "declare everything", and it certainly wouldn't be
Python. Even "declare everything you assign to" would be unnecessary
overhead.

Still -1 on this proposal.

ChrisA

From storchaka at gmail.com  Tue Sep  1 19:55:29 2015
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Tue, 1 Sep 2015 20:55:29 +0300
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
Message-ID: <ms4oqi$mer$1@ger.gmane.org>

On 01.09.15 11:00, Philipp A. wrote:
> When defining a place for config files, cache files, and so on, people
> usually hack around in a OS-dependent, misinformed, and therefore wrong way.
>
> Thanks to the tempfile API we at least don?t see people hardcoding /tmp/
> too much.
>
> There is a beautiful little module that does things right and is easy to
> use: appdirs <https://pypi.python.org/pypi/appdirs>
>
> TI think this is a *really* good candidate for the stdlib since this
> functionality is useful for everything that needs a cache or config (so
> not only GUI and CLI applications, but also scripts that download and
> cache stuff from the internet for faster re-running)
>
> probably we should build the API around pathlib, since i found myself
> not touching os.path with a barge pole since pathlib exists.
>
> i?ll write a PEP about this soon :)


site_data_dir() returns a string. It contains multiple paths separated 
with path delimiter if multipath=True. I think that a function that 
returns a list of paths, including user dir, would be more helpful and 
Pythonic.

See also PyXDG (http://www.freedesktop.org/wiki/Software/pyxdg/).



From p.f.moore at gmail.com  Tue Sep  1 23:04:41 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 1 Sep 2015 22:04:41 +0100
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CACCLA56_Ba3T827DvX2UnFRfySN1jzoc1O0k8W0_=SbmZEBHug@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CACCLA56_Ba3T827DvX2UnFRfySN1jzoc1O0k8W0_=SbmZEBHug@mail.gmail.com>
Message-ID: <CACac1F8GXChg2iGXmiABMbK6zwRAHpna-AUG+h16bBgZghJoMg@mail.gmail.com>

On 1 September 2015 at 17:19, John Wong <gokoproject at gmail.com> wrote:
> But is appdirs only useful if you are running something that's more toward
> system package / desktop application? A lot of projects today create their
> own directory to save data, many use $HOME/DOTCUSTOM_DIR. So the use case of
> appdirs should be addressed.

But that is not appropriate on Windows. Appdirs gives the above on
Unix, but %APPDATA%\Appname on Windows, which conforms properly to
platform standards.

Paul

From donald at stufft.io  Tue Sep  1 23:12:28 2015
From: donald at stufft.io (Donald Stufft)
Date: Tue, 1 Sep 2015 17:12:28 -0400
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CACac1F8GXChg2iGXmiABMbK6zwRAHpna-AUG+h16bBgZghJoMg@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CACCLA56_Ba3T827DvX2UnFRfySN1jzoc1O0k8W0_=SbmZEBHug@mail.gmail.com>
 <CACac1F8GXChg2iGXmiABMbK6zwRAHpna-AUG+h16bBgZghJoMg@mail.gmail.com>
Message-ID: <etPan.55e614bc.4a8cf4f8.24e@Draupnir.home>

On September 1, 2015 at 5:05:14 PM, Paul Moore (p.f.moore at gmail.com) wrote:
> On 1 September 2015 at 17:19, John Wong wrote:
> > But is appdirs only useful if you are running something that's more toward
> > system package / desktop application? A lot of projects today create their
> > own directory to save data, many use $HOME/DOTCUSTOM_DIR. So the use case of
> > appdirs should be addressed.
>  
> But that is not appropriate on Windows. Appdirs gives the above on
> Unix, but %APPDATA%\Appname on Windows, which conforms properly to
> platform standards.
>  
>

I forget why, but we forked appdirs when we added it to pip because of something about how it treated Windows think. Appdirs also is opinionated in situations that there isn?t a platform standard so we?d want to make sure that we agree with those opinions on all platforms.

I?m +1 on it though.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From p.f.moore at gmail.com  Tue Sep  1 23:15:18 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 1 Sep 2015 22:15:18 +0100
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <etPan.55e614bc.4a8cf4f8.24e@Draupnir.home>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CACCLA56_Ba3T827DvX2UnFRfySN1jzoc1O0k8W0_=SbmZEBHug@mail.gmail.com>
 <CACac1F8GXChg2iGXmiABMbK6zwRAHpna-AUG+h16bBgZghJoMg@mail.gmail.com>
 <etPan.55e614bc.4a8cf4f8.24e@Draupnir.home>
Message-ID: <CACac1F_uyfKcTuo3ukPy9EeyNv-CEbAxK+ry8wiLVKEqrrr_sg@mail.gmail.com>

On 1 September 2015 at 22:12, Donald Stufft <donald at stufft.io> wrote:
> I forget why, but we forked appdirs when we added it to pip because of something about how it treated Windows think. Appdirs also is opinionated in situations that there isn?t a platform standard so we?d want to make sure that we agree with those opinions on all platforms.

Certainly. I think the key point here is "let's have something in the
stdlib that makes deciding where your app stores its files work
correctly by default".

Paul

From njs at pobox.com  Tue Sep  1 23:22:23 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 1 Sep 2015 14:22:23 -0700
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
Message-ID: <CAPJVwBnpTS-X9y8WATvACubnM8y_F3AgRCwYOwqdTC=hsemCPw@mail.gmail.com>

On Tue, Sep 1, 2015 at 2:42 AM, Andrew Barnert via Python-ideas
<python-ideas at python.org> wrote:
> On Sep 1, 2015, at 01:00, Philipp A. <flying-sheep at web.de> wrote:
>
> When defining a place for config files, cache files, and so on, people
> usually hack around in a OS-dependent, misinformed, and therefore wrong way.
>
> Thanks to the tempfile API we at least don?t see people hardcoding /tmp/ too
> much.
>
> There is a beautiful little module that does things right and is easy to
> use: appdirs
>
>
> Is appdirs compatible with the OS X recommendations (as required by the App
> Store). Apple only gives you cache and app data directories; prefs are
> supposed to use NSDefaults API or emulate the file names and formats
> properly, and you have to be sensitive to the sandbox.)

No, AFAICT it doesn't get this right -- it just hard-codes the OS X
directories. It also didn't quite implement the XDG spec correctly
(there's some fallback behavior you're supposed to do if the magic
envvars don't make sense that it skips -- very unusual that this will
matter). And windows I'm not sure about -- the logic in appdirs looked
reasonable to me when I was reviewing this a few months ago, but there
seem to be a bunch of semi-contradictory standards and so it's hard to
know what's even "correct" in the tricky cases.

All of this is probably as much an argument *for* providing the correct
functionality as a standard thing as it is against, but any PEP here
probably needs to be thorough about citing the research to show that
it's actually getting the various platform standards correct.

What makes it particularly difficult is that if you "fix a bug" in a
library like appdirs, so that it starts suddenly returning different
results on some computer somewhere, then what it looks like to the end
user is that their data/settings/whatever have suddenly evaporated and
whatever disk space was being used for caches never gets cleaned up
and so forth. Generally when applications change how they compute
these directories, they also include tricky migration logic to check
both the old and new names, move stuff over if needed, but I'm not
sure how a low-level library like this can support that usefully...

-n

-- 
Nathaniel J. Smith -- http://vorpus.org

From yselivanov.ml at gmail.com  Tue Sep  1 23:25:26 2015
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Tue, 1 Sep 2015 17:25:26 -0400
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CAPJVwBnpTS-X9y8WATvACubnM8y_F3AgRCwYOwqdTC=hsemCPw@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBnpTS-X9y8WATvACubnM8y_F3AgRCwYOwqdTC=hsemCPw@mail.gmail.com>
Message-ID: <55E617C6.9000708@gmail.com>



On 2015-09-01 5:22 PM, Nathaniel Smith wrote:
[..]
> All of this is probably as much an argument*for*  providing the correct
> functionality as a standard thing as it is against, but any PEP here
> probably needs to be thorough about citing the research to show that
> it's actually getting the various platform standards correct.
>
> What makes it particularly difficult is that if you "fix a bug" in a
> library like appdirs, so that it starts suddenly returning different
> results on some computer somewhere, then what it looks like to the end
> user is that their data/settings/whatever have suddenly evaporated and
> whatever disk space was being used for caches never gets cleaned up
> and so forth. Generally when applications change how they compute
> these directories, they also include tricky migration logic to check
> both the old and new names, move stuff over if needed, but I'm not
> sure how a low-level library like this can support that usefully...

+1 on all points.  We really need a PEP for this kind of
functionality in the standard library.

Yury

From abarnert at yahoo.com  Tue Sep  1 23:47:22 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Tue, 1 Sep 2015 14:47:22 -0700
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>
Message-ID: <796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>

Responses below, but first, another issue:

Things like app data and prefs aren't a single directory. XDG has a notion of a search path rather than a single directory; Windows has a notion of separate search domains; OS X makes things as fun as possible by having both. For writing a new file, you're usually fine just writing to the first path in the default domain (as long as it exists or you can create it), but for reading files you're supposed to look in /etc or All Users or whatever if it's not found there. Most cross-platform wrappers I've used in the past didn't deal with this automatically, and a lot of them didn't even make it easy to do manually.

On Sep 1, 2015, at 14:19, Nathaniel Smith <njs at vorpus.org> wrote:
> 
> On Sep 1, 2015 02:45, "Andrew Barnert via Python-ideas"
> <python-ideas at python.org> wrote:
>> 
>>> On Sep 1, 2015, at 01:00, Philipp A. <flying-sheep at web.de> wrote:
>>> 
>>> When defining a place for config files, cache files, and so on, people usually hack around in a OS-dependent, misinformed, and therefore wrong way.
>>> 
>>> Thanks to the tempfile API we at least don?t see people hardcoding /tmp/ too much.
>>> 
>>> There is a beautiful little module that does things right and is easy to use: appdirs
>> 
>> 
>> Is appdirs compatible with the OS X recommendations (as required by the App Store). Apple only gives you cache and app data directories; prefs are supposed to use NSDefaults API or emulate the file names and formats properly, and you have to be sensitive to the sandbox.)
> 
> No, AFAICT it doesn't get this right -- it just hard-codes the OS X
> directories.

The biggest problem with most of the cross-platform libraries I've seen is that they assume there is a prefs directory, and on OS X, that really isn't true. If your app explicitly opens the exact same file that NSDefaults would have opened, you're breaking the rules. (Since the mandatory sandbox went into effect, this usually doesn't get you rejected from the App Store anymore, but before that it did.) And taking a quick look at appdirs, it has a user_config_dir that seems to mean exactly that. So, how can a stdlib library handle that?

Meanwhile, it looks like appdirs expects you to give it an app name and company name to construct the paths. What happens if you give it names that don't match the ones in your bundle? You're opening files that belong to another app. Which is again violating Apple's rules.

One more thing: I don't know if it's guaranteed that the right way of doing things on OS X (whether via Cocoa or CoreFoundation) won't spawn a background thread for you. After all, the APIs can talk to the sandbox service and sometimes even the iCloud service. Is that a problem for something in the stdlib?

> It also didn't quite implement the XDG spec correctly
> (there's some fallback behavior you're supposed to do if the magic
> envvars don't make sense that it skips -- very unusual that this will
> matter). And windows I'm not sure about -- the logic in appdirs looked
> reasonable to me when I was reviewing this a few months ago, but there
> seem to be a bunch of semi-contradictory standards and so it's hard to
> know what's even "correct" in the tricky cases.
> 
> All of this is probably as much an argument *for* providing the
> functionality as a standard thing as it is against,
> but any PEP here
> probably needs to be thorough about citing the research to show that
> it's actually getting the various platform standards correct.
> 
> What makes it particularly difficult is that if you "fix a bug" in a
> library like appdirs, so that it starts suddenly returning different
> results on some computer somewhere, then what it looks like to the end
> user is that their data/settings/whatever have suddenly evaporated and
> whatever disk space was being used for caches never gets cleaned up
> and so forth. Generally when applications change how they compute
> these directories, they also include tricky migration logic to check
> both the old and new names, move stuff over if needed, but I'm not
> sure how a low-level library like this can support that usefully...

And that's especially true in the case of Apple's standards, which also include specific rules about how you're supposed to do such a migration, and doing so requires stuff that can't be done from entirely inside the code.

From p.f.moore at gmail.com  Wed Sep  2 01:05:20 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 2 Sep 2015 00:05:20 +0100
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>
 <796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>
Message-ID: <CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>

On 1 September 2015 at 22:47, Andrew Barnert via Python-ideas
<python-ideas at python.org> wrote:
> Things like app data and prefs aren't a single directory. XDG has a notion of a search path rather than a single directory; Windows has a notion of separate search domains; OS X makes things as fun as possible by having both. For writing a new file, you're usually fine just writing to the first path in the default domain (as long as it exists or you can create it), but for reading files you're supposed to look in /etc or All Users or whatever if it's not found there. Most cross-platform wrappers I've used in the past didn't deal with this automatically, and a lot of them didn't even make it easy to do manually.

This is a fair point. But it's also worth noting that the current
state of affairs for many apps is to just bung stuff in ~/whatever.
While appdirs may not get things totally right, at least it improves
things. And if it (or something similar) were in the stdlib, it would
at least provide a level of uniformity.

So, in my view:

1. We should have something that provides the functionality of appdirs
in the stdlib.
2. It probably needs a PEP to get the corner cases right.
3. The behaviour of appdirs is a good baseline default - even if it
isn't 100% compliant with platform standards it'll be better than what
someone unfamiliar with the platform will invent.
4. We shouldn't abandon the idea just because a perfect solution is
unattainable.

There are complex cases to consider (search paths, for example, and
even worse how search paths interact with the app writing config data
rather than just reading it, or migration when a scheme changes). The
PEP should at least mention these cases, but it's not unreasonable to
simply declare them out of scope of the module (most applications
don't need anything this complex).

Paul

From rosuav at gmail.com  Wed Sep  2 02:47:43 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 2 Sep 2015 10:47:43 +1000
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>
 <796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>
 <CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>
Message-ID: <CAPTjJmraS-cYcmccfFKhyFjnb+KPwSA8KCyBrJqUCiu1k4XAAw@mail.gmail.com>

On Wed, Sep 2, 2015 at 9:05 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> So, in my view:
>
> 1. We should have something that provides the functionality of appdirs
> in the stdlib.
> 2. It probably needs a PEP to get the corner cases right.
> 3. The behaviour of appdirs is a good baseline default - even if it
> isn't 100% compliant with platform standards it'll be better than what
> someone unfamiliar with the platform will invent.
> 4. We shouldn't abandon the idea just because a perfect solution is
> unattainable.

+1

> There are complex cases to consider (search paths, for example, and
> even worse how search paths interact with the app writing config data
> rather than just reading it, or migration when a scheme changes). The
> PEP should at least mention these cases, but it's not unreasonable to
> simply declare them out of scope of the module (most applications
> don't need anything this complex).

Might be worth starting with something simple: ask for one directory
(the default or most obvious place), or ask for a full list of
plausible directories to try. Then a config manager could be built on
top of that which would handle write location selection, migration,
etc, and that would be a separate proposal that makes use of the
appdata module for the cross-platform stuff.

ChrisA

From ncoghlan at gmail.com  Wed Sep  2 06:01:25 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 2 Sep 2015 14:01:25 +1000
Subject: [Python-ideas] ideas for type hints for variable: beyond
	comments
In-Reply-To: <20150901130354.GE19373@ando.pearwood.info>
References: <55E57CCF.9030909@gmail.com>
 <CAPTjJmqF5RHvy7yvv60f-YYcp_WvstCC+2zz3WEJZsxhm8nR4g@mail.gmail.com>
 <20150901130354.GE19373@ando.pearwood.info>
Message-ID: <CADiSq7fqLCpJZ0mN=JcGL5NkukNCAurKy+uJ2PaSA8J3O+=Ccg@mail.gmail.com>

On 1 September 2015 at 23:03, Steven D'Aprano <steve at pearwood.info> wrote:
> PEP 484 says:
>
> "No first-class syntax support for explicitly marking variables as being
> of a specific type is added by this PEP. To help with type inference in
> complex cases, a comment of the following format may be used: ..."
>
> https://www.python.org/dev/peps/pep-0484/
>
> I recall that in the discussions prior to the PEP, I got the strong
> impression that Guido was open to the concept of annotating variables in
> principle, but didn't think it was very important (for the most part,
> the type checker should be able to infer the variable type), and he
> didn't want to delay the PEP for the sake of agreement on a variable
> declaration syntax when a simple comment will do the job.
>
> So in principle, if we agree that type declarations for variables should
> look like (let's say) `str s = some_function(arg)` then the syntax may
> be added in the future, but it's a low priority.

The main case where it's potentially useful is when we want to
initialise a variable to None, but constrain permitted rebindings
(from a typechecker's perspective) to a particular type. When we
initialise a variable to an actual value, then type inference can
usually handle it.

Using the typing module as it exists today, I believe this should work
for that purpose (although I haven't actually tried it with mypy or
any other typechecker):

    from typing import TypeVar, Generic, Optional

    T = TypeVar("T")

    class Var(Generic[T]):
        def __new__(cls, value:Optional[T] = None) -> Optional[T]:
            return None

    i = Var[int]()

Unless I've misunderstood the likely outcome of type inference
completely, the value of i here will be None, but it's inferred type
would be Optional[int]. At runtime, you could still rebind "i" to
whatever you want, but a typechecker would complain if it was to
anything other than None or an integer.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Wed Sep  2 06:05:12 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 2 Sep 2015 14:05:12 +1000
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CAPJVwBnpTS-X9y8WATvACubnM8y_F3AgRCwYOwqdTC=hsemCPw@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBnpTS-X9y8WATvACubnM8y_F3AgRCwYOwqdTC=hsemCPw@mail.gmail.com>
Message-ID: <CADiSq7eDozfAMXd11DgZ1c6Sfg9zH3zvLpd-+HNCn8MFs7bcaQ@mail.gmail.com>

On 2 September 2015 at 07:22, Nathaniel Smith <njs at pobox.com> wrote:
> All of this is probably as much an argument *for* providing the correct
> functionality as a standard thing as it is against, but any PEP here
> probably needs to be thorough about citing the research to show that
> it's actually getting the various platform standards correct.

We'd also want to state up front that non-compliance with the relevant
platform standards *is* considered a bug, so it may change in
maintenance releases in order to support changes in the platform
standards.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From me at the-compiler.org  Wed Sep  2 06:18:46 2015
From: me at the-compiler.org (Florian Bruhin)
Date: Wed, 2 Sep 2015 06:18:46 +0200
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
Message-ID: <20150902041846.GH10941@tonks>

* Philipp A. <flying-sheep at web.de> [2015-09-01 08:00:21 +0000]:
> There is a beautiful little module that does things right and is easy to
> use: appdirs <https://pypi.python.org/pypi/appdirs>
> 
> TI think this is a *really* good candidate for the stdlib since this
> functionality is useful for everything that needs a cache or config (so not
> only GUI and CLI applications, but also scripts that download and cache
> stuff from the internet for faster re-running)

+1 from me as well.

Another source for inspirations might be the QStandardPaths class from
the Qt library (which is C++, but I'm using QStandardPaths in my PyQt
applicaiton):

http://doc.qt.io/qt-5/qstandardpaths.html

They have a QStandardPath::writableLocation which gives you exactly
one path to write to, a QStandardPath::standardLocations which gives
you a list of paths, and a QStandardPath::locate which locates your
config based on a name.

They also had the issue with changing standards such as Local/Roaming
appdata on Windows, and solved it by introducing more enum values to
the StandardLocation enum:

    QStandardPaths::DataLocation
        Returns the same value as AppLocalDataLocation. This
        enumeration value is deprecated. Using AppDataLocation is
        preferable since on Windows, the roaming path is recommended.

    QStandardPaths::AppDataLocation
        Returns a directory location where persistent application data
        can be stored. This is an application-specific directory. To
        obtain a path to store data to be shared with other
        applications, use QStandardPaths::GenericDataLocation. The
        returned path is never empty. On the Windows operating system,
        this returns the roaming path. This enum value was added in Qt
        5.4.

    QStandardPaths::AppLocalDataLocation
        Returns the local settings path on the Windows operating
        system.  On all other platforms, it returns the same value as
        AppDataLocation. This enum value was added in Qt 5.4.

Florian

-- 
http://www.the-compiler.org | me at the-compiler.org (Mail/XMPP)
   GPG: 916E B0C8 FD55 A072 | http://the-compiler.org/pubkey.asc
         I love long mails! | http://email.is-not-s.ms/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150902/c01eb355/attachment.sig>

From tjreedy at udel.edu  Wed Sep  2 06:31:33 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 2 Sep 2015 00:31:33 -0400
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CACac1F8GXChg2iGXmiABMbK6zwRAHpna-AUG+h16bBgZghJoMg@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CACCLA56_Ba3T827DvX2UnFRfySN1jzoc1O0k8W0_=SbmZEBHug@mail.gmail.com>
 <CACac1F8GXChg2iGXmiABMbK6zwRAHpna-AUG+h16bBgZghJoMg@mail.gmail.com>
Message-ID: <ms5u3j$d5p$1@ger.gmane.org>

On 9/1/2015 5:04 PM, Paul Moore wrote:
> On 1 September 2015 at 17:19, John Wong <gokoproject at gmail.com> wrote:
>> But is appdirs only useful if you are running something that's more toward
>> system package / desktop application? A lot of projects today create their
>> own directory to save data, many use $HOME/DOTCUSTOM_DIR. So the use case of
>> appdirs should be addressed.
>
> But that is not appropriate on Windows. Appdirs gives the above on
> Unix, but %APPDATA%\Appname on Windows, which conforms properly to
> platform standards.

The problem with Windows is that the standard is to put things in an 
invisible directory, which makes it difficult to tell people, especially 
non-experts, to edit a file in the directory.

Games that expect people to edit .ini files put them in the game directory.

-- 
Terry Jan Reedy


From robertc at robertcollins.net  Wed Sep  2 06:38:05 2015
From: robertc at robertcollins.net (Robert Collins)
Date: Wed, 2 Sep 2015 16:38:05 +1200
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>
 <796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>
 <CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>
Message-ID: <CAJ3HoZ0fpdivaXo6OdO5jwDqfFpoOR=zJCtMqx-RizWSMqUnrw@mail.gmail.com>

On 2 September 2015 at 11:05, Paul Moore <p.f.moore at gmail.com> wrote:
> On 1 September 2015 at 22:47, Andrew Barnert via Python-ideas
> <python-ideas at python.org> wrote:
>> Things like app data and prefs aren't a single directory. XDG has a notion of a search path rather than a single directory; Windows has a notion of separate search domains; OS X makes things as fun as possible by having both. For writing a new file, you're usually fine just writing to the first path in the default domain (as long as it exists or you can create it), but for reading files you're supposed to look in /etc or All Users or whatever if it's not found there. Most cross-platform wrappers I've used in the past didn't deal with this automatically, and a lot of them didn't even make it easy to do manually.
>
> This is a fair point. But it's also worth noting that the current
> state of affairs for many apps is to just bung stuff in ~/whatever.
> While appdirs may not get things totally right, at least it improves
> things. And if it (or something similar) were in the stdlib, it would
> at least provide a level of uniformity.

In about 5 years time. Maybe,

The adoption curve for something that works on all Pythons is able to
be much much higher than that for something which is only in the
stdlib 6 months (or more) from now. Unless we do a rolling backport of
it.

And if we're going to do that... why? Why not just provide a
documentation link to the thing and say 'pip install this' and/or
'setuptools install_require this'.

-Rob


-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud

From random832 at fastmail.us  Wed Sep  2 07:21:56 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Wed, 02 Sep 2015 01:21:56 -0400
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <ms5u3j$d5p$1@ger.gmane.org>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CACCLA56_Ba3T827DvX2UnFRfySN1jzoc1O0k8W0_=SbmZEBHug@mail.gmail.com>
 <CACac1F8GXChg2iGXmiABMbK6zwRAHpna-AUG+h16bBgZghJoMg@mail.gmail.com>
 <ms5u3j$d5p$1@ger.gmane.org>
Message-ID: <1441171316.956058.372400801.57FECD89@webmail.messagingengine.com>

On Wed, Sep 2, 2015, at 00:31, Terry Reedy wrote:
> The problem with Windows is that the standard is to put things in an 
> invisible directory, which makes it difficult to tell people, especially 
> non-experts, to edit a file in the directory.

I'm not sure you _should_ be telling non-experts to find a file to edit.
Why doesn't your app provide a UI for it, or at least a button that pops
up the file in the text editor (Minecraft, for example, has a button to
pop up the folder you're expected to drop downloaded texture packs
into), if editing it as free form text is something that end users
_really_ should be expected to do?

Plus, it's not really any harder to find than a "Hidden" directory
beginning with a dot - in either case you have to either type the name
or enable showing hidden files, and neither platform makes this easier
than the other.

From ncoghlan at gmail.com  Wed Sep  2 07:57:18 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 2 Sep 2015 15:57:18 +1000
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CAJ3HoZ0fpdivaXo6OdO5jwDqfFpoOR=zJCtMqx-RizWSMqUnrw@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>
 <796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>
 <CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>
 <CAJ3HoZ0fpdivaXo6OdO5jwDqfFpoOR=zJCtMqx-RizWSMqUnrw@mail.gmail.com>
Message-ID: <CADiSq7cASe2GQihUOPVDL8+sRn=u3L0eYdSYopKGkjTS4oGiPQ@mail.gmail.com>

On 2 September 2015 at 14:38, Robert Collins <robertc at robertcollins.net> wrote:
> On 2 September 2015 at 11:05, Paul Moore <p.f.moore at gmail.com> wrote:
>> This is a fair point. But it's also worth noting that the current
>> state of affairs for many apps is to just bung stuff in ~/whatever.
>> While appdirs may not get things totally right, at least it improves
>> things. And if it (or something similar) were in the stdlib, it would
>> at least provide a level of uniformity.
>
> In about 5 years time. Maybe,
>
> The adoption curve for something that works on all Pythons is able to
> be much much higher than that for something which is only in the
> stdlib 6 months (or more) from now. Unless we do a rolling backport of
> it.
>
> And if we're going to do that... why? Why not just provide a
> documentation link to the thing and say 'pip install this' and/or
> 'setuptools install_require this'.

My perspective on that has been shifting in recent years, to the point
where I view this kind of standard library modules primarily as a tool
to helps folks learn how the computing ecosystem works in practice.
Consider PEP 3144, for example, and the differences between ipaddress,
and its inspiration, ipaddr. The standard library one is significantly
stricter about correctly using networking terminology, so you can
actually study the ipaddress module as a way of learning how IP
addressing works. The original ipaddr, by contrast, is only easy to
use if you already know all the terms, and can say "Oh, OK, they're
using that term somewhat loosely, but I can see what they mean".

I think this is a case where a similar approach would make sense -
like ipaddr before it, appdirs represents an actual cross-version
production module, put together by a company (in this case ActiveState
rather than Google) for their own use, but made available to the wider
community Python through PyPI. As such, we know it's feature coverage
is likely to be good, but the API design is likely to be optimised for
experienced developers that already understand the concepts, and just
want a library to handle the specific technical details.

A standard library API would shift the emphasis slightly, and take
into account the perspective of the *beginning* programmer, who may
have only first learned about the command line and files and
directories in the course of learning Python, and is now venturing
into the realm of designing full desktop (and mobile?) applications.

Regards,
Nick.

P.S. The statistics module is another example of being able to use the
Python standard as a teaching tool to help learn something else: for
many production use cases, you'd reach for something more
sophisticated (like the NumPy stack), but if what you're aiming to do
is to learn (or teach) basic statistical concepts, it covers the
essentials.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From p.f.moore at gmail.com  Wed Sep  2 09:46:48 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 2 Sep 2015 08:46:48 +0100
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CADiSq7cASe2GQihUOPVDL8+sRn=u3L0eYdSYopKGkjTS4oGiPQ@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>
 <796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>
 <CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>
 <CAJ3HoZ0fpdivaXo6OdO5jwDqfFpoOR=zJCtMqx-RizWSMqUnrw@mail.gmail.com>
 <CADiSq7cASe2GQihUOPVDL8+sRn=u3L0eYdSYopKGkjTS4oGiPQ@mail.gmail.com>
Message-ID: <CACac1F-sepO_1MA74uh2YMg9L3s4MnYjrKw3B8NGMRxz6xFAOw@mail.gmail.com>

On 2 September 2015 at 06:57, Nick Coghlan <ncoghlan at gmail.com> wrote:
> A standard library API would shift the emphasis slightly, and take
> into account the perspective of the *beginning* programmer, who may
> have only first learned about the command line and files and
> directories in the course of learning Python, and is now venturing
> into the realm of designing full desktop (and mobile?) applications.

Agreed. In this case, the focus should be on providing "correct"
cross-platform defaults, and assisting (and encouraging) users to
understand the choices and constraints applicable to other platforms,
which may not be relevant on theirs.

This may feel like a nuisance for single-platform developers (a
Unix-only program doesn't need to care about local or roaming
preferences, because they don't have the concept of a domain account),
so it's important to get the defaults right so that people who don't
need to care, don't have to. But the options should be accessible, so
that people can learn to make the right choices (for example, pip puts
preferences in the roaming profile, but the cache in the local
profile, because you don't want to bloat the roaming profile with a
cache, but you do want the user's preferences to be available on all
the machines they use).

Paul

From p.f.moore at gmail.com  Wed Sep  2 09:52:06 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 2 Sep 2015 08:52:06 +0100
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <ms5u3j$d5p$1@ger.gmane.org>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CACCLA56_Ba3T827DvX2UnFRfySN1jzoc1O0k8W0_=SbmZEBHug@mail.gmail.com>
 <CACac1F8GXChg2iGXmiABMbK6zwRAHpna-AUG+h16bBgZghJoMg@mail.gmail.com>
 <ms5u3j$d5p$1@ger.gmane.org>
Message-ID: <CACac1F8B5av5QNoR9aRcT=B0G64XA9TUsVRXR8CBrzqKiihVvQ@mail.gmail.com>

On 2 September 2015 at 05:31, Terry Reedy <tjreedy at udel.edu> wrote:
> The problem with Windows is that the standard is to put things in an
> invisible directory, which makes it difficult to tell people, especially
> non-experts, to edit a file in the directory.

As you say, that's an issue with Windows, not with the library (if
indeed it *is* an issue with Windows - the users I've dealt with don't
have huge problems with things in appdata, although they would usually
expect the program to offer a config dialog rather than making them
edit the file by hand - but that's off-topic for this thread).

> Games that expect people to edit .ini files put them in the game directory.

That's a common choice on Windows, certainly (relating to historical
issues where the official standards used to be a lot more
user-hostile). It may well-be that the appdirs module should offer an
"app-local" (or "portable", if you prefer) scheme in addition to the
default scheme.

Paul

From flying-sheep at web.de  Wed Sep  2 10:10:29 2015
From: flying-sheep at web.de (Philipp A.)
Date: Wed, 02 Sep 2015 08:10:29 +0000
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CADiSq7cASe2GQihUOPVDL8+sRn=u3L0eYdSYopKGkjTS4oGiPQ@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>
 <796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>
 <CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>
 <CAJ3HoZ0fpdivaXo6OdO5jwDqfFpoOR=zJCtMqx-RizWSMqUnrw@mail.gmail.com>
 <CADiSq7cASe2GQihUOPVDL8+sRn=u3L0eYdSYopKGkjTS4oGiPQ@mail.gmail.com>
Message-ID: <CAN8d9gmP3HZmOD4MTpP3A763i3bW1DpApZrwo9SAcDSEPiYQmA@mail.gmail.com>

OK, original poster here.

thanks for the positive reception!

so there are some issues which i?ll address

   1.

   *appdirs doesn?t get everything right*
   ? in order not to have inconsistencies, we could abandon the ?appdirs as
   a fallback? approach and create our own API which returns more correct
   results. this also frees us to make other changes where we see fit, e.g.
   see next point
   2.

   *there is a search path instead of a single directory sometimes*
   ? appdirs provides a multipath keyword argument which returns a
   (colon-delimited) ?list? of paths. we should provide sth. similar, only
   with python lists. maybe also convenience functions for getting a list of
   all files matching some subpath similar to [d / subpath for d in
   site_data_dir(appname, appauthor, all=True) if (d / subpath).exists()]
   3.

   *some platforms don?t have some specific subset of the functionality
   (e.g. no config dir on OSX)*
   ? provide a warning in the stdlib docs about that and refer to another
   settings API. unfortunately i can?t find a good NSDefaults library in
   python right now. i think the API should still return some directory that
   works for storing settings on OSX in case people want to use
   platform-independent config files.
   4.

   *it?s hard to tell newbies where their files are*
   ? unfortunately that?s how things are. the existing standards are
   confusing, but should be honored nonetheless. we can provide an api
like python
   -m stddirs [--config|--data|...] <companyname> <appname> which prints a
   selection of standard directories.

did i miss anything?

and yeah: i also think that something that?s a little opinionated in case
of ambiguities is vastly better than everyone hacking their own impromptu
platform-dependent alternative.

~/.appname stopped being right on linux long ago and never was right on
other platforms, which we should teach people.

best, phil
?

Nick Coghlan <ncoghlan at gmail.com> schrieb am Mi., 2. Sep. 2015 um 07:57 Uhr:

> On 2 September 2015 at 14:38, Robert Collins <robertc at robertcollins.net>
> wrote:
> > On 2 September 2015 at 11:05, Paul Moore <p.f.moore at gmail.com> wrote:
> >> This is a fair point. But it's also worth noting that the current
> >> state of affairs for many apps is to just bung stuff in ~/whatever.
> >> While appdirs may not get things totally right, at least it improves
> >> things. And if it (or something similar) were in the stdlib, it would
> >> at least provide a level of uniformity.
> >
> > In about 5 years time. Maybe,
> >
> > The adoption curve for something that works on all Pythons is able to
> > be much much higher than that for something which is only in the
> > stdlib 6 months (or more) from now. Unless we do a rolling backport of
> > it.
> >
> > And if we're going to do that... why? Why not just provide a
> > documentation link to the thing and say 'pip install this' and/or
> > 'setuptools install_require this'.
>
> My perspective on that has been shifting in recent years, to the point
> where I view this kind of standard library modules primarily as a tool
> to helps folks learn how the computing ecosystem works in practice.
> Consider PEP 3144, for example, and the differences between ipaddress,
> and its inspiration, ipaddr. The standard library one is significantly
> stricter about correctly using networking terminology, so you can
> actually study the ipaddress module as a way of learning how IP
> addressing works. The original ipaddr, by contrast, is only easy to
> use if you already know all the terms, and can say "Oh, OK, they're
> using that term somewhat loosely, but I can see what they mean".
>
> I think this is a case where a similar approach would make sense -
> like ipaddr before it, appdirs represents an actual cross-version
> production module, put together by a company (in this case ActiveState
> rather than Google) for their own use, but made available to the wider
> community Python through PyPI. As such, we know it's feature coverage
> is likely to be good, but the API design is likely to be optimised for
> experienced developers that already understand the concepts, and just
> want a library to handle the specific technical details.
>
> A standard library API would shift the emphasis slightly, and take
> into account the perspective of the *beginning* programmer, who may
> have only first learned about the command line and files and
> directories in the course of learning Python, and is now venturing
> into the realm of designing full desktop (and mobile?) applications.
>
> Regards,
> Nick.
>
> P.S. The statistics module is another example of being able to use the
> Python standard as a teaching tool to help learn something else: for
> many production use cases, you'd reach for something more
> sophisticated (like the NumPy stack), but if what you're aiming to do
> is to learn (or teach) basic statistical concepts, it covers the
> essentials.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150902/58967a51/attachment-0001.html>

From p.f.moore at gmail.com  Wed Sep  2 10:19:41 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 2 Sep 2015 09:19:41 +0100
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CAJ3HoZ0fpdivaXo6OdO5jwDqfFpoOR=zJCtMqx-RizWSMqUnrw@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>
 <796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>
 <CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>
 <CAJ3HoZ0fpdivaXo6OdO5jwDqfFpoOR=zJCtMqx-RizWSMqUnrw@mail.gmail.com>
Message-ID: <CACac1F8hAvjyfZhvA3aM6sM8T26ts8ZePVTTkNF5C4JhQTB3Tg@mail.gmail.com>

On 2 September 2015 at 05:38, Robert Collins <robertc at robertcollins.net> wrote:
>> This is a fair point. But it's also worth noting that the current
>> state of affairs for many apps is to just bung stuff in ~/whatever.
>> While appdirs may not get things totally right, at least it improves
>> things. And if it (or something similar) were in the stdlib, it would
>> at least provide a level of uniformity.
>
> In about 5 years time. Maybe,

Most of the programs I write are for Python 3.4 at the moment, and
will be for Python 3.5 as soon as it comes out. They won't be
distributed outside of my group at work, if indeed anyone but me uses
them. They won't care about older versions of Python. I don't even
care about cross-platform. But I do want to set up a cache directory,
or save some settings, without thinking *too* hard about where to put
them. I don't want to have an external dependency because I'm forever
running these things from whatever virtualenv I currently have active
(sloppy work habits, I know, but that's the point - not everything is
a nicely structured development project).

For programs that need to support older versions of Python, a backport
should be trivial - I can't see that this would need any particularly
"modern" Python features.

Paul

From cs at zip.com.au  Wed Sep  2 10:27:53 2015
From: cs at zip.com.au (Cameron Simpson)
Date: Wed, 2 Sep 2015 18:27:53 +1000
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <55E617C6.9000708@gmail.com>
References: <55E617C6.9000708@gmail.com>
Message-ID: <20150902082753.GA10839@cskk.homeip.net>

On 01Sep2015 17:25, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
>>What makes it particularly difficult is that if you "fix a bug" in a
>>library like appdirs, so that it starts suddenly returning different
>>results on some computer somewhere, then what it looks like to the end
>>user is that their data/settings/whatever have suddenly evaporated [...]
>>Generally when applications change how they compute
>>these directories, they also include tricky migration logic to check
>>both the old and new names, move stuff over if needed, but I'm not
>>sure how a low-level library like this can support that usefully...

If it were me I'd want to keep a little state recording what choices were made 
for these things and whether those were the defaults. Then you can check that 
on next run: if the choice was a default and the default no longer matches, 
migrate.

Of course, if the default location for this state changes...

Not pretending I'm offering a comprehensive solution,
I remain,
Cameron Simpson <cs at zip.com.au>

"waste cycles drawing trendy 3D junk"   - Mac Eudora v3 config option

From mal at egenix.com  Wed Sep  2 11:30:33 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 02 Sep 2015 11:30:33 +0200
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CAN8d9gmP3HZmOD4MTpP3A763i3bW1DpApZrwo9SAcDSEPiYQmA@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>	<E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>	<CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>	<796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>	<CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>	<CAJ3HoZ0fpdivaXo6OdO5jwDqfFpoOR=zJCtMqx-RizWSMqUnrw@mail.gmail.com>	<CADiSq7cASe2GQihUOPVDL8+sRn=u3L0eYdSYopKGkjTS4oGiPQ@mail.gmail.com>
 <CAN8d9gmP3HZmOD4MTpP3A763i3bW1DpApZrwo9SAcDSEPiYQmA@mail.gmail.com>
Message-ID: <55E6C1B9.2030506@egenix.com>

On 02.09.2015 10:10, Philipp A. wrote:
> ~/.appname stopped being right on linux long ago and never was right on
> other platforms, which we should teach people.

Looking at the my home dir on Linux, there doesn't seem to be
one standard, but rather a whole set of them and the good old
~/.appname is still a popular one (e.g. pip and ansible from
Python land still use it; as do many other non-Python applications
such as ncftp, emacs, svn, git, gpg, etc.).

~/.config/ does get some use, but mostly for GUI applications,
not so much for command line ones.

~/.local/lib/ only appears to be used by Python :-)
~/.local/share/ is mostly used by desktops to register application
shortcuts

~/.cache/ is being used by just a handful of tools, pip being one
of them.

appdirs seems to rely on the XDG Base Directory Specification for a lot
of things on Linux (http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html).
That's probably fine for desktop GUI apps (the standard was apparently
built for this use case), but doesn't apply at all for command line tools
or applications like daemons or services which don't interact with the
desktop, e.g. you typically won't find global config files for
command line tools under /etc/xdg/, but instead under /etc/.

For Windows, the CSIDL_* values have also been replaced with
new ones under FOLDERID_* (the APIs have also evolved):

Values:
https://msdn.microsoft.com/en-us/library/windows/desktop/bb762494%28v=vs.85%29.aspx
https://msdn.microsoft.com/en-us/library/windows/desktop/dd378457%28v=vs.85%29.aspx

APIs:
https://msdn.microsoft.com/en-us/library/windows/desktop/bb762181%28v=vs.85%29.aspx
https://msdn.microsoft.com/en-us/library/windows/desktop/bb762188%28v=vs.85%29.aspx

BTW: I wonder why the Windows functions in appdirs don't use
the environment for much easier access to e.g. APPDATA and USERPROFILE.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 02 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2015-08-27: Released eGenix mx Base 3.2.9 ...     http://egenix.com/go83

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From donald at stufft.io  Wed Sep  2 12:58:07 2015
From: donald at stufft.io (Donald Stufft)
Date: Wed, 2 Sep 2015 06:58:07 -0400
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <55E6C1B9.2030506@egenix.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>
 <796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>
 <CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>
 <CAJ3HoZ0fpdivaXo6OdO5jwDqfFpoOR=zJCtMqx-RizWSMqUnrw@mail.gmail.com>
 <CADiSq7cASe2GQihUOPVDL8+sRn=u3L0eYdSYopKGkjTS4oGiPQ@mail.gmail.com>
 <CAN8d9gmP3HZmOD4MTpP3A763i3bW1DpApZrwo9SAcDSEPiYQmA@mail.gmail.com>
 <55E6C1B9.2030506@egenix.com>
Message-ID: <B1710F36-FE72-4EE2-BE1A-3D295D050D33@stufft.io>


> On Sep 2, 2015, at 5:30 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> 
> Looking at the my home dir on Linux, there doesn't seem to be
> one standard, but rather a whole set of them and the good old
> ~/.appname is still a popular one (e.g. pip and ansible from
> Python land still use it; as do many other non-Python applications
> such as ncftp, emacs, svn, git, gpg, etc.).

Just to be clear: pip supports and prefers the XDG spec, the old locations are just still supported for backwards compatibility. Though we did deviate from XDG in that we use /etc/ instead of /etc/xdg/. 

From mal at egenix.com  Wed Sep  2 13:14:37 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 02 Sep 2015 13:14:37 +0200
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <B1710F36-FE72-4EE2-BE1A-3D295D050D33@stufft.io>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>	<E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>	<CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>	<796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>	<CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>	<CAJ3HoZ0fpdivaXo6OdO5jwDqfFpoOR=zJCtMqx-RizWSMqUnrw@mail.gmail.com>	<CADiSq7cASe2GQihUOPVDL8+sRn=u3L0eYdSYopKGkjTS4oGiPQ@mail.gmail.com>	<CAN8d9gmP3HZmOD4MTpP3A763i3bW1DpApZrwo9SAcDSEPiYQmA@mail.gmail.com>	<55E6C1B9.2030506@egenix.com>
 <B1710F36-FE72-4EE2-BE1A-3D295D050D33@stufft.io>
Message-ID: <55E6DA1D.1030105@egenix.com>

I just commented on the ticket that references this discussion:

http://bugs.python.org/issue7175

In essence, Python already has an installation scheme which is
defined in sysconfig.py (see the top of the file) and has had
this ever since distutils got added to the stdlib.

It just lacks explicit entries for "config" and "cache" files,
so adding those would be more in line with coming up with
yet another standard, e.g. for posix_user:

    'posix_user': {
        'stdlib': '{userbase}/lib/python{py_version_short}',
        'platstdlib': '{userbase}/lib/python{py_version_short}',
        'purelib': '{userbase}/lib/python{py_version_short}/site-packages',
        'platlib': '{userbase}/lib/python{py_version_short}/site-packages',
        'include': '{userbase}/include/python{py_version_short}',
        'scripts': '{userbase}/bin',
        'config': '{userbase}/etc',
        'cache': '{userbase}/var',
        'data': '{userbase}',
        },

({userbase} is set by looking at PYTHONUSERBASE and defaults to
~/.local/)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 02 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2015-08-27: Released eGenix mx Base 3.2.9 ...     http://egenix.com/go83

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From steve.dower at python.org  Wed Sep  2 15:19:15 2015
From: steve.dower at python.org (Steve Dower)
Date: Wed, 2 Sep 2015 06:19:15 -0700
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <55E6C1B9.2030506@egenix.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>
 <796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>
 <CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>
 <CAJ3HoZ0fpdivaXo6OdO5jwDqfFpoOR=zJCtMqx-RizWSMqUnrw@mail.gmail.com>
 <CADiSq7cASe2GQihUOPVDL8+sRn=u3L0eYdSYopKGkjTS4oGiPQ@mail.gmail.com>
 <CAN8d9gmP3HZmOD4MTpP3A763i3bW1DpApZrwo9SAcDSEPiYQmA@mail.gmail.com>
 <55E6C1B9.2030506@egenix.com>
Message-ID: <E1ZX7wd-0007rJ-R1@se2-syd.hostedmail.net.au>

"BTW: I wonder why the Windows functions in appdirs don't use
the environment for much easier access to e.g. APPDATA and USERPROFILE."

The environment can become corrupted more easily and it's really difficult to diagnose that from a bug report that says "my config is corrupt". I assume appdirs using ctypes now, but I'd be happy to add the call into the os module to avoid that.

Cheers,
Steve

Top-posted from my Windows Phone

-----Original Message-----
From: "M.-A. Lemburg" <mal at egenix.com>
Sent: ?9/?2/?2015 2:31
To: "Philipp A." <flying-sheep at web.de>; "Nick Coghlan" <ncoghlan at gmail.com>; "Robert Collins" <robertc at robertcollins.net>
Cc: "Nathaniel Smith" <njs at vorpus.org>; "python-ideas at python.org" <python-ideas at python.org>
Subject: Re: [Python-ideas] Add appdirs module to stdlib

On 02.09.2015 10:10, Philipp A. wrote:
> ~/.appname stopped being right on linux long ago and never was right on
> other platforms, which we should teach people.

Looking at the my home dir on Linux, there doesn't seem to be
one standard, but rather a whole set of them and the good old
~/.appname is still a popular one (e.g. pip and ansible from
Python land still use it; as do many other non-Python applications
such as ncftp, emacs, svn, git, gpg, etc.).

~/.config/ does get some use, but mostly for GUI applications,
not so much for command line ones.

~/.local/lib/ only appears to be used by Python :-)
~/.local/share/ is mostly used by desktops to register application
shortcuts

~/.cache/ is being used by just a handful of tools, pip being one
of them.

appdirs seems to rely on the XDG Base Directory Specification for a lot
of things on Linux (http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html).
That's probably fine for desktop GUI apps (the standard was apparently
built for this use case), but doesn't apply at all for command line tools
or applications like daemons or services which don't interact with the
desktop, e.g. you typically won't find global config files for
command line tools under /etc/xdg/, but instead under /etc/.

For Windows, the CSIDL_* values have also been replaced with
new ones under FOLDERID_* (the APIs have also evolved):

Values:
https://msdn.microsoft.com/en-us/library/windows/desktop/bb762494%28v=vs.85%29.aspx
https://msdn.microsoft.com/en-us/library/windows/desktop/dd378457%28v=vs.85%29.aspx

APIs:
https://msdn.microsoft.com/en-us/library/windows/desktop/bb762181%28v=vs.85%29.aspx
https://msdn.microsoft.com/en-us/library/windows/desktop/bb762188%28v=vs.85%29.aspx

BTW: I wonder why the Windows functions in appdirs don't use
the environment for much easier access to e.g. APPDATA and USERPROFILE.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 02 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2015-08-27: Released eGenix mx Base 3.2.9 ...     http://egenix.com/go83

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/
_______________________________________________
Python-ideas mailing list
Python-ideas at python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150902/f338c591/attachment.html>

From random832 at fastmail.us  Wed Sep  2 15:36:27 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Wed, 02 Sep 2015 09:36:27 -0400
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CAN8d9gmP3HZmOD4MTpP3A763i3bW1DpApZrwo9SAcDSEPiYQmA@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>
 <796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>
 <CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>
 <CAJ3HoZ0fpdivaXo6OdO5jwDqfFpoOR=zJCtMqx-RizWSMqUnrw@mail.gmail.com>
 <CADiSq7cASe2GQihUOPVDL8+sRn=u3L0eYdSYopKGkjTS4oGiPQ@mail.gmail.com>
 <CAN8d9gmP3HZmOD4MTpP3A763i3bW1DpApZrwo9SAcDSEPiYQmA@mail.gmail.com>
Message-ID: <1441200987.2912104.372747265.63D1E82F@webmail.messagingengine.com>



On Wed, Sep 2, 2015, at 04:10, Philipp A. wrote:
>    *there is a search path instead of a single directory sometimes*
>    ? appdirs provides a multipath keyword argument which returns a
>    (colon-delimited) ?list? of paths.

Why isn't it a real list? Paths on any platform can contain a colon;
paths on Windows commonly do.

From flying-sheep at web.de  Wed Sep  2 16:30:31 2015
From: flying-sheep at web.de (Philipp A.)
Date: Wed, 02 Sep 2015 14:30:31 +0000
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <55E6C1B9.2030506@egenix.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>
 <796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>
 <CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>
 <CAJ3HoZ0fpdivaXo6OdO5jwDqfFpoOR=zJCtMqx-RizWSMqUnrw@mail.gmail.com>
 <CADiSq7cASe2GQihUOPVDL8+sRn=u3L0eYdSYopKGkjTS4oGiPQ@mail.gmail.com>
 <CAN8d9gmP3HZmOD4MTpP3A763i3bW1DpApZrwo9SAcDSEPiYQmA@mail.gmail.com>
 <55E6C1B9.2030506@egenix.com>
Message-ID: <CAN8d9gmsoMFR+awLuoOeFbYgp_wU-iybHTGKSy=8tAiaAfe9dg@mail.gmail.com>

hi marc-andre,

you seem some misconceptions and a *very* different experience from mine:

M.-A. Lemburg mal at egenix.com <http://mailto:mal at egenix.com> schrieb am Mi.,
2. Sep. 2015 um 11:30 Uhr:

Looking at the my home dir on Linux, there doesn't seem to be
> one standard, but rather a whole set of them

can you please link to the document from a standards authority describing
those other stadards? ?

and the good old ~/.appname is still a popular one (e.g. pip and ansible
> from
> Python land still use it; as do many other non-Python applications
> such as ncftp, emacs, svn, git, gpg, etc.).
>
as hinted at by my tongue-in-cheek comment from above: that?s not a
standard but an old convention.

git uses ${XDG_CONFIG_DIR-$HOME/.config}/git/config (try it!), as does pip.

the other ones are old as dirt so this is excusable. regarding newly
developed programs i?ll only approve it for some rare exceptions like
shells, where ~/.${SHELL}rc really is the only expected place for a config
file. (and not that me approving it means something)

~/.config/ does get some use, but mostly for GUI applications,
> not so much for command line ones.
>
where do you get this figure from? the xdg standard doesn?t say anything
about only a kind of application being targeted by the standard, and as my
fontconfig example shows, even libraries follow it.

~/.local/lib/ only appears to be used by Python :-)
>
yeah, on my system it doesn?t even exist! but that has a reson which you
can read in the standard: only ~/.local/share is pointed at by the default
for a standard dir. ~/.local/lib is probably an invention by whoever
included that path in the default $PYTHONPATH

~/.local/share/ is mostly used by desktops to register application shortcuts
>
that?s a big understatement, there?s all kinds of stuff in there. i have 56
dirs and files in there.

~/.cache/ is being used by just a handful of tools, pip being one of them.
>
hah! various parts of KDE, matplotlib, gstreamer, chromium, fontconfig,
atom, shall i go on?

--
> Marc-Andre Lemburg
> eGenix.com
>
i hope i could convince you that this isn?t ?some contender among many?,
but really *the* standard for some directories on linux.

best, phil
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150902/0cde5567/attachment.html>

From flying-sheep at web.de  Wed Sep  2 16:40:42 2015
From: flying-sheep at web.de (Philipp A.)
Date: Wed, 02 Sep 2015 14:40:42 +0000
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <1441200987.2912104.372747265.63D1E82F@webmail.messagingengine.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>
 <796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>
 <CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>
 <CAJ3HoZ0fpdivaXo6OdO5jwDqfFpoOR=zJCtMqx-RizWSMqUnrw@mail.gmail.com>
 <CADiSq7cASe2GQihUOPVDL8+sRn=u3L0eYdSYopKGkjTS4oGiPQ@mail.gmail.com>
 <CAN8d9gmP3HZmOD4MTpP3A763i3bW1DpApZrwo9SAcDSEPiYQmA@mail.gmail.com>
 <1441200987.2912104.372747265.63D1E82F@webmail.messagingengine.com>
Message-ID: <CAN8d9gkwdWOgXc9=yhSMFZEDS5cqXGRdAmYmeyKccC2HraijMQ@mail.gmail.com>

<random832 at fastmail.us> schrieb am Mi., 2. Sep. 2015 um 15:36 Uhr:

> On Wed, Sep 2, 2015, at 04:10, Philipp A. wrote:
> >    *there is a search path instead of a single directory sometimes*
> >    ? appdirs provides a multipath keyword argument which returns a
> >    (colon-delimited) ?list? of paths.
>
> Why isn't it a real list? Paths on any platform can contain a colon;
> paths on Windows commonly do.


no idea why, but once we start defining our own API, this will be the most
important cange from appdirs.

maybe they use semicolons on windows, but i really don?t see why we
shouldn?t just use python lists.

best, philipp
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150902/e52106cf/attachment.html>

From ericfahlgren at gmail.com  Wed Sep  2 20:31:04 2015
From: ericfahlgren at gmail.com (Eric Fahlgren)
Date: Wed, 2 Sep 2015 11:31:04 -0700
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CAN8d9gmP3HZmOD4MTpP3A763i3bW1DpApZrwo9SAcDSEPiYQmA@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>
 <796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>
 <CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>
 <CAJ3HoZ0fpdivaXo6OdO5jwDqfFpoOR=zJCtMqx-RizWSMqUnrw@mail.gmail.com>
 <CADiSq7cASe2GQihUOPVDL8+sRn=u3L0eYdSYopKGkjTS4oGiPQ@mail.gmail.com>
 <CAN8d9gmP3HZmOD4MTpP3A763i3bW1DpApZrwo9SAcDSEPiYQmA@mail.gmail.com>
Message-ID: <012901d0e5ad$87a64c70$96f2e550$@gmail.com>

> ~/.appname stopped being right on linux long ago and never was right on other platforms, which we should teach people.

Ah, yes.  I count 17 of those on my Windows machine (!) right now, including .idlerc, .ipython, .matplotlib, .ipylint.d etc., so we've got a ways to go. :)


From tjreedy at udel.edu  Wed Sep  2 20:34:08 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 2 Sep 2015 14:34:08 -0400
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <1441171316.956058.372400801.57FECD89@webmail.messagingengine.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CACCLA56_Ba3T827DvX2UnFRfySN1jzoc1O0k8W0_=SbmZEBHug@mail.gmail.com>
 <CACac1F8GXChg2iGXmiABMbK6zwRAHpna-AUG+h16bBgZghJoMg@mail.gmail.com>
 <ms5u3j$d5p$1@ger.gmane.org>
 <1441171316.956058.372400801.57FECD89@webmail.messagingengine.com>
Message-ID: <ms7ffg$9ql$1@ger.gmane.org>

On 9/2/2015 1:21 AM, random832 at fastmail.us 
wrote:
> On Wed, Sep 2, 2015, at 00:31, Terry Reedy wrote:
>> The problem with Windows is that the standard is to put things in an
>> invisible directory, which makes it difficult to tell people, especially
>> non-experts, to edit a file in the directory.
>
> I'm not sure you _should_ be telling non-experts to find a file to edit.
> Why doesn't your app provide a UI for it,

I added one, mostly written by Tal Einat, a year ago, but older versions 
of Idle have not disappeared (and the user config files are global to 
all versions, for a particular user).  And there is not yet an installer 
for 3rd party extensions.

> Plus, it's not really any harder to find than a "Hidden" directory
> beginning with a dot

Quit the contrary. Files beginning with a '.' are not hidden on Windows 
Explorer (or Command Prompt dir, for that matter).  I do not know of any 
way to enable showing hidden files with Explorer. (If you know of one, 
tell me.) The secret to getting to one is to click the directory 
sequence bar to the right of the last entry to get a directory path 
string, click again to unselect it, then add the name.  In this 
particular case, enter 'AppData/Roaming' or '%APPDATA%'.  It is 
intentionally difficult for user to access these files.

-- 
Terry Jan Reedy


From ckaynor at zindagigames.com  Wed Sep  2 20:54:16 2015
From: ckaynor at zindagigames.com (Chris Kaynor)
Date: Wed, 2 Sep 2015 11:54:16 -0700
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <ms7ffg$9ql$1@ger.gmane.org>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CACCLA56_Ba3T827DvX2UnFRfySN1jzoc1O0k8W0_=SbmZEBHug@mail.gmail.com>
 <CACac1F8GXChg2iGXmiABMbK6zwRAHpna-AUG+h16bBgZghJoMg@mail.gmail.com>
 <ms5u3j$d5p$1@ger.gmane.org>
 <1441171316.956058.372400801.57FECD89@webmail.messagingengine.com>
 <ms7ffg$9ql$1@ger.gmane.org>
Message-ID: <CALvWhxsFj7-CLMM4dPO3fax81n6SXKhh3HRA_Ub3K7HXm25G1g@mail.gmail.com>

On Wed, Sep 2, 2015 at 11:34 AM, Terry Reedy <tjreedy at udel.edu> wrote:
>> Plus, it's not really any harder to find than a "Hidden" directory
>> beginning with a dot
>
> Quit the contrary. Files beginning with a '.' are not hidden on Windows
> Explorer (or Command Prompt dir, for that matter).  I do not know of any way
> to enable showing hidden files with Explorer. (If you know of one, tell me.)
> The secret to getting to one is to click the directory sequence bar to the
> right of the last entry to get a directory path string, click again to
> unselect it, then add the name.  In this particular case, enter
> 'AppData/Roaming' or '%APPDATA%'.  It is intentionally difficult for user to
> access these files.

There is an option in the GUI for it:
On Windows 7, from an explorer window: Tools->Folder Options->View Tab
then in the Advanced Settings list: "Show hidden files, folders, and
drives". On Windows 7, the menus are hidden by default, and you need
to hold Alt for them to show up. I think Vista uses the same options,
and the menus are shown by default, while XP uses a slightly different
layout, but its in roughly the same location.

I do not know where it is on Windows 8 or 10, as I have never really
used either of those.

Chris

From srkunze at mail.de  Wed Sep  2 22:01:25 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Wed, 02 Sep 2015 22:01:25 +0200
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CALvWhxsFj7-CLMM4dPO3fax81n6SXKhh3HRA_Ub3K7HXm25G1g@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CACCLA56_Ba3T827DvX2UnFRfySN1jzoc1O0k8W0_=SbmZEBHug@mail.gmail.com>
 <CACac1F8GXChg2iGXmiABMbK6zwRAHpna-AUG+h16bBgZghJoMg@mail.gmail.com>
 <ms5u3j$d5p$1@ger.gmane.org>
 <1441171316.956058.372400801.57FECD89@webmail.messagingengine.com>
 <ms7ffg$9ql$1@ger.gmane.org>
 <CALvWhxsFj7-CLMM4dPO3fax81n6SXKhh3HRA_Ub3K7HXm25G1g@mail.gmail.com>
Message-ID: <55E75595.1010807@mail.de>



On 02.09.2015 20:54, Chris Kaynor wrote:
> On Wed, Sep 2, 2015 at 11:34 AM, Terry Reedy <tjreedy at udel.edu> wrote:
>>> Plus, it's not really any harder to find than a "Hidden" directory
>>> beginning with a dot
>> Quit the contrary. Files beginning with a '.' are not hidden on Windows
>> Explorer (or Command Prompt dir, for that matter).  I do not know of any way
>> to enable showing hidden files with Explorer. (If you know of one, tell me.)
>> The secret to getting to one is to click the directory sequence bar to the
>> right of the last entry to get a directory path string, click again to
>> unselect it, then add the name.  In this particular case, enter
>> 'AppData/Roaming' or '%APPDATA%'.  It is intentionally difficult for user to
>> access these files.
> There is an option in the GUI for it:
> On Windows 7, from an explorer window: Tools->Folder Options->View Tab
> then in the Advanced Settings list: "Show hidden files, folders, and
> drives". On Windows 7, the menus are hidden by default, and you need
> to hold Alt for them to show up. I think Vista uses the same options,
> and the menus are shown by default, while XP uses a slightly different
> layout, but its in roughly the same location.
>
> I do not know where it is on Windows 8 or 10, as I have never really
> used either of those.
Same as Win 7.
> Chris
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


From python at mrabarnett.plus.com  Wed Sep  2 22:15:38 2015
From: python at mrabarnett.plus.com (MRAB)
Date: Wed, 2 Sep 2015 21:15:38 +0100
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CALvWhxsFj7-CLMM4dPO3fax81n6SXKhh3HRA_Ub3K7HXm25G1g@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CACCLA56_Ba3T827DvX2UnFRfySN1jzoc1O0k8W0_=SbmZEBHug@mail.gmail.com>
 <CACac1F8GXChg2iGXmiABMbK6zwRAHpna-AUG+h16bBgZghJoMg@mail.gmail.com>
 <ms5u3j$d5p$1@ger.gmane.org>
 <1441171316.956058.372400801.57FECD89@webmail.messagingengine.com>
 <ms7ffg$9ql$1@ger.gmane.org>
 <CALvWhxsFj7-CLMM4dPO3fax81n6SXKhh3HRA_Ub3K7HXm25G1g@mail.gmail.com>
Message-ID: <55E758EA.8000607@mrabarnett.plus.com>

On 2015-09-02 19:54, Chris Kaynor wrote:
> On Wed, Sep 2, 2015 at 11:34 AM, Terry Reedy <tjreedy at udel.edu> wrote:
>>> Plus, it's not really any harder to find than a "Hidden" directory
>>> beginning with a dot
>>
>> Quit the contrary. Files beginning with a '.' are not hidden on Windows
>> Explorer (or Command Prompt dir, for that matter).  I do not know of any way
>> to enable showing hidden files with Explorer. (If you know of one, tell me.)
>> The secret to getting to one is to click the directory sequence bar to the
>> right of the last entry to get a directory path string, click again to
>> unselect it, then add the name.  In this particular case, enter
>> 'AppData/Roaming' or '%APPDATA%'.  It is intentionally difficult for user to
>> access these files.
>
> There is an option in the GUI for it:
> On Windows 7, from an explorer window: Tools->Folder Options->View Tab
> then in the Advanced Settings list: "Show hidden files, folders, and
> drives". On Windows 7, the menus are hidden by default, and you need
> to hold Alt for them to show up. I think Vista uses the same options,
> and the menus are shown by default, while XP uses a slightly different
> layout, but its in roughly the same location.
>
> I do not know where it is on Windows 8 or 10, as I have never really
> used either of those.
>
On Windows 10 it's on the File menu:

File->Change folder options and search options


From random832 at fastmail.us  Wed Sep  2 23:40:47 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Wed, 02 Sep 2015 17:40:47 -0400
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <ms7ffg$9ql$1@ger.gmane.org>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CACCLA56_Ba3T827DvX2UnFRfySN1jzoc1O0k8W0_=SbmZEBHug@mail.gmail.com>
 <CACac1F8GXChg2iGXmiABMbK6zwRAHpna-AUG+h16bBgZghJoMg@mail.gmail.com>
 <ms5u3j$d5p$1@ger.gmane.org>
 <1441171316.956058.372400801.57FECD89@webmail.messagingengine.com>
 <ms7ffg$9ql$1@ger.gmane.org>
Message-ID: <1441230047.3965673.373219713.06790441@webmail.messagingengine.com>

On Wed, Sep 2, 2015, at 14:34, Terry Reedy wrote:
> I do not know of any 
> way to enable showing hidden files with Explorer. (If you know of one, 
> tell me.)

Didn't notice this on my first reply. Go into the Folder Options window,
go to the "View" tab, and you will see an option labeled "Show hidden
files, folders, and drives". This option is specific to a folder, there
is a button to apply it to all folders.

The Folder Options dialog box is available from the "Tools" menu in the
classic UI (until Windows XP or maybe Vista), and [still hidden there,
but more visible in] the "Organize" dropdown as "Folder and search
options".

My point stands that putting a dot in front of a filename shows _intent_
to have it hidden from ordinary users, since it has that effect on Unix,
and makes it no easier to find it on Unix than finding the AppData
folder is on Windows.

From random832 at fastmail.us  Wed Sep  2 23:36:23 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Wed, 02 Sep 2015 17:36:23 -0400
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <ms7ffg$9ql$1@ger.gmane.org>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CACCLA56_Ba3T827DvX2UnFRfySN1jzoc1O0k8W0_=SbmZEBHug@mail.gmail.com>
 <CACac1F8GXChg2iGXmiABMbK6zwRAHpna-AUG+h16bBgZghJoMg@mail.gmail.com>
 <ms5u3j$d5p$1@ger.gmane.org>
 <1441171316.956058.372400801.57FECD89@webmail.messagingengine.com>
 <ms7ffg$9ql$1@ger.gmane.org>
Message-ID: <1441229783.3964747.373218457.1E5AF003@webmail.messagingengine.com>



On Wed, Sep 2, 2015, at 14:34, Terry Reedy wrote:

> Quit the contrary. Files beginning with a '.' are not hidden on Windows 
> Explorer

I was talking about dot files *on Unix* vs AppData on Windows. Being
hidden from the normal view is clearly desired, or people wouldn't use
the dot at all.

From tjreedy at udel.edu  Thu Sep  3 00:33:30 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 2 Sep 2015 18:33:30 -0400
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CALvWhxsFj7-CLMM4dPO3fax81n6SXKhh3HRA_Ub3K7HXm25G1g@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CACCLA56_Ba3T827DvX2UnFRfySN1jzoc1O0k8W0_=SbmZEBHug@mail.gmail.com>
 <CACac1F8GXChg2iGXmiABMbK6zwRAHpna-AUG+h16bBgZghJoMg@mail.gmail.com>
 <ms5u3j$d5p$1@ger.gmane.org>
 <1441171316.956058.372400801.57FECD89@webmail.messagingengine.com>
 <ms7ffg$9ql$1@ger.gmane.org>
 <CALvWhxsFj7-CLMM4dPO3fax81n6SXKhh3HRA_Ub3K7HXm25G1g@mail.gmail.com>
Message-ID: <ms7tg9$ce2$1@ger.gmane.org>

On 9/2/2015 2:54 PM, Chris Kaynor wrote:
> On Wed, Sep 2, 2015 at 11:34 AM, Terry Reedy <tjreedy at udel.edu> wrote:

>> I do not know of any way
>> to enable showing hidden files with Explorer. (If you know of one, tell me.)

> There is an option in the GUI for it:
> On Windows 7, from an explorer window: Tools->Folder Options->View Tab
> then in the Advanced Settings list: "Show hidden files, folders, and
> drives". On Windows 7, the menus are hidden by default, and you need
> to hold Alt for them to show up.

Thank you.  I am still using Win 7.  They show up on Alt-release.

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Thu Sep  3 00:38:28 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 2 Sep 2015 18:38:28 -0400
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <1441230047.3965673.373219713.06790441@webmail.messagingengine.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CACCLA56_Ba3T827DvX2UnFRfySN1jzoc1O0k8W0_=SbmZEBHug@mail.gmail.com>
 <CACac1F8GXChg2iGXmiABMbK6zwRAHpna-AUG+h16bBgZghJoMg@mail.gmail.com>
 <ms5u3j$d5p$1@ger.gmane.org>
 <1441171316.956058.372400801.57FECD89@webmail.messagingengine.com>
 <ms7ffg$9ql$1@ger.gmane.org>
 <1441230047.3965673.373219713.06790441@webmail.messagingengine.com>
Message-ID: <ms7tpj$ger$1@ger.gmane.org>

On 9/2/2015 5:40 PM, random832 at fastmail.us 
wrote:
> On Wed, Sep 2, 2015, at 14:34, Terry Reedy wrote:
>> I do not know of any
>> way to enable showing hidden files with Explorer. (If you know of one,
>> tell me.)
>
> Didn't notice this on my first reply. Go into the Folder Options window,
> go to the "View" tab, and you will see an option labeled "Show hidden
> files, folders, and drives". This option is specific to a folder, there
> is a button to apply it to all folders.
>
> The Folder Options dialog box is available from the "Tools" menu in the
> classic UI (until Windows XP or maybe Vista), and [still hidden there,
> but more visible in] the "Organize" dropdown as "Folder and search
> options".

Thanks.  The latter might be the easiest.

> My point stands that putting a dot in front of a filename shows _intent_
> to have it hidden from ordinary users, since it has that effect on Unix,
> and makes it no easier to find it on Unix than finding the AppData
> folder is on Windows.

What I remember from 26 years ago is 'ls -a'.  Still correct in a 
console?  (No idea how to do the same on GUIs.) This was part of any 
Unix intro as one was expected to edit the shell configs to one's taste.

-- 
Terry Jan Reedy


From stephen at xemacs.org  Thu Sep  3 03:21:43 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 03 Sep 2015 10:21:43 +0900
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <ms7tpj$ger$1@ger.gmane.org>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CACCLA56_Ba3T827DvX2UnFRfySN1jzoc1O0k8W0_=SbmZEBHug@mail.gmail.com>
 <CACac1F8GXChg2iGXmiABMbK6zwRAHpna-AUG+h16bBgZghJoMg@mail.gmail.com>
 <ms5u3j$d5p$1@ger.gmane.org>
 <1441171316.956058.372400801.57FECD89@webmail.messagingengine.com>
 <ms7ffg$9ql$1@ger.gmane.org>
 <1441230047.3965673.373219713.06790441@webmail.messagingengine.com>
 <ms7tpj$ger$1@ger.gmane.org>
Message-ID: <8737yw47rs.fsf@uwakimon.sk.tsukuba.ac.jp>

Terry Reedy writes:

 > What I remember from 26 years ago is 'ls -a'.  Still correct in a 
 > console?  (No idea how to do the same on GUIs.) This was part of any 
 > Unix intro as one was expected to edit the shell configs to one's taste.

Yes, "ls -a" (or "ls -A", which omits "." and "..").  This applies to
Mac OS X as well, and as of Yosemite I can't find any way to show
hidden files in the Finder.  (But there may be a way: I use Mac OS X
because it's pretty and I can use traditional *nix commands in the
terminal, not because I'm a Mac GUI wonk.)

From random832 at fastmail.us  Thu Sep  3 03:29:33 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Wed, 02 Sep 2015 21:29:33 -0400
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <ms7tpj$ger$1@ger.gmane.org>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CACCLA56_Ba3T827DvX2UnFRfySN1jzoc1O0k8W0_=SbmZEBHug@mail.gmail.com>
 <CACac1F8GXChg2iGXmiABMbK6zwRAHpna-AUG+h16bBgZghJoMg@mail.gmail.com>
 <ms5u3j$d5p$1@ger.gmane.org>
 <1441171316.956058.372400801.57FECD89@webmail.messagingengine.com>
 <ms7ffg$9ql$1@ger.gmane.org>
 <1441230047.3965673.373219713.06790441@webmail.messagingengine.com>
 <ms7tpj$ger$1@ger.gmane.org>
Message-ID: <1441243773.415824.373361338.24A125B2@webmail.messagingengine.com>

On Wed, Sep 2, 2015, at 18:38, Terry Reedy wrote:
> What I remember from 26 years ago is 'ls -a'.  Still correct in a 
> console?  (No idea how to do the same on GUIs.) This was part of any 
> Unix intro as one was expected to edit the shell configs to one's taste.

What is and should be expected of a non-technical user on a modern
desktop Unix system is very different from what it was 26 years ago.
This is so obvious it should go without saying. And, anyway, Windows has
got dir /a too. People who learned DOS 26 years ago probably likewise
know it.

Anyway, in summary: you have to either use a special option (not *hard*
to discover, but not in your face either) to enable hidden files (ls -a,
if they use the terminal *at all*, or whatever checkbox performs the
same function in gnome/kde/xfce/whatever), or type the exact filename
knowing it exists. Neither is much of a burden, but neither is any
"better" than on Windows.

From j.wielicki at sotecware.net  Thu Sep  3 10:43:59 2015
From: j.wielicki at sotecware.net (Jonas Wielicki)
Date: Thu, 3 Sep 2015 10:43:59 +0200
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CAPJVwBnpTS-X9y8WATvACubnM8y_F3AgRCwYOwqdTC=hsemCPw@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBnpTS-X9y8WATvACubnM8y_F3AgRCwYOwqdTC=hsemCPw@mail.gmail.com>
Message-ID: <55E8084F.5020405@sotecware.net>



On 01.09.2015 23:22, Nathaniel Smith wrote:
> What makes it particularly difficult is that if you "fix a bug" in
> a library like appdirs, so that it starts suddenly returning
> different results on some computer somewhere, then what it looks
> like to the end user is that their data/settings/whatever have
> suddenly evaporated and whatever disk space was being used for
> caches never gets cleaned up and so forth. Generally when
> applications change how they compute these directories, they also
> include tricky migration logic to check both the old and new names,
> move stuff over if needed, but I'm not sure how a low-level library
> like this can support that usefully...

A low-level library could provide an API to return "legacy"
directories. These could on Freedesktop-compliant systems for example
be the fallback directories which are used when the
environment-variables are not present. And after bugfixes (see what
Nick said), the old, incorrect behaviour could be presented.

An application could, after not finding its data in the non-legacy
directories, query the legacy directories and start a migration process.

regards,
jwi

From flying-sheep at web.de  Thu Sep  3 13:40:16 2015
From: flying-sheep at web.de (Philipp A.)
Date: Thu, 03 Sep 2015 11:40:16 +0000
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <012901d0e5ad$87a64c70$96f2e550$@gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>
 <796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>
 <CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>
 <CAJ3HoZ0fpdivaXo6OdO5jwDqfFpoOR=zJCtMqx-RizWSMqUnrw@mail.gmail.com>
 <CADiSq7cASe2GQihUOPVDL8+sRn=u3L0eYdSYopKGkjTS4oGiPQ@mail.gmail.com>
 <CAN8d9gmP3HZmOD4MTpP3A763i3bW1DpApZrwo9SAcDSEPiYQmA@mail.gmail.com>
 <012901d0e5ad$87a64c70$96f2e550$@gmail.com>
Message-ID: <CAN8d9gkkz6YfE5tfHR5Xrw4fkGeeeCrcVyPWNrf4E1MeBu8Hng@mail.gmail.com>

Eric Fahlgren <ericfahlgren at gmail.com> schrieb am Mi., 2. Sep. 2015 um
20:31 Uhr:

> > ~/.appname stopped being right on linux long ago and never was right on
> other platforms, which we should teach people.
>
> Ah, yes.  I count 17 of those on my Windows machine (!) right now,
> including .idlerc, .ipython, .matplotlib, .ipylint.d etc., so we've got a
> ways to go. :)
>

?on windows? wat. oh my god this is horrible. so wrong!

if this isn?t the definitte argument why we need this API yesterday?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150903/e76396a3/attachment.html>

From asweigart at gmail.com  Thu Sep  3 22:53:45 2015
From: asweigart at gmail.com (Al Sweigart)
Date: Thu, 3 Sep 2015 13:53:45 -0700
Subject: [Python-ideas] Non-English names in the turtle module.
Message-ID: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>

I've opened an issue for adding non-English names to the turtle module's
function names: https://bugs.python.org/issue24990

This would effectively take this code:

    import turtle
    t = turtle.Pen()
    t.pencolor('green')
    t.forward(100)

...and have this code in French be completely equivalent:

    import turtle
    t = turtle.Plume()
    t.couleurplume('vert')
    t.avant(100)

(Pardon my google-translate French.)

This, of course, is terrible way for a software module to implement
internationalization, which usually does not apply to the source code names
itself. But turtle is used as a teaching tool. While professional
developers are expected to obtain proficiency with English, the same does
not apply to school kids who are just taking a small computer programming
unit. Having the turtle module available in their native language (even if
Python keywords are not) would remove a large barrier and let them focus on
the core programming concepts that turtle provides.

The popular Scratch tool has a similar internationalized setup and also has
LOGO-style commands, so most of the translation work is already done.


Are there any design or technical issues I should be aware of before doing
this? It seems like a straight forward "Tortuga = Turtle" assignment of
names, though I would have a set up so that it is easy to add languages to
the source.

I have a Google-translated set of translations here:
https://github.com/asweigart/idle-reimagined/wiki/Turtle-Translations

But of course, a native speaker would have to sign off on it before making
it part of the turtle module API.

-Al
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150903/87f9e5a8/attachment.html>

From random832 at fastmail.us  Thu Sep  3 23:27:02 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Thu, 03 Sep 2015 17:27:02 -0400
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
Message-ID: <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>

On Thu, Sep 3, 2015, at 16:53, Al Sweigart wrote:
> https://github.com/asweigart/idle-reimagined/wiki/Turtle-Translations

A couple of downsides I noticed:

"The names will later be formatted to fit the code style of the turtle
module. " is a bit handwavy, to be honest. Are things like "de la"
required or optional? Should an apostrophe/space/hyphen become an
underscore or be omitted? What can be abbreviated? What
case/inflection/conjugation should be used? These are decisions that
have to be made by native speakers based on what will be understandable
to beginners who know each language.

Your names aren't very consistent - I don't know French that well, but I
doubt the module would benefit from mixing "plume", "crayon", and
"stylo". What to call a pen needs to be decided at a high level and used
globally. Also, things like "set...", "get...", "on...", need to be
translated to something consistent throughout the module.

For that matter, the naming conventions in the *English* ones are a bit
questionable and inconsistent. Maybe this is an opportunity to clean up
that API.

Also, a bug to report: I know enough Spanish to know that "pantalla
clara" means "clear screen" as a noun [screen that is clear], not a verb
[clearing the screen]. The English is ambiguous, but most other
languages are not.

For design questions: Is it important to hide the English names? Is it
important to be able to use the classes interchangeably to code that is
written to use a different language's method names? Can the same name in
one language ever translate to two different names in another language
depending on context?

From encukou at gmail.com  Fri Sep  4 00:55:10 2015
From: encukou at gmail.com (Petr Viktorin)
Date: Fri, 4 Sep 2015 00:55:10 +0200
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
Message-ID: <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>

On Thu, Sep 3, 2015 at 11:27 PM,  <random832 at fastmail.us> wrote:
> On Thu, Sep 3, 2015, at 16:53, Al Sweigart wrote:
>> https://github.com/asweigart/idle-reimagined/wiki/Turtle-Translations
>
> A couple of downsides I noticed:
>
> "The names will later be formatted to fit the code style of the turtle
> module. " is a bit handwavy, to be honest. Are things like "de la"
> required or optional? Should an apostrophe/space/hyphen become an
> underscore or be omitted? What can be abbreviated? What
> case/inflection/conjugation should be used? These are decisions that
> have to be made by native speakers based on what will be understandable
> to beginners who know each language.
>
> Your names aren't very consistent - I don't know French that well, but I
> doubt the module would benefit from mixing "plume", "crayon", and
> "stylo". What to call a pen needs to be decided at a high level and used
> globally. Also, things like "set...", "get...", "on...", need to be
> translated to something consistent throughout the module.
>
> For that matter, the naming conventions in the *English* ones are a bit
> questionable and inconsistent. Maybe this is an opportunity to clean up
> that API.
>
> Also, a bug to report: I know enough Spanish to know that "pantalla
> clara" means "clear screen" as a noun [screen that is clear], not a verb
> [clearing the screen]. The English is ambiguous, but most other
> languages are not.
>
> For design questions: Is it important to hide the English names? Is it
> important to be able to use the classes interchangeably to code that is
> written to use a different language's method names? Can the same name in
> one language ever translate to two different names in another language
> depending on context?

I would not be surprised if, among all languages in the world, there'd
be a clash in one of turtle's attribute names.

Why put everything in the same namespace? It might be better to use
more of those ? perhaps something like "from turtle.translations
import tortuga" ("tortuga" being Spanish for turtle). That is probably
too long, so why not just "import tortuga"? That "tortuga" module
could live on PyPI for a while, before it's considered for addition to
either the stdlib, or just the installers.

From asweigart at gmail.com  Fri Sep  4 01:12:51 2015
From: asweigart at gmail.com (Al Sweigart)
Date: Thu, 3 Sep 2015 16:12:51 -0700
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
Message-ID: <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>

Just to reply to both Petr and Random832 at once:

By "formatted to fit the code style of the turtle module" I meant that the
names would be pushed together without camelcase or underscores (as they
are already done in the turtle module).

But you do bring up good points. Whether things like "de la" are omitted or
how things can be abbreviated is left entirely up to the native speaker
doing the translation. I don't see any way around it.

But, as a rough guide, the Scratch tool has already done lots of
translation work and translators can piggy back off their wording choices.

The turtle module has been around long enough that I already see it's API
as set in stone, for better or worse. If the English names should be
changed, I see that as a separate issue.

I don't see a reason to hide the English names. I want the names to be
available without additional setup, i.e. there is no "language" setting
that needs to be specified first. The fewer setup steps the better. All
names in all languages would be available. I don't see being able to mix,
say, English and Spanish names in the same program as a big enough problem
that we have to force a fix to prevent it.



I'd like to keep the setup as simple and as similar to existing code as
possible. There is a slight difference between "import x" and "from x
import x" and I'd want to avoid that. From a technical perspective, I don't
think the additional names will hinder the maintenance of the turtle module
(which rarely changes itself). Though I'll know for sure what the technical
issues are, if any, once I produce the patch.

The idea for putting these modules on PyPI is interesting. My only
hesitation is I don't want "but it's already on PyPI" as an excuse not to
include these changes into the standard library turtle module.

-Al

On Thu, Sep 3, 2015 at 3:55 PM, Petr Viktorin <encukou at gmail.com> wrote:

> On Thu, Sep 3, 2015 at 11:27 PM,  <random832 at fastmail.us> wrote:
> > On Thu, Sep 3, 2015, at 16:53, Al Sweigart wrote:
> >> https://github.com/asweigart/idle-reimagined/wiki/Turtle-Translations
> >
> > A couple of downsides I noticed:
> >
> > "The names will later be formatted to fit the code style of the turtle
> > module. " is a bit handwavy, to be honest. Are things like "de la"
> > required or optional? Should an apostrophe/space/hyphen become an
> > underscore or be omitted? What can be abbreviated? What
> > case/inflection/conjugation should be used? These are decisions that
> > have to be made by native speakers based on what will be understandable
> > to beginners who know each language.
> >
> > Your names aren't very consistent - I don't know French that well, but I
> > doubt the module would benefit from mixing "plume", "crayon", and
> > "stylo". What to call a pen needs to be decided at a high level and used
> > globally. Also, things like "set...", "get...", "on...", need to be
> > translated to something consistent throughout the module.
> >
> > For that matter, the naming conventions in the *English* ones are a bit
> > questionable and inconsistent. Maybe this is an opportunity to clean up
> > that API.
> >
> > Also, a bug to report: I know enough Spanish to know that "pantalla
> > clara" means "clear screen" as a noun [screen that is clear], not a verb
> > [clearing the screen]. The English is ambiguous, but most other
> > languages are not.
> >
> > For design questions: Is it important to hide the English names? Is it
> > important to be able to use the classes interchangeably to code that is
> > written to use a different language's method names? Can the same name in
> > one language ever translate to two different names in another language
> > depending on context?
>
> I would not be surprised if, among all languages in the world, there'd
> be a clash in one of turtle's attribute names.
>
> Why put everything in the same namespace? It might be better to use
> more of those ? perhaps something like "from turtle.translations
> import tortuga" ("tortuga" being Spanish for turtle). That is probably
> too long, so why not just "import tortuga"? That "tortuga" module
> could live on PyPI for a while, before it's considered for addition to
> either the stdlib, or just the installers.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150903/087b107d/attachment-0001.html>

From steve at pearwood.info  Fri Sep  4 03:43:02 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 4 Sep 2015 11:43:02 +1000
Subject: [Python-ideas] Add appdirs module to stdlib
In-Reply-To: <CAJ3HoZ0fpdivaXo6OdO5jwDqfFpoOR=zJCtMqx-RizWSMqUnrw@mail.gmail.com>
References: <CAN8d9gmN3J+G+jetardW5EsyDqyFfJEopHx8Zs7d2mZhzjKH8w@mail.gmail.com>
 <E7331A45-A6C5-48C1-9E6C-3BE8DC83240A@yahoo.com>
 <CAPJVwBkfC5L-Kt+cxM9A8WxzgsoPCo4FzzW=6qtV1WbrRRcqMA@mail.gmail.com>
 <796B0953-FC26-4AC2-AEE1-4BCA5C6F26BF@yahoo.com>
 <CACac1F90KuakdVf1p5x-4mm2nvmMy=5XxQ1_0N6SzYnm52eJ-g@mail.gmail.com>
 <CAJ3HoZ0fpdivaXo6OdO5jwDqfFpoOR=zJCtMqx-RizWSMqUnrw@mail.gmail.com>
Message-ID: <20150904014301.GJ19373@ando.pearwood.info>

On Wed, Sep 02, 2015 at 04:38:05PM +1200, Robert Collins wrote:

> And if we're going to do that... why? Why not just provide a
> documentation link to the thing and say 'pip install this' and/or
> 'setuptools install_require this'.

I can think of two reasons, one minor, one major:

1.  Many people behind corporate or school firewalls cannot just "pip 
install this". By "firewall" I'm talking more figuratively than 
literally. Of course there may be an actual firewall blocking access to 
PyPI, but that's typically very easy to bypass (download the library at 
home and bring it in on a USB stick). Less easy to bypass is corporate/ 
school policy which prohibits the installation of unapproved software, 
often being a firing or expulsion offense. Getting approval may be 
difficult, slow or downright impossible.

However, what makes this a minor reason is that software written under 
those conditions probably won't be distributed outside of the 
organisation itself, so who cares whether it complies with the standard 
locations for application data?

2.  More importantly is the stdlib itself. As someone has pointed out, 
the stdlib already drops config files in completely inappropriate 
locations on Windows, e.g. $HOME/.idlelib. It would be good if the 
stdlib itself could consistently use the standard locations for files.


-- 
Steve

From rosuav at gmail.com  Fri Sep  4 03:51:11 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 4 Sep 2015 11:51:11 +1000
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
Message-ID: <CAPTjJmrHabgyPOquvrZUVwkkbATybLcVeEbbytZPP45oC0o=Aw@mail.gmail.com>

On Fri, Sep 4, 2015 at 8:55 AM, Petr Viktorin <encukou at gmail.com> wrote:
> Why put everything in the same namespace? It might be better to use
> more of those ? perhaps something like "from turtle.translations
> import tortuga" ("tortuga" being Spanish for turtle). That is probably
> too long, so why not just "import tortuga"? That "tortuga" module
> could live on PyPI for a while, before it's considered for addition to
> either the stdlib, or just the installers.

+1. These modules could simply import a boatload of stuff from
"turtle" under new names, which would make them fairly slim. Question:
Should they start with "from turtle import *" so the English names are
always available? It'd ensure that untranslated names don't get missed
out, but it might be confusing.

ChrisA

From stephen at xemacs.org  Fri Sep  4 04:05:51 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 04 Sep 2015 11:05:51 +0900
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
Message-ID: <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>

Al Sweigart writes:

 > The idea for putting these modules on PyPI is interesting. My only
 > hesitation is I don't want "but it's already on PyPI" as an excuse
 > not to include these changes into the standard library turtle
 > module.

Exactly backwards, as the first objection is going to be "if it could
be on PyPI but isn't, there's no evidence it's ready for the stdlib."


From stephen at xemacs.org  Fri Sep  4 04:34:41 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 04 Sep 2015 11:34:41 +0900
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <CAPTjJmrHabgyPOquvrZUVwkkbATybLcVeEbbytZPP45oC0o=Aw@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPTjJmrHabgyPOquvrZUVwkkbATybLcVeEbbytZPP45oC0o=Aw@mail.gmail.com>
Message-ID: <87twra3oam.fsf@uwakimon.sk.tsukuba.ac.jp>

Chris Angelico writes:

 > +1. These modules could simply import a boatload of stuff from
 > "turtle" under new names, which would make them fairly slim. Question:
 > Should they start with "from turtle import *" so the English names are
 > always available? It'd ensure that untranslated names don't get missed
 > out, but it might be confusing.

That would be pretty horrible, and contrary to the point of allowing
the new user to learn algorithmic thinking in a small world using
intuitively named commands.

I would think the sensible thing to do is to is invite participation
from traditional translation volunteers with something like

  import turtle
  from i18n import _
  _translations = { 'turtle' : _('turtle'), ... }
  for name in dir(turtle):
    if name in _translations:
      eval("from turtle import {} as {}".format(name, _translations[name]))
    elif english_fallbacks_please:
      eval("from turtle import {}".format(name))


From rosuav at gmail.com  Fri Sep  4 04:43:38 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 4 Sep 2015 12:43:38 +1000
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <87twra3oam.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPTjJmrHabgyPOquvrZUVwkkbATybLcVeEbbytZPP45oC0o=Aw@mail.gmail.com>
 <87twra3oam.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAPTjJmrJ98fQz+xm3rv06tJMt7hrmfXkRvWk__kzLH8=2H8+Xw@mail.gmail.com>

On Fri, Sep 4, 2015 at 12:34 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Chris Angelico writes:
>
>  > +1. These modules could simply import a boatload of stuff from
>  > "turtle" under new names, which would make them fairly slim. Question:
>  > Should they start with "from turtle import *" so the English names are
>  > always available? It'd ensure that untranslated names don't get missed
>  > out, but it might be confusing.
>
> That would be pretty horrible, and contrary to the point of allowing
> the new user to learn algorithmic thinking in a small world using
> intuitively named commands.
>
> I would think the sensible thing to do is to is invite participation
> from traditional translation volunteers with something like
>
>   import turtle
>   from i18n import _
>   _translations = { 'turtle' : _('turtle'), ... }
>   for name in dir(turtle):
>     if name in _translations:
>       eval("from turtle import {} as {}".format(name, _translations[name]))
>     elif english_fallbacks_please:
>       eval("from turtle import {}".format(name))

Yeah, that'd be better than including all the original English names.
And _translations can be generated easily enough:

import turtle
print("{" + ", ".join("%r: _(%r)" % (n, n) for n in dir(turtle)) + "}")

Though the diffs would be disgusting any time anything in turtle changes.

ChrisA

From steve at pearwood.info  Fri Sep  4 04:45:53 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 4 Sep 2015 12:45:53 +1000
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20150904024552.GL19373@ando.pearwood.info>

On Fri, Sep 04, 2015 at 11:05:51AM +0900, Stephen J. Turnbull wrote:
> Al Sweigart writes:
> 
>  > The idea for putting these modules on PyPI is interesting. My only
>  > hesitation is I don't want "but it's already on PyPI" as an excuse
>  > not to include these changes into the standard library turtle
>  > module.
> 
> Exactly backwards, as the first objection is going to be "if it could
> be on PyPI but isn't, there's no evidence it's ready for the stdlib."

*cough typing cough*


The turtle module has been in Python for many, many years. This proposal 
doesn't change the functionality, it merely offers a localised API to 
the same functionality. A bunch of alternate names, nothing more.

I would argue that if you consider the user-base of turtle, putting it 
on PyPI is a waste of time:

- Beginners aren't going to know to "pip install whatever". Some of us 
here seem to think that pip is the answer to everything, but if you look 
on the python-list mailing list, you will see plenty of evidence that 
people have trouble using pip.

- Schools may have policies against the installation of unapproved 
software on their desktops, and getting approval to "pip install *" may 
be difficult, time-consuming or outright impossible. If they are using 
Python, we know they have approval to use what is in the standard 
library. Everything else is, at best, a theorectical possibility.

One argument against this proposal is that Python is not really designed 
as a kid-friendly learning language, and we should just abandon that 
space to languages that do it better, like Scratch. I'd hate to see that 
argument win, but given our limited resources perhaps we should know 
when we're beaten. Compared to what Scratch can do, turtle graphics are 
so very 1970s.

But if we think that there is still a place in the Python infrastructure 
for turtle graphics, then I'm +1 on localising the turtle module.




-- 
Steve

From asweigart at gmail.com  Fri Sep  4 05:52:30 2015
From: asweigart at gmail.com (Al Sweigart)
Date: Thu, 3 Sep 2015 20:52:30 -0700
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <20150904024552.GL19373@ando.pearwood.info>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
Message-ID: <CAPyZGSkW=MZy=O-D9KbXWzpaZn879EZYbjU5AAEtdCRYYd2OPg@mail.gmail.com>

Thinking about it some more, yeah, having a separate module on PyPI would
just be a waste of time. This isn't changing functionality or experimenting
with new features, it's just adding new names to existing functions. And
installing stuff with pip is going to be insurmountable barrier for a lot
of computer labs.

I'd say Python is very much a kid-friendly language. It's definitely much
friendlier than BASIC.

I'd advise against using the _() function in gettext. That function is for
string tables, which is set up to be easily changed and expanded. The
turtle API is pretty much set in stone, and dealing with separate .po files
and gettext in general would be more of a maintenance headache. It is also
dependent on the machine's localization settings.

I believe some simple code at the end of turtle.py like this would be good
enough:

    _spanish = {'forward': 'adelante'} # ...and the rest of the translated
terms
    _languages = {'spanish': _spanish} # ...and the rest of the languages

    def forward(): # this is the original turtle forward() function
        print('Blah blah blah, this is the forward() function.')

    for language in _languages:
        for englishTerm, nonEnglishTerm in _languages[language].items():
            locals()[nonEnglishTerm] = locals()[englishTerm]

Plus the diff wouldn't look too bad.

This doesn't prohibit someone from mixing both English and Non-English
names in the same program, but I don't see that as a big problem. I think
it's best to have all the languages available without having to setup
localization settings.

-Al

On Thu, Sep 3, 2015 at 7:45 PM, Steven D'Aprano <steve at pearwood.info> wrote:

> On Fri, Sep 04, 2015 at 11:05:51AM +0900, Stephen J. Turnbull wrote:
> > Al Sweigart writes:
> >
> >  > The idea for putting these modules on PyPI is interesting. My only
> >  > hesitation is I don't want "but it's already on PyPI" as an excuse
> >  > not to include these changes into the standard library turtle
> >  > module.
> >
> > Exactly backwards, as the first objection is going to be "if it could
> > be on PyPI but isn't, there's no evidence it's ready for the stdlib."
>
> *cough typing cough*
>
>
> The turtle module has been in Python for many, many years. This proposal
> doesn't change the functionality, it merely offers a localised API to
> the same functionality. A bunch of alternate names, nothing more.
>
> I would argue that if you consider the user-base of turtle, putting it
> on PyPI is a waste of time:
>
> - Beginners aren't going to know to "pip install whatever". Some of us
> here seem to think that pip is the answer to everything, but if you look
> on the python-list mailing list, you will see plenty of evidence that
> people have trouble using pip.
>
> - Schools may have policies against the installation of unapproved
> software on their desktops, and getting approval to "pip install *" may
> be difficult, time-consuming or outright impossible. If they are using
> Python, we know they have approval to use what is in the standard
> library. Everything else is, at best, a theorectical possibility.
>
> One argument against this proposal is that Python is not really designed
> as a kid-friendly learning language, and we should just abandon that
> space to languages that do it better, like Scratch. I'd hate to see that
> argument win, but given our limited resources perhaps we should know
> when we're beaten. Compared to what Scratch can do, turtle graphics are
> so very 1970s.
>
> But if we think that there is still a place in the Python infrastructure
> for turtle graphics, then I'm +1 on localising the turtle module.
>
>
>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150903/4de44e33/attachment-0001.html>

From abarnert at yahoo.com  Fri Sep  4 09:18:52 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 4 Sep 2015 00:18:52 -0700
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <20150904024552.GL19373@ando.pearwood.info>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
Message-ID: <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>

First, does this proposal actually come from a non-English teacher, or someone who's talked to them, or is it just a guess that they might find it nice?

Meanwhile:

On Sep 3, 2015, at 19:45, Steven D'Aprano <steve at pearwood.info> wrote:
> 
> - Beginners aren't going to know to "pip install whatever". Some of us 
> here seem to think that pip is the answer to everything, but if you look 
> on the python-list mailing list, you will see plenty of evidence that 
> people have trouble using pip.

Of course a sizable chunk of those say "my Python didn't come with pip" and the after a bit of exploration you find that they're using Python 2.7.3 or something, so any feature added to Python 3.6 isn't likely to help them anyway.

And that seems like a good argument to add it to PyPI even if it's also added to the stdlib. Sure, for some teachers it'll be easier to just require 3.6 than to require a particular package. But I'm guessing both will be problematic in different cases. For example, if the school is issuing students linux laptops that come with Python 3.4, would explaining apt-get, and getting permission from the IT department for it, is probably harder, not easier, than pip.


From stephen at xemacs.org  Fri Sep  4 09:23:21 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 04 Sep 2015 16:23:21 +0900
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <20150904024552.GL19373@ando.pearwood.info>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
Message-ID: <87si6u3axi.fsf@uwakimon.sk.tsukuba.ac.jp>

Steven D'Aprano writes:
 > On Fri, Sep 04, 2015 at 11:05:51AM +0900, Stephen J. Turnbull wrote:

 > > Exactly backwards, as the first objection is going to be "if it could
 > > be on PyPI but isn't, there's no evidence it's ready for the stdlib."
 > 
 > *cough typing cough*

And?  The objection will still be made.  And I doubt Guido will agree
that typing is a precedent that can be used to justify inclusion of
turtle localizations.  He might very well be in favor AFAIK, I just
doubt he would base that on the precedent of typing.

The rest of your post I don't really agree with, but I have no strong
counterarguments, either.  Here I just wanted to point out that the
way these discussions have gone in the past is that without special
support from the BDFL, the usual path for these things is through PyPI.

Especially since AFAICT we don't actually have an implementation yet.

Steve

From bussonniermatthias at gmail.com  Fri Sep  4 09:26:08 2015
From: bussonniermatthias at gmail.com (Matthias Bussonnier)
Date: Fri, 4 Sep 2015 09:26:08 +0200
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <CAPyZGSkW=MZy=O-D9KbXWzpaZn879EZYbjU5AAEtdCRYYd2OPg@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <CAPyZGSkW=MZy=O-D9KbXWzpaZn879EZYbjU5AAEtdCRYYd2OPg@mail.gmail.com>
Message-ID: <35076B57-2F85-4632-B522-47BDE868567C@gmail.com>

Hi all, 

Personal opinion, base by a bit of experience:

There is one thing worse than programming in a foreign language IMHO (I?m native french) 
It?s programming in an environment which is half translated and/or mix
english and native language.

The cognitive load and the context switching it forces the brain to do when 2 languages 
are present is absolutely astronomical, and i guess translating the Turtle module
will not allow to translate the control-flow structure, the docstrings....etc and so on,
and so forth if you do simple `Tortue = Turtle` assignment,
 So while it looks nice 2 liners example you hit this problem pretty quickly. 

Taking fort given example:

import turtle   # import is english, should translate to  importer, ou importez. turtle should be tortue also. 
t = turtle.Plume()
t.couleurplume('vert?)  # plume is a female, couleur should be ?verte?, ?crayon? would be male, so ?vert"
t.avant(100)  # avance/avancer


I can perfectly imagine a menu ?ins?rer use boucle `pour ...`?, that insert a `for ....` in applications, 
which is confusing is confusing to explain. 

I also find it much easier to attach a programming meaning to a word that 
have no previous meaning for a kid (for, range, if, else, print are blank slate
for French children), than shoehorn another concept biased by previous
experience into it. 

This in particular make me think of Gibiane[1], which is basically:
?Hey fortran is great let?s make it in french?, which was a really bad idea[2], 
no it?s not a joke, and yes people do nuclear physics using this language. 

While I appreciate in general the translation effort, in general most of the 
translated side of things (MDN, microsoft help pages, Apples ones) are much 
worse than trying to understand the english originals. 


So just a warning that the best is the enemy of the good, and despite good intentions[3],
trying to translate Turtle module might not be the right thing to do. 

Thanks, 
-- 

Matthias

[1]: https://fr.wikipedia.org/wiki/Gibiane <https://fr.wikipedia.org/wiki/Gibiane>
[2]: but not the worse IMHO.
[3]: http://www.bloombergview.com/articles/2015-08-18/how-a-ban-on-plastic-bags-can-go-wrong <http://www.bloombergview.com/articles/2015-08-18/how-a-ban-on-plastic-bags-can-go-wrong>


> On Sep 4, 2015, at 05:52, Al Sweigart <asweigart at gmail.com> wrote:
> 
> Thinking about it some more, yeah, having a separate module on PyPI would just be a waste of time. This isn't changing functionality or experimenting with new features, it's just adding new names to existing functions. And installing stuff with pip is going to be insurmountable barrier for a lot of computer labs.
> 
> I'd say Python is very much a kid-friendly language. It's definitely much friendlier than BASIC.
> 
> I'd advise against using the _() function in gettext. That function is for string tables, which is set up to be easily changed and expanded. The turtle API is pretty much set in stone, and dealing with separate .po files and gettext in general would be more of a maintenance headache. It is also dependent on the machine's localization settings.
> 
> I believe some simple code at the end of turtle.py like this would be good enough:
> 
>     _spanish = {'forward': 'adelante'} # ...and the rest of the translated terms
>     _languages = {'spanish': _spanish} # ...and the rest of the languages
> 
>     def forward(): # this is the original turtle forward() function
>         print('Blah blah blah, this is the forward() function.')
> 
>     for language in _languages:
>         for englishTerm, nonEnglishTerm in _languages[language].items():
>             locals()[nonEnglishTerm] = locals()[englishTerm]
> 
> Plus the diff wouldn't look too bad.
> 
> This doesn't prohibit someone from mixing both English and Non-English names in the same program, but I don't see that as a big problem. I think it's best to have all the languages available without having to setup localization settings.
> 
> -Al
> 
> On Thu, Sep 3, 2015 at 7:45 PM, Steven D'Aprano <steve at pearwood.info <mailto:steve at pearwood.info>> wrote:
> On Fri, Sep 04, 2015 at 11:05:51AM +0900, Stephen J. Turnbull wrote:
> > Al Sweigart writes:
> >
> >  > The idea for putting these modules on PyPI is interesting. My only
> >  > hesitation is I don't want "but it's already on PyPI" as an excuse
> >  > not to include these changes into the standard library turtle
> >  > module.
> >
> > Exactly backwards, as the first objection is going to be "if it could
> > be on PyPI but isn't, there's no evidence it's ready for the stdlib."
> 
> *cough typing cough*
> 
> 
> The turtle module has been in Python for many, many years. This proposal
> doesn't change the functionality, it merely offers a localised API to
> the same functionality. A bunch of alternate names, nothing more.
> 
> I would argue that if you consider the user-base of turtle, putting it
> on PyPI is a waste of time:
> 
> - Beginners aren't going to know to "pip install whatever". Some of us
> here seem to think that pip is the answer to everything, but if you look
> on the python-list mailing list, you will see plenty of evidence that
> people have trouble using pip.
> 
> - Schools may have policies against the installation of unapproved
> software on their desktops, and getting approval to "pip install *" may
> be difficult, time-consuming or outright impossible. If they are using
> Python, we know they have approval to use what is in the standard
> library. Everything else is, at best, a theorectical possibility.
> 
> One argument against this proposal is that Python is not really designed
> as a kid-friendly learning language, and we should just abandon that
> space to languages that do it better, like Scratch. I'd hate to see that
> argument win, but given our limited resources perhaps we should know
> when we're beaten. Compared to what Scratch can do, turtle graphics are
> so very 1970s.
> 
> But if we think that there is still a place in the Python infrastructure
> for turtle graphics, then I'm +1 on localising the turtle module.
> 
> 
> 
> 
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org <mailto:Python-ideas at python.org>
> https://mail.python.org/mailman/listinfo/python-ideas <https://mail.python.org/mailman/listinfo/python-ideas>
> Code of Conduct: http://python.org/psf/codeofconduct/ <http://python.org/psf/codeofconduct/>
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150904/65b44ff0/attachment-0001.html>

From encukou at gmail.com  Fri Sep  4 10:17:36 2015
From: encukou at gmail.com (Petr Viktorin)
Date: Fri, 4 Sep 2015 10:17:36 +0200
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <35076B57-2F85-4632-B522-47BDE868567C@gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <CAPyZGSkW=MZy=O-D9KbXWzpaZn879EZYbjU5AAEtdCRYYd2OPg@mail.gmail.com>
 <35076B57-2F85-4632-B522-47BDE868567C@gmail.com>
Message-ID: <CA+=+wqDOqYdHiivM_1hqMQw6hzuV502tbBk2U50Css2XQ+bnhQ@mail.gmail.com>

On Fri, Sep 4, 2015 at 9:26 AM, Matthias Bussonnier
<bussonniermatthias at gmail.com> wrote:
> Hi all,
>
> Personal opinion, base by a bit of experience:
>
> There is one thing worse than programming in a foreign language IMHO (I?m
> native french)
> It?s programming in an environment which is half translated and/or mix
> english and native language.
>
> The cognitive load and the context switching it forces the brain to do when
> 2 languages
> are present is absolutely astronomical, and i guess translating the Turtle
> module
> will not allow to translate the control-flow structure, the
> docstrings....etc and so on,
> and so forth if you do simple `Tortue = Turtle` assignment,
>  So while it looks nice 2 liners example you hit this problem pretty
> quickly.
>
> Taking fort given example:
>
> import turtle   # import is english, should translate to  importer, ou
> importez. turtle should be tortue also.
> t = turtle.Plume()
> t.couleurplume('vert?)  # plume is a female, couleur should be ?verte?,
> ?crayon? would be male, so ?vert"
> t.avant(100)  # avance/avancer
>
>
> I can perfectly imagine a menu ?ins?rer use boucle `pour ...`?, that insert
> a `for ....` in applications,
> which is confusing is confusing to explain.
>
> I also find it much easier to attach a programming meaning to a word that
> have no previous meaning for a kid (for, range, if, else, print are blank
> slate
> for French children), than shoehorn another concept biased by previous
> experience into it.
>
> This in particular make me think of Gibiane[1], which is basically:
> ?Hey fortran is great let?s make it in french?, which was a really bad
> idea[2],
> no it?s not a joke, and yes people do nuclear physics using this language.
>
> While I appreciate in general the translation effort, in general most of the
> translated side of things (MDN, microsoft help pages, Apples ones) are much
> worse than trying to understand the english originals.
>
>
> So just a warning that the best is the enemy of the good, and despite good
> intentions[3],
> trying to translate Turtle module might not be the right thing to do.
>

Another opinion based on some experience:
I use local-language names when teaching beginners. It gives a nice
distinction between names provided by Python or a library (in English)
and things that can be named arbitrarily. I haven't actually measured
if this helps learning, though; and to the turtle module it might not
apply at all.

From gmludo at gmail.com  Fri Sep  4 13:34:16 2015
From: gmludo at gmail.com (Ludovic Gasc)
Date: Fri, 4 Sep 2015 13:34:16 +0200
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <35076B57-2F85-4632-B522-47BDE868567C@gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <CAPyZGSkW=MZy=O-D9KbXWzpaZn879EZYbjU5AAEtdCRYYd2OPg@mail.gmail.com>
 <35076B57-2F85-4632-B522-47BDE868567C@gmail.com>
Message-ID: <CAON-fpFjcAaWmA6FhZ_1JYG7jpuNP2v+BHunwc=Do85-dmtFTQ@mail.gmail.com>

I'm agree with Matthias: IT world is mostly English based.
It's "sad" for non native English speakers like me because you must learn
English before to work in IT.
However, the positive side effect is that we can speak together in the
common language.

Ludovic Gasc (GMLudo)
http://www.gmludo.eu/
On 4 Sep 2015 09:26, "Matthias Bussonnier" <bussonniermatthias at gmail.com>
wrote:

> Hi all,
>
> Personal opinion, base by a bit of experience:
>
> There is one thing worse than programming in a foreign language IMHO (I?m
> native french)
> It?s programming in an environment which is half translated and/or mix
> english and native language.
>
> The cognitive load and the context switching it forces the brain to do
> when 2 languages
> are present is absolutely astronomical, and i guess translating the Turtle
> module
> will not allow to translate the control-flow structure, the
> docstrings....etc and so on,
> and so forth if you do simple `Tortue = Turtle` assignment,
>  So while it looks nice 2 liners example you hit this problem pretty
> quickly.
>
> Taking fort given example:
>
> import turtle   # import is english, should translate to  importer, ou
> importez. turtle should be tortue also.
> t = turtle.Plume()
> t.couleurplume('vert?)  # plume is a female, couleur should be ?verte?,
> ?crayon? would be male, so ?vert"
> t.avant(100)  # avance/avancer
>
>
> I can perfectly imagine a menu ?ins?rer use boucle `pour ...`?, that
> insert a `for ....` in applications,
> which is confusing is confusing to explain.
>
> I also find it much easier to attach a programming meaning to a word that
> have no previous meaning for a kid (for, range, if, else, print are blank
> slate
> for French children), than shoehorn another concept biased by previous
> experience into it.
>
> This in particular make me think of Gibiane[1], which is basically:
> ?Hey fortran is great let?s make it in french?, which was a really bad
> idea[2],
> no it?s not a joke, and yes people do nuclear physics using this language.
>
> While I appreciate in general the translation effort, in general most of
> the
> translated side of things (MDN, microsoft help pages, Apples ones) are
> much
> worse than trying to understand the english originals.
>
>
> So just a warning that the best is the enemy of the good, and despite good
> intentions[3],
> trying to translate Turtle module might not be the right thing to do.
>
> Thanks,
> --
>
> Matthias
>
> [1]: https://fr.wikipedia.org/wiki/Gibiane
> [2]: but not the worse IMHO.
> [3]:
> http://www.bloombergview.com/articles/2015-08-18/how-a-ban-on-plastic-bags-can-go-wrong
>
>
> On Sep 4, 2015, at 05:52, Al Sweigart <asweigart at gmail.com> wrote:
>
> Thinking about it some more, yeah, having a separate module on PyPI would
> just be a waste of time. This isn't changing functionality or experimenting
> with new features, it's just adding new names to existing functions. And
> installing stuff with pip is going to be insurmountable barrier for a lot
> of computer labs.
>
> I'd say Python is very much a kid-friendly language. It's definitely much
> friendlier than BASIC.
>
> I'd advise against using the _() function in gettext. That function is for
> string tables, which is set up to be easily changed and expanded. The
> turtle API is pretty much set in stone, and dealing with separate .po files
> and gettext in general would be more of a maintenance headache. It is also
> dependent on the machine's localization settings.
>
> I believe some simple code at the end of turtle.py like this would be good
> enough:
>
>     _spanish = {'forward': 'adelante'} # ...and the rest of the translated
> terms
>     _languages = {'spanish': _spanish} # ...and the rest of the languages
>
>     def forward(): # this is the original turtle forward() function
>         print('Blah blah blah, this is the forward() function.')
>
>     for language in _languages:
>         for englishTerm, nonEnglishTerm in _languages[language].items():
>             locals()[nonEnglishTerm] = locals()[englishTerm]
>
> Plus the diff wouldn't look too bad.
>
> This doesn't prohibit someone from mixing both English and Non-English
> names in the same program, but I don't see that as a big problem. I think
> it's best to have all the languages available without having to setup
> localization settings.
>
> -Al
>
> On Thu, Sep 3, 2015 at 7:45 PM, Steven D'Aprano <steve at pearwood.info>
> wrote:
>
>> On Fri, Sep 04, 2015 at 11:05:51AM +0900, Stephen J. Turnbull wrote:
>> > Al Sweigart writes:
>> >
>> >  > The idea for putting these modules on PyPI is interesting. My only
>> >  > hesitation is I don't want "but it's already on PyPI" as an excuse
>> >  > not to include these changes into the standard library turtle
>> >  > module.
>> >
>> > Exactly backwards, as the first objection is going to be "if it could
>> > be on PyPI but isn't, there's no evidence it's ready for the stdlib."
>>
>> *cough typing cough*
>>
>>
>> The turtle module has been in Python for many, many years. This proposal
>> doesn't change the functionality, it merely offers a localised API to
>> the same functionality. A bunch of alternate names, nothing more.
>>
>> I would argue that if you consider the user-base of turtle, putting it
>> on PyPI is a waste of time:
>>
>> - Beginners aren't going to know to "pip install whatever". Some of us
>> here seem to think that pip is the answer to everything, but if you look
>> on the python-list mailing list, you will see plenty of evidence that
>> people have trouble using pip.
>>
>> - Schools may have policies against the installation of unapproved
>> software on their desktops, and getting approval to "pip install *" may
>> be difficult, time-consuming or outright impossible. If they are using
>> Python, we know they have approval to use what is in the standard
>> library. Everything else is, at best, a theorectical possibility.
>>
>> One argument against this proposal is that Python is not really designed
>> as a kid-friendly learning language, and we should just abandon that
>> space to languages that do it better, like Scratch. I'd hate to see that
>> argument win, but given our limited resources perhaps we should know
>> when we're beaten. Compared to what Scratch can do, turtle graphics are
>> so very 1970s.
>>
>> But if we think that there is still a place in the Python infrastructure
>> for turtle graphics, then I'm +1 on localising the turtle module.
>>
>>
>>
>>
>> --
>> Steve
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150904/7545787f/attachment.html>

From steve at pearwood.info  Fri Sep  4 14:18:37 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 4 Sep 2015 22:18:37 +1000
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <CAON-fpFjcAaWmA6FhZ_1JYG7jpuNP2v+BHunwc=Do85-dmtFTQ@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <CAPyZGSkW=MZy=O-D9KbXWzpaZn879EZYbjU5AAEtdCRYYd2OPg@mail.gmail.com>
 <35076B57-2F85-4632-B522-47BDE868567C@gmail.com>
 <CAON-fpFjcAaWmA6FhZ_1JYG7jpuNP2v+BHunwc=Do85-dmtFTQ@mail.gmail.com>
Message-ID: <20150904121837.GN19373@ando.pearwood.info>

On Fri, Sep 04, 2015 at 01:34:16PM +0200, Ludovic Gasc wrote:
> I'm agree with Matthias: IT world is mostly English based.

Fortunately for the 95% of the world who speak English as a second 
language, or not at all, that is changing. For example, StackOverflow 
has a very successful Brazilian site, and they make the case for 
non-English speakers well:

https://blog.stackexchange.com/2014/02/cant-we-all-be-reasonable-and-speak-english/

Rather than just repeat what they say there, I will just ask everyone to 
read it.


-- 
Steve

From humbert at uni-wuppertal.de  Fri Sep  4 14:19:19 2015
From: humbert at uni-wuppertal.de (Prof. Dr. L. Humbert)
Date: Fri, 4 Sep 2015 14:19:19 +0200
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <CAON-fpFjcAaWmA6FhZ_1JYG7jpuNP2v+BHunwc=Do85-dmtFTQ@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <CAPyZGSkW=MZy=O-D9KbXWzpaZn879EZYbjU5AAEtdCRYYd2OPg@mail.gmail.com>
 <35076B57-2F85-4632-B522-47BDE868567C@gmail.com>
 <CAON-fpFjcAaWmA6FhZ_1JYG7jpuNP2v+BHunwc=Do85-dmtFTQ@mail.gmail.com>
Message-ID: <55E98C47.1070604@uni-wuppertal.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 04.09.2015 13:34, Ludovic Gasc wrote:
> I'm agree with Matthias: IT world is mostly English based. It's "sad"
> for non native English speakers like me because you must learn 
> English before to work in IT. However, the positive side effect is
> that we can speak together in the common language.

This argument does not fit for students at K4-level so programming4all
will not work. When constructing Ponto.py, (remote control for
OpenOffice.org and LibreOffice via Python) we decided to write PontoE.py
to enable students with English to use it, but
also PontoD.py to use those classes, which are German-based - because of
the bavarian textbooks for Informatik at the age of 11, which use German
identifiers for classes, attributes and methods.

You may take a look at
to get a glimpse, how we dealt in automating the process to get the
	http://www.ham.nw.schule.de/pub/bscw.cgi/100606
The
	README.txt
points out how the process it managed.

Our actual approach without ?internationalisierung? for Python3:
	http://www.ham.nw.schule.de/pub/bscw.cgi/2131956


	TNX
	Ludger
- -- 
https://twitter.com/n770
http://ddi.uni-wuppertal.de/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAlXpjEYACgkQJQsN9FQ+jJ9gggCgiCO4V7oDF9QSFcoMkhd3GarW
1S8Ani7a5F7TlPe982q7ggWlGOTy5z0h
=MpyW
-----END PGP SIGNATURE-----

From steve at pearwood.info  Fri Sep  4 19:27:10 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 5 Sep 2015 03:27:10 +1000
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
Message-ID: <20150904172710.GO19373@ando.pearwood.info>

On Fri, Sep 04, 2015 at 12:18:52AM -0700, Andrew Barnert wrote:

> On Sep 3, 2015, at 19:45, Steven D'Aprano <steve at pearwood.info> wrote:
> > 
> > - Beginners aren't going to know to "pip install whatever". Some of us 
> > here seem to think that pip is the answer to everything, but if you look 
> > on the python-list mailing list, you will see plenty of evidence that 
> > people have trouble using pip.
> 
> Of course a sizable chunk of those say "my Python didn't come with 
> pip" and the after a bit of exploration you find that they're using 
> Python 2.7.3 or something, so any feature added to Python 3.6 isn't 
> likely to help them anyway.

You say "of course", but did you actually look at the python-list 
archives? If you do, you will see posts like these two within the last 
24 hours:

[quote]
I am running Python 3.4 on Windows 7 and is facing [Error 13] 
Permission Denied while installing Python packages...
[end quote]

and:

[quote]
Well I have certainly noted more than once that pip is contained in 
Python 3.4. But I am having the most extreme problems with simply typing 
"pip" into my command prompt and then getting back the normal 
information on pip!
[end quote]

And a random selection of other issues which I just happen to still 
have visible in my news reader:

[quote]
Python 2.7.9 and later (on the python2 series), and Python 3.4 and 
later include pip by default. But I can not find it in python2.7.10 
package. What's the matter? How can i install pip on my Embedded device?
[end quote]

[quote]
I've installed a fresh copy of Python 3.5.0b2 and - as recommended - 
upgraded pip. I don't understand the reason for the permission errors as 
I am owner and have full control for the temporary directory created.
[end quote]

[quote]
I was fed up with trying to install from pypi to Windows.  Setup.py more 
often than not wouldn't be able to find the VS compiler. So I thought 
I'd try the direct route to the excellent Christoph Gohlke site at 
http://www.lfd.uci.edu/~gohlke/pythonlibs/ which is all whl files these 
days.  However as you can see below despite my best efforts I'm still 
processing the tar.gz file, so what am I doing wrong?
[end quote]


(Some spelling errors and obvious typos corrected.)

Please don't dismiss out of hand the actual experience of real users 
with pip. At least one of those quotes above is from a long-time Python 
regular who knows his way around the command line.

This is not meant as an anti-pip screed, so please don't read it as 
such. But it is meant as a reminder that pip is not perfect, and that 
even experienced Python developers can have trouble installing packages. 
Children with no experience with the command line or Python can not be 
expected to install packages from PyPI without assistence, and if they 
are using school computers, they simply may not be permitted to run "pip 
install" even if it worked flawlessly.



-- 
Steve

From gmludo at gmail.com  Fri Sep  4 21:56:13 2015
From: gmludo at gmail.com (Ludovic Gasc)
Date: Fri, 4 Sep 2015 21:56:13 +0200
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <20150904121837.GN19373@ando.pearwood.info>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <CAPyZGSkW=MZy=O-D9KbXWzpaZn879EZYbjU5AAEtdCRYYd2OPg@mail.gmail.com>
 <35076B57-2F85-4632-B522-47BDE868567C@gmail.com>
 <CAON-fpFjcAaWmA6FhZ_1JYG7jpuNP2v+BHunwc=Do85-dmtFTQ@mail.gmail.com>
 <20150904121837.GN19373@ando.pearwood.info>
Message-ID: <CAON-fpHgUjxVHGvTU3SXMnkO4vpzxU5pix-YnBxvjuH2di7Ebw@mail.gmail.com>

Thank you for the link, it's interesting.

However, my remark it's mainly for the source code: Even if I sincerely
think it's better to handle English the most possible you can, I see no
issues to discuss about source code in your native language: I speak myself
in French when I interact with French developers only in my company.

Nevertheless, for the content of the source code or database structure, at
least to me, you must write in English: I've already analysed source code
in Dutch, it was a lot more complicated to understand the code, I've lost a
lot of time for nothing.
The world is now global and dev resources are enough rare to avoid to lock
your source code content in a local language.

See for example the big work of LibreOffice to translate German comments:
https://wiki.documentfoundation.org/Development/Easy_Hacks/Translation_Of_Comments

With a localized turtle, you should give a bad habit at the beginning.

--
Ludovic Gasc (GMLudo)
http://www.gmludo.eu/

2015-09-04 14:18 GMT+02:00 Steven D'Aprano <steve at pearwood.info>:

> On Fri, Sep 04, 2015 at 01:34:16PM +0200, Ludovic Gasc wrote:
> > I'm agree with Matthias: IT world is mostly English based.
>
> Fortunately for the 95% of the world who speak English as a second
> language, or not at all, that is changing. For example, StackOverflow
> has a very successful Brazilian site, and they make the case for
> non-English speakers well:
>
>
> https://blog.stackexchange.com/2014/02/cant-we-all-be-reasonable-and-speak-english/
>
> Rather than just repeat what they say there, I will just ask everyone to
> read it.
>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150904/bbb2bb50/attachment.html>

From stephane at wirtel.be  Fri Sep  4 22:02:46 2015
From: stephane at wirtel.be (=?utf-8?q?St=C3=A9phane?= Wirtel)
Date: Fri, 04 Sep 2015 22:02:46 +0200
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <CAON-fpHgUjxVHGvTU3SXMnkO4vpzxU5pix-YnBxvjuH2di7Ebw@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <CAPyZGSkW=MZy=O-D9KbXWzpaZn879EZYbjU5AAEtdCRYYd2OPg@mail.gmail.com>
 <35076B57-2F85-4632-B522-47BDE868567C@gmail.com>
 <CAON-fpFjcAaWmA6FhZ_1JYG7jpuNP2v+BHunwc=Do85-dmtFTQ@mail.gmail.com>
 <20150904121837.GN19373@ando.pearwood.info>
 <CAON-fpHgUjxVHGvTU3SXMnkO4vpzxU5pix-YnBxvjuH2di7Ebw@mail.gmail.com>
Message-ID: <FA7E7379-6FAA-411F-BFD2-FD6D4EE4A978@wirtel.be>

I do agree with the comments of Ludovic about the source code, Python is 
in English, the code for an open source project is international, and in 
this case, I prefer English.

In the past, I have seen some databases in french, with the accents ?, 
?, ? in the columns of the database, that?s really ugly :/ because 
if you forgot the encoding in the database, you will have a problem.

Yesterday, I have read a source code in Dutch, sorry but if you don?t 
know this language, good luck if you want to change the code.

And an other example, the comments in the code of the EuroPython site is 
in Italian, ok, the project has been developed for PyCon Italia, but 
now, we are a lot of international developers on this code, and 
sincerely, I can speak Italian but not everybody.

Sincerely, English for the code and the database!

On 4 Sep 2015, at 21:56, Ludovic Gasc wrote:

> Thank you for the link, it's interesting.
>
> However, my remark it's mainly for the source code: Even if I 
> sincerely
> think it's better to handle English the most possible you can, I see 
> no
> issues to discuss about source code in your native language: I speak 
> myself
> in French when I interact with French developers only in my company.
>
> Nevertheless, for the content of the source code or database 
> structure, at
> least to me, you must write in English: I've already analysed source 
> code
> in Dutch, it was a lot more complicated to understand the code, I've 
> lost a
> lot of time for nothing.
> The world is now global and dev resources are enough rare to avoid to 
> lock
> your source code content in a local language.
>
> See for example the big work of LibreOffice to translate German 
> comments:
> https://wiki.documentfoundation.org/Development/Easy_Hacks/Translation_Of_Comments
>
> With a localized turtle, you should give a bad habit at the beginning.
>
> --
> Ludovic Gasc (GMLudo)
> http://www.gmludo.eu/
>
> 2015-09-04 14:18 GMT+02:00 Steven D'Aprano <steve at pearwood.info>:
>
>> On Fri, Sep 04, 2015 at 01:34:16PM +0200, Ludovic Gasc wrote:
>>> I'm agree with Matthias: IT world is mostly English based.
>>
>> Fortunately for the 95% of the world who speak English as a second
>> language, or not at all, that is changing. For example, StackOverflow
>> has a very successful Brazilian site, and they make the case for
>> non-English speakers well:
>>
>>
>> https://blog.stackexchange.com/2014/02/cant-we-all-be-reasonable-and-speak-english/
>>
>> Rather than just repeat what they say there, I will just ask everyone 
>> to
>> read it.
>>
>>
>> --
>> Steve
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


--
St?phane Wirtel - http://wirtel.be - @matrixise

From asweigart at gmail.com  Fri Sep  4 22:31:24 2015
From: asweigart at gmail.com (Al Sweigart)
Date: Fri, 4 Sep 2015 13:31:24 -0700
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <FA7E7379-6FAA-411F-BFD2-FD6D4EE4A978@wirtel.be>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <CAPyZGSkW=MZy=O-D9KbXWzpaZn879EZYbjU5AAEtdCRYYd2OPg@mail.gmail.com>
 <35076B57-2F85-4632-B522-47BDE868567C@gmail.com>
 <CAON-fpFjcAaWmA6FhZ_1JYG7jpuNP2v+BHunwc=Do85-dmtFTQ@mail.gmail.com>
 <20150904121837.GN19373@ando.pearwood.info>
 <CAON-fpHgUjxVHGvTU3SXMnkO4vpzxU5pix-YnBxvjuH2di7Ebw@mail.gmail.com>
 <FA7E7379-6FAA-411F-BFD2-FD6D4EE4A978@wirtel.be>
Message-ID: <CAPyZGS=3q3b9Pk9_zUCY1yzpv3SY37K=TFR7=3ZFV_DzEaeHWQ@mail.gmail.com>

I completely agree that Python and codebases already in English should
remain in English. And I want the source code for turtle.py to stay in
English as well. This is where the gray area for Turtle begins though.

Turtle is not for professional developers, where the English expectation is
there. It is used for school kids to program in, and this code will most
likely be forgotten about a week after the programming assignment is done.
And in a sense, turtle.py is not used as a module by kids so much as an app
(albeit a scriptable one) that moves the turtle around and draws shapes.

The language barrier is a very real one for non-technical instructors,
parents, and students. If we could minimize it down to less than a dozen
Python keywords & names (import turtle, for, in, range, while, if, else)
that would be a significant gain for Python's reach.

And I don't think it would be much technical debt for turtle.py. I hope to
have a complete translated set soon so I can submit a patch that shows how
light of a change this would be.

-Al

On Fri, Sep 4, 2015 at 1:02 PM, St?phane Wirtel <stephane at wirtel.be> wrote:

> I do agree with the comments of Ludovic about the source code, Python is
> in English, the code for an open source project is international, and in
> this case, I prefer English.
>
> In the past, I have seen some databases in french, with the accents ?, ?,
> ? in the columns of the database, that?s really ugly :/ because if you
> forgot the encoding in the database, you will have a problem.
>
> Yesterday, I have read a source code in Dutch, sorry but if you don?t know
> this language, good luck if you want to change the code.
>
> And an other example, the comments in the code of the EuroPython site is
> in Italian, ok, the project has been developed for PyCon Italia, but now,
> we are a lot of international developers on this code, and sincerely, I can
> speak Italian but not everybody.
>
> Sincerely, English for the code and the database!
>
>
> On 4 Sep 2015, at 21:56, Ludovic Gasc wrote:
>
> Thank you for the link, it's interesting.
>>
>> However, my remark it's mainly for the source code: Even if I sincerely
>> think it's better to handle English the most possible you can, I see no
>> issues to discuss about source code in your native language: I speak
>> myself
>> in French when I interact with French developers only in my company.
>>
>> Nevertheless, for the content of the source code or database structure, at
>> least to me, you must write in English: I've already analysed source code
>> in Dutch, it was a lot more complicated to understand the code, I've lost
>> a
>> lot of time for nothing.
>> The world is now global and dev resources are enough rare to avoid to lock
>> your source code content in a local language.
>>
>> See for example the big work of LibreOffice to translate German comments:
>>
>> https://wiki.documentfoundation.org/Development/Easy_Hacks/Translation_Of_Comments
>>
>> With a localized turtle, you should give a bad habit at the beginning.
>>
>> --
>> Ludovic Gasc (GMLudo)
>> http://www.gmludo.eu/
>>
>> 2015-09-04 14:18 GMT+02:00 Steven D'Aprano <steve at pearwood.info>:
>>
>> On Fri, Sep 04, 2015 at 01:34:16PM +0200, Ludovic Gasc wrote:
>>>
>>>> I'm agree with Matthias: IT world is mostly English based.
>>>>
>>>
>>> Fortunately for the 95% of the world who speak English as a second
>>> language, or not at all, that is changing. For example, StackOverflow
>>> has a very successful Brazilian site, and they make the case for
>>> non-English speakers well:
>>>
>>>
>>>
>>> https://blog.stackexchange.com/2014/02/cant-we-all-be-reasonable-and-speak-english/
>>>
>>> Rather than just repeat what they say there, I will just ask everyone to
>>> read it.
>>>
>>>
>>> --
>>> Steve
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> https://mail.python.org/mailman/listinfo/python-ideas
>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>>
>>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
> --
> St?phane Wirtel - http://wirtel.be - @matrixise
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150904/c8ed0e18/attachment-0001.html>

From abarnert at yahoo.com  Fri Sep  4 23:05:05 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 4 Sep 2015 14:05:05 -0700
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <20150904172710.GO19373@ando.pearwood.info>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
Message-ID: <A84107F7-0E0C-4F44-91D1-9C19CFCC79A7@yahoo.com>

I find it really annoying when people pick one sentence out of a post to argue against at length, out of context. while entirely ignoring the actual substance of the post.

Are you sincerely arguing that no children out there will have Python 3.5, 3.3, or 2.7, or that for all such student upgrading to 3.6 will be easier and face fewer permissions problems than using pip? If not, then how does this answer my point that some people will want this on PyPI even if it's in the 3.6 stdlib?

Sent from my iPhone

> On Sep 4, 2015, at 10:27, Steven D'Aprano <steve at pearwood.info> wrote:
> 
>> On Fri, Sep 04, 2015 at 12:18:52AM -0700, Andrew Barnert wrote:
>> 
>>> On Sep 3, 2015, at 19:45, Steven D'Aprano <steve at pearwood.info> wrote:
>>> 
>>> - Beginners aren't going to know to "pip install whatever". Some of us 
>>> here seem to think that pip is the answer to everything, but if you look 
>>> on the python-list mailing list, you will see plenty of evidence that 
>>> people have trouble using pip.
>> 
>> Of course a sizable chunk of those say "my Python didn't come with 
>> pip" and the after a bit of exploration you find that they're using 
>> Python 2.7.3 or something, so any feature added to Python 3.6 isn't 
>> likely to help them anyway.
> 
> You say "of course", but did you actually look at the python-list 
> archives? If you do, you will see posts like these two within the last 
> 24 hours:
> 
> [quote]
> I am running Python 3.4 on Windows 7 and is facing [Error 13] 
> Permission Denied while installing Python packages...
> [end quote]
> 
> and:
> 
> [quote]
> Well I have certainly noted more than once that pip is contained in 
> Python 3.4. But I am having the most extreme problems with simply typing 
> "pip" into my command prompt and then getting back the normal 
> information on pip!
> [end quote]
> 
> And a random selection of other issues which I just happen to still 
> have visible in my news reader:
> 
> [quote]
> Python 2.7.9 and later (on the python2 series), and Python 3.4 and 
> later include pip by default. But I can not find it in python2.7.10 
> package. What's the matter? How can i install pip on my Embedded device?
> [end quote]
> 
> [quote]
> I've installed a fresh copy of Python 3.5.0b2 and - as recommended - 
> upgraded pip. I don't understand the reason for the permission errors as 
> I am owner and have full control for the temporary directory created.
> [end quote]
> 
> [quote]
> I was fed up with trying to install from pypi to Windows.  Setup.py more 
> often than not wouldn't be able to find the VS compiler. So I thought 
> I'd try the direct route to the excellent Christoph Gohlke site at 
> http://www.lfd.uci.edu/~gohlke/pythonlibs/ which is all whl files these 
> days.  However as you can see below despite my best efforts I'm still 
> processing the tar.gz file, so what am I doing wrong?
> [end quote]
> 
> 
> (Some spelling errors and obvious typos corrected.)
> 
> Please don't dismiss out of hand the actual experience of real users 
> with pip. At least one of those quotes above is from a long-time Python 
> regular who knows his way around the command line.
> 
> This is not meant as an anti-pip screed, so please don't read it as 
> such. But it is meant as a reminder that pip is not perfect, and that 
> even experienced Python developers can have trouble installing packages. 
> Children with no experience with the command line or Python can not be 
> expected to install packages from PyPI without assistence, and if they 
> are using school computers, they simply may not be permitted to run "pip 
> install" even if it worked flawlessly.
> 
> 
> 
> -- 
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From asweigart at gmail.com  Fri Sep  4 23:43:32 2015
From: asweigart at gmail.com (Al Sweigart)
Date: Fri, 4 Sep 2015 14:43:32 -0700
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <A84107F7-0E0C-4F44-91D1-9C19CFCC79A7@yahoo.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <A84107F7-0E0C-4F44-91D1-9C19CFCC79A7@yahoo.com>
Message-ID: <CAPyZGSkUNPDmc_MP99v6Q9HucfUw1xuaBrY5Y6TdJ5V7rHd0fA@mail.gmail.com>

I see your point. I think there are two different arguments here: It would
be good to have non-English turtle modules of PyPI for older versions of
Python. But it would also be good to have non-English names added to the
turtle module in the 3.6 stdlib.

My main concern was that if these modules were on PyPI, they would be left
out of the standard library. Then the "install from PyPI headache"
arguments would apply.

-Al

On Fri, Sep 4, 2015 at 2:05 PM, Andrew Barnert via Python-ideas <
python-ideas at python.org> wrote:

> I find it really annoying when people pick one sentence out of a post to
> argue against at length, out of context. while entirely ignoring the actual
> substance of the post.
>
> Are you sincerely arguing that no children out there will have Python 3.5,
> 3.3, or 2.7, or that for all such student upgrading to 3.6 will be easier
> and face fewer permissions problems than using pip? If not, then how does
> this answer my point that some people will want this on PyPI even if it's
> in the 3.6 stdlib?
>
> Sent from my iPhone
>
> > On Sep 4, 2015, at 10:27, Steven D'Aprano <steve at pearwood.info> wrote:
> >
> >> On Fri, Sep 04, 2015 at 12:18:52AM -0700, Andrew Barnert wrote:
> >>
> >>> On Sep 3, 2015, at 19:45, Steven D'Aprano <steve at pearwood.info> wrote:
> >>>
> >>> - Beginners aren't going to know to "pip install whatever". Some of us
> >>> here seem to think that pip is the answer to everything, but if you
> look
> >>> on the python-list mailing list, you will see plenty of evidence that
> >>> people have trouble using pip.
> >>
> >> Of course a sizable chunk of those say "my Python didn't come with
> >> pip" and the after a bit of exploration you find that they're using
> >> Python 2.7.3 or something, so any feature added to Python 3.6 isn't
> >> likely to help them anyway.
> >
> > You say "of course", but did you actually look at the python-list
> > archives? If you do, you will see posts like these two within the last
> > 24 hours:
> >
> > [quote]
> > I am running Python 3.4 on Windows 7 and is facing [Error 13]
> > Permission Denied while installing Python packages...
> > [end quote]
> >
> > and:
> >
> > [quote]
> > Well I have certainly noted more than once that pip is contained in
> > Python 3.4. But I am having the most extreme problems with simply typing
> > "pip" into my command prompt and then getting back the normal
> > information on pip!
> > [end quote]
> >
> > And a random selection of other issues which I just happen to still
> > have visible in my news reader:
> >
> > [quote]
> > Python 2.7.9 and later (on the python2 series), and Python 3.4 and
> > later include pip by default. But I can not find it in python2.7.10
> > package. What's the matter? How can i install pip on my Embedded device?
> > [end quote]
> >
> > [quote]
> > I've installed a fresh copy of Python 3.5.0b2 and - as recommended -
> > upgraded pip. I don't understand the reason for the permission errors as
> > I am owner and have full control for the temporary directory created.
> > [end quote]
> >
> > [quote]
> > I was fed up with trying to install from pypi to Windows.  Setup.py more
> > often than not wouldn't be able to find the VS compiler. So I thought
> > I'd try the direct route to the excellent Christoph Gohlke site at
> > http://www.lfd.uci.edu/~gohlke/pythonlibs/ which is all whl files these
> > days.  However as you can see below despite my best efforts I'm still
> > processing the tar.gz file, so what am I doing wrong?
> > [end quote]
> >
> >
> > (Some spelling errors and obvious typos corrected.)
> >
> > Please don't dismiss out of hand the actual experience of real users
> > with pip. At least one of those quotes above is from a long-time Python
> > regular who knows his way around the command line.
> >
> > This is not meant as an anti-pip screed, so please don't read it as
> > such. But it is meant as a reminder that pip is not perfect, and that
> > even experienced Python developers can have trouble installing packages.
> > Children with no experience with the command line or Python can not be
> > expected to install packages from PyPI without assistence, and if they
> > are using school computers, they simply may not be permitted to run "pip
> > install" even if it worked flawlessly.
> >
> >
> >
> > --
> > Steve
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at python.org
> > https://mail.python.org/mailman/listinfo/python-ideas
> > Code of Conduct: http://python.org/psf/codeofconduct/
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150904/96cbddb6/attachment.html>

From abarnert at yahoo.com  Sat Sep  5 04:59:42 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 4 Sep 2015 19:59:42 -0700
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <CAPyZGSkUNPDmc_MP99v6Q9HucfUw1xuaBrY5Y6TdJ5V7rHd0fA@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <A84107F7-0E0C-4F44-91D1-9C19CFCC79A7@yahoo.com>
 <CAPyZGSkUNPDmc_MP99v6Q9HucfUw1xuaBrY5Y6TdJ5V7rHd0fA@mail.gmail.com>
Message-ID: <1D1D750F-C7D2-47B9-8915-255DAEB09FB3@yahoo.com>

On Sep 4, 2015, at 14:43, Al Sweigart <asweigart at gmail.com> wrote:
> 
> I see your point. I think there are two different arguments here: It would be good to have non-English turtle modules of PyPI for older versions of Python. But it would also be good to have non-English names added to the turtle module in the 3.6 stdlib.
> 
> My main concern was that if these modules were on PyPI, they would be left out of the standard library. Then the "install from PyPI headache" arguments would apply.

I understand, but I think that concern is misplaced. Having something on PyPI generally makes it easier, not harder, to get it into the stdlib. And it's also useful on its own, because not everyone has 3.6.

The problem is that it seems like the obvious way to design the PyPI version and the stdlib version would be pretty different (unless you want to explain to novices how to install backports packages and import things conditionally). But hopefully someone can come up with a good solution to that?


From stephen at xemacs.org  Sat Sep  5 08:01:12 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 05 Sep 2015 15:01:12 +0900
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <CAPyZGSkW=MZy=O-D9KbXWzpaZn879EZYbjU5AAEtdCRYYd2OPg@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <CAPyZGSkW=MZy=O-D9KbXWzpaZn879EZYbjU5AAEtdCRYYd2OPg@mail.gmail.com>
Message-ID: <87oahh2ymv.fsf@uwakimon.sk.tsukuba.ac.jp>

Al Sweigart writes:

 > Thinking about it some more, yeah, having a separate module on PyPI would
 > just be a waste of time.

Python 2.7.  'nuff said, I hope.




From rosuav at gmail.com  Sat Sep  5 08:32:48 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Sat, 5 Sep 2015 16:32:48 +1000
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <87oahh2ymv.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <CAPyZGSkW=MZy=O-D9KbXWzpaZn879EZYbjU5AAEtdCRYYd2OPg@mail.gmail.com>
 <87oahh2ymv.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAPTjJmr5owRWmxxmOXPaNVHbj5nwfJsd7ui+EqKPpKGyV7fcUw@mail.gmail.com>

On Sat, Sep 5, 2015 at 4:01 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Al Sweigart writes:
>
>  > Thinking about it some more, yeah, having a separate module on PyPI would
>  > just be a waste of time.
>
> Python 2.7.  'nuff said, I hope.

If someone's just learning to program, surely s/he can learn on Python
3 rather than Python 2. If localized names aren't available on Py2, so
be it.

(For the record, I would be in favour of it being on PyPI. I just
don't think that "Python 2.7" is sufficient argument for that.)

ChrisA

From stephen at xemacs.org  Sat Sep  5 08:46:09 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 05 Sep 2015 15:46:09 +0900
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <CAPyZGS=3q3b9Pk9_zUCY1yzpv3SY37K=TFR7=3ZFV_DzEaeHWQ@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <CAPyZGSkW=MZy=O-D9KbXWzpaZn879EZYbjU5AAEtdCRYYd2OPg@mail.gmail.com>
 <35076B57-2F85-4632-B522-47BDE868567C@gmail.com>
 <CAON-fpFjcAaWmA6FhZ_1JYG7jpuNP2v+BHunwc=Do85-dmtFTQ@mail.gmail.com>
 <20150904121837.GN19373@ando.pearwood.info>
 <CAON-fpHgUjxVHGvTU3SXMnkO4vpzxU5pix-YnBxvjuH2di7Ebw@mail.gmail.com>
 <FA7E7379-6FAA-411F-BFD2-FD6D4EE4A978@wirtel.be>
 <CAPyZGS=3q3b9Pk9_zUCY1yzpv3SY37K=TFR7=3ZFV_DzEaeHWQ@mail.gmail.com>
Message-ID: <87mvx12wjy.fsf@uwakimon.sk.tsukuba.ac.jp>

Al Sweigart writes:

 > If we could minimize it down to less than a dozen Python keywords &
 > names (import turtle, for, in, range, while, if, else) that would
 > be a significant gain for Python's reach.

I don't see why you would want any non-localized identifiers (modules,
functions) at all in the base feature set.  So you can (and I think
should) drop range and turtle.  I don't see any point in discussing
the keywords here -- they are what they are.  If a student decides to
use something weird like "continue" or "try ... finally" it will work.

Of course, the recommended set of syntaxes and their associated
keywords matter a lot pedagogically, but we can leave that discussion
to the pedagogues.  When you're wearing your pedagogue hat and
actually writing the style guide for teaching programming using
turtle, then those interested can talk about that.  Or maybe that
should be left to experimentation.  Some teachers may prefer to avoid
"while condition: suite" in of "for i in iterable: if condition: suite".
(Normally that would be nuts, of course, but here you could reduce the
base set of keywords and syntaxes by one each.)

There may be reasons why advanced users (and teachers) might want to
use "non-base" facilities.  There it's possible to do things like

>>> from builtins import range as interval, print as output
>>> for i in interval(2): output(i)
0
1
>>> 

which allows teachers to add any extensions they like conveniently,
albeit verbosely.

I still think an i18n-based architecture is the way to go, to minimize
such boilerplate (error-prone for translators) among other reasons.
Neither users nor translators need to see it, unless they want to see
"how'd they do that?"  In which case, isn't more "educational" to show
them the way it's done in the real world?


From stephen at xemacs.org  Sat Sep  5 09:08:24 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 05 Sep 2015 16:08:24 +0900
Subject: [Python-ideas] High time for a builtin function to manage packages
	(simply)?
In-Reply-To: <20150904172710.GO19373@ando.pearwood.info>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
Message-ID: <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>

Steven D'Aprano writes:

 > You say "of course", but did you actually look at the python-list 
 > archives? If you do, you will see posts like these two within the last 
 > 24 hours:

So let's fix it, already![1]  Now that we have a blessed package
management module, why not have a builtin that handles the simple
cases?  Say

    def installer(package, command='install'):
        ...

where command would also take values 'status' (in which case package
could be None, meaning list all installed packages, and 'status' might
check for available upgrades as well as stating whether the package is
known to this python instance), 'upgrade', 'install' (which might
error if the package is already installed, since I envision
installations taking place in the user's space which won't work for
upgrading stdlib packages in a system Python, at least on Windows),
and maybe 'remove'.

I'm not real happy with the name "installer", but I chose it to imply
that there is a command argument, and that it can do more than just
install new packages.

In general, I would say installer() should fail-safe (to the point of
fail-annoying<wink/>), and point to the pip (and maybe venv) docs.
It should also be verbose (eg, explaining that it only knows how to
install for the current user and things like that).


Footnotes: 
[1]  This really is not relevant to the "localized turtle" thread.  If
the current situation is acceptable in general, it's not an argument
for putting turtle localizations in the stdlib.  If it's not
acceptable, well, let's fix it.


From rustompmody at gmail.com  Sat Sep  5 09:09:08 2015
From: rustompmody at gmail.com (Rustom Mody)
Date: Sat, 5 Sep 2015 00:09:08 -0700 (PDT)
Subject: [Python-ideas] Packaging systems (was Non-English names in the
	turtle module)
In-Reply-To: <20150904172710.GO19373@ando.pearwood.info>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
Message-ID: <e1d57ed0-a447-4ab2-be24-7a57686ee88e@googlegroups.com>

On Friday, September 4, 2015 at 10:57:42 PM UTC+5:30, Steven D'Aprano wrote:
>
> On Fri, Sep 04, 2015 at 12:18:52AM -0700, Andrew Barnert wrote: 
>
> > On Sep 3, 2015, at 19:45, Steven D'Aprano <st... at pearwood.info 
> <javascript:>> wrote: 
> > > 
> > > - Beginners aren't going to know to "pip install whatever". Some of us 
> > > here seem to think that pip is the answer to everything, but if you 
> look 
> > > on the python-list mailing list, you will see plenty of evidence that 
> > > people have trouble using pip. 
> > 
> > Of course a sizable chunk of those say "my Python didn't come with 
> > pip" and the after a bit of exploration you find that they're using 
> > Python 2.7.3 or something, so any feature added to Python 3.6 isn't 
> > likely to help them anyway. 
>
> You say "of course", but did you actually look at the python-list 
> archives? If you do, you will see posts like these two within the last 
> 24 hours: 
>
> [quote] 
> I am running Python 3.4 on Windows 7 and is facing [Error 13] 
> Permission Denied while installing Python packages... 
> [end quote] 
>
> and: 
>
> [quote] 
> Well I have certainly noted more than once that pip is contained in 
> Python 3.4. But I am having the most extreme problems with simply typing 
> "pip" into my command prompt and then getting back the normal 
> information on pip! 
> [end quote] 
>
> And a random selection of other issues which I just happen to still 
> have visible in my news reader: 
>
> [quote] 
> Python 2.7.9 and later (on the python2 series), and Python 3.4 and 
> later include pip by default. But I can not find it in python2.7.10 
> package. What's the matter? How can i install pip on my Embedded device? 
> [end quote] 
>
> [quote] 
> I've installed a fresh copy of Python 3.5.0b2 and - as recommended - 
> upgraded pip. I don't understand the reason for the permission errors as 
> I am owner and have full control for the temporary directory created. 
> [end quote] 
>
> [quote] 
> I was fed up with trying to install from pypi to Windows.  Setup.py more 
> often than not wouldn't be able to find the VS compiler. So I thought 
> I'd try the direct route to the excellent Christoph Gohlke site at 
> http://www.lfd.uci.edu/~gohlke/pythonlibs/ which is all whl files these 
> days.  However as you can see below despite my best efforts I'm still 
> processing the tar.gz file, so what am I doing wrong? 
> [end quote] 
>
>
> (Some spelling errors and obvious typos corrected.) 
>
> Please don't dismiss out of hand the actual experience of real users 
> with pip. At least one of those quotes above is from a long-time Python 
> regular who knows his way around the command line. 
>
> This is not meant as an anti-pip screed, so please don't read it as 
> such. But it is meant as a reminder that pip is not perfect, and that 
> even experienced Python developers can have trouble installing packages. 
> Children with no experience with the command line or Python can not be 
> expected to install packages from PyPI without assistence, and if they 
> are using school computers, they simply may not be permitted to run "pip 
> install" even if it worked flawlessly. 
>
>
Packaging systems suffer from a Law: 

The quality of the packaging-system is inversely proportional to the 
quality of the language being packaged

[Corollary to the invariable NIH syndrome that programmers suffer from]

Notice:

Haskell' hackage : Terrible
Python's pip: Bad
Ruby's gems: Ok
Perl's CPAN : Good
Debian's apt (mishmash of perl, shell and other unspeakbles) : Superb
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150905/7b05e962/attachment-0001.html>

From ncoghlan at gmail.com  Sat Sep  5 09:31:24 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 5 Sep 2015 17:31:24 +1000
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <20150904024552.GL19373@ando.pearwood.info>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
Message-ID: <CADiSq7d5ca21pJbFOhxcUf6eYUXqfFz5LtMKW2if7v8knPJpJg@mail.gmail.com>

On 4 September 2015 at 12:45, Steven D'Aprano <steve at pearwood.info> wrote:
> One argument against this proposal is that Python is not really designed
> as a kid-friendly learning language, and we should just abandon that
> space to languages that do it better, like Scratch. I'd hate to see that
> argument win, but given our limited resources perhaps we should know
> when we're beaten. Compared to what Scratch can do, turtle graphics are
> so very 1970s.

Block based languages are to text based ones as picture books are to
the written word - to get the combinatorial power of language into
play, you need to be learning systems that have the capacity to be
self hosting. You can write a Python interpreter in Python, but you
can't write a Scratch environment in Scratch.

This is reflected in the way primary schools digital environment
curricula are now being designed - initial concepts of algorithms and
flow control can be introduced without involving a computer at all
(e.g. through games like Robot Turtles), then block based programming
in environments like Scratch introduce the use of computers in a way
that doesn't require particularly fine motor control or spelling
skills.

However, a common aspect I've seen talking to teachers from Australia,
the US and the UK is that the aim is always to introduce kids to the
full combinatorial power of a text based programming environment like
Python, since that's what unlocks the ability to use computers to
manipulate real world data and interfaces, rather than just a local
constrained environment like the one in Scratch.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From rustompmody at gmail.com  Sat Sep  5 08:42:54 2015
From: rustompmody at gmail.com (Rustom Mody)
Date: Fri, 4 Sep 2015 23:42:54 -0700 (PDT)
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <35076B57-2F85-4632-B522-47BDE868567C@gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <CAPyZGSkW=MZy=O-D9KbXWzpaZn879EZYbjU5AAEtdCRYYd2OPg@mail.gmail.com>
 <35076B57-2F85-4632-B522-47BDE868567C@gmail.com>
Message-ID: <9816c9aa-0358-4d2a-b67b-e58ace40a0ac@googlegroups.com>



On Friday, September 4, 2015 at 12:56:33 PM UTC+5:30, Matthias Bussonnier 
wrote:
>
> Hi all, 
>
> Personal opinion, base by a bit of experience:
>

And my personal experience from the (an) other side:

Some students of mine worked to add devanagari to python: 

https://github.com/rusimody/l10Python
[Re-copying something here from a post on dev list
What I would wish to add is tl;dr at bottom
]

Here's an REPL-session to demo:
[Note ?????????? is devanagari equivalent of 1234567890]
--------------------------------------------------
Python 3.5.0b2 (default, Jul 30 2015, 19:32:42)
[GCC 4.9.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> ??
12
>>> 23 == ??
True
>>> ?? + ??
46
>>> ?? + 34
46
>>> "12" == "??"
False
>>> 2 ? 3
True
>>> 2 ? 3
True
>>> (? x: x+3)(4)
7
>>> # as a result of which this doesn't work... I did say they are kids!
...
>>> ? = 3
  File "<stdin>", line 1
    ? = 3
    ^
SyntaxError: invalid syntax
>>> {1,2,3} ? {2,3,4}
{2, 3}
>>> {1,2,3} ? {2,3,4}\{2, 3}
>>> {1,2,3} ? {2,3,4}
{1, 2, 3, 4}
>>> ? True
False
>>> ?([1,2,3,4])
10
>>> 
----------------------------------------------
The last is actually more an embarrassment than the ? breaking since
they?ve *changed the lexer* to read the ? when all that was required was
? = sum !!

In short... Kids!

tl;dr
For me (yes an educated Indian) English is a natural first language
However the idea that English is the *only* language is about as quaint the 
idea that the extent of the universe is as vast as a 100 kilometers with 
Garden of Eden in the center.
Our tiny experience with internationalizing python showed us that the lexer 
(at least) is terribly ASCII centric
My immediate wish-list: Modularize the lexer into a pre-lexer converting 
UTF-8 to unicode codepoint followed by a pure unicode 32-bit codepoint 
based lexer
My long-term wishlist (yeah somewhat unrealistic and utopian) for python 
4000 is the  increased awareness that the only reasonable international 
language 
is mathematics.

Or
http://blog.languager.org/2014/04/unicoded-python.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150904/03e742e3/attachment.html>

From ncoghlan at gmail.com  Sat Sep  5 10:12:45 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 5 Sep 2015 18:12:45 +1000
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <CAPyZGSkUNPDmc_MP99v6Q9HucfUw1xuaBrY5Y6TdJ5V7rHd0fA@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <A84107F7-0E0C-4F44-91D1-9C19CFCC79A7@yahoo.com>
 <CAPyZGSkUNPDmc_MP99v6Q9HucfUw1xuaBrY5Y6TdJ5V7rHd0fA@mail.gmail.com>
Message-ID: <CADiSq7erE_b98Wz16q1F-oMjwB09g=40JWqYbFKtrFZpL3rnpA@mail.gmail.com>

On 5 September 2015 at 07:43, Al Sweigart <asweigart at gmail.com> wrote:
> I see your point. I think there are two different arguments here: It would
> be good to have non-English turtle modules of PyPI for older versions of
> Python. But it would also be good to have non-English names added to the
> turtle module in the 3.6 stdlib.
>
> My main concern was that if these modules were on PyPI, they would be left
> out of the standard library. Then the "install from PyPI headache" arguments
> would apply.

The last major upgrade to turtle was the adoption of Gregor Lindl's
xturtle for 2.6 as the standard turtle implementation, and he iterated
on that as an external project for a while first. I think this is
another case where a similar approach would work well - you could
create a new "eduturtle" project as a fork of the current turtle
module to allow more rapid iteration and feedback from educators
unconstrained by the standard library's release cycle, and then
propose it for default inclusion in 3.6.

Another potentially desirable thing that could be explored within such
a "turtle upgrade" project is switching it to using a HTML5 canvas as
its drawing surface, rather than relying on Tkinter. We made a similar
change a while ago with PyDoc, and while the pages generated by the
local web service could definitely use some TLC from a front-end
designer, I think it was a good call.

That "HTML5-compatible-browser-as-GUI-framework" model is also the way
IPython Notebook went for data analysis, and it unlocks an incredibly
rich world of visualisation capabilities, that are not only useful in
full browsers, but also in HTML widgets in desktop and mobile GUI
frameworks.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Sat Sep  5 10:22:48 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 5 Sep 2015 18:22:48 +1000
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <87mvx12wjy.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <CAPyZGSkW=MZy=O-D9KbXWzpaZn879EZYbjU5AAEtdCRYYd2OPg@mail.gmail.com>
 <35076B57-2F85-4632-B522-47BDE868567C@gmail.com>
 <CAON-fpFjcAaWmA6FhZ_1JYG7jpuNP2v+BHunwc=Do85-dmtFTQ@mail.gmail.com>
 <20150904121837.GN19373@ando.pearwood.info>
 <CAON-fpHgUjxVHGvTU3SXMnkO4vpzxU5pix-YnBxvjuH2di7Ebw@mail.gmail.com>
 <FA7E7379-6FAA-411F-BFD2-FD6D4EE4A978@wirtel.be>
 <CAPyZGS=3q3b9Pk9_zUCY1yzpv3SY37K=TFR7=3ZFV_DzEaeHWQ@mail.gmail.com>
 <87mvx12wjy.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CADiSq7fw5dCP6bYEojeE9VkMkLvoXGeWz9Bo9=X8D_b4RUhk6A@mail.gmail.com>

On 5 September 2015 at 16:46, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> I still think an i18n-based architecture is the way to go, to minimize
> such boilerplate (error-prone for translators) among other reasons.
> Neither users nor translators need to see it, unless they want to see
> "how'd they do that?"  In which case, isn't more "educational" to show
> them the way it's done in the real world?

Sorting out such procedural questions *iteratively* is one of the
reasons I think it's desirable to pursue this externally first. There
are a *lot* of possible options here, including using a full
translation platform like Zanata or Pootle to manage the interaction
with translators, and also a need to allow teachers of students that
aren't native English speakers to easily try out the revised module
*before* we commit to adding it to CPython.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Sat Sep  5 10:30:21 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 5 Sep 2015 18:30:21 +1000
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>

On 5 September 2015 at 17:08, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Steven D'Aprano writes:
>
>  > You say "of course", but did you actually look at the python-list
>  > archives? If you do, you will see posts like these two within the last
>  > 24 hours:
>
> So let's fix it, already![1]  Now that we have a blessed package
> management module, why not have a builtin that handles the simple
> cases?

Running "python -m pip" instead of "pip" already avoids many of the
issues with PATH configuration, which is one of the reasons why that's
what I recommend in the main Python docs at
https://docs.python.org/3/installing/ &
https://docs.python.org/2/installing/

Unfortunately, I've yet to convince the rest of PyPA (let alone the
community at large) that telling people to call "pip" directly is *bad
advice* (as it breaks in too many cases that beginners are going to
encounter), so it would be helpful if folks helping beginners on
python-list and python-tutor could provide feedback supporting that
perspective by filing an issue against
https://github.com/pypa/python-packaging-user-guide

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From stephen at xemacs.org  Sat Sep  5 10:46:20 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 05 Sep 2015 17:46:20 +0900
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <CAPTjJmr5owRWmxxmOXPaNVHbj5nwfJsd7ui+EqKPpKGyV7fcUw@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <CAPyZGSkW=MZy=O-D9KbXWzpaZn879EZYbjU5AAEtdCRYYd2OPg@mail.gmail.com>
 <87oahh2ymv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAPTjJmr5owRWmxxmOXPaNVHbj5nwfJsd7ui+EqKPpKGyV7fcUw@mail.gmail.com>
Message-ID: <87egid2qzn.fsf@uwakimon.sk.tsukuba.ac.jp>

Chris Angelico writes:

 > If someone's just learning to program, surely s/he can learn on
 > Python 3 rather than Python 2.

In any case, I could replace "2.7" with the (not yet released!) "3.5".
'nuff said, yet?<wink />

Anyway, "learn with Python 3" is what we advocate, but the reality is
otherwise in some places, and many platforms (eg, Mac).  None of my
students download Python 3 until I tell them to.  They thought the
system Python would be as "up to date" as the "Retina" display!  I
wouldn't be surprised if there aren't a lot of Mac systems in schools
where many teachers just use the Python that is already there, and are
far more likely to "pip tortuga" than they are to install a whole
parallel Python 3.


From bussonniermatthias at gmail.com  Sat Sep  5 15:22:02 2015
From: bussonniermatthias at gmail.com (Matthias Bussonnier)
Date: Sat, 5 Sep 2015 15:22:02 +0200
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
Message-ID: <CANJQusVp+aEsSBbCK3aY6JY-O3r7oT=1nO_jFGseBYHUdXw64w@mail.gmail.com>

I do have this package[1]

That allow you to do `pip install <....>` from within an IPython
session and will call the
pip of the current python by importing pip instead of calling a subprocess.

One of the things I would like is for that to actually wrap
pip on python.org-installed python, and conda on conda-installed python.

So if such a proposal is integrating into Python, it would be nice to have hooks
that allow to "hide" which package manager is used under the hood.
-- 
M

[1]: https://pypi.python.org/pypi/pip_magic

On Sat, Sep 5, 2015 at 10:30 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 5 September 2015 at 17:08, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>> Steven D'Aprano writes:
>>
>>  > You say "of course", but did you actually look at the python-list
>>  > archives? If you do, you will see posts like these two within the last
>>  > 24 hours:
>>
>> So let's fix it, already![1]  Now that we have a blessed package
>> management module, why not have a builtin that handles the simple
>> cases?
>
> Running "python -m pip" instead of "pip" already avoids many of the
> issues with PATH configuration, which is one of the reasons why that's
> what I recommend in the main Python docs at
> https://docs.python.org/3/installing/ &
> https://docs.python.org/2/installing/
>
> Unfortunately, I've yet to convince the rest of PyPA (let alone the
> community at large) that telling people to call "pip" directly is *bad
> advice* (as it breaks in too many cases that beginners are going to
> encounter), so it would be helpful if folks helping beginners on
> python-list and python-tutor could provide feedback supporting that
> perspective by filing an issue against
> https://github.com/pypa/python-packaging-user-guide
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From steve at pearwood.info  Sat Sep  5 17:24:02 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 6 Sep 2015 01:24:02 +1000
Subject: [Python-ideas] Non-English names in the turtle module.
In-Reply-To: <A84107F7-0E0C-4F44-91D1-9C19CFCC79A7@yahoo.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <A84107F7-0E0C-4F44-91D1-9C19CFCC79A7@yahoo.com>
Message-ID: <20150905152402.GU19373@ando.pearwood.info>

On Fri, Sep 04, 2015 at 02:05:05PM -0700, Andrew Barnert wrote:

> I find it really annoying when people pick one sentence out of a post 
> to argue against at length, out of context. while entirely ignoring 
> the actual substance of the post.

Your post was three rather short paragraphs. I ignored the first 
paragraph because it had nothing to do with me, and I don't know the 
answer. I didn't respond to the third paragraph because I thought the 
conclusion (that getting permission to install pip would be easier than 
getting the most up-to-date version of Python installed) was unlikely at 
best, but regardless, you used enough weasel words ("seems like ... I'm 
guessing ... is probably ...") that it would be churlish to argue. Who 
knows? Yes, there could be some teachers who get permission for their 
students to install anything they like with pip but aren't allowed to 
upgrade to the latest version of Python. It's a big world and IT 
departments sometimes appear to choose their policies at random.

I focused on the second paragraph because that was the comment you made 
that I want to respond to, namely that a sizeable chunk of problems with 
pip is that pip isn't installed. To reiterate, I don't believe that is 
the case, based on what I see on the python-list mailing list. Judging 
by the comments in the "packaging" subthread, this may have hit a chord 
with at least some others.


> Are you sincerely arguing that no children out there will have Python 
> 3.5, 3.3, or 2.7, 

No.

> or that for all such student upgrading to 3.6 will 
> be easier and face fewer permissions problems than using pip? 

For "all" of them? Probably not.

> If not, 
> then how does this answer my point that some people will want this on 
> PyPI even if it's in the 3.6 stdlib?

I didn't respond to that point. If you want me to respond, I'll say that 
I consider it unlikely that putting it on PyPI will be of much practical 
utility, given the user-base for turtle, but if people want to do both, 
it's not likely to do much harm either.


-- 
Steve

From steve at pearwood.info  Sat Sep  5 17:38:01 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 6 Sep 2015 01:38:01 +1000
Subject: [Python-ideas] High time for a builtin function to manage
	packages (simply)?
In-Reply-To: <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20150905153801.GV19373@ando.pearwood.info>

On Sat, Sep 05, 2015 at 04:08:24PM +0900, Stephen J. Turnbull wrote:
> Steven D'Aprano writes:
> 
>  > You say "of course", but did you actually look at the python-list 
>  > archives? If you do, you will see posts like these two within the last 
>  > 24 hours:
> 
> So let's fix it, already![1]  Now that we have a blessed package
> management module, why not have a builtin that handles the simple
> cases?  Say
> 
>     def installer(package, command='install'):
>         ...

Python competes strongly with R in the scientific software area, and R 
supports a built-in to do just that:

https://stat.ethz.ch/R-manual/R-devel/library/utils/html/install.packages.html



-- 
Steve

From donald at stufft.io  Sat Sep  5 18:38:29 2015
From: donald at stufft.io (Donald Stufft)
Date: Sat, 5 Sep 2015 12:38:29 -0400
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <20150905153801.GV19373@ando.pearwood.info>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150905153801.GV19373@ando.pearwood.info>
Message-ID: <etPan.55eb1a85.663e61d3.19bd@Draupnir.home>

On September 5, 2015 at 11:40:17 AM, Steven D'Aprano (steve at pearwood.info) wrote:
> On Sat, Sep 05, 2015 at 04:08:24PM +0900, Stephen J. Turnbull wrote:
> > Steven D'Aprano writes:
> >
> > > You say "of course", but did you actually look at the python-list
> > > archives? If you do, you will see posts like these two within the last
> > > 24 hours:
> >
> > So let's fix it, already![1] Now that we have a blessed package
> > management module, why not have a builtin that handles the simple
> > cases? Say
> >
> > def installer(package, command='install'):
> > ...
> 
> Python competes strongly with R in the scientific software area, and R
> supports a built-in to do just that:
> 
> https://stat.ethz.ch/R-manual/R-devel/library/utils/html/install.packages.html 
> 

I don't know anything about R, but a built in function is a bad idea. It'll be
a pretty big footgun I believe. For instance, if you already have requests 2.x
installed and imported, and then you run the builtin and install something
that triggers requests 1.x to be installed you'll end up with your Python in
an inconsistent state. You might even end up importing something from requests
and ending up with modules from two different versions of requests ending up
in sys.modules. In addition, the standard library is not really enough to
accurately install packages from PyPI. You need a real HTML parser that can
handle malformed input safely, an implementation of PEP 440 versions and
specifiers (currently implemented in the "packaging" library on PyPI), you also
need some mechanism for inspecting the currently installed set of packages, so
you need something like pkg_resources available to properly support that.


-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From brett at python.org  Sat Sep  5 19:05:21 2015
From: brett at python.org (Brett Cannon)
Date: Sat, 05 Sep 2015 17:05:21 +0000
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <20150905153801.GV19373@ando.pearwood.info>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150905153801.GV19373@ando.pearwood.info>
Message-ID: <CAP1=2W5h-40F9c4Lk4WDogsupy_ymjTD-TDDpZpoWti1PxNV_g@mail.gmail.com>

On Sat, 5 Sep 2015 at 08:40 Steven D'Aprano <steve at pearwood.info> wrote:

> On Sat, Sep 05, 2015 at 04:08:24PM +0900, Stephen J. Turnbull wrote:
> > Steven D'Aprano writes:
> >
> >  > You say "of course", but did you actually look at the python-list
> >  > archives? If you do, you will see posts like these two within the last
> >  > 24 hours:
> >
> > So let's fix it, already![1]  Now that we have a blessed package
> > management module, why not have a builtin that handles the simple
> > cases?  Say
> >
> >     def installer(package, command='install'):
> >         ...
>
> Python competes strongly with R in the scientific software area, and R
> supports a built-in to do just that:
>
>
> https://stat.ethz.ch/R-manual/R-devel/library/utils/html/install.packages.html


The reason R has a built-in for this is because it's used a vast majority
of the time from a REPL to do data analytics in an exploratory manner
(think Jupyter notebook type of data exploration). Python does not have the
same typical usage style and so I don't think we should follow R in this
instance (although I have had R users says that packaging in R is far
superior than Python due to ease of getting extensions installed period and
not because of the lack of a function, but that's another discussion).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150905/2b2f499d/attachment-0001.html>

From cannatag at gmail.com  Sat Sep  5 20:03:34 2015
From: cannatag at gmail.com (Giovanni Cannata)
Date: Sat, 05 Sep 2015 20:03:34 +0200
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150905153801.GV19373@ando.pearwood.info>
 <etPan.55eb1a85.663e61d3.19bd@Draupnir.home>
Message-ID: <xjl02ppe1qq2be1uksepxf03.1441472663300@email.android.com>

In reading this discussion with interest. pip and PyPI are what make the Python ecosystem live and vital. 

Especially PyPI is what surprises any Python newbie. A single repository freely available where to find valuable ready-to-use software, accessible from anywhere, it's hard to believe if you are, for example, a Java or .NET developer. 

But are you aware of the problems that PyPI is currently suffering? It's about two weeks that its searching engine is faulty and doesn't find many packages, even if they are available.

This is a very bad thing because PyPI is the frontgate to the Python system for many people, more than the python.org site itself.

I think that PyPI should deserve a special attention for the sake of the whole Python community.





-------- Messaggio originale --------
Da:Donald Stufft <donald at stufft.io>
Inviato:Sat, 05 Sep 2015 18:38:29 +0200
A:Steven D'Aprano <steve at pearwood.info>,python-ideas at python.org
Oggetto:Re: [Python-ideas] High time for a builtin function to manage packages (simply)?

>On September 5, 2015 at 11:40:17 AM, Steven D'Aprano (steve at pearwood.info) wrote:
>> On Sat, Sep 05, 2015 at 04:08:24PM +0900, Stephen J. Turnbull wrote:
>> > Steven D'Aprano writes:
>> >
>> > > You say "of course", but did you actually look at the python-list
>> > > archives? If you do, you will see posts like these two within the last
>> > > 24 hours:
>> >
>> > So let's fix it, already![1] Now that we have a blessed package
>> > management module, why not have a builtin that handles the simple
>> > cases? Say
>> >
>> > def installer(package, command='install'):
>> > ...
>> 
>> Python competes strongly with R in the scientific software area, and R
>> supports a built-in to do just that:
>> 
>> https://stat.ethz.ch/R-manual/R-devel/library/utils/html/install.packages.html 
>> 
>
>I don't know anything about R, but a built in function is a bad idea. It'll be
>a pretty big footgun I believe. For instance, if you already have requests 2.x
>installed and imported, and then you run the builtin and install something
>that triggers requests 1.x to be installed you'll end up with your Python in
>an inconsistent state. You might even end up importing something from requests
>and ending up with modules from two different versions of requests ending up
>in sys.modules. In addition, the standard library is not really enough to
>accurately install packages from PyPI. You need a real HTML parser that can
>handle malformed input safely, an implementation of PEP 440 versions and
>specifiers (currently implemented in the "packaging" library on PyPI), you also
>need some mechanism for inspecting the currently installed set of packages, so
>you need something like pkg_resources available to properly support that.
>
>
>-----------------
>Donald Stufft
>PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
>
>
>_______________________________________________
>Python-ideas mailing list
>Python-ideas at python.org
>https://mail.python.org/mailman/listinfo/python-ideas
>Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150905/5d4d51a0/attachment.html>

From donald at stufft.io  Sat Sep  5 20:05:53 2015
From: donald at stufft.io (Donald Stufft)
Date: Sat, 5 Sep 2015 14:05:53 -0400
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <xjl02ppe1qq2be1uksepxf03.1441472663300@email.android.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150905153801.GV19373@ando.pearwood.info>
 <etPan.55eb1a85.663e61d3.19bd@Draupnir.home>
 <xjl02ppe1qq2be1uksepxf03.1441472663300@email.android.com>
Message-ID: <etPan.55eb2f01.5dad6fed.19bd@Draupnir.home>

On September 5, 2015 at 2:03:46 PM, Giovanni Cannata (cannatag at gmail.com) wrote:
>  
> But are you aware of the problems that PyPI is currently suffering? It's about two weeks  
> that its searching engine is faulty and doesn't find many packages, even if they are available.  
>  

I?m aware. I?m only one person and my plate is extremely full. I?ve poking at the problem to try and figure it out.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From anandkrishnakumar123 at gmail.com  Sat Sep  5 21:33:00 2015
From: anandkrishnakumar123 at gmail.com (Anand Krishnakumar)
Date: Sat, 05 Sep 2015 19:33:00 +0000
Subject: [Python-ideas] Desperate need for enhanced print function
Message-ID: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>

Hi!

This is my first time I'm sending an email to the python-ideas mailing
list. I've got an enhancement idea for the built-in print function and I
hope it is as good as I think it is.

Imagine you have a trial.py file like this:

a = 4
b = "Anand"

print("Hello, I am " + b + ". My favorite number is " + str(a) + ".")
OR
print("Hello, I am ", b, ". My favorite number is ", a, ".")

Well I've got an idea for a function named "print_easy" (The only valid
name I could come up with right now).

So print_easy will be a function which will be used like this (instead of
the current print statement in trial.py) :

print_easy("Hello, I am", b, ". My favorite number is", a ".")

Which gives out:

Hello, I am Anand. My favorite number is 4

The work it does is that it casts the variables and it also formats the
sentences it is provided with. It is exclusively for beginners.

I'm 14 and I came up with this idea after seeing my fellow classmates at
school struggling to do something like this with the standard print
statement.
Sure, you can use the format method but won't that be a bit too much for
beginners? (Also, casting is inevitable in every programmer's career)

Please let me know how this sounds. If it gains some traction, I'll work on
it a bit more and clearly list out the features.

Thanks,
Anand.



-- 
Anand.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150905/0d2a3dcd/attachment.html>

From rymg19 at gmail.com  Sat Sep  5 21:39:35 2015
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Sat, 05 Sep 2015 14:39:35 -0500
Subject: [Python-ideas] Desperate need for enhanced print function
In-Reply-To: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
Message-ID: <6961A279-D649-4F37-9D5C-85476BE6F789@gmail.com>



On September 5, 2015 2:33:00 PM CDT, Anand Krishnakumar <anandkrishnakumar123 at gmail.com> wrote:
>Hi!
>
>This is my first time I'm sending an email to the python-ideas mailing
>list. I've got an enhancement idea for the built-in print function and
>I
>hope it is as good as I think it is.
>
>Imagine you have a trial.py file like this:
>
>a = 4
>b = "Anand"
>
>print("Hello, I am " + b + ". My favorite number is " + str(a) + ".")
>OR
>print("Hello, I am ", b, ". My favorite number is ", a, ".")
>
>Well I've got an idea for a function named "print_easy" (The only valid
>name I could come up with right now).
>
>So print_easy will be a function which will be used like this (instead
>of
>the current print statement in trial.py) :
>
>print_easy("Hello, I am", b, ". My favorite number is", a ".")

I'm sorry...but I can't see the difference. Aren't the two calls exactly the same??

>
>Which gives out:
>
>Hello, I am Anand. My favorite number is 4
>
>The work it does is that it casts the variables and it also formats the
>sentences it is provided with. It is exclusively for beginners.
>
>I'm 14 and I came up with this idea after seeing my fellow classmates
>at
>school struggling to do something like this with the standard print
>statement.
>Sure, you can use the format method but won't that be a bit too much
>for
>beginners? (Also, casting is inevitable in every programmer's career)
>
>Please let me know how this sounds. If it gains some traction, I'll
>work on
>it a bit more and clearly list out the features.
>
>Thanks,
>Anand.

-- 
Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.

From cannatag at gmail.com  Sat Sep  5 21:49:01 2015
From: cannatag at gmail.com (Giovanni Cannata)
Date: Sat, 05 Sep 2015 21:49:01 +0200
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150905153801.GV19373@ando.pearwood.info>
 <etPan.55eb1a85.663e61d3.19bd@Draupnir.home>
 <xjl02ppe1qq2be1uksepxf03.1441472663300@email.android.com>
 <etPan.55eb2f01.5dad6fed.19bd@Draupnir.home>
Message-ID: <fdctan3cvw7khq1cjb5yrg2u.1441481574353@email.android.com>

Hi Donald, you mean you're the only one in charge of maintaining PyPI? I'm sorry for this, I thought that a critical service like PyPI was supported by a team. I (and presume other developers) rely heavily on it. Maybe this should be brought to the attention of the PSF.



-------- Messaggio originale --------
Da:Donald Stufft <donald at stufft.io>
Inviato:Sat, 05 Sep 2015 20:05:53 +0200
A:python-ideas at python.org,Steven D'Aprano <steve at pearwood.info>,Giovanni Cannata <cannatag at gmail.com>
Oggetto:Re: [Python-ideas] High time for a builtin function to manage packages (simply)?

>On September 5, 2015 at 2:03:46 PM, Giovanni Cannata (cannatag at gmail.com) wrote:
>>  
>> But are you aware of the problems that PyPI is currently suffering? It's about two weeks  
>> that its searching engine is faulty and doesn't find many packages, even if they are available.  
>>  
>
>I?m aware. I?m only one person and my plate is extremely full. I?ve poking at the problem to try and figure it out.
>
>-----------------
>Donald Stufft
>PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150905/3a93436c/attachment.html>

From tjreedy at udel.edu  Sat Sep  5 22:13:28 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 5 Sep 2015 16:13:28 -0400
Subject: [Python-ideas] Desperate need for enhanced print function
In-Reply-To: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
Message-ID: <msfidr$gta$1@ger.gmane.org>

On 9/5/2015 3:33 PM, Anand Krishnakumar wrote:
> Hi!
>
> This is my first time I'm sending an email to the python-ideas mailing
> list. I've got an enhancement idea for the built-in print function and I
> hope it is as good as I think it is.
>
> Imagine you have a trial.py file like this:
>
> a = 4
> b = "Anand"
>
> print("Hello, I am " + b + ". My favorite number is " + str(a) + ".")
> OR
> print("Hello, I am ", b, ". My favorite number is ", a, ".")

This prints

Hello, I am  Anand . My favorite number is  4 .

because the sep parameter defaults to ' '.  If you want 'ease', leave 
out end spaces within quotes and don't worry about spaces before periods.

To get what you want, add ", sep=''" before the closing parenthesis.
print("Hello, I am ", b, ". My favorite number is ", a, ".", sep='')
Hello, I am Anand. My favorite number is 4.

> Well I've got an idea for a function named "print_easy" (The only valid
> name I could come up with right now).

When you want a mix of '' and ' ' separators, learn to use templates.

print("Hello, I am {}. My favorite number is {}.".format(b, a))
Hello, I am Anand. My favorite number is 4.

This ends up being easier to type and read because is does not have all 
the extraneous unprinted commas and quotes in the middle.

The formatting could also be written

print("Hello, I am {name}. My favorite number is {favnum}."
       .format(name=b, favnum=a))

-- 
Terry Jan Reedy


From donald at stufft.io  Sat Sep  5 22:38:25 2015
From: donald at stufft.io (Donald Stufft)
Date: Sat, 5 Sep 2015 16:38:25 -0400
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <fdctan3cvw7khq1cjb5yrg2u.1441481574353@email.android.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150905153801.GV19373@ando.pearwood.info>
 <etPan.55eb1a85.663e61d3.19bd@Draupnir.home>
 <xjl02ppe1qq2be1uksepxf03.1441472663300@email.android.com>
 <etPan.55eb2f01.5dad6fed.19bd@Draupnir.home>
 <fdctan3cvw7khq1cjb5yrg2u.1441481574353@email.android.com>
Message-ID: <etPan.55eb52c1.6cc35185.19bd@Draupnir.home>

On September 5, 2015 at 3:49:07 PM, Giovanni Cannata (cannatag at gmail.com) wrote:
> Hi Donald, you mean you're the only one in charge of maintaining PyPI? I'm sorry for this,  
> I thought that a critical service like PyPI was supported by a team. I (and presume other  
> developers) rely heavily on it. Maybe this should be brought to the attention of the PSF.  
>  

Yes and No. I?m the primary developer/administrator for it now and it doesn?t get much contribution from others. It is also supported by the Python infrastructure team, but there is only a handful of us and I?m the only person on that team who has paid time to work on that and the Infrastructure team is also responsible for many other python.org?services.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From tjreedy at udel.edu  Sat Sep  5 23:03:36 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 5 Sep 2015 17:03:36 -0400
Subject: [Python-ideas] High time for a builtin function to manage
	packages (simply)?
In-Reply-To: <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <msflbr$qda$1@ger.gmane.org>

On 9/5/2015 3:08 AM, Stephen J. Turnbull wrote:

> So let's fix it, already![1]  Now that we have a blessed package
> management module, why not have a builtin that handles the simple
> cases?  Say
>
>      def installer(package, command='install'):
>          ...

Because new builtins have a high threashold to reach, and this doesn't 
reach it? Installation is a specialized and rare operation.

Because pip must be installed anyway, so a function should be in the 
package and imported?
   from pip import main
(I realized that PM indepedence is part of the proposal. See below.)

I think a gui frontend is an even better idea. The tracker has a 
proposal to make such, once written, available from Idle.
   https://bugs.python.org/issue23551
I was thinking that the gui code should be in pip itself and not 
idlelib, so as to be available to any Python shell or IDE. If it covered 
multiple PMs, then it might go somewhere in the stdlib.

-- 
Terry Jan Reedy


From rosuav at gmail.com  Sun Sep  6 01:41:38 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 6 Sep 2015 09:41:38 +1000
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <msflbr$qda$1@ger.gmane.org>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <msflbr$qda$1@ger.gmane.org>
Message-ID: <CAPTjJmqzbV7OKnJjbWiv0RqwZQ1+keZqvRxqdsMJoxCcgLOw=A@mail.gmail.com>

On Sun, Sep 6, 2015 at 7:03 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 9/5/2015 3:08 AM, Stephen J. Turnbull wrote:
>
>> So let's fix it, already![1]  Now that we have a blessed package
>> management module, why not have a builtin that handles the simple
>> cases?  Say
>>
>>      def installer(package, command='install'):
>>          ...
>
>
> Because new builtins have a high threashold to reach, and this doesn't reach
> it? Installation is a specialized and rare operation.

If there's a simple entry-point like that in an importable module,
anyone who wants it as a builtin can simply pre-import it (this would
be used interactively anyway). All it'd take would be a known function
to import.

ChrisA

From abarnert at yahoo.com  Sun Sep  6 01:59:35 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sat, 5 Sep 2015 16:59:35 -0700
Subject: [Python-ideas] Desperate need for enhanced print function
In-Reply-To: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
Message-ID: <0636FF1F-9677-41CC-9B17-ED0B056829C5@yahoo.com>

On Sep 5, 2015, at 12:33, Anand Krishnakumar <anandkrishnakumar123 at gmail.com> wrote:
> 
> The work it does is that it casts the variables and it also formats the sentences it is provided with. It is exclusively for beginners.

What do you mean by "casts"? The print function already calls str on each of its arguments. Do you want print_easy to do something different? If so, what? And why do you call it "casting"?

More generally, how do you want the output of print_easy to differ from the output of print, given the same arguments?

If you're hoping it can automatically figure out where to put spaces and where not to, what rule do you want it to use? Obviously it can't be some complicated DWIM AI that figures out whether you're writing English sentences, German sentences, a columnar table, or source code, but maybe there's something simple you can come up with that's still broadly useful. (If you can figure out how to turn that into code, you can put print_easy up on PyPI and let people get some experience using it and increase the chances of getting buy-in to the idea, and even if everyone rejects it as a builtin, you and other students can still use it.)

From tjreedy at udel.edu  Sun Sep  6 06:04:42 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 6 Sep 2015 00:04:42 -0400
Subject: [Python-ideas] High time for a builtin function to manage
	packages (simply)?
In-Reply-To: <msflbr$qda$1@ger.gmane.org>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp> <msflbr$qda$1@ger.gmane.org>
Message-ID: <msge1e$a7h$1@ger.gmane.org>

On 9/5/2015 5:03 PM, Terry Reedy wrote:

> I think a gui frontend is an even better idea. The tracker has a
> proposal to make such, once written, available from Idle.
>    https://bugs.python.org/issue23551
> I was thinking that the gui code should be in pip itself and not
> idlelib, so as to be available to any Python shell or IDE. If it covered
> multiple PMs, then it might go somewhere in the stdlib.

Inspired by this thread, I did some experiments and am fairly confident 
that pip.main can be imported and used directly, bypassing paths, 
subprocesses, and pipes.


-- 
Terry Jan Reedy


From russell at keith-magee.com  Sun Sep  6 10:57:38 2015
From: russell at keith-magee.com (Russell Keith-Magee)
Date: Sun, 6 Sep 2015 16:57:38 +0800
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <msge1e$a7h$1@ger.gmane.org>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <msflbr$qda$1@ger.gmane.org> <msge1e$a7h$1@ger.gmane.org>
Message-ID: <CAJxq84_2_=TnkhOK0Khd=TWbAZJfe-V+A1VqVjpVHwE4s=gptA@mail.gmail.com>

On Sun, Sep 6, 2015 at 12:04 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 9/5/2015 5:03 PM, Terry Reedy wrote:
>
>> I think a gui frontend is an even better idea. The tracker has a
>> proposal to make such, once written, available from Idle.
>>    https://bugs.python.org/issue23551
>> I was thinking that the gui code should be in pip itself and not
>> idlelib, so as to be available to any Python shell or IDE. If it covered
>> multiple PMs, then it might go somewhere in the stdlib.
>
>
> Inspired by this thread, I did some experiments and am fairly confident that
> pip.main can be imported and used directly, bypassing paths, subprocesses,
> and pipes.

I can confirm that this is, indeed, possible. I use this exact
technique in my tool Briefcase to simplify the process of packaging
code as an app bundle.

https://github.com/pybee/briefcase/blob/master/briefcase/app.py#L108

Yours,
Russ Magee %-)

From trent at snakebite.org  Sun Sep  6 17:19:54 2015
From: trent at snakebite.org (Trent Nelson)
Date: Sun, 6 Sep 2015 11:19:54 -0400
Subject: [Python-ideas] PyParallel update
Message-ID: <20150906151954.GB1069@trent.me>

[CC'ing python-dev@ for those that are curious; please drop and keep
follow-up discussion to python-ideas@]

Hi folks,

I've made a lot of progress on PyParallel since the PyCon dev summit
(https://speakerdeck.com/trent/pyparallel-pycon-2015-language-summit); I
fixed the outstanding breakage with generators, exceptions and whatnot.
I got the "instantaneous Wiki search server" working[1] and implemented
the entire TechEmpower Frameworks Benchmark Suite[2], including a
PyParallel-friendly pyodbc module, allowing database connections and
querying in parallel.

[1]:
https://github.com/pyparallel/pyparallel/blob/branches/3.3-px/examples/wiki/wiki.py
[2]:
https://github.com/pyparallel/pyparallel/blob/branches/3.3-px/examples/tefb/tefb.py

I set up a landing page for the project:

    http://pyparallel.org

And there was some good discussion on reddit earlier this week:

    https://www.reddit.com/r/programming/comments/3jhv80/pyparallel_an_experimental_proofofconcept_fork_of/

I've put together some documentation on the project, its aims, and
the key parts of the solution regarding the parallelism through
simple client/server paradigms.  This documentation is available
directly on the github landing page for the project:

    https://github.com/pyparallel/pyparallel

Writing that documentation forced me to formalize (or at least commit)
to the restrictions/trade-offs that PyParallel would introduce, and I'm
pretty happy I was basically able to boil it down into a single rule:

    Don't persist parallel objects.

That keeps the mental model very simple.  You don't need to worry about
locking or ownership or races or anything like that.  Just don't persist
parallel objects, that's the only thing you have to remember.

It's actually really easy to convert existing C code or Python code into
something that is suitable for calling from within a parallel callback by
just ensuring that rule isn't violated.  It took about four hours to figure
out how NumPy allocated stuff and add in the necessary PyParallel-aware
tweaks, and not that much longer for pyodbc.  (Most stuff "just works",
though.)  (The ABI changes would mean this is a Python 4.x type of
thing; there are fancy ways we could avoid ABI changes and get this
working on Python 3.x, but, eh, I like the 4.x target.  It's realistic.)

The other thing that clicked is that asyncio and PyParallel would
actually work really well together for exploiting client-driven
parallelism (PyParallel really is only suited to server-oriented
parallelism at the moment, i.e. serving HTTP requests in parallel).

With asyncio, though, you could keep the main-thread/single-thread
client-drives-computation paradigm, but have it actually dispatch work
to parallel.server() objects behind the scenes.  For example, in order
to process all files in a directory in parallel, asyncio would request a
directory listing (i.e. issue a GET /) which the PyParallel HTTP server
would return, it would then create non-blocking client connections to
the same server and invoke whatever HTTP method is desired to do the
file processing.  You can either choose to write the new results from
within the parallel context (which could then be accessed as normal
files via HTTP), or you could have PyParallel return json/bytes, which
could then be aggregated by asyncio.

Everything is within the same process, so you get all the benefits that
provides (free access to anything within scope, like large data
structures, from within parallel contexts).  You can synchronously call
back into the main thread from a parallel thread, too, if you wanted to
update a complex data structure directly.

The other interesting thing that documentation highlights is the
advantage of the split brain "main thread vs parallel thread" GC and
non-GC allocators.  I'm not sure if I've ever extolled the virtue of
such an approach on paper or in e-mail.  It's pretty neat though and
allows us to avoid a whole raft of problems that need to be solved when
you have a single GC/memory model.

Next steps: once 3.5 is tagged, I'm going to bite the bullet and rebase.
That'll require a bit of churn, so if there's enough interest from
others, I figured we'd use the opportunity to at least get it building
again on POSIX (Linux/OSX/FreeBSD).  From there people can start
implementing the missing bits for implementing the parallel machinery
behind the scenes.

The parallel interpreter thread changes I made are platform agnostic,
the implementation just happens to be on Windows at the moment; don't
let the Windows-only thing detract from what's actually being pitched: a
(working, demonstrably-performant) solution to "Python's GIL problem".

Regards,

    Trent.


From srkunze at mail.de  Sun Sep  6 19:33:29 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Sun, 6 Sep 2015 19:33:29 +0200
Subject: [Python-ideas] Wheels For ...
Message-ID: <55EC78E9.1050300@mail.de>

Hi folks,

currently, I came across http://pythonwheels.com/ during researching how 
to make a proper Python distribution for PyPI. I thought it would be 
great idea to tell other maintainers to upload their content as wheels 
so I approached a couple of them. Some of them already provided wheels.

Happy being able to have built my own distribution, I discussed the 
issue at hand with some people and I would like to share my findings and 
propose some ideas:

1) documentation is weirdly split up/distributed and references old material
2) once up and running (setup.cfg, setup.py etc. etc.) it works but 
everybody needs to do it on their own
3) more than one way to do (upload, wheel, source/binary etc.) it (sigh)
4) making contact to propose wheels on github or per email is easy 
otherwise almost impossible or very tedious
5) reactions went evenly split from "none", "yes", "when ready" to "nope"

None: well, okay
yes: that's good
when ready: well, okay
nope: what a pity for wheels; example: 
https://github.com/simplejson/simplejson/issues/122

I personally find the situation not satisfying. Someone proposes the 
following solution in form of a question:

Why do developers need to build their distribution themselves?

I had not real answer to him, but pondering a while over it, I found it 
really insightful. Viewing this from a different angle, packaging your 
own distribution is actually a waste of time. It is a tedious, 
error-prone task involving no creativity whatsoever. Developers on the 
other hand are actually people with very little time and a lot of 
creativity at hand which should spend better. The logical conclusion 
would be that PyPI should build wheels for the developers for every 
python/platform combination necessary.


With this post, I would like raise awareness of the people in charge of 
the Python infrastructure.


Best,
Sven

From tjreedy at udel.edu  Sun Sep  6 19:54:17 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 6 Sep 2015 13:54:17 -0400
Subject: [Python-ideas] High time for a builtin function to manage
	packages (simply)?
In-Reply-To: <CAJxq84_2_=TnkhOK0Khd=TWbAZJfe-V+A1VqVjpVHwE4s=gptA@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp> <msflbr$qda$1@ger.gmane.org>
 <msge1e$a7h$1@ger.gmane.org>
 <CAJxq84_2_=TnkhOK0Khd=TWbAZJfe-V+A1VqVjpVHwE4s=gptA@mail.gmail.com>
Message-ID: <mshukt$v44$1@ger.gmane.org>

On 9/6/2015 4:57 AM, Russell Keith-Magee wrote:
> On Sun, Sep 6, 2015 at 12:04 PM, Terry Reedy <tjreedy at udel.edu> wrote:
>> On 9/5/2015 5:03 PM, Terry Reedy wrote:
>>
>>> I think a gui frontend is an even better idea. The tracker has a
>>> proposal to make such, once written, available from Idle.
>>>     https://bugs.python.org/issue23551
>>> I was thinking that the gui code should be in pip itself and not
>>> idlelib, so as to be available to any Python shell or IDE. If it covered
>>> multiple PMs, then it might go somewhere in the stdlib.
>>
>>
>> Inspired by this thread, I did some experiments and am fairly confident that
>> pip.main can be imported and used directly, bypassing paths, subprocesses,
>> and pipes.
>
> I can confirm that this is, indeed, possible. I use this exact
> technique in my tool Briefcase to simplify the process of packaging
> code as an app bundle.
>
> https://github.com/pybee/briefcase/blob/master/briefcase/app.py#L108

There *is*, however, a potential gotcha suggested on the issue by Donald 
Stufft and verified by me.  pip is designed for one command (main call) 
per invocation, not for repeated commands.  When started, it finds and 
caches a list of installed packages.  The install command does not 
update the cached list.  So 'show new_or_upgraded_package' will not 
work.  Your series of installs do not run into this.

-- 
Terry Jan Reedy


From bussonniermatthias at gmail.com  Sun Sep  6 20:29:59 2015
From: bussonniermatthias at gmail.com (Matthias Bussonnier)
Date: Sun, 6 Sep 2015 11:29:59 -0700
Subject: [Python-ideas] Wheels For ...
In-Reply-To: <55EC78E9.1050300@mail.de>
References: <55EC78E9.1050300@mail.de>
Message-ID: <CANJQusUMX+0s4eX+RBJoJ0VjUYxcwGLhU7CDET9J_A0S1wiFUQ@mail.gmail.com>

Hi Sven,

Just adding a few comments inline:

On Sun, Sep 6, 2015 at 7:33 PM, Sven R. Kunze <srkunze at mail.de> wrote:

> 3) more than one way to do (upload, wheel, source/binary etc.) it (sigh)

And most are uploading/registering over http (sight)

> nope: what a pity for wheels; example:
> https://github.com/simplejson/simplejson/issues/122

But that's for non pure-python wheels,
wheel can be universal, in which case they are easy to build.

> Why do developers need to build their distribution themselves?

Historical reason. On GitHub, at least it is pretty easy to make Travis-CI
build your wheels, some scientific packages (which are not the easier to build)
have done that, so automation is  possible. And these case need really
particular environements where all aspects of the builds are controlled.

>
> I had not real answer to him, but pondering a while over it, I found it
> really insightful. Viewing this from a different angle, packaging your own
> distribution is actually a waste of time. It is a tedious, error-prone task
> involving no creativity whatsoever. Developers on the other hand are
> actually people with very little time and a lot of creativity at hand which
> should spend better. The logical conclusion would be that PyPI should build
> wheels for the developers for every python/platform combination necessary.

I think that some of that could be done by warehouse at some point:
https://github.com/pypa/warehouse

But you will never be able to cover all. I'm sure people will ask PyPI
to build for windows 98 server version otherwise.

Personally for pure python packages I know use https://pypi.python.org/pypi/flit
which is one of the only packaging tools for which I can remember all the
step to get a package on PyPI without reading the docs.

-- 
M

[Sven, sorry for duplicate :-) ]

From steve.dower at python.org  Sun Sep  6 21:22:19 2015
From: steve.dower at python.org (Steve Dower)
Date: Sun, 6 Sep 2015 12:22:19 -0700
Subject: [Python-ideas] Wheels For ...
In-Reply-To: <55EC78E9.1050300@mail.de>
References: <55EC78E9.1050300@mail.de>
Message-ID: <55EC926B.5050708@python.org>

On 06Sep2015 1033, Sven R. Kunze wrote:
> The logical conclusion
> would be that PyPI should build wheels for the developers for every
> python/platform combination necessary.

This would be a wonderful situation to end up in, but the problem is 
that many wheels have difficult source dependencies to configure. It is 
much easier for the developers who should already have working systems 
to build the wheel themselves then it would be either for them to 
provide/configure a remote system to do it, or for the end-user to 
configure their own system. (And if it can't be tested on a particular 
system, then the developer probably shouldn't release wheels for that 
system anyway.)

What I would rather see is a way to delegate building to other people by 
explicitly allowing someone to add wheels to PyPI for existing releases 
without necessarily being able to make a new release or delete old ones. 
There is some trust involved, but it could also enable more ad-hoc 
systems of building wheels through Travis/Jenkins/VSO/etc. automation 
without needing to reveal login information through your repo (i.e. you 
give the "Jenkins-wheel" user permission to publish wheels for your 
package).

Not sure how feasible this is, but I'd guess it's easier than trying to 
run our own build servers.

Cheers,
Steve

From lac at openend.se  Sun Sep  6 21:54:51 2015
From: lac at openend.se (Laura Creighton)
Date: Sun, 06 Sep 2015 21:54:51 +0200
Subject: [Python-ideas] Wheels For ...
In-Reply-To: <msi4ao$sq6$1@ger.gmane.org>
References: <55EC78E9.1050300@mail.de> <msi4ao$sq6$1@ger.gmane.org>
Message-ID: <201509061954.t86Jspjg011546@fido.openend.se>

In a message of Sun, 06 Sep 2015 15:31:16 -0400, Terry Reedy writes:
>On 9/6/2015 1:33 PM, Sven R. Kunze wrote:
>>
>> With this post, I would like raise awareness of the people in charge of
>> the Python infrastructure.
>
>pypa is in charge of packaging. https://github.com/pypa
>I believe the google groups link is their discussion forum.

They have one -- https://groups.google.com/forum/#!forum/pypa-dev
but you can also get them at the disutils mailing list.
https://mail.python.org/mailman/listinfo/distutils-sig

I think, rather than discussion, it is 'people willing to write code'
that they are short of ...

Laura


From ncoghlan at gmail.com  Mon Sep  7 01:48:21 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 7 Sep 2015 09:48:21 +1000
Subject: [Python-ideas] Desperate need for enhanced print function
In-Reply-To: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
Message-ID: <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>

On 6 September 2015 at 05:33, Anand Krishnakumar
<anandkrishnakumar123 at gmail.com> wrote:
> print("Hello, I am ", b, ". My favorite number is ", a, ".")
>
> I'm 14 and I came up with this idea after seeing my fellow classmates at
> school struggling to do something like this with the standard print
> statement.
> Sure, you can use the format method but won't that be a bit too much for
> beginners? (Also, casting is inevitable in every programmer's career)

Hi Anand,

Your feedback reflects a common point of view on the surprising
difficulty of producing nicely formatted messages from Python code. As
such, it currently appears likely that Python 3.6 will allow you and
your peers to write output messages like this:

    print(f"Hello, I am {b}. My favorite number is {a}.")

as a simpler alternative to the current options:

    print("Hello, I am ", b, ". My favorite number is ", a, ".", sep="")
    print("Hello, I am " + b + ". My favorite number is " + str(a) + ".")
    print("Hello, I am {}. My favorite number is {}.".format(b, a))
    print("Hello, I am {b}. My favorite number is {a}.".format_map(locals()))
    print("Hello, I am %s. My favorite number is %s." % (b, a))

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Mon Sep  7 02:20:51 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 7 Sep 2015 10:20:51 +1000
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <fdctan3cvw7khq1cjb5yrg2u.1441481574353@email.android.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150905153801.GV19373@ando.pearwood.info>
 <etPan.55eb1a85.663e61d3.19bd@Draupnir.home>
 <xjl02ppe1qq2be1uksepxf03.1441472663300@email.android.com>
 <etPan.55eb2f01.5dad6fed.19bd@Draupnir.home>
 <fdctan3cvw7khq1cjb5yrg2u.1441481574353@email.android.com>
Message-ID: <CADiSq7feM2k53WCyB13W=Y=o1G+t7DqXhZea12dHuZgxu=XSzw@mail.gmail.com>

On 6 September 2015 at 05:49, Giovanni Cannata <cannatag at gmail.com> wrote:
> Hi Donald, you mean you're the only one in charge of maintaining PyPI? I'm
> sorry for this, I thought that a critical service like PyPI was supported by
> a team. I (and presume other developers) rely heavily on it. Maybe this
> should be brought to the attention of the PSF.

We're aware. PyPI is currently the *only* python.org service with a
dedicated full time developer (and Donald's time is actually
contributed by the OpenStack group at HP rather than being funded
directly by the PSF). There are also a number of organisations that
donate resources to operating the python.org infrastructure (e.g. the
Fastly CDN, and hosting services from Heroku, Rackspace and the Open
Source Lab at Oregon State University).

Bringing paid development in to support community driven projects,
while also ensuring financial sustainability and fiscally responsible
management of contributor's funds is an interesting challenge, and one
the PSF continues to work to get better at.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Mon Sep  7 02:26:26 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 7 Sep 2015 10:26:26 +1000
Subject: [Python-ideas] Wheels For ...
In-Reply-To: <201509061954.t86Jspjg011546@fido.openend.se>
References: <55EC78E9.1050300@mail.de> <msi4ao$sq6$1@ger.gmane.org>
 <201509061954.t86Jspjg011546@fido.openend.se>
Message-ID: <CADiSq7ewai8fv9XJsOYEvijTZ+JPsJo6+E=O7S9C5p1cZT=Wgg@mail.gmail.com>

On 7 September 2015 at 05:54, Laura Creighton <lac at openend.se> wrote:
> I think, rather than discussion, it is 'people willing to write code'
> that they are short of ...

For the build farm idea, it's not just writing the code initially,
it's operating the resulting infrastructure, and that's a much bigger
ongoing commitment. Automatically building wheels for source uploads
is definitely on the wish list, there are just a large number of other
improvements needed before it's feasible.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From rymg19 at gmail.com  Mon Sep  7 03:22:27 2015
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Sun, 06 Sep 2015 20:22:27 -0500
Subject: [Python-ideas] Wheels For ...
In-Reply-To: <55EC78E9.1050300@mail.de>
References: <55EC78E9.1050300@mail.de>
Message-ID: <3B3CDE41-BB58-447D-BB86-9994EE9587FA@gmail.com>



On September 6, 2015 12:33:29 PM CDT, "Sven R. Kunze" <srkunze at mail.de> wrote:
>Hi folks,
>
>currently, I came across http://pythonwheels.com/ during researching
>how 
>to make a proper Python distribution for PyPI. I thought it would be 
>great idea to tell other maintainers to upload their content as wheels 
>so I approached a couple of them. Some of them already provided wheels.
>
>Happy being able to have built my own distribution, I discussed the 
>issue at hand with some people and I would like to share my findings
>and 
>propose some ideas:
>
>1) documentation is weirdly split up/distributed and references old
>material
>2) once up and running (setup.cfg, setup.py etc. etc.) it works but 
>everybody needs to do it on their own
>3) more than one way to do (upload, wheel, source/binary etc.) it
>(sigh)
>4) making contact to propose wheels on github or per email is easy 
>otherwise almost impossible or very tedious
>5) reactions went evenly split from "none", "yes", "when ready" to
>"nope"
>
>None: well, okay
>yes: that's good
>when ready: well, okay
>nope: what a pity for wheels; example: 
>https://github.com/simplejson/simplejson/issues/122
>
>I personally find the situation not satisfying. Someone proposes the 
>following solution in form of a question:
>
>Why do developers need to build their distribution themselves?
>
>I had not real answer to him, but pondering a while over it, I found it
>
>really insightful. Viewing this from a different angle, packaging your 
>own distribution is actually a waste of time. It is a tedious, 
>error-prone task involving no creativity whatsoever. Developers on the 
>other hand are actually people with very little time and a lot of 
>creativity at hand which should spend better. The logical conclusion 
>would be that PyPI should build wheels for the developers for every 
>python/platform combination necessary.
>

You can already do this with CI services. I wrote a post about doing that with AppVeyor:

http://kirbyfan64.github.io/posts/using-appveyor-to-distribute-python-wheels.html

but the idea behind it should apply easily to Travis and others. In reality, you're probably using a CI service to run your tests anyway, so it might as well build your wheels, too!

>
>With this post, I would like raise awareness of the people in charge of
>
>the Python infrastructure.
>
>
>Best,
>Sven
>_______________________________________________
>Python-ideas mailing list
>Python-ideas at python.org
>https://mail.python.org/mailman/listinfo/python-ideas
>Code of Conduct: http://python.org/psf/codeofconduct/

-- 
Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.

From steve at pearwood.info  Mon Sep  7 03:26:45 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 7 Sep 2015 11:26:45 +1000
Subject: [Python-ideas] Wheels For ...
In-Reply-To: <55EC78E9.1050300@mail.de>
References: <55EC78E9.1050300@mail.de>
Message-ID: <20150907012645.GX19373@ando.pearwood.info>

On Sun, Sep 06, 2015 at 07:33:29PM +0200, Sven R. Kunze wrote:

> Why do developers need to build their distribution themselves?
> 
> I had not real answer to him, but pondering a while over it, I found it 
> really insightful. Viewing this from a different angle, packaging your 
> own distribution is actually a waste of time. It is a tedious, 
> error-prone task involving no creativity whatsoever. Developers on the 
> other hand are actually people with very little time and a lot of 
> creativity at hand which should spend better. The logical conclusion 
> would be that PyPI should build wheels for the developers for every 
> python/platform combination necessary.

Over on the python-list mailing list, Ned Batchelder asked a question. I 
haven't seen an answer there, and as far as I know he isn't subscribed 
here, so I'll take the liberty of copying his question here:

Ned says:

"As a developer of a Python package, I don't see how this would be 
better. The developer would still have to get their software into some 
kind of uniform configuration, so the central authority could package 
it.  You've moved the problem from, "everyone has to make wheels" to 
"everyone has to make a tree that's structured properly."  But if we can 
do the second thing, the first thing is really easy.

Maybe I've misunderstood?"


-- 
Steve

From stephen at xemacs.org  Mon Sep  7 03:51:23 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 07 Sep 2015 10:51:23 +0900
Subject: [Python-ideas] High time for a builtin function to
	manage	packages (simply)?
In-Reply-To: <msflbr$qda$1@ger.gmane.org>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <msflbr$qda$1@ger.gmane.org>
Message-ID: <87d1xv2e04.fsf@uwakimon.sk.tsukuba.ac.jp>

Terry Reedy writes:
 > On 9/5/2015 3:08 AM, Stephen J. Turnbull wrote:
 > 
 > > So let's fix it, already![1]  Now that we have a blessed package
 > > management module, why not have a builtin that handles the simple
 > > cases?  Say
 > >
 > >      def installer(package, command='install'):
 > >          ...
 > 
 > Because new builtins have a high threashold to reach, and this doesn't 
 > reach it?

In your opinion.  

And in mine!  Personally, I don't have a problem with remembering
python -m pip, nor do I have a problem with explaining it as
frequently as necessary to my students But there are only 20 of them,
rather than the thousands that folks like Steven (and you?) are
dealing with on python-list -- and there's the rub.

I'm suggesting this because of the vehemence with which Steven (among
others) objects to any suggestion that packages belong on PyPI, and
the fact that he can back that up with fairly distressing anecdotes
about the number of beginner posts asking about pip problems.  I would
really like to see that put to rest.

 > Installation is a specialized and rare operation.

help(), quit(), and quit are builtins.  I never use quit or quit()
(Ctrl-D works on all the systems I use), so I guess they are
"specialized and rare" in some sense, and I'm far more likely to use
dir() and a pydoc HTML server than help().

More to the point, the trouble packaging causes beginners and Steven
d'Aprano on python-list is apparently widespread and daily.  At some
point beginner-friendliness has enough value to make it into the
stdlib and even builtins.

 > Because pip must be installed anyway, so a function should be in the 
 > package and imported?
 >    from pip import main

I don't pronounce that "install".  Discoverability matters a lot for
the users in question (which is why I'm not happy with "installer",
but it's somewhat more memorable than "pip").

 > I think a gui frontend is an even better idea.

I think it's a great idea in itself.

But IMO it doesn't address this issue because the almost but not
really universally-available GUI is Tcl/Tk, which isn't even available
in any of the four packaged Python instances I have installed (Mac OS
X system Python 2.6 and 2.7, MacPorts Python 2.7 and 3.4, although
IIRC MacPorts offers a tk variant you can enable, but it's off by
default).

 > The tracker has a proposal to make such, once written, available
 > from Idle.
 >    https://bugs.python.org/issue23551
 > I was thinking that the gui code should be in pip itself

Obviously; it doesn't address the present issue if it's not ensured by
ensure_pip.  Which further suggests something like ensure_pyqt as
well.  (Or ensure_tk, if you think that perpetuating Tcl/Tk is
acceptable.)  I think that's a huge mess, given the size and messiness
of the dependencies.  I suppose a browser-based interface like that of
pydoc could deal with the "universality" issue, but I don't know how
fragile it is.


From steve at pearwood.info  Mon Sep  7 04:18:02 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 7 Sep 2015 12:18:02 +1000
Subject: [Python-ideas] High time for a builtin function to manage
	packages (simply)?
In-Reply-To: <msflbr$qda$1@ger.gmane.org>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp> <msflbr$qda$1@ger.gmane.org>
Message-ID: <20150907021802.GY19373@ando.pearwood.info>

On Sat, Sep 05, 2015 at 05:03:36PM -0400, Terry Reedy wrote:
> On 9/5/2015 3:08 AM, Stephen J. Turnbull wrote:
> 
> >So let's fix it, already![1]  Now that we have a blessed package
> >management module, why not have a builtin that handles the simple
> >cases?  Say
> >
> >     def installer(package, command='install'):
> >         ...
> 
> Because new builtins have a high threashold to reach, and this doesn't 
> reach it? Installation is a specialized and rare operation.

You're right about the first part, but as Chris has already suggested, 
this need not be *literally* a built-in. Like help() it could be 
imported at REPL startup.

And I'm not really so sure about how rare it is. Sure, installing a 
single package only happens once... unless you're installing it to 
multiple installations. Or upgrading the package. Or installing more 
than one package. Looking at questions on various programming forums, 
including Python but other languages as well, "how do I install X?" is 
an extremely common question.

And, with the general reluctance to add new packages to the stdlib, and 
the emphasis on putting them onto PyPI first, I think that it will 
become even more common in the future.


> I think a gui frontend is an even better idea. The tracker has a 
> proposal to make such, once written, available from Idle.
>   https://bugs.python.org/issue23551
> I was thinking that the gui code should be in pip itself and not 
> idlelib, so as to be available to any Python shell or IDE. If it covered 
> multiple PMs, then it might go somewhere in the stdlib.

As I see it, there are three high-level steps to an awesome installer:

1. Have an excellent repository of software to install;
2. have a powerful interactive interface to the repo that Just Works;
3. add a GUI interface.

I think that with PyPI we certainly have #1 covered, but I don't think 
we have #2 yet, there are still too many ways that things can "Not 
Work". Number 3 is icing on the cake - it makes a great system even 
better.

I didn't specify whether the interactive interface should be a 
stand-alone application like pip, or an command in the REPL like R uses, 
or even both. I like the idea of being able to install packages directly 
from the Python prompt. It works well within R, and I don't see why it 
wouldn't work in Python either. But it isn't much of an imposition to 
run "python -m pip ..." at the shell.



-- 
Steve

From abarnert at yahoo.com  Mon Sep  7 05:07:44 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sun, 6 Sep 2015 20:07:44 -0700
Subject: [Python-ideas] High time for a builtin function to
	manage	packages (simply)?
In-Reply-To: <87d1xv2e04.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp> <msflbr$qda$1@ger.gmane.org>
 <87d1xv2e04.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <AF4B2951-316E-428C-8C01-04783591B205@yahoo.com>

On Sep 6, 2015, at 18:51, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> 
> But IMO it doesn't address this issue because the almost but not
> really universally-available GUI is Tcl/Tk, which isn't even available
> in any of the four packaged Python instances I have installed (Mac OS
> X system Python 2.6 and 2.7, MacPorts Python 2.7 and 3.4, although
> IIRC MacPorts offers a tk variant you can enable, but it's off by
> default).

Tcl/Tk, and Tkinter for all pre-installed Pythons but 2.3, have been included with every OS X since they started pre-installing 2.5. Here's a brand new laptop that I've done nothing to but run the 10.10.4 update and other recommended updates from the App Store app:

    $ /usr/bin/python2.6
    Python 2.6.9 (unknown, Sep  9 2014, 15:05:12)
    [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.391)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import Tkinter
    >>> root = Tkinter.Tk()
    >>>

That works, and pops up an empty root window.

And it works with all python.org installs for 10.6 or later, all Homebrew default installs, standard source builds... Just about anything besides MacPorts (which seems to want to build Tkinter against its own Tcl/Tk instead of Apple's)

Also, why do you think Qt would be less of a problem? Apple has at various times bundled Qt into OS X and/or Xcode, but not consistently, and even when it's there it's often set up in a way that you can't use it, and of course Apple has never include PyQt or PySide. So, if pip used Qt, a user would have to go to qt.io, register an account, figure out what they need to download and install, figure out how to make it install system-wide instead of per-user, and then repeat for PySide against each copy of Python they want to use. Either that, or pip would have to include its own complete copy of Qt and PySide.

From abarnert at yahoo.com  Mon Sep  7 05:17:53 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sun, 6 Sep 2015 20:17:53 -0700
Subject: [Python-ideas] High time for a builtin function to manage
	packages (simply)?
In-Reply-To: <20150907021802.GY19373@ando.pearwood.info>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp> <msflbr$qda$1@ger.gmane.org>
 <20150907021802.GY19373@ando.pearwood.info>
Message-ID: <5647B0DF-5034-4736-9F02-B3E106DD8434@yahoo.com>

On Sep 6, 2015, at 19:18, Steven D'Aprano <steve at pearwood.info> wrote:
> 
> 
> I didn't specify whether the interactive interface should be a 
> stand-alone application like pip, or an command in the REPL like R uses, 
> or even both. I like the idea of being able to install packages directly 
> from the Python prompt. It works well within R, and I don't see why it 
> wouldn't work in Python either. But it isn't much of an imposition to 
> run "python -m pip ..." at the shell.

Personally, I've never found ^Zpython -m pip spam && fg too hard (or just using ! from IPython), but I can understand why novices might. :)

Anyway, the problem comes when you upgrade (directly or indirectly) a module that's already imported. Reloading is neither easy (especially if you need to reload a module that you only imported indirectly and upgraded indirectly) nor fool-proof. When I run into problems, I usually don't have much trouble stashing any costly intermediate objects, exiting the REPL, re-launching, and restoring, but I don't think novices would have as much fun.

Is there a way the installer could, after working out the requirements, tell you something like "This command will upgrade 'spam' from 1.3.2 to 1.4.1, and you have imported 'spam' and 'spam.eggs' from the package, so you may need to restart after the upgrade. Continue?" That might be good enough. It's not exactly an everyday problem, so as long as it's visible when it's happened and obvious how to work around it so users who run into it for the first time don't just decide Python or pip or spam is "broken" and give up, that might be sufficient.

(And a GUI installer integrated into IDLE would presumably have no additional problems, and could make the experience even nicer--especially since it's already got a "Restart Shell" option built in.)

From rosuav at gmail.com  Mon Sep  7 05:25:32 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Mon, 7 Sep 2015 13:25:32 +1000
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <5647B0DF-5034-4736-9F02-B3E106DD8434@yahoo.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <msflbr$qda$1@ger.gmane.org>
 <20150907021802.GY19373@ando.pearwood.info>
 <5647B0DF-5034-4736-9F02-B3E106DD8434@yahoo.com>
Message-ID: <CAPTjJmq7JMWYpFTW+5J3+iu2xdVANTzs2YoNcXNzbF7iiE_PRQ@mail.gmail.com>

On Mon, Sep 7, 2015 at 1:17 PM, Andrew Barnert via Python-ideas
<python-ideas at python.org> wrote:
> Anyway, the problem comes when you upgrade (directly or indirectly) a module that's already imported. Reloading is neither easy (especially if you need to reload a module that you only imported indirectly and upgraded indirectly) nor fool-proof. When I run into problems, I usually don't have much trouble stashing any costly intermediate objects, exiting the REPL, re-launching, and restoring, but I don't think novices would have as much fun.
>
> Is there a way the installer could, after working out the requirements, tell you something like "This command will upgrade 'spam' from 1.3.2 to 1.4.1, and you have imported 'spam' and 'spam.eggs' from the package, so you may need to restart after the upgrade. Continue?" That might be good enough. It's not exactly an everyday problem, so as long as it's visible when it's happened and obvious how to work around it so users who run into it for the first time don't just decide Python or pip or spam is "broken" and give up, that might be sufficient.
>

How often does pip actually need to upgrade an already-installed
package in order to install something you've just requested? Maybe the
rule could be simpler: if there are any upgrades at all, regardless of
whether you've imported from those packages, recommend a restart. The
use-case I'd be most expecting is this:

>>> import spam
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named 'spam'
>>> install("spam")
... chuggity chug chug ...
>>> import spam
>>>

In the uncommon case where spam depends on ham v1.4.7 or newer *AND*
you already have ham <1.4.7 installed, a simple message should
suffice. (Oh, and you also have to not have any version of spam
installed already, else you won't be able to use install() anyway.)

ChrisA

From donald at stufft.io  Mon Sep  7 06:01:34 2015
From: donald at stufft.io (Donald Stufft)
Date: Mon, 7 Sep 2015 00:01:34 -0400
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <CADiSq7feM2k53WCyB13W=Y=o1G+t7DqXhZea12dHuZgxu=XSzw@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150905153801.GV19373@ando.pearwood.info>
 <etPan.55eb1a85.663e61d3.19bd@Draupnir.home>
 <xjl02ppe1qq2be1uksepxf03.1441472663300@email.android.com>
 <etPan.55eb2f01.5dad6fed.19bd@Draupnir.home>
 <fdctan3cvw7khq1cjb5yrg2u.1441481574353@email.android.com>
 <CADiSq7feM2k53WCyB13W=Y=o1G+t7DqXhZea12dHuZgxu=XSzw@mail.gmail.com>
Message-ID: <etPan.55ed0c1f.4a8a6a3c.31bc@Draupnir.home>

On September 6, 2015 at 8:20:54 PM, Nick Coghlan (ncoghlan at gmail.com) wrote:
> On 6 September 2015 at 05:49, Giovanni Cannata wrote:
> > Hi Donald, you mean you're the only one in charge of maintaining PyPI? I'm
> > sorry for this, I thought that a critical service like PyPI was supported by
> > a team. I (and presume other developers) rely heavily on it. Maybe this
> > should be brought to the attention of the PSF.
> 
> We're aware. PyPI is currently the *only* python.org service with a
> dedicated full time developer (and Donald's time is actually
> contributed by the OpenStack group at HP rather than being funded
> directly by the PSF). There are also a number of organisations that
> donate resources to operating the python.org infrastructure (e.g. the
> Fastly CDN, and hosting services from Heroku, Rackspace and the Open
> Source Lab at Oregon State University).
> 
> Bringing paid development in to support community driven projects,
> while also ensuring financial sustainability and fiscally responsible
> management of contributor's funds is an interesting challenge, and one
> the PSF continues to work to get better at.
> 
>

I'm not exactly full time on PyPI either, though I am full time on packaging. I
split my time between PyPI, pip, and any other related work that I need to. So
PyPI (as important as it is) really only has part of my attention.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From donald at stufft.io  Mon Sep  7 06:05:31 2015
From: donald at stufft.io (Donald Stufft)
Date: Mon, 7 Sep 2015 00:05:31 -0400
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <CAPTjJmq7JMWYpFTW+5J3+iu2xdVANTzs2YoNcXNzbF7iiE_PRQ@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp> <msflbr$qda$1@ger.gmane.org>
 <20150907021802.GY19373@ando.pearwood.info>
 <5647B0DF-5034-4736-9F02-B3E106DD8434@yahoo.com>
 <CAPTjJmq7JMWYpFTW+5J3+iu2xdVANTzs2YoNcXNzbF7iiE_PRQ@mail.gmail.com>
Message-ID: <etPan.55ed0d0b.74cebdb1.31bc@Draupnir.home>

On September 6, 2015 at 11:26:04 PM, Chris Angelico (rosuav at gmail.com) wrote:
> > How often does pip actually need to upgrade an already-installed 
> package in order to install something you've just requested? 
> Maybe the
> rule could be simpler: if there are any upgrades at all, regardless 
> of
> whether you've imported from those packages, recommend a restart. 
> The
> use-case I'd be most expecting is this:

Due to the nature of ``pip install --upgrade``, it's fairly common. At this
time ``pip install --upgrade`` is "greedy" and will try to upgrade the named
package and all of it's dependencies, even if their is already a version of the
dependency installed that satisfies the version constraints.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From rosuav at gmail.com  Mon Sep  7 06:09:46 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Mon, 7 Sep 2015 14:09:46 +1000
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <etPan.55ed0d0b.74cebdb1.31bc@Draupnir.home>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <msflbr$qda$1@ger.gmane.org>
 <20150907021802.GY19373@ando.pearwood.info>
 <5647B0DF-5034-4736-9F02-B3E106DD8434@yahoo.com>
 <CAPTjJmq7JMWYpFTW+5J3+iu2xdVANTzs2YoNcXNzbF7iiE_PRQ@mail.gmail.com>
 <etPan.55ed0d0b.74cebdb1.31bc@Draupnir.home>
Message-ID: <CAPTjJmpNwEv1qbjM3nF9fZoe_sMDhs8F+i3DrmL=8uBF_7zhAA@mail.gmail.com>

On Mon, Sep 7, 2015 at 2:05 PM, Donald Stufft <donald at stufft.io> wrote:
> On September 6, 2015 at 11:26:04 PM, Chris Angelico (rosuav at gmail.com) wrote:
>> > How often does pip actually need to upgrade an already-installed
>> package in order to install something you've just requested?
>> Maybe the
>> rule could be simpler: if there are any upgrades at all, regardless
>> of
>> whether you've imported from those packages, recommend a restart.
>> The
>> use-case I'd be most expecting is this:
>
> Due to the nature of ``pip install --upgrade``, it's fairly common. At this
> time ``pip install --upgrade`` is "greedy" and will try to upgrade the named
> package and all of it's dependencies, even if their is already a version of the
> dependency installed that satisfies the version constraints.

Okay. What if "--upgrade" isn't the default when it's being called
from within an interactive session? Would that work?

ChrisA

From donald at stufft.io  Mon Sep  7 06:20:23 2015
From: donald at stufft.io (Donald Stufft)
Date: Mon, 7 Sep 2015 00:20:23 -0400
Subject: [Python-ideas] Wheels For ...
In-Reply-To: <20150907012645.GX19373@ando.pearwood.info>
References: <55EC78E9.1050300@mail.de>
 <20150907012645.GX19373@ando.pearwood.info>
Message-ID: <etPan.55ed1087.5b6eedad.31bc@Draupnir.home>

On September 6, 2015 at 9:27:32 PM, Steven D'Aprano (steve at pearwood.info) wrote:
> On Sun, Sep 06, 2015 at 07:33:29PM +0200, Sven R. Kunze wrote:
>  
> > Why do developers need to build their distribution themselves?
> >
> > I had not real answer to him, but pondering a while over it, I found it
> > really insightful. Viewing this from a different angle, packaging your
> > own distribution is actually a waste of time. It is a tedious,
> > error-prone task involving no creativity whatsoever. Developers on the
> > other hand are actually people with very little time and a lot of
> > creativity at hand which should spend better. The logical conclusion
> > would be that PyPI should build wheels for the developers for every
> > python/platform combination necessary.
>  
> Over on the python-list mailing list, Ned Batchelder asked a question. I
> haven't seen an answer there, and as far as I know he isn't subscribed
> here, so I'll take the liberty of copying his question here:
>  
> Ned says:
>  
> "As a developer of a Python package, I don't see how this would be
> better. The developer would still have to get their software into some
> kind of uniform configuration, so the central authority could package
> it. You've moved the problem from, "everyone has to make wheels" to
> "everyone has to make a tree that's structured properly." But if we can
> do the second thing, the first thing is really easy.
>  
> Maybe I've misunderstood?"
>  
>  


A PyPI build farm for authors is something I plan on getting too if someone
doesn't beat me to it. It won't be mandatory to use it, and it's not going to
cover every corner case where you need some crazy obscure library installed,
but it will ideally try to make things better.

As far why it's better, it's actually pretty simple. Let's take lxml for
example which binds against libxml2. It needs built on Windows, it needs built
on OSX, it needs built on various Linux distributions in order to cover the
spread of just the common cases. If we want to start to get into uncommon
platforms we're looking at various BSDs, Solaris, etc. It's a lot of
infrastructure to maintain for the common cases much less the uncommon cases
that we can centralize the maintenance into one location. In addition to all of
that, it turns out you pretty much need to get most of the way to defining that
configuration in a central location anyways since pip already needs to know how
to build your project, the only things it doesn't know is what platforms you
support and what, if any, extra libraries you require to be installed. There
are some PEPs in the works that may make that second part known ahead of time
and for the first one, we can simply ask when a project enable the build farm.

Even if we only support things that don't require additional libraries to be?
installed, that's still a pretty big win by default that will allow a wide
number of projects to be installed from a binary distribution and not require
a compiler toolchain that previously couldn't be due to authors not willing or
not able to manage the overhead of the infrastructure around building on all
of those platforms.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From njs at pobox.com  Mon Sep  7 07:07:35 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 6 Sep 2015 22:07:35 -0700
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <CAPTjJmpNwEv1qbjM3nF9fZoe_sMDhs8F+i3DrmL=8uBF_7zhAA@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <msflbr$qda$1@ger.gmane.org>
 <20150907021802.GY19373@ando.pearwood.info>
 <5647B0DF-5034-4736-9F02-B3E106DD8434@yahoo.com>
 <CAPTjJmq7JMWYpFTW+5J3+iu2xdVANTzs2YoNcXNzbF7iiE_PRQ@mail.gmail.com>
 <etPan.55ed0d0b.74cebdb1.31bc@Draupnir.home>
 <CAPTjJmpNwEv1qbjM3nF9fZoe_sMDhs8F+i3DrmL=8uBF_7zhAA@mail.gmail.com>
Message-ID: <CAPJVwBk7E74n_36a55rgQUU7XYJEyBBLHTHFXDARofJRRZeU=A@mail.gmail.com>

On Sep 6, 2015 9:09 PM, "Chris Angelico" <rosuav at gmail.com> wrote:
>
> On Mon, Sep 7, 2015 at 2:05 PM, Donald Stufft <donald at stufft.io> wrote:
> > On September 6, 2015 at 11:26:04 PM, Chris Angelico (rosuav at gmail.com)
wrote:
> >> > How often does pip actually need to upgrade an already-installed
> >> package in order to install something you've just requested?
> >> Maybe the
> >> rule could be simpler: if there are any upgrades at all, regardless
> >> of
> >> whether you've imported from those packages, recommend a restart.
> >> The
> >> use-case I'd be most expecting is this:
> >
> > Due to the nature of ``pip install --upgrade``, it's fairly common. At
this
> > time ``pip install --upgrade`` is "greedy" and will try to upgrade the
named
> > package and all of it's dependencies, even if their is already a
version of the
> > dependency installed that satisfies the version constraints.
>
> Okay. What if "--upgrade" isn't the default when it's being called
> from within an interactive session? Would that work?

FWIW the recursive behaviour of --upgrade is perhaps the single most hated
feature of pip (almost all scientific packages find it so annoying that
they refuse to provide dependency metadata at all), and AFAIK everyone has
agreed to deprecate it in general and replace it with a non-recursive
upgrade command, just no-one has gotten around to it:
  https://github.com/pypa/pip/issues/59
So I wouldn't worry about defining special interactive semantics in
particular, someone just has to make the patch to change it in general :-)

The trickier bit is that I'm not sure there's actually any way right now to
know what python packages were affected by a given install or upgrade
command, because it can be the case that after 'pip install X' you then do
'import Y' -- the wheel and module names don't have to match, and in
practice it's not uncommon for there to be discrepancies. (For example,
after 'pip install matplotlib' you can do both 'import matplotlib' and
'import pylab'.)

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150906/d58ed403/attachment.html>

From ncoghlan at gmail.com  Mon Sep  7 07:09:03 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 7 Sep 2015 15:09:03 +1000
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <20150907021802.GY19373@ando.pearwood.info>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <msflbr$qda$1@ger.gmane.org>
 <20150907021802.GY19373@ando.pearwood.info>
Message-ID: <CADiSq7eULM-855u973TGBsZ-ZkmdvPV-v628DzD=jb3mLvLYew@mail.gmail.com>

On 7 September 2015 at 12:18, Steven D'Aprano <steve at pearwood.info> wrote:
> On Sat, Sep 05, 2015 at 05:03:36PM -0400, Terry Reedy wrote:
>> On 9/5/2015 3:08 AM, Stephen J. Turnbull wrote:
>>
>> >So let's fix it, already![1]  Now that we have a blessed package
>> >management module, why not have a builtin that handles the simple
>> >cases?  Say
>> >
>> >     def installer(package, command='install'):
>> >         ...
>>
>> Because new builtins have a high threashold to reach, and this doesn't
>> reach it? Installation is a specialized and rare operation.
>
> You're right about the first part, but as Chris has already suggested,
> this need not be *literally* a built-in. Like help() it could be
> imported at REPL startup.

Technically it's "import site" that injects those - you have to run
with "-S" to prevent them from being installed:

$ python3 -c "quit()"
$ python3 -Sc "quit()"
Traceback (most recent call last):
 File "<string>", line 1, in <module>
NameError: name 'quit' is not defined

Regardless, I agree a "site builtin" like help() or quit() is a better
option here than a true builtin, and I also think it's a useful idea.

I'd make it simpler than the proposed API though, and instead just
offer an "install(specifier)" API that was a thin shell around
pip.main:

    try:
        import pip
    except ImportError:
        pass
    else:
        def install(specifier):
            cmd = ["install"]
            if sys.prefix == sys.base_prefix:
                cmd.append("--user") # User installs only when outside a venv
            cmd.append(specifier)
            # TODO: throw exception when there's a problem
            pip.main(cmd)

If folks want more flexibility, then they'll need to access (and
understand) the underlying installer.

As far as other possible objections go:

* the pkg_resources global state problem we should be able to work
around just by reloading pkg_resources (if already loaded) after
installing new packages (I've previously tried to address some aspects
of that particular problem upstream, but doing so poses significant
backwards compatibility challenges)

* I believe integration with systems like conda, PyPM, and the
Enthought installer should be addressed through a plugin model in pip,
rather than directly in the standard library

* providing a standard library API for querying the set of installed
packages independently of pip is a separate question

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Mon Sep  7 07:18:18 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 7 Sep 2015 15:18:18 +1000
Subject: [Python-ideas] Wheels For ...
In-Reply-To: <3B3CDE41-BB58-447D-BB86-9994EE9587FA@gmail.com>
References: <55EC78E9.1050300@mail.de>
 <3B3CDE41-BB58-447D-BB86-9994EE9587FA@gmail.com>
Message-ID: <CADiSq7fKVvwcG+rCYr6RK7B8hG0ta0CnqTEWUmzpvEc5O4y7qA@mail.gmail.com>

On 7 September 2015 at 11:22, Ryan Gonzalez <rymg19 at gmail.com> wrote:
> On September 6, 2015 12:33:29 PM CDT, "Sven R. Kunze" <srkunze at mail.de> wrote:
>>really insightful. Viewing this from a different angle, packaging your
>>own distribution is actually a waste of time. It is a tedious,
>>error-prone task involving no creativity whatsoever. Developers on the
>>other hand are actually people with very little time and a lot of
>>creativity at hand which should spend better. The logical conclusion
>>would be that PyPI should build wheels for the developers for every
>>python/platform combination necessary.
>>
>
> You can already do this with CI services. I wrote a post about doing that with AppVeyor:
>
> http://kirbyfan64.github.io/posts/using-appveyor-to-distribute-python-wheels.html
>
> but the idea behind it should apply easily to Travis and others. In reality, you're probably using a CI service to run your tests anyway, so it might as well build your wheels, too!

Right, Appveyor also has the most well-defined CI instructions on
packaging.python.org:
https://packaging.python.org/en/latest/appveyor.html

It doesn't do auto-upload, as many projects only release occasionally
rather than for every commit. However, it may be desirable to go into
more detail about how to do that, if you'd be interested in sending a
PR based on your post.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From abarnert at yahoo.com  Mon Sep  7 07:12:08 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sun, 6 Sep 2015 22:12:08 -0700
Subject: [Python-ideas] High time for a builtin function to manage
	packages (simply)?
In-Reply-To: <CAPTjJmq7JMWYpFTW+5J3+iu2xdVANTzs2YoNcXNzbF7iiE_PRQ@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp> <msflbr$qda$1@ger.gmane.org>
 <20150907021802.GY19373@ando.pearwood.info>
 <5647B0DF-5034-4736-9F02-B3E106DD8434@yahoo.com>
 <CAPTjJmq7JMWYpFTW+5J3+iu2xdVANTzs2YoNcXNzbF7iiE_PRQ@mail.gmail.com>
Message-ID: <248D6F33-6FC0-4239-A15D-98529962DF5A@yahoo.com>

On Sep 6, 2015, at 20:25, Chris Angelico <rosuav at gmail.com> wrote:
> 
> On Mon, Sep 7, 2015 at 1:17 PM, Andrew Barnert via Python-ideas
> <python-ideas at python.org> wrote:
>> Anyway, the problem comes when you upgrade (directly or indirectly) a module that's already imported. Reloading is neither easy (especially if you need to reload a module that you only imported indirectly and upgraded indirectly) nor fool-proof. When I run into problems, I usually don't have much trouble stashing any costly intermediate objects, exiting the REPL, re-launching, and restoring, but I don't think novices would have as much fun.
>> 
>> Is there a way the installer could, after working out the requirements, tell you something like "This command will upgrade 'spam' from 1.3.2 to 1.4.1, and you have imported 'spam' and 'spam.eggs' from the package, so you may need to restart after the upgrade. Continue?" That might be good enough. It's not exactly an everyday problem, so as long as it's visible when it's happened and obvious how to work around it so users who run into it for the first time don't just decide Python or pip or spam is "broken" and give up, that might be sufficient.
> 
> How often does pip actually need to upgrade an already-installed
> package in order to install something you've just requested?

Not that often, which is why I said "It's not exactly an everyday problem"; just often enough that some novices are going to run into it once or twice, so it can't be ignored.

> Maybe the
> rule could be simpler: if there are any upgrades at all, regardless of
> whether you've imported from those packages, recommend a restart.

I suppose that's possible too. It's overzealous, but it still won't happen _that_ often, so if my suggestion is too much work, this one seems fine to me.

Especially if the message made the issue clear, something about "After an upgrade, you should restart, because any packages you already imported may be unchanged or inconsistent".

> The
> use-case I'd be most expecting is this:
> 
>>>> import spam
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
> ImportError: No module named 'spam'
>>>> install("spam")
> ... chuggity chug chug ...
>>>> import spam
> 
> In the uncommon case where spam depends on ham v1.4.7 or newer *AND*
> you already have ham <1.4.7 installed, a simple message should
> suffice. (Oh, and you also have to not have any version of spam
> installed already, else you won't be able to use install() anyway.)
> 
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From abarnert at yahoo.com  Mon Sep  7 07:25:36 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sun, 6 Sep 2015 22:25:36 -0700
Subject: [Python-ideas] Wheels For ...
In-Reply-To: <etPan.55ed1087.5b6eedad.31bc@Draupnir.home>
References: <55EC78E9.1050300@mail.de>
 <20150907012645.GX19373@ando.pearwood.info>
 <etPan.55ed1087.5b6eedad.31bc@Draupnir.home>
Message-ID: <F6CD4556-2AC1-4AD2-BF43-7F46060024FF@yahoo.com>

On Sep 6, 2015, at 21:20, Donald Stufft <donald at stufft.io> wrote:
> 
> Let's take lxml for
> example which binds against libxml2. It needs built on Windows, it needs built
> on OSX, it needs built on various Linux distributions in order to cover the
> spread of just the common cases.

IIRC, Apple included ancient versions (even at the time) of libxml2 up to around 10.7, and at one point they even included one of the broken 2.7.x versions. So a build farm building for 10.6+ (which I think is what python.org builds still target?) is going to build against an ancient libxml2, meaning some features of lxml2 will be disabled, and others may even be broken. Even if I'm remembering wrong about Apple, I'm sure there are linux distros with similar issues.

Fortunately, lxml has a built-in option (triggered by an env variable) for dealing with this, by downloading the source, building a local copy of the libs, and statically linking them into lxml, but that means you need some way for a package to specify env variables to be set on the build server. And can you expect most libraries with similar issues to do the same?

(I don't know how many packages actually have similar problems, but since you specifically mentioned lxml as your example, and I had headaches building it for a binary-distributed app supporting 10.6-10.9 a few years ago, I happened to remember this problem.)

From njs at pobox.com  Mon Sep  7 07:39:39 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 6 Sep 2015 22:39:39 -0700
Subject: [Python-ideas] Wheels For ...
In-Reply-To: <F6CD4556-2AC1-4AD2-BF43-7F46060024FF@yahoo.com>
References: <55EC78E9.1050300@mail.de>
 <20150907012645.GX19373@ando.pearwood.info>
 <etPan.55ed1087.5b6eedad.31bc@Draupnir.home>
 <F6CD4556-2AC1-4AD2-BF43-7F46060024FF@yahoo.com>
Message-ID: <CAPJVwBmMm0tLmMHDUcAoVYVAxeovKDCyZXPcEpFOoyy_EY2cGg@mail.gmail.com>

On Sep 6, 2015 10:28 PM, "Andrew Barnert via Python-ideas" <
python-ideas at python.org> wrote:
>
> On Sep 6, 2015, at 21:20, Donald Stufft <donald at stufft.io> wrote:
> >
> > Let's take lxml for
> > example which binds against libxml2. It needs built on Windows, it
needs built
> > on OSX, it needs built on various Linux distributions in order to cover
the
> > spread of just the common cases.
>
> IIRC, Apple included ancient versions (even at the time) of libxml2 up to
around 10.7, and at one point they even included one of the broken 2.7.x
versions. So a build farm building for 10.6+ (which I think is what
python.org builds still target?) is going to build against an ancient
libxml2, meaning some features of lxml2 will be disabled, and others may
even be broken. Even if I'm remembering wrong about Apple, I'm sure there
are linux distros with similar issues.
>
> Fortunately, lxml has a built-in option (triggered by an env variable)
for dealing with this, by downloading the source, building a local copy of
the libs, and statically linking them into lxml, but that means you need
some way for a package to specify env variables to be set on the build
server. And can you expect most libraries with similar issues to do the
same?

Yes, you can! :-)

I mean, not everyone will necessarily use it, but adding code like

if "PYPI_BUILD_SERVER" in os.environ:
    do_static_link = True

to your setup.py is *wayyyy* easier than buying an OS X machine and
maintaining it and doing manual builds at every release. Or finding a
volunteer who has an OS X box and nagging them at every release and dealing
with trust hassles.

And there are a lot of packages out there that just have some cython files
in them for speedups with no external dependencies, or whatever. A build
farm wouldn't have to be perfect to be extremely useful.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150906/ab04874b/attachment.html>

From srkunze at mail.de  Mon Sep  7 07:46:44 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Mon, 07 Sep 2015 07:46:44 +0200
Subject: [Python-ideas] One way to do format and print (was: Desperate
 need for enhanced print function)
In-Reply-To: <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
Message-ID: <55ED24C4.9000205@mail.de>

On 07.09.2015 01:48, Nick Coghlan wrote:
> As such, it currently appears likely that Python 3.6 will allow you and
> your peers to write output messages like this:
>
>      print(f"Hello, I am {b}. My favorite number is {a}.")
>
> as a simpler alternative to the current options:
>
>      print("Hello, I am ", b, ". My favorite number is ", a, ".", sep="")
>      print("Hello, I am " + b + ". My favorite number is " + str(a) + ".")
>      print("Hello, I am {}. My favorite number is {}.".format(b, a))
>      print("Hello, I am {b}. My favorite number is {a}.".format_map(locals()))
>      print("Hello, I am %s. My favorite number is %s." % (b, a))

Wow, that is awesome and awkward at the same time.

Shouldn't Python 3.7 deprecate at least some of them? (Just looking at 
the Zen of Python and https://xkcd.com/927/ )

Best,
Sven

From stephen at xemacs.org  Mon Sep  7 09:26:11 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 07 Sep 2015 16:26:11 +0900
Subject: [Python-ideas] High time for a builtin function to
	manage	packages (simply)?
In-Reply-To: <AF4B2951-316E-428C-8C01-04783591B205@yahoo.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <msflbr$qda$1@ger.gmane.org>
 <87d1xv2e04.fsf@uwakimon.sk.tsukuba.ac.jp>
 <AF4B2951-316E-428C-8C01-04783591B205@yahoo.com>
Message-ID: <878u8i3d2k.fsf@uwakimon.sk.tsukuba.ac.jp>

Andrew Barnert writes:

 > Tcl/Tk, and Tkinter for all pre-installed Pythons but 2.3, have
 > been included with every OS X since they started pre-installing
 > 2.5.

My mistake, it's only MacPorts where I don't have it.  I used
MacPorts' all-lowercase spelling, which doesn't work in the system
Python.  (The capitalized spelling doesn't work in MacPorts.)

 > And it works with all python.org installs for 10.6 or later, all
 > Homebrew default installs, standard source builds... Just about
 > anything besides MacPorts (which seems to want to build Tkinter
 > against its own Tcl/Tk instead of Apple's)

I recall having problems with trying to build and run against the
system Tcl/Tk in both source and MacPorts, but that was a *long* time
ago (2.6-ish).  Trying it now, on my Mac OS X Yosemite system python
2.7.10, "root=Tkinter.Tk()" creates and displays a window, but doesn't
pop it up.  In fact, "root.tkraise()" doesn't, either.  Oops.  On this
system, IDLE has the same problem with its initial window, and
furthermore complains that Tcl/Tk 8.5.9 is unstable.

Quite possibly this window-raising issue is Just Me.  But based on my
own experience, it is not at all obvious that ensuring availability of
a GUI is possible in the same way we can ensure pip.

 > Also, why do you think Qt would be less of a problem?

I don't.  I think "ensure PyQt" would be a huge burden, much greater
than Tkinter.  Bottom line: IMO, at this point in time, if it has to
Just Work, it has to Work Without GUI.  (Modulo the possibility that
we can use an HTML server and borrow the display engine from the
platform web browser.  I think I already mentioned that, and I think
it's really the way to go.  People who *don't* have a web browser
probably can handle "python -m pip ..." without StackOverflow.)


From rosuav at gmail.com  Mon Sep  7 10:23:13 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Mon, 7 Sep 2015 18:23:13 +1000
Subject: [Python-ideas] One way to do format and print (was: Desperate
 need for enhanced print function)
In-Reply-To: <55ED24C4.9000205@mail.de>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
Message-ID: <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>

On Mon, Sep 7, 2015 at 3:46 PM, Sven R. Kunze <srkunze at mail.de> wrote:
> On 07.09.2015 01:48, Nick Coghlan wrote:
>>
>> As such, it currently appears likely that Python 3.6 will allow you and
>> your peers to write output messages like this:
>>
>>      print(f"Hello, I am {b}. My favorite number is {a}.")
>>
>> as a simpler alternative to the current options:
>>
>>      print("Hello, I am ", b, ". My favorite number is ", a, ".", sep="")
>>      print("Hello, I am " + b + ". My favorite number is " + str(a) + ".")
>>      print("Hello, I am {}. My favorite number is {}.".format(b, a))
>>      print("Hello, I am {b}. My favorite number is
>> {a}.".format_map(locals()))
>>      print("Hello, I am %s. My favorite number is %s." % (b, a))
>
>
> Wow, that is awesome and awkward at the same time.
>
> Shouldn't Python 3.7 deprecate at least some of them? (Just looking at the
> Zen of Python and https://xkcd.com/927/ )

Which would you deprecate?

     print("Hello, I am ", b, ". My favorite number is ", a, ".", sep="")

The print function stringifies all its arguments and outputs them,
joined by a separator. Aside from the 2/3 compatibility requirement
for single-argument print calls, there's no particular reason to
deprecate this. In any case, this isn't "yet another way to format
strings", it's a feature of print.

     print("Hello, I am " + b + ". My favorite number is " + str(a) + ".")

String concatenation is definitely not going away; but even without
PEP 498, I would prefer to use percent formatting or .format() above
this. Its main advantage over those is that the expressions are in the
right place, which PEP 498 also offers; if it lands, I fully expect
3.6+ code to use it rather than this. But the _functionality_ can't be
taken away.

     print("Hello, I am {}. My favorite number is {}.".format(b, a))

This one is important for non-literals. It's one of the two main ways
of formatting strings...

     print("Hello, I am %s. My favorite number is %s." % (b, a))

... and this is the other. Being available for non-literals means they
can be used with i18n, string tables, and other transformations.
Percent formatting is similar to what other C-derived languages have,
and .format() has certain flexibilities, so neither is likely to be
deprecated any time soon.

     print("Hello, I am {b}. My favorite number is {a}.".format_map(locals()))

This one, though, is a bad idea for several reasons. Using locals()
for formatting is restricted - no globals, no expressions, and no
nonlocals that aren't captured in some other way. If this one, and
this one alone, can be replaced by f-string usage, it's done its job.

ChrisA

From tjreedy at udel.edu  Mon Sep  7 10:53:00 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 7 Sep 2015 04:53:00 -0400
Subject: [Python-ideas] High time for a builtin function to manage
	packages (simply)?
In-Reply-To: <878u8i3d2k.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp> <msflbr$qda$1@ger.gmane.org>
 <87d1xv2e04.fsf@uwakimon.sk.tsukuba.ac.jp>
 <AF4B2951-316E-428C-8C01-04783591B205@yahoo.com>
 <878u8i3d2k.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <msjja1$kg8$1@ger.gmane.org>

On 9/7/2015 3:26 AM, Stephen J. Turnbull wrote:
> Andrew Barnert writes:
>
>   > Tcl/Tk, and Tkinter for all pre-installed Pythons but 2.3, have
>   > been included with every OS X since they started pre-installing
>   > 2.5.
>
> My mistake, it's only MacPorts where I don't have it.  I used
> MacPorts' all-lowercase spelling, which doesn't work in the system
> Python.  (The capitalized spelling doesn't work in MacPorts.)
>
>   > And it works with all python.org installs for 10.6 or later, all
>   > Homebrew default installs, standard source builds... Just about
>   > anything besides MacPorts (which seems to want to build Tkinter
>   > against its own Tcl/Tk instead of Apple's)

My impression is that MacParts builds Tkinter 8.6 instead of 8.5.

> I recall having problems with trying to build and run against the
> system Tcl/Tk in both source and MacPorts, but that was a *long* time
> ago (2.6-ish).  Trying it now, on my Mac OS X Yosemite system python
> 2.7.10, "root=Tkinter.Tk()" creates and displays a window, but doesn't
> pop it up.  In fact, "root.tkraise()" doesn't, either.  Oops.  On this
> system, IDLE has the same problem with its initial window, and
> furthermore complains that Tcl/Tk 8.5.9 is unstable.

Mac users who download the PSF Mac installer and want to use tkinter 
should read
https://www.python.org/download/mac/tcltk/
Before the redesign, there was a link to this from the download page, 
but the redesign seems to have removed it.
The page mentions that there may be a window update problem with the 
apple tk.

-- 
Terry Jan Reedy


From mal at egenix.com  Mon Sep  7 11:01:12 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 07 Sep 2015 11:01:12 +0200
Subject: [Python-ideas] Desperate need for enhanced print function
In-Reply-To: <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
Message-ID: <55ED5258.2020804@egenix.com>

On 07.09.2015 01:48, Nick Coghlan wrote:
> On 6 September 2015 at 05:33, Anand Krishnakumar
> <anandkrishnakumar123 at gmail.com> wrote:
>> print("Hello, I am ", b, ". My favorite number is ", a, ".")
>>
>> I'm 14 and I came up with this idea after seeing my fellow classmates at
>> school struggling to do something like this with the standard print
>> statement.
>> Sure, you can use the format method but won't that be a bit too much for
>> beginners? (Also, casting is inevitable in every programmer's career)
> 
> Hi Anand,
> 
> Your feedback reflects a common point of view on the surprising
> difficulty of producing nicely formatted messages from Python code. As
> such, it currently appears likely that Python 3.6 will allow you and
> your peers to write output messages like this:
> 
>     print(f"Hello, I am {b}. My favorite number is {a}.")
> 
> as a simpler alternative to the current options:
> 
>     print("Hello, I am ", b, ". My favorite number is ", a, ".", sep="")
>     print("Hello, I am " + b + ". My favorite number is " + str(a) + ".")
>     print("Hello, I am {}. My favorite number is {}.".format(b, a))
>     print("Hello, I am {b}. My favorite number is {a}.".format_map(locals()))
>     print("Hello, I am %s. My favorite number is %s." % (b, a))

No need to wait for Python 3.6. Since print is a function, you
can easily override it using your own little helper to make
things easier for you. And this works in all Python versions starting
with Python 2.6:

"""
# For Python 2 you need to make print a function first:
from __future__ import print_function
import sys

_orig_print = print

# Use .format() as basis for print()
def fprint(template, *args, **kws):
    caller = sys._getframe(1)
    context = caller.f_locals
    _orig_print(template.format(**context), *args, **kws)

# Use C-style %-formatting as basis for print()
def printf(template, *args, **kws):
    caller = sys._getframe(1)
    context = caller.f_locals
    _orig_print(template % context, *args, **kws)

# Examples:
a = 1
fprint('a = {a}')
printf('a = %(a)s')

# Let's use fprint() as standard print() in this module:
print = fprint
b = 3
print('b = {b}')
"""

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 07 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2015-08-27: Released eGenix mx Base 3.2.9 ...     http://egenix.com/go83
2015-09-18: PyCon UK 2015 ...                              11 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From p.f.moore at gmail.com  Mon Sep  7 12:57:10 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 7 Sep 2015 11:57:10 +0100
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
Message-ID: <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>

On 5 September 2015 at 09:30, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Unfortunately, I've yet to convince the rest of PyPA (let alone the
> community at large) that telling people to call "pip" directly is *bad
> advice* (as it breaks in too many cases that beginners are going to
> encounter), so it would be helpful if folks helping beginners on
> python-list and python-tutor could provide feedback supporting that
> perspective by filing an issue against
> https://github.com/pypa/python-packaging-user-guide

I would love to see "python -m pip" (or where the launcher is
appropriate, the shorter "py -m pip") be the canonical invocation used
in all documentation, discussion and advice on running pip. The main
problems seem to be (1) "but just typing "pip" is shorter and easier
to remember", (2) "I don't understand why pip can't just be a normal
command" and sometimes (3) "isn't this just on Windows because you
can't update pip in place on Windows" (no it isn't, but it's a common
misconception of the issue).

But I would agree with Nick, and recommend that anyone advising people
on how to use pip, *especially* if you are helping them with issues,
to always use "python -m pip" as the canonical command. If you need to
explain why, say that this makes sure that you run pip from the
correct Python interpreter, that's the basic point here.

Paul

From encukou at gmail.com  Mon Sep  7 13:09:03 2015
From: encukou at gmail.com (Petr Viktorin)
Date: Mon, 7 Sep 2015 13:09:03 +0200
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <CADiSq7eULM-855u973TGBsZ-ZkmdvPV-v628DzD=jb3mLvLYew@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp> <msflbr$qda$1@ger.gmane.org>
 <20150907021802.GY19373@ando.pearwood.info>
 <CADiSq7eULM-855u973TGBsZ-ZkmdvPV-v628DzD=jb3mLvLYew@mail.gmail.com>
Message-ID: <CA+=+wqAf5SaazyUs8Z--HRfvL=jTWzfP5TDVf88f1DfynkYRsw@mail.gmail.com>

> * I believe integration with systems like conda, PyPM, and the
> Enthought installer should be addressed through a plugin model in pip,
> rather than directly in the standard library

Perhaps integration with RPM/APT could use that as well. If only to
require some kind of --im-realy-sure flag for "sudo pip".

From stephen at xemacs.org  Mon Sep  7 13:33:50 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 07 Sep 2015 20:33:50 +0900
Subject: [Python-ideas] High time for a builtin function to
	manage	packages (simply)?
In-Reply-To: <msjja1$kg8$1@ger.gmane.org>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <msflbr$qda$1@ger.gmane.org>
 <87d1xv2e04.fsf@uwakimon.sk.tsukuba.ac.jp>
 <AF4B2951-316E-428C-8C01-04783591B205@yahoo.com>
 <878u8i3d2k.fsf@uwakimon.sk.tsukuba.ac.jp>
 <msjja1$kg8$1@ger.gmane.org>
Message-ID: <874mj631lt.fsf@uwakimon.sk.tsukuba.ac.jp>

Terry Reedy writes:

 > My impression is that MacParts builds Tkinter 8.6 instead of 8.5.

If you mean that MacPorts' current Tcl and Tk ports install Tcl/Tk 8.6,
that is correct.

 > Mac users who download the PSF Mac installer and want to use tkinter 
 > should read
 > https://www.python.org/download/mac/tcltk/

 > The page mentions that there may be a window update problem with the 
 > apple tk.

It also mentions that various Tk versions "in common use" are
unsupported by the python.org-installed Python, and in particular not
the Cocoa Tk.  I suppose it's not hard to do that?  Or maybe chances
are that the X11 and Cocoa Tks "just work", but aren't tested for the
Mac installers?


From abarnert at yahoo.com  Mon Sep  7 14:03:18 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Mon, 7 Sep 2015 05:03:18 -0700
Subject: [Python-ideas] High time for a builtin function to
	manage	packages (simply)?
In-Reply-To: <874mj631lt.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp> <msflbr$qda$1@ger.gmane.org>
 <87d1xv2e04.fsf@uwakimon.sk.tsukuba.ac.jp>
 <AF4B2951-316E-428C-8C01-04783591B205@yahoo.com>
 <878u8i3d2k.fsf@uwakimon.sk.tsukuba.ac.jp> <msjja1$kg8$1@ger.gmane.org>
 <874mj631lt.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <509A9271-AE8C-4E66-97AF-2518C40E95D3@yahoo.com>

On Sep 7, 2015, at 04:33, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> 
> Terry Reedy writes:
> 
>> My impression is that MacParts builds Tkinter 8.6 instead of 8.5.
> 
> If you mean that MacPorts' current Tcl and Tk ports install Tcl/Tk 8.6,
> that is correct.
> 
>> Mac users who download the PSF Mac installer and want to use tkinter 
>> should read
>> https://www.python.org/download/mac/tcltk/
> 
>> The page mentions that there may be a window update problem with the 
>> apple tk.
> 
> It also mentions that various Tk versions "in common use" are
> unsupported by the python.org-installed Python, and in particular not
> the Cocoa Tk.  I suppose it's not hard to do that?  Or maybe chances
> are that the X11 and Cocoa Tks "just work", but aren't tested for the
> Mac installers?

It's not just a matter of "not tested"; there are actual glitches with some of the versions, including two pretty serious ones that can lead to a freeze, or to some window management commands being ignored.

But once you have some experience with it, and enough test machines of course, it's not actually that hard to build a binary-shippable GUI app that avoids all of these problems and runs against any of Apple's Tk versions from 10.6+ and against the Python.org recommended versions (which I know, because I've done it, at least with Python 2.7 and 3.3).

Making it work reliably from the REPL, or for a script that's not wrapped as a .app, is definitely a lot less fun. But people who want to install from within the REPL or the system shell probably don't want the GUI.

I don't know about making it work reliably from within IDLE. I don't see any reason IDLE couldn't just launch a .app on Mac if that's a problem, but you have to remember the extra fun bit that the app will get its environment from LaunchServices, not IDLE, so you'd need some other way to tell it to use the current venv. (Possibly this just means the app is linked into the venv?)

From steve at pearwood.info  Mon Sep  7 14:19:35 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 7 Sep 2015 22:19:35 +1000
Subject: [Python-ideas] One way to do format and print (was: Desperate
	need for enhanced print function)
In-Reply-To: <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
Message-ID: <20150907121932.GE19373@ando.pearwood.info>

On Mon, Sep 07, 2015 at 06:23:13PM +1000, Chris Angelico wrote:

>      print("Hello, I am {b}. My favorite number is {a}.".format_map(locals()))
> 
> This one, though, is a bad idea for several reasons.

Such as what?


> Using locals()
> for formatting is restricted - no globals, no expressions, and no
> nonlocals that aren't captured in some other way. 

That's a feature, not a bug.

locals(), by definition, only includes locals. If you want globals, 
non-locals, built-ins, expressions, or the kitchen sink, you can't get 
them from locals(). Just because the locals() trick doesn't handle every 
possible scenario doesn't make it a bad idea for those cases it does 
handle, any more than it is a bad idea to use 2**5 just because the ** 
operator doesn't handle the 3-argument form of pow().

I probably wouldn't use the locals() form if the variable names were 
hard-coded like that, especially for just two of them:

    "Hello, I am {b}. My favorite number is {a}.".format(a=a, b=b)

Where the locals() trick comes in handy is when your template string is 
not hard-coded:

    if greet:
        template = "Hello, I am {name}, and my favourite %s is {%s}."
    else:
        template = "My favourite %s is {%s}."
    if condition:
        template = template % ("number", "x")
    else:
        template = template % ("colour", "c")
    print(template.format_map(locals())



-- 
Steve

From rosuav at gmail.com  Mon Sep  7 14:21:47 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Mon, 7 Sep 2015 22:21:47 +1000
Subject: [Python-ideas] One way to do format and print (was: Desperate
 need for enhanced print function)
In-Reply-To: <20150907121932.GE19373@ando.pearwood.info>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <20150907121932.GE19373@ando.pearwood.info>
Message-ID: <CAPTjJmrOgpkvSjBAaDS0KUZ8uXdcsRSudc1vKRh+1nw2SUpgFw@mail.gmail.com>

On Mon, Sep 7, 2015 at 10:19 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Mon, Sep 07, 2015 at 06:23:13PM +1000, Chris Angelico wrote:
>
>>      print("Hello, I am {b}. My favorite number is {a}.".format_map(locals()))
>>
>> This one, though, is a bad idea for several reasons.
>
> Such as what?
>
>
>> Using locals()
>> for formatting is restricted - no globals, no expressions, and no
>> nonlocals that aren't captured in some other way.
>
> That's a feature, not a bug.
>
> locals(), by definition, only includes locals. If you want globals,
> non-locals, built-ins, expressions, or the kitchen sink, you can't get
> them from locals(). Just because the locals() trick doesn't handle every
> possible scenario doesn't make it a bad idea for those cases it does
> handle, any more than it is a bad idea to use 2**5 just because the **
> operator doesn't handle the 3-argument form of pow().
>
> I probably wouldn't use the locals() form if the variable names were
> hard-coded like that, especially for just two of them:
>
>     "Hello, I am {b}. My favorite number is {a}.".format(a=a, b=b)
>
> Where the locals() trick comes in handy is when your template string is
> not hard-coded:
>
>     if greet:
>         template = "Hello, I am {name}, and my favourite %s is {%s}."
>     else:
>         template = "My favourite %s is {%s}."
>     if condition:
>         template = template % ("number", "x")
>     else:
>         template = template % ("colour", "c")
>     print(template.format_map(locals())

It's still a poor equivalent for the others. In terms of "why do we
have so many different ways to do the same thing", the response is
"the good things to do with format_map(locals()) are not the things
you can do with f-strings". If what you're looking for can be done
with either, it's almost certainly not better to use locals().

ChrisA

From srkunze at mail.de  Mon Sep  7 18:22:12 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Mon, 07 Sep 2015 18:22:12 +0200
Subject: [Python-ideas] Wheels For ...
In-Reply-To: <CADiSq7ewai8fv9XJsOYEvijTZ+JPsJo6+E=O7S9C5p1cZT=Wgg@mail.gmail.com>
References: <55EC78E9.1050300@mail.de> <msi4ao$sq6$1@ger.gmane.org>
 <201509061954.t86Jspjg011546@fido.openend.se>
 <CADiSq7ewai8fv9XJsOYEvijTZ+JPsJo6+E=O7S9C5p1cZT=Wgg@mail.gmail.com>
Message-ID: <55EDB9B4.7020909@mail.de>

On 07.09.2015 02:26, Nick Coghlan wrote:
> For the build farm idea, it's not just writing the code initially,
> it's operating the resulting infrastructure, and that's a much bigger
> ongoing commitment. Automatically building wheels for source uploads
> is definitely on the wish list, there are just a large number of other
> improvements needed before it's feasible.

Could you be more specific on these improvements, Nick?

Best,
Sven

From srkunze at mail.de  Mon Sep  7 18:27:42 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Mon, 07 Sep 2015 18:27:42 +0200
Subject: [Python-ideas] Wheels For ...
In-Reply-To: <CAPJVwBmMm0tLmMHDUcAoVYVAxeovKDCyZXPcEpFOoyy_EY2cGg@mail.gmail.com>
References: <55EC78E9.1050300@mail.de>
 <20150907012645.GX19373@ando.pearwood.info>
 <etPan.55ed1087.5b6eedad.31bc@Draupnir.home>
 <F6CD4556-2AC1-4AD2-BF43-7F46060024FF@yahoo.com>
 <CAPJVwBmMm0tLmMHDUcAoVYVAxeovKDCyZXPcEpFOoyy_EY2cGg@mail.gmail.com>
Message-ID: <55EDBAFE.1040303@mail.de>

On 07.09.2015 07:39, Nathaniel Smith wrote:
>
> > Fortunately, lxml has a built-in option (triggered by an env 
> variable) for dealing with this, by downloading the source, building a 
> local copy of the libs, and statically linking them into lxml, but 
> that means you need some way for a package to specify env variables to 
> be set on the build server. And can you expect most libraries with 
> similar issues to do the same?
>
> Yes, you can! :-)
>
> I mean, not everyone will necessarily use it, but adding code like
>
> if "PYPI_BUILD_SERVER" in os.environ:
>     do_static_link = True
>
> to your setup.py is *wayyyy* easier than buying an OS X machine and 
> maintaining it and doing manual builds at every release. Or finding a 
> volunteer who has an OS X box and nagging them at every release and 
> dealing with trust hassles.
>

You bet what I just needed to do. Depending on somebody else machine is 
really frustrating.

> And there are a lot of packages out there that just have some cython 
> files in them for speedups with no external dependencies, or whatever. 
> A build farm wouldn't have to be perfect to be extremely useful.
>

I agree. Just good enough suffices for 80% of all the packages to be in 
good shape. Nick mentioned some improvements that are necessary before 
we can indulge such a building farm (except the farm itself).

Best,
Sven

From srkunze at mail.de  Mon Sep  7 18:40:33 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Mon, 07 Sep 2015 18:40:33 +0200
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <20150907121932.GE19373@ando.pearwood.info>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <20150907121932.GE19373@ando.pearwood.info>
Message-ID: <55EDBE01.7030408@mail.de>

On 07.09.2015 14:19, Steven D'Aprano wrote:
> I probably wouldn't use the locals() form if the variable names were
> hard-coded like that, especially for just two of them:
>
>      "Hello, I am {b}. My favorite number is {a}.".format(a=a, b=b)
>
> Where the locals() trick comes in handy is when your template string is
> not hard-coded:
>
>      if greet:
>          template = "Hello, I am {name}, and my favourite %s is {%s}."
>      else:
>          template = "My favourite %s is {%s}."
>      if condition:
>          template = template % ("number", "x")
>      else:
>          template = template % ("colour", "c")
>      print(template.format_map(locals())

Err? I rather think you wouldn't pass code review.

Best,
Sven


From srkunze at mail.de  Mon Sep  7 18:48:15 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Mon, 07 Sep 2015 18:48:15 +0200
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
Message-ID: <55EDBFCF.2030301@mail.de>

On 07.09.2015 10:23, Chris Angelico wrote:
> Which would you deprecate?

Hard to tell. Let me see what you got me here. Remember, I am just 
asking as I don't know better:

>       print("Hello, I am ", b, ". My favorite number is ", a, ".", sep="")
>
> The print function stringifies all its arguments and outputs them,
> joined by a separator. Aside from the 2/3 compatibility requirement
> for single-argument print calls, there's no particular reason to
> deprecate this. In any case, this isn't "yet another way to format
> strings", it's a feature of print.

Still necessary?

>       print("Hello, I am " + b + ". My favorite number is " + str(a) + ".")
>
> String concatenation is definitely not going away; but even without
> PEP 498, I would prefer to use percent formatting or .format() above
> this. Its main advantage over those is that the expressions are in the
> right place, which PEP 498 also offers; if it lands, I fully expect
> 3.6+ code to use it rather than this. But the _functionality_ can't be
> taken away.

For sure, however that shouldn't be used in the official documentation 
if so now, right?

>       print("Hello, I am {}. My favorite number is {}.".format(b, a))
>
> This one is important for non-literals. It's one of the two main ways
> of formatting strings...
>
>       print("Hello, I am %s. My favorite number is %s." % (b, a))
>
> ... and this is the other. Being available for non-literals means they
> can be used with i18n, string tables, and other transformations.
> Percent formatting is similar to what other C-derived languages have,

Still necessary? Really, really necessary? Or just because we can?

> and .format() has certain flexibilities, so neither is likely to be
> deprecated any time soon.

format has its own merits as it works like f-strings but on 
non-literals. (again this one-way/one-syntax thing)

>       print("Hello, I am {b}. My favorite number is {a}.".format_map(locals()))
>
> This one, though, is a bad idea for several reasons. Using locals()
> for formatting is restricted - no globals, no expressions, and no
> nonlocals that aren't captured in some other way. If this one, and
> this one alone, can be replaced by f-string usage, it's done its job.

Well sure, I we all agree on not using that until f-strings are released.

Best,
Sven


From ron3200 at gmail.com  Mon Sep  7 19:32:22 2015
From: ron3200 at gmail.com (Ron Adam)
Date: Mon, 7 Sep 2015 12:32:22 -0500
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <CAPTjJmrOgpkvSjBAaDS0KUZ8uXdcsRSudc1vKRh+1nw2SUpgFw@mail.gmail.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <20150907121932.GE19373@ando.pearwood.info>
 <CAPTjJmrOgpkvSjBAaDS0KUZ8uXdcsRSudc1vKRh+1nw2SUpgFw@mail.gmail.com>
Message-ID: <mskhn7$a40$1@ger.gmane.org>

On 09/07/2015 07:21 AM, Chris Angelico wrote:
>> Where the locals() trick comes in handy is when your template string is
>> >not hard-coded:
>> >
>> >     if greet:
>> >         template = "Hello, I am {name}, and my favourite %s is {%s}."
>> >     else:
>> >         template = "My favourite %s is {%s}."
>> >     if condition:
>> >         template = template % ("number", "x")
>> >     else:
>> >         template = template % ("colour", "c")
>> >     print(template.format_map(locals())

> It's still a poor equivalent for the others. In terms of "why do we
> have so many different ways to do the same thing", the response is
> "the good things to do with format_map(locals()) are not the things
> you can do with f-strings". If what you're looking for can be done
> with either, it's almost certainly not better to use locals().

The ability for a format string or template to take a mapping is very 
useful.  Weather or not it's ok for that mapping to be from locals() is 
a separate issue and depends on other factors as well.  It may be 
perfectly fine in some cases, but not so in others.

The issue with + concatenation is it doesn't call str on the objects. 
That can't be changed.  A new operator (or methods on str) that does 
that could work.  It's still not as concise as f-strings which I think 
is a major motivation for having them.

Cheers,
    Ron



From skrah at bytereef.org  Mon Sep  7 19:39:24 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Mon, 7 Sep 2015 17:39:24 +0000 (UTC)
Subject: [Python-ideas] One way to do format and print
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <55EDBFCF.2030301@mail.de>
Message-ID: <loom.20150907T193538-253@post.gmane.org>

Sven R. Kunze <srkunze at ...> writes:
> >
> >       print("Hello, I am %s. My favorite number is %s." % (b, a))
> >
> > ... and this is the other. Being available for non-literals means they
> > can be used with i18n, string tables, and other transformations.
> > Percent formatting is similar to what other C-derived languages have,
> 
> Still necessary? Really, really necessary? Or just because we can?
> 

Absolutely. For many Python users this is the preferred form. I find
that of all variations, this one is the most readable.


Stefan Krah 



From srkunze at mail.de  Mon Sep  7 20:57:44 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Mon, 7 Sep 2015 20:57:44 +0200
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <loom.20150907T193538-253@post.gmane.org>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <55EDBFCF.2030301@mail.de> <loom.20150907T193538-253@post.gmane.org>
Message-ID: <55EDDE28.5020308@mail.de>

On 07.09.2015 19:39, Stefan Krah wrote:
> Sven R. Kunze <srkunze at ...> writes:
>>>        print("Hello, I am %s. My favorite number is %s." % (b, a))
>>>
>>> ... and this is the other. Being available for non-literals means they
>>> can be used with i18n, string tables, and other transformations.
>>> Percent formatting is similar to what other C-derived languages have,
>> Still necessary? Really, really necessary? Or just because we can?
>>
> Absolutely. For many Python users this is the preferred form. I find
> that of all variations, this one is the most readable.

Okay, convinced. ;)

No, seriously, what would you do when Python would deprecate % syntax? 
Could you switch to {} ?

Best,
Sven

From skrah at bytereef.org  Mon Sep  7 22:29:46 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Mon, 7 Sep 2015 20:29:46 +0000 (UTC)
Subject: [Python-ideas] One way to do format and print
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <55EDBFCF.2030301@mail.de> <loom.20150907T193538-253@post.gmane.org>
 <55EDDE28.5020308@mail.de>
Message-ID: <loom.20150907T220622-706@post.gmane.org>

Sven R. Kunze <srkunze at ...> writes:
> >>> Percent formatting is similar to what other C-derived languages have,
> >> Still necessary? Really, really necessary? Or just because we can?
> >>
> > Absolutely. For many Python users this is the preferred form. I find
> > that of all variations, this one is the most readable.
> 
> Okay, convinced. ;)
> 
> No, seriously, what would you do when Python would deprecate % syntax? 
> Could you switch to {} ?

There are many conservative Python users who are probably underrepresented
on this list.  All can say is that %-formatting never went out of
fashion, see e.g.

  https://google-styleguide.googlecode.com/svn/trunk/cppguide.html#Streams ,

  https://google-styleguide.googlecode.com/svn/trunk/pyguide.html#Strings ,

  https://golang.org/pkg/fmt/


and many others.



Fortunately, there are no plans to deprecate %-formatting (latest
reference is PEP-498).


Stefan Krah




From abarnert at yahoo.com  Mon Sep  7 22:47:59 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Mon, 7 Sep 2015 13:47:59 -0700
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <55EDBFCF.2030301@mail.de>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <55EDBFCF.2030301@mail.de>
Message-ID: <EF7B8AD4-989A-4AEF-9EEE-F283A6F759F9@yahoo.com>

On Sep 7, 2015, at 09:48, Sven R. Kunze <srkunze at mail.de> wrote:
> 
>> On 07.09.2015 10:23, Chris Angelico wrote:
>> Which would you deprecate?
> 
> Hard to tell. Let me see what you got me here. Remember, I am just asking as I don't know better:
> 
>>      print("Hello, I am ", b, ". My favorite number is ", a, ".", sep="")
>> 
>> The print function stringifies all its arguments and outputs them,
>> joined by a separator. Aside from the 2/3 compatibility requirement
>> for single-argument print calls, there's no particular reason to
>> deprecate this. In any case, this isn't "yet another way to format
>> strings", it's a feature of print.
> 
> Still necessary?

Necessary? No. Useful? Yes.

For example, in a 5-line script I wrote last night, I've got print(head, *names, sep='\t'). I could have used print('\t'.join(chain([head], names)) instead--in fact, any use of multi-argument print can be replaced by print(sep.join(map(str, args)))--but that's less convenient, less readable, and less likely to occur to novices. And there are plenty of other alternatives, from print('{}\t{}'.format(head, '\t'.join(names)) to print(('%s\t'*(len(names)+1) % ((head,)+names))[:-1]) to that favorite of novices on Stack Overflow, print(str([head]+names)[1:-1].replace(', ', '\t')), but would you really want to use any of these?

Or course in a "real program" that I needed to use more than once, I would have used the csv module  instead of a 5-line script driven by a 3-line shell script, and there's a limit to how far you want to push the argument for quick&dirty scripting/interactive convenience... but that limit isn't "none at all".

When you start trying to mix manual adding of spaces with sep='' to get a sentence formatted exactly right, that's a good sign that you should be using format instead of multi-arg print; when you start trying to add up format strings, that's a good sign you should be using either something simpler or something more complicated. That is an extra thing novices have to get the hang of to become proficient Python programmers that doesn't exist for C or Perl. But the fact that you _can_ use Python like C, but don't have to, isn't really a downside of Python.

From abarnert at yahoo.com  Mon Sep  7 23:00:19 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Mon, 7 Sep 2015 14:00:19 -0700
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <55EDDE28.5020308@mail.de>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <55EDBFCF.2030301@mail.de> <loom.20150907T193538-253@post.gmane.org>
 <55EDDE28.5020308@mail.de>
Message-ID: <C743712B-775A-4E61-B9B3-6144C7EA7563@yahoo.com>

On Sep 7, 2015, at 11:57, Sven R. Kunze <srkunze at mail.de> wrote:
> 
>> On 07.09.2015 19:39, Stefan Krah wrote:
>> Sven R. Kunze <srkunze at ...> writes:
>>>>       print("Hello, I am %s. My favorite number is %s." % (b, a))
>>>> 
>>>> ... and this is the other. Being available for non-literals means they
>>>> can be used with i18n, string tables, and other transformations.
>>>> Percent formatting is similar to what other C-derived languages have,
>>> Still necessary? Really, really necessary? Or just because we can?
>> Absolutely. For many Python users this is the preferred form. I find
>> that of all variations, this one is the most readable.
> 
> Okay, convinced. ;)
> 
> No, seriously, what would you do when Python would deprecate % syntax? Could you switch to {} ?

There's some confusion over this because the str.format proposal originally suggested deprecating %, and there are still some bloggers and StackOverflow users and so on that claim it does (sometimes even citing the PEP, which explicitly says the opposite). But there will always be cases that % is better for, such as:

 * sharing a table of format strings with code in C or another language
 * simple formats that need to be done fast in a loop
 * formatting strings to use as str.format format strings
 * messages that you've converted from logging to real output
 * ASCII-based wire protocols or file formats

So, even if it weren't for the backward compatibility issue for millions of lines of old code (and thousands of stubborn old coders), I doubt it would ever go away. At most, the tutorial and other docs might change to de-emphasize it and make it seem more like an "expert" feature only useful for cases like the above (as is already true for string.Template, and may become true for both % and str.format after f-strings reach widespread use--but nobody can really predict that until f-strings are actually in practical use).

TOOWTDI is a guideline that has to balance against other guidelines, not a strict rule that always trumps everything else, and unless someone can come up with something new that's so much better than both format and % that it's clearly worth overcoming the inertia rather than just being a case of the old standards joke (insert XKCD reference here), there will be two ways to do this.

From random832 at fastmail.com  Mon Sep  7 22:52:52 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 07 Sep 2015 16:52:52 -0400
Subject: [Python-ideas] One way to do format and print
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
Message-ID: <m21teac5p7.fsf@fastmail.com>

Chris Angelico <rosuav at gmail.com> writes:
> ... and this is the other. Being available for non-literals means they
> can be used with i18n, string tables, and other transformations.
> Percent formatting is similar to what other C-derived languages have,
> and .format() has certain flexibilities, so neither is likely to be
> deprecated any time soon.

I've never understood why .format was invented in the first place,
rather than extending percent-formatting to have the features that it
has over it.


From rymg19 at gmail.com  Tue Sep  8 00:39:26 2015
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Mon, 07 Sep 2015 17:39:26 -0500
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <m21teac5p7.fsf@fastmail.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
Message-ID: <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>



On September 7, 2015 3:52:52 PM CDT, Random832 <random832 at fastmail.com> wrote:
>Chris Angelico <rosuav at gmail.com> writes:
>> ... and this is the other. Being available for non-literals means
>they
>> can be used with i18n, string tables, and other transformations.
>> Percent formatting is similar to what other C-derived languages have,
>> and .format() has certain flexibilities, so neither is likely to be
>> deprecated any time soon.
>
>I've never understood why .format was invented in the first place,
>rather than extending percent-formatting to have the features that it
>has over it.
>

t = (1, 2, 3)
# 400 lines later...
print '%s' % t # oops!

>_______________________________________________
>Python-ideas mailing list
>Python-ideas at python.org
>https://mail.python.org/mailman/listinfo/python-ideas
>Code of Conduct: http://python.org/psf/codeofconduct/

-- 
Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.

From rosuav at gmail.com  Tue Sep  8 01:07:02 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Tue, 8 Sep 2015 09:07:02 +1000
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <mskhn7$a40$1@ger.gmane.org>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <20150907121932.GE19373@ando.pearwood.info>
 <CAPTjJmrOgpkvSjBAaDS0KUZ8uXdcsRSudc1vKRh+1nw2SUpgFw@mail.gmail.com>
 <mskhn7$a40$1@ger.gmane.org>
Message-ID: <CAPTjJmoVzbEptokwLMLa16N-k+sQUU9chKJS5sMiuZb+AkGW5g@mail.gmail.com>

On Tue, Sep 8, 2015 at 3:32 AM, Ron Adam <ron3200 at gmail.com> wrote:
> The ability for a format string or template to take a mapping is very
> useful.  Weather or not it's ok for that mapping to be from locals() is a
> separate issue and depends on other factors as well.  It may be perfectly
> fine in some cases, but not so in others.

I think that's the most we're ever going to have in terms of
deprecations. None of the _functionality_ of any of the examples will
be going away, but some of them will be non-recommended ways of doing
certain things. I definitely agree that taking format values from a
mapping is useful.

ChrisA

From dan at tombstonezero.net  Tue Sep  8 01:13:49 2015
From: dan at tombstonezero.net (Dan Sommers)
Date: Mon, 7 Sep 2015 23:13:49 +0000 (UTC)
Subject: [Python-ideas] One way to do format and print
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
Message-ID: <msl5nd$pb3$1@ger.gmane.org>

On Mon, 07 Sep 2015 17:39:26 -0500, Ryan Gonzalez wrote:

> On September 7, 2015 3:52:52 PM CDT, Random832 <random832 at fastmail.com> wrote:

>> I've never understood why .format was invented in the first place,
>> rather than extending percent-formatting to have the features that it
>> has over it.
> 
> t = (1, 2, 3)
> # 400 lines later...
> print '%s' % t # oops!

t = (1, 2, 3)
# 400 lines later
t *= 4 # oops?

Why do you (Ryan Gonzalez) have names that are important enough to span
over 400 lines of source code but not important enough to call something
more interesting than "t"?

And why are we conflating the print function with string formatting with
natural language translation in the first place?


From rosuav at gmail.com  Tue Sep  8 01:15:05 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Tue, 8 Sep 2015 09:15:05 +1000
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <C743712B-775A-4E61-B9B3-6144C7EA7563@yahoo.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <55EDBFCF.2030301@mail.de>
 <loom.20150907T193538-253@post.gmane.org>
 <55EDDE28.5020308@mail.de>
 <C743712B-775A-4E61-B9B3-6144C7EA7563@yahoo.com>
Message-ID: <CAPTjJmpB76PG4wBNzUSrz4=o+boC3byV25aLV4GjCar3OVoEsQ@mail.gmail.com>

On Tue, Sep 8, 2015 at 7:00 AM, Andrew Barnert via Python-ideas
<python-ideas at python.org> wrote:
> But there will always be cases that % is better for, such as:
>
>  * sharing a table of format strings with code in C or another language
>  * simple formats that need to be done fast in a loop
>  * formatting strings to use as str.format format strings
>  * messages that you've converted from logging to real output
>  * ASCII-based wire protocols or file formats

Supporting this last one is PEP 461. There are no proposals on the
cards to add a b"...".format() method (it's not out of the question,
but there are problems to be overcome because of the extreme
generality of it), yet we have percent formatting for bytestrings. I
think that's a strong indication that percent formatting is fully
supported and will be for the future.

ChrisA

From rymg19 at gmail.com  Tue Sep  8 01:25:52 2015
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Mon, 07 Sep 2015 18:25:52 -0500
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <msl5nd$pb3$1@ger.gmane.org>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com> <msl5nd$pb3$1@ger.gmane.org>
Message-ID: <14DCC5DA-01F8-4208-A42D-842211DADC22@gmail.com>



On September 7, 2015 6:13:49 PM CDT, Dan Sommers <dan at tombstonezero.net> wrote:
>On Mon, 07 Sep 2015 17:39:26 -0500, Ryan Gonzalez wrote:
>
>> On September 7, 2015 3:52:52 PM CDT, Random832
><random832 at fastmail.com> wrote:
>
>>> I've never understood why .format was invented in the first place,
>>> rather than extending percent-formatting to have the features that
>it
>>> has over it.
>> 
>> t = (1, 2, 3)
>> # 400 lines later...
>> print '%s' % t # oops!
>
>t = (1, 2, 3)
># 400 lines later
>t *= 4 # oops?
>
>Why do you (Ryan Gonzalez) have names that are important enough to span
>over 400 lines of source code but not important enough to call
>something
>more interesting than "t"?
>
>And why are we conflating the print function with string formatting
>with
>natural language translation in the first place?

You're blowing this out of proportion. I was simply showing how string formatting can be *weird* when it comes to tuples.

>
>_______________________________________________
>Python-ideas mailing list
>Python-ideas at python.org
>https://mail.python.org/mailman/listinfo/python-ideas
>Code of Conduct: http://python.org/psf/codeofconduct/

-- 
Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.

From random832 at fastmail.com  Tue Sep  8 01:28:13 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 07 Sep 2015 19:28:13 -0400
Subject: [Python-ideas] One way to do format and print
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
Message-ID: <m2fv2plshe.fsf@fastmail.com>


Ryan Gonzalez <rymg19 at gmail.com> writes:
> t = (1, 2, 3)
> # 400 lines later...
> print '%s' % t # oops!

I always use % (t,) when intending to format a single object. But
anyway, my ideal version of it would have a .format method, but using
identical format strings. My real question was what the benefit of the
{}-format for format strings is, over an extended %-format.


From ncoghlan at gmail.com  Tue Sep  8 01:45:22 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 8 Sep 2015 09:45:22 +1000
Subject: [Python-ideas] Wheels For ...
In-Reply-To: <55EDB9B4.7020909@mail.de>
References: <55EC78E9.1050300@mail.de> <msi4ao$sq6$1@ger.gmane.org>
 <201509061954.t86Jspjg011546@fido.openend.se>
 <CADiSq7ewai8fv9XJsOYEvijTZ+JPsJo6+E=O7S9C5p1cZT=Wgg@mail.gmail.com>
 <55EDB9B4.7020909@mail.de>
Message-ID: <CADiSq7f6=PMfOdXEv327mnP3nr8HnvRrM=WF30bEeARq87-bHg@mail.gmail.com>

On 8 September 2015 at 02:22, Sven R. Kunze <srkunze at mail.de> wrote:
> On 07.09.2015 02:26, Nick Coghlan wrote:
>>
>> For the build farm idea, it's not just writing the code initially,
>> it's operating the resulting infrastructure, and that's a much bigger
>> ongoing commitment. Automatically building wheels for source uploads
>> is definitely on the wish list, there are just a large number of other
>> improvements needed before it's feasible.
>
>
> Could you be more specific on these improvements, Nick?

- PyPI: migrating from the legacy Zope codebase to Warehouse
- PyPI: end-to-end content signing (PEPs 458 & 480)
- PyPI: automated analytics & dashboards
- Tooling: integration with operating systems & other platforms
- Python Software Foundation financial sustainability
- Python Software Foundation project management capacity
- Infrastructure improvements for the CPython workflow

Those aren't dependencies of automatic wheel-building per se, but
rather are issues that are higher priorities for folks like Donald (in
terms of actually getting things done), myself (in terms of
collaborating more effectively with other open source ecosystems), and
the PSF staff and Board (in terms of ensuring the python.org
infrastructure is being appropriately maintained).

Running an automated build service is expensive, not primarily in
setting it up, but in terms of the ongoing sustaining engineering
costs (including security monitoring and response), so before we
commit to doing it, we need to know how we're going to fund it.
However, most of the PSF's focus at the moment is on getting the
things we *already* do [1] on a more sustainable footing, so adding
*new* services isn't currently a priority.

Cheers,
Nick.

[1] https://wiki.python.org/moin/PythonSoftwareFoundation/Proposals/StrategicPriorities

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Tue Sep  8 02:31:56 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 8 Sep 2015 10:31:56 +1000
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <m2fv2plshe.fsf@fastmail.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com>
Message-ID: <CADiSq7ekgxaEXDmrnh4bBdV2g0veHdEmeOLv-ArD9E+_7OstyA@mail.gmail.com>

On 8 September 2015 at 09:28, Random832 <random832 at fastmail.com> wrote:
>
> Ryan Gonzalez <rymg19 at gmail.com> writes:
>> t = (1, 2, 3)
>> # 400 lines later...
>> print '%s' % t # oops!
>
> I always use % (t,) when intending to format a single object. But
> anyway, my ideal version of it would have a .format method, but using
> identical format strings. My real question was what the benefit of the
> {}-format for format strings is, over an extended %-format.

It turns out PEP 3101 doesn't really go into this, so I guess it was a
case where all of us involved in the discussion knew the reasons a new
format was needed, so we never wrote them down.

As such, it's worth breaking the problem down into a few different subproblems:

1. Invocation via __mod__
2. Positional formatting
3. Name based formatting
4. Extending formatting to new types in an extensible, backwards compatible way

The problems with formatting dictionaries and tuples relate to the
"fmt % values" invocation model, rather than the substitution field
syntax. As such, we *could* have designed str.format() and
str.format_map() around %-interpolation. The reasons we chose not to
do that relate to the other problems.

For positional formatting of short strings, %-interpolation actually
works pretty well, and it has the advantage of being consistent with
printf() style APIs in C and C++. This is the use case where it has
proven most difficult to get people to switch away from
mod-formatting, and is also the approach we used to restore binary
interpolation support in Python 3.5. An illustrative example is to
compare formatting a floating point number to use 2 decimal places:

    >>> x = y = 1.0
    >>> "%.2f, %.2f" % (x, y)
    '1.00, 1.00'
    >>> "{:.2f}, {:.2f}".format(x, y)
    '1.00, 1.00'

I consider the second example there to be *less* readable than the
original mod-formatting. These kinds of cases are why we *changed our
mind* from "we'd like to deprecate mod-formatting, but we haven't
figured out a practical way to do so" to "mod-formatting and
brace-formatting are better at different things, so it's actually
useful having both of them available".

For name based formatting, by contrast, the "%(name)s" syntax is noisy
and clumsy compared to the shorter "{name}" format introduced in PEP
3101 (borrowed from C#). There the value has been clear, and so folks
have been significantly more amenable to switching away from
mod-formatting:

    >>> "%(x).2f, %(y).2f" % dict(x=x, y=y)
    '1.00, 1.00'
    >>> "{x:.2f}, {y:.2f}".format(x=x, y=y)
    '1.00, 1.00'

It's that last example which PEP 498 grants native syntax, with the
entire trailing method call being replaced by a simple leading "f":

    >>> f"{x:.2f}, {y:.2f}"
    '1.00, 1.00'

This gets us back to TOOWTDI (after a long detour away from it), since
direct interpolation will clearly be the obvious way to go when
interpolating into a literal format string - the other options will
only be needed when literal formatting isn't appropriate for some
reason.

The final reason for introducing a distinct formatting system doesn't
relate to syntax, but rather to semantics. Mod-formatting is defined
around the builtin types, with "__str__" as the catch-all fallback for
interpolating arbitrary objects. PEP 3101 introduced a new *protocol*
method (__format__) that allowed classes more control over how their
instances were formatted, with the typical example being to allow
dates and times to accept strftime formatting strings directly rather
than having to make a separate strftime call prior to formatting.
Python generally follows a philosophy of "constructs with different
semantics should use different syntax" (at least in the core language
design), which is reflected in the fact that a new formatting syntax
was introduced in conjunction with a new formatting protocol.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From stephen at xemacs.org  Tue Sep  8 03:44:36 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 08 Sep 2015 10:44:36 +0900
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <CAPTjJmpB76PG4wBNzUSrz4=o+boC3byV25aLV4GjCar3OVoEsQ@mail.gmail.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <55EDBFCF.2030301@mail.de>
 <loom.20150907T193538-253@post.gmane.org>
 <55EDDE28.5020308@mail.de>
 <C743712B-775A-4E61-B9B3-6144C7EA7563@yahoo.com>
 <CAPTjJmpB76PG4wBNzUSrz4=o+boC3byV25aLV4GjCar3OVoEsQ@mail.gmail.com>
Message-ID: <87r3m91y7v.fsf@uwakimon.sk.tsukuba.ac.jp>

Chris Angelico writes:
 > On Tue, Sep 8, 2015 at 7:00 AM, Andrew Barnert via Python-ideas
 > <python-ideas at python.org> wrote:
 > > But there will always be cases that % is better for, such as:

[...]
 > >  * ASCII-based wire protocols or file formats
 > 
 > Supporting this last one is PEP 461. There are no proposals on the
 > cards to add a b"...".format() method (it's not out of the question,
 > but there are problems to be overcome

Actually, it was proposed and pronounced (immediately on proposal :-).

There were no truly difficult technical problems, but Guido decided it
was a YAGNI, and often an attractive nuisance.  In particular, many of
the use cases for bytestring formatting are performance-critical bit-
shoveling applications, and adding a few extra method lookups and
calls to every formatting operation would be a problem.  Many others
involve porting Python 2 applications that used str to hold and format
"external" strings, and those use %-formatting, not .format.

 > because of the extreme generality of it),

Hm.  It seems to me in the PEP 498 discussion that Guido doesn't see
generality as a problem to be solved by restricting it, but rather as
a characteristic of an implementation that makes it more or less
suitable for a given feature.  I guess that Guido would insist on
having bytes.format be almost identical to str.format, except maybe
for a couple of tweaks similar to those added to bytes' % operator.

From random832 at fastmail.com  Tue Sep  8 03:46:53 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 07 Sep 2015 21:46:53 -0400
Subject: [Python-ideas] One way to do format and print
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com>
 <CADiSq7ekgxaEXDmrnh4bBdV2g0veHdEmeOLv-ArD9E+_7OstyA@mail.gmail.com>
Message-ID: <m2io7ladiq.fsf@fastmail.com>

Nick Coghlan <ncoghlan at gmail.com> writes:
> The final reason for introducing a distinct formatting system doesn't
> relate to syntax, but rather to semantics. Mod-formatting is defined
> around the builtin types, with "__str__" as the catch-all fallback for
> interpolating arbitrary objects. PEP 3101 introduced a new *protocol*
> method (__format__) that allowed classes more control over how their
> instances were formatted, with the typical example being to allow
> dates and times to accept strftime formatting strings directly rather
> than having to make a separate strftime call prior to formatting.
> Python generally follows a philosophy of "constructs with different
> semantics should use different syntax"

I guess my problem is that I don't consider the fact that %s forces
something to string, %f to float, etc, to be desired semantics, I
consider it to be a bug that could, and *should*, have been changed by
an alternate-universe PEP.

There's nothing *good* about the fact that '%.20f' % Decimal('0.1')
gives 0.10000000000000000555 instead of 0.10000000000000000000, and that
there are no hooks for Decimal to make it do otherwise. There's nothing
that would IMO be legitimately broken by allowing it to do so. You
could, for example, have object.__format__ fall back on the type
conversion semantics, so that it would continue to work with existing
types that do not define their own __format__.


From ncoghlan at gmail.com  Tue Sep  8 04:18:40 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 8 Sep 2015 12:18:40 +1000
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <m2io7ladiq.fsf@fastmail.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com>
 <CADiSq7ekgxaEXDmrnh4bBdV2g0veHdEmeOLv-ArD9E+_7OstyA@mail.gmail.com>
 <m2io7ladiq.fsf@fastmail.com>
Message-ID: <CADiSq7ee+6FPv=kHNSgUZFYO43unwbP4aDU4sSj6F=tLoi3SMA@mail.gmail.com>

On 8 September 2015 at 11:46, Random832 <random832 at fastmail.com> wrote:
> Nick Coghlan <ncoghlan at gmail.com> writes:
>> The final reason for introducing a distinct formatting system doesn't
>> relate to syntax, but rather to semantics. Mod-formatting is defined
>> around the builtin types, with "__str__" as the catch-all fallback for
>> interpolating arbitrary objects. PEP 3101 introduced a new *protocol*
>> method (__format__) that allowed classes more control over how their
>> instances were formatted, with the typical example being to allow
>> dates and times to accept strftime formatting strings directly rather
>> than having to make a separate strftime call prior to formatting.
>> Python generally follows a philosophy of "constructs with different
>> semantics should use different syntax"
>
> I guess my problem is that I don't consider the fact that %s forces
> something to string, %f to float, etc, to be desired semantics, I
> consider it to be a bug that could, and *should*, have been changed by
> an alternate-universe PEP.
>
> There's nothing *good* about the fact that '%.20f' % Decimal('0.1')
> gives 0.10000000000000000555 instead of 0.10000000000000000000, and that
> there are no hooks for Decimal to make it do otherwise.

Ah, but there *is* something good about it: the fact that
percent-formatting is restricted to a constrained set of known types
makes it fundamentally more *predictable* and more *portable* than
brace-formatting.

The flexibility of str.format is wonderful if you're only needing to
deal with Python code, and Python's type system. It's substantially
less wonderful if you're designing formatting operations that need to
span multiple languages that only have the primitive core defined by C
in common.

These characteristics are what make percent-formatting a more suitable
approach to binary interpolation than the fully flexible formatting
system. Binary interpolation is not only really hard to do right, it's
also really hard to *test* - many of the things that can go wrong are
driven by the specific data values you choose to test with, rather
than being structural errors in the data types you use.

These benefits aren't particularly obvious until you try to live
without them and figure out why you missed them, but we *have* done
that in the 7 years since 2.6 was released, and hence have a good
understanding of why brace-formatting wasn't the wholesale replacement
for percent-formatting that we originally expected it to be.

That said, there *have* been ongoing efforts to improve the numeric
formatting capabilities of printf and related operations in C/C++ that
we haven't been tracking at the Python level. In relation to decimal
support specifically, the C++ write-up at
http://open-std.org/JTC1/SC22/WG21/docs/papers/2014/n3871.html also
links to the C level proposal and the proposed changes to the
interpretation of the floating point codes when working with decimal
data types.

However, as far as I am aware, there isn't anyone specifically
tracking the evolution of printf() formatting codes and reviewing them
for applicability to Python's percent-formatting support - it's done
more in an ad hoc fashion as folks developing in both Python and C/C++
start using a new formatting code on the C/C++ side of things and
request that it also be added to Python.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From stephen at xemacs.org  Tue Sep  8 05:01:44 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 08 Sep 2015 12:01:44 +0900
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <m2fv2plshe.fsf@fastmail.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com>
Message-ID: <87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>

Random832 writes:

 > But anyway, my ideal version of it would have a .format method, but
 > using identical format strings.

"Identical" is impossible, even you immediately admit that an
extension is necessary:

 > My real question was what the benefit of the {}-format for format
 > strings is, over an extended %-format.

One issue is that the "%(name)s" form proves to be difficult for
end-users and translators to get right.  Eg, it's a FAQ for Mailman,
which uses such format strings for interpolatable footers, including
personalized user unsubscribe links and the like.  This is just a
fact.  I don't have any similar evidence that "{}" is better.

Introspecting, I find have the whole spec enclosed in parentheses far
more readable than the very finicky %-specs.  It feels more like a
replacement field to me.  I also find the braces to be far more
readable parenthesization than (round) parentheses (TeX influence
there, maybe?)  In particular, these two attributes of "{}" are why
I use .format by preference even in simple cases where % is both
sufficient and clearly more compact.  Obviously that's a personal
preference but I doubt I'm the only one who feels that way.

%-formatting provides no way to indicate which positional parameter
goes with which format spec.  It was hoped that such a facility might
be useful in I18N, where the syntax of translated string often must be
rather different from English syntax.  That has turned out not to be
the case, but that is the way the C world was going at the time.  In
recent C (well, C89 or C99 ;-) there is a way to do this, but that
would require extending the Python %-spec syntax.

{}-formatting allows one level of recursion.  That is

    "|{result:{width}d}|".format(result=42, width=10)

produces "|        42|".  In %-formatting, a recursive syntax would be
rather finicky, and an alternative syntax for formatting the format
spec would be shocking to people used to %-formatting, I guess.

{}-formatting actually admits an arbitrary string after the ":", to be
interpreted by the object's __format__ method, rather than by format.
The {arg : sign width.prec type} format is respected by the builtin
types, but a type like datetime can (and does!) support the full
strftime() syntax, eg,

    "It is now {0:%H:%M:%S} on {0:%Y/%m/%d}.".format(datetime.now())

produced 'It is now 11:58:53 on 2015/09/08.' for me just now.  I
don't see how equivalent flexibility could be provided with %-spec
syntax without twisting it completely out of shape.

These last three, besides presenting (more or less minor) technical
difficulties for a %-spec extension, run into Guido's allergy to
subtle context-dependent differences in syntax, as we've seen in the
discussion of whether expression syntax in f-strings should be
restricted as compared to "full" expression syntax.  That is, the more
natural the extension of %-spec syntax we used, the more confusing it
would be to users, especially new users reading old code (and
wondering why it does things the hard way).  OTOH, if it's not a
"natural" extension, you lose many of the benefits of an extension in
the first place.


From random832 at fastmail.com  Tue Sep  8 06:27:43 2015
From: random832 at fastmail.com (Random832)
Date: Tue, 08 Sep 2015 00:27:43 -0400
Subject: [Python-ideas] One way to do format and print
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com>
 <87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <m2egi9a62o.fsf@fastmail.com>

"Stephen J. Turnbull" <stephen at xemacs.org>
writes:

> Random832 writes:
>
>  > But anyway, my ideal version of it would have a .format method, but
>  > using identical format strings.
>
> "Identical" is impossible, even you immediately admit that an
> extension is necessary:

By "Identical" I meant there would be a single mechanism, including all
new features, used for both str.format and str.__mod__ - i.e. the
extensions would also be available using the % operator. The extensions
would be implemented, but the % operator would not be left
behind. Hence, identical.


From tritium-list at sdamon.com  Tue Sep  8 11:35:57 2015
From: tritium-list at sdamon.com (Alexander Walters)
Date: Tue, 08 Sep 2015 05:35:57 -0400
Subject: [Python-ideas] NuGet/Chocolatey feed for releases
Message-ID: <55EEABFD.8040600@sdamon.com>

It would be incredibly convenient, especially for users of AppVayor's 
continuous integration service, if there were a(n official) repository 
for chocolatey containing recent releases of python.  The official 
Chocolatey gallery contains installers for the latest 2.7 and 3.4 (as of 
this post).  What I am proposing would contain the most commonly used 
pythons in testing (2.6 2.7 3.3 3.4 and future releases).

I am perfectly willing to set up a repo for my own use, but am posting 
this to see if there is community support...or psf support... for 
setting up an official repo.

From wolfgang.maier at biologie.uni-freiburg.de  Tue Sep  8 12:00:57 2015
From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier)
Date: Tue, 8 Sep 2015 12:00:57 +0200
Subject: [Python-ideas] new format spec for iterable types
Message-ID: <msmbko$ful$1@ger.gmane.org>

Hi,

in the parallel "format and print" thread, Andrew Barnert wrote:


 > For example, in a 5-line script I wrote last night, I've got
 > print(head, *names, sep='\t'). I could have used
 > print('\t'.join(chain([head], names)) instead--in fact, any use of
 > multi-argument print can be replaced by
 > print(sep.join(map(str, args)))--but that's less convenient, less
 > readable, and less likely to occur to novices. And there are plenty
 > of other alternatives, from
 > print('{}\t{}'.format(head, '\t'.join(names)) to
...

That last thing, '{}\t{}'.format(head, '\t'.join(names)), is something I 
find myself writing relatively often - when I do not want to print the 
result immediately, but store it - but it is ugly to read with its 
nested method calls and the separators occurring in two very different 
places.
Now Andrew's comment has prompted me to think about alternative syntax 
for this and I came up with this idea:

What if built in iterable types through their __format__ method 
supported a format spec string of the form "*separator" and interpreted 
it as join your elements' formatted representations using "separator" ?
A quick and dirty illustration in Python:

class myList(list):
     def __format__ (self, fmt=''):
         if fmt == '':
	    return str(self)
	if fmt[0] == '*':
             sep = fmt[1:] or ' '
             return sep.join(format(e) for e in self)
         else:
             raise TypeError()

head = 99
data = myList(range(10))
s = '{}, {:*, }'.format(head, data)
# or
s2 = '{}{sep}{:*{sep}}'.format(head, data, sep=', ')
print(s)
print(s2)
# 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

Thoughts?


From abarnert at yahoo.com  Tue Sep  8 14:24:22 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Tue, 8 Sep 2015 05:24:22 -0700
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <msmbko$ful$1@ger.gmane.org>
References: <msmbko$ful$1@ger.gmane.org>
Message-ID: <8635FF8B-2C17-4016-BDC0-BF5D775C9F0C@yahoo.com>

On Sep 8, 2015, at 03:00, Wolfgang Maier <wolfgang.maier at biologie.uni-freiburg.de> wrote:
> 
> Hi,
> 
> in the parallel "format and print" thread, Andrew Barnert wrote:
> 
> 
> > For example, in a 5-line script I wrote last night, I've got
> > print(head, *names, sep='\t'). I could have used
> > print('\t'.join(chain([head], names)) instead--in fact, any use of
> > multi-argument print can be replaced by
> > print(sep.join(map(str, args)))--but that's less convenient, less
> > readable, and less likely to occur to novices. And there are plenty
> > of other alternatives, from
> > print('{}\t{}'.format(head, '\t'.join(names)) to
> ...
> 
> That last thing, '{}\t{}'.format(head, '\t'.join(names)), is something I find myself writing relatively often - when I do not want to print the result immediately, but store it - but it is ugly to read with its nested method calls and the separators occurring in two very different places.
> Now Andrew's comment has prompted me to think about alternative syntax for this and I came up with this idea:
> 
> What if built in iterable types through their __format__ method supported a format spec string of the form "*separator" and interpreted it as join your elements' formatted representations using "separator" ?
> A quick and dirty illustration in Python:
> 
> class myList(list):
>    def __format__ (self, fmt=''):
>        if fmt == '':
>        return str(self)
>    if fmt[0] == '*':
>            sep = fmt[1:] or ' '
>            return sep.join(format(e) for e in self)
>        else:
>            raise TypeError()
> 
> head = 99
> data = myList(range(10))
> s = '{}, {:*, }'.format(head, data)
> # or
> s2 = '{}{sep}{:*{sep}}'.format(head, data, sep=', ')
> print(s)
> print(s2)
> # 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

Formatting positional argument #2 with *{sep} as the format specifier makes no sense to me. Even knowing what you're trying to do, I can't understand what *(', ') is going to pass to data.__format__, or why it should do what you want. What is the * supposed to mean there? Is it akin to *args in a function call expression, so you get ',' and ' ' as separate positional arguments? If so, how does the fmt[1] do anything useful? It seems like you would be using [' '] as the separator, and in not sure what that would do that you'd want.

From wolfgang.maier at biologie.uni-freiburg.de  Tue Sep  8 14:55:43 2015
From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier)
Date: Tue, 8 Sep 2015 14:55:43 +0200
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <8635FF8B-2C17-4016-BDC0-BF5D775C9F0C@yahoo.com>
References: <msmbko$ful$1@ger.gmane.org>
 <8635FF8B-2C17-4016-BDC0-BF5D775C9F0C@yahoo.com>
Message-ID: <55EEDACF.7000204@biologie.uni-freiburg.de>


On 08.09.2015 14:24, Andrew Barnert via Python-ideas wrote:
 >
 > Formatting positional argument #2 with *{sep} as the format specifier 
makes no sense to me. Even knowing what you're trying to do, I can't 
understand what *(', ') is going to pass to data.__format__, or why it 
should do what you want. What is the * supposed to mean there? Is it 
akin to *args in a function call expression, so you get ',' and ' ' as 
separate positional arguments? If so, how does the fmt[1] do anything 
useful? It seems like you would be using [' '] as the separator, and in 
not sure what that would do that you'd want.
 >

Not sure what happened to the indentation in the posted code. Here's 
another attempt copy pasting from working code as I thought I had done 
before (sorry for the inconvenience):

class myList(list):
     def __format__ (self, fmt=''):
         if fmt == '':
             return str(self)
         if fmt[0] == '*':
             sep = fmt[1:] or ' '
             return sep.join(format(e) for e in self)
         else:
             raise TypeError()

head = 99
data = myList(range(10))
s = '{}, {:*, }'.format(head, data)
# or
s2 = '{}{sep}{:*{sep}}'.format(head, data, sep=', ')
print(s)
print(s2)
# 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

Does that make things clearer?



From oscar.j.benjamin at gmail.com  Tue Sep  8 15:41:39 2015
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Tue, 8 Sep 2015 14:41:39 +0100
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <8635FF8B-2C17-4016-BDC0-BF5D775C9F0C@yahoo.com>
References: <msmbko$ful$1@ger.gmane.org>
 <8635FF8B-2C17-4016-BDC0-BF5D775C9F0C@yahoo.com>
Message-ID: <CAHVvXxR-STALH6-RL5pdvyra+hY8MPRQA2c6qGp1qjS74NV_eA@mail.gmail.com>

On 8 September 2015 at 13:24, Andrew Barnert via Python-ideas
<python-ideas at python.org> wrote:
> Wolfgang wrote:
>> A quick and dirty illustration in Python:
>>
>> class myList(list):
>>    def __format__ (self, fmt=''):
>>        if fmt == '':
>>        return str(self)
>>    if fmt[0] == '*':
>>            sep = fmt[1:] or ' '
>>            return sep.join(format(e) for e in self)
>>        else:
>>            raise TypeError()
>>
>> head = 99
>> data = myList(range(10))
>> s = '{}, {:*, }'.format(head, data)
>> # or
>> s2 = '{}{sep}{:*{sep}}'.format(head, data, sep=', ')
>> print(s)
>> print(s2)
>> # 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
>
> Formatting positional argument #2 with *{sep} as the format specifier makes no sense to me. Even knowing what you're trying to do, I can't understand what *(', ') is going to pass to data.__format__, or why it should do what you want. What is the * supposed to mean there? Is it akin to *args in a function call expression, so you get ',' and ' ' as separate positional arguments? If so, how does the fmt[1] do anything useful? It seems like you would be using [' '] as the separator, and in not sure what that would do that you'd want.

The *{sep} surprised me until I tried

    >>> '{x:.{n}f}'.format(x=1.234567, n=2)
    '1.23'

So format uses a two-level pass over the string for nested curly
brackets (I tried a third level of nesting but it didn't work).

So following it through:

    '{}{sep}{:*{sep}}'.format(head, data, sep=', ')

    '{}, {:*, }'.format(head, data)

    '{}, {}'.format(head, format(data, '*, '))

    '{}, {}'.format(head, ', '.join(format(e) for e in data))

    '99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9'

Unfortunately there's no way to also give a format string to the inner
format call format(e) if I wanted to e.g. format those numbers in hex.


--
Oscar

From wolfgang.maier at biologie.uni-freiburg.de  Tue Sep  8 16:20:43 2015
From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier)
Date: Tue, 8 Sep 2015 16:20:43 +0200
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <CAHVvXxR-STALH6-RL5pdvyra+hY8MPRQA2c6qGp1qjS74NV_eA@mail.gmail.com>
References: <msmbko$ful$1@ger.gmane.org>
 <8635FF8B-2C17-4016-BDC0-BF5D775C9F0C@yahoo.com>
 <CAHVvXxR-STALH6-RL5pdvyra+hY8MPRQA2c6qGp1qjS74NV_eA@mail.gmail.com>
Message-ID: <55EEEEBB.4080203@biologie.uni-freiburg.de>

On 08.09.2015 15:41, Oscar Benjamin wrote:
>
> The *{sep} surprised me until I tried
>
>      >>> '{x:.{n}f}'.format(x=1.234567, n=2)
>      '1.23'
>
> So format uses a two-level pass over the string for nested curly
> brackets (I tried a third level of nesting but it didn't work).
>

Yes, this is documented behavior
(https://docs.python.org/3/library/string.html#format-string-syntax):

"A format_spec field can also include nested replacement fields within 
it. These nested replacement fields can contain only a field name; 
conversion flags and format specifications are not allowed. The 
replacement fields within the format_spec are substituted before the 
format_spec string is interpreted. This allows the formatting of a value 
to be dynamically specified."

> Unfortunately there's no way to also give a format string to the inner
> format call format(e) if I wanted to e.g. format those numbers in hex.

Right, that would require a much more complex format_spec definition. 
But the proposed simple version saves me from mistakenly writing:

'{}\t{}'.format(head, '\t'.join(data))

when some of the elements in data aren't strings and I should have written:

'{}\t{}'.format(head, '\t'.join(str(e) for e in data))

, a mistake that I seem to never learn to avoid :)




From rymg19 at gmail.com  Tue Sep  8 16:37:36 2015
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Tue, 08 Sep 2015 09:37:36 -0500
Subject: [Python-ideas] NuGet/Chocolatey feed for releases
In-Reply-To: <55EEABFD.8040600@sdamon.com>
References: <55EEABFD.8040600@sdamon.com>
Message-ID: <89BE5DA9-CB1B-481E-9601-96F94057BFA0@gmail.com>

Beware: when the Chocolatey devs said "backlog", they *meant* backlog. I submitted an updated PyPy package to them months ago, and it still hasn't been updated yet.

On September 8, 2015 4:35:57 AM CDT, Alexander Walters <tritium-list at sdamon.com> wrote:
>It would be incredibly convenient, especially for users of AppVayor's 
>continuous integration service, if there were a(n official) repository 
>for chocolatey containing recent releases of python.  The official 
>Chocolatey gallery contains installers for the latest 2.7 and 3.4 (as
>of 
>this post).  What I am proposing would contain the most commonly used 
>pythons in testing (2.6 2.7 3.3 3.4 and future releases).
>
>I am perfectly willing to set up a repo for my own use, but am posting 
>this to see if there is community support...or psf support... for 
>setting up an official repo.
>_______________________________________________
>Python-ideas mailing list
>Python-ideas at python.org
>https://mail.python.org/mailman/listinfo/python-ideas
>Code of Conduct: http://python.org/psf/codeofconduct/

-- 
Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150908/b7ed8cba/attachment-0001.html>

From stephen at xemacs.org  Tue Sep  8 18:27:27 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 09 Sep 2015 01:27:27 +0900
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <55EEEEBB.4080203@biologie.uni-freiburg.de>
References: <msmbko$ful$1@ger.gmane.org>
 <8635FF8B-2C17-4016-BDC0-BF5D775C9F0C@yahoo.com>
 <CAHVvXxR-STALH6-RL5pdvyra+hY8MPRQA2c6qGp1qjS74NV_eA@mail.gmail.com>
 <55EEEEBB.4080203@biologie.uni-freiburg.de>
Message-ID: <87lhcg27ww.fsf@uwakimon.sk.tsukuba.ac.jp>

Wolfgang Maier writes:

 > But the proposed simple version saves me from mistakenly writing:
 > 
 > '{}\t{}'.format(head, '\t'.join(data))
 > 
 > when some of the elements in data aren't strings and I should have written:
 > 
 > '{}\t{}'.format(head, '\t'.join(str(e) for e in data))
 > 
 > , a mistake that I seem to never learn to avoid :)

(Note: I don't suffer from that particular mistake, so I may be
biased.)  I think it's a nice trick but doesn't clear the bar for
adding to the standard iterables yet.

A technical comment: you don't actually need the '*' for myList
(although I guess you find it useful to get an error rather than line
noise as a separator if it isn't present?)

On the basic idea: if this can be generalized a bit so that

    head = 99 
    data = range(10)                # optimism!
    s = '{:.1f}, {:.1f*, }'.format(head, data)

produces

    s == '99.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0'

then I'd be a lot more attracted to it.  I would think the simple
version is likely to produce rather ugly output if you have a bunch of
floats in data.  (BTW, that string was actually generated with

    '{:.2f}, {}'.format(99, ', '.join('{:.2f}'.format(x) for x in range(10)))

which doesn't win any beauty contests.)

Bikeshedding in advance, now you pretty much need the '*' (and have to
hope that the types in the iterable don't use it themselves!), because
'{:.1f, }' really does look like line noise!  I might actually prefer
'|' (or '/') which is "heavier" and "looks like a separator" to me:

    s = '{:.1f}, {:.1f|, }'.format(head, data)

Finally, another alternative syntax would be the same in the
replacement field, but instead of iterables implementing it, the
.format method would (using your syntax and example for easier
comparison):

    s = '{}, {:*, }'.format(head, *data)

I'm afraid this won't work unless restricted to be the last
replacement field, where it just consumes all remaining positional
arguments.  I think that restriction deserves a loud "ugh", but maybe
it will give somebody a better idea.

Steve

From srkunze at mail.de  Tue Sep  8 19:49:39 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Tue, 08 Sep 2015 19:49:39 +0200
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <msmbko$ful$1@ger.gmane.org>
References: <msmbko$ful$1@ger.gmane.org>
Message-ID: <55EF1FB3.5000407@mail.de>

On 08.09.2015 12:00, Wolfgang Maier wrote:
>
> head = 99
> data = myList(range(10))
> s = '{}, {:*, }'.format(head, data)
> # or
> s2 = '{}{sep}{:*{sep}}'.format(head, data, sep=', ')
> print(s)
> print(s2)
> # 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
>
> Thoughts?

I like it and I agree this is an oft-used pattern. From my experience I 
can tell patterns are workarounds if a language cannot handle it properly.

I cannot tell what a concrete syntax would exactly look like but I would 
love to see an easy-to-read solution.

Best,
Sven

From rosuav at gmail.com  Tue Sep  8 19:58:49 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 9 Sep 2015 03:58:49 +1000
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <55EF1FB3.5000407@mail.de>
References: <msmbko$ful$1@ger.gmane.org>
	<55EF1FB3.5000407@mail.de>
Message-ID: <CAPTjJmoZDybRaPhahR4sGCLBF=7mcbZBAH4O7hW5_q-eJfYHyw@mail.gmail.com>

On Wed, Sep 9, 2015 at 3:49 AM, Sven R. Kunze <srkunze at mail.de> wrote:
> On 08.09.2015 12:00, Wolfgang Maier wrote:
>>
>>
>> head = 99
>> data = myList(range(10))
>> s = '{}, {:*, }'.format(head, data)
>> # or
>> s2 = '{}{sep}{:*{sep}}'.format(head, data, sep=', ')
>> print(s)
>> print(s2)
>> # 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
>>
>> Thoughts?
>
>
> I like it and I agree this is an oft-used pattern. From my experience I can
> tell patterns are workarounds if a language cannot handle it properly.
>
> I cannot tell what a concrete syntax would exactly look like but I would
> love to see an easy-to-read solution.

It looks tempting, but there's a reason Python has join() as a
*string* method, not a method on any sort of iterable. For the same
reason, I think it'd be better to handle this as a special case inside
str.format(), rather than as a format string of the iterables; it
would be extremely surprising for code to be able to join a list, a
tuple, a ListIterator, or a generator, but not a custom class with
__iter__ and __next__ methods. (Even more surprising if it works with
some standard library types and not others.) Plus, it'd mean a lot of
code duplication across all those types, which is unnecessary.

It'd be rather cool if it could be done as a special format string,
though, which says "here's a separator, here's a format string, now
iterate over the argument and format them with that string, then join
them with that sep, and stick it in here". It might get a bit verbose,
though.

ChrisA

From srkunze at mail.de  Tue Sep  8 20:03:47 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Tue, 08 Sep 2015 20:03:47 +0200
Subject: [Python-ideas] Wheels For ...
In-Reply-To: <CADiSq7f6=PMfOdXEv327mnP3nr8HnvRrM=WF30bEeARq87-bHg@mail.gmail.com>
References: <55EC78E9.1050300@mail.de>	<msi4ao$sq6$1@ger.gmane.org>	<201509061954.t86Jspjg011546@fido.openend.se>	<CADiSq7ewai8fv9XJsOYEvijTZ+JPsJo6+E=O7S9C5p1cZT=Wgg@mail.gmail.com>	<55EDB9B4.7020909@mail.de>
 <CADiSq7f6=PMfOdXEv327mnP3nr8HnvRrM=WF30bEeARq87-bHg@mail.gmail.com>
Message-ID: <55EF2303.8080901@mail.de>



On 08.09.2015 01:45, Nick Coghlan wrote:
> On 8 September 2015 at 02:22, Sven R. Kunze <srkunze at mail.de> wrote:
>> On 07.09.2015 02:26, Nick Coghlan wrote:
>>> For the build farm idea, it's not just writing the code initially,
>>> it's operating the resulting infrastructure, and that's a much bigger
>>> ongoing commitment. Automatically building wheels for source uploads
>>> is definitely on the wish list, there are just a large number of other
>>> improvements needed before it's feasible.
>>
>> Could you be more specific on these improvements, Nick?
> - PyPI: migrating from the legacy Zope codebase to Warehouse
> - PyPI: end-to-end content signing (PEPs 458 & 480)
> - PyPI: automated analytics & dashboards
> - Tooling: integration with operating systems & other platforms
> - Python Software Foundation financial sustainability
> - Python Software Foundation project management capacity
> - Infrastructure improvements for the CPython workflow

Very appreciated.

Let's see how they make progress on these.

> Those aren't dependencies of automatic wheel-building per se, but
> rather are issues that are higher priorities for folks like Donald (in
> terms of actually getting things done), myself (in terms of
> collaborating more effectively with other open source ecosystems), and
> the PSF staff and Board (in terms of ensuring the python.org
> infrastructure is being appropriately maintained).
>
> Running an automated build service is expensive, not primarily in
> setting it up, but in terms of the ongoing sustaining engineering
> costs (including security monitoring and response), so before we
> commit to doing it, we need to know how we're going to fund it.
> However, most of the PSF's focus at the moment is on getting the
> things we *already* do [1] on a more sustainable footing, so adding
> *new* services isn't currently a priority.
>
> Cheers,
> Nick.
>
> [1] https://wiki.python.org/moin/PythonSoftwareFoundation/Proposals/StrategicPriorities
>


From oscar.j.benjamin at gmail.com  Tue Sep  8 20:15:24 2015
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Tue, 8 Sep 2015 19:15:24 +0100
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <87lhcg27ww.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <msmbko$ful$1@ger.gmane.org>
 <8635FF8B-2C17-4016-BDC0-BF5D775C9F0C@yahoo.com>
 <CAHVvXxR-STALH6-RL5pdvyra+hY8MPRQA2c6qGp1qjS74NV_eA@mail.gmail.com>
 <55EEEEBB.4080203@biologie.uni-freiburg.de>
 <87lhcg27ww.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAHVvXxThCf2G_=0qjHhJF9MPWd2Hd4g=TP4=j1dC7Y7OszK5ig@mail.gmail.com>

On 8 September 2015 at 17:27, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>
> A technical comment: you don't actually need the '*' for myList
> (although I guess you find it useful to get an error rather than line
> noise as a separator if it isn't present?)

I think Wolfgang wants it to work with any iterable rather than his
own custom type (at least that's what I'd want). For that to work it
would be better if it was handled by the format method itself rather
than every iterable's __format__ method. Then it could work with
generators, lists, tuples etc.

> On the basic idea: if this can be generalized a bit so that
>
>     head = 99
>     data = range(10)                # optimism!
>     s = '{:.1f}, {:.1f*, }'.format(head, data)
>
> produces
>
>     s == '99.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0'
>
> then I'd be a lot more attracted to it.

ATM the colon separates the part of the format element that is
interpreted by the format method to find the formatted object from the
part that is passed to the __format__ method of the formatted object.
Perhaps an additional colon could be used to separate the separator
for when the formatted object is an iterable so that

     'foo {name:<fmt>:<sep>} bar'.format(name=<expr>)

could become

    'foo {_name} bar'.format(_name = '<sep>'.join(format(o, '<fmt>')
for o in <expr>))

The example would then be

    >>> '{:.1f}, {:.1f:, }'.format(99, range(10))
    '99.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0'

--
Oscar

From srkunze at mail.de  Tue Sep  8 20:21:53 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Tue, 08 Sep 2015 20:21:53 +0200
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <CAPTjJmoZDybRaPhahR4sGCLBF=7mcbZBAH4O7hW5_q-eJfYHyw@mail.gmail.com>
References: <msmbko$ful$1@ger.gmane.org>	<55EF1FB3.5000407@mail.de>
 <CAPTjJmoZDybRaPhahR4sGCLBF=7mcbZBAH4O7hW5_q-eJfYHyw@mail.gmail.com>
Message-ID: <55EF2741.8070507@mail.de>

On 08.09.2015 19:58, Chris Angelico wrote:
> It'd be rather cool if it could be done as a special format string,
> though, which says "here's a separator, here's a format string, now
> iterate over the argument and format them with that string, then join
> them with that sep, and stick it in here". It might get a bit verbose,
> though.

Most of the time, the "format string" of yours I use is "str". So, 
defaulting to "str" would suffice at least from my point of view:

output = f'Have a look at this comma separated list: {fruits#, }.'

Substitute # by any character that you see fit.


I mean, seriously, you don't use a full-featured template engine, throw 
an iterable into it and hope that is just works and provides some 
readable output. Job done and you can move on.

What do you expect? From my point of view, the str + join suffices for 
once again 80% of the use-cases.


Best,
Sven

From srkunze at mail.de  Tue Sep  8 20:39:34 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Tue, 08 Sep 2015 20:39:34 +0200
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <m2egi9a62o.fsf@fastmail.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com> <87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>
 <m2egi9a62o.fsf@fastmail.com>
Message-ID: <55EF2B66.4020509@mail.de>

On 08.09.2015 06:27, Random832 wrote:
> By "Identical" I meant there would be a single mechanism, including all
> new features, used for both str.format and str.__mod__ - i.e. the
> extensions would also be available using the % operator. The extensions
> would be implemented, but the % operator would not be left
> behind. Hence, identical.


Is it an issue when I think the % should be left behind? Just my 
personal preference.

It only increases the learning curve with no actual benefits.


Performance? Make {} faster.


Best,
Sven

From oscar.j.benjamin at gmail.com  Tue Sep  8 20:42:22 2015
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Tue, 8 Sep 2015 19:42:22 +0100
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <CAHVvXxThCf2G_=0qjHhJF9MPWd2Hd4g=TP4=j1dC7Y7OszK5ig@mail.gmail.com>
References: <msmbko$ful$1@ger.gmane.org>
 <8635FF8B-2C17-4016-BDC0-BF5D775C9F0C@yahoo.com>
 <CAHVvXxR-STALH6-RL5pdvyra+hY8MPRQA2c6qGp1qjS74NV_eA@mail.gmail.com>
 <55EEEEBB.4080203@biologie.uni-freiburg.de>
 <87lhcg27ww.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAHVvXxThCf2G_=0qjHhJF9MPWd2Hd4g=TP4=j1dC7Y7OszK5ig@mail.gmail.com>
Message-ID: <CAHVvXxSqC5WW+6sQwUB9=K41YDsDsEHhft2knqWfGsjcfLF3ag@mail.gmail.com>

On 8 September 2015 at 19:15, Oscar Benjamin <oscar.j.benjamin at gmail.com> wrote:
> ATM the colon separates the part of the format element that is
> interpreted by the format method to find the formatted object from the
> part that is passed to the __format__ method of the formatted object.
> Perhaps an additional colon could be used to separate the separator
> for when the formatted object is an iterable so that
>
>      'foo {name:<fmt>:<sep>} bar'.format(name=<expr>)
>
> could become
>
>     'foo {_name} bar'.format(_name = '<sep>'.join(format(o, '<fmt>')
> for o in <expr>))
>
> The example would then be
>
>     >>> '{:.1f}, {:.1f:, }'.format(99, range(10))
>     '99.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0'

Except that obviously that wouldn't work because colon can be part of
the <fmt> string e.g. for datetime:

    >>> '{:%H:%M}'.format(datetime.datetime.now())
    '19:39'

So you'd need something before the colon to disambiguate. In which case perhaps

      'foo {*name:<sep>:<fmt>} bar'.format(name=<expr>)

meaning that if the * is there then everything after the second colon
is the format string.

Then it would be:

     >>> '{:.1f}, {*:, :.1f}'.format(99, range(10))
     '99.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0'

--
Oscar

From random832 at fastmail.us  Tue Sep  8 21:38:04 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Tue, 08 Sep 2015 15:38:04 -0400
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <87lhcg27ww.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <msmbko$ful$1@ger.gmane.org>
 <8635FF8B-2C17-4016-BDC0-BF5D775C9F0C@yahoo.com>
 <CAHVvXxR-STALH6-RL5pdvyra+hY8MPRQA2c6qGp1qjS74NV_eA@mail.gmail.com>
 <55EEEEBB.4080203@biologie.uni-freiburg.de>
 <87lhcg27ww.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <1441741084.1614682.378110905.58940FE2@webmail.messagingengine.com>



On Tue, Sep 8, 2015, at 12:27, Stephen J. Turnbull wrote:
> I'm afraid this won't work unless restricted to be the last
> replacement field, where it just consumes all remaining positional
> arguments.  I think that restriction deserves a loud "ugh", but maybe
> it will give somebody a better idea.

So, this is the second time in as many weeks that I've suggested a new
!converter, but this seems like the place for it - have something like
"!join" which "converts" [wraps] the argument in a class whose
__format__ method knows how to join [and call __format__ on the
individual members].

So you could make a list of floating point numbers by "List: {0:,
|.2f!join}".format([1.2, 3.4, 5.6])

and it will simply call Joiner([1.2, 3.4, 5.6]).__format__(", |.2f")

From random832 at fastmail.us  Tue Sep  8 21:39:55 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Tue, 08 Sep 2015 15:39:55 -0400
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <55EF2B66.4020509@mail.de>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com> <87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>
 <m2egi9a62o.fsf@fastmail.com> <55EF2B66.4020509@mail.de>
Message-ID: <1441741195.1614886.378114729.37307E0E@webmail.messagingengine.com>

On Tue, Sep 8, 2015, at 14:39, Sven R. Kunze wrote:
> Is it an issue when I think the % should be left behind? Just my 
> personal preference.
> 
> It only increases the learning curve with no actual benefits.

My take is: Having two format string grammars is worse than having one,
even if the %-grammar is worse than the {}-grammar.

From tritium-list at sdamon.com  Tue Sep  8 22:51:37 2015
From: tritium-list at sdamon.com (Alexander Walters)
Date: Tue, 08 Sep 2015 16:51:37 -0400
Subject: [Python-ideas] NuGet/Chocolatey feed for releases
In-Reply-To: <89BE5DA9-CB1B-481E-9601-96F94057BFA0@gmail.com>
References: <55EEABFD.8040600@sdamon.com>
 <89BE5DA9-CB1B-481E-9601-96F94057BFA0@gmail.com>
Message-ID: <55EF4A59.6080605@sdamon.com>

This would be to bypass the chocolatey gallery - users of this would use 
the sources parameter to choco install.

On 9/8/2015 10:37, Ryan Gonzalez wrote:
> Beware: when the Chocolatey devs said "backlog", they *meant* backlog. 
> I submitted an updated PyPy package to them months ago, and it still 
> hasn't been updated yet.
>
> On September 8, 2015 4:35:57 AM CDT, Alexander Walters 
> <tritium-list at sdamon.com> wrote:
>
>     It would be incredibly convenient, especially for users of AppVayor's
>     continuous integration service, if there were a(n official) repository
>     for chocolatey containing recent releases of python.  The official
>     Chocolatey gallery contains installers for the latest 2.7 and 3.4 (as of
>     this post).  What I am proposing would contain the most commonly used
>     pythons in testing (2.6 2.7 3.3 3.4 and future releases).
>
>     I am perfectly willing to set up a repo for my own use, but am posting
>     this to see if there is community support...or psf support... for
>     setting up an official repo.
>     ------------------------------------------------------------------------
>
>     Python-ideas mailing list
>     Python-ideas at python.org
>     https://mail.python.org/mailman/listinfo/python-ideas
>     Code of Conduct:http://python.org/psf/codeofconduct/
>
>
> -- 
> Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity. 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150908/edefd04b/attachment.html>

From wes.turner at gmail.com  Wed Sep  9 00:18:03 2015
From: wes.turner at gmail.com (Wes Turner)
Date: Tue, 8 Sep 2015 17:18:03 -0500
Subject: [Python-ideas] NuGet/Chocolatey feed for releases
In-Reply-To: <55EEABFD.8040600@sdamon.com>
References: <55EEABFD.8040600@sdamon.com>
Message-ID: <CACfEFw_xn4bsZvouD0f8uJORSEZCSKJ4VtkyP_-=HLWxK0j=9g@mail.gmail.com>

* Do you have a chocolatey nuget build  script for [buildbot, jenkins]?
Written in Python?
  * https://www.python.org/dev/buildbot/
    * https://github.com/conda/conda-recipes/tree/master/python-2.7.8
    *
https://github.com/conda/conda-recipes/blob/master/python-3.5/meta.yaml

* A pkg repo maintainer could
   scrape/poll these
   * https://www.python.org/downloads/windows/
      * [ ] (schema.org RDFa/JSONLD for releases would be great)
* https://en.wikipedia.org/wiki/NuGet
* http://docs.continuum.io/anaconda/install#windows-install (2.7, 3.4)
  * http://docs.continuum.io/anaconda/pkg-docs
On Sep 8, 2015 4:37 AM, "Alexander Walters" <tritium-list at sdamon.com> wrote:

> It would be incredibly convenient, especially for users of AppVayor's
> continuous integration service, if there were a(n official) repository for
> chocolatey containing recent releases of python.  The official Chocolatey
> gallery contains installers for the latest 2.7 and 3.4 (as of this post).
> What I am proposing would contain the most commonly used pythons in testing
> (2.6 2.7 3.3 3.4 and future releases).
>
> I am perfectly willing to set up a repo for my own use, but am posting
> this to see if there is community support...or psf support... for setting
> up an official repo.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150908/dc2a6c2f/attachment-0001.html>

From abarnert at yahoo.com  Wed Sep  9 02:03:22 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Tue, 8 Sep 2015 17:03:22 -0700
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <1441741084.1614682.378110905.58940FE2@webmail.messagingengine.com>
References: <msmbko$ful$1@ger.gmane.org>
 <8635FF8B-2C17-4016-BDC0-BF5D775C9F0C@yahoo.com>
 <CAHVvXxR-STALH6-RL5pdvyra+hY8MPRQA2c6qGp1qjS74NV_eA@mail.gmail.com>
 <55EEEEBB.4080203@biologie.uni-freiburg.de>
 <87lhcg27ww.fsf@uwakimon.sk.tsukuba.ac.jp>
 <1441741084.1614682.378110905.58940FE2@webmail.messagingengine.com>
Message-ID: <94C3256D-0380-497F-82CF-98B17C904222@yahoo.com>

On Sep 8, 2015, at 12:38, random832 at fastmail.us wrote:
> 
>> On Tue, Sep 8, 2015, at 12:27, Stephen J. Turnbull wrote:
>> I'm afraid this won't work unless restricted to be the last
>> replacement field, where it just consumes all remaining positional
>> arguments.  I think that restriction deserves a loud "ugh", but maybe
>> it will give somebody a better idea.
> 
> So, this is the second time in as many weeks that I've suggested a new
> !converter, but this seems like the place for it - have something like
> "!join" which "converts" [wraps] the argument in a class whose
> __format__ method knows how to join [and call __format__ on the
> individual members].
> 
> So you could make a list of floating point numbers by "List: {0:,
> |.2f!join}".format([1.2, 3.4, 5.6])
> 
> and it will simply call Joiner([1.2, 3.4, 5.6]).__format__(", |.2f")

I like this version. 

Even without the flexibility, just adding another hardcoded 'j' converter for iterables would be nice, but being able to program it would of course be better.

From abarnert at yahoo.com  Wed Sep  9 02:09:41 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Tue, 8 Sep 2015 17:09:41 -0700
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <1441741195.1614886.378114729.37307E0E@webmail.messagingengine.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com> <87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>
 <m2egi9a62o.fsf@fastmail.com> <55EF2B66.4020509@mail.de>
 <1441741195.1614886.378114729.37307E0E@webmail.messagingengine.com>
Message-ID: <6DDBD724-714E-40E1-88DF-9BC8484FF240@yahoo.com>

On Sep 8, 2015, at 12:39, random832 at fastmail.us wrote:
> 
>> On Tue, Sep 8, 2015, at 14:39, Sven R. Kunze wrote:
>> Is it an issue when I think the % should be left behind? Just my 
>> personal preference.
>> 
>> It only increases the learning curve with no actual benefits.
> 
> My take is: Having two format string grammars is worse than having one,
> even if the %-grammar is worse than the {}-grammar.

I think it's already been established why % formatting is not going away any time soon.

As for de-emphasizing it, I think that's already done pretty well in the current docs. The tutorial has a nice long introduction to str.format, a one-paragraph section on "old string formatting" with a single %5.3f example, and a one-sentence mention of Template. The stdtypes chapter in the library reference explains the difference between the two in a way that makes format sound more attractive for novices, and then has details on each one as appropriate. What else should be done?

From stephen at xemacs.org  Wed Sep  9 04:37:20 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 09 Sep 2015 11:37:20 +0900
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <CAHVvXxThCf2G_=0qjHhJF9MPWd2Hd4g=TP4=j1dC7Y7OszK5ig@mail.gmail.com>
References: <msmbko$ful$1@ger.gmane.org>
 <8635FF8B-2C17-4016-BDC0-BF5D775C9F0C@yahoo.com>
 <CAHVvXxR-STALH6-RL5pdvyra+hY8MPRQA2c6qGp1qjS74NV_eA@mail.gmail.com>
 <55EEEEBB.4080203@biologie.uni-freiburg.de>
 <87lhcg27ww.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAHVvXxThCf2G_=0qjHhJF9MPWd2Hd4g=TP4=j1dC7Y7OszK5ig@mail.gmail.com>
Message-ID: <87d1xs1fof.fsf@uwakimon.sk.tsukuba.ac.jp>

Oscar Benjamin writes:

 > ATM the colon separates the part of the format element that is
 > interpreted by the format method to find the formatted object from the
 > part that is passed to the __format__ method of the formatted object.
 > Perhaps an additional colon could be used to separate the separator
 > for when the formatted object is an iterable so that
 > 
 >      'foo {name:<fmt>:<sep>} bar'.format(name=<expr>)

I thought about a colon, but that loses if the objects are times.  I
guess that kills '/' and '-', too, since the objects might be dates.
Of course there may be a tricky way to use these that I haven't
thought of, or they could be escaped for use in <fmt>.


From stephen at xemacs.org  Wed Sep  9 05:05:44 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 09 Sep 2015 12:05:44 +0900
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <1441741195.1614886.378114729.37307E0E@webmail.messagingengine.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com>
 <87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>
 <m2egi9a62o.fsf@fastmail.com> <55EF2B66.4020509@mail.de>
 <1441741195.1614886.378114729.37307E0E@webmail.messagingengine.com>
Message-ID: <87bndc1ed3.fsf@uwakimon.sk.tsukuba.ac.jp>

random832 at fastmail.us writes:

 > My take is: Having two format string grammars is worse than having
 > one, even if the %-grammar is worse than the {}-grammar.

The same has been said for having two (or more) loop grammars.

    Where have you gone, Repeat Until
    A nation turns its lonely eyes to you
    What's that you say, Mrs Robinson
    Pascal itself has turned to Modula 2 (or 3) (how 'bout 4?!)

The point is that not all experiments can be contained in a single
personal branch posted to GitHub.  (Kudos to Trent who seems to be
pulling off that trick as we controverse.  Even if it's possible it's
not necessarily easy!)  We all agree in isolation with the value you
express (more or less TOOWTDI), but it's the balance of many
desiderata that makes Python a great language.  But that balance
clearly is against you: At the time {}-formatting was introduced,
there had already been several less-than-wildly-successful experiments
(including string.Template), yet the PEP was nevertheless accepted.

In string formatting, the consensus is evidently that it's difficult
enough to measure improvement objectively that sufficiently plausible
experiments will still be admitted into the stdlib (or not, in the
case of the much-delayed PEP 461 -- way to go, Ethan! -- the decision
*not* to have backward compatible formatting for bytestrings was
itself an experiment in this sense).  And it's a difficult enough
design space that the principle of minimal sufficient change (implicit
in what you're saying) was not strictly applied to {}-formatting (or
string.Template, for that matter).

I'm not just being ornery, string formatting is near and dear to my
heart.  I'm genuinely curious why you choose a much more conservative
balance in this area than Python has.  But to my eyes your posts so
far amount to an attempted wake-up call: "TOOWTDI is more important in
string formatting than you all seem to think!" and no more.

Sincere regards,

From tritium-list at sdamon.com  Wed Sep  9 07:36:23 2015
From: tritium-list at sdamon.com (Alexander Walters)
Date: Wed, 09 Sep 2015 01:36:23 -0400
Subject: [Python-ideas] NuGet/Chocolatey feed for releases
In-Reply-To: <CACfEFw_xn4bsZvouD0f8uJORSEZCSKJ4VtkyP_-=HLWxK0j=9g@mail.gmail.com>
References: <55EEABFD.8040600@sdamon.com>
 <CACfEFw_xn4bsZvouD0f8uJORSEZCSKJ4VtkyP_-=HLWxK0j=9g@mail.gmail.com>
Message-ID: <55EFC557.9090705@sdamon.com>

I do not see how a build script (to build python?) would be needed. The 
existing installers would be sufficient.  The packages themselves would 
have to be XML and powershell (that is the NuGet/Chocolatey infrastructure.)

As it stands, hosting your own nuget/chocolatey feed required a windows 
server (not ideal, but workable).  I am finding it hard to actually find 
the api specification.

On 9/8/2015 18:18, Wes Turner wrote:
>
> * Do you have a chocolatey nuget build  script for [buildbot, 
> jenkins]? Written in Python?
>   * https://www.python.org/dev/buildbot/
>     * https://github.com/conda/conda-recipes/tree/master/python-2.7.8
>     * 
> https://github.com/conda/conda-recipes/blob/master/python-3.5/meta.yaml
>
> * A pkg repo maintainer could
>    scrape/poll these
>    * https://www.python.org/downloads/windows/
>       * [ ] (schema.org <http://schema.org> RDFa/JSONLD for releases 
> would be great)
> * https://en.wikipedia.org/wiki/NuGet
> * http://docs.continuum.io/anaconda/install#windows-install (2.7, 3.4)
>   * http://docs.continuum.io/anaconda/pkg-docs
>
> On Sep 8, 2015 4:37 AM, "Alexander Walters" <tritium-list at sdamon.com 
> <mailto:tritium-list at sdamon.com>> wrote:
>
>     It would be incredibly convenient, especially for users of
>     AppVayor's continuous integration service, if there were a(n
>     official) repository for chocolatey containing recent releases of
>     python.  The official Chocolatey gallery contains installers for
>     the latest 2.7 and 3.4 (as of this post).  What I am proposing
>     would contain the most commonly used pythons in testing (2.6 2.7
>     3.3 3.4 and future releases).
>
>     I am perfectly willing to set up a repo for my own use, but am
>     posting this to see if there is community support...or psf
>     support... for setting up an official repo.
>     _______________________________________________
>     Python-ideas mailing list
>     Python-ideas at python.org <mailto:Python-ideas at python.org>
>     https://mail.python.org/mailman/listinfo/python-ideas
>     Code of Conduct: http://python.org/psf/codeofconduct/
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/a59deca4/attachment-0001.html>

From ben+python at benfinney.id.au  Wed Sep  9 09:33:57 2015
From: ben+python at benfinney.id.au (Ben Finney)
Date: Wed, 09 Sep 2015 17:33:57 +1000
Subject: [Python-ideas] High time for a builtin function to manage
	packages (simply)?
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
Message-ID: <85h9n482sa.fsf@benfinney.id.au>

Paul Moore <p.f.moore at gmail.com> writes:

> On 5 September 2015 at 09:30, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > Unfortunately, I've yet to convince the rest of PyPA (let alone the
> > community at large) that telling people to call "pip" directly is *bad
> > advice* (as it breaks in too many cases that beginners are going to
> > encounter), so it would be helpful if folks helping beginners on
> > python-list and python-tutor could provide feedback supporting that
> > perspective by filing an issue against
> > https://github.com/pypa/python-packaging-user-guide
>
> I would love to see "python -m pip" (or where the launcher is
> appropriate, the shorter "py -m pip") be the canonical invocation used
> in all documentation, discussion and advice on running pip.

Contrariwise, I would like to see ?pip? become the canonical invocation
used in all documentation, discussion, and advice; and if there are any
technical barriers to that least-surprising method, to see those
barriers addressed and removed.

> The main problems seem to be (1) "but just typing "pip" is shorter and
> easier to remember",

With the concomitant benefit that it's easier to teach and learn. This
is not insignificant.

> (2) "I don't understand why pip can't just be a normal command"

This is my main objection, but rather stated as: We already have a
firmly-established naming convention for user-level commands, that works
in a huge number of languages; Python has no good reason to be an
exception, especially not in one of the first commands that new users
will need to encounter.

If something is preventing ?pip? from being the command to type to run
Pip, then surely the right place to apply pressure is not on everyone
who instructs and documents and interfaces with end-users now and
indefinitely; but instead on whatever is preventing the One Obvious Way
to work. No?

-- 
 \      ?Yesterday I saw a subliminal advertising executive for just a |
  `\                                           second.? ?Steven Wright |
_o__)                                                                  |
Ben Finney


From p.f.moore at gmail.com  Wed Sep  9 10:56:39 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 9 Sep 2015 09:56:39 +0100
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <85h9n482sa.fsf@benfinney.id.au>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
Message-ID: <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>

On 9 September 2015 at 08:33, Ben Finney <ben+python at benfinney.id.au> wrote:
> Contrariwise, I would like to see ?pip? become the canonical invocation
> used in all documentation, discussion, and advice; and if there are any
> technical barriers to that least-surprising method, to see those
> barriers addressed and removed.

There is at least one fundamental, technical, and (so far) unsolveable
issue with using "pip" as the canonical invocation.

    pip install -U pip

fails on Windows, because the exe wrapper cannot be replaced by a
process running that wrapper (the "pip" command runs pip.exe which
needs to replace pip.exe, but can't because the OS has it open as the
current running process).

There have been a number of proposals for fixing this, but none have
been viable so far. We'd need someone to provide working code (not
just suggestions on things that might work, but actual working code)
before we could recommend anything other than "python -m pip install
-U pip" as the correct way of upgrading pip. And recommending one
thing when upgrading pip, but another for "normal use" is also
confusing for beginners. (And we have evidence from the pip issue
tracker people *do* find this confusing, and not just beginners...)

Apart from that issue, which is Windows only (and thus some people
find it less compelling) we have also had reported issues of people
running pip, and it installs things into the "wrong" Python
installation. This is typically because of PATH configuration issues,
where "pip" is being found via one PATH element, but "python" is found
via a different one. I don't have specifics to hand, so I can't
clarify *how* people have managed to construct such breakage, but I
can state that it happens, and the relevant people are usually very
confused by the results. Again, "python -m pip" avoids any confusion
here - that invocation clearly and unambiguously installs to the
Python installation you invoked.

In actual fact, if it weren't for the backward compatibility issues it
would cause, I'd be tempted to argue that pip shouldn't provide any
wrapper at all, and *only* offer "python -m pip" as a means of
invoking it (precisely because it's so closely tied to the Python
interpreter used to invoke it). But that's never going to happen and I
don't intend it as a serious proposal.

Paul

From oscar.j.benjamin at gmail.com  Wed Sep  9 13:56:53 2015
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Wed, 9 Sep 2015 12:56:53 +0100
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <87d1xs1fof.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <msmbko$ful$1@ger.gmane.org>
 <8635FF8B-2C17-4016-BDC0-BF5D775C9F0C@yahoo.com>
 <CAHVvXxR-STALH6-RL5pdvyra+hY8MPRQA2c6qGp1qjS74NV_eA@mail.gmail.com>
 <55EEEEBB.4080203@biologie.uni-freiburg.de>
 <87lhcg27ww.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAHVvXxThCf2G_=0qjHhJF9MPWd2Hd4g=TP4=j1dC7Y7OszK5ig@mail.gmail.com>
 <87d1xs1fof.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAHVvXxSPouxOyx3ZXBG9tzD6k_Q+1CDVtJPzKrSeeatwiReHew@mail.gmail.com>

On 9 September 2015 at 03:37, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Oscar Benjamin writes:
>
>  > ATM the colon separates the part of the format element that is
>  > interpreted by the format method to find the formatted object from the
>  > part that is passed to the __format__ method of the formatted object.
>  > Perhaps an additional colon could be used to separate the separator
>  > for when the formatted object is an iterable so that
>  >
>  >      'foo {name:<fmt>:<sep>} bar'.format(name=<expr>)
>
> I thought about a colon, but that loses if the objects are times.  I
> guess that kills '/' and '-', too, since the objects might be dates.
> Of course there may be a tricky way to use these that I haven't
> thought of, or they could be escaped for use in <fmt>.

You can use the * at the start of the format element (before the first
colon). It can then imply that there will be two colons to separate
the three parts with any further colons part of fmt e.g.:

    '{*<expr>:<sep>:<fmt>}'.format(...)

So then you can have:

    >>> '{*numbers: :.1f}'.format(numbers)
    '1.0, 2.0, 3.0'

    >>> '{*times:, :%H:%M}'.format(times)
    '12:30, 14:50, 22:39'

--
Oscar

From wolfgang.maier at biologie.uni-freiburg.de  Wed Sep  9 15:41:56 2015
From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier)
Date: Wed, 9 Sep 2015 15:41:56 +0200
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <msmbko$ful$1@ger.gmane.org>
References: <msmbko$ful$1@ger.gmane.org>
Message-ID: <mspcv3$8mu$1@ger.gmane.org>

Thanks for all the feedback!

Just to summarize ideas and to clarify what I had in mind when proposing 
this:

1)
Yes, I would like to have this work with any (or at least most) 
iterables, not just with my own custom type that I used for illustration.
So having this handled by the format method rather than each object's 
__format__ method could make sense. It was just simple to implement it 
in Python through the __format__ method.

Why did I propose * as the first character of the new format spec string?
Because I think you really need some token to state unambiguously[1] 
that what follows is a format specification that involves going through 
the elements of the iterable instead of working on the container object 
itself. I thought that * is most intuitive to understand because of its 
use in unpacking.

[1] unfortunately, in my original proposal the leading * can still be 
ambiguous because *<, *> *= and *^ could mean element joining with <, >, 
= or ^ as separators or aligning of the container's formatted string 
representation using * as the fill character.


Ideally, the * should be the very first thing inside a replacement field 
- pretty much as suggested by Oscar - and should not be part of the 
format spec. This is not feasible through a format spec handled by the 
__format__ method, but through a modified str.format method, i.e., 
that's another argument for this approach. Examples:

'foo {*name:<sep>} bar'.format(name=<expr>)
'foo {*0:<sep>} bar {1}'.format(x, y)
'foo {*:<sep>} bar'.format(x)


2)
As for including an additional format spec to apply to the elements of 
the iterable:
I decided against including this in the original proposal to keep it 
simple and to get feedback on the general idea first.
The problem here is that any solution requires an additional token to 
indicate the boundary between the <separator> part and the element 
format spec. Since you would not want to have anyone's custom format 
spec broken by this, this boils down to disallowing one reserved 
character in the <separator> part, like in Oscar's example:

'foo {*name:<sep>:<fmt>} bar'.format(name=<expr>)

where <sep> cannot contain a colon.

So that character would have to be chosen carefully (both : and | are 
quite readable, but also relatively common element separators I guess).
In addition, the <separator> part should be non-optional (though the 
empty string should be allowed) to guarantee the presence of the 
delimiter token, which avoids accidental splitting of lonely element 
format specs into a "<sep>" and <fmt> part:

# format the elements of name using <fmt>, join them using <sep>
'foo {*name:<sep>:<fmt>} bar'.format(name=<expr>)
# format the elements of name using <fmt>, join them using ''
'foo {*name::<fmt>} bar'.format(name=<expr>)
# a syntax error
'foo {*name:<fmt>} bar'.format(name=<expr>)

On the other hand, these restriction do not look too dramatic given the 
flexibility gain in most situations.

So to sum up how this could work:
If str.format encounters a leading * in a replacement field, it splits 
the format spec (i.e. everything after the first colon) on the first 
occurrence of the <sep>|<fmt> separator (possibly ':' or '|') and does, 
essentially:

<sep>.join(format(e, <fmt>) for e in iterable)

Without the *, it just works the current way.


3)
Finally, the alternative idea of having the new functionality handled by 
a new !converter, like:

"List: {0!j:,}".format([1.2, 3.4, 5.6])

I considered this idea before posting the original proposal, but, in 
addition to requiring a change to str.format (which would need to 
recognize the new token), this approach would need either:

- a new special method (e.g., __join__) to be implemented for every type 
that should support it, which is worse than for my original proposal or

- the str.format method must react directly to the converter flag, which 
is then no different to the above solution just that it uses !j instead 
of *. Personally, I find the * syntax more readable, plus, the !j syntax 
would then suggest that this is a regular converter (calling a special 
method of the object) when, in fact, it is not.
Please correct me, if I misunderstood something about this alternative 
proposal.

Best,
Wolfgang


From eric at trueblade.com  Wed Sep  9 16:02:27 2015
From: eric at trueblade.com (Eric V. Smith)
Date: Wed, 9 Sep 2015 10:02:27 -0400
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <mspcv3$8mu$1@ger.gmane.org>
References: <msmbko$ful$1@ger.gmane.org> <mspcv3$8mu$1@ger.gmane.org>
Message-ID: <55F03BF3.50106@trueblade.com>

At some point, instead of complicating how format works internally, you
should just write a function that does what you want. I realize there's
a continuum between '{}'.format(iterable) and
'{<really-really-complex-stuff}'.format(iterable). It's not clear where
to draw the line. But when the solution is to bake knowledge of
iterables into .format(), I think we've passed the point where we should
switch to a function: '{}'.format(some_function(iterable)).

In any event, If you want to play with this, I suggest you write
some_function(iterable) that does what you want, first.

Eric.

On 9/9/2015 9:41 AM, Wolfgang Maier wrote:
> Thanks for all the feedback!
> 
> Just to summarize ideas and to clarify what I had in mind when proposing
> this:
> 
> 1)
> Yes, I would like to have this work with any (or at least most)
> iterables, not just with my own custom type that I used for illustration.
> So having this handled by the format method rather than each object's
> __format__ method could make sense. It was just simple to implement it
> in Python through the __format__ method.
> 
> Why did I propose * as the first character of the new format spec string?
> Because I think you really need some token to state unambiguously[1]
> that what follows is a format specification that involves going through
> the elements of the iterable instead of working on the container object
> itself. I thought that * is most intuitive to understand because of its
> use in unpacking.
> 
> [1] unfortunately, in my original proposal the leading * can still be
> ambiguous because *<, *> *= and *^ could mean element joining with <, >,
> = or ^ as separators or aligning of the container's formatted string
> representation using * as the fill character.
> 
> 
> Ideally, the * should be the very first thing inside a replacement field
> - pretty much as suggested by Oscar - and should not be part of the
> format spec. This is not feasible through a format spec handled by the
> __format__ method, but through a modified str.format method, i.e.,
> that's another argument for this approach. Examples:
> 
> 'foo {*name:<sep>} bar'.format(name=<expr>)
> 'foo {*0:<sep>} bar {1}'.format(x, y)
> 'foo {*:<sep>} bar'.format(x)
> 
> 
> 2)
> As for including an additional format spec to apply to the elements of
> the iterable:
> I decided against including this in the original proposal to keep it
> simple and to get feedback on the general idea first.
> The problem here is that any solution requires an additional token to
> indicate the boundary between the <separator> part and the element
> format spec. Since you would not want to have anyone's custom format
> spec broken by this, this boils down to disallowing one reserved
> character in the <separator> part, like in Oscar's example:
> 
> 'foo {*name:<sep>:<fmt>} bar'.format(name=<expr>)
> 
> where <sep> cannot contain a colon.
> 
> So that character would have to be chosen carefully (both : and | are
> quite readable, but also relatively common element separators I guess).
> In addition, the <separator> part should be non-optional (though the
> empty string should be allowed) to guarantee the presence of the
> delimiter token, which avoids accidental splitting of lonely element
> format specs into a "<sep>" and <fmt> part:
> 
> # format the elements of name using <fmt>, join them using <sep>
> 'foo {*name:<sep>:<fmt>} bar'.format(name=<expr>)
> # format the elements of name using <fmt>, join them using ''
> 'foo {*name::<fmt>} bar'.format(name=<expr>)
> # a syntax error
> 'foo {*name:<fmt>} bar'.format(name=<expr>)
> 
> On the other hand, these restriction do not look too dramatic given the
> flexibility gain in most situations.
> 
> So to sum up how this could work:
> If str.format encounters a leading * in a replacement field, it splits
> the format spec (i.e. everything after the first colon) on the first
> occurrence of the <sep>|<fmt> separator (possibly ':' or '|') and does,
> essentially:
> 
> <sep>.join(format(e, <fmt>) for e in iterable)
> 
> Without the *, it just works the current way.
> 
> 
> 3)
> Finally, the alternative idea of having the new functionality handled by
> a new !converter, like:
> 
> "List: {0!j:,}".format([1.2, 3.4, 5.6])
> 
> I considered this idea before posting the original proposal, but, in
> addition to requiring a change to str.format (which would need to
> recognize the new token), this approach would need either:
> 
> - a new special method (e.g., __join__) to be implemented for every type
> that should support it, which is worse than for my original proposal or
> 
> - the str.format method must react directly to the converter flag, which
> is then no different to the above solution just that it uses !j instead
> of *. Personally, I find the * syntax more readable, plus, the !j syntax
> would then suggest that this is a regular converter (calling a special
> method of the object) when, in fact, it is not.
> Please correct me, if I misunderstood something about this alternative
> proposal.
> 
> Best,
> Wolfgang
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
> 

From wolfgang.maier at biologie.uni-freiburg.de  Wed Sep  9 16:32:08 2015
From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier)
Date: Wed, 9 Sep 2015 16:32:08 +0200
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <55F03BF3.50106@trueblade.com>
References: <msmbko$ful$1@ger.gmane.org> <mspcv3$8mu$1@ger.gmane.org>
 <55F03BF3.50106@trueblade.com>
Message-ID: <55F042E8.10509@biologie.uni-freiburg.de>

Well, here it is:

def unpack_format (iterable, format_spec=None):
     if format_spec:
         try:
             sep, element_fmt = format_spec.split('|', 1)
         except ValueError:
             raise TypeError('Invalid format_spec for iterable formatting')
         return sep.join(format(e, element_fmt) for e in iterable)

usage examples:

# '0.00, 1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00'
'{}'.format(unpack_format(range(10), ', |.2f'))

# '0.001.002.003.004.005.006.007.008.009.00'
'{}'.format(unpack_format(range(10), '|.2f'))

# invalid syntax
'{}'.format(unpack_format(range(10), '.2f'))

Best,
Wolfgang


On 09.09.2015 16:02, Eric V. Smith wrote:
> At some point, instead of complicating how format works internally, you
> should just write a function that does what you want. I realize there's
> a continuum between '{}'.format(iterable) and
> '{<really-really-complex-stuff}'.format(iterable). It's not clear where
> to draw the line. But when the solution is to bake knowledge of
> iterables into .format(), I think we've passed the point where we should
> switch to a function: '{}'.format(some_function(iterable)).
>
> In any event, If you want to play with this, I suggest you write
> some_function(iterable) that does what you want, first.
>
> Eric.
>
> On 9/9/2015 9:41 AM, Wolfgang Maier wrote:
>> Thanks for all the feedback!
>>
>> Just to summarize ideas and to clarify what I had in mind when proposing
>> this:
>>
>> 1)
>> Yes, I would like to have this work with any (or at least most)
>> iterables, not just with my own custom type that I used for illustration.
>> So having this handled by the format method rather than each object's
>> __format__ method could make sense. It was just simple to implement it
>> in Python through the __format__ method.
>>
>> Why did I propose * as the first character of the new format spec string?
>> Because I think you really need some token to state unambiguously[1]
>> that what follows is a format specification that involves going through
>> the elements of the iterable instead of working on the container object
>> itself. I thought that * is most intuitive to understand because of its
>> use in unpacking.
>>
>> [1] unfortunately, in my original proposal the leading * can still be
>> ambiguous because *<, *> *= and *^ could mean element joining with <, >,
>> = or ^ as separators or aligning of the container's formatted string
>> representation using * as the fill character.
>>
>>
>> Ideally, the * should be the very first thing inside a replacement field
>> - pretty much as suggested by Oscar - and should not be part of the
>> format spec. This is not feasible through a format spec handled by the
>> __format__ method, but through a modified str.format method, i.e.,
>> that's another argument for this approach. Examples:
>>
>> 'foo {*name:<sep>} bar'.format(name=<expr>)
>> 'foo {*0:<sep>} bar {1}'.format(x, y)
>> 'foo {*:<sep>} bar'.format(x)
>>
>>
>> 2)
>> As for including an additional format spec to apply to the elements of
>> the iterable:
>> I decided against including this in the original proposal to keep it
>> simple and to get feedback on the general idea first.
>> The problem here is that any solution requires an additional token to
>> indicate the boundary between the <separator> part and the element
>> format spec. Since you would not want to have anyone's custom format
>> spec broken by this, this boils down to disallowing one reserved
>> character in the <separator> part, like in Oscar's example:
>>
>> 'foo {*name:<sep>:<fmt>} bar'.format(name=<expr>)
>>
>> where <sep> cannot contain a colon.
>>
>> So that character would have to be chosen carefully (both : and | are
>> quite readable, but also relatively common element separators I guess).
>> In addition, the <separator> part should be non-optional (though the
>> empty string should be allowed) to guarantee the presence of the
>> delimiter token, which avoids accidental splitting of lonely element
>> format specs into a "<sep>" and <fmt> part:
>>
>> # format the elements of name using <fmt>, join them using <sep>
>> 'foo {*name:<sep>:<fmt>} bar'.format(name=<expr>)
>> # format the elements of name using <fmt>, join them using ''
>> 'foo {*name::<fmt>} bar'.format(name=<expr>)
>> # a syntax error
>> 'foo {*name:<fmt>} bar'.format(name=<expr>)
>>
>> On the other hand, these restriction do not look too dramatic given the
>> flexibility gain in most situations.
>>
>> So to sum up how this could work:
>> If str.format encounters a leading * in a replacement field, it splits
>> the format spec (i.e. everything after the first colon) on the first
>> occurrence of the <sep>|<fmt> separator (possibly ':' or '|') and does,
>> essentially:
>>
>> <sep>.join(format(e, <fmt>) for e in iterable)
>>
>> Without the *, it just works the current way.
>>
>>
>> 3)
>> Finally, the alternative idea of having the new functionality handled by
>> a new !converter, like:
>>
>> "List: {0!j:,}".format([1.2, 3.4, 5.6])
>>
>> I considered this idea before posting the original proposal, but, in
>> addition to requiring a change to str.format (which would need to
>> recognize the new token), this approach would need either:
>>
>> - a new special method (e.g., __join__) to be implemented for every type
>> that should support it, which is worse than for my original proposal or
>>
>> - the str.format method must react directly to the converter flag, which
>> is then no different to the above solution just that it uses !j instead
>> of *. Personally, I find the * syntax more readable, plus, the !j syntax
>> would then suggest that this is a regular converter (calling a special
>> method of the object) when, in fact, it is not.
>> Please correct me, if I misunderstood something about this alternative
>> proposal.
>>
>> Best,
>> Wolfgang
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>


From p.f.moore at gmail.com  Wed Sep  9 16:41:19 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 9 Sep 2015 15:41:19 +0100
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <55F042E8.10509@biologie.uni-freiburg.de>
References: <msmbko$ful$1@ger.gmane.org> <mspcv3$8mu$1@ger.gmane.org>
 <55F03BF3.50106@trueblade.com>
 <55F042E8.10509@biologie.uni-freiburg.de>
Message-ID: <CACac1F8pzQbQj0QoY4eQw1GXdKwjucMmF957TMNosxS2rtdH2w@mail.gmail.com>

On 9 September 2015 at 15:32, Wolfgang Maier
<wolfgang.maier at biologie.uni-freiburg.de> wrote:
> Well, here it is:
>
> def unpack_format (iterable, format_spec=None):
>     if format_spec:
>         try:
>             sep, element_fmt = format_spec.split('|', 1)
>         except ValueError:
>             raise TypeError('Invalid format_spec for iterable formatting')
>         return sep.join(format(e, element_fmt) for e in iterable)
>
> usage examples:
>
> # '0.00, 1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00'
> '{}'.format(unpack_format(range(10), ', |.2f'))
>
> # '0.001.002.003.004.005.006.007.008.009.00'
> '{}'.format(unpack_format(range(10), '|.2f'))
>
> # invalid syntax
> '{}'.format(unpack_format(range(10), '.2f'))

Honestly, it seems to me that

def format_iterable(it, spec, sep=', '):
    return sep.join(format(e, spec) for e in it)

# '0.00, 1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00'
format_iterable(range(10), '.2f')

# '0.001.002.003.004.005.006.007.008.009.00'
format_iterable(range(10), '.2f', sep='')

is perfectly adequate. It reads more clearly to me than the "sep|fmt"
syntax does, as well.

Paul

From wolfgang.maier at biologie.uni-freiburg.de  Wed Sep  9 16:41:26 2015
From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier)
Date: Wed, 9 Sep 2015 16:41:26 +0200
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <55F042E8.10509@biologie.uni-freiburg.de>
References: <msmbko$ful$1@ger.gmane.org> <mspcv3$8mu$1@ger.gmane.org>
 <55F03BF3.50106@trueblade.com> <55F042E8.10509@biologie.uni-freiburg.de>
Message-ID: <mspgem$5qk$1@ger.gmane.org>

Or with default behavior when there is no format_spec:

def unpack_format (iterable, format_spec=None):
     if format_spec:
         try:
             sep, element_fmt = format_spec.split('|', 1)
         except ValueError:
             raise TypeError('Invalid format_spec for iterable formatting')
         return sep.join(format(e, element_fmt) for e in iterable)
     else:
         return ' '.join(format(e) for e in iterable)


On 09.09.2015 16:32, Wolfgang Maier wrote:
> Well, here it is:
>
> def unpack_format (iterable, format_spec=None):
>      if format_spec:
>          try:
>              sep, element_fmt = format_spec.split('|', 1)
>          except ValueError:
>              raise TypeError('Invalid format_spec for iterable formatting')
>          return sep.join(format(e, element_fmt) for e in iterable)
>
> usage examples:
>
> # '0.00, 1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00'
> '{}'.format(unpack_format(range(10), ', |.2f'))
>
> # '0.001.002.003.004.005.006.007.008.009.00'
> '{}'.format(unpack_format(range(10), '|.2f'))
>
> # invalid syntax
> '{}'.format(unpack_format(range(10), '.2f'))
>
> Best,
> Wolfgang
>
>
> On 09.09.2015 16:02, Eric V. Smith wrote:
>> At some point, instead of complicating how format works internally, you
>> should just write a function that does what you want. I realize there's
>> a continuum between '{}'.format(iterable) and
>> '{<really-really-complex-stuff}'.format(iterable). It's not clear where
>> to draw the line. But when the solution is to bake knowledge of
>> iterables into .format(), I think we've passed the point where we should
>> switch to a function: '{}'.format(some_function(iterable)).
>>
>> In any event, If you want to play with this, I suggest you write
>> some_function(iterable) that does what you want, first.
>>
>> Eric.
>>
>> On 9/9/2015 9:41 AM, Wolfgang Maier wrote:
>>> Thanks for all the feedback!
>>>
>>> Just to summarize ideas and to clarify what I had in mind when proposing
>>> this:
>>>
>>> 1)
>>> Yes, I would like to have this work with any (or at least most)
>>> iterables, not just with my own custom type that I used for
>>> illustration.
>>> So having this handled by the format method rather than each object's
>>> __format__ method could make sense. It was just simple to implement it
>>> in Python through the __format__ method.
>>>
>>> Why did I propose * as the first character of the new format spec
>>> string?
>>> Because I think you really need some token to state unambiguously[1]
>>> that what follows is a format specification that involves going through
>>> the elements of the iterable instead of working on the container object
>>> itself. I thought that * is most intuitive to understand because of its
>>> use in unpacking.
>>>
>>> [1] unfortunately, in my original proposal the leading * can still be
>>> ambiguous because *<, *> *= and *^ could mean element joining with <, >,
>>> = or ^ as separators or aligning of the container's formatted string
>>> representation using * as the fill character.
>>>
>>>
>>> Ideally, the * should be the very first thing inside a replacement field
>>> - pretty much as suggested by Oscar - and should not be part of the
>>> format spec. This is not feasible through a format spec handled by the
>>> __format__ method, but through a modified str.format method, i.e.,
>>> that's another argument for this approach. Examples:
>>>
>>> 'foo {*name:<sep>} bar'.format(name=<expr>)
>>> 'foo {*0:<sep>} bar {1}'.format(x, y)
>>> 'foo {*:<sep>} bar'.format(x)
>>>
>>>
>>> 2)
>>> As for including an additional format spec to apply to the elements of
>>> the iterable:
>>> I decided against including this in the original proposal to keep it
>>> simple and to get feedback on the general idea first.
>>> The problem here is that any solution requires an additional token to
>>> indicate the boundary between the <separator> part and the element
>>> format spec. Since you would not want to have anyone's custom format
>>> spec broken by this, this boils down to disallowing one reserved
>>> character in the <separator> part, like in Oscar's example:
>>>
>>> 'foo {*name:<sep>:<fmt>} bar'.format(name=<expr>)
>>>
>>> where <sep> cannot contain a colon.
>>>
>>> So that character would have to be chosen carefully (both : and | are
>>> quite readable, but also relatively common element separators I guess).
>>> In addition, the <separator> part should be non-optional (though the
>>> empty string should be allowed) to guarantee the presence of the
>>> delimiter token, which avoids accidental splitting of lonely element
>>> format specs into a "<sep>" and <fmt> part:
>>>
>>> # format the elements of name using <fmt>, join them using <sep>
>>> 'foo {*name:<sep>:<fmt>} bar'.format(name=<expr>)
>>> # format the elements of name using <fmt>, join them using ''
>>> 'foo {*name::<fmt>} bar'.format(name=<expr>)
>>> # a syntax error
>>> 'foo {*name:<fmt>} bar'.format(name=<expr>)
>>>
>>> On the other hand, these restriction do not look too dramatic given the
>>> flexibility gain in most situations.
>>>
>>> So to sum up how this could work:
>>> If str.format encounters a leading * in a replacement field, it splits
>>> the format spec (i.e. everything after the first colon) on the first
>>> occurrence of the <sep>|<fmt> separator (possibly ':' or '|') and does,
>>> essentially:
>>>
>>> <sep>.join(format(e, <fmt>) for e in iterable)
>>>
>>> Without the *, it just works the current way.
>>>
>>>
>>> 3)
>>> Finally, the alternative idea of having the new functionality handled by
>>> a new !converter, like:
>>>
>>> "List: {0!j:,}".format([1.2, 3.4, 5.6])
>>>
>>> I considered this idea before posting the original proposal, but, in
>>> addition to requiring a change to str.format (which would need to
>>> recognize the new token), this approach would need either:
>>>
>>> - a new special method (e.g., __join__) to be implemented for every type
>>> that should support it, which is worse than for my original proposal or
>>>
>>> - the str.format method must react directly to the converter flag, which
>>> is then no different to the above solution just that it uses !j instead
>>> of *. Personally, I find the * syntax more readable, plus, the !j syntax
>>> would then suggest that this is a regular converter (calling a special
>>> method of the object) when, in fact, it is not.
>>> Please correct me, if I misunderstood something about this alternative
>>> proposal.
>>>
>>> Best,
>>> Wolfgang
>>>
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> https://mail.python.org/mailman/listinfo/python-ideas
>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>


From p.f.moore at gmail.com  Wed Sep  9 16:58:14 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 9 Sep 2015 15:58:14 +0100
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <mspgem$5qk$1@ger.gmane.org>
References: <msmbko$ful$1@ger.gmane.org> <mspcv3$8mu$1@ger.gmane.org>
 <55F03BF3.50106@trueblade.com>
 <55F042E8.10509@biologie.uni-freiburg.de>
 <mspgem$5qk$1@ger.gmane.org>
Message-ID: <CACac1F-y55iGND3OkET2iC78ZXJZ9q4UqSwehuo_dYthzkzBkg@mail.gmail.com>

On 9 September 2015 at 15:41, Wolfgang Maier
<wolfgang.maier at biologie.uni-freiburg.de> wrote:
> def unpack_format (iterable, format_spec=None):
>     if format_spec:
>         try:
>             sep, element_fmt = format_spec.split('|', 1)
>         except ValueError:
>             raise TypeError('Invalid format_spec for iterable formatting')
>         return sep.join(format(e, element_fmt) for e in iterable)
>     else:
>         return ' '.join(format(e) for e in iterable)

>From the docs, "The default format_spec is an empty string which
usually gives the same effect as calling str(value)"

So you can just use format_spec='' and avoid the extra conditional logic.
Paul

From srkunze at mail.de  Wed Sep  9 18:05:10 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Wed, 09 Sep 2015 18:05:10 +0200
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <6DDBD724-714E-40E1-88DF-9BC8484FF240@yahoo.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com> <87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>
 <m2egi9a62o.fsf@fastmail.com> <55EF2B66.4020509@mail.de>
 <1441741195.1614886.378114729.37307E0E@webmail.messagingengine.com>
 <6DDBD724-714E-40E1-88DF-9BC8484FF240@yahoo.com>
Message-ID: <55F058B6.9000202@mail.de>

On 09.09.2015 02:09, Andrew Barnert via Python-ideas wrote:
> I think it's already been established why % formatting is not going away any time soon.
>
> As for de-emphasizing it, I think that's already done pretty well in the current docs. The tutorial has a nice long introduction to str.format, a one-paragraph section on "old string formatting" with a single %5.3f example, and a one-sentence mention of Template. The stdtypes chapter in the library reference explains the difference between the two in a way that makes format sound more attractive for novices, and then has details on each one as appropriate. What else should be done?

I had difficulties to find what you mean by tutorial. But hey, being a 
Python user for years and not knowing where the official tutorial resides...

Anyway, Google presented me the version 2.7 of the tutorial. Thus, the 
link to the stdtypes documentation does not exhibit the note of, say, 3.5:

"Note: The formatting operations described here exhibit a variety of 
quirks that lead to a number of common errors (such as failing to 
display tuples and dictionaries correctly). Using the newer str.format() 
interface helps avoid these errors, and also provides a generally more 
powerful, flexible and extensible approach to formatting text."

So, adding it to the 2.7 docs would be a start.


I still don't understand what's wrong with deprecating %, but okay. I 
think f-strings will push {} to wide-range adoption.


Best,
Sven

From guido at python.org  Wed Sep  9 18:35:12 2015
From: guido at python.org (Guido van Rossum)
Date: Wed, 9 Sep 2015 09:35:12 -0700
Subject: [Python-ideas] Should our default random number generator be secure?
Message-ID: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>

I've received several long emails from Theo de Raadt (OpenBSD founder)
about Python's default random number generator. This is the random module,
and it defaults to a Mersenne Twister (MT) seeded by 2500 bytes of entropy
taken from os.urandom().

Theo's worry is that while the starting seed is fine, MT is not good when
random numbers are used for crypto and other security purposes. I've
countered that it's not meant for that (you should use
random.SystemRandom() or os.urandom() for that) but he counters that people
don't necessarily know that and are using the default random.random() setup
for security purposes without realizing how wrong that is.

There is already a warning in the docs for the random module that it's not
suitable for security, but -- as the meme goes -- nobody reads the docs.

Theo then went into technicalities that went straight over my head,
concluding with a strongly worded recommendation of the OpenBSD version of
arc4random() (which IIUC is based on something called "chacha", not on
"RC4" despite that being in the name). He says it is very fast (but I don't
know what that means).

I've invited Theo to join this list but he's too busy. The two core Python
experts on the random module have given me opinions suggesting that there's
not much wrong with MT, so here I am. Who is right? What should we do? Is
there anything we need to do?

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/3cab00ac/attachment-0001.html>

From skrah at bytereef.org  Wed Sep  9 18:35:46 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Wed, 9 Sep 2015 16:35:46 +0000 (UTC)
Subject: [Python-ideas] One way to do format and print
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com> <87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>
 <m2egi9a62o.fsf@fastmail.com> <55EF2B66.4020509@mail.de>
 <1441741195.1614886.378114729.37307E0E@webmail.messagingengine.com>
 <6DDBD724-714E-40E1-88DF-9BC8484FF240@yahoo.com> <55F058B6.9000202@mail.de>
Message-ID: <loom.20150909T182334-673@post.gmane.org>

Sven R. Kunze <srkunze at ...> writes:
> I still don't understand what's wrong with deprecating %, but okay. I 
> think f-strings will push {} to wide-range adoption.

Then it will probably be hard to explain, so I'll be direct:

  1) Many Python users are fed up with churn and don't want to do yet
     another rewrite of their applications (just after migrating to
     3.x).

  2) Despite many years of officially preferring {}-formatting in
     the docs (and on Stackoverflow), people *still* use %-formatting.
     This should be a clue that they actually like it.

  3) %-formatting often has better performance and is often
     easier to read.

  4) Yes, in other cases {}-formatting is easier to read. So choose
     whatever is best.



Stefan Krah


From donald at stufft.io  Wed Sep  9 18:53:33 2015
From: donald at stufft.io (Donald Stufft)
Date: Wed, 9 Sep 2015 12:53:33 -0400
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
Message-ID: <etPan.55f0640d.16a132a6.31bc@Draupnir.home>

On September 9, 2015 at 12:36:16 PM, Guido van Rossum (guido at python.org) wrote:
>  
> I've invited Theo to join this list but he's too busy. The two core Python
> experts on the random module have given me opinions suggesting that there's
> not much wrong with MT, so here I am. Who is right? What should we do? Is
> there anything we need to do?
>  

Everyone is right :)

MT is a fine algorithm for random numbers when you don't need them to be?
cryptographically safe, it is a disastrous algorithm if you do need them to be
safe. As long as you only use MT (and the default ``random``) implementation
for things where the fact the numbers you get aren't going to be quite random
(e.g. they are actually predictable) and you use os.urandom/random.SystemRandom
for everything where you need actual random then everything is fine.

The problem boils down to, are people going to accidently use the default
random module when they really should use os.urandom or random.SystemRandom. It
is my opinion (and I believe Theo's) that they are going to use the MT backed
random functions in random.py when they shouldn't be. However I don't have a
great solution to what we should do about it.

One option is to add a new, random.FastInsecureRandom class, and switch the
"bare" random functions in that module over to using random.SystemRandom by
default. Then if people want to opt into a faster random that isn't
crpytographically secure by default they can use that class. This would
essentially be inverting the relationship today, where it defaults to insecure
and you have to opt in to secure.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From ron3200 at gmail.com  Wed Sep  9 18:59:38 2015
From: ron3200 at gmail.com (Ron Adam)
Date: Wed, 9 Sep 2015 11:59:38 -0500
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <87d1xs1fof.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <msmbko$ful$1@ger.gmane.org>
 <8635FF8B-2C17-4016-BDC0-BF5D775C9F0C@yahoo.com>
 <CAHVvXxR-STALH6-RL5pdvyra+hY8MPRQA2c6qGp1qjS74NV_eA@mail.gmail.com>
 <55EEEEBB.4080203@biologie.uni-freiburg.de>
 <87lhcg27ww.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAHVvXxThCf2G_=0qjHhJF9MPWd2Hd4g=TP4=j1dC7Y7OszK5ig@mail.gmail.com>
 <87d1xs1fof.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <mspohr$lf3$1@ger.gmane.org>

On 09/08/2015 09:37 PM, Stephen J. Turnbull wrote:
> Oscar Benjamin writes:
>
>   > ATM the colon separates the part of the format element that is
>   > interpreted by the format method to find the formatted object from the
>   > part that is passed to the __format__ method of the formatted object.
>   > Perhaps an additional colon could be used to separate the separator
>   > for when the formatted object is an iterable so that
>   >
>   >      'foo {name:<fmt>:<sep>} bar'.format(name=<expr>)
>
> I thought about a colon, but that loses if the objects are times.  I
> guess that kills '/' and '-', too, since the objects might be dates.
> Of course there may be a tricky way to use these that I haven't
> thought of, or they could be escaped for use in <fmt>.

This seems to me to need a nested format spec.  An outer one to format 
the whole list, and an inner one to format each item.

     f"foo {', '.join(f'{x:inner_spec}' for x in iter):outer_spec}"



Actually this is how I'd rather write it.

      "foo " + l.fmap(inner_spec).join(', ').fstr(outer_spec)

But sequences don't have the methods to write it that way.


 >>> l = range(10)
 >>> "foo" + format(','.join(map(lambda x: format(x, '>5'), l)), '>50')
'foo    0,    1,    2,    3,    4,    5,    6,    7,    8,    9'

It took me a few times to get that right.


Cheers,
     Ron




From guido at python.org  Wed Sep  9 19:02:28 2015
From: guido at python.org (Guido van Rossum)
Date: Wed, 9 Sep 2015 10:02:28 -0700
Subject: [Python-ideas] Should our default random number generator be
	secure?
Message-ID: <CAP7+vJL0mEpHfKaC-2BvX-iF8OeqOO=CwUVWJg4LydEdTzfXbg@mail.gmail.com>

I'm just going to forward further missives by Theo.

---------- Forwarded message ----------
From: Theo de Raadt
Date: Wed, Sep 9, 2015 at 9:59 AM
Subject: Re: getentropy, getrandom, arc4random()
To: guido at python.org


> Thanks. And one last thing: unless Go and Swift, Python has no significant
> corporate resources behind it -- it's mostly volunteers begging their
> employers to let them work on Python for a few hours per week.

i understand because I find myself in the same situation.

however i think you overstate the difficulty involved.

high-availibility random is kind of a new issue.

so final advice from me; feel free to forward as you like.

i think arc4random would be a better API to call on the back side than
getentropy/getrandom.

arc4random can seed initialize with a single getentropy/getrandom call
at startup.  that is done automatically.  you can then use
arc4random's results to initialize the MT.  in a system call trace,
this will show up as one getentropy/getrandom at process startup,
which gets both subsystems going.  really cheap.

in the case of longer run times, the userland arc4random PRNG folding
reduces the system calls required.  this helps older kernels with
slower entropy creation, taking pressure off their subsystem.  driving
everyone towards this one API which is so high performance is the
right goal.

chacha arc4random is really fast.

if you were to create such an API in python, maybe this is how it will
go:

say it becomes arc4random in the back end.  i am unsure what advice to
give you regarding a python API name.  in swift, they chose to use the
same prefix "arc4random" (id = arc4random(), id = arc4random_uniform(1..n)";
it is a little bit different than the C API.  google has tended to choose
other prefixes.   we admit the name is a bit strange, but we can't touch
the previous attempts like drand48....

I do suggest you have the _uniform and _buf versions.  Maybe apple
chose to stick to arc4random as a name simply because search engines
tend to give above average advice for this search string?

so arc4random is natively available in freebsd, macos, solaris, and
other systems like andriod libc (bionic).  some systems lack it:
win32, glibc, hpux, aix, so we wrote replacements for libressl:

https://github.com/libressl-portable/openbsd/tree/master/src/lib/libcrypto/crypto
https://github.com/libressl-portable/portable/tree/master/crypto/compat

the first is the base openbsd tree where we maintain/develop this code
for other systems, the 2nd part is scaffold in libressl that makes this
available to others.

it contains arc4random for those systems, and supplies getentropy()
stubs for challenged systems.

we'll admit we haven't got solutions for every system known to man.
we are trying to handle fork issues, and systems with very bad
entropy feeding.

that's free code.  the heavy lifting is done, and we'll keep maintaining
that until the end of days.  i hope it helps.



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/97b26141/attachment.html>

From tim.peters at gmail.com  Wed Sep  9 19:10:38 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 9 Sep 2015 12:10:38 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
Message-ID: <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>

{Guido]
> ...
> The two core Python experts on the random module
> have given me opinions suggesting that there's not
> much wrong with MT, so here I am.

There is nothing _right_ about MT in a crypto context - it's entirely
unsuitable for any such purpose, and always was.  Just to be clear
about that ;-)  But it's an excellent generator for almost all other
purposes.

So the real question is:  whose use cases do you want to cater to by default?

If you answer "crytpo", then realize the Python generator will have to
change every time the crypto community changes its mind about what's
_currently_ "good enough".  There's a long history of that already.

Indeed, there are already numerous "chacha" variants.  For a brief
overview, scroll down to the ChaCha20 section of this exceptionally
readable page listing pros and cons of various generators:

    http://www.pcg-random.org/other-rngs.html

There are no answers to vital pragmatic questions (like "is it
possible to supply a seed to get reproducible results?") without
specifying whose implementation of which chacha variant you're asking
about.

I've always thought Python should be a follower rather than a leader
in this specific area.  For example, I didn't push for the Twister
before it was well on its way to becoming a de facto standard.

Anyway, it's all moot until someone supplies a patch - and that sure
ain't gonna be me ;-)


On Wed, Sep 9, 2015 at 11:35 AM, Guido van Rossum <guido at python.org> wrote:
> I've received several long emails from Theo de Raadt (OpenBSD founder) about
> Python's default random number generator. This is the random module, and it
> defaults to a Mersenne Twister (MT) seeded by 2500 bytes of entropy taken
> from os.urandom().
>
> Theo's worry is that while the starting seed is fine, MT is not good when
> random numbers are used for crypto and other security purposes. I've
> countered that it's not meant for that (you should use random.SystemRandom()
> or os.urandom() for that) but he counters that people don't necessarily know
> that and are using the default random.random() setup for security purposes
> without realizing how wrong that is.
>
> There is already a warning in the docs for the random module that it's not
> suitable for security, but -- as the meme goes -- nobody reads the docs.
>
> Theo then went into technicalities that went straight over my head,
> concluding with a strongly worded recommendation of the OpenBSD version of
> arc4random() (which IIUC is based on something called "chacha", not on "RC4"
> despite that being in the name). He says it is very fast (but I don't know
> what that means).
>
> I've invited Theo to join this list but he's too busy. The two core Python
> experts on the random module have given me opinions suggesting that there's
> not much wrong with MT, so here I am. Who is right? What should we do? Is
> there anything we need to do?
>
> --
> --Guido van Rossum (python.org/~guido)
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From Stephan.Sahm at gmx.de  Wed Sep  9 19:10:18 2015
From: Stephan.Sahm at gmx.de (Stephan Sahm)
Date: Wed, 9 Sep 2015 19:10:18 +0200
Subject: [Python-ideas] BUG in standard while statement
Message-ID: <CAOs8ta2PC=6p8Eo_+F0m1WrVbgk+Y=8V8HQHotC0ObxGPvS6=g@mail.gmail.com>

Dear all

I found a BUG in the standard while statement, which appears both in python
2.7 and python 3.4 on my system.

It usually won't appear because I only stumbled upon it after trying to
implement a nice repeat structure. Look:
?```?
class repeat(object):
    def __init__(self, n):
        self.n = n

    def __bool__(self):
        self.n -= 1
        return self.n >= 0

    __nonzero__=__bool__

a = repeat(2)
```
the meaning of the above is that bool(a) returns True 2-times, and after
that always False.

Now executing
```
while a:
    print('foo')
```
will in fact print 'foo' two times. HOWEVER ;-) ....
```
while repeat(2):
    print('foo')
```
will go on and go on, printing 'foo' until I kill it.

Please comment, explain or recommend this further if you also think that
both while statements should behave identically.

hoping for responses,
best,
Stephan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/95ea7879/attachment.html>

From joejev at gmail.com  Wed Sep  9 19:15:04 2015
From: joejev at gmail.com (Joseph Jevnik)
Date: Wed, 9 Sep 2015 13:15:04 -0400
Subject: [Python-ideas] BUG in standard while statement
In-Reply-To: <CAOs8ta2PC=6p8Eo_+F0m1WrVbgk+Y=8V8HQHotC0ObxGPvS6=g@mail.gmail.com>
References: <CAOs8ta2PC=6p8Eo_+F0m1WrVbgk+Y=8V8HQHotC0ObxGPvS6=g@mail.gmail.com>
Message-ID: <CAHGq92VPZ8ELrak8wJ-Wj2P+K61Go+zJ6uc9U-aAU8nkzqnwcQ@mail.gmail.com>

This appears as intended. The body of the while condition is executed each
time the condition is checked. In the first case, you are creating a single
instance of repeat, and then calling bool on the expression with each
iteration of the loop. With the second case, you are constructing a _new_
repeat instance each time. Think about the difference between:

while should_stop():
    ...

and:
a = should_stop()
while a:
    ...

One would expect should_stop to be called each time in the first case; but,
in the second case it is only called once.

With all that said, I think you want to use the __iter__ and __next__
protocols to implement this in a more supported way.

On Wed, Sep 9, 2015 at 1:10 PM, Stephan Sahm <Stephan.Sahm at gmx.de> wrote:

> Dear all
>
> I found a BUG in the standard while statement, which appears both in
> python 2.7 and python 3.4 on my system.
>
> It usually won't appear because I only stumbled upon it after trying to
> implement a nice repeat structure. Look:
> ?```?
> class repeat(object):
>     def __init__(self, n):
>         self.n = n
>
>     def __bool__(self):
>         self.n -= 1
>         return self.n >= 0
>
>     __nonzero__=__bool__
>
> a = repeat(2)
> ```
> the meaning of the above is that bool(a) returns True 2-times, and after
> that always False.
>
> Now executing
> ```
> while a:
>     print('foo')
> ```
> will in fact print 'foo' two times. HOWEVER ;-) ....
> ```
> while repeat(2):
>     print('foo')
> ```
> will go on and go on, printing 'foo' until I kill it.
>
> Please comment, explain or recommend this further if you also think that
> both while statements should behave identically.
>
> hoping for responses,
> best,
> Stephan
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/a7337a10/attachment.html>

From storchaka at gmail.com  Wed Sep  9 19:18:39 2015
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 9 Sep 2015 20:18:39 +0300
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
Message-ID: <mspplh$8p4$1@ger.gmane.org>

On 09.09.15 19:35, Guido van Rossum wrote:
> I've invited Theo to join this list but he's too busy. The two core
> Python experts on the random module have given me opinions suggesting
> that there's not much wrong with MT, so here I am. Who is right? What
> should we do? Is there anything we need to do?

Entropy -- limited and slowly recoverable resource (especially if there 
is no network activity). If you consume it too quickly (for example in a 
scientific simulation or in a game), it will not have time to recover, 
that will slow down not only your program, but all consumers of entropy. 
The use of random.SystemRandom by default looks dangerous. It is 
unlikely that all existing programs will be rewritten to use 
random.FastInsecureRandom.



From donald at stufft.io  Wed Sep  9 19:20:03 2015
From: donald at stufft.io (Donald Stufft)
Date: Wed, 9 Sep 2015 13:20:03 -0400
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
Message-ID: <etPan.55f06a43.137d4868.31bc@Draupnir.home>

On September 9, 2015 at 1:11:22 PM, Tim Peters (tim.peters at gmail.com) wrote:
> > So the real question is: whose use cases do you want to cater to  
> by default?
>  
> If you answer "crytpo", then realize the Python generator will  
> have to
> change every time the crypto community changes its mind about  
> what's
> _currently_ "good enough". There's a long history of that already.?


This is not really true in that sense that Python would need to do anything if
the blessed generator changed. We'd use /dev/urandom, one of the syscalls that
do the same thing, or the CryptGen API on Windows. Python should not have it's
own userland CSPRNG. Then it's up to the platform to follow what generator they
are going to provide.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From Stephan.Sahm at gmx.de  Wed Sep  9 19:20:53 2015
From: Stephan.Sahm at gmx.de (Stephan Sahm)
Date: Wed, 9 Sep 2015 19:20:53 +0200
Subject: [Python-ideas] BUG in standard while statement
In-Reply-To: <CAHGq92VPZ8ELrak8wJ-Wj2P+K61Go+zJ6uc9U-aAU8nkzqnwcQ@mail.gmail.com>
References: <CAOs8ta2PC=6p8Eo_+F0m1WrVbgk+Y=8V8HQHotC0ObxGPvS6=g@mail.gmail.com>
 <CAHGq92VPZ8ELrak8wJ-Wj2P+K61Go+zJ6uc9U-aAU8nkzqnwcQ@mail.gmail.com>
Message-ID: <CAOs8ta04kzt+bCSZf2DpHOcRYYd6b_8NA2AkJhd62Edg5fuvyQ@mail.gmail.com>

It is so true!
thanks for pointing that out. It makes sense to do it that way
I probably never used while in different places than ``while 1: pass``
until now.

I admit, I am looking for something alternative to a for structure like
 ``for _ in range(10)`` -- I don't like the ``_`` ;-)
How can I use the iterator protocoll to make a nice repeat syntax?


On 9 September 2015 at 19:15, Joseph Jevnik <joejev at gmail.com> wrote:

> This appears as intended. The body of the while condition is executed each
> time the condition is checked. In the first case, you are creating a single
> instance of repeat, and then calling bool on the expression with each
> iteration of the loop. With the second case, you are constructing a _new_
> repeat instance each time. Think about the difference between:
>
> while should_stop():
>     ...
>
> and:
> a = should_stop()
> while a:
>     ...
>
> One would expect should_stop to be called each time in the first case;
> but, in the second case it is only called once.
>
> With all that said, I think you want to use the __iter__ and __next__
> protocols to implement this in a more supported way.
>
> On Wed, Sep 9, 2015 at 1:10 PM, Stephan Sahm <Stephan.Sahm at gmx.de> wrote:
>
>> Dear all
>>
>> I found a BUG in the standard while statement, which appears both in
>> python 2.7 and python 3.4 on my system.
>>
>> It usually won't appear because I only stumbled upon it after trying to
>> implement a nice repeat structure. Look:
>> ?```?
>> class repeat(object):
>>     def __init__(self, n):
>>         self.n = n
>>
>>     def __bool__(self):
>>         self.n -= 1
>>         return self.n >= 0
>>
>>     __nonzero__=__bool__
>>
>> a = repeat(2)
>> ```
>> the meaning of the above is that bool(a) returns True 2-times, and after
>> that always False.
>>
>> Now executing
>> ```
>> while a:
>>     print('foo')
>> ```
>> will in fact print 'foo' two times. HOWEVER ;-) ....
>> ```
>> while repeat(2):
>>     print('foo')
>> ```
>> will go on and go on, printing 'foo' until I kill it.
>>
>> Please comment, explain or recommend this further if you also think that
>> both while statements should behave identically.
>>
>> hoping for responses,
>> best,
>> Stephan
>>
>>
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/7452c117/attachment.html>

From tim.peters at gmail.com  Wed Sep  9 19:28:29 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 9 Sep 2015 12:28:29 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <etPan.55f06a43.137d4868.31bc@Draupnir.home>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
Message-ID: <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>

[Tim]
>> So the real question is: whose use cases do you want to cater to
>> by default?
>>
>> If you answer "crytpo", then realize the Python generator will
>> have to change every time the crypto community changes its mind
>> about what's  _currently_ "good enough". There's a long history of
>? that already.

[Donald Stufft <donald at stufft.io>]
> This is not really true in that sense that Python would need to do anything if
> the blessed generator changed.

I read Guido's message as specifically asking about Theo's "strongly
worded recommendation of [Python switching to] the OpenBSD version of
arc4random()" as its default generator. In which, case, yes, when that
specific implementation falls out of favor, Python would need to
change.


> We'd use /dev/urandom, one of the syscalls that
> do the same thing, or the CryptGen API on Windows. Python should not have it's
> own userland CSPRNG.

I read Guido's message as asking whether Python should indeed do just that.

From dwblas at gmail.com  Wed Sep  9 19:30:38 2015
From: dwblas at gmail.com (David Blaschke)
Date: Wed, 9 Sep 2015 10:30:38 -0700
Subject: [Python-ideas] BUG in standard while statement
In-Reply-To: <CAOs8ta2PC=6p8Eo_+F0m1WrVbgk+Y=8V8HQHotC0ObxGPvS6=g@mail.gmail.com>
References: <CAOs8ta2PC=6p8Eo_+F0m1WrVbgk+Y=8V8HQHotC0ObxGPvS6=g@mail.gmail.com>
Message-ID: <CAFx9zof_Desc9UZnsuAPTCXz0L6D6ZhyP=1mHkhDRHT_yYSg_w@mail.gmail.com>

while repeat(2): creates a new repeat instance each time through the
loop and initializes the variable as 2 each time through the loop i.e.
repeat(2) returns a new, different instance each time.

On 9/9/15, Stephan Sahm <Stephan.Sahm at gmx.de> wrote:
> Dear all
>
> I found a BUG in the standard while statement, which appears both in python
> 2.7 and python 3.4 on my system.
>
> It usually won't appear because I only stumbled upon it after trying to
> implement a nice repeat structure. Look:
> ?```?
> class repeat(object):
>     def __init__(self, n):
>         self.n = n
>
>     def __bool__(self):
>         self.n -= 1
>         return self.n >= 0
>
>     __nonzero__=__bool__
>
> a = repeat(2)
> ```
> the meaning of the above is that bool(a) returns True 2-times, and after
> that always False.
>
> Now executing
> ```
> while a:
>     print('foo')
> ```
> will in fact print 'foo' two times. HOWEVER ;-) ....
> ```
> while repeat(2):
>     print('foo')
> ```
> will go on and go on, printing 'foo' until I kill it.
>
> Please comment, explain or recommend this further if you also think that
> both while statements should behave identically.
>
> hoping for responses,
> best,
> Stephan
>


-- 
With the simplicity of true nature, there shall be no desire.
Without desire, one's original nature will be at peace.
And the world will naturally be in accord with the right Way.  Tao Te Ching

From donald at stufft.io  Wed Sep  9 19:31:35 2015
From: donald at stufft.io (Donald Stufft)
Date: Wed, 9 Sep 2015 13:31:35 -0400
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <mspplh$8p4$1@ger.gmane.org>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <mspplh$8p4$1@ger.gmane.org>
Message-ID: <etPan.55f06cf7.6f6b3311.31bc@Draupnir.home>

On September 9, 2015 at 1:19:34 PM, Serhiy Storchaka (storchaka at gmail.com) wrote:
> On 09.09.15 19:35, Guido van Rossum wrote:
> > I've invited Theo to join this list but he's too busy. The two core
> > Python experts on the random module have given me opinions suggesting
> > that there's not much wrong with MT, so here I am. Who is right? What
> > should we do? Is there anything we need to do?
>  
> Entropy -- limited and slowly recoverable resource (especially if there
> is no network activity). If you consume it too quickly (for example in a
> scientific simulation or in a game), it will not have time to recover,
> that will slow down not only your program, but all consumers of entropy.
> The use of random.SystemRandom by default looks dangerous. It is
> unlikely that all existing programs will be rewritten to use
> random.FastInsecureRandom.
>  

This isn?t exactly true. Hardware entropy limited and slowly recovering which
is why no sane implementation uses that except to periodically reseed the
CSPRNG which is typically based on ARC4 or ChaCha. The standard CSPRNGs that
most platforms use are fast enough for most people's use cases.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From guido at python.org  Wed Sep  9 19:41:59 2015
From: guido at python.org (Guido van Rossum)
Date: Wed, 9 Sep 2015 10:41:59 -0700
Subject: [Python-ideas] Should our default random number generator be
	secure?
Message-ID: <CAP7+vJKEib1NjzUuNG4aO7symPqeBYSKhAs6qpT05ymUexDtMQ@mail.gmail.com>

---------- Forwarded message ----------
From: Theo de Raadt
Date: Wed, Sep 9, 2015 at 10:36 AM
Subject: Re: getentropy, getrandom, arc4random()
To: guido at python.org


> Yet another thing. Where do you see that Go and Swift have secure random
as
> a keyword? Searching for "golang random" gives the math/rand package as
the
> first hit, which has a note reminding the reader to use crypto/rand for
> security work.

yes, well, look at the other phrase it uses...

    that produces a deterministic sequence of values each time a program is
run

it documents itself as being decidely non-random.  that documentation
change happened soon after this event:

    https://lwn.net/Articles/625506/

these days, the one people are using is found using "go secure random"

    https://golang.org/pkg/crypto/rand/

that opens /dev/urandom or uses the getrandom system call depending on
system.  it also has support for the windows entropy API.  it pulls
data into a large buffer, a cache.  then each subsequent call, it
consumes some, until it rus out, and has to do a fresh read.  it
appears to not clean the buffer behind itself, probably for
performance reasons, so the memory is left active.  (forward secrecy
violated)

i don't think they are doing the best they can...  i think they should
get forward secrecy and higher performance by having an in-process
chacha.  but you can sense the trend.

here's an example of the fallout..

https://github.com/golang/go/issues/9205

> For Swift it's much the same -- there's an arc4random() in
> the Darwin package but nothing in the core language.

that is what people are led to use.



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/62e1b044/attachment.html>

From donald at stufft.io  Wed Sep  9 19:43:53 2015
From: donald at stufft.io (Donald Stufft)
Date: Wed, 9 Sep 2015 13:43:53 -0400
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
Message-ID: <etPan.55f06fd9.71794aea.31bc@Draupnir.home>

On September 9, 2015 at 1:28:46 PM, Tim Peters (tim.peters at gmail.com) wrote:
> > I read Guido's message as specifically asking about Theo's 
> "strongly
> worded recommendation of [Python switching to] the OpenBSD 
> version of
> arc4random()" as its default generator. In which, case, yes, 
> when that
> specific implementation falls out of favor, Python would need 
> to
> change.

arc4random changes as the underlying implementation changes too, the name is a
historical accident really. arc4random no longer uses arc4 it uses chacha, and
when/if chacha needs to be replaced, arc4random will still be the name.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From guido at python.org  Wed Sep  9 19:43:56 2015
From: guido at python.org (Guido van Rossum)
Date: Wed, 9 Sep 2015 10:43:56 -0700
Subject: [Python-ideas] Should our default random number generator be
	secure?
Message-ID: <CAP7+vJK_xk0YKC6UjGdJVyNqoDvv=wn99_cOHtFptfGBUt-ziw@mail.gmail.com>

---------- Forwarded message ----------
From: Theo de Raadt
Date: Wed, Sep 9, 2015 at 10:42 AM
Subject: Re: getentropy, getrandom, arc4random()
To: guido at python.org


been speaking to a significant go person.

confirmed.

it takes data out of that buffer, and does not zero it behind itself.
obviously for performance reasons.

same type of thing happens with MT-style engines.  in practice, they
can be would backwards.  a proper stream cipher cannot be turned
backwards.

however, that's just an academic observation.  or maybe it indicates
that well-financed groups can get it wrong too.

by the way, chacha arc4random can create random values faster than a
memcpy -- the computation of fresh output is faster than doing
gross-cost of "read" from memory (when cache dirtying is accounted for).




-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/113e4d0a/attachment.html>

From random832 at fastmail.us  Wed Sep  9 19:46:09 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Wed, 09 Sep 2015 13:46:09 -0400
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <mspplh$8p4$1@ger.gmane.org>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <mspplh$8p4$1@ger.gmane.org>
Message-ID: <1441820769.2850642.379075929.5DABF6B4@webmail.messagingengine.com>

On Wed, Sep 9, 2015, at 13:18, Serhiy Storchaka wrote:
> Entropy -- limited and slowly recoverable resource (especially if there 
> is no network activity). If you consume it too quickly (for example in a 
> scientific simulation or in a game), it will not have time to recover, 
> that will slow down not only your program, but all consumers of entropy. 
> The use of random.SystemRandom by default looks dangerous. It is 
> unlikely that all existing programs will be rewritten to use 
> random.FastInsecureRandom.

http://www.2uo.de/myths-about-urandom/ should be required reading.

As far as I know, no-one is actually proposing the use of a method that
blocks when there's "not enough entropy", nor does arc4random itself
appear to do so.

From random832 at fastmail.us  Wed Sep  9 19:54:14 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Wed, 09 Sep 2015 13:54:14 -0400
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
Message-ID: <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>

On Wed, Sep 9, 2015, at 13:43, Donald Stufft wrote:
> arc4random changes as the underlying implementation changes too, the name
> is a
> historical accident really. arc4random no longer uses arc4 it uses
> chacha, and
> when/if chacha needs to be replaced, arc4random will still be the name.

The issue is, what should Python do, if the decision is made to not
provide its own RNG [presumably would be a forked copy of OpenBSD's
current arc4random] on systems that do not provide a function named
arc4random? Use /dev/urandom (or CryptGenRandom) every time [more
expensive, performs I/O]? rand48? random? rand?

I don't see the issue with Python providing its own implementation. If
the state of the art changes, we can have another discussion then.

From skrah at bytereef.org  Wed Sep  9 20:00:59 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Wed, 9 Sep 2015 18:00:59 +0000 (UTC)
Subject: [Python-ideas] Should our default random number generator be
	secure?
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
Message-ID: <loom.20150909T193457-689@post.gmane.org>

Tim Peters <tim.peters at ...> writes:
> > We'd use /dev/urandom, one of the syscalls that
> > do the same thing, or the CryptGen API on Windows. Python should not
have it's
> > own userland CSPRNG.
> 
> I read Guido's message as asking whether Python should indeed do just that.

>From Theo's forwarded mail I also got the impression that he wanted
us to use OpenBSD code to implement our own CSPRNG, use that for
the default functions in the random module and add new functions
for reproducible random numbers that use the MT.


My intuition is that if someone just uses a random() function
without checking if it's cryptographically secure then the
application will probably have other holes as well.  I mean,
for example no one is going to use C's rand() function for crypto.



Stefan Krah










From random832 at fastmail.us  Wed Sep  9 20:08:56 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Wed, 09 Sep 2015 14:08:56 -0400
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <loom.20150909T193457-689@post.gmane.org>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <loom.20150909T193457-689@post.gmane.org>
Message-ID: <1441822136.2856767.379097753.6E5AA6EA@webmail.messagingengine.com>


On Wed, Sep 9, 2015, at 14:00, Stefan Krah wrote:
> My intuition is that if someone just uses a random() function
> without checking if it's cryptographically secure then the
> application will probably have other holes as well.  I mean,
> for example no one is going to use C's rand() function for crypto.

Let's turn the question around - what's the _benefit_ of having a random
number generator available that _isn't_ cryptographically secure? One
possible argument is performance. If that's the issue - what are our
performance targets? How can they be measured? Another argument is that
some applications really do need deterministic seeding. Is there a
reason not to require them to be explicit about it?

From tim.peters at gmail.com  Wed Sep  9 20:16:29 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 9 Sep 2015 13:16:29 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <loom.20150909T193457-689@post.gmane.org>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <loom.20150909T193457-689@post.gmane.org>
Message-ID: <CAExdVN=etHSM9XJOHgwkVJK9nrVj=L27-0bX_6mR_+CREKLRZA@mail.gmail.com>

[Stefan Krah <skrah at bytereef.org>]
> From Theo's forwarded mail I also got the impression that he wanted
> us to use OpenBSD code to implement our own CSPRNG, use that for
> the default functions in the random module and add new functions
> for reproducible random numbers that use the MT.

I read it the same way on all counts.


> My intuition is that if someone just uses a random() function
> without checking if it's cryptographically secure then the
> application will probably have other holes as well.  I mean,
> for example no one is going to use C's rand() function for crypto.

Yes, if they're not checking the random() docs first, they're a total
crypto moron - in which case it's insane to believe they'll do
anything else related to crypto-strength requirements right either.

It's hard to make something idiot-proof even if your target audience
is bona fide crypto experts ;-)

From skrah at bytereef.org  Wed Sep  9 20:17:27 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Wed, 9 Sep 2015 18:17:27 +0000 (UTC)
Subject: [Python-ideas]
	=?utf-8?q?Should_our_default_random_number_generat?=
	=?utf-8?q?or_be=09secure=3F?=
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <loom.20150909T193457-689@post.gmane.org>
 <1441822136.2856767.379097753.6E5AA6EA@webmail.messagingengine.com>
Message-ID: <loom.20150909T201139-324@post.gmane.org>

 <random832 at ...> writes:
> On Wed, Sep 9, 2015, at 14:00, Stefan Krah wrote:
> > My intuition is that if someone just uses a random() function
> > without checking if it's cryptographically secure then the
> > application will probably have other holes as well.  I mean,
> > for example no one is going to use C's rand() function for crypto.
> 
> Let's turn the question around - what's the _benefit_ of having a random
> number generator available that _isn't_ cryptographically secure? One
> possible argument is performance. If that's the issue - what are our
> performance targets? How can they be measured? Another argument is that
> some applications really do need deterministic seeding. Is there a
> reason not to require them to be explicit about it?

As you say, performance:

  http://www.pcg-random.org/rng-performance.html


Random number generation is a very broad field. I'm not a specialist,
so I just entered "Mersenne Twister" into an academic search engine
and got many results, but none for arc4random.

It's an interesting question you ask. I'd have to do a lot of reading
first to get an overview.


Stefan Krah






From tim.peters at gmail.com  Wed Sep  9 20:31:49 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 9 Sep 2015 13:31:49 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
Message-ID: <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>

[random832 at fastmail.us]
> I don't see the issue with Python providing its own implementation. If
> the state of the art changes,

It will.  Over & over again.  That's why it's called "art" ;-)

> we can have another discussion then.

Also over & over again.  If you volunteer to own responsibility for
updating all versions of Python each time it changes (in a crypto
context, an advance in the state of the art implies the prior state
becomes "a bug"), and post a performance bond sufficient to pay
someone else to do it if you vanish, then a major pragmatic objection
would go away ;-)

From skrah at bytereef.org  Wed Sep  9 20:43:05 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Wed, 9 Sep 2015 18:43:05 +0000 (UTC)
Subject: [Python-ideas]
	=?utf-8?q?Should_our_default_random_number_generat?=
	=?utf-8?q?or_be=09secure=3F?=
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
Message-ID: <loom.20150909T203931-585@post.gmane.org>

Tim Peters <tim.peters at ...> writes:
> > we can have another discussion then.
> 
> Also over & over again.  If you volunteer to own responsibility for
> updating all versions of Python each time it changes (in a crypto
> context, an advance in the state of the art implies the prior state
> becomes "a bug"), and post a performance bond sufficient to pay
> someone else to do it if you vanish, then a major pragmatic objection
> would go away 

The OpenBSD devs could also publish arc4random as a library that
works everywhere (like OpenSSH). That would be a nicer solution
for everyone (except for the devs perhaps :).


Stefan Krah




From tim.peters at gmail.com  Wed Sep  9 20:47:40 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 9 Sep 2015 13:47:40 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <loom.20150909T203931-585@post.gmane.org>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <loom.20150909T203931-585@post.gmane.org>
Message-ID: <CAExdVNn2BRJBzDt1Y_7z36PKvzxDeKUOw7hw=kNS0aedfNC7Kg@mail.gmail.com>

[Stefan Krah <skrah at bytereef.org>]
> ...
> The OpenBSD devs could also publish arc4random as a library that
> works everywhere (like OpenSSH). That would be a nicer solution
> for everyone (except for the devs perhaps :).

Telling Python devs "hey, it will be as easy as dealing with OpenSSH
has been!" is indeed a good way to kill the idea at once ;-)

From srkunze at mail.de  Wed Sep  9 20:50:44 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Wed, 9 Sep 2015 20:50:44 +0200
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <loom.20150909T182334-673@post.gmane.org>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com> <87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>
 <m2egi9a62o.fsf@fastmail.com> <55EF2B66.4020509@mail.de>
 <1441741195.1614886.378114729.37307E0E@webmail.messagingengine.com>
 <6DDBD724-714E-40E1-88DF-9BC8484FF240@yahoo.com> <55F058B6.9000202@mail.de>
 <loom.20150909T182334-673@post.gmane.org>
Message-ID: <55F07F84.9040707@mail.de>

Fair enough.

On 09.09.2015 18:35, Stefan Krah wrote:
> Sven R. Kunze <srkunze at ...> writes:
>> I still don't understand what's wrong with deprecating %, but okay. I
>> think f-strings will push {} to wide-range adoption.
> Then it will probably be hard to explain, so I'll be direct:
>
>    1) Many Python users are fed up with churn and don't want to do yet
>       another rewrite of their applications (just after migrating to
>       3.x).
>
>    2) Despite many years of officially preferring {}-formatting in
>       the docs (and on Stackoverflow), people *still* use %-formatting.
>       This should be a clue that they actually like it.
>
>    3) %-formatting often has better performance and is often
>       easier to read.
>
>    4) Yes, in other cases {}-formatting is easier to read. So choose
>       whatever is best.
>
>
>
> Stefan Krah
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


From random832 at fastmail.us  Wed Sep  9 20:55:01 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Wed, 09 Sep 2015 14:55:01 -0400
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
Message-ID: <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>

On Wed, Sep 9, 2015, at 14:31, Tim Peters wrote:
> Also over & over again.  If you volunteer to own responsibility for
> updating all versions of Python each time it changes (in a crypto
> context, an advance in the state of the art implies the prior state
> becomes "a bug"), and post a performance bond sufficient to pay
> someone else to do it if you vanish, then a major pragmatic objection
> would go away ;-)

I don't see how "Changing Python's RNG implementation today to
arc4random as it exists now" necessarily implies "Making a commitment to
guarantee the cryptographic suitability of Python's RNG for all time".
Those are two separate things.

From random832 at fastmail.us  Wed Sep  9 20:56:13 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Wed, 09 Sep 2015 14:56:13 -0400
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <loom.20150909T201139-324@post.gmane.org>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <loom.20150909T193457-689@post.gmane.org>
 <1441822136.2856767.379097753.6E5AA6EA@webmail.messagingengine.com>
 <loom.20150909T201139-324@post.gmane.org>
Message-ID: <1441824973.2867674.379143865.0237C5FC@webmail.messagingengine.com>

On Wed, Sep 9, 2015, at 14:17, Stefan Krah wrote:
> Random number generation is a very broad field. I'm not a specialist,
> so I just entered "Mersenne Twister" into an academic search engine
> and got many results, but none for arc4random.

Try "Chacha". The "arc4random" name is a legacy of an older
implementation.

From tim.peters at gmail.com  Wed Sep  9 21:03:33 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 9 Sep 2015 14:03:33 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
Message-ID: <CAExdVNn7hsNpfxc2OfmQv7UAe3NyE5jd+ppJBwsEJyPvF4BYzQ@mail.gmail.com>

[<random832 at fastmail.us>]
> I don't see how "Changing Python's RNG implementation today to
> arc4random as it exists now" necessarily implies "Making a commitment to
> guarantee the cryptographic suitability of Python's RNG for all time".
> Those are two separate things.

Disagree.  The _only_ point to switching today is "to guarantee the
cryptographic suitability of Python's RNG" today.  It misses the
intent of the switch entirely to give a "but tomorrow?  eh - that'[s a
different issue" dodge.

No, no rules of formal logic would be violated by separating the two -
it would be a violation of the only _sense_ in making a switch at all.
If you don't believe me, try asking Theo ;-)

From steve at pearwood.info  Wed Sep  9 21:07:57 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 10 Sep 2015 05:07:57 +1000
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
Message-ID: <20150909190757.GM19373@ando.pearwood.info>

On Wed, Sep 09, 2015 at 02:55:01PM -0400, random832 at fastmail.us wrote:
> On Wed, Sep 9, 2015, at 14:31, Tim Peters wrote:
> > Also over & over again.  If you volunteer to own responsibility for
> > updating all versions of Python each time it changes (in a crypto
> > context, an advance in the state of the art implies the prior state
> > becomes "a bug"), and post a performance bond sufficient to pay
> > someone else to do it if you vanish, then a major pragmatic objection
> > would go away ;-)
> 
> I don't see how "Changing Python's RNG implementation today to
> arc4random as it exists now" necessarily implies "Making a commitment to
> guarantee the cryptographic suitability of Python's RNG for all time".
> Those are two separate things.

Not really. Look at the subject line. It doesn't say "should we change 
from MT to arc4random?", it asks if the default random number generator 
should be secure. The only reason we are considering the change from MT 
to arc4random is to make the PRNG cryptographically secure. "Secure" is 
a moving target, what is secure today will not be secure tomorrow.

Yes, in principle, we could make the change once, then never again. But 
why bother? We don't gain anything from changing to arc4random if there 
is no promise to be secure into the future.

Question, aimed at anyone, not necessarily random832 -- one desirable 
property of PRNGs is that you can repeat a sequence of values if you 
re-seed with a known value. Does arc4random keep that property? I think 
that it is important that the default RNG be deterministic when given a 
known seed. (I'm happy for the default seed to be unpredictable.)


-- 
Steve

From srkunze at mail.de  Wed Sep  9 21:09:05 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Wed, 9 Sep 2015 21:09:05 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <etPan.55f0640d.16a132a6.31bc@Draupnir.home>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <etPan.55f0640d.16a132a6.31bc@Draupnir.home>
Message-ID: <55F083D1.7040401@mail.de>

On 09.09.2015 18:53, Donald Stufft wrote:
> This would
> essentially be inverting the relationship today, where it defaults to insecure
> and you have to opt in to secure.

Not being an expert on this but I agree with this assessment.

You can determine easily whether your program runs fast enough. If not, 
you can fix it.
You cannot determine easily whether something you made is 
cryptographically secure.

The default should be as secure as possible.


Best,
Sven

From encukou at gmail.com  Wed Sep  9 21:12:18 2015
From: encukou at gmail.com (Petr Viktorin)
Date: Wed, 9 Sep 2015 21:12:18 +0200
Subject: [Python-ideas] BUG in standard while statement
In-Reply-To: <CAOs8ta04kzt+bCSZf2DpHOcRYYd6b_8NA2AkJhd62Edg5fuvyQ@mail.gmail.com>
References: <CAOs8ta2PC=6p8Eo_+F0m1WrVbgk+Y=8V8HQHotC0ObxGPvS6=g@mail.gmail.com>
 <CAHGq92VPZ8ELrak8wJ-Wj2P+K61Go+zJ6uc9U-aAU8nkzqnwcQ@mail.gmail.com>
 <CAOs8ta04kzt+bCSZf2DpHOcRYYd6b_8NA2AkJhd62Edg5fuvyQ@mail.gmail.com>
Message-ID: <CA+=+wqCnx8WkQtdJDgCuWNQ_-g_FB-SZySGbJPEDwfh48vi7mw@mail.gmail.com>

On Wed, Sep 9, 2015 at 7:20 PM, Stephan Sahm <Stephan.Sahm at gmx.de> wrote:
> It is so true!
> thanks for pointing that out. It makes sense to do it that way
> I probably never used while in different places than ``while 1: pass`` until
> now.
>
> I admit, I am looking for something alternative to a for structure like
> ``for _ in range(10)`` -- I don't like the ``_`` ;-)
> How can I use the iterator protocoll to make a nice repeat syntax?

If you don't like the ``_``, you can use ``for iteration_index in range(10):``.

You always need to store the iteration number somewhere (the original
post has it in ``self.n``). Chances are you'll want to access it
later, when you debug your code. Hiding it in a class is just making
it harder to get. It's also making the whole thing less maintainable,
because other people, who are used to seeing ``for i in range(...)``,
would now need to understand your custom class and new idiom. (And as
always with maintainability, "other people" includes you in a few
years.)

Instead, just use the current syntax. Eventually it will start looking
nice to you.


> On 9 September 2015 at 19:15, Joseph Jevnik <joejev at gmail.com> wrote:
>>
>> This appears as intended. The body of the while condition is executed each
>> time the condition is checked. In the first case, you are creating a single
>> instance of repeat, and then calling bool on the expression with each
>> iteration of the loop. With the second case, you are constructing a _new_
>> repeat instance each time. Think about the difference between:
>>
>> while should_stop():
>>     ...
>>
>> and:
>> a = should_stop()
>> while a:
>>     ...
>>
>> One would expect should_stop to be called each time in the first case;
>> but, in the second case it is only called once.
>>
>> With all that said, I think you want to use the __iter__ and __next__
>> protocols to implement this in a more supported way.
>>
>> On Wed, Sep 9, 2015 at 1:10 PM, Stephan Sahm <Stephan.Sahm at gmx.de> wrote:
>>>
>>> Dear all
>>>
>>> I found a BUG in the standard while statement, which appears both in
>>> python 2.7 and python 3.4 on my system.
>>>
>>> It usually won't appear because I only stumbled upon it after trying to
>>> implement a nice repeat structure. Look:
>>> ```
>>> class repeat(object):
>>>     def __init__(self, n):
>>>         self.n = n
>>>
>>>     def __bool__(self):
>>>         self.n -= 1
>>>         return self.n >= 0
>>>
>>>     __nonzero__=__bool__
>>>
>>> a = repeat(2)
>>> ```
>>> the meaning of the above is that bool(a) returns True 2-times, and after
>>> that always False.
>>>
>>> Now executing
>>> ```
>>> while a:
>>>     print('foo')
>>> ```
>>> will in fact print 'foo' two times. HOWEVER ;-) ....
>>> ```
>>> while repeat(2):
>>>     print('foo')
>>> ```
>>> will go on and go on, printing 'foo' until I kill it.
>>>
>>> Please comment, explain or recommend this further if you also think that
>>> both while statements should behave identically.
>>>
>>> hoping for responses,
>>> best,
>>> Stephan

From skrah at bytereef.org  Wed Sep  9 21:13:29 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Wed, 9 Sep 2015 19:13:29 +0000 (UTC)
Subject: [Python-ideas]
	=?utf-8?q?Should_our_default_random_number_generat?=
	=?utf-8?q?or_be=09secure=3F?=
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <loom.20150909T193457-689@post.gmane.org>
 <1441822136.2856767.379097753.6E5AA6EA@webmail.messagingengine.com>
 <loom.20150909T201139-324@post.gmane.org>
 <1441824973.2867674.379143865.0237C5FC@webmail.messagingengine.com>
Message-ID: <loom.20150909T210412-428@post.gmane.org>

 <random832 at ...> writes:
> On Wed, Sep 9, 2015, at 14:17, Stefan Krah wrote:
> > Random number generation is a very broad field. I'm not a specialist,
> > so I just entered "Mersenne Twister" into an academic search engine
> > and got many results, but none for arc4random.
> 
> Try "Chacha". The "arc4random" name is a legacy of an older
> implementation.

I know chacha (and most of djb's other works).  I thought we
were talking about the suitability of cryptographically secure
RNGs for traditional scientific applications, in particular
whether there are *other* reasons apart from performance not to
use them.



Stefan Krah



From tim.peters at gmail.com  Wed Sep  9 21:20:52 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 9 Sep 2015 14:20:52 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <20150909190757.GM19373@ando.pearwood.info>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
Message-ID: <CAExdVN=Cx+hc6oSN0LhwVO5zjVuzGKAfeiePRppNqy26M-ZgMg@mail.gmail.com>

[Steven D'Aprano <steve at pearwood.info>]
> ...
> Question, aimed at anyone, not necessarily random832 -- one desirable
> property of PRNGs is that you can repeat a sequence of values if you
> re-seed with a known value. Does arc4random keep that property? I think
> that it is important that the default RNG be deterministic when given a
> known seed. (I'm happy for the default seed to be unpredictable.)

"arc4random" is ill-defined.  From what I gathered, it's the case that
"pure chacha" variants can be seeded to get a reproducible sequence
"in theory", but that not all implementations support that.

Specifically, the OpenBSD implementation being "sold" here does not and cannot:

    http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man3/arc4random.3

"Does not" because there is no API to either request or set a seed.

"Cannot" because:

    The subsystem is re-seeded from the kernel random number
    subsystem using getentropy(2) on a regular basis

Other variants skip that last part.

From skrah at bytereef.org  Wed Sep  9 21:33:16 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Wed, 9 Sep 2015 19:33:16 +0000 (UTC)
Subject: [Python-ideas]
	=?utf-8?q?Should_our_default_random_number_generat?=
	=?utf-8?q?or_be=09secure=3F?=
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
Message-ID: <loom.20150909T213030-270@post.gmane.org>

Steven D'Aprano <steve at ...> writes:
> Question, aimed at anyone, not necessarily random832 -- one desirable 
> property of PRNGs is that you can repeat a sequence of values if you 
> re-seed with a known value. Does arc4random keep that property? I think 
> that it is important that the default RNG be deterministic when given a 
> known seed. (I'm happy for the default seed to be unpredictable.)

I think the removal of MT wasn't proposed (at least not by Theo).
So we'd still have deterministic sequences in addition to
arc4random.



Stefan Krah





From p.f.moore at gmail.com  Wed Sep  9 22:04:32 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 9 Sep 2015 21:04:32 +0100
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <loom.20150909T213030-270@post.gmane.org>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
Message-ID: <CACac1F-AOgb-oZYK47rMxcj1RMVdJ4QuyX+KNG5WXUy=XkOe6g@mail.gmail.com>

On 9 September 2015 at 20:33, Stefan Krah <skrah at bytereef.org> wrote:
> Steven D'Aprano <steve at ...> writes:
>> Question, aimed at anyone, not necessarily random832 -- one desirable
>> property of PRNGs is that you can repeat a sequence of values if you
>> re-seed with a known value. Does arc4random keep that property? I think
>> that it is important that the default RNG be deterministic when given a
>> known seed. (I'm happy for the default seed to be unpredictable.)
>
> I think the removal of MT wasn't proposed (at least not by Theo).
> So we'd still have deterministic sequences in addition to
> arc4random.

I use a RNG quite often. Typically for simulations (games, dierolls,
card draws, that sort of thing). Sometimes for many millions of
results (Monte Carlo simulations, for example). I would always just
use the default RNG supplied by the stdlib - I view my use case as
"normal use" and wouldn't go looking for specialist answers. I'd
occasionally look for reproducibility, although it's not often a key
requirement for me (I would expect it as an option from the stdlib
RNG, though).

Anyone doing crypto who doesn't fully appreciate that it's a
specialist subject and that they should be looking for a dedicated RNG
suitable for crypto, is probably going to make a lot of *other*
mistakes as well. Leading them away from this one probably isn't going
to be enough to make their code something I'd want to use...

So as a user, I'm against making a change like this. Let the default
RNG in the stdlib be something suitable for simulations, "pick a
random question", and similar situations, and provide a crypto-capable
RNG for those who need it, but not as the default. (I am, of course,
assuming that it's not possible to have a single RNG that is the best
option for both uses - nobody on this thread seems to have suggested
that I'm wrong in this assumption).

Paul

From skrah at bytereef.org  Wed Sep  9 22:07:32 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Wed, 9 Sep 2015 20:07:32 +0000 (UTC)
Subject: [Python-ideas]
	=?utf-8?q?Should_our_default_random_number_generat?=
	=?utf-8?q?or_be=09secure=3F?=
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <loom.20150909T203931-585@post.gmane.org>
Message-ID: <loom.20150909T220217-635@post.gmane.org>

Stefan Krah <skrah at ...> writes:
> The OpenBSD devs could also publish arc4random as a library that
> works everywhere (like OpenSSH). That would be a nicer solution
> for everyone (except for the devs perhaps :).

And naturally they're already doing that. I missed this in Theo's first
mail:

https://github.com/libressl-portable/openbsd/tree/master/src/lib/libcrypto/crypto
https://github.com/libressl-portable/portable/tree/master/crypto/compat


So I guess the whole thing also depends on how popular these
libraries will be.


Stefan Krah





From random832 at fastmail.us  Wed Sep  9 22:09:21 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Wed, 09 Sep 2015 16:09:21 -0400
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <20150909190757.GM19373@ando.pearwood.info>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
Message-ID: <1441829361.2883366.379212985.164412ED@webmail.messagingengine.com>

On Wed, Sep 9, 2015, at 15:07, Steven D'Aprano wrote:
> Not really. Look at the subject line. It doesn't say "should we change 
> from MT to arc4random?", it asks if the default random number generator 
> should be secure. The only reason we are considering the change from MT 
> to arc4random is to make the PRNG cryptographically secure. "Secure" is 
> a moving target, what is secure today will not be secure tomorrow.

Right, but we are discussing making it secure today.

From guido at python.org  Wed Sep  9 22:17:14 2015
From: guido at python.org (Guido van Rossum)
Date: Wed, 9 Sep 2015 13:17:14 -0700
Subject: [Python-ideas] Structural type checking for PEP 484
Message-ID: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>

Jukka wrote up a proposal for structural subtyping. It's pretty good.
Please discuss.

https://github.com/ambv/typehinting/issues/11#issuecomment-138133867

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/7d031749/attachment.html>

From random832 at fastmail.us  Wed Sep  9 22:38:02 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Wed, 09 Sep 2015 16:38:02 -0400
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
Message-ID: <1441831082.2889011.379232937.1FB72573@webmail.messagingengine.com>

The commit message changing libc's random functions to use arc4random is
as follows:

> Change rand(), random(), drand48(), lrand48(), mrand48(), and srand48()
> to returning strong random by default, source from arc4random(3).
> Parameters to the seeding functions are ignored, and the subsystems remain
> in strong random mode.  If you wish the standardized deterministic mode,
> call srand_deterministic(), srandom_determistic(), srand48_deterministic(),
> seed48_deterministic() or lcong48_deterministic() instead.
> The re-entrant functions rand_r(), erand48(), nrand48(), jrand48() are
> unaffected by this change and remain in deterministic mode (for now).
> 
> Verified as a good roadmap forward by auditing 8800 pieces of software.
> Roughly 60 pieces of software will need adaptation to request the
> deterministic mode.
> 
> Violates POSIX and C89, which violate best practice in this century.
> ok guenther tedu millert

Perhaps someone could ask them for information about that audit, and how
many / what of those pieces of software were actually using these
functions in ways which made them insecure, but whose security would be
notably improved by a better random implementation (I suspect that the
main thrust of the audit, though, was on finding which ones would be
broken by taking away the default deterministic seeding).

That could tell us how typical it is for people to ignorantly use
default random functions for security-critical code with no other flaws.

From encukou at gmail.com  Wed Sep  9 22:37:55 2015
From: encukou at gmail.com (Petr Viktorin)
Date: Wed, 9 Sep 2015 22:37:55 +0200
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <loom.20150909T213030-270@post.gmane.org>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
Message-ID: <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>

On Wed, Sep 9, 2015 at 9:33 PM, Stefan Krah <skrah at bytereef.org> wrote:
> Steven D'Aprano <steve at ...> writes:
>> Question, aimed at anyone, not necessarily random832 -- one desirable
>> property of PRNGs is that you can repeat a sequence of values if you
>> re-seed with a known value. Does arc4random keep that property? I think
>> that it is important that the default RNG be deterministic when given a
>> known seed. (I'm happy for the default seed to be unpredictable.)

The OpenBSD implementation does not allow any kind of reproducible results.
Reading http://www.pcg-random.org/other-rngs.html, I see that
arc4random is not built for is statistical quality and k-dimensional
equidistribution, which are also properties you might not need for
crypto, but do want for simulations.
So there are two quite different use cases (plus a lot of grey area
where any solution is okay).

The current situation may be surprising to people who didn't read the
docs. Switching away from MT might be a disservice to users that did
read and understand them.

From njs at pobox.com  Wed Sep  9 23:02:19 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 9 Sep 2015 14:02:19 -0700
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAExdVN=Cx+hc6oSN0LhwVO5zjVuzGKAfeiePRppNqy26M-ZgMg@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <CAExdVN=Cx+hc6oSN0LhwVO5zjVuzGKAfeiePRppNqy26M-ZgMg@mail.gmail.com>
Message-ID: <CAPJVwB=tnYhYNFAs3w5PYpQWaEY8XGZpMnA0_GrZw5=c0f7JsQ@mail.gmail.com>

On Sep 9, 2015 12:21 PM, "Tim Peters" <tim.peters at gmail.com> wrote:
>
> [Steven D'Aprano <steve at pearwood.info>]
> > ...
> > Question, aimed at anyone, not necessarily random832 -- one desirable
> > property of PRNGs is that you can repeat a sequence of values if you
> > re-seed with a known value. Does arc4random keep that property? I think
> > that it is important that the default RNG be deterministic when given a
> > known seed. (I'm happy for the default seed to be unpredictable.)
>
> "arc4random" is ill-defined.  From what I gathered, it's the case that
> "pure chacha" variants can be seeded to get a reproducible sequence
> "in theory", but that not all implementations support that.
>
> Specifically, the OpenBSD implementation being "sold" here does not and
cannot:
>
>
http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man3/arc4random.3
>
> "Does not" because there is no API to either request or set a seed.
>
> "Cannot" because:
>
>     The subsystem is re-seeded from the kernel random number
>     subsystem using getentropy(2) on a regular basis

Another reason why it is important *not* to provide a seeding api for a
crypto rng is that this means you can later swap out the underlying
algorithms easily as the state of the art improves. By contrast, if you
have a deterministic seeded mode, then swapping out the algorithm becomes a
compatibility break.

(You can provide a "mix this extra entropy into the pool" api, which looks
rather similar to seeding, but has fundamentally different semantics.)

The only real problem that I see with switching the random module to use a
crypto rng is exactly this backwards compatibility issue. For scientific
users, reproducibility of output streams is really important.

(Ironically, this is a variety of "important" that crypto people are very
familiar with: universally acknowledged to be the right thing by everyone
who's thought about it, a minority do religiously and rely on, and most
people ignore out of ignorance. Education is ongoing...)

OTOH python has never made strong guarantees of output stream
reproducibility -- 3.2 broke all seeds by default (you have to add
'version=1' to your seed call to get the same results on post-3.2 pythons
-- which of course gives an error on older versions). And 99% of the
methods are documented to be unstable across versions -- the only method
that's guaranteed to produce reproducible results across versions is
random.random(). In practice the other methods usually don't change so
people get away with it, but. See:

    https://docs.python.org/3/library/random.html#notes-on-reproducibility

So in practice the stdlib random module is not super suitable for
scientific work anyway. Not that this stops anyone from using it for this
purpose... see above. (And to be fair even this limited determinism is
still enough to be somewhat useful -- not all projects require
reproducibility across years of different python versions.) Plus even a lot
of people who know about the importance of seeding don't realize that the
stdlib's support has these gotchas.

(Numpy unsurprisingly puts higher priority on these issues -- our random
module guarantees exact reproducibility of seeded outputs modulo rounding,
across versions and systems, except for bugfixes necessary for correctness.
This means that we carry around a bunch of old inefficient implementations
of the distribution methods, but so be it...)

So, all that considered: I can actually see an argument for removing the
seeding methods from the the stdib entirely, and directing those who need
reproducibility to a third party library like numpy (/ pygsl / ...). This
would be pretty annoying for some cases where someone really does have
simple needs and wants just a little determinism without a binary
extension, but on net it might even be an improvement, given how easy it is
to misread the current api as guaranteeing more than it actually promises.
OTOH this would actually break the current promise, weak as it is.

Keeping that promise in mind, an alternative would be to keep both
generators around, use the cryptographically secure one by default, and
switch to MT when someone calls

  seed(1234, generator="INSECURE LEGACY MT")

But this would justifiably get us crucified by the security community,
because the above call would flip the insecure switch for your entire
program, including possibly other modules that were depending on random to
provide secure bits.

So if we were going to do this then I think it would have to be by
switching the global RNG over unconditionally, and to fulfill the promise,
provide the MT option as a separate class that the user would have to
instantiate explicitly if they wanted it for backcompat. Document that you
should replace

import random
random.seed(12345)
if random.whatever(): ...

with

from random import MTRandom
random = MTRandom(12345)
if random.whatever(): ...

As part of this transition I would also suggest making the seed method on
non-seedable RNGs raise an error when given an explicit seed, instead of
silently doing nothing like the current SystemRandom.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/faae4eb3/attachment-0001.html>

From random832 at fastmail.us  Wed Sep  9 23:15:39 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Wed, 09 Sep 2015 17:15:39 -0400
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAPJVwB=tnYhYNFAs3w5PYpQWaEY8XGZpMnA0_GrZw5=c0f7JsQ@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <CAExdVN=Cx+hc6oSN0LhwVO5zjVuzGKAfeiePRppNqy26M-ZgMg@mail.gmail.com>
 <CAPJVwB=tnYhYNFAs3w5PYpQWaEY8XGZpMnA0_GrZw5=c0f7JsQ@mail.gmail.com>
Message-ID: <1441833339.2897870.379262849.1477A353@webmail.messagingengine.com>

On Wed, Sep 9, 2015, at 17:02, Nathaniel Smith wrote:
> Keeping that promise in mind, an alternative would be to keep both
> generators around, use the cryptographically secure one by default, and
> switch to MT when someone calls
> 
>   seed(1234, generator="INSECURE LEGACY MT")
> 
> But this would justifiably get us crucified by the security community,
> because the above call would flip the insecure switch for your entire
> program, including possibly other modules that were depending on random
> to
> provide secure bits.

Ideally, neither the crypto bits nor the science bits of a big program
should be using the module-level functions. A small program either
hasn't got both kinds of bits, or won't be using them at the same time.
And if you've got non-science bits doing stuff with your RNG then your
results probably aren't going to be reproducible anyway.

Which suggests a solution: How about exposing a way to switch out the
Random instance used by the module-level functions? The instance itself
exists now as random._inst, but the module just spews its bound methods
all over its namespace. (Long-term, it might make sense to deprecate the
module-level functions)

From srkunze at mail.de  Wed Sep  9 23:16:24 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Wed, 9 Sep 2015 23:16:24 +0200
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
Message-ID: <55F0A1A8.5010001@mail.de>

Thanks for sharing, Guido. Some random thoughts:

- "classes should need to be explicitly marked as protocols"
If so, why are they classes in the first place? Other languages has 
dedicated keywords like "interface".

- "recursive types"?
Yes, please. I am very curious about how to as I am working a similar 
problem. It would basically require defining of the protocol first and 
then populating its member as they might use the protocol's name. pyfu 
is supposed to do exactly this. But it's not going to work 100% when 
metaclasses come into the game.

Best,
Sven

On 09.09.2015 22:17, Guido van Rossum wrote:
> Jukka wrote up a proposal for structural subtyping. It's pretty good. 
> Please discuss.
>
> https://github.com/ambv/typehinting/issues/11#issuecomment-138133867
>
> -- 
> --Guido van Rossum (python.org/~guido <http://python.org/%7Eguido>)
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/4c253091/attachment.html>

From abarnert at yahoo.com  Wed Sep  9 23:18:06 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 9 Sep 2015 14:18:06 -0700
Subject: [Python-ideas] High time for a builtin function to manage
	packages (simply)?
In-Reply-To: <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
Message-ID: <C5A4CEBA-048A-4762-B650-869625BF77ED@yahoo.com>

On Sep 9, 2015, at 01:56, Paul Moore <p.f.moore at gmail.com> wrote:
> 
> Apart from that issue, which is Windows only (and thus some people
> find it less compelling) we have also had reported issues of people
> running pip, and it installs things into the "wrong" Python
> installation. This is typically because of PATH configuration issues,
> where "pip" is being found via one PATH element, but "python" is found
> via a different one. I don't have specifics to hand, so I can't
> clarify *how* people have managed to construct such breakage, but I
> can state that it happens, and the relevant people are usually very
> confused by the results.

If StackOverflow/SU/TD questions are any indication, a disproportionate number of these people are Mac users using Python 2.7, who have installed a second Python 2.7 (or, in some cases, two of them) alongside Apple's. Many teachers, blog posts, instructions for scientific packages, etc. recommend this, but often don't give enough information for a novice to get it right. 

Many people don't even realize they already have a Python 2.7; others are making their first foray into serious terminal usage and don't think about PATH issues; others are following old instructions written for OS X 10.5 that don't do the right thing in 10.6, much less 10.10; etc. And even experienced *nix types who aren't Mac experts may not realize the implications of LaunchServices not being launched by the shell (so anything you double-click, schedule, run as a service, etc. won't see your export PATH that you think should be solving things). Even Mac experts are thrown by the fact that Apple's pre-installed Python is in /usr but has a scripts directory in /usr/local, so if you install pip for both Apple's Python and a second one, whichever one goes second is likely to overwrite the first (but that isn't as common as just having /usr/bin ahead of /usr/local/bin on the PATH--because Apple's Python doesn't come with pip, this is enough to have your highest pip and python executables out of sync).

Whenever someone has a PATH question, I always start by asking them if they're on a Mac, and using Python 2.7, and, if so, which if any Python installs they've done, and why they can't use virtual environments and/or upgrade to Python 3.x and/or use the system Python. The vast majority say yes, yes, [python.org|Homebrew|the one linked from this blog post|I don't remember], what's a virtual environment, my [book|teacher|friend] says 2.7 is the best version, this blog post says [Apple doesn't include Python|Apple's Python is 2.7.1 and broken|etc.].

As both Python 3 and virtual environments become more common (at least as long as Apple isn't shipping Python 3 or virtualenv for their 2.7), the problem seems to be becoming less common, but it's still depressing how many people are still writing blog posts and SO answers and so on that tell people "you need to install the latest version of Python, 2.7, because your computer doesn't come with it" and then proceed to give instructions that will lead to a screwed up PATH and make no mention of virtualenv...

From donald at stufft.io  Wed Sep  9 23:25:12 2015
From: donald at stufft.io (Donald Stufft)
Date: Wed, 9 Sep 2015 17:25:12 -0400
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <C5A4CEBA-048A-4762-B650-869625BF77ED@yahoo.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
 <C5A4CEBA-048A-4762-B650-869625BF77ED@yahoo.com>
Message-ID: <etPan.55f0a3b8.4e506bc7.31bc@Draupnir.home>

On September 9, 2015 at 5:22:57 PM, Andrew Barnert via Python-ideas (python-ideas at python.org) wrote:
> > Apple's Python doesn't come with pip

As of the latest Yosemite release, and in El Capitan, it *does* however come with Python 2.7.10 and thus ``python -m ensurepip`` works.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From skrah at bytereef.org  Wed Sep  9 23:36:06 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Wed, 9 Sep 2015 21:36:06 +0000 (UTC)
Subject: [Python-ideas]
	=?utf-8?q?Should_our_default_random_number_generat?=
	=?utf-8?q?or_be=09secure=3F?=
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
Message-ID: <loom.20150909T232749-280@post.gmane.org>


Petr Viktorin <encukou at ...> writes:
> The OpenBSD implementation does not allow any kind of reproducible results.
> Reading http://www.pcg-random.org/other-rngs.html, I see that
> arc4random is not built for is statistical quality and k-dimensional
> equidistribution, which are also properties you might not need for
> crypto, but do want for simulations.
> So there are two quite different use cases (plus a lot of grey area
> where any solution is okay).

I can't find much at all when searching for "chacha20 equidistribution".
Contrast that with "mersenne twister equidistribution" and it seems that
chacha20 hasn't been studied very much in that respect (except for
the pcg-random site).


So I also think this should preclude us from replacing the current
random() functions.


Adding an arc4random module with the caveat that its quality will
be as good as the current OpenBSD libcrypto/libressl(?) would be okay.


Stefan Krah





From abarnert at yahoo.com  Wed Sep  9 23:50:16 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 9 Sep 2015 14:50:16 -0700
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <55F058B6.9000202@mail.de>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com> <87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>
 <m2egi9a62o.fsf@fastmail.com> <55EF2B66.4020509@mail.de>
 <1441741195.1614886.378114729.37307E0E@webmail.messagingengine.com>
 <6DDBD724-714E-40E1-88DF-9BC8484FF240@yahoo.com> <55F058B6.9000202@mail.de>
Message-ID: <1DCC81C0-DE7A-460A-AD7F-E1533BB14911@yahoo.com>

On Sep 9, 2015, at 09:05, Sven R. Kunze <srkunze at mail.de> wrote:
> 
>> On 09.09.2015 02:09, Andrew Barnert via Python-ideas wrote:
>> I think it's already been established why % formatting is not going away any time soon.
>> 
>> As for de-emphasizing it, I think that's already done pretty well in the current docs. The tutorial has a nice long introduction to str.format, a one-paragraph section on "old string formatting" with a single %5.3f example, and a one-sentence mention of Template. The stdtypes chapter in the library reference explains the difference between the two in a way that makes format sound more attractive for novices, and then has details on each one as appropriate. What else should be done?
> 
> I had difficulties to find what you mean by tutorial. But hey, being a Python user for years and not knowing where the official tutorial resides...

If you go to docs.python.org (directly, or by clicking the link to docs for Python 3 or Python 2 from the home page or the documentation menu), Tutorial is the second thing on the list, after What's New. And, as you found, it's the first hit for "Python tutorial" on Google.

At any rate, if you're not concerned with the tutorial, which parts of the docs are you worried about? Sure, a lot of people learn Python from various books, websites, and classes that present % instead of (or at least in equal light with) {}, but those are all outside the control of Python itself. You can't write a PEP to get the author of ThinkPython, a guy who wrote 1800 random StackOverflow answers, or the instructor for Programming 101 at Steve University to change what they teach.

And if not the docs, what else would it mean to "de-emphasize" %-formatting without deprecating or removing it?

> Anyway, Google presented me the version 2.7 of the tutorial.

That's a whole other problem. But nobody is going to retroactively change Python 2.7 just to help people who find the 2.7 docs when they should be looking for 3.5.

That might seem reasonable today, when 2.7 could heartily recommend str.format because it's nearly the same in 2.7 as in 3.5, but what about next year, when f-strings are the preferred way to do it in 3.6? If 3.6 de-emphasizes str.format (as a feature used only when you need backward compat and/or dynamic formats) and its tutorial, %-formatting docs, and str.format docs all point to f-strings, having 2.7's docs point people to str.format will be misleading at best for 3.6, but having it recommend something that doesn't exist in 2.7 will be actively wrong for 2.7.

The solution is to get people to the 3.5 or 3.6 docs in the first place, not to hack up the 2.7 docs.

> I still don't understand what's wrong with deprecating %, but okay.

Well, have you read the answers given by Nick, me, and others earlier in the thread? If so, what do you disagree with? You've only addressed one point (that % is faster than {} for simple cases--and your solution is just "make {} faster", which may not be possible given that it's inherently more hookable than % and therefore requires more function calls...). What about formatting headers for ASCII wire protocols, sharing tables of format strings between programming languages (e.g., for i18n), or any of the other reasons people have brought up?

From abarnert at yahoo.com  Wed Sep  9 23:28:46 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 9 Sep 2015 14:28:46 -0700
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <mspcv3$8mu$1@ger.gmane.org>
References: <msmbko$ful$1@ger.gmane.org> <mspcv3$8mu$1@ger.gmane.org>
Message-ID: <5B23496D-6DBD-49B3-91D7-E093309A84C7@yahoo.com>

On Sep 9, 2015, at 06:41, Wolfgang Maier <wolfgang.maier at biologie.uni-freiburg.de> wrote:
> 
> 3)
> Finally, the alternative idea of having the new functionality handled by a new !converter, like:
> 
> "List: {0!j:,}".format([1.2, 3.4, 5.6])
> 
> I considered this idea before posting the original proposal, but, in addition to requiring a change to str.format (which would need to recognize the new token), this approach would need either:
> 
> - a new special method (e.g., __join__) to be implemented for every type that should support it, which is worse than for my original proposal or
> 
> - the str.format method must react directly to the converter flag, which is then no different to the above solution just that it uses !j instead of *. Personally, I find the * syntax more readable, plus, the !j syntax would then suggest that this is a regular converter (calling a special method of the object) when, in fact, it is not.
> Please correct me, if I misunderstood something about this alternative proposal.

But the format method already _does_ react directly to the conversion flag. As the docs say, the "type coercion" (call to str, repr, or ascii) happens before formatting, and then the __format__ method is called on the result. A new !j would be a "regular converter"; it just calls a new join function (which returns something whose __format__ method then does the right thing) instead of the str, repr, or ascii functions.

And random's custom converter idea would work similarly, except that presumably his !join would specify a function registered to handle the "join" conversion in some way rather than being hardcoded to a builtin.


From srkunze at mail.de  Thu Sep 10 00:02:43 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Thu, 10 Sep 2015 00:02:43 +0200
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
Message-ID: <55F0AC83.3050505@mail.de>

Not specifically about this proposal but about the effort put into 
Python typehinting in general currently:


What are the supposed benefits?


I somewhere read that right now tools are able to infer 60% of the 
types. That seems pretty good to me and a lot of effort on your side to 
make some additional 20?/30? %. Don't get me wrong, I like the 
theoretical and abstract discussions around this topic but I feel this 
type of feature way out of the practical realm.

I don't see the effort for adding type hints AND the effort for further 
parsing (by human eyes) justified by partially better IDE support and 1 
single additional test within test suites of about 10,000s of tests.

Especially, when considering that correct types don't prove 
functionality in any case. But tested functionality in some way proves 
correct typing.

Just my two cents since I felt I had to say this and maybe I am missing 
something. :)

Best,
Sven



On 09.09.2015 22:17, Guido van Rossum wrote:
> Jukka wrote up a proposal for structural subtyping. It's pretty good. 
> Please discuss.
>
> https://github.com/ambv/typehinting/issues/11#issuecomment-138133867
>
> -- 
> --Guido van Rossum (python.org/~guido <http://python.org/%7Eguido>)
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/a2e9630f/attachment.html>

From wolfgang.maier at biologie.uni-freiburg.de  Thu Sep 10 00:03:05 2015
From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier)
Date: Thu, 10 Sep 2015 00:03:05 +0200
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <5B23496D-6DBD-49B3-91D7-E093309A84C7@yahoo.com>
References: <msmbko$ful$1@ger.gmane.org> <mspcv3$8mu$1@ger.gmane.org>
 <5B23496D-6DBD-49B3-91D7-E093309A84C7@yahoo.com>
Message-ID: <55F0AC99.8030408@biologie.uni-freiburg.de>

On 09.09.2015 23:28, Andrew Barnert via Python-ideas wrote:
> On Sep 9, 2015, at 06:41, Wolfgang Maier <wolfgang.maier at biologie.uni-freiburg.de> wrote:
>>
>> 3)
>> Finally, the alternative idea of having the new functionality handled by a new !converter, like:
>>
>> "List: {0!j:,}".format([1.2, 3.4, 5.6])
>>
>> I considered this idea before posting the original proposal, but, in addition to requiring a change to str.format (which would need to recognize the new token), this approach would need either:
>>
>> - a new special method (e.g., __join__) to be implemented for every type that should support it, which is worse than for my original proposal or
>>
>> - the str.format method must react directly to the converter flag, which is then no different to the above solution just that it uses !j instead of *. Personally, I find the * syntax more readable, plus, the !j syntax would then suggest that this is a regular converter (calling a special method of the object) when, in fact, it is not.
>> Please correct me, if I misunderstood something about this alternative proposal.
>
> But the format method already _does_ react directly to the conversion flag. As the docs say, the "type coercion" (call to str, repr, or ascii) happens before formatting, and then the __format__ method is called on the result. A new !j would be a "regular converter"; it just calls a new join function (which returns something whose __format__ method then does the right thing) instead of the str, repr, or ascii functions.
>

Ah, I see! Thanks for correcting me here. Somehow, I had the mental 
picture that the format converters would call the object's __str__ and 
__repr__ methods directly (and so you'd need an additional __join__ 
method for the new converter), but that's not the case then.

> And random's custom converter idea would work similarly, except that presumably his !join would specify a function registered to handle the "join" conversion in some way rather than being hardcoded to a builtin.
>

How would such a registration work (sorry, I haven't had the time to 
search the list for his previous mention of this idea)? A new builtin 
certainly won't fly.

Thanks,
Wolfgang


From rustompmody at gmail.com  Wed Sep  9 18:16:22 2015
From: rustompmody at gmail.com (Rustom Mody)
Date: Wed, 9 Sep 2015 09:16:22 -0700 (PDT)
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
Message-ID: <0a3342d3-0b60-41cd-9cbb-5fac32c371ef@googlegroups.com>



On Wednesday, September 9, 2015 at 2:27:08 PM UTC+5:30, Paul Moore wrote:
>
> In actual fact, if it weren't for the backward compatibility issues it
>
> would cause, I'd be tempted to argue that pip shouldn't provide any
> wrapper at all, and *only* offer "python -m pip" as a means of
> invoking it (precisely because it's so closely tied to the Python
> interpreter used to invoke it). But that's never going to happen and I
> don't intend it as a serious proposal.
>
> Paul
>
>  
The amount of grief pip is currently causing is IMHO good reason to prefer 
incompatible changes that remove breakage to try-n-please-everyone and keep 
breaking.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/c85374b4/attachment.html>

From abarnert at yahoo.com  Thu Sep 10 00:08:04 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 9 Sep 2015 15:08:04 -0700
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
Message-ID: <7DC7EA44-0CD8-4F61-8462-8147B8BB8059@yahoo.com>

On Sep 9, 2015, at 13:17, Guido van Rossum <guido at python.org> wrote:
> 
> Jukka wrote up a proposal for structural subtyping. It's pretty good. Please discuss.
> 
> https://github.com/ambv/typehinting/issues/11#issuecomment-138133867

Are we going to continue to have (both implicit and explicit) ABCs in collections.abc, numbers, etc., and also have protocols that are also ABCs and are largely parallel to them (and implicit at static checking time whether they're implicit or explicit at runtime) In typing? If so, I think we've reached the point where the two parallel hierarchies are a problem.

Also, why are both the terminology and implementation so different from what we already have for ABCs? Why not just have a decorator or metaclass that can be added to ABCs that makes them implicit (rather than writing a manual __subclasshook__ for each one), which also makes them implicit at static type checking time, which means there's no need for a whole separate but similar notion?

I'm not sure why it's important to also have some times that are implicit at static type checking time but not at runtime, but if there is a good reason, that just means two different decorators/metaclasses/whatever (or a flag passed to the decorator, etc.). Compare:

Hashable is an implicit ABC, Sequence is an explicit ABC, Reversible is an implicit-static/explicit-runtime ABC. 

Hashable is an implicit ABC and also a Protocol that's an explicit ABC, Sequence is an explicit ABC and not a Protocol, Reversible is a Protocol that's an explicit ABC.

The first one is clearly simpler; is there some compelling reason that makes the second one better anyway?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/6e5a7e51/attachment.html>

From tim.peters at gmail.com  Thu Sep 10 00:19:44 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 9 Sep 2015 17:19:44 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <loom.20150909T232749-280@post.gmane.org>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
 <loom.20150909T232749-280@post.gmane.org>
Message-ID: <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>

[Stefan Krah <skrah at bytereef.org>]
> I can't find much at all when searching for "chacha20 equidistribution".
> Contrast that with "mersenne twister equidistribution" and it seems that
> chacha20 hasn't been studied very much in that respect (except for
> the pcg-random site).
>
> So I also think this should preclude us from replacing the current
> random() functions.

Well, most arguments about random functions rely on fantasies ;-)  For
example, yes, the Twister is provably equidistributed to 32 bits
across 623 dimensions, but ... does it make a difference to anything?
That's across the Twister's _entire period_, which couldn't actually
be generated across the age of the universe.

What may really matter to an application is whether it will see rough
equidistribution across the infinitesimally small slice (of the
Twister's full period) it actually generates.  And you'll find very
little about _that_ (last time I searched, I found nothing).  For
assurances about that, people rely on test suites developed to test
generators.

The Twister's provably perfect equidistribution across its whole
period also has its scary sides.  For example, run random.random()
often enough, and it's _guaranteed_ you'll eventually reach a state
where the output is exactly 0.0 hundreds of times in a row.  That will
happen as often as it "should happen" by chance, but that's scant
comfort if you happen to hit such a state.  Indeed, the Twister was
patched relatively early in its life to try to prevent it from
_starting_ in such miserable states.   Such states are nevertheless
still reachable from every starting state.

But few people know any of that, so they take "equidistribution" as
meaning a necessary thing rather than as an absolute guarantee of
eventual disaster ;-)

What may really matter for most simulations is that the Twister never
reaches a state where, in low dimensions, k-tuples fall on "just a
few" regularly-spaced hyperplanes forever after.  That's a systematic
problem with old flavors of linear congruential generators.  But that
problem is _so_ old that no new generator proposed over the last few
decades suffers it either.


> Adding an arc4random module with the caveat that its quality will
> be as good as the current OpenBSD libcrypto/libressl(?) would be okay.

Anyone is welcome to supply such a module today, and distribute it via
the usual channels.

Python already supplies the platform spelling of `urandom`, and a very
capable random.SystemRandom class based on `urandom`, for those
needing crypto-strength randomness (relying on what their OS believed
that means, and supplied via their `urandom`).

Good enough for me.  But, no, I'm not a security wonk at heart.

From njs at pobox.com  Thu Sep 10 00:40:02 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 9 Sep 2015 15:40:02 -0700
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
Message-ID: <CAPJVwBmziZufQL0YVYSFmEV7GmvYZXz2eim-vgoSH=H_nPC1jQ@mail.gmail.com>

On Wed, Sep 9, 2015 at 1:56 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 9 September 2015 at 08:33, Ben Finney <ben+python at benfinney.id.au> wrote:
>> Contrariwise, I would like to see ?pip? become the canonical invocation
>> used in all documentation, discussion, and advice; and if there are any
>> technical barriers to that least-surprising method, to see those
>> barriers addressed and removed.
>
> There is at least one fundamental, technical, and (so far) unsolveable
> issue with using "pip" as the canonical invocation.
>
>     pip install -U pip
>
> fails on Windows, because the exe wrapper cannot be replaced by a
> process running that wrapper (the "pip" command runs pip.exe which
> needs to replace pip.exe, but can't because the OS has it open as the
> current running process).
>
> There have been a number of proposals for fixing this, but none have
> been viable so far. We'd need someone to provide working code (not
> just suggestions on things that might work, but actual working code)
> before we could recommend anything other than "python -m pip install
> -U pip" as the correct way of upgrading pip. And recommending one
> thing when upgrading pip, but another for "normal use" is also
> confusing for beginners. (And we have evidence from the pip issue
> tracker people *do* find this confusing, and not just beginners...)

At the very least, surely this could be "fixed" by detecting this case
and exiting with a message "Sorry, Windows is annoying and this isn't
going to work, to upgrade pip please type 'python -m pip ...'
instead"? That seems more productive in the short run than trying to
get everyone to stop typing "pip" :-). (Though I do agree that having
pip as a separate command from python is a big mess -- another case
where this comes up is the need for pip versus pip3.)

> Apart from that issue, which is Windows only (and thus some people
> find it less compelling) we have also had reported issues of people
> running pip, and it installs things into the "wrong" Python
> installation. This is typically because of PATH configuration issues,
> where "pip" is being found via one PATH element, but "python" is found
> via a different one. I don't have specifics to hand, so I can't
> clarify *how* people have managed to construct such breakage, but I
> can state that it happens, and the relevant people are usually very
> confused by the results. Again, "python -m pip" avoids any confusion
> here - that invocation clearly and unambiguously installs to the
> Python installation you invoked.

It sounds like this is another place where in the short term, it would
help a lot of pip at startup took a peek at $PATH and issued some
warnings or errors if it detected the most common types of
misconfiguration? (E.g. the first python/python3 in $PATH does not
match the one being used to run pip.)

-n

-- 
Nathaniel J. Smith -- http://vorpus.org

From abarnert at yahoo.com  Thu Sep 10 00:39:45 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 9 Sep 2015 15:39:45 -0700
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <55F0AC99.8030408@biologie.uni-freiburg.de>
References: <msmbko$ful$1@ger.gmane.org> <mspcv3$8mu$1@ger.gmane.org>
 <5B23496D-6DBD-49B3-91D7-E093309A84C7@yahoo.com>
 <55F0AC99.8030408@biologie.uni-freiburg.de>
Message-ID: <C2DA14A1-DC2F-4F66-B4D6-EB3D63824A63@yahoo.com>

On Sep 9, 2015, at 15:03, Wolfgang Maier <wolfgang.maier at biologie.uni-freiburg.de> wrote:
> 
>> On 09.09.2015 23:28, Andrew Barnert via Python-ideas wrote:
>>> On Sep 9, 2015, at 06:41, Wolfgang Maier <wolfgang.maier at biologie.uni-freiburg.de> wrote:
>>> 
>>> 3)
>>> Finally, the alternative idea of having the new functionality handled by a new !converter, like:
>>> 
>>> "List: {0!j:,}".format([1.2, 3.4, 5.6])
>>> 
>>> I considered this idea before posting the original proposal, but, in addition to requiring a change to str.format (which would need to recognize the new token), this approach would need either:
>>> 
>>> - a new special method (e.g., __join__) to be implemented for every type that should support it, which is worse than for my original proposal or
>>> 
>>> - the str.format method must react directly to the converter flag, which is then no different to the above solution just that it uses !j instead of *. Personally, I find the * syntax more readable, plus, the !j syntax would then suggest that this is a regular converter (calling a special method of the object) when, in fact, it is not.
>>> Please correct me, if I misunderstood something about this alternative proposal.
>> 
>> But the format method already _does_ react directly to the conversion flag. As the docs say, the "type coercion" (call to str, repr, or ascii) happens before formatting, and then the __format__ method is called on the result. A new !j would be a "regular converter"; it just calls a new join function (which returns something whose __format__ method then does the right thing) instead of the str, repr, or ascii functions.
> 
> Ah, I see! Thanks for correcting me here. Somehow, I had the mental picture that the format converters would call the object's __str__ and __repr__ methods directly (and so you'd need an additional __join__ method for the new converter), but that's not the case then.
> 
>> And random's custom converter idea would work similarly, except that presumably his !join would specify a function registered to handle the "join" conversion in some way rather than being hardcoded to a builtin.
> 
> How would such a registration work (sorry, I haven't had the time to search the list for his previous mention of this idea)? A new builtin certainly won't fly.

I believe he posted a more detailed version of the idea on one of the other spinoff threads from the f-string thread, but I don't have a link. But there are lots of possibilities, and if you want to start bikeshedding, it doesn't matter that much what his original color was. For example, here's a complete proposal:

    class MyJoiner:
        def __init__(self, value):
            self.value = value
        def __format__(self, spec):
            return spec.join(map(str, self.value))
    string.register_converter('join', MyJoiner)

That last line adds it to some global table (maybe string._converters, or maybe it's not exposed at the Python level at all; whatever).

In str.format, instead of reading a single character after a !, it reads until colon or end of field; if that's more than a single character, it looks it up in the global table and calls the registered callable. So, in this case, "{spam!join:-}"
would call MyJoiner(spam).__format__('-').

Any more complexity can be added to MyJoiner pretty easily, so this small extension to str.format seems sufficient for anything you might want. For example, if you want a three-part format spec that includes the join string, a format spec to pass to each element, and a format spec to apply to the whole thing:

    def __format__(self, spec):
        joinstr, _, spec = spec.partition(':')
        espec, _, jspec = spec.partition(':')
        bits = (format(e, espec) for e in self.value)
        joined = joinstr.join(bits)
        return format(joined, jspec)

Or maybe it would be better to have a standard way to do multi-part format specs--maybe even passing arguments to a converter rather than cramming them in the spec--but this seems simple and flexible enough.

It might also be worth having multiple converters called in a chain, but I can't think of a use case for that, so I'll ignore it.

Most converters will be classes that just store the constructor argument and use it in __format__, so it seems tedious to repeat that boilerplate for 90% of them, but that's easy to fix with a decorator:

    def simple_converter(func):
        class Converter:
            def __init__(self, value):
                self.value = value
            def __format__(self, spec):
                return func(self.value, spec)

Meanwhile, maybe you want the register function to be a decorator:

    def register_converter(name):
        def decorator(func):
            _global_converter_table[name] = func
            return func
        return decorator

So now, the original example becomes:

    @string.register_converter('join')
    @string.simple_converter
    def my_joiner(values, joinstr):
        return joinstr.join(map(str, values))


From cannatag at gmail.com  Thu Sep 10 00:35:04 2015
From: cannatag at gmail.com (Giovanni Cannata)
Date: Thu, 10 Sep 2015 00:35:04 +0200
Subject: [Python-ideas] PyPI search still broken
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
 <C5A4CEBA-048A-4762-B650-869625BF77ED@yahoo.com>
 <etPan.55f0a3b8.4e506bc7.31bc@Draupnir.home>
Message-ID: <httdkl4b7jlougelcxey1srm.1441837002659@email.android.com>

Hi, sorry to bother you again, but the search problem on PyPI is still present after different weeks and it's very annoying. I've just released a new version of my ldap3 project and it doesn't show up when searching with its name. For mine (and I suppose for other emerging project, especially related to Python 3) it's vital to be easily found by other developers that use pip and PyPI as THE only repository for python packages and using the number of download as a ranking of popularity of a project.

If search can't be fixed there should be at least a warning on the PyPI homepage to let users know that search is broken and that using Google for searching could help to find more packages.

Bye,
Giovanni
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/37012aee/attachment.html>

From srkunze at mail.de  Thu Sep 10 00:45:11 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Thu, 10 Sep 2015 00:45:11 +0200
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <1DCC81C0-DE7A-460A-AD7F-E1533BB14911@yahoo.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com> <87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>
 <m2egi9a62o.fsf@fastmail.com> <55EF2B66.4020509@mail.de>
 <1441741195.1614886.378114729.37307E0E@webmail.messagingengine.com>
 <6DDBD724-714E-40E1-88DF-9BC8484FF240@yahoo.com> <55F058B6.9000202@mail.de>
 <1DCC81C0-DE7A-460A-AD7F-E1533BB14911@yahoo.com>
Message-ID: <55F0B677.3090500@mail.de>

On 09.09.2015 23:50, Andrew Barnert wrote:
> And if not the docs, what else would it mean to "de-emphasize" %-formatting without deprecating or removing it?

The docs are most important. Sorry, if that didn't come across clearly.

>
>> Anyway, Google presented me the version 2.7 of the tutorial.
> That's a whole other problem. But nobody is going to retroactively change Python 2.7 just to help people who find the 2.7 docs when they should be looking for 3.5.

The Python docs are not Python. So, what's in the way of adding this 
note to Python 2.7 docs? The pride of the Python core devs? I anticipate 
better of you.

> That might seem reasonable today, when 2.7 could heartily recommend str.format because it's nearly the same in 2.7 as in 3.5, but what about next year, when f-strings are the preferred way to do it in 3.6? If 3.6 de-emphasizes str.format (as a feature used only when you need backward compat and/or dynamic formats) and its tutorial, %-formatting docs, and str.format docs all point to f-strings, having 2.7's docs point people to str.format will be misleading at best for 3.6, but having it recommend something that doesn't exist in 2.7 will be actively wrong for 2.7.

str.format teaches people how to use {}. That should be encouraged.
Switching from str.format to f-strings is going to work like charm. So, 
it's the syntax I am concerned with, not how to execute the magic behind.

> The solution is to get people to the 3.5 or 3.6 docs in the first place, not to hack up the 2.7 docs.

You have absolutely no idea why people use 2.7 over 3.5, right? I 
promise you that is going to take time.

And what could you do in the meantime? Call it hacking; to me it's 
improving.

>
>> I still don't understand what's wrong with deprecating %, but okay.
> Well, have you read the answers given by Nick, me, and others earlier in the thread? If so, what do you disagree with?

All "blockers" I read so far classify as a) personal preference of % 
over {} or b) fixable. Both classes do not qualify as real blockers; 
they can be overcome.

> You've only addressed one point (that % is faster than {} for simple cases--and your solution is just "make {} faster", which may not be possible given that it's inherently more hookable than % and therefore requires more function calls...).

Try harder. (If {} is too slow for you.)

I've read Python 3 is significantly slower than Python 2. So what? I can 
live with that, when we will make the transition. If we recognize the 
performance penalty, rest assured I come back here to seek your advice 
but until that it's no reason not to switch to Python 3. Same goes for 
string formatting.

> What about formatting headers for ASCII wire protocols, sharing tables of format strings between programming languages (e.g., for i18n), or any of the other reasons people have brought up?
Both fixable in some way or another, the rest classifies as described above.

Best,
Sven

From abarnert at yahoo.com  Thu Sep 10 00:48:39 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 9 Sep 2015 15:48:39 -0700
Subject: [Python-ideas] High time for a builtin function to manage
	packages (simply)?
In-Reply-To: <etPan.55f0a3b8.4e506bc7.31bc@Draupnir.home>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
 <C5A4CEBA-048A-4762-B650-869625BF77ED@yahoo.com>
 <etPan.55f0a3b8.4e506bc7.31bc@Draupnir.home>
Message-ID: <039C0DAE-2F13-4A0A-B923-4812CBD4FAFC@yahoo.com>

On Sep 9, 2015, at 14:25, Donald Stufft <donald at stufft.io> wrote:
> 
> On September 9, 2015 at 5:22:57 PM, Andrew Barnert via Python-ideas (python-ideas at python.org) wrote:
>>> Apple's Python doesn't come with pip
> 
> As of the latest Yosemite release, and in El Capitan, it *does* however come with Python 2.7.10 and thus ``python -m ensurepip`` works.

Sure, and all the way back to 10.5, you could just "easy_install pip". That never solved the problem of people who've been told to install a second Python 2.7 without any explanation of why or any consideration of what problems that might lead to; in fact, it just means they're even more likely to end up installing two colliding pips.

I don't think there's much chance that anything Apple or the Python community does will get people to stop writing blog posts/class notes/installation web pages/SO answers/etc. to do this. There is a chance that proselytizing for virtualenv and/or Python 3 will make the problem irrelevant (and it already seems to be having that effect, just not as fast as would be ideal).

Of course there will still be some people who really do need two Python 2.7 installations and aren't expert enough to manage it, and some people who manage to make a mess of things even with separate 2 and 3 or with venvs or with only Apple's Python. But, based on my (admittedly anecdotal) experience and my educated-guess-but-still-a-guess, I think it'll become not much more common than the equivalent problems for Fedora or Ubuntu or FreeBSD, which is a huge improvement.


From dw+python-ideas at hmmz.org  Thu Sep 10 01:01:30 2015
From: dw+python-ideas at hmmz.org (David Wilson)
Date: Wed, 9 Sep 2015 23:01:30 +0000
Subject: [Python-ideas] PyPI search still broken
In-Reply-To: <httdkl4b7jlougelcxey1srm.1441837002659@email.android.com>
References: <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
 <C5A4CEBA-048A-4762-B650-869625BF77ED@yahoo.com>
 <etPan.55f0a3b8.4e506bc7.31bc@Draupnir.home>
 <httdkl4b7jlougelcxey1srm.1441837002659@email.android.com>
Message-ID: <20150909230130.GA14415@k3>

Hi there,

My 2.5 year old offer to retrofit the old codebase with a new search
system still stands[1]. :)  There is no reason for this to be a complex
affair, the prototype built back then took only a few hours to complete.

No doubt the long term answer is probably "Warehouse fixes this", but
Warehouse seems no nearer a reality than it did in 2013.


David

[1] https://groups.google.com/forum/#!search/%22david$20wilson%22$20search$20pypi/pypa-dev/ZjUNkczsKos/2et8926YOQYJ

On Thu, Sep 10, 2015 at 12:35:04AM +0200, Giovanni Cannata wrote:
> Hi, sorry to bother you again, but the search problem on PyPI is still present
> after different weeks and it's very annoying. I've just released a new version
> of my ldap3 project and it doesn't show up when searching with its name. For
> mine (and I suppose for other emerging project, especially related to Python 3)
> it's vital to be easily found by other developers that use pip and PyPI as THE
> only repository for python packages and using the number of download as a
> ranking of popularity of a project.
> 
> If search can't be fixed there should be at least a warning on the PyPI
> homepage to let users know that search is broken and that using Google for
> searching could help to find more packages.
> 
> Bye,
> Giovanni

> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


From tritium-list at sdamon.com  Thu Sep 10 01:13:19 2015
From: tritium-list at sdamon.com (Alexander Walters)
Date: Wed, 09 Sep 2015 19:13:19 -0400
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
Message-ID: <55F0BD0F.10508@sdamon.com>

In a word - No.

There is zero reason for people doing crypto to use the random module, 
therefor we should not change the random module to be cryptographically 
secure.

Don't break things and slow my code down by default for dubious reasons, 
please.

On 9/9/2015 12:35, Guido van Rossum wrote:
> I've received several long emails from Theo de Raadt (OpenBSD founder) 
> about Python's default random number generator. This is the random 
> module, and it defaults to a Mersenne Twister (MT) seeded by 2500 
> bytes of entropy taken from os.urandom().
>
> Theo's worry is that while the starting seed is fine, MT is not good 
> when random numbers are used for crypto and other security purposes. 
> I've countered that it's not meant for that (you should use 
> random.SystemRandom() or os.urandom() for that) but he counters that 
> people don't necessarily know that and are using the default 
> random.random() setup for security purposes without realizing how 
> wrong that is.
>
> There is already a warning in the docs for the random module that it's 
> not suitable for security, but -- as the meme goes -- nobody reads the 
> docs.
>
> Theo then went into technicalities that went straight over my head, 
> concluding with a strongly worded recommendation of the OpenBSD 
> version of arc4random() (which IIUC is based on something called 
> "chacha", not on "RC4" despite that being in the name). He says it is 
> very fast (but I don't know what that means).
>
> I've invited Theo to join this list but he's too busy. The two core 
> Python experts on the random module have given me opinions suggesting 
> that there's not much wrong with MT, so here I am. Who is right? What 
> should we do? Is there anything we need to do?
>
> -- 
> --Guido van Rossum (python.org/~guido <http://python.org/%7Eguido>)
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/4688a667/attachment.html>

From abarnert at yahoo.com  Thu Sep 10 01:14:17 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 9 Sep 2015 16:14:17 -0700
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <55F0B677.3090500@mail.de>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com> <87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>
 <m2egi9a62o.fsf@fastmail.com> <55EF2B66.4020509@mail.de>
 <1441741195.1614886.378114729.37307E0E@webmail.messagingengine.com>
 <6DDBD724-714E-40E1-88DF-9BC8484FF240@yahoo.com> <55F058B6.9000202@mail.de>
 <1DCC81C0-DE7A-460A-AD7F-E1533BB14911@yahoo.com> <55F0B677.3090500@mail.de>
Message-ID: <2D5621A7-0676-489D-886E-76E7D953870D@yahoo.com>

On Sep 9, 2015, at 15:45, Sven R. Kunze <srkunze at mail.de> wrote:
> 
>> On 09.09.2015 23:50, Andrew Barnert wrote:
>> And if not the docs, what else would it mean to "de-emphasize" %-formatting without deprecating or removing it?
> 
> The docs are most important. Sorry, if that didn't come across clearly.

No problem.

>>> Anyway, Google presented me the version 2.7 of the tutorial.
>> That's a whole other problem. But nobody is going to retroactively change Python 2.7 just to help people who find the 2.7 docs when they should be looking for 3.5.
> 
> The Python docs are not Python. So, what's in the way of adding this note to Python 2.7 docs? The pride of the Python core devs? I anticipate better of you.

First, the Python docs are part of Python. They're owned by the same foundation, managed by the same team, and updated with a similar process.

Second, I'm not a core dev, and since the Python docs are maintained by the core devs, that stands in the way of me personally making that change. :)

Of course I can easily file a docs bug, with a patch, and possibly start a discussion on the relevant list to get wider discussion. But you can do that as easily as I can, and I don't know why you should anticipate better of me than you do of yourself. (If you don't feel capable of writing the change, because you're not a native speaker or your tech writing skills aren't as good as your coding skills or whatever, I won't argue that your English seems good enough to me; just write a "draft" patch and then ask for people to improve it. There are docs changes that have been done this way in the past, and I think there are more than enough people who'd be happy to help.)

>> The solution is to get people to the 3.5 or 3.6 docs in the first place, not to hack up the 2.7 docs.
> 
> You have absolutely no idea why people use 2.7 over 3.5, right? I promise you that is going to take time.

Of course I don't know why _every_ person still using 2.7 is doing so. For myself personally, off the top of my head, recent reasons have included: maintaining an existing, working app that wouldn't gain any benefit from upgrading it; writing a simple script to be deployed on servers that have 2.7 pre-installed; and writing portable libraries to share on PyPI that work on both 2.7 and 3.3+ to make them useful to as many devs as possible. I know other people do it for similarly good reasons, or different ones (e.g., depending on some library that hasn't been ported yet), and also for bad reasons (related to outdated teaching materials or FUD or depending on some library that has been ported but they checked a 6-year-old blog instead of current information). I know that we're still a few years away from the end of the initial transition period, so none of this surprises me much.

But how is any of that, or any additional factors I don't know about, relevant to the fact that using the 2.7 docs (and especially the tutorial) when coding for 3.5 is a bad idea, and a problem to be fixed? How does any of it mean that making the 2.7 docs apply better to 3.5 but worse to 2.7 is a solution?

>>> I still don't understand what's wrong with deprecating %, but okay.
>> Well, have you read the answers given by Nick, me, and others earlier in the thread? If so, what do you disagree with?
> 
> All "blockers" I read so far classify as a) personal preference of % over {} or b) fixable. Both classes do not qualify as real blockers; they can be overcome.
> 
>> You've only addressed one point (that % is faster than {} for simple cases--and your solution is just "make {} faster", which may not be possible given that it's inherently more hookable than % and therefore requires more function calls...).
> 
> Try harder. (If {} is too slow for you.)
> 
> I've read Python 3 is significantly slower than Python 2. So what? I can live with that, when we will make the transition. If we recognize the performance penalty, rest assured I come back here to seek your advice but until that it's no reason not to switch to Python 3. Same goes for string formatting.
> 
>> What about formatting headers for ASCII wire protocols, sharing tables of format strings between programming languages (e.g., for i18n), or any of the other reasons people have brought up?
> Both fixable in some way or another, the rest classifies as described above.

Just saying "I want % deprecated, and I declare that all of the apparent blocking problems are solvable, and therefore I demand that someone else solve them and then deprecate %" is not very useful.

If you think that's the way Python should go, come up with solutions for all of the ones that need to be fixed (and file bugs and ideally patches), and good arguments to dismiss the ones you don't think need to be fixed. Then you can argue that the only remaining reason not to deprecate % is backward compatibility, which isn't compelling enough, and that may well convince everyone.


From njs at pobox.com  Thu Sep 10 01:15:31 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 9 Sep 2015 16:15:31 -0700
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
 <loom.20150909T232749-280@post.gmane.org>
 <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>
Message-ID: <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>

On Wed, Sep 9, 2015 at 3:19 PM, Tim Peters <tim.peters at gmail.com> wrote:
> Well, most arguments about random functions rely on fantasies ;-)  For
> example, yes, the Twister is provably equidistributed to 32 bits
> across 623 dimensions, but ... does it make a difference to anything?
> That's across the Twister's _entire period_, which couldn't actually
> be generated across the age of the universe.
>
> What may really matter to an application is whether it will see rough
> equidistribution across the infinitesimally small slice (of the
> Twister's full period) it actually generates.  And you'll find very
> little about _that_ (last time I searched, I found nothing).  For
> assurances about that, people rely on test suites developed to test
> generators.

Yeah, equidistribution is not a guarantee of anything on its own. For
example, an integer counter modulo 2**(623*32) is equidistributed to
32 bits across 623 dimensions, just like the Mersenne Twister. Mostly
people talk about equidistribution because for a deterministic RNG,
(a) being non-equidistributed would be bad, (b) equidistribution is
something you can reasonably hope to prove for simple
non-cryptographic generators, and mathematicians like writing proofs.

OTOH equidistribution is not even well-defined for the OpenBSD
"arc4random" generator, because it is genuinely non-deterministic --
it regularly mixes new entropy into its state as it goes -- and
equidistribution by definition requires determinism. So it "fails"
this test of "randomness" because it is too random...

In practice, the chances that your Monte Carlo simulation is going to
give bad results because of systematic biases in arc4random are much,
much lower than the chances that it will give bad results because of
subtle hardware failures that corrupt your simulation. And hey, if
arc4random *does* mess up your simulation, then congratulations, your
simulation is publishable as a cryptographic attack and will probably
get written up in the NYTimes :-).

The real reasons to prefer non-cryptographic RNGs are the auxiliary
features like determinism, speed, jumpahead, multi-thread
friendliness, etc. But the stdlib random module doesn't really provide
any of these (except determinism in strictly limited cases), so I'm
not sure it matters much.

> The Twister's provably perfect equidistribution across its whole
> period also has its scary sides.  For example, run random.random()
> often enough, and it's _guaranteed_ you'll eventually reach a state
> where the output is exactly 0.0 hundreds of times in a row.  That will
> happen as often as it "should happen" by chance, but that's scant
> comfort if you happen to hit such a state.  Indeed, the Twister was
> patched relatively early in its life to try to prevent it from
> _starting_ in such miserable states.   Such states are nevertheless
> still reachable from every starting state.

This criticism seems a bit unfair though -- even a true stream of
random bits (e.g. from a true unbiased quantum source) has this
property, and trying to avoid this happening would introduce bias that
really could cause problems in practice. A good probabilistic program
is one that has a high probability of returning some useful result,
but they always have some low probability of returning something
weird. So this is just saying that most people don't understand
probability. Which is true, but there isn't much that the random
module can do about it :-)

-n

-- 
Nathaniel J. Smith -- http://vorpus.org

From random832 at fastmail.us  Thu Sep 10 01:24:55 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Wed, 09 Sep 2015 19:24:55 -0400
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <C2DA14A1-DC2F-4F66-B4D6-EB3D63824A63@yahoo.com>
References: <msmbko$ful$1@ger.gmane.org> <mspcv3$8mu$1@ger.gmane.org>
 <5B23496D-6DBD-49B3-91D7-E093309A84C7@yahoo.com>
 <55F0AC99.8030408@biologie.uni-freiburg.de>
 <C2DA14A1-DC2F-4F66-B4D6-EB3D63824A63@yahoo.com>
Message-ID: <1441841095.2587236.379354345.340FCF95@webmail.messagingengine.com>



On Wed, Sep 9, 2015, at 18:39, Andrew Barnert via Python-ideas wrote:
> I believe he posted a more detailed version of the idea on one of the
> other spinoff threads from the f-string thread, but I don't have a link.
> But there are lots of possibilities, and if you want to start
> bikeshedding, it doesn't matter that much what his original color was.
> For example, here's a complete proposal:
> 
>     class MyJoiner:
>         def __init__(self, value):
>             self.value = value
>         def __format__(self, spec):
>             return spec.join(map(str, self.value))
>     string.register_converter('join', MyJoiner)

Er, I wanted it to be something more like

def __format__(self, spec):
   sep, fmt = # 'somehow' break up spec into two parts
   return sep.join(map(lambda x: x.__format__(fmt)))

And I wasn't the one who actually proposed user-registered converters;
I'm not sure who did. At one point in the f-string thread I suggested
using a _different_ special !word for stuff like a string that can be
inserted into HTML without quoting. I'm also not 100% sure how good an
idea it is (since it means either using global state or moving
formatting to a class instead of str).

The Joiner class wouldn't have to exist as a builtin, it could be
private to the format function.

From rob.cliffe at btinternet.com  Thu Sep 10 01:32:35 2015
From: rob.cliffe at btinternet.com (Rob Cliffe)
Date: Thu, 10 Sep 2015 00:32:35 +0100
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <55F058B6.9000202@mail.de>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com> <87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>
 <m2egi9a62o.fsf@fastmail.com> <55EF2B66.4020509@mail.de>
 <1441741195.1614886.378114729.37307E0E@webmail.messagingengine.com>
 <6DDBD724-714E-40E1-88DF-9BC8484FF240@yahoo.com> <55F058B6.9000202@mail.de>
Message-ID: <55F0C193.6000606@btinternet.com>

I use %-formatting.
Not because I think it's so wonderful and solves all problems (although 
it's pretty good), but because it appeared to be the recommended method 
at the time I learned Python in earnest.  If I were only learning Python 
now, I would probably learn str.format or whatever it is.
I *could* learn to use something else *and* change all my working code, 
but do you really want to force me to do that?
I would guess that there are quite a lot of Python users in the same 
position.
Rob Cliffe

On 09/09/2015 17:05, Sven R. Kunze wrote:
> On 09.09.2015 02:09, Andrew Barnert via Python-ideas wrote:
>> I think it's already been established why % formatting is not going 
>> away any time soon.
>>
>> As for de-emphasizing it, I think that's already done pretty well in 
>> the current docs. The tutorial has a nice long introduction to 
>> str.format, a one-paragraph section on "old string formatting" with a 
>> single %5.3f example, and a one-sentence mention of Template. The 
>> stdtypes chapter in the library reference explains the difference 
>> between the two in a way that makes format sound more attractive for 
>> novices, and then has details on each one as appropriate. What else 
>> should be done?
>
> I had difficulties to find what you mean by tutorial. But hey, being a 
> Python user for years and not knowing where the official tutorial 
> resides...
>
> Anyway, Google presented me the version 2.7 of the tutorial. Thus, the 
> link to the stdtypes documentation does not exhibit the note of, say, 
> 3.5:
>
> "Note: The formatting operations described here exhibit a variety of 
> quirks that lead to a number of common errors (such as failing to 
> display tuples and dictionaries correctly). Using the newer 
> str.format() interface helps avoid these errors, and also provides a 
> generally more powerful, flexible and extensible approach to 
> formatting text."
>
> So, adding it to the 2.7 docs would be a start.
>
>
> I still don't understand what's wrong with deprecating %, but okay. I 
> think f-strings will push {} to wide-range adoption.
>
>
> Best,
> Sven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2014.0.4830 / Virus Database: 4365/10609 - Release Date: 
> 09/09/15
>
>


From donald at stufft.io  Thu Sep 10 02:01:16 2015
From: donald at stufft.io (Donald Stufft)
Date: Wed, 9 Sep 2015 20:01:16 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
Message-ID: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>

Ok, I reached out to Theo de Raadt to talk to him about what he was suggesting
without Guido having to play messenger and forward fragments of the email
conversation. I'm starting a new thread because this email is rather long, and
I'm hoping to divorce it a bit from the back and forth about a proposal that
wasn't exactly what Theo was suggesting that is being discussed in the other
thread.

Essentially, there are three basic types of uses of random (the concept, not
the module). Those are:

1. People/usecases who absolutely need deterministic output given a seed and
? ?for whom security properties don't matter.
2. People/usecases who absolutely need a cryptographically random output and
? ?for whom having a deterministic output is a downside.
3. People/usecases that fall somewhere in between where it may or may not be
? ?security sensitive or it may not be known if it's security sensitive.

The people in group #1 are currently, in the Python standard library, best
served using the MT random source as it provides exactly the kind of determinsm
they need. The people in group #2 are currently, in the Python standard
library, best served using os.urandom (either directly or via
random.SystemRandom).

However, the third case is the one that Theo's suggestion is attempting to
solve. In the current landscape, the security minded folks will tell these
people to use os.urandom/random.SystemRandom and the performance or otherwise
less security minded folks will likely tell them to just use random.py. Leaving
these people with a random that is not cryptographically safe.

The questin then is, does it matter if #3 are using a cryptographically safe
source of randomness? The answer is obviously that we don't know, and it's
possible that the user doesn't know. In these cases it's typically best if we
default to the more secure option and expect people to opt in to insecurity.

In the case of randomness, a lot of languages (Python included) don't do that
and instead they opt to pick the more peformant option first, often with the
argument (as seen in the other thread) that if people need a cryptographically
secure source of random, they'll know how to look for it and if they don't
know how to look for it, then it's likely they'll have some other security
problem. I think (and I believe Theo thinks) this sort of thinking is short
sighted. Let's take an example of a web application, it's going to need session
identifiers to put into a cookie, you'll want these to be random and it's not
obvious on the tin for a non-expert that you can't just use the module level
functions in the random module to do this. Another examples are generating API
keys or a password.

Looking on google, the first result for "python random password" is
StackOverflow which suggests:

? ? ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))

However, it was later edited to, after that, include:

? ? ''.join(random.SystemRandom().choice(string.ascii_uppercase + string.digits) for _ in range(N))

So it wasn't obvious to the person who answered that question that the random
module's module scoped functions were not appropiate for this use. It appears
that the original answer lasted for roughly 4 years before it was corrected,
so who knows how many people used that in those 4 years.

The second result has someone asking if there is a better way to generate a
random password in Python than:

? ? import os, random, string

? ? length = 13
? ? chars = string.ascii_letters + string.digits + '!@#$%^&*()'
? ? random.seed = (os.urandom(1024))

? ? print ''.join(random.choice(chars) for i in range(length))

This person obviously knew that os.urandom existed and that he should use it,
but failed to correctly identify that the random module's module scoped
functions were not what he wanted to use here.

The third result has this code:

? ? import string
? ? import random

? ? def randompassword():
? ? ? ? chars=string.ascii_uppercase + string.ascii_lowercase + string.digits
? ? ? ? size=8?
? ? ? ? return ''.join(random.choice(chars) for x in range(size,12))

I'm not going to keep pasting snippets, but going through the results it is
clear that in the bulk of cases, this search turns up code snippets that
suggest there is likely to be a lot of code out there that is unknownly using
the random module in a very insecure way. I think this is a failing of the
random.py module to provide an API that guides users to be safe which was
attempted to be papered over by adding a warning to the documentation, however
like has been said before, you can't solve a UX problem with documentation.

Then we come to why might we want to not provide a safe random by default for
the folks in the #3 group. As we've seen in the other thread, this basically
boils down to the fact that for a lot of users they don't care about the
security properties and they just want a fast random-esque value. This
particular case is made stronger by the fact that there is a lot of code out
there using Python's random module in a completely safe way that would regress
in a meaningful way if the random module slowed down.

The fact that speed is the primary reason not to give people in #3 a
cryptographically secure source of random by default is where we come back to
the meat of Theo's suggestion. His claim is that invoking os.urandom through
any of the interfaces imposes a performance penalty because it has to round
trip through the kernel crypto sub system for every request. His suggestion is
essentially that we provide an interface to a modern, good, userland?
cryptographically secure source of random that is running within the same
process as Python itself. One such example of this is the arc4random function
(which doesn't actually provide ARC4 on OpenBSD, it provides ChaCha, it's not
tied to one specific algorithm) which comes from libc on many platforms.
According to Theo, modern userland CSPRNGs can create random bytes faster than
memcpy which eliminates the argument of speed for why a CSPRNG shouldn't be
the "default" source of randomness.

Thus the proposal is essentially:

* Provide an API to access a modern userland CSPRNG.
* Provide an implementation of random.SomeKindOfRandom that utilizes this.
* Move the MT based implementation of the random module to
? random.DeterministicRandom.
* Deprecate the module scoped functions, instructing people to use the new
? random.SomeKindofRandom unless they need deterministic random, in which case
? use random.DeterministicRandom.

This can of course be tweaked one way or the other, but that's the general idea
translated into something actionable for Python. I'm not sure exactly how I
feel about it, but I certainly do think that the current situation is confusing
to end users and leaving them in an insecure state, and that a minimum we
should move MT to something like random.DeterministicRandom and deprecate the
module scoped functions because it seems obvious to me that the idea of a
"default" random function that isn't safe is a footgun for users.

As an additional consideration, there are security experts who believe that
userland CSPRNGs should not be used at all. One of those is Thomas Ptacek who
wrote a blog post [1] on the subject. In this, Thomas makes the case that a
userland CSPRNG pretty much always depends on the cryptographic security of
the system random, but that it itself may be broken which means you're adding
a second, single point of failure where a mistake can cause you to get
non-random data out of the system. I had asked Theo about this, and he stated
that he disagreed with Thomas about never using a userland CSPRNG and in his
opinion that blog post was mostly warning people away from using something like
MT in the userland and away from /dev/random (which is often the cause of
people reaching for MT because /dev/random blocks which makes programs even
slower).

It seems to boil down to, do we want to try to protect users by default or at
least make it more obvious in the API which one they want to use (I think yes),
and if so do we think that /dev/urandom is "fast enough" for most people in
group #3 and if not, do we agree with Theo that a modern userland CSPRNG is
safe enough to use, or do we agree with Thomas that it's not and if we think
that it is, do we use arc4random and what do we do on systems that don't have
a modern userland CSPRNG in their libc.

[1] http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From abarnert at yahoo.com  Thu Sep 10 02:03:17 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 9 Sep 2015 17:03:17 -0700
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <1441841095.2587236.379354345.340FCF95@webmail.messagingengine.com>
References: <msmbko$ful$1@ger.gmane.org> <mspcv3$8mu$1@ger.gmane.org>
 <5B23496D-6DBD-49B3-91D7-E093309A84C7@yahoo.com>
 <55F0AC99.8030408@biologie.uni-freiburg.de>
 <C2DA14A1-DC2F-4F66-B4D6-EB3D63824A63@yahoo.com>
 <1441841095.2587236.379354345.340FCF95@webmail.messagingengine.com>
Message-ID: <461D4C7C-6C32-480D-B065-295A623E11D7@yahoo.com>

On Sep 9, 2015, at 16:24, random832 at fastmail.us wrote:
> 
>> On Wed, Sep 9, 2015, at 18:39, Andrew Barnert via Python-ideas wrote:
>> I believe he posted a more detailed version of the idea on one of the
>> other spinoff threads from the f-string thread, but I don't have a link.
>> But there are lots of possibilities, and if you want to start
>> bikeshedding, it doesn't matter that much what his original color was.
>> For example, here's a complete proposal:
>> 
>>    class MyJoiner:
>>        def __init__(self, value):
>>            self.value = value
>>        def __format__(self, spec):
>>            return spec.join(map(str, self.value))
>>    string.register_converter('join', MyJoiner)
> 
> Er, I wanted it to be something more like
> 
> def __format__(self, spec):
>   sep, fmt = # 'somehow' break up spec into two parts

I covered later in the same message how this simple version could be extended to a smarter version that does that, or even more, without requiring any further changes to str.format. I just wanted to show the simplest version first, and then show that designing for that doesn't lose any flexibility.

> And I wasn't the one who actually proposed user-registered converters;
> I'm not sure who did.

Well, that does make it a bit harder to search for... But anyway, I think the idea is obvious enough once someone's mentioned it that it only matters if everyone decides we should do it, when we want to figure out who to give the credit to.

> At one point in the f-string thread I suggested
> using a _different_ special !word for stuff like a string that can be
> inserted into HTML without quoting. I'm also not 100% sure how good an
> idea it is (since it means either using global state or moving
> formatting to a class instead of str).

I don't see why global state is more of a problem here than for any other global registry (sys.modules, pickle/copy, ABCs,  registries, etc.). In fact, it seems less likely that, e.g., a multithreaded app container would run into problems with this than with most of those other things, not more likely. And the same ideas for solving those problems (subinterpreters, better IPC so multithreaded app containers aren't necessary, easier switchable contexts, whatever) seem like they'd solve this one just as easily.

And meanwhile, the alternative seems to be having something similar, but not exposing it publicly, and just baking in a handful of hardcoded converters for join, html, re-escape, etc., and I don't see why str should know about all of those things, or why extending that set when we realize that we forgot about shlex should require a patch to str and a new Python version.

> The Joiner class wouldn't have to exist as a builtin, it could be
> private to the format function.

If it's custom-registerable, it can be on PyPI, or in the middle of your app, although of course there could be some converters, maybe including your Joiner, somewhere in the stdlib, or even private to format, as well.


From abarnert at yahoo.com  Thu Sep 10 02:19:20 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 9 Sep 2015 17:19:20 -0700
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
In-Reply-To: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
Message-ID: <72597E4F-4E74-412D-8ED3-442E832232EF@yahoo.com>

Deprecating the module-level functions has one problem for backward compatibility: if you're using random across multiple modules, changing them all from this:

    import random

... to this:

    from random import DeterministicRandom
    random = DeterministicRandom()

... gives a separate MT for each module. You can work around that by, e.g., providing your own myrandom.py that does that and then using "from myrandom import random" everywhere, or by stashing a random_inst inside the random module or builtins or something and only creating it if it doesn't exist, etc., but all of these are things that people will rightly complain about.

One possible solution is to make DeterministicRandom a module instead of a class, and move all the module-level functions there, so people can just change their import to "from random import DeterministicRandom as random". (Or, alternatively, give it classmethods that create a singleton just like the module global.)

For people who decide they want to switch to SystemRandom, I don't think it's as much of a problem, as they probably won't care that they have a separate instance in each module. (And I don't think there's any security problem with using multiple instances, but I haven't thought it through...) So, the change is probably only needed in DeterministicRandom.

There are hopefully better solutions than that. But I think some solution is needed. People who have existing code (or textbooks, etc.) that do things the "wrong" way and get a DeprecationWarning should be able to easily figure out how to make their code correct.

Sent from my iPhone

> On Sep 9, 2015, at 17:01, Donald Stufft <donald at stufft.io> wrote:
> 
> Ok, I reached out to Theo de Raadt to talk to him about what he was suggesting
> without Guido having to play messenger and forward fragments of the email
> conversation. I'm starting a new thread because this email is rather long, and
> I'm hoping to divorce it a bit from the back and forth about a proposal that
> wasn't exactly what Theo was suggesting that is being discussed in the other
> thread.
> 
> Essentially, there are three basic types of uses of random (the concept, not
> the module). Those are:
> 
> 1. People/usecases who absolutely need deterministic output given a seed and
>    for whom security properties don't matter.
> 2. People/usecases who absolutely need a cryptographically random output and
>    for whom having a deterministic output is a downside.
> 3. People/usecases that fall somewhere in between where it may or may not be
>    security sensitive or it may not be known if it's security sensitive.
> 
> The people in group #1 are currently, in the Python standard library, best
> served using the MT random source as it provides exactly the kind of determinsm
> they need. The people in group #2 are currently, in the Python standard
> library, best served using os.urandom (either directly or via
> random.SystemRandom).
> 
> However, the third case is the one that Theo's suggestion is attempting to
> solve. In the current landscape, the security minded folks will tell these
> people to use os.urandom/random.SystemRandom and the performance or otherwise
> less security minded folks will likely tell them to just use random.py. Leaving
> these people with a random that is not cryptographically safe.
> 
> The questin then is, does it matter if #3 are using a cryptographically safe
> source of randomness? The answer is obviously that we don't know, and it's
> possible that the user doesn't know. In these cases it's typically best if we
> default to the more secure option and expect people to opt in to insecurity.
> 
> In the case of randomness, a lot of languages (Python included) don't do that
> and instead they opt to pick the more peformant option first, often with the
> argument (as seen in the other thread) that if people need a cryptographically
> secure source of random, they'll know how to look for it and if they don't
> know how to look for it, then it's likely they'll have some other security
> problem. I think (and I believe Theo thinks) this sort of thinking is short
> sighted. Let's take an example of a web application, it's going to need session
> identifiers to put into a cookie, you'll want these to be random and it's not
> obvious on the tin for a non-expert that you can't just use the module level
> functions in the random module to do this. Another examples are generating API
> keys or a password.
> 
> Looking on google, the first result for "python random password" is
> StackOverflow which suggests:
> 
>     ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))
> 
> However, it was later edited to, after that, include:
> 
>     ''.join(random.SystemRandom().choice(string.ascii_uppercase + string.digits) for _ in range(N))
> 
> So it wasn't obvious to the person who answered that question that the random
> module's module scoped functions were not appropiate for this use. It appears
> that the original answer lasted for roughly 4 years before it was corrected,
> so who knows how many people used that in those 4 years.
> 
> The second result has someone asking if there is a better way to generate a
> random password in Python than:
> 
>     import os, random, string
> 
>     length = 13
>     chars = string.ascii_letters + string.digits + '!@#$%^&*()'
>     random.seed = (os.urandom(1024))
> 
>     print ''.join(random.choice(chars) for i in range(length))
> 
> This person obviously knew that os.urandom existed and that he should use it,
> but failed to correctly identify that the random module's module scoped
> functions were not what he wanted to use here.
> 
> The third result has this code:
> 
>     import string
>     import random
> 
>     def randompassword():
>         chars=string.ascii_uppercase + string.ascii_lowercase + string.digits
>         size=8 
>         return ''.join(random.choice(chars) for x in range(size,12))
> 
> I'm not going to keep pasting snippets, but going through the results it is
> clear that in the bulk of cases, this search turns up code snippets that
> suggest there is likely to be a lot of code out there that is unknownly using
> the random module in a very insecure way. I think this is a failing of the
> random.py module to provide an API that guides users to be safe which was
> attempted to be papered over by adding a warning to the documentation, however
> like has been said before, you can't solve a UX problem with documentation.
> 
> Then we come to why might we want to not provide a safe random by default for
> the folks in the #3 group. As we've seen in the other thread, this basically
> boils down to the fact that for a lot of users they don't care about the
> security properties and they just want a fast random-esque value. This
> particular case is made stronger by the fact that there is a lot of code out
> there using Python's random module in a completely safe way that would regress
> in a meaningful way if the random module slowed down.
> 
> The fact that speed is the primary reason not to give people in #3 a
> cryptographically secure source of random by default is where we come back to
> the meat of Theo's suggestion. His claim is that invoking os.urandom through
> any of the interfaces imposes a performance penalty because it has to round
> trip through the kernel crypto sub system for every request. His suggestion is
> essentially that we provide an interface to a modern, good, userland 
> cryptographically secure source of random that is running within the same
> process as Python itself. One such example of this is the arc4random function
> (which doesn't actually provide ARC4 on OpenBSD, it provides ChaCha, it's not
> tied to one specific algorithm) which comes from libc on many platforms.
> According to Theo, modern userland CSPRNGs can create random bytes faster than
> memcpy which eliminates the argument of speed for why a CSPRNG shouldn't be
> the "default" source of randomness.
> 
> Thus the proposal is essentially:
> 
> * Provide an API to access a modern userland CSPRNG.
> * Provide an implementation of random.SomeKindOfRandom that utilizes this.
> * Move the MT based implementation of the random module to
>   random.DeterministicRandom.
> * Deprecate the module scoped functions, instructing people to use the new
>   random.SomeKindofRandom unless they need deterministic random, in which case
>   use random.DeterministicRandom.
> 
> This can of course be tweaked one way or the other, but that's the general idea
> translated into something actionable for Python. I'm not sure exactly how I
> feel about it, but I certainly do think that the current situation is confusing
> to end users and leaving them in an insecure state, and that a minimum we
> should move MT to something like random.DeterministicRandom and deprecate the
> module scoped functions because it seems obvious to me that the idea of a
> "default" random function that isn't safe is a footgun for users.
> 
> As an additional consideration, there are security experts who believe that
> userland CSPRNGs should not be used at all. One of those is Thomas Ptacek who
> wrote a blog post [1] on the subject. In this, Thomas makes the case that a
> userland CSPRNG pretty much always depends on the cryptographic security of
> the system random, but that it itself may be broken which means you're adding
> a second, single point of failure where a mistake can cause you to get
> non-random data out of the system. I had asked Theo about this, and he stated
> that he disagreed with Thomas about never using a userland CSPRNG and in his
> opinion that blog post was mostly warning people away from using something like
> MT in the userland and away from /dev/random (which is often the cause of
> people reaching for MT because /dev/random blocks which makes programs even
> slower).
> 
> It seems to boil down to, do we want to try to protect users by default or at
> least make it more obvious in the API which one they want to use (I think yes),
> and if so do we think that /dev/urandom is "fast enough" for most people in
> group #3 and if not, do we agree with Theo that a modern userland CSPRNG is
> safe enough to use, or do we agree with Thomas that it's not and if we think
> that it is, do we use arc4random and what do we do on systems that don't have
> a modern userland CSPRNG in their libc.
> 
> [1] http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/
> 
> -----------------
> Donald Stufft
> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
> 
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From random832 at fastmail.com  Thu Sep 10 03:25:34 2015
From: random832 at fastmail.com (Random832)
Date: Wed, 09 Sep 2015 21:25:34 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <72597E4F-4E74-412D-8ED3-442E832232EF@yahoo.com>
Message-ID: <m2mvwvoyk1.fsf@fastmail.com>

Andrew Barnert via Python-ideas
<python-ideas at python.org> writes:

> You can work around that by,
> e.g., providing your own myrandom.py that does that and then using
> "from myrandom import random" everywhere, or by stashing a random_inst
> inside the random module or builtins or something and only creating it
> if it doesn't exist, etc., but all of these are things that people
> will rightly complain about.

Of course, this brings to mind the fact that there's *already* an
instance stashed inside the random module.

At that point, you might as well just keep the module-level functions,
and rewrite them to be able to pick up on it if you replace _inst
(perhaps suitably renamed as it would be a public variable) with an
instance of a different class.

Proof-of-concept implementation:

class _method:
    def __init__(self, name):
        self.__name__ = name
    def __call__(self, *args, **kwargs):
        return getattr(_inst, self.__name__)(*args, **kwargs)
    def __repr__(self):
        return "<random method wrapper " + repr(self.__name__) + ">"

_inst = Random()
seed = _method('seed')
random = _method('random')
...etc...


From steve at pearwood.info  Thu Sep 10 03:27:07 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 10 Sep 2015 11:27:07 +1000
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <1441829361.2883366.379212985.164412ED@webmail.messagingengine.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <1441829361.2883366.379212985.164412ED@webmail.messagingengine.com>
Message-ID: <20150910012707.GN19373@ando.pearwood.info>

On Wed, Sep 09, 2015 at 04:09:21PM -0400, random832 at fastmail.us wrote:
> On Wed, Sep 9, 2015, at 15:07, Steven D'Aprano wrote:
> > Not really. Look at the subject line. It doesn't say "should we change 
> > from MT to arc4random?", it asks if the default random number generator 
> > should be secure. The only reason we are considering the change from MT 
> > to arc4random is to make the PRNG cryptographically secure. "Secure" is 
> > a moving target, what is secure today will not be secure tomorrow.
> 
> Right, but we are discussing making it secure today.

No, *you* are discussing making it secure today. The rest of us are
discussing making it secure for all time.


-- 
Steve

From donald at stufft.io  Thu Sep 10 03:30:16 2015
From: donald at stufft.io (Donald Stufft)
Date: Wed, 9 Sep 2015 21:30:16 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
Message-ID: <etPan.55f0dd28.7b8c5e71.31bc@Draupnir.home>

On September 9, 2015 at 8:01:17 PM, Donald Stufft (donald at stufft.io) wrote:
>  
> It seems to boil down to, do we want to try to protect users by default or at
> least make it more obvious in the API which one they want to use (I think yes),
> and if so do we think that /dev/urandom is "fast enough" for most people in
> group #3 and if not, do we agree with Theo that a modern userland CSPRNG is
> safe enough to use, or do we agree with Thomas that it's not and if we think
> that it is, do we use arc4random and what do we do on systems that don't have
> a modern userland CSPRNG in their libc.
>  
>

Ok, I've talked to an honest to god cryptographer as well as some other smart
folks!

Here's the general gist:

Using a userland CSPRNG like arc4random is not advisable for things that you
absolutely need cryptographic security for (this is group #2 from my original
email). These people should use os.urandom or random.SystemRandom as they
should be doing now. In addition os.urandom or random.SystemRandom is
probably fast enough for most use cases of the random.py module, however it is
true that using os.urandom/random.SystemRandom would be slower than MT. It is
reasonable to use a userland CSPRNG as a "default" source of randomness or in
cases where people care about speed but maybe not about security and don't
need determinism.

However, they've said that the primary benefit in using a userland CSPRNG for
a faster cryptographically secure source of randomness is if we can make it the?
default source of randomness for a "probably safe depending on your app" safety
net for people who didn't read or understand the documentation. This would make
most uses of random.random and friends secure but not deterministic.

If we're unwilling to change the default, but we are willing to deprecate the
module scoped functions and force users to make a choice between
random.SystemRandom and random.DeterministicRandom then there is unlikely to
be much benefit to also adding a userland CSPRNG into the mix since there's no
class of people who are using an ambiguous "random" that we don't know if they
need it to be secure or deterministic/fast.

So I guess my suggestion would be, let's deprecate the module scope functions
and rename random.Random to random.DeterministicRandom. This absolves us of
needing to change the behavior of people's existing code (besides deprecating
it) and we don't need to decide if a userland CSPRNG is safe or not while still
moving us to a situation that is far more likely to have users doing the right
thing.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From brenbarn at brenbarn.net  Thu Sep 10 03:50:42 2015
From: brenbarn at brenbarn.net (Brendan Barnwell)
Date: Wed, 09 Sep 2015 18:50:42 -0700
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
Message-ID: <55F0E1F2.6040709@brenbarn.net>

On 2015-09-09 13:17, Guido van Rossum wrote:
> Jukka wrote up a proposal for structural subtyping. It's pretty good.
> Please discuss.
>
> https://github.com/ambv/typehinting/issues/11#issuecomment-138133867

	I'm not totally hip to all the latest typing developments, but I'm not 
sure I fully understand the benefit of this protocol concept.  At the 
beginning it says that classes have to be explicitly marked to support 
these protocols.  But why is that?  Doesn't the existing 
__subclasshook__ already allow an ABC to use any criteria it likes to 
determine if a given class is considered a subclass?  So couldn't ABCs 
like the ones we already have inspect the type annotations and decide a 
class "counts" as an iterable (or whatever) if it defines the right 
methods with the right type hints?

-- 
Brendan Barnwell
"Do not follow where the path may lead.  Go, instead, where there is no 
path, and leave a trail."
    --author unknown

From abarnert at yahoo.com  Thu Sep 10 03:50:43 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 9 Sep 2015 18:50:43 -0700
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
In-Reply-To: <m2mvwvoyk1.fsf@fastmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <72597E4F-4E74-412D-8ED3-442E832232EF@yahoo.com>
 <m2mvwvoyk1.fsf@fastmail.com>
Message-ID: <A45551D4-1E5C-423F-9ACB-F2CB386B6BEE@yahoo.com>

On Sep 9, 2015, at 18:25, Random832 <random832 at fastmail.com> wrote:
> 
> Andrew Barnert via Python-ideas
> <python-ideas at python.org> writes:
> 
>> You can work around that by,
>> e.g., providing your own myrandom.py that does that and then using
>> "from myrandom import random" everywhere, or by stashing a random_inst
>> inside the random module or builtins or something and only creating it
>> if it doesn't exist, etc., but all of these are things that people
>> will rightly complain about.
> 
> Of course, this brings to mind the fact that there's *already* an
> instance stashed inside the random module.
> 
> At that point, you might as well just keep the module-level functions,
> and rewrite them to be able to pick up on it if you replace _inst
> (perhaps suitably renamed as it would be a public variable) with an
> instance of a different class.

The whole point is to make people using the top-level functions see a DeprecationWarning that leads them to make a choice between SystemRandom and DeterministicRandom. Just making inst public (and dynamically switchable) doesn't do that, so it doesn't solve anything.

However, it seems like there's a way to extend it to do that:

First, rename Random to DeterministicRandom. Then, add a subclass called Random that raises a DeprecationWarning whenever its methods are called. Then preinitialize inst to Random(), just as we already to. Existing code will work, but with a warning. And the text of that warning or the help it leads to or the obvious google result or whatever can just suggest "add random.inst = random.DeterministicRandom() or random.inst = random.SystemRandom() at the start of your program". That has most of the benefit of deprecating the top-level functions, without the cost of the solution being non-obvious (and the most obvious solution being wrong for some use cases).

Of course it adds the cost of making the module slower, and also more complex. Maybe a better solution would be to add a random.set_default_instance function that replaced all of the top-level functions with bound methods of the instance (just like what's already done at startup in random.py)? That's simple, and doesn't slow down anything, and it seems like it makes it more clear what you're doing than setting random.inst.

From steve at pearwood.info  Thu Sep 10 03:55:05 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 10 Sep 2015 11:55:05 +1000
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
References: <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
 <loom.20150909T232749-280@post.gmane.org>
 <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>
 <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
Message-ID: <20150910015505.GO19373@ando.pearwood.info>

On Wed, Sep 09, 2015 at 04:15:31PM -0700, Nathaniel Smith wrote:

> The real reasons to prefer non-cryptographic RNGs are the auxiliary
> features like determinism, speed, jumpahead, multi-thread
> friendliness, etc. But the stdlib random module doesn't really provide
> any of these (except determinism in strictly limited cases), so I'm
> not sure it matters much.

The default MT is certainly deterministic, and although only the output 
of random() itself is guaranteed to be reproducible, the other methods 
are *usually* stable in practice.

There's a jumpahead method too, and for use with multiple threads, 
you can (and should) create your own instances that don't share 
state. I call that "multi-thread friendliness" :-)

I think Paul Moore's position is a good one. Anyone writing crypto code 
without reading the docs and understanding what they are doing are 
surely making more mistakes than just using the wrong PRNG. There may be 
a good argument for adding arc4random support to the stdlib, but making 
it the default (with the disadvantages discussed, breaking backwards 
compatibility, surprising non-crypto users, etc.) won't fix the broken 
crypto code. It will just give people a false sense of security and 
encourage them to ignore the docs and write broken crypto code.



-- 
Steve

From tim.peters at gmail.com  Thu Sep 10 03:55:06 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 9 Sep 2015 20:55:06 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F0BD0F.10508@sdamon.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <55F0BD0F.10508@sdamon.com>
Message-ID: <CAExdVNmWWWt_iPNPG2e52pKX-AUdeAusCbCFiK0LrqibSgL_xA@mail.gmail.com>

[Alexander Walters <tritium-list at sdamon.com>]
> In a word - No.
>
> There is zero reason for people doing crypto to use the random module,
> therefor we should not change the random module to be cryptographically
> secure.
>
> Don't break things and slow my code down by default for dubious reasons,
> please.

Would your answer change if a crypto generator were _faster_ than MT?
MT isn't speedy by modern standards, and is cache-hostile (about 2500
bytes of mutable state).

Not claiming a crypto hash _would_ be faster.  But it is possible.

From brenbarn at brenbarn.net  Thu Sep 10 04:07:05 2015
From: brenbarn at brenbarn.net (Brendan Barnwell)
Date: Wed, 09 Sep 2015 19:07:05 -0700
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <1DCC81C0-DE7A-460A-AD7F-E1533BB14911@yahoo.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com> <87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>
 <m2egi9a62o.fsf@fastmail.com> <55EF2B66.4020509@mail.de>
 <1441741195.1614886.378114729.37307E0E@webmail.messagingengine.com>
 <6DDBD724-714E-40E1-88DF-9BC8484FF240@yahoo.com> <55F058B6.9000202@mail.de>
 <1DCC81C0-DE7A-460A-AD7F-E1533BB14911@yahoo.com>
Message-ID: <55F0E5C9.6030509@brenbarn.net>

On 2015-09-09 14:50, Andrew Barnert via Python-ideas wrote:
> Well, have you read the answers given by Nick, me, and others earlier
> in the thread? If so, what do you disagree with? You've only
> addressed one point (that % is faster than {} for simple cases--and
> your solution is just "make {} faster", which may not be possible
> given that it's inherently more hookable than % and therefore
> requires more function calls...). What about formatting headers for
> ASCII wire protocols, sharing tables of format strings between
> programming languages (e.g., for i18n), or any of the other reasons
> people have brought up?

	This getting off on a tangent, but I don't see most of those as super 
compelling.  Any programming language can use whatever formatting scheme 
it likes.  Keeping %-substitutions around helps in sharing format 
strings only with other languages that use exactly the same formatting 
style.  So it's not like % has any intrinsic gain; it just happens to 
interoperate with some other particular stuff.  That's nice, but I don't 
think it makes sense to keep things in Python just so it can 
interoperate in specific ways with specific other languages that use 
less-readable syntax.

	To me the main advantage of {} is it's more readable.  Readability is 
relevant in any application.  The other things you're mentioning seem to 
be basically about making certain particular applications easier, and I 
see that as less important.  In other words, if to write a wire protocol 
or share format strings you have to write your own functions to do stuff 
in a more roundabout way instead of using a (or the!) built-in 
formatting mechanism, I'm fine with that if it streamlines the built-in 
formatting mechanism(s).

	(The main DISadvantage of {} so far is that its readability is limited 
because you have to pass in all that stuff with the format call at the 
end.  I think if one of these string-interpolation PEPs settles down and 
we get something like "I like {this} and {that}" --- where the names are 
drawn directly from the enclosing scope without having to pass them in 
--- that will be a huge win over both the existing {} formatting and the 
% formatting.)

-- 
Brendan Barnwell
"Do not follow where the path may lead.  Go, instead, where there is no
path, and leave a trail."
    --author unknown

From steve at pearwood.info  Thu Sep 10 04:11:08 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 10 Sep 2015 12:11:08 +1000
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAExdVNmWWWt_iPNPG2e52pKX-AUdeAusCbCFiK0LrqibSgL_xA@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <55F0BD0F.10508@sdamon.com>
 <CAExdVNmWWWt_iPNPG2e52pKX-AUdeAusCbCFiK0LrqibSgL_xA@mail.gmail.com>
Message-ID: <20150910021108.GP19373@ando.pearwood.info>

On Wed, Sep 09, 2015 at 08:55:06PM -0500, Tim Peters wrote:
> [Alexander Walters <tritium-list at sdamon.com>]
> > In a word - No.
> >
> > There is zero reason for people doing crypto to use the random module,
> > therefor we should not change the random module to be cryptographically
> > secure.
> >
> > Don't break things and slow my code down by default for dubious reasons,
> > please.
> 
> Would your answer change if a crypto generator were _faster_ than MT?
> MT isn't speedy by modern standards, and is cache-hostile (about 2500
> bytes of mutable state).
> 
> Not claiming a crypto hash _would_ be faster.  But it is possible.

If the crypto PRNG were comparable in speed to what we have now (not 
significantly slower), or faster, *and* gave reproducible results with 
the same seed, *and* had no known/detectable statistical biases), and we 
could promise that those properties would continue to hold even when the 
state of the art changed and we got a new default crypto PRNG, then I'd 
still be -0.5 on the change due to the "false sense of security" factor.

As I've already mentioned in another comment, I'm with Paul Moore -- I 
think anyone foolish/ignorant/lazy/malicious enough to use the default 
PRNG for crypto is surely making more than one mistake, and fixing that 
one thing for them will just give people a false sense of security.



-- 
Steve

From tim.peters at gmail.com  Thu Sep 10 04:23:23 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 9 Sep 2015 21:23:23 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <20150910015505.GO19373@ando.pearwood.info>
References: <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
 <loom.20150909T232749-280@post.gmane.org>
 <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>
 <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
 <20150910015505.GO19373@ando.pearwood.info>
Message-ID: <CAExdVN=tO3jPWoz0t6ckspAuWB-7t61GzbKLrx2L2UtKRyELbA@mail.gmail.com>

[Nathaniel Smith]
>> The real reasons to prefer non-cryptographic RNGs are the auxiliary
>> features like determinism, speed, jumpahead, multi-thread
>> friendliness, etc. But the stdlib random module doesn't really provide
>> any of these (except determinism in strictly limited cases), so I'm
>> not sure it matters much.

[Steven D'Aprano]
> The default MT is certainly deterministic, and although only the output
> of random() itself is guaranteed to be reproducible, the other methods
> are *usually* stable in practice.
>
> There's a jumpahead method too,

Not in Python.  There was for the ancient Wichmann-Hill generator, but
not for MT.  A detailed sketch of ways to implement efficient
jumpahead for MT are given here:

    A Fast Jump Ahead Algorithm for Linear Recurrences
        in a Polynomial Space
    http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/jump-seta-lfsr.pdf

But because MT isn't a _simple_ recurrence, they're all depressingly
complex :-(  For Wichmann-Hill it was just a few integer modular
exponentiations.


> and for use with multiple threads, you can (and should) create your
> own instances that don't share state. I call that "multi-thread friendliness" :-)

That's what people do, but MT's creators don't recommend it anymore
(IIRC, their FAQ did recommend it some years ago).  Then they switched
to recommending using jumpahead with a large value (to guarantee
different instances' states would never overlap).  Now (well, last I
saw) they recommend a parameterized scheme creating a distinct variant
of MT per thread (not just different state, but a different (albeit
related) algorithm):

    http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/DC/dgene.pdf

So I'd say it's clear as mud ;-)

> ...

From tim.peters at gmail.com  Thu Sep 10 05:23:22 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 9 Sep 2015 22:23:22 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
 <loom.20150909T232749-280@post.gmane.org>
 <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>
 <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
Message-ID: <CAExdVNkXYAVG2JdzRXyYyFQ0av3j_xm0==cvV6MoHaRok8PqXg@mail.gmail.com>

[Nathaniel Smith <njs at pobox.com>]
> Yeah, equidistribution is not a guarantee of anything on its own. For
> example, an integer counter modulo 2**(623*32) is equidistributed to
> 32 bits across 623 dimensions, just like the Mersenne Twister.

The analogy is almost exact.  If you view MT's state as a single
19937-bit integer (=623*32 + 1), then MT's state "simply" cycles
through a specific permutation of

    range(1, 2**19937)

(with a different orbit consisting solely of 0).  That was the "hard
part" to prove.  Everything about equidistribution was more of an
observation following from that.  Doesn't say anything about
distribution "in the small" (across small slices), alas.


> ... And hey, if
> arc4random *does* mess up your simulation, then congratulations, your
> simulation is publishable as a cryptographic attack and will probably
> get written up in the NYTimes :-).

Heh.  In the NYT or a security wonk's blog, maybe.  But why would a
reputable journal believe me?  By design, the results of using the
OpenBSD arc4random can't be reproduced ;-)


> The real reasons to prefer non-cryptographic RNGs are the auxiliary
> features like determinism, speed, jumpahead, multi-thread
> friendliness, etc. But the stdlib random module doesn't really provide
> any of these (except determinism in strictly limited cases), so I'm
> not sure it matters much.

Python's implementation of MT has never changed anything about the
sequence produced from a given seed state, and indeed gives the same
sequence from the same seed state as every other correct
implementation of the same flavor of MT.  That is "strictly limited",
to perfection ;-)  At a higher level, depends on the app.  People are
quite creative at defeating efforts to be helpful ;-)


>> The Twister's provably perfect equidistribution across its whole
>> period also has its scary sides.  For example, run random.random()
>> often enough, and it's _guaranteed_ you'll eventually reach a state
>> where the output is exactly 0.0 hundreds of times in a row.  That will
>> happen as often as it "should happen" by chance, but that's scant
>> comfort if you happen to hit such a state.  Indeed, the Twister was
>> patched relatively early in its life to try to prevent it from
>> _starting_ in such miserable states.   Such states are nevertheless
>> still reachable from every starting state.

> This criticism seems a bit unfair though

Those are facts, not criticisms.  I like the Twister very much.  But
those who have no fear of it are dreaming - while those who have
significant fear of it are also dreaming.  It's my job to ensure
nobody is either frightened or complacent ;-)


> -- even a true stream of random bits (e.g. from a true unbiased
> quantum source) has this property,

But good generators with astronomically smaller periods do not.  In a
sense, MT made it possible to get results "far more random" than any
widely used deterministic generator before it.  The patch I mentioned
above was to fix real problems in real life, where people using simple
seeding schemes systematically ended up with results so transparently
ludicrous nobody could possibly accept them for any purpose.

"The fix" consisted of using scrambling functions to spray the bits in
the user-*supplied* seed all over the place, in a pseudo-random way,
to probabilistically ensure "the real" state wouldn't end up with "too
many" zero bits.  "A lot of zero bits" tends to persist across MT
state transitions for a long time.


> and trying to avoid this happening would introduce bias that really
> could cause problems in practice.

For example, nobody _needs_ a generator capable of producing hundreds
of 0.0 in a row.  Use a variant even of MT with a much smaller period,
and that problem goes away, with no bad effects on any app.

Push that more, and what many Monte Carlo applications _really_ want
is "low discrepancy":  some randomness is essential, but becomes a
waste of cycles and assaults "common sense" if it gets too far from
covering the domain more-or-less uniformly.  So there are many ways
known of generating "quasi-random" sequences instead, melding a notion
of randomness with guarantees of relatively low discrepancy (low
deviation from uniformity).  Nothing works best for all purposes -
except Python.


> A good probabilistic program is one that has a high probability
> of returning some useful result, but they always have some
> low probability of returning something weird. So this is just
> saying that most people don't understand probability.

Or nobody does.  Probability really is insufferably subtle.  Guido
should ban it.

> Which is true, but there isn't much that the random module can do
> about it :-)

Python should just remove it.  It's an "attractive nuisance" to everyone ;-)

From stephen at xemacs.org  Thu Sep 10 05:25:29 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 10 Sep 2015 12:25:29 +0900
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <CAPJVwBmziZufQL0YVYSFmEV7GmvYZXz2eim-vgoSH=H_nPC1jQ@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
 <CAPJVwBmziZufQL0YVYSFmEV7GmvYZXz2eim-vgoSH=H_nPC1jQ@mail.gmail.com>
Message-ID: <87613j0xcm.fsf@uwakimon.sk.tsukuba.ac.jp>

Nathaniel Smith writes:

 > That seems more productive in the short run than trying to
 > get everyone to stop typing "pip" :-).

FWIW, I did as soon as I realized python_i_want_to_install -m pip
worked; it's obvious that it DTRTs, and I felt like I'd just dropped
the hammer I'd been whacking my head with.

 > (Though I do agree that having pip as a separate command from
 > python is a big mess -- another case where this comes up is the
 > need for pip versus pip3.)

Ah, that's the name of my hammer, although it's come up in 3.2 vs 3.3
as well.

 > It sounds like this is another place where in the short term, it would
 > help a lot of pip at startup took a peek at $PATH and issued some
 > warnings or errors if it detected the most common types of
 > misconfiguration? (E.g. the first python/python3 in $PATH does not
 > match the one being used to run pip.)

I don't understand the logic for trying to save the pip command by
making its environment checking more complex than the app itself.
"python -m pip" suffers from no problems that pip itself doesn't
suffer from, and is far more reliable, without blaming the user.
Sure, people used to using a pip command shouldn't be deprived of it,
but I'll never miss it, and I don't see why anybody who isn't already
using it would miss it.

The only problem with "python -m pip" is discoverability/memorability,
and the fact that interactive use of "from pip import main" is not
properly supported IIUC (not to mention clumsy).  Thus the proposal
for a builtin named "install" or similar.


From stephen at xemacs.org  Thu Sep 10 05:32:23 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 10 Sep 2015 12:32:23 +0900
Subject: [Python-ideas] High time for a builtin function to
	manage	packages (simply)?
In-Reply-To: <C5A4CEBA-048A-4762-B650-869625BF77ED@yahoo.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
 <C5A4CEBA-048A-4762-B650-869625BF77ED@yahoo.com>
Message-ID: <874mj30x14.fsf@uwakimon.sk.tsukuba.ac.jp>

Andrew Barnert via Python-ideas writes:

 > If StackOverflow/SU/TD questions are any indication, a
 > disproportionate number of these people are Mac users using Python
 > 2.7, who have installed a second Python 2.7 (or, in some cases, two
 > of them) alongside Apple's.

Often enough it's the other way around: the distro catches up to the
user as they upgrade.  I didn't even realize "10.10 Yosemite" had 2.7,
this box has been upgraded from "10.7 Lion" or so, and I just use
MacPorts 2.7 all the time.  I haven't worried about what Apple
supplies as /usr/bin/python in 6 or 7 years.

I don't know if this matters to the effect on pip, but I thought it
should be mentioned.

From njs at pobox.com  Thu Sep 10 05:32:45 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 9 Sep 2015 20:32:45 -0700
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAExdVN=tO3jPWoz0t6ckspAuWB-7t61GzbKLrx2L2UtKRyELbA@mail.gmail.com>
References: <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
 <loom.20150909T232749-280@post.gmane.org>
 <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>
 <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
 <20150910015505.GO19373@ando.pearwood.info>
 <CAExdVN=tO3jPWoz0t6ckspAuWB-7t61GzbKLrx2L2UtKRyELbA@mail.gmail.com>
Message-ID: <CAPJVwBmUFpkmmptAmOJOcpp2i5t_SdhMh0WFbprhpwa+gfrrXQ@mail.gmail.com>

[Sorry to Tim and Steven if they get multiple copies of this... Gmail
recently broke their Android app's handling of from addresses, so
resending, sigh]

On Sep 9, 2015 7:24 PM, "Tim Peters" <tim.peters at gmail.com> wrote:
[...]
> [Steven D'Aprano]
[...]
> > and for use with multiple threads, you can (and should) create your
> > own instances that don't share state. I call that "multi-thread
friendliness" :-)
>
> That's what people do, but MT's creators don't recommend it anymore
> (IIRC, their FAQ did recommend it some years ago).  Then they switched
> to recommending using jumpahead with a large value (to guarantee
> different instances' states would never overlap).  Now (well, last I
> saw) they recommend a parameterized scheme creating a distinct variant
> of MT per thread (not just different state, but a different (albeit
> related) algorithm):
>
>     http:// <http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/DC/dgene.pdf>
www.math.sci.hiroshima-u.ac.jp
<http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/DC/dgene.pdf>/~m-mat/MT/DC/
<http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/DC/dgene.pdf>dgene.pdf
<http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/DC/dgene.pdf>
>
> So I'd say it's clear as mud ;-)

Yeah, the independent-seed-for-each-thread approach is an option with any
RNG, but just like people feel better if they have a 100% certified
guarantee that the RNG output in a single thread will pass through every
combination of possible values (if you wait some cosmological time), they
also feel better if there is some 100% certified guarantee that the RNG
values in two threads will also be uncorrelated with each other.

With something like MT, if two threads did end up with nearby seeds, then
that would be bad: each thread individually would see values that looked
like high quality randomness, but if you compared across the two threads,
they would be identical modulo some lag. So all the nice theoretical
analysis of the single threaded stream falls apart.

However, for two independently seeded threads to end up anywhere near each
other in the MT state space requires that you have picked two numbers
between 0 and 2**19937 and gotten values that were "close". Assuming your
seeding procedure is functional at all, then this is not a thing that will
ever actually happen in this universe. So AFAICT the rise of explicitly
multi-threaded RNG designs is one of those fake problems that exists only
so people can write papers about solving it. (Maybe this is uncharitable.)

So there exist RNG designs that handle multi-threading explicitly, and it
shows up on feature comparison checklists. I don't think it should really
affect Python's decisions at all though.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/6b6023e9/attachment-0001.html>

From tim.peters at gmail.com  Thu Sep 10 05:35:55 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 9 Sep 2015 22:35:55 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAPJVwBnOqY3XcAtuRS7en956qfZHL1_fin-e7Pb+2CWQk2dftg@mail.gmail.com>
References: <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
 <loom.20150909T232749-280@post.gmane.org>
 <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>
 <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
 <20150910015505.GO19373@ando.pearwood.info>
 <CAExdVN=tO3jPWoz0t6ckspAuWB-7t61GzbKLrx2L2UtKRyELbA@mail.gmail.com>
 <CAPJVwBnOqY3XcAtuRS7en956qfZHL1_fin-e7Pb+2CWQk2dftg@mail.gmail.com>
Message-ID: <CAExdVN=tEtoh6Dx+7XCQ-nwv1f7O+ALAvSLvLpT4NQnzyK0Z+A@mail.gmail.com>

[Nathaniel Smith <njs at vorpus.org>]
> Yeah, the independent-seed-for-each-thread approach works for any RNG, but
> just like people feel better if they have a 100% certified guarantee that
> the RNG output in a single thread will pass through every combination of
> possible values (if you wait some cosmological time), they also feel better
> if there is some 100% certified guarantee that the RNG values in two threads
> will also be uncorrelated with each other.
>
> With something like MT, if two threads did end up with nearby seeds, then
> that would be bad: each thread individually would see values that looked
> like high quality randomness, but if you compared across the two threads,
> they would be identical modulo some lag. So all the nice theoretical
> analysis of the single threaded stream falls apart.
>
> However, for two independently seeded threads to end up anywhere near each
> other in the MT state space requires that you have picked two numbers
> between 0 and 2**19937 and gotten values that were "close". Assuming your
> seeding procedure is functional at all, then this is not a thing that will
> ever actually happen in this universe.

I think it's worse than that.  MT is based on a linear recurrence.
Take two streams "far apart" in MT space, and their sum also satisfies
the recurrence.  So a possible worry about a third stream isn't _just_
about correlation or overlap with the first two streams, but,
depending on the app, also about correlation/overlap with the sum of
the first two streams.  Move to N streams, and there are O(N**2)
direct sums to worry about, and then sums of sums, and ...

Still won't cause a problem in _my_ statistical life expectancy, but I
only have 4 cores ;-)


> So AFAICT the rise of explicitly multi-threaded RNG designs is one of
> those fake problems that exists only so people can write papers about
> solving it. (Maybe this is uncharitable.)

Uncharitable, but fair :-)


> So there exist RNG designs that handle multi-threading explicitly, and it
> shows up on feature comparison checklists. I don't think it should really
> affect Python's decisions at all though.

There are some clean and easy approaches to this based on
crypto-inspired schemes, but giving up crypto strength for speed.  If
you haven't read it, this paper is delightful:

    http://www.thesalmons.org/john/random123/papers/random123sc11.pdf

From ben+python at benfinney.id.au  Thu Sep 10 05:37:55 2015
From: ben+python at benfinney.id.au (Ben Finney)
Date: Thu, 10 Sep 2015 13:37:55 +1000
Subject: [Python-ideas] High time for a builtin function to manage
	packages (simply)?
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
 <CAPJVwBmziZufQL0YVYSFmEV7GmvYZXz2eim-vgoSH=H_nPC1jQ@mail.gmail.com>
Message-ID: <858u8f7xm4.fsf@benfinney.id.au>

Nathaniel Smith <njs at pobox.com> writes:

> [?] in the short term, it would help a lot of pip at startup took a
> peek at $PATH and issued some warnings or errors if it detected the
> most common types of misconfiguration? (E.g. the first python/python3
> in $PATH does not match the one being used to run pip.)

Isn't that something that would be better in the Python executable
itself? Many commands would be better with a (overridable) default
behaviour as you describe.

-- 
 \        ?Considering the current sad state of our computer programs, |
  `\     software development is clearly still a black art, and cannot |
_o__)          yet be called an engineering discipline.? ?Bill Clinton |
Ben Finney


From jlehtosalo at gmail.com  Thu Sep 10 05:40:47 2015
From: jlehtosalo at gmail.com (Jukka Lehtosalo)
Date: Wed, 9 Sep 2015 20:40:47 -0700
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <55F0A1A8.5010001@mail.de>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0A1A8.5010001@mail.de>
Message-ID: <CAA_f+LxA+SVnK5S8q+kG+7g_47MFYGgu1Hb1thCM8ws7O9+hww@mail.gmail.com>

On Wed, Sep 9, 2015 at 2:16 PM, Sven R. Kunze <srkunze at mail.de> wrote:

> Thanks for sharing, Guido. Some random thoughts:
>
> - "classes should need to be explicitly marked as protocols"
> If so, why are they classes in the first place? Other languages has
> dedicated keywords like "interface".
>

I want to preserve compatibility with earlier Python versions (down to
3.2), and this makes it impossible to add any new syntax. Also, there is no
need to add a keyword as there are other existing mechanisms which are good
enough, including base classes (as in the proposal) and class decorators. I
don't think that this will become a very commonly used language feature,
and thus adding special syntax for this doesn't seem very important. My
expectation is that structural subtyping would be primarily useful for
libraries and frameworks.

Jukka
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/29bc4a71/attachment.html>

From njs at pobox.com  Thu Sep 10 05:44:05 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 9 Sep 2015 20:44:05 -0700
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <858u8f7xm4.fsf@benfinney.id.au>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
 <CAPJVwBmziZufQL0YVYSFmEV7GmvYZXz2eim-vgoSH=H_nPC1jQ@mail.gmail.com>
 <858u8f7xm4.fsf@benfinney.id.au>
Message-ID: <CAPJVwB=gwCRP5pga5Hd0ZHD_x1Y9BTr5b3JkdfqQpAJZTsdO_Q@mail.gmail.com>

On Wed, Sep 9, 2015 at 8:37 PM, Ben Finney <ben+python at benfinney.id.au> wrote:
> Nathaniel Smith <njs at pobox.com> writes:
>
>> [?] in the short term, it would help a lot of pip at startup took a
>> peek at $PATH and issued some warnings or errors if it detected the
>> most common types of misconfiguration? (E.g. the first python/python3
>> in $PATH does not match the one being used to run pip.)
>
> Isn't that something that would be better in the Python executable
> itself? Many commands would be better with a (overridable) default
> behaviour as you describe.

While that's debatable, any plan that only benefits users of python
3.6+ is a non-starter is a non-starter, given that the goal here is
short-term harm reduction.

-n

-- 
Nathaniel J. Smith -- http://vorpus.org

From steve at pearwood.info  Thu Sep 10 05:46:08 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 10 Sep 2015 13:46:08 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
In-Reply-To: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
Message-ID: <20150910034608.GQ19373@ando.pearwood.info>

On Wed, Sep 09, 2015 at 08:01:16PM -0400, Donald Stufft wrote:
[...]
> Looking on google, the first result for "python random password" is
> StackOverflow which suggests:
> 
> ? ? ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))
> 
> However, it was later edited to, after that, include:
> 
> ? ? ''.join(random.SystemRandom().choice(string.ascii_uppercase + string.digits) for _ in range(N))

You're worried about attacks on the random number generator that 
produces the characters in the password? I think I'm going to have to 
see an attack before I believe that this is meaningful.

Excluding PRNGs that are hopelessly biased ("nine, nine, nine, nine...") 
or predictable, how does knowing the PRNG help in an attack? Here's a 
password I just generated using your "corrected" version using 
SystemRandom:

    06XW0X0X

(Honest, that's exactly what I got on my first try.)

Here's one I generated using the "bad" code snippet:

    V6CFKCF2

How can you tell them apart, or attack one but not the other based on 
the PRNG?


> So it wasn't obvious to the person who answered that question that the random
> module's module scoped functions were not appropiate for this use. It appears
> that the original answer lasted for roughly 4 years before it was corrected,

Shouldn't it be using a single instance of SystemRandom rather than a 
new instance for each call?


[...]
> According to Theo, modern userland CSPRNGs can create random bytes faster than
> memcpy 

That is an astonishing claim, and I'd want to see evidence for it before 
accepting it.



-- 
Steve

From tritium-list at sdamon.com  Thu Sep 10 05:51:42 2015
From: tritium-list at sdamon.com (Alexander Walters)
Date: Wed, 09 Sep 2015 23:51:42 -0400
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <20150910021108.GP19373@ando.pearwood.info>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <55F0BD0F.10508@sdamon.com>
 <CAExdVNmWWWt_iPNPG2e52pKX-AUdeAusCbCFiK0LrqibSgL_xA@mail.gmail.com>
 <20150910021108.GP19373@ando.pearwood.info>
Message-ID: <55F0FE4E.5040802@sdamon.com>



On 9/9/2015 22:11, Steven D'Aprano wrote:
> If the crypto PRNG were comparable in speed to what we have now (not
> significantly slower), or faster,*and*  gave reproducible results with
> the same seed,*and*  had no known/detectable statistical biases), and we
> could promise that those properties would continue to hold even when the
> state of the art changed and we got a new default crypto PRNG, then I'd
> still be -0.5 on the change due to the "false sense of security" factor.
+1 Exactly this.  If you can give me the same functionality (including 
seeding), make it faster *and* more secure, I have zero objections.  I 
*still* do not think we should go out of our way to make random a good 
source of cryptographic data, since...

Lets be frank about this, Guido is not a security expert.  I am not a 
security expert.  Tim, I suspect you are not a security expert. Lets 
leave actually attempting to be at the cutting edge of cryptographic 
randomness to modules by security experts.  I have far too much use for 
randomness outside of a cryptographic context to sacrifice the API and 
feature set we have for, in my opinion, a myopic focus on one, already 
discouraged, use of the random module.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/b3e27f0e/attachment.html>

From njs at pobox.com  Thu Sep 10 05:56:56 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 9 Sep 2015 20:56:56 -0700
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAExdVN=tEtoh6Dx+7XCQ-nwv1f7O+ALAvSLvLpT4NQnzyK0Z+A@mail.gmail.com>
References: <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
 <loom.20150909T232749-280@post.gmane.org>
 <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>
 <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
 <20150910015505.GO19373@ando.pearwood.info>
 <CAExdVN=tO3jPWoz0t6ckspAuWB-7t61GzbKLrx2L2UtKRyELbA@mail.gmail.com>
 <CAPJVwBnOqY3XcAtuRS7en956qfZHL1_fin-e7Pb+2CWQk2dftg@mail.gmail.com>
 <CAExdVN=tEtoh6Dx+7XCQ-nwv1f7O+ALAvSLvLpT4NQnzyK0Z+A@mail.gmail.com>
Message-ID: <CAPJVwBn+tWOtPPt+UqwGwYaRqozAZtU2xTdVhuZUaRvJnePGXQ@mail.gmail.com>

On Wed, Sep 9, 2015 at 8:35 PM, Tim Peters <tim.peters at gmail.com> wrote:
> There are some clean and easy approaches to this based on
> crypto-inspired schemes, but giving up crypto strength for speed.  If
> you haven't read it, this paper is delightful:
>
>     http://www.thesalmons.org/john/random123/papers/random123sc11.pdf

It really is! As AES acceleration instructions become more common
(they're now standard IIUC on x86, x86-64, and even recent ARM?), even
just using AES in CTR mode becomes pretty compelling -- it's fast,
deterministic, provably equidistributed, *and* cryptographically
secure enough for many purposes.

(Compared to a true state-of-the-art CPRNG the naive version fails due
to lack of incremental mixing, and the use of a reversible transition
function. But even these are mostly only important to protect against
attackers who have access to your memory -- which is not trivial as
heartbleed shows, but still, it's *waaay* ahead of something like MT
on basically every important axis.)

-n

-- 
Nathaniel J. Smith -- http://vorpus.org

From random832 at fastmail.com  Thu Sep 10 05:59:22 2015
From: random832 at fastmail.com (Random832)
Date: Wed, 09 Sep 2015 23:59:22 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <20150910034608.GQ19373@ando.pearwood.info>
Message-ID: <m2twr2ncv9.fsf@fastmail.com>

Steven D'Aprano <steve at pearwood.info> writes:

> On Wed, Sep 09, 2015 at 08:01:16PM -0400, Donald Stufft wrote:
> [...]
>
> You're worried about attacks on the random number generator that 
> produces the characters in the password? I think I'm going to have to 
> see an attack before I believe that this is meaningful.

Isn't the only difference between generating a password and generating a
key the length (and base) of the string? Where is the line?

> That is an astonishing claim, and I'd want to see evidence for it before 
> accepting it.

I assume it's comparing a CSPRNG all of whose state is in cache (or
registers, if a large block of random bytes is requested from the CSPRNG
in one go, with memcpy of data which must be retrieved from main
memory.


From jlehtosalo at gmail.com  Thu Sep 10 06:12:24 2015
From: jlehtosalo at gmail.com (Jukka Lehtosalo)
Date: Wed, 9 Sep 2015 21:12:24 -0700
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <55F0AC83.3050505@mail.de>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
Message-ID: <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>

On Wed, Sep 9, 2015 at 3:02 PM, Sven R. Kunze <srkunze at mail.de> wrote:

> Not specifically about this proposal but about the effort put into Python
> typehinting in general currently:
>
>
> What are the supposed benefits?
>

This has been discussed almost to the death before, but there are some of
main the benefits as I see them:

- Code becomes more readable. This is especially true for code that doesn't
have very detailed docstrings. This may go against the intuition of some
people, but my experience strongly suggests this, and many others who've
used optional typing have shared the sentiment. It probably takes a couple
of days before you get used to the type annotations, after which they
likely won't distract you any more but will actually improve code
understanding by providing important contextual information that is often
difficult to infer otherwise.
- Tools can automatically find most (simple) bugs of certain common kinds
in statically typed code. A lot of production code has way below 100% test
coverage, so this can save many manual testing iterations and help avoid
breaking stuff in production due to stupid mistakes (that humans are bad at
spotting).
- Refactoring becomes way less scary, especially if you don't have close to
100% test coverage. A type checker can find many mistakes that are commonly
introduced when refactoring code.

You'll get the biggest benefits if you are working on a large code base
mostly written by other people with limited test coverage and little
comments or documentation. You get extra credit if your tests are slow to
run and flaky, as this slows down your iteration speed, whereas type
checking can be quick (with the right tools, which might not exist as of
now ;-). If you have a small (say, less than 10k lines) code base you've
mostly written yourself and have meticuously documented everything and have
95% test coverage and your full test suite runs in 10 seconds, you'll
probably get less out of it. Context matters.


>
> I somewhere read that right now tools are able to infer 60% of the types.
> That seems pretty good to me and a lot of effort on your side to make some
> additional 20?/30? %. Don't get me wrong, I like the theoretical and
> abstract discussions around this topic but I feel this type of feature way
> out of the practical realm.
>

Such a tool can't infer 40% of the types. This probably includes most of
the tricky parts of the program that I'd actually like to statically check.
A type checker that uses annotations might understand 95% of the types,
i.e. it would miss 5% of the types. This seems like a reasonable figure for
code that has been written with some thought about type checkability. I
consider that difference pretty significant. I wouldn't want to increase
the fraction of unchecked parts of my annotated code by a factor of 8, and
I want to have control over which parts can be type checked.

Jukka


>
> I don't see the effort for adding type hints AND the effort for further
> parsing (by human eyes) justified by partially better IDE support and 1
> single additional test within test suites of about 10,000s of tests.
>
> Especially, when considering that correct types don't prove functionality
> in any case. But tested functionality in some way proves correct typing.
>
> Just my two cents since I felt I had to say this and maybe I am missing
> something. :)
>
> Best,
> Sven
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/8dd5a97f/attachment-0001.html>

From rosuav at gmail.com  Thu Sep 10 06:32:39 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 10 Sep 2015 14:32:39 +1000
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <87613j0xcm.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
 <CAPJVwBmziZufQL0YVYSFmEV7GmvYZXz2eim-vgoSH=H_nPC1jQ@mail.gmail.com>
 <87613j0xcm.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAPTjJmrD+1k1EDF_buvx8qzdaf4C1oF5QnEh0u5mf0HcLmNy6A@mail.gmail.com>

On Thu, Sep 10, 2015 at 1:25 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Nathaniel Smith writes:
>
>  > That seems more productive in the short run than trying to
>  > get everyone to stop typing "pip" :-).
>
> FWIW, I did as soon as I realized python_i_want_to_install -m pip
> worked; it's obvious that it DTRTs, and I felt like I'd just dropped
> the hammer I'd been whacking my head with.

If the problem with this is the verbosity of it ("python -m pip
install packagename" - five words), would there be benefit in blessing
pip with some core interpreter functionality, allowing either:

$ python install packagename

or

$ python -p packagename

to do the one most common operation, installation? (And since it's new
syntax, it could default to --upgrade, which would match the behaviour
of other package managers like apt-get.)

Since the base command is "python", it automatically uses the same
interpreter and environment as you otherwise would. It's less verbose
than bouncing through -m. It gives Python the feeling of having an
integrated package manager, which IMO wouldn't be a bad thing.

Of course, that wouldn't help with the 2.7 people, but it might allow
the deprecation of the 'pip' wrapper. Would it actually help?

ChrisA

From jlehtosalo at gmail.com  Thu Sep 10 06:34:47 2015
From: jlehtosalo at gmail.com (Jukka Lehtosalo)
Date: Wed, 9 Sep 2015 21:34:47 -0700
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <7DC7EA44-0CD8-4F61-8462-8147B8BB8059@yahoo.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <7DC7EA44-0CD8-4F61-8462-8147B8BB8059@yahoo.com>
Message-ID: <CAA_f+LwkgQLMk3BNKbWKQ-FxQJ1kgv5JEZZTU=1B=_Qi7RVgew@mail.gmail.com>

On Wed, Sep 9, 2015 at 3:08 PM, Andrew Barnert via Python-ideas <
python-ideas at python.org> wrote:

> On Sep 9, 2015, at 13:17, Guido van Rossum <guido at python.org> wrote:
>
> Jukka wrote up a proposal for structural subtyping. It's pretty good.
> Please discuss.
>
> https://github.com/ambv/typehinting/issues/11#issuecomment-138133867
>
>
> Are we going to continue to have (both implicit and explicit) ABCs in
> collections.abc, numbers, etc., and also have protocols that are also ABCs
> and are largely parallel to them (and implicit at static checking time
> whether they're implicit or explicit at runtime) In typing? If so, I think
> we've reached the point where the two parallel hierarchies are a problem.
>

I'm not proposing creating protocols for numbers or most collection types.
I'd change some of the existing ABCs (mentioned in the proposal, including
things like Sized) in typing into equivalent protocols, but they'd still
support isinstance as before and would be functionally almost identical to
the existing ABCs. I clarified the latter fact in the github issue.


>
> Also, why are both the terminology and implementation so different from
> what we already have for ABCs? Why not just have a decorator or metaclass
> that can be added to ABCs that makes them implicit (rather than writing a
> manual __subclasshook__ for each one), which also makes them implicit at
> static type checking time, which means there's no need for a whole separate
> but similar notion?
>

Protocol would use a metaclass that is derived from the ABC metaclass, and
it would be similar to the Generic class that we already have. The reason
why the proposal doesn't use an explicit metaclass or a class decorator is
consistency. It's possible to define generic protocols by having
Protocol[t, ...] as a base class, which is consistent with how Generic[...]
works. The latter is already part of typing, and introducing a similar
concept with a different syntax seems inelegant to me.

Consider a generic class:

class Bucket(Generic[T]): ...

Now we can have a generic protocol using a very similar syntax:

class BucketProtocol(Protocol[T]): ...

I wonder how we'd use a metaclass or a class decorator to represent generic
protocols. Maybe something like this:

@protocol[T]
class BucketProtocol: ...

However, this looks quite different from the Generic[...] case and thus I'd
rather not use it. I guess if we'd have picked this syntax for generic
classes it would make more sense:

@generic[T]
class Bucket: ...


> I'm not sure why it's important to also have some times that are implicit
> at static type checking time but not at runtime, but if there is a good
> reason, that just means two different decorators/metaclasses/whatever (or a
> flag passed to the decorator, etc.). Compare:
>
> Hashable is an implicit ABC, Sequence is an explicit ABC, Reversible is an
> implicit-static/explicit-runtime ABC.
>
> Hashable is an implicit ABC and also a Protocol that's an explicit ABC,
> Sequence is an explicit ABC and not a Protocol, Reversible is a Protocol
> that's an explicit ABC.
>
> The first one is clearly simpler; is there some compelling reason that
> makes the second one better anyway?
>

I'm not sure if I fully understand what you mean by implicit vs. explicit
ABCs (and the static/runtime distinction). Could you define these terms and
maybe give some examples of each? Note that in my proposal a protocol is
just a kind of ABC, as GenericMeta is a subclass of ABCMeta and protocol
would have a similar metaclass (or maybe even the same one), even though
I'm not sure if I explicitly mentioned that. Every protocol is also an ABC.

Jukka
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150909/ada2d0c2/attachment.html>

From greg.ewing at canterbury.ac.nz  Thu Sep 10 01:23:13 2015
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 10 Sep 2015 11:23:13 +1200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <20150909190757.GM19373@ando.pearwood.info>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
Message-ID: <55F0BF61.6050205@canterbury.ac.nz>

Steven D'Aprano wrote:
> one desirable 
> property of PRNGs is that you can repeat a sequence of values if you 
> re-seed with a known value. Does arc4random keep that property?

Another property that's important for some applications is
to be able to efficiently "jump ahead" some number of steps
in the sequence, to produce multiple independent streams of
numbers. It would be good to know if that is possible with
arc4random.

-- 
Greg

From tim.peters at gmail.com  Thu Sep 10 06:47:53 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 9 Sep 2015 23:47:53 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAPJVwBmFFNJo5jR_mgUX8kV3ZH_xWcLWPFJiTUSe4ua0FWC05w@mail.gmail.com>
References: <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
 <loom.20150909T232749-280@post.gmane.org>
 <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>
 <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
 <20150910015505.GO19373@ando.pearwood.info>
 <CAExdVN=tO3jPWoz0t6ckspAuWB-7t61GzbKLrx2L2UtKRyELbA@mail.gmail.com>
 <CAPJVwBnOqY3XcAtuRS7en956qfZHL1_fin-e7Pb+2CWQk2dftg@mail.gmail.com>
 <CAExdVN=tEtoh6Dx+7XCQ-nwv1f7O+ALAvSLvLpT4NQnzyK0Z+A@mail.gmail.com>
 <CAPJVwBmFFNJo5jR_mgUX8kV3ZH_xWcLWPFJiTUSe4ua0FWC05w@mail.gmail.com>
Message-ID: <CAExdVNmnJCAM2ju_n+FWr7Lt4pBz7ESSAKZ36oAN2tnSOqEOpg@mail.gmail.com>

[Tim, on parallel PRNGs]
>> There are some clean and easy approaches to this based on
>> crypto-inspired schemes, but giving up crypto strength for speed.  If
>> you haven't read it, this paper is delightful:
>>
>>     http://www.thesalmons.org/john/random123/papers/random123sc11.pdf

[Nathaniel Smith]
> It really is! As AES acceleration instructions become more common
> (they're now standard IIUC on x86, x86-64, and even recent ARM?), even
> just using AES in CTR mode becomes pretty compelling -- it's fast,
> deterministic, provably equidistributed, *and* cryptographically
> secure enough for many purposes.

Excellent - we're going to have a hard time finding something real to
disagree about :-)


> (Compared to a true state-of-the-art CPRNG the naive version fails due
> to lack of incremental mixing, and the use of a reversible transition
> function. But even these are mostly only important to protect against
> attackers who have access to your memory -- which is not trivial as
> heartbleed shows, but still, it's *waaay* ahead of something like MT
> on basically every important axis.)

Except for wide adoption.  Most people I bump into never even heard of
this kind of approach.  Nobody ever got fired for buying IBM, and
nobody ever got fired for recommending MT - it's darned near a
checklist item when shopping for a language.  I may have to sneak the
code in while you distract Guido with a provocative rant about the
inherent perfidy of the Dutch character ;-)

From rosuav at gmail.com  Thu Sep 10 06:57:29 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 10 Sep 2015 14:57:29 +1000
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F0BF61.6050205@canterbury.ac.nz>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
Message-ID: <CAPTjJmoPkeNy7Dir2jh2Qk2Jsa_7kRdVQuEnZYM1ityCwGX7uA@mail.gmail.com>

On Thu, Sep 10, 2015 at 9:23 AM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Steven D'Aprano wrote:
>>
>> one desirable property of PRNGs is that you can repeat a sequence of
>> values if you re-seed with a known value. Does arc4random keep that
>> property?
>
>
> Another property that's important for some applications is
> to be able to efficiently "jump ahead" some number of steps
> in the sequence, to produce multiple independent streams of
> numbers. It would be good to know if that is possible with
> arc4random.

If arc4random reseeds with entropy periodically, then jumping ahead
past such a reseed is simply a matter of performing a reseed, isn't
it?

ChrisA

From tim.peters at gmail.com  Thu Sep 10 06:58:33 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 9 Sep 2015 23:58:33 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F0BF61.6050205@canterbury.ac.nz>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
Message-ID: <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>

[Steven D'Aprano]
>> one desirable property of PRNGs is that you can repeat a sequence of
>> values if you re-seed with a known value. Does arc4random keep that
>> property?

[Greg Ewing]
[> Another property that's important for some applications is
> to be able to efficiently "jump ahead" some number of steps
> in the sequence, to produce multiple independent streams of
> numbers. It would be good to know if that is possible with
> arc4random.

No for "arc4random" based on RC4, yes for "arc4random" based on
ChaCha20, "mostly yes" for "arc4random" in the OpenBSD implementation,
wholly unknown for whatever functions that will may be_called_
"arc4random" in the future.

The fly in the ointment for the OpenBSD version is that it
periodically fiddles its internal state with "entropy" obtained from
the kernel.  It's completely unreproducible for that reason.  However,
you can still jump ahead in the state.  It's just impossible to say
that it's the same state you would have arrived at had you invoked the
function that many times instead (the kernel could change the state in
unpredictable ways any number of times while you were doing that).;

From njs at pobox.com  Thu Sep 10 06:59:08 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 9 Sep 2015 21:59:08 -0700
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F0BF61.6050205@canterbury.ac.nz>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
Message-ID: <CAPJVwBnOAEh2ZWSzvrgUuJk83+R4M7jQS_i8MHmc8wf_3jP+-g@mail.gmail.com>

On Wed, Sep 9, 2015 at 4:23 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Steven D'Aprano wrote:
>>
>> one desirable property of PRNGs is that you can repeat a sequence of
>> values if you re-seed with a known value. Does arc4random keep that
>> property?
>
> Another property that's important for some applications is
> to be able to efficiently "jump ahead" some number of steps
> in the sequence, to produce multiple independent streams of
> numbers. It would be good to know if that is possible with
> arc4random.

The answer to both of these questions is no. For modern cryptographic
PRNGs, full determinism is considered a flaw, and determinism is a
necessary precondition to supporting jumpahead.

The reason is that even if an attacker learns your secret RNG state at
time t, then you want this to have a limited impact -- they'll
obviously be able to predict your RNG output for a while, but you
don't want them to be able to predict it from now until the end of
time. So determinism is considered bad, and high-quality CPRNGs
automatically reseed themselves with new entropy according to some
carefully designed schedule. And OpenBSD's "arc4random" generator is a
high-quality CPRNG in this sense.

-- 
Nathaniel J. Smith -- http://vorpus.org

From tim.peters at gmail.com  Thu Sep 10 07:00:45 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 10 Sep 2015 00:00:45 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAPTjJmoPkeNy7Dir2jh2Qk2Jsa_7kRdVQuEnZYM1ityCwGX7uA@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAPTjJmoPkeNy7Dir2jh2Qk2Jsa_7kRdVQuEnZYM1ityCwGX7uA@mail.gmail.com>
Message-ID: <CAExdVN=hcZ6syaANuvY_vndS0_8HjqA6aufxNr-BnKCfRVu9dw@mail.gmail.com>

[Chris Angelico]
> If arc4random reseeds with entropy periodically, then jumping ahead
> past such a reseed is simply a matter of performing a reseed, isn't
> it?

The OpenBSD version supplies no functionality related to seeds (you
can't set one, and you can't ask for one).

From tim.peters at gmail.com  Thu Sep 10 07:10:10 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 10 Sep 2015 00:10:10 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
Message-ID: <CAExdVNk6RkUFpLOPv2178+PBNvL+st2GX=1kMKAE6_7+H3uGjQ@mail.gmail.com>

[Tim]
> ...
> The fly in the ointment for the OpenBSD version is that it
> periodically fiddles its internal state with "entropy" obtained from
> the kernel.  It's completely unreproducible for that reason.  However,
> you can still jump ahead in the state.

I should add:  but only if they supplied a jumpahead function.  Which
they don't.

From abarnert at yahoo.com  Thu Sep 10 07:15:16 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 9 Sep 2015 22:15:16 -0700
Subject: [Python-ideas] High time for a builtin function to manage
	packages (simply)?
In-Reply-To: <874mj30x14.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
 <C5A4CEBA-048A-4762-B650-869625BF77ED@yahoo.com>
 <874mj30x14.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <7A4752B7-90B8-49CE-9EF6-FF182CE0411D@yahoo.com>

On Sep 9, 2015, at 20:32, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> 
> Andrew Barnert via Python-ideas writes:
> 
>> If StackOverflow/SU/TD questions are any indication, a
>> disproportionate number of these people are Mac users using Python
>> 2.7, who have installed a second Python 2.7 (or, in some cases, two
>> of them) alongside Apple's.
> 
> Often enough it's the other way around: the distro catches up to the
> user as they upgrade.  I didn't even realize "10.10 Yosemite" had 2.7,
> this box has been upgraded from "10.7 Lion" or so,

No, that's not the problem. Lion came with 2.7.1, so you already had it before upgrading it, and it's hard to imagine Apple upgrading your system 2.7.1 to 2.7.6 or 2.7.10 broke anything. More likely, Apple screwed up your PATH, or broke your MacPorts so you had to reinstall or repair it?

> and I just use
> MacPorts 2.7 all the time.  I haven't worried about what Apple
> supplies as /usr/bin/python in 6 or 7 years.

I'd assume most people on this list know what they're doing with their PATH. If you don't, then you just got lucky for a few years.

Well, not just lucky--MacPorts does go out of its way to make things easier for you in various ways (hammering home keeping /opt/local/bin at the the start of your PATH, trying to adjust the PATH system-wide and for LaunchServices as well as shells, providing many packages as ports so you don't need pip, offering a python-select tool that autodetects Apple and PSF Pythons and tries to make them play nice with MacPorts Pythons, etc.). So MacPorts users don't see such problems nearly as often as Homebrew (or Fink or Gentoo Prefix, if anyone still uses those), PSF installers, and third-party extra-batteries installers. But they still can come up--and if you didn't even realize you were running multiple Python 2.7 versions in parallel, that just means you never tried anything that MacPorts didn't anticipate. And, of course, most people with two Python 2.7s on Mac are not using MacPorts anyway.

From njs at pobox.com  Thu Sep 10 07:55:39 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 9 Sep 2015 22:55:39 -0700
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAExdVNmnJCAM2ju_n+FWr7Lt4pBz7ESSAKZ36oAN2tnSOqEOpg@mail.gmail.com>
References: <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
 <loom.20150909T232749-280@post.gmane.org>
 <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>
 <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
 <20150910015505.GO19373@ando.pearwood.info>
 <CAExdVN=tO3jPWoz0t6ckspAuWB-7t61GzbKLrx2L2UtKRyELbA@mail.gmail.com>
 <CAPJVwBnOqY3XcAtuRS7en956qfZHL1_fin-e7Pb+2CWQk2dftg@mail.gmail.com>
 <CAExdVN=tEtoh6Dx+7XCQ-nwv1f7O+ALAvSLvLpT4NQnzyK0Z+A@mail.gmail.com>
 <CAPJVwBmFFNJo5jR_mgUX8kV3ZH_xWcLWPFJiTUSe4ua0FWC05w@mail.gmail.com>
 <CAExdVNmnJCAM2ju_n+FWr7Lt4pBz7ESSAKZ36oAN2tnSOqEOpg@mail.gmail.com>
Message-ID: <CAPJVwBnUpTToxcUM3AnO8TVPO+2=k_MpmETZW2e4C4UH=M-MeA@mail.gmail.com>

On Wed, Sep 9, 2015 at 9:47 PM, Tim Peters <tim.peters at gmail.com> wrote:
> [Tim, on parallel PRNGs]
>>> There are some clean and easy approaches to this based on
>>> crypto-inspired schemes, but giving up crypto strength for speed.  If
>>> you haven't read it, this paper is delightful:
>>>
>>>     http://www.thesalmons.org/john/random123/papers/random123sc11.pdf
>
> [Nathaniel Smith]
>> It really is! As AES acceleration instructions become more common
>> (they're now standard IIUC on x86, x86-64, and even recent ARM?), even
>> just using AES in CTR mode becomes pretty compelling -- it's fast,
>> deterministic, provably equidistributed, *and* cryptographically
>> secure enough for many purposes.
>
> Excellent - we're going to have a hard time finding something real to
> disagree about :-)
>
>
>> (Compared to a true state-of-the-art CPRNG the naive version fails due
>> to lack of incremental mixing, and the use of a reversible transition
>> function. But even these are mostly only important to protect against
>> attackers who have access to your memory -- which is not trivial as
>> heartbleed shows, but still, it's *waaay* ahead of something like MT
>> on basically every important axis.)
>
> Except for wide adoption.  Most people I bump into never even heard of
> this kind of approach.  Nobody ever got fired for buying IBM, and
> nobody ever got fired for recommending MT - it's darned near a
> checklist item when shopping for a language.  I may have to sneak the
> code in while you distract Guido with a provocative rant about the
> inherent perfidy of the Dutch character ;-)

:-)

Srsly though, we've talked about switching to some kind of CTR-mode
RNG as the default in NumPy (where speed differences are pretty
visible, b/c we support generating big blocks of random numbers at
once), and would probably accept a patch. (Just in case Guido is
undisturbed by scurrilous allegations.)

-- 
Nathaniel J. Smith -- http://vorpus.org

From rosuav at gmail.com  Thu Sep 10 08:08:09 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 10 Sep 2015 16:08:09 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <A45551D4-1E5C-423F-9ACB-F2CB386B6BEE@yahoo.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <72597E4F-4E74-412D-8ED3-442E832232EF@yahoo.com>
 <m2mvwvoyk1.fsf@fastmail.com>
 <A45551D4-1E5C-423F-9ACB-F2CB386B6BEE@yahoo.com>
Message-ID: <CAPTjJmpqzLm6v05w-FLYrZCPta6o5j0dmv6Y9tTW=_+ayrxSCw@mail.gmail.com>

On Thu, Sep 10, 2015 at 11:50 AM, Andrew Barnert via Python-ideas
<python-ideas at python.org> wrote:
> Of course it adds the cost of making the module slower, and also more complex. Maybe a better solution would be to add a random.set_default_instance function that replaced all of the top-level functions with bound methods of the instance (just like what's already done at startup in random.py)? That's simple, and doesn't slow down anything, and it seems like it makes it more clear what you're doing than setting random.inst.

+1. A single function call that replaces all the methods adds a
minuscule constant to code size, run time, etc, and it's no less
readable than assignment to a module attribute. (If anything, it makes
it more clearly a supported operation - I've seen novices not realize
that "module.xyz = foo" is valid, but nobody would misunderstand the
validity of a function call.)

ChrisA

From tjreedy at udel.edu  Thu Sep 10 09:07:11 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 10 Sep 2015 03:07:11 -0400
Subject: [Python-ideas] BUG in standard while statement
In-Reply-To: <CAOs8ta2PC=6p8Eo_+F0m1WrVbgk+Y=8V8HQHotC0ObxGPvS6=g@mail.gmail.com>
References: <CAOs8ta2PC=6p8Eo_+F0m1WrVbgk+Y=8V8HQHotC0ObxGPvS6=g@mail.gmail.com>
Message-ID: <msra7n$h9t$1@ger.gmane.org>

On 9/9/2015 1:10 PM, Stephan Sahm wrote:

> I found a BUG in the standard while statement, which appears both in
> python 2.7 and python 3.4 on my system.

No you did not, but aside from that: python-ideas is for ideas about 
future versions of python, not for bug reports, valid or otherwise.  You 
should have sent this to python-list, which is a place to report 
possible bugs.

-- 
Terry Jan Reedy


From encukou at gmail.com  Thu Sep 10 09:35:11 2015
From: encukou at gmail.com (Petr Viktorin)
Date: Thu, 10 Sep 2015 09:35:11 +0200
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <etPan.55f0dd28.7b8c5e71.31bc@Draupnir.home>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <etPan.55f0dd28.7b8c5e71.31bc@Draupnir.home>
Message-ID: <CA+=+wqAbx1NRDSSkoXofC2R+2PN_x=TPzraH_smWUzm_uQPLYw@mail.gmail.com>

On Thu, Sep 10, 2015 at 3:30 AM, Donald Stufft <donald at stufft.io> wrote:
[...]
>
> So I guess my suggestion would be, let's deprecate the module scope functions
> and rename random.Random to random.DeterministicRandom. This absolves us of
> needing to change the behavior of people's existing code (besides deprecating
> it) and we don't need to decide if a userland CSPRNG is safe or not while still
> moving us to a situation that is far more likely to have users doing the right
> thing.

There is one use case that would be hit by that: the kid writing their
first rock-paper-scissors game.
A beginner who just learned the `if` statement isn't ready for a
discussion of cryptography vs. reproducible results, and
random.SystemRandom.random() would just become a magic incantation to
learn. It would feel like requiring sys.stdout.write() instead of
print().

Functions like paretovariate(), getstate(), or seed(), which require
some understanding of (pseudo)randomness, can be moved to a specific
class, but I don't think deprecating random(), randint(), randrange(),
choice(), and shuffle() would not be a good idea. Switching them to a
cryptographically safe RNG is OK from this perspective, though.

From cory at lukasa.co.uk  Thu Sep 10 09:51:58 2015
From: cory at lukasa.co.uk (Cory Benfield)
Date: Thu, 10 Sep 2015 08:51:58 +0100
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
Message-ID: <CAH_hAJETf++CJmcdf1WYQZaHdymXR_2_R+A5tN1NthbrADSeAQ@mail.gmail.com>

On 9 September 2015 at 21:17, Guido van Rossum <guido at python.org> wrote:
> Jukka wrote up a proposal for structural subtyping. It's pretty good. Please
> discuss.

Some good feedback has been provided in this thread already, but I
want to provide an enthusiastic +1 for this change. I'm one of the
people who has been extremely lukewarm towards the Python type hints
proposal, but I believe this addresses one of my major areas of
concern. Overall the proposal seems like a graceful solution to many
of the duck typing problems.

It does not address all of them, particularly around classes that may
dynamically (but deterministically) modify themselves to satisfy the
constraints of the Protocol (e.g. by generating methods for themselves
at instantiation-time), but that's a pretty hairy use-case and there's
not much that a static type checker could do about it anyway.

Altogether this looks great (modulo a couple of small concerns raised
by others), and it's enough for me to consider using static type hints
on basically all my projects with the ongoing exception of Requests
(which has duck typing problems that this cannot solve, I think).
Great work Jukka!

From p.f.moore at gmail.com  Thu Sep 10 10:01:58 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 10 Sep 2015 09:01:58 +0100
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <CAPJVwBmziZufQL0YVYSFmEV7GmvYZXz2eim-vgoSH=H_nPC1jQ@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
 <CAPJVwBmziZufQL0YVYSFmEV7GmvYZXz2eim-vgoSH=H_nPC1jQ@mail.gmail.com>
Message-ID: <CACac1F946J5ckRXFy6g8zzu+_1xpqu_-FUGn5pE3WzeLKUnUwQ@mail.gmail.com>

On 9 September 2015 at 23:40, Nathaniel Smith <njs at pobox.com> wrote:
> At the very least, surely this could be "fixed" by detecting this case
> and exiting with a message "Sorry, Windows is annoying and this isn't
> going to work, to upgrade pip please type 'python -m pip ...'
> instead"? That seems more productive in the short run than trying to
> get everyone to stop typing "pip" :-). (Though I do agree that having
> pip as a separate command from python is a big mess -- another case
> where this comes up is the need for pip versus pip3.)

That's already done (without the unnecessary passive-aggressive
sniping at Windows) and we still get users raising bugs because they
didn't read the message, or because they misinterpreted something.

As I said, we've tried lots of solutions. What we haven't had yet is
anyone come up with an actual working PR that fixes the issue (in the
sense of addressing the bug reports we get) better than the current
code (if we had, we'd have applied the PR).

Paul

From p.f.moore at gmail.com  Thu Sep 10 10:06:30 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 10 Sep 2015 09:06:30 +0100
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <CAPJVwBmziZufQL0YVYSFmEV7GmvYZXz2eim-vgoSH=H_nPC1jQ@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
 <CAPJVwBmziZufQL0YVYSFmEV7GmvYZXz2eim-vgoSH=H_nPC1jQ@mail.gmail.com>
Message-ID: <CACac1F_6y1k6DffL9c5XYN0CovJ4LjQBk=8sR_fMirU5uPZaug@mail.gmail.com>

On 9 September 2015 at 23:40, Nathaniel Smith <njs at pobox.com> wrote:
> It sounds like this is another place where in the short term, it would
> help a lot of pip at startup took a peek at $PATH and issued some
> warnings or errors if it detected the most common types of
> misconfiguration? (E.g. the first python/python3 in $PATH does not
> match the one being used to run pip.)

People (including the pip devs) have talked about this type of thing
before. To my knowledge no-one has actually implemented it. Care to
provide a PR for this?

Paul

From abarnert at yahoo.com  Thu Sep 10 10:17:11 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 10 Sep 2015 01:17:11 -0700
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
In-Reply-To: <CAPTjJmpqzLm6v05w-FLYrZCPta6o5j0dmv6Y9tTW=_+ayrxSCw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <72597E4F-4E74-412D-8ED3-442E832232EF@yahoo.com>
 <m2mvwvoyk1.fsf@fastmail.com>
 <A45551D4-1E5C-423F-9ACB-F2CB386B6BEE@yahoo.com>
 <CAPTjJmpqzLm6v05w-FLYrZCPta6o5j0dmv6Y9tTW=_+ayrxSCw@mail.gmail.com>
Message-ID: <8A294D36-C40F-405F-BB2E-94CD379B8165@yahoo.com>

On Sep 9, 2015, at 23:08, Chris Angelico <rosuav at gmail.com> wrote:
> 
> On Thu, Sep 10, 2015 at 11:50 AM, Andrew Barnert via Python-ideas
> <python-ideas at python.org> wrote:
>> Of course it adds the cost of making the module slower, and also more complex. Maybe a better solution would be to add a random.set_default_instance function that replaced all of the top-level functions with bound methods of the instance (just like what's already done at startup in random.py)? That's simple, and doesn't slow down anything, and it seems like it makes it more clear what you're doing than setting random.inst.
> 
> +1. A single function call that replaces all the methods adds a
> minuscule constant to code size, run time, etc, and it's no less
> readable than assignment to a module attribute. (If anything, it makes
> it more clearly a supported operation - I've seen novices not realize
> that "module.xyz = foo" is valid, but nobody would misunderstand the
> validity of a function call.)

I was only half-serious about this, but now I think I like it: it provides exactly the fix people are hoping to fix by deprecating the top-level functions, but with less risk, less user code churn, a smaller patch, and a much easier fix for novice users. (And it's much better than my earlier suggestion, too.)

See https://gist.github.com/abarnert/e0fced7569e7d77f7464 for the patch, and a patched copy of random.py. The source comments in the patch should be enough to understand everything that's changed.

A couple things:

I'm not sure the normal deprecation path makes sense here. For a couple versions, everything continues to work (because most novices, the people we're thing to help, don't see DeprecationWarnings), and then suddenly their code breaks. Maybe making it a UserWarning makes more sense here?

I made Random a synonym for UnsafeRandom (the class that warns and then passes through to DeterministicRandom). But is that really necessary? Someone who's explicitly using an instance of class Random rather than the top-level functions probably isn't someone who needs this warning, right?

Also, if this is the way we'd want to go, the docs change would be a lot more substantial than the code change. I think the docs should be organized around choosing a random generator and using its methods, and only then mention set_default_instance as being useful for porting old code (and for making it easy for multiple modules to share a single generator, but that shouldn't be a common need for novices).

From abarnert at yahoo.com  Thu Sep 10 10:20:15 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 10 Sep 2015 01:20:15 -0700
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
In-Reply-To: <CA+=+wqAbx1NRDSSkoXofC2R+2PN_x=TPzraH_smWUzm_uQPLYw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <etPan.55f0dd28.7b8c5e71.31bc@Draupnir.home>
 <CA+=+wqAbx1NRDSSkoXofC2R+2PN_x=TPzraH_smWUzm_uQPLYw@mail.gmail.com>
Message-ID: <9DC28BE8-4444-4D92-A72A-7AC945C90005@yahoo.com>

On Sep 10, 2015, at 00:35, Petr Viktorin <encukou at gmail.com> wrote:
> 
>> On Thu, Sep 10, 2015 at 3:30 AM, Donald Stufft <donald at stufft.io> wrote:
>> [...]
>> 
>> So I guess my suggestion would be, let's deprecate the module scope functions
>> and rename random.Random to random.DeterministicRandom. This absolves us of
>> needing to change the behavior of people's existing code (besides deprecating
>> it) and we don't need to decide if a userland CSPRNG is safe or not while still
>> moving us to a situation that is far more likely to have users doing the right
>> thing.
> 
> There is one use case that would be hit by that: the kid writing their
> first rock-paper-scissors game.
> A beginner who just learned the `if` statement isn't ready for a
> discussion of cryptography vs. reproducible results, and
> random.SystemRandom.random() would just become a magic incantation to
> learn. It would feel like requiring sys.stdout.write() instead of
> print().
> 
> Functions like paretovariate(), getstate(), or seed(), which require
> some understanding of (pseudo)randomness, can be moved to a specific
> class, but I don't think deprecating random(), randint(), randrange(),
> choice(), and shuffle() would not be a good idea. Switching them to a
> cryptographically safe RNG is OK from this perspective, though.

Silently switching them could break a lot of code.

I don't think there's any way around making them warn the user that they need to do something. I think the patch I just sent is a good way of doing that: the minimum thing they need to do is a one-liner, which is explained in the warning, and it also gives them enough information to check the docs or google the message and get some understanding of the choice if they're at all inclined to do so. (And if they aren't, well, either one works for the use case you're talking about, so let them flip a coin, or call random.choice.;))

From mal at egenix.com  Thu Sep 10 10:26:23 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 10 Sep 2015 10:26:23 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>	<CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>	<etPan.55f06a43.137d4868.31bc@Draupnir.home>	<CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>	<etPan.55f06fd9.71794aea.31bc@Draupnir.home>	<1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>	<CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>	<1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>	<20150909190757.GM19373@ando.pearwood.info>	<55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
Message-ID: <55F13EAF.5040500@egenix.com>

Reading this thread is fun, but it doesn't seem to be getting
anywhere - perhaps that's part of the fun ;-)

Realistically, I see two options:

 1. Someone goes and implements the OpenBSD random function in C
    and put a package up on PyPI, updating it whenever OpenBSD
    thinks that a new algorithm is needed or a security issue
    has to be fixed (from my experience with other crypto software
    like OpenSSL, this should be on the order of every 2-6 months ;-))

 2. Ditto, but we put the module in the stdlib and then run around
    issuing patch level security releases every 2-6 months.

Replacing our deterministic default PRNG with a non-deterministic
one doesn't really fly, since we'd break an important feature
of random.random(). You may remember that we already ran a similar
stunt with the string hash function, with very mixed results.

Calling the result of such a switch-over "secure" is even
worse, since it's a promise we cannot keep (probably not even
fully define). Better leave the promise at "insecure" - that's
something we can promise forever and don't have to define :-)

Regardless of what we end up with, I think Python land can do
better than name it "arc4random". We're great at bike shedding,
so how about we start the fun with "randomYMMV" :-)

Overall, I think having more options for good PRNGs is great.
Whether this "arc4random" is any good remains to be seen, but
given that OpenBSD developed it, chances are higher than
usual.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 10 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2015-09-18: PyCon UK 2015 ...                               8 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From abarnert at yahoo.com  Thu Sep 10 10:26:20 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 10 Sep 2015 01:26:20 -0700
Subject: [Python-ideas] High time for a builtin function to manage
	packages (simply)?
In-Reply-To: <CAPTjJmrD+1k1EDF_buvx8qzdaf4C1oF5QnEh0u5mf0HcLmNy6A@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
 <CAPJVwBmziZufQL0YVYSFmEV7GmvYZXz2eim-vgoSH=H_nPC1jQ@mail.gmail.com>
 <87613j0xcm.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAPTjJmrD+1k1EDF_buvx8qzdaf4C1oF5QnEh0u5mf0HcLmNy6A@mail.gmail.com>
Message-ID: <DB79DA1C-5378-4FA2-8CC0-E16851BEFF02@yahoo.com>

On Sep 9, 2015, at 21:32, Chris Angelico <rosuav at gmail.com> wrote:
> 
>> On Thu, Sep 10, 2015 at 1:25 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>> Nathaniel Smith writes:
>> 
>>> That seems more productive in the short run than trying to
>>> get everyone to stop typing "pip" :-).
>> 
>> FWIW, I did as soon as I realized python_i_want_to_install -m pip
>> worked; it's obvious that it DTRTs, and I felt like I'd just dropped
>> the hammer I'd been whacking my head with.
> 
> If the problem with this is the verbosity of it ("python -m pip
> install packagename" - five words), would there be benefit in blessing
> pip with some core interpreter functionality, allowing either:
> 
> $ python install packagename
> 
> or
> 
> $ python -p packagename
> 
> to do the one most common operation, installation? (And since it's new
> syntax, it could default to --upgrade, which would match the behaviour
> of other package managers like apt-get.)
> 
> Since the base command is "python", it automatically uses the same
> interpreter and environment as you otherwise would. It's less verbose
> than bouncing through -m. It gives Python the feeling of having an
> integrated package manager, which IMO wouldn't be a bad thing.
> 
> Of course, that wouldn't help with the 2.7 people, but it might allow
> the deprecation of the 'pip' wrapper. Would it actually help?
> 

What about leaving the pip wrapper, but having it  display a banner telling people to use python -m pip (and maybe suggesting they add an alias to their profile, if not Windows) and then do its thing as it currently does. (Maybe with some way to suppress the message if people want to say "I know what I'm doing; if my PATH is screwy I'll fix it".)

If we also add the python -p, it can instead suggest that if version >= (3, 6).

That seems like an easier way to get the message out there than trying to convince everyone to spread the word everywhere they teach anyone, or deprecating it and leaving people wondering what they're supposed to do instead.

From storchaka at gmail.com  Thu Sep 10 10:32:21 2015
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Thu, 10 Sep 2015 11:32:21 +0300
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
In-Reply-To: <8A294D36-C40F-405F-BB2E-94CD379B8165@yahoo.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <72597E4F-4E74-412D-8ED3-442E832232EF@yahoo.com>
 <m2mvwvoyk1.fsf@fastmail.com>
 <A45551D4-1E5C-423F-9ACB-F2CB386B6BEE@yahoo.com>
 <CAPTjJmpqzLm6v05w-FLYrZCPta6o5j0dmv6Y9tTW=_+ayrxSCw@mail.gmail.com>
 <8A294D36-C40F-405F-BB2E-94CD379B8165@yahoo.com>
Message-ID: <msrf6l$710$1@ger.gmane.org>

On 10.09.15 11:17, Andrew Barnert via Python-ideas wrote:
> On Sep 9, 2015, at 23:08, Chris Angelico <rosuav at gmail.com> wrote:
>> On Thu, Sep 10, 2015 at 11:50 AM, Andrew Barnert via Python-ideas
>> <python-ideas at python.org> wrote:
>>> Of course it adds the cost of making the module slower, and also more complex. Maybe a better solution would be to add a random.set_default_instance function that replaced all of the top-level functions with bound methods of the instance (just like what's already done at startup in random.py)? That's simple, and doesn't slow down anything, and it seems like it makes it more clear what you're doing than setting random.inst.
>>
>> +1. A single function call that replaces all the methods adds a
>> minuscule constant to code size, run time, etc, and it's no less
>> readable than assignment to a module attribute. (If anything, it makes
>> it more clearly a supported operation - I've seen novices not realize
>> that "module.xyz = foo" is valid, but nobody would misunderstand the
>> validity of a function call.)
>
> I was only half-serious about this, but now I think I like it: it provides exactly the fix people are hoping to fix by deprecating the top-level functions, but with less risk, less user code churn, a smaller patch, and a much easier fix for novice users. (And it's much better than my earlier suggestion, too.)
>
> See https://gist.github.com/abarnert/e0fced7569e7d77f7464 for the patch, and a patched copy of random.py. The source comments in the patch should be enough to understand everything that's changed.

This doesn't work with the idiom "from random import random".



From p.f.moore at gmail.com  Thu Sep 10 10:41:54 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 10 Sep 2015 09:41:54 +0100
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
Message-ID: <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>

On 10 September 2015 at 01:01, Donald Stufft <donald at stufft.io> wrote:
> Essentially, there are three basic types of uses of random (the concept, not
> the module). Those are:
>
> 1. People/usecases who absolutely need deterministic output given a seed and
>    for whom security properties don't matter.
> 2. People/usecases who absolutely need a cryptographically random output and
>    for whom having a deterministic output is a downside.
> 3. People/usecases that fall somewhere in between where it may or may not be
>    security sensitive or it may not be known if it's security sensitive.

Wrong.

There is a fourth basic type. People (like me!) whose code absolutely
doesn't have any security issues, but want a simple, convenient, fast
RNG. Determinism is not an absolute requirement, but is very useful
(for writing tests, maybe, or for offering a deterministic rerun
option to the program). Simulation-style games often provide a way to
find the "map seed", which allows users to share interesting maps -
this is non-essential but a big quality-of-life benefit in such games.

IMO, the current module perfectly serves this fourth group.

While I accept your point that far too many people are using insecure
RNGs in "generate a random password" scripts, they are *not* the core
target audience of the default module-level functions in the random
module (did you find any examples of insecure use that *weren't*
password generators?). We should educate people that this is bad
practice, not change the module. Also, while it may be imperfect, it's
still better than what many people *actually* do, which is to use
"password" as a password on sensitive systems :-(

Maybe what Python *actually* needs is a good-quality "random password
generator" module in the stdlib? (Semi-serious suggestion...)

Paul

From wolfgang.maier at biologie.uni-freiburg.de  Thu Sep 10 10:58:04 2015
From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier)
Date: Thu, 10 Sep 2015 10:58:04 +0200
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <461D4C7C-6C32-480D-B065-295A623E11D7@yahoo.com>
References: <msmbko$ful$1@ger.gmane.org> <mspcv3$8mu$1@ger.gmane.org>
 <5B23496D-6DBD-49B3-91D7-E093309A84C7@yahoo.com>
 <55F0AC99.8030408@biologie.uni-freiburg.de>
 <C2DA14A1-DC2F-4F66-B4D6-EB3D63824A63@yahoo.com>
 <1441841095.2587236.379354345.340FCF95@webmail.messagingengine.com>
 <461D4C7C-6C32-480D-B065-295A623E11D7@yahoo.com>
Message-ID: <55F1461C.70607@biologie.uni-freiburg.de>

On 10.09.2015 02:03, Andrew Barnert via Python-ideas wrote:
> On Sep 9, 2015, at 16:24, random832 at fastmail.us wrote:
>>
>>> On Wed, Sep 9, 2015, at 18:39, Andrew Barnert via Python-ideas wrote:
>>> I believe he posted a more detailed version of the idea on one of the
>>> other spinoff threads from the f-string thread, but I don't have a link.
>>> But there are lots of possibilities, and if you want to start
>>> bikeshedding, it doesn't matter that much what his original color was.
>>> For example, here's a complete proposal:
>>>
>>>     class MyJoiner:
>>>         def __init__(self, value):
>>>             self.value = value
>>>         def __format__(self, spec):
>>>             return spec.join(map(str, self.value))
>>>     string.register_converter('join', MyJoiner)
>>
>> Er, I wanted it to be something more like
>>
>> def __format__(self, spec):
>>    sep, fmt = # 'somehow' break up spec into two parts
>
> I covered later in the same message how this simple version could be extended to a smarter version that does that, or even more, without requiring any further changes to str.format. I just wanted to show the simplest version first, and then show that designing for that doesn't lose any flexibility.
>

Ok, I think I got the idea.
One question though: how would you prevent this from getting competely 
out of hand?

> And meanwhile, the alternative seems to be having something similar, but not exposing it publicly, and just baking in a handful of hardcoded converters for join, html, re-escape, etc., and I don't see why str should know about all of those things, or why extending that set when we realize that we forgot about shlex should require a patch to str and a new Python version.
>
>> The Joiner class wouldn't have to exist as a builtin, it could be
>> private to the format function.
>
> If it's custom-registerable, it can be on PyPI, or in the middle of your app, although of course there could be some converters, maybe including your Joiner, somewhere in the stdlib, or even private to format, as well.
>

The strength of this idea - flexibility - could also be called its 
biggest weakness and that is scaring me. Essentially, such converters 
would be completely free to do anything they want: change their input at 
will, return something completely unrelated, have side-effects. All of 
that hidden behind a simple !token in a replacement field.
While the idea is really cool and certainly powerful if used 
responsibly, it could also create completely unreadable code.

Just adding one single hardcoded converter for joining iterables looks 
like a much more reasonable and realistic idea and now that I understand 
the concept I have to say I really like it.

Just paraphrasing once more to see if a understood things correctly this 
time:
The !j converter converts the iterable to an instance of a Joiner class 
just like !s, !r and !a convert to a str instance. After that conversion 
the __format__ method of the new object gets called with the format_spec 
string (which specifies the separator and the inner format spec) as 
argument and that method produces the joint string.

So everything follows the existing logic of a converter and no really 
new replacement field syntax is required. Great and +1!


From tritium-list at sdamon.com  Thu Sep 10 11:20:48 2015
From: tritium-list at sdamon.com (Alexander Walters)
Date: Thu, 10 Sep 2015 05:20:48 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CA+=+wqAbx1NRDSSkoXofC2R+2PN_x=TPzraH_smWUzm_uQPLYw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <etPan.55f0dd28.7b8c5e71.31bc@Draupnir.home>
 <CA+=+wqAbx1NRDSSkoXofC2R+2PN_x=TPzraH_smWUzm_uQPLYw@mail.gmail.com>
Message-ID: <55F14B70.2080901@sdamon.com>

Can I just ask what is the actual problem we are trying to solve here?

Python has third party cryptography modules, that bring their own 
sources of randomness (or cryptography libraries that do the same).

Python has a good random library for everything other than cryptography.

Why in the heck are we trying to make the random module do something 
that it is already documented as being a poor choice, where there is 
already third party modules that do just this?

Who needs cryptographic randomness in the standard library anyways (even 
though one line of code give you access to it)?  Have we identified even 
ONE person who does cryptography in python who is kicking themselves 
that they cant use the random module as implemented?

Is this just indulging a paranoid developer?

From stephen at xemacs.org  Thu Sep 10 11:41:28 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 10 Sep 2015 18:41:28 +0900
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <7A4752B7-90B8-49CE-9EF6-FF182CE0411D@yahoo.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
 <C5A4CEBA-048A-4762-B650-869625BF77ED@yahoo.com>
 <874mj30x14.fsf@uwakimon.sk.tsukuba.ac.jp>
 <7A4752B7-90B8-49CE-9EF6-FF182CE0411D@yahoo.com>
Message-ID: <871te61uif.fsf@uwakimon.sk.tsukuba.ac.jp>

Andrew Barnert writes:

 > No, that's not the problem. Lion came with 2.7.1, so you already
 > had it before upgrading it, and it's hard to imagine Apple
 > upgrading your system 2.7.1 to 2.7.6 or 2.7.10 broke anything. More
 > likely, Apple screwed up your PATH, or broke your MacPorts so you
 > had to reinstall or repair it?

I've had no problems with PATH, personally.

I'm just saying that learning that pip was actually version-specific,
and then getting the right pip for the current Python of interest, has
been an annoyance for me over the years, and I was very happy to
switch to "python -m pip" because it Just Works.

As far as the question of order of installation, I just wanted to
point out that system upgrades do sometimes catch up to the user,
resulting in duplicate installations, rather than the user following
some blog to the letter and installing a verson they don't need.

 > I'd assume most people on this list know what they're doing with
 > their PATH. If you don't, then you just got lucky for a few years.

For me, PATH is easy.  <python> -m pip is easy.  <pip> is hard. :-/

 > and if you didn't even realize you were running multiple Python 2.7
 > versions in parallel, that just means you never tried anything that
 > MacPorts didn't anticipate.

No, it just means that since forever my personal PATH has been set up
to give precedence to /usr/local/bin and /opt/local/bin, and since the
days of Python 2 I avoid the system Python at all costs.
Specifically, I never invoke Python without a full 2-digit version
number (except in venvs), and my shebangs specify it too (ditto).  (It
works out to the same semantics as "not surprising MacPorts", of
course.)

From luciano at ramalho.org  Thu Sep 10 12:01:35 2015
From: luciano at ramalho.org (Luciano Ramalho)
Date: Thu, 10 Sep 2015 07:01:35 -0300
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
Message-ID: <CALxg4FVf_g8_v9XxRWAS2Z-hgA0z23zQ3-VxZE2kXmoQ1F1RQA@mail.gmail.com>

Jukka, thank you very much for working on such a hard topic and being
patient enough to respond to issues that I am sure were exhaustively
discussed before (but I was not following the discussions then since I
was in the final sprint for my book, Fluent Python, at the time).

I have two questions which were probably already asked before, so feel
free to point me to relevant past messages:

1) Why is a whole new hierarchy of types being created in the typing
module, instead of continuing the hierarchy in the collections module
while enhancing the ABCs already there? For example, why weren't the
List and Dict type created under the existing MutableSequence and
MutableMapping types in collections.abc?

2) Similarly, I note that PEP-484 shuns existing ABCs like those in
the numbers module, and the ByteString ABC. The reasons given are
pragmatic, so that users don't need to import the numbers module, and
would not "have to write typing.ByteString everywhere." as the PEP
says... I don not understand these arguments because:

a) as you just wrote in another message, the users will be primarily
the authors of libraries and frameworks, who will always be forced to
import typing anyhow, so it does not seem such a burden to have them
import other modules get the benefits of type hinting;
b) alternatively, there could be aliases of the relevant ABCs in the
typing module for convenience

So the second question is: what's wrong with points (a) and (b), and
why did PEP-484 keep such a distance form existing ABCs in general?

I understand pragmatic choices, but as a teacher and writer I know
such choices are often obstacles to learning because they seem
arbitrary to anyone who is not privy to the reasons behind them. So
I'd like to better understand the reasoning, and I think PEP-484 is
not very persuasive when it comes to the issues I mentioned.

Thanks!

Best,

Luciano

-- 
Luciano Ramalho
|  Author of Fluent Python (O'Reilly, 2015)
|     http://shop.oreilly.com/product/0636920032519.do
|  Professor em: http://python.pro.br
|  Twitter: @ramalhoorg

From abarnert at yahoo.com  Thu Sep 10 12:27:23 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 10 Sep 2015 03:27:23 -0700
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <55F1461C.70607@biologie.uni-freiburg.de>
References: <msmbko$ful$1@ger.gmane.org> <mspcv3$8mu$1@ger.gmane.org>
 <5B23496D-6DBD-49B3-91D7-E093309A84C7@yahoo.com>
 <55F0AC99.8030408@biologie.uni-freiburg.de>
 <C2DA14A1-DC2F-4F66-B4D6-EB3D63824A63@yahoo.com>
 <1441841095.2587236.379354345.340FCF95@webmail.messagingengine.com>
 <461D4C7C-6C32-480D-B065-295A623E11D7@yahoo.com>
 <55F1461C.70607@biologie.uni-freiburg.de>
Message-ID: <79AE4A57-B698-45A3-84F5-DC65E03C25CA@yahoo.com>

On Sep 10, 2015, at 01:58, Wolfgang Maier <wolfgang.maier at biologie.uni-freiburg.de> wrote:
> 
>> On 10.09.2015 02:03, Andrew Barnert via Python-ideas wrote:
>>> On Sep 9, 2015, at 16:24, random832 at fastmail.us wrote:
>>> 
>>>> On Wed, Sep 9, 2015, at 18:39, Andrew Barnert via Python-ideas wrote:
>>>> I believe he posted a more detailed version of the idea on one of the
>>>> other spinoff threads from the f-string thread, but I don't have a link.
>>>> But there are lots of possibilities, and if you want to start
>>>> bikeshedding, it doesn't matter that much what his original color was.
>>>> For example, here's a complete proposal:
>>>> 
>>>>    class MyJoiner:
>>>>        def __init__(self, value):
>>>>            self.value = value
>>>>        def __format__(self, spec):
>>>>            return spec.join(map(str, self.value))
>>>>    string.register_converter('join', MyJoiner)
>>> 
>>> Er, I wanted it to be something more like
>>> 
>>> def __format__(self, spec):
>>>   sep, fmt = # 'somehow' break up spec into two parts
>> 
>> I covered later in the same message how this simple version could be extended to a smarter version that does that, or even more, without requiring any further changes to str.format. I just wanted to show the simplest version first, and then show that designing for that doesn't lose any flexibility.
> 
> Ok, I think I got the idea.
> One question though: how would you prevent this from getting competely out of hand?

Same way we keep types with weird __format__ methods, nested or multi-clause comprehensions, import hooks, operator overloads like using __ror__ to partial functions, metaclasses, subclass hooks, multiple inheritance, dynamic method lookup, descriptors, etc. from getting completely out of hand: trust users to have some taste, and don't write bad documentation that would convince them to abuse it. :)

>> And meanwhile, the alternative seems to be having something similar, but not exposing it publicly, and just baking in a handful of hardcoded converters for join, html, re-escape, etc., and I don't see why str should know about all of those things, or why extending that set when we realize that we forgot about shlex should require a patch to str and a new Python version.
>> 
>>> The Joiner class wouldn't have to exist as a builtin, it could be
>>> private to the format function.
>> 
>> If it's custom-registerable, it can be on PyPI, or in the middle of your app, although of course there could be some converters, maybe including your Joiner, somewhere in the stdlib, or even private to format, as well.
> 
> The strength of this idea - flexibility - could also be called its biggest weakness and that is scaring me. Essentially, such converters would be completely free to do anything they want: change their input at will, return something completely unrelated, have side-effects. All of that hidden behind a simple !token in a replacement field.
> While the idea is really cool and certainly powerful if used responsibly, it could also create completely unreadable code.

There aren't any obvious reasons for anyone to write such unreadable code, so I don't see it being a real attractive nuisance.

> Just adding one single hardcoded converter for joining iterables looks like a much more reasonable and realistic idea and now that I understand the concept I have to say I really like it.
> 
> Just paraphrasing once more to see if a understood things correctly this time:
> The !j converter converts the iterable to an instance of a Joiner class just like !s, !r and !a convert to a str instance. After that conversion the __format__ method of the new object gets called with the format_spec string (which specifies the separator and the inner format spec) as argument and that method produces the joint string.
> 
> So everything follows the existing logic of a converter and no really new replacement field syntax is required. Great and +1!

Yep, and I'm +1 on it as well.

But in also at least +0.5 on the custom converter idea, because joining is the fourth idea people have come up with for converters in the past few weeks, and I'd bet there are another few widely-usable ideas, plus some good uses for specific applications (different web frameworks, scientific computing, etc.). When I get a chance, I'll hack something up to play with it and see if it's as useful as I'm expecting.

From abarnert at yahoo.com  Thu Sep 10 12:33:08 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 10 Sep 2015 03:33:08 -0700
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
In-Reply-To: <msrf6l$710$1@ger.gmane.org>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <72597E4F-4E74-412D-8ED3-442E832232EF@yahoo.com>
 <m2mvwvoyk1.fsf@fastmail.com>
 <A45551D4-1E5C-423F-9ACB-F2CB386B6BEE@yahoo.com>
 <CAPTjJmpqzLm6v05w-FLYrZCPta6o5j0dmv6Y9tTW=_+ayrxSCw@mail.gmail.com>
 <8A294D36-C40F-405F-BB2E-94CD379B8165@yahoo.com> <msrf6l$710$1@ger.gmane.org>
Message-ID: <178619C8-5587-4069-AC6F-D7AC8A65C6CD@yahoo.com>

On Sep 10, 2015, at 01:32, Serhiy Storchaka <storchaka at gmail.com> wrote:
> 
>> On 10.09.15 11:17, Andrew Barnert via Python-ideas wrote:
>>> On Sep 9, 2015, at 23:08, Chris Angelico <rosuav at gmail.com> wrote:
>>> On Thu, Sep 10, 2015 at 11:50 AM, Andrew Barnert via Python-ideas
>>> <python-ideas at python.org> wrote:
>>>> Of course it adds the cost of making the module slower, and also more complex. Maybe a better solution would be to add a random.set_default_instance function that replaced all of the top-level functions with bound methods of the instance (just like what's already done at startup in random.py)? That's simple, and doesn't slow down anything, and it seems like it makes it more clear what you're doing than setting random.inst.
>>> 
>>> +1. A single function call that replaces all the methods adds a
>>> minuscule constant to code size, run time, etc, and it's no less
>>> readable than assignment to a module attribute. (If anything, it makes
>>> it more clearly a supported operation - I've seen novices not realize
>>> that "module.xyz = foo" is valid, but nobody would misunderstand the
>>> validity of a function call.)
>> 
>> I was only half-serious about this, but now I think I like it: it provides exactly the fix people are hoping to fix by deprecating the top-level functions, but with less risk, less user code churn, a smaller patch, and a much easier fix for novice users. (And it's much better than my earlier suggestion, too.)
>> 
>> See https://gist.github.com/abarnert/e0fced7569e7d77f7464 for the patch, and a patched copy of random.py. The source comments in the patch should be enough to understand everything that's changed.
> 
> This doesn't work with the idiom "from random import random".

Well, the goal of the deprecation idea was to eventually get people to explicitly use instances, so the fact that doesn't work out of the box is a good thing, not a problem.

But for people just trying to retrofit existing code, all they have to do is call random.set_default_instance at the top of the main module, and all their other modules can just import what they need this way. Which is why it's better than straightforward deprecation.

From donald at stufft.io  Thu Sep 10 13:26:56 2015
From: donald at stufft.io (Donald Stufft)
Date: Thu, 10 Sep 2015 07:26:56 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
Message-ID: <etPan.55f16900.37960d0.31bc@Draupnir.home>

On September 10, 2015 at 4:41:56 AM, Paul Moore (p.f.moore at gmail.com) wrote:
> On 10 September 2015 at 01:01, Donald Stufft wrote:
> > Essentially, there are three basic types of uses of random (the concept, not
> > the module). Those are:
> >
> > 1. People/usecases who absolutely need deterministic output given a seed and
> > for whom security properties don't matter.
> > 2. People/usecases who absolutely need a cryptographically random output and
> > for whom having a deterministic output is a downside.
> > 3. People/usecases that fall somewhere in between where it may or may not be
> > security sensitive or it may not be known if it's security sensitive.
>  
> Wrong.
>  
> There is a fourth basic type. People (like me!) whose code absolutely
> doesn't have any security issues, but want a simple, convenient, fast
> RNG. Determinism is not an absolute requirement, but is very useful
> (for writing tests, maybe, or for offering a deterministic rerun
> option to the program). Simulation-style games often provide a way to
> find the "map seed", which allows users to share interesting maps -
> this is non-essential but a big quality-of-life benefit in such games.

This group is the same as #3 except for the map seed thing which is
group #1. In particular, it wouldn?t hurt you if the random you were
using was cryptographically secure as long as it was fast and if you
needed determinism, it would hurt you to say so. Which is the?point
that Theo was making.

>  
> IMO, the current module perfectly serves this fourth group.

Making the user pick between Deterministic and Secure random would serve
this purpose too, especially in a language where "In the face of ambiguity,
refuse the temptation to guess" is one of the core tenets of the language. The
largest downside would be typing a few extra characters, which Python is not
a language that attempts to do things in the fewest number of characters.?

>  
> While I accept your point that far too many people are using insecure
> RNGs in "generate a random password" scripts, they are *not* the core
> target audience of the default module-level functions in the random
> module (did you find any examples of insecure use that *weren't*
> password generators?). We should educate people that this is bad
> practice, not change the module. Also, while it may be imperfect, it's
> still better than what many people *actually* do, which is to use
> "password" as a password on sensitive systems :-(

You cannot document your way out of a UX problem.

The problem isn?t people doing this once on the command line to generate
a password, the problem is people doing it in applications where they
generate an API key, a session identifier, a random password which they
then give to their users. If you give a way to get the output of the?MT
base random enough times, it can be used to determine?what every random
it generated was and will be.

Here?s a game a friend of mine created where the purpose of the game is
to essentially unrandomize some random data, which is only possible
because it?s (purposely) using MT to make it possible
https://github.com/reaperhulk/dsa-ctf. This is not an ivory tower paranoia
case, it?s a real concern that will absolutely fix some insecure software
out there instead of telling them ?welp typing a little bit extra once
an import is too much of a burden for me and really it?s your own fault
anyways?.

>?
> Maybe what Python *actually* needs is a good-quality "random password
> generator" module in the stdlib? (Semi-serious suggestion...)
>  
> Paul
>  

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From donald at stufft.io  Thu Sep 10 13:40:41 2015
From: donald at stufft.io (Donald Stufft)
Date: Thu, 10 Sep 2015 07:40:41 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <55F14B70.2080901@sdamon.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <etPan.55f0dd28.7b8c5e71.31bc@Draupnir.home>
 <CA+=+wqAbx1NRDSSkoXofC2R+2PN_x=TPzraH_smWUzm_uQPLYw@mail.gmail.com>
 <55F14B70.2080901@sdamon.com>
Message-ID: <etPan.55f16c39.8659361.31bc@Draupnir.home>

On September 10, 2015 at 5:21:29 AM, Alexander Walters (tritium-list at sdamon.com) wrote:
> > Why in the heck are we trying to make the random module do something 
> that it is already documented as being a poor choice, where there 
> is
> already third party modules that do just this?
> 
> Who needs cryptographic randomness in the standard library 
> anyways (even
> though one line of code give you access to it)? Have we identified 
> even
> ONE person who does cryptography in python who is kicking themselves 
> that they cant use the random module as implemented?

Because there are a situations where you need a securely generated randomness
where you are *NOT* "doing cryptography". Blaming people for the fact that the
random module has a bad UX that naturally leads them to use it when it isn't
appropriate is a shitty thing to do.

What harm is there in making people explicitly choose between deterministic
randomness and secure randomness? Is your use case so much better than theirs
that you thing you deserve to type a few characters less to the detriment of
people who don't know any better?

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From p.f.moore at gmail.com  Thu Sep 10 14:29:13 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 10 Sep 2015 13:29:13 +0100
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <etPan.55f16900.37960d0.31bc@Draupnir.home>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
Message-ID: <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>

On 10 September 2015 at 12:26, Donald Stufft <donald at stufft.io> wrote:
>> There is a fourth basic type. People (like me!) whose code absolutely
>> doesn't have any security issues, but want a simple, convenient, fast
>> RNG. Determinism is not an absolute requirement, but is very useful
>> (for writing tests, maybe, or for offering a deterministic rerun
>> option to the program). Simulation-style games often provide a way to
>> find the "map seed", which allows users to share interesting maps -
>> this is non-essential but a big quality-of-life benefit in such games.
>
> This group is the same as #3 except for the map seed thing which is
> group #1. In particular, it wouldn?t hurt you if the random you were
> using was cryptographically secure as long as it was fast and if you
> needed determinism, it would hurt you to say so. Which is the point
> that Theo was making.

I don't understand the phrase "if you needed determinism, it would
hurt you to say so". Could you clarify?

>>
>> IMO, the current module perfectly serves this fourth group.
>
> Making the user pick between Deterministic and Secure random would serve
> this purpose too, especially in a language where "In the face of ambiguity,
> refuse the temptation to guess" is one of the core tenets of the language. The
> largest downside would be typing a few extra characters, which Python is not
> a language that attempts to do things in the fewest number of characters.

And yet I know that I would routinely, and (this is the problem)
without thinking, choose Deterministic, because I know that my use
cases all get a (small) benefit from being able to capture the seed,
but I also know I'm not doing security-related stuff.

No amount of making me choose is going to help me spot security
implications that I've missed.

And also, calling the non-crypto choice "Deterministic" is unhelpful,
because I *don't* want something deterministic, I want something
random (I understand PRNGs aren't truly random, but "good enough for
my purposes" is what I want, and "deterministic" reads to me as saying
it's *not* good enough...)

>> While I accept your point that far too many people are using insecure
>> RNGs in "generate a random password" scripts, they are *not* the core
>> target audience of the default module-level functions in the random
>> module (did you find any examples of insecure use that *weren't*
>> password generators?). We should educate people that this is bad
>> practice, not change the module. Also, while it may be imperfect, it's
>> still better than what many people *actually* do, which is to use
>> "password" as a password on sensitive systems :-(
>
> You cannot document your way out of a UX problem.

What I'm trying to say is that this is an education problem more than
a UX problem.

Personally, I think I know enough about security for my (not a
security specialist) purposes. To that extent, if I'm working on
something with security implications, I'm looking for things that say
"Crypto" in the name. The rest of the time, I just use non-specialist
stuff. It's a similar situation to that of the "statistics" module. If
I'm doing "proper" maths, I'd go for numpy/scipy. If I just want some
averages and I'm not bothered about numerical stability, rounding
behaviour, etc, I'd go for the stdlib statistics package.

> The problem isn?t people doing this once on the command line to generate
> a password, the problem is people doing it in applications where they
> generate an API key, a session identifier, a random password which they
> then give to their users. If you give a way to get the output of the MT
> base random enough times, it can be used to determine what every random
> it generated was and will be.

To me, that's crypto and I'd look to the cryptography module, or to
something in the stdlib that explicitly said it was suitable for
crypto.

Saying people write bad code isn't enough - how does the current
module *encourage* them to write bad code? How much API change must we
allow to cater for people who won't read the statement in the docs (in
a big red box) "Warning: The pseudo-random generators of this module
should not be used for security purposes." (Specifically people
writing security related code who won't read the docs).

> Here?s a game a friend of mine created where the purpose of the game is
> to essentially unrandomize some random data, which is only possible
> because it?s (purposely) using MT to make it possible
> https://github.com/reaperhulk/dsa-ctf. This is not an ivory tower paranoia
> case, it?s a real concern that will absolutely fix some insecure software
> out there instead of telling them ?welp typing a little bit extra once
> an import is too much of a burden for me and really it?s your own fault
> anyways?.

I don't understand how that game (which is an interesting way of
showing people how attacks on crypto work, sure, but that's just
education, which you dismissed above) relates to the issue here.

And I hope you don't really think that your quote is even remotely
what I'm trying to say (I'm not that selfish) - my point is that not
everything is security related. Not every application people write,
and not every API in the stdlib. You're claiming that the random
module is security related. I'm claiming it's not, it's documented as
not being, and that's clear to the people who use it for its intended
purpose. Telling those people that you want to make a module designed
for their use harder to use because people for whom it's not intended
can't read the documentation which explicitly states that it's not
suitable for them, is doing a disservice to those people who are
already using the module correctly for its stated purpose.

By the same argument, we should remove the statistics module because
it can be used by people with numerically unstable problems. (I doubt
you'll find StackOverflow questions along these lines yet, but that's
only because (a) the module's pretty new, and (b) it actually works
pretty hard to handle the hard corner cases, but I bet they'll start
turning up in due course, if only from the people who don't understand
floating point...)

Paul

From contact at ionelmc.ro  Thu Sep 10 14:07:14 2015
From: contact at ionelmc.ro (=?UTF-8?Q?Ionel_Cristian_M=C4=83rie=C8=99?=)
Date: Thu, 10 Sep 2015 15:07:14 +0300
Subject: [Python-ideas] PyPI search still broken
In-Reply-To: <20150909230130.GA14415@k3>
References: <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
 <C5A4CEBA-048A-4762-B650-869625BF77ED@yahoo.com>
 <etPan.55f0a3b8.4e506bc7.31bc@Draupnir.home>
 <httdkl4b7jlougelcxey1srm.1441837002659@email.android.com>
 <20150909230130.GA14415@k3>
Message-ID: <CANkHFr9_3xwri2+69TAA1fCFRwK+iRN7Fe52XiKpO5AJAxLJcQ@mail.gmail.com>

Wouldn't it be better if you'd just build an external search service?
Getting a list of packages and descriptions should be possible no? (just
asking, not 100% sure)

I doubt the maintainers are just going to come out and say "ok, this guy
has waited long enough, lets take his contribution in". If they didn't care
about the search 2.5 years ago why would they care now.

Sorry for being snide here but my impression is that Warehouse could had
been shipped a while ago instead of getting rewritten
? ?
s
?everal times.? I'm not saying that's bad, it's just that there's a
mismatch in goals here.


Thanks,
-- Ionel Cristian M?rie?

On Thu, Sep 10, 2015 at 2:01 AM, David Wilson <dw+python-ideas at hmmz.org>
wrote:

> Hi there,
>
> My 2.5 year old offer to retrofit the old codebase with a new search
> system still stands[1]. :)  There is no reason for this to be a complex
> affair, the prototype built back then took only a few hours to complete.
>
> No doubt the long term answer is probably "Warehouse fixes this", but
> Warehouse seems no nearer a reality than it did in 2013.
>
>
> David
>
> [1]
> https://groups.google.com/forum/#!search/%22david$20wilson%22$20search$20pypi/pypa-dev/ZjUNkczsKos/2et8926YOQYJ
>
> On Thu, Sep 10, 2015 at 12:35:04AM +0200, Giovanni Cannata wrote:
> > Hi, sorry to bother you again, but the search problem on PyPI is still
> present
> > after different weeks and it's very annoying. I've just released a new
> version
> > of my ldap3 project and it doesn't show up when searching with its name.
> For
> > mine (and I suppose for other emerging project, especially related to
> Python 3)
> > it's vital to be easily found by other developers that use pip and PyPI
> as THE
> > only repository for python packages and using the number of download as a
> > ranking of popularity of a project.
> >
> > If search can't be fixed there should be at least a warning on the PyPI
> > homepage to let users know that search is broken and that using Google
> for
> > searching could help to find more packages.
> >
> > Bye,
> > Giovanni
>
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at python.org
> > https://mail.python.org/mailman/listinfo/python-ideas
> > Code of Conduct: http://python.org/psf/codeofconduct/
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/856ba527/attachment-0001.html>

From random832 at fastmail.us  Thu Sep 10 15:02:43 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Thu, 10 Sep 2015 09:02:43 -0400
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F0BF61.6050205@canterbury.ac.nz>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
Message-ID: <1441890163.3120507.379846857.49842A96@webmail.messagingengine.com>

On Wed, Sep 9, 2015, at 19:23, Greg Ewing wrote:
> Another property that's important for some applications is
> to be able to efficiently "jump ahead" some number of steps
> in the sequence, to produce multiple independent streams of
> numbers. It would be good to know if that is possible with
> arc4random.

Being able to produce multiple independent streams of numbers is the
important feature. Doing it by "jumping ahead" seems less so. And the
need for doing it "efficiently" isn't as clear either - how many streams
do you need?

From donald at stufft.io  Thu Sep 10 15:10:09 2015
From: donald at stufft.io (Donald Stufft)
Date: Thu, 10 Sep 2015 09:10:09 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
Message-ID: <etPan.55f18131.392f7558.31bc@Draupnir.home>

On September 10, 2015 at 8:29:16 AM, Paul Moore (p.f.moore at gmail.com) wrote:
> On 10 September 2015 at 12:26, Donald Stufft wrote:
> >> There is a fourth basic type. People (like me!) whose code absolutely
> >> doesn't have any security issues, but want a simple, convenient, fast
> >> RNG. Determinism is not an absolute requirement, but is very useful
> >> (for writing tests, maybe, or for offering a deterministic rerun
> >> option to the program). Simulation-style games often provide a way to
> >> find the "map seed", which allows users to share interesting maps -
> >> this is non-essential but a big quality-of-life benefit in such games.
> >
> > This group is the same as #3 except for the map seed thing which is
> > group #1. In particular, it wouldn?t hurt you if the random you were
> > using was cryptographically secure as long as it was fast and if you
> > needed determinism, it would hurt you to say so. Which is the point
> > that Theo was making.
>  
> I don't understand the phrase "if you needed determinism, it would
> hurt you to say so". Could you clarify?

I transposed some words, fixed:

"If you needed determinism, would it hurt you to say so?""

Essentially, other than typing a little bit more, why is:

? ? import random
? ? print(random.choice([?a?, ?b?, ?c?]))

better than

? ? import random;
? ? print(random.DetereministicRandom().choice([?a?, ?b?, ?C?]))

As far as I can tell, you've made your code and what properties it has much
clearer to someone reading it at the cost of 22 characters. If you're going to
reuse the DeterministicRandom class you can assign it to a variable and
actually end up saving characters if the variable you save it to can be
accessed at less than 6 characters.

>  
> >>
> >> IMO, the current module perfectly serves this fourth group.
> >
> > Making the user pick between Deterministic and Secure random would serve
> > this purpose too, especially in a language where "In the face of ambiguity,
> > refuse the temptation to guess" is one of the core tenets of the language. The
> > largest downside would be typing a few extra characters, which Python is not
> > a language that attempts to do things in the fewest number of characters.
>  
> And yet I know that I would routinely, and (this is the problem)
> without thinking, choose Deterministic, because I know that my use
> cases all get a (small) benefit from being able to capture the seed,
> but I also know I'm not doing security-related stuff.
>  
> No amount of making me choose is going to help me spot security
> implications that I've missed.

You're allowed to pick DeterministicRandom, you're even allowed to do it
without thinking. This isn't about making it impossible to ever insecurely use
random numbers, that's obviously a boil the ocean level of problem, this is
about trying to make it more likely that someone won't be hit by a fairly easy
to hit footgun if it does matter for them, even if they don't know it. It's
also about making code that is easier to understand on the surface, for example
without using the prior knowledge that it's using MT, tell me how you'd know
if this was safe or not:

? ? import random
? ? import string
? ? password = "".join(random.choice(string.ascii_letters) for _ in range(9))
? ? print("Your random password is",)


>  
> And also, calling the non-crypto choice "Deterministic" is unhelpful,
> because I *don't* want something deterministic, I want something
> random (I understand PRNGs aren't truly random, but "good enough for
> my purposes" is what I want, and "deterministic" reads to me as saying
> it's *not* good enough?)

But you *DO* want something deterministic, the *ONLY* way you can get this
small benefit of capturing the seed is if you can put that seed back into the
system and get a deterministic result. If the seed didn?t exactly determine the
output of the randomness then you wouldn't be able to do that. If you don't
need to be able to capture the seed and essentially "replay" the PRNG in a
deterministic way then there is exactly zero downsides to using a CSPRNG other
than speed, which is why Theo suggested using a very fast, modern CSPRNG to
solve the speed issues.

Can you point out one use case where cryptographically safe random numbers,
assuming we could generate them as quickly as you asked for them, would hurt
you unless you needed/wanted to be able to save the seed and thus require or
want deterministic results?

>  
> >> While I accept your point that far too many people are using insecure
> >> RNGs in "generate a random password" scripts, they are *not* the core
> >> target audience of the default module-level functions in the random
> >> module (did you find any examples of insecure use that *weren't*
> >> password generators?). We should educate people that this is bad
> >> practice, not change the module. Also, while it may be imperfect, it's
> >> still better than what many people *actually* do, which is to use
> >> "password" as a password on sensitive systems :-(
> >
> > You cannot document your way out of a UX problem.
>  
> What I'm trying to say is that this is an education problem more than
> a UX problem.
>  
> Personally, I think I know enough about security for my (not a
> security specialist) purposes. To that extent, if I'm working on
> something with security implications, I'm looking for things that say
> "Crypto" in the name. The rest of the time, I just use non-specialist
> stuff. It's a similar situation to that of the "statistics" module. If
> I'm doing "proper" maths, I'd go for numpy/scipy. If I just want some
> averages and I'm not bothered about numerical stability, rounding
> behaviour, etc, I'd go for the stdlib statistics package.
>  
> > The problem isn?t people doing this once on the command line to generate
> > a password, the problem is people doing it in applications where they
> > generate an API key, a session identifier, a random password which they
> > then give to their users. If you give a way to get the output of the MT
> > base random enough times, it can be used to determine what every random
> > it generated was and will be.
>  
> To me, that's crypto and I'd look to the cryptography module, or to
> something in the stdlib that explicitly said it was suitable for
> crypto.
>  
> Saying people write bad code isn't enough - how does the current
> module *encourage* them to write bad code? How much API change must we
> allow to cater for people who won't read the statement in the docs (in
> a big red box) "Warning: The pseudo-random generators of this module
> should not be used for security purposes." (Specifically people
> writing security related code who won't read the docs).

Reminder that this warning does not show up (in any color, much less red)
if you?re using ``help(random)`` or ``dir(random)`` to explore the random
module. It also does not show up in code review when you see someone doing
random.random.

It encourages you to write bad code, because it has a baked in assumption that
there is a sane default for a random number generator and expects people to
understand a fairly dificult concept, which is that not all "random" is equal.

For instance, you've already made the mistake of saying you wanted "random" not
deterministic, but the two are not mutually exlusive and deterministic is a
property that a source of random can have, and one that you need for one of the
features you say you like.?

>  
> > Here?s a game a friend of mine created where the purpose of the game is
> > to essentially unrandomize some random data, which is only possible
> > because it?s (purposely) using MT to make it possible
> > https://github.com/reaperhulk/dsa-ctf. This is not an ivory tower paranoia
> > case, it?s a real concern that will absolutely fix some insecure software
> > out there instead of telling them ?welp typing a little bit extra once
> > an import is too much of a burden for me and really it?s your own fault
> > anyways?.
>  
> I don't understand how that game (which is an interesting way of
> showing people how attacks on crypto work, sure, but that's just
> education, which you dismissed above) relates to the issue here.
>  
> And I hope you don't really think that your quote is even remotely
> what I'm trying to say (I'm not that selfish) - my point is that not
> everything is security related. Not every application people write,
> and not every API in the stdlib. You're claiming that the random
> module is security related. I'm claiming it's not, it's documented as
> not being, and that's clear to the people who use it for its intended
> purpose. Telling those people that you want to make a module designed
> for their use harder to use because people for whom it's not intended
> can't read the documentation which explicitly states that it's not
> suitable for them, is doing a disservice to those people who are
> already using the module correctly for its stated purpose.

I'm claiming that the term random is ambiguously both security related and
not security related and we should either get rid of the default and expect
people to pick whether or not their use case is security related, or we should
assume that it is unless otherwise instructed. I don't particularly care what
the exact spelling of this looks like, random.(System|Secure)Random and
random.DeterministicRandom is just one option. Another option is to look at
something closer to what Go did and deprecate the "random" module and move the
MT based thing to ``math.random`` and the CSPRNG can be moved to something like
crypto.random.

>  
> By the same argument, we should remove the statistics module because
> it can be used by people with numerically unstable problems. (I doubt
> you'll find StackOverflow questions along these lines yet, but that's
> only because (a) the module's pretty new, and (b) it actually works
> pretty hard to handle the hard corner cases, but I bet they'll start
> turning up in due course, if only from the people who don't understand
> floating point...)
>

No, by this argument we shouldn't have a function called statistics in the
statistics module because there is no globally "right" answer for what the
default should be. Should it be mean? mode? median? Why is *your* use case the
"right" use case for the default option, particularly in a situation where
picking the wrong option can be disastrous.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From random832 at fastmail.us  Thu Sep 10 15:13:39 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Thu, 10 Sep 2015 09:13:39 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
Message-ID: <1441890819.3122699.379856193.3A628B5B@webmail.messagingengine.com>

On Thu, Sep 10, 2015, at 08:29, Paul Moore wrote:
> And also, calling the non-crypto choice "Deterministic" is unhelpful,
> because I *don't* want something deterministic, I want something
> random (I understand PRNGs aren't truly random, but "good enough for
> my purposes" is what I want, and "deterministic" reads to me as saying
> it's *not* good enough...)

I don't understand why. What other word would you use to describe a
generator that can be given a specific set of inputs to generate the
same exact sequence of numbers every single time?

If you want that feature, then you're not going to think "deterministic"
means "not good enough". And if you don't want it, you, well, don't want
it, so there's really no harm in the fact that you don't choose it.

Personally, though, I don't see why we're not talking about calling it
MersenneTwister.

From skrah at bytereef.org  Thu Sep 10 15:39:31 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Thu, 10 Sep 2015 13:39:31 +0000 (UTC)
Subject: [Python-ideas] Should our default random number generator be
	secure?
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>	<CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>	<etPan.55f06a43.137d4868.31bc@Draupnir.home>	<CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>	<etPan.55f06fd9.71794aea.31bc@Draupnir.home>	<1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>	<CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>	<1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>	<20150909190757.GM19373@ando.pearwood.info>	<55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
Message-ID: <loom.20150910T153125-298@post.gmane.org>

M.-A. Lemburg <mal at ...> writes:
> Reading this thread is fun, but it doesn't seem to be getting
> anywhere - perhaps that's part of the fun 
> 
> Realistically, I see two options:
> 
>  1. Someone goes and implements the OpenBSD random function in C
>     and put a package up on PyPI, updating it whenever OpenBSD
>     thinks that a new algorithm is needed or a security issue
>     has to be fixed (from my experience with other crypto software
>     like OpenSSL, this should be on the order of every 2-6 months )

The sane option would be to use the OpenBSD libcrypto, which seems to
be part of their OpenSSL fork (libressl), just like libcrypto is part
of OpenSSL.

Then the crypto maintenance would be delegated to the distributions.

I would even be interested in writing such a package, but it would
be external and non-redistributable for well-known reasons. :)


Stefan Krah




From p.f.moore at gmail.com  Thu Sep 10 15:44:11 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 10 Sep 2015 14:44:11 +0100
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <etPan.55f18131.392f7558.31bc@Draupnir.home>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
Message-ID: <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>

On 10 September 2015 at 14:10, Donald Stufft <donald at stufft.io> wrote:
>> I don't understand the phrase "if you needed determinism, it would
>> hurt you to say so". Could you clarify?
>
> I transposed some words, fixed:
>
> "If you needed determinism, would it hurt you to say so?""

Thanks.

In one sense, no it wouldn't. Nor would it matter to me if "the
default random number generator" was fast and cryptographically
secure. What matters is just that I get a load of random (enough)
numbers.

What hurts somewhat (not enormously, I'll admit) is up front having to
think about whether I need to be able to capture a seed and replay it.
That's nearly always something I'd think of way down the line, as a
"wouldn't it be nice if I could get the user to send me a reproducible
test case" or something like that. And of course it's just a matter of
switching the underlying RNG at that point.

None of this is hard. But once again, I'm currently using the module
correctly, as documented.

I've omitted most of the rest of your response largely because we're
probably just going to have to agree to differ. I'm probably too worn
out being annoyed at the way that everything ends up needing to be
security related, and the needs of people who won't read the docs
determines API design, to respond clearly and rationally :-(

Paul

From graffatcolmingov at gmail.com  Thu Sep 10 15:44:26 2015
From: graffatcolmingov at gmail.com (Ian Cordasco)
Date: Thu, 10 Sep 2015 08:44:26 -0500
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <1441890819.3122699.379856193.3A628B5B@webmail.messagingengine.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <1441890819.3122699.379856193.3A628B5B@webmail.messagingengine.com>
Message-ID: <CAN-Kwu3hL49PTvidunjK+ab104v1vxHPyRcZxaOxRZuRC6r0Kg@mail.gmail.com>

On Thu, Sep 10, 2015 at 8:13 AM,  <random832 at fastmail.us> wrote:
> On Thu, Sep 10, 2015, at 08:29, Paul Moore wrote:
>> And also, calling the non-crypto choice "Deterministic" is unhelpful,
>> because I *don't* want something deterministic, I want something
>> random (I understand PRNGs aren't truly random, but "good enough for
>> my purposes" is what I want, and "deterministic" reads to me as saying
>> it's *not* good enough...)
>
> I don't understand why. What other word would you use to describe a
> generator that can be given a specific set of inputs to generate the
> same exact sequence of numbers every single time?
>
> If you want that feature, then you're not going to think "deterministic"
> means "not good enough". And if you don't want it, you, well, don't want
> it, so there's really no harm in the fact that you don't choose it.
>
> Personally, though, I don't see why we're not talking about calling it
> MersenneTwister.

Because while we want to reduce foot guns, we don't want to reduce
usability. DeterministicRandom is fairly easy for anyone to
understand. I would venture a guess that most people looking for that
wouldn't know (or care) what the backing algorithm is. Further, if we
stop using mersenne twister in the future, we would have to remove
that class name. DeterministicRandom can be agnostic of the underlying
algorithm and is friendlier to people who don't need to know or care
about what algorithm is generating the numbers, they only need to
understand the properties of that generator.

From random832 at fastmail.us  Thu Sep 10 15:55:03 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Thu, 10 Sep 2015 09:55:03 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CAN-Kwu3hL49PTvidunjK+ab104v1vxHPyRcZxaOxRZuRC6r0Kg@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <1441890819.3122699.379856193.3A628B5B@webmail.messagingengine.com>
 <CAN-Kwu3hL49PTvidunjK+ab104v1vxHPyRcZxaOxRZuRC6r0Kg@mail.gmail.com>
Message-ID: <1441893303.3132414.379896217.7C80B332@webmail.messagingengine.com>

On Thu, Sep 10, 2015, at 09:44, Ian Cordasco wrote:
> Because while we want to reduce foot guns, we don't want to reduce
> usability. DeterministicRandom is fairly easy for anyone to
> understand. I would venture a guess that most people looking for that
> wouldn't know (or care) what the backing algorithm is. Further, if we
> stop using mersenne twister in the future, we would have to remove
> that class name.

If we're serious about being deterministic, then we should keep that
class under that name and provide a new class for the new algorithm.
What's the point of having a deterministic algorithm if you can't
reproduce your results in the new version because the algorithm was
deleted?

From graffatcolmingov at gmail.com  Thu Sep 10 15:56:05 2015
From: graffatcolmingov at gmail.com (Ian Cordasco)
Date: Thu, 10 Sep 2015 08:56:05 -0500
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
Message-ID: <CAN-Kwu3gScsJbRJxea4TZksgQ08ZBbjscZZ4=xaV8FqVMyCjvw@mail.gmail.com>

On Thu, Sep 10, 2015 at 8:44 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 10 September 2015 at 14:10, Donald Stufft <donald at stufft.io> wrote:
>>> I don't understand the phrase "if you needed determinism, it would
>>> hurt you to say so". Could you clarify?
>>
>> I transposed some words, fixed:
>>
>> "If you needed determinism, would it hurt you to say so?""
>
> Thanks.
>
> In one sense, no it wouldn't. Nor would it matter to me if "the
> default random number generator" was fast and cryptographically
> secure. What matters is just that I get a load of random (enough)
> numbers.
>
> What hurts somewhat (not enormously, I'll admit) is up front having to
> think about whether I need to be able to capture a seed and replay it.
> That's nearly always something I'd think of way down the line, as a
> "wouldn't it be nice if I could get the user to send me a reproducible
> test case" or something like that. And of course it's just a matter of
> switching the underlying RNG at that point.
>
> None of this is hard. But once again, I'm currently using the module
> correctly, as documented.

No one in this thread is accusing everyone of using the module
incorrectly. The fact that you do use it correctly is a testament to
the fact that you read the docs carefully and have some level of
experience with the module to know that you're using it correctly.

> I've omitted most of the rest of your response largely because we're
> probably just going to have to agree to differ. I'm probably too worn
> out being annoyed at the way that everything ends up needing to be
> security related, and the needs of people who won't read the docs
> determines API design, to respond clearly and rationally :-(

I think the people Theo, Donald, and others (including myself) are
worried about are the people who have used some book or online
tutorial to write games in Python and have seen random.random() or
random.choice() used. Later on they start working on something else
(including but not limited to the examples of what Donald has
otherwise pointed out). They also have enough experience with the
random module to know it produced randomness (what kind, they don't
know... in fact they probably don't know there are different kinds
yet) and they use what they know because Python has batteries included
and they're awesome and easy to use. The reality is that past
experiences bias current decisions. If that person went and read the
docs, they probably won't know if what they're doing warrants using a
CSPRNG instead of the default Python one. If they're not willing to
learn, or read enough (and I stress enough) (or just really don't have
the time because this is a side project) about the topic before making
a decision, they'll say "Well the module level functions seemed random
enough to me, so I'll just use those". That could end up being rather
awful for them.

The reality is that your past experiences (and other people's past
experiences, especially those who refuse to do some research and are
demanding others prove that these are insecure with examples) are
biasing this discussion because they fail to empathize with new users
whose past experiences are coloring their decisions.

People choose Python for a variety of reasons, and one of those
reasons is that in their past experience it was "fast enough" to be an
acceptable choice. This is how most people behave. Being angry at
people for reading a two sentence long warning in the middle of the
docs isn't helping anyone or arguing the validity of this discussion.

From graffatcolmingov at gmail.com  Thu Sep 10 15:57:40 2015
From: graffatcolmingov at gmail.com (Ian Cordasco)
Date: Thu, 10 Sep 2015 08:57:40 -0500
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <1441893303.3132414.379896217.7C80B332@webmail.messagingengine.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <1441890819.3122699.379856193.3A628B5B@webmail.messagingengine.com>
 <CAN-Kwu3hL49PTvidunjK+ab104v1vxHPyRcZxaOxRZuRC6r0Kg@mail.gmail.com>
 <1441893303.3132414.379896217.7C80B332@webmail.messagingengine.com>
Message-ID: <CAN-Kwu3ywN5jfFC26_uac2+7KyvLckC3Y0YznoUnw8Ldot9Onw@mail.gmail.com>

On Thu, Sep 10, 2015 at 8:55 AM,  <random832 at fastmail.us> wrote:
> On Thu, Sep 10, 2015, at 09:44, Ian Cordasco wrote:
>> Because while we want to reduce foot guns, we don't want to reduce
>> usability. DeterministicRandom is fairly easy for anyone to
>> understand. I would venture a guess that most people looking for that
>> wouldn't know (or care) what the backing algorithm is. Further, if we
>> stop using mersenne twister in the future, we would have to remove
>> that class name.
>
> If we're serious about being deterministic, then we should keep that
> class under that name and provide a new class for the new algorithm.
> What's the point of having a deterministic algorithm if you can't
> reproduce your results in the new version because the algorithm was
> deleted?

This is totally off topic. That said as a counter-point: What's the
point of carrying around code you don't want people to use if they're
just going to use it anyway?

From donald at stufft.io  Thu Sep 10 16:21:11 2015
From: donald at stufft.io (Donald Stufft)
Date: Thu, 10 Sep 2015 10:21:11 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
Message-ID: <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>

On September 10, 2015 at 9:44:13 AM, Paul Moore (p.f.moore at gmail.com) wrote:
> On 10 September 2015 at 14:10, Donald Stufft wrote:
> >> I don't understand the phrase "if you needed determinism, it would
> >> hurt you to say so". Could you clarify?
> >
> > I transposed some words, fixed:
> >
> > "If you needed determinism, would it hurt you to say so?""
>  
> Thanks.
>  
> In one sense, no it wouldn't. Nor would it matter to me if "the
> default random number generator" was fast and cryptographically
> secure. What matters is just that I get a load of random (enough)
> numbers.
>  
> What hurts somewhat (not enormously, I'll admit) is up front having to
> think about whether I need to be able to capture a seed and replay it.
> That's nearly always something I'd think of way down the line, as a
> "wouldn't it be nice if I could get the user to send me a reproducible
> test case" or something like that. And of course it's just a matter of
> switching the underlying RNG at that point.
>?
> None of this is hard. But once again, I'm currently using the module
> correctly, as documented.

This is actually exactly why Theo suggested using a modern, userland CSPRNG
because it can generate random numbers faster than /dev/urandom can and, unless
you need deterministic results, there's little downside to doing so.?

There's really two possible ideas here that depends on what sort of balance
we'd want to strike. We can make a default "I don't want to think about it"
implementation of random that is both *generally* secure and fast, however it
won't be deterministic and you won't be able to explicitly seed it. This would
be a backwards compatible change [1] for people who are simply calling these
functions [2]:

? ? random.getrandbits
? ? random.randrange
? ? random.randint
? ? random.choice
? ? random.shuffle
? ? random.sample
? ? random.random
? ? random.uniform
? ? random.triangular
? ? random.betavariate
? ? random.expovariate
? ? random.gammavariate
? ? random.gauss
? ? random.lognormvariate
? ? random.normalvariate
? ? random.vonmisesvariate
? ? random.paretovariate
? ? random.weibullvariate

If this were all that the top level functions in random.py provided we could
simply replace the default and people wouldn't notice, they'd just
automatically get safer randomness whether that's actually useful for their
use case or not.

However, random.py also has these functions:

? ? random.seed
? ? random.getstate
? ? random.setstate
? ? random.jumpahead

and these functions are where the problem comes. These functions only really
make sense for deterministic sources of random which are not "safe" for use
in security sensitive applications. So pretending for a moment that we've
already decided to do "something" about this, the question boils down to what
do we do about these 4 functions. Either we can change the default to a secure
CSPRNG and break these functions (and the people using them) which is however
easily fixed by changing ``import random`` to
``import random; random = random.DeterministicRandom()`` or we can deprecate
the top level functions and try to guide people to choose up front what kind
of random they need. Either of these solutions will end up with people being
safer and, if we pretend we've agreed to do "something", it comes down to
whether we'd prefer breaking compatability for some people while keeping a
default random generator that is probably good enough for most people, or if
we'd prefer to not break compatability and try to push people to always
deciding what kind of random they want.

Of course, we still haven't decided that we should do "something", I think that
we should because I think that secure by default (or at least, not insecure by
default) is a good situation to be in. Over the history of computing it's been
shown that time and time again that trying to document or educate users is
error prone and doesn't scale, but if you can design APIs to make the "right"
thing obvious and opt-out and require opting in to specialist [3] cases which
require some particular property.


[1] Assuming Theo's claim of the speed of the ChaCha based arc4random function
? ? is accurate, which I haven't tested but I assume he's smart enough to know
? ? what he's talking about WRT to speed of it.

[2] I believe anyways, I don't think that any of these rely on the properties
? ? of MT or a deterministic source of random, just a source of random.

[3] In this case, their are two specialist use cases, those that require
? ? deterministic results and those that require specific security properties
? ? that are not satisified by a userland CSPRNG because a userland CSPRNG is
? ? not as secure as /dev/urandom but is able to be much faster.

>  
> I've omitted most of the rest of your response largely because we're
> probably just going to have to agree to differ. I'm probably too worn
> out being annoyed at the way that everything ends up needing to be
> security related, and the needs of people who won't read the docs
> determines API design, to respond clearly and rationally :-(
>  
> Paul
>  

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From p.f.moore at gmail.com  Thu Sep 10 17:02:00 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 10 Sep 2015 16:02:00 +0100
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
Message-ID: <CACac1F_ddf4Zba310=4efoFy3VPN2FkHLixU6AGTySmSnwZOug@mail.gmail.com>

On 10 September 2015 at 15:21, Donald Stufft <donald at stufft.io> wrote:
> which is however
> easily fixed by changing ``import random`` to
> ``import random; random = random.DeterministicRandom()`` or we can deprecate

Switching (somewhat hypocritically :-)) from an "I'm a naive user"
stance, to talking about deeper issues as if I knew what I was talking
about, this change results in each module getting a separate instance
of the generator. That has implications on the risks of correlated
results. It's unlikely to cause issues in real life, conceded.

Paul

From ncoghlan at gmail.com  Thu Sep 10 17:36:39 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 11 Sep 2015 01:36:39 +1000
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <55F0E5C9.6030509@brenbarn.net>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com>
 <87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>
 <m2egi9a62o.fsf@fastmail.com> <55EF2B66.4020509@mail.de>
 <1441741195.1614886.378114729.37307E0E@webmail.messagingengine.com>
 <6DDBD724-714E-40E1-88DF-9BC8484FF240@yahoo.com>
 <55F058B6.9000202@mail.de>
 <1DCC81C0-DE7A-460A-AD7F-E1533BB14911@yahoo.com>
 <55F0E5C9.6030509@brenbarn.net>
Message-ID: <CADiSq7emmO81fwShS_rVyH867UXwVPDo0Sx=eUoJMLdE=-DQVQ@mail.gmail.com>

On 10 September 2015 at 12:07, Brendan Barnwell <brenbarn at brenbarn.net> wrote:
> On 2015-09-09 14:50, Andrew Barnert via Python-ideas wrote:
>>
>> Well, have you read the answers given by Nick, me, and others earlier
>> in the thread? If so, what do you disagree with? You've only
>> addressed one point (that % is faster than {} for simple cases--and
>> your solution is just "make {} faster", which may not be possible
>> given that it's inherently more hookable than % and therefore
>> requires more function calls...). What about formatting headers for
>> ASCII wire protocols, sharing tables of format strings between
>> programming languages (e.g., for i18n), or any of the other reasons
>> people have brought up?
>
>
>         This getting off on a tangent, but I don't see most of those as
> super compelling.  Any programming language can use whatever formatting
> scheme it likes.  Keeping %-substitutions around helps in sharing format
> strings only with other languages that use exactly the same formatting
> style.  So it's not like % has any intrinsic gain; it just happens to
> interoperate with some other particular stuff.  That's nice, but I don't
> think it makes sense to keep things in Python just so it can interoperate in
> specific ways with specific other languages that use less-readable syntax.

This perspective doesn't grant enough credit to the significance of C
in general, and the C ABI in particular, in the overall computing
landscape. While a lot of folks have put a lot of work into making it
possible to write software without needing to learn the details of
what's happening at the machine level, it's still the case that the
*one* language binding interface that *every* language runtime ends up
including is being able to load and run C libraries.

It's also the case that for any new CPU architecture, one of the first
things people will do is bootstrap a C compiler for it, as that then
lets them bootstrap a whole host of other things (including Python).

For anyone that wants to make the transition from high level
programming to low level programming, or vice-versa, C is also the
common language understood by both software developers and computer
systems engineers.

There *are* some worthy contenders out there that may eventually
topple C's permissive low level memory access model from its position
of dominance (I personally have high hopes for Rust), but that's not
going to be a quick process.

Regards,
Nick.

P.S. It's also worth remembering than many Pythonistas, including
members of the core development team, happily switch between
programming languages according to the task at hand. Python can still
be our *preferred* language without becoming the *only* language we
use :)

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From brett at python.org  Thu Sep 10 17:46:28 2015
From: brett at python.org (Brett Cannon)
Date: Thu, 10 Sep 2015 15:46:28 +0000
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F13EAF.5040500@egenix.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
Message-ID: <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>

On Thu, 10 Sep 2015 at 01:26 M.-A. Lemburg <mal at egenix.com> wrote:

> Reading this thread is fun, but it doesn't seem to be getting
> anywhere - perhaps that's part of the fun ;-)
>
> Realistically, I see two options:
>
>  1. Someone goes and implements the OpenBSD random function in C
>     and put a package up on PyPI, updating it whenever OpenBSD
>     thinks that a new algorithm is needed or a security issue
>     has to be fixed (from my experience with other crypto software
>     like OpenSSL, this should be on the order of every 2-6 months ;-))
>
>  2. Ditto, but we put the module in the stdlib and then run around
>     issuing patch level security releases every 2-6 months.
>

I see a third: rename random.random() to be be something that gets the
point across it is not crypto secure and then stop at that. I don't think
the stdlib should get into the game of trying to provide a RNG that we
claim is cryptographically secure as that will change suddenly when a
weakness is discovered (this is one of the key reasons we chose not to
consider adding requests to the stdlib, for instance).

Theo's key issue is misuse of random.random(), not the lack of a
crypto-appropriate RNG in the stdlib (that just happens to be his solution
because he has an RNG that he is partially in charge of). So that means
either we take a "consenting adults" approach and say we can't prevent
people from using code without reading the docs or we try to rename the
function. But then again that won't help with all of the other functions in
the random module that implicitly use random.random() (if that even
matters; not sure if the helper functions in the module have any crypto use
that would lead to their misuse).

Oh, and there is always the nuclear 4th option and we just deprecate the
random module. ;)

-Brett


>
> Replacing our deterministic default PRNG with a non-deterministic
> one doesn't really fly, since we'd break an important feature
> of random.random(). You may remember that we already ran a similar
> stunt with the string hash function, with very mixed results.
>
> Calling the result of such a switch-over "secure" is even
> worse, since it's a promise we cannot keep (probably not even
> fully define). Better leave the promise at "insecure" - that's
> something we can promise forever and don't have to define :-)
>
> Regardless of what we end up with, I think Python land can do
> better than name it "arc4random". We're great at bike shedding,
> so how about we start the fun with "randomYMMV" :-)
>
> Overall, I think having more options for good PRNGs is great.
> Whether this "arc4random" is any good remains to be seen, but
> given that OpenBSD developed it, chances are higher than
> usual.
>
> --
> Marc-Andre Lemburg
> eGenix.com
>
> Professional Python Services directly from the Source  (#1, Sep 10 2015)
> >>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
> >>> mxODBC Plone/Zope Database Adapter ...       http://zope.egenix.com/
> >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
> ________________________________________________________________________
> 2015-09-18: PyCon UK 2015 ...                               8 days to go
>
> ::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::
>
>    eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>            Registered at Amtsgericht Duesseldorf: HRB 46611
>                http://www.egenix.com/company/contact/
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/15dc229c/attachment-0001.html>

From ncoghlan at gmail.com  Thu Sep 10 17:55:07 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 11 Sep 2015 01:55:07 +1000
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <85h9n482sa.fsf@benfinney.id.au>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
Message-ID: <CADiSq7eV30Z4VWXF-T0eZVfGdmq76SaE-QE_BTxhrkAkB4joMA@mail.gmail.com>

On 9 September 2015 at 17:33, Ben Finney <ben+python at benfinney.id.au> wrote:
> Paul Moore <p.f.moore at gmail.com> writes:
>
>> On 5 September 2015 at 09:30, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> > Unfortunately, I've yet to convince the rest of PyPA (let alone the
>> > community at large) that telling people to call "pip" directly is *bad
>> > advice* (as it breaks in too many cases that beginners are going to
>> > encounter), so it would be helpful if folks helping beginners on
>> > python-list and python-tutor could provide feedback supporting that
>> > perspective by filing an issue against
>> > https://github.com/pypa/python-packaging-user-guide
>>
>> I would love to see "python -m pip" (or where the launcher is
>> appropriate, the shorter "py -m pip") be the canonical invocation used
>> in all documentation, discussion and advice on running pip.
>
> Contrariwise, I would like to see ?pip? become the canonical invocation
> used in all documentation, discussion, and advice; and if there are any
> technical barriers to that least-surprising method, to see those
> barriers addressed and removed.

We're doing that too, but it's a "teaching people to use the command
line for the first time is hard" problem and a "managing multiple
copies of a language runtime and ensuring independently named commands
are working against the right target environment" issue, moreso than a
language level one.

A potentially more fruitful path is likely to be making it so that
folks don't need to use the system shell at all, and can just work
entirely from the Python REPL.

The two main things folks can't do from the REPL at the moment are:

* install packages
* manage virtual environments

The idea of an "install()" command injected into the builtins from
site.py would cover the first.

The second couldn't be handled the way virtualenv does things, but it
*could* be handled through a tool like vex which creates new subshells
and runs commands in those rather than altering the current shell:

$ python3
Python 3.4.2 (default, Jul  9 2015, 17:24:30)
[GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import subprocess
>>> subprocess.call(["vex", "nikola", "python"])
Python 2.7.10 (default, Jul  5 2015, 14:15:43)
[GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> print("Hello virtual environment!")
Hello virtual environment!
>>>

The "vex nikola python" call there:

1. Starts a new bash subshell
2. Activates my "nikola" virtual environment in that subshell
3. Launches Python within that venv (hence the jump over to a Python
2.7 process, since I keep forgetting to recreate it as Python 3).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From mal at egenix.com  Thu Sep 10 17:59:10 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 10 Sep 2015 17:59:10 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <loom.20150910T153125-298@post.gmane.org>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>	<CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>	<etPan.55f06a43.137d4868.31bc@Draupnir.home>	<CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>	<etPan.55f06fd9.71794aea.31bc@Draupnir.home>	<1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>	<CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>	<1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>	<20150909190757.GM19373@ando.pearwood.info>	<55F0BF61.6050205@canterbury.ac.nz>	<CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>	<55F13EAF.5040500@egenix.com>
 <loom.20150910T153125-298@post.gmane.org>
Message-ID: <55F1A8CE.3000900@egenix.com>

On 10.09.2015 15:39, Stefan Krah wrote:
> M.-A. Lemburg <mal at ...> writes:
>> Reading this thread is fun, but it doesn't seem to be getting
>> anywhere - perhaps that's part of the fun 
>>
>> Realistically, I see two options:
>>
>>  1. Someone goes and implements the OpenBSD random function in C
>>     and put a package up on PyPI, updating it whenever OpenBSD
>>     thinks that a new algorithm is needed or a security issue
>>     has to be fixed (from my experience with other crypto software
>>     like OpenSSL, this should be on the order of every 2-6 months )
> 
> The sane option would be to use the OpenBSD libcrypto, which seems to
> be part of their OpenSSL fork (libressl), just like libcrypto is part
> of OpenSSL.

Well, we already link to OpenSSL for SSL and hashes. I guess exposing
the OpenSSL RAND interface in a module would be the easiest way
to go about this.

pyOpenSSL already does this:

http://www.egenix.com/products/python/pyOpenSSL/doc/pyopenssl.html/#document-api/rand

More pointers:
https://wiki.openssl.org/index.php/Random_Numbers
https://www.openssl.org/docs/manmaster/crypto/rand.html

What's nice about the API is that you can add entropy as you
find it.

> Then the crypto maintenance would be delegated to the distributions.
> 
> I would even be interested in writing such a package, but it would
> be external and non-redistributable for well-known reasons. :)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 10 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2015-09-18: PyCon UK 2015 ...                               8 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From brett at python.org  Thu Sep 10 18:05:56 2015
From: brett at python.org (Brett Cannon)
Date: Thu, 10 Sep 2015 16:05:56 +0000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
Message-ID: <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>

On Thu, 10 Sep 2015 at 07:22 Donald Stufft <donald at stufft.io> wrote:

> On September 10, 2015 at 9:44:13 AM, Paul Moore (p.f.moore at gmail.com)
> wrote:
> > On 10 September 2015 at 14:10, Donald Stufft wrote:
> > >> I don't understand the phrase "if you needed determinism, it would
> > >> hurt you to say so". Could you clarify?
> > >
> > > I transposed some words, fixed:
> > >
> > > "If you needed determinism, would it hurt you to say so?""
> >
> > Thanks.
> >
> > In one sense, no it wouldn't. Nor would it matter to me if "the
> > default random number generator" was fast and cryptographically
> > secure. What matters is just that I get a load of random (enough)
> > numbers.
> >
> > What hurts somewhat (not enormously, I'll admit) is up front having to
> > think about whether I need to be able to capture a seed and replay it.
> > That's nearly always something I'd think of way down the line, as a
> > "wouldn't it be nice if I could get the user to send me a reproducible
> > test case" or something like that. And of course it's just a matter of
> > switching the underlying RNG at that point.
> >
> > None of this is hard. But once again, I'm currently using the module
> > correctly, as documented.
>
> This is actually exactly why Theo suggested using a modern, userland CSPRNG
> because it can generate random numbers faster than /dev/urandom can and,
> unless
> you need deterministic results, there's little downside to doing so.
>
> There's really two possible ideas here that depends on what sort of balance
> we'd want to strike. We can make a default "I don't want to think about it"
> implementation of random that is both *generally* secure and fast, however
> it
> won't be deterministic and you won't be able to explicitly seed it. This
> would
> be a backwards compatible change [1] for people who are simply calling
> these
> functions [2]:
>
>     random.getrandbits
>     random.randrange
>     random.randint
>     random.choice
>     random.shuffle
>     random.sample
>     random.random
>     random.uniform
>     random.triangular
>     random.betavariate
>     random.expovariate
>     random.gammavariate
>     random.gauss
>     random.lognormvariate
>     random.normalvariate
>     random.vonmisesvariate
>     random.paretovariate
>     random.weibullvariate
>
> If this were all that the top level functions in random.py provided we
> could
> simply replace the default and people wouldn't notice, they'd just
> automatically get safer randomness whether that's actually useful for their
> use case or not.
>
> However, random.py also has these functions:
>
>     random.seed
>     random.getstate
>     random.setstate
>     random.jumpahead
>
> and these functions are where the problem comes. These functions only
> really
> make sense for deterministic sources of random which are not "safe" for use
> in security sensitive applications. So pretending for a moment that we've
> already decided to do "something" about this, the question boils down to
> what
> do we do about these 4 functions. Either we can change the default to a
> secure
> CSPRNG and break these functions (and the people using them) which is
> however
> easily fixed by changing ``import random`` to
> ``import random; random = random.DeterministicRandom()`` or we can
> deprecate
> the top level functions and try to guide people to choose up front what
> kind
> of random they need. Either of these solutions will end up with people
> being
> safer and, if we pretend we've agreed to do "something", it comes down to
> whether we'd prefer breaking compatability for some people while keeping a
> default random generator that is probably good enough for most people, or
> if
> we'd prefer to not break compatability and try to push people to always
> deciding what kind of random they want.
>

+1 for deprecating module-level functions and putting everything into
classes to force a choice
+0 for deprecating the seed-related functions and saying "the stdlib uses
was it uses as a RNG and you have to live with it if you don't make your
own choice" and switching to a crypto-secure RNG.
-0 leaving it as-is

-Brett


>
> Of course, we still haven't decided that we should do "something", I think
> that
> we should because I think that secure by default (or at least, not
> insecure by
> default) is a good situation to be in. Over the history of computing it's
> been
> shown that time and time again that trying to document or educate users is
> error prone and doesn't scale, but if you can design APIs to make the
> "right"
> thing obvious and opt-out and require opting in to specialist [3] cases
> which
> require some particular property.
>
>
> [1] Assuming Theo's claim of the speed of the ChaCha based arc4random
> function
>     is accurate, which I haven't tested but I assume he's smart enough to
> know
>     what he's talking about WRT to speed of it.
>
> [2] I believe anyways, I don't think that any of these rely on the
> properties
>     of MT or a deterministic source of random, just a source of random.
>
> [3] In this case, their are two specialist use cases, those that require
>     deterministic results and those that require specific security
> properties
>     that are not satisified by a userland CSPRNG because a userland CSPRNG
> is
>     not as secure as /dev/urandom but is able to be much faster.
>
> >
> > I've omitted most of the rest of your response largely because we're
> > probably just going to have to agree to differ. I'm probably too worn
> > out being annoyed at the way that everything ends up needing to be
> > security related, and the needs of people who won't read the docs
> > determines API design, to respond clearly and rationally :-(
> >
> > Paul
> >
>
> -----------------
> Donald Stufft
> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372
> DCFA
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/892ecf81/attachment-0001.html>

From tim.peters at gmail.com  Thu Sep 10 18:10:26 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 10 Sep 2015 11:10:26 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
Message-ID: <CAExdVN=78Q6fhRrn3Y66GWWX_pkdfb+FY=D4m_3SY6w4Ss3xLQ@mail.gmail.com>

[Brett Cannon <brett at python.org>]
> ...
> I see a third: rename random.random() to be be something that gets the point
> across it is not crypto secure and then stop at that,
> ...
> Theo's key issue is misuse of random.random(), ...
> ...
> But then again that won't help with all of the other functions in
> the random module that implicitly use random.random() (if that even matters;
> not sure if the helper functions in the module have any crypto use that
> would lead to their misuse).

The most likely "misuses" in idiomatic Python (not mindlessly
translated low-level C) involve some spelling of getting or using
random integers, like .choice(), .randrange(), .randint(), or even
.sample() and .shuffle().  At least in Python 3, those don't normally
ever invoke .random() (neither directly nor indirectly) - they
normally use the (didn't always exist) "primitive" .getrandbits()
instead (indirectly via the private ._randbelow()).

So if something here does need to change, it's all or nothing.


> Oh, and there is always the nuclear 4th option and we just deprecate the
> random module. ;)

I already removed it from the repository.  Deprecating it would be a
security risk, since it would give hackers information about our
future actions ;-)

From xavier.combelle at gmail.com  Thu Sep 10 18:18:37 2015
From: xavier.combelle at gmail.com (Xavier Combelle)
Date: Thu, 10 Sep 2015 18:18:37 +0200
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
Message-ID: <CAEQcUJR27+Um12rdz5gTR=Uiv-LtJ4_oZ0-X3E1nxDYHzxLKPQ@mail.gmail.com>

My belief is that doing the safe thing by default is a major plus of
python. So in this point of view, using a cryptographic secure PRNG for
random.random() should be done if possible.

That would not change a lot the way of people creating insecure software by
lack of knowledge (me the first) but could help a little

I see a third: rename random.random() to be be something that gets the
> point across it is not crypto secure and then stop at that. I don't think
> the stdlib should get into the game of trying to provide a RNG that we
> claim is cryptographically secure as that will change suddenly when a
> weakness is discovered (this is one of the key reasons we chose not to
> consider adding requests to the stdlib, for instance).
>
>
This is in my opinion would not be a good idea. Having safe default is a
major plus of python, it is not like not having default because one think
it eventually it could become insecure.  And comparing a cryptographic
secure PNRG with openSSL for the expected security release time is not fair
because the complexity of  the both software is clearly different.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/7d003fc5/attachment.html>

From srkunze at mail.de  Thu Sep 10 18:20:35 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Thu, 10 Sep 2015 18:20:35 +0200
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CAA_f+LxA+SVnK5S8q+kG+7g_47MFYGgu1Hb1thCM8ws7O9+hww@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>	<55F0A1A8.5010001@mail.de>
 <CAA_f+LxA+SVnK5S8q+kG+7g_47MFYGgu1Hb1thCM8ws7O9+hww@mail.gmail.com>
Message-ID: <55F1ADD3.9060903@mail.de>

Got it. Thanks.

On 10.09.2015 05:40, Jukka Lehtosalo wrote:
> On Wed, Sep 9, 2015 at 2:16 PM, Sven R. Kunze <srkunze at mail.de 
> <mailto:srkunze at mail.de>> wrote:
>
>     Thanks for sharing, Guido. Some random thoughts:
>
>     - "classes should need to be explicitly marked as protocols"
>     If so, why are they classes in the first place? Other languages
>     has dedicated keywords like "interface".
>
>
> I want to preserve compatibility with earlier Python versions (down to 
> 3.2), and this makes it impossible to add any new syntax. Also, there 
> is no need to add a keyword as there are other existing mechanisms 
> which are good enough, including base classes (as in the proposal) and 
> class decorators. I don't think that this will become a very commonly 
> used language feature, and thus adding special syntax for this doesn't 
> seem very important. My expectation is that structural subtyping would 
> be primarily useful for libraries and frameworks.
>
> Jukka

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/41ce9f8e/attachment.html>

From rosuav at gmail.com  Thu Sep 10 18:30:22 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 11 Sep 2015 02:30:22 +1000
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <CADiSq7eV30Z4VWXF-T0eZVfGdmq76SaE-QE_BTxhrkAkB4joMA@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CADiSq7eV30Z4VWXF-T0eZVfGdmq76SaE-QE_BTxhrkAkB4joMA@mail.gmail.com>
Message-ID: <CAPTjJmrP-mgTDP9t2AeUQcB+=svmka2pvNiF+TVk+mGpFt74bw@mail.gmail.com>

On Fri, Sep 11, 2015 at 1:55 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> The second couldn't be handled the way virtualenv does things, but it
> *could* be handled through a tool like vex which creates new subshells
> and runs commands in those rather than altering the current shell:
>
> $ python3
> Python 3.4.2 (default, Jul  9 2015, 17:24:30)
> [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] on linux
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import subprocess
>>>> subprocess.call(["vex", "nikola", "python"])
> Python 2.7.10 (default, Jul  5 2015, 14:15:43)
> [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> print("Hello virtual environment!")
> Hello virtual environment!
>>>>

Hmm. This looks like something that could confuse people no end. I
already see a lot of people use Ctrl-Z to get out of a program (often
because they've come from Windows, I think), and this would be yet
another way to get lost as to which of various Python environments
you're in. Is it safe to have Python exec to another process? That
way, there's no "outer" Python to be left behind, and it'd feel like a
transition rather than a nesting. ("Please note: Selecting a virtual
environment restarts Python.")

(Incidentally, what _would_ happen if you pressed Ctrl-Z while in that
'inner' Python? Would both Pythons get suspended?)

ChrisA

From skrah at bytereef.org  Thu Sep 10 18:32:13 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Thu, 10 Sep 2015 16:32:13 +0000 (UTC)
Subject: [Python-ideas] Should our default random number generator be
	secure?
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>	<CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>	<etPan.55f06a43.137d4868.31bc@Draupnir.home>	<CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>	<etPan.55f06fd9.71794aea.31bc@Draupnir.home>	<1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>	<CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>	<1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>	<20150909190757.GM19373@ando.pearwood.info>	<55F0BF61.6050205@canterbury.ac.nz>	<CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>	<55F13EAF.5040500@egenix.com>
 <loom.20150910T153125-298@post.gmane.org> <55F1A8CE.3000900@egenix.com>
Message-ID: <loom.20150910T182316-602@post.gmane.org>

M.-A. Lemburg <mal at ...> writes:
> On 10.09.2015 15:39, Stefan Krah wrote:
> > M.-A. Lemburg <mal <at> ...> writes:
> >>  1. Someone goes and implements the OpenBSD random function in C
> >>     and put a package up on PyPI, updating it whenever OpenBSD
> >>     thinks that a new algorithm is needed or a security issue
> >>     has to be fixed (from my experience with other crypto software
> >>     like OpenSSL, this should be on the order of every 2-6 months )
> > 
> > The sane option would be to use the OpenBSD libcrypto, which seems to
> > be part of their OpenSSL fork (libressl), just like libcrypto is part
> > of OpenSSL.
> 
> Well, we already link to OpenSSL for SSL and hashes. I guess exposing
> the OpenSSL RAND interface in a module would be the easiest way
> to go about this.

Yes, my suggestion was based on the premise that OpenBSD's libcrypto
(which should include the portable arc4(chacha20)random) is more
secure, faster, etc.

That's a big 'if', their PRNG had a couple of bugs on Linux last year,
but OpenSSL also regularly has issues.


Stefan Krah



From mal at egenix.com  Thu Sep 10 18:38:49 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 10 Sep 2015 18:38:49 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>	<CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>	<etPan.55f06a43.137d4868.31bc@Draupnir.home>	<CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>	<etPan.55f06fd9.71794aea.31bc@Draupnir.home>	<1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>	<CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>	<1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>	<20150909190757.GM19373@ando.pearwood.info>	<55F0BF61.6050205@canterbury.ac.nz>	<CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>	<55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
Message-ID: <55F1B219.1000502@egenix.com>

On 10.09.2015 17:46, Brett Cannon wrote:
> On Thu, 10 Sep 2015 at 01:26 M.-A. Lemburg <mal at egenix.com> wrote:
> 
>> Reading this thread is fun, but it doesn't seem to be getting
>> anywhere - perhaps that's part of the fun ;-)
>>
>> Realistically, I see two options:
>>
>>  1. Someone goes and implements the OpenBSD random function in C
>>     and put a package up on PyPI, updating it whenever OpenBSD
>>     thinks that a new algorithm is needed or a security issue
>>     has to be fixed (from my experience with other crypto software
>>     like OpenSSL, this should be on the order of every 2-6 months ;-))
>>
>>  2. Ditto, but we put the module in the stdlib and then run around
>>     issuing patch level security releases every 2-6 months.
>>
> 
> I see a third: rename random.random() to be be something that gets the
> point across it is not crypto secure and then stop at that.

I think this is the major misunderstanding here:

The random module never suggested that it generates pseudo-random data
of crypto quality.

I'm pretty sure people doing crypto will know and most others
simply don't care :-)

Evidence: We used a Wichmann-Hill PRNG as default in random
for a decade and people still got their work done. Mersenne
was added in Python 2.3 and bumped the period from
6,953,607,871,644 (13 digits) to 2**19937-1 (6002 digits).

> I don't think
> the stdlib should get into the game of trying to provide a RNG that we
> claim is cryptographically secure as that will change suddenly when a
> weakness is discovered (this is one of the key reasons we chose not to
> consider adding requests to the stdlib, for instance).
> 
> Theo's key issue is misuse of random.random(), not the lack of a
> crypto-appropriate RNG in the stdlib (that just happens to be his solution
> because he has an RNG that he is partially in charge of). So that means
> either we take a "consenting adults" approach and say we can't prevent
> people from using code without reading the docs or we try to rename the
> function. But then again that won't help with all of the other functions in
> the random module that implicitly use random.random() (if that even
> matters; not sure if the helper functions in the module have any crypto use
> that would lead to their misuse).
> 
> Oh, and there is always the nuclear 4th option and we just deprecate the
> random module. ;)

Why not add ssl.random() et al. (as interface to the OpenSSL
rand APIs) ?

By putting the crypto random stuff into the ssl module, even
people who don't know about the difference, will recognize
that the ssl version must be doing something more related to
crypto than the regular random module one, which never promised
this.

Some background on why I think deterministic RNGs are more
useful to have as default than non-deterministic ones:

A common use case for me is to write test data generators
for large database systems. For such generators, I don't keep
the many GBs data around, but instead make the generator take a
few parameters which then seed the RNGs, the time module and
a few other modules via monkey-patching.

This allows me to create reproducible test sets in a very
efficient way. The feature to be able to reproduce a set is
typically only needed when tracking down a bug in the
system, but the whole setup avoids having to keep the whole
test sets around on disk.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 10 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2015-09-18: PyCon UK 2015 ...                               8 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From srkunze at mail.de  Thu Sep 10 18:42:46 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Thu, 10 Sep 2015 18:42:46 +0200
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>	<55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
Message-ID: <55F1B306.5070705@mail.de>

On 10.09.2015 06:12, Jukka Lehtosalo wrote:
> This has been discussed almost to the death before,

I am sorry. :)

> but there are some of main the benefits as I see them:
> - Code becomes more readable. This is especially true for code that 
> doesn't have very detailed docstrings.

If I have code without docstrings, I better write docstrings then. ;)

I mean when I am really going to touch that file to improve 
documentation (which annotations are a piece of), I am going to add more 
information for the reader of my API and that mostly will be describing 
the behavior of the API.

If my variables have crappy names, so I need to add type hints to them, 
well, then, I rather fix them first.

> This may go against the intuition of some people, but my experience 
> strongly suggests this, and many others who've used optional typing 
> have shared the sentiment. It probably takes a couple of days before 
> you get used to the type annotations, after which they likely won't 
> distract you any more but will actually improve code understanding by 
> providing important contextual information that is often difficult to 
> infer otherwise.
> - Tools can automatically find most (simple) bugs of certain common 
> kinds in statically typed code. A lot of production code has way below 
> 100% test coverage, so this can save many manual testing iterations 
> and help avoid breaking stuff in production due to stupid mistakes 
> (that humans are bad at spotting).
> - Refactoring becomes way less scary, especially if you don't have 
> close to 100% test coverage. A type checker can find many mistakes 
> that are commonly introduced when refactoring code.
>
> You'll get the biggest benefits if you are working on a large code 
> base mostly written by other people with limited test coverage and 
> little comments or documentation.

If I had large untested and undocumented code base (well I actually 
have), then static type checking would be ONE tool to find out issues.

Once found out, I write tests as hell. Tests, tests, tests. I would not 
add type annotations. I need tested functionality not proper typing.

> You get extra credit if your tests are slow to run and flaky,

We are problem solvers. So, I would tell my team: "make them faster and 
more reliable".

> I consider that difference pretty significant. I wouldn't want to 
> increase the fraction of unchecked parts of my annotated code by a 
> factor of 8, and I want to have control over which parts can be type 
> checked.

Granted. But you still don't know if your code runs correctly. You are 
better off with tests. And I agree type checking is 1 test to perform 
(out of 10K).

But:

>
>     I don't see the effort for adding type hints AND the effort for
>     further parsing (by human eyes) justified by partially better IDE
>     support and 1 single additional test within test suites of about
>     10,000s of tests.
>
>     Especially, when considering that correct types don't prove
>     functionality in any case. But tested functionality in some way
>     proves correct typing.
>

I didn't see you respond to that. But you probably know that. :)

Thanks for responding anyway. It is helpful to see your intentions, 
though I don't agree with it 100%.

Moreover, I think it is about time to talk about this. If it were not 
you, somebody else would finally have added type hints to Python. Keep 
up the good work. +1

Best,
Sven
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/940becdb/attachment.html>

From brett at python.org  Thu Sep 10 18:53:59 2015
From: brett at python.org (Brett Cannon)
Date: Thu, 10 Sep 2015 16:53:59 +0000
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
Message-ID: <CAP1=2W6HiFOppWnf=+cxS6JrYMsOe8P_FYX+TR1pm6PfpbP6VA@mail.gmail.com>

I like the idea enough that I'm +1 on moving forward with a PEP.

On Wed, 9 Sep 2015 at 13:19 Guido van Rossum <guido at python.org> wrote:

> Jukka wrote up a proposal for structural subtyping. It's pretty good.
> Please discuss.
>
> https://github.com/ambv/typehinting/issues/11#issuecomment-138133867
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/4475ddec/attachment-0001.html>

From ncoghlan at gmail.com  Thu Sep 10 19:00:22 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 11 Sep 2015 03:00:22 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
Message-ID: <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>

On 11 September 2015 at 02:05, Brett Cannon <brett at python.org> wrote:
> +1 for deprecating module-level functions and putting everything into
> classes to force a choice

-1000, as this would be a *huge* regression in Python's usability for
educational use cases. (Think 7-8 year olds that are still learning to
read, not teenagers or adults with more fully developed vocabularies)

A reasonable "Hello world!" equivalent for introducing randomness to
students is rolling a 6-sided die, as that relates to a real world
object they'll often be familiar with. At the moment that reads as
follows:

>>> from random import randint
>>> randint(1, 6)
6
>>> randint(1, 6)
3
>>> randint(1, 6)
1
>>> randint(1, 6)
4

Another popular educational exercise is the "Guess a number" game,
where the program chooses a random number from 1-100, and the person
playing the game has to guess what it is. Again, randint() works fine
here.

Shuffling decks of cards, flipping coins, these are all things used to
introduce learners to modelling random events in the real world in
software, and we absolutely do *not* want to invalidate the extensive
body of educational material that assumes the current module level API
for the random module.

> +0 for deprecating the seed-related functions and saying "the stdlib uses
> was it uses as a RNG and you have to live with it if you don't make your own
> choice" and switching to a crypto-secure RNG.

However, this I'm +1 on. People *do* use the module level APIs
inappropriately, and we can get them to a much safer place, while
nudging folks that genuinely need deterministic randomness towards an
alternative API.

The key for me is that folks that actually *need* deterministic
randomness *will* be calling the stateful module level APIs. This
means we can put the deprecation warnings on *those* methods, and
leave them out for the others.

In terms of practical suggestions, rather than DeterministicRandom and
NonDeterministicRandom, I'd actually go with the simpler terms
SeededRandom and SeedlessRandom (there's a case to be made that those
are misnomers, but I'll go into that more below):

SeededRandom: Mersenne Twister
SeedlessRandom: new CSPRNG
SystemRandom: os.urandom()

Phase one of transition:

* add SeedlessRandom
* rename Random to SeededRandom
* Random becomes a subclass of SeededRandom that deprecates all
methods not shared with SeedlessRandom
* this will also effectively deprecate the corresponding module level functions
* any SystemRandom methods that are no-ops (like seed()) are deprecated

Phase two of transition:

* Random becomes an alias for SeedlessRandom
* deprecated methods are removed from SystemRandom
* deprecated module level functions are removed

As far as the proposed Seeded/Seedless naming goes, that deliberately
glosses over the fact that "seed" gets used to refer to two different
things - seeding a PRNG with entropy, and seeding a deterministic PRNG
with a particular seed value. The key is that "SeedlessRandom" won't
have a "seed()" *method*, and that's the single most salient fact
about it from a user experience perspective: you can't get the same
output by providing the same seed value, because we wouldn't let you
provide a seed value at all.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From srkunze at mail.de  Thu Sep 10 19:01:34 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Thu, 10 Sep 2015 19:01:34 +0200
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <55F0E1F2.6040709@brenbarn.net>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0E1F2.6040709@brenbarn.net>
Message-ID: <55F1B76E.2030602@mail.de>



On 10.09.2015 03:50, Brendan Barnwell wrote:
> On 2015-09-09 13:17, Guido van Rossum wrote:
>> Jukka wrote up a proposal for structural subtyping. It's pretty good.
>> Please discuss.
>>
>> https://github.com/ambv/typehinting/issues/11#issuecomment-138133867
>
>     I'm not totally hip to all the latest typing developments,

You bet what I am.

> but I'm not sure I fully understand the benefit of this protocol 
> concept.  At the beginning it says that classes have to be explicitly 
> marked to support these protocols.  But why is that? Doesn't the 
> existing __subclasshook__ already allow an ABC to use any criteria it 
> likes to determine if a given class is considered a subclass?  So 
> couldn't ABCs like the ones we already have inspect the type 
> annotations and decide a class "counts" as an iterable (or whatever) 
> if it defines the right methods with the right type hints?
>

The benefit from what I understand is actually really, really nice. It's 
basically adding the ability to shorten the following 'capability' check:

if hasattr(obj, 'important') and hasattr(obj, 'relevant') and 
hasattr(obj, 'necessary'):
     # do

to

if implements(obj, protocol):
     # do


As usual with type hints, functionality is not guaranteed. But it 
simplifies sanity checks OR decision making:


if implements(obj, protocol1):
     # do this
elif implements(obj, (protocol2, protocol3)):
     # do that


The ability to extract all protocols of a type would provide a more 
flexible way of decision making and processing such as:

if my_protocol in obj.__protocols__:
     # iterate over the protocols and do something


@Jukka
I haven't found the abilities described above. Would it make sense to 
add it (except it's already there)?


Best,
Sven

From donald at stufft.io  Thu Sep 10 19:02:12 2015
From: donald at stufft.io (Donald Stufft)
Date: Thu, 10 Sep 2015 13:02:12 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
Message-ID: <etPan.55f1b794.3533b399.31bc@Draupnir.home>

On September 10, 2015 at 10:21:11 AM, Donald Stufft (donald at stufft.io) wrote:
> > Assuming Theo's claim of the speed of the ChaCha based arc4random  
> function
> is accurate, which I haven't tested but I assume he's smart enough  
> to know
> what he's talking about WRT to speed of it.

I wanted to try and test this. These are not super scientific since I just ran
them on a single computer once (but 10 million iterations each) but I think it
can probably give us an indication of the differences?

I put the code up at https://github.com/dstufft/randtest but it's a pretty
simple module. I'm not sure if (double)arc4random() / UINT_MAX is a reasonable
way to get a double out of arc4random (which returns a uint) that is between
0.0 and 1.0, but I assume it's fine at least for this test.

Here's the results from running the test on my personal computer which is
running the OSX El Capitan public Beta:

? ? $ python test.py
? ? Number of Calls: ?10000000
? ? +---------------+--------------------+
? ? | method ? ? ? ?| usecs per call ? ? |
? ? +---------------+--------------------+
? ? | deterministic | 0.0586802460020408 |
? ? | system ? ? ? ?| 1.6681434757076203 |
? ? | userland ? ? ?| 0.1534261149005033 |
? ? +---------------+--------------------+


I'll try it against OpenBSD later to see if their implementation of arc4random
is faster than OSX.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From ncoghlan at gmail.com  Thu Sep 10 19:03:18 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 11 Sep 2015 03:03:18 +1000
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <CAPTjJmrP-mgTDP9t2AeUQcB+=svmka2pvNiF+TVk+mGpFt74bw@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CADiSq7eV30Z4VWXF-T0eZVfGdmq76SaE-QE_BTxhrkAkB4joMA@mail.gmail.com>
 <CAPTjJmrP-mgTDP9t2AeUQcB+=svmka2pvNiF+TVk+mGpFt74bw@mail.gmail.com>
Message-ID: <CADiSq7f8Vwb=sYQmUnOGCPy0Hajq90iUHySOdEtU0hmxvxJW1w@mail.gmail.com>

On 11 September 2015 at 02:30, Chris Angelico <rosuav at gmail.com> wrote:
> On Fri, Sep 11, 2015 at 1:55 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> The second couldn't be handled the way virtualenv does things, but it
>> *could* be handled through a tool like vex which creates new subshells
>> and runs commands in those rather than altering the current shell:
>>
>> $ python3
>> Python 3.4.2 (default, Jul  9 2015, 17:24:30)
>> [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] on linux
>> Type "help", "copyright", "credits" or "license" for more information.
>>>>> import subprocess
>>>>> subprocess.call(["vex", "nikola", "python"])
>> Python 2.7.10 (default, Jul  5 2015, 14:15:43)
>> [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>>>>> print("Hello virtual environment!")
>> Hello virtual environment!
>>>>>
>
> Hmm. This looks like something that could confuse people no end. I
> already see a lot of people use Ctrl-Z to get out of a program (often
> because they've come from Windows, I think), and this would be yet
> another way to get lost as to which of various Python environments
> you're in. Is it safe to have Python exec to another process? That
> way, there's no "outer" Python to be left behind, and it'd feel like a
> transition rather than a nesting. ("Please note: Selecting a virtual
> environment restarts Python.")

Using subprocess.call() to invoke vex was something I could do without
writing a single line of code outside the REPL. An actual PEP would
presumably propose something with a much nicer UX :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From srkunze at mail.de  Thu Sep 10 19:07:43 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Thu, 10 Sep 2015 19:07:43 +0200
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
Message-ID: <55F1B8DF.5060001@mail.de>

On 09.09.2015 22:17, Guido van Rossum wrote:
> Jukka wrote up a proposal for structural subtyping. It's pretty good. 
> Please discuss.
>
> https://github.com/ambv/typehinting/issues/11#issuecomment-138133867

*15) How would|Protocol|be implemented?
*"Implement metaclass functionality to detect whether a class is a 
protocol or not. Maybe add a class attribute such as __protocol__ = True 
if that's the case"

If you consider the __protocols__ attribute I mentioned in an earlier 
post, I would like to see __protocol__ renamed to __is_protocol__. I 
think that would make it more readable in the long run.

Best,
Sven
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/f7396d6a/attachment-0001.html>

From rosuav at gmail.com  Thu Sep 10 19:11:19 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 11 Sep 2015 03:11:19 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
Message-ID: <CAPTjJmqkX6x+JuEuTsfmGrWjBbg=Mcnzg6UzQz73-hCZVbhF1w@mail.gmail.com>

On Fri, Sep 11, 2015 at 3:00 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> As far as the proposed Seeded/Seedless naming goes, that deliberately
> glosses over the fact that "seed" gets used to refer to two different
> things - seeding a PRNG with entropy, and seeding a deterministic PRNG
> with a particular seed value. The key is that "SeedlessRandom" won't
> have a "seed()" *method*, and that's the single most salient fact
> about it from a user experience perspective: you can't get the same
> output by providing the same seed value, because we wouldn't let you
> provide a seed value at all.

Aside from sounding like varieties of grapes in a grocery, those names
seem just fine. From the POV of someone with a bit of comprehension of
crypto (as in, "use /dev/urandom rather than a PRNG", but not enough
knowledge to actually build or verify these things), the distinction
is precise: with SeededRandom, I can give it a seed and get back a
predictable sequence of numbers, but with SeedlessRandom, I can't. I'm
not sure what the difference is between "seeding a PRNG with entropy"
and "seeding a deterministic PRNG with a particular seed value",
though; aside from the fact that one of them uses a known value and
the other doesn't, of course. Back in my BASIC programming days, we
used to use "RANDOMIZE TIMER" to seed the RNG with time-of-day, or
"RANDOMIZE 12345" (or other value) to seed with a particular value;
they're the same operation, but one's considered random and the
other's considered predictable. (Of course, bytes from /dev/urandom
will be a lot more entropic than "number of centiseconds since
midnight", but for a single-player game that wants to provide a
different starting layout every time you play, the latter is
sufficient.)

ChrisA

From rosuav at gmail.com  Thu Sep 10 19:17:32 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 11 Sep 2015 03:17:32 +1000
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <CADiSq7f8Vwb=sYQmUnOGCPy0Hajq90iUHySOdEtU0hmxvxJW1w@mail.gmail.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CADiSq7eV30Z4VWXF-T0eZVfGdmq76SaE-QE_BTxhrkAkB4joMA@mail.gmail.com>
 <CAPTjJmrP-mgTDP9t2AeUQcB+=svmka2pvNiF+TVk+mGpFt74bw@mail.gmail.com>
 <CADiSq7f8Vwb=sYQmUnOGCPy0Hajq90iUHySOdEtU0hmxvxJW1w@mail.gmail.com>
Message-ID: <CAPTjJmrTrM400s-_dWOyyEQyLs0Qo7ZCsZm1TJsZp16rDbu9_g@mail.gmail.com>

On Fri, Sep 11, 2015 at 3:03 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> Hmm. This looks like something that could confuse people no end. I
>> already see a lot of people use Ctrl-Z to get out of a program (often
>> because they've come from Windows, I think), and this would be yet
>> another way to get lost as to which of various Python environments
>> you're in. Is it safe to have Python exec to another process? That
>> way, there's no "outer" Python to be left behind, and it'd feel like a
>> transition rather than a nesting. ("Please note: Selecting a virtual
>> environment restarts Python.")
>
> Using subprocess.call() to invoke vex was something I could do without
> writing a single line of code outside the REPL. An actual PEP would
> presumably propose something with a much nicer UX :)

Heh, fair enough! Mainly, though, I'm wondering whether there'd be any
risks to using os.exec* from the REPL - anything that would make it a
bad idea to even consider that approach.

ChrisA

From tim.peters at gmail.com  Thu Sep 10 19:23:48 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 10 Sep 2015 12:23:48 -0500
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <etPan.55f1b794.3533b399.31bc@Draupnir.home>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <etPan.55f1b794.3533b399.31bc@Draupnir.home>
Message-ID: <CAExdVNkZpi8gKNK=A-r+rsiera7gvin75ytTHNZ0AJGbXUUjFQ@mail.gmail.com>

[Donald Stufft <donald at stufft.io>, on arc4random speed]
> I wanted to try and test this. These are not super scientific since I just ran
> them on a single computer once (but 10 million iterations each) but I think it
> can probably give us an indication of the differences?
>
> I put the code up at https://github.com/dstufft/randtest but it's a pretty
> simple module. I'm not sure if (double)arc4random() / UINT_MAX is a reasonable
> way to get a double out of arc4random (which returns a uint) that is between
> 0.0 and 1.0, but I assume it's fine at least for this test.

arc4random() specifically returns uint32_t, which is 21 bits shy of
what's needed to generate a reasonable random double.  Our MT wrapping
internally generates two 32-bit uint32_t thingies, and pastes them
together like so (Python's C code here):

"""
/* random_random is the function named genrand_res53 in the original code;
 * generates a random number on [0,1) with 53-bit resolution; note that
 * 9007199254740992 == 2**53; I assume they're spelling "/2**53" as
 * multiply-by-reciprocal in the (likely vain) hope that the compiler will
 * optimize the division away at compile-time.  67108864 is 2**26.  In
 * effect, a contains 27 random bits shifted left 26, and b fills in the
 * lower 26 bits of the 53-bit numerator.
 * The orginal code credited Isaku Wada for this algorithm, 2002/01/09.
 */
static PyObject *
random_random(RandomObject *self)
{
    PY_UINT32_T a=genrand_int32(self)>>5, b=genrand_int32(self)>>6;
    return PyFloat_FromDouble((a*67108864.0+b)*(1.0/9007199254740992.0));
}
"""

So now you know how to make it more directly comparable.  The
high-order bit is that it requires 2 calls to the 32-bit uint integer
primitive to get a double, and that can indeed be significant.


> Here's the results from running the test on my personal computer which is
> running the OSX El Capitan public Beta:
>
>     $ python test.py
>     Number of Calls:  10000000
>     +---------------+--------------------+
>     | method        | usecs per call     |
>     +---------------+--------------------+
>     | deterministic | 0.0586802460020408 |
>     | system        | 1.6681434757076203 |
>     | userland      | 0.1534261149005033 |
>     +---------------+--------------------+
>
>
> I'll try it against OpenBSD later to see if their implementation of arc4random
> is faster than OSX.

Just noting that most people timing the OpenBSD version seem to
comment out the "get stuff from the kernel periodically" part first,
in order to time the algorithm instead of the kernel ;-)  In real
life, though, they both count, so I like what you're doing better.

From ncoghlan at gmail.com  Thu Sep 10 19:27:50 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 11 Sep 2015 03:27:50 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CAPTjJmqkX6x+JuEuTsfmGrWjBbg=Mcnzg6UzQz73-hCZVbhF1w@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <CAPTjJmqkX6x+JuEuTsfmGrWjBbg=Mcnzg6UzQz73-hCZVbhF1w@mail.gmail.com>
Message-ID: <CADiSq7dLaxDM3VKRgzJOGveRdYH9wZ2J-+xbfoJqGVB3C6xBLQ@mail.gmail.com>

On 11 September 2015 at 03:11, Chris Angelico <rosuav at gmail.com> wrote:
> On Fri, Sep 11, 2015 at 3:00 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> As far as the proposed Seeded/Seedless naming goes, that deliberately
>> glosses over the fact that "seed" gets used to refer to two different
>> things - seeding a PRNG with entropy, and seeding a deterministic PRNG
>> with a particular seed value. The key is that "SeedlessRandom" won't
>> have a "seed()" *method*, and that's the single most salient fact
>> about it from a user experience perspective: you can't get the same
>> output by providing the same seed value, because we wouldn't let you
>> provide a seed value at all.
>
> Aside from sounding like varieties of grapes in a grocery, those names
> seem just fine. From the POV of someone with a bit of comprehension of
> crypto (as in, "use /dev/urandom rather than a PRNG", but not enough
> knowledge to actually build or verify these things), the distinction
> is precise: with SeededRandom, I can give it a seed and get back a
> predictable sequence of numbers, but with SeedlessRandom, I can't. I'm
> not sure what the difference is between "seeding a PRNG with entropy"
> and "seeding a deterministic PRNG with a particular seed value",
> though; aside from the fact that one of them uses a known value and
> the other doesn't, of course.

Actually, that was just a mistake on my part - they're really the same
thing, and the only distinction is the one you mention: setting the
seed to a known value. Thus the main seed-related difference between
something like arc4random and other random APIs is the same one I'm
proposing to make here: it's seedless at the API level because it
takes care of collecting its own initial entropy from the operating
system's random number API.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From robert.kern at gmail.com  Thu Sep 10 19:29:22 2015
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 10 Sep 2015 18:29:22 +0100
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
 <loom.20150909T232749-280@post.gmane.org>
 <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>
 <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
Message-ID: <msseli$a2l$1@ger.gmane.org>

On 2015-09-10 00:15, Nathaniel Smith wrote:
> On Wed, Sep 9, 2015 at 3:19 PM, Tim Peters <tim.peters at gmail.com> wrote:

>> The Twister's provably perfect equidistribution across its whole
>> period also has its scary sides.  For example, run random.random()
>> often enough, and it's _guaranteed_ you'll eventually reach a state
>> where the output is exactly 0.0 hundreds of times in a row.  That will
>> happen as often as it "should happen" by chance, but that's scant
>> comfort if you happen to hit such a state.  Indeed, the Twister was
>> patched relatively early in its life to try to prevent it from
>> _starting_ in such miserable states.   Such states are nevertheless
>> still reachable from every starting state.
>
> This criticism seems a bit unfair though -- even a true stream of
> random bits (e.g. from a true unbiased quantum source) has this
> property, and trying to avoid this happening would introduce bias that
> really could cause problems in practice. A good probabilistic program
> is one that has a high probability of returning some useful result,
> but they always have some low probability of returning something
> weird. So this is just saying that most people don't understand
> probability. Which is true, but there isn't much that the random
> module can do about it :-)

The MT actually does have a problem unique to it (or at least to its family of 
Generalized Feedback Shift Registers) where a state with a high proportion of 0 
bits will get stuck in a region of successive states with high proportions of 0 
bits. Other 623-dimensional equidistributed PRNGs will indeed come across the 
same states with high 0-bit sequences with the frequency that you expect from 
probability, but they will be surrounded by states with dissimilar 0-bit 
proportions. This problem isn't *due* to equidistribution per se, but I think 
Tim's point is that you are inevitably due to hit one such patch if you sample 
long enough.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco


From srkunze at mail.de  Thu Sep 10 19:35:56 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Thu, 10 Sep 2015 19:35:56 +0200
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <CADiSq7emmO81fwShS_rVyH867UXwVPDo0Sx=eUoJMLdE=-DQVQ@mail.gmail.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>	<CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>	<55ED24C4.9000205@mail.de>	<CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>	<m21teac5p7.fsf@fastmail.com>	<B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>	<m2fv2plshe.fsf@fastmail.com>	<87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>	<m2egi9a62o.fsf@fastmail.com>	<55EF2B66.4020509@mail.de>	<1441741195.1614886.378114729.37307E0E@webmail.messagingengine.com>	<6DDBD724-714E-40E1-88DF-9BC8484FF240@yahoo.com>	<55F058B6.9000202@mail.de>	<1DCC81C0-DE7A-460A-AD7F-E1533BB14911@yahoo.com>	<55F0E5C9.6030509@brenbarn.net>
 <CADiSq7emmO81fwShS_rVyH867UXwVPDo0Sx=eUoJMLdE=-DQVQ@mail.gmail.com>
Message-ID: <55F1BF7C.9060205@mail.de>

On 10.09.2015 17:36, Nick Coghlan wrote:
> This perspective doesn't grant enough credit to the significance of C
> in general, and the C ABI in particular, in the overall computing
> landscape. While a lot of folks have put a lot of work into making it
> possible to write software without needing to learn the details of
> what's happening at the machine level, it's still the case that the
> *one* language binding interface that *every* language runtime ends up
> including is being able to load and run C libraries.

Ah, now I understand. We need to add {} to C. That'll make it, right? ;)

Seriously, there are also other significant influences that fit better 
here: template engines. I know a couple of them using {} in some sense 
or another. C format strings are just one of them, so I wouldn't stress 
the significance of C that hard *in that particular instance*. There are 
other areas where C has its strengths.

> P.S. It's also worth remembering than many Pythonistas, including
> members of the core development team, happily switch between
> programming languages according to the task at hand. Python can still
> be our *preferred* language without becoming the *only* language we
> use :)

I hope everybody on this list knows that.:)

Best,
Sven
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/cd65d65c/attachment.html>

From srkunze at mail.de  Thu Sep 10 19:39:16 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Thu, 10 Sep 2015 19:39:16 +0200
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <2D5621A7-0676-489D-886E-76E7D953870D@yahoo.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com> <87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>
 <m2egi9a62o.fsf@fastmail.com> <55EF2B66.4020509@mail.de>
 <1441741195.1614886.378114729.37307E0E@webmail.messagingengine.com>
 <6DDBD724-714E-40E1-88DF-9BC8484FF240@yahoo.com> <55F058B6.9000202@mail.de>
 <1DCC81C0-DE7A-460A-AD7F-E1533BB14911@yahoo.com> <55F0B677.3090500@mail.de>
 <2D5621A7-0676-489D-886E-76E7D953870D@yahoo.com>
Message-ID: <55F1C044.3040904@mail.de>

On 10.09.2015 01:14, Andrew Barnert wrote:
> Of course I can easily file a docs bug, with a patch, and possibly start a discussion on the relevant list to get wider discussion. But you can do that as easily as I can, and I don't know why you should anticipate better of me than you do of yourself. (If you don't feel capable of writing the change, because you're not a native speaker or your tech writing skills aren't as good as your coding skills or whatever, I won't argue that your English seems good enough to me; just write a "draft" patch and then ask for people to improve it. There are docs changes that have been done this way in the past, and I think there are more than enough people who'd be happy to help.)

I didn't know that. The Python development and discussion process is 
still somewhat opaque to me.

Btw. you asked for what could be improved and I responded. :)

Best,
Sven

From robert.kern at gmail.com  Thu Sep 10 19:41:39 2015
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 10 Sep 2015 18:41:39 +0100
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAPJVwBn+tWOtPPt+UqwGwYaRqozAZtU2xTdVhuZUaRvJnePGXQ@mail.gmail.com>
References: <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
 <loom.20150909T232749-280@post.gmane.org>
 <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>
 <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
 <20150910015505.GO19373@ando.pearwood.info>
 <CAExdVN=tO3jPWoz0t6ckspAuWB-7t61GzbKLrx2L2UtKRyELbA@mail.gmail.com>
 <CAPJVwBnOqY3XcAtuRS7en956qfZHL1_fin-e7Pb+2CWQk2dftg@mail.gmail.com>
 <CAExdVN=tEtoh6Dx+7XCQ-nwv1f7O+ALAvSLvLpT4NQnzyK0Z+A@mail.gmail.com>
 <CAPJVwBn+tWOtPPt+UqwGwYaRqozAZtU2xTdVhuZUaRvJnePGXQ@mail.gmail.com>
Message-ID: <mssfcj$les$1@ger.gmane.org>

On 2015-09-10 04:56, Nathaniel Smith wrote:
> On Wed, Sep 9, 2015 at 8:35 PM, Tim Peters <tim.peters at gmail.com> wrote:
>> There are some clean and easy approaches to this based on
>> crypto-inspired schemes, but giving up crypto strength for speed.  If
>> you haven't read it, this paper is delightful:
>>
>>      http://www.thesalmons.org/john/random123/papers/random123sc11.pdf
>
> It really is! As AES acceleration instructions become more common
> (they're now standard IIUC on x86, x86-64, and even recent ARM?), even
> just using AES in CTR mode becomes pretty compelling -- it's fast,
> deterministic, provably equidistributed, *and* cryptographically
> secure enough for many purposes.

I'll also recommend the PCG paper (and algorithm) as the author's cross-PRNGs 
comparisons have been bandied about in this thread already. The paper lays out a 
lot of the relevant issues and balances the various qualities that are 
important: statistical quality, speed, and security (of various flavors).

   http://www.pcg-random.org/paper.html

I'm actually not that impressed with Random123. The core idea is nice and clean, 
but the implementation is hideously complex.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco


From srkunze at mail.de  Thu Sep 10 19:43:40 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Thu, 10 Sep 2015 19:43:40 +0200
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <55F0C193.6000606@btinternet.com>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com> <87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>
 <m2egi9a62o.fsf@fastmail.com> <55EF2B66.4020509@mail.de>
 <1441741195.1614886.378114729.37307E0E@webmail.messagingengine.com>
 <6DDBD724-714E-40E1-88DF-9BC8484FF240@yahoo.com> <55F058B6.9000202@mail.de>
 <55F0C193.6000606@btinternet.com>
Message-ID: <55F1C14C.9060608@mail.de>

Of course I would not want to force you, Rob.

I believe in progress and progress is achieved through change. So, the 
best method for change I know of are deprecations: not all changes come 
bundled but singly and with time to prepare.

To me, it's just a minor deficiency in Python's own vision.


Best,
Sven


On 10.09.2015 01:32, Rob Cliffe wrote:
> I use %-formatting.
> Not because I think it's so wonderful and solves all problems 
> (although it's pretty good), but because it appeared to be the 
> recommended method at the time I learned Python in earnest.  If I were 
> only learning Python now, I would probably learn str.format or 
> whatever it is.
> I *could* learn to use something else *and* change all my working 
> code, but do you really want to force me to do that?
> I would guess that there are quite a lot of Python users in the same 
> position.
> Rob Cliffe
>
> On 09/09/2015 17:05, Sven R. Kunze wrote:
>> On 09.09.2015 02:09, Andrew Barnert via Python-ideas wrote:
>>> I think it's already been established why % formatting is not going 
>>> away any time soon.
>>>
>>> As for de-emphasizing it, I think that's already done pretty well in 
>>> the current docs. The tutorial has a nice long introduction to 
>>> str.format, a one-paragraph section on "old string formatting" with 
>>> a single %5.3f example, and a one-sentence mention of Template. The 
>>> stdtypes chapter in the library reference explains the difference 
>>> between the two in a way that makes format sound more attractive for 
>>> novices, and then has details on each one as appropriate. What else 
>>> should be done?
>>
>> I had difficulties to find what you mean by tutorial. But hey, being 
>> a Python user for years and not knowing where the official tutorial 
>> resides...
>>
>> Anyway, Google presented me the version 2.7 of the tutorial. Thus, 
>> the link to the stdtypes documentation does not exhibit the note of, 
>> say, 3.5:
>>
>> "Note: The formatting operations described here exhibit a variety of 
>> quirks that lead to a number of common errors (such as failing to 
>> display tuples and dictionaries correctly). Using the newer 
>> str.format() interface helps avoid these errors, and also provides a 
>> generally more powerful, flexible and extensible approach to 
>> formatting text."
>>
>> So, adding it to the 2.7 docs would be a start.
>>
>>
>> I still don't understand what's wrong with deprecating %, but okay. I 
>> think f-strings will push {} to wide-range adoption.
>>
>>
>> Best,
>> Sven
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>>
>> -----
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 2014.0.4830 / Virus Database: 4365/10609 - Release Date: 
>> 09/09/15
>>
>>
>


From srkunze at mail.de  Thu Sep 10 19:46:11 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Thu, 10 Sep 2015 19:46:11 +0200
Subject: [Python-ideas] Wheels For ...
In-Reply-To: <CAPJVwBmMm0tLmMHDUcAoVYVAxeovKDCyZXPcEpFOoyy_EY2cGg@mail.gmail.com>
References: <55EC78E9.1050300@mail.de>
 <20150907012645.GX19373@ando.pearwood.info>
 <etPan.55ed1087.5b6eedad.31bc@Draupnir.home>
 <F6CD4556-2AC1-4AD2-BF43-7F46060024FF@yahoo.com>
 <CAPJVwBmMm0tLmMHDUcAoVYVAxeovKDCyZXPcEpFOoyy_EY2cGg@mail.gmail.com>
Message-ID: <55F1C1E3.90202@mail.de>

Another example for the sake of documentation:

https://github.com/tornadoweb/tornado/issues/1383#issuecomment-84098055



On 07.09.2015 07:39, Nathaniel Smith wrote:
>
> On Sep 6, 2015 10:28 PM, "Andrew Barnert via Python-ideas" 
> <python-ideas at python.org <mailto:python-ideas at python.org>> wrote:
> >
> > On Sep 6, 2015, at 21:20, Donald Stufft <donald at stufft.io 
> <mailto:donald at stufft.io>> wrote:
> > >
> > > Let's take lxml for
> > > example which binds against libxml2. It needs built on Windows, it 
> needs built
> > > on OSX, it needs built on various Linux distributions in order to 
> cover the
> > > spread of just the common cases.
> >
> > IIRC, Apple included ancient versions (even at the time) of libxml2 
> up to around 10.7, and at one point they even included one of the 
> broken 2.7.x versions. So a build farm building for 10.6+ (which I 
> think is what python.org <http://python.org> builds still target?) is 
> going to build against an ancient libxml2, meaning some features of 
> lxml2 will be disabled, and others may even be broken. Even if I'm 
> remembering wrong about Apple, I'm sure there are linux distros with 
> similar issues.
> >
> > Fortunately, lxml has a built-in option (triggered by an env 
> variable) for dealing with this, by downloading the source, building a 
> local copy of the libs, and statically linking them into lxml, but 
> that means you need some way for a package to specify env variables to 
> be set on the build server. And can you expect most libraries with 
> similar issues to do the same?
>
> Yes, you can! :-)
>
> I mean, not everyone will necessarily use it, but adding code like
>
> if "PYPI_BUILD_SERVER" in os.environ:
>     do_static_link = True
>
> to your setup.py is *wayyyy* easier than buying an OS X machine and 
> maintaining it and doing manual builds at every release. Or finding a 
> volunteer who has an OS X box and nagging them at every release and 
> dealing with trust hassles.
>
> And there are a lot of packages out there that just have some cython 
> files in them for speedups with no external dependencies, or whatever. 
> A build farm wouldn't have to be perfect to be extremely useful.
>
> -n
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/e7677580/attachment-0001.html>

From tim.peters at gmail.com  Thu Sep 10 19:48:12 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 10 Sep 2015 12:48:12 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <msseli$a2l$1@ger.gmane.org>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
 <loom.20150909T232749-280@post.gmane.org>
 <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>
 <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
 <msseli$a2l$1@ger.gmane.org>
Message-ID: <CAExdVNneknSC7=sfmwnb=_RSgjTYS_kURwj_PcBDP92dKiDcVA@mail.gmail.com>

[Robert Kern <robert.kern at gmail.com>]
> The MT actually does have a problem unique to it (or at least to its family
> of Generalized Feedback Shift Registers) where a state with a high
> proportion of 0 bits will get stuck in a region of successive states with
> high proportions of 0 bits. Other 623-dimensional equidistributed PRNGs will
> indeed come across the same states with high 0-bit sequences with the
> frequency that you expect from probability, but they will be surrounded by
> states with dissimilar 0-bit proportions. This problem isn't *due* to
> equidistribution per se, but I think Tim's point is that you are inevitably
> due to hit one such patch if you sample long enough.

Thank you for explaining it better than I did.  I implied MT's "stuck
in zero-land" problems were _due_ to perfect equidistribution, but of
course they're not.  It's just that MT's specific permutation of the
equidistributed-regardless-of-order range(1, 2**19937) systematically
puts integers with "lots of 0 bits" next to each other.

And there are many such patches.  But 2**19337 is so large you need to
contrive the state "by hand" to get there at once.  For example,

>>> random.setstate((2, (0,)*600 + (1,)*24 + (624,), None))
>>> random.random()
0.0
>>> random.random()
0.0
>>> random.random()
0.0
>>> random.random()
0.0
>>> random.random()
0.0
>>> random.random()
0.0
>>> random.random()
0.0
>>> random.random()
0.0
>>> random.random()
0.0
>>> random.random()
0.0
>>> random.random()
0.0
>>> random.random()
0.0

That's "impossible" ;-)  (1 chance in 2***(53*12)) of seeing 0.0
twelve times in a row)

From 4kir4.1i at gmail.com  Thu Sep 10 20:06:28 2015
From: 4kir4.1i at gmail.com (Akira Li)
Date: Thu, 10 Sep 2015 21:06:28 +0300
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
Message-ID: <87y4ge6tej.fsf@gmail.com>

Donald Stufft <donald at stufft.io> writes:

...
> Essentially, there are three basic types of uses of random (the concept, not
> the module). Those are:
>
> 1. People/usecases who absolutely need deterministic output given a seed and
> ? ?for whom security properties don't matter.
> 2. People/usecases who absolutely need a cryptographically random output and
> ? ?for whom having a deterministic output is a downside.
> 3. People/usecases that fall somewhere in between where it may or may not be
> ? ?security sensitive or it may not be known if it's security sensitive.
>
> The people in group #1 are currently, in the Python standard library, best
> served using the MT random source as it provides exactly the kind of determinsm
> they need. The people in group #2 are currently, in the Python standard
> library, best served using os.urandom (either directly or via
> random.SystemRandom).
>
> However, the third case is the one that Theo's suggestion is attempting to
> solve. In the current landscape, the security minded folks will tell these
> people to use os.urandom/random.SystemRandom and the performance or otherwise
> less security minded folks will likely tell them to just use random.py. Leaving
> these people with a random that is not cryptographically safe.
...

"security minded folks" [1] recommend "always use os.urandom()" and
advise against *random* module [2,3] despite being aware of
random.SystemRandom() [4]

i.e., if they are right then *random* module probably only need to care
about group #1 and avoid creating the false sense of security in group #3.

[1] https://github.com/pyca/cryptography/blob/92d8bd12609586bfa53cf8c7a691e37474aeccd1/AUTHORS.rst
[2] https://cryptography.io/en/latest/random-numbers/
[3]
https://github.com/pyca/cryptography/blob/92d8bd12609586bfa53cf8c7a691e37474aeccd1/docs/random-numbers.rst
[4] https://github.com/pyca/cryptography/issues/2278


From donald at stufft.io  Thu Sep 10 20:19:20 2015
From: donald at stufft.io (Donald Stufft)
Date: Thu, 10 Sep 2015 14:19:20 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <87y4ge6tej.fsf@gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <87y4ge6tej.fsf@gmail.com>
Message-ID: <etPan.55f1c9a8.60ad8f7b.31bc@Draupnir.home>

On September 10, 2015 at 2:08:46 PM, Akira Li (4kir4.1i at gmail.com) wrote:
>  
> "security minded folks" [1] recommend "always use os.urandom()" and
> advise against *random* module [2,3] despite being aware of
> random.SystemRandom() [4]
>  
> i.e., if they are right then *random* module probably only need to care
> about group #1 and avoid creating the false sense of security in group #3.
>  

Maybe you didn't notice you?re talking to the third name in the list of authors
that you linked too, but that documentation is there primarily because the
random module's API is problematic and it's easier to recommend people to not
use it than to try and explain how to use it safely.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From brenbarn at brenbarn.net  Thu Sep 10 20:24:19 2015
From: brenbarn at brenbarn.net (Brendan Barnwell)
Date: Thu, 10 Sep 2015 11:24:19 -0700
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <55F1B76E.2030602@mail.de>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0E1F2.6040709@brenbarn.net> <55F1B76E.2030602@mail.de>
Message-ID: <55F1CAD3.7050602@brenbarn.net>

On 2015-09-10 10:01, Sven R. Kunze wrote:
>
>
> On 10.09.2015 03:50, Brendan Barnwell wrote:
>> On 2015-09-09 13:17, Guido van Rossum wrote:
>>> Jukka wrote up a proposal for structural subtyping. It's pretty good.
>>> Please discuss.
>>>
>>> https://github.com/ambv/typehinting/issues/11#issuecomment-138133867
>>
>>     I'm not totally hip to all the latest typing developments,
>
> You bet what I am.
>
>> but I'm not sure I fully understand the benefit of this protocol
>> concept.  At the beginning it says that classes have to be explicitly
>> marked to support these protocols.  But why is that? Doesn't the
>> existing __subclasshook__ already allow an ABC to use any criteria it
>> likes to determine if a given class is considered a subclass?  So
>> couldn't ABCs like the ones we already have inspect the type
>> annotations and decide a class "counts" as an iterable (or whatever)
>> if it defines the right methods with the right type hints?
>>
>
> The benefit from what I understand is actually really, really nice. It's
> basically adding the ability to shorten the following 'capability' check:
>
> if hasattr(obj, 'important') and hasattr(obj, 'relevant') and
> hasattr(obj, 'necessary'):
>       # do
>
> to
>
> if implements(obj, protocol):
>       # do

	Right, but can't you already do that with ABCs, as in the example in 
the docs (https://docs.python.org/2/library/abc.html)?  You can write an 
ABC whose __subclasshook__ does whatever hasattr checks you want (and, 
if you want, checks the type annotations too), and then you can use 
isinstance/issubclass to check if a given instance/class "provides the 
protocol" described by that ABC.

-- 
Brendan Barnwell
"Do not follow where the path may lead.  Go, instead, where there is no 
path, and leave a trail."
    --author unknown

From ericsnowcurrently at gmail.com  Thu Sep 10 20:32:11 2015
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 10 Sep 2015 12:32:11 -0600
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
Message-ID: <CALFfu7BuxjuWVPjdOK0XeXT+YaSUsLhU=uDwBVuF-Pjbd9O08Q@mail.gmail.com>

On Thu, Sep 10, 2015 at 9:46 AM, Brett Cannon <brett at python.org> wrote:
> Oh, and there is always the nuclear 4th option and we just deprecate the
> random module. ;)

Or move it under the math module (a la Go).

-eric

From 4kir4.1i at gmail.com  Thu Sep 10 20:40:32 2015
From: 4kir4.1i at gmail.com (Akira Li)
Date: Thu, 10 Sep 2015 21:40:32 +0300
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <etPan.55f1c9a8.60ad8f7b.31bc@Draupnir.home>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <87y4ge6tej.fsf@gmail.com>
 <etPan.55f1c9a8.60ad8f7b.31bc@Draupnir.home>
Message-ID: <CAMZNu4d6dZaZi+BkDV275-XOSvrth3p1zMCUbk98Jq45Qa+gLQ@mail.gmail.com>

On Thu, Sep 10, 2015 at 9:19 PM, Donald Stufft <donald at stufft.io> wrote:

> On September 10, 2015 at 2:08:46 PM, Akira Li (4kir4.1i at gmail.com) wrote:
> >
> > "security minded folks" [1] recommend "always use os.urandom()" and
> > advise against *random* module [2,3] despite being aware of
> > random.SystemRandom() [4]
> >
> > i.e., if they are right then *random* module probably only need to care
> > about group #1 and avoid creating the false sense of security in group
> #3.
> >
>
> Maybe you didn't notice you?re talking to the third name in the list of
> authors
> that you linked too,


Obviously, I've noticed it but I didn't want to call you out.

but that documentation is there primarily because the
> random module's API is problematic and it's easier to recommend people to
> not
> use it than to try and explain how to use it safely.
>
>
"it's easier to recommend people to not use it than to try and explain how
to use it safely." that is exactly the point
if random.SystemRandom()  is not safe to use while being based on "secure"
os.urandom() then providing the same API based on (possibly less secure)
arc4random() won't be any safer.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/db226ca4/attachment-0001.html>

From donald at stufft.io  Thu Sep 10 20:50:35 2015
From: donald at stufft.io (Donald Stufft)
Date: Thu, 10 Sep 2015 14:50:35 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CAExdVNkZpi8gKNK=A-r+rsiera7gvin75ytTHNZ0AJGbXUUjFQ@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <etPan.55f1b794.3533b399.31bc@Draupnir.home>
 <CAExdVNkZpi8gKNK=A-r+rsiera7gvin75ytTHNZ0AJGbXUUjFQ@mail.gmail.com>
Message-ID: <etPan.55f1d0fc.790bfdb5.31bc@Draupnir.home>

On September 10, 2015 at 1:24:05 PM, Tim Peters (tim.peters at gmail.com) wrote:
>  
> So now you know how to make it more directly comparable. The
> high-order bit is that it requires 2 calls to the 32-bit uint integer
> primitive to get a double, and that can indeed be significant.

It didn?t change the results really though:

My OSX El Capitan machine:

Number of Calls: ?10000000
+---------------+---------------------+
| method ? ? ? ?| usecs per call ? ? ?|
+---------------+---------------------+
| deterministic | 0.05792283279588446 |
| system ? ? ? ?| 1.7192466521984897 ?|
| userland ? ? ?| 0.17901834140066059 |
+---------------+??????????+


An OpenBSD 5.7 VM:

Number of Calls: ?10000000
+---------------+---------------------+
| method ? ? ? ?| usecs per call ? ? ?|
+---------------+---------------------+
| deterministic | 0.06555143180000868 |
| system ? ? ? ?| 0.8929547749999983 ?|
| userland ? ? ?| 0.16291017429998647 |
+---------------+---------------------+



-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From kramm at google.com  Thu Sep 10 20:57:21 2015
From: kramm at google.com (Matthias Kramm)
Date: Thu, 10 Sep 2015 11:57:21 -0700 (PDT)
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
Message-ID: <9683c40d-b662-4b77-947e-62c418be8468@googlegroups.com>

On Wednesday, September 9, 2015 at 1:19:12 PM UTC-7, Guido van Rossum wrote:
>
> Jukka wrote up a proposal for structural subtyping. It's pretty good. 
> Please discuss.
>
> https://github.com/ambv/typehinting/issues/11#issuecomment-138133867
>

I like this proposal; given Python's flat nominal type hierarchy, it will 
be useful to have a parallel subtyping mechanism to give things finer 
granularity without having to resort to ABCs.

Are the return types of methods invariant or variant under this proposal?

I.e. if I have

  class A(Protocol):
    def f() -> int: ...

does

  class B:
    def f() -> bool:
      return True

implicitly implement the protocol A?

Also, marking Protocols using subclassing seems confusing and error-prone.
In your examples above, one would think that you could define a new 
protocol using

class SizedAndClosable(Sized):
    pass

instead of

class SizedAndClosable(Sized, Protocol):
    pass

because Sized is already a protocol.

Maybe the below would be a more intuitive syntax:

  @protocol
  class SizedAndClosable(Sized):
      pass

Furthermore, I strongly agree with #7. Typed, but optional, attributes are 
a bad idea.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/2a9face3/attachment.html>

From kramm at google.com  Thu Sep 10 20:57:21 2015
From: kramm at google.com (Matthias Kramm)
Date: Thu, 10 Sep 2015 11:57:21 -0700 (PDT)
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
Message-ID: <9683c40d-b662-4b77-947e-62c418be8468@googlegroups.com>

On Wednesday, September 9, 2015 at 1:19:12 PM UTC-7, Guido van Rossum wrote:
>
> Jukka wrote up a proposal for structural subtyping. It's pretty good. 
> Please discuss.
>
> https://github.com/ambv/typehinting/issues/11#issuecomment-138133867
>

I like this proposal; given Python's flat nominal type hierarchy, it will 
be useful to have a parallel subtyping mechanism to give things finer 
granularity without having to resort to ABCs.

Are the return types of methods invariant or variant under this proposal?

I.e. if I have

  class A(Protocol):
    def f() -> int: ...

does

  class B:
    def f() -> bool:
      return True

implicitly implement the protocol A?

Also, marking Protocols using subclassing seems confusing and error-prone.
In your examples above, one would think that you could define a new 
protocol using

class SizedAndClosable(Sized):
    pass

instead of

class SizedAndClosable(Sized, Protocol):
    pass

because Sized is already a protocol.

Maybe the below would be a more intuitive syntax:

  @protocol
  class SizedAndClosable(Sized):
      pass

Furthermore, I strongly agree with #7. Typed, but optional, attributes are 
a bad idea.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/2a9face3/attachment-0001.html>

From chris.barker at noaa.gov  Thu Sep 10 21:04:23 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Thu, 10 Sep 2015 12:04:23 -0700
Subject: [Python-ideas] BUG in standard while statement
In-Reply-To: <msra7n$h9t$1@ger.gmane.org>
References: <CAOs8ta2PC=6p8Eo_+F0m1WrVbgk+Y=8V8HQHotC0ObxGPvS6=g@mail.gmail.com>
 <msra7n$h9t$1@ger.gmane.org>
Message-ID: <CALGmxEJjEJyi03KzmqVXUztT9bqwAoE_Dyzj3BXzrK4B6VHwog@mail.gmail.com>

however, he did bring up a python-idea worthy general topic:

Sometimes you want to iterate without doing anything with the results of
the iteration.

So the obvious is his example -- iterate N times:

for i in range(N):
    do_something

but you may need to look into the code (probably more than one line) to see
if i is used for anything.

I know there was talk way back about making integers iterable, so you could
do:

for i in 32:
   do something.

which would be slightly cleaner, but still has an extra i in there, and
this was soundly rejected anyway (for good  reason). IN fact, Python's
"for" is not really about iterating N times, it's about iteraton over a
sequence of objects. Ans I for one find:

for _ in range(N):

To be just fine -- really very little noise or performance overhead or
anything else.

However, I've found myself wanting a "make nothing comprehension". For some
reason, I find myself frequently following a pattern where I want to call
the same method on all the objects in a sequence:

for obj in a_sequence:
    obj.a_method()

but I like the compactness of comprehensions, so I do:

[obj.a_method() for obj in a_sequence]

but then this creates a list of the result from that method call. Which I
don't want, so I don't assign the results to anything, and it just goes
away.

But somehow it bugs me that I'm creating this (maybe big) ol' list full of
junk, just to have it deleted.

Anyone else think this is a use-case worth supporting better? Or should I
jstu get over it -- it's really not that expensive to create a list, after
all.

-Chris








On Thu, Sep 10, 2015 at 12:07 AM, Terry Reedy <tjreedy at udel.edu> wrote:

> On 9/9/2015 1:10 PM, Stephan Sahm wrote:
>
> I found a BUG in the standard while statement, which appears both in
>> python 2.7 and python 3.4 on my system.
>>
>
> No you did not, but aside from that: python-ideas is for ideas about
> future versions of python, not for bug reports, valid or otherwise.  You
> should have sent this to python-list, which is a place to report possible
> bugs.
>
> --
> Terry Jan Reedy
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/0cf9fd58/attachment-0001.html>

From donald at stufft.io  Thu Sep 10 21:24:21 2015
From: donald at stufft.io (Donald Stufft)
Date: Thu, 10 Sep 2015 15:24:21 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CAMZNu4d6dZaZi+BkDV275-XOSvrth3p1zMCUbk98Jq45Qa+gLQ@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <87y4ge6tej.fsf@gmail.com> <etPan.55f1c9a8.60ad8f7b.31bc@Draupnir.home>
 <CAMZNu4d6dZaZi+BkDV275-XOSvrth3p1zMCUbk98Jq45Qa+gLQ@mail.gmail.com>
Message-ID: <etPan.55f1d8e5.256bad52.31bc@Draupnir.home>

On September 10, 2015 at 2:40:54 PM, Akira Li (4kir4.1i at gmail.com) wrote:

> "it's easier to recommend people to not use it than to try and explain how
> to use it safely." that is exactly the point
> if random.SystemRandom() is not safe to use while being based on "secure"
> os.urandom() then providing the same API based on (possibly less secure)
> arc4random() won't be any safer.
>?

"If the mountain won't come to Muhammad then Muhammad must go to the mountain."

In other words, we can write all the documentation in the world we want, and it
doesn't change the simple fact that by choosing a default, there is going to be
some people who will use it when it's inappropiate due to the fact that it is
the default. The pratical effect of changing the default will be that some
cases are broken, but in a way that is obvious and trivial to fix, some cases
won't have any pratical effect at all, and finally, for some people it's going
to take code that was previously completely insecure and make it either secure
or harder to exploit for people who are incorrectly using the API.

I wouldn't expect the documentation in pyca/cryptography to change, it'd still
recommend people to use os.urandom directly and we'd still recommend that
people should use SystemRandom/os.urandom in the random.py docs for things that
need to be cryptographically secure, this is just a safety net for people who
don't know or didn't listen.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From storchaka at gmail.com  Thu Sep 10 22:01:20 2015
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Thu, 10 Sep 2015 23:01:20 +0300
Subject: [Python-ideas] Round division
Message-ID: <mssnir$uol$1@ger.gmane.org>

In Python there is a operation for floor division: a // b.

Ceil division easy can be expressed via floor division: -((-a) // b).

But round division is more complicated. This operation is needed in 
Fraction.__round__, in a number of methods in the datetime module (see 
_divide_and_round). Due to the complexity of the correct Python 
implementation, it is slower then just division.

I propose to add special function in the math module. This not only will 
speed up Python implementation of the datetime module and the fractions 
module, but will encourage users to use correct algorithm instead of 
obvious but incorrect round(a/b).


From njs at pobox.com  Thu Sep 10 22:33:05 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 10 Sep 2015 13:33:05 -0700
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
Message-ID: <CAPJVwBk4Edcddcg4iS5V7cMH-g4G2SHU1YQ-wqN0dVKY=WQ-6A@mail.gmail.com>

On Sep 10, 2015 5:29 AM, "Paul Moore" <p.f.moore at gmail.com> wrote:
[...]
> You're claiming that the random
> module is security related. I'm claiming it's not, it's documented as
> not being, and that's clear to the people who use it for its intended
> purpose. Telling those people that you want to make a module designed
> for their use harder to use because people for whom it's not intended
> can't read the documentation which explicitly states that it's not
> suitable for them, is doing a disservice to those people who are
> already using the module correctly for its stated purpose.

Regarding the "harder to use" point (which is obviously just one of many
considerations in this while debate):

I trained myself a few years ago to stop using the global random functions
and instead always pass around an explicit RNG object, and my experience is
that once I got into the habit it gave me a strict improvement in code
quality. Suddenly many more of my functions are deterministic ... well ...
functions ... of their inputs, and suddenly it's clearly marked in the
source which ones have randomness in their semantics, and suddenly it's
much easier to do things like refactor the code while preserving the output
for a given seed. (This is tricky because just changing the order in which
you do things can break your code. I wince in sympathy at people who have
to maintain code like your map-generation-from-a-seed example and *aren't*
using RNG objects explicitly.) The implicit global RNG is a piece of global
state, like global variables, and causes similar unpleasantness. Now that I
don't use it, I look back and it's like "huh, why did I always used to hit
myself in the face like that? That wasn't very pleasant." So this is what I
teach my collaborators and students now. Most of them just use the global
state by default because they don't even know about the OO option.

YMMV but that's my experience FWIW.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/074d981c/attachment.html>

From marky1991 at gmail.com  Thu Sep 10 23:13:18 2015
From: marky1991 at gmail.com (Mark Young)
Date: Thu, 10 Sep 2015 17:13:18 -0400
Subject: [Python-ideas] Round division
In-Reply-To: <mssnir$uol$1@ger.gmane.org>
References: <mssnir$uol$1@ger.gmane.org>
Message-ID: <CAG3cHaY80SdXfWAmFb6xgcYCSesefA0dUyO-eR=qSO2+8R4ERw@mail.gmail.com>

Pardon my ignorance, but what is the definition of round division? (if it
isn't "round(a/b)")
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/ca3b47ec/attachment.html>

From p.f.moore at gmail.com  Thu Sep 10 23:48:42 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 10 Sep 2015 22:48:42 +0100
Subject: [Python-ideas] Round division
In-Reply-To: <CAG3cHaY80SdXfWAmFb6xgcYCSesefA0dUyO-eR=qSO2+8R4ERw@mail.gmail.com>
References: <mssnir$uol$1@ger.gmane.org>
 <CAG3cHaY80SdXfWAmFb6xgcYCSesefA0dUyO-eR=qSO2+8R4ERw@mail.gmail.com>
Message-ID: <CACac1F_sGOS2uXmyjtkfP93g98A_Mg=u2k_xw-vdWSjPx5wkDw@mail.gmail.com>

On 10 September 2015 at 22:13, Mark Young <marky1991 at gmail.com> wrote:
> Pardon my ignorance, but what is the definition of round division? (if it
> isn't "round(a/b)")

I assumed it would be "what round(a/b) would give if it weren't
subject to weird floating point rounding issues". To put it another
way, if a / b is d remainder r, then I'd assume "round division" would
be d if r < b/2, d+1 if r > b/2, and (which of d, d+1?) if r == b/2.
(a, b, d and r are all integers).

If not, then I also would like to know what it means...

Either way, if it is introduced then it should be documented
(particularly as regards what happens when one or both of a, b are
negative) clearly, as it's not 100% obvious.

Also, is the math module the right place? All of the operations in the
math module (apart from factorial, for some reason...) are floating
point.
Paul

From python at mrabarnett.plus.com  Fri Sep 11 00:12:39 2015
From: python at mrabarnett.plus.com (MRAB)
Date: Thu, 10 Sep 2015 23:12:39 +0100
Subject: [Python-ideas] BUG in standard while statement
In-Reply-To: <CALGmxEJjEJyi03KzmqVXUztT9bqwAoE_Dyzj3BXzrK4B6VHwog@mail.gmail.com>
References: <CAOs8ta2PC=6p8Eo_+F0m1WrVbgk+Y=8V8HQHotC0ObxGPvS6=g@mail.gmail.com>
 <msra7n$h9t$1@ger.gmane.org>
 <CALGmxEJjEJyi03KzmqVXUztT9bqwAoE_Dyzj3BXzrK4B6VHwog@mail.gmail.com>
Message-ID: <55F20057.2070103@mrabarnett.plus.com>

On 2015-09-10 20:04, Chris Barker wrote:
> however, he did bring up a python-idea worthy general topic:
>
> Sometimes you want to iterate without doing anything with the results of
> the iteration.
>
> So the obvious is his example -- iterate N times:
>
> for i in range(N):
>      do_something
>
> but you may need to look into the code (probably more than one line) to
> see if i is used for anything.
>
> I know there was talk way back about making integers iterable, so you
> could do:
>
> for i in 32:
>     do something.
>
> which would be slightly cleaner, but still has an extra i in there, and
> this was soundly rejected anyway (for good  reason). IN fact, Python's
> "for" is not really about iterating N times, it's about iteraton over a
> sequence of objects. Ans I for one find:
>
> for _ in range(N):
>
> To be just fine -- really very little noise or performance overhead or
> anything else.
>
> However, I've found myself wanting a "make nothing comprehension". For
> some reason, I find myself frequently following a pattern where I want
> to call the same method on all the objects in a sequence:
>
> for obj in a_sequence:
>      obj.a_method()
>
> but I like the compactness of comprehensions, so I do:
>
> [obj.a_method() for obj in a_sequence]
>
> but then this creates a list of the result from that method call. Which
> I don't want, so I don't assign the results to anything, and it just
> goes away.
>
> But somehow it bugs me that I'm creating this (maybe big) ol' list full
> of junk, just to have it deleted.
>
> Anyone else think this is a use-case worth supporting better? Or should
> I jstu get over it -- it's really not that expensive to create a list,
> after all.
>
You could use a generator expression with a function that discards the 
results:

def every(iterable):
     for _ in iterable:
         pass

every(obj.a_method() for obj in a_sequence)

>
> On Thu, Sep 10, 2015 at 12:07 AM, Terry Reedy <tjreedy at udel.edu
> <mailto:tjreedy at udel.edu>> wrote:
>
>     On 9/9/2015 1:10 PM, Stephan Sahm wrote:
>
>         I found a BUG in the standard while statement, which appears both in
>         python 2.7 and python 3.4 on my system.
>
>
>     No you did not, but aside from that: python-ideas is for ideas about
>     future versions of python, not for bug reports, valid or otherwise.
>     You should have sent this to python-list, which is a place to report
>     possible bugs.
>


From abarnert at yahoo.com  Fri Sep 11 00:22:41 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 10 Sep 2015 15:22:41 -0700
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <55F1B306.5070705@mail.de>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <55F1B306.5070705@mail.de>
Message-ID: <05084F79-C505-4A27-9F08-DA98D4B19963@yahoo.com>

On Sep 10, 2015, at 09:42, Sven R. Kunze <srkunze at mail.de> wrote:
> 
> I mean when I am really going to touch that file to improve documentation (which annotations are a piece of), I am going to add more information for the reader of my API and that mostly will be describing the behavior of the API.

As a bit of useless anecdotal evidence:

After starting to play with MyPy when Guido first announced the idea, I haven't actually started using static type checking seriously, but I have started writing annotations for some of my functions. It feels like a concise and natural way to say "this function wants two integers", and it reads as well as it writes. Of course there's no reason I couldn't have been doing this since 3.0, but I wasn't, and now I am.

Try playing around with it and see if you get the same feeling. Since everyone is thinking about the random module right now, and it makes a great example of what I'm talking about, specify which functions take/return int vs. float, which need a real int vs. anything Integral, etc., and how much more easily you absorb the information than if it's in the middle of a sentence in the docstring.

Anyway, I don't actually annotate every function (or every function except the ones that are so simple that any checker or reader that couldn't infer the types is useless, the way I would in Haskell), just the ones where the types seem like an important part of the semantics. So I haven't missed the more complex features the way I expected to. But I've still got no problem with them being added as we go along, of course. :)

From abarnert at yahoo.com  Fri Sep 11 00:27:05 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 10 Sep 2015 15:27:05 -0700
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <9683c40d-b662-4b77-947e-62c418be8468@googlegroups.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <9683c40d-b662-4b77-947e-62c418be8468@googlegroups.com>
Message-ID: <959BCAB1-9A4E-4147-80C8-BD113E4A5319@yahoo.com>

On Sep 10, 2015, at 11:57, Matthias Kramm via Python-ideas <python-ideas at python.org> wrote:
> 
>> On Wednesday, September 9, 2015 at 1:19:12 PM UTC-7, Guido van Rossum wrote:
>> Jukka wrote up a proposal for structural subtyping. It's pretty good. Please discuss.
>> 
>> https://github.com/ambv/typehinting/issues/11#issuecomment-138133867
> 
> I like this proposal; given Python's flat nominal type hierarchy, it will be useful to have a parallel subtyping mechanism to give things finer granularity without having to resort to ABCs.

I don't understand this, given that resorting to protocols is basically the same thing as resorting to ABCs.

Clearly there's some perceiving difficulty or complexity of ABCs within the Python community that makes people not realize how simple and useful they are. But I don't see how adding something that's nearly equivalent but different and maintaining the two in parallel is a good solution to that problem.

There are some cases where the fact that ABCs rely on a metaclass makes them problematic where Protocols aren't (basically, where you need another metaclass), but I doubt that's the case you're worried about.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/cb6f54b0/attachment.html>

From abarnert at yahoo.com  Fri Sep 11 00:34:29 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 10 Sep 2015 15:34:29 -0700
Subject: [Python-ideas] BUG in standard while statement
In-Reply-To: <CALGmxEJjEJyi03KzmqVXUztT9bqwAoE_Dyzj3BXzrK4B6VHwog@mail.gmail.com>
References: <CAOs8ta2PC=6p8Eo_+F0m1WrVbgk+Y=8V8HQHotC0ObxGPvS6=g@mail.gmail.com>
 <msra7n$h9t$1@ger.gmane.org>
 <CALGmxEJjEJyi03KzmqVXUztT9bqwAoE_Dyzj3BXzrK4B6VHwog@mail.gmail.com>
Message-ID: <C089096A-C13F-4925-9887-0F143613F2B4@yahoo.com>

On Sep 10, 2015, at 12:04, Chris Barker <chris.barker at noaa.gov> wrote:
> 
> However, I've found myself wanting a "make nothing comprehension". For some reason, I find myself frequently following a pattern where I want to call the same method on all the objects in a sequence:
> 
> for obj in a_sequence:
>     obj.a_method()
> 
> but I like the compactness of comprehensions, so I do:
> 
> [obj.a_method() for obj in a_sequence]

I think this is an anti-pattern. The point of a comprehension is that it's an expression, which gathers up results. You're trying to hide side effects inside an expression, which is a bad thing to do, and lamenting the fact that you get a useless value back, which of course you do because expressions have values, so that should be a sign that you don't actually want an expression here.

Also, compare the actual brevity here:

    [obj.a_method() for obj in a_sequence]
    for obj in a_sequence: obj.a_method()

You've replaced a colon with a pair of brackets, so it's actually less concise.

If you really want to do this anyway, you can use the consume recipe from the itertools docs or the more-itertools library or write your own one-liner:

    consume = partial(deque, maxlen=0)
    consume(obj.a_method() for obj in a_sequence)

At least this makes it explicit that you're creating and ignoring a bunch of values. But I still think it's much clearer to just use a for statement.


From storchaka at gmail.com  Fri Sep 11 00:39:59 2015
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Fri, 11 Sep 2015 01:39:59 +0300
Subject: [Python-ideas] Round division
In-Reply-To: <CACac1F_sGOS2uXmyjtkfP93g98A_Mg=u2k_xw-vdWSjPx5wkDw@mail.gmail.com>
References: <mssnir$uol$1@ger.gmane.org>
 <CAG3cHaY80SdXfWAmFb6xgcYCSesefA0dUyO-eR=qSO2+8R4ERw@mail.gmail.com>
 <CACac1F_sGOS2uXmyjtkfP93g98A_Mg=u2k_xw-vdWSjPx5wkDw@mail.gmail.com>
Message-ID: <mst0rv$2r1$1@ger.gmane.org>

On 11.09.15 00:48, Paul Moore wrote:
> On 10 September 2015 at 22:13, Mark Young <marky1991 at gmail.com> wrote:
>> Pardon my ignorance, but what is the definition of round division? (if it
>> isn't "round(a/b)")
>
> I assumed it would be "what round(a/b) would give if it weren't
> subject to weird floating point rounding issues". To put it another
> way, if a / b is d remainder r, then I'd assume "round division" would
> be d if r < b/2, d+1 if r > b/2, and (which of d, d+1?) if r == b/2.
> (a, b, d and r are all integers).
>
> If not, then I also would like to know what it means...

Yes, it is what you have described. If r == b/2, the result is even 
(i.e. (d+1)//2*2).

> Either way, if it is introduced then it should be documented
> (particularly as regards what happens when one or both of a, b are
> negative) clearly, as it's not 100% obvious.
>
> Also, is the math module the right place? All of the operations in the
> math module (apart from factorial, for some reason...) are floating
> point.

It is the best place in the stdlib. Apart from floating point functions, 
the math module contains integer functions (factorial and gcd) and 
general number functions (floor, ceil, trunc and isclose). gcd and 
isclose are new in 3.5.



From tritium-list at sdamon.com  Fri Sep 11 00:47:56 2015
From: tritium-list at sdamon.com (Alexander Walters)
Date: Thu, 10 Sep 2015 18:47:56 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <etPan.55f16c39.8659361.31bc@Draupnir.home>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <etPan.55f0dd28.7b8c5e71.31bc@Draupnir.home>
 <CA+=+wqAbx1NRDSSkoXofC2R+2PN_x=TPzraH_smWUzm_uQPLYw@mail.gmail.com>
 <55F14B70.2080901@sdamon.com> <etPan.55f16c39.8659361.31bc@Draupnir.home>
Message-ID: <55F2089C.4020909@sdamon.com>



On 9/10/2015 07:40, Donald Stufft wrote:

> What harm is there in making people explicitly choose between deterministic
> randomness and secure randomness? Is your use case so much better than theirs
> that you thing you deserve to type a few characters less to the detriment of
> people who don't know any better?
>
>
API Breakage.  This is not worth the break in backwards compatibility.  
My use case is using the API that has been available for... 20 years?  
And for what benefit?  None, and it can be argued that it would do the 
opposite of what is intended (false sense of security and all).

From abarnert at yahoo.com  Fri Sep 11 00:46:36 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 10 Sep 2015 15:46:36 -0700
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
In-Reply-To: <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
Message-ID: <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>

On Sep 10, 2015, at 07:21, Donald Stufft <donald at stufft.io> wrote:
> 
> Either we can change the default to a secure
> CSPRNG and break these functions (and the people using them) which is however
> easily fixed by changing ``import random`` to
> ``import random; random = random.DeterministicRandom()``

But that isn't a fix, unless all your code is in a single module. If I call random.seed in game.py and then call random.choice in aiplayer.py, I'll get different results after your fix than I did before.

What I'd need to do instead is create a separate myrandom.py that does this and then exports all of the bound methods of random as top-level functions, and then make game.py, aiplayer.py, etc. all import myrandom as random. Which is, while not exactly hard, certainly harder, and much less obvious, than the incorrect fix that you've suggested, and it may not be immediately obvious that it's wrong until someone files a bug three versions later claiming that when he reloads a game the AI cheats and you have to track through the problem.

That's why I suggested the set_default_instance function, which makes this problem trivial to solve in a correct way instead of in an incorrect way.

From abarnert at yahoo.com  Fri Sep 11 00:54:36 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 10 Sep 2015 15:54:36 -0700
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
In-Reply-To: <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
Message-ID: <5DBE2F72-DAB1-43D3-97F5-318D480E91FE@yahoo.com>

On Sep 10, 2015, at 15:46, Andrew Barnert via Python-ideas <python-ideas at python.org> wrote:
> 
>> On Sep 10, 2015, at 07:21, Donald Stufft <donald at stufft.io> wrote:
>> 
>> Either we can change the default to a secure
>> CSPRNG and break these functions (and the people using them) which is however
>> easily fixed by changing ``import random`` to
>> ``import random; random = random.DeterministicRandom()``
> 
> But that isn't a fix, unless all your code is in a single module. If I call random.seed in game.py and then call random.choice in aiplayer.py, I'll get different results after your fix than I did before.
> 
> What I'd need to do instead is create a separate myrandom.py that does this and then exports all of the bound methods of random as top-level functions, and then make game.py, aiplayer.py, etc. all import myrandom as random. Which is, while not exactly hard, certainly harder, and much less obvious, than the incorrect fix that you've suggested, and it may not be immediately obvious that it's wrong until someone files a bug three versions later claiming that when he reloads a game the AI cheats and you have to track through the problem.
> 
> That's why I suggested the set_default_instance function, which makes this problem trivial to solve in a correct way instead of in an incorrect way.

Actually, I just thought of an even simpler solution:

Add a deterministic_singleton member to random (which is just initialized to DeterministicRandom() at startup). Now, the user fix is just to change "import random" to "from random import deterministic_singleton as random".


From chris.barker at noaa.gov  Fri Sep 11 01:05:43 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Thu, 10 Sep 2015 16:05:43 -0700
Subject: [Python-ideas] BUG in standard while statement
In-Reply-To: <55F20057.2070103@mrabarnett.plus.com>
References: <CAOs8ta2PC=6p8Eo_+F0m1WrVbgk+Y=8V8HQHotC0ObxGPvS6=g@mail.gmail.com>
 <msra7n$h9t$1@ger.gmane.org>
 <CALGmxEJjEJyi03KzmqVXUztT9bqwAoE_Dyzj3BXzrK4B6VHwog@mail.gmail.com>
 <55F20057.2070103@mrabarnett.plus.com>
Message-ID: <CALGmxEK9s2+2ULjknGPsUFFWdt9Jo1O3j5ccTsgKZ1_D3ZZ7Xw@mail.gmail.com>

On Thu, Sep 10, 2015 at 3:12 PM, MRAB <python at mrabarnett.plus.com> wrote:

> You could use a generator expression with a function that discards the
> results:
>
> def every(iterable):
>     for _ in iterable:
>         pass
>
> every(obj.a_method() for obj in a_sequence)
>

sure -- though this adds a new function that people reading my code need to
grok.

Andrew Barnert wrote:

> > [obj.a_method() for obj in a_sequence]


I think this is an anti-pattern. The point of a comprehension is that it's
> an expression, which gathers up results. You're trying to hide side effects
> inside an expression, which is a bad thing to do, and lamenting the fact
> that you get a useless value back, which of course you do because
> expressions have values, so that should be a sign that you don't actually
> want an expression here.


Exactly -- I don't want a comprehension, I don't want a expression, I want
a concise way to spell : do this thing to all of these things....

Also, compare the actual brevity here:
>     [obj.a_method() for obj in a_sequence]
>     for obj in a_sequence: obj.a_method()
> You've replaced a colon with a pair of brackets, so it's actually less
> concise.


Fair enough -- removing a newline does make that pretty simple looking!

I guess I got all comprehension-happy there -- back when it was the shiny
new toy, and then I got stuck on it.

-CHB



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/6933744c/attachment.html>

From chris.barker at noaa.gov  Fri Sep 11 01:08:59 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Thu, 10 Sep 2015 16:08:59 -0700
Subject: [Python-ideas] Round division
In-Reply-To: <mst0rv$2r1$1@ger.gmane.org>
References: <mssnir$uol$1@ger.gmane.org>
 <CAG3cHaY80SdXfWAmFb6xgcYCSesefA0dUyO-eR=qSO2+8R4ERw@mail.gmail.com>
 <CACac1F_sGOS2uXmyjtkfP93g98A_Mg=u2k_xw-vdWSjPx5wkDw@mail.gmail.com>
 <mst0rv$2r1$1@ger.gmane.org>
Message-ID: <CALGmxEJx7FTRb1XhnGY90BjB5BKW80kte65WYC205z0rJw8unA@mail.gmail.com>

On Thu, Sep 10, 2015 at 3:39 PM, Serhiy Storchaka <storchaka at gmail.com>
wrote:

> On 11.09.15 00:48, Paul Moore wrote:
>
>> On 10 September 2015 at 22:13, Mark Young <marky1991 at gmail.com> wrote:
>>
>>> Pardon my ignorance, but what is the definition of round division? (if it
>>> isn't "round(a/b)")
>>>
>>
>> I assumed it would be "what round(a/b) would give if it weren't
>> subject to weird floating point rounding issues". To put it another
>> way, if a / b is d remainder r, then I'd assume "round division" would
>> be d if r < b/2, d+1 if r > b/2, and (which of d, d+1?) if r == b/2.
>> (a, b, d and r are all integers).
>>
>> If not, then I also would like to know what it means...
>>
>
> Yes, it is what you have described. If r == b/2, the result is even (i.e.
> (d+1)//2*2).
>
> Either way, if it is introduced then it should be documented
>> (particularly as regards what happens when one or both of a, b are
>> negative) clearly, as it's not 100% obvious.
>>
>> Also, is the math module the right place? All of the operations in the
>> math module (apart from factorial, for some reason...) are floating
>> point.
>>
>
> It is the best place in the stdlib. Apart from floating point functions,
> the math module contains integer functions (factorial and gcd) and general
> number functions (floor, ceil, trunc and isclose). gcd and isclose are new
> in 3.5.


well, floor, ceil, and isclose are all about floats...

Nevertheless, yes the math module is the place for it.

-CHB


>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/3a878153/attachment-0001.html>

From stephen at xemacs.org  Fri Sep 11 04:07:22 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 11 Sep 2015 11:07:22 +0900
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <55F1B219.1000502@egenix.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
Message-ID: <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>

Executive summary:

The question is, "what value is there in changing the default to be
crypto strong to protect future security-sensitive applications from
naive implementers vs. the costs to current users who need to rewrite
their applications to explicitly invoke the current default?"

M.-A. Lemburg writes:

 > I'm pretty sure people doing crypto will know and most others
 > simply don't care :-)

Which is why botnets have millions of nodes.  People who do web
security evidently believe that inappropriate RNGs have something to
do with widespread security issues.  (That doesn't mean they're right,
but it gives me pause for thought -- evidently, Guido thought so too!)

 > Evidence: We used a Wichmann-Hill PRNG as default in random
 > for a decade and people still got their work done.

The question is not whether people get their work done.  People work
(unless they're seriously dysfunctional), that's what people do.
Especially programmers (cf. GNU Manifesto).  The question is whether
the work of the *crackers* is made significantly easier by security
holes that are opened by inappropriate use of random.random.

I tend to agree with Steven d'A. (and others) that the answer is no:
it doesn't matter if the kind of person who leaves a key under the
third flowerpot from the left also habitually leaves the door unlocked
(especially if "I'm only gonna be gone for 5 minutes"), and I think
that's likely.  IOW, installing crypto strong RNGs as default is *not*
analogous to the changes to SSL support that were so important that
they were backported to 2.7 in a late patch release.

OTOH, why default to crypto weak if crypto strong is easily available?
You might save a few million Debian users from having to regenerate
all their SSH keys.[1]

But the people who are "just getting work done" in new programs *won't
notice*.  I don't think that they care what's under the hood of
random.random, as long as (1) the API stays the same, and (2) the
documentation clearly indicates where to find PRNGs that support
determinism, jumpahead, replicability, and all those other good
things, for the needs they doesn't have now but know they probably
will have some day.  The rub is, as usual, existing applications that
would have to be changed for no reason that is relevant to them.

Note that arc4random is much simpler to use than random.random.  No
knobs to tweak or seeds to store for future reference.  Seems
perfectly suited to "just getting work" done to me.  OTOH, if you have
an application where you need replicability, jumpahead, etc, you're
going to need to read the docs enough to find the APIs for seeding and
so on.  At design time, I don't see why it would hurt to select an
RNG algorithm explicitly as well.

 > Why not add ssl.random() et al. (as interface to the OpenSSL
 > rand APIs) ?

I like that naming proposal.  I'm sure changing the nature of
random.random would annoy the heck out of *many* users.

An alternative would be to add random.crypto.

 > Some background on why I think deterministic RNGs are more
 > useful to have as default than non-deterministic ones:
 > 
 > A common use case for me is to write test data generators
 > for large database systems. For such generators, I don't keep
 > the many GBs data around, but instead make the generator take a
 > few parameters which then seed the RNGs, the time module and
 > a few other modules via monkey-patching.

If you've gone to that much effort, you evidently have read the docs
and it wouldn't have been a huge amount of trouble to use a
non-default module with a specified PRNG -- if you were doing it now.
But you have every right to be very peeved if you have a bunch of old
test runs you want to replicate with a new version of Python, and
we've changed the random.random RNG on you.



Footnotes: 
[1]  I hasten to add that a programmer who isn't as smart as he thinks
he is who "improves" a crypto algorithm is far more likely than that
the implementer of a crypto suite would choose an RNG that is
inappropriate by design.  Still, it's a theoretical possibility, and
security is about eliminating every theoretical possibility you can
think of.


From steve at pearwood.info  Fri Sep 11 04:39:23 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 11 Sep 2015 12:39:23 +1000
Subject: [Python-ideas] BUG in standard while statement
In-Reply-To: <55F20057.2070103@mrabarnett.plus.com>
References: <CAOs8ta2PC=6p8Eo_+F0m1WrVbgk+Y=8V8HQHotC0ObxGPvS6=g@mail.gmail.com>
 <msra7n$h9t$1@ger.gmane.org>
 <CALGmxEJjEJyi03KzmqVXUztT9bqwAoE_Dyzj3BXzrK4B6VHwog@mail.gmail.com>
 <55F20057.2070103@mrabarnett.plus.com>
Message-ID: <20150911023923.GS19373@ando.pearwood.info>

On Thu, Sep 10, 2015 at 11:12:39PM +0100, MRAB wrote:
> On 2015-09-10 20:04, Chris Barker wrote:
> >however, he did bring up a python-idea worthy general topic:
> >
> >Sometimes you want to iterate without doing anything with the results of
> >the iteration.
> >
> >So the obvious is his example -- iterate N times:
> >
> >for i in range(N):
> >     do_something
> >
> >but you may need to look into the code (probably more than one line) to
> >see if i is used for anything.

Solution is obvious:

for throw_away_variable_not_used_for_anythng in range(N): ...

*wink*

Just use one of the usual conventions for throw-away variables: call it 
_ or whocares. But, why do you care so much about whether i is 
being used for something?

Today, you have:

    for whocares in range(10):
        print(message)

Next week, you decide you need to number them, now you do care 
about the loop variable:

    for whocares in range(10):
        print(whocares, message)

Having a loop variable that may remain unused is not exactly a big deal.


[Chris]
> >However, I've found myself wanting a "make nothing comprehension". For
> >some reason, I find myself frequently following a pattern where I want
> >to call the same method on all the objects in a sequence:
> >
> >for obj in a_sequence:
> >     obj.a_method()
> >
> >but I like the compactness of comprehensions, so I do:
> >
> >[obj.a_method() for obj in a_sequence]

Ew, ew, ew, ew. You're calling the method for its side-effects, not its 
return result (which is probably None, but might not be). Turning it 
into a list comp is abuse of comprehensions: you're collecting the 
return results, potentially creating an enormous list, which you 
don't actually want and immediately throw away.

Just write it as a one-liner for-loop, which is *more* compact (by 
exactly one character) as the list comp):

[obj.a_method() for obj in a_sequence]
for obj in a_sequence: obj.a_method()


[MRAB]
> You could use a generator expression with a function that discards the 
> results:
> 
> def every(iterable):
>     for _ in iterable:
>         pass
> 
> every(obj.a_method() for obj in a_sequence)

If you're going to do such a horrid thing, at least name it accurately. 
"every" sounds like a synonym for the built-in "all". A more accurate 
name would be "consume", as in consuming the iterator, and I seem to 
recall that there's a recipe in the itertools docs to do that as fast as 
possible.

But, whether you call it "every" or "consume", the code still looks 
misleading:

every(obj.a_method() for obj in a_sequence)

looks like you care about the return results of a_method, but you don't. 
List comps and generator expressions are for cases where you care about 
the expression's result.


-- 
Steve

From ncoghlan at gmail.com  Fri Sep 11 04:38:19 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 11 Sep 2015 12:38:19 +1000
Subject: [Python-ideas] One way to do format and print
In-Reply-To: <55F1BF7C.9060205@mail.de>
References: <CAP1MnbvD4KwbEV77qMQPUq7pE1fTh-pkvmguwQn1DiCetoSKnw@mail.gmail.com>
 <CADiSq7fs_b+c6-iwuB+fibi=+5kNdkVQzQSZU7BbS-Loy6swGA@mail.gmail.com>
 <55ED24C4.9000205@mail.de>
 <CAPTjJmqxjZ=N7NHVE22nk3+Zyjy102AZXUCwsX4LAgy7Te2-rQ@mail.gmail.com>
 <m21teac5p7.fsf@fastmail.com>
 <B631FEA1-4665-4BC9-8D7F-C156714B3AA7@gmail.com>
 <m2fv2plshe.fsf@fastmail.com>
 <87pp1t1unb.fsf@uwakimon.sk.tsukuba.ac.jp>
 <m2egi9a62o.fsf@fastmail.com> <55EF2B66.4020509@mail.de>
 <1441741195.1614886.378114729.37307E0E@webmail.messagingengine.com>
 <6DDBD724-714E-40E1-88DF-9BC8484FF240@yahoo.com>
 <55F058B6.9000202@mail.de>
 <1DCC81C0-DE7A-460A-AD7F-E1533BB14911@yahoo.com>
 <55F0E5C9.6030509@brenbarn.net>
 <CADiSq7emmO81fwShS_rVyH867UXwVPDo0Sx=eUoJMLdE=-DQVQ@mail.gmail.com>
 <55F1BF7C.9060205@mail.de>
Message-ID: <CADiSq7cg+3n90wgPB6datEws2B_QCN3Wafxhk0dnkncw2e27Zw@mail.gmail.com>

On 11 September 2015 at 03:35, Sven R. Kunze <srkunze at mail.de> wrote:
> On 10.09.2015 17:36, Nick Coghlan wrote:
>
> This perspective doesn't grant enough credit to the significance of C
> in general, and the C ABI in particular, in the overall computing
> landscape. While a lot of folks have put a lot of work into making it
> possible to write software without needing to learn the details of
> what's happening at the machine level, it's still the case that the
> *one* language binding interface that *every* language runtime ends up
> including is being able to load and run C libraries.
>
>
> Ah, now I understand. We need to add {} to C. That'll make it, right? ;)
>
> Seriously, there are also other significant influences that fit better here:
> template engines. I know a couple of them using {} in some sense or another.
> C format strings are just one of them, so I wouldn't stress the significance
> of C that hard in that particular instance. There are other areas where C
> has its strengths.

You're tilting at windmills Sven. Python has 3 substitution variable
syntaxes (two with builtin support), and we no longer have any plans
for getting rid of any of them. We *did* aim to deprecate
percent-substitution as part of the Python 3 migration, and after
trying for ~5 years *decided that was a bad idea*, and reversed the
original decision to classify it as deprecated. We subsequently
switched the relevant section of the docs from describing
percent-formatting as "old string formatting" to "printf-style string
formatting" in a larger revamp of the builtin sequence type
documentation a few years back:
https://hg.python.org/cpython/rev/463f52d20314

PEP 461 has now further entrenched the notion that "percent-formatting
is recommended for binary data, brace-formatting is recommended for
text data" by bringing back the former for bytes and bytearray in 3.5,
while leaving str.format as text only:
https://www.python.org/dev/peps/pep-0461/

PEP 498 then blesses brace-formatting as the "one obvious way" for
text formatting by elevating it from "builtin method" to "syntax" in
3.6.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Fri Sep 11 04:48:07 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 11 Sep 2015 12:48:07 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <5DBE2F72-DAB1-43D3-97F5-318D480E91FE@yahoo.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
 <5DBE2F72-DAB1-43D3-97F5-318D480E91FE@yahoo.com>
Message-ID: <CADiSq7faaOJbFRJpSpsaiUWZR7adS1yPCF-P=sdM+1p=8b=OPw@mail.gmail.com>

On 11 September 2015 at 08:54, Andrew Barnert via Python-ideas
<python-ideas at python.org> wrote:
> Actually, I just thought of an even simpler solution:
>
> Add a deterministic_singleton member to random (which is just initialized to DeterministicRandom() at startup). Now, the user fix is just to change "import random" to "from random import deterministic_singleton as random".

Change the spelling to "import random.seeded_random as random" and the
user fix is even shorter.

I do agree with the idea of continuing to provide a process global
instance of the current PRNG for ease of migration - changing a single
import is a good way to be able to address a deprecation, and looking
for the use of seeded_random in a security sensitive context would
still be fairly straightforward.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From steve at pearwood.info  Fri Sep 11 05:13:04 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 11 Sep 2015 13:13:04 +1000
Subject: [Python-ideas] Round division
In-Reply-To: <mst0rv$2r1$1@ger.gmane.org>
References: <mssnir$uol$1@ger.gmane.org>
 <CAG3cHaY80SdXfWAmFb6xgcYCSesefA0dUyO-eR=qSO2+8R4ERw@mail.gmail.com>
 <CACac1F_sGOS2uXmyjtkfP93g98A_Mg=u2k_xw-vdWSjPx5wkDw@mail.gmail.com>
 <mst0rv$2r1$1@ger.gmane.org>
Message-ID: <20150911031304.GT19373@ando.pearwood.info>

On Fri, Sep 11, 2015 at 01:39:59AM +0300, Serhiy Storchaka wrote:
> On 11.09.15 00:48, Paul Moore wrote:
> >On 10 September 2015 at 22:13, Mark Young <marky1991 at gmail.com> wrote:
> >>Pardon my ignorance, but what is the definition of round division? (if it
> >>isn't "round(a/b)")
> >
> >I assumed it would be "what round(a/b) would give if it weren't
> >subject to weird floating point rounding issues". To put it another
> >way, if a / b is d remainder r, then I'd assume "round division" would
> >be d if r < b/2, d+1 if r > b/2, and (which of d, d+1?) if r == b/2.
> >(a, b, d and r are all integers).
> >
> >If not, then I also would like to know what it means...
> 
> Yes, it is what you have described. If r == b/2, the result is even 
> (i.e. (d+1)//2*2).

How does this differ from round(a/b)? round() also rounds to even.

Perhaps a more general solution would be a round-to-direction, or 
divide-and-round-to-direction. Now that we have Enums, we could define 
enums for round-to-zero, round-to-nearest, round-to-infinity, 
round-to-even, and have a function divide(a, b, dir=ROUNDEVEN), say.


-- 
Steve

From abarnert at yahoo.com  Fri Sep 11 05:18:45 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 10 Sep 2015 20:18:45 -0700
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
In-Reply-To: <CADiSq7faaOJbFRJpSpsaiUWZR7adS1yPCF-P=sdM+1p=8b=OPw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
 <5DBE2F72-DAB1-43D3-97F5-318D480E91FE@yahoo.com>
 <CADiSq7faaOJbFRJpSpsaiUWZR7adS1yPCF-P=sdM+1p=8b=OPw@mail.gmail.com>
Message-ID: <AF5EBC52-F420-4051-AE62-1EFBB8999A44@yahoo.com>

On Sep 10, 2015, at 19:48, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> On 11 September 2015 at 08:54, Andrew Barnert via Python-ideas
> <python-ideas at python.org> wrote:
>> Actually, I just thought of an even simpler solution:
>> 
>> Add a deterministic_singleton member to random (which is just initialized to DeterministicRandom() at startup). Now, the user fix is just to change "import random" to "from random import deterministic_singleton as random".
> 
> Change the spelling to "import random.seeded_random as random" and the
> user fix is even shorter.

OK, sure; I don't care much about the spelling. I think neither name will be unduly confusing to novices, and anyone who actually wants to understand what the choice means will use help or the docs or a Google search and find out in a few seconds.

> I do agree with the idea of continuing to provide a process global
> instance of the current PRNG for ease of migration - changing a single
> import is a good way to be able to address a deprecation, and looking
> for the use of seeded_random in a security sensitive context would
> still be fairly straightforward.

Personally, I think we're done with that change.  Deprecation of the names random.Random, random.random(), etc. is sufficient to prevent people from making mistakes without realizing it. Having a good workaround to prevent code churn for the thousands of affected apps means the cost doesn't outweigh the benefits. So, the problem Theo raised is solved.[1] Which means the more radical solution he offered is unnecessary. Unless we're seriously worried that some people who aren't sure if they need Seeded or System may incorrectly choose Seeded just because of performance, there's no need to add a Chacha choice alongside them. Put it on PyPI, maybe with a link from the SystemRandom docs, and see how things go from there.

[1] Well, it's not quite solved, because someone has to figure out how to organize things in the docs, which obviously need to change. Do we tell people how to choose between creating a SeededRandom or SystemRandom instance, then describe their interface, and then include a brief note "... but for porting old code, or when you explicitly need a globally shared Seeded instance, use seeded_random"? Or do we present all three as equally valid choices, and try to explain why you might want the singleton seeded_random vs. constructing and managing an instance or instances?

From abarnert at yahoo.com  Fri Sep 11 05:25:52 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 10 Sep 2015 20:25:52 -0700
Subject: [Python-ideas] Round division
In-Reply-To: <20150911031304.GT19373@ando.pearwood.info>
References: <mssnir$uol$1@ger.gmane.org>
 <CAG3cHaY80SdXfWAmFb6xgcYCSesefA0dUyO-eR=qSO2+8R4ERw@mail.gmail.com>
 <CACac1F_sGOS2uXmyjtkfP93g98A_Mg=u2k_xw-vdWSjPx5wkDw@mail.gmail.com>
 <mst0rv$2r1$1@ger.gmane.org> <20150911031304.GT19373@ando.pearwood.info>
Message-ID: <999FEFC7-47CF-4651-9613-8A6B94C24A8C@yahoo.com>

On Sep 10, 2015, at 20:13, Steven D'Aprano <steve at pearwood.info> wrote:
> 
>> On Fri, Sep 11, 2015 at 01:39:59AM +0300, Serhiy Storchaka wrote:
>>> On 11.09.15 00:48, Paul Moore wrote:
>>>> On 10 September 2015 at 22:13, Mark Young <marky1991 at gmail.com> wrote:
>>>> Pardon my ignorance, but what is the definition of round division? (if it
>>>> isn't "round(a/b)")
>>> 
>>> I assumed it would be "what round(a/b) would give if it weren't
>>> subject to weird floating point rounding issues". To put it another
>>> way, if a / b is d remainder r, then I'd assume "round division" would
>>> be d if r < b/2, d+1 if r > b/2, and (which of d, d+1?) if r == b/2.
>>> (a, b, d and r are all integers).
>>> 
>>> If not, then I also would like to know what it means...
>> 
>> Yes, it is what you have described. If r == b/2, the result is even 
>> (i.e. (d+1)//2*2).
> 
> How does this differ from round(a/b)? round() also rounds to even.

His rounds based on the exact integer remainder; yours rounds based on the inexact float fractional part. So, if b is large enough, using round division is guaranteed to do the right thing,[1] but rounding float division may have rounding, overflow, or underflow errors.

[1] Except I'm pretty sure he wants to compare r*2 to b, not r to b/2. Otherwise he's reintroduced the problem he's trying to solve.

From ncoghlan at gmail.com  Fri Sep 11 05:33:59 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 11 Sep 2015 13:33:59 +1000
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CADiSq7f2RV4X6Sh5imKg5RkBsfmsBUDBHdwCeG6KjpJMge27Cw@mail.gmail.com>

On 11 September 2015 at 12:07, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Executive summary:
>
> The question is, "what value is there in changing the default to be
> crypto strong to protect future security-sensitive applications from
> naive implementers vs. the costs to current users who need to rewrite
> their applications to explicitly invoke the current default?"
>
> M.-A. Lemburg writes:
>
>  > I'm pretty sure people doing crypto will know and most others
>  > simply don't care :-)
>
> Which is why botnets have millions of nodes.  People who do web
> security evidently believe that inappropriate RNGs have something to
> do with widespread security issues.  (That doesn't mean they're right,
> but it gives me pause for thought -- evidently, Guido thought so too!)

They're right. I used to be sanguine about this kind of thing because
I spent a long time working in the defence sector, and assumed
everyone else was as professionally paranoid as we were. I've been out
of that world long enough now to realise that that assumption was
deeply, and problematically, wrong*.

In that world, you work on the following assumptions: 1) you're an
interesting target; 2) the attackers' compute capacity is nigh
infinite; 3) any weakness will be found; 4) any weakness will be
exploited; 5) "other weaknesses exist" isn't a reason to avoid
addressing the weaknesses you know about.

As useful background, there's a recent Ars Technica article on the
technical details of cracking the passwords in the Ashley Madison data
dump, where security researchers found a NINE order of magnitude
speedup due to a vulnerability in another part of the system which let
them drastically reduce the search space for passwords:
http://arstechnica.com/security/2015/09/once-seen-as-bulletproof-11-million-ashley-madison-passwords-already-cracked/

That kind of reduction in search requirements means that searches that
*should* have taken almost 3000 years (in the absence of the
vulnerability) can instead be completed within a day.

Weak random number generators have a similar effect of reducing the
search space for attackers - if you know a weakly random source was
used, rather than a cryptographically secure one, then you can use
what you know about the random number generator to favour inputs it is
*likely* to have produced, rather than having to assume equal
probability for the entire search space. And if the target was using a
deterministic RNG and you're able to figure out the seed that was
used? You no longer need to search at all - you can recreate the exact
series of numbers the target was using.

Moving the default random source to a CSPRNG, and allowing folks to
move a faster deterministic PRNG for known non-security related use
cases, or to the system random number generator for known
security-related ones is likely to prove a good way to provide safer
defaults without reducing flexibility or raising barriers to entry too
much.

Regards,
Nick.

P.S. * As a case in point, it was only a couple of years ago that I
realised most developers *haven't* read docs like the NIST crypto
usage guidelines or the IEEE 802.11i WPA2 spec, and don't make a habit
of even casually following the progress of block cipher and secure
hash function design competitions. It's been an interesting exercise
for me in learning the true meaning of "expertise is relative" :)


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Fri Sep 11 05:38:06 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 11 Sep 2015 13:38:06 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <AF5EBC52-F420-4051-AE62-1EFBB8999A44@yahoo.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
 <5DBE2F72-DAB1-43D3-97F5-318D480E91FE@yahoo.com>
 <CADiSq7faaOJbFRJpSpsaiUWZR7adS1yPCF-P=sdM+1p=8b=OPw@mail.gmail.com>
 <AF5EBC52-F420-4051-AE62-1EFBB8999A44@yahoo.com>
Message-ID: <CADiSq7eDP6A+37c36gCnGW72C6vQSsU+pf9iUBMpzd+7pLOG+A@mail.gmail.com>

On 11 September 2015 at 13:18, Andrew Barnert <abarnert at yahoo.com> wrote:
> Personally, I think we're done with that change.  Deprecation of the names random.Random, random.random(), etc. is sufficient to prevent people from making mistakes without realizing it.

Implementing dice rolling or number guessing for a game as "from
random import randint" is *not* a mistake, and I'm adamantly opposed
to any proposal that makes it one - the cost imposed on educational
use cases would be far too high.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From stephen at xemacs.org  Fri Sep 11 06:44:30 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 11 Sep 2015 13:44:30 +0900
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CADiSq7eDP6A+37c36gCnGW72C6vQSsU+pf9iUBMpzd+7pLOG+A@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
 <5DBE2F72-DAB1-43D3-97F5-318D480E91FE@yahoo.com>
 <CADiSq7faaOJbFRJpSpsaiUWZR7adS1yPCF-P=sdM+1p=8b=OPw@mail.gmail.com>
 <AF5EBC52-F420-4051-AE62-1EFBB8999A44@yahoo.com>
 <CADiSq7eDP6A+37c36gCnGW72C6vQSsU+pf9iUBMpzd+7pLOG+A@mail.gmail.com>
Message-ID: <87si6lzhsh.fsf@uwakimon.sk.tsukuba.ac.jp>

Nick Coghlan writes:

 > Implementing dice rolling or number guessing for a game as "from
 > random import randint" is *not* a mistake,

Turning the number guessing game into a text CAPTCHA might be one,
though.  That randint may as well be crypto strong, modulo the problem
that people who use an explicit seed get punished for knowing what
they're doing.

I suppose it would be too magic to have the seed method substitute the
traditional PRNG for the default, while an implicitly seeded RNG
defaults to a crypto strong algorithm?

Steve

From kramm at google.com  Thu Sep 10 20:20:38 2015
From: kramm at google.com (Matthias Kramm)
Date: Thu, 10 Sep 2015 11:20:38 -0700 (PDT)
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
Message-ID: <f7d442f2-110c-45c1-ab22-aca29e83adde@googlegroups.com>

I like this proposal; given Python's flat nominal type hierarchy, it will 
be useful to have a parallel subtyping mechanism to give things finer 
granularity without having to resort to ABCs.

Are the return types of methods invariant or variant under this proposal?

I.e. if I have

  class A(Protocol):
    def f() -> int: ...

does

  class B:
    def f() -> bool:
      return True

implicitly implement the protocol?

Also, marking Protocols using subclassing seems confusing and error-prone.
In your examples above, one would think that you could define a new 
protocol using

class SizedAndClosable(Sized):
    pass

instead of

class SizedAndClosable(Sized, Protocol):
    pass

because Sized is already a protocol.

Maybe the below would be a more intuitive syntax:

  @protocol
  class SizedAndClosable(Sized):
      pass

Furthermore, I strongly agree with #7. Typed, but optional, attributes are 
a bad idea.

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/d085a198/attachment.html>

From rosuav at gmail.com  Fri Sep 11 06:54:30 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 11 Sep 2015 14:54:30 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <87si6lzhsh.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
 <5DBE2F72-DAB1-43D3-97F5-318D480E91FE@yahoo.com>
 <CADiSq7faaOJbFRJpSpsaiUWZR7adS1yPCF-P=sdM+1p=8b=OPw@mail.gmail.com>
 <AF5EBC52-F420-4051-AE62-1EFBB8999A44@yahoo.com>
 <CADiSq7eDP6A+37c36gCnGW72C6vQSsU+pf9iUBMpzd+7pLOG+A@mail.gmail.com>
 <87si6lzhsh.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAPTjJmrn=Gzp3hZdEYQLT1105P6u+OSKPanxYJMEfK=EESvkFw@mail.gmail.com>

On Fri, Sep 11, 2015 at 2:44 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> I suppose it would be too magic to have the seed method substitute the
> traditional PRNG for the default, while an implicitly seeded RNG
> defaults to a crypto strong algorithm?

Ooh. Actually, I rather like that idea. If you don't seed the RNG, its
output will be unpredictable; it doesn't matter whether it's a PRNG
seeded by an unknown number, a PRNG seeded by /dev/urandom, a CSRNG,
or just reading from /dev/urandom every time. Until you explicitly
request determinism, you don't have it. If Python changes its RNG
algorithm and you haven't been seeding it, would you even know? Could
it ever matter to you?

It would require a bit of an internals change; is it possible that
code depends on random.seed and random.randint are bound methods of
the same object? To implement what you describe, they'd probably have
to not be.

ChrisA

From jlehtosalo at gmail.com  Fri Sep 11 07:01:38 2015
From: jlehtosalo at gmail.com (Jukka Lehtosalo)
Date: Thu, 10 Sep 2015 22:01:38 -0700
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <AAC7EB24-B032-42DB-949B-DE88676ACA50@yahoo.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <7DC7EA44-0CD8-4F61-8462-8147B8BB8059@yahoo.com>
 <CAA_f+LwkgQLMk3BNKbWKQ-FxQJ1kgv5JEZZTU=1B=_Qi7RVgew@mail.gmail.com>
 <AAC7EB24-B032-42DB-949B-DE88676ACA50@yahoo.com>
Message-ID: <CAA_f+LxU3hSoBiW4c5KyEoNUsLFt=kUDzw2srReWq-LLKb1gGw@mail.gmail.com>

On Wed, Sep 9, 2015 at 10:48 PM, Andrew Barnert <abarnert at yahoo.com> wrote:

> On Sep 9, 2015, at 21:34, Jukka Lehtosalo <jlehtosalo at gmail.com> wrote:
>
> I'm not sure if I fully understand what you mean by implicit vs. explicit
> ABCs (and the static/runtime distinction). Could you define these terms and
> maybe give some examples of each?
>
>
> I just gave examples just one paragraph above.
>
> A (runtime) implicit ABC is something that uses a __subclasshook__
> (usually implementing a structural check). So, for instance, any type that
> implements __iter__ is-a Iterable, e.g., according to isinstance or
> issubclass or @singledispatch, because that's what
> Iterable.__subclasshook__ checks for.
>
> A (runtime) explicit ABC is something that isn't implicit, like Sequence:
> no hook, so nothing is-a Sequence unless it either inherits the ABC or
> registers with it.
>
> You're proposing a parallel but separate distinction at static typing
> time. Any ABC that's a Protocol is checked based on a structural check;
> otherwise, it's checked based on inheritance.
>

In my proposal I actually suggest that protocols shouldn't support
isinstance or issubclass (these operations should raise an exception) by
default. A protocol is free to override the default exception-raising
__subclasshook__ to implement a structural check, and a static type checker
would allow isinstance and issubclass for protocols that do this. I'll need
to explain this idea in more detail, as clearly the current explanation is
too easy to misundertand.

Here's a concrete example:

class X(Protocol):
    def f(self): ...

class A:
    def f(self): print('f')

if isinstance(A(), X): ...   # Raise an exception, because no
__subclasshook__ override in X

Previously I toyed with the idea of having a default implementation of
__subclasshook__ that actually does a structural check, but I'm no longer
sure if that would be desirable, as it's difficult to come up with an
implementation that does the right thing in all reasonable cases. For
example, consider a structural type like this that people might want to use
to work around the current limitations of Callable (it doesn't support
keyword arguments, for example):

class MyCallable(Protocol):
    def __call__(self, x, y): ...

(This example has some other potential issues that I'm hand-waving away for
now.)

Now how would the default isinstance work? Preferably it should only accept
callables that are compatible with the signature, but doing that check is
pretty difficult for arbitrary functions and should probably be out of
scope for the typing module. Just checking whether __call__ exists would be
too general, as the programmer probably expects that he's able to call the
method with the specific arguments the type suggests. Also, sometimes
checking the argument names would be a good thing to do, but sometimes any
names (as long the the number of arguments is compatible) would be fine.


> This means it's now possible to create supertypes that are implicit at
> runtime but explicit at static typing time (which might occasionally be
> useful), or vice-versa (which I can't imagine why you'd ever want).
>

As I showed above, you wouldn't get the latter unless you really try very
hard (consenting adults and all).


>
> Besides the obvious negatives in having two not-quite-compatible and
> very-different-looking ways of expressing the same concept, this is going
> to lead to people wanting to know why their type checker is complaining
> about perfectly good code ("I tested that constant with isinstance, and it
> really is-a Spammable, and the type checker is inferring its type properly,
> and yet I get an error passing it to a function that wants a Spammable") or
> allowing blatantly invalid code ("I annotated my function to only take
> Spammable arguments, but someone is passing something that calls the
> fallback implementation of my singledispatch function instead of the
> Spammable overload").
>

I agree that having the default nominal/explicit isinstance semantics for a
protocol type would be a very bad idea.


>
> Maybe the solution is to expand your proposal a little: make Protocol
> automatically create a __subclasshook__ (which you listed as an optional
> idea in the proposal), and also change all of the existing stdlib implicit
> ABCs to Protocols and scrap their manual hooks, and also update the
> relevant documentation (e.g., the abc module and the data model section on
> __subclasshook__) to recommend using Protocol instead of implementing a
> manual hook if the only thing you want is structural subtyping. Of course
> the backward compatibility isn't perfect (unless you want to manually munge
> up collections.abc when typing is imported), and people using legacy
> third-party code might need to add stubs (although that seems necessary
> anyway). But for most people, everything should just work as people expect.
> A type is either structurally typed or explicitly (via inheritance or
> registration) types, both at static typing time and a runtime, and that's
> always expressed by the name Protocol. (But for the rare cases when you
> really need a type check that's looser at runtime, you can still write a
> manual hook to handle that.)
>
>
Yeah, this would be nice, but as I argued above, implementing a generic
__subclasshook__ is actually quite tricky.

Jukka
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/84dad0af/attachment-0001.html>

From tim.peters at gmail.com  Fri Sep 11 07:24:23 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 11 Sep 2015 00:24:23 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com> <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>

[M.-A. Lemburg]
>> I'm pretty sure people doing crypto will know and most others
>> simply don't care :-)

[Stephen J. Turnbull <stephen at xemacs.org>]
> Which is why botnets have millions of nodes.

I'm not a security wonk, but I'll bet a life's salary ;-) we'd have
botnets just as pervasive if every non-crypto RNG in the world were
banned - or had never existed.

To start a botnet, the key skill is social engineering:  tricking
ordinary users into installing malicious software.  So long as end
users are allowed to run programs, that problem will never go away.
Hell, I get offers to install malware each day on Facebook alone,
although they're *spelled* like "Install Flash update to see this
shocking video!".

Those never end for the same reason I still routinely get Nigerian 419
spam:  there are plenty of people gullible enough to fall for them
outright.  Technical wizardry isn't needed to get in the door on
millions of machines.

So if RNGs have something to do with security, it's not with botnets;
let's not oversell this.


> People who do web security evidently believe that inappropriate RNGs
> have something to do with widespread security issues.

Do they really?  I casually follow news of the latest exploits, and I
really don't recall any of them pinned on an RNG (as opposed to highly
predictable default RNG _seeding_ from several years back).  Mostly
out-of-bounds crap in C, or exploiting holes in security models, or
bugs in the implementations of those models (whether Microsoft's,
Java's, Adobe Flash's ...).


> (That doesn't mean they're right, but it gives me pause for thought -- evidently,
> Guido thought so too!)

Or it's that Theo can be very insistent, and Guido is only brusque
with the non-Dutch ;-)

Not saying switching is bad.  Am saying I've seen no compelling
justification for causing users (& book & course authors & ....) such
pain.  If this were Python 0.9.1 at issue, sure - but random.py's
basic API really hasn't changed since then.

From greg.ewing at canterbury.ac.nz  Fri Sep 11 07:42:34 2015
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 11 Sep 2015 17:42:34 +1200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <1441890163.3120507.379846857.49842A96@webmail.messagingengine.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <1441890163.3120507.379846857.49842A96@webmail.messagingengine.com>
Message-ID: <55F269CA.9050708@canterbury.ac.nz>

random832 at fastmail.us wrote:
> Being able to produce multiple independent streams of numbers is the
> important feature. Doing it by "jumping ahead" seems less so.

Doing it by jumping ahead isn't strictly necessary; the
important thing is to have some way of generating
*provably* non-overlapping and independent sequences.
Jumping ahead is one obvious way to achieve that.
Simply setting the seed of each generator randomly
and hoping for the best is not really good enough.

 > And the
> need for doing it "efficiently" isn't as clear either

I say that because you can obviously jump ahead N
steps in any generator just by running it for N
cycles, but that's likely to be unacceptably slow.
A more direct way of getting there is desirable.

-- 
Greg

From jlehtosalo at gmail.com  Fri Sep 11 08:00:30 2015
From: jlehtosalo at gmail.com (Jukka Lehtosalo)
Date: Thu, 10 Sep 2015 23:00:30 -0700
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CALxg4FVf_g8_v9XxRWAS2Z-hgA0z23zQ3-VxZE2kXmoQ1F1RQA@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <CALxg4FVf_g8_v9XxRWAS2Z-hgA0z23zQ3-VxZE2kXmoQ1F1RQA@mail.gmail.com>
Message-ID: <CAA_f+LxrGZSO7Aaf=ZTci9wSPK8ELuNai-A6h+Z3Ww82pd=TQg@mail.gmail.com>

On Thu, Sep 10, 2015 at 3:01 AM, Luciano Ramalho <luciano at ramalho.org>
wrote:

> Jukka, thank you very much for working on such a hard topic and being
> patient enough to respond to issues that I am sure were exhaustively
> discussed before (but I was not following the discussions then since I
> was in the final sprint for my book, Fluent Python, at the time).
>
> I have two questions which were probably already asked before, so feel
> free to point me to relevant past messages:
>
> 1) Why is a whole new hierarchy of types being created in the typing
> module, instead of continuing the hierarchy in the collections module
> while enhancing the ABCs already there? For example, why weren't the
> List and Dict type created under the existing MutableSequence and
> MutableMapping types in collections.abc?
>

There are two main reasons. First, we wanted typing to be backward
compatible down to Python 3.2, and so all the new features had to work
without any changes to other standard library modules. Second, the module
is provisional and it would be awkward to have non-provisional standard
library modules depend on or closely interact with a provisional module.

Also, List and Dict are actually type aliases for regular classes (list and
dict, respectively) and so they actually represent subclasses of
MutableSequence and MutableMapping as defined in collections.abc. They
aren't proper classes so they don't directly play a role at runtime outside
annotations.


>
> 2) Similarly, I note that PEP-484 shuns existing ABCs like those in
> the numbers module, and the ByteString ABC. The reasons given are
> pragmatic, so that users don't need to import the numbers module, and
> would not "have to write typing.ByteString everywhere." as the PEP
> says... I don not understand these arguments because:
>
> a) as you just wrote in another message, the users will be primarily
> the authors of libraries and frameworks, who will always be forced to
> import typing anyhow, so it does not seem such a burden to have them
> import other modules get the benefits of type hinting;
>

I meant that protocols will likely be often *defined* in libraries or
frameworks (or their stubs). Almost any code can *use* protocols in
annotations, but user code might be less likely to define additional
protocols. That's just a guess and I could be easily proven wrong, though.

b) alternatively, there could be aliases of the relevant ABCs in the
> typing module for convenience
>

There are other reasons for not using ABCs for things like numbers. For
example, a lot of standard library functions expect concrete numeric types
and won't accept arbitrary subclasses of the ABCs. For example, you
couldn't pass a value with the numbers.Integral type to math.sin, because
it expects an int or a float. Using ABCs instead of int, float or str
wouldn't really work well (or at all) for type checking.


>
> So the second question is: what's wrong with points (a) and (b), and
> why did PEP-484 keep such a distance form existing ABCs in general?
>

See above. There are more reasons but those that I mentioned are some of
the more important ones. If you are still unconvinced, ask for more details
and maybe I'll dig through the archives. :-)


>
> I understand pragmatic choices, but as a teacher and writer I know
> such choices are often obstacles to learning because they seem
> arbitrary to anyone who is not privy to the reasons behind them. So
> I'd like to better understand the reasoning, and I think PEP-484 is
> not very persuasive when it comes to the issues I mentioned.
>

Yeah, PEP 484 doesn't go through the rationale and subtleties in much
detail. Maybe there should be a separate rationale PEP and we could just
link to it when we get asked some of these (quite reasonable, mind you!)
questions again. ;-)

Jukka
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/b9046ff8/attachment.html>

From greg.ewing at canterbury.ac.nz  Fri Sep 11 08:19:13 2015
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 11 Sep 2015 18:19:13 +1200
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CAPTjJmqkX6x+JuEuTsfmGrWjBbg=Mcnzg6UzQz73-hCZVbhF1w@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <CAPTjJmqkX6x+JuEuTsfmGrWjBbg=Mcnzg6UzQz73-hCZVbhF1w@mail.gmail.com>
Message-ID: <55F27261.1060901@canterbury.ac.nz>

Chris Angelico wrote:
> I'm
> not sure what the difference is between "seeding a PRNG with entropy"
> and "seeding a deterministic PRNG with a particular seed value",
> though; aside from the fact that one of them uses a known value and
> the other doesn't, of course. Back in my BASIC programming days, we
> used to use "RANDOMIZE TIMER" to seed the RNG with time-of-day, or
> "RANDOMIZE 12345" (or other value) to seed with a particular value;

I think the only other difference is that the Linux kernel
is continually re-seeding its generator whenever more
unpredictable bits become available. It's not something
you need to explicitly do yourself, as in your BASIC
example.

-- 
Greg

From jlehtosalo at gmail.com  Fri Sep 11 08:24:36 2015
From: jlehtosalo at gmail.com (Jukka Lehtosalo)
Date: Thu, 10 Sep 2015 23:24:36 -0700
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <55F1B306.5070705@mail.de>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <55F1B306.5070705@mail.de>
Message-ID: <CAA_f+LzcuThaoipmLPw0iOHX0EwHQLuQPwO1RrATNpEskp3FXg@mail.gmail.com>

On Thu, Sep 10, 2015 at 9:42 AM, Sven R. Kunze <srkunze at mail.de> wrote:

> On 10.09.2015 06:12, Jukka Lehtosalo wrote:
>
> but there are some of main the benefits as I see them:
>
> - Code becomes more readable. This is especially true for code that
> doesn't have very detailed docstrings.
>
>
> If I have code without docstrings, I better write docstrings then. ;)
>
> I mean when I am really going to touch that file to improve documentation
> (which annotations are a piece of), I am going to add more information for
> the reader of my API and that mostly will be describing the behavior of the
> API.
>
> If my variables have crappy names, so I need to add type hints to them,
> well, then, I rather fix them first.
>

Even good variable names can leave the type ambiguous. And besides, if you
assume that all code is perfect or can be made perfect I think that you've
already lost the discussion. Reality disagrees with you. ;-)

You can't just wave a magic wand and to get every programmer to document
their code and write unit tests. However, we know quite well that
programmers are perfectly capable of writing type annotations, and tools
can even enforce that they are present (witness all the Java code in
existence). Tools can't verify that you have good variable names or useful
docstrings, and people are too inconsistent or lazy to be relied on.


>
> You'll get the biggest benefits if you are working on a large code base
> mostly written by other people with limited test coverage and little
> comments or documentation.
>
>
> If I had large untested and undocumented code base (well I actually have),
> then static type checking would be ONE tool to find out issues.
>

Sure, it doesn't solve everything.


>
> Once found out, I write tests as hell. Tests, tests, tests. I would not
> add type annotations. I need tested functionality not proper typing.
>

Most programmers only have limited time for improving existing code. Adding
type annotations is usually easier that writing tests. In a cost/benefit
analysis it may be optimal to spent half the available time on annotating
parts of the code base to get some (but necessarily limited) static
checking coverage and spend the remaining half on writing tests for
selected parts of the code base, for example. It's not all or nothing.


>
>
> You get extra credit if your tests are slow to run and flaky,
>
>
> We are problem solvers. So, I would tell my team: "make them faster and
> more reliable".
>

But you'd probably also ask them to implement new features (or *your*
manager might be unhappy), and they have to find the right balance, as they
only have 40 hours a week (or maybe 80 hours if you work at an early-stage
startup :-). Having more tools gives you more options for spending your
time efficiently.


>
>
> I consider that difference pretty significant. I wouldn't want to increase
> the fraction of unchecked parts of my annotated code by a factor of 8, and
> I want to have control over which parts can be type checked.
>
>
> Granted. But you still don't know if your code runs correctly. You are
> better off with tests. And I agree type checking is 1 test to perform (out
> of 10K).
>

Actually a type checker can verify multiple properties of a typical line of
code. So for 10k lines of code, complete type checking coverage would give
you the equivalent of maybe 30,000 (simple) tests. :-P

And I'm sure it would take much less time to annotate your code than to
manually write the 30,000 test cases.


>
> But:
>
>
>> I don't see the effort for adding type hints AND the effort for further
>> parsing (by human eyes) justified by partially better IDE support and 1
>> single additional test within test suites of about 10,000s of tests.
>>
>> Especially, when considering that correct types don't prove functionality
>> in any case. But tested functionality in some way proves correct typing.
>>
>
> I didn't see you respond to that. But you probably know that. :)
>

This is a variation of an old argument, which goes along the lines of "if
you have tests and comments (and everybody should, of course!) type
checking doesn't buy you anyhing". But if the premise can't be met, the
argument doesn't actually say anything about the usefulness of type
checking. :-)

It's often not cost effective to have good test coverage (and even 100%
line coverage doesn't give you full coverage of all interactions). Testing
can't prove that your code doesn't have defects -- it just proves that for
a tiny subset of possible inputs you code works as expected. A type checker
may be able to prove that for *all* possible inputs your code doesn't do
certain bad things, but it can't prove that it does the good things.
Neither subsumes the other, and both of these are approaches are useful and
complementary (but incomplete). I think that there was a good talk
basically about this at PyCon this year, by the way, but I can't remember
the title.

Jukka
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/44bdebcd/attachment.html>

From storchaka at gmail.com  Fri Sep 11 08:27:15 2015
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Fri, 11 Sep 2015 09:27:15 +0300
Subject: [Python-ideas] Round division
In-Reply-To: <20150911031304.GT19373@ando.pearwood.info>
References: <mssnir$uol$1@ger.gmane.org>
 <CAG3cHaY80SdXfWAmFb6xgcYCSesefA0dUyO-eR=qSO2+8R4ERw@mail.gmail.com>
 <CACac1F_sGOS2uXmyjtkfP93g98A_Mg=u2k_xw-vdWSjPx5wkDw@mail.gmail.com>
 <mst0rv$2r1$1@ger.gmane.org> <20150911031304.GT19373@ando.pearwood.info>
Message-ID: <msts84$esq$1@ger.gmane.org>

On 11.09.15 06:13, Steven D'Aprano wrote:
> How does this differ from round(a/b)? round() also rounds to even.

 >>> round(5000000000000000/9999999999999999)
0
 >>> round(14999999999999999/10000000000000000)
2

But fractions 5000000000000000/9999999999999999 > 1/2 and 
14999999999999999/10000000000000000 < 3/2.



From xavier.combelle at gmail.com  Fri Sep 11 08:34:43 2015
From: xavier.combelle at gmail.com (Xavier Combelle)
Date: Fri, 11 Sep 2015 08:34:43 +0200
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CAPTjJmrn=Gzp3hZdEYQLT1105P6u+OSKPanxYJMEfK=EESvkFw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
 <5DBE2F72-DAB1-43D3-97F5-318D480E91FE@yahoo.com>
 <CADiSq7faaOJbFRJpSpsaiUWZR7adS1yPCF-P=sdM+1p=8b=OPw@mail.gmail.com>
 <AF5EBC52-F420-4051-AE62-1EFBB8999A44@yahoo.com>
 <CADiSq7eDP6A+37c36gCnGW72C6vQSsU+pf9iUBMpzd+7pLOG+A@mail.gmail.com>
 <87si6lzhsh.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAPTjJmrn=Gzp3hZdEYQLT1105P6u+OSKPanxYJMEfK=EESvkFw@mail.gmail.com>
Message-ID: <CAEQcUJRQi5-JeqN2n-DuuL_y7WA3dABAqgM58=ywKw_LCF-A6A@mail.gmail.com>

2015-09-11 6:54 GMT+02:00 Chris Angelico <rosuav at gmail.com>:

> On Fri, Sep 11, 2015 at 2:44 PM, Stephen J. Turnbull <stephen at xemacs.org>
> wrote:
> > I suppose it would be too magic to have the seed method substitute the
> > traditional PRNG for the default, while an implicitly seeded RNG
> > defaults to a crypto strong algorithm?
>
> Ooh. Actually, I rather like that idea. If you don't seed the RNG, its
> output will be unpredictable; it doesn't matter whether it's a PRNG
> seeded by an unknown number, a PRNG seeded by /dev/urandom, a CSRNG,
> or just reading from /dev/urandom every time. Until you explicitly
> request determinism, you don't have it. If Python changes its RNG
> algorithm and you haven't been seeding it, would you even know? Could
> it ever matter to you?
>
> It would require a bit of an internals change; is it possible that
> code depends on random.seed and random.randint are bound methods of
> the same object? To implement what you describe, they'd probably have
> to not be.
>
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>

I have thought of this idea and was quite seduced by it. However in this
case on a non seeded generator, getstate/setstate would be meaningless. I
also wonder what pickling generators does.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150911/a741e144/attachment-0001.html>

From jlehtosalo at gmail.com  Fri Sep 11 08:38:24 2015
From: jlehtosalo at gmail.com (Jukka Lehtosalo)
Date: Thu, 10 Sep 2015 23:38:24 -0700
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <9683c40d-b662-4b77-947e-62c418be8468@googlegroups.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <9683c40d-b662-4b77-947e-62c418be8468@googlegroups.com>
Message-ID: <CAA_f+LyB2Vt2s4_8HkGoeP0-M3LX=QfOjpDGAJod8zpZ0ACE0w@mail.gmail.com>

On Thu, Sep 10, 2015 at 11:57 AM, Matthias Kramm via Python-ideas <
python-ideas at python.org> wrote:

> On Wednesday, September 9, 2015 at 1:19:12 PM UTC-7, Guido van Rossum
> wrote:
>>
>> Jukka wrote up a proposal for structural subtyping. It's pretty good.
>> Please discuss.
>>
>> https://github.com/ambv/typehinting/issues/11#issuecomment-138133867
>>
>
> I like this proposal; given Python's flat nominal type hierarchy, it will
> be useful to have a parallel subtyping mechanism to give things finer
> granularity without having to resort to ABCs.
>
> Are the return types of methods invariant or variant under this proposal?
>
> I.e. if I have
>
>   class A(Protocol):
>     def f() -> int: ...
>
> does
>
>   class B:
>     def f() -> bool:
>       return True
>
> implicitly implement the protocol A?
>

The proposal doesn't spell out the rules for subtyping, but we should
follow the ordinary rules for subtyping for functions, and return types
would behave covariantly. So the answer is yes.


> Also, marking Protocols using subclassing seems confusing and error-prone.
> In your examples above, one would think that you could define a new
> protocol using
>
> class SizedAndClosable(Sized):
>     pass
>
> instead of
>
> class SizedAndClosable(Sized, Protocol):
>     pass
>
> because Sized is already a protocol.
>

The proposal also lets you define the protocols implemented by your class
explicitly, and without having the explicit Protocol base class or some
other marker these would be impossible to distinguish in general. Example:

class MyList(Sized):   #  I want this to be a normal class, not a protocol.
    def __len__(self) -> int:
        return self.num_items

class DerivedProtocol(Sized):  # This should actually be a protocol.
    def foo(self) -> int: ...


> Maybe the below would be a more intuitive syntax:
>
>   @protocol
>   class SizedAndClosable(Sized):
>       pass
>
>
We could use that. The tradeoff it that then we'd have some inconsistency
depending on whether a protocol is generic or not:

@protocol
class A(metaclass=ProtocolMeta):   # Non-generic protocol
    ...

@protocol
class B(Generic[T]):   # Generic protocol. But this has a different
metaclass than the above?
    ...

I'm not sure if we can use ABCMeta for protocols as protocols may need some
additional metaclass functionality. Anyway, any proposal should consider
all these possible ways of defining protocols:

1. Basic protocol, no protocol inheritance
2. Generic protocol, no protocol inheritance
3. Basic protocol that inherits one or more protocols
4. Generic protocol that inherits one or more protocols

My approach seems to deal with all of these reasonable well in my opinion
(but I haven't implemented it yet!), but the tradeoff is that the Protocol
base class needs to be present for all protocols.

Jukka
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/f39328ec/attachment.html>

From stephen at xemacs.org  Fri Sep 11 08:39:11 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 11 Sep 2015 15:39:11 +0900
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <55F269CA.9050708@canterbury.ac.nz>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <1441890163.3120507.379846857.49842A96@webmail.messagingengine.com>
 <55F269CA.9050708@canterbury.ac.nz>
Message-ID: <87r3m5zchc.fsf@uwakimon.sk.tsukuba.ac.jp>

Greg Ewing writes:
 > random832 at fastmail.us wrote:
 > > Being able to produce multiple independent streams of numbers is the
 > > important feature. Doing it by "jumping ahead" seems less so.
 > 
 > Doing it by jumping ahead isn't strictly necessary; the
 > important thing is to have some way of generating
 > *provably* non-overlapping and independent sequences.

By definition you don't have (stochastic) independence if you're using
a PRNG and deterministically jumping ahead.  Proving non-overlapping
is easy, but I don't even have a definition of "independence" of fixed
sequences: equidistribution of pairs?  That might make sense if you
have a sequence long enough to contain all pairs, but even then you
really just have a single sequence with larger support, and I don't
see how you can prove that it's a "good" sequence for using in a
simulation.

 > Jumping ahead is one obvious way to achieve that.
 > Simply setting the seed of each generator randomly
 > and hoping for the best is not really good enough.

It is not at all obvious to me that jumping ahead is better than
randomly seeding separate generators.  The latter actually gives
stochastic independence (at least if you randomize over all possible
seeds).


From p.f.moore at gmail.com  Fri Sep 11 10:02:47 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 11 Sep 2015 09:02:47 +0100
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <87si6lzhsh.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
 <5DBE2F72-DAB1-43D3-97F5-318D480E91FE@yahoo.com>
 <CADiSq7faaOJbFRJpSpsaiUWZR7adS1yPCF-P=sdM+1p=8b=OPw@mail.gmail.com>
 <AF5EBC52-F420-4051-AE62-1EFBB8999A44@yahoo.com>
 <CADiSq7eDP6A+37c36gCnGW72C6vQSsU+pf9iUBMpzd+7pLOG+A@mail.gmail.com>
 <87si6lzhsh.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CACac1F_0Yrc0p8YrhAdKXHJ0PwTNyPWu3ZG9qr=b-61Vp6kwAw@mail.gmail.com>

On 11 September 2015 at 05:44, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> I suppose it would be too magic to have the seed method substitute the
> traditional PRNG for the default, while an implicitly seeded RNG
> defaults to a crypto strong algorithm?

One issue with that - often, programs simply use a RNG for their own
purposes, but offer a means of getting the seed after the fact for
reproducibility reasons (the "map seed" case, for example).

Pseudo-code:

    if <user supplied a "seed">:
        state = <user-supplied value>
        random.setstate(state)
    else:
        state = random.getstate()
    ... do the program's main job, never calling seed/setstate
    if <user requests the "seed">:
        print state

So getstate (and setstate) would also need to switch to a PRNG.

There's actually very few cases I can think of where I'd need seed()
(as opposed to setstate()). Maybe if I let the user *choose* a seed
Some games do this.

Paul

From encukou at gmail.com  Fri Sep 11 10:08:38 2015
From: encukou at gmail.com (Petr Viktorin)
Date: Fri, 11 Sep 2015 10:08:38 +0200
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CAPTjJmrn=Gzp3hZdEYQLT1105P6u+OSKPanxYJMEfK=EESvkFw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
 <5DBE2F72-DAB1-43D3-97F5-318D480E91FE@yahoo.com>
 <CADiSq7faaOJbFRJpSpsaiUWZR7adS1yPCF-P=sdM+1p=8b=OPw@mail.gmail.com>
 <AF5EBC52-F420-4051-AE62-1EFBB8999A44@yahoo.com>
 <CADiSq7eDP6A+37c36gCnGW72C6vQSsU+pf9iUBMpzd+7pLOG+A@mail.gmail.com>
 <87si6lzhsh.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAPTjJmrn=Gzp3hZdEYQLT1105P6u+OSKPanxYJMEfK=EESvkFw@mail.gmail.com>
Message-ID: <CA+=+wqABN=xw_8VWTS-u6oRyBcopOATafGnGFV=NVMaZmBu04g@mail.gmail.com>

On Fri, Sep 11, 2015 at 6:54 AM, Chris Angelico <rosuav at gmail.com> wrote:
> On Fri, Sep 11, 2015 at 2:44 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>> I suppose it would be too magic to have the seed method substitute the
>> traditional PRNG for the default, while an implicitly seeded RNG
>> defaults to a crypto strong algorithm?
>
> Ooh. Actually, I rather like that idea. If you don't seed the RNG, its
> output will be unpredictable; it doesn't matter whether it's a PRNG
> seeded by an unknown number, a PRNG seeded by /dev/urandom, a CSRNG,
> or just reading from /dev/urandom every time. Until you explicitly
> request determinism, you don't have it. If Python changes its RNG
> algorithm and you haven't been seeding it, would you even know? Could
> it ever matter to you?
>
> It would require a bit of an internals change; is it possible that
> code depends on random.seed and random.randint are bound methods of
> the same object? To implement what you describe, they'd probably have
> to not be.

I've also thought about this idea. The problem with it is that seed()
and friends affect a global instance of Random.
If, after this change, there was a library that used random.random()
for crypto, calling seed() in the main program (or any other library)
would make it insecure. So we'd still be in a situation where nobody
should use random() for crypto.

From p.f.moore at gmail.com  Fri Sep 11 10:11:37 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 11 Sep 2015 09:11:37 +0100
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
Message-ID: <CACac1F8Pc_qnms1gDD_w7MiVpzghSuQxuzn6kwjpLmE=Rz=zqA@mail.gmail.com>

On 10 September 2015 at 23:46, Andrew Barnert <abarnert at yahoo.com> wrote:
> On Sep 10, 2015, at 07:21, Donald Stufft <donald at stufft.io> wrote:
>>
>> Either we can change the default to a secure
>> CSPRNG and break these functions (and the people using them) which is however
>> easily fixed by changing ``import random`` to
>> ``import random; random = random.DeterministicRandom()``
>
> But that isn't a fix, unless all your code is in a single module. If I call random.seed in game.py and then call random.choice in aiplayer.py, I'll get different results after your fix than I did before.

Note that this is another case of wanting "correct by default".
Requiring the user to pass around a RNG object makes it easy to do the
wrong thing - because (as above) people can too easily create multiple
independent RNGs by mistake, which means your numbers don't
necessarily satisfy the randomness criteria any more.

"Secure by default" isn't (and shouldn't be) the only example of
"correct by default" that matters here. Whether "secure" is more
important than "gives the right results" is a matter of opinion, and
application dependent. Password generators have more need to be secure
than to be mathematically random, Monte Carlo simulations (and to a
lesser extent games) the other way around. Many things care about
neither.

If we can't manage "correct and secure by default", someone (and it
won't be me) has to decide which end of the scale gets preference.

Paul.

From rosuav at gmail.com  Fri Sep 11 10:57:32 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 11 Sep 2015 18:57:32 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CA+=+wqABN=xw_8VWTS-u6oRyBcopOATafGnGFV=NVMaZmBu04g@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
 <5DBE2F72-DAB1-43D3-97F5-318D480E91FE@yahoo.com>
 <CADiSq7faaOJbFRJpSpsaiUWZR7adS1yPCF-P=sdM+1p=8b=OPw@mail.gmail.com>
 <AF5EBC52-F420-4051-AE62-1EFBB8999A44@yahoo.com>
 <CADiSq7eDP6A+37c36gCnGW72C6vQSsU+pf9iUBMpzd+7pLOG+A@mail.gmail.com>
 <87si6lzhsh.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAPTjJmrn=Gzp3hZdEYQLT1105P6u+OSKPanxYJMEfK=EESvkFw@mail.gmail.com>
 <CA+=+wqABN=xw_8VWTS-u6oRyBcopOATafGnGFV=NVMaZmBu04g@mail.gmail.com>
Message-ID: <CAPTjJmpZBE3ROmfBAzUft6XCwwgqUCURotJBMqj73t7r1Odbsw@mail.gmail.com>

On Fri, Sep 11, 2015 at 6:08 PM, Petr Viktorin <encukou at gmail.com> wrote:
> I've also thought about this idea. The problem with it is that seed()
> and friends affect a global instance of Random.
> If, after this change, there was a library that used random.random()
> for crypto, calling seed() in the main program (or any other library)
> would make it insecure. So we'd still be in a situation where nobody
> should use random() for crypto.

So library functions shouldn't use random.random() for anything they
know needs security. If you write a function generate_password(), the
responsibility is yours to ensure that it's entropic rather than
deterministic. That's no different from the current situation (seeding
the RNG makes it deterministic) except that the unseeded RNG is not
just harder to predict, it's actually entropic.

In some cases, having the 99% by default is a barrier to people who
need the 100%. (Conflating UCS-2 with Unicode deceives people into
thinking their program works just fine, and then it fails on astral
characters.) But in this case, there's no perfect-by-default solution,
so IMO the best two solutions are: Be great, but vulnerable to an
external seed(), until someone chooses; or have no random number
generation until someone chooses. We know that the latter is a
terrible option for learning, so vulnerability to someone else calling
random.seed() is a small price to pay.

ChrisA

From njs at pobox.com  Fri Sep 11 11:52:41 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 11 Sep 2015 02:52:41 -0700
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F_0Yrc0p8YrhAdKXHJ0PwTNyPWu3ZG9qr=b-61Vp6kwAw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
 <5DBE2F72-DAB1-43D3-97F5-318D480E91FE@yahoo.com>
 <CADiSq7faaOJbFRJpSpsaiUWZR7adS1yPCF-P=sdM+1p=8b=OPw@mail.gmail.com>
 <AF5EBC52-F420-4051-AE62-1EFBB8999A44@yahoo.com>
 <CADiSq7eDP6A+37c36gCnGW72C6vQSsU+pf9iUBMpzd+7pLOG+A@mail.gmail.com>
 <87si6lzhsh.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CACac1F_0Yrc0p8YrhAdKXHJ0PwTNyPWu3ZG9qr=b-61Vp6kwAw@mail.gmail.com>
Message-ID: <CAPJVwBk7eqkKvASFUzaAyrLX7CZkMGr9O+vULcZxE8PmFG2rTw@mail.gmail.com>

On Fri, Sep 11, 2015 at 1:02 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 11 September 2015 at 05:44, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>> I suppose it would be too magic to have the seed method substitute the
>> traditional PRNG for the default, while an implicitly seeded RNG
>> defaults to a crypto strong algorithm?
>
> One issue with that - often, programs simply use a RNG for their own
> purposes, but offer a means of getting the seed after the fact for
> reproducibility reasons (the "map seed" case, for example).
>
> Pseudo-code:
>
>     if <user supplied a "seed">:
>         state = <user-supplied value>
>         random.setstate(state)
>     else:
>         state = random.getstate()
>     ... do the program's main job, never calling seed/setstate
>     if <user requests the "seed">:
>         print state
>
> So getstate (and setstate) would also need to switch to a PRNG.
>
> There's actually very few cases I can think of where I'd need seed()
> (as opposed to setstate()). Maybe if I let the user *choose* a seed
> Some games do this.

You don't really want to use the full 4992 byte state for a "map seed"
application anyway (type 'random.getstate()' in a REPL and watch your
terminal scroll down multiple pages...). No game actually uses map
seeds that look anything like that. I'm 99% sure that real
applications in this category are actually using logic like:

if <user supplied a "seed">:
    seed = user_seed()
else:
    # use some RNG that was seeded with real entropy
    seed = random_short_printable_string()
r = random.Random(seed)
# now use 'r' to generate the map

-n

-- 
Nathaniel J. Smith -- http://vorpus.org

From p.f.moore at gmail.com  Fri Sep 11 12:03:42 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 11 Sep 2015 11:03:42 +0100
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CAPJVwBk7eqkKvASFUzaAyrLX7CZkMGr9O+vULcZxE8PmFG2rTw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
 <5DBE2F72-DAB1-43D3-97F5-318D480E91FE@yahoo.com>
 <CADiSq7faaOJbFRJpSpsaiUWZR7adS1yPCF-P=sdM+1p=8b=OPw@mail.gmail.com>
 <AF5EBC52-F420-4051-AE62-1EFBB8999A44@yahoo.com>
 <CADiSq7eDP6A+37c36gCnGW72C6vQSsU+pf9iUBMpzd+7pLOG+A@mail.gmail.com>
 <87si6lzhsh.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CACac1F_0Yrc0p8YrhAdKXHJ0PwTNyPWu3ZG9qr=b-61Vp6kwAw@mail.gmail.com>
 <CAPJVwBk7eqkKvASFUzaAyrLX7CZkMGr9O+vULcZxE8PmFG2rTw@mail.gmail.com>
Message-ID: <CACac1F8YnP5sWFy7n04f-CzgN0J2+sHwvb1HtLbtMKHXv82qGg@mail.gmail.com>

On 11 September 2015 at 10:52, Nathaniel Smith <njs at pobox.com> wrote:
> You don't really want to use the full 4992 byte state for a "map seed"
> application anyway (type 'random.getstate()' in a REPL and watch your
> terminal scroll down multiple pages...). No game actually uses map
> seeds that look anything like that. I'm 99% sure that real
> applications in this category are actually using logic like:
>
> if <user supplied a "seed">:
>     seed = user_seed()
> else:
>     # use some RNG that was seeded with real entropy
>     seed = random_short_printable_string()
> r = random.Random(seed)
> # now use 'r' to generate the map

Yeah, good point. As I say, I don't actually *use* this in the example
program I'm thinking of, I just know it's a feature I need to add in
due course. So when I do, I'll have to look into how to best implement
it. (And I'll probably nick the approach you show above, thanks ;-))

Paul

From abarnert at yahoo.com  Fri Sep 11 12:07:27 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 11 Sep 2015 03:07:27 -0700
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
In-Reply-To: <CAPJVwBk7eqkKvASFUzaAyrLX7CZkMGr9O+vULcZxE8PmFG2rTw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
 <5DBE2F72-DAB1-43D3-97F5-318D480E91FE@yahoo.com>
 <CADiSq7faaOJbFRJpSpsaiUWZR7adS1yPCF-P=sdM+1p=8b=OPw@mail.gmail.com>
 <AF5EBC52-F420-4051-AE62-1EFBB8999A44@yahoo.com>
 <CADiSq7eDP6A+37c36gCnGW72C6vQSsU+pf9iUBMpzd+7pLOG+A@mail.gmail.com>
 <87si6lzhsh.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CACac1F_0Yrc0p8YrhAdKXHJ0PwTNyPWu3ZG9qr=b-61Vp6kwAw@mail.gmail.com>
 <CAPJVwBk7eqkKvASFUzaAyrLX7CZkMGr9O+vULcZxE8PmFG2rTw@mail.gmail.com>
Message-ID: <4C44F738-05E4-4557-8C24-B1B9B7A38E0D@yahoo.com>

On Sep 11, 2015, at 02:52, Nathaniel Smith <njs at pobox.com> wrote:
> 
>> On Fri, Sep 11, 2015 at 1:02 AM, Paul Moore <p.f.moore at gmail.com> wrote:
>>> On 11 September 2015 at 05:44, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>>> I suppose it would be too magic to have the seed method substitute the
>>> traditional PRNG for the default, while an implicitly seeded RNG
>>> defaults to a crypto strong algorithm?
>> 
>> One issue with that - often, programs simply use a RNG for their own
>> purposes, but offer a means of getting the seed after the fact for
>> reproducibility reasons (the "map seed" case, for example).
>> 
>> Pseudo-code:
>> 
>>    if <user supplied a "seed">:
>>        state = <user-supplied value>
>>        random.setstate(state)
>>    else:
>>        state = random.getstate()
>>    ... do the program's main job, never calling seed/setstate
>>    if <user requests the "seed">:
>>        print state
>> 
>> So getstate (and setstate) would also need to switch to a PRNG.
>> 
>> There's actually very few cases I can think of where I'd need seed()
>> (as opposed to setstate()). Maybe if I let the user *choose* a seed
>> Some games do this.
> 
> You don't really want to use the full 4992 byte state for a "map seed"
> application anyway (type 'random.getstate()' in a REPL and watch your
> terminal scroll down multiple pages...). No game actually uses map
> seeds that look anything like that.

But games do store the entire map state with saved games if they want repeatable saves (e.g., to prevent players from defeating the RNG by save scumming).

From p.f.moore at gmail.com  Fri Sep 11 12:10:56 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 11 Sep 2015 11:10:56 +0100
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <4C44F738-05E4-4557-8C24-B1B9B7A38E0D@yahoo.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
 <5DBE2F72-DAB1-43D3-97F5-318D480E91FE@yahoo.com>
 <CADiSq7faaOJbFRJpSpsaiUWZR7adS1yPCF-P=sdM+1p=8b=OPw@mail.gmail.com>
 <AF5EBC52-F420-4051-AE62-1EFBB8999A44@yahoo.com>
 <CADiSq7eDP6A+37c36gCnGW72C6vQSsU+pf9iUBMpzd+7pLOG+A@mail.gmail.com>
 <87si6lzhsh.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CACac1F_0Yrc0p8YrhAdKXHJ0PwTNyPWu3ZG9qr=b-61Vp6kwAw@mail.gmail.com>
 <CAPJVwBk7eqkKvASFUzaAyrLX7CZkMGr9O+vULcZxE8PmFG2rTw@mail.gmail.com>
 <4C44F738-05E4-4557-8C24-B1B9B7A38E0D@yahoo.com>
Message-ID: <CACac1F-quv6SQqHu8jCM7FNjFOg-rk2VdKMn_DJ_Xwum8-nocw@mail.gmail.com>

On 11 September 2015 at 11:07, Andrew Barnert <abarnert at yahoo.com> wrote:
> But games do store the entire map state with saved games if they want repeatable saves (e.g., to prevent players from defeating the RNG by save scumming).

So far off-topic it's not true, but a number of games I know of (e.g.,
Factorio, Minecraft) include a means to get a map seed (a simple text
string) which you can publish, that allows other users to (in effect)
play on the same map as you. That's different from saves.

Paul

From njs at pobox.com  Fri Sep 11 12:26:07 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 11 Sep 2015 03:26:07 -0700
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F8Pc_qnms1gDD_w7MiVpzghSuQxuzn6kwjpLmE=Rz=zqA@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
 <CACac1F8Pc_qnms1gDD_w7MiVpzghSuQxuzn6kwjpLmE=Rz=zqA@mail.gmail.com>
Message-ID: <CAPJVwBkP5mpCS--0S5o4Li5D-av-L3Pg6xyrM9jfrcPmRT598w@mail.gmail.com>

On Fri, Sep 11, 2015 at 1:11 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 10 September 2015 at 23:46, Andrew Barnert <abarnert at yahoo.com> wrote:
>> On Sep 10, 2015, at 07:21, Donald Stufft <donald at stufft.io> wrote:
>>>
>>> Either we can change the default to a secure
>>> CSPRNG and break these functions (and the people using them) which is however
>>> easily fixed by changing ``import random`` to
>>> ``import random; random = random.DeterministicRandom()``
>>
>> But that isn't a fix, unless all your code is in a single module. If I call random.seed in game.py and then call random.choice in aiplayer.py, I'll get different results after your fix than I did before.
>
> Note that this is another case of wanting "correct by default".
> Requiring the user to pass around a RNG object makes it easy to do the
> wrong thing - because (as above) people can too easily create multiple
> independent RNGs by mistake, which means your numbers don't
> necessarily satisfy the randomness criteria any more.

Accidentally creating multiple independent RNGs is not going to cause
any problems with respect to randomness. It only creates a problem
with respect to determinism/reproducibility.

Beyond that I just find your message a bit baffling. I guess I believe
you that you find passing around RNG objects to make it easy to do the
wrong thing, but it's exactly the opposite of my experience: when
writing code that cares about determinism/reproducibility, then for
me, passing around RNG objects makes it way *easier* to get things
right. It makes it much more obvious what kinds of refactoring will
break reproducibility, and it enables all kinds of useful tricks.

E.g., keeping to the example of games and "aiplayer.py", a common
thing game designers want to do is to record playthroughs so they can
be replayed again as demos or whatever. And a common way to do that is
to (1) record the player's inputs, (2) make sure that the way the game
state evolves through time is deterministic given the players inputs.

(This isn't necessarily the *best* strategy, but it is a common one.)

Now suppose we're writing a game like this, and we have a bunch of
"enemies", each of whose behavior is partially random. So on each
"tick" we have to iterate through each enemy and update its state.

If we are using a single global RNG, then for correctness it becomes
crucial that we always iterate over all enemies in exactly the same
order. Which is a mess.

A better strategy is, keep one global RNG for the level, but then when
each new enemy is spawned, assign it its own RNG that will be used to
determine its actions, and seed this RNG using a value sampled from
the global RNG (!). Now the overall pattern of the game will be just
as random, still be deterministic, and -- crucially -- it no longer
matters what order we iterate over the enemies in.

I particularly would not want to use the global RNG in any program
that was complicated enough to involve multiple modules. Passing state
between inter-module calls using a global variable is pretty much
always a bad plan, and that's exactly what you're talking about here.

Non-deterministic global RNGs are fine, b/c they're semantically
stateless; it's exactly the cases where you care about the determinism
of the RNG state that you want to *stop* using the global RNG.

-n

-- 
Nathaniel J. Smith -- http://vorpus.org

From rosuav at gmail.com  Fri Sep 11 12:30:40 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 11 Sep 2015 20:30:40 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CAPJVwBkP5mpCS--0S5o4Li5D-av-L3Pg6xyrM9jfrcPmRT598w@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
 <CACac1F8Pc_qnms1gDD_w7MiVpzghSuQxuzn6kwjpLmE=Rz=zqA@mail.gmail.com>
 <CAPJVwBkP5mpCS--0S5o4Li5D-av-L3Pg6xyrM9jfrcPmRT598w@mail.gmail.com>
Message-ID: <CAPTjJmoWQe3g_HvhU2Mh3iwN+aNqkqJRCd9V2b37WTYE6gWReA@mail.gmail.com>

On Fri, Sep 11, 2015 at 8:26 PM, Nathaniel Smith <njs at pobox.com> wrote:
> A better strategy is, keep one global RNG for the level, but then when
> each new enemy is spawned, assign it its own RNG that will be used to
> determine its actions, and seed this RNG using a value sampled from
> the global RNG (!). Now the overall pattern of the game will be just
> as random, still be deterministic, and -- crucially -- it no longer
> matters what order we iterate over the enemies in.

As long as the order you seed their RNGs is deterministic. And if you
can do that, can't you iterate over them in a deterministic order too?

ChrisA

From skrah at bytereef.org  Fri Sep 11 13:07:38 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Fri, 11 Sep 2015 11:07:38 +0000 (UTC)
Subject: [Python-ideas]
	=?utf-8?q?Should_our_default_random_number_generat?=
	=?utf-8?q?or_be=09secure=3F?=
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com> <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
Message-ID: <loom.20150911T125750-125@post.gmane.org>

Tim Peters <tim.peters at ...> writes:
> Not saying switching is bad.  Am saying I've seen no compelling
> justification for causing users (& book & course authors & ....) such
> pain.  If this were Python 0.9.1 at issue, sure - but random.py's
> basic API really hasn't changed since then.

Agreed, and just recording my -1 for changing the API.  Also, I'm
noting that in *this* thread most people were at least moderately
against the change.


Stefan Krah



From random832 at fastmail.us  Fri Sep 11 14:42:55 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Fri, 11 Sep 2015 08:42:55 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CAPTjJmrn=Gzp3hZdEYQLT1105P6u+OSKPanxYJMEfK=EESvkFw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
 <5DBE2F72-DAB1-43D3-97F5-318D480E91FE@yahoo.com>
 <CADiSq7faaOJbFRJpSpsaiUWZR7adS1yPCF-P=sdM+1p=8b=OPw@mail.gmail.com>
 <AF5EBC52-F420-4051-AE62-1EFBB8999A44@yahoo.com>
 <CADiSq7eDP6A+37c36gCnGW72C6vQSsU+pf9iUBMpzd+7pLOG+A@mail.gmail.com>
 <87si6lzhsh.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAPTjJmrn=Gzp3hZdEYQLT1105P6u+OSKPanxYJMEfK=EESvkFw@mail.gmail.com>
Message-ID: <1441975375.3458375.380808489.2E341B77@webmail.messagingengine.com>

On Fri, Sep 11, 2015, at 00:54, Chris Angelico wrote:
> It would require a bit of an internals change; is it possible that
> code depends on random.seed and random.randint are bound methods of
> the same object? 

That's a ridiculous thing to depend on.

> To implement what you describe, they'd probably have
> to not be.

You could implement one class that calls either a SystemRandom instance
or an instance of another class depending on which mode it is in.

From mal at egenix.com  Fri Sep 11 14:56:11 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 11 Sep 2015 14:56:11 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAEQcUJQ5ZKjvxwZR=rPy8t1oJxd4WxZHcugh-JxgRsZEvYs5fA@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>	<CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>	<etPan.55f06a43.137d4868.31bc@Draupnir.home>	<CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>	<etPan.55f06fd9.71794aea.31bc@Draupnir.home>	<1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>	<CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>	<1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>	<20150909190757.GM19373@ando.pearwood.info>	<55F0BF61.6050205@canterbury.ac.nz>	<CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>	<55F13EAF.5040500@egenix.com>	<CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>	<55F1B219.1000502@egenix.com>
 <CAEQcUJQ5ZKjvxwZR=rPy8t1oJxd4WxZHcugh-JxgRsZEvYs5fA@mail.gmail.com>
Message-ID: <55F2CF6B.40301@egenix.com>

On 10.09.2015 19:04, Xavier Combelle wrote:
>> I think this is the major misunderstanding here:
>>
>> The random module never suggested that it generates pseudo-random data
>> of crypto quality.
>>
>> I'm pretty sure people doing crypto will know and most others
>> simply don't care :-)
>>
>> Evidence: We used a Wichmann-Hill PRNG as default in random
>> for a decade and people still got their work done. Mersenne
>> was added in Python 2.3 and bumped the period from
>> 6,953,607,871,644 (13 digits) to 2**19937-1 (6002 digits).
>
> It is not a evidence, I have an evidence of the opposite:
> some people can and does use random.random() for generating session key or
> csrf tokens and it's an insecure default.

It all depends on what you consider "secure" or "secure enough"
and points directly to another misunderstanding: that "secure"
is a well-defined term :-)

The random module seeds its global Random instance using urandom
(if available on the system), so while the generator itself is
deterministic, the seed used to kick off the pseudo-random series
is not. For many purposes, this is secure enough.

It's also easy to make the output of the random instance more
secure by passing it through a crypto hash function.


But back to the original question: What is "secure" ?

In crypto terms, "secure" usually refers to "computationally
infeasible to calculate before the sun goes dark" (to take one
variant).

More realistically, it can be defined as: Based on the public
knowledge known today, it's impossible to run a program which
allows converting the output of a crypto function back to its
inputs within a reasonable time span. And this property will
- based on today's knowledge - hold for at least the next
5-10 years.

You may notice the many parameters in these definition attempts.
It all depends on who you ask.

With the advent of new technologies like quantum computers,
it's not at all clear that any of those definitions will still
hold in a couple of years. It's well possible that only quantum
computers will be able to implement the necessary programs
and it'll take a while for mobile phones to catch up and come
with chips implementing those ;-)


Now, leaving aside this bright future, what's reasonable today ?

If you look at tools like untwister:

    https://github.com/bishopfox/untwister

you can get a feeling for how long it takes to deduce the
seed from an output sequence. Bare in mind, that in order
to be reasonably sure that the seed is correct, the available
output sequence has to be long enough.

That's a known plain text attack, so you need access to lots
of session keys to begin with.

The tools is still running on an example set of 1000 32-bit
numbers and it says it'll be done in 1.5 hours, i.e. before
the sun goes down in my timezone. I'll leave it running to
see whether it can find my secret key.

Untwister is only slightly smarter than bruteforce. Given
that MT has a seed size of 32 bits, it's not surprising that
a tool can find the seed within a day.

Perhaps it's time to switch to a better version of MT, e.g.
a 64-bit version (with 64-bit internal state):

    http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt64.html

or an even faster SIMD variant with better properties and
128 bit internal state:

    http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html

Esp. the latter will help make brute force attacks practically
impossible.

Tim ?

BTW: Looking at the sources of the _random module, I found that
the seed function uses the hash of non-integers such as e.g.
strings passed to it as seeds. Given the hash randomization
for strings this will create non-deterministic results, so it's
probably wise to only use 32-bit integers as seed values for
portability, if you need to rely on seeding the global Python
RNG.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 11 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2015-09-18: PyCon UK 2015 ...                               7 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From random832 at fastmail.us  Fri Sep 11 14:58:30 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Fri, 11 Sep 2015 08:58:30 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F-quv6SQqHu8jCM7FNjFOg-rk2VdKMn_DJ_Xwum8-nocw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
 <5DBE2F72-DAB1-43D3-97F5-318D480E91FE@yahoo.com>
 <CADiSq7faaOJbFRJpSpsaiUWZR7adS1yPCF-P=sdM+1p=8b=OPw@mail.gmail.com>
 <AF5EBC52-F420-4051-AE62-1EFBB8999A44@yahoo.com>
 <CADiSq7eDP6A+37c36gCnGW72C6vQSsU+pf9iUBMpzd+7pLOG+A@mail.gmail.com>
 <87si6lzhsh.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CACac1F_0Yrc0p8YrhAdKXHJ0PwTNyPWu3ZG9qr=b-61Vp6kwAw@mail.gmail.com>
 <CAPJVwBk7eqkKvASFUzaAyrLX7CZkMGr9O+vULcZxE8PmFG2rTw@mail.gmail.com>
 <4C44F738-05E4-4557-8C24-B1B9B7A38E0D@yahoo.com>
 <CACac1F-quv6SQqHu8jCM7FNjFOg-rk2VdKMn_DJ_Xwum8-nocw@mail.gmail.com>
Message-ID: <1441976310.3463176.380840969.265C8003@webmail.messagingengine.com>

On Fri, Sep 11, 2015, at 06:10, Paul Moore wrote:
> On 11 September 2015 at 11:07, Andrew Barnert <abarnert at yahoo.com> wrote:
> > But games do store the entire map state with saved games if they want repeatable saves (e.g., to prevent players from defeating the RNG by save scumming).
> 
> So far off-topic it's not true, but a number of games I know of (e.g.,
> Factorio, Minecraft) include a means to get a map seed (a simple text
> string) which you can publish, that allows other users to (in effect)
> play on the same map as you. That's different from saves.

Of course, Minecraft doesn't actually use the seed in such a simple way
as seeding a single-sequence random number generator. If it did, the map
would depend on what order you visited regions in. (This is less of an
issue for games with finite worlds)

From steve at pearwood.info  Fri Sep 11 15:36:13 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 11 Sep 2015 23:36:13 +1000
Subject: [Python-ideas] DRAFT Re: Python's Source of Randomness and the
	random.py module Redux
In-Reply-To: <etPan.55f18131.392f7558.31bc@Draupnir.home>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
Message-ID: <20150911133613.GW19373@ando.pearwood.info>

On Thu, Sep 10, 2015 at 09:10:09AM -0400, Donald Stufft wrote:

> Essentially, other than typing a little bit more, why is:
> 
> ? ? import random
> ? ? print(random.choice([?a?, ?b?, ?c?]))
> 
> better than
> 
> ? ? import random;
> ? ? print(random.DetereministicRandom().choice([?a?, ?b?, ?C?]))

Ironically, the spelling mistake in your example is a good example of 
how this is worse.

Another reason why it's worse is that if you create a new instance every 
single time you need a random number, as you do above, performance is 
definitely going to suffer. By my timings, creating a new SystemRandom 
instance each time is around two times slower; creating a new 
DeterministicRandom (i.e. the current MT default) instance each time is 
over 100 times slower.

Hypothetically, it may even hurt your randomness: it may be that some 
future (or current) (C)PRNG's quality will be "less random" (biased, 
predictable, or correlated) because you keep using a fresh instance 
rather than the same one.

TL;DR:

Yes, calling `random.choice` is *significantly better* than calling 
`random.SomethingRandom().choice`. It's better for beginners, it's even 
better for expert users whose random needs are small, and those whose 
needs are greater shouldn't be using the later anyway.


> You're allowed to pick DeterministicRandom, you're even allowed to do it
> without thinking. This isn't about making it impossible to ever insecurely use
> random numbers, that's obviously a boil the ocean level of problem, this is
> about trying to make it more likely that someone won't be hit by a fairly easy
> to hit footgun if it does matter for them, even if they don't know it. It's
> also about making code that is easier to understand on the surface, for example
> without using the prior knowledge that it's using MT, tell me how you'd know
> if this was safe or not:
> 
> ? ? import random
> ? ? import string
> ? ? password = "".join(random.choice(string.ascii_letters) for _ in range(9))
> ? ? print("Your random password is",)

Is this a trick question?





 In the absense of a keylogger and screen 
reader monitoring my system while I run that code snippet, of course it 
is safe.

In the absence of any credible attack on the password based on how it 
was generated, of course it is safe.


> Can you point out one use case where cryptographically safe random numbers,
> assuming we could generate them as quickly as you asked for them, would hurt
> you unless you needed/wanted to be able to save the seed and thus require or
> want deterministic results?

Nobody is saying that 


To put that question another way: "If you exclude the case where crypto 
would


> Reminder that this warning does not show up (in any color, much less red)
> if you?re using ``help(random)`` or ``dir(random)`` to explore the random
> module. It also does not show up in code review when you see someone doing
> random.random.
> 
> It encourages you to write bad code, because it has a baked in assumption that
> there is a sane default for a random number generator and expects people to
> understand a fairly dificult concept, which is that not all "random" is equal.
> 
> For instance, you've already made the mistake of saying you wanted "random" not
> deterministic, but the two are not mutually exlusive and deterministic is a
> property that a source of random can have, and one that you need for one of the
> features you say you like.?
> 
> >  
> > > Here?s a game a friend of mine created where the purpose of the game is
> > > to essentially unrandomize some random data, which is only possible
> > > because it?s (purposely) using MT to make it possible
> > > https://github.com/reaperhulk/dsa-ctf. This is not an ivory tower paranoia
> > > case, it?s a real concern that will absolutely fix some insecure software
> > > out there instead of telling them ?welp typing a little bit extra once
> > > an import is too much of a burden for me and really it?s your own fault
> > > anyways?.
> >  
> > I don't understand how that game (which is an interesting way of
> > showing people how attacks on crypto work, sure, but that's just
> > education, which you dismissed above) relates to the issue here.
> >  
> > And I hope you don't really think that your quote is even remotely
> > what I'm trying to say (I'm not that selfish) - my point is that not
> > everything is security related. Not every application people write,
> > and not every API in the stdlib. You're claiming that the random
> > module is security related. I'm claiming it's not, it's documented as
> > not being, and that's clear to the people who use it for its intended
> > purpose. Telling those people that you want to make a module designed
> > for their use harder to use because people for whom it's not intended
> > can't read the documentation which explicitly states that it's not
> > suitable for them, is doing a disservice to those people who are
> > already using the module correctly for its stated purpose.
> 
> I'm claiming that the term random is ambiguously both security related and
> people to pick whether or not their use case is security related, or we should
> assume that it is unless otherwise instructed. I don't particularly care what
> the exact spelling of this looks like, random.(System|Secure)Random and
> random.DeterministicRandom is just one option. 

> Another option is to look at
> something closer to what Go did and deprecate the "random" module and move the
> MT based thing to ``math.random`` and the CSPRNG can be moved to something like
> crypto.random.

This might be acceptable, although I wouldn't necessarily deprecate the 
random module. 



> 
> >  
> > By the same argument, we should remove the statistics module because
> > it can be used by people with numerically unstable problems. (I doubt
> > you'll find StackOverflow questions along these lines yet, but that's
> > only because (a) the module's pretty new, and (b) it actually works
> > pretty hard to handle the hard corner cases, but I bet they'll start
> > turning up in due course, if only from the people who don't understand
> > floating point...)
> >
> 
> No, by this argument we shouldn't have a function called statistics in the
> statistics module because there is no globally "right" answer for what the
> default should be. Should it be mean? mode? median? Why is *your* use case the
> "right" use case for the default option, particularly in a situation where
> picking the wrong option can be disastrous.
> 
> -----------------
> Donald Stufft
> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
> 
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From steve at pearwood.info  Fri Sep 11 15:49:48 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 11 Sep 2015 23:49:48 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
In-Reply-To: <CAPTjJmpqzLm6v05w-FLYrZCPta6o5j0dmv6Y9tTW=_+ayrxSCw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <72597E4F-4E74-412D-8ED3-442E832232EF@yahoo.com>
 <m2mvwvoyk1.fsf@fastmail.com>
 <A45551D4-1E5C-423F-9ACB-F2CB386B6BEE@yahoo.com>
 <CAPTjJmpqzLm6v05w-FLYrZCPta6o5j0dmv6Y9tTW=_+ayrxSCw@mail.gmail.com>
Message-ID: <20150911134948.GX19373@ando.pearwood.info>

On Thu, Sep 10, 2015 at 04:08:09PM +1000, Chris Angelico wrote:
> On Thu, Sep 10, 2015 at 11:50 AM, Andrew Barnert via Python-ideas
> <python-ideas at python.org> wrote:
> > Of course it adds the cost of making the module slower, and also 
> > more complex. Maybe a better solution would be to add a 
> > random.set_default_instance function that replaced all of the 
> > top-level functions with bound methods of the instance (just like 
> > what's already done at startup in random.py)? That's simple, and 
> > doesn't slow down anything, and it seems like it makes it more clear 
> > what you're doing than setting random.inst.
> 
> +1. A single function call that replaces all the methods adds a
> minuscule constant to code size, run time, etc, and it's no less
> readable than assignment to a module attribute.

Making monkey-patching the official, recommended way to choose a PRNG is 
a risky solution, to put it mildly. That means that at any time, some 
other module that is directly or indirectly imported might change the 
random number generators you are using without your knowledge. You want 
a crypto PRNG, but some module replaces it with MT. Or visa versa.

Technically, it is true that (this being Python) they can do this now, 
just by assigning to the random module:

    random.random = lambda: 9

but that is clearly abusive, and if you write code to do that, you're 
asking for whatever trouble you get. There's no official API to screw 
over other callers of the random module behind their back. You're 
suggesting that we add one.


> (If anything, it makes
> it more clearly a supported operation 

Which is exactly why this is a terrible idea. You're making monkey- 
patching not only officially supported, but encouraged. That will not 
end well.



-- 
Steve


From graffatcolmingov at gmail.com  Fri Sep 11 15:50:01 2015
From: graffatcolmingov at gmail.com (Ian Cordasco)
Date: Fri, 11 Sep 2015 08:50:01 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <20150909190757.GM19373@ando.pearwood.info>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
Message-ID: <CAN-Kwu15f+849qFz6C+qkxmvxJYpJws3UehyZdtgLOLzF1iFeA@mail.gmail.com>

On Wed, Sep 9, 2015 at 2:07 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Wed, Sep 09, 2015 at 02:55:01PM -0400, random832 at fastmail.us wrote:
>> On Wed, Sep 9, 2015, at 14:31, Tim Peters wrote:
>> > Also over & over again.  If you volunteer to own responsibility for
>> > updating all versions of Python each time it changes (in a crypto
>> > context, an advance in the state of the art implies the prior state
>> > becomes "a bug"), and post a performance bond sufficient to pay
>> > someone else to do it if you vanish, then a major pragmatic objection
>> > would go away ;-)
>>
>> I don't see how "Changing Python's RNG implementation today to
>> arc4random as it exists now" necessarily implies "Making a commitment to
>> guarantee the cryptographic suitability of Python's RNG for all time".
>> Those are two separate things.
>
> Not really. Look at the subject line. It doesn't say "should we change
> from MT to arc4random?", it asks if the default random number generator
> should be secure. The only reason we are considering the change from MT
> to arc4random is to make the PRNG cryptographically secure. "Secure" is
> a moving target, what is secure today will not be secure tomorrow.
>
> Yes, in principle, we could make the change once, then never again. But
> why bother? We don't gain anything from changing to arc4random if there
> is no promise to be secure into the future.

This is a good point. Let's remove the ssl library from Python too.
Until recently, the most widely used versions of Python were all
woefully behind the times and anyone wanting anything relatively
up-to-date had to use a third party library. Even so, if you count the
distributions of RHEL and other "Long Term Support" operating systems
that are running Python 2.7 (pre 2.7.9) and below, most people are
operating with barely secure versions of OpenSSL on versions of Python
that don't have constants for modern secure communications standards.
Clearly, deciding to add the ssl module was a huge mistake because it
wasn't forwards compatible with future security standards.

From random832 at fastmail.us  Fri Sep 11 16:23:52 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Fri, 11 Sep 2015 10:23:52 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
Message-ID: <1441981432.3482349.380920585.5A4249C0@webmail.messagingengine.com>

On Fri, Sep 11, 2015, at 09:36, Steven D'Aprano wrote:
> Yes, calling `random.choice` is *significantly better* than calling 
> `random.SomethingRandom().choice`. It's better for beginners, it's even 
> better for expert users whose random needs are small, and those whose 
> needs are greater shouldn't be using the later anyway.

Why is it that people who need deterministic/seed based random aren't
considered to be "those whose needs are greater"?

From cory at lukasa.co.uk  Fri Sep 11 16:28:35 2015
From: cory at lukasa.co.uk (Cory Benfield)
Date: Fri, 11 Sep 2015 15:28:35 +0100
Subject: [Python-ideas] DRAFT Re: Python's Source of Randomness and the
 random.py module Redux
In-Reply-To: <20150911133613.GW19373@ando.pearwood.info>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <20150911133613.GW19373@ando.pearwood.info>
Message-ID: <CAH_hAJEi5A8LLK94N8E5C-dSLcRbVvMHK2tPxmuPRSkXU-AGiQ@mail.gmail.com>

On 11 September 2015 at 14:36, Steven D'Aprano <steve at pearwood.info> wrote:
> Is this a trick question?
>
> In the absence of any credible attack on the password based on how it
> was generated, of course it is safe.

I feel like I must have misunderstood you Steven. Didn't you just
exclude the attack vector that we're discussing here?

What we are saying is that a deterministic PRNG definitionally allows
attacks on the password based on how it was generated. The very nature
of a deterministic PRNG is that it is possible to predict subsequent
outputs based on previous ones, or at least to dramatically constrain
the search space. This is not a hypothetical attack, and it's not even
a very complicated one.

Now, it's possible that the way the system is constructed precludes
this attack, but let me tell you that vastly more engineers think that
about their systems than are actually right about it. Generally, if
the word 'password' appears anywhere near something, you want to keep
a Mersenne Twister as far away from it as possible.

The concern being highlighted in this thread is that users who don't
know what I just said (the vast majority) are at risk of writing
deeply insecure code. We think the default should be changed.

From rosuav at gmail.com  Fri Sep 11 16:33:29 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Sat, 12 Sep 2015 00:33:29 +1000
Subject: [Python-ideas] DRAFT Re: Python's Source of Randomness and the
 random.py module Redux
In-Reply-To: <CAH_hAJEi5A8LLK94N8E5C-dSLcRbVvMHK2tPxmuPRSkXU-AGiQ@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <20150911133613.GW19373@ando.pearwood.info>
 <CAH_hAJEi5A8LLK94N8E5C-dSLcRbVvMHK2tPxmuPRSkXU-AGiQ@mail.gmail.com>
Message-ID: <CAPTjJmrwe3AwsnE-XDSEisMjWND9ro=NKA6SwfudpqRm-aaSRw@mail.gmail.com>

On Sat, Sep 12, 2015 at 12:28 AM, Cory Benfield <cory at lukasa.co.uk> wrote:
> On 11 September 2015 at 14:36, Steven D'Aprano <steve at pearwood.info> wrote:
>> Is this a trick question?
>>
>> In the absence of any credible attack on the password based on how it
>> was generated, of course it is safe.
>
> I feel like I must have misunderstood you Steven. Didn't you just
> exclude the attack vector that we're discussing here?
>
> What we are saying is that a deterministic PRNG definitionally allows
> attacks on the password based on how it was generated.

Only if an attacker can access many passwords generated from the same
MT stream, right? If the entire program is as was posted (importing
random and using random.choice(), then terminating), then an attack
would have to be based on the seeding of the RNG, not on the RNG
itself. There simply isn't enough content being generated for you to
be able to learn the internal state, and even if you did, the next run
of the program will be freshly seeded anyway.

ChrisA

From cory at lukasa.co.uk  Fri Sep 11 16:34:55 2015
From: cory at lukasa.co.uk (Cory Benfield)
Date: Fri, 11 Sep 2015 15:34:55 +0100
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F2CF6B.40301@egenix.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <CAEQcUJQ5ZKjvxwZR=rPy8t1oJxd4WxZHcugh-JxgRsZEvYs5fA@mail.gmail.com>
 <55F2CF6B.40301@egenix.com>
Message-ID: <CAH_hAJFTGhn7goLuDuBwODGuW=62jyaKvckS7U6+FVxt=j+E3A@mail.gmail.com>

On 11 September 2015 at 13:56, M.-A. Lemburg <mal at egenix.com> wrote:
> The random module seeds its global Random instance using urandom
> (if available on the system), so while the generator itself is
> deterministic, the seed used to kick off the pseudo-random series
> is not. For many purposes, this is secure enough.

Secure enough for what purposes? Certainly not generating a password,
or anything that is 'password equivalent' (e.g. session cookies).

As you acknowledge in the latter portion of your email, one can
predict the future output of a Mersenne Twister by observing lots of
previous values. If I get to see the output of your RNG, I can
dramatically constrain the search space of other things it generated.
It is not hard to see how you can mount a pretty trivial attack
against web software using this,


> It's also easy to make the output of the random instance more
> secure by passing it through a crypto hash function.

Or...just use a CSPRNG and save yourself the computation overhead of
the hash? Besides, anyone who knows enough to hash their random
numbers surely knows enough to use a CSPRNG, so who does this help?


> Perhaps it's time to switch to a better version of MT, e.g.
> a 64-bit version (with 64-bit internal state):
>
>     http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt64.html
>
> or an even faster SIMD variant with better properties and
> 128 bit internal state:
>
>     http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html
>
> Esp. the latter will help make brute force attacks practically
> impossible.

Or, we can move to a CSPRNG and stop trying to move the goalposts on
MT? Or, do both? Using a better Mersenne Twister does not mean we
shouldn't switch the default.

From cory at lukasa.co.uk  Fri Sep 11 16:38:12 2015
From: cory at lukasa.co.uk (Cory Benfield)
Date: Fri, 11 Sep 2015 15:38:12 +0100
Subject: [Python-ideas] DRAFT Re: Python's Source of Randomness and the
 random.py module Redux
In-Reply-To: <CAPTjJmrwe3AwsnE-XDSEisMjWND9ro=NKA6SwfudpqRm-aaSRw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <20150911133613.GW19373@ando.pearwood.info>
 <CAH_hAJEi5A8LLK94N8E5C-dSLcRbVvMHK2tPxmuPRSkXU-AGiQ@mail.gmail.com>
 <CAPTjJmrwe3AwsnE-XDSEisMjWND9ro=NKA6SwfudpqRm-aaSRw@mail.gmail.com>
Message-ID: <CAH_hAJE_0um6L63LmKyXSy6iQYMm0B1bLh9_MeNvi3Y+SKUQbw@mail.gmail.com>

On 11 September 2015 at 15:33, Chris Angelico <rosuav at gmail.com> wrote:
> Only if an attacker can access many passwords generated from the same
> MT stream, right? If the entire program is as was posted (importing
> random and using random.choice(), then terminating), then an attack
> would have to be based on the seeding of the RNG, not on the RNG
> itself. There simply isn't enough content being generated for you to
> be able to learn the internal state, and even if you did, the next run
> of the program will be freshly seeded anyway.

Sure, if the entire program is as posted, but we should probably
assume it isn't. Some programs definitely are, but I'm not worried
about them: I'm worried about the ones that aren't.

From donald at stufft.io  Fri Sep 11 16:38:24 2015
From: donald at stufft.io (Donald Stufft)
Date: Fri, 11 Sep 2015 10:38:24 -0400
Subject: [Python-ideas] DRAFT Re: Python's Source of Randomness and the
 random.py module Redux
In-Reply-To: <CAPTjJmrwe3AwsnE-XDSEisMjWND9ro=NKA6SwfudpqRm-aaSRw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <20150911133613.GW19373@ando.pearwood.info>
 <CAH_hAJEi5A8LLK94N8E5C-dSLcRbVvMHK2tPxmuPRSkXU-AGiQ@mail.gmail.com>
 <CAPTjJmrwe3AwsnE-XDSEisMjWND9ro=NKA6SwfudpqRm-aaSRw@mail.gmail.com>
Message-ID: <etPan.55f2e760.7ea8daac.31bc@Draupnir.home>


On September 11, 2015 at 10:33:55 AM, Chris Angelico (rosuav at gmail.com) wrote:
> On Sat, Sep 12, 2015 at 12:28 AM, Cory Benfield wrote:
> > On 11 September 2015 at 14:36, Steven D'Aprano wrote:
> >> Is this a trick question?
> >>
> >> In the absence of any credible attack on the password based on how it
> >> was generated, of course it is safe.
> >
> > I feel like I must have misunderstood you Steven. Didn't you just
> > exclude the attack vector that we're discussing here?
> >
> > What we are saying is that a deterministic PRNG definitionally allows
> > attacks on the password based on how it was generated.
>  
> Only if an attacker can access many passwords generated from the same
> MT stream, right? If the entire program is as was posted (importing
> random and using random.choice(), then terminating), then an attack
> would have to be based on the seeding of the RNG, not on the RNG
> itself. There simply isn't enough content being generated for you to
> be able to learn the internal state, and even if you did, the next run
> of the program will be freshly seeded anyway.


This is a silly, take that code, stick it in a web application and have it
generating API keys or session identifiers instead of passwords, or hell, even?
passwords or random tokens to reset password or any other such thing.

Suddenly you have a case where you have a persistent process, so there isn't a
new seed, and the attacker can more or less request an unlimited number of
outputs. This isn't some mind boggling uncommon case.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From steve at pearwood.info  Fri Sep 11 16:44:47 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 12 Sep 2015 00:44:47 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
In-Reply-To: <87si6lzhsh.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <ED448F53-3F4D-4F0F-B343-601A4524A097@yahoo.com>
 <5DBE2F72-DAB1-43D3-97F5-318D480E91FE@yahoo.com>
 <CADiSq7faaOJbFRJpSpsaiUWZR7adS1yPCF-P=sdM+1p=8b=OPw@mail.gmail.com>
 <AF5EBC52-F420-4051-AE62-1EFBB8999A44@yahoo.com>
 <CADiSq7eDP6A+37c36gCnGW72C6vQSsU+pf9iUBMpzd+7pLOG+A@mail.gmail.com>
 <87si6lzhsh.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20150911144447.GZ19373@ando.pearwood.info>

On Fri, Sep 11, 2015 at 01:44:30PM +0900, Stephen J. Turnbull wrote:

> I suppose it would be too magic to have the seed method substitute the
> traditional PRNG for the default, while an implicitly seeded RNG
> defaults to a crypto strong algorithm?

Yes, too much magic.


-- 
Steve

From steve at pearwood.info  Fri Sep 11 16:53:26 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 12 Sep 2015 00:53:26 +1000
Subject: [Python-ideas] DRAFT Re: Python's Source of Randomness and the
	random.py module Redux
In-Reply-To: <20150911133613.GW19373@ando.pearwood.info>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <20150911133613.GW19373@ando.pearwood.info>
Message-ID: <20150911145326.GA19373@ando.pearwood.info>

Ah crap. 

Sorry folks, this post was *not supposed to go to the list* in this 
state. I'm having some trouble with my mail client (mutt) not saving 
drafts, so I intended to email it to myself for later editing, and 
didn't notice that the list was CCed.

On Fri, Sep 11, 2015 at 11:36:13PM +1000, Steven D'Aprano wrote:
[...]



-- 
Steve

From steve at pearwood.info  Fri Sep 11 17:41:25 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 12 Sep 2015 01:41:25 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
In-Reply-To: <1441981432.3482349.380920585.5A4249C0@webmail.messagingengine.com>
References: <1441981432.3482349.380920585.5A4249C0@webmail.messagingengine.com>
Message-ID: <20150911154125.GC19373@ando.pearwood.info>

Random832,

You appear to have edited the subject line to remove the word "DRAFT". 
As I explained in an earlier post, that message was a draft and not 
intended to go to the list.

Nevertheless, I will respond to your question below.


On Fri, Sep 11, 2015 at 10:23:52AM -0400, random832 at fastmail.us wrote:
> On Fri, Sep 11, 2015, at 09:36, Steven D'Aprano wrote:
> > Yes, calling `random.choice` is *significantly better* than calling 
> > `random.SomethingRandom().choice`. It's better for beginners, it's even 
> > better for expert users whose random needs are small, and those whose 
> > needs are greater shouldn't be using the later anyway.
> 
> Why is it that people who need deterministic/seed based random aren't
> considered to be "those whose needs are greater"?

I didn't say that. Read again: I give three groups of people:

- Beginners, who are best served by calling `random.choice` rather than 
  `random.SomethingRandom().choice`.

- Those who are experts *and also* have "small" needs. I didn't define 
  "small needs" because (1) I thought it was obvious in context and (2) 
  the post was a draft and still in progress. What I mean by small needs 
  is that they don't care about reproducibility, security, or having 
  multiple independent PRNGs.

- Those who *do* have "greater" needs, whether expert or not. Again, I 
  thought in context it would be clear that greater needs includes such 
  things as reproducibility, security or multiple independent PRNGs.

In no case that I know of is it a good thing to be creating a brand-new 
instance for each and every call to the PRNG. At best, it is harmless, 
and only a little inefficient. At worst, it is a lot inefficient, and 
potentially may affect the reproducibility, security or statistical 
properties of the random numbers you generate.



-- 
Steve

From random832 at fastmail.us  Fri Sep 11 17:55:38 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Fri, 11 Sep 2015 11:55:38 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <20150911154125.GC19373@ando.pearwood.info>
References: <1441981432.3482349.380920585.5A4249C0@webmail.messagingengine.com>
 <20150911154125.GC19373@ando.pearwood.info>
Message-ID: <1441986938.3505841.381040833.33561D29@webmail.messagingengine.com>

On Fri, Sep 11, 2015, at 11:41, Steven D'Aprano wrote:
> Random832,
> 
> You appear to have edited the subject line to remove the word "DRAFT". 
> As I explained in an earlier post, that message was a draft and not 
> intended to go to the list.

Sorry about that... I didn't see it until I went to send it, and I'd had
some issues on my client causing me to have to fish my reply out of my
own drafts folder; I assumed the presence of the word "DRAFT" was
related to that and didn't realize it was on your original message.

From wes.turner at gmail.com  Fri Sep 11 18:05:03 2015
From: wes.turner at gmail.com (Wes Turner)
Date: Fri, 11 Sep 2015 11:05:03 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CACfEFw-Ljwoa2OohFt0L0ZKbE7Qj_EtnK=Dj4zpMcLcdnYFaKw@mail.gmail.com>

On Thu, Sep 10, 2015 at 9:07 PM, Stephen J. Turnbull <stephen at xemacs.org>
wrote:

> Executive summary:
>
> The question is, "what value is there in changing the default to be
> crypto strong to protect future security-sensitive applications from
> naive implementers vs. the costs to current users who need to rewrite
> their applications to explicitly invoke the current default?"
>

* [ ] DOC: note regarding the 'pseudo' part of pseudorandom and MT
  * https://docs.python.org/2/library/random.html
  * https://docs.python.org/3/library/random.html

* [ ] DOC: upgrade cryptography docs in re: random numbers
  * https://cryptography.io/en/latest/random-numbers/


* [ ] ENH: random_(algo=) (~IRandomSource)
* [ ] ENH: Add arc4random
* [ ] ENH: Add chacha
* [ ] ENH: Add OpenBSD's

* [ ] BUG,SEC: new secure default named random.random
  * must also be stateful / **reproducible** (must support .seed)
  * justified as BUG,SEC because: [secure by default is the answer]
    * https://en.wikipedia.org/wiki/Session_fixation
      * https://cwe.mitre.org/data/definitions/384.html
        * The docs did not say "you should know better."
      * see also: hash collisions: https://bugs.python.org/issue13703

  * [ ] REF: random.random -> random.random_old




>
> M.-A. Lemburg writes:
>
>  > I'm pretty sure people doing crypto will know and most others
>  > simply don't care :-)
>
> Which is why botnets have millions of nodes.  People who do web
> security evidently believe that inappropriate RNGs have something to
> do with widespread security issues.  (That doesn't mean they're right,
> but it gives me pause for thought -- evidently, Guido thought so too!)
>
>  > Evidence: We used a Wichmann-Hill PRNG as default in random
>  > for a decade and people still got their work done.
>
> The question is not whether people get their work done.  People work
> (unless they're seriously dysfunctional), that's what people do.
> Especially programmers (cf. GNU Manifesto).  The question is whether
> the work of the *crackers* is made significantly easier by security
> holes that are opened by inappropriate use of random.random.
>
> I tend to agree with Steven d'A. (and others) that the answer is no:
> it doesn't matter if the kind of person who leaves a key under the
> third flowerpot from the left also habitually leaves the door unlocked
> (especially if "I'm only gonna be gone for 5 minutes"), and I think
> that's likely.  IOW, installing crypto strong RNGs as default is *not*
> analogous to the changes to SSL support that were so important that
> they were backported to 2.7 in a late patch release.
>
> OTOH, why default to crypto weak if crypto strong is easily available?
> You might save a few million Debian users from having to regenerate
> all their SSH keys.[1]
>
> But the people who are "just getting work done" in new programs *won't
> notice*.  I don't think that they care what's under the hood of
> random.random, as long as (1) the API stays the same, and (2) the
> documentation clearly indicates where to find PRNGs that support
> determinism, jumpahead, replicability, and all those other good
> things, for the needs they doesn't have now but know they probably
> will have some day.  The rub is, as usual, existing applications that
> would have to be changed for no reason that is relevant to them.
>
> Note that arc4random is much simpler to use than random.random.  No
> knobs to tweak or seeds to store for future reference.  Seems
> perfectly suited to "just getting work" done to me.  OTOH, if you have
> an application where you need replicability, jumpahead, etc, you're
> going to need to read the docs enough to find the APIs for seeding and
> so on.  At design time, I don't see why it would hurt to select an
> RNG algorithm explicitly as well.
>
>  > Why not add ssl.random() et al. (as interface to the OpenSSL
>  > rand APIs) ?
>
> I like that naming proposal.  I'm sure changing the nature of
> random.random would annoy the heck out of *many* users.
>
> An alternative would be to add random.crypto.
>
>  > Some background on why I think deterministic RNGs are more
>  > useful to have as default than non-deterministic ones:
>  >
>  > A common use case for me is to write test data generators
>  > for large database systems. For such generators, I don't keep
>  > the many GBs data around, but instead make the generator take a
>  > few parameters which then seed the RNGs, the time module and
>  > a few other modules via monkey-patching.
>
> If you've gone to that much effort, you evidently have read the docs
> and it wouldn't have been a huge amount of trouble to use a
> non-default module with a specified PRNG -- if you were doing it now.
> But you have every right to be very peeved if you have a bunch of old
> test runs you want to replicate with a new version of Python, and
> we've changed the random.random RNG on you.
>
>
>
> Footnotes:
> [1]  I hasten to add that a programmer who isn't as smart as he thinks
> he is who "improves" a crypto algorithm is far more likely than that
> the implementer of a crypto suite would choose an RNG that is
> inappropriate by design.  Still, it's a theoretical possibility, and
> security is about eliminating every theoretical possibility you can
> think of.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150911/ec59fdeb/attachment-0001.html>

From steve at pearwood.info  Fri Sep 11 18:08:09 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 12 Sep 2015 02:08:09 +1000
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAExdVN=tO3jPWoz0t6ckspAuWB-7t61GzbKLrx2L2UtKRyELbA@mail.gmail.com>
References: <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
 <loom.20150909T232749-280@post.gmane.org>
 <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>
 <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
 <20150910015505.GO19373@ando.pearwood.info>
 <CAExdVN=tO3jPWoz0t6ckspAuWB-7t61GzbKLrx2L2UtKRyELbA@mail.gmail.com>
Message-ID: <20150911160809.GD19373@ando.pearwood.info>

On Wed, Sep 09, 2015 at 09:23:23PM -0500, Tim Peters wrote:

> [Steven D'Aprano]
> > The default MT is certainly deterministic, and although only the output
> > of random() itself is guaranteed to be reproducible, the other methods
> > are *usually* stable in practice.
> >
> > There's a jumpahead method too,
> 
> Not in Python.  

It is there, up to Python 2.7. I hadn't noticed it was gone in Python 3.


-- 
Steve

From mal at egenix.com  Fri Sep 11 18:14:39 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 11 Sep 2015 18:14:39 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAH_hAJFTGhn7goLuDuBwODGuW=62jyaKvckS7U6+FVxt=j+E3A@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>	<CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>	<etPan.55f06a43.137d4868.31bc@Draupnir.home>	<CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>	<etPan.55f06fd9.71794aea.31bc@Draupnir.home>	<1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>	<CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>	<1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>	<20150909190757.GM19373@ando.pearwood.info>	<55F0BF61.6050205@canterbury.ac.nz>	<CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>	<55F13EAF.5040500@egenix.com>	<CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>	<55F1B219.1000502@egenix.com>	<CAEQcUJQ5ZKjvxwZR=rPy8t1oJxd4WxZHcugh-JxgRsZEvYs5fA@mail.gmail.com>	<55F2CF6B.40301@egenix.com>
 <CAH_hAJFTGhn7goLuDuBwODGuW=62jyaKvckS7U6+FVxt=j+E3A@mail.gmail.com>
Message-ID: <55F2FDEF.9010000@egenix.com>

On 11.09.2015 16:34, Cory Benfield wrote:
> On 11 September 2015 at 13:56, M.-A. Lemburg <mal at egenix.com> wrote:
>> The random module seeds its global Random instance using urandom
>> (if available on the system), so while the generator itself is
>> deterministic, the seed used to kick off the pseudo-random series
>> is not. For many purposes, this is secure enough.
> 
> Secure enough for what purposes? Certainly not generating a password,
> or anything that is 'password equivalent' (e.g. session cookies).
> 
> As you acknowledge in the latter portion of your email, one can
> predict the future output of a Mersenne Twister by observing lots of
> previous values. If I get to see the output of your RNG, I can
> dramatically constrain the search space of other things it generated.
> It is not hard to see how you can mount a pretty trivial attack
> against web software using this,

In theory, yes, in practice it's not all that easy. I suggest
to give untwister a try... it started with telling me it needs
about 1.5 hours, then flipped to more than a year, now it's
back to 6 hours. I'll leave it running for while to see whether
it finishes today :-)

>> It's also easy to make the output of the random instance more
>> secure by passing it through a crypto hash function.
> 
> Or...just use a CSPRNG and save yourself the computation overhead of
> the hash? Besides, anyone who knows enough to hash their random
> numbers surely knows enough to use a CSPRNG, so who does this help?

There's a difference between taking a pseudo random
number generator and applying a hash to it vs. using
a CPRNG:

A CPRNG will add entropy to its state at regular intervals, so
there's no such thing as a seeded sequence.

A RNG + hash still has the nice property of allowing
to reproduce the sequence given the seed, but makes it
much harder to determine the seed (brute force can be
made arbitrarily hard via the hash function).

The entropy in the output of the second variant is constant
(only defined by the initial seed and the hash parameters),
while it constantly increases in the CPRNG.

Some more background on this:
https://en.wikipedia.org/wiki/Entropy_%28information_theory%29

>> Perhaps it's time to switch to a better version of MT, e.g.
>> a 64-bit version (with 64-bit internal state):
>>
>>     http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt64.html
>>
>> or an even faster SIMD variant with better properties and
>> 128 bit internal state:
>>
>>     http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html
>>
>> Esp. the latter will help make brute force attacks practically
>> impossible.
> 
> Or, we can move to a CSPRNG and stop trying to move the goalposts on
> MT? Or, do both? Using a better Mersenne Twister does not mean we
> shouldn't switch the default.

I think it's worthwhile exposing the CPRNG from OpenSSL via
the ssl module (see one of my earlier posts in this thread).

People who need something as secure as their SSL implementation,
can then get secure random numbers, while kids implementing coin
flipping games can continue to use the well established API
of the random module.

Switching to a CPRNG in random would break the API, since some of
the functions in the API would no longer be available (e.g.
random.seed(), random.getstate(), random.setstate()).

PS: Apart from the API issue, the default RNG in random would
also have to be equidistributed and uniform, otherwise, the
derivatives available in the module would no longer satisfy their
expected distribution qualities. This is not needed when all
you're interested in is to get some non-predictable random
number for use in a session key :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 11 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2015-09-18: PyCon UK 2015 ...                               7 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From kramm at google.com  Fri Sep 11 18:18:32 2015
From: kramm at google.com (Matthias Kramm)
Date: Fri, 11 Sep 2015 09:18:32 -0700 (PDT)
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CAA_f+LyB2Vt2s4_8HkGoeP0-M3LX=QfOjpDGAJod8zpZ0ACE0w@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <9683c40d-b662-4b77-947e-62c418be8468@googlegroups.com>
 <CAA_f+LyB2Vt2s4_8HkGoeP0-M3LX=QfOjpDGAJod8zpZ0ACE0w@mail.gmail.com>
Message-ID: <afb12daa-900b-4676-8e02-d3ccfedf2cba@googlegroups.com>

On Thursday, September 10, 2015 at 11:38:48 PM UTC-7, Jukka Lehtosalo wrote:
>
> The proposal doesn't spell out the rules for subtyping, but we should 
> follow the ordinary rules for subtyping for functions, and return types 
> would behave covariantly. So the answer is yes.
>

Ok. Note that this introduces some weird corner cases when trying to decide 
whether a class implements a protocol.

Consider

class P(Protocol):
  def f() -> P

class A:
  def f() -> A

It would be both valid to say that A *does not* implement P (because the 
return value of f is incompatible with P) as it would be to say that A 
*does* implement it (because once it does, the return value of f becomes 
compatible with P).

For a more quirky example, consider

class A(Protocol):
    def f(self) -> B
    def g(self) -> str

class B(Protocol):
    def f(self) -> A
    def g(self) -> float

class C:
  def f(self) -> D: return self.x
  def g(self): return self.y

class D:
  def f(self) -> C: return self.x
  def g(self): return self.y

Short of introducing intersection types, the protocols A and B are 
incompatible (because the return types of g() are mutually exclusive). 
Hence, C and D can, respectively, conform to either A or B, but not both.
So the possible assignments are:
C -> A
D -> B
*or*
C -> B
D -> A
.
It seems undecidable which of the two is the right one.
(The structural type converter in pytype solves this by dropping the 
"mutually exclusive" constraint to the floor and making A and B both a C 
*and* a D, which you can do if all you want is a name for an anonymous 
structural type, But here you're using your structural types in type 
declarations, so that solution doesn't apply)

Matthias

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150911/59217811/attachment.html>

From emile at fenx.com  Fri Sep 11 18:18:52 2015
From: emile at fenx.com (Emile van Sebille)
Date: Fri, 11 Sep 2015 09:18:52 -0700
Subject: [Python-ideas] Round division
In-Reply-To: <msts84$esq$1@ger.gmane.org>
References: <mssnir$uol$1@ger.gmane.org>
 <CAG3cHaY80SdXfWAmFb6xgcYCSesefA0dUyO-eR=qSO2+8R4ERw@mail.gmail.com>
 <CACac1F_sGOS2uXmyjtkfP93g98A_Mg=u2k_xw-vdWSjPx5wkDw@mail.gmail.com>
 <mst0rv$2r1$1@ger.gmane.org> <20150911031304.GT19373@ando.pearwood.info>
 <msts84$esq$1@ger.gmane.org>
Message-ID: <msuute$cmr$1@ger.gmane.org>

On 9/10/2015 11:27 PM, Serhiy Storchaka wrote:
> On 11.09.15 06:13, Steven D'Aprano wrote:
>> How does this differ from round(a/b)? round() also rounds to even.
>
>  >>> round(5000000000000000/9999999999999999)
> 0
>  >>> round(14999999999999999/10000000000000000)
> 2
>
> But fractions 5000000000000000/9999999999999999 > 1/2 and
> 14999999999999999/10000000000000000 < 3/2.
>

Wow -- I'm glad I work predominately in business environments and keep 
amounts in pennies.  The only time I need to round anything is to the 
nearest cent.

Emile



From guido at python.org  Fri Sep 11 18:30:35 2015
From: guido at python.org (Guido van Rossum)
Date: Fri, 11 Sep 2015 09:30:35 -0700
Subject: [Python-ideas] Round division
In-Reply-To: <msuute$cmr$1@ger.gmane.org>
References: <mssnir$uol$1@ger.gmane.org>
 <CAG3cHaY80SdXfWAmFb6xgcYCSesefA0dUyO-eR=qSO2+8R4ERw@mail.gmail.com>
 <CACac1F_sGOS2uXmyjtkfP93g98A_Mg=u2k_xw-vdWSjPx5wkDw@mail.gmail.com>
 <mst0rv$2r1$1@ger.gmane.org> <20150911031304.GT19373@ando.pearwood.info>
 <msts84$esq$1@ger.gmane.org> <msuute$cmr$1@ger.gmane.org>
Message-ID: <CAP7+vJK5Vqi_+SXQtzi0GeMFw3-XMT4eG-7OjWyqnt21LUL6rA@mail.gmail.com>

On Fri, Sep 11, 2015 at 9:18 AM, Emile van Sebille <emile at fenx.com> wrote:

> On 9/10/2015 11:27 PM, Serhiy Storchaka wrote:
>
>> On 11.09.15 06:13, Steven D'Aprano wrote:
>>
>>> How does this differ from round(a/b)? round() also rounds to even.
>>>
>>
>>  >>> round(5000000000000000/9999999999999999)
>> 0
>>  >>> round(14999999999999999/10000000000000000)
>> 2
>>
>> But fractions 5000000000000000/9999999999999999 > 1/2 and
>> 14999999999999999/10000000000000000 < 3/2.
>>
>>
> Wow -- I'm glad I work predominately in business environments and keep
> amounts in pennies.  The only time I need to round anything is to the
> nearest cent.
>

I thought any programmer worth their salt would round down (i.e. trunc())
and transfer the fractional penny to their own account? :-)

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150911/496c2f61/attachment-0001.html>

From tim.peters at gmail.com  Fri Sep 11 18:36:55 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 11 Sep 2015 11:36:55 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F2CF6B.40301@egenix.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <CAEQcUJQ5ZKjvxwZR=rPy8t1oJxd4WxZHcugh-JxgRsZEvYs5fA@mail.gmail.com>
 <55F2CF6B.40301@egenix.com>
Message-ID: <CAExdVNnxHPi6Y+2YijLQtnCeJRjZJ8cWGTQYDD63Y4vaRJftrg@mail.gmail.com>

[M.-A. Lemburg <mal at egenix.com>]
> ...
> Now, leaving aside this bright future, what's reasonable today ?
>
> If you look at tools like untwister:
>
>     https://github.com/bishopfox/untwister
>
> you can get a feeling for how long it takes to deduce the
> seed from an output sequence. Bare in mind, that in order
> to be reasonably sure that the seed is correct, the available
> output sequence has to be long enough.
>
> That's a known plain text attack, so you need access to lots
> of session keys to begin with.
>
> The tools is still running on an example set of 1000 32-bit
> numbers and it says it'll be done in 1.5 hours, i.e. before
> the sun goes down in my timezone. I'll leave it running to
> see whether it can find my secret key.

I'm only going to talk about current Python 3, because _any_ backward
incompatible change is off limits for a bugfix release.

So:

1. untwister appears _mostly_ to be probing for poor seeding schemes.
Python 3's default "by magic" seeding is unimpeachable ;-)  It's
computationally infeasible to attack it.

2. If they knew they were targeting MT, and had 624 consecutive 32-bit
outputs, they could compute MT's full internal state essentially
instantly.

#2 is hard to get, though.  These "pick a passward" examples are only
using a relative handful of bits from each 32-bit MT output.  Attacks
with such spotty info are "exponentially harder".


> Untwister is only slightly smarter than bruteforce. Given
> that MT has a seed size of 32 bits, it's not surprising that
> a tool can find the seed within a day.

No no no.  MT's state is 19937 bits, and current .seed()
implementations use every bit you pass to .seed().

By default, current Python seeds the state with 2500 bytes (20000
bits) from the system .urandom() (if available).  That's why it's
computationally infeasible for "poor seeding" searches to attack the
default seeding:  they have a space of 2**19937-1 (the all-0 state
can't occur) to search through, each of which is equally likely
(assuming the system .urandom() is doing _its_ job).

Of course the user can screw that up by using their _own_ seed.  But,
by default, current Pythons already do the best possible seeding job.


> Perhaps it's time to switch to a better version of MT, e.g.
> a 64-bit version (with 64-bit internal state):
>
>     http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt64.html
>
> or an even faster SIMD variant with better properties and
> 128 bit internal state:
>
>     http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html
>
> Esp. the latter will help make brute force attacks practically
> impossible.
>
> Tim ?

We already have a 19937-bit internal state, and current seeding
schemes don';t hide that.

I would like to move to a different generator entirely someday, but
not before some specific better-than-MT alternative gains significant
traction outside the Python world ("better a follower than a leader"
in this area).


> BTW: Looking at the sources of the _random module, I found that
> the seed function uses the hash of non-integers such as e.g.
> strings passed to it as seeds. Given the hash randomization
> for strings this will create non-deterministic results, so it's
> probably wise to only use 32-bit integers as seed values for
> portability, if you need to rely on seeding the global Python
> RNG.

None of that applies to Python 3.  `seed()` string inputs go through
this path now:

            if isinstance(a, (str, bytes, bytearray)):
                if isinstance(a, str):
                    a = a.encode()
                a += _sha512(a).digest()
                a = int.from_bytes(a, 'big')
        super().seed(a)

IOW, a crypto hash is _appended_ to the string, but no info from the
original string is lost (but, if you ask me, this particular step is
useless - it adds no "new entropy").  The whole mess is converted to a
giant integer, again with no loss of input information.  And every bit
of the giant integer affects what `super().seed(a) does`.

From wes.turner at gmail.com  Fri Sep 11 18:53:46 2015
From: wes.turner at gmail.com (Wes Turner)
Date: Fri, 11 Sep 2015 11:53:46 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CACfEFw-Ljwoa2OohFt0L0ZKbE7Qj_EtnK=Dj4zpMcLcdnYFaKw@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CACfEFw-Ljwoa2OohFt0L0ZKbE7Qj_EtnK=Dj4zpMcLcdnYFaKw@mail.gmail.com>
Message-ID: <CACfEFw9jqT4gqCDX8yqV2fB_vPrF28asNUfYsu7XALXMhw6Kyw@mail.gmail.com>

devguide/documenting.html#security-considerations-and-other-concerns

https://docs.python.org/devguide/documenting.html#security-considerations-and-other-concerns

On Fri, Sep 11, 2015 at 11:05 AM, Wes Turner <wes.turner at gmail.com> wrote:

>
>
> On Thu, Sep 10, 2015 at 9:07 PM, Stephen J. Turnbull <stephen at xemacs.org>
> wrote:
>
>> Executive summary:
>>
>> The question is, "what value is there in changing the default to be
>> crypto strong to protect future security-sensitive applications from
>> naive implementers vs. the costs to current users who need to rewrite
>> their applications to explicitly invoke the current default?"
>>
>
> * [ ] DOC: note regarding the 'pseudo' part of pseudorandom and MT
>   * https://docs.python.org/2/library/random.html
>   * https://docs.python.org/3/library/random.html
>
> * [ ] DOC: upgrade cryptography docs in re: random numbers
>   * https://cryptography.io/en/latest/random-numbers/
>
>
> * [ ] ENH: random_(algo=) (~IRandomSource)
> * [ ] ENH: Add arc4random
> * [ ] ENH: Add chacha
> * [ ] ENH: Add OpenBSD's
>
> * [ ] BUG,SEC: new secure default named random.random
>   * must also be stateful / **reproducible** (must support .seed)
>   * justified as BUG,SEC because: [secure by default is the answer]
>     * https://en.wikipedia.org/wiki/Session_fixation
>       * https://cwe.mitre.org/data/definitions/384.html
>         * The docs did not say "you should know better."
>       * see also: hash collisions: https://bugs.python.org/issue13703
>
>   * [ ] REF: random.random -> random.random_old
>
>
>
>
>>
>> M.-A. Lemburg writes:
>>
>>  > I'm pretty sure people doing crypto will know and most others
>>  > simply don't care :-)
>>
>> Which is why botnets have millions of nodes.  People who do web
>> security evidently believe that inappropriate RNGs have something to
>> do with widespread security issues.  (That doesn't mean they're right,
>> but it gives me pause for thought -- evidently, Guido thought so too!)
>>
>>  > Evidence: We used a Wichmann-Hill PRNG as default in random
>>  > for a decade and people still got their work done.
>>
>> The question is not whether people get their work done.  People work
>> (unless they're seriously dysfunctional), that's what people do.
>> Especially programmers (cf. GNU Manifesto).  The question is whether
>> the work of the *crackers* is made significantly easier by security
>> holes that are opened by inappropriate use of random.random.
>>
>> I tend to agree with Steven d'A. (and others) that the answer is no:
>> it doesn't matter if the kind of person who leaves a key under the
>> third flowerpot from the left also habitually leaves the door unlocked
>> (especially if "I'm only gonna be gone for 5 minutes"), and I think
>> that's likely.  IOW, installing crypto strong RNGs as default is *not*
>> analogous to the changes to SSL support that were so important that
>> they were backported to 2.7 in a late patch release.
>>
>> OTOH, why default to crypto weak if crypto strong is easily available?
>> You might save a few million Debian users from having to regenerate
>> all their SSH keys.[1]
>>
>> But the people who are "just getting work done" in new programs *won't
>> notice*.  I don't think that they care what's under the hood of
>> random.random, as long as (1) the API stays the same, and (2) the
>> documentation clearly indicates where to find PRNGs that support
>> determinism, jumpahead, replicability, and all those other good
>> things, for the needs they doesn't have now but know they probably
>> will have some day.  The rub is, as usual, existing applications that
>> would have to be changed for no reason that is relevant to them.
>>
>> Note that arc4random is much simpler to use than random.random.  No
>> knobs to tweak or seeds to store for future reference.  Seems
>> perfectly suited to "just getting work" done to me.  OTOH, if you have
>> an application where you need replicability, jumpahead, etc, you're
>> going to need to read the docs enough to find the APIs for seeding and
>> so on.  At design time, I don't see why it would hurt to select an
>> RNG algorithm explicitly as well.
>>
>>  > Why not add ssl.random() et al. (as interface to the OpenSSL
>>  > rand APIs) ?
>>
>> I like that naming proposal.  I'm sure changing the nature of
>> random.random would annoy the heck out of *many* users.
>>
>> An alternative would be to add random.crypto.
>>
>>  > Some background on why I think deterministic RNGs are more
>>  > useful to have as default than non-deterministic ones:
>>  >
>>  > A common use case for me is to write test data generators
>>  > for large database systems. For such generators, I don't keep
>>  > the many GBs data around, but instead make the generator take a
>>  > few parameters which then seed the RNGs, the time module and
>>  > a few other modules via monkey-patching.
>>
>> If you've gone to that much effort, you evidently have read the docs
>> and it wouldn't have been a huge amount of trouble to use a
>> non-default module with a specified PRNG -- if you were doing it now.
>> But you have every right to be very peeved if you have a bunch of old
>> test runs you want to replicate with a new version of Python, and
>> we've changed the random.random RNG on you.
>>
>>
>>
>> Footnotes:
>> [1]  I hasten to add that a programmer who isn't as smart as he thinks
>> he is who "improves" a crypto algorithm is far more likely than that
>> the implementer of a crypto suite would choose an RNG that is
>> inappropriate by design.  Still, it's a theoretical possibility, and
>> security is about eliminating every theoretical possibility you can
>> think of.
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150911/77c69283/attachment-0001.html>

From kramm at google.com  Fri Sep 11 18:18:32 2015
From: kramm at google.com (Matthias Kramm)
Date: Fri, 11 Sep 2015 09:18:32 -0700 (PDT)
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CAA_f+LyB2Vt2s4_8HkGoeP0-M3LX=QfOjpDGAJod8zpZ0ACE0w@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <9683c40d-b662-4b77-947e-62c418be8468@googlegroups.com>
 <CAA_f+LyB2Vt2s4_8HkGoeP0-M3LX=QfOjpDGAJod8zpZ0ACE0w@mail.gmail.com>
Message-ID: <afb12daa-900b-4676-8e02-d3ccfedf2cba@googlegroups.com>

On Thursday, September 10, 2015 at 11:38:48 PM UTC-7, Jukka Lehtosalo wrote:
>
> The proposal doesn't spell out the rules for subtyping, but we should 
> follow the ordinary rules for subtyping for functions, and return types 
> would behave covariantly. So the answer is yes.
>

Ok. Note that this introduces some weird corner cases when trying to decide 
whether a class implements a protocol.

Consider

class P(Protocol):
  def f() -> P

class A:
  def f() -> A

It would be both valid to say that A *does not* implement P (because the 
return value of f is incompatible with P) as it would be to say that A 
*does* implement it (because once it does, the return value of f becomes 
compatible with P).

For a more quirky example, consider

class A(Protocol):
    def f(self) -> B
    def g(self) -> str

class B(Protocol):
    def f(self) -> A
    def g(self) -> float

class C:
  def f(self) -> D: return self.x
  def g(self): return self.y

class D:
  def f(self) -> C: return self.x
  def g(self): return self.y

Short of introducing intersection types, the protocols A and B are 
incompatible (because the return types of g() are mutually exclusive). 
Hence, C and D can, respectively, conform to either A or B, but not both.
So the possible assignments are:
C -> A
D -> B
*or*
C -> B
D -> A
.
It seems undecidable which of the two is the right one.
(The structural type converter in pytype solves this by dropping the 
"mutually exclusive" constraint to the floor and making A and B both a C 
*and* a D, which you can do if all you want is a name for an anonymous 
structural type, But here you're using your structural types in type 
declarations, so that solution doesn't apply)

Matthias

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150911/59217811/attachment-0002.html>

From tim.peters at gmail.com  Fri Sep 11 19:16:12 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 11 Sep 2015 12:16:12 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <20150911160809.GD19373@ando.pearwood.info>
References: <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
 <loom.20150909T232749-280@post.gmane.org>
 <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>
 <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
 <20150910015505.GO19373@ando.pearwood.info>
 <CAExdVN=tO3jPWoz0t6ckspAuWB-7t61GzbKLrx2L2UtKRyELbA@mail.gmail.com>
 <20150911160809.GD19373@ando.pearwood.info>
Message-ID: <CAExdVNnGwHuAMDG=5Jye7vrCcC06Jtq2RF-5zpgFRu90Z52LLQ@mail.gmail.com>

[Steven D'Aprano]
>>> The default MT is certainly deterministic, and although only the output
>>> of random() itself is guaranteed to be reproducible, the other methods
>>> are *usually* stable in practice.
>>>
>>> There's a jumpahead method too,

[Tim]
>> Not in Python.

[Steve]
> It is there, up to Python 2.7. I hadn't noticed it was gone in Python 3.

Yes, there's something _called_ `,jumpahead()`, for backward
compatibility with the old WIchmann-Hill generator.  But what it does
for MT is "eh - no idea what to do, so let's just make stuff up":

    def jumpahead(self, n):
        """Change the internal state to one that is likely far away
        from the current state.  This method will not be in Py3.x,
        so it is better to simply reseed.
        """
        # The super.jumpahead() method uses shuffling to change state,
        # so it needs a large and "interesting" n to work with.  Here,
        # we use hashing to create a large n for the shuffle.
        s = repr(n) + repr(self.getstate())
        n = int(_hashlib.new('sha512', s).hexdigest(), 16)
        super(Random, self).jumpahead(n)

I doubt there's anything that can be proved about the result of doing
that - except that it's almost certain it won't bear any relationship
to what calling the generator `n` times instead would have done ;-)

From mal at egenix.com  Fri Sep 11 19:19:00 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 11 Sep 2015 19:19:00 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAExdVNnxHPi6Y+2YijLQtnCeJRjZJ8cWGTQYDD63Y4vaRJftrg@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>	<CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>	<etPan.55f06a43.137d4868.31bc@Draupnir.home>	<CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>	<etPan.55f06fd9.71794aea.31bc@Draupnir.home>	<1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>	<CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>	<1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>	<20150909190757.GM19373@ando.pearwood.info>	<55F0BF61.6050205@canterbury.ac.nz>	<CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>	<55F13EAF.5040500@egenix.com>	<CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>	<55F1B219.1000502@egenix.com>	<CAEQcUJQ5ZKjvxwZR=rPy8t1oJxd4WxZHcugh-JxgRsZEvYs5fA@mail.gmail.com>	<55F2CF6B.40301@egenix.com>
 <CAExdVNnxHPi6Y+2YijLQtnCeJRjZJ8cWGTQYDD63Y4vaRJftrg@mail.gmail.com>
Message-ID: <55F30D04.4010001@egenix.com>

On 11.09.2015 18:36, Tim Peters wrote:
> [M.-A. Lemburg <mal at egenix.com>]
>> ...
>> Now, leaving aside this bright future, what's reasonable today ?
>>
>> If you look at tools like untwister:
>>
>>     https://github.com/bishopfox/untwister
>>
>> you can get a feeling for how long it takes to deduce the
>> seed from an output sequence. Bare in mind, that in order
>> to be reasonably sure that the seed is correct, the available
>> output sequence has to be long enough.
>>
>> That's a known plain text attack, so you need access to lots
>> of session keys to begin with.
>>
>> The tools is still running on an example set of 1000 32-bit
>> numbers and it says it'll be done in 1.5 hours, i.e. before
>> the sun goes down in my timezone. I'll leave it running to
>> see whether it can find my secret key.
> 
> I'm only going to talk about current Python 3, because _any_ backward
> incompatible change is off limits for a bugfix release.
> 
> So:
> 
> 1. untwister appears _mostly_ to be probing for poor seeding schemes.
> Python 3's default "by magic" seeding is unimpeachable ;-)  It's
> computationally infeasible to attack it.
> 
> 2. If they knew they were targeting MT, and had 624 consecutive 32-bit
> outputs, they could compute MT's full internal state essentially
> instantly.

How would they do that ? MT's period is too large for
things like rainbow tables.

> #2 is hard to get, though.  These "pick a passward" examples are only
> using a relative handful of bits from each 32-bit MT output.  Attacks
> with such spotty info are "exponentially harder".
> 
>> Untwister is only slightly smarter than bruteforce. Given
>> that MT has a seed size of 32 bits, it's not surprising that
>> a tool can find the seed within a day.
> 
> No no no.  MT's state is 19937 bits, and current .seed()
> implementations use every bit you pass to .seed().

Ah, right. I was looking at init_genrand() in the C implementation
which only takes a single 32-bit unsigned int as value.

The init_by_array() function does take seeds which use all
available bits.

I guess untwister indeed only tries the 32-bit unsigned int seeding
approach, as it keeps listing things like:

Progress: 50.99%  [2190032137 / 4294967295]  ~128296.63/sec  4 hours 33 minutes 30 [-]

> By default, current Python seeds the state with 2500 bytes (20000
> bits) from the system .urandom() (if available).  That's why it's
> computationally infeasible for "poor seeding" searches to attack the
> default seeding:  they have a space of 2**19937-1 (the all-0 state
> can't occur) to search through, each of which is equally likely
> (assuming the system .urandom() is doing _its_ job).
> 
> Of course the user can screw that up by using their _own_ seed.  But,
> by default, current Pythons already do the best possible seeding job.
> 
> 
>> Perhaps it's time to switch to a better version of MT, e.g.
>> a 64-bit version (with 64-bit internal state):
>>
>>     http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt64.html
>>
>> or an even faster SIMD variant with better properties and
>> 128 bit internal state:
>>
>>     http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html
>>
>> Esp. the latter will help make brute force attacks practically
>> impossible.
>>
>> Tim ?
> 
> We already have a 19937-bit internal state, and current seeding
> schemes don';t hide that.

Ouch. I confused internal state with the output size. Sorry.

So we're more than fine already and it's only the cracking
tools that are apparently broken :-)

> I would like to move to a different generator entirely someday, but
> not before some specific better-than-MT alternative gains significant
> traction outside the Python world ("better a follower than a leader"
> in this area).

Another candidate is the new WELL family:

http://www.iro.umontreal.ca/~panneton/WELLRNG.html

This has some nicer properties w/r to booting out of zeroland
(as they call it: too many 0 bits in the seed).

>> BTW: Looking at the sources of the _random module, I found that
>> the seed function uses the hash of non-integers such as e.g.
>> strings passed to it as seeds. Given the hash randomization
>> for strings this will create non-deterministic results, so it's
>> probably wise to only use 32-bit integers as seed values for
>> portability, if you need to rely on seeding the global Python
>> RNG.
> 
> None of that applies to Python 3. 

Well, it still does for the .seed() C implementation in _random.c,
but since that's overridden in Python 3's Random class, you can't
access it anymore :-)

> `seed()` string inputs go through
> this path now:
> 
>             if isinstance(a, (str, bytes, bytearray)):
>                 if isinstance(a, str):
>                     a = a.encode()
>                 a += _sha512(a).digest()
>                 a = int.from_bytes(a, 'big')
>         super().seed(a)
> 
> IOW, a crypto hash is _appended_ to the string, but no info from the
> original string is lost (but, if you ask me, this particular step is
> useless - it adds no "new entropy").  The whole mess is converted to a
> giant integer, again with no loss of input information.  And every bit
> of the giant integer affects what `super().seed(a) does`.

As far as I'm concerned this maps to case closed.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 11 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2015-09-18: PyCon UK 2015 ...                               7 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From tim.peters at gmail.com  Fri Sep 11 20:52:08 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 11 Sep 2015 13:52:08 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F30D04.4010001@egenix.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <CAEQcUJQ5ZKjvxwZR=rPy8t1oJxd4WxZHcugh-JxgRsZEvYs5fA@mail.gmail.com>
 <55F2CF6B.40301@egenix.com>
 <CAExdVNnxHPi6Y+2YijLQtnCeJRjZJ8cWGTQYDD63Y4vaRJftrg@mail.gmail.com>
 <55F30D04.4010001@egenix.com>
Message-ID: <CAExdVNnooHgKdQc+KFsHLj=MJM52eHsp+k1OJNPjJVzC=O3S_Q@mail.gmail.com>

[Tim]
>> ...
>> 2. If they knew they were targeting MT, and had 624 consecutive 32-bit
>> outputs, they could compute MT's full internal state essentially
>> instantly.

[Marc-Andre]
> How would they do that ? MT's period is too large for
> things like rainbow tables.

It's not trivial to figure out how to do this, but once you do, it
works ;-)  No search, or tables, of any kind are required.  It's just
simple (albeit non-obvious!) bit-fiddling to invert MT's
state-to-output transformations to get the state back.  Here's a very
nice writeup:

https://jazzy.id.au/2010/09/22/cracking_random_number_generators_part_3.html

From abarnert at yahoo.com  Fri Sep 11 22:27:02 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 11 Sep 2015 13:27:02 -0700
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
In-Reply-To: <20150911134948.GX19373@ando.pearwood.info>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <72597E4F-4E74-412D-8ED3-442E832232EF@yahoo.com>
 <m2mvwvoyk1.fsf@fastmail.com>
 <A45551D4-1E5C-423F-9ACB-F2CB386B6BEE@yahoo.com>
 <CAPTjJmpqzLm6v05w-FLYrZCPta6o5j0dmv6Y9tTW=_+ayrxSCw@mail.gmail.com>
 <20150911134948.GX19373@ando.pearwood.info>
Message-ID: <9A57E7BB-4314-4929-B7F5-51764779F5D2@yahoo.com>

On Sep 11, 2015, at 06:49, Steven D'Aprano <steve at pearwood.info> wrote:
> 
>> On Thu, Sep 10, 2015 at 04:08:09PM +1000, Chris Angelico wrote:
>> On Thu, Sep 10, 2015 at 11:50 AM, Andrew Barnert via Python-ideas
>> <python-ideas at python.org> wrote:
>>> Of course it adds the cost of making the module slower, and also 
>>> more complex. Maybe a better solution would be to add a 
>>> random.set_default_instance function that replaced all of the 
>>> top-level functions with bound methods of the instance (just like 
>>> what's already done at startup in random.py)? That's simple, and 
>>> doesn't slow down anything, and it seems like it makes it more clear 
>>> what you're doing than setting random.inst.
>> 
>> +1. A single function call that replaces all the methods adds a
>> minuscule constant to code size, run time, etc, and it's no less
>> readable than assignment to a module attribute.
> 
> Making monkey-patching the official, recommended way to choose a PRNG is 
> a risky solution, to put it mildly.

But that's not the proposal. The proposal is to make explicitly passing around an instance the official, recommended way to choose a PRNG; monkey-patching is only the official, recommended way to quickly get legacy code working: once you see the warning about the potential problem and decide that the problem doesn't affect you, you write one standard line of code at the top of your main script instead of rewriting all of your modules and patching or updating every third-party module you use.

As I said later, I think my later suggestion of just having a singleton DeterministicRandom instance (or even a submodule with the same interface) that you can explicitly import in place or random serves the same needs well enough, and is even simpler, and is more flexible (in particular, it can also be used for novices' "my first game" programs), so I'm no longer suggesting this. But that doesn't mean there's any benefit to mischaracterizing the suggestion (especially if Chris or anyone else still supports it even though I don't).


From mal at egenix.com  Fri Sep 11 22:44:46 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 11 Sep 2015 22:44:46 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAExdVNnooHgKdQc+KFsHLj=MJM52eHsp+k1OJNPjJVzC=O3S_Q@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>	<etPan.55f06a43.137d4868.31bc@Draupnir.home>	<CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>	<etPan.55f06fd9.71794aea.31bc@Draupnir.home>	<1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>	<CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>	<1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>	<20150909190757.GM19373@ando.pearwood.info>	<55F0BF61.6050205@canterbury.ac.nz>	<CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>	<55F13EAF.5040500@egenix.com>	<CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>	<55F1B219.1000502@egenix.com>	<CAEQcUJQ5ZKjvxwZR=rPy8t1oJxd4WxZHcugh-JxgRsZEvYs5fA@mail.gmail.com>	<55F2CF6B.40301@egenix.com>	<CAExdVNnxHPi6Y+2YijLQtnCeJRjZJ8cWGTQYDD63Y4vaRJftrg@mail.gmail.com>	<55F30D04.4010001@egenix.com>
 <CAExdVNnooHgKdQc+KFsHLj=MJM52eHsp+k1OJNPjJVzC=O3S_Q@mail.gmail.com>
Message-ID: <55F33D3E.7000904@egenix.com>

On 11.09.2015 20:52, Tim Peters wrote:
> [Tim]
>>> ...
>>> 2. If they knew they were targeting MT, and had 624 consecutive 32-bit
>>> outputs, they could compute MT's full internal state essentially
>>> instantly.
> 
> [Marc-Andre]
>> How would they do that ? MT's period is too large for
>> things like rainbow tables.
> 
> It's not trivial to figure out how to do this, but once you do, it
> works ;-)  No search, or tables, of any kind are required.  It's just
> simple (albeit non-obvious!) bit-fiddling to invert MT's
> state-to-output transformations to get the state back.  Here's a very
> nice writeup:
> 
> https://jazzy.id.au/2010/09/22/cracking_random_number_generators_part_3.html

Indeed very nice. Thanks for the pointer.

I wonder why untwister doesn't use this. I gave it 1000 32-bit
integers, so it should have enough information to recover the
seed in a short while, but it's still trying to find the seed.
Oh, and it now shows: 5 days 21 hours left. I stopped it there.

Anyone up for a random.recover_seed() function ? ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 11 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2015-09-18: PyCon UK 2015 ...                               7 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From random832 at fastmail.us  Fri Sep 11 23:12:00 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Fri, 11 Sep 2015 17:12:00 -0400
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAPJVwB=tnYhYNFAs3w5PYpQWaEY8XGZpMnA0_GrZw5=c0f7JsQ@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <CAExdVN=Cx+hc6oSN0LhwVO5zjVuzGKAfeiePRppNqy26M-ZgMg@mail.gmail.com>
 <CAPJVwB=tnYhYNFAs3w5PYpQWaEY8XGZpMnA0_GrZw5=c0f7JsQ@mail.gmail.com>
Message-ID: <1442005920.3575903.381296913.53B6421C@webmail.messagingengine.com>

On Wed, Sep 9, 2015, at 17:02, Nathaniel Smith wrote:
> Keeping that promise in mind, an alternative would be to keep both
> generators around, use the cryptographically secure one by default, and
> switch to MT when someone calls
> 
>   seed(1234, generator="INSECURE LEGACY MT")
> 
> But this would justifiably get us crucified by the security community,
> because the above call would flip the insecure switch for your entire
> program, including possibly other modules that were depending on random
> to
> provide secure bits.

I just realized, OpenBSD has precisely this functionality, for the
rand/random/rand48 functions, in the "_deterministic" versions of their
respective seed functions. So that's probably not a terrible path to go
down:

Make a Random class that uses a CSPRNG and/or os.urandom until/unless it
is explicitly seeded. Use that class for the global instance. We could
probably skip the "make a separate function name to show you really mean
it" because unlike C, Python has never encouraged explicitly seeding
with the {time, pid, four bytes from /dev/random} when one doesn't
actually want determinism. (The default seed in C for rand/random is
*1*; for rand48 it is an implementation-defined, but specified to be
constant, value).

For completeness, have getstate return a tuple of a boolean (for which
mode it is in) and whatever state Random returns. setstate can accept
either this tuple, or for compatibility whatever Random uses.

From encukou at gmail.com  Fri Sep 11 23:48:54 2015
From: encukou at gmail.com (Petr Viktorin)
Date: Fri, 11 Sep 2015 23:48:54 +0200
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <1442005920.3575903.381296913.53B6421C@webmail.messagingengine.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <CAExdVN=Cx+hc6oSN0LhwVO5zjVuzGKAfeiePRppNqy26M-ZgMg@mail.gmail.com>
 <CAPJVwB=tnYhYNFAs3w5PYpQWaEY8XGZpMnA0_GrZw5=c0f7JsQ@mail.gmail.com>
 <1442005920.3575903.381296913.53B6421C@webmail.messagingengine.com>
Message-ID: <CA+=+wqCX1q7jVCYFLWtWz_0=Q_jSfX+=JrGY7NawsoxLUgtAvQ@mail.gmail.com>

On Fri, Sep 11, 2015 at 11:12 PM,  <random832 at fastmail.us> wrote:
> On Wed, Sep 9, 2015, at 17:02, Nathaniel Smith wrote:
>> Keeping that promise in mind, an alternative would be to keep both
>> generators around, use the cryptographically secure one by default, and
>> switch to MT when someone calls
>>
>>   seed(1234, generator="INSECURE LEGACY MT")
>>
>> But this would justifiably get us crucified by the security community,
>> because the above call would flip the insecure switch for your entire
>> program, including possibly other modules that were depending on random
>> to
>> provide secure bits.
>
> I just realized, OpenBSD has precisely this functionality, for the
> rand/random/rand48 functions, in the "_deterministic" versions of their
> respective seed functions. So that's probably not a terrible path to go
> down:
>
> Make a Random class that uses a CSPRNG and/or os.urandom until/unless it
> is explicitly seeded. Use that class for the global instance. We could
> probably skip the "make a separate function name to show you really mean
> it" because unlike C, Python has never encouraged explicitly seeding
> with the {time, pid, four bytes from /dev/random} when one doesn't
> actually want determinism. (The default seed in C for rand/random is
> *1*; for rand48 it is an implementation-defined, but specified to be
> constant, value).
>
> For completeness, have getstate return a tuple of a boolean (for which
> mode it is in) and whatever state Random returns. setstate can accept
> either this tuple, or for compatibility whatever Random uses.

Calling getstate() means yoy want to call setstate() at some point in
the future, and have deterministic results. Getting the CSRNG state is
dangerous (since it would allow replaying), and it's not even useful
(since system entropy gets mixed in occasionally).
Instead, in this scheme, getstate() should activate the deterministic
RNG (seeding it if it's the first use), and return its state.
setstate() would then also switch to the Twister, and seed it.

From mal at egenix.com  Sat Sep 12 00:59:01 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat, 12 Sep 2015 00:59:01 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <55F33D3E.7000904@egenix.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>	<CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>	<etPan.55f06fd9.71794aea.31bc@Draupnir.home>	<1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>	<CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>	<1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>	<20150909190757.GM19373@ando.pearwood.info>	<55F0BF61.6050205@canterbury.ac.nz>	<CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>	<55F13EAF.5040500@egenix.com>	<CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>	<55F1B219.1000502@egenix.com>	<CAEQcUJQ5ZKjvxwZR=rPy8t1oJxd4WxZHcugh-JxgRsZEvYs5fA@mail.gmail.com>	<55F2CF6B.40301@egenix.com>	<CAExdVNnxHPi6Y+2YijLQtnCeJRjZJ8cWGTQYDD63Y4vaRJftrg@mail.gmail.com>	<55F30D04.4010001@egenix.com>	<CAExdVNnooHgKdQc+KFsHLj=MJM52eHsp+k1OJNPjJVzC=O3S_Q@mail.gmail.com>
 <55F33D3E.7000904@egenix.com>
Message-ID: <55F35CB5.6000701@egenix.com>

On 11.09.2015 22:44, M.-A. Lemburg wrote:
> On 11.09.2015 20:52, Tim Peters wrote:
>> [Tim]
>>>> ...
>>>> 2. If they knew they were targeting MT, and had 624 consecutive 32-bit
>>>> outputs, they could compute MT's full internal state essentially
>>>> instantly.
>>
>> [Marc-Andre]
>>> How would they do that ? MT's period is too large for
>>> things like rainbow tables.
>>
>> It's not trivial to figure out how to do this, but once you do, it
>> works ;-)  No search, or tables, of any kind are required.  It's just
>> simple (albeit non-obvious!) bit-fiddling to invert MT's
>> state-to-output transformations to get the state back.  Here's a very
>> nice writeup:
>>
>> https://jazzy.id.au/2010/09/22/cracking_random_number_generators_part_3.html
> 
> Indeed very nice. Thanks for the pointer.
> 
> I wonder why untwister doesn't use this. I gave it 1000 32-bit
> integers, so it should have enough information to recover the
> seed in a short while, but it's still trying to find the seed.
> Oh, and it now shows: 5 days 21 hours left. I stopped it there.
> 
> Anyone up for a random.recover_seed() function ? ;-)

Turns out this will have to be named random.recover_state().

Getting at the seed is too difficult, esp. for strings in Python 3,
and not really worth the effort anyway.

While implementing this, I found that there's a bit more trickery
involved due to the fact that the MT RNG in Python writes the
624 words internal state in batches - once every 624 times
the .getrandbits() function is called.

So you may need up to 624*2 - 1 output values to determine a
correct array of internal state values.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 12 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2015-09-18: PyCon UK 2015 ...                               6 days to go
2015-10-21: Python Meeting Duesseldorf ...                 39 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From random832 at fastmail.us  Sat Sep 12 01:12:38 2015
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Fri, 11 Sep 2015 19:12:38 -0400
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CA+=+wqCX1q7jVCYFLWtWz_0=Q_jSfX+=JrGY7NawsoxLUgtAvQ@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <CAExdVN=Cx+hc6oSN0LhwVO5zjVuzGKAfeiePRppNqy26M-ZgMg@mail.gmail.com>
 <CAPJVwB=tnYhYNFAs3w5PYpQWaEY8XGZpMnA0_GrZw5=c0f7JsQ@mail.gmail.com>
 <1442005920.3575903.381296913.53B6421C@webmail.messagingengine.com>
 <CA+=+wqCX1q7jVCYFLWtWz_0=Q_jSfX+=JrGY7NawsoxLUgtAvQ@mail.gmail.com>
Message-ID: <1442013158.85026.381453473.0F3F82A3@webmail.messagingengine.com>

On Fri, Sep 11, 2015, at 17:48, Petr Viktorin wrote:
> Calling getstate() means yoy want to call setstate() at some point in
> the future, and have deterministic results. Getting the CSRNG state is
> dangerous (since it would allow replaying), and it's not even useful
> (since system entropy gets mixed in occasionally).
> Instead, in this scheme, getstate() should activate the deterministic
> RNG (seeding it if it's the first use), and return its state.
> setstate() would then also switch to the Twister, and seed it.

My thinking was that "CSRNG is enabled" should be regarded as a single
state of the "magic switching RNG". The alternative would be that
calling getstate on a magic switching RNG that is not already in
deterministic mode is an error.

From tim.peters at gmail.com  Sat Sep 12 03:19:19 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 11 Sep 2015 20:19:19 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F35CB5.6000701@egenix.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <CAEQcUJQ5ZKjvxwZR=rPy8t1oJxd4WxZHcugh-JxgRsZEvYs5fA@mail.gmail.com>
 <55F2CF6B.40301@egenix.com>
 <CAExdVNnxHPi6Y+2YijLQtnCeJRjZJ8cWGTQYDD63Y4vaRJftrg@mail.gmail.com>
 <55F30D04.4010001@egenix.com>
 <CAExdVNnooHgKdQc+KFsHLj=MJM52eHsp+k1OJNPjJVzC=O3S_Q@mail.gmail.com>
 <55F33D3E.7000904@egenix.com> <55F35CB5.6000701@egenix.com>
Message-ID: <CAExdVNk+rPfgMVqTjJMeAMc7hRy42EQ8z52=N8xxpb_Sa9JJJA@mail.gmail.com>

[Tim, on recovering MT state from outputs]
>>> https://jazzy.id.au/2010/09/22/cracking_random_number_generators_part_3.html

[Marc-Andre]
>> Indeed very nice. Thanks for the pointer.
>>
>> I wonder why untwister doesn't use this. I gave it 1000 32-bit
>> integers, so it should have enough information to recover the
>> seed in a short while, but it's still trying to find the seed.
>> Oh, and it now shows: 5 days 21 hours left. I stopped it there.

As you went on to discover, while the writeup gives enough to convince
you it's possible, there are always details ;-)


> Turns out this will have to be named random.recover_state().
>
> Getting at the seed is too difficult, esp. for strings in Python 3,
> and not really worth the effort anyway.

It's flatly impossible to ever know what the seed was, unless you
_also_ know exactly how many times MT was invoked before the first
output you captured.  Think about that a bit, and I'm sure you'll see
that's obvious.  Even if you did know how many times, it would still
be impossible without more assumptions, since seed arguments can
contain any number of bits.


> While implementing this, I found that there's a bit more trickery
> involved due to the fact that the MT RNG in Python writes the
> 624 words internal state in batches - once every 624 times
> the .getrandbits() function is called.
>
> So you may need up to 624*2 - 1 output values to determine a
> correct array of internal state values.

Don't be too sure about that.  From an information-theoretic view,
"it's obvious" that 624 32-bit outputs is enough - indeed, that's 31
more bits than the internal state actually has. You don't need to
reproduce Python's current internal MT state exactly, you only need to
create _a_ MT state that will produce the same values forever after.
Specifically, the index of the "current" vector element is an artifact
of the implementation, and doesn't need to be reproduced.  You're free
to set that index to anything you like in _your_ MT state - the real
goal is to get the same results.

From tim.peters at gmail.com  Sat Sep 12 05:23:42 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 11 Sep 2015 22:23:42 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAExdVNk+rPfgMVqTjJMeAMc7hRy42EQ8z52=N8xxpb_Sa9JJJA@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <CAEQcUJQ5ZKjvxwZR=rPy8t1oJxd4WxZHcugh-JxgRsZEvYs5fA@mail.gmail.com>
 <55F2CF6B.40301@egenix.com>
 <CAExdVNnxHPi6Y+2YijLQtnCeJRjZJ8cWGTQYDD63Y4vaRJftrg@mail.gmail.com>
 <55F30D04.4010001@egenix.com>
 <CAExdVNnooHgKdQc+KFsHLj=MJM52eHsp+k1OJNPjJVzC=O3S_Q@mail.gmail.com>
 <55F33D3E.7000904@egenix.com> <55F35CB5.6000701@egenix.com>
 <CAExdVNk+rPfgMVqTjJMeAMc7hRy42EQ8z52=N8xxpb_Sa9JJJA@mail.gmail.com>
Message-ID: <CAExdVNmhJ4J_b3x4cV7u+Yrttb60jaN7-VThqk_4Mk31eQ6yCQ@mail.gmail.com>

[Marc-Andre]
...
>> While implementing this, I found that there's a bit more trickery
>> involved due to the fact that the MT RNG in Python writes the
>> 624 words internal state in batches - once every 624 times
>> the .getrandbits() function is called.
>>
>> So you may need up to 624*2 - 1 output values to determine a
>> correct array of internal state values.

[Tim]
> Don't be too sure about that.  From an information-theoretic view,
> "it's obvious" that 624 32-bit outputs is enough - indeed, that's 31
> more bits than the internal state actually has. You don't need to
> reproduce Python's current internal MT state exactly, you only need to
> create _a_ MT state that will produce the same values forever after.
> Specifically, the index of the "current" vector element is an artifact
> of the implementation, and doesn't need to be reproduced.  You're free
> to set that index to anything you like in _your_ MT state - the real
> goal is to get the same results.

Concrete proof of concept.  First code to reconstruct state from 624
consecutive 32-bit outputs:

    def invert(transform, output, n=100):
        guess = output
        for i in range(n):
            newguess = transform(guess)
            if newguess == output:
                return guess
            guess = newguess
        raise ValueError("%r not invertible in %s tries" %
                         (output, n))

    t1 = lambda y: y ^ (y >> 11)
    t2 = lambda y: y ^ ((y << 7) & 0x9d2c5680)
    t3 = lambda y: y ^ ((y << 15) & 0xefc60000)
    t4 = lambda y: y ^ (y >> 18)

    def invert_mt(y):
        y = invert(t4, y)
        y = invert(t3, y)
        y = invert(t2, y)
        y = invert(t1, y)
        return y

    def guess_state(vec):
        assert len(vec) == 624
        return (3,
                tuple(map(invert_mt, vec)) + (624,),
                None)

Now we can try it:

    import random
    for i in range(129):
        random.random()

That loop was just to move MT into "the middle" of its internal
vector.  Now grab values:

    vec = [random.getrandbits(32) for i in range(624)]

Note that the `guess_state()` function above _always_ sets the index
to 624.  When it becomes obvious _why_ it does so, all mysteries will
vanish ;-)

Now create a distinct generator and force its state to the deduced state:

    newrand = random.Random()
    newrand.setstate(guess_state(vec))

And some quick sanity checks:

    for i in range(1000000):
        assert random.random() == newrand.random()
    for i in range(1000000):
        assert random.getrandbits(32) == newrand.getrandbits(32)

The internal states are _not_ byte-for-byte identical.  But they don't
need to be.  The artificial `index` bookkeeping variable allows
hundreds of distinct spellings of _semantically_ identical states.

From tritium-list at sdamon.com  Sat Sep 12 05:29:15 2015
From: tritium-list at sdamon.com (Alexander Walters)
Date: Fri, 11 Sep 2015 23:29:15 -0400
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAExdVNmhJ4J_b3x4cV7u+Yrttb60jaN7-VThqk_4Mk31eQ6yCQ@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <CAEQcUJQ5ZKjvxwZR=rPy8t1oJxd4WxZHcugh-JxgRsZEvYs5fA@mail.gmail.com>
 <55F2CF6B.40301@egenix.com>
 <CAExdVNnxHPi6Y+2YijLQtnCeJRjZJ8cWGTQYDD63Y4vaRJftrg@mail.gmail.com>
 <55F30D04.4010001@egenix.com>
 <CAExdVNnooHgKdQc+KFsHLj=MJM52eHsp+k1OJNPjJVzC=O3S_Q@mail.gmail.com>
 <55F33D3E.7000904@egenix.com> <55F35CB5.6000701@egenix.com>
 <CAExdVNk+rPfgMVqTjJMeAMc7hRy42EQ8z52=N8xxpb_Sa9JJJA@mail.gmail.com>
 <CAExdVNmhJ4J_b3x4cV7u+Yrttb60jaN7-VThqk_4Mk31eQ6yCQ@mail.gmail.com>
Message-ID: <55F39C0B.9090600@sdamon.com>

My final thoughts on this entire topic is this:

The suggestions made here, and in the other thread, are pointless api 
breaking changes that do no effect the stated target audience (people 
who actually need secure random numbers but are not getting them 
correctly - they will still find a way to do it wrong, changing the api 
wont fix that).  The net effect is a longer support burden on 2.x - this 
proposes another porting headache for NO reason.

From mal at egenix.com  Sat Sep 12 13:31:48 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat, 12 Sep 2015 13:31:48 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAExdVNmhJ4J_b3x4cV7u+Yrttb60jaN7-VThqk_4Mk31eQ6yCQ@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>	<CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>	<1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>	<20150909190757.GM19373@ando.pearwood.info>	<55F0BF61.6050205@canterbury.ac.nz>	<CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>	<55F13EAF.5040500@egenix.com>	<CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>	<55F1B219.1000502@egenix.com>	<CAEQcUJQ5ZKjvxwZR=rPy8t1oJxd4WxZHcugh-JxgRsZEvYs5fA@mail.gmail.com>	<55F2CF6B.40301@egenix.com>	<CAExdVNnxHPi6Y+2YijLQtnCeJRjZJ8cWGTQYDD63Y4vaRJftrg@mail.gmail.com>	<55F30D04.4010001@egenix.com>	<CAExdVNnooHgKdQc+KFsHLj=MJM52eHsp+k1OJNPjJVzC=O3S_Q@mail.gmail.com>	<55F33D3E.7000904@egenix.com>
 <55F35CB5.6000701@egenix.com>	<CAExdVNk+rPfgMVqTjJMeAMc7hRy42EQ8z52=N8xxpb_Sa9JJJA@mail.gmail.com>
 <CAExdVNmhJ4J_b3x4cV7u+Yrttb60jaN7-VThqk_4Mk31eQ6yCQ@mail.gmail.com>
Message-ID: <55F40D24.8080008@egenix.com>

On 12.09.2015 05:23, Tim Peters wrote:
> [Marc-Andre]
> ...
>>> While implementing this, I found that there's a bit more trickery
>>> involved due to the fact that the MT RNG in Python writes the
>>> 624 words internal state in batches - once every 624 times
>>> the .getrandbits() function is called.
>>>
>>> So you may need up to 624*2 - 1 output values to determine a
>>> correct array of internal state values.
> 
> [Tim]
>> Don't be too sure about that.  From an information-theoretic view,
>> "it's obvious" that 624 32-bit outputs is enough - indeed, that's 31
>> more bits than the internal state actually has. You don't need to
>> reproduce Python's current internal MT state exactly, you only need to
>> create _a_ MT state that will produce the same values forever after.
>> Specifically, the index of the "current" vector element is an artifact
>> of the implementation, and doesn't need to be reproduced.  You're free
>> to set that index to anything you like in _your_ MT state - the real
>> goal is to get the same results.
> 
> Concrete proof of concept.  First code to reconstruct state from 624
> consecutive 32-bit outputs:
> 
>     def invert(transform, output, n=100):
>         guess = output
>         for i in range(n):
>             newguess = transform(guess)
>             if newguess == output:
>                 return guess
>             guess = newguess
>         raise ValueError("%r not invertible in %s tries" %
>                          (output, n))
> 
>     t1 = lambda y: y ^ (y >> 11)
>     t2 = lambda y: y ^ ((y << 7) & 0x9d2c5680)
>     t3 = lambda y: y ^ ((y << 15) & 0xefc60000)
>     t4 = lambda y: y ^ (y >> 18)
> 
>     def invert_mt(y):
>         y = invert(t4, y)
>         y = invert(t3, y)
>         y = invert(t2, y)
>         y = invert(t1, y)
>         return y
> 
>     def guess_state(vec):
>         assert len(vec) == 624
>         return (3,
>                 tuple(map(invert_mt, vec)) + (624,),
>                 None)
> 
> Now we can try it:
> 
>     import random
>     for i in range(129):
>         random.random()
> 
> That loop was just to move MT into "the middle" of its internal
> vector.  Now grab values:
> 
>     vec = [random.getrandbits(32) for i in range(624)]
> 
> Note that the `guess_state()` function above _always_ sets the index
> to 624.  When it becomes obvious _why_ it does so, all mysteries will
> vanish ;-)
> 
> Now create a distinct generator and force its state to the deduced state:
> 
>     newrand = random.Random()
>     newrand.setstate(guess_state(vec))
> 
> And some quick sanity checks:
> 
>     for i in range(1000000):
>         assert random.random() == newrand.random()
>     for i in range(1000000):
>         assert random.getrandbits(32) == newrand.getrandbits(32)
> 
> The internal states are _not_ byte-for-byte identical.  But they don't
> need to be.  The artificial `index` bookkeeping variable allows
> hundreds of distinct spellings of _semantically_ identical states.

It's a rolling index, yes, but when creating the vector of output
values, the complete internal state array will have undergone
a recalc at one of the iterations.

The guess_state(vec) function will thus return an internal
state vector that is half state of the previous recalc run,
half new recalc run, it is not obvious to me why you would
still be able to get away with not synchronizing to the next
recalc in order to have a complete state from the current recalc.

Let's see...

The values in the state array are each based on

a) previous state[i]
b) state[(i + 1) % 624]
c) state[(i + 397) % 624]

Since the calculation is forward looking, your trick will only
work if you can make sure that i + 397 doesn't wrap around
into the previous state before you trigger the recalc in
newrand.

Which is easy, of course, since you can control the current
index of newrand and force it to do a recalc with the next
call to .getrandbits() by setting it to 624.

Clever indeed :-)

Here's a better way to do the inversion without guess work:

# 32-bits all set
ALL_BITS_SET = 0xffffffffL

def undo_bitshift_right_xor(value, shift, mask=ALL_BITS_SET):

    # Set shift high order bits; there's probably a better way to
    # do this, but this does the trick for now
    decoding_mask = (ALL_BITS_SET << (32 - shift)) & ALL_BITS_SET
    decoded_part = 0
    result = 0
    while decoding_mask > 0:
        decoded_part = (value ^ (decoded_part & mask)) & decoding_mask
        result |= decoded_part
        decoded_part >>= shift
        decoding_mask >>= shift
    return result

def undo_bitshift_left_xor(value, shift, mask=ALL_BITS_SET):

    # Set shift low order bits
    decoding_mask = ALL_BITS_SET >> (32 - shift)
    decoded_part = 0
    result = 0
    while decoding_mask > 0:
        decoded_part = (value ^ (decoded_part & mask)) & decoding_mask
        result |= decoded_part
        decoded_part = (decoded_part << shift) & ALL_BITS_SET
        decoding_mask = (decoding_mask << shift) & ALL_BITS_SET
    return result

def recover_single_state_value(value):

    value = undo_bitshift_right_xor(value, 18)
    value = undo_bitshift_left_xor(value, 15, 0xefc60000L)
    value = undo_bitshift_left_xor(value, 7, 0x9d2c5680L)
    value = undo_bitshift_right_xor(value, 11)
    return value

def guess_state(data):

    if len(data) < 624:
        raise TypeError('not enough data to recover state')

    # Only work with the 624 last entries
    data = data[-624:]
    state = [recover_single_state_value(x)
             for x in data]
    return (3,
            tuple(state) + (624,),
            None)

This is inspired by the work of James Roper, but uses a slightly
faster approach for the undo functions. Not that it matters much.
It was fun, that's what matters :-)

Oh, in Python 3, you need to remove the 'L' after the constants.
Too bad that it doesn't recognize those old annotations anymore.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 12 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2015-09-18: PyCon UK 2015 ...                               6 days to go
2015-10-21: Python Meeting Duesseldorf ...                 39 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From mal at egenix.com  Sat Sep 12 13:35:48 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat, 12 Sep 2015 13:35:48 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <55F40D24.8080008@egenix.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>	<CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>	<1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>	<20150909190757.GM19373@ando.pearwood.info>	<55F0BF61.6050205@canterbury.ac.nz>	<CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>	<55F13EAF.5040500@egenix.com>	<CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>	<55F1B219.1000502@egenix.com>	<CAEQcUJQ5ZKjvxwZR=rPy8t1oJxd4WxZHcugh-JxgRsZEvYs5fA@mail.gmail.com>	<55F2CF6B.40301@egenix.com>	<CAExdVNnxHPi6Y+2YijLQtnCeJRjZJ8cWGTQYDD63Y4vaRJftrg@mail.gmail.com>	<55F30D04.4010001@egenix.com>	<CAExdVNnooHgKdQc+KFsHLj=MJM52eHsp+k1OJNPjJVzC=O3S_Q@mail.gmail.com>	<55F33D3E.7000904@egenix.com>	<55F35CB5.6000701@egenix.com>	<CAExdVNk+rPfgMVqTjJMeAMc7hRy42EQ8z52=N8xxpb_Sa9JJJA@mail.gmail.com>	<CAExdVNmhJ4J_b3x4cV7u+Yrttb60jaN7-VThqk_4Mk31eQ6yCQ@mail.gmail.com>
 <55F40D24.8080008@egenix.com>
Message-ID: <55F40E14.6000907@egenix.com>

On 12.09.2015 13:31, M.-A. Lemburg wrote:
> On 12.09.2015 05:23, Tim Peters wrote:
>> [Marc-Andre]
>> ...
>>>> While implementing this, I found that there's a bit more trickery
>>>> involved due to the fact that the MT RNG in Python writes the
>>>> 624 words internal state in batches - once every 624 times
>>>> the .getrandbits() function is called.
>>>>
>>>> So you may need up to 624*2 - 1 output values to determine a
>>>> correct array of internal state values.
>>
>> [Tim]
>>> Don't be too sure about that.  From an information-theoretic view,
>>> "it's obvious" that 624 32-bit outputs is enough - indeed, that's 31
>>> more bits than the internal state actually has. You don't need to
>>> reproduce Python's current internal MT state exactly, you only need to
>>> create _a_ MT state that will produce the same values forever after.
>>> Specifically, the index of the "current" vector element is an artifact
>>> of the implementation, and doesn't need to be reproduced.  You're free
>>> to set that index to anything you like in _your_ MT state - the real
>>> goal is to get the same results.
>>
>> Concrete proof of concept.  First code to reconstruct state from 624
>> consecutive 32-bit outputs:
>>
>>     def invert(transform, output, n=100):
>>         guess = output
>>         for i in range(n):
>>             newguess = transform(guess)
>>             if newguess == output:
>>                 return guess
>>             guess = newguess
>>         raise ValueError("%r not invertible in %s tries" %
>>                          (output, n))
>>
>>     t1 = lambda y: y ^ (y >> 11)
>>     t2 = lambda y: y ^ ((y << 7) & 0x9d2c5680)
>>     t3 = lambda y: y ^ ((y << 15) & 0xefc60000)
>>     t4 = lambda y: y ^ (y >> 18)
>>
>>     def invert_mt(y):
>>         y = invert(t4, y)
>>         y = invert(t3, y)
>>         y = invert(t2, y)
>>         y = invert(t1, y)
>>         return y
>>
>>     def guess_state(vec):
>>         assert len(vec) == 624
>>         return (3,
>>                 tuple(map(invert_mt, vec)) + (624,),
>>                 None)
>>
>> Now we can try it:
>>
>>     import random
>>     for i in range(129):
>>         random.random()
>>
>> That loop was just to move MT into "the middle" of its internal
>> vector.  Now grab values:
>>
>>     vec = [random.getrandbits(32) for i in range(624)]
>>
>> Note that the `guess_state()` function above _always_ sets the index
>> to 624.  When it becomes obvious _why_ it does so, all mysteries will
>> vanish ;-)
>>
>> Now create a distinct generator and force its state to the deduced state:
>>
>>     newrand = random.Random()
>>     newrand.setstate(guess_state(vec))
>>
>> And some quick sanity checks:
>>
>>     for i in range(1000000):
>>         assert random.random() == newrand.random()
>>     for i in range(1000000):
>>         assert random.getrandbits(32) == newrand.getrandbits(32)
>>
>> The internal states are _not_ byte-for-byte identical.  But they don't
>> need to be.  The artificial `index` bookkeeping variable allows
>> hundreds of distinct spellings of _semantically_ identical states.
> 
> It's a rolling index, yes, but when creating the vector of output
> values, the complete internal state array will have undergone
> a recalc at one of the iterations.
> 
> The guess_state(vec) function will thus return an internal
> state vector that is half state of the previous recalc run,
> half new recalc run, it is not obvious to me why you would
> still be able to get away with not synchronizing to the next
> recalc in order to have a complete state from the current recalc.
> 
> Let's see...
> 
> The values in the state array are each based on
> 
> a) previous state[i]
> b) state[(i + 1) % 624]
> c) state[(i + 397) % 624]
> 
> Since the calculation is forward looking, your trick will only
> work if you can make sure that i + 397 doesn't wrap around
> into the previous state before you trigger the recalc in
> newrand.
> 
> Which is easy, of course, since you can control the current
> index of newrand and force it to do a recalc with the next
> call to .getrandbits() by setting it to 624.
> 
> Clever indeed :-)
> 
> Here's a better way to do the inversion without guess work:
> 
> # 32-bits all set
> ALL_BITS_SET = 0xffffffffL
> 
> def undo_bitshift_right_xor(value, shift, mask=ALL_BITS_SET):
> 
>     # Set shift high order bits; there's probably a better way to
>     # do this, but this does the trick for now
>     decoding_mask = (ALL_BITS_SET << (32 - shift)) & ALL_BITS_SET
>     decoded_part = 0
>     result = 0
>     while decoding_mask > 0:
>         decoded_part = (value ^ (decoded_part & mask)) & decoding_mask
>         result |= decoded_part
>         decoded_part >>= shift
>         decoding_mask >>= shift
>     return result
> 
> def undo_bitshift_left_xor(value, shift, mask=ALL_BITS_SET):
> 
>     # Set shift low order bits
>     decoding_mask = ALL_BITS_SET >> (32 - shift)
>     decoded_part = 0
>     result = 0
>     while decoding_mask > 0:
>         decoded_part = (value ^ (decoded_part & mask)) & decoding_mask
>         result |= decoded_part
>         decoded_part = (decoded_part << shift) & ALL_BITS_SET
>         decoding_mask = (decoding_mask << shift) & ALL_BITS_SET
>     return result
> 
> def recover_single_state_value(value):
> 
>     value = undo_bitshift_right_xor(value, 18)
>     value = undo_bitshift_left_xor(value, 15, 0xefc60000L)
>     value = undo_bitshift_left_xor(value, 7, 0x9d2c5680L)
>     value = undo_bitshift_right_xor(value, 11)
>     return value
> 
> def guess_state(data):

Hmm, the name doesn't fit anymore, better call it:

  def recover_state(data):

>     if len(data) < 624:
>         raise TypeError('not enough data to recover state')
> 
>     # Only work with the 624 last entries
>     data = data[-624:]
>     state = [recover_single_state_value(x)
>              for x in data]
>     return (3,
>             tuple(state) + (624,),
>             None)
> 
> This is inspired by the work of James Roper, but uses a slightly
> faster approach for the undo functions. Not that it matters much.
> It was fun, that's what matters :-)
> 
> Oh, in Python 3, you need to remove the 'L' after the constants.
> Too bad that it doesn't recognize those old annotations anymore. 

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 12 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2015-09-18: PyCon UK 2015 ...                               6 days to go
2015-10-21: Python Meeting Duesseldorf ...                 39 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From tim.peters at gmail.com  Sun Sep 13 03:00:17 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 12 Sep 2015 20:00:17 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F40D24.8080008@egenix.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <CAEQcUJQ5ZKjvxwZR=rPy8t1oJxd4WxZHcugh-JxgRsZEvYs5fA@mail.gmail.com>
 <55F2CF6B.40301@egenix.com>
 <CAExdVNnxHPi6Y+2YijLQtnCeJRjZJ8cWGTQYDD63Y4vaRJftrg@mail.gmail.com>
 <55F30D04.4010001@egenix.com>
 <CAExdVNnooHgKdQc+KFsHLj=MJM52eHsp+k1OJNPjJVzC=O3S_Q@mail.gmail.com>
 <55F33D3E.7000904@egenix.com> <55F35CB5.6000701@egenix.com>
 <CAExdVNk+rPfgMVqTjJMeAMc7hRy42EQ8z52=N8xxpb_Sa9JJJA@mail.gmail.com>
 <CAExdVNmhJ4J_b3x4cV7u+Yrttb60jaN7-VThqk_4Mk31eQ6yCQ@mail.gmail.com>
 <55F40D24.8080008@egenix.com>
Message-ID: <CAExdVNnA4AfXj47ZyanJAzk_qA_wR5gFAFWj2pAC7dTcpwVbjA@mail.gmail.com>

[Marc-Andre, puzzling over Tim's MT state-recovering hack]
> ...
> Since the calculation is forward looking, your trick will only
> work if you can make sure that i + 397 doesn't wrap around
> into the previous state before you trigger the recalc in
> newrand.
>
> Which is easy, of course, since you can control the current
> index of newrand and force it to do a recalc with the next
> call to .getrandbits() by setting it to 624.
>
> Clever indeed :-)

I'll suggest a different way to look at it:  suppose you wanted to
reproduce the state at _the start_ of the 624 values captured instead.
Well, we'd do exactly the same thing, except set the index to 0.  Then
it's utterly obvious that your MT instance would spit out exactly the
same 624 outputs as the ones captured.  That's all the internals do
when the index starts at 0:  march through the state vector one word
at a time, spitting out the tempered version of whichever 32-bit word
is current.  And increment the index each time (the only _mutation_ of
any part of the MT internals).

At the end of that, the only change to the internals is that the index
would be left at 624.  Which is exactly what "my code" sets it to.  It
acts exactly the same as if we had just finished generating the 624
captured outputs.

Since we (in our heads) just reproduced enough bits to cover the
entire internal state, it must be the case that we'll continue to
reproduce all future outputs too.  The "wrap around" is a red herring
;-)


> Here's a better way to do the inversion without guess work:

"Better" depends.  Despite the variable named "guess" in the code,
it's not guessing about anything ;-)  It's a single function that
doesn't care (and can't even be told) whether a left or right shift is
being used, what the shift count is, whether a mask is in use, or even
what the word size is.

In those senses it's "better":  it can be used without change for
"anything like this", including, e.g., the 64-bit variant of MT.  Just
paste the C tempering lines into the lambdas.  Nothing about the
inversion function needs to change.

But why it works efficiently is far from obvious.  It _can_ take as
many (but not more) iterations as there are bits in a word, but that's
almost never needed.  IIRC, it can never require more than 8
iterations to invert any of the tempering functions in the 32-bit MT,
and, e.g., always inverts the very weak "lambda y: y ^ (y >> 18)"
32-bit MT transform on the first try.

Nevertheless, you can - as you did - be more efficient by writing
distinct inversion functions for "left shift" and "right shift" cases,
and wiring in the word size.  But the expense of deducing the state is
just plain trivial here either way.  We're not consuming days or hours
here, we're not even consuming an appreciable fraction of a second at
Python speed :-)

From stephen at xemacs.org  Sun Sep 13 03:53:05 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sun, 13 Sep 2015 10:53:05 +0900
Subject: [Python-ideas] Round division
In-Reply-To: <CAP7+vJK5Vqi_+SXQtzi0GeMFw3-XMT4eG-7OjWyqnt21LUL6rA@mail.gmail.com>
References: <mssnir$uol$1@ger.gmane.org>
 <CAG3cHaY80SdXfWAmFb6xgcYCSesefA0dUyO-eR=qSO2+8R4ERw@mail.gmail.com>
 <CACac1F_sGOS2uXmyjtkfP93g98A_Mg=u2k_xw-vdWSjPx5wkDw@mail.gmail.com>
 <mst0rv$2r1$1@ger.gmane.org>
 <20150911031304.GT19373@ando.pearwood.info>
 <msts84$esq$1@ger.gmane.org> <msuute$cmr$1@ger.gmane.org>
 <CAP7+vJK5Vqi_+SXQtzi0GeMFw3-XMT4eG-7OjWyqnt21LUL6rA@mail.gmail.com>
Message-ID: <87mvwrytj2.fsf@uwakimon.sk.tsukuba.ac.jp>

Guido van Rossum writes:
 > On Fri, Sep 11, 2015 at 9:18 AM, Emile van Sebille <emile at fenx.com> wrote:

 > > Wow -- I'm glad I work predominately in business environments and keep
 > > amounts in pennies.  The only time I need to round anything is to the
 > > nearest cent.
 > >
 > 
 > I thought any programmer worth their salt would round down (i.e. trunc())
 > and transfer the fractional penny to their own account? :-)

Hate to tell you, but the accountants and even the SEC caught on to
that one four decades ago.

From stephen at xemacs.org  Sun Sep 13 17:47:31 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 14 Sep 2015 00:47:31 +0900
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
Message-ID: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>

Tim Peters writes:
 > [M.-A. Lemburg]
 > >> I'm pretty sure people doing crypto will know and most others
 > >> simply don't care :-)
 > 
 > [Stephen J. Turnbull <stephen at xemacs.org>]
 > > Which is why botnets have millions of nodes.
 > 
 > I'm not a security wonk, but I'll bet a life's salary ;-) we'd have
 > botnets just as pervasive if every non-crypto RNG in the world were
 > banned - or had never existed.

I am in violent agreement with you on that point and most others.[1]
However, the analogy was not intended to be so direct as to imply that
"insecure" RNGs are responsible for botnets, merely that not caring
is.  I agree I twisted MAL's words a bit -- he meant most people have
no technical need for crypto, and so don't care, I suppose.  But then,
"doing crypto" (== security) is like "speaking prose": a lot of folks
doing it don't realize that's what they're doing -- and they don't
care, either.

 > So long as end users are allowed to run programs, that problem will
 > never go away.

s/users/programmers/ and s/run/write/, and we get a different analogy
that is literally correct -- but fails in an important dimension.  One
user's mistake adds one node to the botnet, and that's about as bad as
one user's mistake gets in terms of harm to third parties.  But one
programmer's (or system administrator's) mistake can put many, perhaps
millions, at risk.

Personally I doubt that justifies an API break here, even if we can
come up with attacks where breaking the PRNG would be cost-effective
compared to "social engineering" or theft of physical media.  I think
it does justify putting quite a bit of thought into ways to make it
easier for naive programmers to do the "safe" thing even if they
technically don't need it.

I will say that IMO the now-traditional API was a very unfortunate
choice.  If you have a CSPRNG that just generates "uniform random
numbers" and has no user-visible APIs for getting or setting state,
it's immediately obvious to the people who know they need access to
state what they need to do -- change "RNG" implementation.  The most
it might cost them is rerunning an expensive base case simulation with
a more appropriate implementation that provides the needed APIs.

On the other hand, if you have something like the MT that "shouldn't
be allowed anywhere near a password", it's easy to ignore the state
access APIs, and call it the same way that you would call a CSPRNG.
In fact that's what's documented as correct usage, as Paul Moore
points out.  Thus, programmers who are using a PRNG whose parameters
can be inferred from its output, and should not be doing so, generally
won't know it until the (potentially widespread) harm is done.  It
would be nice if it wasn't so easy for them to use the MT.


Footnotes: 
[1]  I think "agree with Tim" is a pretty safe default.<wink/>


From njs at pobox.com  Mon Sep 14 04:54:51 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 13 Sep 2015 19:54:51 -0700
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <mssfcj$les$1@ger.gmane.org>
References: <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
 <loom.20150909T232749-280@post.gmane.org>
 <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>
 <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
 <20150910015505.GO19373@ando.pearwood.info>
 <CAExdVN=tO3jPWoz0t6ckspAuWB-7t61GzbKLrx2L2UtKRyELbA@mail.gmail.com>
 <CAPJVwBnOqY3XcAtuRS7en956qfZHL1_fin-e7Pb+2CWQk2dftg@mail.gmail.com>
 <CAExdVN=tEtoh6Dx+7XCQ-nwv1f7O+ALAvSLvLpT4NQnzyK0Z+A@mail.gmail.com>
 <CAPJVwBn+tWOtPPt+UqwGwYaRqozAZtU2xTdVhuZUaRvJnePGXQ@mail.gmail.com>
 <mssfcj$les$1@ger.gmane.org>
Message-ID: <CAPJVwB=UZYj3Puj4AUcfJ+x_ktgBsx4Vb4HowuWk2-S4LSigkA@mail.gmail.com>

[This is getting fairly off-topic for python-ideas (since AFAICT there
is no particular reason right now to add a new deterministic generator
to the stdlib), so CC'ing numpy-discussion and I'd suggest followups
be directed to there alone.]

On Thu, Sep 10, 2015 at 10:41 AM, Robert Kern <robert.kern at gmail.com> wrote:
> On 2015-09-10 04:56, Nathaniel Smith wrote:
>>
>> On Wed, Sep 9, 2015 at 8:35 PM, Tim Peters <tim.peters at gmail.com> wrote:
>>>
>>> There are some clean and easy approaches to this based on
>>> crypto-inspired schemes, but giving up crypto strength for speed.  If
>>> you haven't read it, this paper is delightful:
>>>
>>>      http://www.thesalmons.org/john/random123/papers/random123sc11.pdf
>>
>>
>> It really is! As AES acceleration instructions become more common
>> (they're now standard IIUC on x86, x86-64, and even recent ARM?), even
>> just using AES in CTR mode becomes pretty compelling -- it's fast,
>> deterministic, provably equidistributed, *and* cryptographically
>> secure enough for many purposes.
>
>
> I'll also recommend the PCG paper (and algorithm) as the author's
> cross-PRNGs comparisons have been bandied about in this thread already. The
> paper lays out a lot of the relevant issues and balances the various
> qualities that are important: statistical quality, speed, and security (of
> various flavors).
>
>   http://www.pcg-random.org/paper.html
>
> I'm actually not that impressed with Random123. The core idea is nice and
> clean, but the implementation is hideously complex.

I'm curious what you mean by this? In terms of the code, or...?

I haven't looked at the code, but the simplest generator they
recommend in the paper is literally just

def rng_stream(seed):
    counter = 0
    while True:
        # AES128 takes a 128 bit key and 128 bits of data and returns
128 bits of encrypted data
        val = AES128(key=seed, data=counter)
        yield low_64_bits(val)
        yield high_64_bits(val)
        counter += 1

which gives a 64-bit generator with a period of 2^129. They benchmark
it as faster than the Mersenne Twister on modern CPUs (<2
cycles-per-byte on recent x86, x86-64, ARMv8), it requires less
scratch space, is incredibly simple to work with -- you can
parallelize it, get independent random streams, etc., in a totally
trivial way -- and has a *way* stronger guarantee of
random-looking-ness than merely passing TestU01.

The downsides are that it does still require 176 bytes of read-only
scratch storage (used to cache an expanded version of the "key"), the
need for a modern CPU to get that speed, and that it doesn't play well
with GPUs. So they also provide a set of three more ad hoc generators
designed to solve these problems. I'm not as convinced about these,
but hey, they pass TestU01...

The PCG paper does a much better job of all the other stuff *around*
making a good RNG -- it has the nice website, clear comparisons, nice
code, etc. -- which is definitely important. But to me the increase in
speed and reduction in memory use doesn't seem worth it given how fast
these generators are to start with + the nice properties of counter
mode + and cryptographic guarantees of randomness that you get from
AES, for code that's generally targeting non-embedded non-GPU systems.

-n

-- 
Nathaniel J. Smith -- http://vorpus.org

From tim.peters at gmail.com  Mon Sep 14 05:34:58 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 13 Sep 2015 22:34:58 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAPJVwB=UZYj3Puj4AUcfJ+x_ktgBsx4Vb4HowuWk2-S4LSigkA@mail.gmail.com>
References: <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
 <loom.20150909T232749-280@post.gmane.org>
 <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>
 <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
 <20150910015505.GO19373@ando.pearwood.info>
 <CAExdVN=tO3jPWoz0t6ckspAuWB-7t61GzbKLrx2L2UtKRyELbA@mail.gmail.com>
 <CAPJVwBnOqY3XcAtuRS7en956qfZHL1_fin-e7Pb+2CWQk2dftg@mail.gmail.com>
 <CAExdVN=tEtoh6Dx+7XCQ-nwv1f7O+ALAvSLvLpT4NQnzyK0Z+A@mail.gmail.com>
 <CAPJVwBn+tWOtPPt+UqwGwYaRqozAZtU2xTdVhuZUaRvJnePGXQ@mail.gmail.com>
 <mssfcj$les$1@ger.gmane.org>
 <CAPJVwB=UZYj3Puj4AUcfJ+x_ktgBsx4Vb4HowuWk2-S4LSigkA@mail.gmail.com>
Message-ID: <CAExdVNkL6d5ocRMBvj0c8re0Y5FWp__UFr5G-LybqU57Wmx-xg@mail.gmail.com>

[Robert Kern <robert.kern at gmail.com>]
>> ...
>> I'll also recommend the PCG paper (and algorithm) as the author's
>> cross-PRNGs comparisons have been bandied about in this thread already. The
>> paper lays out a lot of the relevant issues and balances the various
>> qualities that are important: statistical quality, speed, and security (of
>> various flavors).
>>
>>   http://www.pcg-random.org/paper.html
>>
>> I'm actually not that impressed with Random123. The core idea is nice and
>> clean, but the implementation is hideously complex.

[Nathaniel Smith <njs at pobox.com>]
> I'm curious what you mean by this? In terms of the code, or...?
>
> I haven't looked at the code, but the simplest generator they
> recommend in the paper is literally just
>
> def rng_stream(seed):
>     counter = 0
>     while True:
>         # AES128 takes a 128 bit key and 128 bits of data and returns
> 128 bits of encrypted data
>         val = AES128(key=seed, data=counter)
>         yield low_64_bits(val)
>         yield high_64_bits(val)
>         counter += 1

I assume it's because if you expand what's required to _implement_
AES128() in C, it does indeed look pretty hideously complex.  On HW
implementing AES primitives, of course the code can be much simpler.

But to be fair, if integer multiplication and/or addition weren't
implemented in HW, and we had to write to C code emulating them via
bit-level fiddling, the code for any of the PCG algorithms would look
hideously complex too.

But being fair isn't much fun ;-)

From tim.peters at gmail.com  Mon Sep 14 07:29:52 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 14 Sep 2015 00:29:52 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com> <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
 <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>

...

[Tim]
>> I'm not a security wonk, but I'll bet a life's salary ;-) we'd have
>> botnets just as pervasive if every non-crypto RNG in the world were
>> banned - or had never existed.

[Stephen J. Turnbull <stephen at xemacs.org>]
> I am in violent agreement with you on that point and most others.[1]
> However, the analogy was not intended to be so direct as to imply that
> "insecure" RNGs are responsible for botnets, merely that not caring
> is.  I agree I twisted MAL's words a bit -- he meant most people have
> no technical need for crypto, and so don't care, I suppose.  But then,
> "doing crypto" (== security) is like "speaking prose": a lot of folks
> doing it don't realize that's what they're doing -- and they don't
> care, either.

I don't know that it's true, though.  Crypto wonks are like lawyers
that way, always worrying about the worst possible case "in theory".
In my personal life, I've had to tell lawyers "enough already - I'm
not paying another N thousand dollars to insert another page about
what happens in case of nuclear war".  Crypto wonks have no limit on
the costs they'd like to impose either - a Security State never does.

>> So long as end users are allowed to run programs, that problem will
>> never go away.

> s/users/programmers/ and s/run/write/, and we get a different analogy
> that is literally correct -- but fails in an important dimension.  One
> user's mistake adds one node to the botnet, and that's about as bad as
> one user's mistake gets in terms of harm to third parties.

Not really.  The best social engineering is for a bot to rummage
through your email address book and send copies of itself to people
you know, appearing to be a thoroughly legitimate email from you.  Add
a teaser to invite the recipient to click on the attachment, and
response rate can be terrific.

And my ISP (like many others) happens to provide a free
industrial-strength virus/malware scanner/cleaner program.  I doubt
that's because they actually care about me ;-)  Seems more likely they
don't want to pay the costs of hosting millions of active bots.

> But one programmer's (or system administrator's) mistake can put many,
> perhaps millions, at risk.

What I question is whether this has anything _plausible_ to do with
Python's PRNG.


> Personally I doubt that justifies an API break here, even if we can
> come up with attacks where breaking the PRNG would be cost-effective
> compared to "social engineering" or theft of physical media.  I think
> it does justify putting quite a bit of thought into ways to make it
> easier for naive programmers to do the "safe" thing even if they
> technically don't need it.

I remain unclear on what "the danger" is thought to be, such that
replacing with a CSPRNG could plausibly prevent it.  For example, I
know how to deduce the MT's internal state from outputs (and posted
working code for doing so, requiring a small fraction of a second
given 624 consecutive 32-bit outputs).  But it's not an easy problem
_unless_ you have 624 consecutive 32-bit outputs.  It's far beyond the
ken of script kiddies.  If it's the NSA, they can demand you turn over
everything anyway, or just plain steal it ;-)

Consider one of these "password" examples:

    import string, random
    alphabet = string.ascii_letters + string.digits
    print(random.choice(alphabet))

Suppose that prints 'c'.  What have you learned?  Surprisingly,
perhaps, very little.  You learned that one 32-bit output of MT had
0b000010 as its first 6 bits.  You know nothing about its other 26
bits.  And you don't know _which_ MT 32-bit output:  internally,
.choice() consumes as many 32-bit outputs as it needs until it finds
one whose first six bits are less than 62 (len(alphabet)).  So all
you've learned about MT is that, at the time .choice() was called:

   - 0 or more 32-bit outputs x were such that (x >> 26) >= 62.
   - Then one 32-bit output x had (x >> 26) == 2.

This isn't much to go on.  To deduce the whole state easily, you need
to know 19,968 consecutive output bits (624*32).  Get more and more
letters from the password generator, and you learn more and more about
the first 6 bits of an unknowable number of MT outputs consumed, but
learn nothing whatsoever about any of the lower 26 bits of any of
them, and learn nothing but a range constraint on the first 6 bits of
an unknowable number of outputs that were skipped over.

Sure, every clue reveals _something_.  In theory ;-)  Note that, as
explained earlier in these messages, Python's default _seeding_ of MT
is already computationally infeasible to attack (urandom() is already
used to set the entire massive internal state).  _That_ I did worry
about in the past.  So in the above I'm not worried at all that an
attacker exploited poor default seeding to know there were only a few
billion possible MT states _before_ `c` was generated.  All MT states
are possible, and MT's state space is large beyond comprehension (let
alone calculation).


Would the user _really_ be better off using .urandom()?  I don't know.
Since a crypto wonk will rarely recommend doing anything _other_ than
using urandom() directly, I bet they'd discourage using .choice() at
all, even if it is built on urandom().  Then the user will write their
own implementation of .choice(), something like:

    u = urandom(n) # for some n
    letter = alphabet[int(u / 2.0**(8*n) * len(alphabet))]

If they manage to get that much right, _now_ they've introduced a
statistical bias unless len(alphabet) is a power of 2.  If we're
assuming they're crypto morons, chances are good they're not rock
stars at this kind of thing either ;-)


> I will say that IMO the now-traditional API was a very unfortunate
> choice.

Ah, but I remember 1990 ;-)  Python's `random` API was far richer than
"the norm" in languages like C and FORTRAN at the time.  It was a
delight!  Judging it by standards that didn't become trendy until much
later is only fair now ;-)


> If you have a CSPRNG that just generates "uniform random
> numbers" and has no user-visible APIs for getting or setting state,
> it's immediately obvious to the people who know they need access to
> state what they need to do -- change "RNG" implementation.

I don't recall any language _at the time_ that did so.


> The most it might cost them is rerunning an expensive base
> case simulation with a more appropriate implementation that
> provides the needed APIs.

As above, I'm not sure a real crypto wonk would endorse a module that
provided any more than a bare-bones API, forcing the user to build
everything else out of one or two core primitives.  Look at, e.g., how
tiny the OpenBSD arc4random API is.  I do applaud it for offering

     arc4random_uniform(uint32_t upper_bound)

That's exactly what's needed to, e.g., build a bias-free .choice()
(provided you have fewer than 2**32-1 elements to choose from).


> On the other hand, if you have something like the MT that "shouldn't
> be allowed anywhere near a password",

As above, I think that's a weak claim.  The details matter.  As an
example of a strong claim: you should never, ever use MT to produce
the keystream for a stream cipher.  But only a crypto wonk would be
trying to generate a keystream to begin with.  Or a user who did use
MT for that purpose is probably so clueless they'd forget the xor and
just send the plaintext ;-)


> it's easy to ignore the state access APIs, and call it the same way
> that you would call a CSPRNG. In fact that's what's documented as
> correct usage, as Paul Moore points out.  Thus, programmers who
> are using a PRNG whose parameters can be inferred from its output,
> and should not be doing so, generally won't know it until the
> (potentially widespread) harm is done.  It would be nice if it wasn't
> so easy for them to use the MT.

And yet nobody so far has a produced a single example of any harm done
in any of the near-countless languages that supply non-crypto RNGs.  I
know, my lawyer gets annoyed too when I point out there hasn't been a
nuclear war ;-)

Anyway, if people want to pursue this, I suggest adding a _new_ module
doing exactly whatever it is certified crypto experts say is
necessary.  We can even give it a name shorter than "random" to
encourage its use.  That's all most users really care about anyway ;-)


> Footnotes:
> [1]  I think "agree with Tim" is a pretty safe default.<wink/>

It's not always optimal, but, yes, you could do worse ;-)

From njs at pobox.com  Mon Sep 14 08:38:25 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 13 Sep 2015 23:38:25 -0700
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
 <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
Message-ID: <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>

On Sun, Sep 13, 2015 at 10:29 PM, Tim Peters <tim.peters at gmail.com> wrote:
> [Stephen J. Turnbull <stephen at xemacs.org>]
>> it's easy to ignore the state access APIs, and call it the same way
>> that you would call a CSPRNG. In fact that's what's documented as
>> correct usage, as Paul Moore points out.  Thus, programmers who
>> are using a PRNG whose parameters can be inferred from its output,
>> and should not be doing so, generally won't know it until the
>> (potentially widespread) harm is done.  It would be nice if it wasn't
>> so easy for them to use the MT.
>
> And yet nobody so far has a produced a single example of any harm done
> in any of the near-countless languages that supply non-crypto RNGs.  I
> know, my lawyer gets annoyed too when I point out there hasn't been a
> nuclear war ;-)

Here you go:
  https://media.blackhat.com/bh-us-12/Briefings/Argyros/BH_US_12_Argyros_PRNG_WP.pdf

They present real-world attacks on PHP applications that use something
like the "password generation" code we've been talking about as a way
to generate cookies and password reset nonces, including in particular
the case of applications that use a strongly-seeded Mersenne Twister
as their RNG:

"We develop a suite of new techniques and tools to mount attacks
against all PRNGs of the PHP core system even when it is hardened with
the Suhosin patch [which adds strong seeding] and apply our techniques
to create practical exploits for a number of the most popular PHP
applications (including Mediawiki, Gallery, osCommerce and Joomla)
focusing on the password reset functionality. Our exploits allow an
attacker to completely take over arbitrary user accounts."

"Section 5.3: ... In this section we give a description of the
Mersenne Twister generator and present an algorithm that allows the
recovery of the internal state of the generator even when the output
is truncated. Our algorithm also works in the presence of non
consecutive outputs ..."

Out of curiosity, I tried searching github for "random cookie
language:python". The 5th hit (out of ~100k) was a web project that
appears to use this insecure method to generate session cookies:
  https://github.com/bytasv/bbapi/blob/34e294becb22bae6e685f2e742b7ffdb53a83bcb/bbapi/utils/cookie.py
  https://github.com/bytasv/bbapi/blob/34e294becb22bae6e685f2e742b7ffdb53a83bcb/bbapi/api/router.py#L56-L66
(Fortunately this project doesn't appear to actually have any login or
permissions functionality, so I don't think this is an actual
CVE-worthy bug, but that's just a luck -- I'm sure there are plenty of
projects that start out looking like this one and then add security
features without revisiting how they generate session ids.)

There's a reason security people are so Manichean about these kinds of
things. If something is not intended to be secure or used in
security-sensitive ways, then fine, no worries. But if it is, then
there's no point in trying to mess around with "probably mostly
secure" -- either solve the problem right or don't bother. (See also:
the time Python wasted trying to solve hash randomization without
actually solving hash randomization [1].) If Tim Peters can get fooled
into thinking something like using MT to generate session ids is
"probably mostly secure", then what chance do the rest of us have?
<wink>

NB this isn't an argument for *whether* we should make random
cryptographically strong by default; it's just an argument against
wasting time debating whether it's already "secure enough". It's not
secure. Maybe that's okay, maybe it's not.

For the record though I do tend to agree with the idea that it's not
okay, because it's an increasingly hostile world out there, and
secure-random-by-default makes whole classes of these issues just
disappear. It's not often that you get to fix thousands of bugs with
one commit, including at least some with severity level "all your
users' private data just got uploaded to bittorrent".

I like Nick's proposal here:
    https://code.activestate.com/lists/python-ideas/35842/
as probably the most solid strategy for implementing that idea -- the
only projects that would be negatively affected are those that are
using the seeding functionality of the global random API, which is a
tiny fraction, and the effect on those projects is that they get
nudged into using the superior object-oriented API.

-n

[1] https://lwn.net/Articles/574761/

-- 
Nathaniel J. Smith -- http://vorpus.org

From stephen at xemacs.org  Mon Sep 14 10:30:47 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 14 Sep 2015 17:30:47 +0900
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
 <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
Message-ID: <878u89z9l4.fsf@uwakimon.sk.tsukuba.ac.jp>

Tim Peters writes:

 > > "doing crypto" (== security) is like "speaking prose": a lot of folks
 > > doing it don't realize that's what they're doing -- and they don't
 > > care, either.
 > 
 > I don't know that it's true, though.  Crypto wonks are like lawyers
 > that way, always worrying about the worst possible case "in
 > theory".

Well, my worst possible case "in theory" was that a documented MTA
parameter would simply not be implemented and not error when I
configured it to a non-default value -- but that's how yours truly
ended up running an open relay (Smail 3.1.100 I think it was, and I
got it from Debian so it wasn't like I was using alpha code).  That's
what taught me to do functional tests. :-)

So yes, I do think there are a lot of folks out there working with
software without realizing that there are any risks involved.  Life
being life, I'd bet on some of them being programmers working with RNG.

 > In my personal life, I've had to tell lawyers "enough already - I'm
 > not paying another N thousand dollars to insert another page about
 > what happens in case of nuclear war".

But see, that's my main point.  Analogies to *anybody's* personal life
are irrelevant when we're talking about a bug that could be fixed
*once* and save *millions* of users from being exploited.  If the
wonks are right, it's a big deal, big enough to balance the low
probability of them being right. ;-)

 > The best social engineering is for a bot to rummage through your
 > email address book and send copies of itself to people you know,
 > appearing to be a thoroughly legitimate email from you.  Add a
 > teaser to invite the recipient to click on the attachment, and
 > response rate can be terrific.

Sure, but that's not what happened at AOL and Yahoo! AFAIK (of course
they're pretty cagey about details).  It seems that a single leak or a
small number of leaks at each company exposed millions of address
books.  (I hasten to add that I doubt the Mersenne Twister had
anything to do with the leaks.)

 > What I question is whether this has anything _plausible_ to do with
 > Python's PRNG.

Me too.  People who claim some expertise think so, though.

 > Would the user _really_ be better off using .urandom()?  I don't know.
 > Since a crypto wonk will rarely recommend doing anything _other_ than
 > using urandom() directly, I bet they'd discourage using .choice() at
 > all,

That's not unfair, but if they did, I'd go find myself another crypto
wonk.  But who cares about me?  What matters is that Guido would, too.

 > Judging [the random module] by standards that didn't become trendy
 > until much later is only fair now ;-)

You're not the only one who, when offered a choice between fair and
fun, chooses the latter. ;-)

 > We can even give it a name shorter than "random" to encourage its
 > use.  That's all most users really care about anyway ;-)

That's beyond "unfair"!


From mal at egenix.com  Mon Sep 14 12:37:52 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 14 Sep 2015 12:37:52 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>	<CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>	<etPan.55f06fd9.71794aea.31bc@Draupnir.home>	<1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>	<CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>	<1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>	<20150909190757.GM19373@ando.pearwood.info>	<55F0BF61.6050205@canterbury.ac.nz>	<CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>	<55F13EAF.5040500@egenix.com>	<CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>	<55F1B219.1000502@egenix.com>	<87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>	<87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
Message-ID: <55F6A380.4070609@egenix.com>

On 14.09.2015 08:38, Nathaniel Smith wrote:
> If Tim Peters can get fooled
> into thinking something like using MT to generate session ids is
> "probably mostly secure", then what chance do the rest of us have?
> <wink>

I don't think that Tim can get fooled into believing he is a
crypto wonk ;-)

The thread reveals another misunderstanding:

 Broken code doesn't get any better when you change the context
 in which it is run.

By fixing the RNG used in such broken code and making it
harder to run attacks, you are only changing the context in which
the code is run. The code itself still remains broken.

Code which uses the output from an RNG as session id without adding
any additional security measures is broken, regardless of what kind
of RNG you are using. I bet such code will also take any session id
it receives as cookie and trust it without applying extra checks
on it.

Rather than trying to fix up the default RNG in Python by replacing
it with a crypto RNG, it's better to open bug reports to get the
broken software fixed.

Replacing the default Python RNG with a new unstudied crypto one,
will likely introduce problems into working code which rightly
assumes the proven statistical properties of the MT.

Just think of the consequences of adding unwanted bias to simulations.
This is far more likely to go unnoticed than a session highjack due
to a broken system and can easily cost millions (or earn you
millions - it's all probability after all :-)).

Now, pointing people who write broken code to a new module which
provides a crypto RNG probably isn't much better either. They'd feel
instantly secure because it says "crypto" on the box and forget
about redesigning their insecure protocol as well. Nothing much you
can do about that, I'm afraid.

Too easy sometimes is too easy indeed ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 14 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> mxODBC Connect Remote DB-API ...          http://connect.egenix.com/
>>> mxODBC Python Database Interface ...       http://mxodbc.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ...       http://zope.egenix.com/
________________________________________________________________________
2015-09-14: Released mxODBC Plone/Zope DA 2.2.3   http://egenix.com/go84
2015-09-18: PyCon UK 2015 ...                               4 days to go
2015-09-26: Python Meeting Duesseldorf Sprint 2015         12 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From njs at pobox.com  Mon Sep 14 14:26:50 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 14 Sep 2015 05:26:50 -0700
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F6A380.4070609@egenix.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
 <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
Message-ID: <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>

On Mon, Sep 14, 2015 at 3:37 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> On 14.09.2015 08:38, Nathaniel Smith wrote:
>> If Tim Peters can get fooled
>> into thinking something like using MT to generate session ids is
>> "probably mostly secure", then what chance do the rest of us have?
>> <wink>
>
> I don't think that Tim can get fooled into believing he is a
> crypto wonk ;-)
>
> The thread reveals another misunderstanding:
>
>  Broken code doesn't get any better when you change the context
>  in which it is run.

As an aphorism this sounds nice, but logically it makes no sense. If
the broken thing about your code is that it assumes that the output of
the RNG is unguessable, and you change the context by making the
output of the RNG unguessable, then now the code it isn't broken.

The code would indeed remain broken when run under e.g. older
interpreters, but this is not an argument that we should make sure
that it stays broken in the future.

> By fixing the RNG used in such broken code and making it
> harder to run attacks, you are only changing the context in which
> the code is run. The code itself still remains broken.
>
> Code which uses the output from an RNG as session id without adding
> any additional security measures is broken, regardless of what kind
> of RNG you are using. I bet such code will also take any session id
> it receives as cookie and trust it without applying extra checks
> on it.

Yes, that's... generally the thing you do with session cookies?
They're shared secret string that you use as keys into some sort of
server-side session database? What extra checks need to be applied?

> Rather than trying to fix up the default RNG in Python by replacing
> it with a crypto RNG, it's better to open bug reports to get the
> broken software fixed.
>
> Replacing the default Python RNG with a new unstudied crypto one,
> will likely introduce problems into working code which rightly
> assumes the proven statistical properties of the MT.
>
> Just think of the consequences of adding unwanted bias to simulations.
> This is far more likely to go unnoticed than a session highjack due
> to a broken system and can easily cost millions (or earn you
> millions - it's all probability after all :-)).

I'm afraid you just don't understand what you're talking about here.

When it comes to adding bias to simulations, all crypto RNGs have
*better* statistical properties than MT. A crypto RNG which was merely
as statistically-well-behaved as MT would be considered totally
broken, because MT doesn't even pass black-box tests of randomness
like TestU01.

> Now, pointing people who write broken code to a new module which
> provides a crypto RNG probably isn't much better either. They'd feel
> instantly secure because it says "crypto" on the box and forget
> about redesigning their insecure protocol as well. Nothing much you
> can do about that, I'm afraid.

Yes, improving the RNG only helps with some problems, not others; it
might merely make a system harder to attack, rather than impossible to
attack. But giving people unguessable random numbers by default does
solve real problems.

-n

-- 
Nathaniel J. Smith -- http://vorpus.org

From jbvsmo at gmail.com  Mon Sep 14 14:26:33 2015
From: jbvsmo at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Bernardo?=)
Date: Mon, 14 Sep 2015 09:26:33 -0300
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F6A380.4070609@egenix.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com> <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
 <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
Message-ID: <CAOyAWghtn_-1_0kUQU56DQbnZAUvBGKPPp-tFjDghh+0j5wrWA@mail.gmail.com>

Quick fix!
The problem with MT would be someone having all 624 32-byte numbers from
the state. So, every now and then, the random generator should run twice
and discard one of the outputs.
Do this about 20 times for each 624 calls and no brute force can find the
state. Thanks for your attention ;)


Jo?o Bernardo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150914/709956e0/attachment.html>

From cory at lukasa.co.uk  Mon Sep 14 14:31:24 2015
From: cory at lukasa.co.uk (Cory Benfield)
Date: Mon, 14 Sep 2015 13:31:24 +0100
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAOyAWghtn_-1_0kUQU56DQbnZAUvBGKPPp-tFjDghh+0j5wrWA@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
 <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAOyAWghtn_-1_0kUQU56DQbnZAUvBGKPPp-tFjDghh+0j5wrWA@mail.gmail.com>
Message-ID: <CAH_hAJHLhD=iQuvmNa2yJ+9vLsADdPAidV3vhX2vZTqEPBK50A@mail.gmail.com>

On 14 September 2015 at 13:26, Jo?o Bernardo <jbvsmo at gmail.com> wrote:
> Quick fix!
> The problem with MT would be someone having all 624 32-byte numbers from the
> state. So, every now and then, the random generator should run twice and
> discard one of the outputs.
> Do this about 20 times for each 624 calls and no brute force can find the
> state. Thanks for your attention ;)

'Every now and then': what's that? Is it a deterministic interval or a
random one? If a random one, where does the random number come from:
MT? If deterministic, it's trivial to include the effect in your
calculations.

More generally, what you're doing here is gaining *information* about
the state. You don't have to know it perfectly, just to reduce the
space of possible states down. Even if you threw 95% of the results of
MT away, each time I watch I can reduce the space of possible states
the MT is in.

This is not a fix.

From antoine at python.org  Mon Sep 14 14:59:00 2015
From: antoine at python.org (Antoine Pitrou)
Date: Mon, 14 Sep 2015 12:59:00 +0000 (UTC)
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
Message-ID: <loom.20150914T145109-192@post.gmane.org>

Nick Coghlan <ncoghlan at ...> writes:
> 
> On 11 September 2015 at 02:05, Brett Cannon <brett at ...> wrote:
> > +1 for deprecating module-level functions and putting everything into
> > classes to force a choice
> 
> -1000, as this would be a *huge* regression in Python's usability for
> educational use cases. (Think 7-8 year olds that are still learning to
> read, not teenagers or adults with more fully developed vocabularies)

Fully agreed with Nick. That this is being seriously considered
shows a massive disregard for usability. Python is not C++, it places
convenience first.

Besides, a deterministic RNG is a feature: you can reproduce exactly
a random sequence by re-using the same seed, which helps fix rare
input-dependent failures (we actually have good example of that in
CPython development with `regrtest -r`). Good luck debugging such
issues when using a RNG which reseeds itself in a random (!) way.

Endly, the premise of this discussion is idealistic in the first place.
If someone doesn't realize their code is security-sensitive, there
are other mistakes they will make than simply choosing the wrong
RNG.  If you want to help people generate secure passwords, best would
be perhaps to write a password-generating (or more generally
secret-generating, for different kinds of secrets: passwords, session
ids, etc.) library.

Regards

Antoine.



From cory at lukasa.co.uk  Mon Sep 14 15:29:11 2015
From: cory at lukasa.co.uk (Cory Benfield)
Date: Mon, 14 Sep 2015 14:29:11 +0100
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <loom.20150914T145109-192@post.gmane.org>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
Message-ID: <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>

On 14 September 2015 at 13:59, Antoine Pitrou <antoine at python.org> wrote:>
> Endly, the premise of this discussion is idealistic in the first place.
> If someone doesn't realize their code is security-sensitive, there
> are other mistakes they will make than simply choosing the wrong
> RNG.  If you want to help people generate secure passwords, best would
> be perhaps to write a password-generating (or more generally
> secret-generating, for different kinds of secrets: passwords, session
> ids, etc.) library.

Is your argument that there are lots of ways to get security wrong,
and for that reason we shouldn't try to fix any of them? After all, I
could have made this argument against PEP 466, or against the
deprecation of SHA1 in TLS certificates, or against any security
improvement ever made that simply changed defaults. The fact that
there are secure options available is not a good excuse for leaving
the insecure ones as the defaults.

And let's be clear, this is not a theoretical error that people don't
hit in real life. Investigating your last comment, Antoine, I googled
"python password generator". The results:

- The first one is a StackOverflow question which incorrectly uses
random.choice (though seeded from os.urandom, which is an
improvement). The answer to that says to just use os.urandom
everywhere, but does not provide sample code. Only the third answer
gets so far as to provide sample code, and it's way overkill.
- The second option, entitled "A Better Password Generator",
incorrectly uses random.randrange. This code is *aimed at beginners*,
and is kindly handing them a gun to point at their own foot.
- The third one uses urandom, which is fine
- The fourth, an XKCD-based password generator, uses SystemRandom *if
available* but then falls back to the MT approach, which is an
unexpected decision, but there we go.
- The fifth, from "pythonforbeginners.com", incorrectly uses random.choice
- The sixth goes into an intensive discussion about 'password
strength', including a discussion about the 'bit strength' of the
password, despite the fact that they use random.randint which means
that the analysis about bit strength is totally flawed.
- For the seventh we get a security.stackexchange question with the
first answer saying not to use Random, though the questioner does use
it and no sample code is provided.
- The eight is a library that "generates randomized strings of
characters". It attempts to use SystemRandom but falls back silently
if it's unavailable.

At this point I gave up. Of that list of 8 responses, three are
completely wrong, two provide sample code that is wrong with no
correct sample code to be found on the page, two attempt to do the
right thing but will fall into a silent failure mode if they can't,
and only one is unambiguously correct.

Similarly, a quick search of GitHub for Python repositories that
contain random.choice and the string 'password' returns 40,000
results.[0] Even if 95% of them are safe, that leaves 2000 people who
wrote wrong code and uploaded it to GitHub.

It is disingenuous to say that only people who know enough write
security-critical code. They don't. The reason for this is that most
people don't know they don't know enough. And for those people,
Python's default approach screws them over, and then they write blog
posts which screw over more people.

If the Python standard library would like to keep the insecure default
of random.random that's totally fine, but we shouldn't pretend that
the resulting security failures aren't our fault: they absolutely are.

 [0]: https://github.com/search?l=python&q=random.choice+password&ref=searchresults&type=Code&utf8=%E2%9C%93

From ncoghlan at gmail.com  Mon Sep 14 15:32:17 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 14 Sep 2015 23:32:17 +1000
Subject: [Python-ideas] Globally configurable random number generation
Message-ID: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>

This is an expansion of the random module enhancement idea I
previously posted to Donald's thread:
https://mail.python.org/pipermail/python-ideas/2015-September/035969.html

I'll write it up as a full PEP later, but I think it's just as useful
in this form for now.

= Defining the problem =

We're moving into an era where the easiest way to publish software is
as a web application, with "deployment" to client systems done at
runtime via a web browser. It's regularly the case that "learn to
program" classes (especially those aimed at adults picking up
programming for the first time) will introduce folks to both a web
development framework and how to deploy web applications on a
developer focused service with a free hosting tier, like Heroku or
OpenShift.

It's also the case that we live in an era where there's a lot of
well-intentioned-but-actually-bad advice on the internet when it comes
to generating security sensitive tokens, and the folks receiving that
advice through forums like Stack Overflow aren't necessarily ever
going to see the "don't do that" guidance in the standard library's
random module documentation, or the docs for the cryptography library,
or the docs for a web framework like Flask, Django or Pyramid.

One of the ways we know many of the folks doing web development often
don't take admonitions in documentation seriously is because one of
the most popular web servers for Python on these kinds of services is
Django's "runserver", even though Django's docs specifically say only
to use that for local development. It isn't OK to say "the developers
deserve the consequences that come to them" as in many case, it isn't
the developers that suffer the consequences, but the users of their
applications.

One reason we know weak RNGs can be a problem in practice is because
the same kind of concern exists in PHP web applications, and
https://media.blackhat.com/bh-us-12/Briefings/Argyros/BH_US_12_Argyros_PRNG_WP.pdf
shows how the relative predictability of password reset tokens can be
used to compromise administrator accounts.

Rather than playing whackamole with individual web applications (many
of which will be written by inexperienced developers), or attempting
to demonstrate that a deterministic PRNG is "secure enough" for these
use cases (when the research on PHP and deterministic PRNGs in general
indicates that it isn't), it is proposed to migrate Python to a
default random implementation that *is* known to be secure enough for
these kinds of use cases.

At the same time, deterministic random number generation is still
desirable in many situations, and we also don't want to require that
folks learning Python in the future be required to take a crash course
in web application security theory first. Thus, it is also proposed
that the abstraction used to present these differences to end users
minimise the references to the underlying security concepts.

A key outcome of this proposal is that it will retroactively upgrade a
lot of existing instructions on the internet for generating default
passwords and other sensitive tokens in Python from "actively harmful"
to "not necessarily ideal, but at least not wrong if you're using
Python 3.6+".

This *is* a compatibility break for the sake of correcting default
behaviours that are fine when developing applications for local use,
but problematic from a network service security perspective, just as
happened with the introduction of hash randomisation. Unlike the hash
randomisation change, this one is readily addressed in old versions on
a case by case basis, so it is only proposed to make the change in a
future feature release of Python, not in any current maintenance
releases.

= Core abstraction =

The core concept of this proposal involves classifying random number
generators in Python as follows:

* seedable
* seedless
* system

These terms are chosen to make sense to folks that have *no idea*
about the way different kinds of random number generator work and how
that affects their security properties, but do know whether or not
they need to be able to pass in a particular fixed seed in order to
regenerate the same series of outputs.

The guidance to Python users is then:

* we use the seedless RNG by default as it provides the best balance
of speed and security
* if you need to be able to exactly reproduce output sequences, use
the seedable RNG
* if you know you're doing security sensitive work, use the system RNG
directly to eliminate Python's seedless RNG as a potential source of
vulnerabilities

Importantly, there are relatively simple answers to the following two
questions (which could be added to the Design FAQ):

Q: Why isn't the seedable RNG the default random implementation (any more)?
A: The same properties that make it possible to provide an explicit
seed to the seedable RNG and get a predictable series of outputs make
it inappropriate for tasks like generating session IDs and password
reset tokens in web applications. Since folks continued to use the
default RNG for those cases, even after years of the core development
team, web framework developers and security engineers saying "Don't do
that, use the system RNG instead", we eventually changed the default
behaviour to just make those cases OK.

Q: Why isn't the system RNG the default implementation?
A: Due to the way operating systems work, calling into the kernel to
get a random number is always going to be slower than generating one
within the Python runtime. The default seedless generator provides
most of the same benefits as using the system RNG directly, but is an
order of magnitude faster as it doesn't need to call into the kernel
as often.

= Proposed change for Python 3.6 =

* add a random.SeedlessRandom API that omits the seed(), getstate()
and setstate() methods and uses a cryptographically secure PRNG
internally (such as the ChaCha20 algorithm implemented by OpenBSD)
* rename random.Random to random.SeedableRandom
* make random.Random a subclass of SeedableRandom that deprecates
seed(), getstate() and setstate()
* deprecate the seed(), getstate() and setstate() methods on SystemRandom
* expose the global SeedableRandom instance as random.seedable_random
* expose a global SeedlessRandom instance as random.seedless_random
* expose a global SystemRandom instance as random.system_random
* provide a random.set_default_instance() API that makes it possible
to specify the instance used by the module level methods
* the module level seed(), getstate(), and setstate() functions will
throw RuntimeError if the corresponding method is missing from the
default instance

In 3.6, "random.set_default_instance(random.seedless_random)" will opt
in to the CSPRNG when using the module level functions process wide,
while "from random import seedless_random as random" will do so on a
module by module basis.

"from random import system_random as random" also becomes available as
a simple upgrade path for security sensitive modules.

Appropriate helpers would be added to the six and future projects to
allow single source Python 2/3 projects to easily cope with the change
in behaviour when using the seeded RNG for its intended purposes. For
many projects, compatibility code will consist of the following lines
in a compatibility module:

    try:
        from random import seedable_random as random
    except ImportError:
        import random

It would also be desirable for the seedless random number generator to
be made available as a PyPI package for use on older Python versions.

= Proposed change for Python 3.7 =

* random.Random becomes an alias for random.SeedlessRandom
* the default instance changes to be random.seedless_random

In 3.7, "random.set_default_instance(random.seedable_random)" will opt
back in to the deterministic PRNG when using the module level
functions process wide, while "from random import seedable_random as
random" will do so on a module by module basis.

= Seedable random number generation =

This is what we have today. The MT random implementation supports
explicit seeding, state retrieval, and state restoration. It doesn't
automatically mix in additional system entropy as it operates.

This is the right choice for use cases like computer games, map
generation, and randomising the order of test execution, as in these
situations, it's desirable to be able to reproduce a past sequence
exactly.

= Seedless random number generators =

This is the key proposed new addition: a cryptographically secure,
non-deterministic, userspace PRNG. It's faster than the system RNG as
it avoids the need to make a system API call.

The "seedless" name comes from the fact that the inability to feed in
a fixed seed is the most obvious API difference relative to
deterministic RNGs, and hence provides a mental hook for people to
remember which is which, without needing to know the relevant
background security theory (which is arcane enough to be opaque even
to developers with decades of experience and hence isn't something we
want to be inflicting on folks in the process of learning to program).

= System random number generator =

The only proposed change here is providing a default instance to
enable the "from random import system_random as random" pattern.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From steve at pearwood.info  Mon Sep 14 15:32:33 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 14 Sep 2015 23:32:33 +1000
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAOyAWghtn_-1_0kUQU56DQbnZAUvBGKPPp-tFjDghh+0j5wrWA@mail.gmail.com>
References: <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com> <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
 <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAOyAWghtn_-1_0kUQU56DQbnZAUvBGKPPp-tFjDghh+0j5wrWA@mail.gmail.com>
Message-ID: <20150914133232.GA31152@ando.pearwood.info>

On Mon, Sep 14, 2015 at 09:26:33AM -0300, Jo?o Bernardo wrote:
> Quick fix!
> The problem with MT would be someone having all 624 32-byte numbers from
> the state. So, every now and then, the random generator should run twice
> and discard one of the outputs.

No, that's not good enough. You can skip a few outputs, and still 
recover the internal state. The more outputs are skipped, the harder it 
becomes, but still possible.


> Do this about 20 times for each 624 calls and no brute force can find the
> state. Thanks for your attention ;)

This is not brute force. The recovery attack does not try generating 
every possible internal state. The MT is a big, complicated equation 
(technically, it is called a recurrence relation), but being an 
equation, it is completely deterministic. Given enough values, we can 
build another equation which can be solved to give the internal state of 
the MT equation.

Are you suggesting that every time you call random.random(), it 
should secretly generate 20 random numbers and throw away all but the 
last?

(1) That would break backwards compatibility for those who need it. The 
output of random() is stable across versions:

[steve at ando ~]$ for vers in 2.4 2.5 2.6 2.7 3.1 3.2 3.3 3.4; do
> python$vers -c "from random import *; seed(25); print(random())";
> done
0.37696230239
0.37696230239
0.37696230239
0.37696230239
0.37696230239
0.376962302390386
0.376962302390386
0.376962302390386

(There's a change in the printable output starting in 3.2, but the 
numbers themselves are the same.)

(2) it would make the random number generator twenty times slower than 
it is now, and MT is already not very fast;

(3) most importantly, I don't think that would even solve the problem. I 
personally don't know how, but I would predict that somebody with more 
maths skills than me would be able to still recover the internal state. 
They would just have to collect more values.


-- 
Steve

From rosuav at gmail.com  Mon Sep 14 15:32:55 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Mon, 14 Sep 2015 23:32:55 +1000
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
 <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
Message-ID: <CAPTjJmpiBz7cb-=xc6Hkyi=S+U+Ft6VjROX5qtM8sFckEZe0LQ@mail.gmail.com>

On Mon, Sep 14, 2015 at 10:26 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> Code which uses the output from an RNG as session id without adding
>> any additional security measures is broken, regardless of what kind
>> of RNG you are using. I bet such code will also take any session id
>> it receives as cookie and trust it without applying extra checks
>> on it.
>
> Yes, that's... generally the thing you do with session cookies?
> They're shared secret string that you use as keys into some sort of
> server-side session database? What extra checks need to be applied?

Some systems check to see if the session was created by the same IP
address. That can help, but it also annoys legitimate users who change
their IP addresses.

ChrisA

From ncoghlan at gmail.com  Mon Sep 14 15:35:00 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 14 Sep 2015 23:35:00 +1000
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
 <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
Message-ID: <CADiSq7fmV2BDveYmr8XWkk5kKiaZHq=q9CNdRpxGrAkYR8QWfQ@mail.gmail.com>

On 14 September 2015 at 16:38, Nathaniel Smith <njs at pobox.com> wrote:
> I like Nick's proposal here:
>     https://code.activestate.com/lists/python-ideas/35842/
> as probably the most solid strategy for implementing that idea -- the
> only projects that would be negatively affected are those that are
> using the seeding functionality of the global random API, which is a
> tiny fraction, and the effect on those projects is that they get
> nudged into using the superior object-oriented API.

I started a new thread breaking that out into more of a proto-PEP
(including your reference to the PHP research - thanks for that!).

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From skrah at bytereef.org  Mon Sep 14 15:43:58 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Mon, 14 Sep 2015 13:43:58 +0000 (UTC)
Subject: [Python-ideas] Globally configurable random number generation
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
Message-ID: <loom.20150914T153737-237@post.gmane.org>

Nick Coghlan <ncoghlan at ...> writes:
> = Core abstraction =
> 
> The core concept of this proposal involves classifying random number
> generators in Python as follows:
> 
> * seedable
> * seedless
> * system
> 
> These terms are chosen to make sense to folks that have *no idea*
> about the way different kinds of random number generator work and how
> that affects their security properties, but do know whether or not
> they need to be able to pass in a particular fixed seed in order to
> regenerate the same series of outputs.
> 
> The guidance to Python users is then:
> 
> * we use the seedless RNG by default as it provides the best balance
> of speed and security
> * if you need to be able to exactly reproduce output sequences, use
> the seedable RNG
> * if you know you're doing security sensitive work, use the system RNG
> directly to eliminate Python's seedless RNG as a potential source of
> vulnerabilities

Sorry, -1 on this. Theo proposed a simple API like:

  arc4random()
  arc4random_uniform()


Go has:

  https://golang.org/pkg/math/rand/
  https://golang.org/pkg/crypto/rand/


These are sane, unambiguously named APIs. I wish Python had more
of those.  If people must have their CSPRNG, please let's leave
the random module alone and introduce a crypto module like Go.


Stefan Krah













From donald at stufft.io  Mon Sep 14 15:51:29 2015
From: donald at stufft.io (Donald Stufft)
Date: Mon, 14 Sep 2015 09:51:29 -0400
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
Message-ID: <etPan.55f6d0e1.7bf9094.b1e0@Draupnir.home>

On September 14, 2015 at 9:33:27 AM, Nick Coghlan (ncoghlan at gmail.com) wrote:
>  
> * seedable
> * seedless
> * system
>  
> These terms are chosen to make sense to folks that have *no idea*
> about the way different kinds of random number generator work and how
> that affects their security properties, but do know whether or not
> they need to be able to pass in a particular fixed seed in order to
> regenerate the same series of outputs.

I don't love the "seedable" and "seedless" names here, but I don't have a
better suggestion for the userspace CSPRNG one because it's security properties
are a bit nuanced. People doing security sensitive things like generating keys
for cryptography should still use something based on os.urandom, so it's mostly
about providing a safety net that will "probably" [1] be safe. Probably
something like random.ProbablySecureRandom is a bad name :)

> * provide a random.set_default_instance() API that makes it possible
> to specify the instance used by the module level methods

I think this particular bit is a bad idea, it makes an official API that makes
it really hard for an auditor to come into a code base and determine if the use
of random is correct or not. Given that going back to the MT based algorithm is
fairly trivial (and could even be mechanical) what's the long ter benefit here?


[1] The safety of userspace CSPRNGs is a debated topic by security experts,
? ? however I think any of them would be hard pressed to think it's a bad idea
? ? to have a userspace CSPRNG as a safety net for folks who, for whatever
? ? reason, didn't know to use os.urandom/random.SystemRandom and instead to
? ? make them more likely to be safe, or at the very least, harder to attack.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From random832 at fastmail.com  Mon Sep 14 16:16:00 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 14 Sep 2015 10:16:00 -0400
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <etPan.55f6d0e1.7bf9094.b1e0@Draupnir.home>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <etPan.55f6d0e1.7bf9094.b1e0@Draupnir.home>
Message-ID: <1442240160.186432.383146609.471F1B7D@webmail.messagingengine.com>


On Mon, Sep 14, 2015, at 09:51, Donald Stufft wrote:
> I think this particular bit is a bad idea, it makes an official API
> that makes it really hard for an auditor to come into a code base and
> determine if the use of random is correct or not.

It's no worse than what OpenBSD itself has done with the C api for
rand/random/rand48. At some point you've got to balance it with the
realities of making backwards compatibility easy to achieve for the
applications that really do need it with either a few lines change or
none at all. And anyway, the auditor would *know* that if they see a
module-level function called they need to do the extra work to find
out what mode the module-level RNG is in (i.e. yes/no is there anywhere
at all in the codebase that changes it from the secure default?)

It's not an "official API", it's an escape hatch for allowing a minimal
change to existing code that needs the old behavior.

> Given that going back to the MT based algorithm is fairly trivial (and
> could even be mechanical) what's the long ter benefit here?

I don't see how it's trivial/mechanical, *without* the exact feature
being discussed.

From sturla.molden at gmail.com  Mon Sep 14 16:21:11 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Mon, 14 Sep 2015 16:21:11 +0200
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <loom.20150914T153737-237@post.gmane.org>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <loom.20150914T153737-237@post.gmane.org>
Message-ID: <mt6l4o$dgi$1@ger.gmane.org>

On 14/09/15 15:43, Stefan Krah wrote:

> These are sane, unambiguously named APIs. I wish Python had more
> of those.  If people must have their CSPRNG, please let's leave
> the random module alone and introduce a crypto module like Go.

In a perfect world, every programmer would know the difference between 
PRNGs for numerical simulation and entropy sources for cryptography.

Those that do will still use os.urandom or just read from /dev/urandom 
or /dev/random for cryptography.

Those that do know the need for mathematical precision when simulating 
samples from a given distribution. Those that do know the need for a 
fixed seed because a Monte Carlo simulation should be exactly 
reproducible in a scientific context.

The problem is users who have no idea that the Mersenne Twister is 
constructed for producing random deviates that are great for numerical 
simulation -- and that the Mersenne Twister is very weak for cryptography.

Using os.urandom as default entropy source has the opposite effect. It 
is not constructed for being mathematically precise, it is slow, and it 
does not allow for a fixed seed and exact reproducibility.

Whatever we do there are someone who are going to shoot their leg off.

A crypto module would perhaps be great, but it does not solve anything. 
Someone who uses random.random instead of os.urandom is likely to use 
random.random instead of a PRNG in a crypto module as well. Mostly this 
is about propagating knowledge of random number generators to new 
developers and science students.


Sturla



From skrah at bytereef.org  Mon Sep 14 16:24:53 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Mon, 14 Sep 2015 14:24:53 +0000 (UTC)
Subject: [Python-ideas] Globally configurable random number generation
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <etPan.55f6d0e1.7bf9094.b1e0@Draupnir.home>
 <1442240160.186432.383146609.471F1B7D@webmail.messagingengine.com>
Message-ID: <loom.20150914T162116-889@post.gmane.org>

Random832 <random832 at ...> writes:
> It's no worse than what OpenBSD itself has done with the C api for
> rand/random/rand48. 

These functions aren't used widely in scientific computing.


> It's not an "official API", it's an escape hatch for allowing a minimal
> change to existing code that needs the old behavior.

It's yet another case split to keep in the back of one's mind.


Stefan Krah


From donald at stufft.io  Mon Sep 14 16:40:50 2015
From: donald at stufft.io (Donald Stufft)
Date: Mon, 14 Sep 2015 10:40:50 -0400
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <1442240160.186432.383146609.471F1B7D@webmail.messagingengine.com>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <etPan.55f6d0e1.7bf9094.b1e0@Draupnir.home>
 <1442240160.186432.383146609.471F1B7D@webmail.messagingengine.com>
Message-ID: <etPan.55f6dc72.7a3b8900.b1e0@Draupnir.home>

On September 14, 2015 at 10:16:49 AM, Random832 (random832 at fastmail.com) wrote:
>  
> On Mon, Sep 14, 2015, at 09:51, Donald Stufft wrote:
> > I think this particular bit is a bad idea, it makes an official API
> > that makes it really hard for an auditor to come into a code base and
> > determine if the use of random is correct or not.
>  
> It's no worse than what OpenBSD itself has done with the C api for
> rand/random/rand48. At some point you've got to balance it with the
> realities of making backwards compatibility easy to achieve for the
> applications that really do need it with either a few lines change or
> none at all. And anyway, the auditor would *know* that if they see a
> module-level function called they need to do the extra work to find
> out what mode the module-level RNG is in (i.e. yes/no is there anywhere
> at all in the codebase that changes it from the secure default?)
>  
> It's not an "official API", it's an escape hatch for allowing a minimal
> change to existing code that needs the old behavior.
>  
> > Given that going back to the MT based algorithm is fairly trivial (and
> > could even be mechanical) what's the long ter benefit here?
>  
> I don't see how it's trivial/mechanical, *without* the exact feature
> being discussed.

Easily, you change your:

? ? import random

to

? ? from random import seeded_random as random

And then all of your code that used random.foo works without any further
modification. If you were importing the individual functions, you can either
change your code to use random.foo or you can do:

from random import seeded_random as _random
random = _random.random
randint = _random.randint

If you want to do this in cross language code, then you can combine this with
a try: except block like:

? ? try:
? ? ? ? from random import seeded_random as random
? ? except ImportError:
? ? ? ? import random

Either way, trivial and mechanical. It doesn't require much thought, it just
requires some pretty simple changes.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From random832 at fastmail.com  Mon Sep 14 16:45:05 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 14 Sep 2015 10:45:05 -0400
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <loom.20150914T162116-889@post.gmane.org>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <etPan.55f6d0e1.7bf9094.b1e0@Draupnir.home>
 <1442240160.186432.383146609.471F1B7D@webmail.messagingengine.com>
 <loom.20150914T162116-889@post.gmane.org>
Message-ID: <1442241905.192420.383177025.382A3D7E@webmail.messagingengine.com>

On Mon, Sep 14, 2015, at 10:24, Stefan Krah wrote:
> Random832 <random832 at ...> writes:
> > It's no worse than what OpenBSD itself has done with the C api for
> > rand/random/rand48. 
> 
> These functions aren't used widely in scientific computing.

I don't see how that's relevant, when what I'm talking about is
"providing an API that switches them from secure mode to
insecure/deterministic mode"

From skrah at bytereef.org  Mon Sep 14 16:50:13 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Mon, 14 Sep 2015 14:50:13 +0000 (UTC)
Subject: [Python-ideas] Globally configurable random number generation
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <loom.20150914T153737-237@post.gmane.org> <mt6l4o$dgi$1@ger.gmane.org>
Message-ID: <loom.20150914T163810-407@post.gmane.org>

Sturla Molden <sturla.molden at ...> writes:
> On 14/09/15 15:43, Stefan Krah wrote:
> 
> > These are sane, unambiguously named APIs. I wish Python had more
> > of those.  If people must have their CSPRNG, please let's leave
> > the random module alone and introduce a crypto module like Go.
 
> A crypto module would perhaps be great, but it does not solve anything. 
> Someone who uses random.random instead of os.urandom is likely to use 
> random.random instead of a PRNG in a crypto module as well. Mostly this 
> is about propagating knowledge of random number generators to new 
> developers and science students.

The sentiments in the original thread (which has now been renamed two
times), seem to have been lost:

Theo:
=====

"chacha arc4random is really fast.

if you were to create such an API in python, maybe this is how it will
go:

say it becomes arc4random in the back end.  i am unsure what advice to
give you regarding a python API name.  in swift, they chose to use the
same prefix "arc4random" (id = arc4random(), id = arc4random_uniform(1..n)";
it is a little bit different than the C API.  google has tended to choose
other prefixes.   we admit the name is a bit strange, but we can't touch
the previous attempts like drand48....

I do suggest you have the _uniform and _buf versions.  Maybe apple
chose to stick to arc4random as a name simply because search engines
tend to give above average advice for this search string?"


Theo:
=====

"that opens /dev/urandom or uses the getrandom system call depending on
system.  it also has support for the windows entropy API.  it pulls
data into a large buffer, a cache.  then each subsequent call, it
consumes some, until it rus out, and has to do a fresh read.  it
appears to not clean the buffer behind itself, probably for
performance reasons, so the memory is left active.  (forward secrecy
violated)

i don't think they are doing the best they can...  i think they should
get forward secrecy and higher performance by having an in-process
chacha.  but you can sense the trend."


So the original thread is about:
================================

  - Inplementing a possibly faster (and allegedly more secure)
    chacha20-random.

  - Possibly using the naming scheme of Swift.

  - Being careful with os.urandom(), as there are some pitfalls that
    the OpenBSD libcrypto (allegedly) solves.


I see nothing about magically repurposing random.random() functions.



Stefan Krah





















From steve at pearwood.info  Mon Sep 14 16:50:46 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 15 Sep 2015 00:50:46 +1000
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <1442240160.186432.383146609.471F1B7D@webmail.messagingengine.com>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <etPan.55f6d0e1.7bf9094.b1e0@Draupnir.home>
 <1442240160.186432.383146609.471F1B7D@webmail.messagingengine.com>
Message-ID: <20150914145046.GC31152@ando.pearwood.info>

On Mon, Sep 14, 2015 at 10:16:00AM -0400, Random832 wrote:
> 
> On Mon, Sep 14, 2015, at 09:51, Donald Stufft wrote:
> > I think this particular bit is a bad idea, it makes an official API
> > that makes it really hard for an auditor to come into a code base and
> > determine if the use of random is correct or not.
> 
> It's no worse than what OpenBSD itself has done with the C api for
> rand/random/rand48. At some point you've got to balance it with the
> realities of making backwards compatibility easy to achieve for the
> applications that really do need it with either a few lines change or
> none at all. And anyway, the auditor would *know* that if they see a
> module-level function called they need to do the extra work to find
> out what mode the module-level RNG is in (i.e. yes/no is there anywhere
> at all in the codebase that changes it from the secure default?)
> 
> It's not an "official API", it's an escape hatch for allowing a minimal
> change to existing code that needs the old behavior.

Of course it is an official API. It's a documented public function (or 
rather, it will be if Nick's suggest is accepted) in the standard 
library. That makes it an official API. The *whole purpose of it* is to 
give a standard API for what Python can already do: monkey-patch the 
random module. E.g. we can do this now:

import random
random.random = lambda: 9
random.uniform = lambda a, b: return 9


but if you do that, you know you're on thin ice.

I don't entirely agree with everything Donald has said, but I agree that 
providing this API would be harmful. It would mean that any arbitrary 
module you import (directly or indirectly) could swap out the secure 
CSPRNG you're relying on for an insecure PRNG, and you would never know.

(Yes, they could do that now, this is Python. But they won't, because 
there's no official API for swapping out the default PRNG.)


-- 
Steve

From donald at stufft.io  Mon Sep 14 16:57:47 2015
From: donald at stufft.io (Donald Stufft)
Date: Mon, 14 Sep 2015 10:57:47 -0400
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <loom.20150914T163810-407@post.gmane.org>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <loom.20150914T153737-237@post.gmane.org> <mt6l4o$dgi$1@ger.gmane.org>
 <loom.20150914T163810-407@post.gmane.org>
Message-ID: <etPan.55f6e06b.3339a7d1.b1e0@Draupnir.home>

On September 14, 2015 at 10:50:46 AM, Stefan Krah (skrah at bytereef.org) wrote:
> > The sentiments in the original thread (which has now been renamed 
> two
> times), seem to have been lost:

I've actually talked to Theo and I believe he's read my summary of his proposal
and he didin't mention anything amiss. He did mention that he wasn't aware of
the number of APIs that we had in random.py that built ontop of the RNG.

As far as I can tell from talking to him, he focused on that particular thing
because he became aware of the issue via the recent issue with getentropy on
Solaris, and I believe he assumed that our APIs were similar to C in what we
provided.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From p.f.moore at gmail.com  Mon Sep 14 17:01:02 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 14 Sep 2015 16:01:02 +0100
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
Message-ID: <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>

On 14 September 2015 at 14:29, Cory Benfield <cory at lukasa.co.uk> wrote:
> Is your argument that there are lots of ways to get security wrong,
> and for that reason we shouldn't try to fix any of them?

This debate seems to repeatedly degenerate into this type of accusation.

Why is backward compatibility not being taken into account here? To be
clear, the proposed change *breaks backward compatibility* and while
that's allowed in 3.6, just because it is allowed, doesn't mean we
have free rein to break compatibility - any change needs a good
justification. The arguments presented here are valid up to a point,
but every time anyone tries to suggest a weak area in the argument,
the "we should fix security issues" trump card gets pulled out.

For example, as this is a compatibility break, it'll only be allowed
into 3.6+ (I've not seen anyone suggest that this is sufficiently
serious to warrant breaking compatibility on older versions). Almost
all of those SO questions, and google hits, are probably going to be
referenced by people who are using 2.7, or maybe some version of 3.x
earlier than 3.6 (at what stage do we allow for the possibility of 3.x
users who are *not* on the latest release?) So is a solution which
won't impact most of the people making the mistake, worth it?

I fully expect the response to this to be "just because it'll take
time, doesn't mean we should do nothing". Or "even if it just fixes it
for one or two people, it's still worth it". But *that's* the argument
I don't find compelling - not that a fix won't help some situations,
but that because it's security, (a) all the usual trade-off
calculations are irrelevant, and (b) other proposed solutions (such as
education, adding specialised modules like a "shared secret" library,
etc) are off the table.

Honestly, this type of debate doesn't do the security community much
good - there's too little willingness to compromise, and as a result
the more neutral participants (which, frankly, is pretty much anyone
who doesn't have a security agenda to promote) end up pushed into a
"reject everything" stance simply as a reaction to the black and white
argument style.

Paul

From storchaka at gmail.com  Mon Sep 14 17:04:35 2015
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Mon, 14 Sep 2015 18:04:35 +0300
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
Message-ID: <mt6nm4$q8i$1@ger.gmane.org>

On 14.09.15 16:32, Nick Coghlan wrote:
> * make random.Random a subclass of SeedableRandom that deprecates
> seed(), getstate() and setstate()

I would make seed() and setstate() to switch to seedable algorithm. If 
you don't use seed() or setstate(), it is not important that the 
algorithm is changed. If you use seed() or setstate(), you expect 
reproducible behavior.

> * random.Random becomes an alias for random.SeedlessRandom

This breaks compatibility with the data pickled in older Python.

> In 3.7, "random.set_default_instance(random.seedable_random)" will opt
> back in to the deterministic PRNG when using the module level
> functions process wide, while "from random import seedable_random as
> random" will do so on a module by module basis.

What to do with "from random import random" deep in third-party module? 
It caches random.random in the module dictionary.



From skrah at bytereef.org  Mon Sep 14 17:16:04 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Mon, 14 Sep 2015 15:16:04 +0000 (UTC)
Subject: [Python-ideas] Globally configurable random number generation
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <loom.20150914T153737-237@post.gmane.org> <mt6l4o$dgi$1@ger.gmane.org>
 <loom.20150914T163810-407@post.gmane.org>
 <etPan.55f6e06b.3339a7d1.b1e0@Draupnir.home>
Message-ID: <loom.20150914T171308-963@post.gmane.org>

Donald Stufft <donald at ...> writes:

> 
> On September 14, 2015 at 10:50:46 AM, Stefan Krah (skrah at ...) wrote:
> > > The sentiments in the original thread (which has now been renamed 
> > two
> > times), seem to have been lost:
> 
> I've actually talked to Theo and I believe he's read my summary of his
proposal
> and he didin't mention anything amiss. He did mention that he wasn't aware of
> the number of APIs that we had in random.py that built ontop of the RNG.
> 
> As far as I can tell from talking to him, he focused on that particular thing
> because he became aware of the issue via the recent issue with getentropy on
> Solaris, and I believe he assumed that our APIs were similar to C in what we
> provided.

That addresses pretty little of what I wrote, and I'd prefer to
hear anything directly from him.  Your summaries have a tendency
to be highly biased.


Stefan Krah


From donald at stufft.io  Mon Sep 14 17:17:34 2015
From: donald at stufft.io (Donald Stufft)
Date: Mon, 14 Sep 2015 11:17:34 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
Message-ID: <etPan.55f6e50e.2f0a18c5.24af@Draupnir.home>

On September 14, 2015 at 11:01:36 AM, Paul Moore (p.f.moore at gmail.com) wrote:
> On 14 September 2015 at 14:29, Cory Benfield wrote:
> > Is your argument that there are lots of ways to get security wrong,
> > and for that reason we shouldn't try to fix any of them?
> 
> This debate seems to repeatedly degenerate into this type of accusation.
> 
> Why is backward compatibility not being taken into account here? To be
> clear, the proposed change *breaks backward compatibility* and while
> that's allowed in 3.6, just because it is allowed, doesn't mean we
> have free rein to break compatibility - any change needs a good
> justification. The arguments presented here are valid up to a point,
> but every time anyone tries to suggest a weak area in the argument,
> the "we should fix security issues" trump card gets pulled out.

How has it not been taken into account? The current proposal (best summed up
by Nick in the other thread) will not break compatability for anyone except
those calling the functions that are specifically about setting a seed or
getting/setting the current state. In looking around I don't see a lot of
people using those particular functions so most people likely won't notice the
change at all, and for those who there is a very trivial change they can make
to their code to cope with the change.

> 
> For example, as this is a compatibility break, it'll only be allowed
> into 3.6+ (I've not seen anyone suggest that this is sufficiently
> serious to warrant breaking compatibility on older versions). Almost
> all of those SO questions, and google hits, are probably going to be
> referenced by people who are using 2.7, or maybe some version of 3.x
> earlier than 3.6 (at what stage do we allow for the possibility of 3.x
> users who are *not* on the latest release?) So is a solution which
> won't impact most of the people making the mistake, worth it?
> 
> I fully expect the response to this to be "just because it'll take
> time, doesn't mean we should do nothing". Or "even if it just fixes it
> for one or two people, it's still worth it". But *that's* the argument
> I don't find compelling - not that a fix won't help some situations,
> but that because it's security, (a) all the usual trade-off
> calculations are irrelevant, and (b) other proposed solutions (such as
> education, adding specialised modules like a "shared secret" library,
> etc) are off the table.

We can't go back in time and fix those versions that is true. However, one of
the biggest groups of people who are most likely to be helped by this change is
new and inexperienced developers who don't fully grasp the security sensitive
nature of whatever they are doing with random. That group of people are also
more likely to be using Python 3.x than experienced programmers.

> 
> Honestly, this type of debate doesn't do the security community much
> good - there's too little willingness to compromise, and as a result
> the more neutral participants (which, frankly, is pretty much anyone
> who doesn't have a security agenda to promote) end up pushed into a
> "reject everything" stance simply as a reaction to the black and white
> argument style.
> 

If I/we were not willing to compromise, I'd be pushing for it to use
SystemRandom everywhere because that removes all of the possibly problematic
parts of using using a user-space CSPRNG like is being proposed. However, I/we
are willing to compromise by sacrificing possible security in order to not
regress things where we can, in particular a user-space CSPRNG is being
proposed over SystemRandom because it will provide you with random numbers
almost as fast as MT will.

However, when proposing this possible compromise, we are met with people
refusing to meet us in the middle. There are some folks who are trying to
propose other middle grounds, and there will undoubtably be some discussion
around which ones are the best. We've gone from suggesting to replacing the
default random with SystemRandom (a lot slower than MT) to removing the default
altogether, to deprecating the default and replacing it with a fast user-space
CSPRNG.

However, folks who don't want to see it change at all have thus far been
unwilling to compromise at all. I'm confused how you're saying that the
security minded folks have been unwilling to compromise when we've done that
repeatidly in this thread, whereas the backwards compat minded folks have
consistently said "No, it would break compatability" or "We don't need to
change" or "They are probably insecure anyways". Can you explain what
compromise you're willing to accept here? If it doesn't involve breaking at
least a little compatability then it's not a compromise it's you demanding that
your opinion is the correct one (which isn't wrong, we're also asserting that
our opinion is the correct one, we've just been willing to move the goal posts
to try and limit the damage while still getting most of the benefit).

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From donald at stufft.io  Mon Sep 14 17:22:51 2015
From: donald at stufft.io (Donald Stufft)
Date: Mon, 14 Sep 2015 11:22:51 -0400
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <loom.20150914T171308-963@post.gmane.org>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <loom.20150914T153737-237@post.gmane.org> <mt6l4o$dgi$1@ger.gmane.org>
 <loom.20150914T163810-407@post.gmane.org>
 <etPan.55f6e06b.3339a7d1.b1e0@Draupnir.home>
 <loom.20150914T171308-963@post.gmane.org>
Message-ID: <etPan.55f6e64b.3535a079.24af@Draupnir.home>

On September 14, 2015 at 11:16:48 AM, Stefan Krah (skrah at bytereef.org) wrote:
> 
> That addresses pretty little of what I wrote, and I'd prefer to
> hear anything directly from him. Your summaries have a tendency
> to be highly biased.
> 

Well, he's expressed that he's unlikely to participate in this discussion
because he doesn't use Python and thus doesn't have any skin in the game. He
just saw an opportunity to try and improve the "ambient" security of
applications written in Python and thought he'd reach out to see if there was
any interest in it on our end.

I'd ask him personally, but given that I'm "biased" you'll have to manage to
ask him on your own.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From graffatcolmingov at gmail.com  Mon Sep 14 17:32:48 2015
From: graffatcolmingov at gmail.com (Ian Cordasco)
Date: Mon, 14 Sep 2015 10:32:48 -0500
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
Message-ID: <CAN-Kwu3S3O8ft7eXcPYhV6ns_kHSsXVUw1fSsPuFjM7EwdLL8g@mail.gmail.com>

On Mon, Sep 14, 2015 at 10:01 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 14 September 2015 at 14:29, Cory Benfield <cory at lukasa.co.uk> wrote:
>> Is your argument that there are lots of ways to get security wrong,
>> and for that reason we shouldn't try to fix any of them?
>
> This debate seems to repeatedly degenerate into this type of accusation.
>
> Why is backward compatibility not being taken into account here? To be
> clear, the proposed change *breaks backward compatibility* and while
> that's allowed in 3.6, just because it is allowed, doesn't mean we
> have free rein to break compatibility - any change needs a good
> justification. The arguments presented here are valid up to a point,
> but every time anyone tries to suggest a weak area in the argument,
> the "we should fix security issues" trump card gets pulled out.
>
> For example, as this is a compatibility break, it'll only be allowed
> into 3.6+ (I've not seen anyone suggest that this is sufficiently
> serious to warrant breaking compatibility on older versions). Almost
> all of those SO questions, and google hits, are probably going to be
> referenced by people who are using 2.7, or maybe some version of 3.x
> earlier than 3.6 (at what stage do we allow for the possibility of 3.x
> users who are *not* on the latest release?) So is a solution which
> won't impact most of the people making the mistake, worth it?

So people who are arguing that the defaults shouldn't be fixed on
Python 2.7 are likely the same people who also argued that PEP 466 was
a terrible, awful, end-of-the-world type change. Yes it broke things
(like eventlet) but the net benefit for users who can get onto Python
2.7.9 (and later) is immense.

Now I'm not arguing that we should do the same to the random module,
but a backport (that is part of the stdlib) would probably be a good
idea under the same idea of allowing users to opt into security early.

> I fully expect the response to this to be "just because it'll take
> time, doesn't mean we should do nothing". Or "even if it just fixes it
> for one or two people, it's still worth it". But *that's* the argument
> I don't find compelling - not that a fix won't help some situations,
> but that because it's security, (a) all the usual trade-off
> calculations are irrelevant, and (b) other proposed solutions (such as
> education, adding specialised modules like a "shared secret" library,
> etc) are off the table.

They're not irrelevant. I personally think they're of a lower impact
to the discussion, but the reality is that the people who are
educating others are few and far between. If there are public domain
works, free tutorials, etc. that all advocate using a module in the
standard library and no one can update those, they still exist and are
still recommendations. People prefer free to correct when possible
because there's nothing free to correct them (until they get hacked or
worse). Do we have a team in the Python community that goes out to
educate for free people on security related best practices? I haven't
seen them. The best we have is a few people on crufty mailing lists
like this one trying to make an impact because education is a much
larger and harder to solve problem than making something secure by
default.

Perhaps instead of bickering like fools on a mailing list, we could
all be spending our time better educating others. That said, I can't
make that decision for you just like you can't make that for me.

> Honestly, this type of debate doesn't do the security community much
> good - there's too little willingness to compromise, and as a result
> the more neutral participants (which, frankly, is pretty much anyone
> who doesn't have a security agenda to promote) end up pushed into a
> "reject everything" stance simply as a reaction to the black and white
> argument style.

Except you seem to have missed much of the compromises being discussed
and conceded by the security minded folks. Personally, names that
describe the outputs of the algorithms make much more sense to me than
"Seedless" and "Seeded" but no one has really bothered to shave that
yak further out of a desire to compromise and make things better as a
whole. Much of the lack of gradation has come from the opponents to
this change who seem to think of security as a step function where a
subjective measurement of "good enough for me" counts as secure.

From skrah at bytereef.org  Mon Sep 14 17:35:19 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Mon, 14 Sep 2015 15:35:19 +0000 (UTC)
Subject: [Python-ideas] Globally configurable random number generation
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <loom.20150914T153737-237@post.gmane.org> <mt6l4o$dgi$1@ger.gmane.org>
 <loom.20150914T163810-407@post.gmane.org>
 <etPan.55f6e06b.3339a7d1.b1e0@Draupnir.home>
 <loom.20150914T171308-963@post.gmane.org>
 <etPan.55f6e64b.3535a079.24af@Draupnir.home>
Message-ID: <loom.20150914T173142-991@post.gmane.org>

Donald Stufft <donald at ...> writes:
> On September 14, 2015 at 11:16:48 AM, Stefan Krah (skrah at ...) wrote:
> > 
> > That addresses pretty little of what I wrote, and I'd prefer to
> > hear anything directly from him. Your summaries have a tendency
> > to be highly biased.
> > 
> 
> Well, he's expressed that he's unlikely to participate in this discussion
> because he doesn't use Python and thus doesn't have any skin in the game. He
> just saw an opportunity to try and improve the "ambient" security of
> applications written in Python and thought he'd reach out to see if there was
> any interest in it on our end.
> 
> I'd ask him personally, but given that I'm "biased" you'll have to manage to
> ask him on your own.

No one has asked you to do anything. Ironically, this is another
example how you manage to put a spin on basically anything you
respond to.


Stefan Krah


From sturla.molden at gmail.com  Mon Sep 14 17:39:42 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Mon, 14 Sep 2015 17:39:42 +0200
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <1442241905.192420.383177025.382A3D7E@webmail.messagingengine.com>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <etPan.55f6d0e1.7bf9094.b1e0@Draupnir.home>
 <1442240160.186432.383146609.471F1B7D@webmail.messagingengine.com>
 <loom.20150914T162116-889@post.gmane.org>
 <1442241905.192420.383177025.382A3D7E@webmail.messagingengine.com>
Message-ID: <mt6pnv$1m4$1@ger.gmane.org>

On 14/09/15 16:45, Random832 wrote:

>> These functions aren't used widely in scientific computing.
>
> I don't see how that's relevant, when what I'm talking about is
> "providing an API that switches them from secure mode to
> insecure/deterministic mode"

It is not just a matter of security versus determinism. It is also a 
matter of numerical accuracy. The distribution of the output sequence 
must be proven and be as close as possible to the distribution of interest.

MT19937 is loved by scientists because it emulates sampling from the 
uniform distribution so well. Faster alternatives exist, more secure 
alternatives too. But when we simulate a stochastic process we also care 
about numerical accuracy. MT19937 is considered state of the art for 
this purpose.

It does not seem that the issue of numerical accuracy is appreciated in 
this debate. Cryptographers just want random bits that cannot be 
predicted. Numerical accuracy is not their primary concern. If you 
replace MT19937 with "something more secure" you likely also loose its 
usefulness for scientific computing.


Sturla




From donald at stufft.io  Mon Sep 14 17:41:58 2015
From: donald at stufft.io (Donald Stufft)
Date: Mon, 14 Sep 2015 11:41:58 -0400
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <mt6pnv$1m4$1@ger.gmane.org>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <etPan.55f6d0e1.7bf9094.b1e0@Draupnir.home>
 <1442240160.186432.383146609.471F1B7D@webmail.messagingengine.com>
 <loom.20150914T162116-889@post.gmane.org>
 <1442241905.192420.383177025.382A3D7E@webmail.messagingengine.com>
 <mt6pnv$1m4$1@ger.gmane.org>
Message-ID: <etPan.55f6eac6.73d4ee34.24af@Draupnir.home>

On September 14, 2015 at 11:40:53 AM, Sturla Molden (sturla.molden at gmail.com) wrote:
> On 14/09/15 16:45, Random832 wrote:
> 
> >> These functions aren't used widely in scientific computing.
> >
> > I don't see how that's relevant, when what I'm talking about is
> > "providing an API that switches them from secure mode to
> > insecure/deterministic mode"
> 
> It is not just a matter of security versus determinism. It is also a
> matter of numerical accuracy. The distribution of the output sequence
> must be proven and be as close as possible to the distribution of interest.
> 
> MT19937 is loved by scientists because it emulates sampling from the
> uniform distribution so well. Faster alternatives exist, more secure
> alternatives too. But when we simulate a stochastic process we also care
> about numerical accuracy. MT19937 is considered state of the art for
> this purpose.
> 
> It does not seem that the issue of numerical accuracy is appreciated in
> this debate. Cryptographers just want random bits that cannot be
> predicted. Numerical accuracy is not their primary concern. If you
> replace MT19937 with "something more secure" you likely also loose its
> usefulness for scientific computing.
> 

Nobody is suggesting to remove MT, just make it so you have to explicitly
opt-in to it.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From robert.kern at gmail.com  Mon Sep 14 17:50:15 2015
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 14 Sep 2015 16:50:15 +0100
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <mt6pnv$1m4$1@ger.gmane.org>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <etPan.55f6d0e1.7bf9094.b1e0@Draupnir.home>
 <1442240160.186432.383146609.471F1B7D@webmail.messagingengine.com>
 <loom.20150914T162116-889@post.gmane.org>
 <1442241905.192420.383177025.382A3D7E@webmail.messagingengine.com>
 <mt6pnv$1m4$1@ger.gmane.org>
Message-ID: <mt6qbn$c4r$1@ger.gmane.org>

On 2015-09-14 16:39, Sturla Molden wrote:
> On 14/09/15 16:45, Random832 wrote:
>
>>> These functions aren't used widely in scientific computing.
>>
>> I don't see how that's relevant, when what I'm talking about is
>> "providing an API that switches them from secure mode to
>> insecure/deterministic mode"
>
> It is not just a matter of security versus determinism. It is also a matter of
> numerical accuracy. The distribution of the output sequence must be proven and
> be as close as possible to the distribution of interest.
>
> MT19937 is loved by scientists because it emulates sampling from the uniform
> distribution so well. Faster alternatives exist, more secure alternatives too.
> But when we simulate a stochastic process we also care about numerical accuracy.
> MT19937 is considered state of the art for this purpose.

Actually, it's well behind the state of the art as it fails BigCrush. The 
proposed alternative does better in this regard.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco


From random832 at fastmail.com  Mon Sep 14 17:52:10 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 14 Sep 2015 11:52:10 -0400
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <mt6pnv$1m4$1@ger.gmane.org>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <etPan.55f6d0e1.7bf9094.b1e0@Draupnir.home>
 <1442240160.186432.383146609.471F1B7D@webmail.messagingengine.com>
 <loom.20150914T162116-889@post.gmane.org>
 <1442241905.192420.383177025.382A3D7E@webmail.messagingengine.com>
 <mt6pnv$1m4$1@ger.gmane.org>
Message-ID: <1442245930.209341.383250457.5F839815@webmail.messagingengine.com>

On Mon, Sep 14, 2015, at 11:39, Sturla Molden wrote:
> It does not seem that the issue of numerical accuracy is appreciated in 
> this debate. Cryptographers just want random bits that cannot be 
> predicted. Numerical accuracy is not their primary concern. If you 
> replace MT19937 with "something more secure" you likely also loose its 
> usefulness for scientific computing.

Who is doing scientific computing but not using the seeding functions?

From sturla.molden at gmail.com  Mon Sep 14 17:53:53 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Mon, 14 Sep 2015 17:53:53 +0200
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAExdVNmWWWt_iPNPG2e52pKX-AUdeAusCbCFiK0LrqibSgL_xA@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <55F0BD0F.10508@sdamon.com>
 <CAExdVNmWWWt_iPNPG2e52pKX-AUdeAusCbCFiK0LrqibSgL_xA@mail.gmail.com>
Message-ID: <mt6qii$fr1$1@ger.gmane.org>

On 10/09/15 03:55, Tim Peters wrote:

> Would your answer change if a crypto generator were _faster_ than MT?
> MT isn't speedy by modern standards, and is cache-hostile (about 2500
> bytes of mutable state).
>
> Not claiming a crypto hash _would_ be faster.  But it is possible.

Speed is not the main matter of concern. MT19937 is not very fast, it is 
very accurate. It is used in scientific computing when we want to 
simulate sampling from a given distribution as accurately as possible. 
Its strength is in the distribution of number it generates, not in its 
security or speed. MT19937 allows us to produce a very precise 
simulation of a stochastic process. The alternatives cannot compare in 
numerical quality, though they might be faster or more secure, or both.

When we use MT19937 in scientific computing we deliberately sacrifice 
speed for accuracy. A cryto hash might be faster, but will it be more 
accurate? Accuracy means how well the generated sequence emulates 
sampling from a perfect uniform distribution. MT19937 does not have any 
real competition in this game.


Sturla






From antoine at python.org  Mon Sep 14 17:55:30 2015
From: antoine at python.org (Antoine Pitrou)
Date: Mon, 14 Sep 2015 15:55:30 +0000 (UTC)
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <etPan.55f6e50e.2f0a18c5.24af@Draupnir.home>
Message-ID: <loom.20150914T172642-385@post.gmane.org>

Donald Stufft <donald at ...> writes:
> 
> How has it not been taken into account? The current proposal (best summed up
> by Nick in the other thread) will not break compatability for anyone except
> those calling the functions that are specifically about setting a seed or
> getting/setting the current state.

That's a pretty big "except". Paul's and my concern is about compatibility
breakage, saying "it doesn't break compatibility except..." sounds like a
lot of empty rhetoric.

> In looking around I don't see a lot of
> people using those particular functions

Given that when you "look around" you only end up looking around amongst the Web
developer crowd, I may not be surprised.

You know, when I "look around" I don't see a lot of people using the random
module to generate passwords. Your anecdote would be more valuable than other
people's?

> However, one of
> the biggest groups of people who are most likely to be helped by this
change is
> new and inexperienced developers who don't fully grasp the security sensitive
> nature of whatever they are doing with random.

Yes, because generating passwords is a common and reasonable task for new
and inexperienced developers? Really?

Again, why don't you propose a dedicated API for that? That's what we did
for constant-time comparisons. That's what people did for password hashing.
That's what other people did for cryptography. I haven't seen a reasonable
rebuttal to this. Why would generating passwords be any different from all
those use cases? After all, if you provide a convenient API people should
flock to it, instead of cumbersomely reinventing the wheel... That's what
libraries are for.

> However, I/we
> are willing to compromise by sacrificing possible security in order to not
> regress things where we can, in particular a user-space CSPRNG is being
> proposed over SystemRandom because it will provide you with random numbers
> almost as fast as MT will.

Really, it's not so much a performance issue as a compatibility issue.
The random module provides, by default, a *deterministic* stream of random
numbers. That os.urandom() may be a tad slower isn't very important when you're
generating one number at a time and processing it with a slow interpreter
(besides, MT itself is hardly the fastest PRNG out there). That os.urandom()
doesn't give you a way to seed it once and get predictable results is
a big *regression* if made the default RNG in the random module.

And the same can be said for a user-space CSRNG, as far as I understand
the explanations here.

> However, when proposing this possible compromise, we are met with people
> refusing to meet us in the middle.

See, people are fed up with the incompatibilities arising "in the name of
the public good" in each new feature release of Python. When the "middle"
doesn't sound much more desirable than the "extreme", I don't see why I
should call it a "compromise".

Some people have to support code in 4 different Python versions and
further gratuitous breakage in the stdlib doesn't help. Yes, they can
change their code. Yes, they can use the "six" module, the "future" module
or whatever new bandaid exists on PyPI. Still they must change their code
in a way or another because it was deemed "necessary" to break compatibility
to solve a concern that doesn't seem grounded in any reasonable analysis.

Python 3 was there to break compatibility. Not Python 3.4. Not Python 3.5.
Not Python 3.6.

(in case you're wondering, trying to make all published code on the Internet
secure by appropriately changing the interpreter's "behaviour" to match
erroneous expectations - even *documented* as erroneous - is *not* reasonable
- no matter how hard you try, there will always be occurrences of broken code
that people copy and paste around)

> Can you explain what
> compromise you're willing to accept here?

Let's rephrase this: are *you* willing to accept an admittedly "insecure
by default" compromise?

No you aren't, evidently. There's no evidence that you would accept to
leave the top-level random functions intact, even if a new UserSpaceSecureRandom
class was added to the module, right?

So why would we accept a compatibility-breaking compromise? Because we are
more "reasonable" than you?

(which in this context really reads: more willing to quit the discussion
because of boredom, exhaustion, lack of time or any other quite humane
reason; which, btw, sums up of significant part of what the dynamics of
python-ideas have become: "victory of the most obstinate")

Yeah, that's always what you are betting on, because it's not like *you*
will ever be reasonable except if it's the last resort for getting something
accepted. And that's why every discussion about security with security-minded
(read: "obsessed") people is a massive annoyance, even if at the end it
succeeds in reaching a "compromise", after 500+ excruciating backs and forths
on a mailing-list.

Regards

Antoine.



From p.f.moore at gmail.com  Mon Sep 14 17:57:01 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 14 Sep 2015 16:57:01 +0100
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
Message-ID: <CACac1F_7r25cTkucx+zsntsc9bG972j+Ja5MC9ciz2VUFf35CQ@mail.gmail.com>

On 14 September 2015 at 14:32, Nick Coghlan <ncoghlan at gmail.com> wrote:
> I'll write it up as a full PEP later, but I think it's just as useful
> in this form for now.

Please provide costs and benefits. At the moment, the proposal takes
an implied stance that fixing security issues warrants disruption to
users (and in particular to users with *no* security requirements). I
appreciate that there's the usual 2-release long deprecation process,
and that the only visible disruption is to the state/seed APIs. But
I'd like to see that expanded on a little more, precisely to convince
those people who *aren't* automatically convinced by "there's a
security issue" arguments, that the trade-offs have been properly
analyzed.

For example, in terms of costs:

1. The module API is more complex and harder to teach.
2. The new API deliberately introduces a global state setting API.
3. People using "from random import choice" can't use the "simple
upgrade" recommendation "from random import system_random as random".

The benefits seem to be solely:

1. Users of code written based on bad advice will be protected from
the consequences (as long as the code runs on a sufficiently new
version of Python).

(I'm serious - that's how the benefit statement reads to me. Although
I agree it'd be nice if I worded it a bit more unemotionally, I
genuinely don't know how to without either overstating it or making it
a paragraph long...)

I'm not trying to say that the cost/benefit analysis doesn't justify
the change (I'm currently unconvinced, and trying to remain open in
spite of the over-abundance of security rhetoric in the thread), just
that it's a key point of the debate here, and it's not captured in
your summary/pre-PEP.

Paul

From cory at lukasa.co.uk  Mon Sep 14 18:06:03 2015
From: cory at lukasa.co.uk (Cory Benfield)
Date: Mon, 14 Sep 2015 17:06:03 +0100
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <loom.20150914T172642-385@post.gmane.org>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <etPan.55f6e50e.2f0a18c5.24af@Draupnir.home>
 <loom.20150914T172642-385@post.gmane.org>
Message-ID: <CAH_hAJEv5NmWfr5m0qPBwWxdZhw4Hyd_x4ZiL4qTp5igUKUZtQ@mail.gmail.com>

On 14 September 2015 at 16:55, Antoine Pitrou <antoine at python.org> wrote:
> Python 3 was there to break compatibility. Not Python 3.4. Not Python 3.5.
> Not Python 3.6.

To clarify: your position is that we cannot break backward
compatibility in Python 3.6?

From cory at lukasa.co.uk  Mon Sep 14 18:00:45 2015
From: cory at lukasa.co.uk (Cory Benfield)
Date: Mon, 14 Sep 2015 17:00:45 +0100
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
Message-ID: <CAH_hAJGi9UEdBzt-GC=CqvGTq7aWoM7Yq2ZtsiYn4TvWV=Z06A@mail.gmail.com>

On 14 September 2015 at 16:01, Paul Moore <p.f.moore at gmail.com> wrote:
> Why is backward compatibility not being taken into account here? To be
> clear, the proposed change *breaks backward compatibility* and while
> that's allowed in 3.6, just because it is allowed, doesn't mean we
> have free rein to break compatibility - any change needs a good
> justification. The arguments presented here are valid up to a point,
> but every time anyone tries to suggest a weak area in the argument,
> the "we should fix security issues" trump card gets pulled out.

What makes you think that I didn't take it into account? I did: and
then rejected it. On a personal level, I believe that defaulting to
more secure is worth backward compatibility breaks. I believe that a
major reason for the overwhelming prevalence of security
vulnerabilities in modern software is because we are overly attached
to making people's lives *easy* at the expense of making them *safe*.
I believe that software communities in general are too concerned about
keeping the stuff that people used around for far too long, and not
concerned enough about pushing users to make good choice.

The best example of this is OpenSSL. When compiled from source naively
(e.g. ./config && make && make install), OpenSSL includes support for
SSLv3, SSL Compression, and SSLv2, all of which are known-broken
options. To clarify, SSLv2 has been deprecated for security reasons
since 1996, but a version of OpenSSL 1.0.2d you build today will
happily enable *and use* it. Hell, OpenSSL's own build instructions
include this note[0]:

> OpenSSL has been around a long time, and it carries around a lot of
> cruft. For example, from above, SSLv2 is enabled by default. SSLv2 is
> completely broken, and you should disable it during configuration.

Why is it that users who do not read the wiki (most of them) get an
insecure build? Backwards compatibility is why.

This is necessarily a reductio ad absurdum type of argument, because
I'm trying to make a rhetorical point: I believe that sacrificing
security on the altar of backwards compatibility is a bad idea in the
long term, and I want to discourage it as best I can.

I appreciate your desire to maintain backward compatibility, Paul, I
really do. And I think it is probably for the best that people like
you work on projects like CPython, while people like me work outside
the standard library. However, that won't stop me trying to drag the
stdlib towards more secure defaults: it just might make it futile.

From antoine at python.org  Mon Sep 14 18:15:55 2015
From: antoine at python.org (Antoine Pitrou)
Date: Mon, 14 Sep 2015 16:15:55 +0000 (UTC)
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <etPan.55f6e50e.2f0a18c5.24af@Draupnir.home>
 <loom.20150914T172642-385@post.gmane.org>
 <CAH_hAJEv5NmWfr5m0qPBwWxdZhw4Hyd_x4ZiL4qTp5igUKUZtQ@mail.gmail.com>
Message-ID: <loom.20150914T180912-590@post.gmane.org>

Cory Benfield <cory at ...> writes:
> 
> On 14 September 2015 at 16:55, Antoine Pitrou <antoine at ...> wrote:
> > Python 3 was there to break compatibility. Not Python 3.4. Not Python 3.5.
> > Not Python 3.6.
> 
> To clarify: your position is that we cannot break backward
> compatibility in Python 3.6?

It is. Not breaking backward compatibility in feature releases
(except 3.0, which was a deliberate special case) is a very long
standing policy, and it is so because users have a much better
time with such a policy, especially when people have to maintain
code that's compatible accross multiple versions (again, the 2->3
transition is a special case, which justifies the existence of
tools such as "six", and has incidently created a lot of turmoil
in the community that has only recently begin to recede).

Of course, fixing a bug is not necessarily breaking compatibility
(although sometimes we may even refuse to fix a bug because the
impact on working code would be too large). But changing or removing
a documented behaviour that people rely on definitely is.

We do break feature compatibility, from time to time, in exceptional
and generally discussed-at-length cases, but there is a sad pressure
recently to push for more compatibility breakage - and, strangely,
always in the name of "security".

(also note that some library modules such as asyncio are or were
temporarily exempted from the compatibility requirements, because they
are in very active development; the random module evidently isn't part
of them)

Regards

Antoine.



From bussonniermatthias at gmail.com  Mon Sep 14 18:21:08 2015
From: bussonniermatthias at gmail.com (Matthias Bussonnier)
Date: Mon, 14 Sep 2015 09:21:08 -0700
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <etPan.55f6d0e1.7bf9094.b1e0@Draupnir.home>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <etPan.55f6d0e1.7bf9094.b1e0@Draupnir.home>
Message-ID: <D30A55DB-3A27-4E6A-BB95-9D60AC98B072@gmail.com>


> On Sep 14, 2015, at 06:51, Donald Stufft <donald at stufft.io> wrote:
> 
> I don't love the "seedable" and "seedless" names here, but I don't have a
> better suggestion for the userspace CSPRNG one because it's security properties
> are a bit nuanced. People doing security sensitive things like generating keys
> for cryptography should still use something based on os.urandom, so it's mostly
> about providing a safety net that will "probably" [1] be safe.

> Probably
> something like random.ProbablySecureRandom is a bad name :)

Yes but unsecureRandom for the unsecure one (which obviously is insecure)
is not unreasonable. (unsafe can be shorter) 

-- 
M

Also seedless does not mean secure: https://xkcd.com/221/ <https://xkcd.com/221/> :-) 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150914/45fec998/attachment.html>

From graffatcolmingov at gmail.com  Mon Sep 14 18:25:09 2015
From: graffatcolmingov at gmail.com (Ian Cordasco)
Date: Mon, 14 Sep 2015 11:25:09 -0500
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CAH_hAJGi9UEdBzt-GC=CqvGTq7aWoM7Yq2ZtsiYn4TvWV=Z06A@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAH_hAJGi9UEdBzt-GC=CqvGTq7aWoM7Yq2ZtsiYn4TvWV=Z06A@mail.gmail.com>
Message-ID: <etPan.55f6f4e6.2c19e42a.156ba@prometheus.local>




On September 14, 2015 at 11:08:39 AM, Cory Benfield (cory at lukasa.co.uk) wrote:
> On 14 September 2015 at 16:01, Paul Moore wrote:
> > Why is backward compatibility not being taken into account here? To be
> > clear, the proposed change *breaks backward compatibility* and while
> > that's allowed in 3.6, just because it is allowed, doesn't mean we
> > have free rein to break compatibility - any change needs a good
> > justification. The arguments presented here are valid up to a point,
> > but every time anyone tries to suggest a weak area in the argument,
> > the "we should fix security issues" trump card gets pulled out.
>  
> What makes you think that I didn't take it into account? I did: and
> then rejected it. On a personal level, I believe that defaulting to
> more secure is worth backward compatibility breaks. I believe that a
> major reason for the overwhelming prevalence of security
> vulnerabilities in modern software is because we are overly attached
> to making people's lives *easy* at the expense of making them *safe*.
> I believe that software communities in general are too concerned about
> keeping the stuff that people used around for far too long, and not
> concerned enough about pushing users to make good choice.
>  
> The best example of this is OpenSSL. When compiled from source naively
> (e.g. ./config && make && make install), OpenSSL includes support for
> SSLv3, SSL Compression, and SSLv2, all of which are known-broken
> options. To clarify, SSLv2 has been deprecated for security reasons
> since 1996, but a version of OpenSSL 1.0.2d you build today will
> happily enable *and use* it. Hell, OpenSSL's own build instructions
> include this note[0]:
>  
> > OpenSSL has been around a long time, and it carries around a lot of
> > cruft. For example, from above, SSLv2 is enabled by default. SSLv2 is
> > completely broken, and you should disable it during configuration.
>  
> Why is it that users who do not read the wiki (most of them) get an
> insecure build? Backwards compatibility is why.

So I will counter this with what I am fully expecting to be the response:

People use distributions that compile and configure OpenSSL for them, e.g., `apt-get install openssl` (not obviously the example that works, but you get the idea). That said, last year, Debian, Ubuntu, Fedora, and other distributions all started compiling openssl without SSLv3 as an available symbol which broke backwards compatibility and TONS of python projects (eventlet, urllib3, requests, etc.). Why did it break backwards compatibility? Because they knew that they were responsible for the security of their users and expecting users to recompile OpenSSL themselves with the correct flags was unrealistic. Their users come from a wide range of people:

- System administrators
- Desktop users (if you believe anyone actually uses linux on the desktop ;))
- Researchers
- Developers
- etc.

> This is necessarily a reductio ad absurdum type of argument, because
> I'm trying to make a rhetorical point: I believe that sacrificing
> security on the altar of backwards compatibility is a bad idea in the
> long term, and I want to discourage it as best I can.
>  
> I appreciate your desire to maintain backward compatibility, Paul, I
> really do. And I think it is probably for the best that people like
> you work on projects like CPython, while people like me work outside
> the standard library. However, that won't stop me trying to drag the
> stdlib towards more secure defaults: it just might make it futile.


That said, I?d also like to combat the idea that security experts won?t use random. Currently Helios which is a voting piece of software (that anyone can deploy) uses the random module (https://github.com/benadida/helios-server/blob/b07c43dee5f51ce489b6fcb7b719457255c3a8b8/helios/utils.py) They use it to generate passwords:?https://github.com/benadida/helios-server/blob/b07c43dee5f51ce489b6fcb7b719457255c3a8b8/helios/models.py#L944?https://github.com/benadida/helios-server/blob/b07c43dee5f51ce489b6fcb7b719457255c3a8b8/helios/management/commands/load_voter_files.py#L55

Ben Adida is a security professional who has written papers on creating secure voting systems but even he uses the random module arguably incorrectly in what should be secure software.

Arguing that anyone who knows they need secure random functions will use them, is clearly invalidated. Not everyone who knows they should be generating securely random things are aware that the random module is insufficient for their needs.

Perhaps that code was written before the big red box was added to the documentation and so it was ineffective. Perhaps Ben googled and found that everyone else was using random for passwords (as people have shown is easy to find in this discussion several times).

That said, your arguments are easily reduced to ?No language should protect its users from themselves? which is equivalent to Python?s ?We?re all consenting adults philosophy?. In that case, we?re absolutely safe from any blame for the horrible problems that users inflict on themselves.

Anyone that used urllib2/httplib/etc. from the standard library to talk to a site over HTTPS (prior to PEP 466) are all to blame because they didn?t read the source and know that their sensitive information was easily intercepted by anyone on their network. Clearly, that?s their fault. This makes core language development so much easier, doesn?t it? Place all the blame on the users for the sake of X (where in this discussion X is the holy grail of backwards compatibility).

From cody.piersall at gmail.com  Mon Sep 14 18:28:49 2015
From: cody.piersall at gmail.com (Cody Piersall)
Date: Mon, 14 Sep 2015 11:28:49 -0500
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
Message-ID: <CAFSbXtNY_VpOzVHf8JEHGBg_cPL6xFmyyf1qnO=+OpoqG=9ZAw@mail.gmail.com>

On Mon, Sep 14, 2015 at 8:32 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>
> This is an expansion of the random module enhancement idea I
> previously posted to Donald's thread:
> https://mail.python.org/pipermail/python-ideas/2015-September/035969.html
>
> I'll write it up as a full PEP later, but I think it's just as useful
> in this form for now.
>
> [snip]
>
> * expose a global SystemRandom instance as random.system_random
> * provide a random.set_default_instance() API that makes it possible
> to specify the instance used by the module level methods
> * the module level seed(), getstate(), and setstate() functions will
> throw RuntimeError if the corresponding method is missing from the
> default instance

One problem that people (I can't remember who) have pointed out about
random.set_default_instance() is that any imported module in the same
process can change the random from secure -> insecure at a distance.
One way to solve this is to ensure that set_default_instance() can be
called only once; if it is called more than once, a RuntimeError could
be raised.  I think the logging module does something like this for
setting the logging level?

I think the only way that this really would make sense would be to make
set_default_instance() be called before any of the module level functions.
The first time a module level function is called, you could default to
selecting the CSRNG.  If you call one of the seeded API functions
(getstate, setstate, seed) before the other module-level functions the
instance could default to the deterministic RNG, but that might be
confusing to debug.  I could imagine people getting really confused
if this program worked:

    import random
    random.seed(1234)
    random.random()

but this program failed:

    import random
    random.random()
    random.seed(1234) # would raise a RuntimeError
    random.random() # would not be reached

I'm not crazy about the idea of changing the default instance based on the
first module level function called; that might be a terrible idea.  But I
_do_ think it's a good idea not to let the default instance change
throughout the life of the program.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150914/352d25c3/attachment.html>

From cory at lukasa.co.uk  Mon Sep 14 18:36:43 2015
From: cory at lukasa.co.uk (Cory Benfield)
Date: Mon, 14 Sep 2015 17:36:43 +0100
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <loom.20150914T180912-590@post.gmane.org>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <etPan.55f6e50e.2f0a18c5.24af@Draupnir.home>
 <loom.20150914T172642-385@post.gmane.org>
 <CAH_hAJEv5NmWfr5m0qPBwWxdZhw4Hyd_x4ZiL4qTp5igUKUZtQ@mail.gmail.com>
 <loom.20150914T180912-590@post.gmane.org>
Message-ID: <CAH_hAJFgX+y2V1EM557ioRbjqC98Fkv0aWn+KKmArS41zj=j=g@mail.gmail.com>

On 14 September 2015 at 17:15, Antoine Pitrou <antoine at python.org> wrote:
> Cory Benfield <cory at ...> writes:
>>
>> On 14 September 2015 at 16:55, Antoine Pitrou <antoine at ...> wrote:
>> > Python 3 was there to break compatibility. Not Python 3.4. Not Python 3.5.
>> > Not Python 3.6.
>>
>> To clarify: your position is that we cannot break backward
>> compatibility in Python 3.6?
>
> It is. Not breaking backward compatibility in feature releases
> (except 3.0, which was a deliberate special case) is a very long
> standing policy, and it is so because users have a much better
> time with such a policy, especially when people have to maintain
> code that's compatible accross multiple versions (again, the 2->3
> transition is a special case, which justifies the existence of
> tools such as "six", and has incidently created a lot of turmoil
> in the community that has only recently begin to recede).

This neatly resolves the problem. I have no further input to the discussion.

From sturla.molden at gmail.com  Mon Sep 14 18:56:15 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Mon, 14 Sep 2015 18:56:15 +0200
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <mt6qbn$c4r$1@ger.gmane.org>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <etPan.55f6d0e1.7bf9094.b1e0@Draupnir.home>
 <1442240160.186432.383146609.471F1B7D@webmail.messagingengine.com>
 <loom.20150914T162116-889@post.gmane.org>
 <1442241905.192420.383177025.382A3D7E@webmail.messagingengine.com>
 <mt6pnv$1m4$1@ger.gmane.org> <mt6qbn$c4r$1@ger.gmane.org>
Message-ID: <mt6u7d$d1c$1@ger.gmane.org>

On 14/09/15 17:50, Robert Kern wrote:

> Actually, it's well behind the state of the art as it fails BigCrush.
> The proposed alternative does better in this regard.

Is that one of the PCGs? Or Arc4Random, ChaCha20 or XorShift64/32?

The three latter fails on k-dimensional equi-distribution, MT does not. 
Some of the PCGs do too, but some should be as good as MT. Not sure if 
that is worse or better than failing some parts of BigCrush.

Which PCG would you recommend, by the way?



Sturla




From g.brandl at gmx.net  Mon Sep 14 18:59:50 2015
From: g.brandl at gmx.net (Georg Brandl)
Date: Mon, 14 Sep 2015 18:59:50 +0200
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
In-Reply-To: <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
Message-ID: <mt6ue6$dvt$1@ger.gmane.org>

On 09/10/2015 07:00 PM, Nick Coghlan wrote:

>> +0 for deprecating the seed-related functions and saying "the stdlib uses
>> was it uses as a RNG and you have to live with it if you don't make your own
>> choice" and switching to a crypto-secure RNG.
> 
> However, this I'm +1 on. People *do* use the module level APIs
> inappropriately, and we can get them to a much safer place, while
> nudging folks that genuinely need deterministic randomness towards an
> alternative API.

I agree.  Deprecating (and eventually removing) the 4 seed-related functions
seems like the least intrusive, but still effective, solution to this issue.

Georg


From mal at egenix.com  Mon Sep 14 19:15:48 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 14 Sep 2015 19:15:48 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>	<1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>	<CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>	<1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>	<20150909190757.GM19373@ando.pearwood.info>	<55F0BF61.6050205@canterbury.ac.nz>	<CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>	<55F13EAF.5040500@egenix.com>	<CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>	<55F1B219.1000502@egenix.com>	<87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>	<87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>	<CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>	<55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
Message-ID: <55F700C4.4030900@egenix.com>

[This is getting off-topic, so I'll stop after this reply]

On 14.09.2015 14:26, Nathaniel Smith wrote:
> On Mon, Sep 14, 2015 at 3:37 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>> On 14.09.2015 08:38, Nathaniel Smith wrote:
>>> If Tim Peters can get fooled
>>> into thinking something like using MT to generate session ids is
>>> "probably mostly secure", then what chance do the rest of us have?
>>> <wink>
>>
>> I don't think that Tim can get fooled into believing he is a
>> crypto wonk ;-)
>>
>> The thread reveals another misunderstanding:
>>
>>  Broken code doesn't get any better when you change the context
>>  in which it is run.
> 
> As an aphorism this sounds nice, but logically it makes no sense. If
> the broken thing about your code is that it assumes that the output of
> the RNG is unguessable, and you change the context by making the
> output of the RNG unguessable, then now the code it isn't broken.

It's still broken, because it's making wrong assumptions on the
documented context and given that it did in the first place,
suggests that this is not the only aspect of it being broken
(pure speculation, but experience shows that bugs usually
run around in groups ;-)).

> The code would indeed remain broken when run under e.g. older
> interpreters, but this is not an argument that we should make sure
> that it stays broken in the future.
> 
>> By fixing the RNG used in such broken code and making it
>> harder to run attacks, you are only changing the context in which
>> the code is run. The code itself still remains broken.
>>
>> Code which uses the output from an RNG as session id without adding
>> any additional security measures is broken, regardless of what kind
>> of RNG you are using. I bet such code will also take any session id
>> it receives as cookie and trust it without applying extra checks
>> on it.
> 
> Yes, that's... generally the thing you do with session cookies?
> They're shared secret string that you use as keys into some sort of
> server-side session database? What extra checks need to be applied?

You will at least want to add checks that the session id string was
indeed generated by the server and not some bot trying to
find valid session ids, e.g. by signing the session id and
checking the sig on incoming requests.

Other things you can do: fold timeouts into the id, add IP addresses,
browser sigs, request sequence numbers.

You also need to make sure that the session ids are taken from
a large enough set to make it highly unlikely that someone
can guess the id simply in case the number of active
sessions is significant compared to the universe
of possible ids, e.g. 32-bit ids are great for database indexes,
but a pretty bad idea if you have millions of active sessions.

>> Rather than trying to fix up the default RNG in Python by replacing
>> it with a crypto RNG, it's better to open bug reports to get the
>> broken software fixed.
>>
>> Replacing the default Python RNG with a new unstudied crypto one,
>> will likely introduce problems into working code which rightly
>> assumes the proven statistical properties of the MT.
>>
>> Just think of the consequences of adding unwanted bias to simulations.
>> This is far more likely to go unnoticed than a session highjack due
>> to a broken system and can easily cost millions (or earn you
>> millions - it's all probability after all :-)).
> 
> I'm afraid you just don't understand what you're talking about here.
> 
> When it comes to adding bias to simulations, all crypto RNGs have
> *better* statistical properties than MT. A crypto RNG which was merely
> as statistically-well-behaved as MT would be considered totally
> broken, because MT doesn't even pass black-box tests of randomness
> like TestU01.

I am well aware that MT doesn't satisfy all empirical tests
and also that it is not a CSPRNG (see the code Tim and I discussed
in this thread showing how easy it is to synchronize to an existing
MT RNG if you can gain knowledge of 624 output values).

However, it has been extensively studied and it is proven to be
equidistributed which is a key property needed for it to be used as
basis for other derived probability distributions (as it done by the
random module).

For CSPRNGs you can empirically test properties, but due to their
nature not prove e.g. them being equidistributed - even though they
usually will pass standard frequency tests. For real-life purposes,
you're probably right with them not being biased. I'm a mathematician,
though, so like provable more than empirical :-)

The main purpose of CSPRNGs is producing output which you cannot guess,
not to produce output which has provable distribution qualities.
They do this in a more efficient way than having to wait for enough
entropy to be collected - basically making true random number
generators practically usable.

There's a new field which appears to be popular these days:
"Chaotic Pseudo Random Number Generators" (CPRNGs). These are based
on chaotic systems and are great for making better use of available
entropy.

I'm sure we'll have something similar to the MT for these
chaotic systems come out of this research in a while and then Python
should follow this by implementing it in a new module.

Until then, I think it's worthwhile using the existing rand
code in OpenSSL and exposing this through the ssl module:

https://www.openssl.org/docs/man1.0.1/crypto/rand.html

It interfaces to platform hardware true RNGs where available, falls
back to an SHA-1 based 1k pool based generator where needed. It's
being used for SSL session keys, key generation, etc.,
trusted by millions of people and passes the NIST tests.

This paper explains the algorithm in more detail:

http://webpages.uncc.edu/yonwang/papers/lilesorics.pdf

The downside of the OpenSSL implementation is that it can
fail if there isn't enough entropy available.

Here's a slightly better algorithm, but it's just one of many
which you can find when searching for CPRNGs:

https://eprint.iacr.org/2012/471.pdf

>> Now, pointing people who write broken code to a new module which
>> provides a crypto RNG probably isn't much better either. They'd feel
>> instantly secure because it says "crypto" on the box and forget
>> about redesigning their insecure protocol as well. Nothing much you
>> can do about that, I'm afraid.
> 
> Yes, improving the RNG only helps with some problems, not others; it
> might merely make a system harder to attack, rather than impossible to
> attack. But giving people unguessable random numbers by default does
> solve real problems.

Drop the "by default" and I agree, as will probably everyone else
in this thread :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 14 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-14: Released mxODBC Plone/Zope DA 2.2.3   http://egenix.com/go84
2015-09-18: PyCon UK 2015 ...                               4 days to go
2015-09-26: Python Meeting Duesseldorf Sprint 2015         12 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From Nikolaus at rath.org  Mon Sep 14 20:32:26 2015
From: Nikolaus at rath.org (Nikolaus Rath)
Date: Mon, 14 Sep 2015 11:32:26 -0700
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F700C4.4030900@egenix.com> (M.'s message of "Mon, 14 Sep 2015
 19:15:48 +0200")
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
 <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com>
Message-ID: <87oah4rgw5.fsf@thinkpad.rath.org>

On Sep 14 2015, "M.-A. Lemburg" <mal-SVD0I98eSHvQT0dZR+AlfA at public.gmane.org> wrote:
>>> Code which uses the output from an RNG as session id without adding
>>> any additional security measures is broken, regardless of what kind
>>> of RNG you are using. I bet such code will also take any session id
>>> it receives as cookie and trust it without applying extra checks
>>> on it.
>> 
>> Yes, that's... generally the thing you do with session cookies?
>> They're shared secret string that you use as keys into some sort of
>> server-side session database? What extra checks need to be applied?
>
> You will at least want to add checks that the session id string was
> indeed generated by the server and not some bot trying to
> find valid session ids, e.g. by signing the session id and
> checking the sig on incoming requests.

The chance of a bot hitting a valid (randomly generated) session key by
chance should be just as high as the bot generating a correctly signed
session key by chance, if I'm not mistaken. 

(Assuming, of course, that the completely random key has the same number
of bits as they other key + signature).


Best,
-Nikolaus
-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From sturla.molden at gmail.com  Mon Sep 14 21:07:40 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Mon, 14 Sep 2015 21:07:40 +0200
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F700C4.4030900@egenix.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>	<1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>	<CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>	<1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>	<20150909190757.GM19373@ando.pearwood.info>	<55F0BF61.6050205@canterbury.ac.nz>	<CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>	<55F13EAF.5040500@egenix.com>	<CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>	<55F1B219.1000502@egenix.com>	<87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>	<87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>	<CAPJVwBkHQu
 EnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>	<55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com>
Message-ID: <mt75tq$bn1$1@ger.gmane.org>

On 14/09/15 19:15, M.-A. Lemburg wrote:

> I am well aware that MT doesn't satisfy all empirical tests
> and also that it is not a CSPRNG

> However, it has been extensively studied and it is proven to be
> equidistributed which is a key property needed for it to be used as
> basis for other derived probability distributions (as it done by the
> random module).

And with this criterion, only MT and certain PCG generators are 
acceptable. Those are (to my knowledge) the only ones with proven 
equidistribution.


Sturla


From p.f.moore at gmail.com  Mon Sep 14 21:14:05 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 14 Sep 2015 20:14:05 +0100
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CAN-Kwu3S3O8ft7eXcPYhV6ns_kHSsXVUw1fSsPuFjM7EwdLL8g@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAN-Kwu3S3O8ft7eXcPYhV6ns_kHSsXVUw1fSsPuFjM7EwdLL8g@mail.gmail.com>
Message-ID: <CACac1F9Y4n_dd+ybfCA2Ps1gN0Q5GbWJxeErqj0KfFNzYGbQjw@mail.gmail.com>

On 14 September 2015 at 16:32, Ian Cordasco <graffatcolmingov at gmail.com> wrote:
>> I fully expect the response to this to be "just because it'll take
>> time, doesn't mean we should do nothing". Or "even if it just fixes it
>> for one or two people, it's still worth it". But *that's* the argument
>> I don't find compelling - not that a fix won't help some situations,
>> but that because it's security, (a) all the usual trade-off
>> calculations are irrelevant, and (b) other proposed solutions (such as
>> education, adding specialised modules like a "shared secret" library,
>> etc) are off the table.
>
> They're not irrelevant. I personally think they're of a lower impact
> to the discussion, but the reality is that the people who are
> educating others are few and far between. If there are public domain
> works, free tutorials, etc. that all advocate using a module in the
> standard library and no one can update those, they still exist and are
> still recommendations. People prefer free to correct when possible
> because there's nothing free to correct them (until they get hacked or
> worse). Do we have a team in the Python community that goes out to
> educate for free people on security related best practices? I haven't
> seen them. The best we have is a few people on crufty mailing lists
> like this one trying to make an impact because education is a much
> larger and harder to solve problem than making something secure by
> default.
>
> Perhaps instead of bickering like fools on a mailing list, we could
> all be spending our time better educating others.

You may well be right. Personally, I'm pretty sick of the way all of
these debates degenerate into content-free reiteration of the same old
points, and unwillingness to hear other people's views.

Here's a point - it seems likely that the people arguing for this
change are of the opinion that I'm not appreciating their position.
(For the record, I'm not being deliberately obstructive in case anyone
thought otherwise. In my view at least, I don't understand the
security guys' position). Assuming that's the case, then I'm probably
one of the people who needs educating. But I don't feel like anyone's
trying to educate me, just that I'm being browbeaten until I give in.

Education != indoctrination.

> That said, I can't
> make that decision for you just like you can't make that for me.

Indeed. Personally, I spend quite a lot of time in my day job (closed
source corporate environment) trying to educate people in sane
security practices, usually ones I have learned from people in
communities like this one. One of the biggest challenges I have is
stopping people from viewing security as "an annoying set of rules
that get in the way of what I'm trying to do". But you would not
believe the sorts of things I see routinely - I'm not willing to give
examples or even outlines on a public mailing list because I can't
assess whether such information could be turned into an exploit. I can
say, though, that crypto-safe RNGs is *not* a relevant factor :-)

At its best, good security practice should *help* people write
reliable, easy to use systems. Or at a minimum, not get in the way.
But the PR message needs always to be "I understand the constraints
you're dealing with", not "you must do this for your own good".
Otherwise the "follow the rules until the auditors go away" attitude
just gets reinforced. Hence my focus on seeing proof that breakages
are justified *in the context of the target audience I am responsible
for*.

Conversely, you're right that I can't force anyone else to try to
educate people in good security practices, however much better than me
at it I might think they are. In actual fact, though, I think a lot of
people do a lot of good work educating others - as I say, most of what
I've learned has been from lists like these.

>> Honestly, this type of debate doesn't do the security community much
>> good - there's too little willingness to compromise, and as a result
>> the more neutral participants (which, frankly, is pretty much anyone
>> who doesn't have a security agenda to promote) end up pushed into a
>> "reject everything" stance simply as a reaction to the black and white
>> argument style.
>
> Except you seem to have missed much of the compromises being discussed
> and conceded by the security minded folks.

OK, you have a point - there have been changes to the proposals. But
there are fundamental points that have (as far as I can see) never
been acknowledged. As a result, the changes feel less like compromises
based on understanding each other's viewpoints, and more like repeated
attempts to push something through, even if it's not what was
originally proposed. (I *know* this is an emotional position - please
understand I'm fed up and not always managing to word things
objectively).

Specifically, I have been told that I can't argue my "convenience"
over the weight of all the other people who could fall into security
traps with the current API. Let's review that, shall we?

* My argument is that breaking backward compatibility needs to be
justified. People have different priorities. "Security risks should be
fixed" isn't (IMO) a free pass. Why should it be? "Windows
compatibility issues should be fixed" isn't a free pass. "PyPy/Jython
compatibility issues should be fixed" isn't a free pass. Forcing me to
adjust my priorities so that I care about security when I don't want
(or IMO need) to isn't acceptable.
* The security arguments seem to be largely in the context of web
application development (cookies, passwords, shared secrets, ...)
That's not the only context that matters.
* As I said above, in my experience, a compatibility break "to make
things more secure" is seen as equating security with inconvenience,
and can actually harm attempts to educate users in better security
practices.
* In many environments, reproducibility of random streams is
important. I'm not an expert on those fields, although I've hit some
situations where seeding is a requirement. As far as I am aware, most
of those situations have no security implications. So for them, the
PEP is all cost, no benefit. Sure the cost is small, but it's
non-zero.

How come the web application development community is the only one
whose voice gets heard? Is it because the fact that they *are*
public-facing, and frequently open-source, means that data is
available? So "back it up with facts or we won't believe you" becomes
a debating stance? I'm not arguing that everyone should be allowed to
climb up on their soapbox and rant - but I would like to think that
bringing a different perspective to the table could be treated with
respect and genuine attempts to understand. And "in my experience" is
viewed as an offer of information, not as an attempt to bluff on a
worthless hand.

Just to be clear, I think the current proposal (Nick's pre-PEP) is
relatively unobtrusive, and unlikely to cause serious compatibility
issues. I'm uncomfortable with the fact that it feels like yet another
"imposition in the name of security", and while I'm only one person I
feel that I'm not alone. I'm concerned that the people pushing
security seem unable to recognise that people becoming sick of such
changes is a PR problem they need to address, but that's their issue
not mine. So I'm unlikely to vote against the proposal, but I'll feel
sad if it's accepted without a more balanced discussion than we've
currently had.

On the meta-issue of how debates like this are conducted, I think
people probably need to listen more than they talk. I'm as guilty as
anyone else here. But in particular, when multiple people all end up
responding to rebut *every* counter-argument, essentially with the
same response, maybe it's time to think "we're in the majority here,
let's stop talking so much and see if we're missing anything from what
the people with other views are saying". He who shouts loudest isn't
always right. Not necessarily wrong, either, but sometimes it's bloody
hard to tell one way or the other, if they won't shut up long enough
to analyze the objections.

> Personally, names that
> describe the outputs of the algorithms make much more sense to me than
> "Seedless" and "Seeded" but no one has really bothered to shave that
> yak further out of a desire to compromise and make things better as a
> whole.

I'm frankly long past caring. I think we'll end up with whatever was
on the table when people got too tired to argue any more.

> Much of the lack of gradation has come from the opponents to
> this change who seem to think of security as a step function where a
> subjective measurement of "good enough for me" counts as secure.

Wait, what? It's *me* that's claiming that security is a yes/no
thing??? When all I'm hearing is "education isn't sufficient",
"dedicated libraries aren't sufficient", "keeping a deterministic RNG
as default isn't an option"? And when I'm suggesting that fixing the
PRNG use in code that misuses a PRNG may not be the only security
issue with that code? I knew the two sides weren't communicating, but
this statement staggers me. We have clearly misunderstood each other
even more fundamentally that I had thought possible :-(

Thinking hard about the implications of what you said there, I start
to see why you might have misinterpreted my stance as the black and
white one. But I have absolutely no idea how to explain to you that I
find your stance equally (and before I took the time to think through
what your statement implied, even more) so.

There's little more I can say. I'm going to take my own advice now,
and stop talking. I'll keep listening, in the hope that either this
post or something else will somehow break the logjam, but right now
I'm not sure I have much hope of that.

Paul

From robert.kern at gmail.com  Mon Sep 14 21:25:14 2015
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 14 Sep 2015 20:25:14 +0100
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <mt6u7d$d1c$1@ger.gmane.org>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <etPan.55f6d0e1.7bf9094.b1e0@Draupnir.home>
 <1442240160.186432.383146609.471F1B7D@webmail.messagingengine.com>
 <loom.20150914T162116-889@post.gmane.org>
 <1442241905.192420.383177025.382A3D7E@webmail.messagingengine.com>
 <mt6pnv$1m4$1@ger.gmane.org> <mt6qbn$c4r$1@ger.gmane.org>
 <mt6u7d$d1c$1@ger.gmane.org>
Message-ID: <mt76uq$t8s$1@ger.gmane.org>

On 2015-09-14 17:56, Sturla Molden wrote:
> On 14/09/15 17:50, Robert Kern wrote:
>
>> Actually, it's well behind the state of the art as it fails BigCrush.
>> The proposed alternative does better in this regard.
>
> Is that one of the PCGs? Or Arc4Random, ChaCha20 or XorShift64/32?

The alternative proposed in this thread is ChaCha20.

> The three latter fails on k-dimensional equi-distribution, MT does not. Some of
> the PCGs do too, but some should be as good as MT. Not sure if that is worse or
> better than failing some parts of BigCrush.

There is a reason that exact k-dimensional equidistribution for such a large k 
is not tested even in BigCrush. It's a nifty feature useful in a few 
applications, but not for simulations. It is important that the PRNG is 
*well*-distributed, but exact equidistribution is mostly neither here nor there. 
It can be trivially implemented by statistically bad PRNGs, like a simple 
counter. Obtaining it requires implementing an astronomically long period (and 
consequent growth in the state size) that adds significant costs without any 
realizable improvement to the statistics. If I'm drawing millions of numbers, 
k=623 is not much better than k=1, provided that the generator is otherwise good.

> Which PCG would you recommend, by the way?

Probably pcg64 (128-bit state, 64-bit output). Having the 64-bit output is nice 
so you only have to draw one value to make a uniform(0,1) double, and a period 
of 2**128 is nice and roomy without being excessively large.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco


From random832 at fastmail.com  Mon Sep 14 21:25:52 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 14 Sep 2015 15:25:52 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F9Y4n_dd+ybfCA2Ps1gN0Q5GbWJxeErqj0KfFNzYGbQjw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAN-Kwu3S3O8ft7eXcPYhV6ns_kHSsXVUw1fSsPuFjM7EwdLL8g@mail.gmail.com>
 <CACac1F9Y4n_dd+ybfCA2Ps1gN0Q5GbWJxeErqj0KfFNzYGbQjw@mail.gmail.com>
Message-ID: <1442258752.255504.383449553.23795E2B@webmail.messagingengine.com>

On Mon, Sep 14, 2015, at 15:14, Paul Moore wrote:
> * My argument is that breaking backward compatibility needs to be
> justified.

I don't think it does. I think that there needs to be a long roadmap of
deprecation and provided workarounds for *almost any*
backwards-compatibility-breaking change, but that special justification
beyond "is this a good feature" is only needed for ignoring that
roadmap, not for deprecating/replacing a feature in line with it.

No-one, as far as I have seen in this thread to date, has actually put a
timeline on this change. No-one's talking about getting rid of the
global functions in 3.5.1, or in 3.6, or in 3.7. So with that in mind I
can only conclude that the people against making the change are against
*ever* making it *at all* - and certainly a lot of the arguments they're
making have to do with nebulous educational use-cases (class instances
are hard, let's use mutable global state) rather than backwards
compatibility. Would you likewise have been against every single thing
that Python 3 did?

From p.f.moore at gmail.com  Mon Sep 14 21:26:49 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 14 Sep 2015 20:26:49 +0100
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CAH_hAJGi9UEdBzt-GC=CqvGTq7aWoM7Yq2ZtsiYn4TvWV=Z06A@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAH_hAJGi9UEdBzt-GC=CqvGTq7aWoM7Yq2ZtsiYn4TvWV=Z06A@mail.gmail.com>
Message-ID: <CACac1F_OZLAC2KLjnx2Z1LFq0WDzEt20WmCeHqhTgoyW_SFehg@mail.gmail.com>

On 14 September 2015 at 17:00, Cory Benfield <cory at lukasa.co.uk> wrote:
> What makes you think that I didn't take it into account? I did: and
> then rejected it. On a personal level, I believe that defaulting to
> more secure is worth backward compatibility breaks. I believe that a
> major reason for the overwhelming prevalence of security
> vulnerabilities in modern software is because we are overly attached
> to making people's lives *easy* at the expense of making them *safe*.
> I believe that software communities in general are too concerned about
> keeping the stuff that people used around for far too long, and not
> concerned enough about pushing users to make good choice.

OK. In *my* experience, systems with appallingly bad security
practices run for many years with no sign of an exploit. The
vulnerabilities described in this thread pale into insignificance
compared to many I have seen. On the other hand, I regularly see
systems not being upgraded because the cost of confirming that there
are no regressions (much less the cost of making fixes for deliberate
incompatibilities) is deemed too high.

I'm not trying to justify those things, nor am I trying to say that my
experience is in any way "worth more" than yours. These aren't all
Python systems. But the culture where such things occur is real, and I
have no reason to believe that I'm the only person in this position.
(But as it's in-house closed-source, it's essentially impossible to
get any good view of how common it is).

Paul

From donald at stufft.io  Mon Sep 14 22:23:17 2015
From: donald at stufft.io (Donald Stufft)
Date: Mon, 14 Sep 2015 16:23:17 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F9Y4n_dd+ybfCA2Ps1gN0Q5GbWJxeErqj0KfFNzYGbQjw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAN-Kwu3S3O8ft7eXcPYhV6ns_kHSsXVUw1fSsPuFjM7EwdLL8g@mail.gmail.com>
 <CACac1F9Y4n_dd+ybfCA2Ps1gN0Q5GbWJxeErqj0KfFNzYGbQjw@mail.gmail.com>
Message-ID: <etPan.55f72cb5.7b968f9d.24af@Draupnir.home>

On September 14, 2015 at 3:14:45 PM, Paul Moore (p.f.moore at gmail.com) wrote:
>  
> Here's a point - it seems likely that the people arguing for this
> change are of the opinion that I'm not appreciating their position.
> (For the record, I'm not being deliberately obstructive in case anyone
> thought otherwise. In my view at least, I don't understand the
> security guys' position). Assuming that's the case, then I'm probably
> one of the people who needs educating. But I don't feel like anyone's
> trying to educate me, just that I'm being browbeaten until I give in.
>  
> Education != indoctrination.

For the record, I'm not sure what part you don't understand. I'm happy to try
and explain it, but I think I'm misunderstanding what you're not understanding
or something because I personally feel like I did explain what I think you're
misunderstanding.

Part of the problem (probably) here is that there isn't an exact person we're
trying to protect here. The general gist is that if you use the deterministic
APIs in a security sensitive situation, then you may be vulnerable depending
on exactly what you're doing. We think that in particular, the API of the
random module will lead inexperienced or un(der)informed developers to use the
API in situations that it's not appropiate and from that, have an insecure
piece of software they wrote. We're people who think that the defaults of the
software should be "generally" secure (as much so as is reasonable) and that
if you want to do something that isn't safe then you should explicitly opt in
to that (the flipside is, things shouldn't be so locked down as to be unusable
without having to turn off all of the security knobs, this is where the
"generally" in generally secure comes into play).

A particularly nasty side effects of this, is that it's almost never the people
who wrote this software who are harmed by it being broken and it's almost
always their users who didn't have anything to do with it.

So essentially the goal is to try and make it harder for people to accidently
misuse the random module. If that doesn't answer your confusion, if you can
try to reword it to get it through my thick skull better, I'm happy to continue
to try an answer it (on or off list).

>  
> At its best, good security practice should *help* people write
> reliable, easy to use systems. Or at a minimum, not get in the way.
> But the PR message needs always to be "I understand the constraints
> you're dealing with", not "you must do this for your own good".
> Otherwise the "follow the rules until the auditors go away" attitude
> just gets reinforced. Hence my focus on seeing proof that breakages
> are justified *in the context of the target audience I am responsible
> for*.

Right, and this is actually trying to do that. By removing a possibly dangerous
default and making the default safer. Defaults matter a lot in security (and
sadly, a lot of software doesn't have safe defaults) because a lot of software
will never use anything but the defaults.

>  
> Conversely, you're right that I can't force anyone else to try to
> educate people in good security practices, however much better than me
> at it I might think they are. In actual fact, though, I think a lot of
> people do a lot of good work educating others - as I say, most of what
> I've learned has been from lists like these.
>  
> >> Honestly, this type of debate doesn't do the security community much
> >> good - there's too little willingness to compromise, and as a result
> >> the more neutral participants (which, frankly, is pretty much anyone
> >> who doesn't have a security agenda to promote) end up pushed into a
> >> "reject everything" stance simply as a reaction to the black and white
> >> argument style.
> >
> > Except you seem to have missed much of the compromises being discussed
> > and conceded by the security minded folks.
>  
> OK, you have a point - there have been changes to the proposals. But
> there are fundamental points that have (as far as I can see) never
> been acknowledged. As a result, the changes feel less like compromises
> based on understanding each other's viewpoints, and more like repeated
> attempts to push something through, even if it's not what was
> originally proposed. (I *know* this is an emotional position - please
> understand I'm fed up and not always managing to word things
> objectively).

I think part of this is that a lot of the folks proposing these changes are
also sensitive to the backwards compatability needs and have already baked that
into their thoughts. We don't generally come into these with "scorched earth"
suggestions of fixing some situation where security could be improved but
instead try and figure out a decent balance of security and not breaking things
to try and cover most of the ground with as little cost as possible.

My very first email in this particular thread (that started this thread) was
the first one I had with a fully solid proposal in it. The last paragraph in
that proposal asked the question "Do we want to protect users by default?" My
next email presents two possible options depending on which we considered to be
"less" breaking, either deprecating the module scoped functions completely or
change their defaults to something secure and mentioned that if we can't change
the default, the user-land CSPRNG probably isn't a useful addition because it's
benefit is primarily in being able to make it the default option.

I don't see anyone who is talking about making a change not also talking about
what areas of backwards compatibility it would actually break.

I think part of this too is that security is a bit weird, it's not a boolean
property but there are particular bars you need to pass before it's an actual
solution to the problem. So for a lot of us, we'll figure out that bar and draw
a line in the sand and say "If this proposal crosses this line, then doing
nothing is better than doing something" because it'd just be churn for churns
sake at that point. That's why you'll see particular points that we essentially
won't give up, because if they are given up we might as well do nothing. In
this particular instance, the point is that the API of the random module leads
people to use it incorrectly, so unless we address that, we might as well just
leave it alone.

>  
> Specifically, I have been told that I can't argue my "convenience"
> over the weight of all the other people who could fall into security
> traps with the current API. Let's review that, shall we?

I think I was the one who said that to you, and I'd like to explain why I said
it (beyond the fact I was riled up). Essentially I had in my mind something
like what Nick has proposed, which you've said later on you think is relatively?
unobtrusive, and unlikely to cause serious compatibility, which I agree with.
Then I saw you arguing against what I felt was a pretty mundane API break that
was fairly trivial to work around, and it signaled to me that you were saying
that having to type a few extra letters was a bridge too far. This reads to me
like someone saying "Well I know how to use it correctly, it's their own fault
if others don't". I'm not saying that's what you actually think but that's how
it read to me.

>  
> * My argument is that breaking backward compatibility needs to be
> justified. People have different priorities. "Security risks should be
> fixed" isn't (IMO) a free pass. Why should it be? "Windows
> compatibility issues should be fixed" isn't a free pass. "PyPy/Jython
> compatibility issues should be fixed" isn't a free pass. Forcing me to
> adjust my priorities so that I care about security when I don't want
> (or IMO need) to isn't acceptable.

The justification is essentially that it will protect some people with minimal
impact to others. The main impact will be people who actually needed a
deterministic RNG will need to use something like ``random.seeded_random``
instead of just ``random`` and importantly, this will break in a fairly obvious
manner instead of the silently wrong situation for people who are currently
using the top level API incorrectly.

As a bit of a divergence, the "silently wrong" part is why defaults tend to
matter a lot in security. Unless you're well versed in it, most people don't
think about it and since it "works" they don't inquire further. Something that
is security sensitive that always "works" (as in, doesn't raise an error) is
broken which is the inverse of how most people think about software. To put it
another way, it's the job of security sensitive APIs to break things, ideally
only in cases where it's important to break, but unless you're actually testing
that it breaks in those attack scenarios, secure and insecure looks exactly the
same.

> * The security arguments seem to be largely in the context of web
> application development (cookies, passwords, shared secrets, ...)
> That's not the only context that matters.

You're right it's not the only context that matters, however it's often brought
up for a few reasons:

* Security largely doesn't matter for software that doesn't accept or send
?input from some untrusted source which narrows security down to be mostly
?network based applications.

* The HTTP protocol is "eating the world" and we're seeing more and more things
? using it as their communication protocol (even for things that are not
? traditional browser based applications).

* Traditional Web Applications/Sites are a pretty large target audience for
? Python and in particular a lot of the security folks come from that world
? because the web is a hostile place.


But you can replace web application with anything that an untrusted user can
interact with over any protocol and the argument is basically the same.

> * As I said above, in my experience, a compatibility break "to make
> things more secure" is seen as equating security with inconvenience,
> and can actually harm attempts to educate users in better security
> practices.

Sadly, I don't think this is fully resolvable :(

It is the nature of security that it's purpose is to take something that
otherwise "works" and make it no longer work because it doesn't satisfy the
constraints of the security system.

> * In many environments, reproducibility of random streams is
> important. I'm not an expert on those fields, although I've hit some
> situations where seeding is a requirement. As far as I am aware, most
> of those situations have no security implications. So for them, the
> PEP is all cost, no benefit. Sure the cost is small, but it's
> non-zero.

Right, and I don't think anyone is saying this isn't an important use case,
just that if you need a deterministic RNG and you don't get one, that is a
fairly obvious problem but if you need a CSPRNG and you don't get one, that is
not obvious.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From donald at stufft.io  Mon Sep 14 22:36:16 2015
From: donald at stufft.io (Donald Stufft)
Date: Mon, 14 Sep 2015 16:36:16 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F_OZLAC2KLjnx2Z1LFq0WDzEt20WmCeHqhTgoyW_SFehg@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAH_hAJGi9UEdBzt-GC=CqvGTq7aWoM7Yq2ZtsiYn4TvWV=Z06A@mail.gmail.com>
 <CACac1F_OZLAC2KLjnx2Z1LFq0WDzEt20WmCeHqhTgoyW_SFehg@mail.gmail.com>
Message-ID: <etPan.55f72fc0.3eda087c.24af@Draupnir.home>

On September 14, 2015 at 3:27:22 PM, Paul Moore (p.f.moore at gmail.com) wrote:
> On 14 September 2015 at 17:00, Cory Benfield wrote:
> > What makes you think that I didn't take it into account? I did: and
> > then rejected it. On a personal level, I believe that defaulting to
> > more secure is worth backward compatibility breaks. I believe that a
> > major reason for the overwhelming prevalence of security
> > vulnerabilities in modern software is because we are overly attached
> > to making people's lives *easy* at the expense of making them *safe*.
> > I believe that software communities in general are too concerned about
> > keeping the stuff that people used around for far too long, and not
> > concerned enough about pushing users to make good choice.
>  
> OK. In *my* experience, systems with appallingly bad security
> practices run for many years with no sign of an exploit. The
> vulnerabilities described in this thread pale into insignificance
> compared to many I have seen.?

What does "no sign of an exploit" mean? Does it mean that if there was an
exploit that the attackers didn't put metaphorical giant signs up to say that
"Zero Cool" was here? Or is there an active security team running IDS software
to ensure that there wasn't a breach?

I ask because in my experience, "no sign of an exploit" is often synonymous
with "we've never really looked to see if we were exploited, but we haven't
noticed anything". This is a dangerous way to look at it, because a lot of
exploitation is being done by organized crime where they don't want you to
notice that you were exploited because they want to make you part of a botnet
or to silently steal data or whatever you have. For these, if they get detected
that is a bad thing because they lose that node in their botnet (or whatever).

It's a very rare exploit that gets publically exposed like the Ashley Madison
hacks, they are jsut the ones that get the most attention because they are
bombastic and public.

> On the other hand, I regularly see
> systems not being upgraded because the cost of confirming that there
> are no regressions (much less the cost of making fixes for deliberate
> incompatibilities) is deemed too high.

Absolutely!

However, I think these systems largely don't upgrade *at all* and are still on
whatever version of $LANG they originally wrote the software for. These systems
tend to be so regression adverse that they don't even risk bug fixes because
that might cause a regression. For these people, it doesn't really matter what
we do because they aren't going to upgrade anyways, and they keep Red Hat in
business by paying them for Python 2.4 until the heat death of the universe.

I think the more likely case for concern is people who do upgrade and are
willing to tolerate some regression in order to stay somewhat current. These
people will push back against *massive* breakage (as seen with the Python 3.x
migration taking forever) but are often perfectly fine dealing with small
breakages. As someone who does write software that supports a lot of versions
(currently, 6-7 versions of CPython alone is my standard depending if you
count pre-releases or not) having to tweak import statements doesn't even
really register in my "give a damn" meter, nor did it for the folks I know who
are in similar situations (though this is admittingly a biased and small
sample).

>  
> I'm not trying to justify those things, nor am I trying to say that my
> experience is in any way "worth more" than yours. These aren't all
> Python systems. But the culture where such things occur is real, and I
> have no reason to believe that I'm the only person in this position.
> (But as it's in-house closed-source, it's essentially impossible to
> get any good view of how common it is).
>  

I think maybe a problem here is a difference in how we look at the data. It
seems that you might focus on the probability of you personally (or the things
you work on) getting attacked and thus benefiting from these changes, whereas
I, and I suspect the others like me, think about the probability of *anyone*
being attacked.


-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From sturla.molden at gmail.com  Mon Sep 14 22:39:42 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Mon, 14 Sep 2015 22:39:42 +0200
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <mt75tq$bn1$1@ger.gmane.org>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>	<CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>	<1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>	<20150909190757.GM19373@ando.pearwood.info>	<55F0BF61.6050205@canterbury.ac.nz>	<CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>	<55F13EAF.5040500@egenix.com>	<CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>	<55F1B219.1000502@egenix.com>	<87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>	<87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>	<CAPJVwBkHQu
 EnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>	<55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
Message-ID: <mt7bac$3qo$1@ger.gmane.org>

On 14/09/15 21:07, Sturla Molden wrote:

> And with this criterion, only MT and certain PCG generators are
> acceptable. Those are (to my knowledge) the only ones with proven
> equidistribution.

Just to explain, for those who do not know...

Equidistribution means that the numbers are "uniformly distributed", or 
specifically that "the proportion of the sequence that fall within an 
interval is proportional to the length of the interval".

With one-dimensional equidistribution, the deviates are uniformly 
distributed on a line. With two-dimensional equidistribution, the 
deviates are uniformly distributed in a square. With three-dimensional 
equidistribution, the deviates are uniformly distributed in a cube. 
k-dimensional equi-distribution generalizes this up to a k-dimensional 
space.

Let us say you want to simulate a shooter firing a gun at a target. 
Every bullet is aimed at the target and hits in a sightly different 
place. The shooter is unbiased, but there will be some random jitter. 
The probability of hitting the target should be proportional to its 
size, right? Perhaps!

Mersenne Twister MT19937 (used in Python) is proven to have 623 
dimensional equidistribution. Certain PCG generators are proven to have 
equidistribution of arbitrary dimensionality. Your simulation of the 
shooter will agree with common sence if you pick one of these.

With other generators, such there are no k-dimensional equidistribution. 
Your simulation of the shooter will disagree with common sence. Which is 
correct? Common sence.

 From a mathematical point of view, this is so important than anything 
else than Mersenne Twister or PCG is not worth considering in a Monte 
Carlo simulation.


Sturla








From robert.kern at gmail.com  Mon Sep 14 22:45:25 2015
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 14 Sep 2015 21:45:25 +0100
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <mt75tq$bn1$1@ger.gmane.org>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com> <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
 <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQu EnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
Message-ID: <mt7bl5$7hk$1@ger.gmane.org>

On 2015-09-14 20:07, Sturla Molden wrote:
> On 14/09/15 19:15, M.-A. Lemburg wrote:
>
>> I am well aware that MT doesn't satisfy all empirical tests
>> and also that it is not a CSPRNG
>
>> However, it has been extensively studied and it is proven to be
>> equidistributed which is a key property needed for it to be used as
>> basis for other derived probability distributions (as it done by the
>> random module).
>
> And with this criterion, only MT and certain PCG generators are acceptable.
> Those are (to my knowledge) the only ones with proven equidistribution.

Do not confuse k-dimensional equidistribution with "equidistribution". The 
latter property (how uniformly a single draw is distributed) is the one that the 
derived probability distributions rely upon, not the former. Funny story: MT is 
provably *not* strictly equidistributed; it produces a exactly 624 fewer 0s than 
it does any other uint32 if you run it over its entire period. Not that it 
really matters, practically speaking.

FWIW, lots of PRNGs can prove either property. To Nate's point, I think he is 
primarily thinking of counter-mode block ciphers when we talks of CSPRNGs, and 
they are trivially proved to be equidistributed. The counter is obviously 
equidistributed, and the symmetric encryption function is a bijective function 
from counter to output.

However, not all CSPRNGs are constructed alike. In particular, ChaCha20 is a 
stream cipher rather than a block cipher, and I think Marc-Andre is right that 
it would be difficult to prove equidistribution. Proving substantial 
*non*-equidistribution could eventually happen though, as it did to ARC4, which 
prompted its replacement with ChaCha20 in OpenBSD, IIRC.

And all that said, provable equidistribution (much less provable k-dimensional 
equidistribution) doesn't make a good PRNG. A simple counter satisfies 
equidistribution, but it is a terrible PRNG. The empirical tests are more 
important IMO.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco


From sturla.molden at gmail.com  Mon Sep 14 22:49:14 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Mon, 14 Sep 2015 22:49:14 +0200
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <mt76uq$t8s$1@ger.gmane.org>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <etPan.55f6d0e1.7bf9094.b1e0@Draupnir.home>
 <1442240160.186432.383146609.471F1B7D@webmail.messagingengine.com>
 <loom.20150914T162116-889@post.gmane.org>
 <1442241905.192420.383177025.382A3D7E@webmail.messagingengine.com>
 <mt6pnv$1m4$1@ger.gmane.org> <mt6qbn$c4r$1@ger.gmane.org>
 <mt6u7d$d1c$1@ger.gmane.org> <mt76uq$t8s$1@ger.gmane.org>
Message-ID: <mt7bs8$bmp$1@ger.gmane.org>

On 14/09/15 21:25, Robert Kern wrote:

>> Which PCG would you recommend, by the way?
>
> Probably pcg64 (128-bit state, 64-bit output). Having the 64-bit output
> is nice so you only have to draw one value to make a uniform(0,1)
> double, and a period of 2**128 is nice and roomy without being
> excessively large.

Thanks :)


Sturla



From sturla.molden at gmail.com  Mon Sep 14 22:56:05 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Mon, 14 Sep 2015 22:56:05 +0200
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <mt7bl5$7hk$1@ger.gmane.org>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com> <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
 <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQu EnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org>
Message-ID: <mt7c93$gns$1@ger.gmane.org>

On 14/09/15 22:45, Robert Kern wrote:
> On 2015-09-14 20:07, Sturla Molden wrote:
>> On 14/09/15 19:15, M.-A. Lemburg wrote:
>>
>>> I am well aware that MT doesn't satisfy all empirical tests
>>> and also that it is not a CSPRNG
>>
>>> However, it has been extensively studied and it is proven to be
>>> equidistributed which is a key property needed for it to be used as
>>> basis for other derived probability distributions (as it done by the
>>> random module).
>>
>> And with this criterion, only MT and certain PCG generators are
>> acceptable.
>> Those are (to my knowledge) the only ones with proven equidistribution.
>
> Do not confuse k-dimensional equidistribution with "equidistribution".
> The latter property (how uniformly a single draw is distributed) is the
> one that the derived probability distributions rely upon, not the
> former.


Yes, there was something fishy about this. k-dimensional 
equidistribution matters if we simulate a k-dimensional tuple, as I 
understand it.

Sturla



From sturla.molden at gmail.com  Mon Sep 14 22:59:02 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Mon, 14 Sep 2015 22:59:02 +0200
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <mt7bl5$7hk$1@ger.gmane.org>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com> <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
 <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQu EnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org>
Message-ID: <mt7cek$gns$2@ger.gmane.org>

On 14/09/15 22:45, Robert Kern wrote:

> Funny story: MT is provably *not* strictly equidistributed; it
> produces a exactly 624 fewer 0s than it does any other uint32 if you run
> it over its entire period. Not that it really matters, practically
> speaking.

I probably would not live long enough to see it ;)


Sturla



From robert.kern at gmail.com  Mon Sep 14 23:19:09 2015
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 14 Sep 2015 22:19:09 +0100
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <mt7c93$gns$1@ger.gmane.org>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com> <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
 <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQu EnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org> <mt7c93$gns$1@ger.gmane.org>
Message-ID: <mt7dkd$87i$1@ger.gmane.org>

On 2015-09-14 21:56, Sturla Molden wrote:
> On 14/09/15 22:45, Robert Kern wrote:
>> On 2015-09-14 20:07, Sturla Molden wrote:
>>> On 14/09/15 19:15, M.-A. Lemburg wrote:
>>>
>>>> I am well aware that MT doesn't satisfy all empirical tests
>>>> and also that it is not a CSPRNG
>>>
>>>> However, it has been extensively studied and it is proven to be
>>>> equidistributed which is a key property needed for it to be used as
>>>> basis for other derived probability distributions (as it done by the
>>>> random module).
>>>
>>> And with this criterion, only MT and certain PCG generators are
>>> acceptable.
>>> Those are (to my knowledge) the only ones with proven equidistribution.
>>
>> Do not confuse k-dimensional equidistribution with "equidistribution".
>> The latter property (how uniformly a single draw is distributed) is the
>> one that the derived probability distributions rely upon, not the
>> former.
>
> Yes, there was something fishy about this. k-dimensional equidistribution
> matters if we simulate a k-dimensional tuple, as I understand it.

Yeah, but we do that every time we draw k numbers in a simulation at all. And we 
usually draw millions. In that case, perfect k=623-dimensional equidistribution 
is not really any better than k=1, provided that the PRNG is otherwise good.

The requirement for a good PRNG for simulation work is that it be *well* 
distributed in reasonable dimensions, not that it be *exactly* equidistributed 
for some k. And well-distributedness is exactly what is tested in TestU01. It is 
essentially a collection of simulations designed to expose known statistical 
flaws in PRNGs. So to your earlier question as to which is more damning, failing 
TestU01 or not being perfectly 623-dim equidistributed, failing TestU01 is.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco


From mal at egenix.com  Tue Sep 15 00:09:13 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Sep 2015 00:09:13 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <mt7dkd$87i$1@ger.gmane.org>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>	<1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>	<20150909190757.GM19373@ando.pearwood.info>	<55F0BF61.6050205@canterbury.ac.nz>	<CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>	<55F13EAF.5040500@egenix.com>	<CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>	<55F1B219.1000502@egenix.com>	<87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>	<87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>	<CAPJVwBkHQu
 EnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>	<55F6A380.4070609@egenix.com>	<CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>	<55F700C4.4030900@egenix.com>
 <mt75tq$bn1$1@ger.gmane.org>	<mt7bl5$7hk$1@ger.gmane.org>
 <mt7c93$gns$1@ger.gmane.org> <mt7dkd$87i$1@ger.gmane.org>
Message-ID: <55F74589.4030805@egenix.com>

On 14.09.2015 23:19, Robert Kern wrote:
> On 2015-09-14 21:56, Sturla Molden wrote:
>> On 14/09/15 22:45, Robert Kern wrote:
>>> On 2015-09-14 20:07, Sturla Molden wrote:
>>>> On 14/09/15 19:15, M.-A. Lemburg wrote:
>>>>
>>>>> I am well aware that MT doesn't satisfy all empirical tests
>>>>> and also that it is not a CSPRNG
>>>>
>>>>> However, it has been extensively studied and it is proven to be
>>>>> equidistributed which is a key property needed for it to be used as
>>>>> basis for other derived probability distributions (as it done by the
>>>>> random module).
>>>>
>>>> And with this criterion, only MT and certain PCG generators are
>>>> acceptable.
>>>> Those are (to my knowledge) the only ones with proven equidistribution.
>>>
>>> Do not confuse k-dimensional equidistribution with "equidistribution".
>>> The latter property (how uniformly a single draw is distributed) is the
>>> one that the derived probability distributions rely upon, not the
>>> former.
>>
>> Yes, there was something fishy about this. k-dimensional equidistribution
>> matters if we simulate a k-dimensional tuple, as I understand it.
> 
> Yeah, but we do that every time we draw k numbers in a simulation at all. And we usually draw
> millions. In that case, perfect k=623-dimensional equidistribution is not really any better than
> k=1, provided that the PRNG is otherwise good.

Depends on your use case, but the fact that you can prove it is
what really matters - well, at least for me :-)

> The requirement for a good PRNG for simulation work is that it be *well* distributed in reasonable
> dimensions, not that it be *exactly* equidistributed for some k. And well-distributedness is exactly
> what is tested in TestU01. It is essentially a collection of simulations designed to expose known
> statistical flaws in PRNGs. So to your earlier question as to which is more damning, failing TestU01
> or not being perfectly 623-dim equidistributed, failing TestU01 is.

TestU01 includes tests which PRNGs of the MT type have trouble passing, since
they are linear. This makes them poor choices for crypto applications,
but does not have much effect on simulations using only a tiny part of the
available period.

For MT there's an enhanced version called SFMT which performs better
in this respect:

http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/M062821.pdf

(the paper also discusses the linear dependencies)

See http://www.jstatsoft.org/v50/c01/paper for a discussion of MT vs.
SFMT.

You can also trick TestU01 to have all tests pass by applying a non-linear
transformation (though I don't really see the point in doing this).

The WELL family of generators is an newer development, which provides
even better characteristics:

http://www.iro.umontreal.ca/~lecuyer/myftp/papers/wellrng.pdf

Also note that by seeding the MT in Python with truly random
data, the shortcomings of MT w/r to having problems escaping
"zeroland" (many 0 bits in the seed) are mostly avoided.

Anyway, it's been an interesting discussion, but I think it's time
to let go :-)

Here's a firt cut at an implementation of the idea to use OpenSSL's
rand API as basis for an RNG. It even supports monkey patching
the random module, though I don't think that's good design.

""" RNG based on OpenSSL's rand API.

    Marc-Andre lemburg, 2015. License: MIT

"""
# Needs OpenSSL installed: pip install egenix-pyopenssl
from OpenSSL import rand
import random, struct, binascii

# Number of bits in an IEEE float
BITS_IN_FLOAT = 53

# Scale to apply to RNG output to make uniform
UNIFORM_SCALING = 2 ** -BITS_IN_FLOAT

### Helpers

# Unpacker
def str2long(value):

    value_len = len(value)

    if value_len <= 4:
        if value_len < 4:
            value = '\0' * (4 - value_len) + value
        return struct.unpack('>L', value)[0]

    elif value_len <= 8:
        if value_len < 8:
            value = '\0' * (8 - value_len) + value
        return struct.unpack('>Q', value)[0]

    return long(binascii.hexlify(value), 16)

###

class OpenSSLRandom(random.Random):

    """ RNG using the OpenSSL rand API, which provides a cross-platform
        cryptographically secure RNG.

    """
    def random(self):

        """ Return a random float from [0.0, 1.0).

        """
        return (str2long(rand.bytes(7)) >> 3) * UNIFORM_SCALING

    def getrandbits(self, bits):

        """ Return an integer with the given number of random bits.

        """
        if bits <= 0:
            raise ValueError('bits must be >0')

        if bits != int(bits):
            raise TypeError('bits must be an integer')

        # Get enough bytes for the requested number of bits
        numbytes = (bits + 7) // 8
        x = str2long(rand.bytes(numbytes))

        # Truncate bits, if needed
        return x >> (numbytes * 8 - bits)

    def seed(self, value=None):

        """ Feed entropy to the RNG.

            value may be None, an integer or a string.

            If None, 2.5k bytes data are read from /dev/urandom and
            fed into the RNG.

        """
        if value is None:
            try:
                value = random._urandom(2500)
                entropy = 2500
            except NotImplementedError:
                return
        if isinstance(value, (int, long)):
            value = hexlify(value)
            entropy = len(value)
        else:
            value = str(value)
            entropy = len(value)

        # Let's be conservative regarding the available entropy in
        # value
        rand.add(value, entropy / 2)

    def _notimplemented(self, *args, **kwds):

        raise NotImplementedError(
            'OpenSSL RNG does not implement this method')

    getstate =  _notimplemented
    setstate = _notimplemented

### Testing

def install_as_default_rng():

    """ Monkey patch the random module

    """
    _inst = OpenSSLRandom()
    random._inst = _inst
    for attr in ('seed',
                 'random',
                 'uniform',
                 'triangular',
                 'randint',
                 'choice',
                 'randrange',
                 'sample',
                 'shuffle',
                 'normalvariate',
                 'lognormvariate',
                 'expovariate',
                 'vonmisesvariate',
                 'gammavariate',
                 'gauss',
                 'betavariate',
                 'paretovariate',
                 'weibullvariate',
                 'getstate',
                 'setstate',
                 'jumpahead',
                 'getrandbits',
                 ):
        setattr(random, attr, getattr(_inst, attr))

def _test():

    # Install
    install_as_default_rng()

    # Now run the random module tests
    random._test()

###

if __name__ == '__main__':
    _test()

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 14 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-14: Released mxODBC Plone/Zope DA 2.2.3   http://egenix.com/go84
2015-09-18: PyCon UK 2015 ...                               4 days to go
2015-09-26: Python Meeting Duesseldorf Sprint 2015         12 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From tjreedy at udel.edu  Tue Sep 15 00:31:24 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 14 Sep 2015 18:31:24 -0400
Subject: [Python-ideas] Globally configurable random number generation)
In-Reply-To: <mt6nm4$q8i$1@ger.gmane.org>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <mt6nm4$q8i$1@ger.gmane.org>
Message-ID: <mt7hss$9im$1@ger.gmane.org>

On 9/14/2015 11:04 AM, Serhiy Storchaka wrote:
> On 14.09.15 16:32, Nick Coghlan wrote:
>> * make random.Random a subclass of SeedableRandom that deprecates
>> seed(), getstate() and setstate()

An alternate proposal is to initialize the module so that random uses a 
something more 'secure' than MT.  Then...

> I would make seed() and setstate() to switch to seedable algorithm.

In particular, to MT. Also switch on a getstate() call.

 > If you don't use seed() or setstate(), it is not important that the
> algorithm is changed. If you use seed() or setstate(), you expect
> reproducible behavior.

There is more than one possible internal implementation.  But for any of 
them, the change should be invisible to callers.  (Representations and 
introspection results would be a different matter.)

I understand that the docs currently say that random uses MT.  But I 
wonder if any version of the above could be used in current versions, so 
as to immediately "upgrade a lot of existing instructions on the 
internet" and code that follows such instructions.

-- 
Terry Jan Reedy


From p.f.moore at gmail.com  Tue Sep 15 00:39:25 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 14 Sep 2015 23:39:25 +0100
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <etPan.55f72fc0.3eda087c.24af@Draupnir.home>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAH_hAJGi9UEdBzt-GC=CqvGTq7aWoM7Yq2ZtsiYn4TvWV=Z06A@mail.gmail.com>
 <CACac1F_OZLAC2KLjnx2Z1LFq0WDzEt20WmCeHqhTgoyW_SFehg@mail.gmail.com>
 <etPan.55f72fc0.3eda087c.24af@Draupnir.home>
Message-ID: <CACac1F9rCe_n-3gVPGqMp48WthLmt-4ZDu0aGrshCO5oevRjfQ@mail.gmail.com>

(The rest of your emails, I'm going to read fully and digest before
responding. Might take a day or so.)

On 14 September 2015 at 21:36, Donald Stufft <donald at stufft.io> wrote:
> I think maybe a problem here is a difference in how we look at the data. It
> seems that you might focus on the probability of you personally (or the things
> you work on) getting attacked and thus benefiting from these changes, whereas
> I, and I suspect the others like me, think about the probability of *anyone*
> being attacked.

This may be true, in some sense. But I'm not willing to accept that
you are thinking about everyone, but I'm somehow selfishly only
thinking of myself. If that's what you were implying, then frankly
it's a pretty offensive way of disregarding my viewpoint. Knowing you,
I'm sure that's *not* how you meant it - but do you see how easy it is
for the way you word something to make it nearly impossible for me to
see past your wording to get to the actual meaning of what you're
trying to say? I didn't even consciously notice the implication
myself, at first. I simply started writing a pretty argumentative
rebuttal, because I felt that somehow I needed to correct what you
said, but I couldn't quite say why.

Looking at the reality of what I focus on, I'd say it's more like
this. I mistrust arguments that work on the basis that "someone,
somewhere, might do X bad thing, therefore we must all pay cost Y".
The reasons are complex (and I don't know that I fully understand all
of my thought processes here) but some aspects that immediately strike
me are:

* The probability of X isn't really quantified. I may win the lottery,
but I don't quit my job - the probability is low. The probability of X
matters.
* My experience of the probability of X happening varies wildly from
that of whoever's making the point. Who is right? Why must one of us
"win" and be right? Can't it simply be that my data implies that over
the full data set, the actual probability of X is lower than you
thought?
* The people paying cost Y are not the cause of, nor are they impacted
by, X (except in an abstract "we all suffer if bad things happen"
sense). I believe in the general principle of "you pay for what you
use", so to me you're arguing for the wrong people to be made to pay.

Hopefully, those are relatively objective measures. More subjectively,

* It's way too easy to say "if X happens once, we have a problem". If
you take the stance that we have to prevent X from *ever* happening,
you allow yourself the freedom to argue with vague phrases like
"might", while leaving the burden of absolute proofs on me. (In the
context of RNG proposals, this is where arguments like "let's
implement a secure secret library" get dismissed - they still leave
open the possibility of *someone* using an inappropriate RNG, so "they
don't solve the issue" - even if they reduce the chance of that
happening by a certain amount - and neither you nor I can put a figure
on how much, so let's not try).
* There's little evidence that I can see of preventative security
measures having improved things. Maybe this is because it's an "arms
race" situation, and keeping up is all we can hope for. Maybe it's
because it's hard to demonstrate a lack of evidence, so the demand for
evidence is unreasonable. I don't know.
* For many years I ran my PC with no anti-virus software. I never got
a virus. Does that prove anything? Probably not. The anti-virus
software on my work PC is the source of *far* more issues than I have
ever seen caused by a virus. Does *that* prove anything? Again,
probably not. But my experience with at least *that* class of pressure
to implement security is that the cure is worse than the disease.
Where does that leave the burden of proof? Again, I don't know, but my
experience should at least be considered as relevant data.
* Everyone I have ever encountered in a work context (as opposed to in
open-source communities) seems to me to be in a similar situation to
mine. I believe I'm speaking for them, but because it's a
closed-source in house environment, I've got no public data to back my
comments.

And totally subjective,

* I'm extremely tired of the relentless pressure of "we need to do X,
because security". While the various examples of X may all have ended
up being essentially of no disadvantage to me, feeling obliged to
read, understand, and comment on the arguments presented every time,
gets pretty wearing.
* I can't think of a single occasion where we *don't* do X. That may
well be confirmation bias, but again subjectively, it feels like
nobody's listening to the objections. I get that the original
proposals get modified, but if never once has the result been "you're
right, the cost is too high, we'll not do X" then that puts
security-related proposals in a pretty unique position.

Finally, in relation to that last point, and one thing I think is a
key difference in our thinking. I do *not* believe that security
proposals (as opposed to security bug fixes) are different from any
other type of proposal. I believe that they should be subject to all
the same criteria for acceptance that anything else is. I suspect that
you don't agree with that stance, and believe that security proposals
should be held to different standards (e.g., a demonstrated
*probability* of benefit is sufficient, rather than evidence of actual
benefit being needed). But please speak for yourself on this - I'm not
trying to put words into your mouth, it's just my impression.

All of which is completely unrelated to either the default RNG for the
Python stdlib, or whether I understand and/or accept the security
arguments presented here (for clarity, I believe I understand them, I
just don't accept them).

Paul

From emile at fenx.com  Tue Sep 15 00:50:25 2015
From: emile at fenx.com (Emile van Sebille)
Date: Mon, 14 Sep 2015 15:50:25 -0700
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
In-Reply-To: <CACac1F9rCe_n-3gVPGqMp48WthLmt-4ZDu0aGrshCO5oevRjfQ@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAH_hAJGi9UEdBzt-GC=CqvGTq7aWoM7Yq2ZtsiYn4TvWV=Z06A@mail.gmail.com>
 <CACac1F_OZLAC2KLjnx2Z1LFq0WDzEt20WmCeHqhTgoyW_SFehg@mail.gmail.com>
 <etPan.55f72fc0.3eda087c.24af@Draupnir.home>
 <CACac1F9rCe_n-3gVPGqMp48WthLmt-4ZDu0aGrshCO5oevRjfQ@mail.gmail.com>
Message-ID: <mt7ivn$opn$1@ger.gmane.org>

On 9/14/2015 3:39 PM, Paul Moore wrote:
> * Everyone I have ever encountered in a work context (as opposed to in
> open-source communities) seems to me to be in a similar situation to
> mine. I believe I'm speaking for them, but because it's a
> closed-source in house environment, I've got no public data to back my
> comments.

You can certainly speak for me.  It's much easier to guard the gates 
than everything inside the walls.

Emile


From abarnert at yahoo.com  Tue Sep 15 01:10:19 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Mon, 14 Sep 2015 16:10:19 -0700
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
Message-ID: <EEEEDDFE-CCB0-4BDE-8D28-383D11B852AC@yahoo.com>

On Sep 14, 2015, at 06:32, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> This is an expansion of the random module enhancement idea I
> previously posted to Donald's thread:
> https://mail.python.org/pipermail/python-ideas/2015-September/035969.html

Since I suggested the set_default_instance and the singleton instances that can be imported in place of the module, I'm obviously happy with those parts.

However, I think you still haven't solved the problem with my proposal that you set out to solve.

The main difference is that I wanted to deprecate (and eventually make it an error) to use the top-level functions without calling set_default_instance, while you want to allow them and gradually shift the semantics from using the seeded to the seedless PRNG.

As I understand it, the reason for this is that you want to make it possible for someone to write "from random import choice", and not get a warning or error telling them they need to call set_default_instance or import one of the singletons instead.

But then you're encouraging people to write code that's broken in 3.6 and earlier--and that's also potentially broken in 3.7 if used together with any code that calls set_default_instance (because that can't retroactive fix anything from-imported before the call). So, it takes 18 more months to provide any benefit, and it adds an extra cost.

Maybe the suggestion of not allowing set_default_instance to be called more than once and/or after any other functions is sufficient, but I'm not sure that it is.

What about this change: replace the three singleton instances with three modules, so we can tell people (and 2to3 and similar mechanical tools) to replace "from random import choice" with "from random.seedless_random import choice"? Would that be acceptable for novices? (And, If so, would that mean we no longer need the set_default_instance and can just flat-out deprecate the top-level functions in random?)

If that's not sufficient because the name is too long/too nested, could we just flatten the names out, so it's "from seedless_random import choice", and then the deprecation process is just making random an alias for seeded_random and then switching it to seedless_random later? (I don't think there's any official cross-platform way to alias modules like that, and having to do some ugly sys.modules munging to force them to be the same instance, or using a special module finder just for this case, etc. is obviously ugly, but it may be worth doing anyway.) One nice advantage of this is that it's dead-easy to backport; if I need seeded_random, I can write code that works for 2.6+/3.3+ by just spending on seeded_random from PyPI...

From ncoghlan at gmail.com  Tue Sep 15 02:04:19 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 15 Sep 2015 10:04:19 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CAN-Kwu3S3O8ft7eXcPYhV6ns_kHSsXVUw1fSsPuFjM7EwdLL8g@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAN-Kwu3S3O8ft7eXcPYhV6ns_kHSsXVUw1fSsPuFjM7EwdLL8g@mail.gmail.com>
Message-ID: <CADiSq7ckKtVWs_2Lf6yggnTq-r4XLKea1VtQnJbuET2omndgXw@mail.gmail.com>

On 15 September 2015 at 01:32, Ian Cordasco <graffatcolmingov at gmail.com> wrote:
> On Mon, Sep 14, 2015 at 10:01 AM, Paul Moore <p.f.moore at gmail.com> wrote:
>> On 14 September 2015 at 14:29, Cory Benfield <cory at lukasa.co.uk> wrote:
>>> Is your argument that there are lots of ways to get security wrong,
>>> and for that reason we shouldn't try to fix any of them?
>>
>> This debate seems to repeatedly degenerate into this type of accusation.
>>
>> Why is backward compatibility not being taken into account here? To be
>> clear, the proposed change *breaks backward compatibility* and while
>> that's allowed in 3.6, just because it is allowed, doesn't mean we
>> have free rein to break compatibility - any change needs a good
>> justification. The arguments presented here are valid up to a point,
>> but every time anyone tries to suggest a weak area in the argument,
>> the "we should fix security issues" trump card gets pulled out.
>>
>> For example, as this is a compatibility break, it'll only be allowed
>> into 3.6+ (I've not seen anyone suggest that this is sufficiently
>> serious to warrant breaking compatibility on older versions). Almost
>> all of those SO questions, and google hits, are probably going to be
>> referenced by people who are using 2.7, or maybe some version of 3.x
>> earlier than 3.6 (at what stage do we allow for the possibility of 3.x
>> users who are *not* on the latest release?) So is a solution which
>> won't impact most of the people making the mistake, worth it?
>
> So people who are arguing that the defaults shouldn't be fixed on
> Python 2.7 are likely the same people who also argued that PEP 466 was
> a terrible, awful, end-of-the-world type change. Yes it broke things
> (like eventlet) but the net benefit for users who can get onto Python
> 2.7.9 (and later) is immense.

They don't even have to get onto 2.7.9 per se - the RHEL 7.2 beta just
shipped with Robert Kuska's backport of those changes (minus the
eventlet breaking internal API change), so it will also filter out
through the RHEL/CentOS ecosystem via 7.x and SCLs. (We also looked at
a Python 2.6 backport, but decided it was too much work for not enough
benefit - folks really need to just upgrade to RHEL/CentOS 7 already,
or at least switch to using Software Collections for their Python
runtime needs).

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From donald at stufft.io  Tue Sep 15 02:14:32 2015
From: donald at stufft.io (Donald Stufft)
Date: Mon, 14 Sep 2015 20:14:32 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F9rCe_n-3gVPGqMp48WthLmt-4ZDu0aGrshCO5oevRjfQ@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAH_hAJGi9UEdBzt-GC=CqvGTq7aWoM7Yq2ZtsiYn4TvWV=Z06A@mail.gmail.com>
 <CACac1F_OZLAC2KLjnx2Z1LFq0WDzEt20WmCeHqhTgoyW_SFehg@mail.gmail.com>
 <etPan.55f72fc0.3eda087c.24af@Draupnir.home>
 <CACac1F9rCe_n-3gVPGqMp48WthLmt-4ZDu0aGrshCO5oevRjfQ@mail.gmail.com>
Message-ID: <etPan.55f762e9.4ad8f068.5a78@Draupnir.home>

On September 14, 2015 at 6:39:28 PM, Paul Moore (p.f.moore at gmail.com) wrote:
> (The rest of your emails, I'm going to read fully and digest before
> responding. Might take a day or so.)
>  
> On 14 September 2015 at 21:36, Donald Stufft wrote:
> > I think maybe a problem here is a difference in how we look at the data. It
> > seems that you might focus on the probability of you personally (or the things
> > you work on) getting attacked and thus benefiting from these changes, whereas
> > I, and I suspect the others like me, think about the probability of *anyone*
> > being attacked.
>  
> This may be true, in some sense. But I'm not willing to accept that
> you are thinking about everyone, but I'm somehow selfishly only
> thinking of myself. If that's what you were implying, then frankly
> it's a pretty offensive way of disregarding my viewpoint. Knowing you,
> I'm sure that's *not* how you meant it - but do you see how easy it is
> for the way you word something to make it nearly impossible for me to
> see past your wording to get to the actual meaning of what you're
> trying to say? I didn't even consciously notice the implication
> myself, at first. I simply started writing a pretty argumentative
> rebuttal, because I felt that somehow I needed to correct what you
> said, but I couldn't quite say why.

No, I don?t mean it in the way of you being selfish. I'm not quite sure the
right wording here, essentially the probably of an event happening to a
particular indivdual vs the probablity of an event occuring at all. To use your
lottery example, I *think*, and perhaps I'm wrong, that you're looking at it in
terms of, the chance of any particular person participating in the lottery
winning the lottery is low, so why should each of these people, as an invidual
make plans for how to get the money when they win the lottery, because as
inviduals they are unlikely to win. Whereas I flip it around and think, that
someone, somewhere is likely going to win the lottery, so the lottery system
should make plans for how to get them the money when they win.

I'm not sure the right "name" for each type, and I don't want to continue to
try and hamfist it, because I don't mean it in an offensive or an "I'm better
than you" way and I fear putting my foot in my mouth again :(

>  
> Looking at the reality of what I focus on, I'd say it's more like
> this. I mistrust arguments that work on the basis that "someone,
> somewhere, might do X bad thing, therefore we must all pay cost Y".
> The reasons are complex (and I don't know that I fully understand all
> of my thought processes here) but some aspects that immediately strike
> me are:
>  
> * The probability of X isn't really quantified. I may win the lottery,
> but I don't quit my job - the probability is low. The probability of X
> matters.
> * My experience of the probability of X happening varies wildly from
> that of whoever's making the point. Who is right? Why must one of us
> "win" and be right? Can't it simply be that my data implies that over
> the full data set, the actual probability of X is lower than you
> thought?
> * The people paying cost Y are not the cause of, nor are they impacted
> by, X (except in an abstract "we all suffer if bad things happen"
> sense). I believe in the general principle of "you pay for what you
> use", so to me you're arguing for the wrong people to be made to pay.
>  
> Hopefully, those are relatively objective measures. More subjectively,
>  
> * It's way too easy to say "if X happens once, we have a problem". If
> you take the stance that we have to prevent X from *ever* happening,
> you allow yourself the freedom to argue with vague phrases like
> "might", while leaving the burden of absolute proofs on me. (In the
> context of RNG proposals, this is where arguments like "let's
> implement a secure secret library" get dismissed - they still leave
> open the possibility of *someone* using an inappropriate RNG, so "they
> don't solve the issue" - even if they reduce the chance of that
> happening by a certain amount - and neither you nor I can put a figure
> on how much, so let's not try).

Just to be clear, I don?t think that "If X happens once, it's a problem" is a
reasonable belief and I don't personally have that belief. It's a sliding scale
where we need to figure out where the right solution for Python is for each
particular problem. I certainly wouldn't want to use a language that took the
approach that if X can ever happen, we need to prevent X. I have seen a number
of users incorrectly use the random.py module to where I think that the danger
is "real". I also think that, if this were a brand new module, it would be a
no brainer (but perhaps I'm wrong) for the default, module level to have a safe
by default API. Going off that assumption then I think the question is really
just "Is it worth it?" not "does this make more sense then the current?".

> * There's little evidence that I can see of preventative security
> measures having improved things. Maybe this is because it's an "arms
> race" situation, and keeping up is all we can hope for. Maybe it's
> because it's hard to demonstrate a lack of evidence, so the demand for
> evidence is unreasonable. I don't know.

By preventive security measures, do you mean things like PEP 466? I don't quite
know how to accurately state it, but I'm certain that PEP 466 directly improved
the security of the entire internet (and continues to do so as it propagates).

> * For many years I ran my PC with no anti-virus software. I never got
> a virus. Does that prove anything? Probably not. The anti-virus
> software on my work PC is the source of *far* more issues than I have
> ever seen caused by a virus. Does *that* prove anything? Again,
> probably not. But my experience with at least *that* class of pressure
> to implement security is that the cure is worse than the disease.
> Where does that leave the burden of proof? Again, I don't know, but my
> experience should at least be considered as relevant data.

Antivirus is a particularly bad example of security software :/ It's a massive
failing of the security industry that they exist in the state they do. There's
a certain bias here though, because it is the job of security sensitive code
to "break" things (as in, take otherwise valid input and make it not work). In
an ideal world, security software just sits there doing "nothing" from the
POV of someone who isn't a security engineer and then will, often through no
fault of their own, pop and and make things go kabloom because it detected
something insecure happening. This means that for most people, the only
interaction they have with something designed to protect them, is when it steps
in to make things stop working.

It is relevant data, but I think it goes back to the different way of looking
at things (what is the individual chance of an event happening, vs the chance
of an event happening across the entire population). This might also be why
you'll see the backwards compat folks focus more on experienced driven data and
security folks focus more on hypotheticals about what could happen.

> * Everyone I have ever encountered in a work context (as opposed to in
> open-source communities) seems to me to be in a similar situation to
> mine. I believe I'm speaking for them, but because it's a
> closed-source in house environment, I've got no public data to back my
> comments.
>  
> And totally subjective,
>  
> * I'm extremely tired of the relentless pressure of "we need to do X,
> because security". While the various examples of X may all have ended
> up being essentially of no disadvantage to me, feeling obliged to
> read, understand, and comment on the arguments presented every time,
> gets pretty wearing.

I'm not sure what to do about this :(

On one side, you're not obligated to read, understand, and comment on every
thing that's raised but I totally understand why you do, because I do too, but
I'm not sure how to help this without saying that people who care about
security shouldn't bring it up either?

> * I can't think of a single occasion where we *don't* do X. That may
> well be confirmation bias, but again subjectively, it feels like
> nobody's listening to the objections. I get that the original
> proposals get modified, but if never once has the result been "you're
> right, the cost is too high, we'll not do X" then that puts
> security-related proposals in a pretty unique position.

Off the top of my head I remember the on by default hash randomization for
Python 2.x (or the actual secure hash randomization since 2.x still has the one
that is trivial to recover the original seed).

I don't actually remember that many cases where python-dev choose to broke
backwards compatability for security. The only ones I can think of are:

* The hash randomization on Python 3.x (sort of? Only if you depended on dict
? ordering, which wasn't a guarentee anyways).
* The HTTPS improvements where we switched Python to default to default to
? verifying certificates.
* The backports of several security features to 2.7 (backport of 3.4's ssl
? module, hmac.compare_digest, os.urandom's persistent FD, hashlib.pbkdf2_hmac,
? hashlib.algorithms_guaranteed, hashlib.algorithms_available).

There are probably things that I'm not thinking of, but the hash randomization
only broke things if you were depending on dict/set having ordering which isn't
a promised property of dict/set. The backports of security features was done in
a pretty minimally invasive way where it would (ideally) only break things if
you relied on those names *not* existing on Python 2.7 (which was a nonzero
but small set). The HTTPS verification is the main thing I can think of where
python-dev actually broke backwards compatibility in an obvious way for people
relying on something that was documented to work a particular way.

Are there example I'm not remembering (probably!)? It doesn't feel like 2 sort
of backwards incompatible changes and 1 backwards incompatible change in the
lifetime of Python is really that much to me?

Is there some cross over between distutils-sig maybe? I've focused a lot more
on pushing security on that side of things both because it personally affects
me more and because I think insecure defaults there are a lot worse than
insecure defaults in any particular module in the Python standard library.

>  
> Finally, in relation to that last point, and one thing I think is a
> key difference in our thinking. I do *not* believe that security
> proposals (as opposed to security bug fixes) are different from any
> other type of proposal. I believe that they should be subject to all
> the same criteria for acceptance that anything else is. I suspect that
> you don't agree with that stance, and believe that security proposals
> should be held to different standards (e.g., a demonstrated
> *probability* of benefit is sufficient, rather than evidence of actual
> benefit being needed). But please speak for yourself on this - I'm not
> trying to put words into your mouth, it's just my impression.

Well, I think that all proposals are based on what the probability is it's
going to help some particular percentage of people, and whether it's going to
help enough people to be worth the cost.

What I think is special about security is the cost of *not* doing something.
Security "fails open" in that if someone does something insecure, it's not
going to raise an exception or give different results or something like that.
It's going to appear to "work" (in that you get the results you expect) while
the user is silently insecure. Compare this to, well let's pretend that there
was never a deterministic RNG in the standard library. If a scientist or a
game designer inappropiately used random.py they'd pretty quickly learn that
they couldn't give the RNG a seed, and that even if it was a CSPRNG that had
an "add_seed" method that might confuse them it'd be pretty obvious on the
second execution of their program that it's giving them a different result.

I think that the bar *should* be lower for something that just silently or
subtlety does the "wrong" thing vs something that obviously and loudly does
the wrong thing. Particularly when the downside of doing the "wrong" thing
is as potentionally disasterous as it is with security.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From donald at stufft.io  Tue Sep 15 02:18:50 2015
From: donald at stufft.io (Donald Stufft)
Date: Mon, 14 Sep 2015 20:18:50 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <etPan.55f762e9.4ad8f068.5a78@Draupnir.home>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAH_hAJGi9UEdBzt-GC=CqvGTq7aWoM7Yq2ZtsiYn4TvWV=Z06A@mail.gmail.com>
 <CACac1F_OZLAC2KLjnx2Z1LFq0WDzEt20WmCeHqhTgoyW_SFehg@mail.gmail.com>
 <etPan.55f72fc0.3eda087c.24af@Draupnir.home>
 <CACac1F9rCe_n-3gVPGqMp48WthLmt-4ZDu0aGrshCO5oevRjfQ@mail.gmail.com>
 <etPan.55f762e9.4ad8f068.5a78@Draupnir.home>
Message-ID: <etPan.55f763ea.1113610c.5a78@Draupnir.home>

On September 14, 2015 at 8:14:33 PM, Donald Stufft (donald at stufft.io) wrote:
> > Security "fails open" in that if someone does something insecure, 
> it's not
> going to raise an exception or give different results or something 
> like that.

This should read:

Security "fails open" in that if someone uses an API that allows something
insecure to happen (like not validating HTTPS) it's not going to raise an
exception or give different results or something like that.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From ncoghlan at gmail.com  Tue Sep 15 02:36:52 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 15 Sep 2015 10:36:52 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
Message-ID: <CADiSq7dFTujAst21d--MisGnKsENtg77BeL7gXPBoxKMTC+o3Q@mail.gmail.com>

On 15 September 2015 at 01:01, Paul Moore <p.f.moore at gmail.com> wrote:
> I fully expect the response to this to be "just because it'll take
> time, doesn't mean we should do nothing". Or "even if it just fixes it
> for one or two people, it's still worth it".

This may be at the core of the disagreement, as we're not talking "one
or two people", we're talking tens of millions. While wearing my "PSF
Director" hat, I spend a lot of time talking to professional
educators, and recently organised the first "Python in Education"
miniconf at PyCon Australia. If you look at the inroads we're making
across primary, secondary and tertiary education, as well as through
workshops like Software Carpentry and DjangoGirls, a *lot* of people
around the world are going to be introduced to text based programming
over the coming decades by way of Python.

That level of success brings with it a commensurate level of
responsibility: if we're setting those students up for future security
failures, that's *on us* as language designers, not on them for
failing to learn to avoid traps we've accidentally laid for them
(because *we* previously didn't know any better).

Switching back to my "security wonk" hat, the historical approach to
computer security has been "secure settings are opt in, so only
qualified experts should be allowed to write security sensitive
software". What we've learned as an industry (the hard way) is that
this approach *doesn't work*.

The main reason it doesn't work is the one that was part of the
rationale for the HTTPS changes in PEP 476: when security failures are
silent by default, you generally don't find out that you forgot to
flip the "I need this to be secure" switch until *after* the system
you're responsible for has been compromised (with whatever
consequences that may have for your users).

The law of large numbers then tells us that even if (for example) only
1 in 1000 people forget to flip the "be secure" switch when they
needed it (or don't even know that the switch *exists*), it's a
practical certainty that when you have millions of programmers using
your language (and you don't climb to near the top of the IEEE
rankings without that), you're going to be hitting that failure mode
regularly as a collective group.

We have the power to mitigate that harm permanently *just by changing
the default behaviour of the random module*. However, that has a cost:
it causes problems for some current users for the sake of better
serving future users. That's what transition strategy design is about,
and I'll take that up in the other thread.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Tue Sep 15 03:07:46 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 15 Sep 2015 11:07:46 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <mt7ivn$opn$1@ger.gmane.org>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAH_hAJGi9UEdBzt-GC=CqvGTq7aWoM7Yq2ZtsiYn4TvWV=Z06A@mail.gmail.com>
 <CACac1F_OZLAC2KLjnx2Z1LFq0WDzEt20WmCeHqhTgoyW_SFehg@mail.gmail.com>
 <etPan.55f72fc0.3eda087c.24af@Draupnir.home>
 <CACac1F9rCe_n-3gVPGqMp48WthLmt-4ZDu0aGrshCO5oevRjfQ@mail.gmail.com>
 <mt7ivn$opn$1@ger.gmane.org>
Message-ID: <CADiSq7fy6JC85obsVtnz1Exyq96hs7yCe4njGiNeNB4obE6N4A@mail.gmail.com>

On 15 September 2015 at 08:50, Emile van Sebille <emile at fenx.com> wrote:
> On 9/14/2015 3:39 PM, Paul Moore wrote:
>>
>> * Everyone I have ever encountered in a work context (as opposed to in
>> open-source communities) seems to me to be in a similar situation to
>> mine. I believe I'm speaking for them, but because it's a
>> closed-source in house environment, I've got no public data to back my
>> comments.
>
> You can certainly speak for me.  It's much easier to guard the gates than
> everything inside the walls.

Historically, yes, but relying solely on perimeter defence is becoming
less and less viable as the workforce decentralises, and we see more
people using personal devices and untrusted networks to connect to
work systems (whether that's their home network or the local coffee
shop), as well as relying on public web services rather than internal
applications.

Enterprise IT is simply *wrong* in the way we currently go about a lot
of things, and the public web service sector is showing us all how to
do it right. Facilitating that transition is a key part of my day job
in Red Hat's Developer Experience team (it's getting a bit off topic,
but for a high level company perspective on that:
http://www.redhat-cloudstrategy.com/towards-a-frictionless-it-whether-you-like-it-or-not/).

And for folks tempted to think "this is just about the web", for a
non-web related example of what we as an industry have unleashed
through our historical "security is optional" mindset:
http://www.wired.com/2015/07/hackers-remotely-kill-jeep-highway/

That's an article on remotely hacking the UConnect system in a Jeep
Cherokee to control all sorts of systems that had no business being
connected to the internet in the first place.

The number of SCADA industrial control systems accessible through the
internet is frankly terrifying - one of the reasons we can comfortably
assume most humans are either nice or lazy is because we *don't* see
most of the vulnerabilities that are lying around being exploited.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Tue Sep 15 04:07:38 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 15 Sep 2015 12:07:38 +1000
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <EEEEDDFE-CCB0-4BDE-8D28-383D11B852AC@yahoo.com>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <EEEEDDFE-CCB0-4BDE-8D28-383D11B852AC@yahoo.com>
Message-ID: <CADiSq7fRzPZ+MotTRqbmTcvjUryzWZx66jkR_4LdOM3eW2SeOQ@mail.gmail.com>

On 15 September 2015 at 09:10, Andrew Barnert <abarnert at yahoo.com> wrote:
> On Sep 14, 2015, at 06:32, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>> This is an expansion of the random module enhancement idea I
>> previously posted to Donald's thread:
>> https://mail.python.org/pipermail/python-ideas/2015-September/035969.html
>
> Since I suggested the set_default_instance and the singleton instances that can be imported in place of the module, I'm obviously happy with those parts.
>
> However, I think you still haven't solved the problem with my proposal that you set out to solve.
>
> The main difference is that I wanted to deprecate (and eventually make it an error) to use the top-level functions without calling set_default_instance, while you want to allow them and gradually shift the semantics from using the seeded to the seedless PRNG.
>
> As I understand it, the reason for this is that you want to make it possible for someone to write "from random import choice", and not get a warning or error telling them they need to call set_default_instance or import one of the singletons instead.
>
> But then you're encouraging people to write code that's broken in 3.6 and earlier--and that's also potentially broken in 3.7 if used together with any code that calls set_default_instance (because that can't retroactive fix anything from-imported before the call). So, it takes 18 more months to provide any benefit, and it adds an extra cost.

This entire problem is one that I put in the "fix it eventually"
category, rather than "fix it urgently" - folks really are better off
learning to use things like cryptography.io for security sensitive
software, so this change is just about harm mitigation given that it's
inevitable that a non-trivial proportion of the millions of current
and future Python developers won't do that.

Since there's really only one transition I want to enable (seedable ->
seedless as the default RNG), I now think the "switch implicitly as
needed" is a better idea than a permanent support API for switching
the default instance - I'd just add a deprecation warning to that
behaviour, with the intent of removing it some time after 2.7 goes
EOL.

I also realised based on Paul's comments that we really do want
"random.seedable" and "random.seedless" submodules, since that allows
proper interaction with the import system in constructs like "from
random.seedable import randint"

That would make the proposed change for Python 3.6:

* add a random.SeedlessRandom API that omits the seed(), getstate()
and setstate() methods and uses a cryptographically secure PRNG
internally (such as the ChaCha20 algorithm implemented by OpenBSD)
* deprecate the seed(), getstate() and setstate() methods on SystemRandom
* convert random to a pseudo-module with "seedless", "seedable" and
"system" submodules (keeping most code in __init__ for easy pickle
compatibility)
* these would each work like the current top-level random module - a
default instance, with bound methods as module level callables
* random._inst would be an alias for random.seedless._inst by default
* the top level random functions would change to be functions lazily
looking up methods on random._inst, rather than bound methods
* if you call the module level seed(), getstate(), or setstate()
methods, and random._inst is set to random.seedless._inst, it will
issue a deprecation warning recommending the direct use of
"random.seedable" and switch random._inst to refer to
random.seedable._inst instead

Compared to my original proposal, the seedable MT RNG retains the
random.Random name, so any code already using explicit instances is
entirely unaffected by the proposed change. This means the only code
that will receive a deprecation warning is code calling the module
level seed(), getstate() and setstate() functions, and that warning
will just recommend importing "random.seedable" rather than importing
"random".

The API used to replace the default instance at runtime for backwards
compatibility purposes becomes private rather than public, so we only
need to support our specific reasons for doing that, rather than
supporting it as a general feature.

Future security audits would focus on the use of the module "seed()",
"getstate()" and "setstate()" functions (since they'd trigger the
deterministic RNG process wide), and it would also still be encouraged
to use random.SystemRandom() or os.urandom() for security sensitive
use cases (since that's both version independent, and immune to other
modules modifying the active default RNG).

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From random832 at fastmail.com  Tue Sep 15 04:30:28 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 14 Sep 2015 22:30:28 -0400
Subject: [Python-ideas] Globally configurable random number generation
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <EEEEDDFE-CCB0-4BDE-8D28-383D11B852AC@yahoo.com>
 <CADiSq7fRzPZ+MotTRqbmTcvjUryzWZx66jkR_4LdOM3eW2SeOQ@mail.gmail.com>
Message-ID: <m2vbbcfm7v.fsf@fastmail.com>

Nick Coghlan <ncoghlan at gmail.com> writes:
> Compared to my original proposal, the seedable MT RNG retains the
> random.Random name, so any code already using explicit instances is
> entirely unaffected by the proposed change.

So, if you use random.Random() without seeding, you still get "MT seeded
from os.urandom"?


From ncoghlan at gmail.com  Tue Sep 15 04:39:38 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 15 Sep 2015 12:39:38 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F9rCe_n-3gVPGqMp48WthLmt-4ZDu0aGrshCO5oevRjfQ@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAH_hAJGi9UEdBzt-GC=CqvGTq7aWoM7Yq2ZtsiYn4TvWV=Z06A@mail.gmail.com>
 <CACac1F_OZLAC2KLjnx2Z1LFq0WDzEt20WmCeHqhTgoyW_SFehg@mail.gmail.com>
 <etPan.55f72fc0.3eda087c.24af@Draupnir.home>
 <CACac1F9rCe_n-3gVPGqMp48WthLmt-4ZDu0aGrshCO5oevRjfQ@mail.gmail.com>
Message-ID: <CADiSq7foDWJMRPGVnr-=0b-ZrVBXeiMhvk2hOqqEX8a10esZbg@mail.gmail.com>

On 15 September 2015 at 08:39, Paul Moore <p.f.moore at gmail.com> wrote:
> * I can't think of a single occasion where we *don't* do X. That may
> well be confirmation bias, but again subjectively, it feels like
> nobody's listening to the objections. I get that the original
> proposals get modified, but if never once has the result been "you're
> right, the cost is too high, we'll not do X" then that puts
> security-related proposals in a pretty unique position.

Most of the time, when the cost of change is clearly too high, we
simply *don't ask*. hmac.compare_digest() is an example of that, where
having a time-constant comparison operation readily available in the
standard library is important from a security perspective, but having
standard equality comparisons be as fast as possible is obviously more
important from a language design perspective.

Historically, it was taken for granted that backwards compatibility
concerns would always take precedence over improving security
defaults, but the never-ending cascade of data breaches involving
personally identifiable information are proving that we, as a
collective industry are *doing something wrong*:
http://www.informationisbeautiful.net/visualizations/worlds-biggest-data-breaches-hacks/

A lot of the problems we need to address are operational ones as we
upgrade the industry from a "perimiter defence" mindset to a "defence
in depth" mindset, and hence we have things like continuous
integration, continuous deployment, application and service
sandboxing, containerisation, infrastructure-as-code, immutable
infrastructure, etc, etc, etc. That side of things is mostly being
driven by infrastructure software vendors (whether established ones or
startups), where we have the fortunate situation that the security
benefits are tied in together with a range of operational efficiency
and capability benefits [1].

However, there's also increasing recognition that some of the problems
are due to the default behaviours of the programming languages we use
to *create* applications, and in particular the fact that many
security issues involve silent failure modes. Sometimes the right
answer to those is to turn the silent failure into a noisy failure (as
with certificate verification in PEP 476), other times it is about
turning the silent failure into a silent success (as is being proposed
for the random module API), and yet other times it is simply about
lowering the barriers to someone doing the right thing once they're
alerted to the problem (as with the introduction of
hmac.compare_digest() and ssl.create_default_context(), and their
backports to the Python 2.7 series)

At a lower level, languages like Go and Rust are challenging some of
the assumptions in the still dominant C-based memory management model
for systems programming. Rust in particular is interesting in that it
has a much richer compile time enforced concept of memory ownership
than C does, while still aiming to keep the necessary runtime support
very light.

Regards,
Nick.

[1] For folks wanting more background on some of the factors this
shift, I highly recommend Google's "BeyondCorp" research paper:
http://research.google.com/pubs/pub43231.html

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Tue Sep 15 05:22:27 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 15 Sep 2015 13:22:27 +1000
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <m2vbbcfm7v.fsf@fastmail.com>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <EEEEDDFE-CCB0-4BDE-8D28-383D11B852AC@yahoo.com>
 <CADiSq7fRzPZ+MotTRqbmTcvjUryzWZx66jkR_4LdOM3eW2SeOQ@mail.gmail.com>
 <m2vbbcfm7v.fsf@fastmail.com>
Message-ID: <CADiSq7dLUpxd6rfX6FKbO7fGRSV7ur-AcQP_ekbi_XNt8_0Lng@mail.gmail.com>

On 15 September 2015 at 12:30, Random832 <random832 at fastmail.com> wrote:
> Nick Coghlan <ncoghlan at gmail.com> writes:
>> Compared to my original proposal, the seedable MT RNG retains the
>> random.Random name, so any code already using explicit instances is
>> entirely unaffected by the proposed change.
>
> So, if you use random.Random() without seeding, you still get "MT seeded
> from os.urandom"?

Yes, with the revised proposal, only the module level functions would
change their behaviour to use a CSPRNG by default. If you trawl the
various cryptographically unsound password generation recipes, they're
almost all using the module level functions, so changing the meaning
of random.Random itself would add a lot of additional pain for next to
no gain.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From steve at pearwood.info  Tue Sep 15 05:53:36 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 15 Sep 2015 13:53:36 +1000
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <mt7dkd$87i$1@ger.gmane.org>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org> <mt7c93$gns$1@ger.gmane.org>
 <mt7dkd$87i$1@ger.gmane.org>
Message-ID: <20150915035334.GF31152@ando.pearwood.info>

On Mon, Sep 14, 2015 at 10:19:09PM +0100, Robert Kern wrote:

> The requirement for a good PRNG for simulation work is that it be *well* 
> distributed in reasonable dimensions, not that it be *exactly* 
> equidistributed for some k. And well-distributedness is exactly what is 
> tested in TestU01. It is essentially a collection of simulations designed 
> to expose known statistical flaws in PRNGs. So to your earlier question as 
> to which is more damning, failing TestU01 or not being perfectly 623-dim 
> equidistributed, failing TestU01 is.

I'm confused here. Isn't "well-distributed" a less-strict test than 
"exactly equidistributed"? MT is (almost) exactly k-equidistributed up 
to k = 623, correct? So how does it fail the "well-distributed" test?


-- 
Steve

From abarnert at yahoo.com  Tue Sep 15 06:03:05 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Mon, 14 Sep 2015 21:03:05 -0700
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <CADiSq7dLUpxd6rfX6FKbO7fGRSV7ur-AcQP_ekbi_XNt8_0Lng@mail.gmail.com>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <EEEEDDFE-CCB0-4BDE-8D28-383D11B852AC@yahoo.com>
 <CADiSq7fRzPZ+MotTRqbmTcvjUryzWZx66jkR_4LdOM3eW2SeOQ@mail.gmail.com>
 <m2vbbcfm7v.fsf@fastmail.com>
 <CADiSq7dLUpxd6rfX6FKbO7fGRSV7ur-AcQP_ekbi_XNt8_0Lng@mail.gmail.com>
Message-ID: <305B13C9-BA39-4133-8BDC-794E82EBF254@yahoo.com>

On Sep 14, 2015, at 20:22, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
>> On 15 September 2015 at 12:30, Random832 <random832 at fastmail.com> wrote:
>> Nick Coghlan <ncoghlan at gmail.com> writes:
>>> Compared to my original proposal, the seedable MT RNG retains the
>>> random.Random name, so any code already using explicit instances is
>>> entirely unaffected by the proposed change.
>> 
>> So, if you use random.Random() without seeding, you still get "MT seeded
>> from os.urandom"?
> 
> Yes, with the revised proposal, only the module level functions would
> change their behaviour to use a CSPRNG by default. If you trawl the
> various cryptographically unsound password generation recipes, they're
> almost all using the module level functions, so changing the meaning
> of random.Random itself would add a lot of additional pain for next to
> no gain.

That definitely makes things simpler. The only part of my set_default_instance patch that was at all tricky was how to make sure Random instances worked the same as top-level functions, but still providing a way to explicitly select one (hence renaming the base class to DeterministicRandom, making a new subclass UnsafeRandom that subclasses it and added the warning, and making both Random and the top-level functions point at that). If we don't need that, then your simpler solution makes more sense.

Also, while I'm not 100% sold on the auto-switching and the delegate-at-call-time wrappers, I'll play with them and see, and if they do work, then you're definitely right that your second version does solve your problem with my proposal, so it doesn't matter whether your first version did anymore.

First, on delegating top-level function: have you tested the performance? Is MT so slow that an extra lookup and function call don't matter?

One quick thought on auto-switching vs. explicitly setting the instance before any functions have been called: if I get you to install a plugin that calls random.seed(), I've now changed your app to use seeded random numbers. And it might even still pass security tests, because it doesn't switch until someone hits some API that activates the plugin. Is that a realistic danger for any realistic apps? If so, doesn't that potentially make 3.6 more dangerous than 3.5?

For another: I still think we should be getting people to explicitly use seeded_random or system_random (or seedless_random, if they need speed as well as "probably secure") or explicit class instances (which are a bigger change, but more backward compatible once you've made it) as often as possible, even if random does eventually turn into seedless_random. The fact that many apps will never actually issue a deprecation warning or any other signal that anything is changing may be leaning over too far toward convenience. I realize the benefit of having books and course materials written for 3.4 continue to work in 3.8, but really, if those books are giving people bad ideas, removing any incentive for anyone to change the next edition may not be a good idea.

And finally: it _seems like_ people who want MT for simulation/game/science stuff will have a pretty easy time finding the migration path, but I'm having a really hard time coming up with a convincing argument. Does anyone have a handful of science guys they can hack up a system for and test them empirically? Because if you can establish that fact, I think the naysayers have very little reason left to say nay, and a consensus would surely be better than having that horribly contentious thread end with "too bad, overruled, the PEP has been accepted".

From stephen at xemacs.org  Tue Sep 15 06:34:44 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 15 Sep 2015 13:34:44 +0900
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <1442245930.209341.383250457.5F839815@webmail.messagingengine.com>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <etPan.55f6d0e1.7bf9094.b1e0@Draupnir.home>
 <1442240160.186432.383146609.471F1B7D@webmail.messagingengine.com>
 <loom.20150914T162116-889@post.gmane.org>
 <1442241905.192420.383177025.382A3D7E@webmail.messagingengine.com>
 <mt6pnv$1m4$1@ger.gmane.org>
 <1442245930.209341.383250457.5F839815@webmail.messagingengine.com>
Message-ID: <871te0z4ez.fsf@uwakimon.sk.tsukuba.ac.jp>

Random832 writes:

 > Who is doing scientific computing but not using the seeding functions?

People whose papers should be rejected.


From ncoghlan at gmail.com  Tue Sep 15 06:53:48 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 15 Sep 2015 14:53:48 +1000
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <305B13C9-BA39-4133-8BDC-794E82EBF254@yahoo.com>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <EEEEDDFE-CCB0-4BDE-8D28-383D11B852AC@yahoo.com>
 <CADiSq7fRzPZ+MotTRqbmTcvjUryzWZx66jkR_4LdOM3eW2SeOQ@mail.gmail.com>
 <m2vbbcfm7v.fsf@fastmail.com>
 <CADiSq7dLUpxd6rfX6FKbO7fGRSV7ur-AcQP_ekbi_XNt8_0Lng@mail.gmail.com>
 <305B13C9-BA39-4133-8BDC-794E82EBF254@yahoo.com>
Message-ID: <CADiSq7dn7NdH+1QhHsQHoqBBA75MogoLuVAVbw87t-tE=MEJRA@mail.gmail.com>

On 15 September 2015 at 14:03, Andrew Barnert <abarnert at yahoo.com> wrote:
> Also, while I'm not 100% sold on the auto-switching and the delegate-at-call-time wrappers, I'll play with them and see, and if they do work, then you're definitely right that your second version does solve your problem with my proposal, so it doesn't matter whether your first version did anymore.
>
> First, on delegating top-level function: have you tested the performance? Is MT so slow that an extra lookup and function call don't matter?

If folks are in a situation where the performance impact of the
additional layer of indirection is a problem, they can switch to using
random.Random explicitly, or import from random.seedable rather than
the top level random module.

> One quick thought on auto-switching vs. explicitly setting the instance before any functions have been called: if I get you to install a plugin that calls random.seed(), I've now changed your app to use seeded random numbers. And it might even still pass security tests, because it doesn't switch until someone hits some API that activates the plugin. Is that a realistic danger for any realistic apps? If so, doesn't that potentially make 3.6 more dangerous than 3.5?

This isn't an applicable concern, as we already provide zero runtime
protections against hostile monkeypatching of other modules (by design
choice). You can subvert even os.urandom in a hostile plugin:

    def not_random(num_bytes):
        return b'A' * num_bytes
    import os
    os.urandom = not_random

Once "potentially hostile code running in the current process" is part
of your threat model, CPython is out of the running, and even PyPy's
sandboxing capabilities rely on running the potentially hostile code
in a separate process. IronPython and Jython can rely on CLR/JVM
sandboxing, but that's still a case of delegating the problem to the
host platform, rather than trying to solve it at the Python level.

> For another: I still think we should be getting people to explicitly use seeded_random or system_random (or seedless_random, if they need speed as well as "probably secure") or explicit class instances (which are a bigger change, but more backward compatible once you've made it) as often as possible, even if random does eventually turn into seedless_random. The fact that many apps will never actually issue a deprecation warning or any other signal that anything is changing may be leaning over too far toward convenience. I realize the benefit of having books and course materials written for 3.4 continue to work in 3.8, but really, if those books are giving people bad ideas, removing any incentive for anyone to change the next edition may not be a good idea.

Forcing people to make choices they're ill-equipped to make just
because we think they "should" know enough to make a wise decision is
one of the leading causes of user hostile software (consider the
respective user experiences of a HTTP site and a HTTPS site with a
self-signed certificate).

People are busy, and life is full of decisions that need to be made
where there's no good default, so when we're able to deliver a good
default that fails *noisily* when it's the wrong answer, that's what
we should be aiming for.

> And finally: it _seems like_ people who want MT for simulation/game/science stuff will have a pretty easy time finding the migration path, but I'm having a really hard time coming up with a convincing argument. Does anyone have a handful of science guys they can hack up a system for and test them empirically? Because if you can establish that fact, I think the naysayers have very little reason left to say nay, and a consensus would surely be better than having that horribly contentious thread end with "too bad, overruled, the PEP has been accepted".

Given the general lack of investment in sustaining engineering for
scientific software, I think the naysayers are right on that front,
which is why I switched my proposal to give them a transparent upgrade
path - I was originally thinking primarily of the educational and
gaming use cases, and hadn't considered randomised simulations in the
scientific realm.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From tim.peters at gmail.com  Tue Sep 15 07:49:15 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 15 Sep 2015 00:49:15 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com> <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
 <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
Message-ID: <CAExdVNnYHBfkHjYpeC+rZA1oBqq7qEE82hsoGfaA_N9_aKu97Q@mail.gmail.com>

[Tim]
>> And yet nobody so far has a produced a single example of any harm done
>> in any of the near-countless languages that supply non-crypto RNGs.  I
>> know, my lawyer gets annoyed too when I point out there hasn't been a
>> nuclear war ;-)

[Nathaniel Smith <njs at pobox.com>]
> Here you go:
>   https://media.blackhat.com/bh-us-12/Briefings/Argyros/BH_US_12_Argyros_PRNG_WP.pdf

The most important question I have about that is from its Appendix D,
where they try to give a secure "all-purpose" token generator.  If
openssl_random_pseudo_bytes is available, they use just that and call
it done.  Otherwise they go on to all sorts of stuff.

My question is, even if /dev/urandom is available, they're _not_
content to use that.alone.  They continue to mix it up with all other
kinds of silly stuff.  So why do they trust urandom less than
OpenSLL's gimmick?  It's important to me because, far as I'm
concerned, os.urandom() is already Python's way to spell "crypto
random" (yes, it would be better to have one guaranteed to be
available on all Python platforms).


> They present real-world attacks on PHP applications that use something
> like the "password generation" code we've been talking about as a way
> to generate cookies and password reset nonces, including in particular
> the case of applications that use a strongly-seeded Mersenne Twister
> as their RNG:

I couldn't help but note ;-) that at least 3 of the apps had already
attempted to repair bug reports filed against their insecure
password-reset schemes at least 5 years ago.  You can lead a PHP'er to
water, but ... ;-)

> ...
> "Section 5.3: ... In this section we give a description of the
> Mersenne Twister generator and present an algorithm that allows the
> recovery of the internal state of the generator even when the output
> is truncated. Our algorithm also works in the presence of non
> consecutive outputs ..."

It's cute, but I doubt anyone but the authors had the patience - or
knowledge - to write a solver dealing with tens of thousands of picky
equations over about 20000 variables.  They don't explain enough about
the details for a script kiddie to do it.  Even very bright hackers
attack what's easiest to topple, like poor seeding - that's just easy
brute force.

It remained unclear to me _exactly_ what "in the presence of non
consecutive outputs" is supposed to mean.  In the only examples, they
knew exactly how many times MT was called.  "Non consecutive" in all
those contexts appeared to mean "but we couldn't observe_any_ output
bits in some cases - the ones we could know something about were
sometimes non-consecutive".  So in the MT output sequence, they had no
knowledge of _some_ of the outputs, but they nevertheless knew exactly
_which_ of the outputs they were wholly ignorant about.

That's no problem for the equation solver.  They just skip adding any
equations for the affected bits, keep collecting more outputs and
potentially "wrap around", probably leading to an overdetermined
system in the end.

But Python doesn't work the way PHP does here.  As explained in
another message, in Python you can have _no idea_ how many MT outputs
are consumed by a single .choice() call.  In the PHP equivalent, you
always consume exactly one MT output.  PHP's method suffers
statistical bias, but under the covers Python uses an accept/reject
method to avoid that.  Any number of MT outputs may be (invisibly!)
consumed before "accept" is reached, although typically only one or
two.  You can deduce some of the leading MT output bits from the
.choice() result, but _only_ for the single MT output .choice()
reveals anything about.  About the other MT outputs it may consume,
you can't even know that some _were_ skipped over, let alone how many.

Best I can tell, that makes a huge difference to whether their solver
is even applicable to cracking idiomatic "password generators" in
Python.  You can't know which variables correspond to the bits you can
deduce.  You could split the solver into multiple instances to cover
all feasible possibilities (for how many MT outputs may have been
invisibly consumed), but the number of solver instances needed then
grows exponentially with the number of outputs you do see something
about.  In the worst case (31 bits are truncated), they need over
19000 outputs to deduce the state.  Even a wildly optimistic "well,
let's guess no more than 1 MT output was invisibly rejected each time"
leads to over 2**19000 solver clones then.

Sure, there's doubtless a far cleverer way to approach that.  But
unless another group of PhDs looking to level up in Security World
burns their grant money to tackle it, that's yet another attack that
will never be seen in the real world ;-)


> Out of curiosity, I tried searching github for "random cookie
> language:python". The 5th hit (out of ~100k) was a web project that
> appears to use this insecure method to generate session cookies:
>   https://github.com/bytasv/bbapi/blob/34e294becb22bae6e685f2e742b7ffdb53a83bcb/bbapi/utils/cookie.py
>   https://github.com/bytasv/bbapi/blob/34e294becb22bae6e685f2e742b7ffdb53a83bcb/bbapi/api/router.py#L56-L66

And they all use .choice(), which is overwhelmingly the most natural
way to do this kind of thing in Python.

> ...
> There's a reason security people are so Manichean about these kinds of
> things. If something is not intended to be secure or used in
> security-sensitive ways, then fine, no worries. But if it is, then
> there's no point in trying to mess around with "probably mostly
> secure" -- either solve the problem right or don't bother. (See also:
> the time Python wasted trying to solve hash randomization without
> actually solving hash randomization [1].) If Tim Peters can get fooled
> into thinking something like using MT to generate session ids is
> "probably mostly secure", then what chance do the rest of us have?
> <wink>

As above, I'm still not much worried about .choice().  Even if I were,
I'd be happy to leave it at "use .choice() from a random.SystemRandom
instance instead".  Unless there's some non-obvious (to me) reason
these authors appear to be unhappy with urandom.


> NB this isn't an argument for *whether* we should make random
> cryptographically strong by default; it's just an argument against
> wasting time debating whether it's already "secure enough". It's not
> secure. Maybe that's okay, maybe it's not.

_I_ would use SystemRandom.  But, as you can tell, I'm extremely paranoid ;-)


> For the record though I do tend to agree with the idea that it's not
> okay, because it's an increasingly hostile world out there, and
> secure-random-by-default makes whole classes of these issues just
> disappear. It's not often that you get to fix thousands of bugs with
> one commit, including at least some with severity level "all your
> users' private data just got uploaded to bittorrent".
>
> I like Nick's proposal here:
>     https://code.activestate.com/lists/python-ideas/35842/
> as probably the most solid strategy for implementing that idea -- the
> only projects that would be negatively affected are those that are
> using the seeding functionality of the global random API, which is a
> tiny fraction, and the effect on those projects is that they get
> nudged into using the superior object-oriented API.

Have you released software used by millions of people?  If not, you
have no idea how ticked off users get by needing to change anything.
But Guido does ;-)

Why not add a new "saferandom" module and leave it at that?  Encourage
newbies to use it.  Nobody's old code ever breaks.  But nobody's old
code is saved from problems it likely didn't have anyway ;-)

> -n
>
> [1] https://lwn.net/Articles/574761/

From njs at pobox.com  Tue Sep 15 09:36:09 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 15 Sep 2015 00:36:09 -0700
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <20150915035334.GF31152@ando.pearwood.info>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org> <mt7c93$gns$1@ger.gmane.org>
 <mt7dkd$87i$1@ger.gmane.org>
 <20150915035334.GF31152@ando.pearwood.info>
Message-ID: <CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>

On Mon, Sep 14, 2015 at 8:53 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Mon, Sep 14, 2015 at 10:19:09PM +0100, Robert Kern wrote:
>
>> The requirement for a good PRNG for simulation work is that it be *well*
>> distributed in reasonable dimensions, not that it be *exactly*
>> equidistributed for some k. And well-distributedness is exactly what is
>> tested in TestU01. It is essentially a collection of simulations designed
>> to expose known statistical flaws in PRNGs. So to your earlier question as
>> to which is more damning, failing TestU01 or not being perfectly 623-dim
>> equidistributed, failing TestU01 is.
>
> I'm confused here. Isn't "well-distributed" a less-strict test than
> "exactly equidistributed"? MT is (almost) exactly k-equidistributed up
> to k = 623, correct? So how does it fail the "well-distributed" test?

No, "well-distributed" means "distributed like true random stream",
which is a strong and somewhat fuzzy requirement. "k-equidistributed"
is one particular way of operationalizing this requirement, but it's
not a very good one. The idea of k-equidistribution is historically
important because it spurred the development of generators that
avoided the really terrible flaws of early designs like Unix rand, but
it's not terribly relevant to modern RNG design.

Here's how it works.

Formally speaking, a randomized algorithm or Monte Carlo simulation or
similar can be understood as a function mapping an infinite bitstring
to some output value, F : R -> O.

If we sample infinite bitstrings R uniformly at random (using a
theoretical "true" random number generator), and then apply F to each
bitstring, then this produces some probability distribution over
output values O.

Now given some *pseudo* random number generator, we consider our
generator to be successful if that repeatedly running F(sample from
this RNG) gives us the same distribution over output values as if we
had repeatedly run F(true random sample).

So for example, you could have a function F that counts up how many
times it sees a zero or a one in the first 20,000 entries in the
bitstring, and you expect those numbers to come up at ~10,000 each
with some distribution around that. If your RNG is such that you
reproduce that distribution, then you pass this function. Note that
this is a little counterintuitive: if your RNG is such that over the
first 20,000 entries it always produces *exactly* 10,000 zeros and
10,000 ones, then it fails this test. The Mersenne Twister will pass
this test.

Or you could have a function F that takes the first 19937 bits from
the random stream, uses it to construct a model of the internal state
of a Mersenne Twister, predicts the next 1000 bits, and returns True
if they match and False if they don't match. On a real random stream
this function returns True with probability 2^-1000; on a MT random
stream it returns True with probability 1. So the MT fails this test.
OTOH this is obviously a pretty artificial example.

The only case the scientists actually care about is the one where F is
"this simulation right here that I'm trying to run before the
conference deadline". But since scientists don't really want to design
a new RNG for every simulation, we instead try to design our RNGs such
that for all "likely" or "reasonable" functions F, they'll probably
work ok. In practice this means that we write down a bunch of explicit
test functions F inside a test battery like TestU01, run the functions
on the RNG stream, and if their output is indistinguishable in
distribution from what it would be for a true random stream then we
say they pass. And we hope that this will probably generalize to the
simulation we actually care about.

Cryptographers are worried about the exact same issue -- they want
RNGs that have the property that for all functions F, the behavior is
indistinguishable from true randomness. But unlike the scientists,
they're not content to say "eh, I checked a few functions and it
seemed to work on those, probably the ones I actually care about are
okay too". The cryptographers consider it a failure if an adversary
with arbitrary computing power, full knowledge of the RNG algorithm,
plus other magic powers like the ability to influence the RNG seeding,
can invent *any* function F that acts differently (produces a
different distribution over outputs) when fed the input from the RNG
as compared to a true random stream. The only rule for them is that
the function has to be one that you can actually implement on a
computer that masses, like, less than Jupiter, and only has 1000 years
to run. And as far as we can tell, modern crypto RNGs succeed at this
very well.

Obviously the thing the scientists worry about is a *strict* subset of
what the cryptographers are worried about. This is why it is silly to
worry that a crypto RNG will cause problems for a scientific
simulation. The cryptographers take the scientists' real goal -- the
correctness of arbitrary programs like e.g. a monte carlo simulation
-- *much* more seriously than the scientists themselves do. (This is
because scientists need RNGs to do their real work, whereas for
cryptographers RNGs are their real work.)

Compared to this, k-dimensional equidistribution is a red herring: it
requires that you have a RNG that repeats itself after a while, and
within each repeat it produces a uniform distribution over bitstrings
of some particular length. By contrast, a true random bitstring does
not repeat itself, and it gives a uniform distribution over bitstrings
of *arbitrary* length. In this regard, crypto RNGs are like true
random bitstrings, not like k-equidistributed RNGs. This is a good
thing. k-equidistribution doesn't really hurt (theoretically it
introduces flaws, but for realistic designs they don't really matter),
but if randomness is what you want then crypto RNGs are better.

I-should-really-get-a-blog-shouldn't-I'ly-yrs,
-n

-- 
Nathaniel J. Smith -- http://vorpus.org

From njs at pobox.com  Tue Sep 15 10:42:26 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 15 Sep 2015 01:42:26 -0700
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAExdVNnYHBfkHjYpeC+rZA1oBqq7qEE82hsoGfaA_N9_aKu97Q@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com>
 <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
 <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <CAExdVNnYHBfkHjYpeC+rZA1oBqq7qEE82hsoGfaA_N9_aKu97Q@mail.gmail.com>
Message-ID: <CAPJVwBmNHGdocHpXkoSPD3MZLXqTrfcK656afP0QpVf9sHWuqQ@mail.gmail.com>

On Mon, Sep 14, 2015 at 10:49 PM, Tim Peters <tim.peters at gmail.com> wrote:
> [Tim]
>>> And yet nobody so far has a produced a single example of any harm done
>>> in any of the near-countless languages that supply non-crypto RNGs.  I
>>> know, my lawyer gets annoyed too when I point out there hasn't been a
>>> nuclear war ;-)
>
> [Nathaniel Smith <njs at pobox.com>]
>> Here you go:
>>   https://media.blackhat.com/bh-us-12/Briefings/Argyros/BH_US_12_Argyros_PRNG_WP.pdf
>
> The most important question I have about that is from its Appendix D,
> where they try to give a secure "all-purpose" token generator.  If
> openssl_random_pseudo_bytes is available, they use just that and call
> it done.  Otherwise they go on to all sorts of stuff.
>
> My question is, even if /dev/urandom is available, they're _not_
> content to use that.alone.  They continue to mix it up with all other
> kinds of silly stuff.  So why do they trust urandom less than
> OpenSLL's gimmick?  It's important to me because, far as I'm
> concerned, os.urandom() is already Python's way to spell "crypto
> random" (yes, it would be better to have one guaranteed to be
> available on all Python platforms).

Who knows why they wrote the code in that exact way. /dev/urandom is fine.

>> They present real-world attacks on PHP applications that use something
>> like the "password generation" code we've been talking about as a way
>> to generate cookies and password reset nonces, including in particular
>> the case of applications that use a strongly-seeded Mersenne Twister
>> as their RNG:
>
> I couldn't help but note ;-) that at least 3 of the apps had already
> attempted to repair bug reports filed against their insecure
> password-reset schemes at least 5 years ago.  You can lead a PHP'er to
> water, but ... ;-)
>
>> ...
>> "Section 5.3: ... In this section we give a description of the
>> Mersenne Twister generator and present an algorithm that allows the
>> recovery of the internal state of the generator even when the output
>> is truncated. Our algorithm also works in the presence of non
>> consecutive outputs ..."
>
> It's cute, but I doubt anyone but the authors had the patience - or
> knowledge - to write a solver dealing with tens of thousands of picky
> equations over about 20000 variables.  They don't explain enough about
> the details for a script kiddie to do it.  Even very bright hackers
> attack what's easiest to topple, like poor seeding - that's just easy
> brute force.
>
> It remained unclear to me _exactly_ what "in the presence of non
> consecutive outputs" is supposed to mean.  In the only examples, they
> knew exactly how many times MT was called.  "Non consecutive" in all
> those contexts appeared to mean "but we couldn't observe_any_ output
> bits in some cases - the ones we could know something about were
> sometimes non-consecutive".  So in the MT output sequence, they had no
> knowledge of _some_ of the outputs, but they nevertheless knew exactly
> _which_ of the outputs they were wholly ignorant about.
>
> That's no problem for the equation solver.  They just skip adding any
> equations for the affected bits, keep collecting more outputs and
> potentially "wrap around", probably leading to an overdetermined
> system in the end.
>
> But Python doesn't work the way PHP does here.  As explained in
> another message, in Python you can have _no idea_ how many MT outputs
> are consumed by a single .choice() call.  In the PHP equivalent, you
> always consume exactly one MT output.  PHP's method suffers
> statistical bias, but under the covers Python uses an accept/reject
> method to avoid that.  Any number of MT outputs may be (invisibly!)
> consumed before "accept" is reached, although typically only one or
> two.  You can deduce some of the leading MT output bits from the
> .choice() result, but _only_ for the single MT output .choice()
> reveals anything about.  About the other MT outputs it may consume,
> you can't even know that some _were_ skipped over, let alone how many.

This led me to look at the implementation of Python's choice(), and
it's interesting; I hadn't realized that it was using such an
inefficient method. (To make a random selection between, say, 36
items, it rounds up to 64 = 2**6, draws a 32-bit sample from MT,
discards 26 of the bits (!) to get a number between 0-63, and then
repeats until this number happens to fall in the 0-35 range, so it
rejects with probability ~0.45. A more efficient algorithm is the one
that it uses if getrandbits is not available, where it uses all 32
bits and only rejects with probability (2**32 % 36) / (2**32) =
~1e-9.) I guess this does add a bit of obfuscation.

OTOH the amount of obfuscation is very sensitive to the size of the
password alphabet. If I use uppercase + lowercase + digits, that gives
me 62 options, so I only reject with probability 1/32, and I can
expect that any given 40-character session key will contain zero skips
with probability ~0.28, and that reveals 240 bits of seed.

I don't have time right now to go look up the MT equations to see how
easy it is to make use of such partial information, but there
certainly are lots of real-world weaponized exploits that begin with
something like "first, gather 10**8 session keys...". I certainly
wouldn't trust it.

Also, if I use the base64 or hex alphabets, then the probability of
rejection is 0, and I can deterministically read off bits from the
underlying MT state. (Alternatively, if someone in the future makes
the obvious optimization to choice(), then it will basically stop
rejecting in practice, and again it becomes trivial to read off all
the bits from the underlying MT state.)

The point of "secure by default" is that you don't have to spend all
these paragraphs doing the math to try and guess whether some RNG
usage might maybe be secure; it just is secure.

> Best I can tell, that makes a huge difference to whether their solver
> is even applicable to cracking idiomatic "password generators" in
> Python.  You can't know which variables correspond to the bits you can
> deduce.  You could split the solver into multiple instances to cover
> all feasible possibilities (for how many MT outputs may have been
> invisibly consumed), but the number of solver instances needed then
> grows exponentially with the number of outputs you do see something
> about.  In the worst case (31 bits are truncated), they need over
> 19000 outputs to deduce the state.  Even a wildly optimistic "well,
> let's guess no more than 1 MT output was invisibly rejected each time"
> leads to over 2**19000 solver clones then.

Your "wildly optimistic" estimate is wildly conservative under
realistic conditions. How confident are you that the rest of your
analysis is totally free of similar errors? Would you willing to bet,
say, the public revelation of every website you've visited in the last
5 years on it?

> Sure, there's doubtless a far cleverer way to approach that.  But
> unless another group of PhDs looking to level up in Security World
> burns their grant money to tackle it, that's yet another attack that
> will never be seen in the real world ;-)

Grant money is a drop in the bucket of security research funding these
days. Criminals and governments have very deep pockets, and it's well
documented that there are quite a few people with PhDs who make their
living by coming up with exploits and then auctioning them on the
black market.

BTW, it looks like that PHP paper was an undergraduate project. You
don't need a PhD to solve linear equations :-).

>> Out of curiosity, I tried searching github for "random cookie
>> language:python". The 5th hit (out of ~100k) was a web project that
>> appears to use this insecure method to generate session cookies:
>>   https://github.com/bytasv/bbapi/blob/34e294becb22bae6e685f2e742b7ffdb53a83bcb/bbapi/utils/cookie.py
>>   https://github.com/bytasv/bbapi/blob/34e294becb22bae6e685f2e742b7ffdb53a83bcb/bbapi/api/router.py#L56-L66
>
> And they all use .choice(), which is overwhelmingly the most natural
> way to do this kind of thing in Python.
>
>> ...
>> There's a reason security people are so Manichean about these kinds of
>> things. If something is not intended to be secure or used in
>> security-sensitive ways, then fine, no worries. But if it is, then
>> there's no point in trying to mess around with "probably mostly
>> secure" -- either solve the problem right or don't bother. (See also:
>> the time Python wasted trying to solve hash randomization without
>> actually solving hash randomization [1].) If Tim Peters can get fooled
>> into thinking something like using MT to generate session ids is
>> "probably mostly secure", then what chance do the rest of us have?
>> <wink>
>
> As above, I'm still not much worried about .choice().  Even if I were,
> I'd be happy to leave it at "use .choice() from a random.SystemRandom
> instance instead".  Unless there's some non-obvious (to me) reason
> these authors appear to be unhappy with urandom.

No, SystemRandom.choice is certainly fine. But people clearly don't
use it, so it's fine-ness doesn't matter that much in practice...

-n

-- 
Nathaniel J. Smith -- http://vorpus.org

From mal at egenix.com  Tue Sep 15 10:45:34 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Sep 2015 10:45:34 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>	<CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>	<55F6A380.4070609@egenix.com>	<CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>	<55F700C4.4030900@egenix.com>
 <mt75tq$bn1$1@ger.gmane.org>	<mt7bl5$7hk$1@ger.gmane.org>
 <mt7c93$gns$1@ger.gmane.org>	<mt7dkd$87i$1@ger.gmane.org>	<20150915035334.GF31152@ando.pearwood.info>
 <CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>
Message-ID: <55F7DAAE.5010401@egenix.com>

On 15.09.2015 09:36, Nathaniel Smith wrote:
>
> [Using empirical tests to check RNGs]
>
> Obviously the thing the scientists worry about is a *strict* subset of
> what the cryptographers are worried about. 

I think this explains why we cannot make ends meet:

A scientist wants to be able to *repeat* a simulation in exactly the
same way without having to store GBs of data (or send them to colleagues
to have them very the results).

Crypto RNGs cannot provide this feature per design.

What people designing PRNGs are after is to improve the statistical
properties of these PRNGs while still maintaining the repeatability
of the output.

> This is why it is silly to
> worry that a crypto RNG will cause problems for a scientific
> simulation. The cryptographers take the scientists' real goal -- the
> correctness of arbitrary programs like e.g. a monte carlo simulation
> -- *much* more seriously than the scientists themselves do. (This is
> because scientists need RNGs to do their real work, whereas for
> cryptographers RNGs are their real work.)

Yes, cryptographers are the better folks, understood. These arguments
are not really helpful. They are not even arguments.

It's really simple: If you don't care about being able to reproduce
your simulation results, you can use a crypto RNG, otherwise
you can't.

> Compared to this, k-dimensional equidistribution is a red herring: it
> requires that you have a RNG that repeats itself after a while, and
> within each repeat it produces a uniform distribution over bitstrings
> of some particular length.

k-dim equidistribution is a way to measure how well your
PRNG behaves, because it describes in analytical terms how
far you can get with increasing the linear complexity of your
RNG output. The goal is not to design an PRNG with specific
k, but to increase k as far as possible, given the RNGs
deterministic limitations.

It's also not something you require of a PRNG, it's simply
a form of analytical measurement, just like the tests in TestU01
or the NIST test set are statistical measurements for various
aspects of RNGs.

Those statistical tests are good in detecting flaws of certain kinds,
but they are not perfect. If you know the tests, you can work around
them and have your RNG appear to pass them, e.g. you can trick a
statistical test for linear dependencies by applying a non-linear
transform. That doesn't make the RNGs better, but it apparently is
a way to convince some people of the quality of your RNG.

> By contrast, a true random bitstring does
> not repeat itself, and it gives a uniform distribution over bitstrings
> of *arbitrary* length. In this regard, crypto RNGs are like true
> random bitstrings, not like k-equidistributed RNGs. This is a good
> thing. k-equidistribution doesn't really hurt (theoretically it
> introduces flaws, but for realistic designs they don't really matter),
> but if randomness is what you want then crypto RNGs are better.

If you can come up with a crypto RNG that allows repeating the
results, I think you'd have us all convinced, otherwise it
doesn't really make sense to compare apples and oranges,
and insisting that orange juice is better for you than
apple juice ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 15 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-14: Released mxODBC Plone/Zope DA 2.2.3   http://egenix.com/go84
2015-09-18: PyCon UK 2015 ...                               3 days to go
2015-09-26: Python Meeting Duesseldorf Sprint 2015         11 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From rosuav at gmail.com  Tue Sep 15 10:58:01 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Tue, 15 Sep 2015 18:58:01 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <etPan.55f72cb5.7b968f9d.24af@Draupnir.home>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAN-Kwu3S3O8ft7eXcPYhV6ns_kHSsXVUw1fSsPuFjM7EwdLL8g@mail.gmail.com>
 <CACac1F9Y4n_dd+ybfCA2Ps1gN0Q5GbWJxeErqj0KfFNzYGbQjw@mail.gmail.com>
 <etPan.55f72cb5.7b968f9d.24af@Draupnir.home>
Message-ID: <CAPTjJmpiuwdwoZXoF4xr70P49d+vLGFOFpb=Z+GW7x7ngoCmoQ@mail.gmail.com>

On Tue, Sep 15, 2015 at 6:23 AM, Donald Stufft <donald at stufft.io> wrote:
>> * The security arguments seem to be largely in the context of web
>> application development (cookies, passwords, shared secrets, ...)
>> That's not the only context that matters.
>
> You're right it's not the only context that matters, however it's often brought
> up for a few reasons:
>
> * Security largely doesn't matter for software that doesn't accept or send
>  input from some untrusted source which narrows security down to be mostly
>  network based applications.
>
> * The HTTP protocol is "eating the world" and we're seeing more and more things
>   using it as their communication protocol (even for things that are not
>   traditional browser based applications).
>
> * Traditional Web Applications/Sites are a pretty large target audience for
>   Python and in particular a lot of the security folks come from that world
>   because the web is a hostile place.

To add to that: Web application development is a *huge* area (every
man and his dog wants a web site, and more than half of them want
logins and users and so on), which means that the number of
non-experts writing security-sensitive code is higher there than in a
lot of places. The only other area I can think of that would be
comparably popular would be mobile app development - and a lot of the
security concerns there are going to be in a web context anyway.

Is it fundamentally insecure to receive passwords over an encrypted
HTTP connection and use those to verify user identities? I don't think
so (although I'm no expert) - it's what you do with them afterward
that matters (improperly hashing - or, worse, using a reversible
transformation). Why are so many people advised not to do user
authentication at all, but to tie in with one of the auth APIs like
Google's or Facebook's? Because it's way easier to explain how to get
that right than to explain how to get security/encryption right.

How bad is it, really, to tell everyone "use random.SystemRandom for
anything sensitive", and leave it at that?

ChrisA

From ncoghlan at gmail.com  Tue Sep 15 12:14:35 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 15 Sep 2015 20:14:35 +1000
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CAPTjJmpiuwdwoZXoF4xr70P49d+vLGFOFpb=Z+GW7x7ngoCmoQ@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAN-Kwu3S3O8ft7eXcPYhV6ns_kHSsXVUw1fSsPuFjM7EwdLL8g@mail.gmail.com>
 <CACac1F9Y4n_dd+ybfCA2Ps1gN0Q5GbWJxeErqj0KfFNzYGbQjw@mail.gmail.com>
 <etPan.55f72cb5.7b968f9d.24af@Draupnir.home>
 <CAPTjJmpiuwdwoZXoF4xr70P49d+vLGFOFpb=Z+GW7x7ngoCmoQ@mail.gmail.com>
Message-ID: <CADiSq7eKBdeG3yEWrPexZvBwM9em-AhH6bLrLrnG3vyi90Ey8w@mail.gmail.com>

On 15 September 2015 at 18:58, Chris Angelico <rosuav at gmail.com> wrote:
> How bad is it, really, to tell everyone "use random.SystemRandom for
> anything sensitive", and leave it at that?

That's the status quo, and has been for a long time. If it was ever
going to work in terms of discouraging folks from use the module level
functions for security sensitive tasks, it would have worked by now.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From p.f.moore at gmail.com  Tue Sep 15 13:04:49 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 15 Sep 2015 12:04:49 +0100
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F9rCe_n-3gVPGqMp48WthLmt-4ZDu0aGrshCO5oevRjfQ@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAH_hAJGi9UEdBzt-GC=CqvGTq7aWoM7Yq2ZtsiYn4TvWV=Z06A@mail.gmail.com>
 <CACac1F_OZLAC2KLjnx2Z1LFq0WDzEt20WmCeHqhTgoyW_SFehg@mail.gmail.com>
 <etPan.55f72fc0.3eda087c.24af@Draupnir.home>
 <CACac1F9rCe_n-3gVPGqMp48WthLmt-4ZDu0aGrshCO5oevRjfQ@mail.gmail.com>
Message-ID: <CACac1F9BUiZhBGh4rTcUbw7Yk1TiWSA4Ar8WdO_gNH6x7wbA_Q@mail.gmail.com>

On 14 September 2015 at 23:39, Paul Moore <p.f.moore at gmail.com> wrote:
> (The rest of your emails, I'm going to read fully and digest before
> responding. Might take a day or so.)

Point by point responses exhaust and frustrate me, and don't really serve much
purpose other than to perpetuate the debate. So I'm going to make some final
points, and then stop. This is based on having read the various emails
responding to my earlier comments. If it looks like I haven't read something,
please assume I have but either you didn't get your point across, or maybe I
simply don't agree with you.

Why now?
--------

First of all, the big question for me is why now? The random module has been
around in its current form for many, many years. Security issues are not new,
maybe they are slowly increasing, but there's been no step change. The only
thing that seems to have changed is that someone (Theo) has drawn attention to
the random module.

So I feel that the onus is on the people proposing change to address that.
Show me the evidence that we've had an actual problem for many years, and
demonstrate that it's a good job we spotted it at last, and now have a chance
to fix it. Explain to me what has been going wrong all these years that I'd
never even noticed. Arguments that people are misusing the module aren't
sufficient in themselves - they've (presumably) been doing that for years. In
all that time, who was hacked? Who lost data? As a result of random.random
being a PRNG rather than being crypto-secure?

I'm not asking for an unassailable argument, just acknowledgement that it's
*your* job to address that question, and not mine to persuade you that "we've
been alright so far" is a compelling reason to reject your proposal.

Incorrect code on SO etc
------------------------

As regards people picking up insecure code snippets from the internet and
using them, there's no news there. I can look round and find hundreds of bits
of incorrect code in any area you want. People copy/paste garbage code all the
time. To my embarassment, I've done it myself in the past :-(

But I'm reminded of https://xkcd.com/386/ - "somebody is wrong on the
internet!"

This proposal, and in particular the suggestion that we need to
retrospectively make the code snippets quoted here secure, strikes me as a
huge exercise in trying to correct all the people who are wrong on the
internet. There's certainly value in "safe by default" APIs, I don't disagree
with that, but I honestly fail to see how quoting incorrect code off the
internet is a compelling argument for anything.

Millions of users are affected
------------------------------

The numbers game is also a frustrating exercise here. We keep hearing that
"millions of users are affected by bad code", that scans of Google almost
immediately find sites with vulnerabilities.

But I don't see anyone pointing at a single documented case of an actual
exploit caused by Python's random module. There's no bug report. There's no
security alert notification.

How are those millions of users affected? Their level of risk is increased?
Who can tell that? Are any of the sites identified holding personal data? Not
all websites on the internet are *worth* hacking.

And I feel that expressing that view is somehow frowned on. That "it doesn't
matter" is an unacceptable view to hold. And so, the responses to my questions
feel personal, they feel like criticisms of me personally, that I'm being
unprofessional. I don't want to make this a big deal, but the code of conduct
says "we're tactful when approaching differing views", and it really doesn't
feel like that.

I understand that the whole security thing is a numbers game. And that it's
about assessing risk. But what risk is enough to trigger a response? A 10%
increased chance of any given website being hacked? 5%? 1%? Again, I'm not
asking to use the information to veto a change. I'm asking to *understand
your position*. To better assess your arguments, so that I can be open to
persuasion, and to *agree* with you, if your arguments are sound.

Furthermore, should we not take into account other languages and approaches at
this point? Isn't PHP a well-known "soft target"? Isn't phishing and social
engineering the best approach to hacking these days, rather than cracking
RNGs? I don't know, and I look to security experts for advice here. So please
explain for me, how are you assessing the risks, and why do you judge this
specific risk high enough to warrant a response?

The impression I get is that the security view is that *any* risk, no matter
how small, once identified, warrants a response. "Do nothing" is never an
option. If that's your position, then I'm sorry, but I simply don't agree with
you. I don't want to live in a world that paranoid, and I'm unsure how to get
past this point to have a meaningful dialog.

History, and security's "bad rep"
---------------------------------

Donald asked if I was experiencing some level of spill-over from
distutils-sig, where there has *also* been a lot of security churn (far more
than here). Yes, I am. No doubt about that. On distutils-sig, and pip in
particular, it's clear to see a lot of frustration from users with the
long-running series of security changes. The tone of bug reports is frustrated
and annoyed. Users want a break from being forced to make changes.

Outside of Python, and speaking purely from my own experience in the corporate
world, security is pretty uniformly seen as an annoying overhead, and a block
on actually getting the job done. You can dismiss that as misguided, but it's
a fact. "We need to do this for security" is a direct challenge to people to
dismiss it as unnecessary, and often to immediately start looking for ways to
bypass the requirement "so that it doesn't get in the way". I try not to take
that attitude in this sort of debate, but at the same time, I do try to
*represent* that view and ask for help in addressing it.

The level of change in core Python is far less than on distutils-sig, and has
been relatively isolated from "non-web" areas. People understand (and are
grateful for) increases in "secure by default" behaviour in code like urllib
and ssl. They know that these are places where security is important, where
getting it right is harder than you'd think, and where trusting experts to do
the hard thinking for you is important.

But things like hash randomisation and the random module are less obviously
security related. The feedback from hash randomisation focused on "why did you
break my code?". It wasn't a big deal, people were relying on undocumented
behaviour and accepted that, but they did see it as a breakage from a security
fix. I expect the same to be true with the random module, but with the added
dimension that we're proposing changing documented behaviour this time.

As a result of similar arguments applying to every security change, and those
arguments never *really* seeming to satisfy people, there's a lot of
reiterated debate. And that's driving interested but non-expert people away
from contributing to the discussion. So we end up with a lack of checks and
balances because people without a vested interest in tightening security "tune
out" of the debates. I see that as a problem. But ultimately, if we can't find
a better way of running these discussions, I don't know how we fix it. I
certainly can't continue being devil's advocate every time.

Anyway, that's me done on this thread. I hope I've added more benefit than
cost to the discussion. Thanks to everyone for responding to my questions -
even if we all felt like we were both just repeating the same thing, it's a
lot of effort doing so and I appreciate your time.

Paul

From robert.kern at gmail.com  Tue Sep 15 13:21:57 2015
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 15 Sep 2015 12:21:57 +0100
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <20150915035334.GF31152@ando.pearwood.info>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org> <mt7c93$gns$1@ger.gmane.org>
 <mt7dkd$87i$1@ger.gmane.org> <20150915035334.GF31152@ando.pearwood.info>
Message-ID: <mt8v0l$2mk$1@ger.gmane.org>

On 2015-09-15 04:53, Steven D'Aprano wrote:
> On Mon, Sep 14, 2015 at 10:19:09PM +0100, Robert Kern wrote:
>
>> The requirement for a good PRNG for simulation work is that it be *well*
>> distributed in reasonable dimensions, not that it be *exactly*
>> equidistributed for some k. And well-distributedness is exactly what is
>> tested in TestU01. It is essentially a collection of simulations designed
>> to expose known statistical flaws in PRNGs. So to your earlier question as
>> to which is more damning, failing TestU01 or not being perfectly 623-dim
>> equidistributed, failing TestU01 is.
>
> I'm confused here. Isn't "well-distributed" a less-strict test than
> "exactly equidistributed"? MT is (almost) exactly k-equidistributed up
> to k = 623, correct? So how does it fail the "well-distributed" test?

k=623 is a tiny number of dimensions for testing "well-distributedness". You 
should be able to draw millions of values without detecting significant 
correlations.

Perfect k-dim equidistribution is not a particularly useful metric on its own 
(at least for simulation work). You can't just say "PRNG A has a bigger k than 
PRNG B therefore PRNG A is better". You need a minimum period to even possibly 
reach a certain k, and that period goes up exponentially with k. Given two PRNGs 
that have the same period, but one has a much smaller k than the other, *then* 
you can start making inferences about relative quality (again for simulation 
work; ChaCha20 has a long period but no guarantees of k that I am aware of, but 
its claim to fame is security, not simulation work). Astronomical periods have 
costs, so you only want to pay for what is actually worth it, so it's certainly 
a good thing that the MT has a k near its upper bound. PRNGs with shorter, but 
still roomy periods like 2**128 are not worse because they have necessarily 
smaller ks.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco


From njs at pobox.com  Tue Sep 15 13:41:30 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 15 Sep 2015 04:41:30 -0700
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F7DAAE.5010401@egenix.com>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org> <mt7c93$gns$1@ger.gmane.org>
 <mt7dkd$87i$1@ger.gmane.org>
 <20150915035334.GF31152@ando.pearwood.info>
 <CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>
 <55F7DAAE.5010401@egenix.com>
Message-ID: <CAPJVwBm-8kSxj_gSg0Q34vO1P65_vPWKQ0QggpRtaB=7TRvOWA@mail.gmail.com>

On Tue, Sep 15, 2015 at 1:45 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> On 15.09.2015 09:36, Nathaniel Smith wrote:
>>
>> [Using empirical tests to check RNGs]
>>
>> Obviously the thing the scientists worry about is a *strict* subset of
>> what the cryptographers are worried about.
>
> I think this explains why we cannot make ends meet:
>
> A scientist wants to be able to *repeat* a simulation in exactly the
> same way without having to store GBs of data (or send them to colleagues
> to have them very the results).
>
> Crypto RNGs cannot provide this feature per design.
>
> What people designing PRNGs are after is to improve the statistical
> properties of these PRNGs while still maintaining the repeatability
> of the output.
>
>> This is why it is silly to
>> worry that a crypto RNG will cause problems for a scientific
>> simulation. The cryptographers take the scientists' real goal -- the
>> correctness of arbitrary programs like e.g. a monte carlo simulation
>> -- *much* more seriously than the scientists themselves do. (This is
>> because scientists need RNGs to do their real work, whereas for
>> cryptographers RNGs are their real work.)
>
> Yes, cryptographers are the better folks, understood. These arguments
> are not really helpful. They are not even arguments.

Err... I think we're arguing past each other. (Hint: I'm a scientist,
not a cryptographer ;-).)

My email was *only* trying to clear up the argument that keeps popping
up about whether or not a cryptographic RNG could introduce bias in
simulations etc., as compared to the allegedly-better-behaved Mersenne
Twister. (As in e.g. your comment upthread that "[MT] is proven to be
equidistributed which is a key property needed for it to be used as
basis for other derived probability distributions".) This argument is
incorrect -- equidistribution is not a guarantee that an RNG will
produce good results when deriving other probability distributions,
and in general cryptographic RNGs will produce as-or-better results
than MT in terms of correctness of output. On this particular axis,
using a cryptographic RNG is not at all dangerous.

Obviously this is only one of the considerations in choosing an RNG;
the quality of the randomness is totally orthogonal to considerations
like determinism.

(Cryptographers also have deterministic RNGs -- they call them "stream
ciphers" -- and these will also meet or beat MT in any practically
relevant test of correctness for the same reasons I outlined, despite
not being provably equidistributed. Of course there are then yet other
trade-offs like speed. But that's not really relevant to this thread,
because no-one is proposing replacing MT as the standard deterministic
RNG in Python; I'm just trying to be clear about how one judges the
quality of randomness that an RNG produces.)

-n

-- 
Nathaniel J. Smith -- http://vorpus.org

From sturla.molden at gmail.com  Tue Sep 15 13:54:45 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Tue, 15 Sep 2015 13:54:45 +0200
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org> <mt7c93$gns$1@ger.gmane.org>
 <mt7dkd$87i$1@ger.gmane.org> <20150915035334.GF31152@ando.pearwood.info>
 <CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>
Message-ID: <mt90uk$268$1@ger.gmane.org>

On 15/09/15 09:36, Nathaniel Smith wrote:

> Obviously the thing the scientists worry about is a *strict* subset of
> what the cryptographers are worried about. This is why it is silly to
> worry that a crypto RNG will cause problems for a scientific
> simulation. The cryptographers take the scientists' real goal -- the
> correctness of arbitrary programs like e.g. a monte carlo simulation
> -- *much* more seriously than the scientists themselves do.

No. Cryptographers care about predictability, not the exact 
distribution. Any distribution can be considered randomness with a given 
entropy, but not any distribution is uniform. Only the uniform 
distribution is uniform. That is where our needs fail to meet. 
Cryptographers damn any RNG that allow the internal state to be 
reconstructed. Scientists damn any RNG that do not produce the 
distribution of interest.


Sturla




From skrah at bytereef.org  Tue Sep 15 14:08:53 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Tue, 15 Sep 2015 12:08:53 +0000 (UTC)
Subject: [Python-ideas] Python's Source of Randomness and the random.py
	module Redux
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAH_hAJGi9UEdBzt-GC=CqvGTq7aWoM7Yq2ZtsiYn4TvWV=Z06A@mail.gmail.com>
 <CACac1F_OZLAC2KLjnx2Z1LFq0WDzEt20WmCeHqhTgoyW_SFehg@mail.gmail.com>
 <etPan.55f72fc0.3eda087c.24af@Draupnir.home>
 <CACac1F9rCe_n-3gVPGqMp48WthLmt-4ZDu0aGrshCO5oevRjfQ@mail.gmail.com>
 <CACac1F9BUiZhBGh4rTcUbw7Yk1TiWSA4Ar8WdO_gNH6x7wbA_Q@mail.gmail.com>
Message-ID: <loom.20150915T131841-528@post.gmane.org>

Paul Moore <p.f.moore at ...> writes:
[snip well-reasoned paragraphs]

I want to add that the dichotomy between "security-minded" and
"non-security-minded" that has been used for rhetoric purposes
has no basis in reality.

Several "non-security-minded" devs (of the kind who have *actually*
contributed a lot of code to CPython) have a pretty good grasp of
cryptography and just don't like security theater.



Stefan Krah






From sturla.molden at gmail.com  Tue Sep 15 14:09:07 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Tue, 15 Sep 2015 14:09:07 +0200
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F7DAAE.5010401@egenix.com>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>	<CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>	<55F6A380.4070609@egenix.com>	<CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>	<55F700C4.4030900@egenix.com>
 <mt75tq$bn1$1@ger.gmane.org>	<mt7bl5$7hk$1@ger.gmane.org>
 <mt7c93$gns$1@ger.gmane.org>	<mt7dkd$87i$1@ger.gmane.org>	<20150915035334.GF31152@ando.pearwood.info>
 <CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>
 <55F7DAAE.5010401@egenix.com>
Message-ID: <mt91p2$ffl$1@ger.gmane.org>

On 15/09/15 10:45, M.-A. Lemburg wrote:

> k-dim equidistribution is a way to measure how well your
> PRNG behaves, because it describes in analytical terms how
> far you can get with increasing the linear complexity of your
> RNG output.

Yes and no. Conceptually it means that k subsequent samples will have 
exactly zero correlation. But any PRNG that produces detectable 
correlation between samples 623 steps apart is junk anyway. The MT have 
proven equidistribution for k=623, but many have measured 
equidistribution for far longer periods than that. Numerical 
computations are subject to rounding error and truncation error whatever 
you do. The question is whether the deviation from k-dim 
equidistribution will show up in your simulation result or drown in the 
error terms.

Sturla


From p.f.moore at gmail.com  Tue Sep 15 14:12:30 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 15 Sep 2015 13:12:30 +0100
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <loom.20150915T131841-528@post.gmane.org>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAH_hAJGi9UEdBzt-GC=CqvGTq7aWoM7Yq2ZtsiYn4TvWV=Z06A@mail.gmail.com>
 <CACac1F_OZLAC2KLjnx2Z1LFq0WDzEt20WmCeHqhTgoyW_SFehg@mail.gmail.com>
 <etPan.55f72fc0.3eda087c.24af@Draupnir.home>
 <CACac1F9rCe_n-3gVPGqMp48WthLmt-4ZDu0aGrshCO5oevRjfQ@mail.gmail.com>
 <CACac1F9BUiZhBGh4rTcUbw7Yk1TiWSA4Ar8WdO_gNH6x7wbA_Q@mail.gmail.com>
 <loom.20150915T131841-528@post.gmane.org>
Message-ID: <CACac1F8eEGSvn3UCFL8Zfmk8EG01iQ5XMKMibVEVqbYVECK-Nw@mail.gmail.com>

On 15 September 2015 at 13:08, Stefan Krah <skrah at bytereef.org> wrote:
> I want to add that the dichotomy between "security-minded" and
> "non-security-minded" that has been used for rhetoric purposes
> has no basis in reality.
>
> Several "non-security-minded" devs (of the kind who have *actually*
> contributed a lot of code to CPython) have a pretty good grasp of
> cryptography and just don't like security theater.

Agreed, and every time I ended up looking for words for the two
"sides", I ended up feeling uncomfortable. There are no "sides" here,
just a variety of people with a variety of experiences, who want to
feel assured that their voices are being heard.

Paul

From donald at stufft.io  Tue Sep 15 14:27:44 2015
From: donald at stufft.io (Donald Stufft)
Date: Tue, 15 Sep 2015 08:27:44 -0400
Subject: [Python-ideas] Python's Source of Randomness and the random.py
 module Redux
In-Reply-To: <CACac1F9BUiZhBGh4rTcUbw7Yk1TiWSA4Ar8WdO_gNH6x7wbA_Q@mail.gmail.com>
References: <etPan.55f0c84c.3fdb05c1.31bc@Draupnir.home>
 <CACac1F-EYHfU0cD+ZUxy3oi8NXOb7e4NV9GFv3ZP9_7edrrgPA@mail.gmail.com>
 <etPan.55f16900.37960d0.31bc@Draupnir.home>
 <CACac1F9zXGAxwG-eFrZnXwqV__CMUQLzNogTp5tvd2ci1vT37Q@mail.gmail.com>
 <etPan.55f18131.392f7558.31bc@Draupnir.home>
 <CACac1F_fxkfo7e2=zTuYyTejGsx=UsfxvbMmu-tFZJq-kx7xeA@mail.gmail.com>
 <etPan.55f191d7.5ad7181d.31bc@Draupnir.home>
 <CAP1=2W4T-K2g9Ws=UhbOcOYWMJtS7iRPB+OTDXzKh8Z7L91fgg@mail.gmail.com>
 <CADiSq7dcoiumBtfvrHttzCuxG=W5=F+e9RHrRcPc12zt4-b_iQ@mail.gmail.com>
 <loom.20150914T145109-192@post.gmane.org>
 <CAH_hAJEQ7KvcLN4G++709k7ztz3nBUjAkfSH-BnPc1HpFevKag@mail.gmail.com>
 <CACac1F-LtcxnDwML6395eowUoDpoaL_d3jrqiOm2TUypNwO1Hw@mail.gmail.com>
 <CAH_hAJGi9UEdBzt-GC=CqvGTq7aWoM7Yq2ZtsiYn4TvWV=Z06A@mail.gmail.com>
 <CACac1F_OZLAC2KLjnx2Z1LFq0WDzEt20WmCeHqhTgoyW_SFehg@mail.gmail.com>
 <etPan.55f72fc0.3eda087c.24af@Draupnir.home>
 <CACac1F9rCe_n-3gVPGqMp48WthLmt-4ZDu0aGrshCO5oevRjfQ@mail.gmail.com>
 <CACac1F9BUiZhBGh4rTcUbw7Yk1TiWSA4Ar8WdO_gNH6x7wbA_Q@mail.gmail.com>
Message-ID: <etPan.55f80ec0.773881a.5a78@Draupnir.home>

On September 15, 2015 at 7:04:52 AM, Paul Moore (p.f.moore at gmail.com) wrote:
> On 14 September 2015 at 23:39, Paul Moore wrote:
> > (The rest of your emails, I'm going to read fully and digest before
> > responding. Might take a day or so.)
>  
> Point by point responses exhaust and frustrate me, and don't really serve much
> purpose other than to perpetuate the debate. So I'm going to make some final
> points, and then stop. This is based on having read the various emails
> responding to my earlier comments. If it looks like I haven't read something,
> please assume I have but either you didn't get your point across, or maybe I
> simply don't agree with you.
>  
> Why now?
> --------
>  
> First of all, the big question for me is why now? The random module has been
> around in its current form for many, many years. Security issues are not new,
> maybe they are slowly increasing, but there's been no step change. The only
> thing that seems to have changed is that someone (Theo) has drawn attention to
> the random module.
>  
> So I feel that the onus is on the people proposing change to address that.
> Show me the evidence that we've had an actual problem for many years, and
> demonstrate that it's a good job we spotted it at last, and now have a chance
> to fix it. Explain to me what has been going wrong all these years that I'd
> never even noticed. Arguments that people are misusing the module aren't
> sufficient in themselves - they've (presumably) been doing that for years. In
> all that time, who was hacked? Who lost data? As a result of random.random
> being a PRNG rather than being crypto-secure?
>  
> I'm not asking for an unassailable argument, just acknowledgement that it's
> *your* job to address that question, and not mine to persuade you that "we've
> been alright so far" is a compelling reason to reject your proposal.

The answer to "Why Now?"" is basically because someone brought it up. I realize
that's a pretty arbitrary thing but I'm not sure what answer would even be
acceptable here. When is an OK time to do it in your eye? Is it only after
there is a public, known attack against the RNG? Is it only when the module is
first being added?

The sad state of affairs is that it's only been relatively recently that our
industry as a whole has really taken security seriously so there is a lot of
things out there that are not well designed from a security POV. We can't go
back in time and change the original mistake, but we can repair it going into
the future.


>  
> Incorrect code on SO etc
> ------------------------
>  
> As regards people picking up insecure code snippets from the internet and
> using them, there's no news there. I can look round and find hundreds of bits
> of incorrect code in any area you want. People copy/paste garbage code all the
> time. To my embarassment, I've done it myself in the past :-(
>  
> But I'm reminded of https://xkcd.com/386/ - "somebody is wrong on the
> internet!"
>  
> This proposal, and in particular the suggestion that we need to
> retrospectively make the code snippets quoted here secure, strikes me as a
> huge exercise in trying to correct all the people who are wrong on the
> internet. There's certainly value in "safe by default" APIs, I don't disagree
> with that, but I honestly fail to see how quoting incorrect code off the
> internet is a compelling argument for anything.


The argument is basically that security is an important part of API design, and
that if you look at what people are doing in practice, it gives you an idea of
how people think they should use the API. It's kind of like looking at a
situation like this: https://i.imgur.com/0gnb7Us.jpg and concluding that maybe
we should pave that worn down footpath, because people are going to use it
anyways.

>  
> Millions of users are affected
> ------------------------------
>  
> The numbers game is also a frustrating exercise here. We keep hearing that
> "millions of users are affected by bad code", that scans of Google almost
> immediately find sites with vulnerabilities.
>  
> But I don't see anyone pointing at a single documented case of an actual
> exploit caused by Python's random module. There's no bug report. There's no
> security alert notification.

So a big part of this is certainly preventative. It's a fairly relatively
recent development that hacking went from indivduals or small teams doing it
to big targets to a business on it's own. There are literally giant office
complexes in places like Russia and China filled with employees in cubicles,
but they aren't writing software like at a normal company, they are just
trawling around the internet, looking for targets, trying to expand botnets
looking for anything and everything they can get their hands on.

It's also true that there isn't going to be a big fanfaire for *most* actual
hacked computers/sites. Most of the time the people running the site simply
won't ever know, they'll just be silently hosting malware or having their
user's passwords being fed into other sites. It's very few exploits that
actually get noticed and when noticed it's unlikely they get public attention.

I'd also suggest that for changes like these, if someone was exploited by this
they'd probably look at the documentation for random.py and see that they were
accidently using the module wrong, and then blame themselves and not ever
bother to file a bug report. It is my opinion that it's not really their fault
that the API lead them to believe that what they were doing was right.

>  
> How are those millions of users affected? Their level of risk is increased?
> Who can tell that? Are any of the sites identified holding personal data? Not
> all websites on the internet are *worth* hacking.

Actually, all sites on the internet *are* worth hacking, depending on what you
call hacking. Malware is constantly being hosted on tiny sites that most
wouldn't call "worth" hacking, but malware authors were able to hack in some
way and then they uploaded their malware there. If there are user logins it's
likely that people reused username and passwords, so if you can get the
passwords from one smaller site, it's possible you can use that as a door into
a larger, more important site. Plus, there's also the desire for botnets to
add more and more nodes into their swarm, they don't care what site you're
hosting, they just want the machine.

One key problem to the security of the internet as a whole is that there are a
lot of small sites without dedicated security teams, or anyone who really knows
security at all. These are easy targets for people and most languages and
libraries make it far too easy for people to do the wrong thing.

>  
> And I feel that expressing that view is somehow frowned on. That "it doesn't
> matter" is an unacceptable view to hold. And so, the responses to my questions
> feel personal, they feel like criticisms of me personally, that I'm being
> unprofessional. I don't want to make this a big deal, but the code of conduct
> says "we're tactful when approaching differing views", and it really doesn't
> feel like that.
>  
> I understand that the whole security thing is a numbers game. And that it's
> about assessing risk. But what risk is enough to trigger a response? A 10%
> increased chance of any given website being hacked? 5%? 1%? Again, I'm not
> asking to use the information to veto a change. I'm asking to *understand
> your position*. To better assess your arguments, so that I can be open to
> persuasion, and to *agree* with you, if your arguments are sound.

It's basically a gut feeling since we can't get any hard data here. Things like
being able to look online and find code in the wild that does this wrong within
minutes gives us an idea at how likely it is as well as reasoning about what
people who don't know what the difference is between ``random.random()`` and
``random.SystemRandom().random()`` as well as just a little bit of guessing
based on experience with similar situations.

Another input into this equation is how much it's likely that this change would
break someone and once broken, how easy it will be to fix things.

I sadly can't give anything more specific than that here, because it's a bit of
an artform crossed with personal biases :(

>  
> Furthermore, should we not take into account other languages and approaches at
> this point? Isn't PHP a well-known "soft target"? Isn't phishing and social
> engineering the best approach to hacking these days, rather than cracking
> RNGs? I don't know, and I look to security experts for advice here. So please
> explain for me, how are you assessing the risks, and why do you judge this
> specific risk high enough to warrant a response?
>  
> The impression I get is that the security view is that *any* risk, no matter
> how small, once identified, warrants a response. "Do nothing" is never an
> option. If that's your position, then I'm sorry, but I simply don't agree with
> you. I don't want to live in a world that paranoid, and I'm unsure how to get
> past this point to have a meaningful dialog.

Do nothing is absolutely an option, but most security focused folks don't take
a scorched earth view of security so we often times don't bother to even
mention a possible change unless we think that doing nothing is the wrong
answer. An example going back to PEP 476 where we enabled TLS verification by
default on HTTPS, we limited it to *only* HTTPS even though TLS is used by
many other protocols because it was our opinion that doing nothing for those
protocols was the right call. Those are protocols are still insecure by
default, but doing something about that by default would break too much for us
to be willing to even suggest it.

On top of that, we tend to want to prioritize the things we do try to have
happen, so we focus on things with the smallest fallout or the biggest upsides
and we ignore other things until later.

This is probably why there's some bias that it looks like doing nothing is an
option, because we already self select what we choose to push forward because
we *do* care about backwards compatability too.

>  
> History, and security's "bad rep"
> ---------------------------------
>  
> Donald asked if I was experiencing some level of spill-over from
> distutils-sig, where there has *also* been a lot of security churn (far more
> than here). Yes, I am. No doubt about that. On distutils-sig, and pip in
> particular, it's clear to see a lot of frustration from users with the
> long-running series of security changes. The tone of bug reports is frustrated
> and annoyed. Users want a break from being forced to make changes.?

I think a lot of these changes are paying down technical debt of two decades of
(industry standard) lack of focus on security. It sucks, but when we come out
the other side (because hopefully, new APIs and modules will be better designed
with security in mind given our new landscape) we should hopefully be in a much
better situation.

In the distutils-sig side, I think that PEP 470 was the last breaking change
that I can think of that we'll need to do in the name of security, we've paid
down that particular bit of technical debt, and once that lands we'll have a
pretty decent story. We still have other kinds of techincal debt to pay down
though :(

>  
> Outside of Python, and speaking purely from my own experience in the corporate
> world, security is pretty uniformly seen as an annoying overhead, and a block
> on actually getting the job done. You can dismiss that as misguided, but it's
> a fact. "We need to do this for security" is a direct challenge to people to
> dismiss it as unnecessary, and often to immediately start looking for ways to
> bypass the requirement "so that it doesn't get in the way". I try not to take
> that attitude in this sort of debate, but at the same time, I do try to
> *represent* that view and ask for help in addressing it.
>  
> The level of change in core Python is far less than on distutils-sig, and has
> been relatively isolated from "non-web" areas. People understand (and are
> grateful for) increases in "secure by default" behaviour in code like urllib
> and ssl. They know that these are places where security is important, where
> getting it right is harder than you'd think, and where trusting experts to do
> the hard thinking for you is important.
>  
> But things like hash randomisation and the random module are less obviously
> security related. The feedback from hash randomisation focused on "why did you
> break my code?". It wasn't a big deal, people were relying on undocumented
> behaviour and accepted that, but they did see it as a breakage from a security
> fix. I expect the same to be true with the random module, but with the added
> dimension that we're proposing changing documented behaviour this time.
>  
> As a result of similar arguments applying to every security change, and those
> arguments never *really* seeming to satisfy people, there's a lot of
> reiterated debate. And that's driving interested but non-expert people away
> from contributing to the discussion. So we end up with a lack of checks and
> balances because people without a vested interest in tightening security "tune
> out" of the debates. I see that as a problem. But ultimately, if we can't find
> a better way of running these discussions, I don't know how we fix it. I
> certainly can't continue being devil's advocate every time.

Things don't really satisify people because they often times fundamentally
don't care about security. That is perfectly reasonable, so don't think that I
expect everyone to care about security, but they simply don't. However, In my
opinion we have a moral obligation to try and do what we reasonably can to
protect people. It's a bit like social safety nets, one person might ask why
they are being asked to pay taxes, after all they never needed government
assistance but by asking every citizen to pay in, they can try and help people
from falling through the cracks. This isn't a social safety net, it's a
security safety net.

>  
> Anyway, that's me done on this thread. I hope I've added more benefit than
> cost to the discussion. Thanks to everyone for responding to my questions -
> even if we all felt like we were both just repeating the same thing, it's a
> lot of effort doing so and I appreciate your time.
>  
> Paul
>  

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From njs at pobox.com  Tue Sep 15 14:34:36 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 15 Sep 2015 05:34:36 -0700
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <mt90uk$268$1@ger.gmane.org>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org> <mt7c93$gns$1@ger.gmane.org>
 <mt7dkd$87i$1@ger.gmane.org>
 <20150915035334.GF31152@ando.pearwood.info>
 <CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>
 <mt90uk$268$1@ger.gmane.org>
Message-ID: <CAPJVwBkg88m149hDuDH==KFdQkiD2OswcBPSwRnYyNpQR45rEQ@mail.gmail.com>

On Sep 15, 2015 4:57 AM, "Sturla Molden" <sturla.molden at gmail.com> wrote:
>
> On 15/09/15 09:36, Nathaniel Smith wrote:
>
>> Obviously the thing the scientists worry about is a *strict* subset of
>> what the cryptographers are worried about. This is why it is silly to
>> worry that a crypto RNG will cause problems for a scientific
>> simulation. The cryptographers take the scientists' real goal -- the
>> correctness of arbitrary programs like e.g. a monte carlo simulation
>> -- *much* more seriously than the scientists themselves do.
>
>
> No. Cryptographers care about predictability, not the exact distribution.
Any distribution can be considered randomness with a given entropy, but not
any distribution is uniform. Only the uniform distribution is uniform. That
is where our needs fail to meet. Cryptographers damn any RNG that allow the
internal state to be reconstructed. Scientists damn any RNG that do not
produce the distribution of interest.

No, this is simply wrong. I promise! ("Oh, sorry, this is
contradictions...") For the output of a cryptographic RNG, any deviation
from the uniform distribution is considered a flaw. (And as you know, given
uniform variates you can construct any distribution of interest.) If I know
that you're using a coin that usually comes up heads to generate your
passwords, then this gives me a head start in guessing your passwords, and
that's considered unacceptable.

Or for further evidence, consider: "Scott Fluhrer and David McGrew also
showed such attacks which distinguished the keystream of the RC4 from a
random stream given a gigabyte of output." --
https://en.m.wikipedia.org/wiki/RC4#Biased_outputs_of_the_RC4

This result is listed on wikipedia because the existence of a program that
can detect a deviation from perfect uniformity given a gigabyte of samples
and an arbitrarily complicated test statistic is considered a publishable
security flaw (and RC4 is generally deprecated because of this and related
issues -- this is why openbsd's "arc4random" function no longer uses
(A)RC4).

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150915/40a58bc7/attachment.html>

From skrah at bytereef.org  Tue Sep 15 14:36:04 2015
From: skrah at bytereef.org (Stefan Krah)
Date: Tue, 15 Sep 2015 12:36:04 +0000 (UTC)
Subject: [Python-ideas]
	=?utf-8?q?Should_our_default_random_number_generat?=
	=?utf-8?q?or_be=09secure=3F?=
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org> <mt7c93$gns$1@ger.gmane.org>
 <mt7dkd$87i$1@ger.gmane.org> <20150915035334.GF31152@ando.pearwood.info>
 <CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>
Message-ID: <loom.20150915T143159-93@post.gmane.org>

Nathaniel Smith <njs at ...> writes:
> Obviously the thing the scientists worry about is a *strict* subset of
> what the cryptographers are worried about. This is why it is silly to
> worry that a crypto RNG will cause problems for a scientific
> simulation.

Do you have links to papers analyzing chacha20 w.r.t statistical
properties?  The only information that I found is

  http://www.pcg-random.org/other-rngs.html#id11

"Fewer rounds result in poor statistical performance; ChaCha2 fails
statistical tests badly, and ChaCha4 passes TestU01 but sophisticated
mathematical analysis has shown it to exhibit some bias. ChaCha8 (and
higher) are believed to be good. Nevertheless, ChaCha needs to go to more
work to achieve satisfactory statistical quality than many other generators.
ChaCha20, being newer, has received less scrutiny from the cryptographic
community than Arc4."



Stefan Krah





From sturla.molden at gmail.com  Tue Sep 15 14:51:28 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Tue, 15 Sep 2015 14:51:28 +0200
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAExdVN=tO3jPWoz0t6ckspAuWB-7t61GzbKLrx2L2UtKRyELbA@mail.gmail.com>
References: <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <loom.20150909T213030-270@post.gmane.org>
 <CA+=+wqA-c80eyKf25k0+0HNCb=awARByB0C=jwtE_KzFwp+QAA@mail.gmail.com>
 <loom.20150909T232749-280@post.gmane.org>
 <CAExdVNm4S89WXOcOLrL_tE0SL6Gc9tw20BwDtg8q_M0Qc1qmJQ@mail.gmail.com>
 <CAPJVwBmxA2qGiZ9QWGNdB0krook-_NZkuur_HhtGcErsCeTOvQ@mail.gmail.com>
 <20150910015505.GO19373@ando.pearwood.info>
 <CAExdVN=tO3jPWoz0t6ckspAuWB-7t61GzbKLrx2L2UtKRyELbA@mail.gmail.com>
Message-ID: <mt948g$o3v$1@ger.gmane.org>

On 10/09/15 04:23, Tim Peters wrote:

>  Now (well, last I
> saw) they recommend a parameterized scheme creating a distinct variant
> of MT per thread (not just different state, but a different (albeit
> related) algorithm)

The DCMT use the same algorithm (Mersenne Twister) but with different 
polynomials. The choice of polynomial is more or less arbitrary. You can 
search for a set of N polynomials that are (almost) prime to each other, 
and thus end up with totally independent sequences. Searching for such a 
set can take some time, so you need to do that in advance and save the 
result. But once you have a set, each one of them is just as valid as 
the vanilla MT.

PCG also provides independent streams.


Sturla


From jeremy at jeremysanders.net  Tue Sep 15 16:02:43 2015
From: jeremy at jeremysanders.net (Jeremy Sanders)
Date: Tue, 15 Sep 2015 16:02:43 +0200
Subject: [Python-ideas] Should our default random number generator be
	secure?
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org> <mt7c93$gns$1@ger.gmane.org>
 <mt7dkd$87i$1@ger.gmane.org> <20150915035334.GF31152@ando.pearwood.info>
 <CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>
 <55F7DAAE.5010401@egenix.com>
Message-ID: <mt98dj$nu$1@ger.gmane.org>

M.-A. Lemburg wrote:


> If you can come up with a crypto RNG that allows repeating the
> results, I think you'd have us all convinced, otherwise it
> doesn't really make sense to compare apples and oranges,
> and insisting that orange juice is better for you than
> apple juice ;-)

According to
http://www.pcg-random.org/other-rngs.html

This chacha20 implementation is seedable and should be reproducible:
https://gist.github.com/orlp/32f5d1b631ab092608b1

...though I am concerned about the k-dimensional equidistribution as a 
scientist, and also that if the random number generator is changed without 
the interface changing, then it may screw up tests and existing codes which 
rely on a particular sequence of random numbers.

J



From mal at egenix.com  Tue Sep 15 16:20:46 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Sep 2015 16:20:46 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <mt91p2$ffl$1@ger.gmane.org>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>	<CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>	<55F6A380.4070609@egenix.com>	<CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>	<55F700C4.4030900@egenix.com>	<mt75tq$bn1$1@ger.gmane.org>	<mt7bl5$7hk$1@ger.gmane.org>	<mt7c93$gns$1@ger.gmane.org>	<mt7dkd$87i$1@ger.gmane.org>	<20150915035334.GF31152@ando.pearwood.info>	<CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>	<55F7DAAE.5010401@egenix.com>
 <mt91p2$ffl$1@ger.gmane.org>
Message-ID: <55F8293E.2070307@egenix.com>

On 15.09.2015 14:09, Sturla Molden wrote:
> On 15/09/15 10:45, M.-A. Lemburg wrote:
> 
>> k-dim equidistribution is a way to measure how well your
>> PRNG behaves, because it describes in analytical terms how
>> far you can get with increasing the linear complexity of your
>> RNG output.
> 
> Yes and no. Conceptually it means that k subsequent samples will have exactly zero correlation. But
> any PRNG that produces detectable correlation between samples 623 steps apart is junk anyway. The MT
> have proven equidistribution for k=623, but many have measured equidistribution for far longer
> periods than that. Numerical computations are subject to rounding error and truncation error
> whatever you do. The question is whether the deviation from k-dim equidistribution will show up in
> your simulation result or drown in the error terms.

I guess the answer is: it depends :-)

According to the SFMT paper:

"""
...it requires 10**28 samples to detect
an F2-linear relation with 15 (or more) terms among 521 bits,
by a standard statistical test. If the number of bits is
increased, the necessary sample size is increased rapidly. Thus, it seems that
k(v) of SFMT19937 is sufficiently large, far beyond the level of the observable
bias. On the other hand, the speed of the generator is observable.
"""
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/M062821.pdf
(which again refers to this paper:
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/HONGKONG/hong-fin4.pdf)

10**28 is already a lot of data, but YMMV, of course.

Here's a quote for the WELL family of PRNGs:

"""
The WELL generators mentioned in Table IV successfully passed all the
statistical tests included ... TestU01 ..., except those that look for linear
dependencies in a long sequence of bits, such as the matrix-rank test
... for very large binary matrices and the linear complexity tests ...
This is in fact a limitation of all F2-linear generators, including
the Mersenne twister, the TT800, etc. Because of their linear nature,
the sequences produced by these generators just cannot have
the linear complexity of a truly random sequence. This is definitely
unacceptable in cryptology, for example, but is quite acceptable for the
vast majority of simulation applications if the linear dependencies are
of long range and high order.
"""
http://www.iro.umontreal.ca/~lecuyer/myftp/papers/wellrng.pdf

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 15 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-14: Released mxODBC Plone/Zope DA 2.2.3   http://egenix.com/go84
2015-09-18: PyCon UK 2015 ...                               3 days to go
2015-09-26: Python Meeting Duesseldorf Sprint 2015         11 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From sturla.molden at gmail.com  Tue Sep 15 16:27:37 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Tue, 15 Sep 2015 16:27:37 +0200
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAPJVwBkg88m149hDuDH==KFdQkiD2OswcBPSwRnYyNpQR45rEQ@mail.gmail.com>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org> <mt7c93$gns$1@ger.gmane.org>
 <mt7dkd$87i$1@ger.gmane.org> <20150915035334.GF31152@ando.pearwood.info>
 <CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>
 <mt90uk$268$1@ger.gmane.org>
 <CAPJVwBkg88m149hDuDH==KFdQkiD2OswcBPSwRnYyNpQR45rEQ@mail.gmail.com>
Message-ID: <mt99sn$pt5$1@ger.gmane.org>

On 15/09/15 14:34, Nathaniel Smith wrote:

> No, this is simply wrong. I promise! ("Oh, sorry, this is
> contradictions...") For the output of a cryptographic RNG, any deviation
> from the uniform distribution is considered a flaw. (And as you know,
> given uniform variates you can construct any distribution of interest.)
> If I know that you're using a coin that usually comes up heads to
> generate your passwords, then this gives me a head start in guessing
> your passwords, and that's considered unacceptable.

The uniform distribution has the highest entropy, yes, but it does not 
mean that other distributions are unacceptable. The sequence just has to 
be incredibly hard to predict. A non-uniform distribution will give an 
adversary a head start, that is true, but if the adversary still cannot 
complete the brute-force attack before the end of the universe there is 
little help in knowing this.

In scientific computing we do not care about adversaries. We care about 
the correctness of our numerical result. That means we should be fuzzy 
about the distribution, not about the predictability or "randomness" of 
a sequence, nor about adversaries looking to recover the internal state. 
MT is proven to be uniform (equidistributed) up to 623 dimensions, but 
it is incredibly easy to recover the internal state. The latter we do 
not care about. In fact, we can often do even better with "quasi-random" 
sequences, e.g. Sobol sequences, which are not constructed to produce 
"uncorrelated" points, but constructed to produce correlated points that 
are delibarately more uniform than uncorrelated points.

Sturla


From ncoghlan at gmail.com  Tue Sep 15 16:47:34 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 16 Sep 2015 00:47:34 +1000
Subject: [Python-ideas] PEP 504: Using the system RNG by default
Message-ID: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>

Hi folks,

Based on the feedback in the recent threads, I've written a draft PEP
that dispenses with the userspace CSPRNG idea, and instead proposes:

* defaulting to using the system RNG for the module level random API
in Python 3.6+
* implicitly switching to the deterministic PRNG if you call
random.seed(), random.getstate() or random.setstate() (this implicit
fallback would trigger a silent-by-default deprecation warning in 3.6,
and a visible-by-default runtime warning after 2.7 goes EOL)
* providing random.system and random.seedable submodules so you can
explicitly opt in to using the one you want without having to manage
your own RNG instances

That approach would provide a definite security improvement over the
status quo, while restricting the compatibility break to a performance
regression in applications that use the module level API without
calling seed(), getstate() or setstate(). It would also allow the
current security warning in the random module documentation to be
moved towards the end of the module, in a section dedicated to
determinism and reproducibility.

The full PEP should be up shortly at
https://www.python.org/dev/peps/pep-0504/, but caching is still a
problem when uploading new PEPs, so if that 404s, try
http://legacy.python.org/dev/peps/pep-0504/

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From sturla.molden at gmail.com  Tue Sep 15 16:54:13 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Tue, 15 Sep 2015 16:54:13 +0200
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F8293E.2070307@egenix.com>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>	<CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>	<55F6A380.4070609@egenix.com>	<CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>	<55F700C4.4030900@egenix.com>	<mt75tq$bn1$1@ger.gmane.org>	<mt7bl5$7hk$1@ger.gmane.org>	<mt7c93$gns$1@ger.gmane.org>	<mt7dkd$87i$1@ger.gmane.org>	<20150915035334.GF31152@ando.pearwood.info>	<CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>	<55F7DAAE.5010401@egenix.com>
 <mt91p2$ffl$1@ger.gmane.org> <55F8293E.2070307@egenix.com>
Message-ID: <mt9bek$lgo$1@ger.gmane.org>

On 15/09/15 16:20, M.-A. Lemburg wrote:

> http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/M062821.pdf
> (which again refers to this paper:
> http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/HONGKONG/hong-fin4.pdf)

You seem to be confusing the DCMT with the SFMT which is a fast SIMD 
friendly Mersenne Twister.

The DCMT is intended for using the Mersenne Twister in parallel 
computing (i.e. one Mersenne Twister per processor). It is not a 
Mersenne Twister accelerated with parallel hardware. That would be the 
SFMT.

http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/DC/dc.html
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/DC/dgene.pdf

The period for the DC Mersenne Twisters they report are long enough, 
e.g. 2^127-1 or 2^521-1, but much shorter than the period of MT19937 
(2^19937-1). This does not matter because the period of MT19937 is 
excessive. In scientific computing, the sequence is long enough for most 
practical purposes if it is larger than 2^64. 2^127-1 is more than 
enough, and this is the shortest period DCMT reported in the paper. So 
do we care? Probably not.


Sturla


From sturla.molden at gmail.com  Tue Sep 15 17:40:57 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Tue, 15 Sep 2015 17:40:57 +0200
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
Message-ID: <mt9e68$3q6$1@ger.gmane.org>

On 15/09/15 16:47, Nick Coghlan wrote:

> * providing random.system and random.seedable submodules so you can
> explicitly opt in to using the one you want without having to manage
> your own RNG instances

I do not think these names are helpful. The purpose was to increase 
security, not confuse the user even more. What does "seedable" mean? 
Secure as in ChaCha20? Insecure as in MT19937? Something else? A name 
like "seedable" does not convey any useful information about the 
security to an un(der)informed web developer. A name like 
"random.system" does not convey any information about numerical 
applicability to an  un(der)informed researcher.

The module names should rather indicate how the generators are intended 
to be used. I suggest:

random.crypto.*    (os.urandom, ChaCha20, Arc4Random)
random.numeric.*   (Mersenne Twister, PCG, XorShift)

Deprecate random.random et al. with a visible warning. That should 
convey the message.

Sturla


From tim.peters at gmail.com  Tue Sep 15 17:46:04 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 15 Sep 2015 10:46:04 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F7DAAE.5010401@egenix.com>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org> <mt7c93$gns$1@ger.gmane.org>
 <mt7dkd$87i$1@ger.gmane.org>
 <20150915035334.GF31152@ando.pearwood.info>
 <CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>
 <55F7DAAE.5010401@egenix.com>
Message-ID: <CAExdVNmWgCZbTSGTC+ZuMGDz33Cs+fQMEX0is5bLh2ukeLCakQ@mail.gmail.com>

[M.-A. Lemburg <mal at egenix.com>]
> ...
> If you can come up with a crypto RNG that allows repeating the
> results, I think you'd have us all convinced, otherwise it
> doesn't really make sense to compare apples and oranges,
> and insisting that orange juice is better for you than
> apple juice ;-)

For example, run AES in CTR mode.  Remember that we did something
related on whatever mailing list it was ;-) discussing the PSF's
voting system, to break ties in a reproducible-by-anyone way using
some public info ("news") that couldn't be known until after the
election ended.

My understanding is that ChaCha20 (underlying currently-trendy
implementations of arc4random) is not only deterministic, it even
_could_ support an efficient jumpahead(n) operation.  The specific
OpenBSD implementation of arc4random goes beyond just using ChaCha20
by periodically scrambling the state with kernel-obtained "entropy"
too, and that makes it impossible to reproduce its sequence.  But it
would remain a crytpo-strength generator without that extra scrambling
step.

Note that these _can_ be very simple to program.  The "Blum Blum Shub"
crypto generator from 30 years ago just iteratively squares a "big
integer" modulo a (carefully chosen) constant.  Not only
deterministic, given any integer `i` it's efficient to directly
compute the i'th output.  It's an expensive generator, though
(typically only 1 output bit is derived from each modular squaring
operation).

From sturla.molden at gmail.com  Tue Sep 15 17:45:09 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Tue, 15 Sep 2015 17:45:09 +0200
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <mt9e68$3q6$1@ger.gmane.org>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <mt9e68$3q6$1@ger.gmane.org>
Message-ID: <mt9ee3$3q6$2@ger.gmane.org>

On 15/09/15 17:40, Sturla Molden wrote:

> random.crypto.*    (os.urandom, ChaCha20, Arc4Random)
> random.numeric.*   (Mersenne Twister, PCG, XorShift)

Or even

random.security.*

The name hierarchy should convey a very clear message.


Sturla



From oscar.j.benjamin at gmail.com  Tue Sep 15 19:20:18 2015
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Tue, 15 Sep 2015 18:20:18 +0100
Subject: [Python-ideas] Globally configurable random number generation
In-Reply-To: <CADiSq7dn7NdH+1QhHsQHoqBBA75MogoLuVAVbw87t-tE=MEJRA@mail.gmail.com>
References: <CADiSq7cQGVg3_zdHU=vaSqX-NhUrqP1Kw-9o_qrvU8r0PnuMtA@mail.gmail.com>
 <EEEEDDFE-CCB0-4BDE-8D28-383D11B852AC@yahoo.com>
 <CADiSq7fRzPZ+MotTRqbmTcvjUryzWZx66jkR_4LdOM3eW2SeOQ@mail.gmail.com>
 <m2vbbcfm7v.fsf@fastmail.com>
 <CADiSq7dLUpxd6rfX6FKbO7fGRSV7ur-AcQP_ekbi_XNt8_0Lng@mail.gmail.com>
 <305B13C9-BA39-4133-8BDC-794E82EBF254@yahoo.com>
 <CADiSq7dn7NdH+1QhHsQHoqBBA75MogoLuVAVbw87t-tE=MEJRA@mail.gmail.com>
Message-ID: <CAHVvXxS1WSZNvBgb6E_spnHN8yTPdA0COSc5ZiVO_k-ChmP6RA@mail.gmail.com>

On 15 September 2015 at 05:53, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 15 September 2015 at 14:03, Andrew Barnert <abarnert at yahoo.com> wrote:
>> Also, while I'm not 100% sold on the auto-switching and the delegate-at-call-time wrappers, I'll play with them and see, and if they do work, then you're definitely right that your second version does solve your problem with my proposal, so it doesn't matter whether your first version did anymore.
>>
>> First, on delegating top-level function: have you tested the performance? Is MT so slow that an extra lookup and function call don't matter?
>
> If folks are in a situation where the performance impact of the
> additional layer of indirection is a problem, they can switch to using
> random.Random explicitly, or import from random.seedable rather than
> the top level random module.
>
>> One quick thought on auto-switching vs. explicitly setting the instance before any functions have been called: if I get you to install a plugin that calls random.seed(), I've now changed your app to use seeded random numbers. And it might even still pass security tests, because it doesn't switch until someone hits some API that activates the plugin. Is that a realistic danger for any realistic apps? If so, doesn't that potentially make 3.6 more dangerous than 3.5?

The same problem can occur the other way round. Suppose that I want my
whole app to be seedable but I have many modules that use "from random
import choice" etc. Then in my top-level script I call random.seed and
get an error under Python 3.6. So I switch that to use random.seedable
but potentially end up with a mix of modules using
random.seedable.choice and random.choice. It may seem under certain
conditions that my app is properly seeded while not under others
depending on which particular functions get called.

The docs explicitly state that I will always be able to globally seed
the module so that my entire non-threaded application is reproducible
when using the top-level functions (even across different Python
versions for random.random). So it's entirely reasonable to expect
that people are using this behaviour and will want a way to revert to
it which in the general case would need something like
set_default_instance so that every module (including those I don't
write myself) uses the same generator.

> This isn't an applicable concern, as we already provide zero runtime
> protections against hostile monkeypatching of other modules (by design
> choice). You can subvert even os.urandom in a hostile plugin:
>
>     def not_random(num_bytes):
>         return b'A' * num_bytes
>     import os
>     os.urandom = not_random

It might not be a case of "hostile monkeypatching". Someone might just
be trying to fix their code that was broken by the
backwards-incompatible change proposed in this discussion.

>> For another: I still think we should be getting people to explicitly use seeded_random or system_random (or seedless_random, if they need speed as well as "probably secure") or explicit class instances (which are a bigger change, but more backward compatible once you've made it) as often as possible, even if random does eventually turn into seedless_random.

That's fine but seeded_random won't exist in earlier Python versions
so it creates another cross-version compatibility problem. Also
switching to using your own random instance can be a non-trivial
change if more than one module/project is involved. The random module
has deliberately provided a convenient place to store that global
state which would need to be replaced somehow.

>> And finally: it _seems like_ people who want MT for simulation/game/science stuff will have a pretty easy time finding the migration path, but I'm having a really hard time coming up with a convincing argument. Does anyone have a handful of science guys they can hack up a system for and test them empirically? Because if you can establish that fact, I think the naysayers have very little reason left to say nay, and a consensus would surely be better than having that horribly contentious thread end with "too bad, overruled, the PEP has been accepted".
>
> Given the general lack of investment in sustaining engineering for
> scientific software, I think the naysayers are right on that front,
> which is why I switched my proposal to give them a transparent upgrade
> path - I was originally thinking primarily of the educational and
> gaming use cases, and hadn't considered randomised simulations in the
> scientific realm.

TBH when I need to burn thousands of CPU-hours on RNG heavy code I
would rather use numpy's random module. It also uses Mersenne Twister
but it's a lot faster if you need loads of random numbers.

--
Oscar

From guido at python.org  Tue Sep 15 19:33:47 2015
From: guido at python.org (Guido van Rossum)
Date: Tue, 15 Sep 2015 10:33:47 -0700
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
Message-ID: <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>

I had to check out of the mega-threads, but I really don't like the outcome
(unless this PEP is just the first of several competing proposals).

The random module provides a useful interface ? a random() function and a
large variety of derived functionality useful for statistics programming
(e.g. uniform(), choice(), bivariate(), etc.). Many of these have
significant mathematical finesse in their implementation. They are all
accessing shared state that is kept in a global variable in the module, and
that is a desirable feature (nobody wants to have to pass an extra variable
just so you can share the state of the random number generator with some
other code).

I don?t want to change this API and I don?t want to introduce deprecation
warnings ? the API is fine, and the warnings will be as ineffective as the
warnings in the documentation.

I am fine with adding more secure ways of generating random numbers. But we
already have random.SystemRandom(), so there doesn?t seem to be a hurry?

How about we make one small change instead: a way to change the default
instance used by the top-level functions in the random module. Say,

  random.set_random_generator(<instance>)

This would require the global functions to use an extra level of
indirection, e.g. instead of

  random = _inst.random

we?d change that code to say

  def random():
      return _inst.random()

(and similar for all related functions). I am not worried of the cost of
the indirection (and if it turns out too expensive we can reimplement the
module in C).

Then we could implement

  def set_random_generator(instance):
      global _inst
      _inst = instance

We could also have a function random.use_secure_random() that calls
set_random_generator() with an instance of a secure random number generator
(maybe just SystemRandom()). We could rig things so that once
use_secure_random() has been called called, set_random_generator() will
throw an exception (to avoid situations where a library module attempts to
make the shared random generator insecure in a program that has declared
that it wants secure random). It would also be fine for SystemRandom (or at
least whatever is used by use_secure_random(), if SystemRandom cannot
change for backward compatibility reasons) to raise an exception when
seed(), setstate() or getstate() are called.

Of course modules are still free to use their own instances of the Random
class. But I don?t see a reason to mess with the existing interface.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150915/fff6fb5a/attachment.html>

From mal at egenix.com  Tue Sep 15 19:42:38 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Sep 2015 19:42:38 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAExdVNmWgCZbTSGTC+ZuMGDz33Cs+fQMEX0is5bLh2ukeLCakQ@mail.gmail.com>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>	<CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>	<55F6A380.4070609@egenix.com>	<CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>	<55F700C4.4030900@egenix.com>
 <mt75tq$bn1$1@ger.gmane.org>	<mt7bl5$7hk$1@ger.gmane.org>
 <mt7c93$gns$1@ger.gmane.org>	<mt7dkd$87i$1@ger.gmane.org>	<20150915035334.GF31152@ando.pearwood.info>	<CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>	<55F7DAAE.5010401@egenix.com>
 <CAExdVNmWgCZbTSGTC+ZuMGDz33Cs+fQMEX0is5bLh2ukeLCakQ@mail.gmail.com>
Message-ID: <55F8588E.7010106@egenix.com>

On 15.09.2015 17:46, Tim Peters wrote:
> [M.-A. Lemburg <mal at egenix.com>]
>> ...
>> If you can come up with a crypto RNG that allows repeating the
>> results, I think you'd have us all convinced, otherwise it
>> doesn't really make sense to compare apples and oranges,
>> and insisting that orange juice is better for you than
>> apple juice ;-)
> 
> For example, run AES in CTR mode.  Remember that we did something
> related on whatever mailing list it was ;-) discussing the PSF's
> voting system, to break ties in a reproducible-by-anyone way using
> some public info ("news") that couldn't be known until after the
> election ended.

Ah, now we're getting somewhere :-)

If we accept that non-guessable, but deterministic is a good
compromise, then adding a cipher behind MT sounds like a reasonable
way forward, even as default.

For full crypto strength, people would still have to rely on
solutions like /dev/urandom or the OpenSSL one (or reseed the
default RNG every now and then). All others get the benefit of
non-guessable, but keep the ability to seed the default RNG in
Python.

Is there some research on this (MT + cipher or hash) ?

> My understanding is that ChaCha20 (underlying currently-trendy
> implementations of arc4random) is not only deterministic, it even
> _could_ support an efficient jumpahead(n) operation.  The specific
> OpenBSD implementation of arc4random goes beyond just using ChaCha20
> by periodically scrambling the state with kernel-obtained "entropy"
> too, and that makes it impossible to reproduce its sequence.  But it
> would remain a crytpo-strength generator without that extra scrambling
> step.
> 
> Note that these _can_ be very simple to program.  The "Blum Blum Shub"
> crypto generator from 30 years ago just iteratively squares a "big
> integer" modulo a (carefully chosen) constant.  Not only
> deterministic, given any integer `i` it's efficient to directly
> compute the i'th output.  It's an expensive generator, though
> (typically only 1 output bit is derived from each modular squaring
> operation).

IMO, that's a different discussion and we should rely on existing
well tested full entropy mixers (urandom or OpenSSL) until the researchers
have come with something like MT for chaotic PRNGs.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 15 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-14: Released mxODBC Plone/Zope DA 2.2.3   http://egenix.com/go84
2015-09-18: PyCon UK 2015 ...                               3 days to go
2015-09-26: Python Meeting Duesseldorf Sprint 2015         11 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From donald at stufft.io  Tue Sep 15 19:50:12 2015
From: donald at stufft.io (Donald Stufft)
Date: Tue, 15 Sep 2015 13:50:12 -0400
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
Message-ID: <etPan.55f85a54.432cb095.6557@Draupnir.home>

On September 15, 2015 at 1:34:56 PM, Guido van Rossum (guido at python.org) wrote:
> > I am fine with adding more secure ways of generating random numbers.  
> But we already have random.SystemRandom(), so there doesn?t  
> seem to be a hurry?

The problem isn't so much that there isn't a way of securely generating random?
numbers, but that the module, as it is right now, guides you towards using an?
insecure source of random numbers rather than a secure one. This means that
unless you're familar with the random module or reading the online
documentation you don't really have any idea that ``random.random()`` isn't
secure. This is an attractive nuisance for anyone who *doesn't* need
deterministic output from their random numbers and leads to situations where
people are incorrectly using MT when they should be using SystemRandom because
they don't know any better.


-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From mal at egenix.com  Tue Sep 15 19:56:14 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Sep 2015 19:56:14 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <mt9bek$lgo$1@ger.gmane.org>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>	<CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>	<55F6A380.4070609@egenix.com>	<CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>	<55F700C4.4030900@egenix.com>	<mt75tq$bn1$1@ger.gmane.org>	<mt7bl5$7hk$1@ger.gmane.org>	<mt7c93$gns$1@ger.gmane.org>	<mt7dkd$87i$1@ger.gmane.org>	<20150915035334.GF31152@ando.pearwood.info>	<CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>	<55F7DAAE.5010401@egenix.com>	<mt91p2$ffl$1@ger.gmane.org>
 <55F8293E.2070307@egenix.com> <mt9bek$lgo$1@ger.gmane.org>
Message-ID: <55F85BBE.2040404@egenix.com>

On 15.09.2015 16:54, Sturla Molden wrote:
> On 15/09/15 16:20, M.-A. Lemburg wrote:
> 
>> http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/M062821.pdf
>> (which again refers to this paper:
>> http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/HONGKONG/hong-fin4.pdf)
> 
> You seem to be confusing the DCMT with the SFMT which is a fast SIMD friendly Mersenne Twister.

I was talking about the SFMT, which is a variant of the MT for
processors with SIMD instruction sets (most CPUs have these nowadays)
and which has 32-, 64-bit or floating point output:

http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html

But what I really wanted to reference was the discussion in the
SFMT paper about the practical effects of the 623-dim equidistribution
(see the end of the first paper I quoted; the discussion references
the second paper).

> The DCMT is intended for using the Mersenne Twister in parallel computing (i.e. one Mersenne Twister
> per processor). It is not a Mersenne Twister accelerated with parallel hardware. That would be the
> SFMT.
> 
> http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/DC/dc.html
> http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/DC/dgene.pdf
> 
> The period for the DC Mersenne Twisters they report are long enough, e.g. 2^127-1 or 2^521-1, but
> much shorter than the period of MT19937 (2^19937-1). This does not matter because the period of
> MT19937 is excessive. In scientific computing, the sequence is long enough for most practical
> purposes if it is larger than 2^64. 2^127-1 is more than enough, and this is the shortest period
> DCMT reported in the paper. So do we care? Probably not.

Thanks for the pointers. I wasn't aware of a special MT variant
for parallel computing.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 15 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-14: Released mxODBC Plone/Zope DA 2.2.3   http://egenix.com/go84
2015-09-18: PyCon UK 2015 ...                               3 days to go
2015-09-26: Python Meeting Duesseldorf Sprint 2015         11 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From tim.peters at gmail.com  Tue Sep 15 20:05:44 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 15 Sep 2015 13:05:44 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAPJVwBmNHGdocHpXkoSPD3MZLXqTrfcK656afP0QpVf9sHWuqQ@mail.gmail.com>
References: <CAP7+vJ+B=umEubBs9YQ4LVPcsLk0bjXFS=K7AtBrEeHNL0Ww2w@mail.gmail.com>
 <CAExdVNnHmD6cCdat5xcy_0AAk9wQ_3ZYoCsyCA=Dcw-1p7+gDg@mail.gmail.com>
 <etPan.55f06a43.137d4868.31bc@Draupnir.home>
 <CAExdVN=qSqxdDimtuOf1hwuiRE9Bwx8k_hjWdkksiOVwNVRuQA@mail.gmail.com>
 <etPan.55f06fd9.71794aea.31bc@Draupnir.home>
 <1441821254.2853664.379081313.43B3886D@webmail.messagingengine.com>
 <CAExdVNmFRZHPYFSLEfSpdQi1uQCOyGqMY2-W7Pc0BW7NcphK8Q@mail.gmail.com>
 <1441824901.2867280.379140585.4F25667D@webmail.messagingengine.com>
 <20150909190757.GM19373@ando.pearwood.info>
 <55F0BF61.6050205@canterbury.ac.nz>
 <CAExdVNnQMOoWhc_Tr+aYCdYoMxf4MB9uq-4qvaYd21dbfy8ZiQ@mail.gmail.com>
 <55F13EAF.5040500@egenix.com>
 <CAP1=2W4sMUT6xay4tfUyU1fNMMi5HiboBUwHWqvX7pEu=Tsjew@mail.gmail.com>
 <55F1B219.1000502@egenix.com> <87y4gdzp2d.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=-6m73Ex5wKF=qicn2pzpPf_RD+sq6zzkN=Y1XDhSgRg@mail.gmail.com>
 <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <CAExdVNnYHBfkHjYpeC+rZA1oBqq7qEE82hsoGfaA_N9_aKu97Q@mail.gmail.com>
 <CAPJVwBmNHGdocHpXkoSPD3MZLXqTrfcK656afP0QpVf9sHWuqQ@mail.gmail.com>
Message-ID: <CAExdVNnPjypY6jP3DJKc2ZLcXCOhVrjx2HB+kn28RdfJv6Yc_A@mail.gmail.com>

...

[Nathaniel Smith <njs at pobox.com>]
>>> Here you go:
>>>   https://media.blackhat.com/bh-us-12/Briefings/Argyros/BH_US_12_Argyros_PRNG_WP.pdf

[Tim,
 on appendix D]
>> ...
>> My question is, even if /dev/urandom is available, they're _not_
>> content to use that.alone.  They continue to mix it up with all other
>> kinds of silly stuff.  So why do they trust urandom less than
>> OpenSLL's gimmick?

[Nathaniel]
> Who knows why they wrote the code in that exact way. /dev/urandom is fine.

Presumably _they_ know, yes?  When they're presenting a "gold
standard" all-purpose token generator, it would be nice if they had
taken care to make every step clear to other crypto wonks.  Otherwise
the rest of us are left wondering if _anyone_ really knows what
they're talking about ;-)

[on the MT state solver]
>> ...
>> It remained unclear to me _exactly_ what "in the presence of non
>> consecutive outputs" is supposed to mean.  In the only examples, they
>> knew exactly how many times MT was called.  "Non consecutive" in all
>> those contexts appeared to mean "but we couldn't observe_any_ output
>> bits in some cases - the ones we could know something about were
>> sometimes non-consecutive".  So in the MT output sequence, they had no
>> knowledge of _some_ of the outputs, but they nevertheless knew exactly
>> _which_ of the outputs they were wholly ignorant about.
>>
>> That's no problem for the equation solver.  They just skip adding any
>> equations for the affected bits, keep collecting more outputs and
>> potentially "wrap around", probably leading to an overdetermined
>> system in the end.
>>
>> But Python doesn't work the way PHP does here.  As explained in
>> another message, in Python you can have _no idea_ how many MT outputs
>> are consumed by a single .choice() call.  In the PHP equivalent, you
>> always consume exactly one MT output.  PHP's method suffers
>> statistical bias, but under the covers Python uses an accept/reject
>> method to avoid that.  Any number of MT outputs may be (invisibly!)
>> consumed before "accept" is reached, although typically only one or
>> two.  You can deduce some of the leading MT output bits from the
>> .choice() result, but _only_ for the single MT output .choice()
>> reveals anything about.  About the other MT outputs it may consume,
>> you can't even know that some _were_ skipped over, let alone how many.


> This led me to look at the implementation of Python's choice(), and
> it's interesting; I hadn't realized that it was using such an
> inefficient method.

Speed is irrelevant here, but do note that .choice() isn't restricted
to picking from less than 2**32 possibilities.

>>> random.choice(range(2**62))
2693408174642551707

Special-casing is done at Python speed, and adding a Python-level
branch to ask "is it less than 2**32?" is typically more expensive
than calling MT again.

> (To make a random selection between, say, 36
> items, it rounds up to 64 = 2**6, draws a 32-bit sample from MT,
> discards 26 of the bits (!) to get a number between 0-63, and then
> repeats until this number happens to fall in the 0-35 range, so it
> rejects with probability ~0.45. A more efficient algorithm is the one
> that it uses if getrandbits is not available, where it uses all 32
> bits and only rejects with probability (2**32 % 36) / (2**32) =
> ~1e-9.) I guess this does add a bit of obfuscation.

And note that the branch you like better _also_ needs another
Python-speed test to raise an exception if the range is "too big".
The branch actually used doesn't need that.


> OTOH the amount of obfuscation is very sensitive to the size of the
> password alphabet. If I use uppercase + lowercase + digits, that gives
> me 62 options, so I only reject with probability 1/32, and I can
> expect that any given 40-character session key will contain zero skips
> with probability ~0.28, and that reveals 240 bits of seed.

To be quite clear, nothing can ever be learned about "the seed".  All
that can be observed is bits from the outputs, and the goal is to
deduce "the state".  There is an invertible permutation mapping a
state word to an output word, but from partial knowledge of N < 32
bits from a single output word you typically can't deduce N bits of
the state word from which the output was derived (for example, if you
only know the first bit of an output, you can't deduce from that alone
the value of any bit in the corresponding state word).  That's why
they need such a hairy framework to begin with (in the example, you
_can_ add equations tying the known value of the first output bit to
linear functions of all bits of the state related to the first output
bit).

About the "240 bits", they need about 80 times more than that to
deduce the state.  0.28 ** 80 is even less than a tenth ;-)


> I don't have time right now to go look up the MT equations to see how
> easy it is to make use of such partial information,

It's robust against only knowing a subset of an output's bits
(including none).  It's not robust against not knowing _which_ output
you're staring at.

They label the state bits with variables x_0 through x_19936, and
generate equations relating specific state bits to the output bits
they can deduce.  If they don't know how many times MT was invoked
between outputs they know were generated, they can't know _which_
state-bit variables to plug into their equations.  Is this output
related to (e.g.) at least x_0 through x_31, or is it x_32 through
x_63?  x_64 through x_95?

Take an absurd extreme to illustrate an obvious futility in general:
suppose .choice() remembered the size of the last range it was asked
to pick from.  Then if the next call to .choice() is for the
same-sized range, call MT 2**19937-2 times ignoring the outputs, and
call it once more.  It will get the same result then.  "The solver"
will deduce exactly the same output bits every time, and will never
learn more than that.  Eventually, if they're doing enough sanity
checking, the best their solver could do is notice that the derived
equations (regardless of how they construct them) are inconsistent.
The worst it could do is "deduce" a state that's pure fantasy.  They
can't know that they are in fact seeing an output derived from exactly
the same state every time.  Unless they read the source code for
.choice() and see it's playing this trick.  in that case, they would
never add to their collection of equations after the first output was
dealt with.  _Then_ the equations would faithfully reflect the truth:
that they learned a tiny bit at first, but never learn more than just
that.


>There but there certainly are lots of real-world weaponized exploits that begin with
> something like "first, gather 10**8 session keys...". I certainly
> wouldn't trust it.

And I'm not asking you to.  I wouldn't either.  I'm expressing
skepticism about that the solver in this paper is a slam-dunk proof
that all existing idiomatic Python password generators are about to
cause the world to end ;-)


> Also, if I use the base64 or hex alphabets, then the probability of
> rejection is 0, and I can deterministically read off bits from the
> underlying MT state.

I' readily agree that if .choice(x) is used whenever len(x) is a power
of two, then their solver applies directly to such cases.  It's in all
and only such cases they can know exactly how many MT outputs were
consumed, and so know also which specific state-bit variables to use.


> (Alternatively, if someone in the future makes
> the obvious optimization

As above, if what you like better were obviously faster in reality, it
would have been written that way already ;-)  Honest, this looks like
Raymond Hettinger's code, and he typically obsesses over speed.


> to choice(), then it will basically stop rejecting in practice, and again
> it becomes trivial to read off all the bits from the underlying MT state.)

Not trivial.  You still need this hairy solver framework, and best I
can tell its code wasn't made available.  Note that the URL given in
the paper gives a 404 error now.  It isn't trivial to code it either.


> The point of "secure by default" is that you don't have to spend all
> these paragraphs doing the math to try and guess whether some RNG
> usage might maybe be secure; it just is secure.

No argument there!  I'm just questioning how worried people "should
be" over what actual Python code actually does now.  I confess I
remain free of outright panic ;-)


>> Best I can tell, that makes a huge difference to whether their solver
>> is even applicable to cracking idiomatic "password generators" in
>> Python.  You can't know which variables correspond to the bits you can
>> deduce.  You could split the solver into multiple instances to cover
>> all feasible possibilities (for how many MT outputs may have been
>> invisibly consumed), but the number of solver instances needed then
>> grows exponentially with the number of outputs you do see something
>> about.  In the worst case (31 bits are truncated), they need over
>> 19000 outputs to deduce the state.  Even a wildly optimistic "well,
>> let's guess no more than 1 MT output was invisibly rejected each time"
>> leads to over 2**19000 solver clones then.

> Your "wildly optimistic" estimate is wildly conservative under
> realistic conditions.

Eh.

> How confident are you that the rest of your
> analysis is totally free of similar errors? Would you willing to bet,
> say, the public revelation of every website you've visited in the last
> 5 years on it?

I couldn't care less if that were revealed.  In fact, I'd enjoy the
trip down memory lane ;-)


> ...
> Grant money is a drop in the bucket of security research funding these
> days. Criminals and governments have very deep pockets, and it's well
> documented that there are quite a few people with PhDs who make their
> living by coming up with exploits and then auctioning them on the
> black market.

Excellent points!  Snideness doesn't always pay off for me ;-)


> BTW, it looks like that PHP paper was an undergraduate project. You
> don't need a PhD to solve linear equations :-).

So give 'em their doctorates!  I've seen doctoral theses a hundred
times less substantial ;-)


> ...
> No, SystemRandom.choice is certainly fine. But people clearly don't
> use it, so it's fine-ness doesn't matter that much in practice...

It's just waiting for a real exploit.  People writing security papers
love to "name & shame".  "Gothca!  Gotcha!"  Once people see there
_is_ "a real problem" (if there is), they'll scramble to avoid being
the target of the next name-&-shame campaign.  Before then, they're
much too busy trying to erase all traces of every website they've
visited in the last 5 years ;-)

From mal at egenix.com  Tue Sep 15 20:19:05 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Sep 2015 20:19:05 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAPJVwBm-8kSxj_gSg0Q34vO1P65_vPWKQ0QggpRtaB=7TRvOWA@mail.gmail.com>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>	<CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>	<55F6A380.4070609@egenix.com>	<CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>	<55F700C4.4030900@egenix.com>
 <mt75tq$bn1$1@ger.gmane.org>	<mt7bl5$7hk$1@ger.gmane.org>
 <mt7c93$gns$1@ger.gmane.org>	<mt7dkd$87i$1@ger.gmane.org>	<20150915035334.GF31152@ando.pearwood.info>	<CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>	<55F7DAAE.5010401@egenix.com>
 <CAPJVwBm-8kSxj_gSg0Q34vO1P65_vPWKQ0QggpRtaB=7TRvOWA@mail.gmail.com>
Message-ID: <55F86119.6000000@egenix.com>

On 15.09.2015 13:41, Nathaniel Smith wrote:
> On Tue, Sep 15, 2015 at 1:45 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>> On 15.09.2015 09:36, Nathaniel Smith wrote:
>>>
>>> [Using empirical tests to check RNGs]
>>>
>>> Obviously the thing the scientists worry about is a *strict* subset of
>>> what the cryptographers are worried about.
>>
>> I think this explains why we cannot make ends meet:
>>
>> A scientist wants to be able to *repeat* a simulation in exactly the
>> same way without having to store GBs of data (or send them to colleagues
>> to have them very the results).
>>
>> Crypto RNGs cannot provide this feature per design.
>>
>> What people designing PRNGs are after is to improve the statistical
>> properties of these PRNGs while still maintaining the repeatability
>> of the output.
>>
>>> This is why it is silly to
>>> worry that a crypto RNG will cause problems for a scientific
>>> simulation. The cryptographers take the scientists' real goal -- the
>>> correctness of arbitrary programs like e.g. a monte carlo simulation
>>> -- *much* more seriously than the scientists themselves do. (This is
>>> because scientists need RNGs to do their real work, whereas for
>>> cryptographers RNGs are their real work.)
>>
>> Yes, cryptographers are the better folks, understood. These arguments
>> are not really helpful. They are not even arguments.
> 
> Err... I think we're arguing past each other. (Hint: I'm a scientist,
> not a cryptographer ;-).)
> 
> My email was *only* trying to clear up the argument that keeps popping
> up about whether or not a cryptographic RNG could introduce bias in
> simulations etc., as compared to the allegedly-better-behaved Mersenne
> Twister. (As in e.g. your comment upthread that "[MT] is proven to be
> equidistributed which is a key property needed for it to be used as
> basis for other derived probability distributions".)

Ok, thanks for the clarification.

> This argument is
> incorrect -- equidistribution is not a guarantee that an RNG will
> produce good results when deriving other probability distributions,
> and in general cryptographic RNGs will produce as-or-better results
> than MT in terms of correctness of output. On this particular axis,
> using a cryptographic RNG is not at all dangerous.

You won't get me to agree on "statistical tests are better
than mathematical proofs", so let's call it a day :-)

> Obviously this is only one of the considerations in choosing an RNG;
> the quality of the randomness is totally orthogonal to considerations
> like determinism.
> 
> (Cryptographers also have deterministic RNGs -- they call them "stream
> ciphers" -- and these will also meet or beat MT in any practically
> relevant test of correctness for the same reasons I outlined, despite
> not being provably equidistributed. Of course there are then yet other
> trade-offs like speed. But that's not really relevant to this thread,
> because no-one is proposing replacing MT as the standard deterministic
> RNG in Python; I'm just trying to be clear about how one judges the
> quality of randomness that an RNG produces.)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 15 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-14: Released mxODBC Plone/Zope DA 2.2.3   http://egenix.com/go84
2015-09-18: PyCon UK 2015 ...                               3 days to go
2015-09-26: Python Meeting Duesseldorf Sprint 2015         11 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From guido at python.org  Tue Sep 15 20:21:20 2015
From: guido at python.org (Guido van Rossum)
Date: Tue, 15 Sep 2015 11:21:20 -0700
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <etPan.55f85a54.432cb095.6557@Draupnir.home>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
Message-ID: <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>

On Tue, Sep 15, 2015 at 10:50 AM, Donald Stufft <donald at stufft.io> wrote:

> On September 15, 2015 at 1:34:56 PM, Guido van Rossum (guido at python.org)
> wrote:
> > > I am fine with adding more secure ways of generating random numbers.
> > But we already have random.SystemRandom(), so there doesn?t
> > seem to be a hurry?
>
> The problem isn't so much that there isn't a way of securely generating
> random
> numbers, but that the module, as it is right now, guides you towards using
> an
> insecure source of random numbers rather than a secure one. This means that
> unless you're familar with the random module or reading the online
> documentation you don't really have any idea that ``random.random()`` isn't
> secure. This is an attractive nuisance for anyone who *doesn't* need
> deterministic output from their random numbers and leads to situations
> where
> people are incorrectly using MT when they should be using SystemRandom
> because
> they don't know any better.
>

That feels condescending, as does the assumption that (almost) every naive
use of randomness is somehow a security vulnerability. The concept of
secure vs. insecure sources of randomness isn't *that* hard to grasp.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150915/9cb2706e/attachment.html>

From random832 at fastmail.com  Tue Sep 15 20:25:39 2015
From: random832 at fastmail.com (Random832)
Date: Tue, 15 Sep 2015 14:25:39 -0400
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
Message-ID: <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>

On Tue, Sep 15, 2015, at 13:33, Guido van Rossum wrote:
> I don?t want to change this API and I don?t want to introduce deprecation
> warnings ? the API is fine, and the warnings will be as ineffective as
> the
> warnings in the documentation.

The output of random.random today when it's not seeded / seeded with
None isn't _really_ deterministic - you can't reproduce it, after all,
without modifying the code (though in principle you could do
seed(None)/getstate the first time and then setstate on subsequent
executions - it may be worth supporting this use case?) - so changing it
isn't likely to affect anyone - anyone needing MT is likely to also be
using the seed functions.

>   random.set_random_generator(<instance>)

What do you think of having calls to seed/setstate(/getstate?)
implicitly switch (by whatever mechanism) to MT? This could be done
without a deprecation warning, and would allow existing code that relies
on reproducible values to continue working without modification?

[indirection in global functions]...
> (and similar for all related functions).

global getstate/setstate should also save/replace the _inst or its type;
at least if it's a different type than it was at the time the state was
saved. For backwards compatibility in case these are pickled it could
use the existing format when _inst is the current MT implementation, and
accept these in setstate.

> It would also be fine for SystemRandom (or
> at
> least whatever is used by use_secure_random(), if SystemRandom cannot
> change for backward compatibility reasons) to raise an exception when
> seed(), setstate() or getstate() are called.

SystemRandom already raises an exception when getstate and setstate are
called.


From random832 at fastmail.com  Tue Sep 15 20:34:04 2015
From: random832 at fastmail.com (Random832)
Date: Tue, 15 Sep 2015 14:34:04 -0400
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
Message-ID: <1442342044.575954.384484633.18794172@webmail.messagingengine.com>

On Tue, Sep 15, 2015, at 14:25, Random832 wrote:
>

I made an editing mistake that may have made it hard to follow my post.
This paragraph:

> The output of random.random today when it's not seeded / seeded with
> None isn't _really_ deterministic - you can't reproduce it, after all,
> without modifying the code (though in principle you could do
> seed(None)/getstate the first time and then setstate on subsequent
> executions - it may be worth supporting this use case?) - so changing it
> isn't likely to affect anyone - anyone needing MT is likely to also be
> using the seed functions.

Should have been _after_ this one:

> What do you think of having calls to seed/setstate(/getstate?)
> implicitly switch (by whatever mechanism) to MT? This could be done
> without a deprecation warning, and would allow existing code that relies
> on reproducible values to continue working without modification?

From guido at python.org  Tue Sep 15 20:36:10 2015
From: guido at python.org (Guido van Rossum)
Date: Tue, 15 Sep 2015 11:36:10 -0700
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
Message-ID: <CAP7+vJKD=1XsJB=BmckLnUYTf=a7zO10K_czCtcJj=ARDrxLXw@mail.gmail.com>

On Tue, Sep 15, 2015 at 11:25 AM, Random832 <random832 at fastmail.com> wrote:

> On Tue, Sep 15, 2015, at 13:33, Guido van Rossum wrote:
> > I don?t want to change this API and I don?t want to introduce deprecation
> > warnings ? the API is fine, and the warnings will be as ineffective as
> > the
> > warnings in the documentation.
>
> The output of random.random today when it's not seeded / seeded with
> None isn't _really_ deterministic - you can't reproduce it, after all,
> without modifying the code (though in principle you could do
> seed(None)/getstate the first time and then setstate on subsequent
> executions - it may be worth supporting this use case?)


Yes, that's how I would do it (better than using a weak seed).


> - so changing it
> isn't likely to affect anyone - anyone needing MT is likely to also be
> using the seed functions.
>

Or they could just make a lot of random() calls and find their performance
down the drain (like what happened in the tracker issue that started all
this: http://bugs.python.org/issue25003).


> >   random.set_random_generator(<instance>)
>
> What do you think of having calls to seed/setstate(/getstate?)
> implicitly switch (by whatever mechanism) to MT? This could be done
> without a deprecation warning, and would allow existing code that relies
> on reproducible values to continue working without modification?
>

I happen to believe that MT's performance is a feature of the (default)
API, and this would still be considered breakage (again, as in that issue).

[indirection in global functions]...
> > (and similar for all related functions).
>
> global getstate/setstate should also save/replace the _inst or its type;
> at least if it's a different type than it was at the time the state was
> saved. For backwards compatibility in case these are pickled it could
> use the existing format when _inst is the current MT implementation, and
> accept these in setstate.
>
> > It would also be fine for SystemRandom (or
> > at
> > least whatever is used by use_secure_random(), if SystemRandom cannot
> > change for backward compatibility reasons) to raise an exception when
> > seed(), setstate() or getstate() are called.
>
> SystemRandom already raises an exception when getstate and setstate are
> called.
>

Great!

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150915/fa4eb2fb/attachment-0001.html>

From mertz at gnosis.cx  Tue Sep 15 21:43:30 2015
From: mertz at gnosis.cx (David Mertz)
Date: Tue, 15 Sep 2015 12:43:30 -0700
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
Message-ID: <CAEbHw4Zq_AT-8E6iDh_CB3LfDha8KTY0=cBZ9tNwCjp7L4VWrQ@mail.gmail.com>

I commonly use random.some_distribution() as a quick source of "randomness"
knowing full well that it's not cryptographic. Moreover, I usually do so
initially without setting a seed.

The first question I want to answer is "does this random process behave
roughly as I expect?" But in the back of my mind is always the thought,
"If/when I want to reuse this I'll add a seed for reproducibility". It
would never occur to me to reach for the random module if I want to do
cryptography.

It's a good and well established API that currently exists. Sure, add a
submodule random.crypto (or whatever name), but I'm -1 on changing anything
whatsoever on the module functions that are well known.
On Sep 15, 2015 11:26 AM, "Random832" <random832 at fastmail.com> wrote:

> On Tue, Sep 15, 2015, at 13:33, Guido van Rossum wrote:
> > I don?t want to change this API and I don?t want to introduce deprecation
> > warnings ? the API is fine, and the warnings will be as ineffective as
> > the
> > warnings in the documentation.
>
> The output of random.random today when it's not seeded / seeded with
> None isn't _really_ deterministic - you can't reproduce it, after all,
> without modifying the code (though in principle you could do
> seed(None)/getstate the first time and then setstate on subsequent
> executions - it may be worth supporting this use case?) - so changing it
> isn't likely to affect anyone - anyone needing MT is likely to also be
> using the seed functions.
>
> >   random.set_random_generator(<instance>)
>
> What do you think of having calls to seed/setstate(/getstate?)
> implicitly switch (by whatever mechanism) to MT? This could be done
> without a deprecation warning, and would allow existing code that relies
> on reproducible values to continue working without modification?
>
> [indirection in global functions]...
> > (and similar for all related functions).
>
> global getstate/setstate should also save/replace the _inst or its type;
> at least if it's a different type than it was at the time the state was
> saved. For backwards compatibility in case these are pickled it could
> use the existing format when _inst is the current MT implementation, and
> accept these in setstate.
>
> > It would also be fine for SystemRandom (or
> > at
> > least whatever is used by use_secure_random(), if SystemRandom cannot
> > change for backward compatibility reasons) to raise an exception when
> > seed(), setstate() or getstate() are called.
>
> SystemRandom already raises an exception when getstate and setstate are
> called.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150915/83a42e0c/attachment.html>

From guido at python.org  Tue Sep 15 22:18:44 2015
From: guido at python.org (Guido van Rossum)
Date: Tue, 15 Sep 2015 13:18:44 -0700
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAEbHw4Zq_AT-8E6iDh_CB3LfDha8KTY0=cBZ9tNwCjp7L4VWrQ@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
 <CAEbHw4Zq_AT-8E6iDh_CB3LfDha8KTY0=cBZ9tNwCjp7L4VWrQ@mail.gmail.com>
Message-ID: <CAP7+vJ+=1RWpp9E=F9SSH31L4PDZohtpFRp1bCj2hc0cJdsnwQ@mail.gmail.com>

How about the following. We add a fast secure random generator to the
stdlib as an option, and when it has proven its worth a few releases from
now we consider again whether the default random() can be made secure
without breaking anything.

On Tue, Sep 15, 2015 at 12:43 PM, David Mertz <mertz at gnosis.cx> wrote:

> I commonly use random.some_distribution() as a quick source of
> "randomness" knowing full well that it's not cryptographic. Moreover, I
> usually do so initially without setting a seed.
>
> The first question I want to answer is "does this random process behave
> roughly as I expect?" But in the back of my mind is always the thought,
> "If/when I want to reuse this I'll add a seed for reproducibility". It
> would never occur to me to reach for the random module if I want to do
> cryptography.
>
> It's a good and well established API that currently exists. Sure, add a
> submodule random.crypto (or whatever name), but I'm -1 on changing anything
> whatsoever on the module functions that are well known.
> On Sep 15, 2015 11:26 AM, "Random832" <random832 at fastmail.com> wrote:
>
>> On Tue, Sep 15, 2015, at 13:33, Guido van Rossum wrote:
>> > I don?t want to change this API and I don?t want to introduce
>> deprecation
>> > warnings ? the API is fine, and the warnings will be as ineffective as
>> > the
>> > warnings in the documentation.
>>
>> The output of random.random today when it's not seeded / seeded with
>> None isn't _really_ deterministic - you can't reproduce it, after all,
>> without modifying the code (though in principle you could do
>> seed(None)/getstate the first time and then setstate on subsequent
>> executions - it may be worth supporting this use case?) - so changing it
>> isn't likely to affect anyone - anyone needing MT is likely to also be
>> using the seed functions.
>>
>> >   random.set_random_generator(<instance>)
>>
>> What do you think of having calls to seed/setstate(/getstate?)
>> implicitly switch (by whatever mechanism) to MT? This could be done
>> without a deprecation warning, and would allow existing code that relies
>> on reproducible values to continue working without modification?
>>
>> [indirection in global functions]...
>> > (and similar for all related functions).
>>
>> global getstate/setstate should also save/replace the _inst or its type;
>> at least if it's a different type than it was at the time the state was
>> saved. For backwards compatibility in case these are pickled it could
>> use the existing format when _inst is the current MT implementation, and
>> accept these in setstate.
>>
>> > It would also be fine for SystemRandom (or
>> > at
>> > least whatever is used by use_secure_random(), if SystemRandom cannot
>> > change for backward compatibility reasons) to raise an exception when
>> > seed(), setstate() or getstate() are called.
>>
>> SystemRandom already raises an exception when getstate and setstate are
>> called.
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150915/ac8b7505/attachment.html>

From tim.peters at gmail.com  Wed Sep 16 02:43:36 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 15 Sep 2015 19:43:36 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F8588E.7010106@egenix.com>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org> <mt7c93$gns$1@ger.gmane.org>
 <mt7dkd$87i$1@ger.gmane.org>
 <20150915035334.GF31152@ando.pearwood.info>
 <CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>
 <55F7DAAE.5010401@egenix.com>
 <CAExdVNmWgCZbTSGTC+ZuMGDz33Cs+fQMEX0is5bLh2ukeLCakQ@mail.gmail.com>
 <55F8588E.7010106@egenix.com>
Message-ID: <CAExdVNmfZrR6j1yCTraeFEngYXOJDKBHe2xDfCFicgy1m-s2Rw@mail.gmail.com>

...

[Marc-Andre]
> Ah, now we're getting somewhere :-)
>
> If we accept that non-guessable, but deterministic is a good
> compromise, then adding a cipher behind MT sounds like a reasonable
> way forward, even as default.
>
> For full crypto strength, people would still have to rely on
> solutions like /dev/urandom or the OpenSSL one (or reseed the
> default RNG every now and then). All others get the benefit of
> non-guessable, but keep the ability to seed the default RNG in
> Python.

I expect the only real reason "new entropy" is periodically mixed in
to OpenBSD's arc4random() is to guard against that a weakness in
ChaCha20 may be discovered later.  If there were any known
computationally feasible way whatsoever to distinguish bare-bones
ChaCha20's output from a "truly random" sequence, it wouldn't be
called "crypto" to begin with.

But reseeding MT every now and again is definitely not suitable for
crypto purposes.  You would need to reseed at least every 624 outputs,
and from a crypto-strength seed source.  In which case, why bother
with MT at all?  You could just as well use the crypto source
directly.


> Is there some research on this (MT + cipher or hash) ?

Oh, sure.  MT's creators noted from the start that it would suffice to
run MT's outputs through a crypto hash (like your favorite flavor of
SHA).  That's just as vulnerable to "poor seeding" attacks as plain
MT, but it's computationally infeasible to deduce the state from any
number of hashed outputs (unless your crypto hash is at  least partly
invertible, in which case it's not really a crypto hash ;-) ).;

For other approaches, search for CryptMT.  MT's creators suggested a
number of other schemes over the years.  The simplest throws away the
"tempering" part of MT (the 4 lines that map the raw state word into a
mildly scrambled output word - not because it needs to be thrown away,
but because they think it would no longer be needed given what
follows).  Then one byte is obtained via grabbing the next MT 32-bit
output, folding it into a persistent accumulator via multiplication,
and just revealing the top byte:

    accum = some_odd_integer
    while True:
        accum *= random.getrandbits(32) | 1
        yield accum >> 24

I did see one paper suggesting it was possible to distinguish the
output of that from a truly random sequence given 2**50 consecutive
outputs (but that's all - still no way to deduce the state).

From tim.peters at gmail.com  Wed Sep 16 02:49:33 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 15 Sep 2015 19:49:33 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAExdVNmfZrR6j1yCTraeFEngYXOJDKBHe2xDfCFicgy1m-s2Rw@mail.gmail.com>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org> <mt7c93$gns$1@ger.gmane.org>
 <mt7dkd$87i$1@ger.gmane.org>
 <20150915035334.GF31152@ando.pearwood.info>
 <CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>
 <55F7DAAE.5010401@egenix.com>
 <CAExdVNmWgCZbTSGTC+ZuMGDz33Cs+fQMEX0is5bLh2ukeLCakQ@mail.gmail.com>
 <55F8588E.7010106@egenix.com>
 <CAExdVNmfZrR6j1yCTraeFEngYXOJDKBHe2xDfCFicgy1m-s2Rw@mail.gmail.com>
Message-ID: <CAExdVN=9F43vUzhvX_77dXAr-_Y5LvSPmmz2CDQUQp1vQc9hPw@mail.gmail.com>

[Tim, on CryptMT]
> I did see one paper suggesting it was possible to distinguish the
> output of that from a truly random sequence given 2**50 consecutive
> outputs (but that's all - still no way to deduce the state).

Sorry:  not 2**50 consecutive outputs (which are bytes), but 2**50
consecutive output bits, so only 2**47 outputs.

From tim.peters at gmail.com  Wed Sep 16 02:55:00 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 15 Sep 2015 19:55:00 -0500
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAExdVNmfZrR6j1yCTraeFEngYXOJDKBHe2xDfCFicgy1m-s2Rw@mail.gmail.com>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org> <mt7c93$gns$1@ger.gmane.org>
 <mt7dkd$87i$1@ger.gmane.org>
 <20150915035334.GF31152@ando.pearwood.info>
 <CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>
 <55F7DAAE.5010401@egenix.com>
 <CAExdVNmWgCZbTSGTC+ZuMGDz33Cs+fQMEX0is5bLh2ukeLCakQ@mail.gmail.com>
 <55F8588E.7010106@egenix.com>
 <CAExdVNmfZrR6j1yCTraeFEngYXOJDKBHe2xDfCFicgy1m-s2Rw@mail.gmail.com>
Message-ID: <CAExdVNmvrn4Az7RPV_jmbUP9Au7NtddGp2KevsryFpZyaqfFiw@mail.gmail.com>

[Tim]
> ...
> Oh, sure.  MT's creators noted from the start that it would suffice to
> run MT's outputs through a crypto hash (like your favorite flavor of
> SHA).  That's just as vulnerable to "poor seeding" attacks as plain
> MT, but it's computationally infeasible to deduce the state from any
> number of hashed outputs

Although what's "computationally feasible" may well have changed since
then!  These days I expect even a modestly endowed attacker could
afford to store an exhaustive table of the 2**32 possible outputs and
their corresponding hashes.  Then the hashes are 100% invertible via
simple lookup, so are no better than not hashing at all.

From stephen at xemacs.org  Wed Sep 16 03:16:59 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 16 Sep 2015 10:16:59 +0900
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
Message-ID: <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>

Guido van Rossum writes:

 > > This is an attractive nuisance for anyone who *doesn't* need
 > > deterministic output from their random numbers and leads to
 > > situations where people are incorrectly using MT when they should
 > > be using SystemRandom because they don't know any better.
 > 
 > That feels condescending,

It is, but it's also accurate: there's plenty of anecdotal evidence
that this actually happens, specifically that most of the recipes for
password generation on SO silently fall back to a deterministic PRNG
if SystemRandom is unavailable, and the rest happily start with
random.random.  Not only are people apparently doing a wrong thing
here, they are eagerly teaching others to do the same.  (There's also
the possibility that the bad guys are seeding SO with backdoors in
this way, I guess.)

 > as does the assumption that (almost) every naive use of randomness
 > is somehow a security vulnerability.

This is a strawman.  None of the advocates of this change makes that
assumption.  The advocates proceed from the (basically unimpeachable)
assumptions that (1) the attacker only has to win once, and (2) they
are out there knocking on a lot of doors.  Then the questionable
assumption is that (3) the attackers are knocking on *this* door.

RC4 was at one time one of the best crypto algorithms available, but
it also induced the WEP fiasco, and a scramble for a new standard.
The question is whether we wait for a "Python security fiasco" to do
something about this situation.  Waiting *is* an option; the arguments
that RNGs won't be a "Python security fiasco" before Python 4 is
released are very plausible[1], and the overhead of a compatibility
break is not negligible (though Paul Moore himself admits it's
probably not huge, either).  But the risk of a security fiasco
(probably in a scenario not mentioned in this thread) is real.  The
arguments of the opponents of the change amount to "I have confirmed
that the probability it will happen to me is very small, therefore the
probability it will happen to anyone is small", which is, of course, a
fallacy.

 > The concept of secure vs. insecure sources of randomness isn't
 > *that* hard to grasp.

Once one *tries*.  Read some of Paul Moore's posts, and you will
discover that the very mention of some practice "improving security"
immediately induces a non-trivial subset of his colleagues to start
thinking about how to avoid doing it.  I am almost not kidding;
according to his descriptions, the situation in the trenches is very
nearly that bad.  Security is evidently hated almost as much as spam.

If random.random were to default to an unseedable nondeterministic
RNG, the scientific users would very quickly discover that (if not on
their own, when their papers get rejected).  On the other hand,
inappropriate uses are nowhere near so lucky.  In the current
situation, the programs Just Work Fine (they produce passwords that no
human would choose for themselves, for example), and noone is the
wiser unless they deliberately seek the information.

It seems to me that, given the "in your face" level of discoverability
that removing the state-access methods would provide, backward
compatibility with existing programs is the only real reason not to
move to "secure" randomness by default.  In fact "secure" randomness
is *higher*-quality for any purpose, including science.

Footnotes: 
[1]  Cf. Tim Peters' posts especially, they're few and where the
information content is low the humor content is high. ;-)


From stephen at xemacs.org  Wed Sep 16 03:30:21 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 16 Sep 2015 10:30:21 +0900
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <87oah3xiaa.fsf@uwakimon.sk.tsukuba.ac.jp>

Sorry for the self-followup; premature send.

Stephen J. Turnbull writes:

 > In fact "secure" randomness is *higher*-quality for any purpose,
 > including science.

It does need to be acknowedged that scientists need replicability for
unscientific reasons: (1) some "scientists" lie (cf. the STAP cell
controversy), and (2) as a regression test for their simulation
software.  But an exact replication of an "honest" simulation is
scientifically useless!


From stephen at xemacs.org  Wed Sep 16 04:22:58 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 16 Sep 2015 11:22:58 +0900
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAEbHw4Zq_AT-8E6iDh_CB3LfDha8KTY0=cBZ9tNwCjp7L4VWrQ@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
 <CAEbHw4Zq_AT-8E6iDh_CB3LfDha8KTY0=cBZ9tNwCjp7L4VWrQ@mail.gmail.com>
Message-ID: <87mvwnxful.fsf@uwakimon.sk.tsukuba.ac.jp>

A pseudo-randomly selected recent quote:

 > It would never occur to me to reach for the random module if I want
 > to do cryptography.

It's sad that so many of the opponents of this change make this kind
of comment sooner or later.  Security is (rarely) about *any* of *us*!
Most of *us* don't need it (if we do, our physical or corporate
security has already been compromised), most of *us* understand it, a
somewhat smaller fraction of *us* behave in habitually secure ways (at
the level we practice oral hygiene, say).

That doesn't mean that security has to be #1 always and everywhere in
designing Python, but I find it pretty distressing that apparently a
lot of people either don't understand or don't care about what's at
stake in these kinds of decisions *for the rest of the world*.
The reality is that security that is not on by default is not
secure.  Any break in a dike can flood a whole town.

The flip side is that security has costs, specifically the
compatibility break, and since security needs to be on by default, the
aggregate burden should be *presumed large* (even a small burden is
spread over many users).  Nevertheless, I think that the arguments to
justify this change are pretty good:

(1) The cost of adapting per program seems small, and seems to restricted
    to a class of users (software engineers doing regression testing
    and scientists doing simulations) who probably can easily make the
    change locally.  Nick's proto-PEP is specifically designed so that
    there will be no cost to naive users (kids writing games etc) who
    don't need access to state.

    Caveat: there may be a performance hit for some naive users.  That
    can probably be avoided with an appropriate choice of secure RNG,
    but that hasn't actually been benchmarked AFAIK.

(2) ISTM there are no likely attack vectors due to choice of default
    RNG in random.random, based on Tim's analysis, but AFAICS he's
    unwilling to say it's implausible that they exist.  (Sorry for the
    double negative!)  I take this to mean that there may be real risk.

(3) The anecdotal evidence that the module's current default is
    frequently misused is strong (the StackOverflow recipes for
    password generation).

Two out of three ain't bad.  YMMV, of course.

From tim.peters at gmail.com  Wed Sep 16 05:14:12 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 15 Sep 2015 22:14:12 -0500
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <87mvwnxful.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
 <CAEbHw4Zq_AT-8E6iDh_CB3LfDha8KTY0=cBZ9tNwCjp7L4VWrQ@mail.gmail.com>
 <87mvwnxful.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAExdVN=Xo_FqFvvQbgiV9s-WfZv3R=HfX0=cSsHteYw9EWw6Fg@mail.gmail.com>

[Stephen J. Turnbull <stephen at xemacs.org>]
> ...
> (2) ISTM there are no likely attack vectors due to choice of default
>     RNG in random.random, based on Tim's analysis, but AFAICS he's
>     unwilling to say it's implausible that they exist.  (Sorry for the
>     double negative!)  I take this to mean that there may be real risk.

Oh, _many_ attacks are possible.  Many are even plausible.  For
example, while Python's _default_ seeding is based on urandom()
setting MT's entire massive state (no more secure way exists), a user
making up their own seed is quite likely to do so in a way vulnerable
to a "poor seeding" attack.

"Password generators" should be the least of our worries.  Best I can
tell, the PHP paper's highly technical MT attack against those has
scant chance of working in Python except when random.choice(x) is
known to have len(x) a power of 2.  Then it's a very powerful attack.
But in PHP's idiomatic way of spelling random.choice(x) ("by hand",
spelled out in the paper), it's _always_ a very powerful attack.

In general, the more technical the attack, the more details matter.
It's just no _fun_ to drone on about simple universally applicable
brute-force attacks, so I'll continue to drone on about the PHP
paper's sophisticated MT state-deducer ;-)

From ncoghlan at gmail.com  Wed Sep 16 05:40:46 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 16 Sep 2015 13:40:46 +1000
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
Message-ID: <CADiSq7enZRFkFpzVhDnDs_8Ji12K5Pn8TQGagz8dnSaQ33-DrA@mail.gmail.com>

On 16 September 2015 at 03:33, Guido van Rossum <guido at python.org> wrote:
> I had to check out of the mega-threads, but I really don't like the outcome
> (unless this PEP is just the first of several competing proposals).
>
> The random module provides a useful interface ? a random() function and a
> large variety of derived functionality useful for statistics programming
> (e.g. uniform(), choice(), bivariate(), etc.). Many of these have
> significant mathematical finesse in their implementation. They are all
> accessing shared state that is kept in a global variable in the module, and
> that is a desirable feature (nobody wants to have to pass an extra variable
> just so you can share the state of the random number generator with some
> other code).
>
> I don?t want to change this API and I don?t want to introduce deprecation
> warnings ? the API is fine, and the warnings will be as ineffective as the
> warnings in the documentation.

The proposed runtime warnings are just an additional harder to avoid
nudge for folks that don't read the documentation, so I'd be OK with
dropping them from the proposal. However, it also occurs to me there
may be a better solution to eliminating them than getting people to
change their imports: add a "random.ensure_seedable()" API that flips
the default instance to the deterministic RNG without triggering the
warning.

For applications that genuinely want the determinism, warnings free
3.6+ compatibility would then look like:

    if hasattr(random, "ensure_seedable"):
        random.ensure_seedable()

> I am fine with adding more secure ways of generating random numbers. But we
> already have random.SystemRandom(), so there doesn?t seem to be a hurry?
>
> How about we make one small change instead: a way to change the default
> instance used by the top-level functions in the random module. Say,
>
>   random.set_random_generator(<instance>)

That was my previous proposal. The problem with it is that it's much
harder to test and support, as you have to allow for the global
instance changing multiple times, and in multiple different
directions.

With the proposal in the PEP, there's only a single idempotent change
that's possible: from the system RNG (used by default to eliminate the
silent security failure) to the seedable RNG (needed for
reproducibility).

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Wed Sep 16 06:00:22 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 16 Sep 2015 14:00:22 +1000
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>

On 16 September 2015 at 11:16, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Guido van Rossum writes:
>  > The concept of secure vs. insecure sources of randomness isn't
>  > *that* hard to grasp.
>
> Once one *tries*.  Read some of Paul Moore's posts, and you will
> discover that the very mention of some practice "improving security"
> immediately induces a non-trivial subset of his colleagues to start
> thinking about how to avoid doing it.  I am almost not kidding;
> according to his descriptions, the situation in the trenches is very
> nearly that bad.  Security is evidently hated almost as much as spam.

Yep, hence things like http://stopdisablingselinux.com/

SELinux in enforcing mode operates on a very simple principle: we
should know what system resources we expect our applications to
access, and we should write that down in a form the computer
understands so it can protect us against attackers trying to use that
application to do something unintended (like steal user information).

However, what we've realised as an industry is that effective security
systems have to be *transparent* and they have to be *natural*. So in
a containerised world, SELinux isolates containers from each other,
but if you're writing code that runs *in* the container, you don't
need to worry about it - from inside the container, it looks like
SELinux isn't running.

The traditional security engineering approach of telling people
"You're doing it wrong" just encourages them to avoid talking to
security people [1], rather than encouraging them to improve their
practices [2].

Hence the proposal in PEP 504 - my goal is to make the default
behaviour of the random module cryptographically secure, *without*
unduly affecting the use cases that need reproducibility rather than
cryptographic security, while still providing at least a nudge in the
direction of promoting security awareness. Changing the default
matters more to me than the nudge, so I'd be prepared to drop that
part.

Regards,
Nick.

[1] http://sobersecurity.blogspot.com.au/2015/09/everyone-is-afraid-of-us.html
[2] http://sobersecurity.blogspot.com.au/2015/09/being-nice-security-person.html

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From njs at pobox.com  Wed Sep 16 06:09:49 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 15 Sep 2015 21:09:49 -0700
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAP7+vJ+=1RWpp9E=F9SSH31L4PDZohtpFRp1bCj2hc0cJdsnwQ@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
 <CAEbHw4Zq_AT-8E6iDh_CB3LfDha8KTY0=cBZ9tNwCjp7L4VWrQ@mail.gmail.com>
 <CAP7+vJ+=1RWpp9E=F9SSH31L4PDZohtpFRp1bCj2hc0cJdsnwQ@mail.gmail.com>
Message-ID: <CAPJVwBkqjda36q9i7FWi2w48E8ACwTaxuTQwU+gUOvudOQvuEQ@mail.gmail.com>

On Sep 15, 2015 1:19 PM, "Guido van Rossum" <guido at python.org> wrote:
>
> How about the following. We add a fast secure random generator to the
stdlib as an option, and when it has proven its worth a few releases from
now we consider again whether the default random() can be made secure
without breaking anything.

If we have a fast secure RNG, then the standard Random object might as well
at least use it by default until someone actually sets or reads the state
(and then switch to MT at that point). Until one of these events happens,
the two RNGs are indistinguishable, and this would be a 100% backwards
compatible change. (It might even make sense to backport to 2.7.)

The limitation is that if library A uses the global random object without
seeding in a security sensitive context, and library B uses seeding, then a
program that just uses library A will be secure, but if it then starts
using library B it will become insecure. But this is still better than the
current situation where library A is always insecure.

The only case where this would actually have a downside compared to status
quo (assuming arc4random lives up to it's reputation for speed etc) is if
people start assuming that the default random object is in fact secure and
intentionally choosing to use it in security sensitive situations. But
hopefully people who know enough to realize that this is a decision they
need to make will also read the docs where it clearly states that this is
only a best-effort kind of hardening mechanism and that using
random.Random/the global methods for cryptographic purposes is still a bug.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150915/64468460/attachment.html>

From guido at python.org  Wed Sep 16 06:12:44 2015
From: guido at python.org (Guido van Rossum)
Date: Tue, 15 Sep 2015 21:12:44 -0700
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CADiSq7enZRFkFpzVhDnDs_8Ji12K5Pn8TQGagz8dnSaQ33-DrA@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <CADiSq7enZRFkFpzVhDnDs_8Ji12K5Pn8TQGagz8dnSaQ33-DrA@mail.gmail.com>
Message-ID: <CAP7+vJJwRSF+ShR06YR1-gEf0U4UwGuUKrS_5e0c7SVGe0MCwQ@mail.gmail.com>

On Tue, Sep 15, 2015 at 8:40 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 16 September 2015 at 03:33, Guido van Rossum <guido at python.org> wrote:
> > I had to check out of the mega-threads, but I really don't like the
> outcome
> > (unless this PEP is just the first of several competing proposals).
> >
> > The random module provides a useful interface ? a random() function and a
> > large variety of derived functionality useful for statistics programming
> > (e.g. uniform(), choice(), bivariate(), etc.). Many of these have
> > significant mathematical finesse in their implementation. They are all
> > accessing shared state that is kept in a global variable in the module,
> and
> > that is a desirable feature (nobody wants to have to pass an extra
> variable
> > just so you can share the state of the random number generator with some
> > other code).
> >
> > I don?t want to change this API and I don?t want to introduce deprecation
> > warnings ? the API is fine, and the warnings will be as ineffective as
> the
> > warnings in the documentation.
>
> The proposed runtime warnings are just an additional harder to avoid
> nudge for folks that don't read the documentation, so I'd be OK with
> dropping them from the proposal.


Good, because I really don't want the warnings, nor the hack based on
whether you call any of the seed/state-related methods.


> However, it also occurs to me there
> may be a better solution to eliminating them than getting people to
> change their imports: add a "random.ensure_seedable()" API that flips
> the default instance to the deterministic RNG without triggering the
> warning.
>
> For applications that genuinely want the determinism, warnings free
> 3.6+ compatibility would then look like:
>
>     if hasattr(random, "ensure_seedable"):
>         random.ensure_seedable()
>

I don't believe that seedability is the only thing that matters. MT is also
over an order of magnitude faster than os.urandom() or SystemRandom.


> > I am fine with adding more secure ways of generating random numbers. But
> we
> > already have random.SystemRandom(), so there doesn?t seem to be a hurry?
> >
> > How about we make one small change instead: a way to change the default
> > instance used by the top-level functions in the random module. Say,
> >
> >   random.set_random_generator(<instance>)
>
> That was my previous proposal. The problem with it is that it's much
> harder to test and support, as you have to allow for the global
> instance changing multiple times, and in multiple different
> directions.
>

Actually part of my proposal was a use_secure_random() that was also a
one-way flag flip, just in the opposite direction. :-)

With the proposal in the PEP, there's only a single idempotent change
> that's possible: from the system RNG (used by default to eliminate the
> silent security failure) to the seedable RNG (needed for
> reproducibility).
>

I'd be much more comfortable if in 3.6 we only introduced a new way to
generate secure random numbers that was as fast as MT. Once that has been
in use for a few releases we may have a discussion about whether it's time
to make it the default.

Security isn't served well by panicky over-reaction.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150915/81e65030/attachment-0001.html>

From guido at python.org  Wed Sep 16 06:15:22 2015
From: guido at python.org (Guido van Rossum)
Date: Tue, 15 Sep 2015 21:15:22 -0700
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAPJVwBkqjda36q9i7FWi2w48E8ACwTaxuTQwU+gUOvudOQvuEQ@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
 <CAEbHw4Zq_AT-8E6iDh_CB3LfDha8KTY0=cBZ9tNwCjp7L4VWrQ@mail.gmail.com>
 <CAP7+vJ+=1RWpp9E=F9SSH31L4PDZohtpFRp1bCj2hc0cJdsnwQ@mail.gmail.com>
 <CAPJVwBkqjda36q9i7FWi2w48E8ACwTaxuTQwU+gUOvudOQvuEQ@mail.gmail.com>
Message-ID: <CAP7+vJLVegO+5W=D7UnoUr_cJvVTxr9rXsJYVxkAPSAK18nn+Q@mail.gmail.com>

Clearly I need to mute this thread too. :-(

On Tue, Sep 15, 2015 at 9:09 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sep 15, 2015 1:19 PM, "Guido van Rossum" <guido at python.org> wrote:
> >
> > How about the following. We add a fast secure random generator to the
> stdlib as an option, and when it has proven its worth a few releases from
> now we consider again whether the default random() can be made secure
> without breaking anything.
>
> If we have a fast secure RNG, then the standard Random object might as
> well at least use it by default until someone actually sets or reads the
> state (and then switch to MT at that point). Until one of these events
> happens, the two RNGs are indistinguishable, and this would be a 100%
> backwards compatible change. (It might even make sense to backport to 2.7.)
>
> The limitation is that if library A uses the global random object without
> seeding in a security sensitive context, and library B uses seeding, then a
> program that just uses library A will be secure, but if it then starts
> using library B it will become insecure. But this is still better than the
> current situation where library A is always insecure.
>
> The only case where this would actually have a downside compared to status
> quo (assuming arc4random lives up to it's reputation for speed etc) is if
> people start assuming that the default random object is in fact secure and
> intentionally choosing to use it in security sensitive situations. But
> hopefully people who know enough to realize that this is a decision they
> need to make will also read the docs where it clearly states that this is
> only a best-effort kind of hardening mechanism and that using
> random.Random/the global methods for cryptographic purposes is still a bug.
>
> -n
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150915/f0668cf9/attachment.html>

From mertz at gnosis.cx  Wed Sep 16 06:16:37 2015
From: mertz at gnosis.cx (David Mertz)
Date: Tue, 15 Sep 2015 21:16:37 -0700
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAP7+vJ+=1RWpp9E=F9SSH31L4PDZohtpFRp1bCj2hc0cJdsnwQ@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
 <CAEbHw4Zq_AT-8E6iDh_CB3LfDha8KTY0=cBZ9tNwCjp7L4VWrQ@mail.gmail.com>
 <CAP7+vJ+=1RWpp9E=F9SSH31L4PDZohtpFRp1bCj2hc0cJdsnwQ@mail.gmail.com>
Message-ID: <CAEbHw4Yno9ESz7cNf+0VODxdn2_o-tAdOk_bkxH1eW4oNaxNNg@mail.gmail.com>

Sounds good to me!
On Sep 15, 2015 1:19 PM, "Guido van Rossum" <guido at python.org> wrote:

> How about the following. We add a fast secure random generator to the
> stdlib as an option, and when it has proven its worth a few releases from
> now we consider again whether the default random() can be made secure
> without breaking anything.
>
> On Tue, Sep 15, 2015 at 12:43 PM, David Mertz <mertz at gnosis.cx> wrote:
>
>> I commonly use random.some_distribution() as a quick source of
>> "randomness" knowing full well that it's not cryptographic. Moreover, I
>> usually do so initially without setting a seed.
>>
>> The first question I want to answer is "does this random process behave
>> roughly as I expect?" But in the back of my mind is always the thought,
>> "If/when I want to reuse this I'll add a seed for reproducibility". It
>> would never occur to me to reach for the random module if I want to do
>> cryptography.
>>
>> It's a good and well established API that currently exists. Sure, add a
>> submodule random.crypto (or whatever name), but I'm -1 on changing anything
>> whatsoever on the module functions that are well known.
>> On Sep 15, 2015 11:26 AM, "Random832" <random832 at fastmail.com> wrote:
>>
>>> On Tue, Sep 15, 2015, at 13:33, Guido van Rossum wrote:
>>> > I don?t want to change this API and I don?t want to introduce
>>> deprecation
>>> > warnings ? the API is fine, and the warnings will be as ineffective as
>>> > the
>>> > warnings in the documentation.
>>>
>>> The output of random.random today when it's not seeded / seeded with
>>> None isn't _really_ deterministic - you can't reproduce it, after all,
>>> without modifying the code (though in principle you could do
>>> seed(None)/getstate the first time and then setstate on subsequent
>>> executions - it may be worth supporting this use case?) - so changing it
>>> isn't likely to affect anyone - anyone needing MT is likely to also be
>>> using the seed functions.
>>>
>>> >   random.set_random_generator(<instance>)
>>>
>>> What do you think of having calls to seed/setstate(/getstate?)
>>> implicitly switch (by whatever mechanism) to MT? This could be done
>>> without a deprecation warning, and would allow existing code that relies
>>> on reproducible values to continue working without modification?
>>>
>>> [indirection in global functions]...
>>> > (and similar for all related functions).
>>>
>>> global getstate/setstate should also save/replace the _inst or its type;
>>> at least if it's a different type than it was at the time the state was
>>> saved. For backwards compatibility in case these are pickled it could
>>> use the existing format when _inst is the current MT implementation, and
>>> accept these in setstate.
>>>
>>> > It would also be fine for SystemRandom (or
>>> > at
>>> > least whatever is used by use_secure_random(), if SystemRandom cannot
>>> > change for backward compatibility reasons) to raise an exception when
>>> > seed(), setstate() or getstate() are called.
>>>
>>> SystemRandom already raises an exception when getstate and setstate are
>>> called.
>>>
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> https://mail.python.org/mailman/listinfo/python-ideas
>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150915/fd93211a/attachment.html>

From mertz at gnosis.cx  Wed Sep 16 06:19:01 2015
From: mertz at gnosis.cx (David Mertz)
Date: Tue, 15 Sep 2015 21:19:01 -0700
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAEbHw4Zq_AT-8E6iDh_CB3LfDha8KTY0=cBZ9tNwCjp7L4VWrQ@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
 <CAEbHw4Zq_AT-8E6iDh_CB3LfDha8KTY0=cBZ9tNwCjp7L4VWrQ@mail.gmail.com>
Message-ID: <CAEbHw4aSq9MPu88Nph7=RAO5dSKX-yYuWz-OgeB_h6jDq2+xGA@mail.gmail.com>

The below said, I confess I never really liked random.random() as a name.
Calling it random.uniform() 20 years ago would have been better. But that's
ancient history, and no big deal.
On Sep 15, 2015 12:43 PM, "David Mertz" <mertz at gnosis.cx> wrote:

> I commonly use random.some_distribution() as a quick source of
> "randomness" knowing full well that it's not cryptographic. Moreover, I
> usually do so initially without setting a seed.
>
> The first question I want to answer is "does this random process behave
> roughly as I expect?" But in the back of my mind is always the thought,
> "If/when I want to reuse this I'll add a seed for reproducibility". It
> would never occur to me to reach for the random module if I want to do
> cryptography.
>
> It's a good and well established API that currently exists. Sure, add a
> submodule random.crypto (or whatever name), but I'm -1 on changing anything
> whatsoever on the module functions that are well known.
> On Sep 15, 2015 11:26 AM, "Random832" <random832 at fastmail.com> wrote:
>
>> On Tue, Sep 15, 2015, at 13:33, Guido van Rossum wrote:
>> > I don?t want to change this API and I don?t want to introduce
>> deprecation
>> > warnings ? the API is fine, and the warnings will be as ineffective as
>> > the
>> > warnings in the documentation.
>>
>> The output of random.random today when it's not seeded / seeded with
>> None isn't _really_ deterministic - you can't reproduce it, after all,
>> without modifying the code (though in principle you could do
>> seed(None)/getstate the first time and then setstate on subsequent
>> executions - it may be worth supporting this use case?) - so changing it
>> isn't likely to affect anyone - anyone needing MT is likely to also be
>> using the seed functions.
>>
>> >   random.set_random_generator(<instance>)
>>
>> What do you think of having calls to seed/setstate(/getstate?)
>> implicitly switch (by whatever mechanism) to MT? This could be done
>> without a deprecation warning, and would allow existing code that relies
>> on reproducible values to continue working without modification?
>>
>> [indirection in global functions]...
>> > (and similar for all related functions).
>>
>> global getstate/setstate should also save/replace the _inst or its type;
>> at least if it's a different type than it was at the time the state was
>> saved. For backwards compatibility in case these are pickled it could
>> use the existing format when _inst is the current MT implementation, and
>> accept these in setstate.
>>
>> > It would also be fine for SystemRandom (or
>> > at
>> > least whatever is used by use_secure_random(), if SystemRandom cannot
>> > change for backward compatibility reasons) to raise an exception when
>> > seed(), setstate() or getstate() are called.
>>
>> SystemRandom already raises an exception when getstate and setstate are
>> called.
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150915/5b9a0288/attachment-0001.html>

From mertz at gnosis.cx  Wed Sep 16 06:27:41 2015
From: mertz at gnosis.cx (David Mertz)
Date: Tue, 15 Sep 2015 21:27:41 -0700
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <87mvwnxful.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
 <CAEbHw4Zq_AT-8E6iDh_CB3LfDha8KTY0=cBZ9tNwCjp7L4VWrQ@mail.gmail.com>
 <87mvwnxful.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAEbHw4Z_OPtn2cjSMW351Mddw-vbhbVeNFKzEuD8P0MX2Dt8wA@mail.gmail.com>

On Sep 15, 2015 7:23 PM, "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
>
> A pseudo-randomly selected recent quote:
>
>  > It would never occur to me to reach for the random module if I want
>  > to do cryptography.

> That doesn't mean that security has to be #1 always and everywhere in
> designing Python, but I find it pretty distressing that apparently a
> lot of people either don't understand or don't care about what's at
> stake in these kinds of decisions *for the rest of the world*.
> The reality is that security that is not on by default is not
> secure.  Any break in a dike can flood a whole town.

This feels somewhere between disingenuous and dishonest. Just like I don't
use the random module for cryptography, I also don't use the socket module
or the threading module for cryptography.

Could a program dealing with sockets have security issues?! Very likely!
Could a multithreaded one expose vulnerabilities? Certainly!

Should we try to "secure" these modules for users who don't need to our
don't know to think about security? Absolutely not!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150915/e9b59efe/attachment.html>

From stephen at xemacs.org  Wed Sep 16 06:47:25 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 16 Sep 2015 13:47:25 +0900
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAExdVN=Xo_FqFvvQbgiV9s-WfZv3R=HfX0=cSsHteYw9EWw6Fg@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
 <CAEbHw4Zq_AT-8E6iDh_CB3LfDha8KTY0=cBZ9tNwCjp7L4VWrQ@mail.gmail.com>
 <87mvwnxful.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=Xo_FqFvvQbgiV9s-WfZv3R=HfX0=cSsHteYw9EWw6Fg@mail.gmail.com>
Message-ID: <87lhc7x95u.fsf@uwakimon.sk.tsukuba.ac.jp>

Tim Peters writes:
 > [Stephen J. Turnbull <stephen at xemacs.org>]
 > > ...
 > > (2) ISTM there are no likely attack vectors due to choice of default
 > >     RNG in random.random, based on Tim's analysis, but AFAICS he's
 > >     unwilling to say it's implausible that they exist.  (Sorry for the
 > >     double negative!)  I take this to mean that there may be real risk.
 > 
 > Oh, _many_ attacks are possible.  Many are even plausible.  For
 > example, while Python's _default_ seeding is based on urandom()
 > setting MT's entire massive state (no more secure way exists), a user
 > making up their own seed is quite likely to do so in a way vulnerable
 > to a "poor seeding" attack.

I'm not sure what you mean to say, but I don't count that as "due to
choice of default RNG".  That's foot-shooting of the kind we can't do
anything about anyway, and if *that* is what Nick is worried about,
I'm worried about Nick. ;-)

*I* am more worried about attacks we don't know about yet (or at least
haven't been mentioned in this thread), and maybe even haven't been
invented yet.  I presume Nick is, too.

 > "Password generators" should be the least of our worries.  Best I can
 > tell, the PHP paper's highly technical MT attack against those has
 > scant chance of working in Python except when random.choice(x) is
 > known to have len(x) a power of 2.

That's genuinely comforting to read (even though it's the second or
third time I've read it ;-).  But I'm still nervous about the unknown.

From ncoghlan at gmail.com  Wed Sep 16 07:38:36 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 16 Sep 2015 15:38:36 +1000
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAEbHw4Z_OPtn2cjSMW351Mddw-vbhbVeNFKzEuD8P0MX2Dt8wA@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
 <CAEbHw4Zq_AT-8E6iDh_CB3LfDha8KTY0=cBZ9tNwCjp7L4VWrQ@mail.gmail.com>
 <87mvwnxful.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAEbHw4Z_OPtn2cjSMW351Mddw-vbhbVeNFKzEuD8P0MX2Dt8wA@mail.gmail.com>
Message-ID: <CADiSq7c_R=iMVtSHvs1S_2_Hwk3c2rhvF4RSCWxtuLW-BSXDtw@mail.gmail.com>

On 16 September 2015 at 14:27, David Mertz <mertz at gnosis.cx> wrote:
>
> On Sep 15, 2015 7:23 PM, "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
>>
>> A pseudo-randomly selected recent quote:
>>
>>  > It would never occur to me to reach for the random module if I want
>>  > to do cryptography.
>
>> That doesn't mean that security has to be #1 always and everywhere in
>> designing Python, but I find it pretty distressing that apparently a
>> lot of people either don't understand or don't care about what's at
>> stake in these kinds of decisions *for the rest of the world*.
>> The reality is that security that is not on by default is not
>> secure.  Any break in a dike can flood a whole town.
>
> This feels somewhere between disingenuous and dishonest. Just like I don't
> use the random module for cryptography, I also don't use the socket module
> or the threading module for cryptography.

That's great that you already know not to use the random module for
cryptography. Unfortunately, this is a lesson that needs to be taught
developer by developer: "don't use the random module for security
sensitive tasks". When they ask "Why not?", they get hit with a wall
of confusing arcana about brute force search spaces, and
cryptographically secure random number generators, and get left with a
feeling of dissatisfaction with the explanation because cryptography
is one of the areas of computing where our intuitions break down so it
takes years to retrain our brains to adopt the relevant mindset.
Beginners don't even get that far, as they have to ask "What's a
security sensitive task?" while they're still at a stage where they're
trying to grasp the basic concept of computer generated random numbers
(this is a concrete problem with the current situation, as a warning
that says "Don't use this for <X>" is equivalent to "Don't use this"
is you don't yet know how to identify "<X>").

It's instinctive for humans to avoid additional work when it provides
no immediate benefit to us personally. This is a sensible time
management strategy, but it's proved to be a serious problem in the
context of computer security. An analogy that came up in one of the
earlier threads is this:

* as an individual lottery ticket holder, assuming you're going to win
is a bad assumption
* as a lottery operator, assuming someone, somewhere, is going to win
is a good assumption

Infrastructure security engineers are lottery operators - with
millions of software development projects, millions of businesses
demanding online web presences, and tens of millions of developers
worldwide (with many, many more on the way as computing becomes a
compulsory part of schooling), any potential mistake is going to be
made and exploited eventually, we just have no way of predicting when
or where. Unlike lottery operators (who get to set their prize
levels), we also have no way of predicting the severity of the
consequences.

The *problem* we have is that individual developers are lottery ticket
holders - the probability of *our* particular component being the one
that gets compromised is vanishingly small, so the incentive to
inflict additional work on ourselves to mitigate security concerns is
similarly small (although some folks do it anyway out of sheer
interest, and some have professional incentives to do so).

So let's assume any given component has a 1 in 10000 chance of being
compromised (0.01%). We only have to get to 100k components before the
aggregate chance of at least one component being compromised rises to
almost 100% (around 99.54%). It's at this point the sheer scale of the
internet starts working against us - while it's currently estimated
that there are currently only around 30 million developers (both
professionals and hobbyists) worldwide, it's further estimated that
there are 3 *billion* people with access to the internet. Neither of
those numbers is going to suddenly start getting smaller, so we start
getting interested in security risks with a lower and lower
probability of being exploited.

Accordingly, we want "don't worry about security" to be the *right
answer* in as many cases as possible - there's always going to be
plenty of unavoidable security risks in any software development
project, so eliminating the avoidable ones by default makes it easier
to focus attention on other areas of potential concern.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Wed Sep 16 07:59:04 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 16 Sep 2015 15:59:04 +1000
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAP7+vJJwRSF+ShR06YR1-gEf0U4UwGuUKrS_5e0c7SVGe0MCwQ@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <CADiSq7enZRFkFpzVhDnDs_8Ji12K5Pn8TQGagz8dnSaQ33-DrA@mail.gmail.com>
 <CAP7+vJJwRSF+ShR06YR1-gEf0U4UwGuUKrS_5e0c7SVGe0MCwQ@mail.gmail.com>
Message-ID: <CADiSq7eXHiquJyF=09niBJDhwS4Vw3GNXjqDepfes06BF2bdLA@mail.gmail.com>

On 16 September 2015 at 14:12, Guido van Rossum <guido at python.org> wrote:
> Security isn't served well by panicky over-reaction.

Proposing a change in 2015 that wouldn't be released to the public
until early 2017 or so isn't exactly panicking. (And the thing that
changed for me that prompted me to write the PEP was finally figuring
out a remotely plausible migration plan to address the backwards
compatibility concerns, rather than anything on the security side)

As I wrote in the PEP, this kind of problem is a chronic one, not an
acute one, where security engineers currently waste a *lot* of their
(and other people's) time on remedial firefighting - a security audit
(or a breach investigation) detects a vulnerability, high priority
issues get filed with affected projects, nobody goes home happy.

Accordingly, my proposal is aimed as much at eliminating the perennial
"But *why* can't I use the random module for security sensitive
tasks?" argument as it is at anything else. I'd like the answer to
that question to eventually be "Sure, you can use the random module
for security sensitive tasks, so let's talk about something more
important, like why you're collecting and storing all this sensitive
personally identifiable information in the first place".

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From tim.peters at gmail.com  Wed Sep 16 09:23:57 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 16 Sep 2015 02:23:57 -0500
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <87lhc7x95u.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
 <CAEbHw4Zq_AT-8E6iDh_CB3LfDha8KTY0=cBZ9tNwCjp7L4VWrQ@mail.gmail.com>
 <87mvwnxful.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=Xo_FqFvvQbgiV9s-WfZv3R=HfX0=cSsHteYw9EWw6Fg@mail.gmail.com>
 <87lhc7x95u.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAExdVNkmxoqB2CYt2L_vQHzYWCrq8GS5e=GwsrpvmNryt1jr+A@mail.gmail.com>

[Stephen J. Turnbull <stephen at xemacs.org>]
>>> (2) ISTM there are no likely attack vectors due to choice of default
>>>     RNG in random.random, based on Tim's analysis, but AFAICS he's
>>>     unwilling to say it's implausible that they exist.  (Sorry for the
>>>     double negative!)  I take this to mean that there may be real risk.

[Tim]
>> Oh, _many_ attacks are possible.  Many are even plausible.  For
>> example, while Python's _default_ seeding is based on urandom()
>> setting MT's entire massive state (no more secure way exists), a user
>> making up their own seed is quite likely to do so in a way vulnerable
>> to a "poor seeding" attack.

[Stephen]
> I'm not sure what you mean to say,

That the most obvious and easiest of RNG attacks remain possible
regardless of anything that may be done, short of refusing to provide
a seedable generator.


> but I don't count that as "due to choice of default RNG".  That's foot-
>shooting of the kind we can't do anything about anyway, and if *that*
> is what Nick is worried about, I'm worried about Nick. ;-)

Oh no, _nobody_ is worried enough to "do something" about it.  Not really.

Note that in the PHP paper, 10 of the 16 apps scored "full attack" via
pure brute force against poor seeding (figure 13, column 4.3).  That's
probably mostly due to that the versions of PHP tested inflicted poor
_default_ seeding on users.  I hope so.  But there's no accounting of
which apps did and didn't set their own seeds.  They did note that
"Joomla" attempted to repair a security bug by _removing_ its own
seeding, in 2010.  Which left it open to PHP's poor default seeding
instead - which was nevertheless an improvement.


> *I* am more worried about attacks we don't know about yet (or at least
> haven't been mentioned in this thread), and maybe even haven't been
> invented yet.  I presume Nick is, too.

Fundamentally, I just don't see the sense in saying that someone who
does their own seeding deserves whatever they get, while someone who
uses an inappropriate generator in a security context should be saved
from themself.  I know, I read all the posts about why I'm wrong.  I
just don't buy it.  There's no real substitute for understanding what
you're doing, regardless of field.  Yes, incompetence can cause great
damage.  But I'm not sure it does the world a real favor to possibly
help a programmer incompetent to do a task keep working in the field a
little longer.  This isn't the only damage they can cause, and the
longer they keep working in an area they don't understand the more
damage they can do.  The alternative?  Learn how to use frickin'
SystemRandom.  It's not hard.  Or get work for which they are
competent.


>> "Password generators" should be the least of our worries.  Best I can
>> tell, the PHP paper's highly technical MT attack against those has
>> scant chance of working in Python except when random.choice(x) is
>> known to have len(x) a power of 2.

> That's genuinely comforting to read (even though it's the second or
> third time I've read it ;-)

If you read everything I ever wrote, it's the second.

Although you may have _inferred_ it before I ever wrote it, from
Nathaniel's "if I use the base64 or hex alphabets", instinctively
leaping from "hmm ... 2**6 and ... 2**4" to "power of 2".  In which
case it could feel like the third time.

And I used the phrase "power of 2" in a reply to you before, but in a
context wholly unrelated to the PHP paper.  That may even make it feel
like the fourth time.

Always happy to clarify ;-)


> But I'm still nervous about the unknown.

Huh!  I've heard humans are prone to that.  In which case, there will
always be something to be nervous about :-)

From mertz at gnosis.cx  Wed Sep 16 09:43:59 2015
From: mertz at gnosis.cx (David Mertz)
Date: Wed, 16 Sep 2015 00:43:59 -0700
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CADiSq7eXHiquJyF=09niBJDhwS4Vw3GNXjqDepfes06BF2bdLA@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <CADiSq7enZRFkFpzVhDnDs_8Ji12K5Pn8TQGagz8dnSaQ33-DrA@mail.gmail.com>
 <CAP7+vJJwRSF+ShR06YR1-gEf0U4UwGuUKrS_5e0c7SVGe0MCwQ@mail.gmail.com>
 <CADiSq7eXHiquJyF=09niBJDhwS4Vw3GNXjqDepfes06BF2bdLA@mail.gmail.com>
Message-ID: <CAEbHw4Y4LEbs5cc9ERMcUqxg3P8bN6HY1KGdP-6UUj5_JSstRw@mail.gmail.com>

On Sep 15, 2015 11:00 PM, "Nick Coghlan" <ncoghlan at gmail.com> wrote:
> "But *why* can't I use the random module for security sensitive
> tasks?" argument as it is at anything else. I'd like the answer to
> that question to eventually be "Sure, you can use the random module
> for security sensitive tasks, so let's talk about something more
> important, like why you're collecting and storing all this sensitive
> personally identifiable information in the first place".

I believe this attitude makes overall security WORSE, not better. Giving a
false assurance that simply using a certain cryptographic building block
makes your application secure makes it less likely applications will fail
to undergo genuine security analysis.

Hence I affirmatively PREFER a random module that explicitly proclaims that
it is non-cryptographic.  Someone who figures out enough to use
random.SystemRandom, or a future crypto.random, or the like is more likely
to think about why they are doing so, and what doing so does and does NOT
assure them off.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150916/e7892be8/attachment.html>

From robert.kern at gmail.com  Wed Sep 16 10:11:12 2015
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 16 Sep 2015 09:11:12 +0100
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <CAExdVNmfZrR6j1yCTraeFEngYXOJDKBHe2xDfCFicgy1m-s2Rw@mail.gmail.com>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org> <mt7c93$gns$1@ger.gmane.org>
 <mt7dkd$87i$1@ger.gmane.org> <20150915035334.GF31152@ando.pearwood.info>
 <CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>
 <55F7DAAE.5010401@egenix.com>
 <CAExdVNmWgCZbTSGTC+ZuMGDz33Cs+fQMEX0is5bLh2ukeLCakQ@mail.gmail.com>
 <55F8588E.7010106@egenix.com>
 <CAExdVNmfZrR6j1yCTraeFEngYXOJDKBHe2xDfCFicgy1m-s2Rw@mail.gmail.com>
Message-ID: <mtb870$qo3$1@ger.gmane.org>

On 2015-09-16 01:43, Tim Peters wrote:
> ...
>
> [Marc-Andre]
>> Ah, now we're getting somewhere :-)
>>
>> If we accept that non-guessable, but deterministic is a good
>> compromise, then adding a cipher behind MT sounds like a reasonable
>> way forward, even as default.
>>
>> For full crypto strength, people would still have to rely on
>> solutions like /dev/urandom or the OpenSSL one (or reseed the
>> default RNG every now and then). All others get the benefit of
>> non-guessable, but keep the ability to seed the default RNG in
>> Python.
>
> I expect the only real reason "new entropy" is periodically mixed in
> to OpenBSD's arc4random() is to guard against that a weakness in
> ChaCha20 may be discovered later.  If there were any known
> computationally feasible way whatsoever to distinguish bare-bones
> ChaCha20's output from a "truly random" sequence, it wouldn't be
> called "crypto" to begin with.

Periodic reseeding also serves to guard against other leaks of information about 
the underlying state that don't come from breaking through the cipher. If an 
attacker manages to deduce the state through side channels, timing attacks on 
the machine, brief physical access, whatever, then reseeding with new entropy 
will limit the damage rather than blithely continuing on with a compromised 
state forever.

It's an important feature of a CSPRNG. Using a crypto output function in your 
PRNG is a necessary but not sufficient condition for security.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco


From mal at egenix.com  Wed Sep 16 10:21:23 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 16 Sep 2015 10:21:23 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAExdVNmfZrR6j1yCTraeFEngYXOJDKBHe2xDfCFicgy1m-s2Rw@mail.gmail.com>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>	<CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>	<55F6A380.4070609@egenix.com>	<CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>	<55F700C4.4030900@egenix.com>
 <mt75tq$bn1$1@ger.gmane.org>	<mt7bl5$7hk$1@ger.gmane.org>
 <mt7c93$gns$1@ger.gmane.org>	<mt7dkd$87i$1@ger.gmane.org>	<20150915035334.GF31152@ando.pearwood.info>	<CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>	<55F7DAAE.5010401@egenix.com>	<CAExdVNmWgCZbTSGTC+ZuMGDz33Cs+fQMEX0is5bLh2ukeLCakQ@mail.gmail.com>	<55F8588E.7010106@egenix.com>
 <CAExdVNmfZrR6j1yCTraeFEngYXOJDKBHe2xDfCFicgy1m-s2Rw@mail.gmail.com>
Message-ID: <55F92683.5040205@egenix.com>



On 16.09.2015 02:43, Tim Peters wrote:
> ...
> 
> [Marc-Andre]
>> Ah, now we're getting somewhere :-)
>>
>> If we accept that non-guessable, but deterministic is a good
>> compromise, then adding a cipher behind MT sounds like a reasonable
>> way forward, even as default.
>>
>> For full crypto strength, people would still have to rely on
>> solutions like /dev/urandom or the OpenSSL one (or reseed the
>> default RNG every now and then). All others get the benefit of
>> non-guessable, but keep the ability to seed the default RNG in
>> Python.
> 
> I expect the only real reason "new entropy" is periodically mixed in
> to OpenBSD's arc4random() is to guard against that a weakness in
> ChaCha20 may be discovered later.  If there were any known
> computationally feasible way whatsoever to distinguish bare-bones
> ChaCha20's output from a "truly random" sequence, it wouldn't be
> called "crypto" to begin with.
> 
> But reseeding MT every now and again is definitely not suitable for
> crypto purposes.  You would need to reseed at least every 624 outputs,
> and from a crypto-strength seed source.  In which case, why bother
> with MT at all?  You could just as well use the crypto source
> directly.
> 
> 
>> Is there some research on this (MT + cipher or hash) ?
> 
> Oh, sure.  MT's creators noted from the start that it would suffice to
> run MT's outputs through a crypto hash (like your favorite flavor of
> SHA).  That's just as vulnerable to "poor seeding" attacks as plain
> MT, but it's computationally infeasible to deduce the state from any
> number of hashed outputs (unless your crypto hash is at  least partly
> invertible, in which case it's not really a crypto hash ;-) ).;
> 
> For other approaches, search for CryptMT.  MT's creators suggested a
> number of other schemes over the years.  The simplest throws away the
> "tempering" part of MT (the 4 lines that map the raw state word into a
> mildly scrambled output word - not because it needs to be thrown away,
> but because they think it would no longer be needed given what
> follows).  Then one byte is obtained via grabbing the next MT 32-bit
> output, folding it into a persistent accumulator via multiplication,
> and just revealing the top byte:
> 
>     accum = some_odd_integer
>     while True:
>         accum *= random.getrandbits(32) | 1
>         yield accum >> 24
> 
> I did see one paper suggesting it was possible to distinguish the
> output of that from a truly random sequence given 2**50 consecutive
> outputs (but that's all - still no way to deduce the state).

> [Tim, on CryptMT]
>> I did see one paper suggesting it was possible to distinguish the
>> output of that from a truly random sequence given 2**50 consecutive
>> outputs (but that's all - still no way to deduce the state).
>
> Sorry:  not 2**50 consecutive outputs (which are bytes), but 2**50
> consecutive output bits, so only 2**47 outputs.

Thanks for the "CryptMT" pointers. I'll do some research after PyCon UK
on this.

http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/CRYPTMT/index.html

A quick glimpse at

http://www.ecrypt.eu.org/stream/p3ciphers/cryptmt/cryptmt_p3.pdf

suggests that this is a completely new stream cipher, though it
uses the typical elements (key + non-linear filter + feedback loop).

The approach is interesting, though: they propose an PRNG which
can then get used as stream cipher by XOR'ing the PRNG output with
the data stream. So the PRNG implies the cipher, not the other way
around as many other approaches to CSPRNGs.

That's probably also one of its perceived weaknesses: it's different
than the common approach.

On 16.09.2015 02:55, Tim Peters wrote:> [Tim]
>> ...
>> Oh, sure.  MT's creators noted from the start that it would suffice to
>> run MT's outputs through a crypto hash (like your favorite flavor of
>> SHA).  That's just as vulnerable to "poor seeding" attacks as plain
>> MT, but it's computationally infeasible to deduce the state from any
>> number of hashed outputs
>
> Although what's "computationally feasible" may well have changed since
> then!  These days I expect even a modestly endowed attacker could
> afford to store an exhaustive table of the 2**32 possible outputs and
> their corresponding hashes.  Then the hashes are 100% invertible via
> simple lookup, so are no better than not hashing at all.

Simply adding a hash doesn't sound like a good idea.
My initial thought was using a (well studied) stream cipher on
the output, not just a hash on the output.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 16 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-14: Released mxODBC Plone/Zope DA 2.2.3   http://egenix.com/go84
2015-09-18: PyCon UK 2015 ...                               2 days to go
2015-09-26: Python Meeting Duesseldorf Sprint 2015         10 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From cory at lukasa.co.uk  Wed Sep 16 10:28:26 2015
From: cory at lukasa.co.uk (Cory Benfield)
Date: Wed, 16 Sep 2015 09:28:26 +0100
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAExdVNkmxoqB2CYt2L_vQHzYWCrq8GS5e=GwsrpvmNryt1jr+A@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
 <CAEbHw4Zq_AT-8E6iDh_CB3LfDha8KTY0=cBZ9tNwCjp7L4VWrQ@mail.gmail.com>
 <87mvwnxful.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=Xo_FqFvvQbgiV9s-WfZv3R=HfX0=cSsHteYw9EWw6Fg@mail.gmail.com>
 <87lhc7x95u.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNkmxoqB2CYt2L_vQHzYWCrq8GS5e=GwsrpvmNryt1jr+A@mail.gmail.com>
Message-ID: <CAH_hAJGyi4bnvmKCsSJECJPUBudq93S+CNowqO9KaHVN3yR5Pw@mail.gmail.com>

On 16 September 2015 at 08:23, Tim Peters <tim.peters at gmail.com> wrote:
> Fundamentally, I just don't see the sense in saying that someone who
> does their own seeding deserves whatever they get, while someone who
> uses an inappropriate generator in a security context should be saved
> from themself.  I know, I read all the posts about why I'm wrong.  I
> just don't buy it.  There's no real substitute for understanding what
> you're doing, regardless of field.  Yes, incompetence can cause great
> damage.  But I'm not sure it does the world a real favor to possibly
> help a programmer incompetent to do a task keep working in the field a
> little longer.  This isn't the only damage they can cause, and the
> longer they keep working in an area they don't understand the more
> damage they can do.  The alternative?  Learn how to use frickin'
> SystemRandom.  It's not hard.  Or get work for which they are
> competent.

Because that's never how these things go. You usually don't write a
password generator that uses a non-CS PRNG in a security context, get
discovered in the short term, and fired/reprimanded/whatever. Instead,
one of the following things happens:

- you get code review from a reviewer who knows the problem space and
spots the problem. It gets fixed, you get educated, you're better
prepared for the field.
- you get code review from a reviewer who knows the problem space but
*doesn't* spot the problem because Python isn't their first language.
It doesn't get fixed and no-one notices for ten years until the
problem is exploited, but you left the company 8 years ago and are now
Head of Security Engineering at CoolStartupInc.
- you don't get code review, or your reviewer is no better informed on
this topic than you are. The problem doesn't get fixed and no-one
notices ever because your program isn't exploited, or is only
exploited in ways you never find out about because the rest of your
security process sucked too, but you never find out about this.

This is the ongoing problem with incompetence when it comes to
security: the feedback loop is long and the negative event fires
rarely, so most programmers never experience it. Most engineers have
*never* experienced a security vulnerability in their own project, let
alone had one exploited. Thus, most engineers never get the negative
feedback loop that tells them that they don't know enough to do the
work they're doing.

Look at all the people who get this wrong. Consider haveibeenpwned.com
for a minute. They list a fraction of the website databases that have
been exposed due to security errors. At last count, that list includes
(I removed more than half for the sake of length):

- Adobe
- Ashley Madison
- Snapchat
- Gawker
- NextGenUpdate
- Yandex
- Forbes
- Stratfor
- Domino's
- Yahoo
- Telecom Regulatory Authority of India
- Vodafone
- Sony
- HackingTeam
- Bell
- Minecraft Forum
- UN Internet Governance Forum
- Tesco

Are you telling me that every engineer responsible for these is not
working in the industry any more? I doubt it. In fact, I think most of
these places can't even account for which engineer is responsible, and
if they can odds are good they left long before the problem was
exploited.

So you're right, there is no real substitute for knowing what you're
doing. But we cannot prevent programmers who don't know this stuff
from writing the code that does it. We don't get to set the bar. We
cannot throw GoReadABookOrTwo exceptions when inexperienced
programmers type random.random, much as we would like too.

With that said, we *can* construct an environment where a programmer
has to have actually tried to hurt themselves. They have to have taken
the gun off the desk, loaded it, disabled the safety, pointed it at
their foot, and pulled the trigger. At that point we can say that we
took all reasonable precautions to stop you doing what you did and you
did it anyway: that's entirely on you.

If you disable the safety settings, then frankly you are taking on the
mantle of an expert: you are claiming you knew more than the person
who developed the system, and if you don't then the consequences are
on you. But if you use the defaults then you're just doing the most
obvious thing, and from my perspective that should not be a punishable
offence.

From cory at lukasa.co.uk  Wed Sep 16 10:29:53 2015
From: cory at lukasa.co.uk (Cory Benfield)
Date: Wed, 16 Sep 2015 09:29:53 +0100
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAEbHw4Y4LEbs5cc9ERMcUqxg3P8bN6HY1KGdP-6UUj5_JSstRw@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <CADiSq7enZRFkFpzVhDnDs_8Ji12K5Pn8TQGagz8dnSaQ33-DrA@mail.gmail.com>
 <CAP7+vJJwRSF+ShR06YR1-gEf0U4UwGuUKrS_5e0c7SVGe0MCwQ@mail.gmail.com>
 <CADiSq7eXHiquJyF=09niBJDhwS4Vw3GNXjqDepfes06BF2bdLA@mail.gmail.com>
 <CAEbHw4Y4LEbs5cc9ERMcUqxg3P8bN6HY1KGdP-6UUj5_JSstRw@mail.gmail.com>
Message-ID: <CAH_hAJECibCUr0sq0RobDGvyUBsMMji-5n-LA8O0sMd6xbZo9g@mail.gmail.com>

On 16 September 2015 at 08:43, David Mertz <mertz at gnosis.cx> wrote:
> Hence I affirmatively PREFER a random module that explicitly proclaims that
> it is non-cryptographic.  Someone who figures out enough to use
> random.SystemRandom, or a future crypto.random, or the like is more likely
> to think about why they are doing so, and what doing so does and does NOT
> assure them off.

And what about those that don't? Is our position here "screw 'em, and
also screw their users"?

From ncoghlan at gmail.com  Wed Sep 16 10:34:35 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 16 Sep 2015 18:34:35 +1000
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAEbHw4Y4LEbs5cc9ERMcUqxg3P8bN6HY1KGdP-6UUj5_JSstRw@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <CADiSq7enZRFkFpzVhDnDs_8Ji12K5Pn8TQGagz8dnSaQ33-DrA@mail.gmail.com>
 <CAP7+vJJwRSF+ShR06YR1-gEf0U4UwGuUKrS_5e0c7SVGe0MCwQ@mail.gmail.com>
 <CADiSq7eXHiquJyF=09niBJDhwS4Vw3GNXjqDepfes06BF2bdLA@mail.gmail.com>
 <CAEbHw4Y4LEbs5cc9ERMcUqxg3P8bN6HY1KGdP-6UUj5_JSstRw@mail.gmail.com>
Message-ID: <CADiSq7ezg0ZoCwSNomqDJc3qbsex58OtLt3QV9gT=F+GdgTWnw@mail.gmail.com>

On 16 September 2015 at 17:43, David Mertz <mertz at gnosis.cx> wrote:
>
> On Sep 15, 2015 11:00 PM, "Nick Coghlan" <ncoghlan at gmail.com> wrote:
>> "But *why* can't I use the random module for security sensitive
>> tasks?" argument as it is at anything else. I'd like the answer to
>> that question to eventually be "Sure, you can use the random module
>> for security sensitive tasks, so let's talk about something more
>> important, like why you're collecting and storing all this sensitive
>> personally identifiable information in the first place".
>
> I believe this attitude makes overall security WORSE, not better. Giving a
> false assurance that simply using a certain cryptographic building block
> makes your application secure makes it less likely applications will fail to
> undergo genuine security analysis.
>
> Hence I affirmatively PREFER a random module that explicitly proclaims that
> it is non-cryptographic.  Someone who figures out enough to use
> random.SystemRandom, or a future crypto.random, or the like is more likely
> to think about why they are doing so, and what doing so does and does NOT
> assure them off.

You're *describing the status quo*. This isn't a new concept, as it's
the way our industry has worked since forever:

1. All the security features are off by default
2. The onus is on individual developers to "just know" when the work
they're doing is security sensitive
3. Once they realise what they're doing is security sensitive
(probably because a security engineer pointed it out), the onus is
*still* on them to educate themselves as to what to do about it

Meanwhile, their manager is pointing at the project schedule demanding
to know why the new feature hasn't shipped yet, and they're in turn
pointing fingers at the information security team blaming them for
blocking the release until the security vulnerabilities have been
addressed.

And that's the *good* scenario, since the only people it upsets are
the people working on the project. In the more typical cases where the
security team doesn't exist, gets overruled, or simply has too many
fires to try to put out, we get
http://www.informationisbeautiful.net/visualizations/worlds-biggest-data-breaches-hacks/
and http://www.wired.com/2015/07/hackers-remotely-kill-jeep-highway/

On the community project side, we take the manager, the product
schedule and the information security team out of the picture, so
folks never even get to find out that there are any problems with the
approach they're taking - they just ship and deploy software, and are
mostly protected by the lack of money involved (companies and
governments are far more interesting as targets than online
communities, so open source projects mainly need to worry about
protecting the software distribution infrastructure that provides an
indirect attack vector on more profitable targets).

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From njs at pobox.com  Wed Sep 16 11:02:36 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 16 Sep 2015 02:02:36 -0700
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F92683.5040205@egenix.com>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org> <mt7c93$gns$1@ger.gmane.org>
 <mt7dkd$87i$1@ger.gmane.org>
 <20150915035334.GF31152@ando.pearwood.info>
 <CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>
 <55F7DAAE.5010401@egenix.com>
 <CAExdVNmWgCZbTSGTC+ZuMGDz33Cs+fQMEX0is5bLh2ukeLCakQ@mail.gmail.com>
 <55F8588E.7010106@egenix.com>
 <CAExdVNmfZrR6j1yCTraeFEngYXOJDKBHe2xDfCFicgy1m-s2Rw@mail.gmail.com>
 <55F92683.5040205@egenix.com>
Message-ID: <CAPJVwBne-ETX+UsdYC4bmbAgGQrTk964H=-jrJGi0NG-qCQ1hw@mail.gmail.com>

On Wed, Sep 16, 2015 at 1:21 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>
>
> On 16.09.2015 02:43, Tim Peters wrote:
>> [Tim, on CryptMT]
>>> I did see one paper suggesting it was possible to distinguish the
>>> output of that from a truly random sequence given 2**50 consecutive
>>> outputs (but that's all - still no way to deduce the state).
>>
>> Sorry:  not 2**50 consecutive outputs (which are bytes), but 2**50
>> consecutive output bits, so only 2**47 outputs.
>
> Thanks for the "CryptMT" pointers. I'll do some research after PyCon UK
> on this.
>
> http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/CRYPTMT/index.html
>
> A quick glimpse at
>
> http://www.ecrypt.eu.org/stream/p3ciphers/cryptmt/cryptmt_p3.pdf
>
> suggests that this is a completely new stream cipher, though it
> uses the typical elements (key + non-linear filter + feedback loop).

NB that that paper also says that it's patented and requires a license
for commercial use.

> The approach is interesting, though: they propose an PRNG which
> can then get used as stream cipher by XOR'ing the PRNG output with
> the data stream. So the PRNG implies the cipher, not the other way
> around as many other approaches to CSPRNGs.
>
> That's probably also one of its perceived weaknesses: it's different
> than the common approach.

I think you just described the standard definition of a stream cipher?
"Stream cipher" is just the crypto term for a deterministic RNG, that
you XOR with data. (However it's a not a CSPRNG, because those require
seeding schedules and things like that -- check out e.g. Fortuna.)

-n

-- 
Nathaniel J. Smith -- http://vorpus.org

From p.f.moore at gmail.com  Wed Sep 16 11:42:33 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 16 Sep 2015 10:42:33 +0100
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
Message-ID: <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>

On 16 September 2015 at 05:00, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 16 September 2015 at 11:16, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>> Guido van Rossum writes:
>>  > The concept of secure vs. insecure sources of randomness isn't
>>  > *that* hard to grasp.
>>
>> Once one *tries*.  Read some of Paul Moore's posts, and you will
>> discover that the very mention of some practice "improving security"
>> immediately induces a non-trivial subset of his colleagues to start
>> thinking about how to avoid doing it.  I am almost not kidding;
>> according to his descriptions, the situation in the trenches is very
>> nearly that bad.  Security is evidently hated almost as much as spam.
>
> Yep, hence things like http://stopdisablingselinux.com/
>
> SELinux in enforcing mode operates on a very simple principle: we
> should know what system resources we expect our applications to
> access, and we should write that down in a form the computer
> understands so it can protect us against attackers trying to use that
> application to do something unintended (like steal user information).

I don't know if it's still true, but most Oracle database installation
instructions state "disable SELinux" as a basic pre-requisite. This is
a special case of a more general issue, which is that the "assign only
those privileges that you need" principle is impossible to implement
when you are working with proprietary software that contains no
documentation on what privileges it needs, other than "admin rights".
(Actually, I just checked - it looks like the official Oracle docs no
longer suggest disabling SELinux. But I bet they still don't give you
all the information you need to implement a tight security policy
without a fair amount of "try it and see what breaks"...)

Even in open source, people routinely run "sudo pip install". Not
"make the Python site-packages read/write", which is still wrong, but
which at least adheres to the principle of least privilege, but "give
me root access".

How many people get an app for their phone, see "this app needs <long
list of permissions>" and has any option other than to click "yes" or
discard the app? Who does anything with UAC on Windows other than
blindly click "yes" or disable it altogether? Not because they don't
understand the issues (certainly, many don't, but some do) but rather
because there's really no other option?

In these contexts, "security" is the name for "those things I have to
work around to do what I'm trying to do" - by disabling it, or blindly
clicking "yes", or insisting I need admin rights.

Or another example. Due to a password expiry policy combined with a
lack of viable single sign on, I have to change upwards of 50
passwords at least once every 4 weeks in order to be able to do my
job. And the time to do so is considered "overhead" and therefore
challenged regularly. So I spend a lot of time looking to see if I can
automate password changes (which is *definitely* not good practice).
I'm sure others do things like using weak passwords or reusing
passwords. Because the best practice simply isn't practical in that
context.

Nobody in the open source or security good practices communities even
has an avenue to communicate with the groups involved in this sort of
thing. At least as far as I know. I do what I can to raise awareness,
but it's a "grass roots" exercise that typically doesn't reach the
people with the means to actually change anything.

Of course, nobody in this environment uses Python to build
internet-facing web applications, either. So I'm not trying to argue
that this should drive the question of the RNG used in Python. But at
the same time, I am trying to sell Python as a good tool for
automating business processes, writing administrative scripts and
internal applications, etc. So there is a certain link...

Sorry - but it's nice to vent sometimes :-)

Paul

From mal at egenix.com  Wed Sep 16 13:38:05 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 16 Sep 2015 13:38:05 +0200
Subject: [Python-ideas] Should our default random number generator be
 secure?
In-Reply-To: <CAPJVwBne-ETX+UsdYC4bmbAgGQrTk964H=-jrJGi0NG-qCQ1hw@mail.gmail.com>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>	<CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>	<CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>	<55F6A380.4070609@egenix.com>	<CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>	<55F700C4.4030900@egenix.com>
 <mt75tq$bn1$1@ger.gmane.org>	<mt7bl5$7hk$1@ger.gmane.org>
 <mt7c93$gns$1@ger.gmane.org>	<mt7dkd$87i$1@ger.gmane.org>	<20150915035334.GF31152@ando.pearwood.info>	<CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>	<55F7DAAE.5010401@egenix.com>	<CAExdVNmWgCZbTSGTC+ZuMGDz33Cs+fQMEX0is5bLh2ukeLCakQ@mail.gmail.com>	<55F8588E.7010106@egenix.com>	<CAExdVNmfZrR6j1yCTraeFEngYXOJDKBHe2xDfCFicgy1m-s2Rw@mail.gmail.com>	<55F92683.5040205@egenix.com>
 <CAPJVwBne-ETX+UsdYC4bmbAgGQrTk964H=-jrJGi0NG-qCQ1hw@mail.gmail.com>
Message-ID: <55F9549D.4020006@egenix.com>

On 16.09.2015 11:02, Nathaniel Smith wrote:
> On Wed, Sep 16, 2015 at 1:21 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>>
>>
>> On 16.09.2015 02:43, Tim Peters wrote:
>>> [Tim, on CryptMT]
>>>> I did see one paper suggesting it was possible to distinguish the
>>>> output of that from a truly random sequence given 2**50 consecutive
>>>> outputs (but that's all - still no way to deduce the state).
>>>
>>> Sorry:  not 2**50 consecutive outputs (which are bytes), but 2**50
>>> consecutive output bits, so only 2**47 outputs.
>>
>> Thanks for the "CryptMT" pointers. I'll do some research after PyCon UK
>> on this.
>>
>> http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/CRYPTMT/index.html
>>
>> A quick glimpse at
>>
>> http://www.ecrypt.eu.org/stream/p3ciphers/cryptmt/cryptmt_p3.pdf
>>
>> suggests that this is a completely new stream cipher, though it
>> uses the typical elements (key + non-linear filter + feedback loop).
> 
> NB that that paper also says that it's patented and requires a license
> for commercial use.

Hmm, you're right:

"""
If CryptMT is selected as one of the recommendable stream ciphers by
eSTREAM, then it is free even for commercial use.
"""

Hasn't happened yet, but perhaps either eSTREAM or the authors
will change their minds.

Too bad they haven't yet, since it's a pretty fast stream cipher :-(

Anyway, here's a paper on CryptMT:

http://cryptography.gmu.edu/~jkaps/download.php?docid=1083

>> The approach is interesting, though: they propose an PRNG which
>> can then get used as stream cipher by XOR'ing the PRNG output with
>> the data stream. So the PRNG implies the cipher, not the other way
>> around as many other approaches to CSPRNGs.
>>
>> That's probably also one of its perceived weaknesses: it's different
>> than the common approach.
> 
> I think you just described the standard definition of a stream cipher?
> "Stream cipher" is just the crypto term for a deterministic RNG, that
> you XOR with data. (However it's a not a CSPRNG, because those require
> seeding schedules and things like that -- check out e.g. Fortuna.)

The standard definition I know reads like this:

"""
Stream ciphers are an important class of encryption algorithms. They encrypt individual
characters (usually binary digits) of a plaintext message one at a time, using an encryption
transformation which varies with time.
"""
(taken from Chap 6.1 Introduction of "Handbook of Applied Cryptography";
http://cacr.uwaterloo.ca/hac/about/chap6.pdf)

That's a bit more general than what you describe, since the keystream
can pretty much be generated in arbitrary ways.

What I wanted to emphasize is that a common way of coming up
with a stream cipher is to use an existing block cipher which you
then transform into a stream cipher. See e.g.

https://www.emsec.rub.de/media/crypto/attachments/files/2011/03/hudde.pdf

E.g. take AES run in CTR (counter) mode: it applies AES repeatedly
to the values of a simple counter as "RNG".

Running MT + AES would result in a similar setup, except that the
source would have somewhat better qualities and would be based
on standard well studied technology, albeit slower than going
straight for a native stream cipher.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 16 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-14: Released mxODBC Plone/Zope DA 2.2.3   http://egenix.com/go84
2015-09-18: PyCon UK 2015 ...                               2 days to go
2015-09-26: Python Meeting Duesseldorf Sprint 2015         10 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From stephen at xemacs.org  Wed Sep 16 14:43:23 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 16 Sep 2015 21:43:23 +0900
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAExdVNkmxoqB2CYt2L_vQHzYWCrq8GS5e=GwsrpvmNryt1jr+A@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
 <CAEbHw4Zq_AT-8E6iDh_CB3LfDha8KTY0=cBZ9tNwCjp7L4VWrQ@mail.gmail.com>
 <87mvwnxful.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=Xo_FqFvvQbgiV9s-WfZv3R=HfX0=cSsHteYw9EWw6Fg@mail.gmail.com>
 <87lhc7x95u.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNkmxoqB2CYt2L_vQHzYWCrq8GS5e=GwsrpvmNryt1jr+A@mail.gmail.com>
Message-ID: <8737yey1p0.fsf@uwakimon.sk.tsukuba.ac.jp>

Tim Peters writes:

 > Fundamentally, I just don't see the sense in saying that someone
 > who does their own seeding deserves whatever they get, while
 > someone who uses an inappropriate generator in a security context
 > should be saved from themself.

Strawman, or imprecise quotation if you prefer.  Nobody said they
*deserve* it AFAICR; I said we can't stop them.  Strictly speaking,
yes, we could.  We could (and *I* think we *should*) make it much less
obvious how to do it by removing the seed method and the seed argument
to __init__.  The problem there is backward compatibility.  I don't
see that Guido would stand for it.  Dis here homeboy not a-gonna stick
mah neck out heeya, suh.

I suspect we might also want to provide helper functions to construct
a state from a seed as used by some other simulation package, such as
Python 3.4. ;-)  Name them and document them as for use in replicating
simulations done from those seeds.  Nice self-documenting names like
"construct_rng_internal_state_from_python_3_4_compatible_seed".  There
should be one for each version of Python, too (sssh! don't confuse the
users with abstractions like "identical implementation").

 > There's no real substitute for understanding what you're doing,
 > regardless of field.  Yes, incompetence can cause great damage.
 > But I'm not sure it does the world a real favor to possibly help a
 > programmer incompetent to do a task keep working in the field a
 > little longer.

"Think of it as evolution in action."  Yeah, I sympathize.  But
realistically, Darwinian selection will take geological time, no?
That is, in almost all cases where disaster strikes, the culprit has
long since moved on[1].  Whoever gets the sack is unlikely to be him
or her.  More likely it will be whoever has been telling the shop that
their product is an accident waiting to happen. :-(

The way I think about it, though, is a variation on a theme by Nick.
Specifically, the more attractive nuisances we can eliminate, the
fewer things the uninitiated need to learn.


Footnotes: 
[1]  That's especially true in Japan, where I live.  "Whodunnit" also
gets fuzzed up by the tendency to group work and group think, and a
value system that promotes "getting along with others" more than
expertise.  Child-proof caps are a GoodThang[tm]. ;-)


From ncoghlan at gmail.com  Wed Sep 16 14:53:53 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 16 Sep 2015 22:53:53 +1000
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
Message-ID: <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>

On 16 September 2015 at 19:42, Paul Moore <p.f.moore at gmail.com> wrote:
> Nobody in the open source or security good practices communities even
> has an avenue to communicate with the groups involved in this sort of
> thing.

Fortunately, that's no longer the case. Open source based development
models are going mainstream, and while there's still a lot of work to
do, cases like the US Federal government requiring the creation of
open source prototypes as part of a bidding process are incredibly
heartening (https://18f.gsa.gov/2015/08/28/announcing-the-agile-BPA-awards/).

On the security side, folks are realising that the "You can't do that,
it's a security risk" model is a bad one, and hence favoring switching
to a model more like "We can help you to minimise your risk exposure
while still enabling you to do what you want to do".

So while it's going to take time for practices like those described in
https://playbook.cio.gov/ to become a description of "the way the IT
industry typically works", the benefits are so remarkable that it's a
question of "when" rather than "if".

> Of course, nobody in this environment uses Python to build
> internet-facing web applications, either. So I'm not trying to argue
> that this should drive the question of the RNG used in Python. But at
> the same time, I am trying to sell Python as a good tool for
> automating business processes, writing administrative scripts and
> internal applications, etc. So there is a certain link...

Right, helping Red Hat's Python maintenance team to maintain that kind
of balance is one aspect of my day job, hence my interest in
https://www.python.org/dev/peps/pep-0493/ as a nicer migration path
when backporting the change to verify HTTPS certificates by default.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From robert.kern at gmail.com  Wed Sep 16 15:25:22 2015
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 16 Sep 2015 14:25:22 +0100
Subject: [Python-ideas] Should our default random number generator be
	secure?
In-Reply-To: <55F9549D.4020006@egenix.com>
References: <87lhcaz5gs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVNmue_3obYf5Yhh70vHidRCvQv1N7WT-WF0qhmqLQ73CBw@mail.gmail.com>
 <CAPJVwBkHQuEnEHB4A3CC4wTkuNO6Zj+wmwwnsjoWGDGhF+mC5w@mail.gmail.com>
 <55F6A380.4070609@egenix.com>
 <CAPJVwBmf6ttfc1=xLq4M3vRc65=4k2OPCatGgU3mgzOzAwctZQ@mail.gmail.com>
 <55F700C4.4030900@egenix.com> <mt75tq$bn1$1@ger.gmane.org>
 <mt7bl5$7hk$1@ger.gmane.org> <mt7c93$gns$1@ger.gmane.org>
 <mt7dkd$87i$1@ger.gmane.org> <20150915035334.GF31152@ando.pearwood.info>
 <CAPJVwBnhLVj26pZuTXE1Acwge-YLWXubCi-91b-bK1LSsu1j8g@mail.gmail.com>
 <55F7DAAE.5010401@egenix.com>
 <CAExdVNmWgCZbTSGTC+ZuMGDz33Cs+fQMEX0is5bLh2ukeLCakQ@mail.gmail.com>
 <55F8588E.7010106@egenix.com>
 <CAExdVNmfZrR6j1yCTraeFEngYXOJDKBHe2xDfCFicgy1m-s2Rw@mail.gmail.com>
 <55F92683.5040205@egenix.com>
 <CAPJVwBne-ETX+UsdYC4bmbAgGQrTk964H=-jrJGi0NG-qCQ1hw@mail.gmail.com>
 <55F9549D.4020006@egenix.com>
Message-ID: <mtbqk2$c61$1@ger.gmane.org>

On 2015-09-16 12:38, M.-A. Lemburg wrote:

> What I wanted to emphasize is that a common way of coming up
> with a stream cipher is to use an existing block cipher which you
> then transform into a stream cipher. See e.g.
>
> https://www.emsec.rub.de/media/crypto/attachments/files/2011/03/hudde.pdf
>
> E.g. take AES run in CTR (counter) mode: it applies AES repeatedly
> to the values of a simple counter as "RNG".

Indeed. DE Shaw has done the analysis for you:

https://www.deshawresearch.com/resources_random123.html

> Running MT + AES would result in a similar setup, except that the
> source would have somewhat better qualities and would be based
> on standard well studied technology, albeit slower than going
> straight for a native stream cipher.

Why do you think it would have better qualities? You'll have to redo the 
analysis that makes MT and AES each so well-studied, and I'm not sure that all 
of the desirable properties of either will survive the combination.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco


From guido at python.org  Wed Sep 16 16:26:05 2015
From: guido at python.org (Guido van Rossum)
Date: Wed, 16 Sep 2015 07:26:05 -0700
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
Message-ID: <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>

There's still way too much chatter, and a lot that seems just rhetoric.
This is not the republican primaries.

Yes lots of companies got hacked. What's the evidence that a language's
default RNG was involved? IIUC the best practice for password encryption
(to make cracking using a large word list harder) is something called
bcrypt; maybe next year something else will become popular, but the default
RNG seems an unlikely candidate. I know that in the past the randomness of
certain protocols was compromised because the seeding used a timestamp that
an attacker could influence or guess. But random.py seeds MT from
os.urandom(2500). So what's the class of vulnerabilities where the default
RNG is implicated?

Tim's proposal is simple: create a new module, e.g. safefandom, with the
same API as random (less seed/state). That's it. Then it's a simple import
change away to do the right thing, and we have years to seed StackOverflow
with better information before that code even hits the road. (But a
backport to Python 2.7 could be on PyPI tomorrow!)

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150916/c301cbd4/attachment-0001.html>

From tim.peters at gmail.com  Wed Sep 16 17:47:30 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 16 Sep 2015 10:47:30 -0500
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
 <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
Message-ID: <CAExdVNny9tZqzc_NZ3UORn+c5Nt=OC0CEz3eOvEg9YxbvmGp6A@mail.gmail.com>

[Guido]
> There's still way too much chatter, and a lot that seems just rhetoric. This
> is not the republican primaries.

Which is a shame, since the chatter here is of much higher quality
than in the actual primaries ;-)


> Yes lots of companies got hacked. What's the evidence that a language's
> default RNG was involved?

Nobody cares whether there's evidence of actual harm.  Just that there
_might_ be, and even if none identifiable now, then maybe in the
future.

There is evidence of actual harm from RNGs doing poor _seeding_ by
default, but Python already fixed that (I know, you already know that
;-) ).

And this paper, from a few years ago, studying RNG vulnerabilities in
PHP apps, is really good:

     https://media.blackhat.com/bh-us-12/Briefings/Argyros/BH_US_12_Argyros_PRNG_WP.pdf

An interesting thing is that several of the apps already had a history
of trying to fix security-related holes related to RNG (largely due to
PHP's poor default seeding), but remained easily cracked.

The primary recommendation there wasn't to make PHP's various PRNGs
"crypto by magic", but for core PHP to supply "a standard" crypto RNG
for people to use instead.  As above, some of the app developers
already knew darned well they had a history of RNG-related holes, but
simply had no standard way to address it, and didn't have the _major_
expertise needed to roll their own.


> IIUC the best practice for password encryption (to
> make cracking using a large word list harder) is something called bcrypt;
> maybe next year something else will become popular, but the default RNG
> seems an unlikely candidate. I know that in the past the randomness of
> certain protocols was compromised because the seeding used a timestamp that
> an attacker could influence or guess. But random.py seeds MT from
> os.urandom(2500). So what's the class of vulnerabilities where the default
> RNG is implicated?

1. Users doing their own poor seeding.

2. A hypothetical MT state-deducer (seemingly needing to be
   considerably more sophisticated than the already mondo
   sophisticated one in the paper above) to be of general use
   against Python.

3. "Prove there can't be any in the future.  Ha!  You can't." ;-)


> Tim's proposal is simple: create a new module, e.g. safefandom, with the
> same API as random (less seed/state). That's it. Then it's a simple import
> change away to do the right thing, and we have years to seed StackOverflow
> with better information before that code even hits the road. (But a backport
> to Python 2.7 could be on PyPI tomorrow!)

Which would obviously be fine by me:  make the distinction obvious at
import time, make "the safe way" dead easy and convenient to use, give
it anew name engineered to nudge newbies away from the "unsafe" (by
contrast) `random`, and a new name easily discoverable by web search.

There's something else here:  some of these messages gave pointers to
web pages where "security wonks" conceded that specific uses of
SystemRandom were fine, but they couldn't recommend it anyway because
it's too hard to explain what is or isn't "safe".  "Therefore" users
should only use urandom() directly.  Which is insane, if for no other
reason than that users would then invent their own algorithms to
convert urandom() results into floats and ints, etc.  Then they'll
screw up _that_ part.

But if "saferandom" were its own module, then over time it could
implement its own "security wonk certified" higher level (than raw
bytes) methods.  I suspect it would never need to change anything from
what the SystemRandom class does, but I'm not a security wonk, so I
know nothing.  Regardless, _whatever_ changes certified wonks deemed
necessary in the future could be confined to the new module, where
incompatibilities would only annoy apps using that module.  Ditto
whatever doc changes were needed.  Also gone would be the inherent
confusion from needing to draw distinctions between "safe" and
"unsafe" in a single module's docs (which any by-magic scheme would
only make worse).

However, supplying a powerful and dead-simple-to-use new module would
indeed do nothing to help old code entirely by magic.  That's a
non-goal to me, but appears to be the _only_ deal-breaker goal for the
advocates.

Which is why none of us is the BDFL ;-)

From ncoghlan at gmail.com  Wed Sep 16 17:54:24 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 17 Sep 2015 01:54:24 +1000
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
 <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
Message-ID: <CADiSq7esW+pmYjwvmFiS9qiZKTa4M2qnXTrhYi1Q9bDE4pPhBw@mail.gmail.com>

On 17 September 2015 at 00:26, Guido van Rossum <guido at python.org> wrote:
> There's still way too much chatter, and a lot that seems just rhetoric. This
> is not the republican primaries.

There was still a fair bit of useful feedback in there, so I pushed a
new version of the PEP that addresses it:

* the submodule idea is gone
* the module level API still delegates to random._inst at call time
rather than import time
* random._inst is a SystemRandom() instance by default
* there's a  new ensure_repeatable() API to switch it back to random.Random()
* seed(), getstate() and setstate() all implicitly call ensure_repeatable()
* the latter issue a warning recommending calling ensure_repeatable() explicitly

The key user experience difference from the status quo is that this
allows the "not suitable for security purposes" warning to be moved to
a section specifically covering ensure_repeatable(), seed(),
getstate() and setstate() rather than automatically applying to the
entire random module.

The reason it becomes reasonable to move the warning is that it
changes the failure mode from "any use of the module API for security
sensitive purposes" is a problem to "any use of the module API for
security sensitive purposes is a problem if the application also calls
random.ensure_repeatable()".

> Yes lots of companies got hacked. What's the evidence that a language's
> default RNG was involved? IIUC the best practice for password encryption (to
> make cracking using a large word list harder) is something called bcrypt;
> maybe next year something else will become popular, but the default RNG
> seems an unlikely candidate. I know that in the past the randomness of
> certain protocols was compromised because the seeding used a timestamp that
> an attacker could influence or guess. But random.py seeds MT from
> os.urandom(2500). So what's the class of vulnerabilities where the default
> RNG is implicated?

Reducing the search space for brute force attacks on things like:

* randomly generated default passwords
* password reset tokens
* session IDs

The PHP paper covered an attack on password reset tokens.

Python's seeding is indeed much better, and Tim's mathematical skills
are infinitely better than mine so I'm never personally going to win a
war of equations with him. If you considered a conclusive proof of a
break specifically targeting *CPython's* PRNG essential before
considering changing the default behaviour (even given the almost
entirely backwards compatible approach I'm now proposing), I'd defer
the PEP with a note suggesting that practical attacks on security
tokens generated with CPython's PRNG may be a topic of potential
interest to the security community.

The PEP would then stay deferred until someone actually did the
research and demonstrated a practical attack.

> Tim's proposal is simple: create a new module, e.g. safefandom, with the
> same API as random (less seed/state). That's it. Then it's a simple import
> change away to do the right thing, and we have years to seed StackOverflow
> with better information before that code even hits the road. (But a backport
> to Python 2.7 could be on PyPI tomorrow!)

If folks are reaching for a third party library anyway, we'd be better
off point them at one of the higher levels ones like passlib or
cryptography.

There's also the aspect that something I'd now like to achieve is to
eliminate the security warning that is one of the first things people
currently see when they open up the random module documentation:
https://docs.python.org/3/library/random.html

While I think that warning is valuable given the current default
behaviour, it's also inherently user hostile for beginners that
actually *do* read the docs, as it raises questions they don't know
how to answer: "The pseudo-random generators of this module should not
be used for security purposes. Use os.urandom() or SystemRandom if you
require a cryptographically secure pseudo-random number generator."

Switching the default means that the question to be asked is instead
"Do you need repeatability?", which is *much* easier question, and we
only need to ask it in the documentation for ensure_repeatable() and
the related functions that call that implicitly.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From donald at stufft.io  Wed Sep 16 17:54:30 2015
From: donald at stufft.io (Donald Stufft)
Date: Wed, 16 Sep 2015 11:54:30 -0400
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAExdVNny9tZqzc_NZ3UORn+c5Nt=OC0CEz3eOvEg9YxbvmGp6A@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
 <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
 <CAExdVNny9tZqzc_NZ3UORn+c5Nt=OC0CEz3eOvEg9YxbvmGp6A@mail.gmail.com>
Message-ID: <etPan.55f990b6.35ccbfa8.6557@Draupnir.home>

On September 16, 2015 at 11:48:12 AM, Tim Peters (tim.peters at gmail.com) wrote:
> > There's something else here: some of these messages gave pointers 
> to
> web pages where "security wonks" conceded that specific uses 
> of
> SystemRandom were fine, but they couldn't recommend it anyway 
> because
> it's too hard to explain what is or isn't "safe". "Therefore" 
> users
> should only use urandom() directly. Which is insane, if for no 
> other
> reason than that users would then invent their own algorithms 
> to
> convert urandom() results into floats and ints, etc. Then they'll 
> screw up _that_ part.

That was the documentation for PyCA's cryptography module, where the only use
of random we needed was for an IV (which you can use the output of os.urandom
directly) and for an integer, which you could just use int.from_bytes and the
output of os.urandom (i.e. int.from_bytes(os.urandom(20), byteorder="big")).

It wasn't so much a general recommendation against random.SystemRandom, just
that for our particular use case os.urandom is either by itself fine, or with
a tiny bit of code on top of it fine and that's easier to explain than to try
to explain how to use the random module safely and just warn against it
entirely.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From steve at pearwood.info  Wed Sep 16 17:54:13 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 17 Sep 2015 01:54:13 +1000
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CADiSq7eXHiquJyF=09niBJDhwS4Vw3GNXjqDepfes06BF2bdLA@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <CADiSq7enZRFkFpzVhDnDs_8Ji12K5Pn8TQGagz8dnSaQ33-DrA@mail.gmail.com>
 <CAP7+vJJwRSF+ShR06YR1-gEf0U4UwGuUKrS_5e0c7SVGe0MCwQ@mail.gmail.com>
 <CADiSq7eXHiquJyF=09niBJDhwS4Vw3GNXjqDepfes06BF2bdLA@mail.gmail.com>
Message-ID: <20150916155412.GJ31152@ando.pearwood.info>

On Wed, Sep 16, 2015 at 03:59:04PM +1000, Nick Coghlan wrote:

[...]
> Accordingly, my proposal is aimed as much at eliminating the perennial
> "But *why* can't I use the random module for security sensitive
> tasks?" argument as it is at anything else. I'd like the answer to
> that question to eventually be "Sure, you can use the random module
> for security sensitive tasks, so let's talk about something more
> important, like why you're collecting and storing all this sensitive
> personally identifiable information in the first place".

The answer to that question is *already* "sure you can use the random 
module". You just have to use it correctly.

[Aside: do you think that, having given companies and people a "secure 
by default" solution that will hopefully prevent data breaches, that 
they will be *more* or *less* open to the idea that they shouldn't be 
collecting this sensitive information?]

We've spent a long time taking about random() as regards to security, 
but nobody exposes the output of random directly. They use it as a 
building block to generate tokens and passwords, and *that's* where the 
breech is occurring. We shouldn't care so much about the building blocks 
and care more about the high-level tools: the batteries included.

Look at the example given by Nathaniel:

https://media.blackhat.com/bh-us-12/Briefings/Argyros/BH_US_12_Argyros_PRNG_WP.pdf

What was remarkable about this is how many individual factors were 
involved in the attacks. It wasn't merely an attack on Mersenne Twister, 
and it is quite possible that had any of the other factors been changed, 
the attacks would have failed.

E.g. the applications used MD5 hashes. What if they had used SHA-1? They 
leaked sensitive information such as PIDs and exposed the time that the 
random numbers where generated. They allowed the attackers to get as 
many connections as they wanted.

Someone might argue that none of those other problems would matter if 
the PRNG was more secure. That's true, up to a point: you never know 
when somebody will come up with an attack on the CSPRNG. Previous 
generations of CSPRNGs, including RC4, have been "retired", and we must 
expect that the current generation will be too. It is a good habit to 
avoid leaking this sort of information (times, PIDs etc) even if you 
don't have a concrete attack in place, because you don't know when a 
concrete attack will be discovered. Today's CSPRNG is tomorrow's 
hopelessly insecure PRNG, but defence is depth is always useful.

I propose that instead of focusing on changing the building 
blocks that people will use by default, we provide them with ready made 
batteries for the most common tasks, and provide a clear example of 
acceptable practices for making their own batteries. (As usual, the 
standard lib will provide batteries, and third-party frameworks or 
libraries can provide heavy-duty nuclear reactors.)

I propose:

- The random module's API is left as-is, including the default PRNG. 
  Backwards compatibility is important, code-churn is bad, and there are 
  good use-cases for a non-CSPRNG.

- We add at least one CSPRNG. I leave it to the crypto-wonks to decide 
  which.

- We add a new module, which I'm calling "secrets" (for lack of a better 
  name) to hold best-practice security-related functions. To start with,
  it would have at least these three functions: one battery, and two 
  building blocks:

  + secrets.token to create password recovery tokens or similar;

  + secrets.random calls the CSPRNG; it just returns a random number 
    (integer?). There is no API for getting or setting the state, 
    setting the seed, or returning values from non-uniform 
    distributions;

  + secrets.choice similarly uses the CSPRNG.


Developers will still have to make a choice: "do I use secrets, or 
random?" If they're looking for a random token (or password?), the 
answer is obvious: use secrets, because the battery is already there. 
For reasons that I will go into below, I don't think that requiring this 
choice is a bad thing. I think it is a *good* thing.

secrets becomes the go-to module for things you want to keep secret. 
random remains the module you use for games and simulations.

If there is interest in this proposed secrets module, I'll write up a 
proto-PEP over the weekend, and start a new thread for the benefit of 
those who have muted this one.

You can stop reading now. The rest is motivational rather than part of 
the concrete proposal.


Still here? Okay.


I think that it is a good thing to have developers explicitly make a 
choice between random and secrets. I think it is important that we 
continue to involve developers in security thinking. I don't believe 
that "secure by default" is possible at the application level, and 
that's what really matters. It doesn't matter if the developer uses a 
"secure by default" CSPRNG if the application leaks information some 
other way. We cannot possibly hope to solve application security from 
the bottom-up (although providing good secure tools is part of the 
solution).

I believe that computer security is to the IT world what occupational 
health and safety is to the farming, building and manufacturing 
industries (and others). The thing about security is that, like safety, 
it is not a product. There is no function you can call to turn security 
on, no secure=True setting. It is a process and a mind-set, and everyone 
involved needs to think about it, at least a little bit.

It took a long time for the blue collar industries to accept that OH&S 
was something that *everyone* has to be involved in, from the government 
setting standards to individual workers who have to keep their eyes open 
while on the job. Like the IT industry today, management's attitude was 
that safety was a cost that just ate into profits and made projects late 
(sound familiar?), and the workers' attitude was all too often "it won't 
happen to us".

It takes experience and training and education to recognise dangerous 
situations on the job, and people die when they don't get that training. 
It is part of every person's job to think about what they are doing.

I don't believe that it is possible to have "zero-thought security" any 
more than it is possible to have "zero-thought safety". The security 
professionals can help by providing ready-to-use tools, but the 
individual developers still have to use those tools correctly, and 
cultivate a properly careful mindset:

"If I wanted to break into this application, what information would I 
look for? How can I stop providing it? Am I using the right tool for 
this job? How can I check? Where's the security rep?"

Until the IT industry treats security as the building industry treats 
OH&S, attempts to bail out the Titanic with a teacup with bottom-up 
"safe by default" functions will just encourage a false sense of 
security.



-- 
Steve

From ncoghlan at gmail.com  Wed Sep 16 18:05:53 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 17 Sep 2015 02:05:53 +1000
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <20150916155412.GJ31152@ando.pearwood.info>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <CADiSq7enZRFkFpzVhDnDs_8Ji12K5Pn8TQGagz8dnSaQ33-DrA@mail.gmail.com>
 <CAP7+vJJwRSF+ShR06YR1-gEf0U4UwGuUKrS_5e0c7SVGe0MCwQ@mail.gmail.com>
 <CADiSq7eXHiquJyF=09niBJDhwS4Vw3GNXjqDepfes06BF2bdLA@mail.gmail.com>
 <20150916155412.GJ31152@ando.pearwood.info>
Message-ID: <CADiSq7f5Q4hyWoeGE8CVfH3O7okjU47VTqS7eq-6waqUdY00Mg@mail.gmail.com>

On 17 September 2015 at 01:54, Steven D'Aprano <steve at pearwood.info> wrote:
> I propose:
>
> - The random module's API is left as-is, including the default PRNG.
>   Backwards compatibility is important, code-churn is bad, and there are
>   good use-cases for a non-CSPRNG.
>
> - We add at least one CSPRNG. I leave it to the crypto-wonks to decide
>   which.
>
> - We add a new module, which I'm calling "secrets" (for lack of a better
>   name) to hold best-practice security-related functions. To start with,
>   it would have at least these three functions: one battery, and two
>   building blocks:
>
>   + secrets.token to create password recovery tokens or similar;
>
>   + secrets.random calls the CSPRNG; it just returns a random number
>     (integer?). There is no API for getting or setting the state,
>     setting the seed, or returning values from non-uniform
>     distributions;
>
>   + secrets.choice similarly uses the CSPRNG.
>
> Developers will still have to make a choice: "do I use secrets, or
> random?" If they're looking for a random token (or password?), the
> answer is obvious: use secrets, because the battery is already there.
> For reasons that I will go into below, I don't think that requiring this
> choice is a bad thing. I think it is a *good* thing.
>
> secrets becomes the go-to module for things you want to keep secret.
> random remains the module you use for games and simulations.
>
> If there is interest in this proposed secrets module, I'll write up a
> proto-PEP over the weekend, and start a new thread for the benefit of
> those who have muted this one.

Oh, *this* I like (minus the idea of introducing a CSPRNG -
random.SystemRandom will be a good choice for this task).

"Is it an important secret?" is a question anyone can answer, so
simply changing the proposed name addresses all my concerns regarding
having to ask people to learn how to answer a difficult question that
isn't directly related to what they're trying to do.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From guido at python.org  Wed Sep 16 18:06:36 2015
From: guido at python.org (Guido van Rossum)
Date: Wed, 16 Sep 2015 09:06:36 -0700
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAExdVNny9tZqzc_NZ3UORn+c5Nt=OC0CEz3eOvEg9YxbvmGp6A@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
 <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
 <CAExdVNny9tZqzc_NZ3UORn+c5Nt=OC0CEz3eOvEg9YxbvmGp6A@mail.gmail.com>
Message-ID: <CAP7+vJ+NjpkpDGDS5FrKzVUXp_0B+rtGq0fHgtsUx8MZJZds+w@mail.gmail.com>

On Wed, Sep 16, 2015 at 8:47 AM, Tim Peters <tim.peters at gmail.com> wrote:

> [Guido]
> > There's still way too much chatter, and a lot that seems just rhetoric.
> This
> > is not the republican primaries.
>
> Which is a shame, since the chatter here is of much higher quality
> than in the actual primaries ;-)
>
>
> > Yes lots of companies got hacked. What's the evidence that a language's
> > default RNG was involved?
>
> Nobody cares whether there's evidence of actual harm.  Just that there
> _might_ be, and even if none identifiable now, then maybe in the
> future.
>
> There is evidence of actual harm from RNGs doing poor _seeding_ by
> default, but Python already fixed that (I know, you already know that
> ;-) ).
>
> And this paper, from a few years ago, studying RNG vulnerabilities in
> PHP apps, is really good:
>
>
> https://media.blackhat.com/bh-us-12/Briefings/Argyros/BH_US_12_Argyros_PRNG_WP.pdf
>
> An interesting thing is that several of the apps already had a history
> of trying to fix security-related holes related to RNG (largely due to
> PHP's poor default seeding), but remained easily cracked.
>
> The primary recommendation there wasn't to make PHP's various PRNGs
> "crypto by magic", but for core PHP to supply "a standard" crypto RNG
> for people to use instead.  As above, some of the app developers
> already knew darned well they had a history of RNG-related holes, but
> simply had no standard way to address it, and didn't have the _major_
> expertise needed to roll their own.
>
>
> > IIUC the best practice for password encryption (to
> > make cracking using a large word list harder) is something called bcrypt;
> > maybe next year something else will become popular, but the default RNG
> > seems an unlikely candidate. I know that in the past the randomness of
> > certain protocols was compromised because the seeding used a timestamp
> that
> > an attacker could influence or guess. But random.py seeds MT from
> > os.urandom(2500). So what's the class of vulnerabilities where the
> default
> > RNG is implicated?
>
> 1. Users doing their own poor seeding.
>
> 2. A hypothetical MT state-deducer (seemingly needing to be
>    considerably more sophisticated than the already mondo
>    sophisticated one in the paper above) to be of general use
>    against Python.
>
> 3. "Prove there can't be any in the future.  Ha!  You can't." ;-)
>
>
> > Tim's proposal is simple: create a new module, e.g. safefandom, with the
> > same API as random (less seed/state). That's it. Then it's a simple
> import
> > change away to do the right thing, and we have years to seed
> StackOverflow
> > with better information before that code even hits the road. (But a
> backport
> > to Python 2.7 could be on PyPI tomorrow!)
>
> Which would obviously be fine by me:  make the distinction obvious at
> import time, make "the safe way" dead easy and convenient to use, give
> it anew name engineered to nudge newbies away from the "unsafe" (by
> contrast) `random`, and a new name easily discoverable by web search.
>
> There's something else here:  some of these messages gave pointers to
> web pages where "security wonks" conceded that specific uses of
> SystemRandom were fine, but they couldn't recommend it anyway because
> it's too hard to explain what is or isn't "safe".  "Therefore" users
> should only use urandom() directly.  Which is insane, if for no other
> reason than that users would then invent their own algorithms to
> convert urandom() results into floats and ints, etc.  Then they'll
> screw up _that_ part.
>
> But if "saferandom" were its own module, then over time it could
> implement its own "security wonk certified" higher level (than raw
> bytes) methods.  I suspect it would never need to change anything from
> what the SystemRandom class does, but I'm not a security wonk, so I
> know nothing.  Regardless, _whatever_ changes certified wonks deemed
> necessary in the future could be confined to the new module, where
> incompatibilities would only annoy apps using that module.  Ditto
> whatever doc changes were needed.  Also gone would be the inherent
> confusion from needing to draw distinctions between "safe" and
> "unsafe" in a single module's docs (which any by-magic scheme would
> only make worse).
>
> However, supplying a powerful and dead-simple-to-use new module would
> indeed do nothing to help old code entirely by magic.  That's a
> non-goal to me, but appears to be the _only_ deal-breaker goal for the
> advocates.
>
> Which is why none of us is the BDFL ;-)
>

So if you or someone else (Chris?) wrote that up in PEP form I'd accept it.

I'd even accept adding a warning on calling seed() (but not setstate()).

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150916/339fc3e1/attachment.html>

From donald at stufft.io  Wed Sep 16 18:09:15 2015
From: donald at stufft.io (Donald Stufft)
Date: Wed, 16 Sep 2015 12:09:15 -0400
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <20150916155412.GJ31152@ando.pearwood.info>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <CADiSq7enZRFkFpzVhDnDs_8Ji12K5Pn8TQGagz8dnSaQ33-DrA@mail.gmail.com>
 <CAP7+vJJwRSF+ShR06YR1-gEf0U4UwGuUKrS_5e0c7SVGe0MCwQ@mail.gmail.com>
 <CADiSq7eXHiquJyF=09niBJDhwS4Vw3GNXjqDepfes06BF2bdLA@mail.gmail.com>
 <20150916155412.GJ31152@ando.pearwood.info>
Message-ID: <etPan.55f9942c.7b68e578.6557@Draupnir.home>

On September 16, 2015 at 11:55:48 AM, Steven D'Aprano (steve at pearwood.info) wrote:
>
> - We add at least one CSPRNG. I leave it to the crypto-wonks to decide
> which.

We already have a CSPRNG via os.urandom, and importantly we don't have to
decide which implementation it is, because the OS provides it and is
responsible for it. I am against adding a userspace CSPRNG as anything but a
possible implementation detail of making a CSPRNG the default for random.py. If
we're not going to change the default, then I think adding a userspace CSPRNG
is jsut adding a different footgun. That's OK though, becuase os.urandom is a
pretty great CSPRNG.

>  
> Developers will still have to make a choice: "do I use secrets, or
> random?" If they're looking for a random token (or password?), the
> answer is obvious: use secrets, because the battery is already there.
> For reasons that I will go into below, I don't think that requiring this
> choice is a bad thing. I think it is a *good* thing.

Forcing the user to make a choice isn?t a bad option from a security point of
view. Most people will prefer to use the secure one by default even if they
don't know better, the problem right now is that there is a "default", and that
default is unsafe so people aren't forced to make a choice, they are given a
choice with the option to go and make a choice later.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From tim.peters at gmail.com  Wed Sep 16 18:09:55 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 16 Sep 2015 11:09:55 -0500
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CADiSq7esW+pmYjwvmFiS9qiZKTa4M2qnXTrhYi1Q9bDE4pPhBw@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
 <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
 <CADiSq7esW+pmYjwvmFiS9qiZKTa4M2qnXTrhYi1Q9bDE4pPhBw@mail.gmail.com>
Message-ID: <CAExdVNnU18ZC46FRJyQOmfj62kD897R_tf-u1hBfdur0G_m3Fg@mail.gmail.com>

[Guido]
>> ...
>> Tim's proposal is simple: create a new module, e.g. safefandom, with the
>> same API as random (less seed/state). That's it. Then it's a simple import
>> change away to do the right thing, and we have years to seed StackOverflow
>> with better information before that code even hits the road. (But a backport
>> to Python 2.7 could be on PyPI tomorrow!)

[Nick Coghlan <ncoghlan at gmail.com>]
> If folks are reaching for a third party library anyway, we'd be better
> off point them at one of the higher levels ones like passlib or
> cryptography.

Note that, in context, "saferandom" _would_ be a standard module in a
future Python 3 feature release.  But it _could_ be used literally
tomorrow by anyone who wanted a head start, whether in a current
Python 2 or Python 3.

And if pieces of `passlib` and/or `cryptography` are thought to be
essential for best practice, cool, then `saferandom` could also become
a natural home for workalikes.  Would you really want to _ever_ put
such functions in the catch-all "random" module?  The docs would
become an incomprehensible mess.

From ncoghlan at gmail.com  Wed Sep 16 18:21:09 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 17 Sep 2015 02:21:09 +1000
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAExdVNnU18ZC46FRJyQOmfj62kD897R_tf-u1hBfdur0G_m3Fg@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
 <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
 <CADiSq7esW+pmYjwvmFiS9qiZKTa4M2qnXTrhYi1Q9bDE4pPhBw@mail.gmail.com>
 <CAExdVNnU18ZC46FRJyQOmfj62kD897R_tf-u1hBfdur0G_m3Fg@mail.gmail.com>
Message-ID: <CADiSq7dokrTNk7nb6iavOXvKajL2Ywf8_bdS-bD3KyeSgL2ADw@mail.gmail.com>

On 17 September 2015 at 02:09, Tim Peters <tim.peters at gmail.com> wrote:
> [Guido]
>>> ...
>>> Tim's proposal is simple: create a new module, e.g. safefandom, with the
>>> same API as random (less seed/state). That's it. Then it's a simple import
>>> change away to do the right thing, and we have years to seed StackOverflow
>>> with better information before that code even hits the road. (But a backport
>>> to Python 2.7 could be on PyPI tomorrow!)
>
> [Nick Coghlan <ncoghlan at gmail.com>]
>> If folks are reaching for a third party library anyway, we'd be better
>> off point them at one of the higher levels ones like passlib or
>> cryptography.
>
> Note that, in context, "saferandom" _would_ be a standard module in a
> future Python 3 feature release.  But it _could_ be used literally
> tomorrow by anyone who wanted a head start, whether in a current
> Python 2 or Python 3.
>
> And if pieces of `passlib` and/or `cryptography` are thought to be
> essential for best practice, cool, then `saferandom` could also become
> a natural home for workalikes.  Would you really want to _ever_ put
> such functions in the catch-all "random" module?  The docs would
> become an incomprehensible mess.

My main objection here was the name, so Steven's suggestion of calling
such a module "secrets" with a suitably crafted higher level API
rather than replicating the entire random module API made a big
difference. We may even be able to finally give hmac.compare_digest a
more obvious home as something like "secrets.equal".

I'll leave PEP 504 as Draft for now, but I currently expect I'll end
up withdrawing it in favour of Steven's idea.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From jsbfox at gmail.com  Wed Sep 16 18:36:58 2015
From: jsbfox at gmail.com (Un Do)
Date: Wed, 16 Sep 2015 17:36:58 +0100
Subject: [Python-ideas] Add inspect.getenclosed to return/yield source code
 for nested classes and functions
Message-ID: <CAMPw9HQaQxUHEeGpx2FDhBTUrFLHTB6v8HNPoTo-H6OJ0ELZ+g@mail.gmail.com>

I propose adding a function into inspect module that will retrieve
definitions of classes and functions (standard and lambdas) located
inside another function/method.

In my opinion this would a small but nice and useful addition to the
standard library. It can be implemented using a couple of undocumented
function from that module (findsource and getblock) without any
performance drawbacks.

Example:

    In [9]: print(getsource(function))
    def function():
        class inner_class():
            def __init__(self):
                return

        # Some code
        # Some more code
        # Even more code

        l = lambda x: 42

        # Ugh code again

        def inner_function(with_argument):
            pass


    In [10]: for c in function.__code__.co_consts:
       ....:     if not iscode(c):
       ....:         continue
       ....:     name, starts_line = c.co_name, c.co_firstlineno
       ....:     if not name.startswith('<') or name == '<lambda>':
       ....:         lines, _ = findsource(c)
       ....:         source = ''.join(getblock(lines[starts_line-1:]))
       ....:         print(dedent(source), end='-' * 30 + '\n')
       ....:
    class inner_class():
        def __init__(self):
            return
    ------------------------------
    l = lambda x: 42
    ------------------------------
    def inner_function(with_argument):
        pass
    ------------------------------


What do you think?

From mertz at gnosis.cx  Wed Sep 16 18:39:53 2015
From: mertz at gnosis.cx (David Mertz)
Date: Wed, 16 Sep 2015 09:39:53 -0700
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAH_hAJECibCUr0sq0RobDGvyUBsMMji-5n-LA8O0sMd6xbZo9g@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <CADiSq7enZRFkFpzVhDnDs_8Ji12K5Pn8TQGagz8dnSaQ33-DrA@mail.gmail.com>
 <CAP7+vJJwRSF+ShR06YR1-gEf0U4UwGuUKrS_5e0c7SVGe0MCwQ@mail.gmail.com>
 <CADiSq7eXHiquJyF=09niBJDhwS4Vw3GNXjqDepfes06BF2bdLA@mail.gmail.com>
 <CAEbHw4Y4LEbs5cc9ERMcUqxg3P8bN6HY1KGdP-6UUj5_JSstRw@mail.gmail.com>
 <CAH_hAJECibCUr0sq0RobDGvyUBsMMji-5n-LA8O0sMd6xbZo9g@mail.gmail.com>
Message-ID: <CAEbHw4YBYAz+6+HsLVcBDbPBHtHcLbinFeGDZnYGMjZFWVnKZA@mail.gmail.com>

The point here is that the closest we can come to PROTECTING users is to
avoid making false promises to them.

All this talk of "maybe, possibly, secure RNGs" (until they've been
analyzed longer) is just building a house on sand. Maybe ChaCha20 is
completely free of all exploits... It's new-ish, and no one has found any.

The API we really owe users is to create a class
random.BelievedSecureIn2015, and let users utilize that if they like. All
the rest of the proposals are just invitations to create more security
breaches... The specific thing that random.random and MT DOES NOT do.
On Sep 16, 2015 1:29 AM, "Cory Benfield" <cory at lukasa.co.uk> wrote:

> On 16 September 2015 at 08:43, David Mertz <mertz at gnosis.cx> wrote:
> > Hence I affirmatively PREFER a random module that explicitly proclaims
> that
> > it is non-cryptographic.  Someone who figures out enough to use
> > random.SystemRandom, or a future crypto.random, or the like is more
> likely
> > to think about why they are doing so, and what doing so does and does NOT
> > assure them off.
>
> And what about those that don't? Is our position here "screw 'em, and
> also screw their users"?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150916/00182a0e/attachment.html>

From brett at python.org  Wed Sep 16 18:53:38 2015
From: brett at python.org (Brett Cannon)
Date: Wed, 16 Sep 2015 16:53:38 +0000
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAExdVNnU18ZC46FRJyQOmfj62kD897R_tf-u1hBfdur0G_m3Fg@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
 <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
 <CADiSq7esW+pmYjwvmFiS9qiZKTa4M2qnXTrhYi1Q9bDE4pPhBw@mail.gmail.com>
 <CAExdVNnU18ZC46FRJyQOmfj62kD897R_tf-u1hBfdur0G_m3Fg@mail.gmail.com>
Message-ID: <CAP1=2W4tjG8Ne1FgpzHrfM2-hKkyAf3fgbLrTkWQB-onyW03AQ@mail.gmail.com>

On Wed, 16 Sep 2015 at 09:10 Tim Peters <tim.peters at gmail.com> wrote:

> [Guido]
> >> ...
> >> Tim's proposal is simple: create a new module, e.g. safefandom, with the
> >> same API as random (less seed/state). That's it. Then it's a simple
> import
> >> change away to do the right thing, and we have years to seed
> StackOverflow
> >> with better information before that code even hits the road. (But a
> backport
> >> to Python 2.7 could be on PyPI tomorrow!)
>
> [Nick Coghlan <ncoghlan at gmail.com>]
> > If folks are reaching for a third party library anyway, we'd be better
> > off point them at one of the higher levels ones like passlib or
> > cryptography.
>
> Note that, in context, "saferandom" _would_ be a standard module in a
> future Python 3 feature release.  But it _could_ be used literally
> tomorrow by anyone who wanted a head start, whether in a current
> Python 2 or Python 3.
>

+1 on the overall idea, although I would rather the module be named
random.safe in the stdlib ("namespaces are one honking great idea" and it
helps keep the "safer" version of random near the "unsafe" version in the
module index which makes discovery easier). And as long as the version on
PyPI stays Python 2/3 compatible people can just rely on the saferandom
name until they drop Python 2 support and then just update their imports.


>
> And if pieces of `passlib` and/or `cryptography` are thought to be
> essential for best practice, cool, then `saferandom` could also become
> a natural home for workalikes.  Would you really want to _ever_ put
> such functions in the catch-all "random" module?  The docs would
> become an incomprehensible mess.
>

So, a PEP for this to propose which random algorithm to use (I have at
least heard chacha/ch4random and some AES thing bandied about as being
fast)? And if yes to a PEP, who's writing it? And then who is writing the
implementation in the end?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150916/53736acf/attachment-0001.html>

From mertz at gnosis.cx  Wed Sep 16 19:06:19 2015
From: mertz at gnosis.cx (David Mertz)
Date: Wed, 16 Sep 2015 10:06:19 -0700
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAP1=2W4tjG8Ne1FgpzHrfM2-hKkyAf3fgbLrTkWQB-onyW03AQ@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
 <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
 <CADiSq7esW+pmYjwvmFiS9qiZKTa4M2qnXTrhYi1Q9bDE4pPhBw@mail.gmail.com>
 <CAExdVNnU18ZC46FRJyQOmfj62kD897R_tf-u1hBfdur0G_m3Fg@mail.gmail.com>
 <CAP1=2W4tjG8Ne1FgpzHrfM2-hKkyAf3fgbLrTkWQB-onyW03AQ@mail.gmail.com>
Message-ID: <CAEbHw4bwy9sf5HV_0MOvw3xdgyxSUSX-daYebjsWer3vUZBXpg@mail.gmail.com>

On Sep 16, 2015 9:54 AM, "Brett Cannon" <brett at python.org> wrote:
> +1 on the overall idea, although I would rather the module be named
random.safe in the stdlib ("namespaces are one honking great idea" and it
helps keep the "safer" version of random near the "unsafe" version in the
module index which makes discovery easier). And as long as the version on
PyPI stays Python 2/3 compatible people can just rely on the saferandom
name until they drop Python 2 support and then just update their imports.

Without repeating my somewhat satirical long name, I think "safe" is a
terrible name because it makes a false promise.  However, the name
"secrets" is a great name.

I think a top-level module is better than "random.secrets" because not
everything related to secrets is related to randomness. But that detail is
minor. Letting the documentation of "secrets" discuss the current state of
cryptanalysis on the algorithms and protocols contained therein is the
right place for it. With prominent dates attached to those discussions.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150916/ada8f4ca/attachment.html>

From random832 at fastmail.com  Wed Sep 16 19:09:43 2015
From: random832 at fastmail.com (Random832)
Date: Wed, 16 Sep 2015 13:09:43 -0400
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CADiSq7esW+pmYjwvmFiS9qiZKTa4M2qnXTrhYi1Q9bDE4pPhBw@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
 <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
 <CADiSq7esW+pmYjwvmFiS9qiZKTa4M2qnXTrhYi1Q9bDE4pPhBw@mail.gmail.com>
Message-ID: <1442423383.1806663.385468561.043B31B7@webmail.messagingengine.com>

On Wed, Sep 16, 2015, at 11:54, Nick Coghlan wrote:
> * random._inst is a SystemRandom() instance by default

He has a point on the performance issue. The difference between Random
and SystemRandom on my machine is significantly more than an order of
magnitude. (Calling libc's arc4random with ctypes was roughly in the
middle, though I *suspect* a lot of that was due to ctypes overhead).

From donald at stufft.io  Wed Sep 16 19:17:22 2015
From: donald at stufft.io (Donald Stufft)
Date: Wed, 16 Sep 2015 13:17:22 -0400
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <1442423383.1806663.385468561.043B31B7@webmail.messagingengine.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
 <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
 <CADiSq7esW+pmYjwvmFiS9qiZKTa4M2qnXTrhYi1Q9bDE4pPhBw@mail.gmail.com>
 <1442423383.1806663.385468561.043B31B7@webmail.messagingengine.com>
Message-ID: <etPan.55f9a422.6c9bf171.6557@Draupnir.home>

On September 16, 2015 at 1:10:09 PM, Random832 (random832 at fastmail.com) wrote:
> On Wed, Sep 16, 2015, at 11:54, Nick Coghlan wrote:
> > * random._inst is a SystemRandom() instance by default
>  
> He has a point on the performance issue. The difference between Random
> and SystemRandom on my machine is significantly more than an order of
> magnitude. (Calling libc's arc4random with ctypes was roughly in the
> middle, though I *suspect* a lot of that was due to ctypes overhead).
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>?

I did the benchmark already:?https://bpaste.net/show/79cc134a12b1

Using this code:?https://github.com/dstufft/randtest

However, using anything except for urandom is warned against by most security experts I?ve talked to. Most of them are only OK with it, if it means we?re using a CSPRNG by default in the random.py module, but not if it?s not the default. Even then, one of them thought that using a userspace CSPRNG instead of urandom was a bad idea (The rest though it was better than the status quo). They all agreed that splitting the namespace was a good idea.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From chris.barker at noaa.gov  Wed Sep 16 19:25:34 2015
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Wed, 16 Sep 2015 10:25:34 -0700
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <C5A4CEBA-048A-4762-B650-869625BF77ED@yahoo.com>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7fDze4MK5DDBg-EihT=L-ePqL9HoUfCNne_PNhVUVR8Ww@mail.gmail.com>
 <CACac1F_DtcfEQ6rGUieE9WuQshSt_LQZEDmvd6Yq0kt=gHSp-g@mail.gmail.com>
 <85h9n482sa.fsf@benfinney.id.au>
 <CACac1F_9NSVYzpyEKEfJnL-jMKSrBD030ciMhvx5_1vq9UMHgQ@mail.gmail.com>
 <C5A4CEBA-048A-4762-B650-869625BF77ED@yahoo.com>
Message-ID: <5517176680814632774@unknownmsgid>

 >but it's still depressing how many people
>are still writing blog posts and SO answers
>and so on that tell people "you need to
>install the latest version of Python, 2.7,

I teach an intro to python class, and have been advocating
python/supporting users of python on OS-X for years. AND I am one
those folks that advocates starting out by installing the latest
Python2.7 (unless your going with 3). And I don't think I'm going to
stop.

>because your computer doesn't come with
>it"

But never for that reason, but because I don't think users SHOULD rely
on the system Python on OS-X (and probably orher systems). You can
google the reasons why -- you'll probably find a fair number of posts
with my name on it ;-). Or, if that debate really is relevant to this
discussion I could repeat it all here...

>and then proceed to give instructions
>that will lead to a screwed up PATH

Well, I hope I don't do that ;-) -- in fact, the python.org installer
has done a pretty nice job with its defaults for years -- the people
that get messed up the ones that try to "fix" it be hand, when they
don't know what they are doing (and very few people DO know what they
doing with PATH on OS-X)

>and make no mention of virtualenv...

OK, I do that ..... But quite deliberately. Virtualenv solves some
problems for sure, but NOT the "I can't import something I swear I
just installed" problem. in fact, it creates even MORE different
"python" and "pip" commands, and a greater need to understand PATH,
and what terminals are configured how, etc.

So no, I don't introduce virtualenv to beginners.

But I'll probably start teaching:

python -m pip install ......

-Chris

From p.f.moore at gmail.com  Wed Sep 16 20:08:20 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 16 Sep 2015 19:08:20 +0100
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <20150916155412.GJ31152@ando.pearwood.info>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <CADiSq7enZRFkFpzVhDnDs_8Ji12K5Pn8TQGagz8dnSaQ33-DrA@mail.gmail.com>
 <CAP7+vJJwRSF+ShR06YR1-gEf0U4UwGuUKrS_5e0c7SVGe0MCwQ@mail.gmail.com>
 <CADiSq7eXHiquJyF=09niBJDhwS4Vw3GNXjqDepfes06BF2bdLA@mail.gmail.com>
 <20150916155412.GJ31152@ando.pearwood.info>
Message-ID: <CACac1F9WB5EC+dVtJyXGHJosakP2nBbjZ9yYH94+MsaBgNwjBw@mail.gmail.com>

On 16 September 2015 at 16:54, Steven D'Aprano <steve at pearwood.info> wrote:
> If there is interest in this proposed secrets module, I'll write up a
> proto-PEP over the weekend, and start a new thread for the benefit of
> those who have muted this one.

I love this idea. The name is perfect, and your motivational
discussion fits exactly how I think we should be approaching security.

Would it also be worth having secrets.password(alphabet, length) -
generate a random password of length "length" from alphabet
"alphabet". It's not going to cover every use case, but it immediately
becomes the obvious answer to all those "how do I generate a password"
SO questions people keep pointing at.

Also, a backport version could be made available via PyPI.

I don't see why the module couldn't use random.SystemRandom as its
CSPRNG (and as a result be pure Python) but that can be an
implementation detail the security specialists can argue over if they
want. No need to expose it here (although if it's useful, republishing
(some more of) its API without exposing the implementation, just like
the proposed secrets.choice, would be fine).

Paul.

From wes.turner at gmail.com  Wed Sep 16 20:42:01 2015
From: wes.turner at gmail.com (Wes Turner)
Date: Wed, 16 Sep 2015 13:42:01 -0500
Subject: [Python-ideas] High time for a builtin function to manage
 packages (simply)?
In-Reply-To: <878u8i3d2k.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPyZGSmTbwAkRr0TG8hfvzZGjaMehROof0JYWuw-5VQhoiKG2g@mail.gmail.com>
 <1441315622.90616.374217921.66BE1B36@webmail.messagingengine.com>
 <CA+=+wqAJVwa6d16mHD=aPUZtFOpVPoX4ERGw+a_L+Vj11oORtA@mail.gmail.com>
 <CAPyZGSmAMJUUdY8p_AqXX8zzBMsABrY--AWxcqKW077H7QXuow@mail.gmail.com>
 <87vbbr2b28.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150904024552.GL19373@ando.pearwood.info>
 <204080FC-D5B0-44A9-9D9D-582B1491B413@yahoo.com>
 <20150904172710.GO19373@ando.pearwood.info>
 <87lhcl2viv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <msflbr$qda$1@ger.gmane.org>
 <87d1xv2e04.fsf@uwakimon.sk.tsukuba.ac.jp>
 <AF4B2951-316E-428C-8C01-04783591B205@yahoo.com>
 <878u8i3d2k.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CACfEFw9TAATQhc696ujumR_AbBu+GPi-jH_wG-eqbW0BDqRQMQ@mail.gmail.com>

I just found this HTML web-based pip interface: Stallion "easy-to-use
Python Package Manager interface with command-line tool"

* http://perone.github.io/stallion/
* https://pypi.python.org/pypi/Stallion

Qt would not be a reasonable requirement; Tk shouldn't be a requirement.

I tend to try and work with repeatable CLI commands; rather than package
manager GUIs.



On Sep 7, 2015 2:26 AM, "Stephen J. Turnbull" <stephen at xemacs.org> wrote:

> Andrew Barnert writes:
>
>  > Tcl/Tk, and Tkinter for all pre-installed Pythons but 2.3, have
>  > been included with every OS X since they started pre-installing
>  > 2.5.
>
> My mistake, it's only MacPorts where I don't have it.  I used
> MacPorts' all-lowercase spelling, which doesn't work in the system
> Python.  (The capitalized spelling doesn't work in MacPorts.)
>
>  > And it works with all python.org installs for 10.6 or later, all
>  > Homebrew default installs, standard source builds... Just about
>  > anything besides MacPorts (which seems to want to build Tkinter
>  > against its own Tcl/Tk instead of Apple's)
>
> I recall having problems with trying to build and run against the
> system Tcl/Tk in both source and MacPorts, but that was a *long* time
> ago (2.6-ish).  Trying it now, on my Mac OS X Yosemite system python
> 2.7.10, "root=Tkinter.Tk()" creates and displays a window, but doesn't
> pop it up.  In fact, "root.tkraise()" doesn't, either.  Oops.  On this
> system, IDLE has the same problem with its initial window, and
> furthermore complains that Tcl/Tk 8.5.9 is unstable.
>
> Quite possibly this window-raising issue is Just Me.  But based on my
> own experience, it is not at all obvious that ensuring availability of
> a GUI is possible in the same way we can ensure pip.
>
>  > Also, why do you think Qt would be less of a problem?
>
> I don't.  I think "ensure PyQt" would be a huge burden, much greater
> than Tkinter.  Bottom line: IMO, at this point in time, if it has to
> Just Work, it has to Work Without GUI.  (Modulo the possibility that
> we can use an HTML server and borrow the display engine from the
> platform web browser.  I think I already mentioned that, and I think
> it's really the way to go.  People who *don't* have a web browser
> probably can handle "python -m pip ..." without StackOverflow.)
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150916/1e2e9215/attachment.html>

From tim.peters at gmail.com  Wed Sep 16 20:45:13 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 16 Sep 2015 13:45:13 -0500
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAP7+vJ+NjpkpDGDS5FrKzVUXp_0B+rtGq0fHgtsUx8MZJZds+w@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
 <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
 <CAExdVNny9tZqzc_NZ3UORn+c5Nt=OC0CEz3eOvEg9YxbvmGp6A@mail.gmail.com>
 <CAP7+vJ+NjpkpDGDS5FrKzVUXp_0B+rtGq0fHgtsUx8MZJZds+w@mail.gmail.com>
Message-ID: <CAExdVNnhaJrJvpKMVMDmpQbztcC7Pc87okyPJ5bkvE7et9z4hw@mail.gmail.com>

[Guido, on "saferandom"]
> So if you or someone else (Chris?) wrote that up in PEP form I'd accept it.

I like Steven D'Aprano's "secrets" idea better, so it won't be me ;-)
Indeed, if PHP had a secure "token generator" function built in, the
paper in question would have found almost nothing of practical
interest to write about.


> I'd even accept adding a warning on calling seed() (but not setstate()).

Yield the width of an electron, and the universe itself will be too
small to contain the eventual consequences ;-)

From tim.peters at gmail.com  Wed Sep 16 20:55:20 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 16 Sep 2015 13:55:20 -0500
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAP1=2W4tjG8Ne1FgpzHrfM2-hKkyAf3fgbLrTkWQB-onyW03AQ@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
 <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
 <CADiSq7esW+pmYjwvmFiS9qiZKTa4M2qnXTrhYi1Q9bDE4pPhBw@mail.gmail.com>
 <CAExdVNnU18ZC46FRJyQOmfj62kD897R_tf-u1hBfdur0G_m3Fg@mail.gmail.com>
 <CAP1=2W4tjG8Ne1FgpzHrfM2-hKkyAf3fgbLrTkWQB-onyW03AQ@mail.gmail.com>
Message-ID: <CAExdVN=bjqDCsa3n-qfcqRvaZJG1F4wr8a-Gervy0uKzq2sM=A@mail.gmail.com>

[Tim]
>> ....
>> Note that, in context, "saferandom" _would_ be a standard module in a
>> future Python 3 feature release.  But it _could_ be used literally
>> tomorrow by anyone who wanted a head start, whether in a current
>> Python 2 or Python 3.


[Brett Cannon <brett at python.org>]
> +1 on the overall idea, although I would rather the module be named
> random.safe in the stdlib ("namespaces are one honking great idea"

Ah, grasshopper, there's a reason that one is last in PEP 20.  "Flat
is better than nested" is the one - and only one - that _obviously_
applies here ;-)


> and it helps keep the "safer" version of random near the "unsafe" version
> in the module index which makes discovery easier). And as long as the
> version on PyPI stays Python 2/3 compatible people can just rely on the
> saferandom name until they drop Python 2 support and then just update
>  their imports.

I'd much rather see Steven D'Aprano's "secrets" idea pursued:  solve
"the problems" on their own terms directly.

> ...
> So, a PEP for this to propose which random algorithm to use (I have at least
> heard chacha/ch4random and some AES thing bandied about as being fast)?

os.urandom() is the obvious thing to build on, and it's already there.
If alternatives are desired (which they may well be - .urandom() is
sloooooooow on many systems), that can be addressed
later.  Before then, speed probably doesn't matter for most plausibly
appropriate uses.


> And if yes to a PEP, who's writing it? And then who is writing the
> implementation in the end?

Did you just volunteer?  Great!  Thanks ;-)  OK, Steven already
volunteered to write a PEP for his proposal.

From mal at egenix.com  Wed Sep 16 21:09:27 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 16 Sep 2015 21:09:27 +0200
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <20150916155412.GJ31152@ando.pearwood.info>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>	<CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>	<CADiSq7enZRFkFpzVhDnDs_8Ji12K5Pn8TQGagz8dnSaQ33-DrA@mail.gmail.com>	<CAP7+vJJwRSF+ShR06YR1-gEf0U4UwGuUKrS_5e0c7SVGe0MCwQ@mail.gmail.com>	<CADiSq7eXHiquJyF=09niBJDhwS4Vw3GNXjqDepfes06BF2bdLA@mail.gmail.com>
 <20150916155412.GJ31152@ando.pearwood.info>
Message-ID: <55F9BE67.50209@egenix.com>

On 16.09.2015 17:54, Steven D'Aprano wrote:
> I propose:
> 
> - The random module's API is left as-is, including the default PRNG. 
>   Backwards compatibility is important, code-churn is bad, and there are 
>   good use-cases for a non-CSPRNG.
> 
> - We add at least one CSPRNG. I leave it to the crypto-wonks to decide 
>   which.
> 
> - We add a new module, which I'm calling "secrets" (for lack of a better 
>   name) to hold best-practice security-related functions. To start with,
>   it would have at least these three functions: one battery, and two 
>   building blocks:
> 
>   + secrets.token to create password recovery tokens or similar;
> 
>   + secrets.random calls the CSPRNG; it just returns a random number 
>     (integer?). There is no API for getting or setting the state, 
>     setting the seed, or returning values from non-uniform 
>     distributions;
> 
>   + secrets.choice similarly uses the CSPRNG.

+1 on the idea (not sure about the name, though :-))

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 16 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-14: Released mxODBC Plone/Zope DA 2.2.3   http://egenix.com/go84
2015-09-18: PyCon UK 2015 ...                               2 days to go
2015-09-26: Python Meeting Duesseldorf Sprint 2015         10 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From tim.peters at gmail.com  Wed Sep 16 21:13:27 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 16 Sep 2015 14:13:27 -0500
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <20150916155412.GJ31152@ando.pearwood.info>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <CADiSq7enZRFkFpzVhDnDs_8Ji12K5Pn8TQGagz8dnSaQ33-DrA@mail.gmail.com>
 <CAP7+vJJwRSF+ShR06YR1-gEf0U4UwGuUKrS_5e0c7SVGe0MCwQ@mail.gmail.com>
 <CADiSq7eXHiquJyF=09niBJDhwS4Vw3GNXjqDepfes06BF2bdLA@mail.gmail.com>
 <20150916155412.GJ31152@ando.pearwood.info>
Message-ID: <CAExdVNkc_w+0z1Ko=gYtv=kqBG3Hx5ft=fG=qGUwdOpp4kedKg@mail.gmail.com>

[Steven D'Aprano <steve at pearwood.info>, on "secrets"]

+1 on everything.  Glad _that's_ finally over ;-)

One tech point:

> ...
>   + secrets.random calls the CSPRNG; it just returns a random number
>     (integer?). There is no API for getting or setting the state,
>     setting the seed, or returning values from non-uniform
>     distributions;

The OpenBSD arc4random() has a very sparse API, but gets this part
exactly right:

    uint32_t arc4random_uniform(uint32_t upper_bound);

    arc4random_uniform() will return a single 32-bit value, uniformly
    distributed but less than upper_bound. This is recommended
    over constructions like ?arc4random() % upper_bound? as it
    avoids "modulo bias" when the upper bound is not a power
    of two. In the worst case, this function may consume multiple
    iterations to ensure uniformity; see the source code to understand
    the problem and solution.

In Python, there's no point to the uint32_t restrictions, and the
function is already implemented for arbitrary bigints via the current
(but private)
Random._randbelow() method, whose implementation could be simplified
for this specific use.

That in turn relies on the .getrandbits(number_of_bits) method, which
SystemRandom overrides.

So getrandbits() is the fundamental primitive. and SystemRandom
already implements that based on .urandom() results.

An OpenBSD-ish random_uniform(upper_bound) would be a "nice to have",
but not essential.


> + secrets.choice similarly uses the CSPRNG.

Apart from error checking, that's just:

def choice(seq):
    return seq[self.random_uniform(len(seq))]

random.Random already does that (and SystemRandom inherits it),
although spelled with _randbelow().

From srkunze at mail.de  Wed Sep 16 21:45:57 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Wed, 16 Sep 2015 21:45:57 +0200
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <55F1CAD3.7050602@brenbarn.net>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0E1F2.6040709@brenbarn.net> <55F1B76E.2030602@mail.de>
 <55F1CAD3.7050602@brenbarn.net>
Message-ID: <55F9C6F5.50903@mail.de>

On 10.09.2015 20:24, Brendan Barnwell wrote:
> Right, but can't you already do that with ABCs, as in the example in 
> the docs (https://docs.python.org/2/library/abc.html)?  You can write 
> an ABC whose __subclasshook__ does whatever hasattr checks you want 
> (and, if you want, checks the type annotations too), and then you can 
> use isinstance/issubclass to check if a given instance/class "provides 
> the protocol" described by that ABC.

You might probably be write. Maybe, it's that this kind of "does 
whatever hasattr checks you want" gets standardized via the protocol 
base class.

Pondering about this idea further, current Python actually gives enough 
means to do that on runtime. If I rely on method A to be present at 
object b, Python will give me simply an AttributeError and that'll suffice.


So, it's only for the static typechecker again.


Best,
Sven

From srkunze at mail.de  Wed Sep 16 22:42:20 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Wed, 16 Sep 2015 22:42:20 +0200
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CAA_f+LzcuThaoipmLPw0iOHX0EwHQLuQPwO1RrATNpEskp3FXg@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <55F1B306.5070705@mail.de>
 <CAA_f+LzcuThaoipmLPw0iOHX0EwHQLuQPwO1RrATNpEskp3FXg@mail.gmail.com>
Message-ID: <55F9D42C.1080208@mail.de>

On 11.09.2015 08:24, Jukka Lehtosalo wrote:
> On Thu, Sep 10, 2015 at 9:42 AM, Sven R. Kunze <srkunze at mail.de 
> <mailto:srkunze at mail.de>> wrote:
>
>     If my variables have crappy names, so I need to add type hints to
>     them, well, then, I rather fix them first.
>
>
> Even good variable names can leave the type ambiguous.

Try harder then.

> And besides, if you assume that all code is perfect or can be made 
> perfect I think that you've already lost the discussion. Reality 
> disagrees with you. ;-)

Not sure where I said this.

> You can't just wave a magic wand and to get every programmer to 
> document their code and write unit tests. However, we know quite well 
> that programmers are perfectly capable of writing type annotations, 
> and tools can even enforce that they are present (witness all the Java 
> code in existence).

You can't just wave a magic wand and to get every programmer to add type 
annotations to their code. However, we know quite well that programmers 
are perfectly capable of writing unit tests, and tools can even enforce 
that they are present (witness coverage tools and hooks in SCM systems 
preventing it from dropping).

[ Interesting, that it was that easy to exchange the parts you've given 
me ;) ]

Btw. have you heard of code review?

> Tools can't verify that you have good variable names or useful 
> docstrings, and people are too inconsistent or lazy to be relied on.

Same can be said for type annotations.

> In a cost/benefit analysis it may be optimal to spent half the 
> available time on annotating parts of the code base to get some (but 
> necessarily limited) static checking coverage and spend the remaining 
> half on writing tests for selected parts of the code base, for 
> example. It's not all or nothing.

I would like to peer-review that cost/benefit analysis you've made to 
see whether your numbers are sane.

>
>>     You get extra credit if your tests are slow to run and flaky,
>
>     We are problem solvers. So, I would tell my team: "make them
>     faster and more reliable".
>
>
> But you'd probably also ask them to implement new features (or *your* 
> manager might be unhappy), and they have to find the right balance, as 
> they only have 40 hours a week (or maybe 80 hours if you work at an 
> early-stage startup :-). Having more tools gives you more options for 
> spending your time efficiently.

Yes, I am going to tell him: "Hey, it doesn't work but we got all/most 
of the types right."

>
>     Granted. But you still don't know if your code runs correctly. You
>     are better off with tests. And I agree type checking is 1 test to
>     perform (out of 10K).
>
>
> Actually a type checker can verify multiple properties of a typical 
> line of code. So for 10k lines of code, complete type checking 
> coverage would give you the equivalent of maybe 30,000 (simple) tests. 
> :-P

I think you should be more specific on this.

Using hypothesis, e.g., you can easily increase the number of simple 
tests as well.

What I can tell is that most of the time, a variable carries the same 
type. It is really convenient that it doesn't have to but most of the 
time it does. Thus, one test run can probably reveal a dangerous type 
mistake. I've seen code where that is not the case indeed and one 
variable is either re-used or accidentally have different types. But, 
well, you better stay away from it anyway because most of the time it's 
very old code.

Moreover, in order to add *reasonable* type annotations you would 
probably invest equal amount of time that you would invest to write some 
tests for it. The majority of time is about *understanding* the code. 
And there, better variable names help a lot.

> It's often not cost effective to have good test coverage (and even 
> 100% line coverage doesn't give you full coverage of all 
> interactions). Testing can't prove that your code doesn't have defects 
> -- it just proves that for a tiny subset of possible inputs you code 
> works as expected. A type checker may be able to prove that for *all* 
> possible inputs your code doesn't do certain bad things, but it can't 
> prove that it does the good things. Neither subsumes the other, and 
> both of these are approaches are useful and complementary (but 
> incomplete).

I fully agree on this. Yet I don't need type annotations. ;) A simple 
test running a typechecker working at 40%-60% (depending on whom you 
ask) efficiency suffices at least for me.

I would love to see better typecheckers rather than cluttering our code 
with some questionable annotations; btw. of which I don't know of are 
necessary at all.

Don't be fooled by the possibility of dynamic typing in Python. Just 
because it's possible doesn't necessarily mean it's the usual thing.

> I think that there was a good talk basically about this at PyCon this 
> year, by the way, but I can't remember the title.

It'll be great to have it. :)

Best,
Sven
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150916/f41a794f/attachment.html>

From srkunze at mail.de  Wed Sep 16 22:57:29 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Wed, 16 Sep 2015 22:57:29 +0200
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <05084F79-C505-4A27-9F08-DA98D4B19963@yahoo.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <55F1B306.5070705@mail.de> <05084F79-C505-4A27-9F08-DA98D4B19963@yahoo.com>
Message-ID: <55F9D7B9.8060109@mail.de>

On 11.09.2015 00:22, Andrew Barnert wrote:
> On Sep 10, 2015, at 09:42, Sven R. Kunze <srkunze at mail.de> wrote:
>> I mean when I am really going to touch that file to improve documentation (which annotations are a piece of), I am going to add more information for the reader of my API and that mostly will be describing the behavior of the API.
> As a bit of useless anecdotal evidence:
>
> After starting to play with MyPy when Guido first announced the idea, I haven't actually started using static type checking seriously, but I have started writing annotations for some of my functions. It feels like a concise and natural way to say "this function wants two integers", and it reads as well as it writes. Of course there's no reason I couldn't have been doing this since 3.0, but I wasn't, and now I am.
>
> Try playing around with it and see if you get the same feeling. Since everyone is thinking about the random module right now, and it makes a great example of what I'm talking about, specify which functions take/return int vs. float, which need a real int vs. anything Integral, etc., and how much more easily you absorb the information than if it's in the middle of a sentence in the docstring.
>
> Anyway, I don't actually annotate every function (or every function except the ones that are so simple that any checker or reader that couldn't infer the types is useless, the way I would in Haskell), just the ones where the types seem like an important part of the semantics. So I haven't missed the more complex features the way I expected to. But I've still got no problem with them being added as we go along, of course. :)

Thanks for the anecdote. It's good to hear you don't do it for every 
function and I am glad it helps you a lot. :)

Do you know what makes me sad? If you do that for this function but 
don't do it for another what is the guideline then? Python Zen tells us 
to have one obvious way to do sth. At least for me, it's not obvious 
anymore when to annotate and when not to annote. Just a random guess 
depending on the moon phase? :(

Sometimes and sometimes that. That can't be right for something to basic 
like types. Couldn't these problems not be solved by further research on 
typecheckers?

Btw. I can tell the same anecdote when switching from C/C++/C#/Java to 
Python. It was like a liberation---no explicit type declarations 
anymore. I was baffled and frightened the first week using it. But I 
love it now and I don't want to give that freedom up. Maybe, that's why 
I am reluctant to use it in production.

But as said, I like the theoretical discussion around it. :)

Best,
Sven

From asweigart at gmail.com  Thu Sep 17 00:28:11 2015
From: asweigart at gmail.com (Al Sweigart)
Date: Wed, 16 Sep 2015 15:28:11 -0700
Subject: [Python-ideas] Non-English names in the turtle module (with Spanish
	example)
Message-ID: <CAPyZGSn5woL=jde7abXh4oJmhZpgy2bRux6gdVojKM5QqtRPtg@mail.gmail.com>

I've created a prototype for how we could add foreign language names to the
turtle.py module and erase the language barrier for non-English schoolkids.

The Tortuga module has the same functionality as

You can test it out by running "pip install tortuga"

https://pypi.python.org/pypi/Tortuga

Since Python 2 doesn't have simpledialog, I used the PyMsgBox pure-python
module for the input boxes. This code is small enough that it could be
added into turtle.py. (It's just used for simple tkinter dialog boxes.)

Check out the diff between Tortuga and turtle.py here:
https://www.diffchecker.com/2xmbrkhk

This file can be easily adapted to support multiple programming languages.

Thoughts? Suggestions?

-Al
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150916/044bf829/attachment.html>

From stephen at xemacs.org  Thu Sep 17 03:04:53 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 17 Sep 2015 10:04:53 +0900
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <55F9D7B9.8060109@mail.de>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <55F1B306.5070705@mail.de>
 <05084F79-C505-4A27-9F08-DA98D4B19963@yahoo.com>
 <55F9D7B9.8060109@mail.de>
Message-ID: <87wpvpx3d6.fsf@uwakimon.sk.tsukuba.ac.jp>

Sven R. Kunze writes:

 > Do you know what makes me sad? If you do that for this function but 
 > don't do it for another what is the guideline then? Python Zen tells us 
 > to have one obvious way to do sth. At least for me, it's not obvious 
 > anymore when to annotate and when not to annote. Just a random guess 
 > depending on the moon phase? :(

No.  There's a simple rule: if it's obvious to you that type
annotation is useful, do it.  If it's not obvious you want it, you
don't, and you don't do it.  You obviously are unlikely to do it for
some time, if ever.  Me too.

But some shops want to use automated tools to analyze these things,
and I don't see why there's a problem in providing a feature that
makes it easier for them to do that.

 > Btw. I can tell the same anecdote when switching from C/C++/C#/Java to 
 > Python. It was like a liberation---no explicit type declarations 
 > anymore. I was baffled and frightened the first week using it. But I 
 > love it now and I don't want to give that freedom up. Maybe, that's why 
 > I am reluctant to use it in production.

So don't, nothing else in the language depends on type annotation or
on running a type checker for that matter.  What's your point?  That
you'll have to read them in the stdlib?  Nope; the stdlib will use
stubfiles where it uses type annotations at all for the foreseeable
future.  That your employer might make you use them?  That's the
nature of employment.  And if you can't convince your boss that
annotations have no useful role in a program written in good style,
why would you expect to convince us?


From stephen at xemacs.org  Thu Sep 17 03:36:37 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 17 Sep 2015 10:36:37 +0900
Subject: [Python-ideas] Non-English names in the turtle module (with
	Spanish	example)
In-Reply-To: <CAPyZGSn5woL=jde7abXh4oJmhZpgy2bRux6gdVojKM5QqtRPtg@mail.gmail.com>
References: <CAPyZGSn5woL=jde7abXh4oJmhZpgy2bRux6gdVojKM5QqtRPtg@mail.gmail.com>
Message-ID: <87vbb9x1wa.fsf@uwakimon.sk.tsukuba.ac.jp>

Al Sweigart writes:

 > I've created a prototype for how we could add foreign language
 > names to the turtle.py module and erase the language barrier for
 > non-English schoolkids.

I noticed that "ayudarme" ("help me") isn't implemented.  Whether
that's a barrier or not, I don't know.  I suspect it needs testing,
because I can imagine that kids might not use help if it were
available, preferring to ask the kid next to them.

The error messages aren't translated.  That definitely needs to be
done.

 > This file can be easily adapted to support multiple programming
 > languages.

Maybe it's just me, but putting multiple languages in a single file
might be confusing to the user if you just add all the language
bindings to locals().  I really would like to see (but won't do it
myself) a structure where "tortuga" imports "turtle", defines some
variables which contain the dictionaries, and calls an "install
language" utility (new in turtle) that does what's necessary to hook
up the dictionaries to the turtle commands.

From steve at pearwood.info  Thu Sep 17 05:59:15 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 17 Sep 2015 13:59:15 +1000
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <55F9D7B9.8060109@mail.de>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <55F1B306.5070705@mail.de> <05084F79-C505-4A27-9F08-DA98D4B19963@yahoo.com>
 <55F9D7B9.8060109@mail.de>
Message-ID: <20150917035915.GL31152@ando.pearwood.info>

On Wed, Sep 16, 2015 at 10:57:29PM +0200, Sven R. Kunze wrote:

> Do you know what makes me sad? If you do that for this function but 
> don't do it for another what is the guideline then? Python Zen tells us 
> to have one obvious way to do sth. At least for me, it's not obvious 
> anymore when to annotate and when not to annote. Just a random guess 
> depending on the moon phase? :(

This is no different from when to document and when to write tests.

In a perfect world, every function is fully documented and fully tested. 
But in reality we have only a limited about of time to spend writing 
code, and only a portion of that is spent writing documentation and 
tests, so we have to prioritise. Some functions are less than fully 
documented and less than fully tested. How do you decide which ones get 
your attention?

People will use the same sort of heuristic for deciding which functions 
get annotated:

- does the function need annotations/documentation/tests?
- do I have time to write annotations/documentation/tests?
- is my manager telling me to add annotations/documentation/tests?
- if I don't, will bad things happen?
- if it easy or interesting to add them?
- or difficult and boring?


Don't expect to hold annotations up to a higher standard than we already 
hold other aspects of programming.


-- 
Steve

From gokoproject at gmail.com  Thu Sep 17 06:07:50 2015
From: gokoproject at gmail.com (John Wong)
Date: Thu, 17 Sep 2015 00:07:50 -0400
Subject: [Python-ideas] Bring line continuation to multi-level dictionary
	lookup
Message-ID: <CACCLA55V0gcsT60DTM1OOc4UFC_eQcLYAq3or6CKX3mfyQoU6A@mail.gmail.com>

Hi everyone.

I work with APIs which have deep nested dictionary structure response.
Imagine a simplified case:

foo = {1: {2: {3: {4: {5: 6 } } } }

Now imagine I need to get to 6:

foo['1']['2']['3']['4']['5']['6']

This looks managable, but if the key name is long, then I certainly will
end doing this to respect my style guide. To make it concrete, let's use
something reallistic, a response call from AWS API:

response = {'DescribeDBSnapshotsResponse': {'ResponseMetadata':
{'RequestId': '123456'}, 'DescribeDBSnapshotsResult': {'Marker': None,
'DBSnapshots': [{'Engine': 'postgres'}]}}}

If I had to get to the Engine I'd do this:

detail_response = response["DescribeDBSnapshotsResponse"]
result = detail_response["DescribeDBSnapshotsResult"]

This is only a few level deep, but imagine something slightly longer (I
strict out so much from this response). Obviously I am picking some real
example but key name being really long to sell my request.

Can we do it differently? How about
print(response.get(
    "DescribeDBSnapshotsResponse").get(
    "DescribeDBSnapshotsResult").get(
    "DBSnapshots")[0].get(
    "Engine"))

Okay. Not bad, almost like writing in Javascript except Python doesn't
allow you to do line continuation before the got at all, so you are stuck
with (.

But the problem with the alternative is that
if DescribeDBSnapshotsResult is a non-existent key, you will just get None,
because that's the beauty of the .get method for a dictionary object. So
while this allows you to write in slightly different way, I am costing
silent KeyError exception. I wouldn't know which key raised the exception.
Whereas with [key1][key2] I know if key1 doesn't exist, the exception will
explain to me that key1 does not exist.

So here I am, thinking, what if we can do this?

response(
    ["DescribeDBSnapshotsResponse"]
    ["DescribeDBSnapshotsResult"]
)

You get the point. This looks kinda ugly, but it doesn't require so many
assignment. I think this is doable, after all [ ] is still a method call
with the key name passed in. I am not familar with grammar, so I don't know
how hard and how much the implementation has to change to adopt this.

Let me know if this is a +1 or -10000000 bad crazy idea.

Thanks.

John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150917/1c4d041c/attachment.html>

From 4kir4.1i at gmail.com  Thu Sep 17 06:16:02 2015
From: 4kir4.1i at gmail.com (Akira Li)
Date: Thu, 17 Sep 2015 07:16:02 +0300
Subject: [Python-ideas] Bring line continuation to multi-level
	dictionary lookup
References: <CACCLA55V0gcsT60DTM1OOc4UFC_eQcLYAq3or6CKX3mfyQoU6A@mail.gmail.com>
Message-ID: <87eghx4r5p.fsf@gmail.com>

John Wong <gokoproject at gmail.com> writes:

> Hi everyone.
>
> I work with APIs which have deep nested dictionary structure response.
> Imagine a simplified case:
>
> foo = {1: {2: {3: {4: {5: 6 } } } }
>
> Now imagine I need to get to 6:
>
> foo['1']['2']['3']['4']['5']['6']
>
> This looks managable, but if the key name is long, then I certainly will
> end doing this to respect my style guide. To make it concrete, let's use
> something reallistic, a response call from AWS API:
>
> response = {'DescribeDBSnapshotsResponse': {'ResponseMetadata':
> {'RequestId': '123456'}, 'DescribeDBSnapshotsResult': {'Marker': None,
> 'DBSnapshots': [{'Engine': 'postgres'}]}}}
>
> If I had to get to the Engine I'd do this:
>
> detail_response = response["DescribeDBSnapshotsResponse"]
> result = detail_response["DescribeDBSnapshotsResult"]
>
> This is only a few level deep, but imagine something slightly longer (I
> strict out so much from this response). Obviously I am picking some real
> example but key name being really long to sell my request.
>
> Can we do it differently? How about
> print(response.get(
>     "DescribeDBSnapshotsResponse").get(
>     "DescribeDBSnapshotsResult").get(
>     "DBSnapshots")[0].get(
>     "Engine"))
>
> Okay. Not bad, almost like writing in Javascript except Python doesn't
> allow you to do line continuation before the got at all, so you are stuck
> with (.
>
> But the problem with the alternative is that
> if DescribeDBSnapshotsResult is a non-existent key, you will just get None,
> because that's the beauty of the .get method for a dictionary object. So
> while this allows you to write in slightly different way, I am costing
> silent KeyError exception. I wouldn't know which key raised the exception.
> Whereas with [key1][key2] I know if key1 doesn't exist, the exception will
> explain to me that key1 does not exist.

  import functools
  import operator
  
  functools.reduce(operator.getitem, [
      "DescribeDBSnapshotsResponse",
      "DescribeDBSnapshotsResult",
      "DBSnapshots",
      0,
      "Engine"], response)
  

> So here I am, thinking, what if we can do this?
>
> response(
>     ["DescribeDBSnapshotsResponse"]
>     ["DescribeDBSnapshotsResult"]
> )
>
> You get the point. This looks kinda ugly, but it doesn't require so many
> assignment. I think this is doable, after all [ ] is still a method call
> with the key name passed in. I am not familar with grammar, so I don't know
> how hard and how much the implementation has to change to adopt this.
>
> Let me know if this is a +1 or -10000000 bad crazy idea.
>
> Thanks.
>
> John


From zachary.ware+pyideas at gmail.com  Thu Sep 17 06:24:03 2015
From: zachary.ware+pyideas at gmail.com (Zachary Ware)
Date: Wed, 16 Sep 2015 23:24:03 -0500
Subject: [Python-ideas] Bring line continuation to multi-level
	dictionary lookup
In-Reply-To: <CACCLA55V0gcsT60DTM1OOc4UFC_eQcLYAq3or6CKX3mfyQoU6A@mail.gmail.com>
References: <CACCLA55V0gcsT60DTM1OOc4UFC_eQcLYAq3or6CKX3mfyQoU6A@mail.gmail.com>
Message-ID: <CAKJDb-O5+U32Gn69U_qW9Ys2w3=TUKyRO1BM9Oz2=hrzobzPTw@mail.gmail.com>

On Wed, Sep 16, 2015 at 11:07 PM, John Wong <gokoproject at gmail.com> wrote:
> So here I am, thinking, what if we can do this?
>
> response(
>     ["DescribeDBSnapshotsResponse"]
>     ["DescribeDBSnapshotsResult"]
> )
>
> You get the point. This looks kinda ugly, but it doesn't require so many
> assignment. I think this is doable, after all [ ] is still a method call
> with the key name passed in. I am not familar with grammar, so I don't know
> how hard and how much the implementation has to change to adopt this.
>
> Let me know if this is a +1 or -10000000 bad crazy idea.

I think a much better idea is to create a utility function:

   def dig(container, *path):
       obj = container
       for p in path:
           obj = obj[p]
       return obj

Then you can do your long lookup like so:

   engine = dig(response,
                'DescribeDBSnapshotsResponse',
                'DescribeDBSnapshotsResult',
                'DBSnapshots',
                0,
                'Engine')

This came up on python-list a month or two ago; perhaps there's some
merit in finding a place to stick this utility function.

-- 
Zach

From guido at python.org  Thu Sep 17 06:45:14 2015
From: guido at python.org (Guido van Rossum)
Date: Wed, 16 Sep 2015 21:45:14 -0700
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAExdVNkc_w+0z1Ko=gYtv=kqBG3Hx5ft=fG=qGUwdOpp4kedKg@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <CADiSq7enZRFkFpzVhDnDs_8Ji12K5Pn8TQGagz8dnSaQ33-DrA@mail.gmail.com>
 <CAP7+vJJwRSF+ShR06YR1-gEf0U4UwGuUKrS_5e0c7SVGe0MCwQ@mail.gmail.com>
 <CADiSq7eXHiquJyF=09niBJDhwS4Vw3GNXjqDepfes06BF2bdLA@mail.gmail.com>
 <20150916155412.GJ31152@ando.pearwood.info>
 <CAExdVNkc_w+0z1Ko=gYtv=kqBG3Hx5ft=fG=qGUwdOpp4kedKg@mail.gmail.com>
Message-ID: <CAP7+vJJDWEMXa2NtUeG9dnfzntiEQ9Zn4BCAUgf76=EyNUYAtA@mail.gmail.com>

On Wed, Sep 16, 2015 at 12:13 PM, Tim Peters <tim.peters at gmail.com> wrote:

> [Steven D'Aprano <steve at pearwood.info>, on "secrets"]
>
> +1 on everything.  Glad _that's_ finally over ;-)
>

Yes. Thanks all! I'm looking forward to the new PEP.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150916/02d90adb/attachment.html>

From brenbarn at brenbarn.net  Thu Sep 17 06:47:16 2015
From: brenbarn at brenbarn.net (Brendan Barnwell)
Date: Wed, 16 Sep 2015 21:47:16 -0700
Subject: [Python-ideas] Bring line continuation to multi-level
 dictionary lookup
In-Reply-To: <CACCLA55V0gcsT60DTM1OOc4UFC_eQcLYAq3or6CKX3mfyQoU6A@mail.gmail.com>
References: <CACCLA55V0gcsT60DTM1OOc4UFC_eQcLYAq3or6CKX3mfyQoU6A@mail.gmail.com>
Message-ID: <55FA45D4.6060607@brenbarn.net>

On 2015-09-16 21:07, John Wong wrote:
> Hi everyone.
>
> I work with APIs which have deep nested dictionary structure response.
> Imagine a simplified case:
>
> foo = {1: {2: {3: {4: {5: 6 } } } }
>
> Now imagine I need to get to 6:
>
> foo['1']['2']['3']['4']['5']['6']

	You can just use the existing parentheses-based line continuation for this:

(foo[1]
     [2]
     [3]
     [4]
     [5]
)

-- 
Brendan Barnwell
"Do not follow where the path may lead.  Go, instead, where there is no 
path, and leave a trail."
    --author unknown

From abarnert at yahoo.com  Thu Sep 17 07:56:42 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 16 Sep 2015 22:56:42 -0700
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <55F9D7B9.8060109@mail.de>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <55F1B306.5070705@mail.de> <05084F79-C505-4A27-9F08-DA98D4B19963@yahoo.com>
 <55F9D7B9.8060109@mail.de>
Message-ID: <058F4ABF-CCBC-455B-9A37-41742AA35C5E@yahoo.com>

On Sep 16, 2015, at 13:57, Sven R. Kunze <srkunze at mail.de> wrote:

> Sometimes and sometimes that. That can't be right for something to basic like types.

Types aren't as basic as you think, and assuming they are leads you to design languages like Java, that restrict you to working within unnecessary constraints. For an obvious example, what's the return type of json.loads (or, worse, eval)?

Haskell, Dependent ML, and other languages have made great strides in working out how to get most of the power of a language like Python (and some things Python can't do, too) in a type-driven paradigm, but there's still plenty of research to go. And, even if that were a solved problem, nobody wants to rewrite Python as an ML dialect, and nobody would use it if you did.

Python solves the json.loads problem by saying its runtime type is defined lazily and implicitly by the data. And there's no way any static type checker can possibly infer that type. A good statically-typed language can make it a lot easier to handle than a bad one like Java, but it will be very different from Python.

> Couldn't these problems not be solved by further research on typecheckers?

I'm not sure which problems you want solved.

If you want every type to be inferable, for a language with a sufficiently powerful type system, that's provably equivalent to the halting problem, so it's not going to happen.

More importantly, we already have languages with a powerful static type system and a great inference engine, and experience with those languages shows that it's often useful to annotate some types for readability that the inference engine could have figured out. If a particular function is more understandable to the reader when it declared its parameter types, I can't imagine what research anyone would do that would cause me to stop wanting to declaring those types.

Also, even when you want to rely on inference, you still want the types to have meaningful names that you can read, and could have figured out how to construct on your own, for things like error messages, debuggers, and reflective code. So, the work that Jukka is proposing would still be worth doing even if we had perfect inference.

> Btw. I can tell the same anecdote when switching from C/C++/C#/Java to Python. It was like a liberation---no explicit type declarations anymore. I was baffled and frightened the first week using it. But I love it now and I don't want to give that freedom up. Maybe, that's why I am reluctant to use it in production.

The problem here is that you're coming from C++/C#/Java, which are terrible examples of static typing. Disliking static typing because of Java is like disliking dynamic typing because of Tcl. I won't get into details of why they're so bad, but: if you don't have the time to learn you a Haskell for great good, you can probably at least pick up Boo in an hour or so, to at least see what static typing is like with inference by default and annotations only when needed in a very pythonesque languages, and that will give you half the answer.

From abarnert at yahoo.com  Thu Sep 17 08:06:23 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 16 Sep 2015 23:06:23 -0700
Subject: [Python-ideas] Bring line continuation to multi-level
	dictionary lookup
In-Reply-To: <CACCLA55V0gcsT60DTM1OOc4UFC_eQcLYAq3or6CKX3mfyQoU6A@mail.gmail.com>
References: <CACCLA55V0gcsT60DTM1OOc4UFC_eQcLYAq3or6CKX3mfyQoU6A@mail.gmail.com>
Message-ID: <FFA5F32F-65E9-4E55-82B6-1ECC26E6B30D@yahoo.com>

On Sep 16, 2015, at 21:07, John Wong <gokoproject at gmail.com> wrote:
> 
> So here I am, thinking, what if we can do this?
> 
> response(
>     ["DescribeDBSnapshotsResponse"]
>     ["DescribeDBSnapshotsResult"]
> )

This already has a perfectly valid meaning: you have a list of one string, you're indexing it with another string, and passing the result to a function. If this isn't obvious, try this example:

    frobulate(['a', 'e', 'i', 'o', 'u'][vowel])

So, giving it a second meaning would be ambiguous.

Also, there's already a perfectly good way to write what you want. (Actually two, because square brackets continue the exact same way parens do, but I wouldn't recommend that here.)

    (response
     ["DescribeDBSnapshotsResponse"]
     ["DescribeDBSnapshotsResult"]
    )

That looks no uglier than your suggestion, and a lot less ugly when buried inside a larger expression.

(I think it might look nicer to indent the bracketed keys, but I think that technically violates PEP 8.)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150916/d6777feb/attachment-0001.html>

From ncoghlan at gmail.com  Thu Sep 17 14:35:18 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 17 Sep 2015 22:35:18 +1000
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAExdVN=bjqDCsa3n-qfcqRvaZJG1F4wr8a-Gervy0uKzq2sM=A@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
 <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
 <CADiSq7esW+pmYjwvmFiS9qiZKTa4M2qnXTrhYi1Q9bDE4pPhBw@mail.gmail.com>
 <CAExdVNnU18ZC46FRJyQOmfj62kD897R_tf-u1hBfdur0G_m3Fg@mail.gmail.com>
 <CAP1=2W4tjG8Ne1FgpzHrfM2-hKkyAf3fgbLrTkWQB-onyW03AQ@mail.gmail.com>
 <CAExdVN=bjqDCsa3n-qfcqRvaZJG1F4wr8a-Gervy0uKzq2sM=A@mail.gmail.com>
Message-ID: <CADiSq7eL2fQyovqAam=m1TEK3EWUDGanrtxhLEUDbUjaWcO4bw@mail.gmail.com>

On 17 September 2015 at 04:55, Tim Peters <tim.peters at gmail.com> wrote:
> [Brett Cannon <brett at python.org>]
>> And if yes to a PEP, who's writing it? And then who is writing the
>> implementation in the end?
>
> Did you just volunteer?  Great!  Thanks ;-)  OK, Steven already
> volunteered to write a PEP for his proposal.

As far as implementation goes, based on a separate discussion at
https://github.com/pyca/cryptography/issues/2347, I believe the
essential cases can all be covered by:

    def random_bits(bits):
        return os.urandom(bits//8)

    def random_int(bits):
        return int.from_bytes(random_bits(bits), byteorder="big")

    def random_token(bits):
        return base64.urlsafe_b64encode(random_bits(bits)).decode("ascii")

    def random_hex_digits(bits):
        return binascii.hexlify(random_bits(bits)).decode("ascii")

So if you want a 128 bit (16 bytes) IV, you can just write
"secrets.random_bits(128)". Examples of all four in action:

>>> random_bits(256)
b'\xacc\xa6I[\x9c\xca\x86=B$\xd0\xbc\xee\x8a\xe3i\xe9\xb2\xf4w\xd4@\xc2{U\xb5\xb0\xac\x82\x8a='
>>> random_int(bits=256)
44147786895503064021838366541869866305141442570318401936078951782072369110412
>>> random_token(bits=256)
'-woFuniDCsApOFMtRP5vtjfPfFkmvVhdaPoh9eqAuSs='
>>> random_hex_digits(bits=256)
'e5b09c74bda516ca8464f38dc45428004b6bd81d4e4031fdf9f164e567fbed82'

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From tim.peters at gmail.com  Thu Sep 17 17:11:44 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 17 Sep 2015 10:11:44 -0500
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CADiSq7eL2fQyovqAam=m1TEK3EWUDGanrtxhLEUDbUjaWcO4bw@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
 <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
 <CADiSq7esW+pmYjwvmFiS9qiZKTa4M2qnXTrhYi1Q9bDE4pPhBw@mail.gmail.com>
 <CAExdVNnU18ZC46FRJyQOmfj62kD897R_tf-u1hBfdur0G_m3Fg@mail.gmail.com>
 <CAP1=2W4tjG8Ne1FgpzHrfM2-hKkyAf3fgbLrTkWQB-onyW03AQ@mail.gmail.com>
 <CAExdVN=bjqDCsa3n-qfcqRvaZJG1F4wr8a-Gervy0uKzq2sM=A@mail.gmail.com>
 <CADiSq7eL2fQyovqAam=m1TEK3EWUDGanrtxhLEUDbUjaWcO4bw@mail.gmail.com>
Message-ID: <CAExdVNkY6Yvj=pvMnTPo8CjsVdoab7RD5YCBRALDKAxbF_=fiA@mail.gmail.com>

[Nick Coghlan <ncoghlan at gmail.com>]
> As far as implementation goes, based on a separate discussion at
> https://github.com/pyca/cryptography/issues/2347, I believe the
> essential cases can all be covered by:
>
>     def random_bits(bits):
>         return os.urandom(bits//8)
>
>     def random_int(bits):
>         return int.from_bytes(random_bits(bits), byteorder="big")
>
>     def random_token(bits):
>         return base64.urlsafe_b64encode(random_bits(bits)).decode("ascii")
>
>     def random_hex_digits(bits):
>         return binascii.hexlify(random_bits(bits)).decode("ascii")
>
> So if you want a 128 bit (16 bytes) IV, you can just write
> "secrets.random_bits(128)". Examples of all four in action:
>
> ...

Probably better to wait until Steven starts a new thread about his PEP
(nobody is ever gonna look at _this_ thread again ;-) ).

Just two things to note:

1. Whatever task-appropriate higher-level functions people want, as
you've shown "secure" implementations are easy to write for someone
who knows what's available to build on.  It will take 10000 times
longer for people to bikeshed what "secrets" should offer than to
implement it ;-)

2. I'd personally be surprised if a function taking a "number of bits"
argument silently replaced argument `bits` with `bits - bits % 8`.  If
the app-level programmers at issue can't think in terms of bytes
instead (and use functions with a `bytes` argument), then, e.g.,
better to raise an exception if `bits % 8 != 0` to begin with.  Or to
round up, taking "bits" as meaning "a number of bytes covering _at
least_ the number of bits asked for".

From ncoghlan at gmail.com  Thu Sep 17 18:36:15 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 18 Sep 2015 02:36:15 +1000
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAExdVNkY6Yvj=pvMnTPo8CjsVdoab7RD5YCBRALDKAxbF_=fiA@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
 <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
 <CADiSq7esW+pmYjwvmFiS9qiZKTa4M2qnXTrhYi1Q9bDE4pPhBw@mail.gmail.com>
 <CAExdVNnU18ZC46FRJyQOmfj62kD897R_tf-u1hBfdur0G_m3Fg@mail.gmail.com>
 <CAP1=2W4tjG8Ne1FgpzHrfM2-hKkyAf3fgbLrTkWQB-onyW03AQ@mail.gmail.com>
 <CAExdVN=bjqDCsa3n-qfcqRvaZJG1F4wr8a-Gervy0uKzq2sM=A@mail.gmail.com>
 <CADiSq7eL2fQyovqAam=m1TEK3EWUDGanrtxhLEUDbUjaWcO4bw@mail.gmail.com>
 <CAExdVNkY6Yvj=pvMnTPo8CjsVdoab7RD5YCBRALDKAxbF_=fiA@mail.gmail.com>
Message-ID: <CADiSq7cEjNF02UJ7TBVtKWmBG8PrdvQkjyw7uangKzmToJ-6Ww@mail.gmail.com>

On 18 September 2015 at 01:11, Tim Peters <tim.peters at gmail.com> wrote:
> Just two things to note:
>
> 1. Whatever task-appropriate higher-level functions people want, as
> you've shown "secure" implementations are easy to write for someone
> who knows what's available to build on.  It will take 10000 times
> longer for people to bikeshed what "secrets" should offer than to
> implement it ;-)

Agreed, although the 4 I listed are fairly well-credentialed - the
implementations of the first two (raw bytes and integers) are the
patterns cryptography.io uses, the token generator is comparable to
the Django one (with a couple of extra punctuation characters in the
alphabet), and the hex digit generator is the Pyramid one.

You can get more exotic with full arbitrary alphabet password and
passphrase generators, but I think we're getting beyond stdlib level
functionality at that point - it's getting into the realm of password
managers and attack software.

> 2. I'd personally be surprised if a function taking a "number of bits"
> argument silently replaced argument `bits` with `bits - bits % 8`.  If
> the app-level programmers at issue can't think in terms of bytes
> instead (and use functions with a `bytes` argument), then, e.g.,
> better to raise an exception if `bits % 8 != 0` to begin with.  Or to
> round up, taking "bits" as meaning "a number of bytes covering _at
> least_ the number of bits asked for".

Yeah, I took a shortcut to keep them all as pretty one liners. A
proper rand_bits with that API would look something like:

    def rand_bits(bits):
        num_bytes, add_byte = divmod(bits)
        if add_byte:
            num_bytes += 1
        return os.urandom(bits)

Compared to the os.urandom() call itself, the bits -> bytes
calculation should disappear into the noise from a speed perspective
(and a JIT compiled runtime like PyPy could likely optimise it away
entirely).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From random832 at fastmail.com  Thu Sep 17 19:08:37 2015
From: random832 at fastmail.com (Random832)
Date: Thu, 17 Sep 2015 13:08:37 -0400
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CADiSq7cEjNF02UJ7TBVtKWmBG8PrdvQkjyw7uangKzmToJ-6Ww@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
 <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
 <CADiSq7esW+pmYjwvmFiS9qiZKTa4M2qnXTrhYi1Q9bDE4pPhBw@mail.gmail.com>
 <CAExdVNnU18ZC46FRJyQOmfj62kD897R_tf-u1hBfdur0G_m3Fg@mail.gmail.com>
 <CAP1=2W4tjG8Ne1FgpzHrfM2-hKkyAf3fgbLrTkWQB-onyW03AQ@mail.gmail.com>
 <CAExdVN=bjqDCsa3n-qfcqRvaZJG1F4wr8a-Gervy0uKzq2sM=A@mail.gmail.com>
 <CADiSq7eL2fQyovqAam=m1TEK3EWUDGanrtxhLEUDbUjaWcO4bw@mail.gmail.com>
 <CAExdVNkY6Yvj=pvMnTPo8CjsVdoab7RD5YCBRALDKAxbF_=fiA@mail.gmail.com>
 <CADiSq7cEjNF02UJ7TBVtKWmBG8PrdvQkjyw7uangKzmToJ-6Ww@mail.gmail.com>
Message-ID: <1442509717.2145449.386496081.2D5AF5B7@webmail.messagingengine.com>

On Thu, Sep 17, 2015, at 12:36, Nick Coghlan wrote:
> You can get more exotic with full arbitrary alphabet password and
> passphrase generators, but I think we're getting beyond stdlib level
> functionality at that point - it's getting into the realm of password
> managers and attack software.

I think it's important to at least have a way to get a random number in
a range that isn't a power of two, since that's so easy to get wrong.
Even the libc arc4random API has that in arc4random_uniform.

At that point people can build their own arbitrary alphabet password
generators as one-liners.

From tim.peters at gmail.com  Thu Sep 17 20:07:28 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 17 Sep 2015 13:07:28 -0500
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CADiSq7cEjNF02UJ7TBVtKWmBG8PrdvQkjyw7uangKzmToJ-6Ww@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
 <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
 <CADiSq7esW+pmYjwvmFiS9qiZKTa4M2qnXTrhYi1Q9bDE4pPhBw@mail.gmail.com>
 <CAExdVNnU18ZC46FRJyQOmfj62kD897R_tf-u1hBfdur0G_m3Fg@mail.gmail.com>
 <CAP1=2W4tjG8Ne1FgpzHrfM2-hKkyAf3fgbLrTkWQB-onyW03AQ@mail.gmail.com>
 <CAExdVN=bjqDCsa3n-qfcqRvaZJG1F4wr8a-Gervy0uKzq2sM=A@mail.gmail.com>
 <CADiSq7eL2fQyovqAam=m1TEK3EWUDGanrtxhLEUDbUjaWcO4bw@mail.gmail.com>
 <CAExdVNkY6Yvj=pvMnTPo8CjsVdoab7RD5YCBRALDKAxbF_=fiA@mail.gmail.com>
 <CADiSq7cEjNF02UJ7TBVtKWmBG8PrdvQkjyw7uangKzmToJ-6Ww@mail.gmail.com>
Message-ID: <CAExdVN=85s4gMcnorRpZ0Dy0YSEZyd1ZuAYDax-C44iD+FyAtQ@mail.gmail.com>

[Tim]
>> Just two things to note:
>>
>> 1. Whatever task-appropriate higher-level functions people want, as
>> you've shown "secure" implementations are easy to write for someone
>> who knows what's available to build on.  It will take 10000 times
>> longer for people to bikeshed what "secrets" should offer than to
>> implement it ;-)

[Nick Coghlan <ncoghlan at gmail.com>]
> Agreed, although the 4 I listed are fairly well-credentialed - the
> implementations of the first two (raw bytes and integers) are the
> patterns cryptography.io uses, the token generator is comparable to
> the Django one (with a couple of extra punctuation characters in the
> alphabet), and the hex digit generator is the Pyramid one.

I will immodestly claim that nobody needs to be a crypto-wonk to see
that these implementations are exactly as secure (or insecure) as the
platform urandom():  in each case, it's trivial to invert the output
to recover the exact bytes urandom() returned.  So if there's any
attack against the outputs, that's also an attack against what
urandom() returned.  The outputs just spell what urandom returned
using a different alphabet.

For the same reason, e.g., it would be fine to replace each 0 bit in
urandom's result with the string "egg", and each 1 bit with the string
"turtle".  An attack on the output of that is exactly as hard (or
easy) as an attack on the output of urandom.  Obvious, right?  It's
only a little harder to see that the same is true of even the fanciest
of your 4 functions.

Where you _may_ get in trouble is creating a non-invertible output.  Like:

    def secure_int(nbytes):
        n = int.from_bytes(os.urandom(nbytes), "big")
        return n - n

That's not likely to be useful ;-)


> You can get more exotic with full arbitrary alphabet password and
> passphrase generators, but I think we're getting beyond stdlib level
> functionality at that point - it's getting into the realm of password
> managers and attack software.

I'll leave that for the discussion of Steven's PEP.  I think he was on
the right track to, e.g., suggest a secure choice() as one his few
base building blocks.  It _does_ take some expertise to implement a
secure choice() correctly, but not so much from the crypto view as
from the free-from-statistical-bias view.  SystemRandom.choice()
already gets both right.


>> 2. I'd personally be surprised if a function taking a "number of bits"
>> argument silently replaced argument `bits` with `bits - bits % 8`.  If
>> the app-level programmers at issue can't think in terms of bytes
>> instead (and use functions with a `bytes` argument), then, e.g.,
>> better to raise an exception if `bits % 8 != 0` to begin with.  Or to
>> round up, taking "bits" as meaning "a number of bytes covering _at
>> least_ the number of bits asked for".

> Yeah, I took a shortcut to keep them all as pretty one liners. A
> proper rand_bits with that API would look something like:
>
>     def rand_bits(bits):
>         num_bytes, add_byte = divmod(bits)
>         if add_byte:
>             num_bytes += 1
>         return os.urandom(bits)

You should really be calling that with "num_bytes" now ;-)


> Compared to the os.urandom() call itself, the bits -> bytes
> calculation should disappear into the noise from a speed perspective
> (and a JIT compiled runtime like PyPy could likely optimise it away
> entirely).

Goodness - "premature optimization" already?! ;-)  Fastest in pure
Python is likely

    num_bytes = (bits + 7) >> 3

But if I were bikeshedding I'd question why the function weren't:

    def rand_bytes(nbytes):
       return os.urandom(nbytes)

instead.  A rand_bits(nbits) that meant what it said would likely also
be useful:

From srkunze at mail.de  Thu Sep 17 23:13:48 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Thu, 17 Sep 2015 23:13:48 +0200
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <87wpvpx3d6.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>	<55F0AC83.3050505@mail.de>	<CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>	<55F1B306.5070705@mail.de>	<05084F79-C505-4A27-9F08-DA98D4B19963@yahoo.com>	<55F9D7B9.8060109@mail.de>
 <87wpvpx3d6.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <55FB2D0C.2020008@mail.de>

On 17.09.2015 03:04, Stephen J. Turnbull wrote:
> Sven R. Kunze writes:
>
>   > Do you know what makes me sad? If you do that for this function but
>   > don't do it for another what is the guideline then? Python Zen tells us
>   > to have one obvious way to do sth. At least for me, it's not obvious
>   > anymore when to annotate and when not to annote. Just a random guess
>   > depending on the moon phase? :(
>
> No.  There's a simple rule: if it's obvious to you that type
> annotation is useful, do it.  If it's not obvious you want it, you
> don't, and you don't do it.  You obviously are unlikely to do it for
> some time, if ever.  Me too.

I was talking about specific examples (functions and methods). You were 
talking about the concept as a whole if I am not completely mistaken.

From rymg19 at gmail.com  Thu Sep 17 23:19:30 2015
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Thu, 17 Sep 2015 16:19:30 -0500
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <55F9D42C.1080208@mail.de>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <55F1B306.5070705@mail.de>
 <CAA_f+LzcuThaoipmLPw0iOHX0EwHQLuQPwO1RrATNpEskp3FXg@mail.gmail.com>
 <55F9D42C.1080208@mail.de>
Message-ID: <881AE78F-DCDB-4085-A298-28400E880A50@gmail.com>



On September 16, 2015 3:42:20 PM CDT, "Sven R. Kunze" <srkunze at mail.de> wrote:
>On 11.09.2015 08:24, Jukka Lehtosalo wrote:
>> On Thu, Sep 10, 2015 at 9:42 AM, Sven R. Kunze <srkunze at mail.de 
>> <mailto:srkunze at mail.de>> wrote:
>>
>>     If my variables have crappy names, so I need to add type hints to
>>     them, well, then, I rather fix them first.
>>
>>
>> Even good variable names can leave the type ambiguous.
>
>Try harder then.

def process_integer_coordinate_tuples(integer_tuple_1, integer_tuple_2, is_fast): ...

vs

def process_coords(t1: Tuple[int, int], t2: Tuple[int, int], fast: bool): ...

Java's fatal mistake.

>
>> And besides, if you assume that all code is perfect or can be made 
>> perfect I think that you've already lost the discussion. Reality 
>> disagrees with you. ;-)
>
>Not sure where I said this.
>
>> You can't just wave a magic wand and to get every programmer to 
>> document their code and write unit tests. However, we know quite well
>
>> that programmers are perfectly capable of writing type annotations, 
>> and tools can even enforce that they are present (witness all the
>Java 
>> code in existence).
>
>You can't just wave a magic wand and to get every programmer to add
>type 
>annotations to their code. However, we know quite well that programmers
>
>are perfectly capable of writing unit tests, and tools can even enforce
>
>that they are present (witness coverage tools and hooks in SCM systems 
>preventing it from dropping).
>
>[ Interesting, that it was that easy to exchange the parts you've given
>
>me ;) ]
>
>Btw. have you heard of code review?
>
>> Tools can't verify that you have good variable names or useful 
>> docstrings, and people are too inconsistent or lazy to be relied on.
>
>Same can be said for type annotations.
>
>> In a cost/benefit analysis it may be optimal to spent half the 
>> available time on annotating parts of the code base to get some (but 
>> necessarily limited) static checking coverage and spend the remaining
>
>> half on writing tests for selected parts of the code base, for 
>> example. It's not all or nothing.
>
>I would like to peer-review that cost/benefit analysis you've made to 
>see whether your numbers are sane.
>
>>
>>>     You get extra credit if your tests are slow to run and flaky,
>>
>>     We are problem solvers. So, I would tell my team: "make them
>>     faster and more reliable".
>>
>>
>> But you'd probably also ask them to implement new features (or *your*
>
>> manager might be unhappy), and they have to find the right balance,
>as 
>> they only have 40 hours a week (or maybe 80 hours if you work at an 
>> early-stage startup :-). Having more tools gives you more options for
>
>> spending your time efficiently.
>
>Yes, I am going to tell him: "Hey, it doesn't work but we got all/most 
>of the types right."
>
>>
>>     Granted. But you still don't know if your code runs correctly.
>You
>>     are better off with tests. And I agree type checking is 1 test to
>>     perform (out of 10K).
>>
>>
>> Actually a type checker can verify multiple properties of a typical 
>> line of code. So for 10k lines of code, complete type checking 
>> coverage would give you the equivalent of maybe 30,000 (simple)
>tests. 
>> :-P
>
>I think you should be more specific on this.
>
>Using hypothesis, e.g., you can easily increase the number of simple 
>tests as well.
>
>What I can tell is that most of the time, a variable carries the same 
>type. It is really convenient that it doesn't have to but most of the 
>time it does. Thus, one test run can probably reveal a dangerous type 
>mistake. I've seen code where that is not the case indeed and one 
>variable is either re-used or accidentally have different types. But, 
>well, you better stay away from it anyway because most of the time it's
>
>very old code.
>
>Moreover, in order to add *reasonable* type annotations you would 
>probably invest equal amount of time that you would invest to write
>some 
>tests for it. The majority of time is about *understanding* the code. 
>And there, better variable names help a lot.
>
>> It's often not cost effective to have good test coverage (and even 
>> 100% line coverage doesn't give you full coverage of all 
>> interactions). Testing can't prove that your code doesn't have
>defects 
>> -- it just proves that for a tiny subset of possible inputs you code 
>> works as expected. A type checker may be able to prove that for *all*
>
>> possible inputs your code doesn't do certain bad things, but it can't
>
>> prove that it does the good things. Neither subsumes the other, and 
>> both of these are approaches are useful and complementary (but 
>> incomplete).
>
>I fully agree on this. Yet I don't need type annotations. ;) A simple 
>test running a typechecker working at 40%-60% (depending on whom you 
>ask) efficiency suffices at least for me.
>
>I would love to see better typecheckers rather than cluttering our code
>
>with some questionable annotations; btw. of which I don't know of are 
>necessary at all.
>
>Don't be fooled by the possibility of dynamic typing in Python. Just 
>because it's possible doesn't necessarily mean it's the usual thing.
>
>> I think that there was a good talk basically about this at PyCon this
>
>> year, by the way, but I can't remember the title.
>
>It'll be great to have it. :)
>
>Best,
>Sven
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Python-ideas mailing list
>Python-ideas at python.org
>https://mail.python.org/mailman/listinfo/python-ideas
>Code of Conduct: http://python.org/psf/codeofconduct/

-- 
Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.

From srkunze at mail.de  Thu Sep 17 23:24:53 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Thu, 17 Sep 2015 23:24:53 +0200
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <20150917035915.GL31152@ando.pearwood.info>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <55F1B306.5070705@mail.de> <05084F79-C505-4A27-9F08-DA98D4B19963@yahoo.com>
 <55F9D7B9.8060109@mail.de> <20150917035915.GL31152@ando.pearwood.info>
Message-ID: <55FB2FA5.6050501@mail.de>

On 17.09.2015 05:59, Steven D'Aprano wrote:
> People will use the same sort of heuristic for deciding which functions
> get annotated:
>
> - does the function need annotations/documentation/tests?
> - do I have time to write annotations/documentation/tests?
> - is my manager telling me to add annotations/documentation/tests?
> - if I don't, will bad things happen?
> - if it easy or interesting to add them?
> - or difficult and boring?

I fear I am not convinced of that analogy.


Tests and documentation is all or nothing. Either you have them or you 
don't and one is not worthier than another.

Type annotations (as far as I understand them) are basically completing 
a picture of 40%-of-already-inferred types. So, I have difficulties to 
infer which parameters actually would benefit from annotating. I am 
either doing redundant work (because the typechecker is already very 
well aware of the type) or I actually insert explicit knowledge (which 
might become redundant in case typecheckers actually become better).

From srkunze at mail.de  Thu Sep 17 23:42:52 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Thu, 17 Sep 2015 23:42:52 +0200
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <058F4ABF-CCBC-455B-9A37-41742AA35C5E@yahoo.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <55F1B306.5070705@mail.de> <05084F79-C505-4A27-9F08-DA98D4B19963@yahoo.com>
 <55F9D7B9.8060109@mail.de> <058F4ABF-CCBC-455B-9A37-41742AA35C5E@yahoo.com>
Message-ID: <55FB33DC.6080203@mail.de>

On 17.09.2015 07:56, Andrew Barnert wrote:
> I'm not sure which problems you want solved.
>
> If you want every type to be inferable, for a language with a sufficiently powerful type system, that's provably equivalent to the halting problem, so it's not going to happen.

Nobody said it must be perfect. It just needs to be good enough.

> More importantly, we already have languages with a powerful static type system and a great inference engine, and experience with those languages shows that it's often useful to annotate some types for readability that the inference engine could have figured out. If a particular function is more understandable to the reader when it declared its parameter types, I can't imagine what research anyone would do that would cause me to stop wanting to declaring those types.

Because it's more code, redundant, needs to me maintained and so on and 
so forth.

> Also, even when you want to rely on inference, you still want the types to have meaningful names that you can read, and could have figured out how to construct on your own, for things like error messages, debuggers, and reflective code. So, the work that Jukka is proposing would still be worth doing even if we had perfect inference.

I totally agree (and I said this before).

Speaking of meaningful names, which name(s) are debuggers supposed to 
show when there is a multitude of protocols that would fit?

>
>> Btw. I can tell the same anecdote when switching from C/C++/C#/Java to Python. It was like a liberation---no explicit type declarations anymore. I was baffled and frightened the first week using it. But I love it now and I don't want to give that freedom up. Maybe, that's why I am reluctant to use it in production.
> The problem here is that you're coming from C++/C#/Java, which are terrible examples of static typing. Disliking static typing because of Java is like disliking dynamic typing because of Tcl. I won't get into details of why they're so bad, but: if you don't have the time to learn you a Haskell for great good, you can probably at least pick up Boo in an hour or so, to at least see what static typing is like with inference by default and annotations only when needed in a very pythonesque languages, and that will give you half the answer.

I came across Haskell quite some time ago and I have to admit it feels 
not natural but for other reasons than its typing system and inference.

From srkunze at mail.de  Thu Sep 17 23:45:59 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Thu, 17 Sep 2015 23:45:59 +0200
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <881AE78F-DCDB-4085-A298-28400E880A50@gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <55F1B306.5070705@mail.de>
 <CAA_f+LzcuThaoipmLPw0iOHX0EwHQLuQPwO1RrATNpEskp3FXg@mail.gmail.com>
 <55F9D42C.1080208@mail.de> <881AE78F-DCDB-4085-A298-28400E880A50@gmail.com>
Message-ID: <55FB3497.2040103@mail.de>

On 17.09.2015 23:19, Ryan Gonzalez wrote:
> def process_integer_coordinate_tuples(integer_tuple_1, integer_tuple_2, is_fast): ...
>
> vs
>
> def process_coords(t1: Tuple[int, int], t2: Tuple[int, int], fast: bool): ...
>
> Java's fatal mistake.

Care to elaborate?

From rymg19 at gmail.com  Thu Sep 17 23:56:33 2015
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Thu, 17 Sep 2015 16:56:33 -0500
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <55FB3497.2040103@mail.de>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <55F1B306.5070705@mail.de>
 <CAA_f+LzcuThaoipmLPw0iOHX0EwHQLuQPwO1RrATNpEskp3FXg@mail.gmail.com>
 <55F9D42C.1080208@mail.de> <881AE78F-DCDB-4085-A298-28400E880A50@gmail.com>
 <55FB3497.2040103@mail.de>
Message-ID: <CAO41-mPuBr7tgkft_g26XD12xTjtUCPsS_ZLLb3yNXJxDbKVNQ@mail.gmail.com>

Embedding type names in arguments and method names.

On Thu, Sep 17, 2015 at 4:45 PM, Sven R. Kunze <srkunze at mail.de> wrote:

> On 17.09.2015 23:19, Ryan Gonzalez wrote:
>
>> def process_integer_coordinate_tuples(integer_tuple_1, integer_tuple_2,
>> is_fast): ...
>>
>> vs
>>
>> def process_coords(t1: Tuple[int, int], t2: Tuple[int, int], fast: bool):
>> ...
>>
>> Java's fatal mistake.
>>
>
> Care to elaborate?
>

You said:

> Even good variable names can leave the type ambiguous.

These are names that don't leave anything ambiguous! :D

Really, though: relying on naming to make types explicit fails badly
whenever you start refactoring and makes hell for the users of the API you
made.

-- 
Ryan
[ERROR]: Your autotools build scripts are 200 lines longer than your
program. Something?s wrong.
http://kirbyfan64.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150917/07f9601b/attachment.html>

From srkunze at mail.de  Fri Sep 18 00:21:46 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Fri, 18 Sep 2015 00:21:46 +0200
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CAO41-mPuBr7tgkft_g26XD12xTjtUCPsS_ZLLb3yNXJxDbKVNQ@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <55F1B306.5070705@mail.de>
 <CAA_f+LzcuThaoipmLPw0iOHX0EwHQLuQPwO1RrATNpEskp3FXg@mail.gmail.com>
 <55F9D42C.1080208@mail.de> <881AE78F-DCDB-4085-A298-28400E880A50@gmail.com>
 <55FB3497.2040103@mail.de>
 <CAO41-mPuBr7tgkft_g26XD12xTjtUCPsS_ZLLb3yNXJxDbKVNQ@mail.gmail.com>
Message-ID: <55FB3CFA.40707@mail.de>

On 17.09.2015 23:56, Ryan Gonzalez wrote:
> Embedding type names in arguments and method names.
>
> On Thu, Sep 17, 2015 at 4:45 PM, Sven R. Kunze <srkunze at mail.de 
> <mailto:srkunze at mail.de>> wrote:
>
>     On 17.09.2015 23:19, Ryan Gonzalez wrote:
>
>         def process_integer_coordinate_tuples(integer_tuple_1,
>         integer_tuple_2, is_fast): ...
>
>         vs
>
>         def process_coords(t1: Tuple[int, int], t2: Tuple[int, int],
>         fast: bool): ...
>
>         Java's fatal mistake.
>
>
>     Care to elaborate?
>
>

I was actually confused by 'Java' in your reply.

> You said:
>
> > Even good variable names can leave the type ambiguous.
>
> These are names that don't leave anything ambiguous! :D
>

They just do. Because they don't tell me why I would want to call that 
function and with what.

If any of these versions is supposed to represent good style, you still 
need to learn a lot.

> Really, though: relying on naming to make types explicit fails badly 
> whenever you start refactoring and makes hell for the users of the API 
> you made.

Professional refactoring would not change venerable APIs. It would 
provide another version of it and slowly deprecate the old one.

Not sure where you heading here but do you say t1 and t2 are good names? 
Not sure how big the applications you work with are but those I know of 
are very large. So, I am glad when 2000 lines and 10 files later a 
variable somehow tells me something about it. And no "*Tuple[int, int]*" 
doesn't tell me anything (even when an IDE could tell that).

Most of the time when discussing typecheckers and so forth, I get the 
feeling people think most applications are using data structures like 
*tuples of tuples of tuples of ints*. That is definitely not the case 
(anymore). Most of the time the data types are instances, list of 
instances and dicts of instances.

That's one reason I somehow like Jukka's structural proposal because I 
actually can see some real-world benefit which goes beyond the tuples of 
tuples and that is: *inferring proper names*.

Best,
Sven
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150918/72b2b7fd/attachment.html>

From steve at pearwood.info  Fri Sep 18 04:13:39 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 18 Sep 2015 12:13:39 +1000
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <CAO41-mPuBr7tgkft_g26XD12xTjtUCPsS_ZLLb3yNXJxDbKVNQ@mail.gmail.com>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <55F1B306.5070705@mail.de>
 <CAA_f+LzcuThaoipmLPw0iOHX0EwHQLuQPwO1RrATNpEskp3FXg@mail.gmail.com>
 <55F9D42C.1080208@mail.de> <881AE78F-DCDB-4085-A298-28400E880A50@gmail.com>
 <55FB3497.2040103@mail.de>
 <CAO41-mPuBr7tgkft_g26XD12xTjtUCPsS_ZLLb3yNXJxDbKVNQ@mail.gmail.com>
Message-ID: <20150918021339.GM31152@ando.pearwood.info>

On Thu, Sep 17, 2015 at 04:56:33PM -0500, Ryan Gonzalez wrote:

> Embedding type names in arguments and method names.

supposedly being "Java's fatal mistake".

I'm not sure that Java developers commonly make a practice of doing 
that. It would be strange, since Java requires type declarations. I'm 
not really a Java guy, but I think this would be more like what you 
would expect:

public class Example{
   public void processCoords(Point t1, Point t2, boolean fast){
      ...
   }

where Point is equivalent to a (int, int) tuple.

You seem to be describing a verbose version of "Apps Hungarian 
Notation". I don't think Hungarian Notation was ever standard practice 
in the Java world, although I did find at least one tutorial (from 1999) 
recommending it:

http://www.developer.com/java/ent/article.php/615891/Applying-Hungarian-Notation-to-Java-programs-Part-1.htm

In any case, I *think* that your intended lesson is that type 
annotations can increase the quality of code even without a type 
checker, as they act as type documentation to the reader.

I agree with that.


-- 
Steve

From steve at pearwood.info  Fri Sep 18 05:00:26 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 18 Sep 2015 13:00:26 +1000
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <55FB2FA5.6050501@mail.de>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <55F1B306.5070705@mail.de> <05084F79-C505-4A27-9F08-DA98D4B19963@yahoo.com>
 <55F9D7B9.8060109@mail.de> <20150917035915.GL31152@ando.pearwood.info>
 <55FB2FA5.6050501@mail.de>
Message-ID: <20150918030024.GN31152@ando.pearwood.info>

On Thu, Sep 17, 2015 at 11:24:53PM +0200, Sven R. Kunze wrote:
> On 17.09.2015 05:59, Steven D'Aprano wrote:
> >People will use the same sort of heuristic for deciding which functions
> >get annotated:
> >
> >- does the function need annotations/documentation/tests?
> >- do I have time to write annotations/documentation/tests?
> >- is my manager telling me to add annotations/documentation/tests?
> >- if I don't, will bad things happen?
> >- if it easy or interesting to add them?
> >- or difficult and boring?
> 
> I fear I am not convinced of that analogy.
> 
> 
> Tests and documentation is all or nothing. Either you have them or you 
> don't and one is not worthier than another.

I don't think they are all or nothing. I think it is possible to have 
incomplete documentation and partial test coverage -- it isn't like you 
go from "no documentation at all and zero tests" to "fully documented 
and 100% test coverage" in a single step. Unless you are religiously 
following something like Test Driven Development, where code is always 
written to follow a failed test, there will be times where you have to 
decide between writing new code or improving test coverage.

Other choices may include:

- improve documentation;
- fix bugs;
- run a linter and fix the warnings it generates;

Adding "fix type errors found by the type checker" doesn't fundamentally 
change the nature of the work. You are still deciding what your 
priorities are, according to the needs of the project, your own personal 
preferences, and the instructions of your project manager (if you have 
one).


> Type annotations (as far as I understand them) are basically completing 
> a picture of 40%-of-already-inferred types. 

That's one use-case for them. Another use-case is as documentation:

def agm(x:float, y:float)->float:
    """Return the arithmetic-geometric mean of x and y."""

versus

def agm(x, y):
    """Return the arithmetic-geometric mean of x and y.

    Args:
        x (float): A number.
        y (float): A number.

    Returns:
        float: The agm of the two numbers.
    """



> So, I have difficulties to 
> infer which parameters actually would benefit from annotating. 

The simplest process may be something like this:

- run the type-checker in a mode where it warns about variables 
  with unknown types;
- add just enough annotations so that the warnings go away.

This is, in part, a matter of the quality of your tools. A good type 
checker should be able to tell you where it can, or can't, infer a type.


> I am 
> either doing redundant work (because the typechecker is already very 
> well aware of the type) or I actually insert explicit knowledge (which 
> might become redundant in case typecheckers actually become better).

You make it sound like, alone out of everything else in Python 
programming, once a type annotation is added to a function it is carved 
in stone forever, never to be removed or changed :-)

If you add redundant type annotations, no harm is done. For example:

def spam(n=3):
    return "spam"*n

A decent type-checker should be able to infer that n is an int. What if 
you add a type annotation?

def spam(n:int=3):
    return "spam"*n

Is that really such a big problem that you need to worry about this? I 
don't think so. The choice whether to rigorously stamp out all redundant 
type annotations, or leave them in, is a decision for your project. 
There is no universal right or wrong answer.


-- 
Steve

From stephen at xemacs.org  Fri Sep 18 10:15:11 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 18 Sep 2015 17:15:11 +0900
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <55FB2D0C.2020008@mail.de>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <55F1B306.5070705@mail.de>
 <05084F79-C505-4A27-9F08-DA98D4B19963@yahoo.com>
 <55F9D7B9.8060109@mail.de>
 <87wpvpx3d6.fsf@uwakimon.sk.tsukuba.ac.jp>
 <55FB2D0C.2020008@mail.de>
Message-ID: <87fv2cw3cg.fsf@uwakimon.sk.tsukuba.ac.jp>

Sven R. Kunze writes:
 > On 17.09.2015 03:04, Stephen J. Turnbull wrote:
 > > Sven R. Kunze writes:

 > >   > At least for me, it's not obvious anymore when to annotate
 > >   > and when not to annote. Just a random guess depending on the
 > >   > moon phase? :(
 > >
 > > No.  There's a simple rule: if it's obvious to you that type
 > > annotation is useful, do it.  If it's not obvious you want it, you
 > > don't, and you don't do it.  You obviously are unlikely to do it for
 > > some time, if ever.  Me too.
 > 
 > I was talking about specific examples (functions and methods). You were 
 > talking about the concept as a whole if I am not completely mistaken.

Nope.  I was talking about each time you write a function.

From gokoproject at gmail.com  Fri Sep 18 16:44:27 2015
From: gokoproject at gmail.com (John Wong)
Date: Fri, 18 Sep 2015 10:44:27 -0400
Subject: [Python-ideas] Bring line continuation to multi-level
	dictionary lookup
In-Reply-To: <FFA5F32F-65E9-4E55-82B6-1ECC26E6B30D@yahoo.com>
References: <CACCLA55V0gcsT60DTM1OOc4UFC_eQcLYAq3or6CKX3mfyQoU6A@mail.gmail.com>
 <FFA5F32F-65E9-4E55-82B6-1ECC26E6B30D@yahoo.com>
Message-ID: <CACCLA55oNCyYQf94z03_4s-9HN9LYon-b8b81eNCfcq-9i-pKQ@mail.gmail.com>

On Thu, Sep 17, 2015 at 2:06 AM, Andrew Barnert <abarnert at yahoo.com> wrote:

> On Sep 16, 2015, at 21:07, John Wong <gokoproject at gmail.com> wrote:
>
> So here I am, thinking, what if we can do this?
>
> response(
>     ["DescribeDBSnapshotsResponse"]
>     ["DescribeDBSnapshotsResult"]
> )
>
>
> This already has a perfectly valid meaning: you have a list of one string,
> you're indexing it with another string, and passing the result to a
> function. If this isn't obvious, try this example:
>
>     frobulate(['a', 'e', 'i', 'o', 'u'][vowel])
>
> So, giving it a second meaning would be ambiguous.
>

Great catch... I did not even consider this. You are right.



> Also, there's already a perfectly good way to write what you want.
> (Actually two, because square brackets continue the exact same way parens
> do, but I wouldn't recommend that here.)
>
>     (response
>      ["DescribeDBSnapshotsResponse"]
>      ["DescribeDBSnapshotsResult"]
>     )
>

Thank you all. I think I should have noticed (response[..]) would work...


>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150918/24f15c61/attachment.html>

From antoine at python.org  Fri Sep 18 17:50:42 2015
From: antoine at python.org (Antoine Pitrou)
Date: Fri, 18 Sep 2015 15:50:42 +0000 (UTC)
Subject: [Python-ideas] PEP 504: Using the system RNG by default
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <etPan.55f85a54.432cb095.6557@Draupnir.home>
 <CAP7+vJK3qVGozGbzVRBh05DqLdwgFJV1mFGts48VyqDso+WuVQ@mail.gmail.com>
 <87pp1jxiwk.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dwTuAM7FGWqXO4CRJQjOtqsxm5XmPA4eGa_EpMKuUY-A@mail.gmail.com>
 <CACac1F8e5d_b=6bSvX-VvS9Er9-B+RcduOaa9aaJBsbATUBhUQ@mail.gmail.com>
 <CADiSq7eh5+nreJqwbqmjPWvD_FtVGVAjS6zfpg9Eyng2Gp2u=Q@mail.gmail.com>
 <CAP7+vJLHkPXJy6tv0reDB-gR8gpm2YJhMsz0uSGkPKpFjyFPvg@mail.gmail.com>
 <CADiSq7esW+pmYjwvmFiS9qiZKTa4M2qnXTrhYi1Q9bDE4pPhBw@mail.gmail.com>
 <CAExdVNnU18ZC46FRJyQOmfj62kD897R_tf-u1hBfdur0G_m3Fg@mail.gmail.com>
 <CAP1=2W4tjG8Ne1FgpzHrfM2-hKkyAf3fgbLrTkWQB-onyW03AQ@mail.gmail.com>
 <CAExdVN=bjqDCsa3n-qfcqRvaZJG1F4wr8a-Gervy0uKzq2sM=A@mail.gmail.com>
 <CADiSq7eL2fQyovqAam=m1TEK3EWUDGanrtxhLEUDbUjaWcO4bw@mail.gmail.com>
Message-ID: <loom.20150918T173744-626@post.gmane.org>

Nick Coghlan <ncoghlan at ...> writes:
> 
> On 17 September 2015 at 04:55, Tim Peters <tim.peters at ...> wrote:
> > [Brett Cannon <brett at ...>]
> >> And if yes to a PEP, who's writing it? And then who is writing the
> >> implementation in the end?
> >
> > Did you just volunteer?  Great!  Thanks   OK, Steven already
> > volunteered to write a PEP for his proposal.
> 
> As far as implementation goes, based on a separate discussion at
> https://github.com/pyca/cryptography/issues/2347, I believe the
> essential cases can all be covered by:
> 
>     def random_bits(bits):
>         return os.urandom(bits//8)
> 
>     def random_int(bits):
>         return int.from_bytes(random_bits(bits), byteorder="big")
> 
>     def random_token(bits):
>         return base64.urlsafe_b64encode(random_bits(bits)).decode("ascii")
> 
>     def random_hex_digits(bits):
>         return binascii.hexlify(random_bits(bits)).decode("ascii")

I think you want a little bit more flexibility than that, because the
allowed characters may depend on the specific protocol (of course,
people can use the hex digits version, but the output is longer).

(quite a good idea, that "secrets" library - I wonder why nobody proposed
it before ;-))

Regards

Antoine.



From srkunze at mail.de  Fri Sep 18 19:35:17 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Fri, 18 Sep 2015 19:35:17 +0200
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <20150918030024.GN31152@ando.pearwood.info>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <55F1B306.5070705@mail.de> <05084F79-C505-4A27-9F08-DA98D4B19963@yahoo.com>
 <55F9D7B9.8060109@mail.de> <20150917035915.GL31152@ando.pearwood.info>
 <55FB2FA5.6050501@mail.de> <20150918030024.GN31152@ando.pearwood.info>
Message-ID: <55FC4B55.1050908@mail.de>

On 18.09.2015 05:00, Steven D'Aprano wrote:
> I don't think they are all or nothing. I think it is possible to have
> incomplete documentation and partial test coverage -- it isn't like you
> go from "no documentation at all and zero tests" to "fully documented
> and 100% test coverage" in a single step.

This was a misunderstanding. The "all or nothing" wasn't about "test 
everything or don't do it at all". It was about the robustness of future 
benefits you gain from it. Either you have a test or you don't.

With type annotations you have 40% or 60% *depending* on the quality of 
the tool you use. It's fuzzy. I don't like to build stuff on jello. Just 
my personal feeling here.

> That's one use-case for them. Another use-case is as documentation:
>
> def agm(x:float, y:float)->float:
>      """Return the arithmetic-geometric mean of x and y."""
>
> versus
>
> def agm(x, y):
>      """Return the arithmetic-geometric mean of x and y.
>
>      Args:
>          x (float): A number.
>          y (float): A number.
>
>      Returns:
>          float: The agm of the two numbers.
>      """

The type annotation explains nothing. The short doc-string 
"arithmetic-geometric mean" explains everything (or prepare you to 
google it). So, I would prefer this one:

def agm(x, y):
     """Return the arithmetic-geometric mean of x and y."""


>> So, I have difficulties to
>> infer which parameters actually would benefit from annotating.
> The simplest process may be something like this:
>
> - run the type-checker in a mode where it warns about variables
>    with unknown types;
> - add just enough annotations so that the warnings go away.
>
> This is, in part, a matter of the quality of your tools. A good type
> checker should be able to tell you where it can, or can't, infer a type.

You see? Depending on who runs which tools, type annotations need to be 
added which are redundant for one tool and not for another and vice 
versa. (Yes, we allow that because we grant the liberty to our devs to 
use the tools they perform best with.)

Coverage, on the other hand, is strict. Either you traverse that line of 
code or you don't (assuming no bugs in the coverage tools).

>> I am
>> either doing redundant work (because the typechecker is already very
>> well aware of the type) or I actually insert explicit knowledge (which
>> might become redundant in case typecheckers actually become better).
> You make it sound like, alone out of everything else in Python
> programming, once a type annotation is added to a function it is carved
> in stone forever, never to be removed or changed :-)

Let me reformulate my point: it's not about setting things in stone. 
It's about having more to read/process mentally. You might think, 'nah, 
he's exaggerating; it's just one tiny little ": int" more here and 
there', but these things build up slowly over time, due to missing clear 
guidelines (see the fuzziness I described above). Devs will simply add 
them just everywhere just to make sure OR ignore the whole concept 
completely.

It's simply not good enough. :(


Nevertheless, I like the protocol idea more as it introduces actual 
names to be exposed by IDEs without any work from the devs. That's great!


You might further think, 'you're so lazy, Sven. First, you don't want to 
help the type checker but you still want to use it?' Yes, I am lazy! And 
I already benefit from it when using PyCharm. It might not be perfect 
but it still amazes me again and again what it can infer without any 
type annotations present.

> def spam(n=3):
>      return "spam"*n
>
> A decent type-checker should be able to infer that n is an int. What if
> you add a type annotation?
>
> def spam(n:int=3):
>      return "spam"*n

It's nothing seriously wrong with it (except what I described above). 
However, these examples (this one in particular) are/should not be 
real-world code. The function name is not helpful, the parameter name is 
not helpful, the functionality is a toy.

My observation so far:

1) Type checking illustrates its point well when using academic 
examples, such as the tuples-of-tuples-of-tuples-of-ints I described 
somewhere else on this thread or unreasonably short toy examples.

(This might be domain specific; I can witness it for business 
applications and web applications none of which actually need to solve 
hard problems admittedly.)

2) Just using constant and sane types like a class, lists of 
single-class instances and dicts of single-class instances for a single 
variable enables you to assign a proper name to it and forces you to 
design a reasonable architecture of your functionality by keeping the 
level of nesting at 0 or 1 and split out pieces into separate code blocks.

Best,
Sven
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150918/13cafd29/attachment-0001.html>

From mehaase at gmail.com  Fri Sep 18 19:42:59 2015
From: mehaase at gmail.com (Mark Haase)
Date: Fri, 18 Sep 2015 10:42:59 -0700 (PDT)
Subject: [Python-ideas] Null coalescing operators
Message-ID: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>

StackOverflow has many questions 
<http://stackoverflow.com/search?q=%5Bpython%5D+null+coalesce> on the topic 
of null coalescing operators in Python, but I can't find any discussions of 
them on this list or in any of the PEPs. Has the addition of null 
coalescing operators into Python ever been discussed publicly?

Python has an "or" operator that can be used to coalesce false-y values, 
but it does not have an operator to coalesce "None" exclusively.

C# has nice operators for handling null: "??" (null coalesce), "?." 
(null-aware member access), and "?[]" (null-aware index access). They are 
concise and easy to reason about. I think these would be a great addition 
to Python.

As a motivating example: when writing web services, I often want to change 
the representation of a non-None value but also need to handle None 
gracefully. I write code like this frequently: 

    response = json.dumps({
        'created': created.isoformat() if created is not None else None,
        'updated': updated.isoformat() if updated is not None else None,
        ...
    })

With a null-aware member access operator, I could write this instead:

    response = json.dumps({
        'created': created?.isoformat(),
        'updated': updated?.isoformat(),
        ...
    })

I can implement this behavior myself in pure Python, but it would be (a) 
nice to have it the in the standard library, and (b) even nicer to have an 
operator in the language, since terseness is the goal.

I assume that this has never been brought up in the past because it's so 
heinously un-Pythonic that you'd have to be a fool to risk the public 
mockery and shunning associated with asking this question. Well, I guess 
I'm that fool: flame away...

Thanks,
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150918/2b6b6cfb/attachment.html>

From trent at snakebite.org  Fri Sep 18 20:21:39 2015
From: trent at snakebite.org (Trent Nelson)
Date: Fri, 18 Sep 2015 14:21:39 -0400
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
Message-ID: <20150918182138.GA64237@trent.me>

On Fri, Sep 18, 2015 at 10:42:59AM -0700, Mark Haase wrote:
> StackOverflow has many questions
> <http://stackoverflow.com/search?q=%5Bpython%5D+null+coalesce> on the
> topic of null coalescing operators in Python, but I can't find any
> discussions of them on this list or in any of the PEPs. Has the
> addition of null coalescing operators into Python ever been discussed
> publicly?
> 
> Python has an "or" operator that can be used to coalesce false-y
> values, but it does not have an operator to coalesce "None"
> exclusively.

Hmmm, I use this NullObject class when I want to do stuff similar to what
you've described:

class NullObject(object):
    """
    This is a helper class that does its best to pretend to be forgivingly
    null-like.
    >>> n = NullObject()
    >>> n
    None
    >>> n.foo
    None
    >>> n.foo.bar.moo
    None
    >>> n.foo().bar.moo(True).cat().hello(False, abc=123)
    None
    >>> n.hornet(afterburner=True).shotdown(by=n().tomcat)
    None
    >>> n or 1
    1
    >>> str(n)
    ''
    >>> int(n)
    0
    >>> len(n)
    0
    """
    def __getattr__(self, name):
        return self

    def __getitem__(self, item):
        return self

    def __call__(self, *args, **kwds):
        return self

    def __nonzero__(self):
        return False

    def __repr__(self):
        return repr(None)

    def __str__(self):
        return ''

    def __int__(self):
        return 0

    def __len__(self):
        return 0

Source: https://github.com/tpn/tpn/blob/master/lib/tpn/util.py#L1031

Sample use: https://github.com/enversion/enversion/blob/master/lib/evn/change.py#L1300

    class ChangeSet(AbstractChangeSet):
        @property
        def top(self):
            """
            Iff one child change is present, return it.
            Otherwise, return an instance of a NullObject.
            """
            if self.child_count != 1:
                return NullObject()
            else:
                top = None
                for child in self:
                    top = child
                    break
                return top

        @property
        def is_tag_create(self):
            return self.top.is_tag_create

        @property
        def is_tag_remove(self):
            return self.top.is_tag_remove

        @property
        def is_branch_create(self):
            return self.top.is_branch_create

        @property
        def is_branch_remove(self):
            return self.top.is_branch_remove

Having self.top potentially return a NullObject simplifies the code for
the four following properties.

> I can implement this behavior myself in pure Python, but it would be
> (a) nice to have it the in the standard library, and (b) even nicer to
> have an operator in the language, since terseness is the goal.
> 

> As a motivating example: when writing web services, I often want to
> change the representation of a non-None value but also need to handle
> None gracefully. I write code like this frequently: 
> 
>     response = json.dumps({ 'created': created.isoformat() if created
>     is not None else None, 'updated': updated.isoformat() if updated
>     is not None else None, ...  })
> 
> With a null-aware member access operator, I could write this instead:
> 
>     response = json.dumps({ 'created': created?.isoformat(),
>     'updated': updated?.isoformat(), ...  })

If you can alter the part that creates `created` or `updated` to return
a NullObject() instead of None when applicable, you could call
`created.isoformat()` with out the addition clause.

> Thanks, Mark

    Trent.

From abarnert at yahoo.com  Fri Sep 18 20:57:24 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 18 Sep 2015 11:57:24 -0700
Subject: [Python-ideas] Structural type checking for PEP 484
In-Reply-To: <55FC4B55.1050908@mail.de>
References: <CAP7+vJLLmQ8Ws4fBvzf97WLujxvY9DF+ngUAU68NF6pvTb2ttg@mail.gmail.com>
 <55F0AC83.3050505@mail.de>
 <CAA_f+LyMKuJLHobK_of+Pt2Qpd5AhvvX839RekRfFdv35TJ-tg@mail.gmail.com>
 <55F1B306.5070705@mail.de> <05084F79-C505-4A27-9F08-DA98D4B19963@yahoo.com>
 <55F9D7B9.8060109@mail.de> <20150917035915.GL31152@ando.pearwood.info>
 <55FB2FA5.6050501@mail.de> <20150918030024.GN31152@ando.pearwood.info>
 <55FC4B55.1050908@mail.de>
Message-ID: <E2FFB303-4427-4019-96C1-27D4EB784FC2@yahoo.com>

On Sep 18, 2015, at 10:35, Sven R. Kunze <srkunze at mail.de> wrote:
> 
>> On 18.09.2015 05:00, Steven D'Aprano wrote:
>> I don't think they are all or nothing. I think it is possible to have 
>> incomplete documentation and partial test coverage -- it isn't like you 
>> go from "no documentation at all and zero tests" to "fully documented 
>> and 100% test coverage" in a single step.
> 
> This was a misunderstanding. The "all or nothing" wasn't about "test everything or don't do it at all". It was about the robustness of future benefits you gain from it. Either you have a test or you don't.
> 
> With type annotations you have 40% or 60% depending on the quality of the tool you use. It's fuzzy. I don't like to build stuff on jello. Just my personal feeling here.

Surely gaining 40% or gaining 60% is better than gaining 0%?

At any rate, if you're really concerned with this, there is research you might be interested in. The first static typer that I'm aware of that used a "fallback to any" rule like MyPy was for an ML language, and it used unsafety marking: any time it falls back to any, it marks the code unsafe, and that propagates in the obvious way. At the end of the typer run, it can tell you which parts of your program are type safe and which aren't. (It can also refactor the type safe parts into separate modules, which are then reusable in other programs, with well-defined type-safe APIs.) This sounds really nifty, and is fun to play with, but I don't think people found it useful in practice. (This is not the same as the explicit Unsafe type found in most SML descendants, where it's used explicitly, to mark FFIs and access to interval structures, which definitely is useful--although of course it's not completely unrelated.)

I think someone could pretty easily write something similar around PEP 484, and then display the results in a way similar to a code coverage map. If people found it useful, that would become a quality of implementation issue for static  typers, IDEs, etc. to compete on, and might be worth adding as a required feature to some future update to the standard; if not, it would just be a checklist on some typer's feature list that would eventually stop being worth maintaining.

Would that solve your "40% problem" to your satisfaction?
>> That's one use-case for them. Another use-case is as documentation:
>> 
>> def agm(x:float, y:float)->float:
>>     """Return the arithmetic-geometric mean of x and y."""
>> 
>> versus
>> 
>> def agm(x, y):
>>     """Return the arithmetic-geometric mean of x and y.
>> 
>>     Args:
>>         x (float): A number.
>>         y (float): A number.
>> 
>>     Returns:
>>         float: The agm of the two numbers.
>>     """
> 
> The type annotation explains nothing. The short doc-string "arithmetic-geometric mean" explains everything (or prepare you to google it). So, I would prefer this one:
> def agm(x, y):
>     """Return the arithmetic-geometric mean of x and y.""
What happens if I call your version with complex numbers? High-precision Decimal objects? NumPy arrays of floats?

I know that Steven wasn't expecting any of those, and will probably do the wrong thing (including silently doing something bad like silently throwing away Decimal precision or improperly extending to the complex plane). With yours, I don't know that. I may not even notice that there's a problem and just call it and get a bug months later. Even if I do notice the question, I have to read through your implementation and/or your test suite to find out if you'd considered the case, or write my own tests to find out empirically.

And that's exactly what I meant earlier by annotations sometimes being useful for human readers whether or not they're useful to the checker.
>>> So, I have difficulties to 
>>> infer which parameters actually would benefit from annotating. 
>> The simplest process may be something like this:
>> 
>> - run the type-checker in a mode where it warns about variables 
>>   with unknown types;
>> - add just enough annotations so that the warnings go away.
>> 
>> This is, in part, a matter of the quality of your tools. A good type 
>> checker should be able to tell you where it can, or can't, infer a type.
> 
> You see? Depending on who runs which tools, type annotations need to be added which are redundant for one tool and not for another and vice versa. (Yes, we allow that because we grant the liberty to our devs to use the tools they perform best with.)

The only way to avoid that is to define the type system completely and then define the inference engine as part of the language spec. The static type system is inherently an approximation of the much more powerful partly-implicit dynamic type system; not allowing it to act as an approximation would mean severely weakening Python's dynamic type system, which would mean severely weakening what you can write in Python. That's a terrible idea. Something like PEP 484 and an ecosystem of competing checkers is the only possibly useful thing that could be added to Python. If you disagree, nothing that could be feasibly added to Python will ever be useful to you, so you should resign yourself to never using static type checking (which you're allowed to do, of course).

> Coverage, on the other hand, is strict. Either you traverse that line of code or you don't (assuming no bugs in the coverage tools).
> 
>>> I am 
>>> either doing redundant work (because the typechecker is already very 
>>> well aware of the type) or I actually insert explicit knowledge (which 
>>> might become redundant in case typecheckers actually become better).
>> You make it sound like, alone out of everything else in Python 
>> programming, once a type annotation is added to a function it is carved 
>> in stone forever, never to be removed or changed :-)
> 
> Let me reformulate my point: it's not about setting things in stone. It's about having more to read/process mentally. You might think, 'nah, he's exaggerating; it's just one tiny little ": int" more here and there', but these things build up slowly over time, due to missing clear guidelines (see the fuzziness I described above). Devs will simply add them just everywhere just to make sure OR ignore the whole concept completely.
> 
> It's simply not good enough. :(
> 
> 
> Nevertheless, I like the protocol idea more as it introduces actual names to be exposed by IDEs without any work from the devs. That's great!
> 
> 
> You might further think, 'you're so lazy, Sven. First, you don't want to help the type checker but you still want to use it?' Yes, I am lazy! And I already benefit from it when using PyCharm. It might not be perfect but it still amazes me again and again what it can infer without any type annotations present.
> 
>> def spam(n=3):
>>     return "spam"*n
>> 
>> A decent type-checker should be able to infer that n is an int. What if 
>> you add a type annotation?
>> 
>> def spam(n:int=3):
>>     return "spam"*n
> 
> It's nothing seriously wrong with it (except what I described above). However, these examples (this one in particular) are/should not be real-world code. The function name is not helpful, the parameter name is not helpful, the functionality is a toy.
> 
> My observation so far:
> 
> 1) Type checking illustrates its point well when using academic examples, such as the tuples-of-tuples-of-tuples-of-ints I described somewhere else on this thread or unreasonably short toy examples.
> 
> (This might be domain specific; I can witness it for business applications and web applications none of which actually need to solve hard problems admittedly.)
> 
> 2) Just using constant and sane types like a class, lists of single-class instances and dicts of single-class instances for a single variable enables you to assign a proper name to it and forces you to design a reasonable architecture of your functionality by keeping the level of nesting at 0 or 1 and split out pieces into separate code blocks.

What you're essentially arguing is that if nobody ever used dynamic types (e.g., types with __getattr__, types constructed at runtime by PyObjC or similar bridges, etc.), or dynamically-typed values (like the result of json.loads), or static types that are hard to express manually (like ADTs or dependent types), we could easily build a static type checker that worked near-perfectly, and then we could define exactly where you do and don't need to annotate types.

That's true, but it effectively means restricting yourself to the Java type system. Which sucks. There are many things that are easy to write readably in Python (or in Haskell) that require ugliness in Java simply because its type system is too weak. Restricting Python (or even idiomatic Python) to the things that could Java-typed would seriously weaken the language, to the point where I'd rather go find a language that got duck typing right than stick with it.

You could argue that Swift actually does a pretty good job of making 90% of your code just work and making it as non-ugly as possible to force the rest of the 10% through escapes in the type system (at least for many kinds of programs). But this actually required a more complicated type system than the one you're suggesting--and, more importantly, it involved explicitly designing the language and the stdlib around that goal. Even the first few public betas didn't work for real programs without a lot of ugliness, requiring drastic changes to the language and stdlib to make it usable. Imagine how much would have to change about a language that was designed for duck typing and grew organically over two and a half decades. Also, there are many corners of Swift that have inconsistently ad-hoc rules that make it much harder to fit the entire language into your brain than Python, despite the language being about the same size. A language that you developed out of performing a similar process on Python might be a good language, maybe even better than Swift, but it would be not be Python, and would not be useful for the same kinds of projects where a language-agnostic programmer would choose Python over other alternatives.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150918/dba8fdcb/attachment-0001.html>

From abarnert at yahoo.com  Fri Sep 18 21:28:05 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 18 Sep 2015 12:28:05 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <20150918182138.GA64237@trent.me>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <20150918182138.GA64237@trent.me>
Message-ID: <DF29B6FB-AAAF-43C8-BB31-C41C8AAD9CC9@yahoo.com>

On Sep 18, 2015, at 11:21, Trent Nelson <trent at snakebite.org> wrote:
> 
>> On Fri, Sep 18, 2015 at 10:42:59AM -0700, Mark Haase wrote:
>> StackOverflow has many questions
>> <http://stackoverflow.com/search?q=%5Bpython%5D+null+coalesce> on the
>> topic of null coalescing operators in Python, but I can't find any
>> discussions of them on this list or in any of the PEPs. Has the
>> addition of null coalescing operators into Python ever been discussed
>> publicly?

I believe it was raised as a side issue during other discussions (conditional expressions, exception-handling expressions, one of the pattern-matching discussions), but I personally can't remember anyone ever writing a serious proposal. I think Armin from PyPy also has a blog post mentioning the idea somewhere, as a spinoff of his arguments against PEP 484 (which turned into a more general "what's wrong with Python's type system and what could be done to fix it). One last place to look, although it'll be harder to search for, is every time people discuss whether things like dict.get are a wart on the language (because there should be a fully general way to do the equivalent) or a feature (because it's actually only useful in a handful of cases, and it's better to mark them explicitly than to try to generalize).

But my guess is that the discussion hasn't actually been had in sufficient depth to avoid having it here. (Although even if I'm right, that doesn't mean more searching isn't worth doing--to find arguments and counter arguments you may have missed, draw parallels to successes and failures in other languages, etc.) And, even if Guido hates the idea out of hand, or someone comes up with a slam-dunk argument against it, this could turn into one of those cases where it's worth someone gathering all the info and shepherding the discussion just to write a PEP for Guido to reject explicitly.

Personally, for whatever my opinion is worth (not that much), I don't have a good opinion on how it would work in Python without seeing lots of serious examples or trying it out. But I think this would be relatively easy to hack in at the tokenizer level with a quick&dirty import hook. I'll attempt it some time this weekend, in hopes that people can play with the feature. Also, it might be possible to do it less hackily with MacroPy (or it might already be part of MacroPy--often Haoyi's time machine is as good as Guido's).

>> Python has an "or" operator that can be used to coalesce false-y
>> values, but it does not have an operator to coalesce "None"
>> exclusively.
> 
> Hmmm, I use this NullObject class when I want to do stuff similar to what
> you've described:

This is a very Smalltalk-y solution, which isn't a bad thing. I think having a singleton instance of NullObject (like None is a singleton instance of NoneType) so you can use is-tests, etc. might make it better, but that's arguable.

The biggest problem is that you have to write (or wrap) every API to return NullObjects instead of None, and likewise to take NullObjects. (And, if you use a PEP 484 checker, it won't understand that an optional int can hold a NullObject.)

Also, there's no way for NullObject to ensure that spam(NullObject) returns NullObject for any function spam (or, more realistically, for any function except special cases, where it's hard to define what counts as a special case but easy to understand intuitively).

And finally, there's no obvious way to make NullObject raise when you want it to raise. With syntax for nil coalescing, this is easy: ?. returns None for None, while . raises AttributeError. With separate types instead, you're putting the distinction at the point (possibly far away) where the value is produced, rather than the point where it's used.

As a side note, my experience in both Smalltalk and C# is that at some point in a large program, I'm going to end up hackily using a distinction between [nil] and nil somewhere because I needed to distinguish between an optional optional spam that "failed" at the top level vs. one that did so at the bottom level. I like the fact that in Haskell or Swift I can actually distinguish "just nil" from "nil" when I need to but usually don't have to (and the code is briefer when I don't have to), but I don't know whether that's actually essential (the [nil]) hack almost always works, and isn't that hard to read if it's used sparsely, which it almost always is).


From guido at python.org  Fri Sep 18 21:45:24 2015
From: guido at python.org (Guido van Rossum)
Date: Fri, 18 Sep 2015 12:45:24 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <DF29B6FB-AAAF-43C8-BB31-C41C8AAD9CC9@yahoo.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <20150918182138.GA64237@trent.me>
 <DF29B6FB-AAAF-43C8-BB31-C41C8AAD9CC9@yahoo.com>
Message-ID: <CAP7+vJKwXfL+NKQ=M=dYKjgF=zDF00f3xYvD4+LqPcWF2ujwjw@mail.gmail.com>

FWIW, I generally hate odd punctuation like this (@ notwithstanding) but
I'm not against the idea itself -- maybe a different syntax can be
invented, or maybe I could be persuaded that it's okay.

On Fri, Sep 18, 2015 at 12:28 PM, Andrew Barnert via Python-ideas <
python-ideas at python.org> wrote:

> On Sep 18, 2015, at 11:21, Trent Nelson <trent at snakebite.org> wrote:
> >
> >> On Fri, Sep 18, 2015 at 10:42:59AM -0700, Mark Haase wrote:
> >> StackOverflow has many questions
> >> <http://stackoverflow.com/search?q=%5Bpython%5D+null+coalesce> on the
> >> topic of null coalescing operators in Python, but I can't find any
> >> discussions of them on this list or in any of the PEPs. Has the
> >> addition of null coalescing operators into Python ever been discussed
> >> publicly?
>
> I believe it was raised as a side issue during other discussions
> (conditional expressions, exception-handling expressions, one of the
> pattern-matching discussions), but I personally can't remember anyone ever
> writing a serious proposal. I think Armin from PyPy also has a blog post
> mentioning the idea somewhere, as a spinoff of his arguments against PEP
> 484 (which turned into a more general "what's wrong with Python's type
> system and what could be done to fix it). One last place to look, although
> it'll be harder to search for, is every time people discuss whether things
> like dict.get are a wart on the language (because there should be a fully
> general way to do the equivalent) or a feature (because it's actually only
> useful in a handful of cases, and it's better to mark them explicitly than
> to try to generalize).
>
> But my guess is that the discussion hasn't actually been had in sufficient
> depth to avoid having it here. (Although even if I'm right, that doesn't
> mean more searching isn't worth doing--to find arguments and counter
> arguments you may have missed, draw parallels to successes and failures in
> other languages, etc.) And, even if Guido hates the idea out of hand, or
> someone comes up with a slam-dunk argument against it, this could turn into
> one of those cases where it's worth someone gathering all the info and
> shepherding the discussion just to write a PEP for Guido to reject
> explicitly.
>
> Personally, for whatever my opinion is worth (not that much), I don't have
> a good opinion on how it would work in Python without seeing lots of
> serious examples or trying it out. But I think this would be relatively
> easy to hack in at the tokenizer level with a quick&dirty import hook. I'll
> attempt it some time this weekend, in hopes that people can play with the
> feature. Also, it might be possible to do it less hackily with MacroPy (or
> it might already be part of MacroPy--often Haoyi's time machine is as good
> as Guido's).
>
> >> Python has an "or" operator that can be used to coalesce false-y
> >> values, but it does not have an operator to coalesce "None"
> >> exclusively.
> >
> > Hmmm, I use this NullObject class when I want to do stuff similar to what
> > you've described:
>
> This is a very Smalltalk-y solution, which isn't a bad thing. I think
> having a singleton instance of NullObject (like None is a singleton
> instance of NoneType) so you can use is-tests, etc. might make it better,
> but that's arguable.
>
> The biggest problem is that you have to write (or wrap) every API to
> return NullObjects instead of None, and likewise to take NullObjects. (And,
> if you use a PEP 484 checker, it won't understand that an optional int can
> hold a NullObject.)
>
> Also, there's no way for NullObject to ensure that spam(NullObject)
> returns NullObject for any function spam (or, more realistically, for any
> function except special cases, where it's hard to define what counts as a
> special case but easy to understand intuitively).
>
> And finally, there's no obvious way to make NullObject raise when you want
> it to raise. With syntax for nil coalescing, this is easy: ?. returns None
> for None, while . raises AttributeError. With separate types instead,
> you're putting the distinction at the point (possibly far away) where the
> value is produced, rather than the point where it's used.
>
> As a side note, my experience in both Smalltalk and C# is that at some
> point in a large program, I'm going to end up hackily using a distinction
> between [nil] and nil somewhere because I needed to distinguish between an
> optional optional spam that "failed" at the top level vs. one that did so
> at the bottom level. I like the fact that in Haskell or Swift I can
> actually distinguish "just nil" from "nil" when I need to but usually don't
> have to (and the code is briefer when I don't have to), but I don't know
> whether that's actually essential (the [nil]) hack almost always works, and
> isn't that hard to read if it's used sparsely, which it almost always is).
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150918/e2681a30/attachment.html>

From abarnert at yahoo.com  Fri Sep 18 22:59:24 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 18 Sep 2015 13:59:24 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <DF29B6FB-AAAF-43C8-BB31-C41C8AAD9CC9@yahoo.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <20150918182138.GA64237@trent.me>
 <DF29B6FB-AAAF-43C8-BB31-C41C8AAD9CC9@yahoo.com>
Message-ID: <DA9D15AA-99C1-4804-84D5-5B16715EB9E4@yahoo.com>

On Sep 18, 2015, at 12:28, Andrew Barnert via Python-ideas <python-ideas at python.org> wrote:
> 
> Personally, for whatever my opinion is worth (not that much), I don't have a good opinion on how it would work in Python without seeing lots of serious examples or trying it out. But I think this would be relatively easy to hack in at the tokenizer level with a quick&dirty import hook. I'll attempt it some time this weekend, in hopes that people can play with the feature. Also, it might be possible to do it less hackily with MacroPy (or it might already be part of MacroPy--often Haoyi's time machine is as good as Guido's).

You can download a quick&dirty hack at https://github.com/abarnert/nonehack

This only handles the simple case of identifier?.attribute; using an arbitrary target on the left side of the . doesn't work, and there are no other none-coalescing forms like ?(...) or ?[...]. (The latter would be easy to add; the former, I don't think so.) But that's enough to handle the examples in the initial email.

So, feel free to experiment with it, and show off code that proves the usefulness of the feature.

Also, if you can think of a better syntax that will make Guido less sad, but don't know how to implement it as a hack, let me know and I'll try to do it for you.

From mehaase at gmail.com  Sat Sep 19 00:28:33 2015
From: mehaase at gmail.com (Mark E. Haase)
Date: Fri, 18 Sep 2015 18:28:33 -0400
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAP7+vJKwXfL+NKQ=M=dYKjgF=zDF00f3xYvD4+LqPcWF2ujwjw@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <20150918182138.GA64237@trent.me>
 <DF29B6FB-AAAF-43C8-BB31-C41C8AAD9CC9@yahoo.com>
 <CAP7+vJKwXfL+NKQ=M=dYKjgF=zDF00f3xYvD4+LqPcWF2ujwjw@mail.gmail.com>
Message-ID: <CALb0Rk48OGCBeXdQGZWx=+2mBy+yWucMBebWrNvVfW-dj4TVSw@mail.gmail.com>

Andrew, thanks for putting together that hack. I will check it out.

Guido, you lost me at "hate odd punctuation ... maybe a different syntax
can be invented". Do you mean introducing a new keyword or implementing
this as a function? Or do you mean that some other punctuation might be
less odd?

I'm willing to write a PEP, even if it's only purpose is to get shot down.


On Fri, Sep 18, 2015 at 3:45 PM, Guido van Rossum <guido at python.org> wrote:

> FWIW, I generally hate odd punctuation like this (@ notwithstanding) but
> I'm not against the idea itself -- maybe a different syntax can be
> invented, or maybe I could be persuaded that it's okay.
>
> On Fri, Sep 18, 2015 at 12:28 PM, Andrew Barnert via Python-ideas <
> python-ideas at python.org> wrote:
>
>> On Sep 18, 2015, at 11:21, Trent Nelson <trent at snakebite.org> wrote:
>> >
>> >> On Fri, Sep 18, 2015 at 10:42:59AM -0700, Mark Haase wrote:
>> >> StackOverflow has many questions
>> >> <http://stackoverflow.com/search?q=%5Bpython%5D+null+coalesce> on the
>> >> topic of null coalescing operators in Python, but I can't find any
>> >> discussions of them on this list or in any of the PEPs. Has the
>> >> addition of null coalescing operators into Python ever been discussed
>> >> publicly?
>>
>> I believe it was raised as a side issue during other discussions
>> (conditional expressions, exception-handling expressions, one of the
>> pattern-matching discussions), but I personally can't remember anyone ever
>> writing a serious proposal. I think Armin from PyPy also has a blog post
>> mentioning the idea somewhere, as a spinoff of his arguments against PEP
>> 484 (which turned into a more general "what's wrong with Python's type
>> system and what could be done to fix it). One last place to look, although
>> it'll be harder to search for, is every time people discuss whether things
>> like dict.get are a wart on the language (because there should be a fully
>> general way to do the equivalent) or a feature (because it's actually only
>> useful in a handful of cases, and it's better to mark them explicitly than
>> to try to generalize).
>>
>> But my guess is that the discussion hasn't actually been had in
>> sufficient depth to avoid having it here. (Although even if I'm right, that
>> doesn't mean more searching isn't worth doing--to find arguments and
>> counter arguments you may have missed, draw parallels to successes and
>> failures in other languages, etc.) And, even if Guido hates the idea out of
>> hand, or someone comes up with a slam-dunk argument against it, this could
>> turn into one of those cases where it's worth someone gathering all the
>> info and shepherding the discussion just to write a PEP for Guido to reject
>> explicitly.
>>
>> Personally, for whatever my opinion is worth (not that much), I don't
>> have a good opinion on how it would work in Python without seeing lots of
>> serious examples or trying it out. But I think this would be relatively
>> easy to hack in at the tokenizer level with a quick&dirty import hook. I'll
>> attempt it some time this weekend, in hopes that people can play with the
>> feature. Also, it might be possible to do it less hackily with MacroPy (or
>> it might already be part of MacroPy--often Haoyi's time machine is as good
>> as Guido's).
>>
>> >> Python has an "or" operator that can be used to coalesce false-y
>> >> values, but it does not have an operator to coalesce "None"
>> >> exclusively.
>> >
>> > Hmmm, I use this NullObject class when I want to do stuff similar to
>> what
>> > you've described:
>>
>> This is a very Smalltalk-y solution, which isn't a bad thing. I think
>> having a singleton instance of NullObject (like None is a singleton
>> instance of NoneType) so you can use is-tests, etc. might make it better,
>> but that's arguable.
>>
>> The biggest problem is that you have to write (or wrap) every API to
>> return NullObjects instead of None, and likewise to take NullObjects. (And,
>> if you use a PEP 484 checker, it won't understand that an optional int can
>> hold a NullObject.)
>>
>> Also, there's no way for NullObject to ensure that spam(NullObject)
>> returns NullObject for any function spam (or, more realistically, for any
>> function except special cases, where it's hard to define what counts as a
>> special case but easy to understand intuitively).
>>
>> And finally, there's no obvious way to make NullObject raise when you
>> want it to raise. With syntax for nil coalescing, this is easy: ?. returns
>> None for None, while . raises AttributeError. With separate types instead,
>> you're putting the distinction at the point (possibly far away) where the
>> value is produced, rather than the point where it's used.
>>
>> As a side note, my experience in both Smalltalk and C# is that at some
>> point in a large program, I'm going to end up hackily using a distinction
>> between [nil] and nil somewhere because I needed to distinguish between an
>> optional optional spam that "failed" at the top level vs. one that did so
>> at the bottom level. I like the fact that in Haskell or Swift I can
>> actually distinguish "just nil" from "nil" when I need to but usually don't
>> have to (and the code is briefer when I don't have to), but I don't know
>> whether that's actually essential (the [nil]) hack almost always works, and
>> isn't that hard to read if it's used sparsely, which it almost always is).
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
Mark E. Haase
202-815-0201
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150918/b657bac3/attachment.html>

From rosuav at gmail.com  Sat Sep 19 00:37:17 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Sat, 19 Sep 2015 08:37:17 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
Message-ID: <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>

On Sat, Sep 19, 2015 at 3:42 AM, Mark Haase <mehaase at gmail.com> wrote:
> StackOverflow has many questions on the topic of null coalescing operators
> in Python, but I can't find any discussions of them on this list or in any
> of the PEPs. Has the addition of null coalescing operators into Python ever
> been discussed publicly?
>
> Python has an "or" operator that can be used to coalesce false-y values, but
> it does not have an operator to coalesce "None" exclusively.

Python generally doesn't special-case None, so having a bit of magic
that works only on that one object seems a little odd. For comparison
purposes, Pike has something very similar to what you're describing,
but Pike *does* treat the integer 0 as special, so it makes good sense
there. Pike code that wants to return "a thing or NULL" will return an
object or the integer 0, where Python code will usually return an
object or None. I can't think of any situation in Python where the
language itself gives special support to None, other than it being a
keyword. You're breaking new ground.

But in my opinion, the practicality is worth it. The use of None to
represent the SQL NULL value [1], the absence of useful return value,
or other "non-values", is pretty standard. I would define the operator
pretty much the way you did above, with one exception. You say:

created?.isoformat() # is equivalent to
created.isoformat() if created is not None else None

but this means there needs to be some magic, because it should be
equally possible to write:

created?.year # equivalent to
created.year if created is not None else None

which means that sometimes it has to return None, and sometimes
(lambda *a,**ka: None). Three possible solutions:

1) Make None callable. None.__call__(*a, **ka) always returns None.
2) Special-case the immediate call in the syntax, so the equivalencies
are a bit different.
3) Add another case: func?(args) evaluates func, and if it's None,
evaluates to None without calling anything.

Option 1 would potentially mask bugs in a lot of unrelated code. I
don't think it's a good idea, but maybe others disagree.

Option 2 adds a grammatical distinction that currently doesn't exist.
When you see a nullable attribute lookup, you have to check to see if
it's a method call, and if it is, do things differently. That means
there's a difference between these:

func = obj?.attr; func()
obj?.attr()

Option 3 requires a bit more protection, but is completely explicit.
It would also have use in other situations. Personally, I support that
option; it maintains all the identities, is explicit that calling None
will yield None, and doesn't need any magic special cases. It does add
another marker, though:

created?.isoformat?() # is equivalent to
created.isoformat() if created is not None and created.isoformat is
not None else None

As to the syntax... IMO this needs to be compact, so ?. has my
support. With subscripting, should it be "obj?[idx]" or "obj[?idx]" ?
FWIW Pike uses the latter, but if C# uses the former, there's no one
obvious choice.

ChrisA

[1] Or non-value, depending on context

From python at lucidity.plus.com  Sat Sep 19 00:56:40 2015
From: python at lucidity.plus.com (Erik)
Date: Fri, 18 Sep 2015 23:56:40 +0100
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
Message-ID: <55FC96A8.605@lucidity.plus.com>

On 18/09/15 23:37, Chris Angelico wrote:
> Python generally doesn't special-case None, so having a bit of magic
> that works only on that one object seems a little odd.

So the answer here is to introduce a "magic" hook that None can make use 
of (but also other classes). I can't think of an appropriate word, so 
I'll use "foo" to keep it suitably abstract.

If the foo operator uses the magic method "__foo__" to mean "return an 
object to be used in place of the operand should it be considered ... 
false? [or some other definition - I'm not sure]" then any class can 
implement that method to return an appropriate proxy object.

If that was a postfix operator which has a high precedence, then:

bar = foo?
bar.isoformat()

and the original syntax suggestion:

bar = foo?.isoformat()

... are equivalent. "?." is not a new operator. "?" is. This is 
essentially a slight refinement of Chris's case 3 -

> 3) Add another case: func?(args) evaluates func, and if it's None,
> evaluates to None without calling anything.
[...]
> Option 3 requires a bit more protection, but is completely explicit.
> It would also have use in other situations. Personally, I support that
> option; it maintains all the identities, is explicit that calling None
> will yield None, and doesn't need any magic special cases. It does add
> another marker, though:

E.


From python at mrabarnett.plus.com  Sat Sep 19 01:02:42 2015
From: python at mrabarnett.plus.com (MRAB)
Date: Sat, 19 Sep 2015 00:02:42 +0100
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
Message-ID: <55FC9812.2090503@mrabarnett.plus.com>

On 2015-09-18 23:37, Chris Angelico wrote:
> On Sat, Sep 19, 2015 at 3:42 AM, Mark Haase <mehaase at gmail.com> wrote:
>> StackOverflow has many questions on the topic of null coalescing operators
>> in Python, but I can't find any discussions of them on this list or in any
>> of the PEPs. Has the addition of null coalescing operators into Python ever
>> been discussed publicly?
>>
>> Python has an "or" operator that can be used to coalesce false-y values, but
>> it does not have an operator to coalesce "None" exclusively.
>
[snip]
>
> created?.isoformat() # is equivalent to
> created.isoformat() if created is not None else None
>
> but this means there needs to be some magic, because it should be
> equally possible to write:
>
> created?.year # equivalent to
> created.year if created is not None else None
>
> which means that sometimes it has to return None, and sometimes
> (lambda *a,**ka: None). Three possible solutions:
>
> 1) Make None callable. None.__call__(*a, **ka) always returns None.
> 2) Special-case the immediate call in the syntax, so the equivalencies
> are a bit different.
> 3) Add another case: func?(args) evaluates func, and if it's None,
> evaluates to None without calling anything.
>
> Option 1 would potentially mask bugs in a lot of unrelated code. I
> don't think it's a good idea, but maybe others disagree.
>
> Option 2 adds a grammatical distinction that currently doesn't exist.
> When you see a nullable attribute lookup, you have to check to see if
> it's a method call, and if it is, do things differently. That means
> there's a difference between these:
>
> func = obj?.attr; func()
> obj?.attr()
>
> Option 3 requires a bit more protection, but is completely explicit.
> It would also have use in other situations. Personally, I support that
> option; it maintains all the identities, is explicit that calling None
> will yield None, and doesn't need any magic special cases. It does add
> another marker, though:
>
> created?.isoformat?() # is equivalent to
> created.isoformat() if created is not None and created.isoformat is
> not None else None
>
> As to the syntax... IMO this needs to be compact, so ?. has my
> support. With subscripting, should it be "obj?[idx]" or "obj[?idx]" ?
> FWIW Pike uses the latter, but if C# uses the former, there's no one
> obvious choice.
>
To me, the choice _is_ obvious: "obj?[idx]". After all, that's more in
keeping with "obj?.attr" and "func?()".

If you had "obj?[idx]", then shouldn't it also be "obj.?attr" and
"func(?)"?


From python at lucidity.plus.com  Sat Sep 19 01:18:38 2015
From: python at lucidity.plus.com (Erik)
Date: Sat, 19 Sep 2015 00:18:38 +0100
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <55FC96A8.605@lucidity.plus.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
Message-ID: <55FC9BCE.7020703@lucidity.plus.com>

Apologies for the self-reply. I just wanted to clarify a couple of things.

On 18/09/15 23:56, Erik wrote:
> If the foo operator uses the magic method "__foo__" to mean "return an
> object to be used in place of the operand should it be considered ...
> false? [or some other definition - I'm not sure]"

Not "false", I think. The "foo" operator is meant to mean "I will go on 
to use the resulting object in any way imaginable and it must cope with 
that and return a value from any attempts to use it that will generally 
mean 'no'" (*).

> If that was a postfix operator which has a high precedence, then:
>
> bar = foo?
> bar.isoformat()
>
> and the original syntax suggestion:
>
> bar = foo?.isoformat()

Which is clearly wrong - the first part should be:

baz = foo?
bar = baz.isoformat()

E.

(*) Should we call the operator "shrug"?


From srkunze at mail.de  Sat Sep 19 01:19:23 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Sat, 19 Sep 2015 01:19:23 +0200
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <55FC9812.2090503@mrabarnett.plus.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
Message-ID: <55FC9BFB.2080006@mail.de>

On 19.09.2015 01:02, MRAB wrote:
> To me, the choice _is_ obvious: "obj?[idx]". After all, that's more in
> keeping with "obj?.attr" and "func?()".
>
> If you had "obj?[idx]", then shouldn't it also be "obj.?attr" and
> "func(?)"?

I agree with that.

From srkunze at mail.de  Sat Sep 19 01:44:31 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Sat, 19 Sep 2015 01:44:31 +0200
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <55FC9BCE.7020703@lucidity.plus.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com> <55FC9BCE.7020703@lucidity.plus.com>
Message-ID: <55FCA1DF.2030004@mail.de>



On 19.09.2015 01:18, Erik wrote:
> Apologies for the self-reply. I just wanted to clarify a couple of 
> things.
>
> On 18/09/15 23:56, Erik wrote:
>> If the foo operator uses the magic method "__foo__" to mean "return an
>> object to be used in place of the operand should it be considered ...
>> false? [or some other definition - I'm not sure]"
>
> Not "false", I think. The "foo" operator is meant to mean "I will go 
> on to use the resulting object in any way imaginable and it must cope 
> with that and return a value from any attempts to use it that will 
> generally mean 'no'" (*).
>
>> If that was a postfix operator which has a high precedence, then:
>>
>> bar = foo?
>> bar.isoformat()
>>
>> and the original syntax suggestion:
>>
>> bar = foo?.isoformat()
>
> Which is clearly wrong - the first part should be:
>
> baz = foo?
> bar = baz.isoformat()
>
> E.
>
> (*) Should we call the operator "shrug"?

Maybe monad?

From rymg19 at gmail.com  Sat Sep 19 01:47:31 2015
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Fri, 18 Sep 2015 18:47:31 -0500
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <55FCA1DF.2030004@mail.de>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com> <55FC9BCE.7020703@lucidity.plus.com>
 <55FCA1DF.2030004@mail.de>
Message-ID: <550D45BB-1103-4B70-9D1A-E17AEE519DBF@gmail.com>

What about "apply"? It's the closest thing to "fmap" I can think of that won't coblnfuse people...

On September 18, 2015 6:44:31 PM CDT, "Sven R. Kunze" <srkunze at mail.de> wrote:
>
>
>On 19.09.2015 01:18, Erik wrote:
>> Apologies for the self-reply. I just wanted to clarify a couple of 
>> things.
>>
>> On 18/09/15 23:56, Erik wrote:
>>> If the foo operator uses the magic method "__foo__" to mean "return
>an
>>> object to be used in place of the operand should it be considered
>...
>>> false? [or some other definition - I'm not sure]"
>>
>> Not "false", I think. The "foo" operator is meant to mean "I will go 
>> on to use the resulting object in any way imaginable and it must cope
>
>> with that and return a value from any attempts to use it that will 
>> generally mean 'no'" (*).
>>
>>> If that was a postfix operator which has a high precedence, then:
>>>
>>> bar = foo?
>>> bar.isoformat()
>>>
>>> and the original syntax suggestion:
>>>
>>> bar = foo?.isoformat()
>>
>> Which is clearly wrong - the first part should be:
>>
>> baz = foo?
>> bar = baz.isoformat()
>>
>> E.
>>
>> (*) Should we call the operator "shrug"?
>
>Maybe monad?
>_______________________________________________
>Python-ideas mailing list
>Python-ideas at python.org
>https://mail.python.org/mailman/listinfo/python-ideas
>Code of Conduct: http://python.org/psf/codeofconduct/

-- 
Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150918/a399c44a/attachment.html>

From srkunze at mail.de  Sat Sep 19 01:58:23 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Sat, 19 Sep 2015 01:58:23 +0200
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <550D45BB-1103-4B70-9D1A-E17AEE519DBF@gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com> <55FC9BCE.7020703@lucidity.plus.com>
 <55FCA1DF.2030004@mail.de> <550D45BB-1103-4B70-9D1A-E17AEE519DBF@gmail.com>
Message-ID: <55FCA51F.5060608@mail.de>

On 19.09.2015 01:47, Ryan Gonzalez wrote:
> What about "apply"? It's the closest thing to "fmap" I can think of 
> that won't coblnfuse people...

Are you sure? I think "maybe" better reflects the purpose of "?".


Nevertheless, I would love to see support for the maybe monad in Python.

Best,
Sven

From joejev at gmail.com  Sat Sep 19 02:00:40 2015
From: joejev at gmail.com (Joseph Jevnik)
Date: Fri, 18 Sep 2015 20:00:40 -0400
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <55FCA51F.5060608@mail.de>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <55FC9BCE.7020703@lucidity.plus.com> <55FCA1DF.2030004@mail.de>
 <550D45BB-1103-4B70-9D1A-E17AEE519DBF@gmail.com>
 <55FCA51F.5060608@mail.de>
Message-ID: <CAHGq92UqkdomAv0S7OR5mFXRosiyBENeCt8EvgsgXCH9nH6_xw@mail.gmail.com>

Is there a reason that this needs explicit support, it is trivial to
implement maybe in pure python.

On Fri, Sep 18, 2015 at 7:58 PM, Sven R. Kunze <srkunze at mail.de> wrote:

> On 19.09.2015 01:47, Ryan Gonzalez wrote:
>
>> What about "apply"? It's the closest thing to "fmap" I can think of that
>> won't coblnfuse people...
>>
>
> Are you sure? I think "maybe" better reflects the purpose of "?".
>
>
> Nevertheless, I would love to see support for the maybe monad in Python.
>
> Best,
> Sven
> _______________________________________________
>
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150918/b661cadf/attachment.html>

From python at mrabarnett.plus.com  Sat Sep 19 02:01:50 2015
From: python at mrabarnett.plus.com (MRAB)
Date: Sat, 19 Sep 2015 01:01:50 +0100
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <55FCA1DF.2030004@mail.de>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com> <55FC9BCE.7020703@lucidity.plus.com>
 <55FCA1DF.2030004@mail.de>
Message-ID: <55FCA5EE.5050607@mrabarnett.plus.com>

On 2015-09-19 00:44, Sven R. Kunze wrote:
>
>
> On 19.09.2015 01:18, Erik wrote:
>> Apologies for the self-reply. I just wanted to clarify a couple of
>> things.
>>
>> On 18/09/15 23:56, Erik wrote:
>>> If the foo operator uses the magic method "__foo__" to mean "return an
>>> object to be used in place of the operand should it be considered ...
>>> false? [or some other definition - I'm not sure]"
>>
>> Not "false", I think. The "foo" operator is meant to mean "I will go
>> on to use the resulting object in any way imaginable and it must cope
>> with that and return a value from any attempts to use it that will
>> generally mean 'no'" (*).
>>
>>> If that was a postfix operator which has a high precedence, then:
>>>
>>> bar = foo?
>>> bar.isoformat()
>>>
>>> and the original syntax suggestion:
>>>
>>> bar = foo?.isoformat()
>>
>> Which is clearly wrong - the first part should be:
>>
>> baz = foo?
>> bar = baz.isoformat()
>>
>> E.
>>
>> (*) Should we call the operator "shrug"?
>
> Maybe monad?
>
Too fancy.

How about "ni"? :-)


From rymg19 at gmail.com  Sat Sep 19 02:07:56 2015
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Fri, 18 Sep 2015 19:07:56 -0500
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <55FCA51F.5060608@mail.de>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com> <55FC9BCE.7020703@lucidity.plus.com>
 <55FCA1DF.2030004@mail.de> <550D45BB-1103-4B70-9D1A-E17AEE519DBF@gmail.com>
 <55FCA51F.5060608@mail.de>
Message-ID: <0D0C3730-4FC7-4EA9-9262-E997644C1677@gmail.com>



On September 18, 2015 6:58:23 PM CDT, "Sven R. Kunze" <srkunze at mail.de> wrote:
>On 19.09.2015 01:47, Ryan Gonzalez wrote:
>> What about "apply"? It's the closest thing to "fmap" I can think of 
>> that won't coblnfuse people...
>
>Are you sure? I think "maybe" better reflects the purpose of "?".
>

That's better. Or "optional".

>
>Nevertheless, I would love to see support for the maybe monad in
>Python.
>
>Best,
>Sven

-- 
Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.

From 4kir4.1i at gmail.com  Sat Sep 19 02:08:29 2015
From: 4kir4.1i at gmail.com (Akira Li)
Date: Sat, 19 Sep 2015 03:08:29 +0300
Subject: [Python-ideas] Null coalescing operators
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com> <55FC9BCE.7020703@lucidity.plus.com>
 <55FCA1DF.2030004@mail.de>
 <550D45BB-1103-4B70-9D1A-E17AEE519DBF@gmail.com>
Message-ID: <871tdv46f6.fsf@gmail.com>

Ryan Gonzalez <rymg19 at gmail.com> writes:

>>On 19.09.2015 01:18, Erik wrote:
...
>>>
>>> baz = foo?
>>> bar = baz.isoformat()
>>>
>>> E.
>>>
>>> (*) Should we call the operator "shrug"?
>>
>>Maybe monad?

  http://stackoverflow.com/questions/8507200/maybe-kind-of-monad-in-python



From abarnert at yahoo.com  Sat Sep 19 02:49:36 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 18 Sep 2015 17:49:36 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <55FC96A8.605@lucidity.plus.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
Message-ID: <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>

On Sep 18, 2015, at 15:56, Erik <python at lucidity.plus.com> wrote:
> 
>> On 18/09/15 23:37, Chris Angelico wrote:
>> Python generally doesn't special-case None, so having a bit of magic
>> that works only on that one object seems a little odd.
> 
> So the answer here is to introduce a "magic" hook that None can make use of (but also other classes). I can't think of an appropriate word, so I'll use "foo" to keep it suitably abstract.
> 
> If the foo operator uses the magic method "__foo__" to mean "return an object to be used in place of the operand should it be considered ... false? [or some other definition - I'm not sure]" then any class can implement that method to return an appropriate proxy object.
> 
> If that was a postfix operator which has a high precedence, then:
> 
> bar = foo?
> bar.isoformat()
> 
> and the original syntax suggestion:
> 
> bar = foo?.isoformat()
> 
> ... are equivalent. "?." is not a new operator. "?" is. This is essentially a slight refinement of Chris's case 3 -

I like this (modulo the corrections later in the thread). It's simpler and more flexible than the other options, and also comes closer to resolving the "spam?.eggs" vs. "spam?.cheese()" issue, by requiring "spam?.cheese?()".

Obviously "spam?" returns something with a __getattr__ method that just passes through to spam.__getattr__, except that on NoneType it returns something with a __getattr__ that always returns None. That solves the eggs case.

Next, "spam?.cheese?" returns something with a __call__ method that just passed through to spam?.cheese.__call__, except that on NoneType it returns something with a __call__ that always returns None. That solves the cheese case.

If you make None? return something whose other dunder methods also return None (except for special cases like __repr__), this also gives you "spam ?+ 3". (I'm not sure if that's a good thing or a bad thing...) Of course there's no way to do "spam ?= 3" (but I'm pretty sure that's a good thing).

So, do we need a dunder method for the "?" operator? What else would you use it for besides None?


From random832 at fastmail.com  Sat Sep 19 02:58:50 2015
From: random832 at fastmail.com (Random832)
Date: Fri, 18 Sep 2015 20:58:50 -0400
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
Message-ID: <1442624330.2155926.387780601.0BDF9956@webmail.messagingengine.com>

On Fri, Sep 18, 2015, at 18:37, Chris Angelico wrote:
> created?.isoformat?() # is equivalent to
> created.isoformat() if created is not None and created.isoformat is
> not None else None

More or less - it'd only look up the attribute once.

> As to the syntax... IMO this needs to be compact, so ?. has my
> support. With subscripting, should it be "obj?[idx]" or "obj[?idx]" ?
> FWIW Pike uses the latter, but if C# uses the former, there's no one
> obvious choice.

?[ has the benefit of being consistent with ?. - and ?(, for that
matter. It actually suggests a whole range of null-coalescing operators.
?* for multiply? A lot of these things are done already by the normal
operators for statically-typed nullable operands in C#.

That could get hairy fast  - I just thought of a radical alternative
that I'm not even sure if I support: ?(expr) as a lexical context that
changes the meaning of all operators.

From rosuav at gmail.com  Sat Sep 19 03:00:53 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Sat, 19 Sep 2015 11:00:53 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
Message-ID: <CAPTjJmqc2-RuAvoY-NR65VLYftyHwJJ51yBueYrWrpugEQn_UA@mail.gmail.com>

On Sat, Sep 19, 2015 at 10:49 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
> Obviously "spam?" returns something with a __getattr__ method that just passes through to spam.__getattr__, except that on NoneType it returns something with a __getattr__ that always returns None. That solves the eggs case.
>
> Next, "spam?.cheese?" returns something with a __call__ method that just passed through to spam?.cheese.__call__, except that on NoneType it returns something with a __call__ that always returns None. That solves the cheese case.
>

Hang on, how do you do this? How does the operator know the difference
between "spam?", which for None has to have __getattr__ return None,
and "spam?.cheese?" that returns (lambda: None)?

ChrisA

From abarnert at yahoo.com  Sat Sep 19 03:03:15 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 18 Sep 2015 18:03:15 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <55FCA51F.5060608@mail.de>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com> <55FC9BCE.7020703@lucidity.plus.com>
 <55FCA1DF.2030004@mail.de> <550D45BB-1103-4B70-9D1A-E17AEE519DBF@gmail.com>
 <55FCA51F.5060608@mail.de>
Message-ID: <2D28867B-6048-490A-9E60-4BC680FFF557@yahoo.com>

On Sep 18, 2015, at 16:58, Sven R. Kunze <srkunze at mail.de> wrote:
> 
>> On 19.09.2015 01:47, Ryan Gonzalez wrote:
>> What about "apply"? It's the closest thing to "fmap" I can think of that won't coblnfuse people...
> 
> Are you sure? I think "maybe" better reflects the purpose of "?".
> 
> 
> Nevertheless, I would love to see support for the maybe monad in Python.

I think this, and the whole discussion of maybe and fmap, is off the mark here.

It's trivial to create a maybe type in Python.

What's missing is the two things that make it useful: (1) pattern matching, and (2) a calling syntax and a general focus on HOFs that make fmap natural. Without at least one of those, maybe isn't useful. And adding either of those to Python is a huge proposal, much larger than null coalescing, and a lot less likely to gain support.

Also, the monadic style of failure propagation directly competes with the exception-raising style, and they're both contagious. A well-designed language and library can have both side by side if it, e.g., rigorously restricts exceptions to only truly exceptional cases, but the boat for that sailed decades ago in Python. So just having them side by side would lead to the exact same problems as C++ code that mixes exception-based and status-code-based APIs, or JavaScript code that mixes exceptions and errbacks or promise.fail handlers. 

Personally, whenever I think to myself "I could really use maybe here" in some Python code, that's a sign that I'm not thinking Pythonically, and either need to switch gears in my brain or switch languages. Just like when I start thinking about how I could get rid of that with statement with an RAII class, and maybe an implicit conversion operator....

From abarnert at yahoo.com  Sat Sep 19 03:10:17 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 18 Sep 2015 18:10:17 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAPTjJmqc2-RuAvoY-NR65VLYftyHwJJ51yBueYrWrpugEQn_UA@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <CAPTjJmqc2-RuAvoY-NR65VLYftyHwJJ51yBueYrWrpugEQn_UA@mail.gmail.com>
Message-ID: <DE93DCDF-F280-402A-974A-BA32280619F9@yahoo.com>

On Sep 18, 2015, at 18:00, Chris Angelico <rosuav at gmail.com> wrote:
> 
>> On Sat, Sep 19, 2015 at 10:49 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
>> Obviously "spam?" returns something with a __getattr__ method that just passes through to spam.__getattr__, except that on NoneType it returns something with a __getattr__ that always returns None. That solves the eggs case.
>> 
>> Next, "spam?.cheese?" returns something with a __call__ method that just passed through to spam?.cheese.__call__, except that on NoneType it returns something with a __call__ that always returns None. That solves the cheese case.
> 
> Hang on, how do you do this? How does the operator know the difference
> between "spam?", which for None has to have __getattr__ return None,
> and "spam?.cheese?" that returns (lambda: None)?

>>> spam
None
>>> spam?
NoneQuestion
>>> spam?.cheese
None
>>> spam?.cheese?
NoneQuestion
>>> spam?.cheese?()
None

All you need to make this work is:

* "spam?" returns NoneQuestion if spam is None else spam
* NoneQuestion.__getattr__(self, *args, **kw) returns None.
* NoneQuestion.__call__(self, *args, **kw) returns None.

Optionally, you can add more None-returning methods to NoneQuestion. Also, whether NoneQuestion is a singleton, has an accessible name, etc. are all bikesheddable.

I think it's obvious what happens is "spam" is not None and "spam.cheese" is, or of both are None, but if not, I can work them through as well.


> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From python at mrabarnett.plus.com  Sat Sep 19 03:39:22 2015
From: python at mrabarnett.plus.com (MRAB)
Date: Sat, 19 Sep 2015 02:39:22 +0100
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <DE93DCDF-F280-402A-974A-BA32280619F9@yahoo.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <CAPTjJmqc2-RuAvoY-NR65VLYftyHwJJ51yBueYrWrpugEQn_UA@mail.gmail.com>
 <DE93DCDF-F280-402A-974A-BA32280619F9@yahoo.com>
Message-ID: <55FCBCCA.5000001@mrabarnett.plus.com>

On 2015-09-19 02:10, Andrew Barnert via Python-ideas wrote:
> On Sep 18, 2015, at 18:00, Chris Angelico <rosuav at gmail.com> wrote:
>>
>>> On Sat, Sep 19, 2015 at 10:49 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
>>> Obviously "spam?" returns something with a __getattr__ method that just passes through to spam.__getattr__, except that on NoneType it returns something with a __getattr__ that always returns None. That solves the eggs case.
>>>
>>> Next, "spam?.cheese?" returns something with a __call__ method that just passed through to spam?.cheese.__call__, except that on NoneType it returns something with a __call__ that always returns None. That solves the cheese case.
>>
>> Hang on, how do you do this? How does the operator know the difference
>> between "spam?", which for None has to have __getattr__ return None,
>> and "spam?.cheese?" that returns (lambda: None)?
>
>>>> spam
> None
>>>> spam?
> NoneQuestion
>>>> spam?.cheese
> None
>>>> spam?.cheese?
> NoneQuestion
>>>> spam?.cheese?()
> None
>
> All you need to make this work is:
>
> * "spam?" returns NoneQuestion if spam is None else spam
> * NoneQuestion.__getattr__(self, *args, **kw) returns None.
> * NoneQuestion.__call__(self, *args, **kw) returns None.
>
> Optionally, you can add more None-returning methods to NoneQuestion. Also, whether NoneQuestion is a singleton, has an accessible name, etc. are all bikesheddable.
>
> I think it's obvious what happens is "spam" is not None and "spam.cheese" is, or of both are None, but if not, I can work them through as well.
>
I see it as "spam? doing "Maybe(spam)" and then attribute access
checking returning None if the wrapped object is None and getting the
attribute from it if not.

I think that the optimiser could probably avoid the use of Maybe in
cases like "spam?.cheese".


From python at mrabarnett.plus.com  Sat Sep 19 03:52:08 2015
From: python at mrabarnett.plus.com (MRAB)
Date: Sat, 19 Sep 2015 02:52:08 +0100
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <55FCBCCA.5000001@mrabarnett.plus.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <CAPTjJmqc2-RuAvoY-NR65VLYftyHwJJ51yBueYrWrpugEQn_UA@mail.gmail.com>
 <DE93DCDF-F280-402A-974A-BA32280619F9@yahoo.com>
 <55FCBCCA.5000001@mrabarnett.plus.com>
Message-ID: <55FCBFC8.5060305@mrabarnett.plus.com>

On 2015-09-19 02:39, MRAB wrote:
> On 2015-09-19 02:10, Andrew Barnert via Python-ideas wrote:
>> On Sep 18, 2015, at 18:00, Chris Angelico <rosuav at gmail.com> wrote:
>>>
>>>> On Sat, Sep 19, 2015 at 10:49 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
>>>> Obviously "spam?" returns something with a __getattr__ method that just passes through to spam.__getattr__, except that on NoneType it returns something with a __getattr__ that always returns None. That solves the eggs case.
>>>>
>>>> Next, "spam?.cheese?" returns something with a __call__ method that just passed through to spam?.cheese.__call__, except that on NoneType it returns something with a __call__ that always returns None. That solves the cheese case.
>>>
>>> Hang on, how do you do this? How does the operator know the difference
>>> between "spam?", which for None has to have __getattr__ return None,
>>> and "spam?.cheese?" that returns (lambda: None)?
>>
>>>>> spam
>> None
>>>>> spam?
>> NoneQuestion
>>>>> spam?.cheese
>> None
>>>>> spam?.cheese?
>> NoneQuestion
>>>>> spam?.cheese?()
>> None
>>
>> All you need to make this work is:
>>
>> * "spam?" returns NoneQuestion if spam is None else spam
>> * NoneQuestion.__getattr__(self, *args, **kw) returns None.
>> * NoneQuestion.__call__(self, *args, **kw) returns None.
>>
>> Optionally, you can add more None-returning methods to NoneQuestion. Also, whether NoneQuestion is a singleton, has an accessible name, etc. are all bikesheddable.
>>
>> I think it's obvious what happens is "spam" is not None and "spam.cheese" is, or of both are None, but if not, I can work them through as well.
>>
> I see it as "spam? doing "Maybe(spam)" and then attribute access
> checking returning None if the wrapped object is None and getting the
> attribute from it if not.
>
> I think that the optimiser could probably avoid the use of Maybe in
> cases like "spam?.cheese".
>
I've thought of another issue:

If you write "spam?(sing_lumberjack_song())", won't it still call
sing_lumberjack_song even if spam is None? After all, Python evaluates
the arguments before looking up the call, so it won't know that "spam"
is None until it tries to call "spam?".

That isn't a problem with "spam.sing_lumberjack_song() if spam is not
None else None" or if it's optimised to that, but "m = spam?;
m(sing_lumberjack_song())" is a different matter.

perhaps a "Maybe" object should also support "?" so you could write "m
= spam?; m?(sing_lumberjack_song())". "Maybe" could be idempotent, so
"Maybe(Maybe(x))" returns the same result as "Maybe(x)".


From mehaase at gmail.com  Sat Sep 19 04:06:30 2015
From: mehaase at gmail.com (Mark E. Haase)
Date: Fri, 18 Sep 2015 22:06:30 -0400
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <DE93DCDF-F280-402A-974A-BA32280619F9@yahoo.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <CAPTjJmqc2-RuAvoY-NR65VLYftyHwJJ51yBueYrWrpugEQn_UA@mail.gmail.com>
 <DE93DCDF-F280-402A-974A-BA32280619F9@yahoo.com>
Message-ID: <CALb0Rk7guKYiWvnydRm2kuT_1rhwgR+5zQd4BHTx+17-ZA5nCA@mail.gmail.com>

Andrew, I really like that idea. Turning back to the null coalescing
operator (spelled ?? in other languages), how do you think that fits in?

Consider this syntax:

>>> None? or 1
1

This works if NoneQuestion overrides __nonzero__ to return False.

>>> 0? or 1
0

This doesn't work, because 0? returns 0, and "0 or 1" is 1.

We could try this instead, if NoneQuestion overrides __or__:

>>> 0? | 1
0
>>> 0 ?| 1
0

This looks a little ugly, and it would be nice (as MRAB pointed out) if
null coalescing short circuited.

>>> None? or None?

This also doesn't work quite right. If both operands are None, we want the
expression to evaluate to None, not NoneQuestion. *Should null coalescing
be a separate operator? And if so, are "?" and "??" too similar?*

Can anybody think of realistic use cases for overriding a magic method for
the "?" operator? I would like to include such use cases in a PEP. One
possible use case: being able to coalesce empty strings.

>>> s1 = MyString('')
>>> s2 = MyString('foobar')
>>> s1? or s2
MyString('foobar')



On Fri, Sep 18, 2015 at 9:10 PM, Andrew Barnert via Python-ideas <
python-ideas at python.org> wrote:

> On Sep 18, 2015, at 18:00, Chris Angelico <rosuav at gmail.com> wrote:
> >
> >> On Sat, Sep 19, 2015 at 10:49 AM, Andrew Barnert <abarnert at yahoo.com>
> wrote:
> >> Obviously "spam?" returns something with a __getattr__ method that just
> passes through to spam.__getattr__, except that on NoneType it returns
> something with a __getattr__ that always returns None. That solves the eggs
> case.
> >>
> >> Next, "spam?.cheese?" returns something with a __call__ method that
> just passed through to spam?.cheese.__call__, except that on NoneType it
> returns something with a __call__ that always returns None. That solves the
> cheese case.
> >
> > Hang on, how do you do this? How does the operator know the difference
> > between "spam?", which for None has to have __getattr__ return None,
> > and "spam?.cheese?" that returns (lambda: None)?
>
> >>> spam
> None
> >>> spam?
> NoneQuestion
> >>> spam?.cheese
> None
> >>> spam?.cheese?
> NoneQuestion
> >>> spam?.cheese?()
> None
>
> All you need to make this work is:
>
> * "spam?" returns NoneQuestion if spam is None else spam
> * NoneQuestion.__getattr__(self, *args, **kw) returns None.
> * NoneQuestion.__call__(self, *args, **kw) returns None.
>
> Optionally, you can add more None-returning methods to NoneQuestion. Also,
> whether NoneQuestion is a singleton, has an accessible name, etc. are all
> bikesheddable.
>
> I think it's obvious what happens is "spam" is not None and "spam.cheese"
> is, or of both are None, but if not, I can work them through as well.
>
>
> > ChrisA
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at python.org
> > https://mail.python.org/mailman/listinfo/python-ideas
> > Code of Conduct: http://python.org/psf/codeofconduct/
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
Mark E. Haase
202-815-0201
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150918/56822fa2/attachment-0001.html>

From rosuav at gmail.com  Sat Sep 19 04:26:11 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Sat, 19 Sep 2015 12:26:11 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CALb0Rk7guKYiWvnydRm2kuT_1rhwgR+5zQd4BHTx+17-ZA5nCA@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <CAPTjJmqc2-RuAvoY-NR65VLYftyHwJJ51yBueYrWrpugEQn_UA@mail.gmail.com>
 <DE93DCDF-F280-402A-974A-BA32280619F9@yahoo.com>
 <CALb0Rk7guKYiWvnydRm2kuT_1rhwgR+5zQd4BHTx+17-ZA5nCA@mail.gmail.com>
Message-ID: <CAPTjJmrdULyMnefEzXkn1Urewsa_x3OtBpNnCoWbNx6A0k918g@mail.gmail.com>

On Sat, Sep 19, 2015 at 12:06 PM, Mark E. Haase <mehaase at gmail.com> wrote:
> Can anybody think of realistic use cases for overriding a magic method for
> the "?" operator? I would like to include such use cases in a PEP. One
> possible use case: being able to coalesce empty strings.
>
>>>> s1 = MyString('')
>>>> s2 = MyString('foobar')
>>>> s1? or s2
> MyString('foobar')

Frankly, I think this is a bad idea. You're potentially coalescing
multiple things with the same expression, and we already have a way of
spelling that: the "or" operator. If you don't want a generic "if it's
false, use this", and don't want a super-specific "if it's None, use
this", then how are you going to define what it is? And more
importantly, how do you reason about the expression "s1? or s2"
without knowing exactly what types coalesce to what? Let's keep the
rules simple. Make this a special feature of the None singleton, and
all other objects simply return themselves - for the same reason that
a class isn't allowed to override the "is" operator.

ChrisA

From abarnert at yahoo.com  Sat Sep 19 05:20:49 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 18 Sep 2015 20:20:49 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <55FCBFC8.5060305@mrabarnett.plus.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <CAPTjJmqc2-RuAvoY-NR65VLYftyHwJJ51yBueYrWrpugEQn_UA@mail.gmail.com>
 <DE93DCDF-F280-402A-974A-BA32280619F9@yahoo.com>
 <55FCBCCA.5000001@mrabarnett.plus.com> <55FCBFC8.5060305@mrabarnett.plus.com>
Message-ID: <059B044A-173C-488C-9AAD-EF32CF03D6AC@yahoo.com>

On Sep 18, 2015, at 18:52, MRAB <python at mrabarnett.plus.com> wrote:
> 
>> On 2015-09-19 02:39, MRAB wrote:
>>> On 2015-09-19 02:10, Andrew Barnert via Python-ideas wrote:
>>>> On Sep 18, 2015, at 18:00, Chris Angelico <rosuav at gmail.com> wrote:
>>>> 
>>>>> On Sat, Sep 19, 2015 at 10:49 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
>>>>> Obviously "spam?" returns something with a __getattr__ method that just passes through to spam.__getattr__, except that on NoneType it returns something with a __getattr__ that always returns None. That solves the eggs case.
>>>>> 
>>>>> Next, "spam?.cheese?" returns something with a __call__ method that just passed through to spam?.cheese.__call__, except that on NoneType it returns something with a __call__ that always returns None. That solves the cheese case.
>>>> 
>>>> Hang on, how do you do this? How does the operator know the difference
>>>> between "spam?", which for None has to have __getattr__ return None,
>>>> and "spam?.cheese?" that returns (lambda: None)?
>>> 
>>>>>> spam
>>> None
>>>>>> spam?
>>> NoneQuestion
>>>>>> spam?.cheese
>>> None
>>>>>> spam?.cheese?
>>> NoneQuestion
>>>>>> spam?.cheese?()
>>> None
>>> 
>>> All you need to make this work is:
>>> 
>>> * "spam?" returns NoneQuestion if spam is None else spam
>>> * NoneQuestion.__getattr__(self, *args, **kw) returns None.
>>> * NoneQuestion.__call__(self, *args, **kw) returns None.
>>> 
>>> Optionally, you can add more None-returning methods to NoneQuestion. Also, whether NoneQuestion is a singleton, has an accessible name, etc. are all bikesheddable.
>>> 
>>> I think it's obvious what happens is "spam" is not None and "spam.cheese" is, or of both are None, but if not, I can work them through as well.
>> I see it as "spam? doing "Maybe(spam)" and then attribute access
>> checking returning None if the wrapped object is None and getting the
>> attribute from it if not.
>> 
>> I think that the optimiser could probably avoid the use of Maybe in
>> cases like "spam?.cheese".
> I've thought of another issue:
> 
> If you write "spam?(sing_lumberjack_song())", won't it still call
> sing_lumberjack_song even if spam is None?

You're right; I didn't think about that. But I don't think that's a problem. 

I believe C#, Swift, etc. all evaluate the arguments in their equivalent. And languages like ObjC that do automatic nil coalescing for all method calls definitely evaluate them. If you really want to switch on spam and not call sing_lumberjack_song, you can always do that manually, right?

> perhaps a "Maybe" object should also support "?" so you could write "m
> = spam?; m?(sing_lumberjack_song())". "Maybe" could be idempotent, so
> "Maybe(Maybe(x))" returns the same result as "Maybe(x)".

That actually makes sense just for its own reasons.

Actually, now that I think about it, the way I defined it above already gives you think: if spam? is spam if it's anything but None, then spam?? is always spam?, right?

> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From abarnert at yahoo.com  Sat Sep 19 05:30:48 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 18 Sep 2015 20:30:48 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CALb0Rk7guKYiWvnydRm2kuT_1rhwgR+5zQd4BHTx+17-ZA5nCA@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <CAPTjJmqc2-RuAvoY-NR65VLYftyHwJJ51yBueYrWrpugEQn_UA@mail.gmail.com>
 <DE93DCDF-F280-402A-974A-BA32280619F9@yahoo.com>
 <CALb0Rk7guKYiWvnydRm2kuT_1rhwgR+5zQd4BHTx+17-ZA5nCA@mail.gmail.com>
Message-ID: <8EA8F9B6-5A00-43C6-A4AA-12DDB69125C6@yahoo.com>

On Sep 18, 2015, at 19:06, Mark E. Haase <mehaase at gmail.com> wrote:
> 
> Andrew, I really like that idea. Turning back to the null coalescing operator (spelled ?? in other languages), how do you think that fits in? 
> 
> Consider this syntax:
> 
> >>> None? or 1

I don't think there's any easy way to make "spam? or 1" work any better than "spam or 1" already does, partly for the reasons you give below, but also because it doesn't seem to fit the design in any obvious way.

I guess that means postix ? doesn't quite magically solve everything...

> This also doesn't work quite right. If both operands are None, we want the expression to evaluate to None, not NoneQuestion. Should null coalescing be a separate operator? And if so, are "?" and "??" too similar?

As MRAB pointed out, there seem to be good reasons to let spam?? mean the same thing as spam? (and that follows automatically from the simplest possible definition, the one I gave above). So I think "spam ?? eggs" is ambiguous between the postfix operator and the infix operator without lookahead, at least to a human, and possibly to the compiler as well.

I suppose ?: as in ColdFusion might work, but (a) ewwww, (b) it regularly confuses novices to CF, and (c) it's impossible to search for, because ?: no matter how you quote it gets you the C ternary operator....

> Can anybody think of realistic use cases for overriding a magic method for the "?" operator? I would like to include such use cases in a PEP. One possible use case: being able to coalesce empty strings.
> 
> >>> s1 = MyString('')
> >>> s2 = MyString('foobar')
> >>> s1? or s2
> MyString('foobar')

This seems like a bad idea. Empty strings are already falsey. If you want this behavior, why not just use "s1 or s2", which already works, and for obvious reasons?

>> On Fri, Sep 18, 2015 at 9:10 PM, Andrew Barnert via Python-ideas <python-ideas at python.org> wrote:
>> On Sep 18, 2015, at 18:00, Chris Angelico <rosuav at gmail.com> wrote:
>> >
>> >> On Sat, Sep 19, 2015 at 10:49 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
>> >> Obviously "spam?" returns something with a __getattr__ method that just passes through to spam.__getattr__, except that on NoneType it returns something with a __getattr__ that always returns None. That solves the eggs case.
>> >>
>> >> Next, "spam?.cheese?" returns something with a __call__ method that just passed through to spam?.cheese.__call__, except that on NoneType it returns something with a __call__ that always returns None. That solves the cheese case.
>> >
>> > Hang on, how do you do this? How does the operator know the difference
>> > between "spam?", which for None has to have __getattr__ return None,
>> > and "spam?.cheese?" that returns (lambda: None)?
>> 
>> >>> spam
>> None
>> >>> spam?
>> NoneQuestion
>> >>> spam?.cheese
>> None
>> >>> spam?.cheese?
>> NoneQuestion
>> >>> spam?.cheese?()
>> None
>> 
>> All you need to make this work is:
>> 
>> * "spam?" returns NoneQuestion if spam is None else spam
>> * NoneQuestion.__getattr__(self, *args, **kw) returns None.
>> * NoneQuestion.__call__(self, *args, **kw) returns None.
>> 
>> Optionally, you can add more None-returning methods to NoneQuestion. Also, whether NoneQuestion is a singleton, has an accessible name, etc. are all bikesheddable.
>> 
>> I think it's obvious what happens is "spam" is not None and "spam.cheese" is, or of both are None, but if not, I can work them through as well.
>> 
>> 
>> > ChrisA
>> > _______________________________________________
>> > Python-ideas mailing list
>> > Python-ideas at python.org
>> > https://mail.python.org/mailman/listinfo/python-ideas
>> > Code of Conduct: http://python.org/psf/codeofconduct/
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
> 
> 
> 
> -- 
> Mark E. Haase
> 202-815-0201
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150918/c37e8baa/attachment-0001.html>

From steve at pearwood.info  Sat Sep 19 05:41:12 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 19 Sep 2015 13:41:12 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <55FC9812.2090503@mrabarnett.plus.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
Message-ID: <20150919034112.GQ31152@ando.pearwood.info>

On Sat, Sep 19, 2015 at 12:02:42AM +0100, MRAB wrote:

> To me, the choice _is_ obvious: "obj?[idx]". After all, that's more in
> keeping with "obj?.attr" and "func?()".
> 
> If you had "obj?[idx]", then shouldn't it also be "obj.?attr" and
> "func(?)"?

No.

If I understand the idea, obj?.attr returns None if obj is None, 
otherwise returns obj.attr. The question mark (shrug operator?) applies 
to `obj` *before* the attribute lookup, so it should appear *before* the 
dot (since we read from left-to-right).

The heuristic for remembering the order is that the "shrug" (question 
mark) operator applies to obj, so it is attached to obj, before any 
subsequent operation.

For the sake of brevity, using @ as a placeholder for one of attribute 
access, item/key lookup, or function call, then we have:

    obj?@

as syntactic sugar for:

    None if obj is None else obj@

Furthermore, we should be able to chain a sequence of such @s:

paperboy.receive(customer?.trousers.backpocket.wallet.extract(2.99))


being equivalent to:

paperboy.receive(None if customer is None else 
                 customer.trousers.backpocket.wallet.extract(2.99) 
                 )


Let's just assume we have a good reason for chaining lookups that isn't 
an egregious violation of the Law of Demeter, and not get into a debate 
over OOP best practices, okay? :-)

Suppose that wallet itself may also be None. Then we can easily deal 
with that situation too:

paperboy.receive(customer?.trousers.backpocket.wallet?.extract(2.99))


which I think is a big win over either of these two alternatives:

# 1
paperboy.receive(None if customer is None else 
                 None if customer.trousers.backpocket.wallet is None 
                 else customer.trousers.backpocket.wallet.extract(2.99) 
                 )

# 2
if customer is not None:
    wallet = customer.trousers.backpocket.wallet
    if wallet is not None:
        paperboy.receive(wallet.extract(2.99))


It's a funny thing, I'm usually not a huge fan of symbols outside of 
maths operators, and I strongly dislike the C ? ternary operator, but 
this one feels really natural to me. I didn't have even the most 
momentary "if you want Perl, you know where to find it" thought.



-- 
Steve

From rymg19 at gmail.com  Sat Sep 19 05:43:16 2015
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Fri, 18 Sep 2015 22:43:16 -0500
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <8EA8F9B6-5A00-43C6-A4AA-12DDB69125C6@yahoo.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <CAPTjJmqc2-RuAvoY-NR65VLYftyHwJJ51yBueYrWrpugEQn_UA@mail.gmail.com>
 <DE93DCDF-F280-402A-974A-BA32280619F9@yahoo.com>
 <CALb0Rk7guKYiWvnydRm2kuT_1rhwgR+5zQd4BHTx+17-ZA5nCA@mail.gmail.com>
 <8EA8F9B6-5A00-43C6-A4AA-12DDB69125C6@yahoo.com>
Message-ID: <15397B59-288D-478B-8DC1-00F5CB16B30D@gmail.com>

This is likely going to get shot down quickly...

I know CoffeeScript is not regarded too well in this community (well, at least based on Guido's remarks on parsing it), but what if x? was shorthand for x is None? In CS, it's called the existential operator.

On September 18, 2015 10:30:48 PM CDT, Andrew Barnert via Python-ideas <python-ideas at python.org> wrote:
>On Sep 18, 2015, at 19:06, Mark E. Haase <mehaase at gmail.com> wrote:
>> 
>> Andrew, I really like that idea. Turning back to the null coalescing
>operator (spelled ?? in other languages), how do you think that fits
>in? 
>> 
>> Consider this syntax:
>> 
>> >>> None? or 1
>
>I don't think there's any easy way to make "spam? or 1" work any better
>than "spam or 1" already does, partly for the reasons you give below,
>but also because it doesn't seem to fit the design in any obvious way.
>
>I guess that means postix ? doesn't quite magically solve everything...
>
>> This also doesn't work quite right. If both operands are None, we
>want the expression to evaluate to None, not NoneQuestion. Should null
>coalescing be a separate operator? And if so, are "?" and "??" too
>similar?
>
>As MRAB pointed out, there seem to be good reasons to let spam?? mean
>the same thing as spam? (and that follows automatically from the
>simplest possible definition, the one I gave above). So I think "spam
>?? eggs" is ambiguous between the postfix operator and the infix
>operator without lookahead, at least to a human, and possibly to the
>compiler as well.
>
>I suppose ?: as in ColdFusion might work, but (a) ewwww, (b) it
>regularly confuses novices to CF, and (c) it's impossible to search
>for, because ?: no matter how you quote it gets you the C ternary
>operator....
>
>> Can anybody think of realistic use cases for overriding a magic
>method for the "?" operator? I would like to include such use cases in
>a PEP. One possible use case: being able to coalesce empty strings.
>> 
>> >>> s1 = MyString('')
>> >>> s2 = MyString('foobar')
>> >>> s1? or s2
>> MyString('foobar')
>
>This seems like a bad idea. Empty strings are already falsey. If you
>want this behavior, why not just use "s1 or s2", which already works,
>and for obvious reasons?
>
>>> On Fri, Sep 18, 2015 at 9:10 PM, Andrew Barnert via Python-ideas
><python-ideas at python.org> wrote:
>>> On Sep 18, 2015, at 18:00, Chris Angelico <rosuav at gmail.com> wrote:
>>> >
>>> >> On Sat, Sep 19, 2015 at 10:49 AM, Andrew Barnert
><abarnert at yahoo.com> wrote:
>>> >> Obviously "spam?" returns something with a __getattr__ method
>that just passes through to spam.__getattr__, except that on NoneType
>it returns something with a __getattr__ that always returns None. That
>solves the eggs case.
>>> >>
>>> >> Next, "spam?.cheese?" returns something with a __call__ method
>that just passed through to spam?.cheese.__call__, except that on
>NoneType it returns something with a __call__ that always returns None.
>That solves the cheese case.
>>> >
>>> > Hang on, how do you do this? How does the operator know the
>difference
>>> > between "spam?", which for None has to have __getattr__ return
>None,
>>> > and "spam?.cheese?" that returns (lambda: None)?
>>> 
>>> >>> spam
>>> None
>>> >>> spam?
>>> NoneQuestion
>>> >>> spam?.cheese
>>> None
>>> >>> spam?.cheese?
>>> NoneQuestion
>>> >>> spam?.cheese?()
>>> None
>>> 
>>> All you need to make this work is:
>>> 
>>> * "spam?" returns NoneQuestion if spam is None else spam
>>> * NoneQuestion.__getattr__(self, *args, **kw) returns None.
>>> * NoneQuestion.__call__(self, *args, **kw) returns None.
>>> 
>>> Optionally, you can add more None-returning methods to NoneQuestion.
>Also, whether NoneQuestion is a singleton, has an accessible name, etc.
>are all bikesheddable.
>>> 
>>> I think it's obvious what happens is "spam" is not None and
>"spam.cheese" is, or of both are None, but if not, I can work them
>through as well.
>>> 
>>> 
>>> > ChrisA
>>> > _______________________________________________
>>> > Python-ideas mailing list
>>> > Python-ideas at python.org
>>> > https://mail.python.org/mailman/listinfo/python-ideas
>>> > Code of Conduct: http://python.org/psf/codeofconduct/
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> https://mail.python.org/mailman/listinfo/python-ideas
>>> Code of Conduct: http://python.org/psf/codeofconduct/
>> 
>> 
>> 
>> -- 
>> Mark E. Haase
>> 202-815-0201
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Python-ideas mailing list
>Python-ideas at python.org
>https://mail.python.org/mailman/listinfo/python-ideas
>Code of Conduct: http://python.org/psf/codeofconduct/

-- 
Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150918/f9243180/attachment.html>

From guido at python.org  Sat Sep 19 06:21:56 2015
From: guido at python.org (Guido van Rossum)
Date: Fri, 18 Sep 2015 21:21:56 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <20150919034112.GQ31152@ando.pearwood.info>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
Message-ID: <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>

On Fri, Sep 18, 2015 at 8:41 PM, Steven D'Aprano <steve at pearwood.info>
wrote:

> It's a funny thing, I'm usually not a huge fan of symbols outside of
> maths operators, and I strongly dislike the C ? ternary operator, but
> this one feels really natural to me. I didn't have even the most
> momentary "if you want Perl, you know where to find it" thought.
>

I do, but at least the '?' is part of an operator, not part of the name (as
it is in Ruby?).

I really, really, really don't like how it looks, but here's one thing: the
discussion can be cut short and focus almost entirely on whether this is
worth making Python uglier (and whether it's even ugly :-). The semantics
are crystal clear and it's obvious that the way it should work is by making
"?.", ?(" and "?[" new operators or operator pairs -- the "?" should not be
a unary postfix operator but a symbol that combines with certain other
symbols.

Let me propose a (hyper?)generalization: it could be combined with any
binary operation, e.g. "a?+b" would mean "None if a is None else a+b".
Sadly (as hypergeneralizations tend to do?) this also leads to a negative
observation: what if I wanted to write "None if b is None else a+b"? (And
don't be funny and say I should swap a and b -- they could be strings.)
Similar for what if you wanted to do this with a unary operator, e.g. None
if x is None else -x. Maybe we could write "a+?b" and "-?x"? But I don't
think the use cases warrant these much.

Finally, let's give it a proper name -- let's call it the uptalk operator.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150918/9e977982/attachment-0001.html>

From steve at pearwood.info  Sat Sep 19 07:06:48 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 19 Sep 2015 15:06:48 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
Message-ID: <20150919050647.GR31152@ando.pearwood.info>

On Fri, Sep 18, 2015 at 05:49:36PM -0700, Andrew Barnert via Python-ideas wrote:

> Obviously "spam?" returns something with a __getattr__ method that 
> just passes through to spam.__getattr__, except that on NoneType it 
> returns something with a __getattr__ that always returns None. That 
> solves the eggs case.

Ah, and now my enthusiasm for the whole idea is gone...

In my previous response, I imagined spam?.attr to be syntactic sugar for 
`None if spam is None else spam.attr`. But having ? be an ordinary 
operator that returns a special Null object feels too "Design Pattern-y" 
to me. I think the Null design pattern is actually harmful, and I would 
not like to see this proposal implemented this way.

(In another email, Andrew called the special object something like 
NoneMaybe or NoneQuestion, I forget which. I'm going to call the object 
Null, since that's less typing.)

The Null object pattern sounds like a great idea at first, but I find it 
to be a code smell at best and outright harmful at worst. If you are 
passing around an object which is conceptually None, but unlike None 
reliably does nothing without raising an exception no matter what you do 
with it, that suggests to me that something about your code is not 
right.

If your functions already accept None, then you should just use None. If 
they don't accept None, then why are you trying to smuggle None into 
them using a quasi-None that unconditionally hides errors?

Here are some problems with the Null pattern as I see it:

(1) Suppose that spam? returns a special Null object, and Null.attr 
itself returns Null. (As do Null[item] and Null(arg), of course.) This 
matches the classic Null object design pattern, and gives us chaining 
for free:

    value = obj?.spam.eggs.cheese

But now `value` is Null, which may not be what we expect and may in fact 
be a problem if we're expecting it to be "an actual value, or None" 
rather than our quasi-None Null object.

Because `value` is now a Null, every time we pass it to a function, we 
risk getting new Nulls in places that shouldn't get them. If a function 
isn't expecting None, we should get an exception, but Null is designed 
to not raise exceptions no matter what you do with it. So we risk 
contaminating our data with Nulls in unexpected places.

Eventually, of course, there comes a time where we need to deal with the 
actual value. With the Null pattern in place, we have to deal with two 
special cases, not one:

    # I assume Null is a singleton, otherwise use isinstance
    if filename is not None and filename is not Null:
        os.unlink(filename)

A small nuisance, to be sure, but part of the reason why I really don't 
think much of the Null object pattern. It sounds good on paper, but I 
think it's actually more dangerous and inconvenient than the problem it 
tries to solve.


(2) We can avoid the worst of the Null design (anti-)pattern by having 
Null.attr return None instead of Null. Unfortunately, that means we've 
lost automatic chaining. If you have an object that might be None, we 
have to explicitly use the ? operator after each lookup except the last:

    value = obj?.spam?.eggs?.cheese

which is (a) messy, (b) potentially inefficient, and (c) potentially 
hides subtle bugs.

Here is a scenario where it hides bugs. Suppose obj may be None, but if 
it is not, then obj.spam *must* be a object with an eggs attribute. If 
obj.spam is None, that's a bug that needs fixing. Suppose we start off 
by writing the obvious thing:

    obj?.spam.eggs

but that fails because obj=None raises an exception:

    obj? returns Null
    Null.spam returns None
    None.eggs raises

So to protect against that, we might write:

    obj?.spam?.eggs

but that protects against too much, and hides the fact that obj.spam 
exists but is None.

As far as I am concerned, any use of a Null object has serious 
downsides. If people want to explicitly use it in their own code, well, 
good luck with that. I don't think Python should be making it a 
built-in.

I think the first case, the classic Null design pattern, is actually 
*better* because the downsides are anything but subtle, and people will 
soon learn not to touch it with a 10ft pole *wink*, while the second 
case, the "Null.attr gives None" case, is actually worse because it 
isn't *obviously* wrong and can subtly hide bugs.


How does my earlier idea of ? as syntactic sugar compare with those?

In that case, there is no special Null object, there's only None. So we 
avoid the risk of Null infection, and avoid needing to check specially 
for Null. It also avoids the bug-hiding scenario:

    obj?.spam.eggs.cheese

is equivalent to:

    None if obj is None else obj.spam.eggs


If obj is None, we get None, as we expect. If it is not None, we get 
obj.spam.eggs as we expect. If obj.spam is wrongly None, then we get an 
exception, as we should.



-- 
Steve

From stephen at xemacs.org  Sat Sep 19 07:14:40 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 19 Sep 2015 14:14:40 +0900
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
Message-ID: <87d1xfyoqn.fsf@uwakimon.sk.tsukuba.ac.jp>

Andrew Barnert via Python-ideas writes:

 > So, do we need a dunder method for the "?" operator? What else
 > would you use it for besides None?

NaNs in a pure-Python implementation of float or Decimal.  (This is
not a practical suggestion.)

A true SQL NULL type.  It's always bothered me that most ORMs map NULL
to None but there are plenty of other ways to inject None into a
Python computation.  (This probably isn't a practical suggestion
either unless Random832's suggestion of ?() establishing a lexical
context were adopted.)

The point is that Maybe behavior is at least theoretically useful in
subcategories, with special objects other than None.

Sven's suggestion of calling this the "monad" operator triggers a
worry in me, however.  In Haskell, the Monad type doesn't enforce the
monad laws, only the property of being an endofunctor.  That
apparently turns out to be enough in practice to make the Monad type
very useful.  However, in Python we have no way to enforce that
property.  I don't have the imagination to come up with a truly
attractive nuisance here, and this operator doesn't enable general
functorial behavior, so maybe it's not a problem.


From random832 at fastmail.com  Sat Sep 19 08:55:13 2015
From: random832 at fastmail.com (Random832)
Date: Sat, 19 Sep 2015 02:55:13 -0400
Subject: [Python-ideas] Null coalescing operators
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
Message-ID: <m2zj0ikiem.fsf@fastmail.com>

Guido van Rossum <guido at python.org> writes:
> Let me propose a (hyper?)generalization: it could be combined with any
> binary operation, e.g. "a?+b" would mean "None if a is None else a+b".

I'd have read it as "None if a is None or b is None else a+b". If you
want to only do it for one of the operands you should be explicit.

I'm not sure if I have a coherent argument for why this shouldn't apply
to ?[, though.


From anthony at xtfx.me  Sat Sep 19 10:17:07 2015
From: anthony at xtfx.me (C Anthony Risinger)
Date: Sat, 19 Sep 2015 03:17:07 -0500
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
Message-ID: <CAGAVQTFG8PpdmDqr1FRBrvmB_hJ-ZqJrZT4qLyLx63+bNm-mpg@mail.gmail.com>

On Fri, Sep 18, 2015 at 11:21 PM, Guido van Rossum <guido at python.org> wrote:

> On Fri, Sep 18, 2015 at 8:41 PM, Steven D'Aprano <steve at pearwood.info>
> wrote:
>
>> It's a funny thing, I'm usually not a huge fan of symbols outside of
>> maths operators, and I strongly dislike the C ? ternary operator, but
>> this one feels really natural to me. I didn't have even the most
>> momentary "if you want Perl, you know where to find it" thought.
>>
>
> I do, but at least the '?' is part of an operator, not part of the name
> (as it is in Ruby?).
>
> I really, really, really don't like how it looks, but here's one thing:
> the discussion can be cut short and focus almost entirely on whether this
> is worth making Python uglier (and whether it's even ugly :-). The
> semantics are crystal clear and it's obvious that the way it should work is
> by making "?.", ?(" and "?[" new operators or operator pairs -- the "?"
> should not be a unary postfix operator but a symbol that combines with
> certain other symbols.
>

I really liked this whole thread, and I largely still do -- I?think -- but
I'm not sure I like how `?` suddenly prevents whole blocks of code from
being evaluated. Anything within the (...) or [...] is now skipped (IIUC)
just because a `?` was added, which seems like it could have side effects
on the surrounding state, especially since I expect people will use it for
squashing/silencing or as a convenient trick after the fact, possibly in
code they did not originally write.

If the original example included a `?` like so:

    response = json.dumps?({
        'created': created?.isoformat(),
        'updated': updated?.isoformat(),
        ...
    })

should "dumps" be None, the additional `?` (although though you can barely
see it) prevents *everything else* from executing. This may cause confusion
about what is being executed, and when, especially once nesting (to any
degree really) and/or chaining comes into play!

Usually when I want to use this pattern, I find I just need to write things
out more. The concept itself vaguely reminds me of PHP's use of `@` for
squashing errors. In my opinion, it has some utility but has too much
potential impact on program flow without being very noticeable. If I saw
more than 1 per line, or a couple within a few lines, I think my ability to
quickly identify -> analyze -> comprehend possible routes in program
control flow decreases. I feel like I'll fault more, double back, and/or
make sure I forevermore look harder for sneaky `?`s.

I probably need to research more examples of how such a thing is used in
real code, today. This will help me get a feel for how people might want to
integrate the new `?` capability into their libraries and apis, maybe that
will ease my readability reservations.

Thanks,

-- 

C Anthony
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150919/d076c13e/attachment-0001.html>

From greg.ewing at canterbury.ac.nz  Sat Sep 19 09:03:27 2015
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 19 Sep 2015 19:03:27 +1200
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
Message-ID: <55FD08BF.9090800@canterbury.ac.nz>

Guido van Rossum wrote:

> Finally, let's give it a proper name -- let's call it the uptalk operator.

Um... why? Is this  Monty reference I'm missing?

-- 
Greg

From srkunze at mail.de  Sat Sep 19 11:48:01 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Sat, 19 Sep 2015 11:48:01 +0200
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <87d1xfyoqn.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <87d1xfyoqn.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <55FD2F51.6020303@mail.de>

On 19.09.2015 07:14, Stephen J. Turnbull wrote:
> A true SQL NULL type.  It's always bothered me that most ORMs map NULL
> to None but there are plenty of other ways to inject None into a
> Python computation.  (This probably isn't a practical suggestion
> either unless Random832's suggestion of ?() establishing a lexical
> context were adopted.)

I definitely agree here. Internally, we have a guideline telling us to 
avoid None or NULL whenever possible. Andrew's remark about 'code smell' 
is definitely appropriate.


There was a great discussion some years ago on one of the RDF semantics 
mailing list about the semantics of NULL (in RDF). It turned out to have 
6 or 7 semantics WITHOUT any domain-specific focus (don't know, don't 
exists, is missing, etc. -- can't remember all of them). I feel that is 
one reason why Python programs should avoid None: we don't guess.

> The point is that Maybe behavior is at least theoretically useful in
> subcategories, with special objects other than None.
>
> Sven's suggestion of calling this the "monad" operator triggers a
> worry in me, however.  In Haskell, the Monad type doesn't enforce the
> monad laws, only the property of being an endofunctor.  That
> apparently turns out to be enough in practice to make the Monad type
> very useful.  However, in Python we have no way to enforce that
> property.  I don't have the imagination to come up with a truly
> attractive nuisance here, and this operator doesn't enable general
> functorial behavior, so maybe it's not a problem.

Sleeping one night over it, I now tend to change my mind regarding this. 
Maybe, it's *better to DEAL with None as in remove* *them* from the 
code, from the database, from the YAML files and so forth *instead**of 
*making it easier to work with them. Restricting oneself, would 
eventually lead to more predictable designs.

Does this makes sense somehow?


Issue is, None is so convenient to work with. You only find out the code 
smell when you discover a "NoneType object does not have attribute X" 
exception some months later and start looking where the heck the None 
could come from. What can we do here?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150919/95d0490c/attachment.html>

From steve at pearwood.info  Sat Sep 19 14:06:24 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 19 Sep 2015 22:06:24 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAGAVQTFG8PpdmDqr1FRBrvmB_hJ-ZqJrZT4qLyLx63+bNm-mpg@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <CAGAVQTFG8PpdmDqr1FRBrvmB_hJ-ZqJrZT4qLyLx63+bNm-mpg@mail.gmail.com>
Message-ID: <20150919120624.GS31152@ando.pearwood.info>

On Sat, Sep 19, 2015 at 03:17:07AM -0500, C Anthony Risinger wrote:

> I really liked this whole thread, and I largely still do -- I?think -- but
> I'm not sure I like how `?` suddenly prevents whole blocks of code from
> being evaluated. Anything within the (...) or [...] is now skipped (IIUC)
> just because a `?` was added, which seems like it could have side effects
> on the surrounding state, especially since I expect people will use it for
> squashing/silencing or as a convenient trick after the fact, possibly in
> code they did not originally write.

I don't think this is any different from other short-circuiting 
operators, particularly `and` and the ternary `if` operator:

result = obj and obj.method(expression)

result = obj.method(expression) if obj else default

In both cases, `expression` is not evaluated if obj is falsey. That's 
the whole point.

 
> If the original example included a `?` like so:
> 
>     response = json.dumps?({
>         'created': created?.isoformat(),
>         'updated': updated?.isoformat(),
>         ...
>     })
> 
> should "dumps" be None, the additional `?` (although though you can barely
> see it) prevents *everything else* from executing.

We're still discussing the syntax and semantics of this, so I could be 
wrong, but my understanding of this is that the *first* question mark 
prevents the expressions in the parens from being executed:

json.dumps?( ... )

evaluates as None if json.dumps is None, otherwise it evaluates the 
arguments and calls the dumps object. In other words, rather like this:

_temp = json.dumps  # temporary value
if _temp is None:
    response = None
else:
    response = _temp({
        'created': None if created is None else created.isoformat(),
        'updated': None if updated is None else updated.isoformat(),
        ...
        })
del _temp


except the _temp name isn't actually used. The whole point is to avoid 
evaluating an expression (attribute looking, index/key lookup, function 
call) which will fail if the object is None, and if you're not going to 
call the function, why evaluate the arguments to the function?


> This may cause confusion
> about what is being executed, and when, especially once nesting (to any
> degree really) and/or chaining comes into play!

Well, yes, people can abuse most any syntax. 


> Usually when I want to use this pattern, I find I just need to write things
> out more. The concept itself vaguely reminds me of PHP's use of `@` for
> squashing errors.

I had to look up PHP's @ and I must say I'm rather horrified. According 
to the docs, all it does is suppress the error reporting, it does 
nothing to prevent or recover from errors. There's not really an 
equivalent in Python, but I suppose this is the closest:

# similar to PHP's $result = @(expression);
try:
    result = expression
except:
    result = None


This is nothing like this proposal. It doesn't suppress arbitrary 
errors. It's more like a conditional:

# result = obj?(expression)
if obj is None:
    result = None
else:
    result = obj(expression)


If `expression` raises an exception, it will still be raised, but only 
if it is actually evaluated, just like anything else protected by an 
if...else or short-circuit operator.



-- 
Steve

From stephen at xemacs.org  Sat Sep 19 14:48:52 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 19 Sep 2015 21:48:52 +0900
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <55FD2F51.6020303@mail.de>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <87d1xfyoqn.fsf@uwakimon.sk.tsukuba.ac.jp>
 <55FD2F51.6020303@mail.de>
Message-ID: <8761368thn.fsf@uwakimon.sk.tsukuba.ac.jp>

Sven R. Kunze writes:

 > Issue is, None is so convenient to work with. You only find out the
 > code smell when you discover a "NoneType object does not have
 > attribute X"

That's exactly what should happen (analogous to a "signalling NaN").
The problem is if you are using None as a proxy for a NULL in another
subsystem that has "NULL contagion" (I prefer that to "coalescing").

At this point the thread ends for me because I'm not going try to tell
the many libraries that have chosen to translate NULL to None and vice
versa that they are wrong.


From guido at python.org  Sat Sep 19 18:21:04 2015
From: guido at python.org (Guido van Rossum)
Date: Sat, 19 Sep 2015 09:21:04 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <8761368thn.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <87d1xfyoqn.fsf@uwakimon.sk.tsukuba.ac.jp> <55FD2F51.6020303@mail.de>
 <8761368thn.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAP7+vJLJrnBeXmctXWZ-L-KoxWNn00c2ajmdgGJXmXDqqievcw@mail.gmail.com>

"Uptalk" is an interesting speech pattern where every sentence sounds like
a question. Google it, there's some interesting research.

The "null pattern" is terrible. Uptalk should not be considered a unary
operator that returns a magical value. It's a modifier on other operators
(somewhat similar to the way "+=" and friends are formed).

In case someone missed it, uptalk should test for None, not for a falsey
value.

I forgot to think about the scope of the uptalk operator (i.e. what is
skipped when it finds a None). There are some clear cases (the actual
implementation should avoid double evaluation of the tested expression, of
course):

  a.b?.c.d[x, y](p, q) === None if a.b is None else a.b.c.d[x, y](p, q)
  a.b?[x, y].c.d(p, q) === None if a.b is None else a.b[x, y].c.d(p, q)
  a.b?(p, q).c.d[x, y] === None if a.b is None else a.b(p, q).c.d[x, y]

But what about its effect on other operators in the same expression? I
think this is reasonable:

  a?.b + c.d === None if a is None else a.b + c.d

OTOH I don't think it should affect shortcut boolean operators (and, or):

  a?.b or x === (None if a is None else a.b) or x

It also shouldn't escape out of comma-separated lists, argument lists, etc.:

  (a?.b, x) === ((None if a is None else a.b), x)
  f(a?.b) === f((None if a is None else a.b))

Should it escape from plain parentheses? Which of these is better?

  (a?.b) + c === (None if a is None else a.b) + c    # Fails unless c
overloads None+c
  (a?.b) + c === None if a is None else (a.b) + c    # Could be surprising
if ? is deeply nested

Here are some more edge cases / hypergeneralizations:

  {k1?: v1, k2: v2} === {k2: v2} if k1 is None else {k1: v1, k2: v2}   # ?:
skips if key is None
  # But what to do to skip None values?

Could we give ?= a meaning in assignment, e.g. x ?= y could mean:

  if y is not None:
      x = y

More fun: x ?+= y could mean:

  if x is None:
      x = y
  elif y is not None:
      y += y

You see where this is going. Downhill fast. :-)

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150919/439148e5/attachment.html>

From rosuav at gmail.com  Sat Sep 19 18:27:09 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 20 Sep 2015 02:27:09 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAP7+vJLJrnBeXmctXWZ-L-KoxWNn00c2ajmdgGJXmXDqqievcw@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <87d1xfyoqn.fsf@uwakimon.sk.tsukuba.ac.jp>
 <55FD2F51.6020303@mail.de>
 <8761368thn.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAP7+vJLJrnBeXmctXWZ-L-KoxWNn00c2ajmdgGJXmXDqqievcw@mail.gmail.com>
Message-ID: <CAPTjJmr8sp=Lq0y4W3zEtu_jO3YGWbbNQ+3xS6pse8H9c+Qc6w@mail.gmail.com>

On Sun, Sep 20, 2015 at 2:21 AM, Guido van Rossum <guido at python.org> wrote:
> Should it escape from plain parentheses? Which of these is better?
>
>   (a?.b) + c === (None if a is None else a.b) + c    # Fails unless c
> overloads None+c
>   (a?.b) + c === None if a is None else (a.b) + c    # Could be surprising
> if ? is deeply nested

My recommendation: It should _not_ escape. That way, you get control
over how far out the Noneness goes - you can bracket it in as tight as
you like.

ChrisA

From chris.barker at noaa.gov  Sat Sep 19 19:50:04 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Sat, 19 Sep 2015 10:50:04 -0700
Subject: [Python-ideas] add a single __future__ for py3?
Message-ID: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>

Hi all,

the common advise, these days, if you want to write py2/3 compatible code,
is to do:

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals

https://docs.python.org/2/howto/pyporting.html#prevent-compatibility-regressions

I'm trying to do this in my code, and teaching my students to do it to.

but that's actually a lot of code to write.

It would be nice to have a:

from __future__ import py3

or something like that, that would do all of those in one swipe.

IIIC, l can't make a little module that does that, because the __future__
imports only effect the module in which they are imported

Sure, it's not a huge deal, but it would make it easier for folks wanting
to keep up this best practice.

Of course, this wouldn't happen until 2.7.11, if an when there even is one,
but it would be nice to get it on the list....

-Chris




-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150919/1f4dedb4/attachment-0001.html>

From steve at pearwood.info  Sat Sep 19 20:16:12 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 20 Sep 2015 04:16:12 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
Message-ID: <20150919181612.GT31152@ando.pearwood.info>

Following on to the discussions about changing the default random number 
generator, I would like to propose an alternative: adding a secrets 
module to the standard library.

Attached is a draft PEP. Feedback is requested.

(I'm going to only be intermittently at the keyboard for the next day or 
so, so my responses may be rather slow.)


-- 
Steve

-------------- next part --------------
PEP: xxx
Title: Adding A Secrets Module To The Standard Library
Version: $Revision$
Last-Modified: $Date$
Author: Steven D'Aprano <steve at pearwood.info>
Status: Draft
Type: Standards Track
Content-Type: text/plain
Created: 19-Sep-2015
Python-Version: 3.6
Post-History:


Abstract

    This PEP proposes the addition of a module for common security-related
    functions such as generating tokens to the Python standard library.


Definitions

    Some common abbreviations used in this proposal:

        PRNG:
            Pseudo Random Number Generator.  A deterministic algorithm used
            to produce random-looking numbers with certain desirable
            statistical properties.

        CSPRNG:
            Cryptographically Strong Pseudo Random Number Generator.  An
            algorithm used to produce random-looking numbers which are
            resistant to prediction.

        MT:
            Mersenne Twister.  An extensively studied PRNG which is currently
            used by the ``random`` module as the default.


Rationale

    This proposal is motivated by concerns that Python's standard library
    makes it too easy for developers to inadvertently make serious security
    errors.  Theo de Raadt, the founder of OpenBSD, contacted Guido van Rossum
    and expressed some concern[1] about the use of MT for generating sensitive
    information such as passwords, secure tokens, session keys and similar.

    Although the documentation for the random module explicitly states that
    the default is not suitable for security purposes[2], it is strongly
    believed that this warning may be missed, ignored or misunderstood by
    many Python developers.  In particular:

    - developers may not have read the documentation and consequently
      not seen the warning;

    - they may not realise that their specific use of it has security
      implications; or

    - not realising that there could be a problem, they have copied code
      (or learned techniques) from websites which don't offer best
      practises.

    The first[3] hit when searching for "python how to generate passwords" on
    Google is a tutorial that uses the default functions from the ``random``
    module[4].  Although it is not intended for use in web applications, it is
    likely that similar techniques find themselves used in that situation.
    The second hit is to a StackOverflow question about generating
    passwords[5].  Most of the answers given, including the accepted one, use
    the default functions.  When one user warned that the default could be
    easily compromised, they were told "I think you worry too much."[6]

    This strongly suggests that the existing ``random`` module is an attractive
    nuisance when it comes to generating (for example) passwords or secure
    tokens.

    Additional motivation (of a more philosophical bent) can be found in the
    post which first proposed this idea[7].

Proposal

    Alternative proposals have focused on the default PRNG in the ``random``
    module, with the aim of providing "secure by default" cryptographically
    strong primitives that developers can build upon without thinking about
    security.  (See Alternatives below.)  This proposes a different approach:

    * The standard library already provides cryptographically strong
      primitives, but many users don't know they exist or when to use them.

    * Instead of requiring crypto-naive users to write secure code, the
      standard library should include a set of ready-to-use "batteries" for
      the most common needs, such as generating secure tokens.  This code
      will both directly satisfy a need ("How do I generate a password reset
      token?"), and act as an example of acceptable practises which
      developers can learn from[8].

    To do this, this PEP proposes that we add a new module to the standard
    library, with the suggested name ``secrets``.  This module will contain a
    set of ready-to-use functions for common activities with security
    implications, together with some lower-level primitives.

    The suggestion is that ``secrets`` becomes the go-to module for dealing
    with anything which should remain secret (passwords, tokens, etc.)
    while the ``random`` module remains backward-compatible.


API and Implementation

    The contents of the ``secrets`` module is expected to evolve over time, and
    likely will evolve between the time of writing this PEP and actual release
    in the standard library[9].  At the time of writing, the following functions
    have been suggested:

    * A high-level function for generating secure tokens suitable for use
      in (e.g.) password recovery, as session keys, etc.

    * A limited interface to the system CSPRNG, using either ``os.urandom``
      directly or ``random.SystemRandom``.  Unlike the ``random`` module, this
      does not need to provide methods for seeding, getting or setting the
      state, or any non-uniform distributions.  It should provide the
      following:

      - A function for choosing items from a sequence, ``secrets.choice``.
      - A function for generating an integer within some range, such as
        ``secrets.randrange`` or ``secrets.randint``.
      - A function for generating a given number of random bits and/or bytes
        as an integer.
      - A similar function which returns the value as a hex digit string.

    * ``hmac.compare_digest`` under the name ``equal``.

    The consensus appears to be that there is no need to add a new CSPRNG to
    the ``random`` module to support these uses, ``SystemRandom`` will be
    sufficient.

    Some illustrative implementations have been given by Nick Coghlan[10].
    This idea has also been discussed on the issue tracker for the
    "cryptography" module[11].

    The ``secrets`` module itself will be pure Python, and other Python
    implementations can easily make use of it unchanged, or adapt it as
    necessary.


Alternatives

    One alternative is to change the default PRNG provided by the ``random``
    module[12].  This received considerable scepticism and outright opposition:

    * There is fear that a CSPRNG may be slower than the current PRNG (which
      in the case of MT is already quite slow).

    * Some applications (such as scientific simulations, and replaying
      gameplay) require the ability to seed the PRNG into a known state,
      which a CSPRNG lacks by design.

    * Another major use of the ``random`` module is for simple "guess a number"
      games written by beginners, and many people are loath to make any
      change to the ``random`` module which may make that harder.

    * Although there is no proposal to remove MT from the ``random`` module,
      there was considerable hostility to the idea of having to opt-in to
      a non-CSPRNG or any backwards-incompatible changes.

    * Demonstrated attacks against MT are typically against PHP applications.
      It is believed that PHP's version of MT is a significantly softer target
      than Python's version, due to a poor seeding technique[13].  Consequently,
      without a proven attack against Python applications, many people object
      to a backwards-incompatible change.

    Nick Coghlan made an earlier suggestion for a globally configurable PRNG
    which uses the system CSPRNG by default[14], but has since hinted that he
    may withdraw it in favour of this proposal[15].


Comparison To Other Languages

    PHP

        PHP includes a function ``uniqid``[16] which by default returns a
        thirteen character string based on the current time in microseconds.
        Translated into Python syntax, it has the following signature:

            def uniqid(prefix='', more_entropy=False)->str


        The PHP documentation warns that this function is not suitable for
        security purposes.  Nevertheless, various mature, well-known PHP
        applications use it for that purpose (citation needed).

        PHP 5.3 and better also includes a function ``openssl_random_pseudo_bytes``[17].
        Translated into Python syntax, it has roughly the following signature:

            def openssl_random_pseudo_bytes(length:int)->Tuple[str, bool]

        This function returns a pseudo-random string of bytes of the given
        length, and an boolean flag giving whether the string is considered
        cryptographically strong.  The PHP manual suggests that returning
        anything but True should be rare except for old or broken platforms.

    Javascript

        Based on a rather cursory search[18], there doesn't appear to be any
        well-known standard functions for producing strong random values in
        Javascript, although there may be good quality third-party libraries.
        Standard Javascript doesn't seem to include an interface to the
        system CSPRNG either, and people have extensively written about the
        weaknesses of Javascript's Math.random[19].

    Ruby

        The Ruby standard library includes a module ``SecureRandom``[20]
        which includes the following methods:

        * base64 - returns a Base64 encoded random string.

        * hex - returns a random hexadecimal string.

        * random_bytes - returns a random byte string.

        * random_number - depending on the argument, returns either a random
          integer in the range(0, n), or a random float between 0.0 and 1.0.

        * urlsafe_base64 - returns a random URL-safe Base64 encoded string.

        * uuid - return a version 4 random Universally Unique IDentifier.


What Should Be The Name Of The Module?

    There was a proposal to add a "random.safe" submodule, quoting the Zen
    of Python "Namespaces are one honking great idea" koan.  However, the
    author of the Zen, Tim Peters, has come out against this idea[21], and
    recommends a top-level module.

    In discussion on the python-ideas mailing list so far, the name "secrets"
    has received some approval, and no strong opposition.


Frequently Asked Questions

    Q: Is this a real problem? Surely MT is random enough that nobody can
       predict its output.

    A: The consensus among security professionals is that MT is not safe
       in security contexts.  It is not difficult to reconstruct the internal
       state of MT[22][23] and so predict all past and future values.  There
       are a number of known, practical attacks on systems using MT for
       randomness[24].

       While there are currently no known direct attacks on applications
       written in Python due to the use of MT, there is widespread agreement
       that such usage is unsafe.

    Q: Is this an alternative to specialise cryptographic software such as SSL?

    A: No. This is a "batteries included" solution, not a full-featured
       "nuclear reactor".  It is intended to mitigate against some basic
       security errors, not be a solution to all security-related issues. To
       quote Nick Coghlan referring to his earlier proposal:

            "...folks really are better off learning to use things like
            cryptography.io for security sensitive software, so this change
            is just about harm mitigation given that it's inevitable that a
            non-trivial proportion of the millions of current and future
            Python developers won't do that."[25]


References

    [1] https://mail.python.org/pipermail/python-ideas/2015-September/035820.html

    [2] https://docs.python.org/3/library/random.html

    [3] As of the date of writing. Also, as Google search terms may be automatically customised for the user without their knowledge, some readers may see different results.

    [4] http://interactivepython.org/runestone/static/everyday/2013/01/3_password.html

    [5] http://stackoverflow.com/questions/3854692/generate-password-in-python

    [6] http://stackoverflow.com/questions/3854692/generate-password-in-python/3854766#3854766

    [7] https://mail.python.org/pipermail/python-ideas/2015-September/036238.html

    [8] At least those who are motivated to read the source code and documentation.

    [9] Tim Peters suggests that bike-shedding the contents of the module will be 10000 times more time consuming than actually implementing the module.  Words do not begin to express how much I am looking forward to this.

    [10] https://mail.python.org/pipermail/python-ideas/2015-September/036271.html

    [11] https://github.com/pyca/cryptography/issues/2347

    [12] Link needed.

    [13] By default PHP seeds the MT PRNG with the time (citation needed), which is exploitable by attackers, while Python seeds the PRNG with output from the system CSPRNG, which is believed to be much harder to exploit.

    [14] http://legacy.python.org/dev/peps/pep-0504/

    [15] https://mail.python.org/pipermail/python-ideas/2015-September/036243.html

    [16] http://php.net/manual/en/function.uniqid.php

    [17] http://php.net/manual/en/function.openssl-random-pseudo-bytes.php

    [18] Volunteers and patches are welcome.

    [19] http://ifsec.blogspot.fr/2012/05/cross-domain-mathrandom-prediction.html

    [20] http://ruby-doc.org/stdlib-2.1.2/libdoc/securerandom/rdoc/SecureRandom.html

    [21] https://mail.python.org/pipermail/python-ideas/2015-September/036254.html

    [22] https://jazzy.id.au/2010/09/22/cracking_random_number_generators_part_3.html

    [23] https://mail.python.org/pipermail/python-ideas/2015-September/036077.html

    [24] https://media.blackhat.com/bh-us-12/Briefings/Argyros/BH_US_12_Argyros_PRNG_WP.pdf

    [25] https://mail.python.org/pipermail/python-ideas/2015-September/036157.html


Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

From brett at python.org  Sat Sep 19 20:21:48 2015
From: brett at python.org (Brett Cannon)
Date: Sat, 19 Sep 2015 18:21:48 +0000
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
Message-ID: <CAP1=2W5_UfGmoyKKRmwLtt0mb489GDZGr4k0hQ2p_T7=L-7fWw@mail.gmail.com>

On Sat, 19 Sep 2015 at 10:51 Chris Barker <chris.barker at noaa.gov> wrote:

> Hi all,
>
> the common advise, these days, if you want to write py2/3 compatible code,
> is to do:
>
> from __future__ import absolute_import
> from __future__ import division
> from __future__ import print_function
> from __future__ import unicode_literals
>
>
> https://docs.python.org/2/howto/pyporting.html#prevent-compatibility-regressions
>
> I'm trying to do this in my code, and teaching my students to do it to.
>
> but that's actually a lot of code to write.
>
> It would be nice to have a:
>
> from __future__ import py3
>
> or something like that, that would do all of those in one swipe.
>
> IIIC, l can't make a little module that does that, because the __future__
> imports only effect the module in which they are imported
>
> Sure, it's not a huge deal, but it would make it easier for folks wanting
> to keep up this best practice.
>
> Of course, this wouldn't happen until 2.7.11, if an when there even is
> one, but it would be nice to get it on the list....
>
>
While in hindsight having a python3 __future__ statement that just turned
on everything would be handy, this runs the risk of breaking code by
introducing something that only works in a bugfix release and we went down
that route with booleans in 2.2.1 and came to regret it.

-Brett



> -Chris
>
>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150919/548d11b3/attachment.html>

From srkunze at mail.de  Sat Sep 19 20:41:56 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Sat, 19 Sep 2015 20:41:56 +0200
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
Message-ID: <55FDAC74.7050001@mail.de>

I totally agree here.

On 19.09.2015 19:50, Chris Barker wrote:
> Hi all,
>
> the common advise, these days, if you want to write py2/3 compatible 
> code, is to do:
>
> from __future__ import absolute_import
> from __future__ import division
> from __future__ import print_function
> from __future__ import unicode_literals
>
> https://docs.python.org/2/howto/pyporting.html#prevent-compatibility-regressions
>
> I'm trying to do this in my code, and teaching my students to do it to.
>
> but that's actually a lot of code to write.
>
> It would be nice to have a:
>
> from __future__ import py3
>
> or something like that, that would do all of those in one swipe.
>
> IIIC, l can't make a little module that does that, because the 
> __future__ imports only effect the module in which they are imported
>
> Sure, it's not a huge deal, but it would make it easier for folks 
> wanting to keep up this best practice.
>
> Of course, this wouldn't happen until 2.7.11, if an when there even is 
> one, but it would be nice to get it on the list....
>
> -Chris
>
>
>
>
> -- 
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov <mailto:Chris.Barker at noaa.gov>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150919/aae37800/attachment-0001.html>

From python at mrabarnett.plus.com  Sat Sep 19 20:45:59 2015
From: python at mrabarnett.plus.com (MRAB)
Date: Sat, 19 Sep 2015 19:45:59 +0100
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAP7+vJLJrnBeXmctXWZ-L-KoxWNn00c2ajmdgGJXmXDqqievcw@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <87d1xfyoqn.fsf@uwakimon.sk.tsukuba.ac.jp> <55FD2F51.6020303@mail.de>
 <8761368thn.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAP7+vJLJrnBeXmctXWZ-L-KoxWNn00c2ajmdgGJXmXDqqievcw@mail.gmail.com>
Message-ID: <55FDAD67.40303@mrabarnett.plus.com>

On 2015-09-19 17:21, Guido van Rossum wrote:
> "Uptalk" is an interesting speech pattern where every sentence sounds
> like a question. Google it, there's some interesting research.
>
> The "null pattern" is terrible. Uptalk should not be considered a unary
> operator that returns a magical value. It's a modifier on other
> operators (somewhat similar to the way "+=" and friends are formed).
>
> In case someone missed it, uptalk should test for None, not for a falsey
> value.
>
> I forgot to think about the scope of the uptalk operator (i.e. what is
> skipped when it finds a None). There are some clear cases (the actual
> implementation should avoid double evaluation of the tested expression,
> of course):
>
>    a.b?.c.d[x, y](p, q) === None if a.b is None else a.b.c.d[x, y](p, q)
>    a.b?[x, y].c.d(p, q) === None if a.b is None else a.b[x, y].c.d(p, q)
>    a.b?(p, q).c.d[x, y] === None if a.b is None else a.b(p, q).c.d[x, y]
>
> But what about its effect on other operators in the same expression? I
> think this is reasonable:
>
>    a?.b + c.d === None if a is None else a.b + c.d
>
> OTOH I don't think it should affect shortcut boolean operators (and, or):
>
>    a?.b or x === (None if a is None else a.b) or x
>
> It also shouldn't escape out of comma-separated lists, argument lists, etc.:
>
>    (a?.b, x) === ((None if a is None else a.b), x)
>    f(a?.b) === f((None if a is None else a.b))
>
> Should it escape from plain parentheses? Which of these is better?
>
>    (a?.b) + c === (None if a is None else a.b) + c    # Fails unless c
> overloads None+c
>    (a?.b) + c === None if a is None else (a.b) + c    # Could be
> surprising if ? is deeply nested
>
It shouldn't escape beyond anything having a lower precedence.

> Here are some more edge cases / hypergeneralizations:
>
>    {k1?: v1, k2: v2} === {k2: v2} if k1 is None else {k1: v1, k2: v2}
> # ?: skips if key is None
>    # But what to do to skip None values?
>
> Could we give ?= a meaning in assignment, e.g. x ?= y could mean:
>
>    if y is not None:
>        x = y
>
Shouldn't that be:

     if x is not None:
         x = y

? It's the value before the '?' that's tested.

> More fun: x ?+= y could mean:
>
>    if x is None:
>        x = y
>    elif y is not None:
>        y += y
>
Or:

     if x is None:
         pass
     else:
         x += y

> You see where this is going. Downhill fast. :-)
>
Could it be used postfix:

     a +? b === None if b is None else a + b

     -?a === None if a is None else -a

or both prefix and postfix:

     a ?+? b === None if a is None or b is None else a + b

?


From srkunze at mail.de  Sat Sep 19 21:09:48 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Sat, 19 Sep 2015 21:09:48 +0200
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <8761368thn.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>	<CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>	<55FC96A8.605@lucidity.plus.com>	<D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>	<87d1xfyoqn.fsf@uwakimon.sk.tsukuba.ac.jp>	<55FD2F51.6020303@mail.de>
 <8761368thn.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <55FDB2FC.3080104@mail.de>

On 19.09.2015 14:48, Stephen J. Turnbull wrote:
> Sven R. Kunze writes:
>
>   > Issue is, None is so convenient to work with. You only find out the
>   > code smell when you discover a "NoneType object does not have
>   > attribute X"
>
> That's exactly what should happen (analogous to a "signalling NaN").

Not my point, Stephen. My point is, you better avoid None (despite its 
convenience) because you are going to have a hard time finding its 
origin later in the control flow.

Question still stands: is None really necessary to justify the 
introduction of convenience operators like "?." etc.?

> The problem is if you are using None as a proxy for a NULL in another
> subsystem that has "NULL contagion" (I prefer that to "coalescing").

How would you solve instead?


Best,
Sven

From rymg19 at gmail.com  Sat Sep 19 21:24:19 2015
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Sat, 19 Sep 2015 14:24:19 -0500
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <55FDB2FC.3080104@mail.de>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <87d1xfyoqn.fsf@uwakimon.sk.tsukuba.ac.jp> <55FD2F51.6020303@mail.de>
 <8761368thn.fsf@uwakimon.sk.tsukuba.ac.jp> <55FDB2FC.3080104@mail.de>
Message-ID: <5FE190D3-B6E1-4CBE-9082-366A672D9D0A@gmail.com>

I think the core issue is that, whether or not it should be used, APIs already return None values, so a convenience operator might as well be added.

On September 19, 2015 2:09:48 PM CDT, "Sven R. Kunze" <srkunze at mail.de> wrote:
>On 19.09.2015 14:48, Stephen J. Turnbull wrote:
>> Sven R. Kunze writes:
>>
>>   > Issue is, None is so convenient to work with. You only find out
>the
>>   > code smell when you discover a "NoneType object does not have
>>   > attribute X"
>>
>> That's exactly what should happen (analogous to a "signalling NaN").
>
>Not my point, Stephen. My point is, you better avoid None (despite its 
>convenience) because you are going to have a hard time finding its 
>origin later in the control flow.
>
>Question still stands: is None really necessary to justify the 
>introduction of convenience operators like "?." etc.?
>
>> The problem is if you are using None as a proxy for a NULL in another
>> subsystem that has "NULL contagion" (I prefer that to "coalescing").
>
>How would you solve instead?
>
>
>Best,
>Sven
>_______________________________________________
>Python-ideas mailing list
>Python-ideas at python.org
>https://mail.python.org/mailman/listinfo/python-ideas
>Code of Conduct: http://python.org/psf/codeofconduct/

-- 
Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150919/d589a9d0/attachment.html>

From xavier.combelle at gmail.com  Sat Sep 19 22:03:10 2015
From: xavier.combelle at gmail.com (Xavier Combelle)
Date: Sat, 19 Sep 2015 22:03:10 +0200
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <5FE190D3-B6E1-4CBE-9082-366A672D9D0A@gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <87d1xfyoqn.fsf@uwakimon.sk.tsukuba.ac.jp>
 <55FD2F51.6020303@mail.de>
 <8761368thn.fsf@uwakimon.sk.tsukuba.ac.jp>
 <55FDB2FC.3080104@mail.de>
 <5FE190D3-B6E1-4CBE-9082-366A672D9D0A@gmail.com>
Message-ID: <CAEQcUJQcw8CpsH-AVZ0aaqBP90yBbGA2WHU6w0bmr=bNAKpKog@mail.gmail.com>

2015-09-19 21:24 GMT+02:00 Ryan Gonzalez <rymg19 at gmail.com>:

> I think the core issue is that, whether or not it should be used, APIs
> already return None values, so a convenience operator might as well be
> added.
>
>
I'm curious on which API returning None, a major bonus on using python is
that I pretty never stumbled upon the equivalent of NullPointerException.
Moreover, I wonder if that this convenience operator will do something more
than hide bugs.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150919/cacca579/attachment.html>

From guido at python.org  Sat Sep 19 22:57:29 2015
From: guido at python.org (Guido van Rossum)
Date: Sat, 19 Sep 2015 13:57:29 -0700
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <20150919181612.GT31152@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
Message-ID: <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>

Thanks! I'd accept this (and I'd reject 504 at the same time). I like the
secrets name. I wonder though, should the PEP propose a specific set of
functions? (With the understanding that we might add more later.) Hopefully
someone on the peps team can commit your PEP in the repo. It's probably
going to be PEP 506.

On Sat, Sep 19, 2015 at 11:16 AM, Steven D'Aprano <steve at pearwood.info>
wrote:

> Following on to the discussions about changing the default random number
> generator, I would like to propose an alternative: adding a secrets
> module to the standard library.
>
> Attached is a draft PEP. Feedback is requested.
>
> (I'm going to only be intermittently at the keyboard for the next day or
> so, so my responses may be rather slow.)
>
>
> --
> Steve
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150919/5fbc0d7f/attachment-0001.html>

From random832 at fastmail.com  Sat Sep 19 23:39:41 2015
From: random832 at fastmail.com (Random832)
Date: Sat, 19 Sep 2015 17:39:41 -0400
Subject: [Python-ideas] Null coalescing operators
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <87d1xfyoqn.fsf@uwakimon.sk.tsukuba.ac.jp> <55FD2F51.6020303@mail.de>
 <8761368thn.fsf@uwakimon.sk.tsukuba.ac.jp> <55FDB2FC.3080104@mail.de>
 <5FE190D3-B6E1-4CBE-9082-366A672D9D0A@gmail.com>
 <CAEQcUJQcw8CpsH-AVZ0aaqBP90yBbGA2WHU6w0bmr=bNAKpKog@mail.gmail.com>
Message-ID: <m2h9mqum02.fsf@fastmail.com>

Xavier Combelle
<xavier.combelle at gmail.com> writes:

> I'm curious on which API returning None, a major bonus on using python
> is that I pretty never stumbled upon the equivalent of
> NullPointerException.

It doesn't strictly have one; None is an object and you get the usual
TypeError, AttributeError, etc, upon using it in a place it's not expected.


From guido at python.org  Sat Sep 19 23:47:52 2015
From: guido at python.org (Guido van Rossum)
Date: Sat, 19 Sep 2015 14:47:52 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <m2h9mqum02.fsf@fastmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <87d1xfyoqn.fsf@uwakimon.sk.tsukuba.ac.jp> <55FD2F51.6020303@mail.de>
 <8761368thn.fsf@uwakimon.sk.tsukuba.ac.jp> <55FDB2FC.3080104@mail.de>
 <5FE190D3-B6E1-4CBE-9082-366A672D9D0A@gmail.com>
 <CAEQcUJQcw8CpsH-AVZ0aaqBP90yBbGA2WHU6w0bmr=bNAKpKog@mail.gmail.com>
 <m2h9mqum02.fsf@fastmail.com>
Message-ID: <CAP7+vJJz2wpg5a9JtdVxuW5qf0w4K3vOddfXu=psYaNsLEh2ag@mail.gmail.com>

On Sat, Sep 19, 2015 at 2:39 PM, Random832 <random832 at fastmail.com> wrote:

> Xavier Combelle
> <xavier.combelle at gmail.com> writes:
>
> > I'm curious on which API returning None, a major bonus on using python
> > is that I pretty never stumbled upon the equivalent of
> > NullPointerException.
>
> It doesn't strictly have one; None is an object and you get the usual
> TypeError, AttributeError, etc, upon using it in a place it's not expected.
>

Most often AttributeError. It's pretty common in large Python systems.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150919/25408847/attachment.html>

From breamoreboy at yahoo.co.uk  Sun Sep 20 00:00:47 2015
From: breamoreboy at yahoo.co.uk (Mark Lawrence)
Date: Sat, 19 Sep 2015 23:00:47 +0100
Subject: [Python-ideas] new format spec for iterable types
In-Reply-To: <55F03BF3.50106@trueblade.com>
References: <msmbko$ful$1@ger.gmane.org> <mspcv3$8mu$1@ger.gmane.org>
 <55F03BF3.50106@trueblade.com>
Message-ID: <mtklus$j76$1@ger.gmane.org>

On 09/09/2015 15:02, Eric V. Smith wrote:
> At some point, instead of complicating how format works internally, you
> should just write a function that does what you want. I realize there's
> a continuum between '{}'.format(iterable) and
> '{<really-really-complex-stuff}'.format(iterable). It's not clear where
> to draw the line. But when the solution is to bake knowledge of
> iterables into .format(), I think we've passed the point where we should
> switch to a function: '{}'.format(some_function(iterable)).
>
> In any event, If you want to play with this, I suggest you write
> some_function(iterable) that does what you want, first.
>
> Eric.
>

Something like this from Nick Coghlan 
https://code.activestate.com/recipes/577845-format_iter-easy-formatting-of-arbitrary-iterables 
???

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence


From storchaka at gmail.com  Sun Sep 20 00:59:01 2015
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sun, 20 Sep 2015 01:59:01 +0300
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <20150919181612.GT31152@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
Message-ID: <mtkpbm$1gn$1@ger.gmane.org>

On 19.09.15 21:16, Steven D'Aprano wrote:
> Following on to the discussions about changing the default random number
> generator, I would like to propose an alternative: adding a secrets
> module to the standard library.

Python already has three secret modules: this and antigravity.


From rosuav at gmail.com  Sun Sep 20 01:00:25 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 20 Sep 2015 09:00:25 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
Message-ID: <CAPTjJmrb6a8VdBVgowBfzP+0GGnYFyxKRimfhKqxx141mrHnOA@mail.gmail.com>

On Sun, Sep 20, 2015 at 6:57 AM, Guido van Rossum <guido at python.org> wrote:
> [in response to Steven D'Aprano's proto-PEP]
> Hopefully someone on the peps team can commit your PEP in the repo. It's
> probably going to be PEP 506.

I think that's my cue!

PEP 506 created and pushed. I've manually converted the original text
to RST, but if that was a bad idea, I can revert to text.

ChrisA

From rosuav at gmail.com  Sun Sep 20 01:02:26 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 20 Sep 2015 09:02:26 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <mtkpbm$1gn$1@ger.gmane.org>
References: <20150919181612.GT31152@ando.pearwood.info>
 <mtkpbm$1gn$1@ger.gmane.org>
Message-ID: <CAPTjJmrg0v22ELemncDE83YEmAXutT=-8gzDDxm8Ov6rO0rUkg@mail.gmail.com>

On Sun, Sep 20, 2015 at 8:59 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:
> On 19.09.15 21:16, Steven D'Aprano wrote:
>>
>> Following on to the discussions about changing the default random number
>> generator, I would like to propose an alternative: adding a secrets
>> module to the standard library.
>
>
> Python already has three secret modules: this and antigravity.

*scratches head* Is this Pythonesque counting, or is there some way to
"import and" that has escaped me?

ChrisA

From phd at phdru.name  Sun Sep 20 01:07:01 2015
From: phd at phdru.name (Oleg Broytman)
Date: Sun, 20 Sep 2015 01:07:01 +0200
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
 Library
In-Reply-To: <CAPTjJmrg0v22ELemncDE83YEmAXutT=-8gzDDxm8Ov6rO0rUkg@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <mtkpbm$1gn$1@ger.gmane.org>
 <CAPTjJmrg0v22ELemncDE83YEmAXutT=-8gzDDxm8Ov6rO0rUkg@mail.gmail.com>
Message-ID: <20150919230701.GA19380@phdru.name>

On Sun, Sep 20, 2015 at 09:02:26AM +1000, Chris Angelico <rosuav at gmail.com> wrote:
> On Sun, Sep 20, 2015 at 8:59 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:
> > On 19.09.15 21:16, Steven D'Aprano wrote:
> >>
> >> Following on to the discussions about changing the default random number
> >> generator, I would like to propose an alternative: adding a secrets
> >> module to the standard library.
> >
> > Python already has three secret modules: this and antigravity.
> 
> *scratches head* Is this Pythonesque counting, or is there some way to
> "import and" that has escaped me?

   Or, BTW, I always wanted "import this or that"!

> ChrisA

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.

From storchaka at gmail.com  Sun Sep 20 01:07:05 2015
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sun, 20 Sep 2015 02:07:05 +0300
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CAPTjJmrg0v22ELemncDE83YEmAXutT=-8gzDDxm8Ov6rO0rUkg@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <mtkpbm$1gn$1@ger.gmane.org>
 <CAPTjJmrg0v22ELemncDE83YEmAXutT=-8gzDDxm8Ov6rO0rUkg@mail.gmail.com>
Message-ID: <mtkpqp$1gn$2@ger.gmane.org>

On 20.09.15 02:02, Chris Angelico wrote:
> On Sun, Sep 20, 2015 at 8:59 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:
>> On 19.09.15 21:16, Steven D'Aprano wrote:
>>> Following on to the discussions about changing the default random number
>>> generator, I would like to propose an alternative: adding a secrets
>>> module to the standard library.
>> Python already has three secret modules: this and antigravity.
>
> *scratches head* Is this Pythonesque counting, or is there some way to
> "import and" that has escaped me?

The name of the third module is secret.



From steve at pearwood.info  Sun Sep 20 01:13:09 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 20 Sep 2015 09:13:09 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CAPTjJmrg0v22ELemncDE83YEmAXutT=-8gzDDxm8Ov6rO0rUkg@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <mtkpbm$1gn$1@ger.gmane.org>
 <CAPTjJmrg0v22ELemncDE83YEmAXutT=-8gzDDxm8Ov6rO0rUkg@mail.gmail.com>
Message-ID: <20150919231309.GU31152@ando.pearwood.info>

On Sun, Sep 20, 2015 at 09:02:26AM +1000, Chris Angelico wrote:
> On Sun, Sep 20, 2015 at 8:59 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:
> > On 19.09.15 21:16, Steven D'Aprano wrote:
> >>
> >> Following on to the discussions about changing the default random number
> >> generator, I would like to propose an alternative: adding a secrets
> >> module to the standard library.
> >
> >
> > Python already has three secret modules: this and antigravity.
> 
> *scratches head* Is this Pythonesque counting, or is there some way to
> "import and" that has escaped me?

We could tell you what the third secret module is, but then we'd have to 
kill you.


-- 
Steve

From tim.peters at gmail.com  Sun Sep 20 01:40:32 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 19 Sep 2015 18:40:32 -0500
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
Message-ID: <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>

[Guido]
> Thanks! I'd accept this (and I'd reject 504 at the same time). I like the
> secrets name. I wonder though, should the PEP propose a specific set of
> functions? (With the understanding that we might add more later.)

The bikeshedding on that will be far more tedious than the
implementation.  I'll get it started :-)

No attempt to be minimal here.  More-than-less "obvious" is more important:

Bound methods of a SystemRandom instance
    .randrange()
    .randint()
    .randbits()
        renamed from .getrandbits()
    .randbelow(exclusive_upper_bound)
        renamed from private ._randbelow()
    .choice()

 Token functions
    .token_bytes(nbytes)
        another name for os.urandom()
    .token_hex(nbytes)
        same, but return string of ASCII hex digits
    .token_url(nbytes)
        same, but return URL-safe base64-encoded ASCII
    .token_alpha(alphabet, nchars)
        string of `nchars` characters drawn uniformly
        from `alphabet`

From gokoproject at gmail.com  Sun Sep 20 01:50:55 2015
From: gokoproject at gmail.com (John Wong)
Date: Sat, 19 Sep 2015 19:50:55 -0400
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <mtkpqp$1gn$2@ger.gmane.org>
References: <20150919181612.GT31152@ando.pearwood.info>
 <mtkpbm$1gn$1@ger.gmane.org>
 <CAPTjJmrg0v22ELemncDE83YEmAXutT=-8gzDDxm8Ov6rO0rUkg@mail.gmail.com>
 <mtkpqp$1gn$2@ger.gmane.org>
Message-ID: <CACCLA55JtEEfsUBhWJ5f7deY1XeuT1Lsz9LvXS9f4-DW3P0Tmw@mail.gmail.com>

On Sat, Sep 19, 2015 at 7:07 PM, Serhiy Storchaka <storchaka at gmail.com>
wrote:

> On 20.09.15 02:02, Chris Angelico wrote:
>
>> On Sun, Sep 20, 2015 at 8:59 AM, Serhiy Storchaka <storchaka at gmail.com>
>> wrote:
>>
>>> On 19.09.15 21:16, Steven D'Aprano wrote:
>>>
>>>> Following on to the discussions about changing the default random number
>>>> generator, I would like to propose an alternative: adding a secrets
>>>> module to the standard library.
>>>>
>>> Python already has three secret modules: this and antigravity.
>>>
>>
>> *scratches head* Is this Pythonesque counting, or is there some way to
>> "import and" that has escaped me?
>>
>
> The name of the third module is secret.
>
>
is "secret" or is a secret? Why not "secure"?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150919/d4f51c51/attachment.html>

From abarnert at yahoo.com  Sun Sep 20 02:11:31 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sat, 19 Sep 2015 17:11:31 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAP7+vJJz2wpg5a9JtdVxuW5qf0w4K3vOddfXu=psYaNsLEh2ag@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <87d1xfyoqn.fsf@uwakimon.sk.tsukuba.ac.jp> <55FD2F51.6020303@mail.de>
 <8761368thn.fsf@uwakimon.sk.tsukuba.ac.jp> <55FDB2FC.3080104@mail.de>
 <5FE190D3-B6E1-4CBE-9082-366A672D9D0A@gmail.com>
 <CAEQcUJQcw8CpsH-AVZ0aaqBP90yBbGA2WHU6w0bmr=bNAKpKog@mail.gmail.com>
 <m2h9mqum02.fsf@fastmail.com>
 <CAP7+vJJz2wpg5a9JtdVxuW5qf0w4K3vOddfXu=psYaNsLEh2ag@mail.gmail.com>
Message-ID: <6F3E8475-5D26-4D9A-9AD9-7F463E91CBFF@yahoo.com>

On Sep 19, 2015, at 14:47, Guido van Rossum <guido at python.org> wrote:
> 
>> On Sat, Sep 19, 2015 at 2:39 PM, Random832 <random832 at fastmail.com> wrote:
>> Xavier Combelle
>> <xavier.combelle at gmail.com> writes:
>> 
>> > I'm curious on which API returning None, a major bonus on using python
>> > is that I pretty never stumbled upon the equivalent of
>> > NullPointerException.
>> 
>> It doesn't strictly have one; None is an object and you get the usual
>> TypeError, AttributeError, etc, upon using it in a place it's not expected.
> 
> Most often AttributeError. It's pretty common in large Python systems.

The TypeErrors usually come from novices. There are many of StackOverflow questions asking why they can't add spam.get_text() + "\n" where they don't show you the implementation of get_text, or the exception they got, but you just know they forgot a return statement at the end and the exception was a TypeError about adding NoneType and str.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150919/8d036f9f/attachment.html>

From random832 at fastmail.com  Sun Sep 20 02:13:08 2015
From: random832 at fastmail.com (Random832)
Date: Sat, 19 Sep 2015 20:13:08 -0400
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <mtkpbm$1gn$1@ger.gmane.org> (Serhiy Storchaka's message of "Sun, 
 20 Sep 2015 01:59:01 +0300")
References: <20150919181612.GT31152@ando.pearwood.info>
 <mtkpbm$1gn$1@ger.gmane.org>
Message-ID: <m24miq6j8r.fsf@fastmail.com>

Serhiy Storchaka <storchaka at gmail.com> writes:
> Python already has three secret modules: this and antigravity.

Even its secrets have secrets. Show of hands, who here knew about
antigravity.geohash?

From rosuav at gmail.com  Sun Sep 20 02:14:02 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 20 Sep 2015 10:14:02 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
Message-ID: <CAPTjJmq0OQ8EUJH4km95-2NcYDUsmVe+khmmrGw-fNYsxJ0YAw@mail.gmail.com>

On Sun, Sep 20, 2015 at 9:40 AM, Tim Peters <tim.peters at gmail.com> wrote:
>  Token functions
>     .token_bytes(nbytes)
>         another name for os.urandom()
>     .token_hex(nbytes)
>         same, but return string of ASCII hex digits
>     .token_url(nbytes)
>         same, but return URL-safe base64-encoded ASCII
>     .token_alpha(alphabet, nchars)
>         string of `nchars` characters drawn uniformly
>         from `alphabet`

token_bytes "obviously" should return a bytes, and token_alpha equally
obviously should be returning a str. (Or maybe it should return the
same type as alphabet, which could be either?) What about the other
two? Also, if you ask for 4 bytes from token_hex, do you get 4 hex
digits or 8 (four bytes of entropy)?

ChrisA

From rosuav at gmail.com  Sun Sep 20 02:15:09 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 20 Sep 2015 10:15:09 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <m24miq6j8r.fsf@fastmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <mtkpbm$1gn$1@ger.gmane.org> <m24miq6j8r.fsf@fastmail.com>
Message-ID: <CAPTjJmre3kTSnz1pNN8W4Ar-UOAsMoVxxviQ70dvGxnv=Lm96g@mail.gmail.com>

On Sun, Sep 20, 2015 at 10:13 AM, Random832 <random832 at fastmail.com> wrote:
> Serhiy Storchaka <storchaka at gmail.com> writes:
>> Python already has three secret modules: this and antigravity.
>
> Even its secrets have secrets. Show of hands, who here knew about
> antigravity.geohash?

I did, but I'm an XKCD wonk.

ChrisA

From bussonniermatthias at gmail.com  Sun Sep 20 02:16:39 2015
From: bussonniermatthias at gmail.com (Matthias Bussonnier)
Date: Sat, 19 Sep 2015 17:16:39 -0700
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CACCLA55JtEEfsUBhWJ5f7deY1XeuT1Lsz9LvXS9f4-DW3P0Tmw@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <mtkpbm$1gn$1@ger.gmane.org>
 <CAPTjJmrg0v22ELemncDE83YEmAXutT=-8gzDDxm8Ov6rO0rUkg@mail.gmail.com>
 <mtkpqp$1gn$2@ger.gmane.org>
 <CACCLA55JtEEfsUBhWJ5f7deY1XeuT1Lsz9LvXS9f4-DW3P0Tmw@mail.gmail.com>
Message-ID: <CANJQusX48-2e0Y54OGh5ZNxVjYG0M90my7nxsHUcJebTQFzstA@mail.gmail.com>

You forgot :

from __future__ import braces

-- 
M

On Sat, Sep 19, 2015 at 4:50 PM, John Wong <gokoproject at gmail.com> wrote:
>
>
> On Sat, Sep 19, 2015 at 7:07 PM, Serhiy Storchaka <storchaka at gmail.com>
> wrote:
>>
>> On 20.09.15 02:02, Chris Angelico wrote:
>>>
>>> On Sun, Sep 20, 2015 at 8:59 AM, Serhiy Storchaka <storchaka at gmail.com>
>>> wrote:
>>>>
>>>> On 19.09.15 21:16, Steven D'Aprano wrote:
>>>>>
>>>>> Following on to the discussions about changing the default random
>>>>> number
>>>>> generator, I would like to propose an alternative: adding a secrets
>>>>> module to the standard library.
>>>>
>>>> Python already has three secret modules: this and antigravity.
>>>
>>>
>>> *scratches head* Is this Pythonesque counting, or is there some way to
>>> "import and" that has escaped me?
>>
>>
>> The name of the third module is secret.
>>
>
> is "secret" or is a secret? Why not "secure"?
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From tim.peters at gmail.com  Sun Sep 20 02:19:27 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 19 Sep 2015 19:19:27 -0500
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CAPTjJmq0OQ8EUJH4km95-2NcYDUsmVe+khmmrGw-fNYsxJ0YAw@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CAPTjJmq0OQ8EUJH4km95-2NcYDUsmVe+khmmrGw-fNYsxJ0YAw@mail.gmail.com>
Message-ID: <CAExdVNnTjYFWKUA0pVL1-4LDb8ecvRAtLJbFXnFci1jQ2MUXNg@mail.gmail.com>

[Tim Peters]
>>  Token functions
>>     .token_bytes(nbytes)
>>         another name for os.urandom()
>>     .token_hex(nbytes)
>>         same, but return string of ASCII hex digits
>>     .token_url(nbytes)
>>         same, but return URL-safe base64-encoded ASCII
>>     .token_alpha(alphabet, nchars)
>>         string of `nchars` characters drawn uniformly
>>         from `alphabet`

[Chris Angelico <rosuav at gmail.com>]
> token_bytes "obviously" should return a bytes,

Which os.urandom() does in Python 3.  I'm not writing docs, just
suggesting the functions.

> and token_alpha equally obviously should be returning a str.

Which part of "string" doesn't suggest "str"?

> (Or maybe it should return the same type as alphabet, which
> could be either?)
>
>: What about the other two?

Which part of "ASCII" is ambiguous?

> Also, if you ask for 4 bytes from token_hex, do you get 4 hex
> digits or 8 (four bytes of entropy)?

And which part of "same"?  ;-)

Bikeshed away.;  I'm outta this now ;-)

From rosuav at gmail.com  Sun Sep 20 02:27:42 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 20 Sep 2015 10:27:42 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CAExdVNnTjYFWKUA0pVL1-4LDb8ecvRAtLJbFXnFci1jQ2MUXNg@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CAPTjJmq0OQ8EUJH4km95-2NcYDUsmVe+khmmrGw-fNYsxJ0YAw@mail.gmail.com>
 <CAExdVNnTjYFWKUA0pVL1-4LDb8ecvRAtLJbFXnFci1jQ2MUXNg@mail.gmail.com>
Message-ID: <CAPTjJmqKCbpwpu7x1_PtXc8-HUmgtp8Tw-6CMvy9UOjcTSDY5w@mail.gmail.com>

On Sun, Sep 20, 2015 at 10:19 AM, Tim Peters <tim.peters at gmail.com> wrote:
> [Chris Angelico <rosuav at gmail.com>]
>> token_bytes "obviously" should return a bytes,
>
> Which os.urandom() does in Python 3.  I'm not writing docs, just
> suggesting the functions.
>
>> and token_alpha equally obviously should be returning a str.
>
> Which part of "string" doesn't suggest "str"?
>
>> (Or maybe it should return the same type as alphabet, which
>> could be either?)
>>
>>: What about the other two?
>
> Which part of "ASCII" is ambiguous?
>
>> Also, if you ask for 4 bytes from token_hex, do you get 4 hex
>> digits or 8 (four bytes of entropy)?
>
> And which part of "same"?  ;-)
>
> Bikeshed away.;  I'm outta this now ;-)

Heh :)

My personal preference for shed colour: token_bytes returns a
bytestring, its length being the number provided. All the others
return Unicode strings, their lengths again being the number provided.
So they're all text bar the one that explicitly says it's in bytes.

But I'm aware others may disagree, and while "ASCII" might not be
ambiguous, Py3 does still distinguish between b"asdf" and u"asdf" :)

ChrisA

From stephen at xemacs.org  Sun Sep 20 06:45:36 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sun, 20 Sep 2015 13:45:36 +0900
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The
	Standard	Library
In-Reply-To: <CAPTjJmqKCbpwpu7x1_PtXc8-HUmgtp8Tw-6CMvy9UOjcTSDY5w@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CAPTjJmq0OQ8EUJH4km95-2NcYDUsmVe+khmmrGw-fNYsxJ0YAw@mail.gmail.com>
 <CAExdVNnTjYFWKUA0pVL1-4LDb8ecvRAtLJbFXnFci1jQ2MUXNg@mail.gmail.com>
 <CAPTjJmqKCbpwpu7x1_PtXc8-HUmgtp8Tw-6CMvy9UOjcTSDY5w@mail.gmail.com>
Message-ID: <871tdtvgun.fsf@uwakimon.sk.tsukuba.ac.jp>

Chris Angelico writes:

 > My personal preference for shed colour: token_bytes returns a
 > bytestring, its length being the number provided. All the others
 > return Unicode strings, their lengths again being the number provided.
 > So they're all text bar the one that explicitly says it's in bytes.

I think that token_url may need a bytes mode, for the same reasons
that bytes needs __mod__: such tokens will often be created and parsed
by programs that never leave the "ASCII-compatible bytes" world.


From storchaka at gmail.com  Sun Sep 20 08:00:08 2015
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sun, 20 Sep 2015 09:00:08 +0300
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
Message-ID: <mtli19$u3d$1@ger.gmane.org>

On 20.09.15 02:40, Tim Peters wrote:
> No attempt to be minimal here.  More-than-less "obvious" is more important:
>
> Bound methods of a SystemRandom instance
>      .randrange()
>      .randint()
>      .randbits()
>          renamed from .getrandbits()
>      .randbelow(exclusive_upper_bound)
>          renamed from private ._randbelow()
>      .choice()

randbelow() is just an alias for randrange() with single argument.
randint(a, b) == randrange(a, b+1).

These functions are redundant and they have non-zero cost.

Would not renaming getrandbits be confused?

>   Token functions
>      .token_bytes(nbytes)
>          another name for os.urandom()
>      .token_hex(nbytes)
>          same, but return string of ASCII hex digits
>      .token_url(nbytes)
>          same, but return URL-safe base64-encoded ASCII
>      .token_alpha(alphabet, nchars)
>          string of `nchars` characters drawn uniformly
>          from `alphabet`

token_hex(nbytes) == token_alpha('0123456789abcdef', nchars) ?
token_url(nbytes) == token_alpha(
     'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_',
      nchars) ?



From storchaka at gmail.com  Sun Sep 20 08:10:32 2015
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sun, 20 Sep 2015 09:10:32 +0300
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
Message-ID: <mtliko$632$1@ger.gmane.org>

On 19.09.15 07:21, Guido van Rossum wrote:
> I do, but at least the '?' is part of an operator, not part of the name
> (as it is in Ruby?).

What to do with the "in" operator?



From random832 at fastmail.com  Sun Sep 20 08:45:31 2015
From: random832 at fastmail.com (Random832)
Date: Sun, 20 Sep 2015 02:45:31 -0400
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The
	Standard	Library
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CAPTjJmq0OQ8EUJH4km95-2NcYDUsmVe+khmmrGw-fNYsxJ0YAw@mail.gmail.com>
 <CAExdVNnTjYFWKUA0pVL1-4LDb8ecvRAtLJbFXnFci1jQ2MUXNg@mail.gmail.com>
 <CAPTjJmqKCbpwpu7x1_PtXc8-HUmgtp8Tw-6CMvy9UOjcTSDY5w@mail.gmail.com>
 <871tdtvgun.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <m2bncx4mic.fsf@fastmail.com>

"Stephen J. Turnbull" <stephen at xemacs.org>
writes:

> Chris Angelico writes:
>
>  > My personal preference for shed colour: token_bytes returns a
>  > bytestring, its length being the number provided. All the others
>  > return Unicode strings, their lengths again being the number provided.
>  > So they're all text bar the one that explicitly says it's in bytes.
>
> I think that token_url may need a bytes mode, for the same reasons
> that bytes needs __mod__: such tokens will often be created and parsed
> by programs that never leave the "ASCII-compatible bytes" world.

For token_alpha the obvious answer is to return the same type as
alphabet, which there's no reason not to allow be either.


From random832 at fastmail.com  Sun Sep 20 08:46:33 2015
From: random832 at fastmail.com (Random832)
Date: Sun, 20 Sep 2015 02:46:33 -0400
Subject: [Python-ideas] Null coalescing operators
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
Message-ID: <m27fnl4mgm.fsf@fastmail.com>

Serhiy Storchaka <storchaka at gmail.com>
writes:

> On 19.09.15 07:21, Guido van Rossum wrote:
>> I do, but at least the '?' is part of an operator, not part of the name
>> (as it is in Ruby?).
>
> What to do with the "in" operator?

This is one of those things where we've got to decide which side it
applies to. None can be in a list, but not a string. And nothing can be
in None.


From storchaka at gmail.com  Sun Sep 20 09:28:03 2015
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sun, 20 Sep 2015 10:28:03 +0300
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <m27fnl4mgm.fsf@fastmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <m27fnl4mgm.fsf@fastmail.com>
Message-ID: <mtln63$v89$1@ger.gmane.org>

On 20.09.15 09:46, Random832 wrote:
> Serhiy Storchaka <storchaka at gmail.com>
> writes:
>> On 19.09.15 07:21, Guido van Rossum wrote:
>>> I do, but at least the '?' is part of an operator, not part of the name
>>> (as it is in Ruby?).
>> What to do with the "in" operator?
> This is one of those things where we've got to decide which side it
> applies to. None can be in a list, but not a string. And nothing can be
> in None.

All operators are either identifiers ("in", "is", "not"), or 
nonalphabetic. Not mixes. There is no the "in=" operator and I guess 
shouldn't be "?in".



From steve at pearwood.info  Sun Sep 20 09:31:58 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 20 Sep 2015 17:31:58 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <mtliko$632$1@ger.gmane.org>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
Message-ID: <20150920073157.GV31152@ando.pearwood.info>

On Sun, Sep 20, 2015 at 09:10:32AM +0300, Serhiy Storchaka wrote:
> On 19.09.15 07:21, Guido van Rossum wrote:
> >I do, but at least the '?' is part of an operator, not part of the name
> >(as it is in Ruby?).
> 
> What to do with the "in" operator?

Absolutely nothing.

I'm not convinced that we should generalise this beyond the three 
original examples of attribute access, item lookup and function call. I 
think that applying ? to arbitrary operators is a case of "YAGNI". Or 
perhaps, "You Shouldn't Need It".

Mark's original motivating use-case strikes me as both common and 
unexceptional. We might write:

# spam may be None, or some object
result = spam or spam.attr
# better, as it doesn't misbehave when spam is falsey
result = None if spam is None else spam.attr


and it seems reasonable to me to want a short-cut for that use-case. But 
the generalisations to arbitrary operators suggested by Guido strike me 
as too much, too far. As he says, going downhill, and quickly.

Consider these two hypotheticals:

spam ?+ eggs
# None if spam is None or eggs is None else spam + eggs

needle ?in haystack
# None if needle is None or haystack is None else needle in haystack

Briefer (more concise) is not necessarily better. At the point you have 
*two* objects in the one term that both need to be checked for None, 
that is in my opinion a code smell and we shouldn't provide a short-cut 
disguising that.

Technically, x.y x[y] and x(y) aren't operators, but for the sake of 
convenience I'll call them such. Even though these are binary operators, 
the ? only shortcuts according to the x, not the y. So we can call 
these ?. ?[] ?() operators "pseudo-unary" operators rather than binary 
operators.

Are there any actual unary operators we might want to apply this 
uptalk/shrug operator to? There are (if I remember correctly) only three 
unary operators: + - and ~. I don't think there are any reasonable 
use-cases for writing (say):

    value = ?-x

that justifies making this short-cut available.

So as far as I am concerned, the following conditions should apply:

- the uptalk/shrug ? "operator" should not apply to actual binary 
  operators where both operands need to be checked for None-ness 
  (e.g. arithmetic operators, comparison operators)

- it should not apply to arithmetic unary operators + - and ~

- it might apply to pseudo-operators where only the lefthand 
  argument is checked for None-ness, that is, x.y x[y] and 
  x(y), written as x?.y x?[y] and x?(y).


If I had to choose between generalising this to all operators, or not 
having it at all, I'd rather not have it at all. A little bit of uptalk 
goes a long way, once we have ? appearing all over the place in all 
sorts of expressions, I think it's too much.

-- 
Steve

From steve at pearwood.info  Sun Sep 20 09:34:23 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 20 Sep 2015 17:34:23 +1000
Subject: [Python-ideas] [Python-Dev] Make stacklevel=2 by default in
	warnings.warn()
In-Reply-To: <CAPJVwBmftOk7Tu14a=XuBX638V-X7BFheRMx9EY1yGaJvpY4wQ@mail.gmail.com>
References: <mtlkl6$ur9$1@ger.gmane.org>
 <CAPJVwBmftOk7Tu14a=XuBX638V-X7BFheRMx9EY1yGaJvpY4wQ@mail.gmail.com>
Message-ID: <20150920073423.GW31152@ando.pearwood.info>

On Sat, Sep 19, 2015 at 11:55:44PM -0700, Nathaniel Smith wrote:

> I don't have enough fingers to count how many times I've had to
> explain how stacklevel= works to maintainers of widely-used packages
> -- they had no idea that this was even a thing they were getting
> wrong.

Count me in that. I had no idea it was even a thing.


-- 
Steve

From bruce at leban.us  Sun Sep 20 09:50:09 2015
From: bruce at leban.us (Bruce Leban)
Date: Sun, 20 Sep 2015 00:50:09 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAP7+vJLJrnBeXmctXWZ-L-KoxWNn00c2ajmdgGJXmXDqqievcw@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC96A8.605@lucidity.plus.com>
 <D01DA4B6-9C67-4292-A98D-B18C98441177@yahoo.com>
 <87d1xfyoqn.fsf@uwakimon.sk.tsukuba.ac.jp> <55FD2F51.6020303@mail.de>
 <8761368thn.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAP7+vJLJrnBeXmctXWZ-L-KoxWNn00c2ajmdgGJXmXDqqievcw@mail.gmail.com>
Message-ID: <CAGu0Anss0a63ra0BoEGRFgYD8_6AuNCQBFVKoy+FZkpRWYK8_Q@mail.gmail.com>

On Sat, Sep 19, 2015 at 9:21 AM, Guido van Rossum <guido at python.org> wrote:

> I forgot to think about the scope of the uptalk operator (i.e. what is
> skipped when it finds a None). There are some clear cases (the actual
> implementation should avoid double evaluation of the tested expression, of
> course):
>
>   a.b?.c.d[x, y](p, q) === None if a.b is None else a.b.c.d[x, y](p, q)
>   a.b?[x, y].c.d(p, q) === None if a.b is None else a.b[x, y].c.d(p, q)
>   a.b?(p, q).c.d[x, y] === None if a.b is None else a.b(p, q).c.d[x, y]
>
> This makes sense to me.


> But what about its effect on other operators in the same expression? I
> think this is reasonable:
>
>   a?.b + c.d === None if a is None else a.b + c.d
>

This is a bit weird to me. Essentially ?. takes precedence over a following
+. But would you also expect it to take precedence over a preceding one as
well? That's inconsistent.

c.d + a?.b === None if a is None else c.d + a.b
or
c.d + a?.b === c.d + None if a is None else c.d + a.b

I think that ?. ?[] and ?() should affect other operators at the same
precedence level only, i.e.,  each other and . [] and (). This seems the
most logical to me. And I just looked up the C# documentation on MSDN and
it does the same thing:
https://msdn.microsoft.com/en-us/library/dn986595.aspx


> It also shouldn't escape out of comma-separated lists, argument lists,
> etc.:
>
>   (a?.b, x) === ((None if a is None else a.b), x)
>   f(a?.b) === f((None if a is None else a.b))
>

Agree. It also should not escape grouping parenthesis even though that
might not be useful. It would be very weird if a parenthesized expression
did something other than evaluate the expression inside it, period.

(a?.b).c === None.c if a is None else (a.b).c === temp = a?.b; temp.c
(x or a?.b).c === (x or (None if a is none else a.b)).c

Yes, None.c is going to raise an exception. That's better than just getting
None IMHO.

--- Bruce
Check out my new puzzle book: http://J.mp/ingToConclusions
Get it free here: http://J.mp/ingToConclusionsFree (available on iOS)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150920/64c7d7d1/attachment-0001.html>

From abarnert at yahoo.com  Sun Sep 20 11:28:53 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sun, 20 Sep 2015 02:28:53 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <20150920073157.GV31152@ando.pearwood.info>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
Message-ID: <6CD00D68-2B91-4318-9301-740800A6EC0B@yahoo.com>

On Sep 20, 2015, at 00:31, Steven D'Aprano <steve at pearwood.info> wrote:
> 
>> On Sun, Sep 20, 2015 at 09:10:32AM +0300, Serhiy Storchaka wrote:
>>> On 19.09.15 07:21, Guido van Rossum wrote:
>>> I do, but at least the '?' is part of an operator, not part of the name
>>> (as it is in Ruby?).
>> 
>> What to do with the "in" operator?
> 
> Absolutely nothing.
> 
> I'm not convinced that we should generalise this beyond the three 
> original examples of attribute access, item lookup and function call. I 
> think that applying ? to arbitrary operators is a case of "YAGNI". Or 
> perhaps, "You Shouldn't Need It".

I agree. Seeing how far you can generalize something and whether you can come up with a simple rule that makes all of your use cases follow naturally can be fun, but it isn't necessarily the best design.

Also, by not trying to generalize uptalk-combined operators (or uptalk as a postfix unary operator of its own, which I was earlier arguing for...), the question of how we deal with ?? or ?= (if we want them) can be "the same way every other language does", rather than seeing what follows from the general rule and then convincing ourselves that's what we wanted.

Also, I think trying to generalize to all operators is a false generalization, since the things we're generalizing from aren't actually operators (and not just syntactically--e.g., stylistically, they're never surrounded by spaces--which makes a pretty big difference in the readability impact of a character as heavy as "?") in the first place.

Personally, I think ?? is the second most obviously useful after ?. (there's a reason it's the one with the oldest and widest pedigree); we need ?() because Python, unlike C# and friends, unifies member and method access; ?[] doesn't seem as necessary but it's such an obvious parallel to ?() that I think people will expect it; ?= is potentially as confusing as it is helpful. So, my suggestion would be just the first four. And keeping them simple, and consistent with other languages, no trying to extend the protection to other operators/accesses, no extra short-circuiting, nothing. So:

    spam ?? eggs === spam if spam is not None else eggs
    spam?.eggs === spam.eggs if spam is not None else None
    spam?(eggs) === spam(eggs) if spam is not None else None
    spam?[eggs] === spam[eggs] if spam is not None else None

That's easy to define, easy to learn and remember, and pretty consistent with other languages. The one big difference is that what you write as "spam?.eggs(cheese)" in C# has to be "spam?.eggs?(cheese)" in Python, but I don't think that's a big problem. After all, in Python, spam.eggs is a first-class object, and one that's commonly passed or stored, so the obvious way to look at "spam.eggs(cheese)" is as explicitly chaining two separate things together (a __getattr__ with a descriptor __get__, and a __call__), so why shouldn't uptalking both operations be explicit?


From rosuav at gmail.com  Sun Sep 20 11:38:18 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 20 Sep 2015 19:38:18 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <20150920073157.GV31152@ando.pearwood.info>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
Message-ID: <CAPTjJmq5C7WxJBwUiQwT4_ZDpaV7n924O5iXG7=Y+reBrGS_kA@mail.gmail.com>

On Sun, Sep 20, 2015 at 5:31 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> Technically, x.y x[y] and x(y) aren't operators, but for the sake of
> convenience I'll call them such. Even though these are binary operators,
> the ? only shortcuts according to the x, not the y. So we can call
> these ?. ?[] ?() operators "pseudo-unary" operators rather than binary
> operators.

That's how all Python's short-circuiting works - based on the value of
what's on the left, decide whether or not to evaluate what's on the
right. (Well, nearly all - if/else evaluates the middle first, but
same difference.) This is another form of short-circuiting; "x[y]"
evaluates x, then if that's None, doesn't bother evaluating y because
it can't affect the result.

ChrisA

From p.f.moore at gmail.com  Sun Sep 20 12:56:06 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 20 Sep 2015 11:56:06 +0100
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
Message-ID: <CACac1F8CNxjf3ZZ3_MkuoURxvQtRMpQ1hm3vREmFO5aFQ=5W3w@mail.gmail.com>

On 20 September 2015 at 00:40, Tim Peters <tim.peters at gmail.com> wrote:
> [Guido]
>> Thanks! I'd accept this (and I'd reject 504 at the same time). I like the
>> secrets name. I wonder though, should the PEP propose a specific set of
>> functions? (With the understanding that we might add more later.)
>
> The bikeshedding on that will be far more tedious than the
> implementation.  I'll get it started :-)
>
> No attempt to be minimal here.  More-than-less "obvious" is more important:
>
> Bound methods of a SystemRandom instance
>     .randrange()
>     .randint()
>     .randbits()
>         renamed from .getrandbits()
>     .randbelow(exclusive_upper_bound)
>         renamed from private ._randbelow()
>     .choice()
>
>  Token functions
>     .token_bytes(nbytes)
>         another name for os.urandom()
>     .token_hex(nbytes)
>         same, but return string of ASCII hex digits
>     .token_url(nbytes)
>         same, but return URL-safe base64-encoded ASCII
>     .token_alpha(alphabet, nchars)
>         string of `nchars` characters drawn uniformly
>         from `alphabet`

Given where this started, I'd suggest renaming token_alpha as
"password". Beginners wouldn't necessarily associate the term "token"
with the problem "I want to generate a random password" [1]. Maybe add
a short recipe showing how to meet constraints like "at least 2
digits" by simply generating repeatedly until a valid password is
found.

For a bit of extra bikeshedding, I'd make alphabet the second,
optional, parameter and default it to
string.ascii_letters+string.digits+string.punctuation, as that's often
what password constraints require.

Or at the very least, document how to use the module functions for the
common tasks we see people getting wrong. But I thought the idea here
was to make doing things the right way obvious, for people who don't
read documentation, so I'd prefer to see the functions exposed by the
module named based on the problems they solve, not on the features
they provide. (Even if that involves a little duplication, and/or a
split between "high level" and "low level" APIs).

Paul.

[1] I'd written a spec for password() before I spotted that it was the
same as token_alpha :-(

From p.f.moore at gmail.com  Sun Sep 20 13:05:52 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 20 Sep 2015 12:05:52 +0100
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <20150920073157.GV31152@ando.pearwood.info>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
Message-ID: <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>

On 20 September 2015 at 08:31, Steven D'Aprano <steve at pearwood.info> wrote:
> I'm not convinced that we should generalise this beyond the three
> original examples of attribute access, item lookup and function call. I
> think that applying ? to arbitrary operators is a case of "YAGNI". Or
> perhaps, "You Shouldn't Need It".

Agreed.

Does this need to be an operator? How about the following:

    class Maybe:
        def __getattr__(self, attr): return None
        def __getitem__(self, idx): return None
        def __call__(self, *args, **kw): return None

    def maybe(obj):
        return Maybe() if obj is None else obj

    attr = maybe(obj).spam
    elt = maybe(obj)[n]
    result = maybe(callback)(args)

The Maybe class could be hidden, and the Maybe() object a singleton
(making my poor naming a non-issue :-)) and if it's felt sufficiently
useful, the maybe() function could be a builtin.

Usage of the result of maybe() outside of the above 3 contexts should
simply be "not supported" - don't worry about trying to stop people
doing weird things, just make it clear that the intent is only to
support the 3 given idiomatic usages.

Paul.

From phd at phdru.name  Sun Sep 20 13:29:26 2015
From: phd at phdru.name (Oleg Broytman)
Date: Sun, 20 Sep 2015 13:29:26 +0200
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
Message-ID: <20150920112926.GA5178@phdru.name>

On Sun, Sep 20, 2015 at 12:05:52PM +0100, Paul Moore <p.f.moore at gmail.com> wrote:
> On 20 September 2015 at 08:31, Steven D'Aprano <steve at pearwood.info> wrote:
> > I'm not convinced that we should generalise this beyond the three
> > original examples of attribute access, item lookup and function call. I
> > think that applying ? to arbitrary operators is a case of "YAGNI". Or
> > perhaps, "You Shouldn't Need It".
> 
> Agreed.
> 
> Does this need to be an operator? How about the following:
> 
>     class Maybe:
>         def __getattr__(self, attr): return None
>         def __getitem__(self, idx): return None
>         def __call__(self, *args, **kw): return None
> 
>     def maybe(obj):
>         return Maybe() if obj is None else obj
> 
>     attr = maybe(obj).spam
>     elt = maybe(obj)[n]
>     result = maybe(callback)(args)
> 
> The Maybe class could be hidden, and the Maybe() object a singleton
> (making my poor naming a non-issue :-)) and if it's felt sufficiently
> useful, the maybe() function could be a builtin.
> 
> Usage of the result of maybe() outside of the above 3 contexts should
> simply be "not supported" - don't worry about trying to stop people
> doing weird things, just make it clear that the intent is only to
> support the 3 given idiomatic usages.

   PyMaybe - a Python implementation of the Maybe pattern. Seems to be
quite elaborated.

https://github.com/ekampf/pymaybe

> Paul.

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.

From ncoghlan at gmail.com  Sun Sep 20 13:54:48 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 20 Sep 2015 21:54:48 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CAPTjJmrb6a8VdBVgowBfzP+0GGnYFyxKRimfhKqxx141mrHnOA@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAPTjJmrb6a8VdBVgowBfzP+0GGnYFyxKRimfhKqxx141mrHnOA@mail.gmail.com>
Message-ID: <CADiSq7eLvKFCjR5YYQ2-x3UHFFT9w7p2qB+HoMVPUnikf6xk_Q@mail.gmail.com>

On 20 September 2015 at 09:00, Chris Angelico <rosuav at gmail.com> wrote:
> On Sun, Sep 20, 2015 at 6:57 AM, Guido van Rossum <guido at python.org> wrote:
>> [in response to Steven D'Aprano's proto-PEP]
>> Hopefully someone on the peps team can commit your PEP in the repo. It's
>> probably going to be PEP 506.
>
> I think that's my cue!
>
> PEP 506 created and pushed. I've manually converted the original text
> to RST, but if that was a bad idea, I can revert to text.

And I've now withdrawn PEP 504 in favour of Steven's approach in this PEP.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Sun Sep 20 14:26:42 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 20 Sep 2015 22:26:42 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CACac1F8CNxjf3ZZ3_MkuoURxvQtRMpQ1hm3vREmFO5aFQ=5W3w@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CACac1F8CNxjf3ZZ3_MkuoURxvQtRMpQ1hm3vREmFO5aFQ=5W3w@mail.gmail.com>
Message-ID: <CADiSq7fgocTcThn75rVfq+j=xRO+yr130Of5rN_a9oD-Wgue-w@mail.gmail.com>

On 20 September 2015 at 20:56, Paul Moore <p.f.moore at gmail.com> wrote:
> Given where this started, I'd suggest renaming token_alpha as
> "password". Beginners wouldn't necessarily associate the term "token"
> with the problem "I want to generate a random password" [1]. Maybe add
> a short recipe showing how to meet constraints like "at least 2
> digits" by simply generating repeatedly until a valid password is
> found.
>
> For a bit of extra bikeshedding, I'd make alphabet the second,
> optional, parameter and default it to
> string.ascii_letters+string.digits+string.punctuation, as that's often
> what password constraints require.
>
> Or at the very least, document how to use the module functions for the
> common tasks we see people getting wrong. But I thought the idea here
> was to make doing things the right way obvious, for people who don't
> read documentation, so I'd prefer to see the functions exposed by the
> module named based on the problems they solve, not on the features
> they provide. (Even if that involves a little duplication, and/or a
> split between "high level" and "low level" APIs).

Right, I'd suggest the following breakdown.

* Arbitrary password generation (also covers passphrase generation
from a word list):

    secrets.password(result_len: int,
alphabet=string.ascii_letters+string.digits+string.punctuation: T) ->
T

* Binary token generation ("num_random_bytes" is the arg to
os.urandom, not the length of result):

    secrets.token(num_random_bytes: int) -> bytes
    secrets.token_hex(num_random_bytes: int) -> bytes
    secrets.token_urlsafe_base64(num_random_bytes: int) -> bytes

* Serial number generation ("num_random_bytes" is the arg to
os.urandom, not the length of result):

    secrets.serial_number(num_random_bytes: int) -> int

* Constant time secret comparison (aka hmac.compare_digest):

    secrets.equal(a: T, b: T) -> bool

* Lower level building blocks:

    secrets.choice(container)
    # Hold off on other SystemRandom methods?

(I don't have a strong opinion on that last point, as it's the higher
level APIs that I think are the important aspect of this proposal)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From luciano at ramalho.org  Sun Sep 20 15:59:54 2015
From: luciano at ramalho.org (Luciano Ramalho)
Date: Sun, 20 Sep 2015 10:59:54 -0300
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <55FDAC74.7050001@mail.de>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
 <55FDAC74.7050001@mail.de>
Message-ID: <CALxg4FU=gJKHrq1p+ni5uiXhF==B_SxXk6eU2cW=49KfO1_3Vg@mail.gmail.com>

Chris,

I don't think students should be worrying about writing code that is
Python 2 and Python 3 compatible.

That's a concern only for people who write libraries, tools and
frameworks for others to use, and I do not think these are the kinds
of programs students usually do. Even if they are doing something
along those lines, they should be focusing on other more important
features of the programs rather than whether they run on Python 2 and
on Python 3.

Having said that, I'd also like to add that I don't think ``from
__future__ import unicode_literals`` is a great idea for making code
2/3 compatible nowadays. It was necessary before the u'' prefix was
reinstated in Python 3.3, but since u'' is back it's much better to be
explicit in your literals rather than dealing with runtime errors
because of the blanket effect of the unicode_literals import.

Anyone who cares about 2/3 compatibility should mark every single
literal with a u'' or a b'' prefix.

But students should not be distracted by this. They should be using
Python 3 only ;-).

Cheers,

Luciano




On Sat, Sep 19, 2015 at 3:41 PM, Sven R. Kunze <srkunze at mail.de> wrote:
> I totally agree here.
>
>
> On 19.09.2015 19:50, Chris Barker wrote:
>
> Hi all,
>
> the common advise, these days, if you want to write py2/3 compatible code,
> is to do:
>
> from __future__ import absolute_import
> from __future__ import division
> from __future__ import print_function
> from __future__ import unicode_literals
>
> https://docs.python.org/2/howto/pyporting.html#prevent-compatibility-regressions
>
> I'm trying to do this in my code, and teaching my students to do it to.
>
> but that's actually a lot of code to write.
>
> It would be nice to have a:
>
> from __future__ import py3
>
> or something like that, that would do all of those in one swipe.
>
> IIIC, l can't make a little module that does that, because the __future__
> imports only effect the module in which they are imported
>
> Sure, it's not a huge deal, but it would make it easier for folks wanting to
> keep up this best practice.
>
> Of course, this wouldn't happen until 2.7.11, if an when there even is one,
> but it would be nice to get it on the list....
>
> -Chris
>
>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Luciano Ramalho
|  Author of Fluent Python (O'Reilly, 2015)
|     http://shop.oreilly.com/product/0636920032519.do
|  Professor em: http://python.pro.br
|  Twitter: @ramalhoorg

From abarnert at yahoo.com  Sun Sep 20 23:34:34 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sun, 20 Sep 2015 14:34:34 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
Message-ID: <5DE97370-B0FD-49B1-A22F-08407951E68B@yahoo.com>

On Sep 20, 2015, at 04:05, Paul Moore <p.f.moore at gmail.com> wrote:
> 
> Does this need to be an operator? How about the following:
> 
>    class Maybe:
>        def __getattr__(self, attr): return None
>        def __getitem__(self, idx): return None
>        def __call__(self, *args, **kw): return None
> 
>    def maybe(obj):
>        return Maybe() if obj is None else obj
> 
>    attr = maybe(obj).spam
>    elt = maybe(obj)[n]
>    result = maybe(callback)(args)

But try this for calling a method on a possibly-null object:

    result = maybe(maybe(spam).eggs)(cheese)



From mertz at gnosis.cx  Sun Sep 20 23:47:16 2015
From: mertz at gnosis.cx (David Mertz)
Date: Sun, 20 Sep 2015 14:47:16 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
Message-ID: <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>

Paul Moore's idea is WAAYY better than the ugly ? pseudo-operator.
 `maybe()` reads just like a regular function (because it is), and we don't
need to go looking for Perl (nor Haskell) in some weird extra syntax that
will confuse beginners.

On Sun, Sep 20, 2015 at 4:05 AM, Paul Moore <p.f.moore at gmail.com> wrote:

> On 20 September 2015 at 08:31, Steven D'Aprano <steve at pearwood.info>
> wrote:
> > I'm not convinced that we should generalise this beyond the three
> > original examples of attribute access, item lookup and function call. I
> > think that applying ? to arbitrary operators is a case of "YAGNI". Or
> > perhaps, "You Shouldn't Need It".
>
> Agreed.
>
> Does this need to be an operator? How about the following:
>
>     class Maybe:
>         def __getattr__(self, attr): return None
>         def __getitem__(self, idx): return None
>         def __call__(self, *args, **kw): return None
>
>     def maybe(obj):
>         return Maybe() if obj is None else obj
>
>     attr = maybe(obj).spam
>     elt = maybe(obj)[n]
>     result = maybe(callback)(args)
>
> The Maybe class could be hidden, and the Maybe() object a singleton
> (making my poor naming a non-issue :-)) and if it's felt sufficiently
> useful, the maybe() function could be a builtin.
>
> Usage of the result of maybe() outside of the above 3 contexts should
> simply be "not supported" - don't worry about trying to stop people
> doing weird things, just make it clear that the intent is only to
> support the 3 given idiomatic usages.
>
> Paul.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150920/e48da746/attachment.html>

From abarnert at yahoo.com  Mon Sep 21 00:07:43 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sun, 20 Sep 2015 15:07:43 -0700
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CADiSq7fgocTcThn75rVfq+j=xRO+yr130Of5rN_a9oD-Wgue-w@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CACac1F8CNxjf3ZZ3_MkuoURxvQtRMpQ1hm3vREmFO5aFQ=5W3w@mail.gmail.com>
 <CADiSq7fgocTcThn75rVfq+j=xRO+yr130Of5rN_a9oD-Wgue-w@mail.gmail.com>
Message-ID: <EB7EC940-4CB4-45BC-8DF2-807562016C13@yahoo.com>

On Sep 20, 2015, at 05:26, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
>> On 20 September 2015 at 20:56, Paul Moore <p.f.moore at gmail.com> wrote:
>> Given where this started, I'd suggest renaming token_alpha as
>> "password". Beginners wouldn't necessarily associate the term "token"
>> with the problem "I want to generate a random password" [1]. Maybe add
>> a short recipe showing how to meet constraints like "at least 2
>> digits" by simply generating repeatedly until a valid password is
>> found.
>> 
>> For a bit of extra bikeshedding, I'd make alphabet the second,
>> optional, parameter and default it to
>> string.ascii_letters+string.digits+string.punctuation, as that's often
>> what password constraints require.
>> 
>> Or at the very least, document how to use the module functions for the
>> common tasks we see people getting wrong. But I thought the idea here
>> was to make doing things the right way obvious, for people who don't
>> read documentation, so I'd prefer to see the functions exposed by the
>> module named based on the problems they solve, not on the features
>> they provide. (Even if that involves a little duplication, and/or a
>> split between "high level" and "low level" APIs).
> 
> Right, I'd suggest the following breakdown.
> 
> * Arbitrary password generation (also covers passphrase generation
> from a word list):
> 
>    secrets.password(result_len: int,
> alphabet=string.ascii_letters+string.digits+string.punctuation: T) ->
> T

If T is a word list--that is, an Iterable of str or bytes--you want to return a str or a bytes, not a T.

Also, making it work that generically will make the code much more complicated, to the point where it no longer serves as useful sample code to rank novices. You have to extract the first element of T, then do your choosing off chain([first], T) instead of off T, then type(first).join; all of that is more complicated than the actual logic, and will obscure the important part we want novices to learn if they read the source.

Also, I think for word lists, I think you'd want a way to specify actual passphrases vs. the xkcd 936 idea of using passphrases as passwords even for sites that don't accept spaces, like "correcthorsebatterystaple". Maybe via a sep=' ' parameter? That would be very confusing if it's ignored when T is string-like but used when T is a non-string-like iterable of string-likes.

I think it's better to require T to be string-like than to try to generalize it, and maybe add a separate passphrase function that takes (words: Sequence[T], sep: T) -> T. (Although I'm not sure how to default to ' ' vs b' ' based on the type of T... But maybe this does need to handle bytes, so Sequence[str] is fine?)

> * Binary token generation ("num_random_bytes" is the arg to
> os.urandom, not the length of result):
> 
>    secrets.token(num_random_bytes: int) -> bytes
>    secrets.token_hex(num_random_bytes: int) -> bytes
>    secrets.token_urlsafe_base64(num_random_bytes: int) -> bytes
> 
> * Serial number generation ("num_random_bytes" is the arg to
> os.urandom, not the length of result):
> 
>    secrets.serial_number(num_random_bytes: int) -> int
> 
> * Constant time secret comparison (aka hmac.compare_digest):
> 
>    secrets.equal(a: T, b: T) -> bool
> 
> * Lower level building blocks:
> 
>    secrets.choice(container)
>    # Hold off on other SystemRandom methods?
> 
> (I don't have a strong opinion on that last point, as it's the higher
> level APIs that I think are the important aspect of this proposal)

I think randrange is definitely worth having. Even the OpenSSL and arc4random APIs provide something equivalent. If you're a novice, and following a blog post that says to use your language's equivalent of randbelow(1000000), are you going to think of choice(range(1000000))? And, if you do, are you going to convince yourself that this is reasonable and not going to create a slew of million-element lists?


From guido at python.org  Mon Sep 21 00:50:10 2015
From: guido at python.org (Guido van Rossum)
Date: Sun, 20 Sep 2015 15:50:10 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
Message-ID: <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>

Actually if anything reminds me of Haskell it's a 'Maybe' type. :-(

But I do side with those who find '?' too ugly to consider.

On Sun, Sep 20, 2015 at 2:47 PM, David Mertz <mertz at gnosis.cx> wrote:

> Paul Moore's idea is WAAYY better than the ugly ? pseudo-operator.
>  `maybe()` reads just like a regular function (because it is), and we don't
> need to go looking for Perl (nor Haskell) in some weird extra syntax that
> will confuse beginners.
>
> On Sun, Sep 20, 2015 at 4:05 AM, Paul Moore <p.f.moore at gmail.com> wrote:
>
>> On 20 September 2015 at 08:31, Steven D'Aprano <steve at pearwood.info>
>> wrote:
>> > I'm not convinced that we should generalise this beyond the three
>> > original examples of attribute access, item lookup and function call. I
>> > think that applying ? to arbitrary operators is a case of "YAGNI". Or
>> > perhaps, "You Shouldn't Need It".
>>
>> Agreed.
>>
>> Does this need to be an operator? How about the following:
>>
>>     class Maybe:
>>         def __getattr__(self, attr): return None
>>         def __getitem__(self, idx): return None
>>         def __call__(self, *args, **kw): return None
>>
>>     def maybe(obj):
>>         return Maybe() if obj is None else obj
>>
>>     attr = maybe(obj).spam
>>     elt = maybe(obj)[n]
>>     result = maybe(callback)(args)
>>
>> The Maybe class could be hidden, and the Maybe() object a singleton
>> (making my poor naming a non-issue :-)) and if it's felt sufficiently
>> useful, the maybe() function could be a builtin.
>>
>> Usage of the result of maybe() outside of the above 3 contexts should
>> simply be "not supported" - don't worry about trying to stop people
>> doing weird things, just make it clear that the intent is only to
>> support the 3 given idiomatic usages.
>>
>> Paul.
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
>
> --
> Keeping medicines from the bloodstreams of the sick; food
> from the bellies of the hungry; books from the hands of the
> uneducated; technology from the underdeveloped; and putting
> advocates of freedom in prisons.  Intellectual property is
> to the 21st century what the slave trade was to the 16th.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150920/f4f18383/attachment-0001.html>

From mehaase at gmail.com  Mon Sep 21 04:35:03 2015
From: mehaase at gmail.com (Mark E. Haase)
Date: Sun, 20 Sep 2015 22:35:03 -0400
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
Message-ID: <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>

On the day I started this thread, I wrote a Python module that does what
maybe() does. I hadn't seen PyMaybe yet, and I couldn't think of any good
names for my module's functions, so my module was disappointingly ugly.

PyMaybe is exactly what I *wish* I had written that day. For comparison,
here's the code from my first post in this thread and it's maybe-ized
version.

    response = json.dumps({
        'created': created?.isoformat(),
        'updated': updated?.isoformat(),
        ...
    })

    response = json.dumps({
        'created': maybe(created).isoformat(),
        'updated': maybe(updated).isoformat(),
        ...
    })

Pros:
1. No modification to Python grammar.
2. More readable: it's easy to overlook ? when skimming quickly, but
"maybe()" is easy to spot.
3. More intuitive: the name "maybe" gives a hint at what it might do,
whereas if you've never seen "?." you would need to google it. (Googling
punctuation is obnoxious.)

Cons:
1. Doesn't short circuit: "maybe(family_name).upper().strip()" will fail if
family_name is None.[1] You might try
"maybe(maybe(family_name).upper()).strip()", but that is tricky to read and
still isn't quite right: if family_name is not None, then it *should* be an
error if "upper" is not an attribute of it. The 2-maybes form covers up
that error.

I'm sure there will be differing opinions on whether this type of operation
should short circuit. Some will say that we shouldn't be writing code that
way: if you need to chain calls, then use some other syntax. But I think
the example of upper case & strip is a good example of a perfectly
reasonable thing to do. These kinds of operations are pretty common when
you're interfacing with some external system or external data that has a
concept of null (databases, JSON, YAML, argparse, any thin wrapper around C
library, etc.).

This conversation has really focused on the null aware attribute access,
but the easier and more defensible use case is the null coalesce operator,
spelled "??" in C# and Dart. It's easy to find popular packages that use
something like "retries = default if default is not None else cls.DEFAULT"
to supply default instances.[2] Other packages do something like "retries =
default or cls.DEFAULT"[3], which is worse because it easy to overlook the
implicit coalescing of the left operand. In fact, the top hit for "python
null coalesce" is StackOverflow, and the top-voted answer says to use
"or".[4] (The answer goes on to explain the nuance of using "or" to
coalesce, but how many developers read that far?)

*In the interest of finding some common ground, I'd like to get some
feedback on the coalesce operator.* Maybe that conversation will yield some
insight into the other "None aware" operators.

A) Is coalesce a useful feature? (And what are the use cases?)
B) If it is useful, is it important that it short circuits? (Put another
way, could a function suffice?)
C) If it should be an operator, is "??" an ugly spelling?

    >>> retries = default ?? cls.DEFAULT

D) If it should be an operator, are any keywords more aesthetically
pleasing? (I expect zero support for adding a new keyword.)

    >>> retries = default else cls.DEFAULT
    >>> retries = try default or cls.DEFAULT
    >>> retries = try default else cls.DEFAULT
    >>> retries = try default, cls.DEFAULT
    >>> retries = from default or cls.DEFAULT
    >>> retries = from default else cls.DEFAULT
    >>> retries = from default, cls.DEFAULT


My answers:

A) It's useful: supplying default instances for optional values is an
obvious and common use case.
B) It should short circuit, because the patterns it replaces (using ternary
operator or "or") also do.
C) It's too restrictive to cobble a new operator out of existing keywords;
"??" isn't hard to read when it is separated by whitespace, as Pythonistas
typically do between a binary operator and its operands.
D) I don't find any of these easier to read or write than "??".




[1] I say "should", but actually PyMaybe does something underhanded so that
this expression does not fail: "maybe(foo).upper()" returns a "Nothing"
instance, not "None". But Nothing has "def __repr__(self): return
repr(None)". So if you try to print it out, you'll think you have a None
instance, but it won't behave like one. If you try to JSON serialize it,
you get a hideously confusing error: "TypeError: None is not JSON
serializable". For those not familiar: the JSON encoder can definitely
serialize None: it becomes a JSON "null". A standard implementation of
maybe() should _not_ work this way.

[2] https://github.com/shazow/urllib3/blob/master/urllib3/util/retry.py#L148

[3]
https://github.com/kennethreitz/requests/blob/46ff1a9a543cc4d33541aa64c94f50f0a698736e/requests/hooks.py#L25

[4] http://stackoverflow.com/a/4978745/122763

On Sun, Sep 20, 2015 at 6:50 PM, Guido van Rossum <guido at python.org> wrote:

> Actually if anything reminds me of Haskell it's a 'Maybe' type. :-(
>
> But I do side with those who find '?' too ugly to consider.
>
> On Sun, Sep 20, 2015 at 2:47 PM, David Mertz <mertz at gnosis.cx> wrote:
>
>> Paul Moore's idea is WAAYY better than the ugly ? pseudo-operator.
>>  `maybe()` reads just like a regular function (because it is), and we don't
>> need to go looking for Perl (nor Haskell) in some weird extra syntax that
>> will confuse beginners.
>>
>> On Sun, Sep 20, 2015 at 4:05 AM, Paul Moore <p.f.moore at gmail.com> wrote:
>>
>>> On 20 September 2015 at 08:31, Steven D'Aprano <steve at pearwood.info>
>>> wrote:
>>> > I'm not convinced that we should generalise this beyond the three
>>> > original examples of attribute access, item lookup and function call. I
>>> > think that applying ? to arbitrary operators is a case of "YAGNI". Or
>>> > perhaps, "You Shouldn't Need It".
>>>
>>> Agreed.
>>>
>>> Does this need to be an operator? How about the following:
>>>
>>>     class Maybe:
>>>         def __getattr__(self, attr): return None
>>>         def __getitem__(self, idx): return None
>>>         def __call__(self, *args, **kw): return None
>>>
>>>     def maybe(obj):
>>>         return Maybe() if obj is None else obj
>>>
>>>     attr = maybe(obj).spam
>>>     elt = maybe(obj)[n]
>>>     result = maybe(callback)(args)
>>>
>>> The Maybe class could be hidden, and the Maybe() object a singleton
>>> (making my poor naming a non-issue :-)) and if it's felt sufficiently
>>> useful, the maybe() function could be a builtin.
>>>
>>> Usage of the result of maybe() outside of the above 3 contexts should
>>> simply be "not supported" - don't worry about trying to stop people
>>> doing weird things, just make it clear that the intent is only to
>>> support the 3 given idiomatic usages.
>>>
>>> Paul.
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> https://mail.python.org/mailman/listinfo/python-ideas
>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>>
>>
>>
>>
>> --
>> Keeping medicines from the bloodstreams of the sick; food
>> from the bellies of the hungry; books from the hands of the
>> uneducated; technology from the underdeveloped; and putting
>> advocates of freedom in prisons.  Intellectual property is
>> to the 21st century what the slave trade was to the 16th.
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
Mark E. Haase
202-815-0201
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150920/23821ff9/attachment.html>

From steve at pearwood.info  Mon Sep 21 05:50:16 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 21 Sep 2015 13:50:16 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAPTjJmq5C7WxJBwUiQwT4_ZDpaV7n924O5iXG7=Y+reBrGS_kA@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CAPTjJmq5C7WxJBwUiQwT4_ZDpaV7n924O5iXG7=Y+reBrGS_kA@mail.gmail.com>
Message-ID: <20150921035016.GX31152@ando.pearwood.info>

On Sun, Sep 20, 2015 at 07:38:18PM +1000, Chris Angelico wrote:
> On Sun, Sep 20, 2015 at 5:31 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> > Technically, x.y x[y] and x(y) aren't operators, but for the sake of
> > convenience I'll call them such. Even though these are binary operators,
> > the ? only shortcuts according to the x, not the y. So we can call
> > these ?. ?[] ?() operators "pseudo-unary" operators rather than binary
> > operators.
> 
> That's how all Python's short-circuiting works - based on the value of
> what's on the left, decide whether or not to evaluate what's on the
> right. (Well, nearly all - if/else evaluates the middle first, but
> same difference.) This is another form of short-circuiting; "x[y]"
> evaluates x, then if that's None, doesn't bother evaluating y because
> it can't affect the result.

I think you are mistaken about x[y]:

py> None[print("side effect")]
side effect
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not subscriptable

That's why x?[y] is a proposal.


-- 
Steve

From rosuav at gmail.com  Mon Sep 21 06:05:15 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Mon, 21 Sep 2015 14:05:15 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <20150921035016.GX31152@ando.pearwood.info>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
 <CAPTjJmq5C7WxJBwUiQwT4_ZDpaV7n924O5iXG7=Y+reBrGS_kA@mail.gmail.com>
 <20150921035016.GX31152@ando.pearwood.info>
Message-ID: <CAPTjJmoNt=qUv1joT_bNutPTriZjvw1Jxyy0OCPzGELRS8eRBA@mail.gmail.com>

On Mon, Sep 21, 2015 at 1:50 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Sun, Sep 20, 2015 at 07:38:18PM +1000, Chris Angelico wrote:
>> On Sun, Sep 20, 2015 at 5:31 PM, Steven D'Aprano <steve at pearwood.info> wrote:
>> > Technically, x.y x[y] and x(y) aren't operators, but for the sake of
>> > convenience I'll call them such. Even though these are binary operators,
>> > the ? only shortcuts according to the x, not the y. So we can call
>> > these ?. ?[] ?() operators "pseudo-unary" operators rather than binary
>> > operators.
>>
>> That's how all Python's short-circuiting works - based on the value of
>> what's on the left, decide whether or not to evaluate what's on the
>> right. (Well, nearly all - if/else evaluates the middle first, but
>> same difference.) This is another form of short-circuiting; "x[y]"
>> evaluates x, then if that's None, doesn't bother evaluating y because
>> it can't affect the result.
>
> I think you are mistaken about x[y]:
>
> py> None[print("side effect")]
> side effect
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: 'NoneType' object is not subscriptable
>
> That's why x?[y] is a proposal.

Oops, that was a typo in my statement. I meant "x?[y]" should behave
that way - once it's discovered that x is None, the evaluation of y
can't affect the result, and so it doesn't get evaluated (as per the
normal short-circuiting rules). Yes, x[y] has to evaluate both x and y
(after all, the value of y is passed to __getitem__). Sorry for the
confusion.

ChrisA

From steve at pearwood.info  Mon Sep 21 06:06:19 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 21 Sep 2015 14:06:19 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
Message-ID: <20150921040619.GY31152@ando.pearwood.info>

On Sun, Sep 20, 2015 at 12:05:52PM +0100, Paul Moore wrote:
> On 20 September 2015 at 08:31, Steven D'Aprano <steve at pearwood.info> wrote:
> > I'm not convinced that we should generalise this beyond the three
> > original examples of attribute access, item lookup and function call. I
> > think that applying ? to arbitrary operators is a case of "YAGNI". Or
> > perhaps, "You Shouldn't Need It".
> 
> Agreed.
> 
> Does this need to be an operator? How about the following:

Sadly, I think it does.

Guido has (I think) ruled out the Null object design pattern, which 
makes me glad because I think it is horrid. But your Maybe class below 
is a watered down, weak version that (in my opinion) isn't worth 
bothering with. See below.


class Maybe:
    def __getattr__(self, attr): return None
    def __getitem__(self, idx): return None
    def __call__(self, *args, **kw): return None

def maybe(obj):
    return Maybe() if obj is None else obj


And in action:

py> maybe("spam").upper()  # Works fine.
'SPAM'
py> maybe(None).upper()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not callable

It also fails for chained lookups:

maybe(obj).spam['id'].ham

will fail for the same reason. You could write this:

maybe(maybe(obj).upper)()
maybe(maybe(maybe(obj).spam)['id']).ham

but that's simply awful. Avoiding that problem is why the Null object 
returns itself, but we've rightly ruled that out.

This is why I think that if this is worth doing, it has to be some sort 
of short-circuiting operator or pseudo-operator:

expression ? .spam.eggs.cheese

can short-circuit the entire chain .spam.eggs.cheese, not just the first 
component. Otherwise, I don't think it's worth doing.



-- 
Steve

From anthony at xtfx.me  Mon Sep 21 06:17:00 2015
From: anthony at xtfx.me (C Anthony Risinger)
Date: Sun, 20 Sep 2015 23:17:00 -0500
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <20150919120624.GS31152@ando.pearwood.info>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <CAGAVQTFG8PpdmDqr1FRBrvmB_hJ-ZqJrZT4qLyLx63+bNm-mpg@mail.gmail.com>
 <20150919120624.GS31152@ando.pearwood.info>
Message-ID: <CAGAVQTGGX679Mx+F3kFDgWSYrJjRGQ_AXXo_qxFBwCtQnUP0Rg@mail.gmail.com>

On Sat, Sep 19, 2015 at 7:06 AM, Steven D'Aprano <steve at pearwood.info>
wrote:

> On Sat, Sep 19, 2015 at 03:17:07AM -0500, C Anthony Risinger wrote:
>
> > I really liked this whole thread, and I largely still do -- I?think --
> but
> > I'm not sure I like how `?` suddenly prevents whole blocks of code from
> > being evaluated. Anything within the (...) or [...] is now skipped (IIUC)
> > just because a `?` was added, which seems like it could have side effects
> > on the surrounding state, especially since I expect people will use it
> for
> > squashing/silencing or as a convenient trick after the fact, possibly in
> > code they did not originally write.
>
> I don't think this is any different from other short-circuiting
> operators, particularly `and` and the ternary `if` operator:
>
> result = obj and obj.method(expression)
>
> result = obj.method(expression) if obj else default
>
> In both cases, `expression` is not evaluated if obj is falsey. That's
> the whole point.


Sure, but those all have white space and I can read what's happening. The
`?` could appear anywhere without break. I don't like that, but, opinion.


> > If the original example included a `?` like so:
> >
> >     response = json.dumps?({
> >         'created': created?.isoformat(),
> >         'updated': updated?.isoformat(),
> >         ...
> >     })
> >
> > should "dumps" be None, the additional `?` (although though you can
> barely
> > see it) prevents *everything else* from executing.
>
> We're still discussing the syntax and semantics of this, so I could be
> wrong, but my understanding of this is that the *first* question mark
> prevents the expressions in the parens from being executed:
>
> json.dumps?( ... )
>
> evaluates as None if json.dumps is None, otherwise it evaluates the
> arguments and calls the dumps object. In other words, rather like this:
>
> _temp = json.dumps  # temporary value
> if _temp is None:
>     response = None
> else:
>     response = _temp({
>         'created': None if created is None else created.isoformat(),
>         'updated': None if updated is None else updated.isoformat(),
>         ...
>         })
> del _temp
>
>
> except the _temp name isn't actually used. The whole point is to avoid
> evaluating an expression (attribute looking, index/key lookup, function
> call) which will fail if the object is None, and if you're not going to
> call the function, why evaluate the arguments to the function?
>

Yes that is how I understand it as well. I'm suggesting it's hard to see. I
understand the concept as "None cancellation", because if the left is None,
the right is cancelled. This lead me here:

* This is great, want to use all the time!
* First-level language support, shouldn't I use? Does feels useful/natural
* How can I make my APIs cancellation-friendly?
* I can write None-centric APIs, that often collapse to None
* Now maybe user code does stuff like `patient.visits?.september?.records`
to get all records in September (if any, else None)
* Since both `?` points would *prefer* None, if the result is None, I now
have to jump around looking for who done it
* If I don't have debugger ATM, I'm breaking it up a lot for good 'ol
print(...), only way
* I don't think I like this any more :(

I especially don't like the idea of seeing it multiple times quickly, and
the potential impact to debugging. The truth is I want to like this but I
feel like it opens a can of worms (as seen by all the wild operators this
proposal "naturally" suggests).


> > Usually when I want to use this pattern, I find I just need to write
> things
> > out more. The concept itself vaguely reminds me of PHP's use of `@` for
> > squashing errors.
>
> I had to look up PHP's @ and I must say I'm rather horrified. According
> to the docs, all it does is suppress the error reporting, it does
> nothing to prevent or recover from errors. There's not really an
> equivalent in Python, but I suppose this is the closest:
>
> # similar to PHP's $result = @(expression);
> try:
>     result = expression
> except:
>     result = None
>
>
> This is nothing like this proposal. It doesn't suppress arbitrary
> errors. It's more like a conditional:
>
> # result = obj?(expression)
> if obj is None:
>     result = None
> else:
>     result = obj(expression)
>
>
> If `expression` raises an exception, it will still be raised, but only
> if it is actually evaluated, just like anything else protected by an
> if...else or short-circuit operator.


I did say vaguely :) but it is extremely hideous I agree. The part that
made me think of this is the would be desire for things to become None (so,
or example, wanting to avoid throwing typed/informative exceptions if
possible) so they'd then be more useful with `?`.

-- 

C Anthony
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150920/d8250f22/attachment.html>

From srkunze at mail.de  Mon Sep 21 06:20:53 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Mon, 21 Sep 2015 06:20:53 +0200
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAGAVQTGGX679Mx+F3kFDgWSYrJjRGQ_AXXo_qxFBwCtQnUP0Rg@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <CAGAVQTFG8PpdmDqr1FRBrvmB_hJ-ZqJrZT4qLyLx63+bNm-mpg@mail.gmail.com>
 <20150919120624.GS31152@ando.pearwood.info>
 <CAGAVQTGGX679Mx+F3kFDgWSYrJjRGQ_AXXo_qxFBwCtQnUP0Rg@mail.gmail.com>
Message-ID: <55FF85A5.8000909@mail.de>

On 21.09.2015 06:17, C Anthony Risinger wrote:
> Yes that is how I understand it as well. I'm suggesting it's hard to 
> see. I understand the concept as "None cancellation", because if the 
> left is None, the right is cancelled. This lead me here:
>
> * This is great, want to use all the time!
> * First-level language support, shouldn't I use? Does feels useful/natural
> * How can I make my APIs cancellation-friendly?
> * I can write None-centric APIs, that often collapse to None
> * Now maybe user code does stuff like 
> `patient.visits?.september?.records` to get all records in September 
> (if any, else None)
> * Since both `?` points would *prefer* None, if the result is None, I 
> now have to jump around looking for who done it
> * If I don't have debugger ATM, I'm breaking it up a lot for good 'ol 
> print(...), only way
> * I don't think I like this any more :(
>
> I especially don't like the idea of seeing it multiple times quickly, 
> and the potential impact to debugging. The truth is I want to like 
> this but I feel like it opens a can of worms (as seen by all the wild 
> operators this proposal "naturally" suggests).
>

It's interesting to see that everybody who ponders more than a minute 
about it, really fast comes to the same conclusion.

Best,
Sven

From stephen at xemacs.org  Mon Sep 21 06:22:24 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 21 Sep 2015 13:22:24 +0900
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <CALxg4FU=gJKHrq1p+ni5uiXhF==B_SxXk6eU2cW=49KfO1_3Vg@mail.gmail.com>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
 <55FDAC74.7050001@mail.de>
 <CALxg4FU=gJKHrq1p+ni5uiXhF==B_SxXk6eU2cW=49KfO1_3Vg@mail.gmail.com>
Message-ID: <878u80xuyn.fsf@uwakimon.sk.tsukuba.ac.jp>

Luciano Ramalho writes:

 > I don't think students should be worrying about writing code that is
 > Python 2 and Python 3 compatible.

I suppose Chris's students, as for many of those who post RFEs to aid
in teaching Python programming (vs. using Python to teach
programming), are professional programmers, not full-time students.  I
suspect it's their job to write such code.<wink/>

One thing that I've learned in over a decade on this list is that the
"consenting adults" attitude is very practical in focusing discussions
here.  If some posts "I have this use case <explanation>, that I'm
addressing with this code: <code>", it's perfectly reasonable and
often useful to reply, "Don't use that code: in Python the TOOWTDI is
<more code>."

But most of the time "that use case is invalid" isn't any help.  The
use case may even be "stupid", but mandated by employer or by contract
with client, or by existing code that nobody knows how to maintain.
YMMV, but I've been emparrassed every time I've written something to
the effect of "you should make your use case go away."  The OP usually
cannot make it go away.  The most that usually should be said is "it's
very difficult to serve that use case elegantly in Python, and here's
why."


From greg.ewing at canterbury.ac.nz  Mon Sep 21 06:28:08 2015
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 21 Sep 2015 16:28:08 +1200
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
Message-ID: <55FF8758.70406@canterbury.ac.nz>

Chris Barker wrote:
> It would be nice to have a:
> 
> from __future__ import py3

Or maybe

   from __future__ import *

should work?

-- 
Greg

From alexander.belopolsky at gmail.com  Mon Sep 21 06:32:14 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 21 Sep 2015 00:32:14 -0400
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <55FF8758.70406@canterbury.ac.nz>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
 <55FF8758.70406@canterbury.ac.nz>
Message-ID: <CAP7h-xZmBm=GrZ4LYFjamkhLJpVnDKoNXqp3pTTF0HDgDEUWBw@mail.gmail.com>

On Mon, Sep 21, 2015 at 12:28 AM, Greg Ewing <greg.ewing at canterbury.ac.nz>
wrote:

> Chris Barker wrote:
>
>> It would be nice to have a:
>>
>> from __future__ import py3
>>
>
> Or maybe
>
>   from __future__ import *
>
> should work?


+1 (with all the admonitions against the "from whatever import *" construct
being still applicable)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150921/6d83c3d7/attachment.html>

From rosuav at gmail.com  Mon Sep 21 06:35:37 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Mon, 21 Sep 2015 14:35:37 +1000
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <55FF8758.70406@canterbury.ac.nz>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
 <55FF8758.70406@canterbury.ac.nz>
Message-ID: <CAPTjJmqxGgCywQRBNXmyQ5p1ChZMoVbLWgQC1Wt5SG3UQfHGTA@mail.gmail.com>

On Mon, Sep 21, 2015 at 2:28 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Chris Barker wrote:
>>
>> It would be nice to have a:
>>
>> from __future__ import py3
>
>
> Or maybe
>
>   from __future__ import *
>
> should work?

Hah!

Even if it were made to work, though, it'd mean you suddenly and
unexpectedly get backward-incompatible changes when you run your code
on a new version. Effectively, that directive would say "hey, you know
that __future__ feature, well, I'd rather just not bother - get the
breakage right away". Kinda defeats the purpose :)

ChrisA

From srkunze at mail.de  Mon Sep 21 06:52:13 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Mon, 21 Sep 2015 06:52:13 +0200
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
Message-ID: <55FF8CFD.9070906@mail.de>

On 21.09.2015 04:35, Mark E. Haase wrote:
> A) Is coalesce a useful feature? (And what are the use cases?)

I limit myself to materializing default arguments as in:

def a(b=None):
     b = b or {}
     ...

Because its a well known theme (and issue) of the mutability of default 
arguments of Python.

> B) If it is useful, is it important that it short circuits? (Put 
> another way, could a function suffice?)
> C) If it should be an operator, is "??" an ugly spelling?
>
>     >>> retries = default ?? cls.DEFAULT
>

The only difference between "or" and "??" is that "??" is None only, 
right? At least to me, the given use case above does not justify the 
introduction of "??".

> D) If it should be an operator, are any keywords more aesthetically 
> pleasing? (I expect zero support for adding a new keyword.)
>
>     >>> retries = default else cls.DEFAULT
>     >>> retries = try default or cls.DEFAULT
>     >>> retries = try default else cls.DEFAULT
>     >>> retries = try default, cls.DEFAULT
>     >>> retries = from default or cls.DEFAULT
>     >>> retries = from default else cls.DEFAULT
>     >>> retries = from default, cls.DEFAULT
>
>
> My answers:
>
> A) It's useful: supplying default instances for optional values is an 
> obvious and common use case.

Yes, "or" suffices in that case.

> B) It should short circuit, because the patterns it replaces (using 
> ternary operator or "or") also do.

They look ugly and unpleasant because they remind you to reduce the 
usage of None; not to make dealing with it more pleasant.

> C) It's too restrictive to cobble a new operator out of existing 
> keywords; "??" isn't hard to read when it is separated by whitespace, 
> as Pythonistas typically do between a binary operator and its operands.
> D) I don't find any of these easier to read or write than "??".

"or" is easier to type (no special characters), I don't need to explain 
it to new staff, and it's more pleasant to the eye.

I remember my missis telling me, after I showed her some C# code, that 
programmers tend to like weird special characters. Well, that might 
certainly be true. Special characters increase the visual noise and the 
mental strain when reading. They make the lines they are in special. I 
don't see anything special with "or" and with the single use case I have 
for it. :)

Best,
Sven

From rosuav at gmail.com  Mon Sep 21 06:57:14 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Mon, 21 Sep 2015 14:57:14 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <55FF8CFD.9070906@mail.de>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
Message-ID: <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>

On Mon, Sep 21, 2015 at 2:52 PM, Sven R. Kunze <srkunze at mail.de> wrote:
> I limit myself to materializing default arguments as in:
>
> def a(b=None):
>     b = b or {}
>     ...

As long as you never need to pass in a specific empty dictionary,
that's fine. That's the trouble with using 'or' - it's not checking
for None, it's checking for falsiness.

ChrisA

From srkunze at mail.de  Mon Sep 21 07:11:05 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Mon, 21 Sep 2015 07:11:05 +0200
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
Message-ID: <55FF9169.1080905@mail.de>

On 21.09.2015 06:57, Chris Angelico wrote:
> On Mon, Sep 21, 2015 at 2:52 PM, Sven R. Kunze <srkunze at mail.de> wrote:
>> I limit myself to materializing default arguments as in:
>>
>> def a(b=None):
>>      b = b or {}
>>      ...
> As long as you never need to pass in a specific empty dictionary,
> that's fine. That's the trouble with using 'or' - it's not checking
> for None, it's checking for falsiness.

True. Although I rarely pass a dynamic value to parameters with default 
arguments. But you are right, so what does this mean for "??" ?

From random832 at fastmail.com  Mon Sep 21 08:21:21 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 21 Sep 2015 02:21:21 -0400
Subject: [Python-ideas] add a single __future__ for py3?
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
 <55FF8758.70406@canterbury.ac.nz>
 <CAPTjJmqxGgCywQRBNXmyQ5p1ChZMoVbLWgQC1Wt5SG3UQfHGTA@mail.gmail.com>
Message-ID: <m2vbb4xpge.fsf@fastmail.com>

Chris Angelico <rosuav at gmail.com> writes:
> Even if it were made to work, though, it'd mean you suddenly and
> unexpectedly get backward-incompatible changes when you run your code
> on a new version. Effectively, that directive would say "hey, you know
> that __future__ feature, well, I'd rather just not bother - get the
> breakage right away". Kinda defeats the purpose :)

Yeah, well, that won't be a problem for this use case until Python 2.8
comes out. Or do we expect *new* __future__ features to be added to
maintenance releases of Python 2.7?


From rosuav at gmail.com  Mon Sep 21 10:02:34 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Mon, 21 Sep 2015 18:02:34 +1000
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <m2vbb4xpge.fsf@fastmail.com>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
 <55FF8758.70406@canterbury.ac.nz>
 <CAPTjJmqxGgCywQRBNXmyQ5p1ChZMoVbLWgQC1Wt5SG3UQfHGTA@mail.gmail.com>
 <m2vbb4xpge.fsf@fastmail.com>
Message-ID: <CAPTjJmqFgwu_CVAkEQvg9vbQcZq=iT9vXojCs2y4ogy=cjqbeA@mail.gmail.com>

On Mon, Sep 21, 2015 at 4:21 PM, Random832 <random832 at fastmail.com> wrote:
> Chris Angelico <rosuav at gmail.com> writes:
>> Even if it were made to work, though, it'd mean you suddenly and
>> unexpectedly get backward-incompatible changes when you run your code
>> on a new version. Effectively, that directive would say "hey, you know
>> that __future__ feature, well, I'd rather just not bother - get the
>> breakage right away". Kinda defeats the purpose :)
>
> Yeah, well, that won't be a problem for this use case until Python 2.8
> comes out. Or do we expect *new* __future__ features to be added to
> maintenance releases of Python 2.7?

The whole point of this is to be compatible also with Python 3, and
new future directives can be added there. So your from __future__
import * would trigger generator_stop on 3.5, for instance.

ChrisA

From p.f.moore at gmail.com  Mon Sep 21 10:10:03 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 21 Sep 2015 09:10:03 +0100
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
Message-ID: <CACac1F_-JnyZ=_ckzb=H8gtLnezoPxWqLo_DMBa3AXOmnkia2g@mail.gmail.com>

On 20 September 2015 at 23:50, Guido van Rossum <guido at python.org> wrote:
> Actually if anything reminds me of Haskell it's a 'Maybe' type. :-(

I warned you my choice of names was poor :-)
Paul

From greg.ewing at canterbury.ac.nz  Mon Sep 21 07:15:53 2015
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 21 Sep 2015 17:15:53 +1200
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <CAPTjJmqxGgCywQRBNXmyQ5p1ChZMoVbLWgQC1Wt5SG3UQfHGTA@mail.gmail.com>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
 <55FF8758.70406@canterbury.ac.nz>
 <CAPTjJmqxGgCywQRBNXmyQ5p1ChZMoVbLWgQC1Wt5SG3UQfHGTA@mail.gmail.com>
Message-ID: <55FF9289.6050202@canterbury.ac.nz>

Chris Angelico wrote:
> On Mon, Sep 21, 2015 at 2:28 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> 
>>  from __future__ import *
>
> Even if it were made to work, though, it'd mean you suddenly and
> unexpectedly get backward-incompatible changes when you run your code
> on a new version.

Properly implemented, it would use the time
machine module to find every feature that will
ever be implemented in Python. So once you had
updated your code to be compatible with all of
them, it would *never* break again!

The neat thing is that it would take just one
use of the time machine to backport this feature,
and it would then bootstrap itself into existence.

-- 
Greg

From stephen at xemacs.org  Mon Sep 21 10:48:28 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 21 Sep 2015 17:48:28 +0900
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
Message-ID: <877fnkxin7.fsf@uwakimon.sk.tsukuba.ac.jp>

Mark E. Haase writes:

 > This conversation has really focused on the null aware attribute access,
 > but the easier and more defensible use case is the null coalesce operator,
 > spelled "??" in C# and Dart. It's easy to find popular packages that use
 > something like "retries = default if default is not None else cls.DEFAULT"

To me, it's less defensible.  Eg, currently TOOWTDI for "??" is the
idiom quoted above.  I sorta like the attribute access, attribute
fetch, and function call versions, though I probably won't use them.

Also some functions need to accept None as an actual argument, and the
module defines a module-specific sentinel.  The inability to handle
such sentinels is a lack of generality that the "x if x is not
sentinel else y" idiom doesn't suffer from, so "??" itself can't
become TOOWTDI.

I don't write "def foo(default):" (ever that I can recall), so using
"default" in

    retries = default if default is not None else cls.DEFAULT

confuses me.  Realistically, I would be writing

    retries = retries if retries is not None else cls.RETRIES

(or perhaps the RHS would be "self.retries").  That doesn't look that
bad to me (perhaps from frequent repetition).  It's verbose, but I
don't see a need to chain it, unlike "?.".  For "?.", some Pythonistas
would say "just don't", but I agree that often it's natural to chain.

 > to supply default instances.[2] Other packages do something like
 > "retries = default or cls.DEFAULT"[3], which is worse because it
 > easy to overlook the implicit coalescing of the left operand.

Worse?  It's true that it's more risky because it's all falsies, not
just the None sentinel, but I think "consenting adults" applies here.

I don't know about the packages you looked at, but I often use
"x = s or y" where I really want to trap the falsey value of the
expected type, perhaps as well as None, and I use the "x if s is not
sentinel else y" idiom to substitute default values.  I also use "or"
in scripty applications and unit test setup functions where I want
compact expression and I don't expect long-lived objects to be passed
so I can easily figure out where the non-None falsey came from anyway.

 > A) Is coalesce a useful feature? (And what are the use cases?)

Yes, for the whole group of operators.  Several use cases for the
other operators have already been proposed, but I wouldn't use them
myself in current or past projects, and don't really foresee that
changing.  -0 for the group on the IAGNI principle.

But for "??" specifically, it's just more compact AFAICS.  I don't see
where I would use x ?? y ?? z, so the compactness doesn't seem like
that great a benefit.  In practice, I think the use cases for "??"
would be a strict subset of the use cases for the ternary operator, so
you have to argue that "this special case *is* special enough" to have
its own way to do it.  I don't think it is.  -1

 > C) If it should be an operator, is "??" an ugly spelling?
 > 
 >     >>> retries = default ?? cls.DEFAULT

Looks like metasyntax from pseudo-code that didn't get fixed to me.
That would probably change if other ?x operators were added though.

I have no comment on short-circuiting (no relevant experience), or
keyword vs. punctuation spellings.  On second thought:

 > D) If it should be an operator, are any keywords more aesthetically
 > pleasing? (I expect zero support for adding a new keyword.)
 > 
 >     >>> retries = default else cls.DEFAULT

I kinda like this if-less else syntax for the symmetry with else-less
if.  But on second thought I think it would persistently confuse me
when reading, because it would be extremely natural to expect it to be
another way of spelling "default or cls.DEFAULT".  "try ... else ..."
also has its attraction, but I suppose that would fail for the same
reasons that the ternary operator is spelled "x if y else z" rather
than "if y then x else z".


From abarnert at yahoo.com  Mon Sep 21 11:05:33 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Mon, 21 Sep 2015 02:05:33 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <877fnkxin7.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <877fnkxin7.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <6E2CAA1D-F35C-4BD5-B314-B2E0291AE019@yahoo.com>

On Sep 21, 2015, at 01:48, Stephen J. Turnbull <stephen at xemacs.org> wrote:

>>>>> retries = default else cls.DEFAULT
> 
> I kinda like this if-less else syntax for the symmetry with else-less
> if.  

How do you parse this:

    a if b else c else d

Feel free to answer either as a human reader or as CPython's LL(1) parser.


From ncoghlan at gmail.com  Mon Sep 21 11:25:00 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 21 Sep 2015 19:25:00 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <EB7EC940-4CB4-45BC-8DF2-807562016C13@yahoo.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CACac1F8CNxjf3ZZ3_MkuoURxvQtRMpQ1hm3vREmFO5aFQ=5W3w@mail.gmail.com>
 <CADiSq7fgocTcThn75rVfq+j=xRO+yr130Of5rN_a9oD-Wgue-w@mail.gmail.com>
 <EB7EC940-4CB4-45BC-8DF2-807562016C13@yahoo.com>
Message-ID: <CADiSq7ceGfSWMBW183JvyfAC=ac2jqwGJqdqX5RLPb+n+5grJg@mail.gmail.com>

On 21 September 2015 at 08:07, Andrew Barnert <abarnert at yahoo.com> wrote:
> If T is a word list--that is, an Iterable of str or bytes--you want to return a str or a bytes, not a T.
>
> Also, making it work that generically will make the code much more complicated, to the point where it no longer serves as useful sample code to rank novices. You have to extract the first element of T, then do your choosing off chain([first], T) instead of off T, then type(first).join; all of that is more complicated than the actual logic, and will obscure the important part we want novices to learn if they read the source.
>
> Also, I think for word lists, I think you'd want a way to specify actual passphrases vs. the xkcd 936 idea of using passphrases as passwords even for sites that don't accept spaces, like "correcthorsebatterystaple". Maybe via a sep=' ' parameter? That would be very confusing if it's ignored when T is string-like but used when T is a non-string-like iterable of string-likes.
>
> I think it's better to require T to be string-like than to try to generalize it, and maybe add a separate passphrase function that takes (words: Sequence[T], sep: T) -> T. (Although I'm not sure how to default to ' ' vs b' ' based on the type of T... But maybe this does need to handle bytes, so Sequence[str] is fine?)

Simpler is better here, so I'll revise the text based suggestions to:

    secrets.password(result_len: int,
alphabet=string.ascii_letters+string.digits+string.punctuation: str)
-> str
    secrets.passphrase(result_len: int, words: Sequence[str], sep=' ') -> str

>> * Lower level building blocks:
>>
>>    secrets.choice(container)
>>    # Hold off on other SystemRandom methods?
>>
>> (I don't have a strong opinion on that last point, as it's the higher
>> level APIs that I think are the important aspect of this proposal)
>
> I think randrange is definitely worth having. Even the OpenSSL and arc4random APIs provide something equivalent. If you're a novice, and following a blog post that says to use your language's equivalent of randbelow(1000000), are you going to think of choice(range(1000000))? And, if you do, are you going to convince yourself that this is reasonable and not going to create a slew of million-element lists?

Sure, that makes sense, while still keeping the secrets module focused
on integers.

getrandbits() is an interesting one, as it opens up the option of
"secrets.getrandbits(128).to_bytes()" as a pointlessly slower
alternative to "secrets.token(128 // 8)", while
"secrets.getrandbits(128)" itself would be directly equivalent to the
proposed "secrets.serial_number(128 // 8)"

So perhaps it makes sense to just drop the serial_number() idea and
have getrandbits() instead.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Mon Sep 21 11:29:24 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 21 Sep 2015 19:29:24 +1000
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
Message-ID: <CADiSq7czJ+c8Jnbh=GE0=-cUCoiCHYgjKr+0-Ch1HiVtdgEP8w@mail.gmail.com>

On 20 September 2015 at 03:50, Chris Barker <chris.barker at noaa.gov> wrote:
> Hi all,
>
> the common advise, these days, if you want to write py2/3 compatible code,
> is to do:
>
> from __future__ import absolute_import
> from __future__ import division
> from __future__ import print_function
> from __future__ import unicode_literals
>
> https://docs.python.org/2/howto/pyporting.html#prevent-compatibility-regressions
>
> I'm trying to do this in my code, and teaching my students to do it to.
>
> but that's actually a lot of code to write.

For folks using IPython Notebook, I've been suggesting to various
folks that a "Python 2/3 compatible" kernel that enables these
features by default may be desirable. Ed Schofield of
python-future.org was the last person I suggested that to, and he was
interested in taking a look at the idea, but wasn't sure when he'd be
able to find the time.

So, if anyone's interested in exploring the creation of new Project
Jupyter kernels... :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From abarnert at yahoo.com  Mon Sep 21 11:56:56 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Mon, 21 Sep 2015 02:56:56 -0700
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <55FF9289.6050202@canterbury.ac.nz>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
 <55FF8758.70406@canterbury.ac.nz>
 <CAPTjJmqxGgCywQRBNXmyQ5p1ChZMoVbLWgQC1Wt5SG3UQfHGTA@mail.gmail.com>
 <55FF9289.6050202@canterbury.ac.nz>
Message-ID: <FC49F8C7-D6BB-4D76-A6FD-4369B51640D9@yahoo.com>

On Sep 20, 2015, at 22:15, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> 
> Chris Angelico wrote:
>>> On Mon, Sep 21, 2015 at 2:28 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>>> from __future__ import *
>> 
>> Even if it were made to work, though, it'd mean you suddenly and
>> unexpectedly get backward-incompatible changes when you run your code
>> on a new version.
> 
> Properly implemented, it would use the time
> machine module to find every feature that will
> ever be implemented in Python. So once you had
> updated your code to be compatible with all of
> them, it would *never* break again!
> 
> The neat thing is that it would take just one
> use of the time machine to backport this feature,
> and it would then bootstrap itself into existence.

Well, I just tested it with 2.7.0, and it doesn't give me any future flags at all. Which proves that Guido is going to reject the feature (because otherwise he will would have useding the time machine, and he hasn't doinged), so there's no point discussing it any further.

I thought maybe many-worlds could help, so I tried "from __alternate_timeline__ import *" first, but then I got "parse error on input\nFailed, modules loaded: none", and then my kernel panicked with a type error (needs more monads)".

From p.f.moore at gmail.com  Mon Sep 21 12:55:55 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 21 Sep 2015 11:55:55 +0100
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
Message-ID: <CACac1F_O6V5QCoPO1-8Ki4F389O0k3wYyvCEx5mC+kk_8XopGg@mail.gmail.com>

On 21 September 2015 at 03:35, Mark E. Haase <mehaase at gmail.com> wrote:
> A) Is coalesce a useful feature? (And what are the use cases?)

There seem to be a few main use cases:

1. Dealing with functions that return a useful value or None to signal
"no value". I suspect the right answer here is actually to rewrite the
function to not do that in the first place. "Useful value or None"
seems like a reasonable example of an anti-pattern in Python.

2. People forgetting a return at the end of the function. In that
case, the error, while obscure, is reasonable, and should be fixed by
fixing the function, not by working around it in the caller.

3. Using a library (or other function outside your control) that uses
the "useful value or None" idiom. You have to make the best of a bad
job here, but writing an adapter function that hides the complexity
doesn't seem completely unreasonable. Nor does just putting the test
inline and accepting that you're dealing with a less than ideal API.

4. Any others? I can't think of anything.

Overall, I don't think coalesce is *that* useful, given that it seems
like it'd mainly be used in situations where I'd recommend a more
strategic fix to the code.

> B) If it is useful, is it important that it short circuits? (Put another
> way, could a function suffice?)

Short circuiting is important, but to me that simply implies that the
"useful value or None" approach is flawed *because* it needs
short-circuiting to manage. In lazy languages like Haskell, the Maybe
type is reasonable because short-circuiting is a natural consequence
of laziness, and so not a problem. In languages like C#, the use of
null as a sentinel probably goes back to C usage of NULL (i.e., it may
not be a good approach there either, but history and common practice
make it common enough that a fix is needed).

> C) If it should be an operator, is "??" an ugly spelling?
>
>     >>> retries = default ?? cls.DEFAULT

Arbitrary punctuation as operators is not natural in Python, something
like this should be a keyword IMO.

> D) If it should be an operator, are any keywords more aesthetically
> pleasing? (I expect zero support for adding a new keyword.)
>
>     >>> retries = default else cls.DEFAULT
>     >>> retries = try default or cls.DEFAULT
>     >>> retries = try default else cls.DEFAULT
>     >>> retries = try default, cls.DEFAULT
>     >>> retries = from default or cls.DEFAULT
>     >>> retries = from default else cls.DEFAULT
>     >>> retries = from default, cls.DEFAULT

Reusing existing keywords (specifically, all of the above) looks
clumsy and forced to me. I agree that proposals to add a new keyword
will probably never get off the ground, but none of the above
suggestions look reasonable to me, and I can't think of anything else
that does (particularly if you add "must be parseable" as a
restriction!)

Overall, I'm -0.5 on a "coalesce" operator. I can't see it having
sufficient value, and I can't think of a syntax I'd consider
justifying it. But if someone were to come up with a Guido-like
blindingly obvious way to spell the operation, I would be fine with
that (and may even start using it more often than I think).

Paul

From rosuav at gmail.com  Mon Sep 21 15:02:27 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Mon, 21 Sep 2015 23:02:27 +1000
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <FC49F8C7-D6BB-4D76-A6FD-4369B51640D9@yahoo.com>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
 <55FF8758.70406@canterbury.ac.nz>
 <CAPTjJmqxGgCywQRBNXmyQ5p1ChZMoVbLWgQC1Wt5SG3UQfHGTA@mail.gmail.com>
 <55FF9289.6050202@canterbury.ac.nz>
 <FC49F8C7-D6BB-4D76-A6FD-4369B51640D9@yahoo.com>
Message-ID: <CAPTjJmpm_aYEbWwyXeU96LC07-87FWtJO=9JzTdocF1P3REv9A@mail.gmail.com>

On Mon, Sep 21, 2015 at 7:56 PM, Andrew Barnert via Python-ideas
<python-ideas at python.org> wrote:
> I thought maybe many-worlds could help, so I tried "from __alternate_timeline__ import *" first, but then I got "parse error on input\nFailed, modules loaded: none", and then my kernel panicked with a type error (needs more monads)".
>

Your kernel isn't multitimeline compliant. Try recompiling it with the
--with-polyads option, and make sure you don't use an ad-blocker.

Dragging this thread back to some semblance of serious discussion...
An alias like "py3" could be well-defined, but still rather not - and
definitely not the star-import. Even adding an alias would be a
problem for compatibility, because there would be Python versions that
suddenly fail. Currently, future features monotonically increase as
Python versions increase, so if "from __future__ import
barry_as_FLUFL" works on 3.3.6, I would expect it to work on 3.4.1 as
well. Adding an alias to 2.7.11 would mean adding it also to bugfix
releases in the 3.x line, so "from __future__ import py3" would break
on certain bugfix releases of all versions of Python until 3.6, at
which point it would be available. Do you really want your code to run
fine on 3.5.1 and 3.4.3, but not on 3.5.0? That would be a nightmare
to deal with, unless you're writing code for 2.7.11+/3.6+.

ChrisA

From steve at pearwood.info  Mon Sep 21 15:08:15 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 21 Sep 2015 23:08:15 +1000
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <m2vbb4xpge.fsf@fastmail.com>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
 <55FF8758.70406@canterbury.ac.nz>
 <CAPTjJmqxGgCywQRBNXmyQ5p1ChZMoVbLWgQC1Wt5SG3UQfHGTA@mail.gmail.com>
 <m2vbb4xpge.fsf@fastmail.com>
Message-ID: <20150921130813.GZ31152@ando.pearwood.info>

On Mon, Sep 21, 2015 at 02:21:21AM -0400, Random832 wrote:
> Chris Angelico <rosuav at gmail.com> writes:
> > Even if it were made to work, though, it'd mean you suddenly and
> > unexpectedly get backward-incompatible changes when you run your code
> > on a new version. Effectively, that directive would say "hey, you know
> > that __future__ feature, well, I'd rather just not bother - get the
> > breakage right away". Kinda defeats the purpose :)
> 
> Yeah, well, that won't be a problem for this use case until Python 2.8
> comes out.

There will not be an official Python 2.8.

https://www.python.org/dev/peps/pep-0404/


> Or do we expect *new* __future__ features to be added to
> maintenance releases of Python 2.7?

No.


-- 
Steve

From rosuav at gmail.com  Mon Sep 21 15:27:24 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Mon, 21 Sep 2015 23:27:24 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CACac1F_O6V5QCoPO1-8Ki4F389O0k3wYyvCEx5mC+kk_8XopGg@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <CACac1F_O6V5QCoPO1-8Ki4F389O0k3wYyvCEx5mC+kk_8XopGg@mail.gmail.com>
Message-ID: <CAPTjJmp-e7oZ-w0nL_11kiw2jRGhhM2M7T2oeAt1V0dXMbZfyg@mail.gmail.com>

On Mon, Sep 21, 2015 at 8:55 PM, Paul Moore <p.f.moore at gmail.com> wrote:
> There seem to be a few main use cases:
>
> 1. Dealing with functions that return a useful value or None to signal
> "no value". I suspect the right answer here is actually to rewrite the
> function to not do that in the first place. "Useful value or None"
> seems like a reasonable example of an anti-pattern in Python.

The alternative being to raise an exception? It's generally easier,
when you can know in advance what kind of object you're expecting, to
have a None return when there isn't one. For example, SQLAlchemy has
.get(id) to return the object for a given primary key value, and it
returns None if there's no such row in the database table - having to
wrap that with try/except would be a pain. This isn't an error
condition, and it's not like the special case of iteration (since an
iterator could yield any value, it's critical to have a non-value way
of signalling "end of iteration"). I don't want to see everything
forced to "return or raise" just because someone calls this an
anti-pattern.

ChrisA

From random832 at fastmail.com  Mon Sep 21 15:45:38 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 21 Sep 2015 09:45:38 -0400
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <20150921130813.GZ31152@ando.pearwood.info>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
 <55FF8758.70406@canterbury.ac.nz>
 <CAPTjJmqxGgCywQRBNXmyQ5p1ChZMoVbLWgQC1Wt5SG3UQfHGTA@mail.gmail.com>
 <m2vbb4xpge.fsf@fastmail.com>
 <20150921130813.GZ31152@ando.pearwood.info>
Message-ID: <1442843138.3321719.389368849.52DEB79B@webmail.messagingengine.com>

On Mon, Sep 21, 2015, at 09:08, Steven D'Aprano wrote:
> On Mon, Sep 21, 2015 at 02:21:21AM -0400, Random832 wrote:
> > Yeah, well, that won't be a problem for this use case until Python 2.8
> > comes out.
> 
> There will not be an official Python 2.8.

Well, yes, "until Python 2.8 comes out" was meant to be a synonym for
"never". But Chris Angelico since pointed out that it would be a problem
for Python 3 since it's intended to be used for 2/3 compatible scripts.

I think I'd originally read the use case as "make Python 2 as similar to
Python 3 as possible so that people learning on Python 2 won't learn as
many bad habits", without anything about explicitly running the same
scripts on Python 3.

From p.f.moore at gmail.com  Mon Sep 21 16:27:01 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 21 Sep 2015 15:27:01 +0100
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAPTjJmp-e7oZ-w0nL_11kiw2jRGhhM2M7T2oeAt1V0dXMbZfyg@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <CACac1F_O6V5QCoPO1-8Ki4F389O0k3wYyvCEx5mC+kk_8XopGg@mail.gmail.com>
 <CAPTjJmp-e7oZ-w0nL_11kiw2jRGhhM2M7T2oeAt1V0dXMbZfyg@mail.gmail.com>
Message-ID: <CACac1F_htUWZuhc+7r2DXeOKVEdQ6PdRvGmWHQN_XynOp-prwg@mail.gmail.com>

On 21 September 2015 at 14:27, Chris Angelico <rosuav at gmail.com> wrote:
> On Mon, Sep 21, 2015 at 8:55 PM, Paul Moore <p.f.moore at gmail.com> wrote:
>> There seem to be a few main use cases:
>>
>> 1. Dealing with functions that return a useful value or None to signal
>> "no value". I suspect the right answer here is actually to rewrite the
>> function to not do that in the first place. "Useful value or None"
>> seems like a reasonable example of an anti-pattern in Python.
>
> The alternative being to raise an exception? It's generally easier,
> when you can know in advance what kind of object you're expecting, to
> have a None return when there isn't one. For example, SQLAlchemy has
> .get(id) to return the object for a given primary key value, and it
> returns None if there's no such row in the database table - having to
> wrap that with try/except would be a pain. This isn't an error
> condition, and it's not like the special case of iteration (since an
> iterator could yield any value, it's critical to have a non-value way
> of signalling "end of iteration"). I don't want to see everything
> forced to "return or raise" just because someone calls this an
> anti-pattern.

Agreed, that's not what should happen.

It's hard to give examples without going into specific cases, but as
an example, look at dict.get. The user can supply a "what to return if
the key doesn't exist" argument. OK, many people leave it returning
the default None, but they don't *have* to - dict.get itself doesn't
do "useful value or None", it does "useful value or user-supplied
default".

All I'm saying is that people should look at *why* their functions
return None instead of a useful result, and see if they can do better.
My contention is that (given free rein) many times they can. Of course
not all code has free rein, not all developers have the time to look
for perfect APIs, etc. But in that case, returning a placeholder None
(and accepting a little ugliness at the call site) isn't an impossible
price to pay.

Nothing more than "I don't think the benefit justifies adding a new
operator to Python".

Paul

From steve at pearwood.info  Mon Sep 21 16:41:31 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 22 Sep 2015 00:41:31 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CACac1F_O6V5QCoPO1-8Ki4F389O0k3wYyvCEx5mC+kk_8XopGg@mail.gmail.com>
References: <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <CACac1F_O6V5QCoPO1-8Ki4F389O0k3wYyvCEx5mC+kk_8XopGg@mail.gmail.com>
Message-ID: <20150921144130.GB31152@ando.pearwood.info>

On Mon, Sep 21, 2015 at 11:55:55AM +0100, Paul Moore wrote:
> On 21 September 2015 at 03:35, Mark E. Haase <mehaase at gmail.com> wrote:
> > A) Is coalesce a useful feature? (And what are the use cases?)
> 
> There seem to be a few main use cases:
> 
> 1. Dealing with functions that return a useful value or None to signal
> "no value". I suspect the right answer here is actually to rewrite the
> function to not do that in the first place. "Useful value or None"
> seems like a reasonable example of an anti-pattern in Python.

I think that's a bit strong. Or perhaps much too strong.

There are times where you can avoid the "None or value" pattern, since 
there is a perfectly decent empty value you can use instead of None. 
E.g. if somebody doesn't have a name, you can use "" instead of None, 
and avoid special treatment.

But that doesn't always work. Suppose you want an optional (say) Dog 
object. There isn't such a thing as an empty Dog, so you have to use 
some other value to represent the lack of Dog. One could, I suppose, 
subclass Dog and build a (singleton? borg?) NoDog object, but that's 
overkill and besides it doesn't scale well if you have multiple types 
that need the same treatment.

So I don't think it is correct, or helpful, to say that we should avoid 
the "None versus value" pattern. Sometimes we can naturally avoid it, 
but it also has perfectly reasonable uses.


> Overall, I don't think coalesce is *that* useful, given that it seems
> like it'd mainly be used in situations where I'd recommend a more
> strategic fix to the code.

Go back to the original use-case given, which, paraphrasing, looks 
something like this:

result = None if value is None else value['id'].method()

I don't think we can reject code like the above out of hand as 
un-Pythonic or an anti-pattern. It's also very common, and a little 
verbose. It's not bad when the value is a name, but sometimes it's an 
expression, in which case it's both verbose and inefficient:

result = None if spam.eggs(cheese) is None else spam.eggs(cheese)['id'].method()

Contrast:

result = spam.eggs(cheese)?['id'].method()

which only calculates the expression to the left of the ?[ once.

An actual real-life example where we work around this by using a 
temporary name that otherwise isn't actually used for anything:

mo = re.match(needle, haystack)
if mo:
    substr = mo.group()
else:
    substr = None


I think it is perfectly reasonable to ask for syntactic sugar to avoid 
having to write code like the above:

substr = re.match(needle, haystack)?.group()


That's not to say we necessarily should add sugar for this, since 
there is no doubt there are disadvantages as well (mostly that many 
people dislike the ? syntax), but in principle at least it would 
certainly be nice to have and useful.


> > B) If it is useful, is it important that it short circuits? (Put another
> > way, could a function suffice?)
> 
> Short circuiting is important, but to me that simply implies that the
> "useful value or None" approach is flawed *because* it needs
> short-circuiting to manage.

Nothing needs short-circuiting, at least in a language with imperative 
assignment statements. You can always avoid the need for short-circuits 
with temporary variables, and sometimes that's the right answer: not 
everything needs to be a one-liner, or an expression.

But sometimes it is better if it could be.


> > C) If it should be an operator, is "??" an ugly spelling?
> >
> >     >>> retries = default ?? cls.DEFAULT

I assume the ?? operator is meant as sugar for:

retries = cls.DEFAULT if default is None else default

I prefer to skip the "default" variable and use the standard idiom:

if retries is None:
     retries = cls.DEFAULT

I also worry about confusion caused by the asymmetry between ?? and the 
other three ? cases:

# if the left side is None, return None, else evaluate the right side
spam?.attr
spam?['id']
spam?(arg)

# if the left side is None, return the right side, else return the left
spam ?? eggs

but perhaps I'm worried over nothing.


-- 
Steve

From p.f.moore at gmail.com  Mon Sep 21 16:56:44 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 21 Sep 2015 15:56:44 +0100
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <20150921144130.GB31152@ando.pearwood.info>
References: <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <CACac1F_O6V5QCoPO1-8Ki4F389O0k3wYyvCEx5mC+kk_8XopGg@mail.gmail.com>
 <20150921144130.GB31152@ando.pearwood.info>
Message-ID: <CACac1F-dVvpTA5YhuSWpFMe+oNsXDas9eE0NfgS=m44_v-theA@mail.gmail.com>

On 21 September 2015 at 15:41, Steven D'Aprano <steve at pearwood.info> wrote:
> An actual real-life example where we work around this by using a
> temporary name that otherwise isn't actually used for anything:
>
> mo = re.match(needle, haystack)
> if mo:
>     substr = mo.group()
> else:
>     substr = None
>
>
> I think it is perfectly reasonable to ask for syntactic sugar to avoid
> having to write code like the above:
>
> substr = re.match(needle, haystack)?.group()

Well, (1) Mark had focused on the "coalesce" operator ??, not the ?.
variant, and that is less obviously useful here, and (2) I find the
former version more readable. YMMV on readability of course - which is
why I added the proviso that if someone comes up with an "obviously
right" syntax, I may well change my mind. But the options suggested so
far are all far less readable than a simple multi-line if (maybe with
a temporary variable) to me, at least.

By the way, in your example you're passing on the "none or useful"
property by making substr be either the matched value or None. In real
life, I'd probably do something more like

mo = re.match(needle, haystack)
if mo:
    process(mo.group())
else:
    no_needle()

possibly with inline code if process or no_needle were simple. But it
is of course easy to pick apart examples - real code isn't always that
tractable. For example we get (what seems to me like) a *lot* of bug
reports about "None has not attribute foo" style errors in pip. My
comments here are based on my inclinations about how I would fix them
in pip - I'd always go back to *why* we got a None, and try to avoid
getting the None in the first place. But that's not always easy to do.
Again, of course, YMMV.

Paul

From jsbueno at python.org.br  Mon Sep 21 17:10:58 2015
From: jsbueno at python.org.br (Joao S. O. Bueno)
Date: Mon, 21 Sep 2015 12:10:58 -0300
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <CALxg4FU=gJKHrq1p+ni5uiXhF==B_SxXk6eU2cW=49KfO1_3Vg@mail.gmail.com>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
 <55FDAC74.7050001@mail.de>
 <CALxg4FU=gJKHrq1p+ni5uiXhF==B_SxXk6eU2cW=49KfO1_3Vg@mail.gmail.com>
Message-ID: <CAH0mxTRy4YCJOCDpWawD=1Z7wLGvLJr6bJ37kUrVVdP=wOs=vw@mail.gmail.com>

On 20 September 2015 at 10:59, Luciano Ramalho <luciano at ramalho.org> wrote:
> Chris,
>
> I don't think students should be worrying about writing code that is
> Python 2 and Python 3 compatible.
>
> That's a concern only for people who write libraries, tools and
> frameworks for others to use, and I do not think these are the kinds
> of programs students usually do. Even if they are doing something
> along those lines, they should be focusing on other more important
> features of the programs rather than whether they run on Python 2 and
> on Python 3.
>
> Having said that, I'd also like to add that I don't think ``from
> __future__ import unicode_literals`` is a great idea for making code
> 2/3 compatible nowadays. It was necessary before the u'' prefix was
> reinstated in Python 3.3, but since u'' is back it's much better to be
> explicit in your literals rather than dealing with runtime errors
> because of the blanket effect of the unicode_literals import.
>
> Anyone who cares about 2/3 compatibility should mark every single
> literal with a u'' or a b'' prefix.
>
> But students should not be distracted by this. They should be using
> Python 3 only ;-).

I care to disagree. Anyone writting maintaonable, future-proof code that
should sitll run on Python2 should add the "from __future__ import
unicode_literals"  -
and write the code fully aware that each string is text, and not bytes.

The "u" prefix is nice for quickly porting projects, without rethinking the flow
of every single string found inside.
>
> Cheers,
>
> Luciano
>
>
>
>
> On Sat, Sep 19, 2015 at 3:41 PM, Sven R. Kunze <srkunze at mail.de> wrote:
>> I totally agree here.
>>
>>
>> On 19.09.2015 19:50, Chris Barker wrote:
>>
>> Hi all,
>>
>> the common advise, these days, if you want to write py2/3 compatible code,
>> is to do:
>>
>> from __future__ import absolute_import
>> from __future__ import division
>> from __future__ import print_function
>> from __future__ import unicode_literals
>>
>> https://docs.python.org/2/howto/pyporting.html#prevent-compatibility-regressions
>>
>> I'm trying to do this in my code, and teaching my students to do it to.
>>
>> but that's actually a lot of code to write.
>>
>> It would be nice to have a:
>>
>> from __future__ import py3
>>
>> or something like that, that would do all of those in one swipe.
>>
>> IIIC, l can't make a little module that does that, because the __future__
>> imports only effect the module in which they are imported
>>
>> Sure, it's not a huge deal, but it would make it easier for folks wanting to
>> keep up this best practice.
>>
>> Of course, this wouldn't happen until 2.7.11, if an when there even is one,
>> but it would be nice to get it on the list....
>>
>> -Chris
>>
>>
>>
>>
>> --
>>
>> Christopher Barker, Ph.D.
>> Oceanographer
>>
>> Emergency Response Division
>> NOAA/NOS/OR&R            (206) 526-6959   voice
>> 7600 Sand Point Way NE   (206) 526-6329   fax
>> Seattle, WA  98115       (206) 526-6317   main reception
>>
>> Chris.Barker at noaa.gov
>>
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>>
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
>
> --
> Luciano Ramalho
> |  Author of Fluent Python (O'Reilly, 2015)
> |     http://shop.oreilly.com/product/0636920032519.do
> |  Professor em: http://python.pro.br
> |  Twitter: @ramalhoorg
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From guido at python.org  Mon Sep 21 17:18:13 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 21 Sep 2015 08:18:13 -0700
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <1442843138.3321719.389368849.52DEB79B@webmail.messagingengine.com>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
 <55FF8758.70406@canterbury.ac.nz>
 <CAPTjJmqxGgCywQRBNXmyQ5p1ChZMoVbLWgQC1Wt5SG3UQfHGTA@mail.gmail.com>
 <m2vbb4xpge.fsf@fastmail.com> <20150921130813.GZ31152@ando.pearwood.info>
 <1442843138.3321719.389368849.52DEB79B@webmail.messagingengine.com>
Message-ID: <CAP7+vJ+VCBsSJ1QheRzaoKViYiFGnJd_JLLhb0kW2r+GZg6Xdg@mail.gmail.com>

It's just about these four imports, right?

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals

I think the case is overblown.

- absolute_import is rarely an issue; the only thing it does (despite the
name) is give an error message when you attempt a relative import without
using a "." in the import. A linter can find this easily for you, and a
little discipline plus the right example can do a lot of good here.

- division is important.

- print_function is important.

- unicode_literals is useless IMO. It breaks some things (yes there are
still APIs that don't take unicode in 2.7) and it doesn't nearly as much as
what would be useful -- e.g. repr() and <stream>.readline() still return
8-bit strings. I recommend just using u-literals and abandoning Python 3.2.

--
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150921/2590782b/attachment-0001.html>

From ron3200 at gmail.com  Mon Sep 21 17:28:22 2015
From: ron3200 at gmail.com (Ron Adam)
Date: Mon, 21 Sep 2015 10:28:22 -0500
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
Message-ID: <mtp7mo$rgh$1@ger.gmane.org>

On 09/20/2015 11:57 PM, Chris Angelico wrote:
> On Mon, Sep 21, 2015 at 2:52 PM, Sven R. Kunze<srkunze at mail.de>  wrote:
>> >I limit myself to materializing default arguments as in:
>> >
>> >def a(b=None):
>> >     b = b or {}
>> >     ...
> As long as you never need to pass in a specific empty dictionary,
> that's fine. That's the trouble with using 'or' - it's not checking
> for None, it's checking for falseness.

 From reading these, I think the lowest-level/purest change would be to 
accommodate testing for "not None".  Something I've always thought 
Python should be able to do in a nicer more direct way.

We could add a "not None" specific boolean operators just by appending ! 
to them.

     while! x:    <-->   while x != None:
     if! x:       <-->   if x != None:

     a or! b      <-->   b if a != None else a
     a and! b     <-->   a if a != None else b
     not! x       <-->   x if x != None else None

Those expressions on the right are very common and are needed because of 
None, False, and 0, are all False values.

It would make for much simpler expressions and statements where they are 
used and be more efficient as these are likely to be in loops going over 
*many* objects.  So it may also result in a fairly nice speed 
improvement for many routines.

While the consistency argument says "if!" should be equivalent to "if 
not", I feel the practicality argument leans towards it being specific 
to "if obj != None".

I believe testing for "not None" is a lot more common than testing for 
"None".  Usually the only difference is how the code is arranged.

I like how it simplifies/clarifies the common cases above.  It would be 
especially nice in comprehensions.

Cheers,
    Ron



From guido at python.org  Mon Sep 21 17:40:17 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 21 Sep 2015 08:40:17 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <mtp7mo$rgh$1@ger.gmane.org>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
Message-ID: <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>

Just to cut this thread short, I'm going to reject PEP 505, because ? is
just too ugly to add to Python IMO. Sorry.

I commend Mark for his clean write-up, without being distracted, giving
some good use cases. I also like that he focused on a minimal addition to
the language and didn't get distracted by hyper-generalizations.

I also like that he left out f?(...) -- the use case is much weaker;
usually it's the object whose method you're calling that might be None, as
in title?.upper().

Some nits for the PEP:

- I don't think it ever gives the priority for the ?? operator. What would
"a ?? b or c" mean?
- You don't explain why it's x ?? y but x ?= y. I would have expected
either x ? y or x ??= y.
- You don't explain or show how far ?. reaches; I assume x?y.z is
equivalent to None if x is None else x.y.z, so you don't have to write
x?.y?.z just to handle x.y.z if x is None.
- The specification section is empty.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150921/4ac40254/attachment.html>

From mehaase at gmail.com  Mon Sep 21 17:58:22 2015
From: mehaase at gmail.com (Mark E. Haase)
Date: Mon, 21 Sep 2015 11:58:22 -0400
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
Message-ID: <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>

PEP-505 isn't anywhere close to being finished. I only submitted the draft
because somebody off list asked me to send a draft so I could get a PEP
number assigned. So I literally sent him what I had open in my text editor,
which was just a few minutes of brain dumping and had several mistakes
(grammatical and technical).

If there's absolutely no point in continuing to work on it, I'll drop it.
But from the outset, I thought the plan was to present this in its best
light (and similar to the ternary operator PEP, offer several alternatives)
if for no other reason than to have a good record of the reasoning for
rejecting it.

I'm sorry if I misunderstood the PEP process; I would have kept it to
myself longer if I knew the first submission was going to be reviewed
critically. I thought this e-mail chain was more of an open discussion on
the general idea, not specifically a referendum on the PEP itself.

On Mon, Sep 21, 2015 at 11:40 AM, Guido van Rossum <guido at python.org> wrote:

> Just to cut this thread short, I'm going to reject PEP 505, because ? is
> just too ugly to add to Python IMO. Sorry.
>
> I commend Mark for his clean write-up, without being distracted, giving
> some good use cases. I also like that he focused on a minimal addition to
> the language and didn't get distracted by hyper-generalizations.
>
> I also like that he left out f?(...) -- the use case is much weaker;
> usually it's the object whose method you're calling that might be None, as
> in title?.upper().
>
> Some nits for the PEP:
>
> - I don't think it ever gives the priority for the ?? operator. What would
> "a ?? b or c" mean?
> - You don't explain why it's x ?? y but x ?= y. I would have expected
> either x ? y or x ??= y.
> - You don't explain or show how far ?. reaches; I assume x?y.z is
> equivalent to None if x is None else x.y.z, so you don't have to write
> x?.y?.z just to handle x.y.z if x is None.
> - The specification section is empty.
>
> --
> --Guido van Rossum (python.org/~guido)
>



-- 
Mark E. Haase
202-815-0201
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150921/e75ba4e5/attachment.html>

From steve at pearwood.info  Mon Sep 21 18:10:59 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 22 Sep 2015 02:10:59 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
Message-ID: <20150921161059.GC31152@ando.pearwood.info>

On Sat, Sep 19, 2015 at 06:40:32PM -0500, Tim Peters wrote:
> [Guido]
> > Thanks! I'd accept this (and I'd reject 504 at the same time). I like the
> > secrets name. I wonder though, should the PEP propose a specific set of
> > functions? (With the understanding that we might add more later.)
> 
> The bikeshedding on that will be far more tedious than the
> implementation.  I'll get it started :-)
> 
> No attempt to be minimal here.  More-than-less "obvious" is more important:
> 
> Bound methods of a SystemRandom instance
>     .randrange()
>     .randint()
>     .randbits()
>         renamed from .getrandbits()
>     .randbelow(exclusive_upper_bound)
>         renamed from private ._randbelow()
>     .choice()

While we're bike-shedding, I don't know that I like the name randbits, 
since that always makes me expect a sequence of 0, 1 bits. But that's a 
minor point.

When would somebody use randbelow(n) rather than randrange(n)?

Apart from the possible redundancy between rand[below|range], all the 
above seem reasonable to me.

Are there use-cases for a strong random float between 0 and 1? If 
so, is it sufficient to say secrets.randbelow(sys.maxsize)/sys.maxsize, 
or should we offer secrets.random() and/or secrets.uniform(a, b)?


>  Token functions
>     .token_bytes(nbytes)
>         another name for os.urandom()
>     .token_hex(nbytes)
>         same, but return string of ASCII hex digits
>     .token_url(nbytes)
>         same, but return URL-safe base64-encoded ASCII

I suggest adding a default length, say nbytes=32, with a note that the 
default length is expected to increase in the future. Otherwise, how 
will the naive user know what counts as a good, hard-to-attack length?

All of the above look good to me.


>     .token_alpha(alphabet, nchars)
>         string of `nchars` characters drawn uniformly
>         from `alphabet`

What is the intention for this function? To use as passwords? Other than 
that, it's not obvious to me what that would be used for.



-- 
Steve

From steve at pearwood.info  Mon Sep 21 18:16:45 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 22 Sep 2015 02:16:45 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The
	Standard	Library
In-Reply-To: <871tdtvgun.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CAPTjJmq0OQ8EUJH4km95-2NcYDUsmVe+khmmrGw-fNYsxJ0YAw@mail.gmail.com>
 <CAExdVNnTjYFWKUA0pVL1-4LDb8ecvRAtLJbFXnFci1jQ2MUXNg@mail.gmail.com>
 <CAPTjJmqKCbpwpu7x1_PtXc8-HUmgtp8Tw-6CMvy9UOjcTSDY5w@mail.gmail.com>
 <871tdtvgun.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20150921161645.GD31152@ando.pearwood.info>

On Sun, Sep 20, 2015 at 01:45:36PM +0900, Stephen J. Turnbull wrote:
> Chris Angelico writes:
> 
>  > My personal preference for shed colour: token_bytes returns a
>  > bytestring, its length being the number provided. All the others
>  > return Unicode strings, their lengths again being the number provided.
>  > So they're all text bar the one that explicitly says it's in bytes.
> 
> I think that token_url may need a bytes mode, for the same reasons
> that bytes needs __mod__: such tokens will often be created and parsed
> by programs that never leave the "ASCII-compatible bytes" world.

I expect that token_url would return a string (Unicode), but since
it's pure ASCII (being base64 encoded), if you want bytes, you can just
call token_url().encode('ascii').

Or maybe it should return bytes, and if you want a string, you just say 
token_url().decode('ascii').

Out of the two, I'm very slightly leaning towards the first (Unicode by 
default, encode to ASCII if you want bytes) than the second.

I'm very much not in favour of a "return_bytes=True" argument.


-- 
Steve

From srkunze at mail.de  Mon Sep 21 18:21:28 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Mon, 21 Sep 2015 18:21:28 +0200
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <6E2CAA1D-F35C-4BD5-B314-B2E0291AE019@yahoo.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <877fnkxin7.fsf@uwakimon.sk.tsukuba.ac.jp>
 <6E2CAA1D-F35C-4BD5-B314-B2E0291AE019@yahoo.com>
Message-ID: <56002E88.80903@mail.de>

On 21.09.2015 11:05, Andrew Barnert via Python-ideas wrote:
> On Sep 21, 2015, at 01:48, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>
>>>>>> retries = default else cls.DEFAULT
>> I kinda like this if-less else syntax for the symmetry with else-less
>> if.

That's cool. It reads nice (at least for a non-native speaker). Also 
chaining else reads nice:

final_value = users_value else apps_value else systems_value

> How do you parse this:
>
>      a if b else c else d
>
> Feel free to answer either as a human reader or as CPython's LL(1) parser.

Use parentheses if you mix up if-else and else. ;)

Btw. the same applies for: a + b * c + d
If you don't know from you education that b*c would have been evaluated 
first, then it's not obvious either.

Best,
Sven

From steve at pearwood.info  Mon Sep 21 18:22:26 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 22 Sep 2015 02:22:26 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <mtli19$u3d$1@ger.gmane.org>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <mtli19$u3d$1@ger.gmane.org>
Message-ID: <20150921162226.GE31152@ando.pearwood.info>

On Sun, Sep 20, 2015 at 09:00:08AM +0300, Serhiy Storchaka wrote:
> On 20.09.15 02:40, Tim Peters wrote:
> >No attempt to be minimal here.  More-than-less "obvious" is more important:
> >
> >Bound methods of a SystemRandom instance
> >     .randrange()
> >     .randint()
> >     .randbits()
> >         renamed from .getrandbits()
> >     .randbelow(exclusive_upper_bound)
> >         renamed from private ._randbelow()
> >     .choice()
> 
> randbelow() is just an alias for randrange() with single argument.
> randint(a, b) == randrange(a, b+1).
> 
> These functions are redundant and they have non-zero cost.

But they already exist in the random module, so adding them to secrets 
doesn't cost anything extra. It's just a reference to the bound method 
of the private SystemRandom() instance:

# suggested implementation
import random
_systemrandom = random.SystemRandom()

randint= _systemrandom.randint
randrange = _systemrandom.randrange

etc.


> Would not renaming getrandbits be confused?
> 
> >  Token functions
> >     .token_bytes(nbytes)
> >         another name for os.urandom()
> >     .token_hex(nbytes)
> >         same, but return string of ASCII hex digits
> >     .token_url(nbytes)
> >         same, but return URL-safe base64-encoded ASCII
> >     .token_alpha(alphabet, nchars)
> >         string of `nchars` characters drawn uniformly
> >         from `alphabet`
> 
> token_hex(nbytes) == token_alpha('0123456789abcdef', nchars) ?
> token_url(nbytes) == token_alpha(
>     'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_',
>      nchars) ?

They may be reasonable implementations for the functions, but simple as 
they are, I think we still want to provide them as named functions 
rather than expect the user to write things like the above. If they're 
doing it more than once, they'll want to write a helper function, we 
might as well provide that for them.


-- 
Steve

From srkunze at mail.de  Mon Sep 21 18:27:48 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Mon, 21 Sep 2015 18:27:48 +0200
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAPTjJmp-e7oZ-w0nL_11kiw2jRGhhM2M7T2oeAt1V0dXMbZfyg@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <CACac1F_O6V5QCoPO1-8Ki4F389O0k3wYyvCEx5mC+kk_8XopGg@mail.gmail.com>
 <CAPTjJmp-e7oZ-w0nL_11kiw2jRGhhM2M7T2oeAt1V0dXMbZfyg@mail.gmail.com>
Message-ID: <56003004.1070807@mail.de>

On 21.09.2015 15:27, Chris Angelico wrote:
> On Mon, Sep 21, 2015 at 8:55 PM, Paul Moore <p.f.moore at gmail.com> wrote:
>> There seem to be a few main use cases:
>>
>> 1. Dealing with functions that return a useful value or None to signal
>> "no value". I suspect the right answer here is actually to rewrite the
>> function to not do that in the first place. "Useful value or None"
>> seems like a reasonable example of an anti-pattern in Python.
> The alternative being to raise an exception? It's generally easier,
> when you can know in advance what kind of object you're expecting, to
> have a None return when there isn't one. For example, SQLAlchemy has
> .get(id) to return the object for a given primary key value, and it
> returns None if there's no such row in the database table - having to
> wrap that with try/except would be a pain. This isn't an error
> condition, and it's not like the special case of iteration (since an
> iterator could yield any value, it's critical to have a non-value way
> of signalling "end of iteration"). I don't want to see everything
> forced to "return or raise" just because someone calls this an
> anti-pattern.

I don't think both approaches are mutual exclusive. They can both exist 
and provide whenever I need the right thing.

Depending on the use-case, one needs to decide:

If I know, the value definitely needs to be a dictionary, I use dict[...].
If I know, the value is definitely optional and I can't do anything 
about it, I use dict.get('key'[, default]).
If I definitely don't know, I use dict[...] to get my hands on a real 
example with out that key if that every happens and don't waste time for 
special-handling a possible None return value.

Best,
Sven

From robert.kern at gmail.com  Mon Sep 21 18:29:08 2015
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 21 Sep 2015 17:29:08 +0100
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <20150921162226.GE31152@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <mtli19$u3d$1@ger.gmane.org> <20150921162226.GE31152@ando.pearwood.info>
Message-ID: <mtpb8l$omo$1@ger.gmane.org>

On 2015-09-21 17:22, Steven D'Aprano wrote:
> On Sun, Sep 20, 2015 at 09:00:08AM +0300, Serhiy Storchaka wrote:
>> On 20.09.15 02:40, Tim Peters wrote:

>>>   Token functions
>>>      .token_bytes(nbytes)
>>>          another name for os.urandom()
>>>      .token_hex(nbytes)
>>>          same, but return string of ASCII hex digits
>>>      .token_url(nbytes)
>>>          same, but return URL-safe base64-encoded ASCII
>>>      .token_alpha(alphabet, nchars)
>>>          string of `nchars` characters drawn uniformly
>>>          from `alphabet`
>>
>> token_hex(nbytes) == token_alpha('0123456789abcdef', nchars) ?
>> token_url(nbytes) == token_alpha(
>>      'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_',
>>       nchars) ?
>
> They may be reasonable implementations for the functions, but simple as
> they are, I think we still want to provide them as named functions
> rather than expect the user to write things like the above. If they're
> doing it more than once, they'll want to write a helper function, we
> might as well provide that for them.

Actually, I don't think those are the semantics that Tim intended. Rather, 
token_hex(nbytes) would return a string twice as long as nbytes. The idea is 
that you want to get nbytes-worth of random bits, just encoded in a common 
"safe" format. Similarly, token_url(nbytes) would get nbytes of random bits then 
base64-encode it, not just pick nbytes characters from a URL-safe list of 
characters. This makes it easier to reason about how much entropy you are 
actually using.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco


From stephen at xemacs.org  Mon Sep 21 18:29:43 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 22 Sep 2015 01:29:43 +0900
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <6E2CAA1D-F35C-4BD5-B314-B2E0291AE019@yahoo.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <877fnkxin7.fsf@uwakimon.sk.tsukuba.ac.jp>
 <6E2CAA1D-F35C-4BD5-B314-B2E0291AE019@yahoo.com>
Message-ID: <87k2rjzqfc.fsf@uwakimon.sk.tsukuba.ac.jp>

Andrew Barnert writes:
 > On Sep 21, 2015, at 01:48, Stephen J. Turnbull <stephen at xemacs.org> wrote:
 > 
 > >>>>> retries = default else cls.DEFAULT
 > > 
 > > I kinda like this if-less else syntax for the symmetry with else-less
 > > if.  
 > 
 > How do you parse this:
 > 
 >     a if b else c else d
 > 
 > Feel free to answer either as a human reader or as CPython's LL(1)
 > parser.

I don't know what an LL(1) parser could do offhand.  As a human, I
would parse that greedily as (a if b else c) else d.

But the point's actually moot, as I'm -1 on the "??" operator in any
form in favor of the explicit "a if a is not None else b" existing
syntax.  And to be honest, the fact that a truly symmetric "if-less
else" would have "or" semantics, not "??" semantics, bothers me more
than the technical issue of whether anybody could actually parse it.



From srkunze at mail.de  Mon Sep 21 18:35:04 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Mon, 21 Sep 2015 18:35:04 +0200
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <56003004.1070807@mail.de>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <CACac1F_O6V5QCoPO1-8Ki4F389O0k3wYyvCEx5mC+kk_8XopGg@mail.gmail.com>
 <CAPTjJmp-e7oZ-w0nL_11kiw2jRGhhM2M7T2oeAt1V0dXMbZfyg@mail.gmail.com>
 <56003004.1070807@mail.de>
Message-ID: <560031B8.5020903@mail.de>

On 21.09.2015 18:27, Sven R. Kunze wrote:
> If I know, the value definitely needs to be *IN* the dictionary, I use 
> dict[...].

typo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150921/68cf3da0/attachment.html>

From rosuav at gmail.com  Mon Sep 21 18:50:56 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Tue, 22 Sep 2015 02:50:56 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <20150921161059.GC31152@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <20150921161059.GC31152@ando.pearwood.info>
Message-ID: <CAPTjJmpfyNVfUHjGgHs-EpRgYOE8KU_0D7dLqVBu69AxT_xrCQ@mail.gmail.com>

On Tue, Sep 22, 2015 at 2:10 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> Are there use-cases for a strong random float between 0 and 1? If
> so, is it sufficient to say secrets.randbelow(sys.maxsize)/sys.maxsize,
> or should we offer secrets.random() and/or secrets.uniform(a, b)?

I would be leery of such a function, because it'd be hard to define it
perfectly. Tell me, crypto wonks: If I have a function randfloat()
that returns 0.0 <= x < 1.0, is it safe to use it like this:

# Generate an integer 0 <= x < 12345, uniformly distributed
uniform = int(randfloat() * 12345)
# Ditto but on a logarithmic distribution
log = math.exp(randfloat() * math.log(12345))
# Double-logarithmic
loglog = math.exp(math.exp(randfloat() * math.log(math.log(12345))))

If it's producing a random *real number* 0 <= x < 1, then these should
be valid. But given the differences between floats and reals, I would
be worried that this kind of usage would introduce an unexpected bias.
Obviously the first example is much better spelled randbelow or
randrange, but for more complicated examples, grabbing a random float
would look like the best way to do it. Will it? Always?

Not being a crypto wonk myself, I can't know what's safe and what
isn't. If Python is going to offer a new module with the (implicit or
explicit) recommendation "use this for all your cryptographic
entropy", it needs to be 100% reliable.

ChrisA

From random832 at fastmail.com  Mon Sep 21 18:57:09 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 21 Sep 2015 12:57:09 -0400
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <87k2rjzqfc.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <877fnkxin7.fsf@uwakimon.sk.tsukuba.ac.jp>
 <6E2CAA1D-F35C-4BD5-B314-B2E0291AE019@yahoo.com>
 <87k2rjzqfc.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <1442854629.3367342.389575353.4BDB0154@webmail.messagingengine.com>

On Mon, Sep 21, 2015, at 12:29, Stephen J. Turnbull wrote:
> I don't know what an LL(1) parser could do offhand.  As a human, I
> would parse that greedily as (a if b else c) else d.

That's not greedy. The greedy parsing is (a if b else (c else d)).

From guido at python.org  Mon Sep 21 19:07:16 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 21 Sep 2015 10:07:16 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
Message-ID: <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>

On Mon, Sep 21, 2015 at 8:58 AM, Mark E. Haase <mehaase at gmail.com> wrote:

> PEP-505 isn't anywhere close to being finished. I only submitted the draft
> because somebody off list asked me to send a draft so I could get a PEP
> number assigned. So I literally sent him what I had open in my text editor,
> which was just a few minutes of brain dumping and had several mistakes
> (grammatical and technical).
>
> If there's absolutely no point in continuing to work on it, I'll drop it.
> But from the outset, I thought the plan was to present this in its best
> light (and similar to the ternary operator PEP, offer several alternatives)
> if for no other reason than to have a good record of the reasoning for
> rejecting it.
>
> I'm sorry if I misunderstood the PEP process; I would have kept it to
> myself longer if I knew the first submission was going to be reviewed
> critically. I thought this e-mail chain was more of an open discussion on
> the general idea, not specifically a referendum on the PEP itself.
>

I apologize for having misunderstood the status of your PEP. I think it
would be great if you finished the PEP. As you know the ? operator has its
share of fans as well as detractors, and I will happily wait until more of
a consensus appears. I hope you can also add a discussion to the PEP of
ideas (like some of the hyper-generalizations) that were considered and
rejected -- summarizing a discussion is often a very important goal of a
PEP. I think you have made a great start already!

--Guido


> On Mon, Sep 21, 2015 at 11:40 AM, Guido van Rossum <guido at python.org>
> wrote:
>
>> Just to cut this thread short, I'm going to reject PEP 505, because ? is
>> just too ugly to add to Python IMO. Sorry.
>>
>> I commend Mark for his clean write-up, without being distracted, giving
>> some good use cases. I also like that he focused on a minimal addition to
>> the language and didn't get distracted by hyper-generalizations.
>>
>> I also like that he left out f?(...) -- the use case is much weaker;
>> usually it's the object whose method you're calling that might be None, as
>> in title?.upper().
>>
>> Some nits for the PEP:
>>
>> - I don't think it ever gives the priority for the ?? operator. What
>> would "a ?? b or c" mean?
>> - You don't explain why it's x ?? y but x ?= y. I would have expected
>> either x ? y or x ??= y.
>> - You don't explain or show how far ?. reaches; I assume x?y.z is
>> equivalent to None if x is None else x.y.z, so you don't have to write
>> x?.y?.z just to handle x.y.z if x is None.
>> - The specification section is empty.
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>>
>
>
>
> --
> Mark E. Haase
> 202-815-0201
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150921/d3af7f7d/attachment.html>

From ncoghlan at gmail.com  Mon Sep 21 19:09:04 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 22 Sep 2015 03:09:04 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CAPTjJmpfyNVfUHjGgHs-EpRgYOE8KU_0D7dLqVBu69AxT_xrCQ@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <20150921161059.GC31152@ando.pearwood.info>
 <CAPTjJmpfyNVfUHjGgHs-EpRgYOE8KU_0D7dLqVBu69AxT_xrCQ@mail.gmail.com>
Message-ID: <CADiSq7ci_b4piGDBGMht-EteDxO-Y3G90149jo1+EyoFshNQVw@mail.gmail.com>

On 22 September 2015 at 02:50, Chris Angelico <rosuav at gmail.com> wrote:
> On Tue, Sep 22, 2015 at 2:10 AM, Steven D'Aprano <steve at pearwood.info> wrote:
>> Are there use-cases for a strong random float between 0 and 1? If
>> so, is it sufficient to say secrets.randbelow(sys.maxsize)/sys.maxsize,
>> or should we offer secrets.random() and/or secrets.uniform(a, b)?
>
> I would be leery of such a function, because it'd be hard to define it
> perfectly. Tell me, crypto wonks: If I have a function randfloat()
> that returns 0.0 <= x < 1.0, is it safe to use it like this:

Floating point numbers and crypto don't go together - crypto is all
about integers, bits, bytes, and text. Folks dealing with floating
point numbers are presumably handling modelling and simulation tasks,
and will want the random module, not secrets.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From rymg19 at gmail.com  Mon Sep 21 19:10:04 2015
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Mon, 21 Sep 2015 12:10:04 -0500
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
Message-ID: <CAO41-mP+-aEogmC1xEQchfJ9jjoTvHRy8CYgRt=1+0BdE16cjw@mail.gmail.com>

What about re-using try? Crystal does this (
http://play.crystal-lang.org/#/r/gf5):

v = "ABC"
puts nil == v.try &.downcase # prints true
v = nil
puts nil == v.try &.downcase # prints false

Python could use something like:

v = 'ABC'
print(v try.downcase is None) # prints False
v = None
print(v try.downcase is None) # prints True

(Of course, the syntax would be a little less...weird!)


On Mon, Sep 21, 2015 at 10:40 AM, Guido van Rossum <guido at python.org> wrote:

> Just to cut this thread short, I'm going to reject PEP 505, because ? is
> just too ugly to add to Python IMO. Sorry.
>
> I commend Mark for his clean write-up, without being distracted, giving
> some good use cases. I also like that he focused on a minimal addition to
> the language and didn't get distracted by hyper-generalizations.
>
> I also like that he left out f?(...) -- the use case is much weaker;
> usually it's the object whose method you're calling that might be None, as
> in title?.upper().
>
> Some nits for the PEP:
>
> - I don't think it ever gives the priority for the ?? operator. What would
> "a ?? b or c" mean?
> - You don't explain why it's x ?? y but x ?= y. I would have expected
> either x ? y or x ??= y.
> - You don't explain or show how far ?. reaches; I assume x?y.z is
> equivalent to None if x is None else x.y.z, so you don't have to write
> x?.y?.z just to handle x.y.z if x is None.
> - The specification section is empty.
>
> --
> --Guido van Rossum (python.org/~guido)
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
Ryan
[ERROR]: Your autotools build scripts are 200 lines longer than your
program. Something?s wrong.
http://kirbyfan64.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150921/62f9a350/attachment.html>

From python at mrabarnett.plus.com  Mon Sep 21 19:13:19 2015
From: python at mrabarnett.plus.com (MRAB)
Date: Mon, 21 Sep 2015 18:13:19 +0100
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <87k2rjzqfc.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <877fnkxin7.fsf@uwakimon.sk.tsukuba.ac.jp>
 <6E2CAA1D-F35C-4BD5-B314-B2E0291AE019@yahoo.com>
 <87k2rjzqfc.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <56003AAF.6030100@mrabarnett.plus.com>

On 2015-09-21 17:29, Stephen J. Turnbull wrote:
> Andrew Barnert writes:
>   > On Sep 21, 2015, at 01:48, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>   >
>   > >>>>> retries = default else cls.DEFAULT
>   > >
>   > > I kinda like this if-less else syntax for the symmetry with else-less
>   > > if.
>   >
>   > How do you parse this:
>   >
>   >     a if b else c else d
>   >
>   > Feel free to answer either as a human reader or as CPython's LL(1)
>   > parser.
>
> I don't know what an LL(1) parser could do offhand.  As a human, I
> would parse that greedily as (a if b else c) else d.
>
'else' is being used like 'or', except when it belongs to 'if'.

I can't see a way of handling that.

It would result in a syntax error.

> But the point's actually moot, as I'm -1 on the "??" operator in any
> form in favor of the explicit "a if a is not None else b" existing
> syntax.  And to be honest, the fact that a truly symmetric "if-less
> else" would have "or" semantics, not "??" semantics, bothers me more
> than the technical issue of whether anybody could actually parse it.
>


From steve at pearwood.info  Mon Sep 21 19:47:58 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 22 Sep 2015 03:47:58 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CACac1F8CNxjf3ZZ3_MkuoURxvQtRMpQ1hm3vREmFO5aFQ=5W3w@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CACac1F8CNxjf3ZZ3_MkuoURxvQtRMpQ1hm3vREmFO5aFQ=5W3w@mail.gmail.com>
Message-ID: <20150921174758.GF31152@ando.pearwood.info>

On Sun, Sep 20, 2015 at 11:56:06AM +0100, Paul Moore wrote:
> On 20 September 2015 at 00:40, Tim Peters <tim.peters at gmail.com> wrote:

> >     .token_alpha(alphabet, nchars)
> >         string of `nchars` characters drawn uniformly
> >         from `alphabet`
> 
> Given where this started, I'd suggest renaming token_alpha as
> "password". Beginners wouldn't necessarily associate the term "token"
> with the problem "I want to generate a random password" [1]. Maybe add
> a short recipe showing how to meet constraints like "at least 2
> digits" by simply generating repeatedly until a valid password is
> found.

I'm not entirely sure about including password generators, since there 
are so many password schemes around:

http://thedailywtf.com/articles/Security-by-PostIt

 
> For a bit of extra bikeshedding, I'd make alphabet the second,
> optional, parameter and default it to
> string.ascii_letters+string.digits+string.punctuation, as that's often
> what password constraints require.

If we're going to offer a simple, no-brainer password generator, my 
vote goes for:

def password(nchars=10, alphabet=string.ascii_letters+string.digits):

I wouldn't include punctuation by default, as too many places still 
prohibit some, or all, punctuation characters.

If both my understanding and calculations are correct, using 
ascii_letters+digits+punctuation gives us log(94, 2) = 6.6 bits of 
(Shannon) entropy per character, while just using letters+digits gives 
us log(62, 2) = 6.0 bits per character. For short-ish passwords, up to 
10 characters, the extra entropy from including punctuation is less than 
the extra from adding an extra character:

password length of 8, without punctuation: 47.6 bits
password length of 8, including punctuation: 52.4 bits
password length of 9, without punctuation: 53.6 bits


> Or at the very least, document how to use the module functions for the
> common tasks we see people getting wrong. But I thought the idea here
> was to make doing things the right way obvious, for people who don't
> read documentation, so I'd prefer to see the functions exposed by the
> module named based on the problems they solve, not on the features
> they provide. (Even if that involves a little duplication, and/or a
> split between "high level" and "low level" APIs).

I agree that secrets should be providing ready-to-use functions, even if 
they don't solve all use-cases, not just primitive building blocks.



-- 
Steve

From tim.peters at gmail.com  Mon Sep 21 19:51:13 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 21 Sep 2015 12:51:13 -0500
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <20150921161059.GC31152@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <20150921161059.GC31152@ando.pearwood.info>
Message-ID: <CAExdVNmNsp=mR_Y4A-e2zHFsAFS=NdaKZKmn_HMVpaoVve-8Hw@mail.gmail.com>

[Tim]
>> ...
>> No attempt to be minimal here.  More-than-less "obvious" is more important:
>>
>> Bound methods of a SystemRandom instance
>>     .randrange()
>>     .randint()
>>     .randbits()
>>         renamed from .getrandbits()
>>     .randbelow(exclusive_upper_bound)
>>         renamed from private ._randbelow()
>>     .choice()

[Steven D'Aprano <steve at pearwood.info>]
> While we're bike-shedding,

I refuse to bikeshed on this.  I posted a concrete proposal just to
enrage others into it ;-)  So I'll just sketch my thinking:


> I don't know that I like the name randbits, since that always
> makes me expect a sequence of 0, 1 bits. But that's a minor
> point.

Had in mind multiple audiences, including those who know a lot about
Python, and those who know little.  The _lack_ of randbits() would
surprise the former.


> When would somebody use randbelow(n) rather than randrange(n)?

For the same reason they'd use randbits(n) instead of randrange(1 <<
n) ;-)  That is, familiarity and obviousness.  randrange() has a
complicated signature, with 1 to 3 arguments, and endlessly surprises
newbies who _expect_, e.g., randrange(3) to return 3 at times.  That's
why randint() was created.  "randbelow(n)" has a dirt-simple
signature, and its name makes it hard to mistakenly believe `n` is a
possible return value.  It's exactly what's needed most often to avoid
_statistical_ bias (as opposed to security weaknesses) in higher-level
functions - that's why _randbelow() is a fundamental primitive in
Random.

So, yes, it's redundant, but I don't care.  randrange(n) itself is
just a needlessly expensive way to call _randbelow(n) today.


> Apart from the possible redundancy between rand[below|range], all the
> above seem reasonable to me.

If people want minimal, just expose os.urandom() under a friendlier
name, and call it done ;-)


> Are there use-cases for a strong random float between 0 and 1? If
> so, is it sufficient to say secrets.randbelow(sys.maxsize)/sys.maxsize,
> or should we offer secrets.random() and/or secrets.uniform(a, b)?

I don't know of any "security use" for random floats.  But if you want
to add a recipe to the docs, point them to SystemRandom.random
instead.  That gets it right.  `sys.maxsize` doesn't really have
anything to do with floats, and the snippet you gave would produce
poor-quality floats on a 32-bit box (wouldn't get anywhere near
randomizing all 53 bits of float precision).  On a 64-bit box, it
could, e.g., return 1.0 (which random() should never return).


>>  Token functions
>>     .token_bytes(nbytes)
>>         another name for os.urandom()
>>     .token_hex(nbytes)
>>         same, but return string of ASCII hex digits
>>     .token_url(nbytes)
>>         same, but return URL-safe base64-encoded ASCII
>
> I suggest adding a default length, say nbytes=32, with a note that the
> default length is expected to increase in the future. Otherwise, how
> will the naive user know what counts as a good, hard-to-attack length?

Fine by me!


> All of the above look good to me.
>
>
>>     .token_alpha(alphabet, nchars)
>>         string of `nchars` characters drawn uniformly
>>         from `alphabet`
>
> What is the intention for this function? To use as passwords? Other than
> that, it's not obvious to me what that would be used for.

I just noted that several of the examples in the PHP paper appeared to
want to use their own alphabet.  But, since that paper was about
exposing security holes in PHP apps, perhaps that wasn't such a good
idea to begin with ;-)  Fine by me if it's dropped.

From steve at pearwood.info  Mon Sep 21 19:55:40 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 22 Sep 2015 03:55:40 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CAPTjJmpfyNVfUHjGgHs-EpRgYOE8KU_0D7dLqVBu69AxT_xrCQ@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <20150921161059.GC31152@ando.pearwood.info>
 <CAPTjJmpfyNVfUHjGgHs-EpRgYOE8KU_0D7dLqVBu69AxT_xrCQ@mail.gmail.com>
Message-ID: <20150921175540.GG31152@ando.pearwood.info>

On Tue, Sep 22, 2015 at 02:50:56AM +1000, Chris Angelico wrote:
> On Tue, Sep 22, 2015 at 2:10 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> > Are there use-cases for a strong random float between 0 and 1? If
> > so, is it sufficient to say secrets.randbelow(sys.maxsize)/sys.maxsize,
> > or should we offer secrets.random() and/or secrets.uniform(a, b)?
> 
> I would be leery of such a function, because it'd be hard to define it
> perfectly. Tell me, crypto wonks: If I have a function randfloat()
> that returns 0.0 <= x < 1.0, is it safe to use it like this:
> 
> # Generate an integer 0 <= x < 12345, uniformly distributed
> uniform = int(randfloat() * 12345)
> # Ditto but on a logarithmic distribution
> log = math.exp(randfloat() * math.log(12345))
> # Double-logarithmic
> loglog = math.exp(math.exp(randfloat() * math.log(math.log(12345))))

I'm satisfied by Nick's response to you, which also implies an answer to 
my question: there is no good use-case for a strong random float and no 
need for secrets.random().

The main reason I asked is because Ruby's SecureRandom.random_number() 
optionally returns a float between 0 and 1.



-- 
Steve

From steve at pearwood.info  Mon Sep 21 20:05:08 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 22 Sep 2015 04:05:08 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CAExdVNmNsp=mR_Y4A-e2zHFsAFS=NdaKZKmn_HMVpaoVve-8Hw@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <20150921161059.GC31152@ando.pearwood.info>
 <CAExdVNmNsp=mR_Y4A-e2zHFsAFS=NdaKZKmn_HMVpaoVve-8Hw@mail.gmail.com>
Message-ID: <20150921180508.GH31152@ando.pearwood.info>

On Mon, Sep 21, 2015 at 12:51:13PM -0500, Tim Peters wrote:
> [Tim]
> >> ...
> >> No attempt to be minimal here.  More-than-less "obvious" is more important:
> >>
> >> Bound methods of a SystemRandom instance
> >>     .randrange()
> >>     .randint()
> >>     .randbits()
> >>         renamed from .getrandbits()
> >>     .randbelow(exclusive_upper_bound)
> >>         renamed from private ._randbelow()
> >>     .choice()
> 
> [Steven D'Aprano <steve at pearwood.info>]
> > While we're bike-shedding,
> 
> I refuse to bikeshed on this.  I posted a concrete proposal just to
> enrage others into it ;-)  So I'll just sketch my thinking:

Consider me enraged. Hulk smash puny humans!


[...]
> > When would somebody use randbelow(n) rather than randrange(n)?
> 
> For the same reason they'd use randbits(n) instead of randrange(1 <<
> n) ;-)  That is, familiarity and obviousness.

Okay, that makes sense.


> > Are there use-cases for a strong random float between 0 and 1? If
> > so, is it sufficient to say secrets.randbelow(sys.maxsize)/sys.maxsize,
> > or should we offer secrets.random() and/or secrets.uniform(a, b)?
> 
> I don't know of any "security use" for random floats.  But if you want
> to add a recipe to the docs, point them to SystemRandom.random
> instead.  That gets it right.

Good enough for me.



-- 
Steve

From greg at krypto.org  Mon Sep 21 19:59:54 2015
From: greg at krypto.org (Gregory P. Smith)
Date: Mon, 21 Sep 2015 17:59:54 +0000
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <CAP7+vJ+VCBsSJ1QheRzaoKViYiFGnJd_JLLhb0kW2r+GZg6Xdg@mail.gmail.com>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
 <55FF8758.70406@canterbury.ac.nz>
 <CAPTjJmqxGgCywQRBNXmyQ5p1ChZMoVbLWgQC1Wt5SG3UQfHGTA@mail.gmail.com>
 <m2vbb4xpge.fsf@fastmail.com> <20150921130813.GZ31152@ando.pearwood.info>
 <1442843138.3321719.389368849.52DEB79B@webmail.messagingengine.com>
 <CAP7+vJ+VCBsSJ1QheRzaoKViYiFGnJd_JLLhb0kW2r+GZg6Xdg@mail.gmail.com>
Message-ID: <CAGE7PNLuuyP+wmvKFMWmwng5LSQXV8qvViKKkLc4-ahA9FBAYQ@mail.gmail.com>

I think people should stick with *from __future__ import
absolute_import* regardless
of what code they are writing. They will eventually create a file
innocuously called something like calendar.py (the same name as a standard
library module) in the same directory as their main binary and their
debugging of the mysterious failures they just started getting from the
tarfile module will suddenly require leveling up to be able to figure it
out. ;)

-gps

On Mon, Sep 21, 2015 at 8:18 AM Guido van Rossum <guido at python.org> wrote:

> It's just about these four imports, right?
>
>
> from __future__ import absolute_import
> from __future__ import division
> from __future__ import print_function
> from __future__ import unicode_literals
>
> I think the case is overblown.
>
> - absolute_import is rarely an issue; the only thing it does (despite the
> name) is give an error message when you attempt a relative import without
> using a "." in the import. A linter can find this easily for you, and a
> little discipline plus the right example can do a lot of good here.
>
> - division is important.
>
> - print_function is important.
>
> - unicode_literals is useless IMO. It breaks some things (yes there are
> still APIs that don't take unicode in 2.7) and it doesn't nearly as much as
> what would be useful -- e.g. repr() and <stream>.readline() still return
> 8-bit strings. I recommend just using u-literals and abandoning Python 3.2.
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150921/bd90f485/attachment.html>

From abarnert at yahoo.com  Mon Sep 21 21:21:49 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Mon, 21 Sep 2015 12:21:49 -0700
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CAExdVNmNsp=mR_Y4A-e2zHFsAFS=NdaKZKmn_HMVpaoVve-8Hw@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <20150921161059.GC31152@ando.pearwood.info>
 <CAExdVNmNsp=mR_Y4A-e2zHFsAFS=NdaKZKmn_HMVpaoVve-8Hw@mail.gmail.com>
Message-ID: <59D14571-7711-46DF-8ADF-817DA543E6CE@yahoo.com>

On Sep 21, 2015, at 10:51, Tim Peters <tim.peters at gmail.com> wrote:

>> When would somebody use randbelow(n) rather than randrange(n)?
> 
> For the same reason they'd use randbits(n) instead of randrange(1 <<
> n) ;-)  That is, familiarity and obviousness.  randrange() has a
> complicated signature, with 1 to 3 arguments, and endlessly surprises
> newbies who _expect_, e.g., randrange(3) to return 3 at times.  That's
> why randint() was created.

Anyone who gets confused by randrange(3) also gets confused by range(3), and they have to learn pretty quickly.

Also, randint wasn't created to allow people to put off learning that fact. It was created before randrange, because Adrian Baddeley didn't realize that Python consistently used half-open ranges, and Guido didn't notice. After 1.5 was out and someone complained that choice(range(...)) was inefficient, Guido added randrange. See the commit comment (61464037da53) which says "This addresses the problem that randint() was accidentally defined as taking an inclusive range (how unpythonic)".Also, some guy named Tim Peters convinced Guido that randint(0, 2.5) was surprisingly broken, so if he wasn't going to remove it he should reimplement it as randrange(a, b+1), which would give a clear error message. Later still (3.0), there was another discussion on removing randint, but the decision was to keep it as a "legacy alias", and change the docs to reflect that.

I suppose randbelow could be implemented as an alias to randrange(a), or it could copy and paste the same type checks as randrange, but honestly, I don't think anyone needs it.

From storchaka at gmail.com  Mon Sep 21 22:12:28 2015
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Mon, 21 Sep 2015 23:12:28 +0300
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <20150921162226.GE31152@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <mtli19$u3d$1@ger.gmane.org> <20150921162226.GE31152@ando.pearwood.info>
Message-ID: <mtpobd$i5s$1@ger.gmane.org>

On 21.09.15 19:22, Steven D'Aprano wrote:
> On Sun, Sep 20, 2015 at 09:00:08AM +0300, Serhiy Storchaka wrote:
>> On 20.09.15 02:40, Tim Peters wrote:
>>> No attempt to be minimal here.  More-than-less "obvious" is more important:
>>>
>>> Bound methods of a SystemRandom instance
>>>      .randrange()
>>>      .randint()
>>>      .randbits()
>>>          renamed from .getrandbits()
>>>      .randbelow(exclusive_upper_bound)
>>>          renamed from private ._randbelow()
>>>      .choice()
>>
>> randbelow() is just an alias for randrange() with single argument.
>> randint(a, b) == randrange(a, b+1).
>>
>> These functions are redundant and they have non-zero cost.
>
> But they already exist in the random module, so adding them to secrets
> doesn't cost anything extra.

The main cost is learning and memorising cost. The fewer words you need 
to learn and keep in memory the better.

>> Would not renaming getrandbits be confused?
>>
>>>   Token functions
>>>      .token_bytes(nbytes)
>>>          another name for os.urandom()
>>>      .token_hex(nbytes)
>>>          same, but return string of ASCII hex digits
>>>      .token_url(nbytes)
>>>          same, but return URL-safe base64-encoded ASCII
>>>      .token_alpha(alphabet, nchars)
>>>          string of `nchars` characters drawn uniformly
>>>          from `alphabet`
>>
>> token_hex(nbytes) == token_alpha('0123456789abcdef', nchars) ?
>> token_url(nbytes) == token_alpha(
>>      'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_',
>>       nchars) ?
>
> They may be reasonable implementations for the functions, but simple as
> they are, I think we still want to provide them as named functions
> rather than expect the user to write things like the above. If they're
> doing it more than once, they'll want to write a helper function, we
> might as well provide that for them.

But why these particular alphabets are special? I expect that every 
application will use the alphabet that matches its needs. One needs 
decimal digits ('0123456789'), other needs English letters 
('ABCDEFGHIJKLMNOPQRSTUVWXYZ'), or letters and digits and underscore, or 
letters, digits and punctuation, or all safe ASCII characters, or  all 
well graphical distinguished characters. Why token_hex and token_url, 
but not token_digits, token_letters, token_identifier, token_base32, 
token_base85, token_html_safe, etc?


From storchaka at gmail.com  Mon Sep 21 22:16:52 2015
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Mon, 21 Sep 2015 23:16:52 +0300
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <mtpb8l$omo$1@ger.gmane.org>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <mtli19$u3d$1@ger.gmane.org> <20150921162226.GE31152@ando.pearwood.info>
 <mtpb8l$omo$1@ger.gmane.org>
Message-ID: <mtpojl$lu2$1@ger.gmane.org>

On 21.09.15 19:29, Robert Kern wrote:
> On 2015-09-21 17:22, Steven D'Aprano wrote:
>> On Sun, Sep 20, 2015 at 09:00:08AM +0300, Serhiy Storchaka wrote:
>>> On 20.09.15 02:40, Tim Peters wrote:
>
>>>>   Token functions
>>>>      .token_bytes(nbytes)
>>>>          another name for os.urandom()
>>>>      .token_hex(nbytes)
>>>>          same, but return string of ASCII hex digits
>>>>      .token_url(nbytes)
>>>>          same, but return URL-safe base64-encoded ASCII
>>>>      .token_alpha(alphabet, nchars)
>>>>          string of `nchars` characters drawn uniformly
>>>>          from `alphabet`
>>>
>>> token_hex(nbytes) == token_alpha('0123456789abcdef', nchars) ?
>>> token_url(nbytes) == token_alpha(
>>>      'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_',
>>>       nchars) ?
>>
>> They may be reasonable implementations for the functions, but simple as
>> they are, I think we still want to provide them as named functions
>> rather than expect the user to write things like the above. If they're
>> doing it more than once, they'll want to write a helper function, we
>> might as well provide that for them.
>
> Actually, I don't think those are the semantics that Tim intended.
> Rather, token_hex(nbytes) would return a string twice as long as nbytes.
> The idea is that you want to get nbytes-worth of random bits, just
> encoded in a common "safe" format. Similarly, token_url(nbytes) would
> get nbytes of random bits then base64-encode it, not just pick nbytes
> characters from a URL-safe list of characters. This makes it easier to
> reason about how much entropy you are actually using.

Looks as the semantic of these functions is not so obvious.

May be add generic function that encodes a sequence of bytes with 
specified alphabet?


From tjreedy at udel.edu  Mon Sep 21 22:23:03 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 Sep 2015 16:23:03 -0400
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CACac1F-dVvpTA5YhuSWpFMe+oNsXDas9eE0NfgS=m44_v-theA@mail.gmail.com>
References: <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <CACac1F_O6V5QCoPO1-8Ki4F389O0k3wYyvCEx5mC+kk_8XopGg@mail.gmail.com>
 <20150921144130.GB31152@ando.pearwood.info>
 <CACac1F-dVvpTA5YhuSWpFMe+oNsXDas9eE0NfgS=m44_v-theA@mail.gmail.com>
Message-ID: <mtpova$rmn$1@ger.gmane.org>

On 9/21/2015 10:56 AM, Paul Moore wrote:

> By the way, in your example you're passing on the "none or useful"
> property by making substr be either the matched value or None.

I agree that dealing with None immediately is better.

  In real
> life, I'd probably do something more like
>
> mo = re.match(needle, haystack)
> if mo:
>      process(mo.group())
> else:
>      no_needle()

try:
     process(re.match(needle, haystack).group())
except AttributeError:  # no match
     no_needle()

is equivalent unless process can also raise AttributeError.


-- 
Terry Jan Reedy


From random832 at fastmail.com  Mon Sep 21 22:33:10 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 21 Sep 2015 16:33:10 -0400
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
 Library
In-Reply-To: <mtpobd$i5s$1@ger.gmane.org>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <mtli19$u3d$1@ger.gmane.org> <20150921162226.GE31152@ando.pearwood.info>
 <mtpobd$i5s$1@ger.gmane.org>
Message-ID: <1442867590.3414435.389778129.2688BF5A@webmail.messagingengine.com>

On Mon, Sep 21, 2015, at 16:12, Serhiy Storchaka wrote:
> But why these particular alphabets are special? I expect that every 
> application will use the alphabet that matches its needs. One needs 
> decimal digits ('0123456789'), other needs English letters 
> ('ABCDEFGHIJKLMNOPQRSTUVWXYZ'), or letters and digits and underscore, or 
> letters, digits and punctuation, or all safe ASCII characters, or  all 
> well graphical distinguished characters. Why token_hex and token_url, 
> but not token_digits, token_letters, token_identifier, token_base32, 
> token_base85, token_html_safe, etc?

Well, for one thing, they're trivial encodings of random bits, which is
why passing in nbytes (number of random bytes) makes sense. Someone else
pointed out that this makes it easier to reason about the amount of
entropy involved. Token_base64 could actually, in principle, return a
string with padding at the end according to base64 rules, if you ask for
a number of bytes that is not a multiple of four. Base85 could likewise,
for that matter, but base85 is a less common encoding.

From tjreedy at udel.edu  Mon Sep 21 22:44:18 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 Sep 2015 16:44:18 -0400
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <mtp7mo$rgh$1@ger.gmane.org>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
Message-ID: <mtpq75$f73$1@ger.gmane.org>

On 9/21/2015 11:28 AM, Ron Adam wrote:

> We could add a "not None" specific boolean operators just by appending !
> to them.
>
>      while! x:    <-->   while x != None:
>      if! x:       <-->   if x != None:
>
>      a or! b      <-->   b if a != None else a
>      a and! b     <-->   a if a != None else b
>      not! x       <-->   x if x != None else None

'!= None' should be 'is not None' in all examples. Since  'is not None' 
is a property of the object, so I think any abbreviation should be 
applied to the object, not the operator.  "while x!", etcetera

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Mon Sep 21 23:23:42 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 Sep 2015 17:23:42 -0400
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
Message-ID: <mtpsh2$j5m$1@ger.gmane.org>

On 9/21/2015 1:07 PM, Guido van Rossum wrote:

> I apologize for having misunderstood the status of your PEP. I think it
> would be great if you finished the PEP. As you know the ? operator has
> its share of fans as well as detractors, and I will happily wait until
> more of a consensus appears.

Add me to the detractors of what I have read so far ;-).

In arithmetic, 1/0 and 0/0 both stop the calculation.  My hand 
calculator literally freezes until I hit 'on' or 'all clear'.  Early 
computers also stopped, maybe with an instruction address and core dump. 
  Three orthogonal solutions are: test y before x/y, so one can do 
something else; introduce catchable exceptions, so one can do something 
else; introduce contagious special objects ('inf' and 'nan'), which at 
some point can be tested for, so one can do something else.  Python 
introduced 'inf' and 'nan' but did not use them to replace 
ZeroDivisionError.

Some languages lacking exceptions introduce a contagious null object. 
Call it Bottom.  Any operation on Bottom yields Bottom.  Python is not 
such a language. None is anti-contagious; most operations raise an 
exception.

I agree with Paul Moore that propagating None is generally a bad idea. 
It merely avoids the inevitable exception.  Or is it inevitable? Trying 
to avoid exceptions naturally leads to the hypergeneralization of 
allowing '?' everywhere.

Instead of trying to turn None into Bottom, I think a better solution 
would be a new, contagious, singleton Bottom object with every possible 
special method, all returning Bottom. Anyone could write such for their 
one use.  Someone could put it on pypi to see if there how useful it 
would be.

I agree with Ron Adam that the narrow issue is that bool(x) is False is 
sometimes too broad and people dislike of spelling out 'x is not None'. 
So abbreviate that with a unary operator; 'is not None', is a property 
of objects, not operators. I think 'x!' or 'x?', either meaning 'x is 
not None', might be better than a new binary operator. The former, x!, 
re-uses ! in something close to its normal meaning: x really exists.

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Mon Sep 21 23:28:54 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 Sep 2015 17:28:54 -0400
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <20150921162226.GE31152@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <mtli19$u3d$1@ger.gmane.org> <20150921162226.GE31152@ando.pearwood.info>
Message-ID: <mtpsqp$j5m$2@ger.gmane.org>

On 9/21/2015 12:22 PM, Steven D'Aprano wrote:
> On Sun, Sep 20, 2015 at 09:00:08AM +0300, Serhiy Storchaka wrote:
>> On 20.09.15 02:40, Tim Peters wrote:
>>> No attempt to be minimal here.  More-than-less "obvious" is more important:
>>>
>>> Bound methods of a SystemRandom instance
>>>      .randrange()
>>>      .randint()
>>>      .randbits()
>>>          renamed from .getrandbits()
>>>      .randbelow(exclusive_upper_bound)
>>>          renamed from private ._randbelow()
>>>      .choice()
>>
>> randbelow() is just an alias for randrange() with single argument.
>> randint(a, b) == randrange(a, b+1).
>>
>> These functions are redundant and they have non-zero cost.
>
> But they already exist in the random module, so adding them to secrets
> doesn't cost anything extra. It's just a reference to the bound method
> of the private SystemRandom() instance:
>
> # suggested implementation
> import random
> _systemrandom = random.SystemRandom()
>
> randint= _systemrandom.randint
> randrange = _systemrandom.randrange
>
> etc.
>
>
>> Would not renaming getrandbits be confused?
>>
>>>   Token functions
>>>      .token_bytes(nbytes)
>>>          another name for os.urandom()
>>>      .token_hex(nbytes)
>>>          same, but return string of ASCII hex digits
>>>      .token_url(nbytes)
>>>          same, but return URL-safe base64-encoded ASCII
>>>      .token_alpha(alphabet, nchars)
>>>          string of `nchars` characters drawn uniformly
>>>          from `alphabet`
>>
>> token_hex(nbytes) == token_alpha('0123456789abcdef', nchars) ?
>> token_url(nbytes) == token_alpha(
>>      'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_',
>>       nchars) ?
>
> They may be reasonable implementations for the functions, but simple as
> they are, I think we still want to provide them as named functions
> rather than expect the user to write things like the above. If they're
> doing it more than once, they'll want to write a helper function, we
> might as well provide that for them.
>
>


-- 
Terry Jan Reedy


From tjreedy at udel.edu  Mon Sep 21 23:32:44 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 Sep 2015 17:32:44 -0400
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <20150921162226.GE31152@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <mtli19$u3d$1@ger.gmane.org> <20150921162226.GE31152@ando.pearwood.info>
Message-ID: <mtpt1v$rdc$1@ger.gmane.org>

On 9/21/2015 12:22 PM, Steven D'Aprano wrote:
> On Sun, Sep 20, 2015 at 09:00:08AM +0300, Serhiy Storchaka wrote:

>> randbelow() is just an alias for randrange() with single argument.
>> randint(a, b) == randrange(a, b+1).
>>
>> These functions are redundant and they have non-zero cost.
>
> But they already exist in the random module, so adding them to secrets
> doesn't cost anything extra.

I think the redundancy in random is a mistake.  The cost is confusion 
and extra memory load, and there need to more ofter refer to the manual, 
for essentially zero gain.  When I read two names, I expect them to do 
two different things.  The question is whether to propagate the mistake 
to a new module.

-- 
Terry Jan Reedy


From guido at python.org  Mon Sep 21 23:48:38 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 21 Sep 2015 14:48:38 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <mtpsh2$j5m$1@ger.gmane.org>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org>
Message-ID: <CAP7+vJ+kSSFxdbcPjopVGRVSJRG-pN9NuU8KD2xQGFFs46-0nw@mail.gmail.com>

On Mon, Sep 21, 2015 at 2:23 PM, Terry Reedy <tjreedy at udel.edu> wrote:

> Add me to the detractors of what I have read so far ;-).
>
> In arithmetic, 1/0 and 0/0 both stop the calculation.  My hand calculator
> literally freezes until I hit 'on' or 'all clear'.  Early computers also
> stopped, maybe with an instruction address and core dump.  Three orthogonal
> solutions are: test y before x/y, so one can do something else; introduce
> catchable exceptions, so one can do something else; introduce contagious
> special objects ('inf' and 'nan'), which at some point can be tested for,
> so one can do something else.  Python introduced 'inf' and 'nan' but did
> not use them to replace ZeroDivisionError.
>
> Some languages lacking exceptions introduce a contagious null object. Call
> it Bottom.  Any operation on Bottom yields Bottom.  Python is not such a
> language. None is anti-contagious; most operations raise an exception.
>
> I agree with Paul Moore that propagating None is generally a bad idea. It
> merely avoids the inevitable exception.  Or is it inevitable? Trying to
> avoid exceptions naturally leads to the hypergeneralization of allowing '?'
> everywhere.
>
> Instead of trying to turn None into Bottom, I think a better solution
> would be a new, contagious, singleton Bottom object with every possible
> special method, all returning Bottom. Anyone could write such for their one
> use.  Someone could put it on pypi to see if there how useful it would be.
>

I think this is the PyMaybe solution. What I don't like about it is that it
is dynamic -- when used incorrectly (or even correctly?) Bottom could end
up being passed into code that doesn't expect it. That's bad -- "if x is
None" returns False when x is Bottom, so code that isn't prepared for
Bottom may well misbehave. In contrast, PEP 505 only affects code that is
lexically near the ? operator.

(You may see a trend here. PEP 498 is also carefully designed to be
locally-scoped.)


> I agree with Ron Adam that the narrow issue is that bool(x) is False is
> sometimes too broad and people dislike of spelling out 'x is not None'. So
> abbreviate that with a unary operator; 'is not None', is a property of
> objects, not operators. I think 'x!' or 'x?', either meaning 'x is not
> None', might be better than a new binary operator. The former, x!, re-uses
> ! in something close to its normal meaning: x really exists.


I don't think the big issue is bool(x) being too broad. That's what the
binary ?? operator is trying to fix, but to me the more useful operators
are x?.y and x?[y], both of which would still require repetition of the
part on the left when spelled using ??.

This is important when x is a more complex expression that is either
expensive or has a side-effect. E.g. d.get(key)?.upper() would currently
have to be spelled as (some variant of) "None if d.get(key) is None else
d.get(key).upper()" and the ?? operator doesn't really help for the
repetition -- it would still be "d.get(key) ?? d.get(key).upper()".

In general to avoid this repetition you have to introduce a local variable,
but that's often awkward and interrupts the programmer's "flow". The ?
solves that nicely. The key issue with this proposal to me is how it
affects readability of code that uses it, given that there isn't much
uniformity across languages in what ? means -- it could be part of a method
name indicating a Boolean return value (Ruby) or a conditional operator (C
and most of its descendents) or some kind of shortcut.

So this is the issue I have to deal with (and thought I had dealt with by
prematurely rejecting the PEP, but I've had a change of heart and am now
waiting for the PEP to be finished).

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150921/bf3e1042/attachment.html>

From chris.barker at noaa.gov  Tue Sep 22 00:03:41 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Mon, 21 Sep 2015 15:03:41 -0700
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <CAP1=2W5_UfGmoyKKRmwLtt0mb489GDZGr4k0hQ2p_T7=L-7fWw@mail.gmail.com>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
 <CAP1=2W5_UfGmoyKKRmwLtt0mb489GDZGr4k0hQ2p_T7=L-7fWw@mail.gmail.com>
Message-ID: <CALGmxEK1VZy0Umt53wPJD9Rro_gJAPwxt7Oyn6zs4xK23Nj+jg@mail.gmail.com>

On Sat, Sep 19, 2015 at 11:21 AM, Brett Cannon <brett at python.org> wrote:


> It would be nice to have a:
>>
>> from __future__ import py3
>>
>>

> While in hindsight having a python3 __future__ statement that just turned
>> on everything would be handy, this runs the risk of breaking code by
>> introducing something that only works in a bugfix release and we went down
>> that route with booleans in 2.2.1 and came to regret it.
>>
>
That may well kill the idea then, yes.

Guido wrote:

> It's just about these four imports, right?
> from __future__ import absolute_import
> from __future__ import division
> from __future__ import print_function
> from __future__ import unicode_literals


yup, but that is enough  to be a able to remember and type...

and will there be more? probably not, but .....

But you are right, if we can redude that to a couple, maybe a smaller deal.


> I think the case is overblown.
> - absolute_import is rarely an issue; the only thing it does (despite the
> name) is give an error message when you attempt a relative import without
> using a "." in the import. A linter can find this easily for you, and a
> little discipline plus the right example can do a lot of good here.
>

sure -- but this one is more for the learners -- these things are confusing
-- and getting something that works in one version of python but not
another will be more confusing, still.

And much as I wish everyone would use a good linter....


> - division is important.
> - print_function is important.
>

so maybe those two are enough....


> - unicode_literals is useless IMO. It breaks some things (yes there are
> still APIs that don't take unicode in 2.7) and it doesn't nearly as much as
> what would be useful -- e.g. repr() and <stream>.readline() still return
> 8-bit strings. I recommend just using u-literals and abandoning Python 3.2.


hmm -- I find myself doing an unholy mess of u"" and  "". And I tried
teaching an intro class where I used u"" everywhere -- the students were
pretty confused about why they were typing all those u-s, particularly
since it didn't seem to make any difference.

Sure there is breakage, but there is breakage on some of these between py2
and py3 anway -- APIs that return py2 strings on py2... So unicode_literals is
still useful to me.

But Brett was probably right --  somethign minor but useful but only works
on the very latest bug-fix release is probably an attractive nuisance more
than anything else.

-Chris





-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150921/785647b1/attachment-0001.html>

From tim.peters at gmail.com  Tue Sep 22 00:08:42 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 21 Sep 2015 17:08:42 -0500
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <59D14571-7711-46DF-8ADF-817DA543E6CE@yahoo.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <20150921161059.GC31152@ando.pearwood.info>
 <CAExdVNmNsp=mR_Y4A-e2zHFsAFS=NdaKZKmn_HMVpaoVve-8Hw@mail.gmail.com>
 <59D14571-7711-46DF-8ADF-817DA543E6CE@yahoo.com>
Message-ID: <CAExdVNn5T=OXXd1Sp2-xdGzHd0BTGomHr43U-9Gkd16RVSCMqg@mail.gmail.com>

[Steven]
>>> When would somebody use randbelow(n) rather than randrange(n)?

[Tim]
>> For the same reason they'd use randbits(n) instead of randrange(1 <<
>> n) ;-)  That is, familiarity and obviousness.  randrange() has a
>> complicated signature, with 1 to 3 arguments, and endlessly surprises
>> newbies who _expect_, e.g., randrange(3) to return 3 at times.  That's
>> why randint() was created.

[Andrew Barnert <abarnert at yahoo.com>]
> Anyone who gets confused by randrange(3) also gets
> confused by range(3),

True!

> and they have to learn pretty quickly.

And they do.  And then, in a rush, they slip up.


> Also, randint wasn't created to allow people to put off
> learning that fact. It was created before randrange,
> because Adrian Baddeley didn't realize that Python consistently
> used half-open ranges, and Guido didn't notice. After 1.5 was
> out and someone complained that choice(range(...)) was
> inefficient, Guido added randrange. See the commit comment
> (61464037da53) which says "This addresses the problem that
> randint() was accidentally defined as taking an inclusive range
> (how unpythonic)".Also, some guy named Tim Peters
> convinced Guido that randint(0, 2.5) was surprisingly broken,
> so if he wasn't going to remove it he should reimplement it as
> randrange(a, b+1), which would give a clear error message.

Goodness - you seem to believe there's virtue in remembering things in
the order they actually happened.  Hmm.  I'll try that sometime, but
I'm dubious ;-)


> ...
> I suppose randbelow could be implemented as an alias to
> randrange(a), or it could copy and paste the same type
> checks as randrange,

randbelow() is already implemented, in current Pythons, although as a
class-private method (Random._randbelow()).  It's randrange() that's
implemented by calling ._randbelow() now.  To expose it on its own, it
should grow a check that its argument is an integer > 0 (as a private
method, it currently assumes it won't be called with an insane
argument).


> but honestly, I don't think anyone needs it.

Of the four {randbelow, randint, randrange, randbits}, any can be
implemented via any of the other three.  You chopped what I considered
to be "the real" point:

>> "randbelow(n)" has a dirt-simple signature, and its name makes
>> it hard to mistakenly believe `n` is a possible return value.

That's what gives it value.  Indeed, if minimality crusaders are
determined to root out redundancy, randbelow is the only one of the
four I'd keep.

From tjreedy at udel.edu  Tue Sep 22 00:45:04 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 Sep 2015 18:45:04 -0400
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAP7+vJ+kSSFxdbcPjopVGRVSJRG-pN9NuU8KD2xQGFFs46-0nw@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org>
 <CAP7+vJ+kSSFxdbcPjopVGRVSJRG-pN9NuU8KD2xQGFFs46-0nw@mail.gmail.com>
Message-ID: <mtq19j$reo$1@ger.gmane.org>

On 9/21/2015 5:48 PM, Guido van Rossum wrote:
> On Mon, Sep 21, 2015 at 2:23 PM, Terry Reedy
> <tjreedy at udel.edu
> <mailto:tjreedy at udel.edu>> wrote:

>     I agree with Paul Moore that propagating None is generally a bad
>     idea. It merely avoids the inevitable exception.

To me, this is the key idea in opposition to proposals that make 
propagating None easier.

> I don't think the big issue is bool(x) being too broad. That's what the
> binary ?? operator is trying to fix, but to me the more useful operators
> are x?.y and x?[y], both of which would still require repetition of the
> part on the left when spelled using ??.
>
> This is important when x is a more complex expression that is either
> expensive or has a side-effect. E.g. d.get(key)?.upper() would currently
> have to be spelled as (some variant of)
 > "None if d.get(key) is None else d.get(key).upper()"
 > and the ?? operator doesn't really help for the
> repetition -- it would still be "d.get(key) ?? d.get(key).upper()".
>
> In general to avoid this repetition you have to introduce a local
> variable, but that's often awkward and interrupts the programmer's
> "flow".

try:
     x = d.get(key).upper()
except AttributeError:
     x = None

is also a no-repeat equivalent when d.values are all strings.  I agree 
than "x = d.get(key)?.upper()" is a plausible abbreviation.  But I am 
much more likely to want "x = ''" or another exception as the 
alternative.  I guess some other pythonistas like keeping None around 
more than I do ;-).

-- 
Terry Jan Reedy


From guido at python.org  Tue Sep 22 00:54:32 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 21 Sep 2015 15:54:32 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <mtq19j$reo$1@ger.gmane.org>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org>
 <CAP7+vJ+kSSFxdbcPjopVGRVSJRG-pN9NuU8KD2xQGFFs46-0nw@mail.gmail.com>
 <mtq19j$reo$1@ger.gmane.org>
Message-ID: <CAP7+vJ+PmV=vGAccXBubtUuSiCyZrnwnqYjWZKm91u0zpU1p2g@mail.gmail.com>

On Mon, Sep 21, 2015 at 3:45 PM, Terry Reedy <tjreedy at udel.edu> wrote:

> On 9/21/2015 5:48 PM, Guido van Rossum wrote:
>
>> On Mon, Sep 21, 2015 at 2:23 PM, Terry Reedy
>> <tjreedy at udel.edu
>> <mailto:tjreedy at udel.edu>> wrote:
>>
>
>     I agree with Paul Moore that propagating None is generally a bad
>>     idea. It merely avoids the inevitable exception.
>>
>
> To me, this is the key idea in opposition to proposals that make
> propagating None easier.
>

(I didn't write that, you [Terry] did. It looks like our mailers don't
understand each other's quoting conventions. :-( )


> I don't think the big issue is bool(x) being too broad. That's what the
>> binary ?? operator is trying to fix, but to me the more useful operators
>> are x?.y and x?[y], both of which would still require repetition of the
>> part on the left when spelled using ??.
>>
>> This is important when x is a more complex expression that is either
>> expensive or has a side-effect. E.g. d.get(key)?.upper() would currently
>> have to be spelled as (some variant of)
>>
>      > "None if d.get(key) is None else d.get(key).upper()"
>      > and the ?? operator doesn't really help for the
>
>> repetition -- it would still be "d.get(key) ?? d.get(key).upper()".
>>
>> In general to avoid this repetition you have to introduce a local
>> variable, but that's often awkward and interrupts the programmer's
>> "flow".
>>
>
> try:
>     x = d.get(key).upper()
> except AttributeError:
>     x = None
>
> is also a no-repeat equivalent when d.values are all strings.  I agree
> than "x = d.get(key)?.upper()" is a plausible abbreviation.  But I am much
> more likely to want "x = ''" or another exception as the alternative.  I
> guess some other pythonistas like keeping None around more than I do ;-).
>

Eew. That try/except is not only very distracting and interrupts the flow
of both the writer and the reader, it may also catch errors, e.g. what if
the method being called raises an exception (not a problem with upper(),
but definitely with user-defined methods).

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150921/fbf281e1/attachment.html>

From carl at oddbird.net  Tue Sep 22 00:56:03 2015
From: carl at oddbird.net (Carl Meyer)
Date: Mon, 21 Sep 2015 16:56:03 -0600
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <mtq19j$reo$1@ger.gmane.org>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org>
 <CAP7+vJ+kSSFxdbcPjopVGRVSJRG-pN9NuU8KD2xQGFFs46-0nw@mail.gmail.com>
 <mtq19j$reo$1@ger.gmane.org>
Message-ID: <56008B03.6020704@oddbird.net>

On 09/21/2015 04:45 PM, Terry Reedy wrote:
> On 9/21/2015 5:48 PM, Guido van Rossum wrote:
>> On Mon, Sep 21, 2015 at 2:23 PM, Terry Reedy
>> <tjreedy at udel.edu
>> <mailto:tjreedy at udel.edu>> wrote:
> 
>>     I agree with Paul Moore that propagating None is generally a bad
>>     idea. It merely avoids the inevitable exception.
> 
> To me, this is the key idea in opposition to proposals that make
> propagating None easier.
[...]
> I guess some other pythonistas like keeping None around
> more than I do ;-).

I think it's one of those things that depends on what you're doing. From
a web-development perspective, you rarely keep _anything_ around for
very long, so there's rarely an issue of `None` sneaking in somewhere
unexpectedly and then causing a surprise exception way down the line.
Typical use cases are things like: "If this database query returns a
User, I want to get their name and return that in the JSON dict from my
API, otherwise I want None, which will be serialized to a JSON null,
clearly indicating that there is no user here."

My jaw dropped a bit when I saw it asserted in this thread that
functions returning "useful value or None" is an anti-pattern. I write
functions like that all the time, and I consider it a useful and
necessary Python idiom. I would hate to rewrite all that code to either
deal with exceptions or add default-value-argument boilerplate to all of
them; when "no result" is an expected and normal possibility from a
function, letting the calling code deal with None however it chooses is
much nicer than either of those options.

I don't love the ? syntax, but I would certainly use the feature
discussed here happily and frequently.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150921/35a0012c/attachment-0001.sig>

From chris.barker at noaa.gov  Mon Sep 21 23:55:24 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Mon, 21 Sep 2015 14:55:24 -0700
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <CADiSq7czJ+c8Jnbh=GE0=-cUCoiCHYgjKr+0-Ch1HiVtdgEP8w@mail.gmail.com>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
 <CADiSq7czJ+c8Jnbh=GE0=-cUCoiCHYgjKr+0-Ch1HiVtdgEP8w@mail.gmail.com>
Message-ID: <CALGmxEJU_JZbpmaPP430xLmEXu_fuHRRgq_G-DVvzLp5=+=w+A@mail.gmail.com>

On Mon, Sep 21, 2015 at 2:29 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> For folks using IPython Notebook, I've been suggesting to various
> folks that a "Python 2/3 compatible" kernel that enables these
> features by default may be desirable.


That would be nice, yes. And not had to do.

But in a way, I struggle with getting new-to-programming scientists to make
the transitions from interactive code in a notebook to re-usable module --
one more thing that would break when they did that would be too bad.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150921/2233543d/attachment.html>

From random832 at fastmail.com  Tue Sep 22 01:47:13 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 21 Sep 2015 19:47:13 -0400
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAP7+vJ+kSSFxdbcPjopVGRVSJRG-pN9NuU8KD2xQGFFs46-0nw@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org>
 <CAP7+vJ+kSSFxdbcPjopVGRVSJRG-pN9NuU8KD2xQGFFs46-0nw@mail.gmail.com>
Message-ID: <1442879233.2295250.389924553.305406F9@webmail.messagingengine.com>

On Mon, Sep 21, 2015, at 17:48, Guido van Rossum wrote:
> This is important when x is a more complex expression that is either
> expensive or has a side-effect. E.g. d.get(key)?.upper() would currently
> have to be spelled as (some variant of) "None if d.get(key) is None else
> d.get(key).upper()" and the ?? operator doesn't really help for the
> repetition -- it would still be "d.get(key) ?? d.get(key).upper()".

?? is meant to use the right if the left *is* null, as I understand it.
So this isn't a problem it solves at all.

From guido at python.org  Tue Sep 22 01:51:50 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 21 Sep 2015 16:51:50 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <1442879233.2295250.389924553.305406F9@webmail.messagingengine.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org>
 <CAP7+vJ+kSSFxdbcPjopVGRVSJRG-pN9NuU8KD2xQGFFs46-0nw@mail.gmail.com>
 <1442879233.2295250.389924553.305406F9@webmail.messagingengine.com>
Message-ID: <CAP7+vJJGzUdbGkckM6B7_-vqQDRoPNSkU=2=PKRP4W4OvwGQMQ@mail.gmail.com>

On Mon, Sep 21, 2015 at 4:47 PM, Random832 <random832 at fastmail.com> wrote:

> On Mon, Sep 21, 2015, at 17:48, Guido van Rossum wrote:
> > This is important when x is a more complex expression that is either
> > expensive or has a side-effect. E.g. d.get(key)?.upper() would currently
> > have to be spelled as (some variant of) "None if d.get(key) is None else
> > d.get(key).upper()" and the ?? operator doesn't really help for the
> > repetition -- it would still be "d.get(key) ?? d.get(key).upper()".
>
> ?? is meant to use the right if the left *is* null, as I understand it.
> So this isn't a problem it solves at all.
>

Sorry, my bad. Indeed, x ?? y tries to fix the issue that "x or y" uses y
if x is falsey. Still this seems a lesser problem to me than the problem
solved by x?.a and x?[y]. Most of the time in my code it is actually fine
to use the default if the LHS is an empty string.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150921/6ba2f018/attachment.html>

From stephen at xemacs.org  Tue Sep 22 01:56:24 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 22 Sep 2015 08:56:24 +0900
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The
	Standard	Library
In-Reply-To: <20150921174758.GF31152@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CACac1F8CNxjf3ZZ3_MkuoURxvQtRMpQ1hm3vREmFO5aFQ=5W3w@mail.gmail.com>
 <20150921174758.GF31152@ando.pearwood.info>
Message-ID: <87zj0fmimv.fsf@uwakimon.sk.tsukuba.ac.jp>

Steven D'Aprano writes:

 > I wouldn't include punctuation [in the password alphabet] by
 > default, as too many places still prohibit some, or all,
 > punctuation characters.

Do you really expect users to choose their own random passwords using
this function?  I would expect that this function would be used for
initial system-generated passwords (or system-enforced random
passwords), and the system would have control over the admissible set.
But users who have to conform to somebody else's rules much prefer
obfuscated passwords that pass strength tests to random passwords in
my experience.

BTW, the last time I had to set a password that didn't allow the full
set of 94 printable ASCII characters, uppercase letters were forbidden
(silently -- it was documented in the help but not on the password
change form, I had no idea why my first three suggestions were
rejected).  Go figure.


From bruce at leban.us  Tue Sep 22 02:16:47 2015
From: bruce at leban.us (Bruce Leban)
Date: Mon, 21 Sep 2015 17:16:47 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <56008B03.6020704@oddbird.net>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org>
 <CAP7+vJ+kSSFxdbcPjopVGRVSJRG-pN9NuU8KD2xQGFFs46-0nw@mail.gmail.com>
 <mtq19j$reo$1@ger.gmane.org> <56008B03.6020704@oddbird.net>
Message-ID: <CAGu0Ansh=1hOdMCRG=RTW-eC089WMXHMT1okQzk++s2h9UTUzA@mail.gmail.com>

On Mon, Sep 21, 2015 at 3:56 PM, Carl Meyer <carl at oddbird.net> wrote:

>
> My jaw dropped a bit when I saw it asserted in this thread that
> functions returning "useful value or None" is an anti-pattern. I write
> functions like that all the time, and I consider it a useful and
> necessary Python idiom. I would hate to rewrite all that code to either
> deal with exceptions or add default-value-argument boilerplate to all of
> them; when "no result" is an expected and normal possibility from a
> function, letting the calling code deal with None however it chooses is
> much nicer than either of those options.
>

+1

Some language features are "prescriptive," designed to encourage particular
ways of writing things. Others are "respective," recognizing the variety of
ways people write things and respecting that variety. Python has None and
generally respects use of it. To say that using None is an anti-pattern is
something I would strongly disagree with. Yes, NPE errors are a problem,
but eliminating null/None does not eliminate those errors. It merely
replaces one common error with an assortment of other errors.

I like the feature. I have been asking for features like this for years and
the number of times I have written the longer forms is too many to count.

I like the ?.  ?[]  ?()  ?? syntax. I think:

(1) it's strongly related to the . [] () syntax;
(2) any syntax that uses a keyword is either not syntactically related to .
[] () or mixes a keyword and punctuation, both of which I dislike;

(3) it's the same syntax as used in other languages (yes, Python is not C#
or Dart but there's a good reason Python uses ^ for xor, ** for power, +=
for add to, etc.)


--- Bruce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150921/cfc30976/attachment-0001.html>

From bussonniermatthias at gmail.com  Tue Sep 22 02:34:43 2015
From: bussonniermatthias at gmail.com (Matthias Bussonnier)
Date: Mon, 21 Sep 2015 17:34:43 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAP7+vJ+kSSFxdbcPjopVGRVSJRG-pN9NuU8KD2xQGFFs46-0nw@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org>
 <CAP7+vJ+kSSFxdbcPjopVGRVSJRG-pN9NuU8KD2xQGFFs46-0nw@mail.gmail.com>
Message-ID: <CANJQusXFni+DHT6z_jDqAooNW1+iQvwcs89BfE9FNFFFGmrENA@mail.gmail.com>

On Mon, Sep 21, 2015 at 2:48 PM, Guido van Rossum <guido at python.org> wrote:

>
> In general to avoid this repetition you have to introduce a local variable,
> but that's often awkward and interrupts the programmer's "flow". The ?
> solves that nicely. The key issue with this proposal to me is how it affects
> readability of code that uses it, given that there isn't much uniformity
> across languages in what ? means -- it could be part of a method name
> indicating a Boolean return value (Ruby) or a conditional operator (C and
> most of its descendents) or some kind of shortcut.
>
> So this is the issue I have to deal with (and thought I had dealt with by
> prematurely rejecting the PEP, but I've had a change of heart and am now
> waiting for the PEP to be finished).
>

As we are in the process of writing a PEP and  uses of ?/?? in other languages,
why not speak about thecurrent usage of `?` / `??` in the Python community ?

(I'll try to state only facts, excuse any passage that might seem like
a personal opinion)

Can the PEP include the fact that `? and `??` have been in use in the Scientific
Python community for 10 to 14 years now, and that any Scientific Python user who
have touched IPython will tell you that ? and ?? are for getting help.
(?? try to pull the source, while ? does not, but let's not get into details).

This include the fact that any IDE (like spyder) which use IPython under
the hood have this feature.

The usage of `?` is even visible on Python.org main page [3] (imgur screenshot),
which invite the user to launch an interactive console saying:

> object? -> Details about 'object', use 'object??' for extra details.
> In [1]:

leading the casual user thinking that this is a Python feature.

This fact is even including in Books and introduction to python. Sometime
without mentioning that the feature is IPython Specific, and does not work in
Python repl/scripts.

Examples in Cyrile's rossant "Learning IPython for Interactive
Computing and Data Visualization"[1]
introduce Python with the second code/repl example beeing about `?`.

Book extract :

> Some of these commands let you get some help or information about any
> Python function or object. For instance, have you ever had a doubt about how
> to use the super function to access parent methods in a derived class? Just type
> `super?` and you?ll find out. Appending `?` to any command or variable gives you all
> the information you need about it.

> In [1]: super?
> Type: type
> String Form:<type 'super'>
> Namespace: Python builtin
> ...

A google search also give for eaxample:
Python for beginners online tutorial[2] which does rapidly the same:

Tuto snippet:
> The "?" is very useful. If you type in `?` after a `len?`, you will see the
> documentation about the function len.
> Typing `?` after a name will give you information about the object attached to that name.

Doing even worse as they replace the IPython prompt `In[x]:` with
`>>>` literally showing
that `>>> len?` works. Which imply that it should work on a plain Python REPL.

As someone that have to regularly teach Python, and interact with new
Python users,
it will be hard to explain that `?` and `??` have different meaning
depending on the context,
and that most book on Scientific Python are wrong/inaccurate.

>From the current state of the PEP/proposal I'm guessing we should be
able to distinguish
Null Coalescing operation (or whatever name you want to give them) in
Python 3.6+
from actual help request, and that this will allow us to keep backward
compatibility
with 10+ years of code/user habits, but the result will most likely be
confusing.

It will be even harder if we have to remove the usage of `?`/`??`[4].

I also want to note that the use of `?`/`??` is not present to just being or end
of identifiers as it can also be used use to search for names:

> In [1]: *int*?
> FloatingPointError
> int
> print

But this usage is not as widespread as extracting help about objects,
and seem less relevant, though I'm not sure:

> In [10]: ?Float*Error
> FloatingPointError
>
> In [12]: Uni*Error?
> UnicodeDecodeError
> UnicodeEncodeError
> UnicodeError
> UnicodeTranslateError

Please take these fact into consideration when making a
decision/writing the Pep.

Thanks,
-- 
M

[1]: That's one of the only book for which I have (legally) the
sources, and that I bother to grepped through.
[2]: http://www.pythonforbeginners.com/basics/ipython-a-short-introduction
[3]: http://imgur.com/d0Vs7Xr
[4]: I'll have to hire a body guard to prevent people to pursue me to
the end of the earth with a chainsaw. I'm sure you know that feeling.

From rosuav at gmail.com  Tue Sep 22 02:44:13 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Tue, 22 Sep 2015 10:44:13 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CANJQusXFni+DHT6z_jDqAooNW1+iQvwcs89BfE9FNFFFGmrENA@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org>
 <CAP7+vJ+kSSFxdbcPjopVGRVSJRG-pN9NuU8KD2xQGFFs46-0nw@mail.gmail.com>
 <CANJQusXFni+DHT6z_jDqAooNW1+iQvwcs89BfE9FNFFFGmrENA@mail.gmail.com>
Message-ID: <CAPTjJmr0kPk7TcgTEef+OHfHK1Uuj6PkBrGpzLjZ3tbZdKoTVg@mail.gmail.com>

On Tue, Sep 22, 2015 at 10:34 AM, Matthias Bussonnier
<bussonniermatthias at gmail.com> wrote:
> On Mon, Sep 21, 2015 at 2:48 PM, Guido van Rossum <guido at python.org> wrote:
> As we are in the process of writing a PEP and  uses of ?/?? in other languages,
> why not speak about thecurrent usage of `?` / `??` in the Python community ?
>
> Can the PEP include the fact that `? and `??` have been in use in the Scientific
> Python community for 10 to 14 years now, and that any Scientific Python user who
> have touched IPython will tell you that ? and ?? are for getting help.
> (?? try to pull the source, while ? does not, but let's not get into details).
>
> This include the fact that any IDE (like spyder) which use IPython under
> the hood have this feature.
>
> I also want to note that the use of `?`/`??` is not present to just being or end
> of identifiers as it can also be used use to search for names:
>
>> In [1]: *int*?
>> FloatingPointError
>> int
>> print
>
> But this usage is not as widespread as extracting help about objects,
> and seem less relevant, though I'm not sure:
>
>> In [10]: ?Float*Error
>> FloatingPointError
>>
>> In [12]: Uni*Error?
>> UnicodeDecodeError
>> UnicodeEncodeError
>> UnicodeError
>> UnicodeTranslateError
>
> Please take these fact into consideration when making a
> decision/writing the Pep.

Are there any uses like this that would involve a question mark
followed by some other punctuation? The main proposal would be for
x?.y and similar; if "x ?? y" can't be used because of a conflict with
ipython, I'm sure it could be changed ("x ?! y" would be cute).

ChrisA

From bussonniermatthias at gmail.com  Tue Sep 22 04:00:26 2015
From: bussonniermatthias at gmail.com (Matthias Bussonnier)
Date: Mon, 21 Sep 2015 19:00:26 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAPTjJmr0kPk7TcgTEef+OHfHK1Uuj6PkBrGpzLjZ3tbZdKoTVg@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org>
 <CAP7+vJ+kSSFxdbcPjopVGRVSJRG-pN9NuU8KD2xQGFFs46-0nw@mail.gmail.com>
 <CANJQusXFni+DHT6z_jDqAooNW1+iQvwcs89BfE9FNFFFGmrENA@mail.gmail.com>
 <CAPTjJmr0kPk7TcgTEef+OHfHK1Uuj6PkBrGpzLjZ3tbZdKoTVg@mail.gmail.com>
Message-ID: <CANJQusVSHcUo5cYirn53iJiii0FAx1g7PQO_eZip8wBo5AW5Vg@mail.gmail.com>

Hi Chris,

On Mon, Sep 21, 2015 at 5:44 PM, Chris Angelico <rosuav at gmail.com> wrote:

>
> Are there any uses like this that would involve a question mark
> followed by some other punctuation? The main proposal would be for
> x?.y and similar; if "x ?? y" can't be used because of a conflict with
> ipython,


As far as I can tell, no `?`/`??` behave in IPython like a unary
operator[1] and don't conflict.
We could distinguish from any use case so far mentioned in this discussion
(as far as I can tell). I just hope that the PEP will not slide in the
direction of
`foo?` alone being valid and equivalent to `maybe(x)`, or returning an object.

I'm concern about teaching/semantics as for me `x?.y` read like "The
attribute y of the help of x"
so roughly `help(x).y`. The `x ?? y` is less ambiguous (imho) as it is
clearly an operator
with the space on each side.


> I'm sure it could be changed ("x ?! y" would be cute).

Is that the WTF operator ? [2]

Joke aside, ?! should not conflict either, but `!` and `!!` also have
their own meaning in IPython
Like the following is valid (which is a useless piece of code, but to
show the principles)
if you are interested in the Python syntax extension we have.


my_files = ! ls ~/*.txt
for i,file in enumerate(my_files):
    raw = !cat $file
    !cat  $raw > {'%s-%s'%(i,file.upper())}

Thanks,

-- 
M

[1]: but is not really an operator, I'm not sure what a=print? ; a
=?print or ?a=print would do.
[2]: http://stackoverflow.com/questions/7825055/what-does-the-c-operator-do
>
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From steve at pearwood.info  Tue Sep 22 04:15:46 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 22 Sep 2015 12:15:46 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <mtpova$rmn$1@ger.gmane.org>
References: <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <CACac1F_O6V5QCoPO1-8Ki4F389O0k3wYyvCEx5mC+kk_8XopGg@mail.gmail.com>
 <20150921144130.GB31152@ando.pearwood.info>
 <CACac1F-dVvpTA5YhuSWpFMe+oNsXDas9eE0NfgS=m44_v-theA@mail.gmail.com>
 <mtpova$rmn$1@ger.gmane.org>
Message-ID: <20150922021546.GK31152@ando.pearwood.info>

On Mon, Sep 21, 2015 at 04:23:03PM -0400, Terry Reedy wrote:

> try:
>     process(re.match(needle, haystack).group())
> except AttributeError:  # no match
>     no_needle()
> 
> is equivalent unless process can also raise AttributeError.

It is difficult to guarantee what exceptions a function will, or won't, 
raise. Even if process() is documented as "only raising X", I wouldn't 
be confident that the above might not disguise a bug in process as "no 
needle".

This is why the standard idiom for using re.match is to capture the 
result first, then test for truthiness (a MatchObject), or None-ness, 
before processing it.


-- 
Steve

From ron3200 at gmail.com  Tue Sep 22 04:18:33 2015
From: ron3200 at gmail.com (Ron Adam)
Date: Mon, 21 Sep 2015 21:18:33 -0500
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <mtpq75$f73$1@ger.gmane.org>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org> <mtpq75$f73$1@ger.gmane.org>
Message-ID: <mtqdpq$c91$1@ger.gmane.org>

On 09/21/2015 03:44 PM, Terry Reedy wrote:
> On 9/21/2015 11:28 AM, Ron Adam wrote:
>
>> We could add a "not None" specific boolean operators just by appending !
>> to them.
>>
>>      while! x:    <-->   while x != None:
>>      if! x:       <-->   if x != None:
>>
>>      a or! b      <-->   b if a != None else a
>>      a and! b     <-->   a if a != None else b
>>      not! x       <-->   x if x != None else None
>
> '!= None' should be 'is not None' in all examples.

Yes

> Since  'is not None'
> is a property of the object, so I think any abbreviation should be
> applied to the object, not the operator.  "while x!", etcetera

My observation is that because None is the default return value from 
functions (and methods), it is already a special case.  While that isn't 
directly related to None values in general, I think it does lend weight 
to treating it specially.

Having None specific bool-type operators is both cleaner and more 
efficient and avoids issues with false values. The byte code might look 
like this...

 >>> def value_or(x, y):
... return x or! y
...
 >>> dis(value_or)
2 0 LOAD_FAST 0 (x)
3 JUMP_IF_NONE_OR_POP 9
6 LOAD_FAST 1 (y)
 >> 9 RETURN_VALUE

It would not be sensitive to False values.

Applying the op to the object wouldn't quite work as expected.

What would x! return?  True, or the object if not None?

And if it returns the object, what does this do when the value is 0 or 
False, or an empty container.

     result = x! or y      # not the same as   result = x or! y


The maybe(x) function would work the same as x! in this case.


I also think a trailing unary operator is kind of weird.


But I get the feeling from Guido response about over generalizations 
that it may be too big of a change, and I agree it is a new concept I 
haven't seen anywhere else.  Maybe one to think about over time, and not 
to rush into.

Cheers,
    Ron



From stephen at xemacs.org  Tue Sep 22 04:34:45 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 22 Sep 2015 11:34:45 +0900
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CANJQusXFni+DHT6z_jDqAooNW1+iQvwcs89BfE9FNFFFGmrENA@mail.gmail.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org>
 <CAP7+vJ+kSSFxdbcPjopVGRVSJRG-pN9NuU8KD2xQGFFs46-0nw@mail.gmail.com>
 <CANJQusXFni+DHT6z_jDqAooNW1+iQvwcs89BfE9FNFFFGmrENA@mail.gmail.com>
Message-ID: <87vbb3mbay.fsf@uwakimon.sk.tsukuba.ac.jp>

Matthias Bussonnier writes:

 > Can the PEP include the fact that `? and `??` have been in use in
 > the Scientific Python community for 10 to 14 years now, and that
 > any Scientific Python user who have touched IPython will tell you
 > that ? and ?? are for getting help.

But the syntax is extremely restrictive.  Both "None? 1 + 2" and
"None ?" are SyntaxErrors, as are "a?.attr" and even "*int*? ".
Prefixing the help operator also gives help, and in that case
whitespace may separate the operator from the word (apparently defined
as non-whitespace, and any trailing detritus is ignored).  Perhaps the
prefix form (a little less natural for people coming directly from
natural languages, I guess) should be emphasized -- there are no
proposals for unary prefix use of "?" or "??".

So, this is the kind of DWIM that I doubt will confuse many users, at
least not for very long.  Do you envision a problem keeping IPython
facilities separate from Python language syntax?  Technically, I think
the current rules that divide valid IPython requests for help from
Python syntax (valid or invalid) should continue to work.  Whether it
would confuse users, I doubt, but there are possible surprises for
some (many?) users, I suppose.

Definitely, it should be mentioned in the PEP, but Python syntax is
something that Python defines; shells and language variants have to be
prepared to deal with Python syntax changes.

 > leading the casual user thinking that this is a Python feature.

Casual users actually expect software to DWIM in my experience.  The
rule "leading help or stuck-right-on-the-end help works, elsewhere it
means something else" (including no meaning == SyntaxError) is
intuitively understood by them already, I'm sure.  Also, I rather
doubt that "casual users" will encounter "?." or "??" until they're
not so casual anymore.

 > As someone that have to regularly teach Python, and interact with
 > new Python users, it will be hard to explain that `?` and `??` have
 > different meaning depending on the context,

I've never had a question about the context-sensitivity of "%" in
IPython.  Have you?

 > and that most book on Scientific Python are wrong/inaccurate.

I'm afraid that's Scientific Python's cross to bear, not Python's.

 > It will be even harder if we have to remove the usage of
 > `?`/`??`[4].

Not to worry about that.  IPython can define its own syntax for
parsing out help requests vs. Python syntax.  I doubt you'll have to
modify the current rules in any way.

 > I also want to note that the use of `?`/`??` is not present to just
 > being or end of identifiers as it can also be used use to search
 > for names:
 > 
 > > In [1]: *int*?

But again "*int* ?" is a SyntaxError.  This is the kind of thing most
casual users can easily work with.  (At least speakers of American
English.  In email text, my Indian students love to separate trailing
punctuation from the preceding word for some reason, but they would
certainly learn quickly that you can't do that in IPython.)

Again, I agree it would be useful to mention this in the PEP, but as
far as I can see there really isn't a conflict.  The main thing I'd
want to know to convince me there's a risk would be if a lot of users
are confused by "%quickref" (an IPython command) vs. "3 % 2" (a Python
expression).


From steve at pearwood.info  Tue Sep 22 05:15:08 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 22 Sep 2015 13:15:08 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <mtpsh2$j5m$1@ger.gmane.org>
References: <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org>
Message-ID: <20150922031507.GL31152@ando.pearwood.info>

On Mon, Sep 21, 2015 at 05:23:42PM -0400, Terry Reedy wrote:

> I agree with Paul Moore that propagating None is generally a bad idea. 

As I understand it, you and Paul are describing a basic, simple idiom 
which is ubiquitous across Python code: using None to stand in for "no 
such value" when the data type normally used doesn't otherwise have 
something suitable. Consequently I really don't understand what you and 
Paul have against it.


> It merely avoids the inevitable exception.

I think you meant to say it merely *postpones* the inevitable exception. 
But that's wrong, there's nothing inevitable about an exception here.

It's not *hard* to deal with "value-or-None". It's just tedious, which 
is why a bit of syntactic sugar may appeal.


[...]
> Instead of trying to turn None into Bottom, I think a better solution 
> would be a new, contagious, singleton Bottom object with every possible 
> special method, all returning Bottom. Anyone could write such for their 
> one use.  Someone could put it on pypi to see if there how useful it 
> would be.

In one of my earlier posts, I discussed this Null object design pattern. 
I think it is an anti-pattern. If people want to add one to their own 
code, it's their foot, but I certainly don't want to see it as a 
built-in. Thank goodness Guido has already ruled that out :-)


> I agree with Ron Adam that the narrow issue is that bool(x) is False is 
> sometimes too broad and people dislike of spelling out 'x is not None'. 

I don't think that is the motivation of the original proposal, nor is it 
one I particularly care about.

I think that there is a level of inconvenience below which it's not 
worth adding yet more syntax just to save a few characters. That 
inconvenience is not necessarily just to do with the typing, it may be 
conceptual, e.g. we have "x != y" rather than "not x == y". I think that

    x is not None

fails to reach that minimum level of inconvenience to justify syntactic 
sugar, but 

    obj.method() if x is not None else None

does exceed the level. So I am mildly interested in null-coalescing 
versions of attribute and item/key lookup, but not at all interested in 
making the "x is not None" part *alone* shorter.


> So abbreviate that with a unary operator; 'is not None', is a property 
> of objects, not operators. I think 'x!' or 'x?', either meaning 'x is 
> not None', might be better than a new binary operator. The former, x!, 
> re-uses ! in something close to its normal meaning: x really exists.

Bring it back to the original post's motivating use-case. Slightly 
paraphrased, it was something like:

    value = obj.method() if obj is not None else None

Having x! as a short-cut for "x is not None" makes this a bit shorter to 
write:

    value = obj.method() if obj! else None

but it is still very boilerplatey and verbose compared to the suggested:

    value = obj?.method()



-- 
Steve

From bussonniermatthias at gmail.com  Tue Sep 22 05:21:49 2015
From: bussonniermatthias at gmail.com (Matthias Bussonnier)
Date: Mon, 21 Sep 2015 20:21:49 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <87vbb3mbay.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org>
 <CAP7+vJ+kSSFxdbcPjopVGRVSJRG-pN9NuU8KD2xQGFFs46-0nw@mail.gmail.com>
 <CANJQusXFni+DHT6z_jDqAooNW1+iQvwcs89BfE9FNFFFGmrENA@mail.gmail.com>
 <87vbb3mbay.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <D840D45A-360A-4162-8C7C-9AAF1544EF71@gmail.com>

Hi Stephen, 

Thanks for the response and the time you took to investigate, 

> On Sep 21, 2015, at 19:34, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> 
> But the syntax is extremely restrictive.  Both "None? 1 + 2" and
> "None ?" are SyntaxErrors, as are "a?.attr" and even "*int*? ".
> Prefixing the help operator also gives help, and in that case
> whitespace may separate the operator from the word (apparently defined
> as non-whitespace, and any trailing detritus is ignored).  Perhaps the
> prefix form (a little less natural for people coming directly from
> natural languages, I guess) should be emphasized -- there are no
> proposals for unary prefix use of "?" or "???.

Yes I?m not worrying for the time being, the current syntax proposal does not conflict, 
and I just wanted to describe a few usage and reference to consideration in the PEP.
I prefer to give the PEP authors all the cards so that they can work out  a proposal
that fits majority of the people. 

I don?t want to let people write a pep, go through iteration and once they are
happy with it complain that it does not fit my needs. 


> So, this is the kind of DWIM that I doubt will confuse many users, at
> least not for very long.

I do not think we are in contact with the same users. 
Yes, I see users confused by % syntax, just last Thursday,
amy neighbor reported my that IPython was printing only for element of tuple,
but was working for list:

In [1]: print((1))
1
In [2]: print([1])
[1]

Yes people do get confused for %magics, vs modulo  vs %-format, 
less because module is number, % for strings. But it gets betterwith time.


>  Do you envision a problem keeping IPython
> facilities separate from Python language syntax?  Technically, I think
> the current rules that divide valid IPython requests for help from
> Python syntax (valid or invalid) should continue to work.

For no with current proposal, no no problem to keep them separate.

>  Whether it
> would confuse users, I doubt, but there are possible surprises for
> some (many?) users, I suppose.

I can see the Python `??`/`?` vs IPython `?`/`??` being one explicit point in
our docs/teaching/tutorial. I cannot say how much confusion this will 
be into our user head, I think the greater confusion will be the double 
meaning plus the fact that?s a Python 3.6+ only feature. So I doubt it will
be taught before a few years. Though it is still another small difficulty. 

I guess we will start to get this king of experience with 3.5 now that 
@ is there both for decorator and __matmul__ (thanks for 3.5 in general BTW) 


> Definitely, it should be mentioned in the PEP,

Thanks,

> but Python syntax is
> something that Python defines; shells and language variants have to be
> prepared to deal with Python syntax changes.

Yes, we are prepared, but our user don?t always understand :-)

I?m wondering if there wouldn?t be a way for interpreter
to a actually help Python beta-test some syntax changes, 
at least at the REPL level. Like website do user testing. 


> 
>> leading the casual user thinking that this is a Python feature.
> 
> Casual users actually expect software to DWIM in my experience.  The
> rule "leading help or stuck-right-on-the-end help works, elsewhere it
> means something else" (including no meaning == SyntaxError) is
> intuitively understood by them already, I'm sure.  Also, I rather
> doubt that "casual users" will encounter "?." or "??" until they're
> not so casual anymore.

In my domain people get confronted to advance syntax really rapidly, 
one feedback that I have for such weird syntax, especially when you 
are new to python, is that you don?t even now how to Google for this
kind of thing (especially for non english speaker). 

Trying to put a name on *arg and **kwarg is hard, google is starting to get better, 
but still ignore chars like ?,+,-

The google search for `Python` and `Python ??` seem to be identical.


>> As someone that have to regularly teach Python, and interact with
>> new Python users, it will be hard to explain that `?` and `??` have
>> different meaning depending on the context,
> 
> I've never had a question about the context-sensitivity of "%" in
> IPython.  Have you?

Cf above yes, but more form the side where people don?t
get what modulo mean, but fair enough it was non native english, 
and they were confused by indent/implement/increment being roughly
the same word with 3 completely different meaning.

Though the number of people I see using modulo is low,
then they use numpy.mod on arrays once we get to numerics.
And in string formatting  we push for .format(*args, **kwargs).

But I?ll try to gather some statistics around. 

> 
>> and that most book on Scientific Python are wrong/inaccurate.
> 
> I'm afraid that's Scientific Python's cross to bear, not Python's.
> 
>> It will be even harder if we have to remove the usage of
>> `?`/`??`[4].
> 
> Not to worry about that.  IPython can define its own syntax for
> parsing out help requests vs. Python syntax.  I doubt you'll have to
> modify the current rules in any way.

I hope that won?t change in the final PEP, and I?m not too worry, 
worse case we use more reg-ex, and shift the problem elsewhere :-) 

> 
>> I also want to note that the use of `?`/`??` is not present to just
>> being or end of identifiers as it can also be used use to search
>> for names:
>> 
>>> In [1]: *int*?
> 
> But again "*int* ?" is a SyntaxError.  This is the kind of thing most
> casual users can easily work with.  (At least speakers of American
> English.  In email text, my Indian students love to separate trailing
> punctuation from the preceding word for some reason, but they would
> certainly learn quickly that you can't do that in IPython.)

French also separate punctuation, I?m still torn on that.
But if we have to slightly change the rules, so be it. 

> 
> Again, I agree it would be useful to mention this in the PEP, but as
> far as I can see there really isn't a conflict.

Happy you agree on point 1, and that you confirm point 2.

> The main thing I'd
> want to know to convince me there's a risk would be if a lot of users
> are confused by "%quickref" (an IPython command) vs. "3 % 2" (a Python
> expression).


Will try to get more qualitative info. 

Thanks, 
-- 
M



From steve at pearwood.info  Tue Sep 22 05:23:02 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 22 Sep 2015 13:23:02 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <mtpt1v$rdc$1@ger.gmane.org>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <mtli19$u3d$1@ger.gmane.org> <20150921162226.GE31152@ando.pearwood.info>
 <mtpt1v$rdc$1@ger.gmane.org>
Message-ID: <20150922032302.GM31152@ando.pearwood.info>

On Mon, Sep 21, 2015 at 05:32:44PM -0400, Terry Reedy wrote:
> On 9/21/2015 12:22 PM, Steven D'Aprano wrote:
> >On Sun, Sep 20, 2015 at 09:00:08AM +0300, Serhiy Storchaka wrote:
> 
> >>randbelow() is just an alias for randrange() with single argument.
> >>randint(a, b) == randrange(a, b+1).
> >>
> >>These functions are redundant and they have non-zero cost.
> >
> >But they already exist in the random module, so adding them to secrets
> >doesn't cost anything extra.
> 
> I think the redundancy in random is a mistake.  The cost is confusion 
> and extra memory load, and there need to more ofter refer to the manual, 
> for essentially zero gain. 

Sorry, I don't understand what you mean.

Do you mean that it is a mistake for the random module to have randint 
and randrange? Or that it is a mistake for the secrets module to include 
functions that the random module includes?


> When I read two names, I expect them to do 
> two different things.  The question is whether to propagate the mistake 
> to a new module.

If you are referring to randint versus randrange, they do do different 
things. Look at their signatures.

randint(a, b) follows the ubiquitous API of "generate a random integer 
from the closed range a through b inclusive". 

randrange([start,] end [, step]) follows the Python practice of 
specifying a half-open interval, and has a more complex signature.

Even though randrange is more Pythonic, I've never actually used it. 
randint is always what I've wanted. E.g.

def die():
    # Roll a die.
    return randint(1, 6)

is far more natural than randrange(1, 7), Pythonic half-open intervals 
or not.

But I'm satisfied that others may think differently, and by Tim's 
argument that excluding one or the other will be more confusing than 
including them both.


-- 
Steve

From steve at pearwood.info  Tue Sep 22 05:34:36 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 22 Sep 2015 13:34:36 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <20150919181612.GT31152@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
Message-ID: <20150922033436.GN31152@ando.pearwood.info>

I have discovered that there is already a "secrets" module on PyPI:

https://pypi.python.org/pypi/secrets


(Thanks to Robert Collins who has brought this to my attention.)

Personally, I don't think we should necessarily rule out re-using the 
name in the standard library. Does anyone have strong feelings either 
way?

-- 
Steve

From steve at pearwood.info  Tue Sep 22 05:40:44 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 22 Sep 2015 13:40:44 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <87zj0fmimv.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CACac1F8CNxjf3ZZ3_MkuoURxvQtRMpQ1hm3vREmFO5aFQ=5W3w@mail.gmail.com>
 <20150921174758.GF31152@ando.pearwood.info>
 <87zj0fmimv.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20150922034044.GO31152@ando.pearwood.info>

On Tue, Sep 22, 2015 at 08:56:24AM +0900, Stephen J. Turnbull wrote:
> Steven D'Aprano writes:
> 
>  > I wouldn't include punctuation [in the password alphabet] by
>  > default, as too many places still prohibit some, or all,
>  > punctuation characters.
> 
> Do you really expect users to choose their own random passwords using
> this function? 

I don't know. Perhaps they will. I'm not entirely sure what the use-case 
of this password generator is, since I'm pretty sure that "real" 
password generators have to deal with far more complicated rules.


> I would expect that this function would be used for
> initial system-generated passwords (or system-enforced random
> passwords), and the system would have control over the admissible set.

Perhaps so. But then how does the application get the password to the 
user? Via unencypted email, like mailman does?

I expect that the only use-case for an application generating a password 
for the user would be "low security" applications where the password has 
low value.

But maybe others disagree. I don't really have a strong opinion one way 
or another.



-- 
Steve

From brenbarn at brenbarn.net  Tue Sep 22 05:40:59 2015
From: brenbarn at brenbarn.net (Brendan Barnwell)
Date: Mon, 21 Sep 2015 20:40:59 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <56008B03.6020704@oddbird.net>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org>
 <CAP7+vJ+kSSFxdbcPjopVGRVSJRG-pN9NuU8KD2xQGFFs46-0nw@mail.gmail.com>
 <mtq19j$reo$1@ger.gmane.org> <56008B03.6020704@oddbird.net>
Message-ID: <5600CDCB.1010707@brenbarn.net>

On 2015-09-21 15:56, Carl Meyer wrote:
> My jaw dropped a bit when I saw it asserted in this thread that
> functions returning "useful value or None" is an anti-pattern. I write
> functions like that all the time, and I consider it a useful and
> necessary Python idiom. I would hate to rewrite all that code to either
> deal with exceptions or add default-value-argument boilerplate to all of
> them; when "no result" is an expected and normal possibility from a
> function, letting the calling code deal with None however it chooses is
> much nicer than either of those options.

	I agree that it's a fine thing.  The thing is, it's an API choice.  If 
your API is "return such-and-such or None", then anyone who calls your 
function knows they have to check for None and do the right thing.  I 
think this is fine if None really does indicate something like "no 
result".  (The re module uses None return values this way.)

	It seems to me that a lot of the "problem" that these null-coalescing 
proposals are trying to solve is dealing with APIs that return None when 
they really ought to be raising an exception or returning some kind of 
context-appropriate empty value.  If you're doing result = 
someFunction() and then result.attr.upper() and it's failing because 
result.attr is None, to me that's often a sign that the API is fragile, 
and the result object that someFunction returns should have its attr set 
to an empty string, not None.

	In other words, if you really want "a null result that I can call all 
kinds of string methods on and treat it like a string", you should be 
returning an empty string.  If you want "a null result I can subscript 
and get an integer", you should be returning some kind of 
defaultdict-like object that has a default zero value.  Or whatever. 
There isn't really such a thing as "an object to which I want to be able 
to do absolutely anything and have it work", because there's no 
type-general notion of what "work" means.  From a duck-typing 
perspective, if you expect users to try to do anything with a value you 
return, what they might reasonably want to do should be a clue as to 
what kind of value you should return.

	That still leaves the use-case where you're trying to interoperate with 
some external system that may have missing values, but I don't see that 
as super compelling.  Getting an exception when you do 
some['big']['json']['object']['value'] and one of the intermediate ones 
isn't there is a feature; the bug is the JavaScripty mentality of just 
silently passing around "undefined".  To my mind, Python APIs that wrap 
such external data sources should ideally take the opportunity to 
improve on them and make them more Pythonic, by providing sensible, 
context-relevant defaults instead of propagating a generic "null" value 
willy-nilly.

-- 
Brendan Barnwell
"Do not follow where the path may lead.  Go, instead, where there is no 
path, and leave a trail."
    --author unknown

From abarnert at yahoo.com  Tue Sep 22 05:59:07 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Mon, 21 Sep 2015 20:59:07 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <87vbb3mbay.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org> <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org>
 <CAP7+vJ+kSSFxdbcPjopVGRVSJRG-pN9NuU8KD2xQGFFs46-0nw@mail.gmail.com>
 <CANJQusXFni+DHT6z_jDqAooNW1+iQvwcs89BfE9FNFFFGmrENA@mail.gmail.com>
 <87vbb3mbay.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <BD4AEE43-9F0E-4FDB-A5E0-3C4E3D960F29@yahoo.com>

> On Sep 21, 2015, at 19:34, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> 
> Definitely, it should be mentioned in the PEP, but Python syntax is
> something that Python defines; shells and language variants have to be
> prepared to deal with Python syntax changes.

IIRC, back when the ternary conditional was suggested, and the C-style ?: was proposed, Guido declared that no way was his language ever going to use ? an operator. So if IPython took that as a promise that they can use ? without fear of ambiguity, you can't blame them too much....

I'm not saying they have a right to expect/demand that Guido never change his mind about anything anywhere ever, just that maybe they get a little extra consideration on backward compatibility with their use of ? than with their use of ! or % (which have been in use as operators or parts of operators for decades).

From p.f.moore at gmail.com  Tue Sep 22 10:01:23 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 22 Sep 2015 09:01:23 +0100
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <20150922033436.GN31152@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
 <20150922033436.GN31152@ando.pearwood.info>
Message-ID: <CACac1F_M-gAWPPeC5a4xdUA76+ONHu0qkbm9HrFYHe9yCE27Zg@mail.gmail.com>

On 22 September 2015 at 04:34, Steven D'Aprano <steve at pearwood.info> wrote:
> I have discovered that there is already a "secrets" module on PyPI:
>
> https://pypi.python.org/pypi/secrets
>
>
> (Thanks to Robert Collins who has brought this to my attention.)
>
> Personally, I don't think we should necessarily rule out re-using the
> name in the standard library. Does anyone have strong feelings either
> way?

The package appears to have no releases, and the home page gave me a
404. I would say it's OK to reuse the name.

Paul

From p.f.moore at gmail.com  Tue Sep 22 10:12:00 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 22 Sep 2015 09:12:00 +0100
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <BD4AEE43-9F0E-4FDB-A5E0-3C4E3D960F29@yahoo.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CAPTjJmpwm2zKjHwqoPVNTHrhbF_7qM6sO8jARf84+O54usbRGA@mail.gmail.com>
 <55FC9812.2090503@mrabarnett.plus.com>
 <20150919034112.GQ31152@ando.pearwood.info>
 <CAP7+vJL0CunS5ddKZefuTnFUK7jaROs59+YZPvTmvGXcD18=oQ@mail.gmail.com>
 <mtliko$632$1@ger.gmane.org>
 <20150920073157.GV31152@ando.pearwood.info>
 <CACac1F_UKTNgLc0G0Mn05Arj+QL3gWt-Affn41UWo7U45nVVGQ@mail.gmail.com>
 <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org>
 <CAP7+vJ+kSSFxdbcPjopVGRVSJRG-pN9NuU8KD2xQGFFs46-0nw@mail.gmail.com>
 <CANJQusXFni+DHT6z_jDqAooNW1+iQvwcs89BfE9FNFFFGmrENA@mail.gmail.com>
 <87vbb3mbay.fsf@uwakimon.sk.tsukuba.ac.jp>
 <BD4AEE43-9F0E-4FDB-A5E0-3C4E3D960F29@yahoo.com>
Message-ID: <CACac1F9FLNdUU=95oYyMxXcJGAnPZjnTm6Ra+Hk6CG1VT_jjiQ@mail.gmail.com>

On 21 September 2015 at 23:56, Carl Meyer <carl at oddbird.net> wrote:
> My jaw dropped a bit when I saw it asserted in this thread that
> functions returning "useful value or None" is an anti-pattern. I write
> functions like that all the time, and I consider it a useful and
> necessary Python idiom. I would hate to rewrite all that code to either
> deal with exceptions or add default-value-argument boilerplate to all of
> them; when "no result" is an expected and normal possibility from a
> function, letting the calling code deal with None however it chooses is
> much nicer than either of those options.

Maybe my use of the phrase "anti-pattern" was too strong (i thought it
implied a relatively mild "this causes problems"). Having the caller
deal with problems isn't bad, but in my experience, too often the
caller *doesn't* deal with the possibility None return. It feels
rather like C's practice of returning error codes which never get
checked.

But as I said, YMMV, and my experience is clearly different from yours.

> I don't love the ? syntax, but I would certainly use the feature
> discussed here happily and frequently.

If we're back to discussing indexing and attribute access rather than
??, maybe -> would work?

obj->attr meaning None if obj is None else obj.attr
obj->[n] meaning None if obj is None else obj[attr]
obj->(args) meaning None if obj is None else obj(args)

I think Matthias Bussonnier's point that ? and ?? is heavily used in
IPython is a good one. Python traditionally doesn't introduce new
punctuation (@ for decorators was AFAIK the last one). I thought that
was precisely to leave the space of unused characters available for
3rd party tools.

Paul

From j.wielicki at sotecware.net  Tue Sep 22 10:26:13 2015
From: j.wielicki at sotecware.net (Jonas Wielicki)
Date: Tue, 22 Sep 2015 10:26:13 +0200
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
 Library
In-Reply-To: <CAPTjJmqKCbpwpu7x1_PtXc8-HUmgtp8Tw-6CMvy9UOjcTSDY5w@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CAPTjJmq0OQ8EUJH4km95-2NcYDUsmVe+khmmrGw-fNYsxJ0YAw@mail.gmail.com>
 <CAExdVNnTjYFWKUA0pVL1-4LDb8ecvRAtLJbFXnFci1jQ2MUXNg@mail.gmail.com>
 <CAPTjJmqKCbpwpu7x1_PtXc8-HUmgtp8Tw-6CMvy9UOjcTSDY5w@mail.gmail.com>
Message-ID: <560110A5.8010503@sotecware.net>



On 20.09.2015 02:27, Chris Angelico wrote:
> On Sun, Sep 20, 2015 at 10:19 AM, Tim Peters <tim.peters at gmail.com> wrote:
>> [Chris Angelico <rosuav at gmail.com>]
>>> token_bytes "obviously" should return a bytes,
>>
>> Which os.urandom() does in Python 3.  I'm not writing docs, just
>> suggesting the functions.
>>
>>> and token_alpha equally obviously should be returning a str.
>>
>> Which part of "string" doesn't suggest "str"?
>>
>>> (Or maybe it should return the same type as alphabet, which
>>> could be either?)
>>>
>>> : What about the other two?
>>
>> Which part of "ASCII" is ambiguous?
>>
>>> Also, if you ask for 4 bytes from token_hex, do you get 4 hex
>>> digits or 8 (four bytes of entropy)?
>>
>> And which part of "same"?  ;-)
>>
>> Bikeshed away.;  I'm outta this now ;-)
> 
> Heh :)
> 
> My personal preference for shed colour: token_bytes returns a
> bytestring, its length being the number provided. All the others
> return Unicode strings, their lengths again being the number provided.
> So they're all text bar the one that explicitly says it's in bytes.

My personal preference would be for the number of bytes to rather
reflect the entropy in the result. This would be a safer use when
migrating from using e.g. token_url to token_alpha with the base32
alphabet [1], for example because you want to have better readable tokens.

Speaking of which, a token_base32 would probably make sense, too.

regards,
jwi

   [1]: https://philzimmermann.com/docs/human-oriented-base-32-encoding.txt

From ncoghlan at gmail.com  Tue Sep 22 13:56:02 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 22 Sep 2015 21:56:02 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <87zj0fmimv.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CACac1F8CNxjf3ZZ3_MkuoURxvQtRMpQ1hm3vREmFO5aFQ=5W3w@mail.gmail.com>
 <20150921174758.GF31152@ando.pearwood.info>
 <87zj0fmimv.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CADiSq7d9G61wHfcsQ7TPMqgOauevxKg5c-Q+_wCv0wM2Dx-XUQ@mail.gmail.com>

On 22 September 2015 at 09:56, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Steven D'Aprano writes:
>
>  > I wouldn't include punctuation [in the password alphabet] by
>  > default, as too many places still prohibit some, or all,
>  > punctuation characters.
>
> Do you really expect users to choose their own random passwords using
> this function?  I would expect that this function would be used for
> initial system-generated passwords (or system-enforced random
> passwords), and the system would have control over the admissible set.
> But users who have to conform to somebody else's rules much prefer
> obfuscated passwords that pass strength tests to random passwords in
> my experience.

Right, the primary use case here is "web developer creating a default
password for an automatically created admin account" (for example),
not "end user creating a password for an arbitrary service".

We don't want to overgeneralise the canned recipes - keep them dirt
simple, and if folks want something slightly different, we can go the
itertools path and have recipes in the documentation.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Tue Sep 22 14:03:06 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 22 Sep 2015 22:03:06 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <560110A5.8010503@sotecware.net>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CAPTjJmq0OQ8EUJH4km95-2NcYDUsmVe+khmmrGw-fNYsxJ0YAw@mail.gmail.com>
 <CAExdVNnTjYFWKUA0pVL1-4LDb8ecvRAtLJbFXnFci1jQ2MUXNg@mail.gmail.com>
 <CAPTjJmqKCbpwpu7x1_PtXc8-HUmgtp8Tw-6CMvy9UOjcTSDY5w@mail.gmail.com>
 <560110A5.8010503@sotecware.net>
Message-ID: <CADiSq7cc5-icsJ0KjNXUHt8oJuDEqVPqqhtLUUMRJJD+-yNmGQ@mail.gmail.com>

On 22 September 2015 at 18:26, Jonas Wielicki <j.wielicki at sotecware.net> wrote:
> On 20.09.2015 02:27, Chris Angelico wrote:
>> My personal preference for shed colour: token_bytes returns a
>> bytestring, its length being the number provided. All the others
>> return Unicode strings, their lengths again being the number provided.
>> So they're all text bar the one that explicitly says it's in bytes.
>
> My personal preference would be for the number of bytes to rather
> reflect the entropy in the result. This would be a safer use when
> migrating from using e.g. token_url to token_alpha with the base32
> alphabet [1], for example because you want to have better readable tokens.

This isn't something to decide by personal preference, it's something
to be decide by considering the consequences of someone
misunderstanding the API and not noticing that the result isn't what
they expected.

Scenario 1: API specifies bytes of entropy

Consequence of misunderstanding: result is twice as long as expected,
with more entropy than expected

Scenario 2: API specifies length of result

Consequence of misunderstanding: result is half as long as expected,
with less entropy than expected

Scenario 1 fails safe, scenario 2 doesn't, so for the APIs that are
just reversible data transforms around os.urandom, it makes the most
sense to specify the number of bytes of entropy you want.

Building a password from an alphabet is different, as that involves
repeated applications of secrets.choice() to the given alphabet, so
you need to specify the result length directly.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From stephen at xemacs.org  Tue Sep 22 14:07:59 2015
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 22 Sep 2015 21:07:59 +0900
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The
	Standard	Library
In-Reply-To: <20150922034044.GO31152@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CACac1F8CNxjf3ZZ3_MkuoURxvQtRMpQ1hm3vREmFO5aFQ=5W3w@mail.gmail.com>
 <20150921174758.GF31152@ando.pearwood.info>
 <87zj0fmimv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20150922034044.GO31152@ando.pearwood.info>
Message-ID: <87lhbylkrk.fsf@uwakimon.sk.tsukuba.ac.jp>

Steven D'Aprano writes:
 > On Tue, Sep 22, 2015 at 08:56:24AM +0900, Stephen J. Turnbull wrote:

 > I don't know. Perhaps they will. I'm not entirely sure what the
 > use-case of this password generator is, since I'm pretty sure that
 > "real" password generators have to deal with far more complicated
 > rules.

Actually, I think they'll do what randrange does: take a seed from
urandom() and values from a (CS)PRNG based on that seed, and throw
away an out-of-range subset.  Ie, they'll just generate passwords
based on a simple rule about the alphabet and keep trying until they
get one that passes the strength tester.

 > > I would expect that this function would be used for
 > > initial system-generated passwords (or system-enforced random
 > > passwords), and the system would have control over the admissible set.
 >
 > Perhaps so. But then how does the application get the password to the 
 > user? Via unencypted email, like mailman does?

Well, I hand them out to my students in class on business cards.  But
an HTTPS connection could also work.

 > I expect that the only use-case for an application generating a
 > password for the user would be "low security" applications where
 > the password has low value.

That could very well be true.


From random832 at fastmail.com  Tue Sep 22 14:51:10 2015
From: random832 at fastmail.com (Random832)
Date: Tue, 22 Sep 2015 08:51:10 -0400
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
 Library
In-Reply-To: <CADiSq7cc5-icsJ0KjNXUHt8oJuDEqVPqqhtLUUMRJJD+-yNmGQ@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CAPTjJmq0OQ8EUJH4km95-2NcYDUsmVe+khmmrGw-fNYsxJ0YAw@mail.gmail.com>
 <CAExdVNnTjYFWKUA0pVL1-4LDb8ecvRAtLJbFXnFci1jQ2MUXNg@mail.gmail.com>
 <CAPTjJmqKCbpwpu7x1_PtXc8-HUmgtp8Tw-6CMvy9UOjcTSDY5w@mail.gmail.com>
 <560110A5.8010503@sotecware.net>
 <CADiSq7cc5-icsJ0KjNXUHt8oJuDEqVPqqhtLUUMRJJD+-yNmGQ@mail.gmail.com>
Message-ID: <1442926270.3642748.390398377.68DC4B29@webmail.messagingengine.com>

On Tue, Sep 22, 2015, at 08:03, Nick Coghlan wrote:
> Building a password from an alphabet is different, as that involves
> repeated applications of secrets.choice() to the given alphabet, so
> you need to specify the result length directly.

Well, in principle, the length could be calculated from the number of
bytes of entropy desired by using
ceil(nbytes*log(256)/log(len(alphabet))), if all that matters is to
"fail safe" [i.e. longer] rather than to not be surprising. Being
calculated by repeated application of choice rather than some other
algorithm is an implementation detail.

From eric at trueblade.com  Tue Sep 22 17:04:55 2015
From: eric at trueblade.com (Eric V. Smith)
Date: Tue, 22 Sep 2015 11:04:55 -0400
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
 Library
In-Reply-To: <mtpobd$i5s$1@ger.gmane.org>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <mtli19$u3d$1@ger.gmane.org> <20150921162226.GE31152@ando.pearwood.info>
 <mtpobd$i5s$1@ger.gmane.org>
Message-ID: <56016E17.4000509@trueblade.com>

Sorry to jump in with replying to a random message, but I can't find the
message where this originally showed up:

>>>> Bound methods of a SystemRandom instance
>>>>      .randrange()
>>>>      .randint()
>>>>      .randbits()
>>>>          renamed from .getrandbits()
>>>>      .randbelow(exclusive_upper_bound)
>>>>          renamed from private ._randbelow()
>>>>      .choice()

While we're bikeshedding, can we pick better names than randXXX? How
about random_range(), etc.? I'd rather have clarity than save a few
chars. I think it's more approachable for new users to a new module.

Eric.



From brett at python.org  Tue Sep 22 18:01:51 2015
From: brett at python.org (Brett Cannon)
Date: Tue, 22 Sep 2015 16:01:51 +0000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CADiSq7d9G61wHfcsQ7TPMqgOauevxKg5c-Q+_wCv0wM2Dx-XUQ@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CACac1F8CNxjf3ZZ3_MkuoURxvQtRMpQ1hm3vREmFO5aFQ=5W3w@mail.gmail.com>
 <20150921174758.GF31152@ando.pearwood.info>
 <87zj0fmimv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7d9G61wHfcsQ7TPMqgOauevxKg5c-Q+_wCv0wM2Dx-XUQ@mail.gmail.com>
Message-ID: <CAP1=2W4jjp+f2oc+=BNgx=2PJ6T7QM5REi2ZJwUcbZ3KtSMPAA@mail.gmail.com>

On Tue, 22 Sep 2015 at 04:56 Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 22 September 2015 at 09:56, Stephen J. Turnbull <stephen at xemacs.org>
> wrote:
> > Steven D'Aprano writes:
> >
> >  > I wouldn't include punctuation [in the password alphabet] by
> >  > default, as too many places still prohibit some, or all,
> >  > punctuation characters.
> >
> > Do you really expect users to choose their own random passwords using
> > this function?  I would expect that this function would be used for
> > initial system-generated passwords (or system-enforced random
> > passwords), and the system would have control over the admissible set.
> > But users who have to conform to somebody else's rules much prefer
> > obfuscated passwords that pass strength tests to random passwords in
> > my experience.
>
> Right, the primary use case here is "web developer creating a default
> password for an automatically created admin account" (for example),
> not "end user creating a password for an arbitrary service".
>
> We don't want to overgeneralise the canned recipes - keep them dirt
> simple, and if folks want something slightly different, we can go the
> itertools path and have recipes in the documentation.
>

Out of this whole proposal, this password function is the one I'm most
worried about. As someone who has a project whose entire job is to generate
consistent passwords, I can tell you it's a messy business that will just
lead to never-ending complaints about "why didn't you include this as part
of password alphabet" or "why did you choose that length". It just isn't
worth the hassle when it isn't going to impact a majority of Python users.
This can be something that web frameworks and other folks worry about.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150922/f56da7de/attachment-0001.html>

From mal at egenix.com  Tue Sep 22 18:25:46 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 22 Sep 2015 18:25:46 +0200
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
 Library
In-Reply-To: <CAP1=2W4jjp+f2oc+=BNgx=2PJ6T7QM5REi2ZJwUcbZ3KtSMPAA@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>	<CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>	<CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>	<CACac1F8CNxjf3ZZ3_MkuoURxvQtRMpQ1hm3vREmFO5aFQ=5W3w@mail.gmail.com>	<20150921174758.GF31152@ando.pearwood.info>	<87zj0fmimv.fsf@uwakimon.sk.tsukuba.ac.jp>	<CADiSq7d9G61wHfcsQ7TPMqgOauevxKg5c-Q+_wCv0wM2Dx-XUQ@mail.gmail.com>
 <CAP1=2W4jjp+f2oc+=BNgx=2PJ6T7QM5REi2ZJwUcbZ3KtSMPAA@mail.gmail.com>
Message-ID: <5601810A.3070302@egenix.com>

On 22.09.2015 18:01, Brett Cannon wrote:
> On Tue, 22 Sep 2015 at 04:56 Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
>> On 22 September 2015 at 09:56, Stephen J. Turnbull <stephen at xemacs.org>
>> wrote:
>>> Steven D'Aprano writes:
>>>
>>>  > I wouldn't include punctuation [in the password alphabet] by
>>>  > default, as too many places still prohibit some, or all,
>>>  > punctuation characters.
>>>
>>> Do you really expect users to choose their own random passwords using
>>> this function?  I would expect that this function would be used for
>>> initial system-generated passwords (or system-enforced random
>>> passwords), and the system would have control over the admissible set.
>>> But users who have to conform to somebody else's rules much prefer
>>> obfuscated passwords that pass strength tests to random passwords in
>>> my experience.
>>
>> Right, the primary use case here is "web developer creating a default
>> password for an automatically created admin account" (for example),
>> not "end user creating a password for an arbitrary service".
>>
>> We don't want to overgeneralise the canned recipes - keep them dirt
>> simple, and if folks want something slightly different, we can go the
>> itertools path and have recipes in the documentation.
>>
> 
> Out of this whole proposal, this password function is the one I'm most
> worried about. As someone who has a project whose entire job is to generate
> consistent passwords, I can tell you it's a messy business that will just
> lead to never-ending complaints about "why didn't you include this as part
> of password alphabet" or "why did you choose that length". It just isn't
> worth the hassle when it isn't going to impact a majority of Python users.
> This can be something that web frameworks and other folks worry about.

Agreed. There are too many policies and regulations for
passwords out there. The stdlib is not the right place for this.

But the general purpose functionality of having a function
which returns a string of given length and characters from a
given set is useful for building routines which implement
such policies.

Just don't call it a password function :-)

How about: randstr(length, alphabet)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 22 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-14: Released mxODBC Plone/Zope DA 2.2.3   http://egenix.com/go84
2015-09-26: Python Meeting Duesseldorf Sprint 2015          4 days to go
2015-10-21: Python Meeting Duesseldorf ...                 29 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From steve at pearwood.info  Tue Sep 22 19:05:34 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 23 Sep 2015 03:05:34 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CAP1=2W4jjp+f2oc+=BNgx=2PJ6T7QM5REi2ZJwUcbZ3KtSMPAA@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CACac1F8CNxjf3ZZ3_MkuoURxvQtRMpQ1hm3vREmFO5aFQ=5W3w@mail.gmail.com>
 <20150921174758.GF31152@ando.pearwood.info>
 <87zj0fmimv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7d9G61wHfcsQ7TPMqgOauevxKg5c-Q+_wCv0wM2Dx-XUQ@mail.gmail.com>
 <CAP1=2W4jjp+f2oc+=BNgx=2PJ6T7QM5REi2ZJwUcbZ3KtSMPAA@mail.gmail.com>
Message-ID: <20150922170534.GR31152@ando.pearwood.info>

On Tue, Sep 22, 2015 at 04:01:51PM +0000, Brett Cannon wrote:

> Out of this whole proposal, this password function is the one I'm most
> worried about. As someone who has a project whose entire job is to generate
> consistent passwords, I can tell you it's a messy business that will just
> lead to never-ending complaints about "why didn't you include this as part
> of password alphabet" or "why did you choose that length". It just isn't
> worth the hassle when it isn't going to impact a majority of Python users.
> This can be something that web frameworks and other folks worry about.

I too feel a quiet unease about password(), although I don't have 
anything concrete to pin it on. I'm happy to be guided by people with 
more experience in this realm.

What if we called it simple_password() and made it clear that it wasn't 
intended as an all-singing, all-dancing password generator?

-- 
Steve

From tim.peters at gmail.com  Tue Sep 22 19:41:44 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 22 Sep 2015 12:41:44 -0500
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <20150922170534.GR31152@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CACac1F8CNxjf3ZZ3_MkuoURxvQtRMpQ1hm3vREmFO5aFQ=5W3w@mail.gmail.com>
 <20150921174758.GF31152@ando.pearwood.info>
 <87zj0fmimv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7d9G61wHfcsQ7TPMqgOauevxKg5c-Q+_wCv0wM2Dx-XUQ@mail.gmail.com>
 <CAP1=2W4jjp+f2oc+=BNgx=2PJ6T7QM5REi2ZJwUcbZ3KtSMPAA@mail.gmail.com>
 <20150922170534.GR31152@ando.pearwood.info>
Message-ID: <CAExdVNmG2tDne4s0j4D0zAZwq95kKhNxX-rcypRC5R5DJ6n4Wg@mail.gmail.com>

[Steven D'Aprano <steve at pearwood.info>]
> I too feel a quiet unease about password(), although I don't have
> anything concrete to pin it on. I'm happy to be guided by people with
> more experience in this realm.
>
> What if we called it simple_password() and made it clear that it wasn't
> intended as an all-singing, all-dancing password generator?

Just drop it.  Nobody I recall has said anything in favor of it ;-)

It would be easy to give it as an example in the docs instead,
building directly on choice().  That would steer people who need
fancier stuff in the right direction.

From steve at pearwood.info  Tue Sep 22 19:47:55 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 23 Sep 2015 03:47:55 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <560110A5.8010503@sotecware.net>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CAPTjJmq0OQ8EUJH4km95-2NcYDUsmVe+khmmrGw-fNYsxJ0YAw@mail.gmail.com>
 <CAExdVNnTjYFWKUA0pVL1-4LDb8ecvRAtLJbFXnFci1jQ2MUXNg@mail.gmail.com>
 <CAPTjJmqKCbpwpu7x1_PtXc8-HUmgtp8Tw-6CMvy9UOjcTSDY5w@mail.gmail.com>
 <560110A5.8010503@sotecware.net>
Message-ID: <20150922174755.GS31152@ando.pearwood.info>

On Tue, Sep 22, 2015 at 10:26:13AM +0200, Jonas Wielicki wrote:
> 
> On 20.09.2015 02:27, Chris Angelico wrote:
> >>> Also, if you ask for 4 bytes from token_hex, do you get 4 hex
> >>> digits or 8 (four bytes of entropy)?

I think the answer there has to be 8. I interpret Tim's reference to 
"same" as that the intent of token_hex is to call os.urandom(nbytes), 
then convert it to a hex string. So the implementation might be as 
simple as:

def token_hex(nbytes):
    return binascii.hexlify(os.urandom(nbytes))

modulo a call to .decode('ascii') if we want it to return a string.

One obvious question is, how many bytes is enough? Perhaps we should set 
a default value for nbytes, with the understanding that the default 
value will increase in the future.


> > My personal preference for shed colour: token_bytes returns a
> > bytestring, its length being the number provided. All the others
> > return Unicode strings, their lengths again being the number provided.
> > So they're all text bar the one that explicitly says it's in bytes.
> 
> My personal preference would be for the number of bytes to rather
> reflect the entropy in the result. This would be a safer use when
> migrating from using e.g. token_url to token_alpha with the base32
> alphabet [1], for example because you want to have better readable tokens.
> 
> Speaking of which, a token_base32 would probably make sense, too.

Oh oh, scope creep already! And so it begins... *wink*

What you are referring to isn't the standard base32, which already 
exists in the stdlib (in base64.py, together with base16). It's is 
referred to by its creators as z-base-32, and the reasoning they give 
seems sound. It's not intended as a replacement for RFC-3458 base32, but 
an alternative.

If the std lib already included a z-base-32 implementation, I would be 
happy to include token_zbase32 in the same spirit as token_base64. But 
it doesn't. So first you would have to convince somebody to add zbase32 
to the standard library.

>    [1]: https://philzimmermann.com/docs/human-oriented-base-32-encoding.txt


-- 
Steve

From tim.peters at gmail.com  Tue Sep 22 20:05:43 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 22 Sep 2015 13:05:43 -0500
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <20150922174755.GS31152@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CAPTjJmq0OQ8EUJH4km95-2NcYDUsmVe+khmmrGw-fNYsxJ0YAw@mail.gmail.com>
 <CAExdVNnTjYFWKUA0pVL1-4LDb8ecvRAtLJbFXnFci1jQ2MUXNg@mail.gmail.com>
 <CAPTjJmqKCbpwpu7x1_PtXc8-HUmgtp8Tw-6CMvy9UOjcTSDY5w@mail.gmail.com>
 <560110A5.8010503@sotecware.net> <20150922174755.GS31152@ando.pearwood.info>
Message-ID: <CAExdVNnLddL6oi8=xCumKhpmDZzG=qz6U5uMq3Aq52OpuBBcRA@mail.gmail.com>

>>>>> Also, if you ask for 4 bytes from token_hex, do you get 4 hex
>>>>> digits or 8 (four bytes of entropy)?


[Steven D'Aprano]
> I think the answer there has to be 8. I interpret Tim's reference to
> "same" as that the intent of token_hex is to call os.urandom(nbytes),
> then convert it to a hex string.

Absolutely.  If we're trying to "fail safe", it's the number of
unpredictable source bytes that's important, not the length of the
string produced.  And, e.g., in the case of a URL-safe base64
encoding, passing "number of characters in the string" would be plain
idiotic ;-)


> So the implementation might be as simple as:
>
> def token_hex(nbytes):
>     return binascii.hexlify(os.urandom(nbytes))
> modulo a call to .decode('ascii') if we want it to return a string.

Nick Coghlan already posted implementation of these things, before
this thread started.  They're all easy, _provided that_ you know which
obscure functions to call; e.g.,

    def token_url(nbytes):
        return base64.urlsafe_b64encode(os.urandom(nbytes)).decode("ascii")

From srkunze at mail.de  Tue Sep 22 20:22:42 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Tue, 22 Sep 2015 20:22:42 +0200
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <20150922031507.GL31152@ando.pearwood.info>
References: <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org> <20150922031507.GL31152@ando.pearwood.info>
Message-ID: <56019C72.9000806@mail.de>

On 22.09.2015 05:15, Steven D'Aprano wrote:
> On Mon, Sep 21, 2015 at 05:23:42PM -0400, Terry Reedy wrote:
>
>> I agree with Paul Moore that propagating None is generally a bad idea.
> As I understand it, you and Paul are describing a basic, simple idiom
> which is ubiquitous across Python code: using None to stand in for "no
> such value"

There is not a single "no such value". As I mentioned before, when 
discussing NULL values on the RDF mailing list, we discovered 6 or 7 
domain-agnostic meanings.

> when the data type normally used doesn't otherwise have
> something suitable. Consequently I really don't understand what you and
> Paul have against it.

I can tell from what I've seen that people use None for: all kinds of 
various interesting semantics depending on the variable, on the supposed 
type and on the function such as:

- +infinity for datetimes but only if it signifies the end of a timespan
- current datetime
- mixing both
- default item in a list like [1, 2, None, 4, 9] (putting in 5 would 
have done the trick)
- ...

Really?

Just imagine a world where Python and other systems would have never 
invented None, NULLs or anything like that.

> I think you meant to say it merely *postpones* the inevitable 
> exception. But that's wrong, there's nothing inevitable about an 
> exception here. It's not *hard* to deal with "value-or-None". It's 
> just tedious, which is why a bit of syntactic sugar may appeal. 

It's a sign of bad design. So, syntactic sugar does not help when doing 
toilet paper programming (hope that translation works for English).


Best,
Sven

From rosuav at gmail.com  Wed Sep 23 00:53:00 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 23 Sep 2015 08:53:00 +1000
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <56019C72.9000806@mail.de>
References: <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org>
 <20150922031507.GL31152@ando.pearwood.info>
 <56019C72.9000806@mail.de>
Message-ID: <CAPTjJmqBTOnOHHQ+=M+7FwLVvM0FLAXiwvZHT-gZ_D1pKY9=GQ@mail.gmail.com>

On Wed, Sep 23, 2015 at 4:22 AM, Sven R. Kunze <srkunze at mail.de> wrote:
> I can tell from what I've seen that people use None for: all kinds of
> various interesting semantics depending on the variable, on the supposed
> type and on the function such as:
>
> - +infinity for datetimes but only if it signifies the end of a timespan

What this means is that your boundaries can be a datetime or None,
where None means "no boundary at this end".

> - current datetime
> - mixing both

I don't know of a situation where None means "now"; can you give an example?

> - default item in a list like [1, 2, None, 4, 9] (putting in 5 would have
> done the trick)

What does this mean? Is this where you're taking an average or
somesuch, and pretending that the None doesn't exist? That seems
fairly consistent with SQL.

Mostly, this does still represent "no such value".

ChrisA

From fperez.net at gmail.com  Wed Sep 23 03:21:11 2015
From: fperez.net at gmail.com (Fernando Perez)
Date: Tue, 22 Sep 2015 18:21:11 -0700
Subject: [Python-ideas] Null coalescing operators
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CANJQusXFni+DHT6z_jDqAooNW1+iQvwcs89BfE9FNFFFGmrENA@mail.gmail.com>
 <87vbb3mbay.fsf@uwakimon.sk.tsukuba.ac.jp>
 <BD4AEE43-9F0E-4FDB-A5E0-3C4E3D960F29@yahoo.com>
Message-ID: <mtsuq7$7aa$1@ger.gmane.org>

On 2015-09-22 03:59:07 +0000, Andrew Barnert via Python-ideas said:

> I'm not saying they have a right to expect/demand that Guido never 
> change his mind about anything anywhere ever, just that maybe they get 
> a little extra consideration on backward compatibility with their use 
> of ? than with their use of ! or % (which have been in use as operators 
> or parts of operators for decades).

I just wanted to quickly comment on what my original stance was 
regarding IPython's extensions to the base Python language.  This was 
where I stood as I made decisions when the project was basically just 
me, and over time we've mostly adopted this as project policy.

We fully acknowledge that IPython has to be a strict superset of the 
Python language, and we are  most emphatically *not* a fork of the 
lanugage intended to be incompatible. We've added some extensions by 
hijacking a few characters that are invalid in the base language for 
thigns we deemed to be useful while working interactively, but we 
always accept that, if the language moves in our direction, it's our 
job to pack up and move again to a new location.

In fact, that already happened once: before Python 2.4, our prefix for 
"magic functions" was the @ character, and when that was introduced as 
the decorator prefix, we had to scramble.  We carefully decided to pick 
%, knowing that an existing binary operator would be unlikely to be 
added also as a new unary prefix.

Now, accepting that as our reality doesn't mean that at least we don't 
want to *inform* you folks of what our uses are, so that at least you 
can consider them in your decision-making process.  Since in some 
cases, that means there's an established ~ 15 years of a community with 
a habit of using a particular syntax for something, that may be 
confused if things change.  So at least, we want to let you know.

Due precisely to these recent conversations (I had a very similar 
thread a few days ago with Nick about the ! operator, which we also use 
in all kinds of nasty ways), we have started documenting more precisely 
all these differences, so the question "where exactly does IPython go 
beyond Python" can be answered in one place.  You can see the progress 
here:

https://github.com/ipython/ipython/pull/8821

We hope this will be merged soon into our docs, and it should help you 
folks have a quick reference for these questions.

Finally, I want to emphasize that these things aren't really changing 
much anymore, this is all fairly stable.  All these choices have by now 
stabilized, we only introduced the @ -> % transition when python 2.4 
forced us, and more recently we introduced the notion of having a 
double-%% marker for "cell magics", but that was ~ 4 years ago, and it 
didn't require a new character, only allowing it to be doubled.  

Best,

f
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150922/763bae92/attachment.html>

From python at mrabarnett.plus.com  Wed Sep 23 04:43:50 2015
From: python at mrabarnett.plus.com (MRAB)
Date: Wed, 23 Sep 2015 03:43:50 +0100
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <mtsuq7$7aa$1@ger.gmane.org>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CANJQusXFni+DHT6z_jDqAooNW1+iQvwcs89BfE9FNFFFGmrENA@mail.gmail.com>
 <87vbb3mbay.fsf@uwakimon.sk.tsukuba.ac.jp>
 <BD4AEE43-9F0E-4FDB-A5E0-3C4E3D960F29@yahoo.com> <mtsuq7$7aa$1@ger.gmane.org>
Message-ID: <560211E6.1000907@mrabarnett.plus.com>

On 2015-09-23 02:21, Fernando Perez wrote:
> On 2015-09-22 03:59:07 +0000, Andrew Barnert via Python-ideas said:
>
>
> I'm not saying they have a right to expect/demand that Guido never
> change his mind about anything anywhere ever, just that maybe they get a
> little extra consideration on backward compatibility with their use of ?
> than with their use of ! or % (which have been in use as operators or
> parts of operators for decades).
>
>
> I just wanted to quickly comment on what my original stance was
> regarding IPython's extensions to the base Python language.This was
> where I stood as I made decisions when the project was basically just
> me, and over time we've mostly adopted this as project policy.
>
>
> We fully acknowledge that IPython has to be a strict superset of the
> Python language, and we aremost emphatically *not* a fork of the
> lanugage intended to be incompatible. We've added some extensions by
> hijacking a few characters that are invalid in the base language for
> thigns we deemed to be useful while working interactively, but we always
> accept that, if the language moves in our direction, it's our job to
> pack up and move again to a new location.
>
>
> In fact, that already happened once: before Python 2.4, our prefix for
> "magic functions" was the @ character, and when that was introduced as
> the decorator prefix, we had to scramble.We carefully decided to pick %,
> knowing that an existing binary operator would be unlikely to be added
> also as a new unary prefix.
>
>
> Now, accepting that as our reality doesn't mean that at least we don't
> want to *inform* you folks of what our uses are, so that at least you
> can consider them in your decision-making process.Since in some cases,
> that means there's an established ~ 15 years of a community with a habit
> of using a particular syntax for something, that may be confused if
> things change.So at least, we want to let you know.
>
>
> Due precisely to these recent conversations (I had a very similar thread
> a few days ago with Nick about the ! operator, which we also use in all
> kinds of nasty ways), we have started documenting more precisely all
> these differences, so the question "where exactly does IPython go beyond
> Python" can be answered in one place.You can see the progress here:
>
>
> https://github.com/ipython/ipython/pull/8821
>
>
> We hope this will be merged soon into our docs, and it should help you
> folks have a quick reference for these questions.
>
>
> Finally, I want to emphasize that these things aren't really changing
> much anymore, this is all fairly stable.All these choices have by now
> stabilized, we only introduced the @ -> % transition when python 2.4
> forced us, and more recently we introduced the notion of having a
> double-%% marker for "cell magics", but that was ~ 4 years ago, and it
> didn't require a new character, only allowing it to be doubled.
>
 From the examples I've seen, the "?" and "??" occur at the end of the line.

The proposed 'operators' "?.", "?[", "?(" and "??" wouldn't occur at
the end of the line (or, if they did, they'd be inside parentheses,
brackets, or braces).

So is there really a conflict, in practice?


From bussonniermatthias at gmail.com  Wed Sep 23 06:56:54 2015
From: bussonniermatthias at gmail.com (Matthias Bussonnier)
Date: Tue, 22 Sep 2015 21:56:54 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <560211E6.1000907@mrabarnett.plus.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CANJQusXFni+DHT6z_jDqAooNW1+iQvwcs89BfE9FNFFFGmrENA@mail.gmail.com>
 <87vbb3mbay.fsf@uwakimon.sk.tsukuba.ac.jp>
 <BD4AEE43-9F0E-4FDB-A5E0-3C4E3D960F29@yahoo.com> <mtsuq7$7aa$1@ger.gmane.org>
 <560211E6.1000907@mrabarnett.plus.com>
Message-ID: <67F64367-9413-4AD6-9B16-66CA1547E44D@gmail.com>


> On Sep 22, 2015, at 19:43, MRAB <python at mrabarnett.plus.com> wrote:
> 
> On 2015-09-23 02:21, Fernando Perez wrote:
>> On 2015-09-22 03:59:07 +0000, Andrew Barnert via Python-ideas said:
>> 
>> ...
>> 
>> Finally, I want to emphasize that these things aren't really changing
>> much anymore, this is all fairly stable.All these choices have by now
>> stabilized, we only introduced the @ -> % transition when python 2.4
>> forced us, and more recently we introduced the notion of having a
>> double-%% marker for "cell magics", but that was ~ 4 years ago, and it
>> didn't require a new character, only allowing it to be doubled.
>> 
> From the examples I've seen, the "?" and "??" occur at the end of the line.

beginning of line can happened too. 

?print 
is equivalent to 
print?


> The proposed 'operators' "?.", "?[", "?(" and "??" wouldn't occur at
> the end of the line (or, if they did, they'd be inside parentheses,
> brackets, or braces).
> 
> So is there really a conflict, in practice?

As stated in previous mails, with current state of proposal
no it does not conflict, we should be able to distinguish the two cases.
We are just informing the pep authors and contributors of
the syntax hijack that we did and currently have in IPython.

-- 
M


> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


From chris.barker at noaa.gov  Wed Sep 23 08:09:36 2015
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Tue, 22 Sep 2015 23:09:36 -0700
Subject: [Python-ideas] add a single __future__ for py3?
In-Reply-To: <030BCA1C-5B19-4DD5-BBE1-E8C344BB4716@yahoo.com>
References: <CALGmxEJnLk5yvO+hN9SJ3Cuq5Wecp525dA3bGV-6OgD-QqK-KQ@mail.gmail.com>
 <55FF8758.70406@canterbury.ac.nz>
 <CAPTjJmqxGgCywQRBNXmyQ5p1ChZMoVbLWgQC1Wt5SG3UQfHGTA@mail.gmail.com>
 <m2vbb4xpge.fsf@fastmail.com> <20150921130813.GZ31152@ando.pearwood.info>
 <1442843138.3321719.389368849.52DEB79B@webmail.messagingengine.com>
 <CAP7+vJ+VCBsSJ1QheRzaoKViYiFGnJd_JLLhb0kW2r+GZg6Xdg@mail.gmail.com>
 <CAGE7PNLuuyP+wmvKFMWmwng5LSQXV8qvViKKkLc4-ahA9FBAYQ@mail.gmail.com>
 <030BCA1C-5B19-4DD5-BBE1-E8C344BB4716@yahoo.com>
Message-ID: <1716796771240402588@unknownmsgid>

On Sep 22, 2015, at 6:43 PM, Andrew Barnert <abarnert at yahoo.com> wrote:

On Sep 21, 2015, at 10:59, Gregory P. Smith <greg at krypto.org> wrote:

I think people should stick with *from __future__ import
absolute_import* regardless
of what code they are writing.


If the py3 way of handling Absolute vs relative import isn't better --- why
is it in Py3????

Anyway, the point of a this is to get your py2 code working as similarly as
possible on py3. So better or worse, or not all that different, you still
want that behavior.

But again, it looks like this ship has sailed...

Thanks for indulging me.

-Chris





They will eventually create a file innocuously called something like
calendar.py (the same name as a standard library module) in the same
directory as their main binary and their debugging of the mysterious
failures they just started getting from the tarfile module will suddenly
require leveling up to be able to figure it out. ;)


But they'll get the same problems either way. If calendar.py isn't on
sys.path, it won't interfere with tarfile. And if it is on sys.path, so it
does interfere with tarfile, then it's already an absolute import, so
enabling absolute_import doesn't help.

I suppose if they've done something extra stupid, like putting a package
directory on sys.path as well as putting something called calendar.py in
that package and importing it with an unqualified import, then maybe it'll
be easier for someone to explain all the details of everything they did
wrong (including why they shouldn't have put the package on sys.path) if
they're using absolute_imports, but beyond that, I don't see how it helps
this case.


-gps

On Mon, Sep 21, 2015 at 8:18 AM Guido van Rossum <guido at python.org> wrote:

> It's just about these four imports, right?
>
>
> from __future__ import absolute_import
> from __future__ import division
> from __future__ import print_function
> from __future__ import unicode_literals
>
> I think the case is overblown.
>
> - absolute_import is rarely an issue; the only thing it does (despite the
> name) is give an error message when you attempt a relative import without
> using a "." in the import. A linter can find this easily for you, and a
> little discipline plus the right example can do a lot of good here.
>
> - division is important.
>
> - print_function is important.
>
> - unicode_literals is useless IMO. It breaks some things (yes there are
> still APIs that don't take unicode in 2.7) and it doesn't nearly as much as
> what would be useful -- e.g. repr() and <stream>.readline() still return
> 8-bit strings. I recommend just using u-literals and abandoning Python 3.2.
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________
Python-ideas mailing list
Python-ideas at python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150922/3e5100a0/attachment.html>

From chris.barker at noaa.gov  Wed Sep 23 08:19:37 2015
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Tue, 22 Sep 2015 23:19:37 -0700
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <1394693929.57358.1442988294597.JavaMail.mobile-sync@iogg1>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <CANJQusXFni+DHT6z_jDqAooNW1+iQvwcs89BfE9FNFFFGmrENA@mail.gmail.com>
 <87vbb3mbay.fsf@uwakimon.sk.tsukuba.ac.jp>
 <BD4AEE43-9F0E-4FDB-A5E0-3C4E3D960F29@yahoo.com>
 <1394693929.57358.1442988294597.JavaMail.mobile-sync@iogg1>
Message-ID: <1578039969583045795@unknownmsgid>

Sent from my iPhone

On Sep 22, 2015, at 6:21 PM, Fernando Perez <fperez.net at gmail.com

In fact, that already happened once: before Python 2.4, our prefix for
"magic functions" was the @ character, and when that was introduced as the
decorator prefix, we had to scramble.?

Note that the userbase of iPython was orders of magnitude smaller then --
changes like that would be a much bigger deal now.

And while iPython was born in the scientific software community, and sees a
lot (most) of its use there, it is by no means specific to that use case.

In fact, if you use the standard REPL at all -- I encourage you to give it
a try -- you will be very glad you did.

-Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150922/34e16e7b/attachment.html>

From ncoghlan at gmail.com  Wed Sep 23 09:46:18 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 23 Sep 2015 17:46:18 +1000
Subject: [Python-ideas] Pre-PEP Adding A Secrets Module To The Standard
	Library
In-Reply-To: <CAExdVNmG2tDne4s0j4D0zAZwq95kKhNxX-rcypRC5R5DJ6n4Wg@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <CAP7+vJKwcDBQScNbQ=KRpF8cy-p8WDhFq5p0SkJUW54WVgxz=A@mail.gmail.com>
 <CAExdVNkQmbaQoQQ7XQ6Nx8EVxFFKfPo94R=72oLxG4+izyh8aQ@mail.gmail.com>
 <CACac1F8CNxjf3ZZ3_MkuoURxvQtRMpQ1hm3vREmFO5aFQ=5W3w@mail.gmail.com>
 <20150921174758.GF31152@ando.pearwood.info>
 <87zj0fmimv.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7d9G61wHfcsQ7TPMqgOauevxKg5c-Q+_wCv0wM2Dx-XUQ@mail.gmail.com>
 <CAP1=2W4jjp+f2oc+=BNgx=2PJ6T7QM5REi2ZJwUcbZ3KtSMPAA@mail.gmail.com>
 <20150922170534.GR31152@ando.pearwood.info>
 <CAExdVNmG2tDne4s0j4D0zAZwq95kKhNxX-rcypRC5R5DJ6n4Wg@mail.gmail.com>
Message-ID: <CADiSq7fh4hnam12Yn8poH9Yqk+xJK=7o7w5cGyV+KHTG1Ag8Mg@mail.gmail.com>

On 23 September 2015 at 03:41, Tim Peters <tim.peters at gmail.com> wrote:
> [Steven D'Aprano <steve at pearwood.info>]
>> I too feel a quiet unease about password(), although I don't have
>> anything concrete to pin it on. I'm happy to be guided by people with
>> more experience in this realm.
>>
>> What if we called it simple_password() and made it clear that it wasn't
>> intended as an all-singing, all-dancing password generator?
>
> Just drop it.  Nobody I recall has said anything in favor of it ;-)

I think I may have been the one to suggest it originally, since one of
the things we're trying to address is the plethora of bad advice found
when Googling for "python password generator", but I'm OK with
dropping it from the initial version of the module, just on the
general principle that adding things later is relatively easy, while
taking them away is hard.

> It would be easy to give it as an example in the docs instead,
> building directly on choice().  That would steer people who need
> fancier stuff in the right direction.

Yeah, addressing the default password generation problem should work
just as well as a recipe in the secrets module documentation - I see
the core goal here as being to help guide folks towards using the
right random number generator for security sensitive tasks, and "use
the RNG in the secrets module for random secrets, and the RNG in the
random module for modelling and simulation" is a much easier story to
tell than explaining the technical differences between random.Random
and random.SystemRandom.

Raymond Hettinger's philosophy with itertools is likely a good guiding
principle here: provide a small set of useful primitives, and
otherwise favour recipes in the documentation. If we end up with a
"more-secrets" module on PyPI akin to "more-itertools", I think that's
fine (and also provides an easy way of backporting future secrets
module additions to earlier Python versions)

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Wed Sep 23 11:00:41 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 23 Sep 2015 19:00:41 +1000
Subject: [Python-ideas] Using "||" (doubled pipe) as the null coalescing
	operator?
Message-ID: <CADiSq7cGr7yC3H62dVgi_mVDoMfrHiT3YaAxCbXHNXtZf+1jow@mail.gmail.com>

This may just be my C programmer brain talking, but reading the
examples in PEP 505 makes me think of the existing use of "|" as the
bitwise-or operator in both Python and C, and "||" as the logical-or
operator in C.

Using || for None-coalescence would still introduce a third "or"
variant into Python as PEP 505 proposes (for good reasons), but
without introducing a new symbolic character that relates to "OR"
operations:

    x | y: bitwise OR (doesn't short circuit)
    x or y: logical OR (short circuits based on bool(x))
    x || y: logical OR (short circuits based on "x is not None")

(An analogy with C pointers works fairly well here, as "x || y" in C
is a short-circuiting operator that switches on "x != NULL" in the
pointer case)

Taking some key examples from the PEP:

    data = data ?? []
    headers = headers ?? {}
    data ?= []
    headers ?= {}

When written using a doubled pipe instead:

    data = data || []
    headers = headers || {}
    data ||= []
    headers ||= {}

Translations would be the same as proposed n PEP 505 (for simplicity,
this shows evaluating the LHS multiple times, in practice that
wouldn't happen):

    data = data if data is not None else []
    headers = headers if headers is not None else []
    data = data if data is not None else []
    headers = headers if headers is not None else []

One additional wrinkle is that a single "|" would conflict with the
bitwise-or notation in the case of None-aware index access, so the
proposal for both that and attribute access would be to make the
notation "!|", borrowing the logical negation "!" from "!=".

In this approach, where "||" would be the new short-circuiting binary
operator standing for "LHS if LHS is not None else RHS", in "!|" the
logical negations cancel out to give "LHS if LHS is None else
LHS<OP>".

PEP 505 notation:

    title?.upper()
    person?['name']

Using the "is not not None" pipe-based notation:

    title!|.upper()
    person!|['name']

And the gist of the translation:

    title if title is None else title.upper()
    person if person is None else person['name']

If this particular syntax were to be chosen, I also came up with the
following possible mnemonics that may be useful as an explanatory
tool:

    "||" is a barrier to prevent None passing through an expression
    "!|" explicitly allows None to pass without error

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From abarnert at yahoo.com  Wed Sep 23 11:21:44 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 23 Sep 2015 02:21:44 -0700
Subject: [Python-ideas] Using "||" (doubled pipe) as the null coalescing
	operator?
In-Reply-To: <CADiSq7cGr7yC3H62dVgi_mVDoMfrHiT3YaAxCbXHNXtZf+1jow@mail.gmail.com>
References: <CADiSq7cGr7yC3H62dVgi_mVDoMfrHiT3YaAxCbXHNXtZf+1jow@mail.gmail.com>
Message-ID: <E2E9859A-936E-45C8-B431-D555B99A1E00@yahoo.com>

On Sep 23, 2015, at 02:00, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> This may just be my C programmer brain talking, but reading the
> examples in PEP 505 makes me think of the existing use of "|" as the
> bitwise-or operator in both Python and C, and "||" as the logical-or
> operator in C.

The connection with || as a falsey-coalescing operator in C--and C#, Swift, etc., which have a separate null-coalescing operator that's spelled ??--seems like it could be misleading. Otherwise, I like it, but that's a pretty big otherwise.

> One additional wrinkle is that a single "|" would conflict with the
> bitwise-or notation in the case of None-aware index access, so the
> proposal for both that and attribute access would be to make the
> notation "!|", borrowing the logical negation "!" from "!=".

Maybe you should have given the examples first, because written on its own like this it looks unspeakably ugly, but in context below it's a lot nicer...

>    title!|.upper()
>    person!|['name']

This actually makes me think of the ! from Swift and other languages ("I know this optionally-null object is not null even if the type checker can't prove it, so let me use it that way"), more than negation. Which makes the whole thing make sense, but in a maybe-unpythonically out-of-order way: the bang-or means "either title is not None so I get title.upper(), or it is so I get None".

I'm not sure whether other people will read it that way--or, if they do, whether it will be helpful or harmful mnemonically.

> If this particular syntax were to be chosen, I also came up with the
> following possible mnemonics that may be useful as an explanatory
> tool:
> 
>    "||" is a barrier to prevent None passing through an expression
>    "!|" explicitly allows None to pass without error

That's definitely easy to understand and remember. But since Python doesn't exist in isolation, and null coalescing and null conditional operators exist in other languages and are being added to many new ones, it might be useful to use similar terms to other languages. (See https://msdn.microsoft.com/en-us/library/ms173224.aspx and https://msdn.microsoft.com/en-us/library/dn986595.aspx for how C# describes them.)

From ncoghlan at gmail.com  Wed Sep 23 12:53:00 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 23 Sep 2015 20:53:00 +1000
Subject: [Python-ideas] Using "||" (doubled pipe) as the null coalescing
	operator?
In-Reply-To: <E2E9859A-936E-45C8-B431-D555B99A1E00@yahoo.com>
References: <CADiSq7cGr7yC3H62dVgi_mVDoMfrHiT3YaAxCbXHNXtZf+1jow@mail.gmail.com>
 <E2E9859A-936E-45C8-B431-D555B99A1E00@yahoo.com>
Message-ID: <CADiSq7ck_PZgcHg2UPvLhxSiCC=W87oBYN_pMiRaYLd0fZQfmQ@mail.gmail.com>

On 23 September 2015 at 19:21, Andrew Barnert <abarnert at yahoo.com> wrote:
> On Sep 23, 2015, at 02:00, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>> This may just be my C programmer brain talking, but reading the
>> examples in PEP 505 makes me think of the existing use of "|" as the
>> bitwise-or operator in both Python and C, and "||" as the logical-or
>> operator in C.
>
> The connection with || as a falsey-coalescing operator in C--and C#, Swift, etc., which have a separate null-coalescing operator that's spelled ??--seems like it could be misleading. Otherwise, I like it, but that's a pretty big otherwise.

One of the problems I occasionally see with folks migrating to Python
from other languages is with our relatively expansive definition of
"false" values. In particular, C/C++ developers expect all strings and
containers (i.e. non-NULL pointers) to be truthy, with only primitive
types (i.e. pointers and numbers) able to be false in a boolean
content.

Accordingly, the difference between C's || and a null-coalescing || in
Python would be adequately covered by "Python has no primitive types,
everything's an object or a reference to an object, so || in Python is
like || with pointers in C/C++, where a reference to None is Python's
closest equivalent to NULL".

For example, a C/C++ dev might be tempted to write code like this:

    def example(env=None):
        env = env || {}
        ...

With || as a null coalescing operator, that code's actually correct,
while the same code with "or" would be incorrect:

    def example(env=None):
        env = env or {} # Also replaces a passed in empty dict
        ...

>> One additional wrinkle is that a single "|" would conflict with the
>> bitwise-or notation in the case of None-aware index access, so the
>> proposal for both that and attribute access would be to make the
>> notation "!|", borrowing the logical negation "!" from "!=".
>
> Maybe you should have given the examples first, because written on its own like this it looks unspeakably ugly, but in context below it's a lot nicer...
>
>>    title!|.upper()
>>    person!|['name']
>
> This actually makes me think of the ! from Swift and other languages ("I know this optionally-null object is not null even if the type checker can't prove it, so let me use it that way"), more than negation. Which makes the whole thing make sense, but in a maybe-unpythonically out-of-order way: the bang-or means "either title is not None so I get title.upper(), or it is so I get None".

It could also just be a "!" on its own, as the pipe isn't really
adding much here:

    title!.upper()
    person!['name']

Then the "!" is saying "I know this may not exist, if it doesn't just
bail out of this whole subexpression and produce None".

That said, it's mainly the doubled "??" operator that I'm not fond of,
I'm more OK with the "gracefully tolerate this being None" aspect of
the proposal:

    title?.upper()
    person?['name']

> I'm not sure whether other people will read it that way--or, if they do, whether it will be helpful or harmful mnemonically.
>
>> If this particular syntax were to be chosen, I also came up with the
>> following possible mnemonics that may be useful as an explanatory
>> tool:
>>
>>    "||" is a barrier to prevent None passing through an expression
>>    "!|" explicitly allows None to pass without error
>
> That's definitely easy to understand and remember. But since Python doesn't exist in isolation, and null coalescing and null conditional operators exist in other languages and are being added to many new ones, it might be useful to use similar terms to other languages. (See https://msdn.microsoft.com/en-us/library/ms173224.aspx and https://msdn.microsoft.com/en-us/library/dn986595.aspx for how C# describes them.)

Those mnemonics are the "How would I try to explain this to a 10 year
old?" version, rather than the "How would I try to explain this to a
computer science student?" version. Assuming a null coalescing
operator is added, I'd expect to see more formal language than that
used in the language reference, regardless of the spelling chosen.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From rymg19 at gmail.com  Wed Sep 23 16:37:37 2015
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Wed, 23 Sep 2015 09:37:37 -0500
Subject: [Python-ideas] Using "||" (doubled pipe) as the null
	coalescing	operator?
In-Reply-To: <CADiSq7cGr7yC3H62dVgi_mVDoMfrHiT3YaAxCbXHNXtZf+1jow@mail.gmail.com>
References: <CADiSq7cGr7yC3H62dVgi_mVDoMfrHiT3YaAxCbXHNXtZf+1jow@mail.gmail.com>
Message-ID: <DA4D1C39-B9F7-4E39-BD2F-0B384CB9E456@gmail.com>

*cough* Ruby and Perl *cough*

Ruby has two 'or' operators. One is used normally:

myval = a == 1 || a == 2
# same as
myval = (a == 1 || a == 2)

The other one is a bit different:

myval = a == 1 or a == 2
# same as
(myval = a == 1) or (a == 2)

It's used for simple nil and false elision, since Ruby has a stricter concept of falseness than Python.

But it's a bug magnet!! That's what I hated about Ruby. Type the wrong operator and get a hidden error.

Sometimes, when I code in C++ a lot and then do something in Python, I'll do:

if a || b:

Then I realize my mistake and fix it.

BUT, with this change, it wouldn't be a mistake. It would just do something entirely different.

On September 23, 2015 4:00:41 AM CDT, Nick Coghlan <ncoghlan at gmail.com> wrote:
>This may just be my C programmer brain talking, but reading the
>examples in PEP 505 makes me think of the existing use of "|" as the
>bitwise-or operator in both Python and C, and "||" as the logical-or
>operator in C.
>
>Using || for None-coalescence would still introduce a third "or"
>variant into Python as PEP 505 proposes (for good reasons), but
>without introducing a new symbolic character that relates to "OR"
>operations:
>
>    x | y: bitwise OR (doesn't short circuit)
>    x or y: logical OR (short circuits based on bool(x))
>    x || y: logical OR (short circuits based on "x is not None")
>
>(An analogy with C pointers works fairly well here, as "x || y" in C
>is a short-circuiting operator that switches on "x != NULL" in the
>pointer case)
>
>Taking some key examples from the PEP:
>
>    data = data ?? []
>    headers = headers ?? {}
>    data ?= []
>    headers ?= {}
>
>When written using a doubled pipe instead:
>
>    data = data || []
>    headers = headers || {}
>    data ||= []
>    headers ||= {}
>
>Translations would be the same as proposed n PEP 505 (for simplicity,
>this shows evaluating the LHS multiple times, in practice that
>wouldn't happen):
>
>    data = data if data is not None else []
>    headers = headers if headers is not None else []
>    data = data if data is not None else []
>    headers = headers if headers is not None else []
>
>One additional wrinkle is that a single "|" would conflict with the
>bitwise-or notation in the case of None-aware index access, so the
>proposal for both that and attribute access would be to make the
>notation "!|", borrowing the logical negation "!" from "!=".
>
>In this approach, where "||" would be the new short-circuiting binary
>operator standing for "LHS if LHS is not None else RHS", in "!|" the
>logical negations cancel out to give "LHS if LHS is None else
>LHS<OP>".
>
>PEP 505 notation:
>
>    title?.upper()
>    person?['name']
>
>Using the "is not not None" pipe-based notation:
>
>    title!|.upper()
>    person!|['name']
>
>And the gist of the translation:
>
>    title if title is None else title.upper()
>    person if person is None else person['name']
>
>If this particular syntax were to be chosen, I also came up with the
>following possible mnemonics that may be useful as an explanatory
>tool:
>
>    "||" is a barrier to prevent None passing through an expression
>    "!|" explicitly allows None to pass without error
>
>Regards,
>Nick.
>
>-- 
>Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
>_______________________________________________
>Python-ideas mailing list
>Python-ideas at python.org
>https://mail.python.org/mailman/listinfo/python-ideas
>Code of Conduct: http://python.org/psf/codeofconduct/

-- 
Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150923/60c2efa8/attachment.html>

From ncoghlan at gmail.com  Wed Sep 23 17:59:56 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 24 Sep 2015 01:59:56 +1000
Subject: [Python-ideas] Using "||" (doubled pipe) as the null coalescing
	operator?
In-Reply-To: <DA4D1C39-B9F7-4E39-BD2F-0B384CB9E456@gmail.com>
References: <CADiSq7cGr7yC3H62dVgi_mVDoMfrHiT3YaAxCbXHNXtZf+1jow@mail.gmail.com>
 <DA4D1C39-B9F7-4E39-BD2F-0B384CB9E456@gmail.com>
Message-ID: <CADiSq7dh6-C+ULY=GuF0k442+Rwuoi=jXDpjk4ZoOxC9972h8Q@mail.gmail.com>

On 24 September 2015 at 00:37, Ryan Gonzalez <rymg19 at gmail.com> wrote:
> *cough* Ruby and Perl *cough*
>
> Ruby has two 'or' operators. One is used normally:
>
> myval = a == 1 || a == 2
> # same as
> myval = (a == 1 || a == 2)
>
> The other one is a bit different:
>
> myval = a == 1 or a == 2
> # same as
> (myval = a == 1) or (a == 2)
>
> It's used for simple nil and false elision, since Ruby has a stricter
> concept of falseness than Python.

The Perl, Ruby and PHP situation is a bit different from the one
proposed here - "or" and "||" are semantically identical in those
languages aside from operator precedence.

That said, it does still count as a point in favour of "??" as the
binary operator spelling - experienced developers are unlikely to
assume they already know what that means, while the "||" spelling
means they're more likely to think "oh, that's just a higher
precedence spelling of 'or'".

The only other potential spelling of the coalescence case that comes
to mind is to make "?" available in conditional expressions as a
reference to the LHS:

    data = data if ? is not None else []
    headers = headers if ? is not None else {}
    title = user_title if ? is not None else local_default_title if ?
is not None else global_default_title
    title?.upper()
    person?['name']

The expansions of the latter two would then be:

    title if ? is None else ?.upper()
    person if ? is None else ?['name']

Augmented assignment would still be a shorthand for the first two examples:

    data ?= []
    headers ?= {}

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From steve at pearwood.info  Wed Sep 23 18:47:00 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 24 Sep 2015 02:47:00 +1000
Subject: [Python-ideas] PEP 505 [was Re: Null coalescing operators]
In-Reply-To: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
Message-ID: <20150923164651.GU31152@ando.pearwood.info>

I've now read PEP 505, and I would like to comment.

Executive summary:

- I have very little interest in the ?? and ?= operators, but don't 
  object to them: vote +0

- I have a little more interest in ?. and ?[ provided that the
  precedence allows coalescing multiple look-ups from a single
  question mark: vote +1

- if it uses the (apparent?) Dart semantics, I am opposed: vote -1

- if the syntax chosen uses || or !| as per Nick's suggestion, 
  I feel the cryptic and ugly syntax is worse than the benefit: 
  vote -1


In more detail:

I'm sympathetic to the idea of reducing typing, but I think it is 
critical to recognise that reducing typing is not always a good thing. 
If it were, we would always name our variables "a", "b" etc, the type of 
[] would be "ls", and we would say "frm col impt ODt". And if you have 
no idea what that last one means, that's exactly my point.

Reducing typing is a good thing *up to a point*, at which time it 
becomes excessively terse and cryptic. One of the things I like about 
Python is that it is not Perl: it doesn't have an excess of punctuation 
and short-cuts. Too much syntactic sugar is a bad thing. 

The PEP suggests a handful of new operators:

(1) Null Coalescing Operator

    spam ?? eggs

equivalent to a short-circuiting:

    spam if spam is not None else eggs

I'm ambivalent about this. I don't object to it, but nor does it excite 
me in the least. I don't think the abbreviated syntax gains us enough in 
expressiveness to make up for the increase in terseness. In its favour, 
it can reduce code duplication, and also act as a more correct 
alternative to `spam or eggs`. (See the PEP for details.)

So I'm a very luke-warm +0 on this part of the PEP.



(2) None coalescing assignment

    spam ?= eggs

being equivalent to:

    if spam is None:
        spam = eggs

For the same reasons as above, I'm luke-warm on this: +0.



(3) Null-Aware Member Access Operator

    spam?.attr

being equivalent to

    spam.attr if spam is not None else None

To me, this passes the test "does it add more than it costs in cryptic 
punctuation?", so I'm a little more positive about this.

If my reading is correct, the PEP underspecifies the behaviour of this 
when there is a chain of attribute accesses. Consider:

    spam?.eggs.cheese

This can be interpreted two ways:

    (a)  (spam.eggs.cheese) if spam is not None else None

    (b)  (spam.eggs if spam is not None).cheese

but the PEP doesn't make it clear which behaviour they have in mind. 
Dart appears to interpret it as (b), as the reference given in the 
PEP shows this example:

    [quote]
    You can chain ?. calls, for example:
    obj?.child?.child?.getter
    [quote]

http://blog.sethladd.com/2015/07/null-aware-operators-in-dart.html

That would seem to imply that obj?.child.child.getter would end up 
trying to evaluate null.child if the first ?. operator returned null.

I don't think the Dart semantics is useful, indeed it is actively 
harmful in that it can hide bugs:

Suppose we have an object which may be None, but if not, it must 
have an attribute spam which in turn must have an attribute eggs. This 
implies that spam must not be None. We want:

    obj.spam.eggs if obj is not None else None

Using the Dart semantics, we chain ?. operators and get this:

    obj?.spam?.eggs

If obj is None, the expression correctly returns None. If obj is not 
None, and obj.spam is not None, the expression correctly returns eggs. 
But it is over-eager, and hides a bug: if obj.spam is None, you want to 
get an AttributeError, but instead the error is silenced and you get 
None.

So I'm -1 with the Dart semantics, and +1 otherwise.



(3) Null-Aware Index Access Operator

    spam?[item]

being similar to spam.attr. Same reasoning applies to this as for 
attribute access.



Nick has suggested using || instead of ??, and similar for the other 
operators. I don't think this is attractive at all, but the deciding 
factor which makes Nick's syntax a -1 for me is that it is inconsistent 
and confusing. He has to introduce a !| variation, so the user has to 
remember when to use two |s and when to use a ! instead, whether the ! 
goes before or after the | and that !! is never used.



-- 
Steve

From mehaase at gmail.com  Wed Sep 23 19:22:11 2015
From: mehaase at gmail.com (Mark E. Haase)
Date: Wed, 23 Sep 2015 13:22:11 -0400
Subject: [Python-ideas] PEP 505 [was Re: Null coalescing operators]
In-Reply-To: <20150923164651.GU31152@ando.pearwood.info>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <20150923164651.GU31152@ando.pearwood.info>
Message-ID: <CALb0Rk7oKhihaZ4YOf2ARrKoRkJLvw6yZyHzjEhh3obJ1OAbgQ@mail.gmail.com>

Steven, thanks for the reply. Just to clarify: the current PEP draft was
not meant to be read -- it was just a placeholder to get a PEP # assigned.
I didn't realize that new PEPs are published in an RSS feed!

I do appreciate your detailed feedback, though. Your interpretation of
Dart's semantics is correct, and I agree that's absolutely the wrong way to
do it. C# does have the short-circuit semantics that you're looking for.

To you, and to everybody else in this thread: I am reading every single
message, and I'm working on a draft worthy of your time and attention that
incorporates all of these viewpoints and offers several, competing
alternatives. I will announce the draft on this list when I'm further along.

Proposing a new operator is tremendously difficult in Python, because this
community doesn't like complex or ugly punctuation. (And adding a new
keyword won't happen any time soon.) A similar debate surrounded the
ternary operator PEP[1]. That PEP's author eventually held a vote on the
competing alternatives, including an option for "don't do anything". I'm
hoping to hold a similar referendum on this PEP once I've had time to work
on it a bit more.

[1] https://www.python.org/dev/peps/pep-0308/

On Wed, Sep 23, 2015 at 12:47 PM, Steven D'Aprano <steve at pearwood.info>
wrote:

> I've now read PEP 505, and I would like to comment.
>
> Executive summary:
>
> - I have very little interest in the ?? and ?= operators, but don't
>   object to them: vote +0
>
> - I have a little more interest in ?. and ?[ provided that the
>   precedence allows coalescing multiple look-ups from a single
>   question mark: vote +1
>
> - if it uses the (apparent?) Dart semantics, I am opposed: vote -1
>
> - if the syntax chosen uses || or !| as per Nick's suggestion,
>   I feel the cryptic and ugly syntax is worse than the benefit:
>   vote -1
>
>
> In more detail:
>
> I'm sympathetic to the idea of reducing typing, but I think it is
> critical to recognise that reducing typing is not always a good thing.
> If it were, we would always name our variables "a", "b" etc, the type of
> [] would be "ls", and we would say "frm col impt ODt". And if you have
> no idea what that last one means, that's exactly my point.
>
> Reducing typing is a good thing *up to a point*, at which time it
> becomes excessively terse and cryptic. One of the things I like about
> Python is that it is not Perl: it doesn't have an excess of punctuation
> and short-cuts. Too much syntactic sugar is a bad thing.
>
> The PEP suggests a handful of new operators:
>
> (1) Null Coalescing Operator
>
>     spam ?? eggs
>
> equivalent to a short-circuiting:
>
>     spam if spam is not None else eggs
>
> I'm ambivalent about this. I don't object to it, but nor does it excite
> me in the least. I don't think the abbreviated syntax gains us enough in
> expressiveness to make up for the increase in terseness. In its favour,
> it can reduce code duplication, and also act as a more correct
> alternative to `spam or eggs`. (See the PEP for details.)
>
> So I'm a very luke-warm +0 on this part of the PEP.
>
>
>
> (2) None coalescing assignment
>
>     spam ?= eggs
>
> being equivalent to:
>
>     if spam is None:
>         spam = eggs
>
> For the same reasons as above, I'm luke-warm on this: +0.
>
>
>
> (3) Null-Aware Member Access Operator
>
>     spam?.attr
>
> being equivalent to
>
>     spam.attr if spam is not None else None
>
> To me, this passes the test "does it add more than it costs in cryptic
> punctuation?", so I'm a little more positive about this.
>
> If my reading is correct, the PEP underspecifies the behaviour of this
> when there is a chain of attribute accesses. Consider:
>
>     spam?.eggs.cheese
>
> This can be interpreted two ways:
>
>     (a)  (spam.eggs.cheese) if spam is not None else None
>
>     (b)  (spam.eggs if spam is not None).cheese
>
> but the PEP doesn't make it clear which behaviour they have in mind.
> Dart appears to interpret it as (b), as the reference given in the
> PEP shows this example:
>
>     [quote]
>     You can chain ?. calls, for example:
>     obj?.child?.child?.getter
>     [quote]
>
> http://blog.sethladd.com/2015/07/null-aware-operators-in-dart.html
>
> That would seem to imply that obj?.child.child.getter would end up
> trying to evaluate null.child if the first ?. operator returned null.
>
> I don't think the Dart semantics is useful, indeed it is actively
> harmful in that it can hide bugs:
>
> Suppose we have an object which may be None, but if not, it must
> have an attribute spam which in turn must have an attribute eggs. This
> implies that spam must not be None. We want:
>
>     obj.spam.eggs if obj is not None else None
>
> Using the Dart semantics, we chain ?. operators and get this:
>
>     obj?.spam?.eggs
>
> If obj is None, the expression correctly returns None. If obj is not
> None, and obj.spam is not None, the expression correctly returns eggs.
> But it is over-eager, and hides a bug: if obj.spam is None, you want to
> get an AttributeError, but instead the error is silenced and you get
> None.
>
> So I'm -1 with the Dart semantics, and +1 otherwise.
>
>
>
> (3) Null-Aware Index Access Operator
>
>     spam?[item]
>
> being similar to spam.attr. Same reasoning applies to this as for
> attribute access.
>
>
>
> Nick has suggested using || instead of ??, and similar for the other
> operators. I don't think this is attractive at all, but the deciding
> factor which makes Nick's syntax a -1 for me is that it is inconsistent
> and confusing. He has to introduce a !| variation, so the user has to
> remember when to use two |s and when to use a ! instead, whether the !
> goes before or after the | and that !! is never used.
>
>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
Mark E. Haase
202-815-0201
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150923/a170ac30/attachment.html>

From srkunze at mail.de  Wed Sep 23 19:30:12 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Wed, 23 Sep 2015 19:30:12 +0200
Subject: [Python-ideas] Null coalescing operators
In-Reply-To: <CAPTjJmqBTOnOHHQ+=M+7FwLVvM0FLAXiwvZHT-gZ_D1pKY9=GQ@mail.gmail.com>
References: <CAEbHw4aG1-Q8f7C1ciWw_9FjGsR+xSCvzTjeyZ2ZvGADfpO_Vw@mail.gmail.com>
 <CAP7+vJKVUmkhkuMKOCOjxJmmvQStdM6HDH+YRWZF8cVHgoDXyw@mail.gmail.com>
 <CALb0Rk6PSi+PqJkBG-g7bfyv2cXBS4ydKS54VW_UmS-kGx5hjg@mail.gmail.com>
 <55FF8CFD.9070906@mail.de>
 <CAPTjJmrnG=nvpc8x9csBqmH9b=hSiVmwRaj=1XQQvM5tq-3OCA@mail.gmail.com>
 <mtp7mo$rgh$1@ger.gmane.org>
 <CAP7+vJ+wxDtu01QUc+Gtp0eV4nBZW2wMs7fokd1FUBkme+GjXQ@mail.gmail.com>
 <CALb0Rk65+S-_AvoYC=45PfR+y-zbOjOpah60HDqk-UEsfyXY8w@mail.gmail.com>
 <CAP7+vJJ_AsCNFSZ0kqKty_tbPFij+GM0_xg=eWahiwVtHn7Ajw@mail.gmail.com>
 <mtpsh2$j5m$1@ger.gmane.org> <20150922031507.GL31152@ando.pearwood.info>
 <56019C72.9000806@mail.de>
 <CAPTjJmqBTOnOHHQ+=M+7FwLVvM0FLAXiwvZHT-gZ_D1pKY9=GQ@mail.gmail.com>
Message-ID: <5602E1A4.6040807@mail.de>

On 23.09.2015 00:53, Chris Angelico wrote:
> On Wed, Sep 23, 2015 at 4:22 AM, Sven R. Kunze <srkunze at mail.de> wrote:
>> I can tell from what I've seen that people use None for: all kinds of
>> various interesting semantics depending on the variable, on the supposed
>> type and on the function such as:
>>
>> - +infinity for datetimes but only if it signifies the end of a timespan
> What this means is that your boundaries can be a datetime or None,
> where None means "no boundary at this end".

Yes.

>
>> - current datetime
>> - mixing both
> I don't know of a situation where None means "now"; can you give an example?

range_start = <proper datetime>
range_end = <proper datetime or one of the above>

So, if you need something that ranges from 2015-01-01 to now, you 
basically say the range is expanding. Depending on the function/method, 
it either means until forever, or now.

>
>> - default item in a list like [1, 2, None, 4, 9] (putting in 5 would have
>> done the trick)
> What does this mean? Is this where you're taking an average or
> somesuch, and pretending that the None doesn't exist? That seems
> fairly consistent with SQL.

Imagine you render (as in HTML and the like) 1, 2, 4, 9 and instead of 
the None you render a 3. Now, the rendering engine needs to 
special-check None to put in a pre-defined value. Furthermore, all 
places where you need that list [1, 2, None, 9], you basically need to 
special-check None and act appropriately.

(Of course it was not that simple but you get the idea. The numbers 
stand for fairly complex objects drawn from the database.)

>
> Mostly, this does still represent "no such value".

Point was "no such value" sucks. It can be a blend of every other value 
and semantics depending on the function, type and so forth.

It's too convenient that people would not use it. As the example with 
the list shows us, the 3 could have easily be put into the database as 
it behaves exactly the same as the other objects.


The same goes for the special datetime objects. The lack of thinking and 
appropriate default objects, lead to the usage of None. People tend to 
use None for everything that is special and you end up with something 
really nasty to debug.


Not why people don't find it problematic when I said "we found 6/7 
domain-agnostic semantics for NULL". The "no such value" can be any of 
them OR a blend of them. That, I don't want to see in the code; that's all.

Btw. having the third issue of above, I could add another 
domain-agnostic meaning for None: "too lazy to create a pre-defined 
object but instead using None".


> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


From rymg19 at gmail.com  Wed Sep 23 19:34:14 2015
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Wed, 23 Sep 2015 12:34:14 -0500
Subject: [Python-ideas] PEP 505 [was Re: Null coalescing operators]
In-Reply-To: <20150923164651.GU31152@ando.pearwood.info>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <20150923164651.GU31152@ando.pearwood.info>
Message-ID: <CAO41-mM5p2szDB9P8AuGLbRbx+zUef8+voiZgBF1FfTJTNin9A@mail.gmail.com>

On Wed, Sep 23, 2015 at 11:47 AM, Steven D'Aprano <steve at pearwood.info>
wrote:

> I've now read PEP 505, and I would like to comment.
>
> Executive summary:
>
> - I have very little interest in the ?? and ?= operators, but don't
>   object to them: vote +0
>
> - I have a little more interest in ?. and ?[ provided that the
>   precedence allows coalescing multiple look-ups from a single
>   question mark: vote +1
>
> - if it uses the (apparent?) Dart semantics, I am opposed: vote -1
>
> - if the syntax chosen uses || or !| as per Nick's suggestion,
>   I feel the cryptic and ugly syntax is worse than the benefit:
>   vote -1
>
>
> In more detail:
>
> I'm sympathetic to the idea of reducing typing, but I think it is
> critical to recognise that reducing typing is not always a good thing.
> If it were, we would always name our variables "a", "b" etc, the type of
> [] would be "ls", and we would say "frm col impt ODt". And if you have
> no idea what that last one means, that's exactly my point.
>

from collections import OrderedDict


>
> Reducing typing is a good thing *up to a point*, at which time it
> becomes excessively terse and cryptic. One of the things I like about
> Python is that it is not Perl: it doesn't have an excess of punctuation
> and short-cuts. Too much syntactic sugar is a bad thing.
>
> The PEP suggests a handful of new operators:
>
> (1) Null Coalescing Operator
>
>     spam ?? eggs
>
> equivalent to a short-circuiting:
>
>     spam if spam is not None else eggs
>
> I'm ambivalent about this. I don't object to it, but nor does it excite
> me in the least. I don't think the abbreviated syntax gains us enough in
> expressiveness to make up for the increase in terseness. In its favour,
> it can reduce code duplication, and also act as a more correct
> alternative to `spam or eggs`. (See the PEP for details.)
>
> So I'm a very luke-warm +0 on this part of the PEP.
>
>
>
> (2) None coalescing assignment
>
>     spam ?= eggs
>
> being equivalent to:
>
>     if spam is None:
>         spam = eggs
>
> For the same reasons as above, I'm luke-warm on this: +0.
>
>
>
> (3) Null-Aware Member Access Operator
>
>     spam?.attr
>
> being equivalent to
>
>     spam.attr if spam is not None else None
>
> To me, this passes the test "does it add more than it costs in cryptic
> punctuation?", so I'm a little more positive about this.
>
> If my reading is correct, the PEP underspecifies the behaviour of this
> when there is a chain of attribute accesses. Consider:
>
>     spam?.eggs.cheese
>
> This can be interpreted two ways:
>
>     (a)  (spam.eggs.cheese) if spam is not None else None
>
>     (b)  (spam.eggs if spam is not None).cheese
>
> but the PEP doesn't make it clear which behaviour they have in mind.
> Dart appears to interpret it as (b), as the reference given in the
> PEP shows this example:
>
>     [quote]
>     You can chain ?. calls, for example:
>     obj?.child?.child?.getter
>     [quote]
>
> http://blog.sethladd.com/2015/07/null-aware-operators-in-dart.html
>
> That would seem to imply that obj?.child.child.getter would end up
> trying to evaluate null.child if the first ?. operator returned null.
>
> I don't think the Dart semantics is useful, indeed it is actively
> harmful in that it can hide bugs:
>
> Suppose we have an object which may be None, but if not, it must
> have an attribute spam which in turn must have an attribute eggs. This
> implies that spam must not be None. We want:
>
>     obj.spam.eggs if obj is not None else None
>
> Using the Dart semantics, we chain ?. operators and get this:
>
>     obj?.spam?.eggs
>
> If obj is None, the expression correctly returns None. If obj is not
> None, and obj.spam is not None, the expression correctly returns eggs.
> But it is over-eager, and hides a bug: if obj.spam is None, you want to
> get an AttributeError, but instead the error is silenced and you get
> None.
>
> So I'm -1 with the Dart semantics, and +1 otherwise.
>
>
I have to kind of agree here. In reality, I don't see any issues like this
with approach (a).


>
>
> (3) Null-Aware Index Access Operator
>
>     spam?[item]
>
> being similar to spam.attr. Same reasoning applies to this as for
> attribute access.
>
>
>
> Nick has suggested using || instead of ??, and similar for the other
> operators. I don't think this is attractive at all, but the deciding
> factor which makes Nick's syntax a -1 for me is that it is inconsistent
> and confusing. He has to introduce a !| variation, so the user has to
> remember when to use two |s and when to use a ! instead, whether the !
> goes before or after the | and that !! is never used.
>
>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
Ryan
[ERROR]: Your autotools build scripts are 200 lines longer than your
program. Something?s wrong.
http://kirbyfan64.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150923/85b3ae03/attachment-0001.html>

From p.f.moore at gmail.com  Wed Sep 23 20:46:15 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 23 Sep 2015 19:46:15 +0100
Subject: [Python-ideas] PEP 505 [was Re: Null coalescing operators]
In-Reply-To: <20150923164651.GU31152@ando.pearwood.info>
References: <747c20ca-960e-4d0e-83c7-f17a61e3d42d@googlegroups.com>
 <20150923164651.GU31152@ando.pearwood.info>
Message-ID: <CACac1F-vMWAHvkW6bRHTo=jwKGkKNAMqSG3gfqT-3fYOtNumQw@mail.gmail.com>

On 23 September 2015 at 17:47, Steven D'Aprano <steve at pearwood.info> wrote:
> I've now read PEP 505, and I would like to comment.

Having read the various messages in this thread, and then your summary
(which was interesting, because it put a lot of the various options
next to each other) I have to say:

1. The "expanded" versions using if..else are definitely pretty
unreadable and ugly (for all the variations). But in practice, I'd be
very unlikely to use if expressions in this case - I'd be more likely
to expand the whole construct, probably involving an if *statement*.
Comparing a multi-line statement to an operator is much harder to do
in a general manner. So I guess I can see the benefits, but I suspect
the operators won't be used in practice as much as people are implying
(in much the same way that the use if the if expression is pretty rare
in real Python code, as opposed to examples).

2. All of the punctuation-based suggestions remain ugly to my eyes. ?
is too visually striking, and has too many other associations for me
("help" in IPython, and as a suffix for variable names from Lisp).
Nick's || version looked plausible, but the inconsistent !| variations
bother me.

3. People keep referring to "obj ?? default" in comparison to "obj or
default". The comparison is fine - as is the notion that we are
talking about a version that simply replaces a truth test with an "is
None" test. But to me it also says that we should be looking for a
keyword, not a punctuation operator - the "or" version reads nicely,
and the punctuation version looks very cryptic in comparison. I can't
think of a good keyword, or a viable way to use a keyword for the ?.
?[ and ?( variations, but I wish I could.

Summary - I don't mind the addition of the functionality, although I
don't think it's crucial. But I really dislike the punctuation. The
benefits don't justify the cost for me.

Paul

From ncoghlan at gmail.com  Thu Sep 24 05:00:39 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 24 Sep 2015 13:00:39 +1000
Subject: [Python-ideas] Using "||" (doubled pipe) as the null coalescing
	operator?
In-Reply-To: <CADiSq7dh6-C+ULY=GuF0k442+Rwuoi=jXDpjk4ZoOxC9972h8Q@mail.gmail.com>
References: <CADiSq7cGr7yC3H62dVgi_mVDoMfrHiT3YaAxCbXHNXtZf+1jow@mail.gmail.com>
 <DA4D1C39-B9F7-4E39-BD2F-0B384CB9E456@gmail.com>
 <CADiSq7dh6-C+ULY=GuF0k442+Rwuoi=jXDpjk4ZoOxC9972h8Q@mail.gmail.com>
Message-ID: <CADiSq7f7HPwVL+_ZkGhO8oZusjdSOo1Y+tWZ+VRww7Xq2QmxDA@mail.gmail.com>

On 24 Sep 2015 01:59, "Nick Coghlan" <ncoghlan at gmail.com> wrote:
>
> The only other potential spelling of the coalescence case that comes
> to mind is to make "?" available in conditional expressions as a
> reference to the LHS:
>
>     data = data if ? is not None else []
>     headers = headers if ? is not None else {}
>     title = user_title if ? is not None else local_default_title if ?
> is not None else global_default_title

One advantage of this more explicit spelling is that it permits sentinels
other than None in the expanded form:

    data = data if ? is not sentinel else default()

Only the shorthand cases (augmented assignment, attribute access, subscript
lookup) would be restricted to checking specifically against None.

Cheers,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150924/8430e355/attachment.html>

From python at lucidity.plus.com  Fri Sep 25 00:28:45 2015
From: python at lucidity.plus.com (Erik)
Date: Thu, 24 Sep 2015 23:28:45 +0100
Subject: [Python-ideas] Using "||" (doubled pipe) as the null coalescing
 operator?
In-Reply-To: <CADiSq7f7HPwVL+_ZkGhO8oZusjdSOo1Y+tWZ+VRww7Xq2QmxDA@mail.gmail.com>
References: <CADiSq7cGr7yC3H62dVgi_mVDoMfrHiT3YaAxCbXHNXtZf+1jow@mail.gmail.com>
 <DA4D1C39-B9F7-4E39-BD2F-0B384CB9E456@gmail.com>
 <CADiSq7dh6-C+ULY=GuF0k442+Rwuoi=jXDpjk4ZoOxC9972h8Q@mail.gmail.com>
 <CADiSq7f7HPwVL+_ZkGhO8oZusjdSOo1Y+tWZ+VRww7Xq2QmxDA@mail.gmail.com>
Message-ID: <5604791D.3000204@lucidity.plus.com>

On 24/09/15 04:00, Nick Coghlan wrote:
>  data = data if ? is not sentinel else default()

This reads OK in a short example like this and when using word-based 
operators such as "is not". However, it's a bit clumsy looking when 
using operators spelled with punctuation:

data = data if ? != None else default()
data = data if foo <= ? <= bar else default()

 > title = user_title if ? is not None else local_default_title if ? is 
not None else global_default_title

I don't think I like the way '?' changes its target during the line in 
this example.

For example, the equivalent of the admittedly-contrived expression:

foo = bar if foo is None else baz if baz is not None else foo.frobnicate()

is:

foo = bar if ? is None else baz if ? is not None else foo.frobnicate()

... so you still have to spell 'foo' repeatedly (and only due to the 
subtle switch of the '?' target, which might go away (or be added) 
during code maintenance or refactoring).


Also, if '?' is sort of a short-cut way of referencing the LHS, then one 
might naively expect to be able to write this:

[(x, y) for x in range(5) if ? < 3 for y in range(5) if ? > 2]

Regs, E.

From guido at python.org  Fri Sep 25 00:30:45 2015
From: guido at python.org (Guido van Rossum)
Date: Thu, 24 Sep 2015 15:30:45 -0700
Subject: [Python-ideas] Using "||" (doubled pipe) as the null coalescing
	operator?
In-Reply-To: <5604791D.3000204@lucidity.plus.com>
References: <CADiSq7cGr7yC3H62dVgi_mVDoMfrHiT3YaAxCbXHNXtZf+1jow@mail.gmail.com>
 <DA4D1C39-B9F7-4E39-BD2F-0B384CB9E456@gmail.com>
 <CADiSq7dh6-C+ULY=GuF0k442+Rwuoi=jXDpjk4ZoOxC9972h8Q@mail.gmail.com>
 <CADiSq7f7HPwVL+_ZkGhO8oZusjdSOo1Y+tWZ+VRww7Xq2QmxDA@mail.gmail.com>
 <5604791D.3000204@lucidity.plus.com>
Message-ID: <CAP7+vJLN7RCDZoBNd_ougs65Jb8bFS=5HOCaVe7fwFxZUU=x0g@mail.gmail.com>

Using "?" as a (pro)noun is even worse than using it as an
operator/modifier.

On Thu, Sep 24, 2015 at 3:28 PM, Erik <python at lucidity.plus.com> wrote:

> On 24/09/15 04:00, Nick Coghlan wrote:
>
>>  data = data if ? is not sentinel else default()
>>
>
> This reads OK in a short example like this and when using word-based
> operators such as "is not". However, it's a bit clumsy looking when using
> operators spelled with punctuation:
>
> data = data if ? != None else default()
> data = data if foo <= ? <= bar else default()
>
> > title = user_title if ? is not None else local_default_title if ? is not
> None else global_default_title
>
> I don't think I like the way '?' changes its target during the line in
> this example.
>
> For example, the equivalent of the admittedly-contrived expression:
>
> foo = bar if foo is None else baz if baz is not None else foo.frobnicate()
>
> is:
>
> foo = bar if ? is None else baz if ? is not None else foo.frobnicate()
>
> ... so you still have to spell 'foo' repeatedly (and only due to the
> subtle switch of the '?' target, which might go away (or be added) during
> code maintenance or refactoring).
>
>
> Also, if '?' is sort of a short-cut way of referencing the LHS, then one
> might naively expect to be able to write this:
>
> [(x, y) for x in range(5) if ? < 3 for y in range(5) if ? > 2]
>
> Regs, E.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150924/33bd4c88/attachment.html>

From python at lucidity.plus.com  Fri Sep 25 00:40:51 2015
From: python at lucidity.plus.com (Erik)
Date: Thu, 24 Sep 2015 23:40:51 +0100
Subject: [Python-ideas] Using "||" (doubled pipe) as the null coalescing
 operator?
In-Reply-To: <CAP7+vJLN7RCDZoBNd_ougs65Jb8bFS=5HOCaVe7fwFxZUU=x0g@mail.gmail.com>
References: <CADiSq7cGr7yC3H62dVgi_mVDoMfrHiT3YaAxCbXHNXtZf+1jow@mail.gmail.com>
 <DA4D1C39-B9F7-4E39-BD2F-0B384CB9E456@gmail.com>
 <CADiSq7dh6-C+ULY=GuF0k442+Rwuoi=jXDpjk4ZoOxC9972h8Q@mail.gmail.com>
 <CADiSq7f7HPwVL+_ZkGhO8oZusjdSOo1Y+tWZ+VRww7Xq2QmxDA@mail.gmail.com>
 <5604791D.3000204@lucidity.plus.com>
 <CAP7+vJLN7RCDZoBNd_ougs65Jb8bFS=5HOCaVe7fwFxZUU=x0g@mail.gmail.com>
Message-ID: <56047BF3.6090308@lucidity.plus.com>

On 24/09/15 23:30, Guido van Rossum wrote:
> Using "?" as a (pro)noun is even worse than using it as an
> operator/modifier.

That was what I was trying to say, but you did it more correctly and 
using far fewer characters. How very Pythonic of you ... ;)

E.

From youtux at gmail.com  Fri Sep 25 01:07:53 2015
From: youtux at gmail.com (Alessio Bogon)
Date: Fri, 25 Sep 2015 01:07:53 +0200
Subject: [Python-ideas] Using `or?` as the null coalescing operator
Message-ID: <6C2E5579-42A0-423F-AB8C-01B49FA59D67@gmail.com>

I really like PEP 0505. The only thing that does not convince me is the `??` operator. I would like to know what you think of an alternative like `or?`:

a_list = some_list or? []
a_dict = some_dict or? {}

The rationale behind is to let `or` do its job with ?truthy? values, while `or?` would require non-None values.
The rest of the PEP looks good to me.

I apologise in advance if this was already proposed and I missed it.

Regards,
Alessio

From gokoproject at gmail.com  Fri Sep 25 01:13:29 2015
From: gokoproject at gmail.com (John Wong)
Date: Thu, 24 Sep 2015 19:13:29 -0400
Subject: [Python-ideas] Using "||" (doubled pipe) as the null coalescing
	operator?
In-Reply-To: <56047BF3.6090308@lucidity.plus.com>
References: <CADiSq7cGr7yC3H62dVgi_mVDoMfrHiT3YaAxCbXHNXtZf+1jow@mail.gmail.com>
 <DA4D1C39-B9F7-4E39-BD2F-0B384CB9E456@gmail.com>
 <CADiSq7dh6-C+ULY=GuF0k442+Rwuoi=jXDpjk4ZoOxC9972h8Q@mail.gmail.com>
 <CADiSq7f7HPwVL+_ZkGhO8oZusjdSOo1Y+tWZ+VRww7Xq2QmxDA@mail.gmail.com>
 <5604791D.3000204@lucidity.plus.com>
 <CAP7+vJLN7RCDZoBNd_ougs65Jb8bFS=5HOCaVe7fwFxZUU=x0g@mail.gmail.com>
 <56047BF3.6090308@lucidity.plus.com>
Message-ID: <CACCLA57PeCt5t5yf1M5dbhR84X3sHm21Y6o3qHRwQqtRKjwB9Q@mail.gmail.com>

I just read the PEP. As a user, I would prefer || until...

[Nick]

>
> If this particular syntax were to be chosen, I also came up with the
> following possible mnemonics that may be useful as an explanatory
> tool:
>     "||" is a barrier to prevent None passing through an expression
>     "!|" explicitly allows None to pass without error


I am like eating my own words, !| is pretty hard to read, especially during
code review. The two symbols look too similar. Do we really need to have
one doesn't raise exception and one that does?

Next, the example title?.upper() in the PEP, this is also kind of ugly and
unclear to me what's the purpose. I do appreciate the idea of circuit, but
I don't feel the syntax is right. To me this is the debate between
defaultdict and primitive dict (but in that debate you don't have the
option to raise or not raise exception, but Nick's proposal does).

> data = [] if data is None else data
Looks like a valid case for a short-cut operator. The argument that
"undesirable effect of putting the operands in an unintuitive order" is not
so bad. Once you have seen it once it should make sense. I will probably
poke Star War and say "python awesome, it is" and our brain will adopt. At
least that line is still readable.

> data = data ?? []
I would prefer || again simply because of no new syntax. Actually, the
price computing example in PEP 505 is not too convincing from a contract
standpoint . The proposal is shorter than writing if requested_quanity is
not None, but if you have to think about using null coalescing operator,
then aren't you already spotting a case you need to handle? The example
shows how the bug can be prevented, so maybe requested_quanlity should
really default to 0 from the beginning, not None. None shouldn't appear and
if it appear it should be a bug, and using null coalescing in this very
example is actually a bug from my view. You are just avoiding ever having
to think about taking care of such case in your code. But then you have
negative number to avoid too... so that still require a sanity check
somewhere.

Just my four cents.

On Thu, Sep 24, 2015 at 6:40 PM, Erik <python at lucidity.plus.com> wrote:

> On 24/09/15 23:30, Guido van Rossum wrote:
>
>> Using "?" as a (pro)noun is even worse than using it as an
>> operator/modifier.
>>
>
> That was what I was trying to say, but you did it more correctly and using
> far fewer characters. How very Pythonic of you ... ;)
>
>
> E.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150924/e479c18e/attachment-0001.html>

From python at lucidity.plus.com  Fri Sep 25 02:30:54 2015
From: python at lucidity.plus.com (Erik)
Date: Fri, 25 Sep 2015 01:30:54 +0100
Subject: [Python-ideas] Using "||" (doubled pipe) as the null coalescing
 operator?
In-Reply-To: <CACCLA57PeCt5t5yf1M5dbhR84X3sHm21Y6o3qHRwQqtRKjwB9Q@mail.gmail.com>
References: <CADiSq7cGr7yC3H62dVgi_mVDoMfrHiT3YaAxCbXHNXtZf+1jow@mail.gmail.com>
 <DA4D1C39-B9F7-4E39-BD2F-0B384CB9E456@gmail.com>
 <CADiSq7dh6-C+ULY=GuF0k442+Rwuoi=jXDpjk4ZoOxC9972h8Q@mail.gmail.com>
 <CADiSq7f7HPwVL+_ZkGhO8oZusjdSOo1Y+tWZ+VRww7Xq2QmxDA@mail.gmail.com>
 <5604791D.3000204@lucidity.plus.com>
 <CAP7+vJLN7RCDZoBNd_ougs65Jb8bFS=5HOCaVe7fwFxZUU=x0g@mail.gmail.com>
 <56047BF3.6090308@lucidity.plus.com>
 <CACCLA57PeCt5t5yf1M5dbhR84X3sHm21Y6o3qHRwQqtRKjwB9Q@mail.gmail.com>
Message-ID: <560495BE.3070600@lucidity.plus.com>

Throwing this one out there in case it inspires someone to come up with 
a better variation (or to get it explicitly rejected):

object.(<accessor> if <condition> else <expr>)

... where 'accessor' is anything normally allowed after 'object' ([], 
(), attr) and 'condition' can omit the LHS of any conditional expression 
(which is taken the associated object) or not (i.e., can be a complete 
condition independent of the associated object):

foo = bar.((param0, param1) if not None else default())
foo = bar.([idx] if != sentinel else default())


And the perhaps more off-the-wall (as 'bar' is not involved in the 
condition):


foo = bar.(attr if secrets.randint(0, 1023) & 1 else default())

E.

From steve at pearwood.info  Fri Sep 25 03:35:04 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 25 Sep 2015 11:35:04 +1000
Subject: [Python-ideas] Using "||" (doubled pipe) as the null coalescing
	operator?
In-Reply-To: <560495BE.3070600@lucidity.plus.com>
References: <CADiSq7cGr7yC3H62dVgi_mVDoMfrHiT3YaAxCbXHNXtZf+1jow@mail.gmail.com>
 <DA4D1C39-B9F7-4E39-BD2F-0B384CB9E456@gmail.com>
 <CADiSq7dh6-C+ULY=GuF0k442+Rwuoi=jXDpjk4ZoOxC9972h8Q@mail.gmail.com>
 <CADiSq7f7HPwVL+_ZkGhO8oZusjdSOo1Y+tWZ+VRww7Xq2QmxDA@mail.gmail.com>
 <5604791D.3000204@lucidity.plus.com>
 <CAP7+vJLN7RCDZoBNd_ougs65Jb8bFS=5HOCaVe7fwFxZUU=x0g@mail.gmail.com>
 <56047BF3.6090308@lucidity.plus.com>
 <CACCLA57PeCt5t5yf1M5dbhR84X3sHm21Y6o3qHRwQqtRKjwB9Q@mail.gmail.com>
 <560495BE.3070600@lucidity.plus.com>
Message-ID: <20150925013500.GD23642@ando.pearwood.info>

On Fri, Sep 25, 2015 at 01:30:54AM +0100, Erik wrote:

> Throwing this one out there in case it inspires someone to come up with 
> a better variation (or to get it explicitly rejected):
> 
> object.(<accessor> if <condition> else <expr>)
> 
> ... where 'accessor' is anything normally allowed after 'object' ([], 
> (), attr) and 'condition' can omit the LHS of any conditional expression 
> (which is taken the associated object) or not (i.e., can be a complete 
> condition independent of the associated object):
> 
> foo = bar.((param0, param1) if not None else default())

I think that your intention is for that to be equivalent to:

if bar not None:  # missing "is" operator
    foo = bar(param0, param1)
else:
    foo = default()


I had to read your description three times before I got to the point 
where I could understand it. Some problems:

I thought `bar.(<accessor> ...)` meant attribute access, so I initially 
expected the true branch to evaluate to:

    foo = bar.(param0, param1)

which of course is a syntax error. Presumably you would write 
`bar.(attr if ...)` for attribute access and not `bar.(.attr if ...)`.

I'm still confused about the missing `is`. Maybe you meant:

if not None:  # evaluates to True
    ...


which is a problem with your suggestion that the left hand side of the 
condition is optional -- it makes it harder to catch errors in typing. 
Worse, it's actually ambiguous in some cases:

    spam = eggs.(cheese if - x else "aardvark")

can be read as:

if eggs - x:  # implied bool(eggs - x)
    spam = eggs.cheese
else:
    spam = "aardvark"


or as this:

if -x:  # implied bool(-x)
    spam = eggs.cheese
else:
    spam = "aardvark"


> foo = bar.([idx] if != sentinel else default())

I **really** hate this syntax. It almost makes me look more fondly at 
the || / !| syntax. Looking at this, I really want to interprete the 
last part as 

    foo = bar.default()

so I can see this being a really common error. "Why isn't my method 
being called?"

-1 on this.


-- 
Steve

From python at lucidity.plus.com  Fri Sep 25 04:01:12 2015
From: python at lucidity.plus.com (Erik)
Date: Fri, 25 Sep 2015 03:01:12 +0100
Subject: [Python-ideas] Using "||" (doubled pipe) as the null coalescing
 operator?
In-Reply-To: <20150925013500.GD23642@ando.pearwood.info>
References: <CADiSq7cGr7yC3H62dVgi_mVDoMfrHiT3YaAxCbXHNXtZf+1jow@mail.gmail.com>
 <DA4D1C39-B9F7-4E39-BD2F-0B384CB9E456@gmail.com>
 <CADiSq7dh6-C+ULY=GuF0k442+Rwuoi=jXDpjk4ZoOxC9972h8Q@mail.gmail.com>
 <CADiSq7f7HPwVL+_ZkGhO8oZusjdSOo1Y+tWZ+VRww7Xq2QmxDA@mail.gmail.com>
 <5604791D.3000204@lucidity.plus.com>
 <CAP7+vJLN7RCDZoBNd_ougs65Jb8bFS=5HOCaVe7fwFxZUU=x0g@mail.gmail.com>
 <56047BF3.6090308@lucidity.plus.com>
 <CACCLA57PeCt5t5yf1M5dbhR84X3sHm21Y6o3qHRwQqtRKjwB9Q@mail.gmail.com>
 <560495BE.3070600@lucidity.plus.com>
 <20150925013500.GD23642@ando.pearwood.info>
Message-ID: <5604AAE8.50808@lucidity.plus.com>

Hi Steven,

On 25/09/15 02:35, Steven D'Aprano wrote:
> I think that your intention is for that to be equivalent to:
>
> if bar not None:  # missing "is" operator
>      foo = bar(param0, param1)
> else:
>      foo = default()

Yes, you are correct. I omitted the 'is'.

> I thought `bar.(<accessor> ...)` meant attribute access, so I initially
> expected the true branch to evaluate to:
>
>      foo = bar.(param0, param1)
>
> which of course is a syntax error. Presumably you would write
> `bar.(attr if ...)` for attribute access and not `bar.(.attr if ...)`.

I chose ".()" on purpose because it was a syntax error. Not including 
the "." meant it looks like a function call, so that wasn't workable.

".()" was supposed to read "I'm doing something with this object, but 
what I'm doing is conditional, so read on".

> I'm still confused about the missing `is`. Maybe you meant:

No, I meant to write 'is'.

> Worse, it's actually ambiguous in some cases:

Hmmm. Yes, OK, I see the problem here.

>> foo = bar.([idx] if != sentinel else default())
>
> I **really** hate this syntax.

"hate" is a very strong word. You've prefixed it with "really" (and 
emphasised that with several asterisks) - are you trying to tell me 
something? ;)

> I really want to interprete the
> last part as
>
>      foo = bar.default()

Yes, I can see that's a reasonable interpretation.


I never expected my suggestion to be embraced as-is, but perhaps it will 
inspire someone else to come up with a more enlightened suggestion - I 
did say that at the top of the post ;)

E.

From tim.peters at gmail.com  Fri Sep 25 06:02:30 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 24 Sep 2015 23:02:30 -0500
Subject: [Python-ideas] PEP 504: Using the system RNG by default
In-Reply-To: <CAExdVN=Xo_FqFvvQbgiV9s-WfZv3R=HfX0=cSsHteYw9EWw6Fg@mail.gmail.com>
References: <CADiSq7fYpacQAYbscyGnMGU6fBaC-0gwFdUJaUWHQ7Xpxh_D_A@mail.gmail.com>
 <CAP7+vJLQ=SjU6HiCfwyeAEkYC0MBwLP2rBFEfETjXCYubFi7pA@mail.gmail.com>
 <1442341539.574404.384456273.435775D6@webmail.messagingengine.com>
 <CAEbHw4Zq_AT-8E6iDh_CB3LfDha8KTY0=cBZ9tNwCjp7L4VWrQ@mail.gmail.com>
 <87mvwnxful.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAExdVN=Xo_FqFvvQbgiV9s-WfZv3R=HfX0=cSsHteYw9EWw6Fg@mail.gmail.com>
Message-ID: <CAExdVNmyHBfdE7fdGPjPK7cZ=GDpz+YaV1qhSVUGGwFQ09qOMw@mail.gmail.com>

[Tim]
> ...
> "Password generators" should be the least of our worries.  Best I can
> tell, the PHP paper's highly technical MT attack against those has
> scant chance of working in Python except when random.choice(x) is
> known to have len(x) a power of 2.  Then it's a very powerful attack.

Ha!  That's actually its worse case, although everyone missed that.

I wrote a solver, and bumped into this while testing it.  The rub is
this line in _randbelow():

            k = n.bit_length()  # don't use (n-1) here because n can be 1

If n == 2**i, k is i+1 then, and ._randbelow() goes on to throw away
half of all 32-bit MT outputs.  Everyone before assumed it wouldn't
throw any away.

The best case for this kind of solver is when .choice(x) has len(x)
one less than a power of 2, say 2**i - 1.  Then k = i, and
._randbelow() throws away 1 of each of 2**i MT outputs (on average).

For small i (say, len(x) == 63), every time I tried then the solver
(which can only record bits from MT outputs it _knows_ were produced)
found itself stuck with inconsistent equations.

If len(x) = 2**20 - 1, _then_ it has a great chance of succeeding.
There's about a chance in a million then that a single .choice() call
will consume 2 32-bit MT outputs,

It takes around 1,250 consecutive observations (of .choice() results)
to deduce the starting state then, assuming .choice() never skips an
MT output.  The chance that no output was in fact skipped is about:

>>> (1 - 1./2**20) ** 1250
0.9988086167972104

So that attack is very likely to succeed.

So, until the "secrets" module is released, and you're too dense to
use os.urandom(), don't pick passwords from a million-character
alphabet ;-)

From steve at pearwood.info  Sat Sep 26 15:07:15 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 26 Sep 2015 23:07:15 +1000
Subject: [Python-ideas] PEP 506 (secrets module) and token functions
In-Reply-To: <20150919181612.GT31152@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
Message-ID: <20150926130715.GG23642@ando.pearwood.info>

I'm looking for guidance and/or consensus on two issues regarding token* 
functions in secrets: output type, and default values.

The idea is that the module will include a few functions for generating 
tokens, suitable for (say) password recovery, with the 
following signatures:

def token_bytes(nbytes:int) -> bytes:
    """Return nbytes random bytes."""

def token_hex(nbytes:int) -> ???? :
    """Return nbytes random bytes, encoded to hex"""

def token_url(nbytes:int) -> ???? :
    """Return nbytes random bytes, URL-safe base64 encoded."""


Question one:

- token_bytes obviously should return bytes. What should the others 
  return, bytes or str?

Question two:

- Many people will have no idea how many bytes should be used to be 
  confident that it will be hard for an attacker to guess. Earlier, I
  suggested that the three functions include default values for nbytes, 
  and there were no objections. Do we have consensus on this, and if so, 
  what default value should we use?

Question three:

- If we have default values, do we need some sort of documented 
  exception to the general backwards-compatibility requirement?

E.g. suppose we release the module in 3.6.0 with defaults of 32 bytes, 
and in 3.6.2 we discover that's too small and we should have used 64 
bytes. Can we change the default in 3.6.3 without notice?



-- 
Steve

From storchaka at gmail.com  Sat Sep 26 15:56:09 2015
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sat, 26 Sep 2015 16:56:09 +0300
Subject: [Python-ideas] PEP 506 (secrets module) and token functions
In-Reply-To: <20150926130715.GG23642@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
 <20150926130715.GG23642@ando.pearwood.info>
Message-ID: <mu685p$k4t$1@ger.gmane.org>

On 26.09.15 16:07, Steven D'Aprano wrote:
> Question one:
>
> - token_bytes obviously should return bytes. What should the others
>    return, bytes or str?

Why don't left conversion to the user? You can provide simple receipts 
in the documentation.

def token_hex(nbytes):
     return token_bytes(nbytes).hex()

def token_url(nbytes):
     return base64.urlsafe_b64encode(token_bytes(nbytes)).rstrip(b'=')

We don't know what functions are needed by users. After the secrets 
module is widely used, we could gather the statistics of most popular 
patterns and add some of them in the stdlib.

> Question two:
>
> - Many people will have no idea how many bytes should be used to be
>    confident that it will be hard for an attacker to guess. Earlier, I
>    suggested that the three functions include default values for nbytes,
>    and there were no objections. Do we have consensus on this, and if so,
>    what default value should we use?

I would made the nbytes argument mandatory, and exposed recommended 
values in examples.

 >>> secrets.token_bytes(32)
b'\xf8\x80Ejh\x1ck\xfbL\xc3l\xd3ev\x1bT\xbe\x983\x072\xbbP\xe2\xee\xf8\xdc\xaf\xe4\xddJ#'



From rosuav at gmail.com  Sat Sep 26 16:04:49 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 27 Sep 2015 00:04:49 +1000
Subject: [Python-ideas] PEP 506 (secrets module) and token functions
In-Reply-To: <20150926130715.GG23642@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
 <20150926130715.GG23642@ando.pearwood.info>
Message-ID: <CAPTjJmpsM_fsREibnSrCyzsrkGXZYZ5JYcJC817Tq-Co-_WryA@mail.gmail.com>

On Sat, Sep 26, 2015 at 11:07 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> Question one:
>
> - token_bytes obviously should return bytes. What should the others
>   return, bytes or str?

str. The point of encoding them is to turn the entropy into some form
of text, so IMO it makes sense to treat this as text.

> Question two:
>
> - Many people will have no idea how many bytes should be used to be
>   confident that it will be hard for an attacker to guess. Earlier, I
>   suggested that the three functions include default values for nbytes,
>   and there were no objections. Do we have consensus on this, and if so,
>   what default value should we use?
>
> Question three:
>
> - If we have default values, do we need some sort of documented
>   exception to the general backwards-compatibility requirement?
>
> E.g. suppose we release the module in 3.6.0 with defaults of 32 bytes,
> and in 3.6.2 we discover that's too small and we should have used 64
> bytes. Can we change the default in 3.6.3 without notice?

So as I understand you, there are three options:

1) No default. Whenever you want entropy, you say how much. Simple.

2) Fixed default, covered by backward guarantee promises.

3) Variable default with an implication that using the default entropy
is "secure enough" for most purposes.

Can you adequately define "secure enough" across all purposes? If so,
I would support that. The precise number would never be documented
specifically (if you want to know what your version does, try it
interactively), and then it can indeed be changed in 3.6.3 - or even
without a version number bump at all (in ten years' time, Red Hat
might choose to continue shipping CPython 3.6.1, but change the
default entropy value).

Otherwise, I would be inclined toward not having a default at all.
Having one that can be changed only in 3.7 seems like the worst of
both worlds - programs can't depend on the value being constant, but a
security enhancement can't be done on an already-released version.

ChrisA

From vxgmichel at gmail.com  Sat Sep 26 16:29:12 2015
From: vxgmichel at gmail.com (Vincent Michel)
Date: Sat, 26 Sep 2015 16:29:12 +0200
Subject: [Python-ideas] Submitting a job to an asyncio event loop
Message-ID: <CAFvThkBK5RjLLXgxZ8ePe0xfQ5O2xKKpT_0oULRAz0PB0X--zQ@mail.gmail.com>

Hi,

I noticed there is currently no standard solution to submit a job from a
thread to an asyncio event loop.

Here's what the asyncio documentation says about concurrency and
multithreading:

> To schedule a callback from a different thread, the
BaseEventLoop.call_soon_threadsafe() method should be used.
> Example to schedule a coroutine from a different thread:
>     loop.call_soon_threadsafe(asyncio.async, coro_func())

The issue with this method is the loss of the coroutine result.

One way to deal with this issue is to connect the asyncio.Future returned
by async (or ensure_future) to a concurrent.futures.Future. It is then
possible to use a subclass of concurrent.futures.Executor to submit a
callback to an asyncio event loop. Such an executor can also be used to set
up communication between two event loops using run_in_executor.

I posted an implementation called LoopExecutor on GitHub:
https://github.com/vxgmichel/asyncio-loopexecutor
The repo contains the loopexecutor module along with tests for several use
cases. The README describes the whole thing (context, examples, issues,
implementation).

It is interesting to note that this executor is a bit different than
ThreadPoolExecutor and ProcessPoolExecutor since it can also submit a
coroutine function. Example:

with LoopExecutor(loop) as executor:
    future = executor.submit(operator.add, 1, 2)
    assert future.result() == 3
    future = executor.submit(asyncio.sleep, 0.1, result=3)
    assert future.result() == 3

This works in both cases because submit always cast the given function to a
coroutine. That means it would also work with a function that returns a
Future.

Here's a few topic related to the current implementation that might be
interesting to discuss:

- possible drawback of casting the callback to a coroutine
- possible drawback of concurrent.future.Future using
asyncio.Future._copy_state
- does LoopExecutor need to implement the shutdown method?
- removing the limitation in run_in_executor (can't submit a coroutine
function)
- adding a generic Future connection function in asyncio
- reimplementing wrap_future with the generic connection
- adding LoopExecutor to asyncio (or concurrent.futures)

At the moment, the interaction between asyncio and concurrent.futures only
goes one way. It would be nice to have a standard solution (LoopExecutor or
something else) to make it bidirectional.

Thanks,

Vincent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150926/e95732e7/attachment.html>

From abarnert at yahoo.com  Sat Sep 26 23:00:01 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sat, 26 Sep 2015 14:00:01 -0700
Subject: [Python-ideas] PEP 506 (secrets module) and token functions
In-Reply-To: <20150926130715.GG23642@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
 <20150926130715.GG23642@ando.pearwood.info>
Message-ID: <E7FC1741-FF6C-4726-950A-C25049D492C6@yahoo.com>

On Sep 26, 2015, at 06:07, Steven D'Aprano <steve at pearwood.info> wrote:
> 
> Question three:
> 
> - If we have default values, do we need some sort of documented 
>  exception to the general backwards-compatibility requirement?

Why not just use a default value of None, and document that None picks an appropriate value? Then, if it changes to a different appropriate value in 3.7 or 3.6.3 or some custom build of CPython, it hasn't broken backward compatibility.

> 
> E.g. suppose we release the module in 3.6.0 with defaults of 32 bytes, 
> and in 3.6.2 we discover that's too small and we should have used 64 
> bytes. Can we change the default in 3.6.3 without notice?

From guido at python.org  Sun Sep 27 04:52:15 2015
From: guido at python.org (Guido van Rossum)
Date: Sat, 26 Sep 2015 19:52:15 -0700
Subject: [Python-ideas] Submitting a job to an asyncio event loop
In-Reply-To: <CAFvThkBK5RjLLXgxZ8ePe0xfQ5O2xKKpT_0oULRAz0PB0X--zQ@mail.gmail.com>
References: <CAFvThkBK5RjLLXgxZ8ePe0xfQ5O2xKKpT_0oULRAz0PB0X--zQ@mail.gmail.com>
Message-ID: <CAP7+vJLCGfenQS-p4fwhOzbBogxcev-4Nz9iROuxtB2z-skp9A@mail.gmail.com>

Hi Vincent,

I've read your write-up with interest. You're right that it's a bit awkward
to make calls from the threaded world into the asyncio world.
Interestingly, there's much better support for passing work off from the
asyncio event loop to a thread (run_in_executor()). Perhaps that's because
the use case there was obvious from the start: some things that may block
for I/O just don't have an async interface yet, so in order to use them
from an asyncio task they must be off-loaded to a separate thread or else
the entire event loop is blocked. (This is used for calling getaddrinfo(),
for example.)

I'm curious where you have encountered the opposite use case?

I think if I had to do this myself I would go for a more minimalist
interface: something like your submit() method but without the call to
asyncio.coroutine(fn). Having the caller pass in the already-called
coroutine object might simplify the signature even further. I'm not sure I
see the advantage of trying to make this an executor -- but perhaps I'm
missing something?

--Guido



On Sat, Sep 26, 2015 at 7:29 AM, Vincent Michel <vxgmichel at gmail.com> wrote:

> Hi,
>
> I noticed there is currently no standard solution to submit a job from a
> thread to an asyncio event loop.
>
> Here's what the asyncio documentation says about concurrency and
> multithreading:
>
> > To schedule a callback from a different thread, the
> BaseEventLoop.call_soon_threadsafe() method should be used.
> > Example to schedule a coroutine from a different thread:
> >     loop.call_soon_threadsafe(asyncio.async, coro_func())
>
> The issue with this method is the loss of the coroutine result.
>
> One way to deal with this issue is to connect the asyncio.Future returned
> by async (or ensure_future) to a concurrent.futures.Future. It is then
> possible to use a subclass of concurrent.futures.Executor to submit a
> callback to an asyncio event loop. Such an executor can also be used to set
> up communication between two event loops using run_in_executor.
>
> I posted an implementation called LoopExecutor on GitHub:
> https://github.com/vxgmichel/asyncio-loopexecutor
> The repo contains the loopexecutor module along with tests for several use
> cases. The README describes the whole thing (context, examples, issues,
> implementation).
>
> It is interesting to note that this executor is a bit different than
> ThreadPoolExecutor and ProcessPoolExecutor since it can also submit a
> coroutine function. Example:
>
> with LoopExecutor(loop) as executor:
>     future = executor.submit(operator.add, 1, 2)
>     assert future.result() == 3
>     future = executor.submit(asyncio.sleep, 0.1, result=3)
>     assert future.result() == 3
>
> This works in both cases because submit always cast the given function to
> a coroutine. That means it would also work with a function that returns a
> Future.
>
> Here's a few topic related to the current implementation that might be
> interesting to discuss:
>
> - possible drawback of casting the callback to a coroutine
> - possible drawback of concurrent.future.Future using
> asyncio.Future._copy_state
> - does LoopExecutor need to implement the shutdown method?
> - removing the limitation in run_in_executor (can't submit a coroutine
> function)
> - adding a generic Future connection function in asyncio
> - reimplementing wrap_future with the generic connection
> - adding LoopExecutor to asyncio (or concurrent.futures)
>
> At the moment, the interaction between asyncio and concurrent.futures only
> goes one way. It would be nice to have a standard solution (LoopExecutor or
> something else) to make it bidirectional.
>
> Thanks,
>
> Vincent
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150926/eb2977e7/attachment.html>

From ncoghlan at gmail.com  Sun Sep 27 15:28:13 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 27 Sep 2015 23:28:13 +1000
Subject: [Python-ideas] PEP 506 (secrets module) and token functions
In-Reply-To: <20150926130715.GG23642@ando.pearwood.info>
References: <20150919181612.GT31152@ando.pearwood.info>
 <20150926130715.GG23642@ando.pearwood.info>
Message-ID: <CADiSq7dqKoMZKmwn0bV=WpcZ0J1v6EMn_yTcHQ1U_A+5NZ7_sw@mail.gmail.com>

On 26 September 2015 at 23:07, Steven D'Aprano <steve at pearwood.info> wrote:
> I'm looking for guidance and/or consensus on two issues regarding token*
> functions in secrets: output type, and default values.
>
> The idea is that the module will include a few functions for generating
> tokens, suitable for (say) password recovery, with the
> following signatures:
>
> def token_bytes(nbytes:int) -> bytes:
>     """Return nbytes random bytes."""
> def token_hex(nbytes:int) -> ???? :
>     """Return nbytes random bytes, encoded to hex"""
>
> def token_url(nbytes:int) -> ???? :
>     """Return nbytes random bytes, URL-safe base64 encoded."""
>
>
> Question one:
>
> - token_bytes obviously should return bytes. What should the others
>   return, bytes or str?

token_hex and token_url are inspired by Pyramid's and Django's token
generators (albeit with a different implementation technique in the
latter case), so I'd look at what type those return.

The Django token generator is django.utils.crypto.get_random_string,
and returns text.
The Pyramid CSRF token generator in
sessions.BaseCookieSessionFactory.CookieSession.new_csrf_token also
returns text

However, I'm starting to think we should just pick one of the two
algorithms and call it "token_str" (with the shorter output from the
URL-safe base64 with any trailing "=" removed being my preference).
For folks that want or need to use a different token generation
algorithm, we can offer the Pyramid and Django generation algorithms
as recipes in the documentation.

> Question two:
>
> - Many people will have no idea how many bytes should be used to be
>   confident that it will be hard for an attacker to guess. Earlier, I
>   suggested that the three functions include default values for nbytes,
>   and there were no objections. Do we have consensus on this, and if so,
>   what default value should we use?

32 bytes (256 bits of entropy) seems like a reasonable default to me.

> Question three:
>
> - If we have default values, do we need some sort of documented
>   exception to the general backwards-compatibility requirement?
>
> E.g. suppose we release the module in 3.6.0 with defaults of 32 bytes,
> and in 3.6.2 we discover that's too small and we should have used 64
> bytes. Can we change the default in 3.6.3 without notice?

I like Andrew's suggestion of making the default None, and saying that
passing None means we'll choose an appropriate length, which will be
32 bytes for now, but may change in maintenance releases to increase
the length if we decide 256 bits of entropy isn't enough. Changes in
the default length could be indicated through "versionchanged" notes
in the "token_bytes" documentation.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Sun Sep 27 15:30:04 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 27 Sep 2015 23:30:04 +1000
Subject: [Python-ideas] PEP 506 (secrets module) and token functions
In-Reply-To: <mu685p$k4t$1@ger.gmane.org>
References: <20150919181612.GT31152@ando.pearwood.info>
 <20150926130715.GG23642@ando.pearwood.info>
 <mu685p$k4t$1@ger.gmane.org>
Message-ID: <CADiSq7fU0pQmbqH_3pQ=FA6Yje5EgNcySmvqO9LFOQR1UAiDrg@mail.gmail.com>

On 26 September 2015 at 23:56, Serhiy Storchaka <storchaka at gmail.com> wrote:
> On 26.09.15 16:07, Steven D'Aprano wrote:
>>
>> Question one:
>>
>> - token_bytes obviously should return bytes. What should the others
>>    return, bytes or str?
>
>
> Why don't left conversion to the user? You can provide simple receipts in
> the documentation.
>
> def token_hex(nbytes):
>     return token_bytes(nbytes).hex()
>
> def token_url(nbytes):
>     return base64.urlsafe_b64encode(token_bytes(nbytes)).rstrip(b'=')
>
> We don't know what functions are needed by users. After the secrets module
> is widely used, we could gather the statistics of most popular patterns and
> add some of them in the stdlib.

We already have those patterns based on what web frameworks use - the
hex token generator pattern is taken from Pyramid's token generator,
while the base64 one is inspired by Django's (the latter actually uses
the "choosing from an alphabet" implementation style, but the proposed
base64 approach makes the same general trade-off of encoding more bits
of entropy per character to make the overall output shorter).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Sun Sep 27 15:35:33 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 27 Sep 2015 23:35:33 +1000
Subject: [Python-ideas] PEP 506 (secrets module) and token functions
In-Reply-To: <CAPTjJmpsM_fsREibnSrCyzsrkGXZYZ5JYcJC817Tq-Co-_WryA@mail.gmail.com>
References: <20150919181612.GT31152@ando.pearwood.info>
 <20150926130715.GG23642@ando.pearwood.info>
 <CAPTjJmpsM_fsREibnSrCyzsrkGXZYZ5JYcJC817Tq-Co-_WryA@mail.gmail.com>
Message-ID: <CADiSq7dD7THmQX3zrU6LejyhkfL58Zqx5Ro2eXpoxEBzL-T0rQ@mail.gmail.com>

On 27 September 2015 at 00:04, Chris Angelico <rosuav at gmail.com> wrote:
> Can you adequately define "secure enough" across all purposes? If so,
> I would support that. The precise number would never be documented
> specifically (if you want to know what your version does, try it
> interactively), and then it can indeed be changed in 3.6.3 - or even
> without a version number bump at all (in ten years' time, Red Hat
> might choose to continue shipping CPython 3.6.1, but change the
> default entropy value).

We backported PEP 466 with its "the default SSL context settings may
change in maintenance releases" behaviour to the Python 2.7.5 based
system Python in RHEL 7.2, so I expect we'd be OK with backporting
changes to default entropy settings in the secrets module.

The default settings in the system provided OpenSSL have also long
been subject to change (that's one of the reasons CPython defaults to
dynamically linking to OpenSSL on *nix systems).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From vxgmichel at gmail.com  Sun Sep 27 15:36:05 2015
From: vxgmichel at gmail.com (Vincent Michel)
Date: Sun, 27 Sep 2015 15:36:05 +0200
Subject: [Python-ideas] Submitting a job to an asyncio event loop
In-Reply-To: <CAP7+vJLCGfenQS-p4fwhOzbBogxcev-4Nz9iROuxtB2z-skp9A@mail.gmail.com>
References: <CAFvThkBK5RjLLXgxZ8ePe0xfQ5O2xKKpT_0oULRAz0PB0X--zQ@mail.gmail.com>
 <CAP7+vJLCGfenQS-p4fwhOzbBogxcev-4Nz9iROuxtB2z-skp9A@mail.gmail.com>
Message-ID: <CAFvThkCSNiCNLhTJz5Ta_1-B7U4End-XOHE5v1eUcSuttCy8sw@mail.gmail.com>

Hi Guido,

Thanks for your interest,

I work for a synchrotron and we use the distributed control system
TANGO. The main implementation is in C++, but we use a python binding
called PyTango. The current server implementation (on the C++ side)
does not feature an event loop but instead create a different thread
for each client.

TANGO: http://www.tango-controls.org/
PyTango: http://www.esrf.eu/computing/cs/tango/tango_doc/kernel_doc/pytango/latest/index.html

I wanted to add asyncio support to the library, so that we can benefit
from single-threaded asynchronous programming. The problem is that
client callbacks run in different threads and there is not much we can
do about it until a pure python implementation is developed (and it's
a lot of work). Instead, it is possible to use an asyncio event loop,
run the server through run_in_executor (juste like you mentioned in
your mail), and redirect all the client callbacks to the event loop.
That's the part where job submission from a different thread comes in
handy.

A very similar solution has been developed using gevent, but I like
explicit coroutines better :p

Another use case is the communication between two event loops. From
what I've seen, the current context (get/set event loop) is only
related to the current thread. It makes it easy to run different event
loops in different threads. Even though I'm not sure what the use case
is, I suppose it's been done intentionally. Then the executor
interface is useful to run things like:

executor = LoopExecutor(other_loop)
result = await my_loop.run_in_executor(executor, coro_func, *args)

There is working example in the test directory:
https://github.com/vxgmichel/asyncio-loopexecutor/blob/master/test/test_multi_loop.py

***

The coroutine(fn) cast only makes sense if a subclass of Executor is
used, in order to be consistent with the Executor.submit signature.
Otherwise, passing an already-called coroutine is perfectly fine. I
think it is a good idea to define a simple submit function like you
recommended:

def submit_to_loop(loop, coro):
    future = concurrent.futures.Future()
    callback = partial(schedule, coro, destination=future)
    loop.call_soon_threadsafe(callback)
    return future

And then use the executor interface if we realize it is actually
useful. It's really not a lot of code anyway:

class LoopExecutor(concurrent.futures.Executor):

    def __init__(self, loop=None):
        self.loop = loop or asyncio.get_event_loop()

    def submit(self, fn, *args, **kwargs):
        coro = asyncio.coroutine(fn)(*args, **kwargs)
        return submit_to_loop(self.loop, coro)

I'll update the repository.

Cheers,

Vincent

2015-09-27 4:52 GMT+02:00 Guido van Rossum <guido at python.org>:
>
> Hi Vincent,
>
> I've read your write-up with interest. You're right that it's a bit awkward to make calls from the threaded world into the asyncio world. Interestingly, there's much better support for passing work off from the asyncio event loop to a thread (run_in_executor()). Perhaps that's because the use case there was obvious from the start: some things that may block for I/O just don't have an async interface yet, so in order to use them from an asyncio task they must be off-loaded to a separate thread or else the entire event loop is blocked. (This is used for calling getaddrinfo(), for example.)
>
> I'm curious where you have encountered the opposite use case?
>
> I think if I had to do this myself I would go for a more minimalist interface: something like your submit() method but without the call to asyncio.coroutine(fn). Having the caller pass in the already-called coroutine object might simplify the signature even further. I'm not sure I see the advantage of trying to make this an executor -- but perhaps I'm missing something?
>
> --Guido
>
>
>
> On Sat, Sep 26, 2015 at 7:29 AM, Vincent Michel <vxgmichel at gmail.com> wrote:
>>
>> Hi,
>>
>> I noticed there is currently no standard solution to submit a job from a thread to an asyncio event loop.
>>
>> Here's what the asyncio documentation says about concurrency and multithreading:
>>
>> > To schedule a callback from a different thread, the BaseEventLoop.call_soon_threadsafe() method should be used.
>> > Example to schedule a coroutine from a different thread:
>> >     loop.call_soon_threadsafe(asyncio.async, coro_func())
>>
>> The issue with this method is the loss of the coroutine result.
>>
>> One way to deal with this issue is to connect the asyncio.Future returned by async (or ensure_future) to a concurrent.futures.Future. It is then possible to use a subclass of concurrent.futures.Executor to submit a callback to an asyncio event loop. Such an executor can also be used to set up communication between two event loops using run_in_executor.
>>
>> I posted an implementation called LoopExecutor on GitHub:
>> https://github.com/vxgmichel/asyncio-loopexecutor
>> The repo contains the loopexecutor module along with tests for several use cases. The README describes the whole thing (context, examples, issues, implementation).
>>
>> It is interesting to note that this executor is a bit different than ThreadPoolExecutor and ProcessPoolExecutor since it can also submit a coroutine function. Example:
>>
>> with LoopExecutor(loop) as executor:
>>     future = executor.submit(operator.add, 1, 2)
>>     assert future.result() == 3
>>     future = executor.submit(asyncio.sleep, 0.1, result=3)
>>     assert future.result() == 3
>>
>> This works in both cases because submit always cast the given function to a coroutine. That means it would also work with a function that returns a Future.
>>
>> Here's a few topic related to the current implementation that might be interesting to discuss:
>>
>> - possible drawback of casting the callback to a coroutine
>> - possible drawback of concurrent.future.Future using asyncio.Future._copy_state
>> - does LoopExecutor need to implement the shutdown method?
>> - removing the limitation in run_in_executor (can't submit a coroutine function)
>> - adding a generic Future connection function in asyncio
>> - reimplementing wrap_future with the generic connection
>> - adding LoopExecutor to asyncio (or concurrent.futures)
>>
>> At the moment, the interaction between asyncio and concurrent.futures only goes one way. It would be nice to have a standard solution (LoopExecutor or something else) to make it bidirectional.
>>
>> Thanks,
>>
>> Vincent
>>
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)

From guido at python.org  Sun Sep 27 18:42:46 2015
From: guido at python.org (Guido van Rossum)
Date: Sun, 27 Sep 2015 09:42:46 -0700
Subject: [Python-ideas] Submitting a job to an asyncio event loop
In-Reply-To: <CAFvThkCSNiCNLhTJz5Ta_1-B7U4End-XOHE5v1eUcSuttCy8sw@mail.gmail.com>
References: <CAFvThkBK5RjLLXgxZ8ePe0xfQ5O2xKKpT_0oULRAz0PB0X--zQ@mail.gmail.com>
 <CAP7+vJLCGfenQS-p4fwhOzbBogxcev-4Nz9iROuxtB2z-skp9A@mail.gmail.com>
 <CAFvThkCSNiCNLhTJz5Ta_1-B7U4End-XOHE5v1eUcSuttCy8sw@mail.gmail.com>
Message-ID: <CAP7+vJ+Pc=+sWz1oq+K0u=kjVyyHjYFAT4RxSqtpA5mwRB9PJQ@mail.gmail.com>

OK, I think I understand your primary use case -- the C++ library calls
callbacks in their own threads but you want the callback code to run in
your event loop, where presumably it is structured as a coroutine and may
use `yield from` or `await` to wait for other coroutines, tasks or futures.
Then when that coroutine is done it returns a value which your machinery
passes back as the result of a concurrent.futures.Future on which the
callback thread is waiting.

I don't think the use case involving multiple event loops in different
threads is as clear. I am still waiting for someone who is actually trying
to use this. It might be useful on a system where there is a system event
loop that must be used for UI events (assuming this event loop can somehow
be wrapped in a custom asyncio loop) and where an app might want to have a
standard asyncio event loop for network I/O. Come to think of it, the
ProactorEventLoop on Windows has both advantages and disadvantages, and
some app might need to use both that and SelectorEventLoop. But this is a
real pain (because you can't share any mutable state between event loops).



On Sun, Sep 27, 2015 at 6:36 AM, Vincent Michel <vxgmichel at gmail.com> wrote:

> Hi Guido,
>
> Thanks for your interest,
>
> I work for a synchrotron and we use the distributed control system
> TANGO. The main implementation is in C++, but we use a python binding
> called PyTango. The current server implementation (on the C++ side)
> does not feature an event loop but instead create a different thread
> for each client.
>
> TANGO: http://www.tango-controls.org/
> PyTango:
> http://www.esrf.eu/computing/cs/tango/tango_doc/kernel_doc/pytango/latest/index.html
>
> I wanted to add asyncio support to the library, so that we can benefit
> from single-threaded asynchronous programming. The problem is that
> client callbacks run in different threads and there is not much we can
> do about it until a pure python implementation is developed (and it's
> a lot of work). Instead, it is possible to use an asyncio event loop,
> run the server through run_in_executor (juste like you mentioned in
> your mail), and redirect all the client callbacks to the event loop.
> That's the part where job submission from a different thread comes in
> handy.
>
> A very similar solution has been developed using gevent, but I like
> explicit coroutines better :p
>
> Another use case is the communication between two event loops. From
> what I've seen, the current context (get/set event loop) is only
> related to the current thread. It makes it easy to run different event
> loops in different threads. Even though I'm not sure what the use case
> is, I suppose it's been done intentionally. Then the executor
> interface is useful to run things like:
>
> executor = LoopExecutor(other_loop)
> result = await my_loop.run_in_executor(executor, coro_func, *args)
>
> There is working example in the test directory:
>
> https://github.com/vxgmichel/asyncio-loopexecutor/blob/master/test/test_multi_loop.py
>
> ***
>
> The coroutine(fn) cast only makes sense if a subclass of Executor is
> used, in order to be consistent with the Executor.submit signature.
> Otherwise, passing an already-called coroutine is perfectly fine. I
> think it is a good idea to define a simple submit function like you
> recommended:
>
> def submit_to_loop(loop, coro):
>     future = concurrent.futures.Future()
>     callback = partial(schedule, coro, destination=future)
>     loop.call_soon_threadsafe(callback)
>     return future
>
> And then use the executor interface if we realize it is actually
> useful. It's really not a lot of code anyway:
>
> class LoopExecutor(concurrent.futures.Executor):
>
>     def __init__(self, loop=None):
>         self.loop = loop or asyncio.get_event_loop()
>
>     def submit(self, fn, *args, **kwargs):
>         coro = asyncio.coroutine(fn)(*args, **kwargs)
>         return submit_to_loop(self.loop, coro)
>
> I'll update the repository.
>
> Cheers,
>
> Vincent
>
> 2015-09-27 4:52 GMT+02:00 Guido van Rossum <guido at python.org>:
> >
> > Hi Vincent,
> >
> > I've read your write-up with interest. You're right that it's a bit
> awkward to make calls from the threaded world into the asyncio world.
> Interestingly, there's much better support for passing work off from the
> asyncio event loop to a thread (run_in_executor()). Perhaps that's because
> the use case there was obvious from the start: some things that may block
> for I/O just don't have an async interface yet, so in order to use them
> from an asyncio task they must be off-loaded to a separate thread or else
> the entire event loop is blocked. (This is used for calling getaddrinfo(),
> for example.)
> >
> > I'm curious where you have encountered the opposite use case?
> >
> > I think if I had to do this myself I would go for a more minimalist
> interface: something like your submit() method but without the call to
> asyncio.coroutine(fn). Having the caller pass in the already-called
> coroutine object might simplify the signature even further. I'm not sure I
> see the advantage of trying to make this an executor -- but perhaps I'm
> missing something?
> >
> > --Guido
> >
> >
> >
> > On Sat, Sep 26, 2015 at 7:29 AM, Vincent Michel <vxgmichel at gmail.com>
> wrote:
> >>
> >> Hi,
> >>
> >> I noticed there is currently no standard solution to submit a job from
> a thread to an asyncio event loop.
> >>
> >> Here's what the asyncio documentation says about concurrency and
> multithreading:
> >>
> >> > To schedule a callback from a different thread, the
> BaseEventLoop.call_soon_threadsafe() method should be used.
> >> > Example to schedule a coroutine from a different thread:
> >> >     loop.call_soon_threadsafe(asyncio.async, coro_func())
> >>
> >> The issue with this method is the loss of the coroutine result.
> >>
> >> One way to deal with this issue is to connect the asyncio.Future
> returned by async (or ensure_future) to a concurrent.futures.Future. It is
> then possible to use a subclass of concurrent.futures.Executor to submit a
> callback to an asyncio event loop. Such an executor can also be used to set
> up communication between two event loops using run_in_executor.
> >>
> >> I posted an implementation called LoopExecutor on GitHub:
> >> https://github.com/vxgmichel/asyncio-loopexecutor
> >> The repo contains the loopexecutor module along with tests for several
> use cases. The README describes the whole thing (context, examples, issues,
> implementation).
> >>
> >> It is interesting to note that this executor is a bit different than
> ThreadPoolExecutor and ProcessPoolExecutor since it can also submit a
> coroutine function. Example:
> >>
> >> with LoopExecutor(loop) as executor:
> >>     future = executor.submit(operator.add, 1, 2)
> >>     assert future.result() == 3
> >>     future = executor.submit(asyncio.sleep, 0.1, result=3)
> >>     assert future.result() == 3
> >>
> >> This works in both cases because submit always cast the given function
> to a coroutine. That means it would also work with a function that returns
> a Future.
> >>
> >> Here's a few topic related to the current implementation that might be
> interesting to discuss:
> >>
> >> - possible drawback of casting the callback to a coroutine
> >> - possible drawback of concurrent.future.Future using
> asyncio.Future._copy_state
> >> - does LoopExecutor need to implement the shutdown method?
> >> - removing the limitation in run_in_executor (can't submit a coroutine
> function)
> >> - adding a generic Future connection function in asyncio
> >> - reimplementing wrap_future with the generic connection
> >> - adding LoopExecutor to asyncio (or concurrent.futures)
> >>
> >> At the moment, the interaction between asyncio and concurrent.futures
> only goes one way. It would be nice to have a standard solution
> (LoopExecutor or something else) to make it bidirectional.
> >>
> >> Thanks,
> >>
> >> Vincent
> >>
> >>
> >> _______________________________________________
> >> Python-ideas mailing list
> >> Python-ideas at python.org
> >> https://mail.python.org/mailman/listinfo/python-ideas
> >> Code of Conduct: http://python.org/psf/codeofconduct/
> >
> >
> >
> >
> > --
> > --Guido van Rossum (python.org/~guido)
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150927/ed3f1bc6/attachment-0001.html>

From vxgmichel at gmail.com  Sun Sep 27 22:29:05 2015
From: vxgmichel at gmail.com (Vincent Michel)
Date: Sun, 27 Sep 2015 22:29:05 +0200
Subject: [Python-ideas] Submitting a job to an asyncio event loop
In-Reply-To: <CAP7+vJ+Pc=+sWz1oq+K0u=kjVyyHjYFAT4RxSqtpA5mwRB9PJQ@mail.gmail.com>
References: <CAFvThkBK5RjLLXgxZ8ePe0xfQ5O2xKKpT_0oULRAz0PB0X--zQ@mail.gmail.com>
 <CAP7+vJLCGfenQS-p4fwhOzbBogxcev-4Nz9iROuxtB2z-skp9A@mail.gmail.com>
 <CAFvThkCSNiCNLhTJz5Ta_1-B7U4End-XOHE5v1eUcSuttCy8sw@mail.gmail.com>
 <CAP7+vJ+Pc=+sWz1oq+K0u=kjVyyHjYFAT4RxSqtpA5mwRB9PJQ@mail.gmail.com>
Message-ID: <CAFvThkAoR8HrbDO3i7ZOFuiKNGkj=oC3EMbFsKBkzcMEMMwGjw@mail.gmail.com>

Yes that's exactly it. No problem for the multiple event loops, it was
a fun thing to play with. Then there's probably no reason to have a
loop executor either.

I think the important part is really the interface between asyncio
futures and concurrent futures, since it is not trivial to write and
maintain. In particular, getting exceptions and cancellation to work
safely can be a bit tricky.


2015-09-27 18:42 GMT+02:00 Guido van Rossum <guido at python.org>:
> OK, I think I understand your primary use case -- the C++ library calls
> callbacks in their own threads but you want the callback code to run in your
> event loop, where presumably it is structured as a coroutine and may use
> `yield from` or `await` to wait for other coroutines, tasks or futures. Then
> when that coroutine is done it returns a value which your machinery passes
> back as the result of a concurrent.futures.Future on which the callback
> thread is waiting.
>
> I don't think the use case involving multiple event loops in different
> threads is as clear. I am still waiting for someone who is actually trying
> to use this. It might be useful on a system where there is a system event
> loop that must be used for UI events (assuming this event loop can somehow
> be wrapped in a custom asyncio loop) and where an app might want to have a
> standard asyncio event loop for network I/O. Come to think of it, the
> ProactorEventLoop on Windows has both advantages and disadvantages, and some
> app might need to use both that and SelectorEventLoop. But this is a real
> pain (because you can't share any mutable state between event loops).
>
>
>
> On Sun, Sep 27, 2015 at 6:36 AM, Vincent Michel <vxgmichel at gmail.com> wrote:
>>
>> Hi Guido,
>>
>> Thanks for your interest,
>>
>> I work for a synchrotron and we use the distributed control system
>> TANGO. The main implementation is in C++, but we use a python binding
>> called PyTango. The current server implementation (on the C++ side)
>> does not feature an event loop but instead create a different thread
>> for each client.
>>
>> TANGO: http://www.tango-controls.org/
>> PyTango:
>> http://www.esrf.eu/computing/cs/tango/tango_doc/kernel_doc/pytango/latest/index.html
>>
>> I wanted to add asyncio support to the library, so that we can benefit
>> from single-threaded asynchronous programming. The problem is that
>> client callbacks run in different threads and there is not much we can
>> do about it until a pure python implementation is developed (and it's
>> a lot of work). Instead, it is possible to use an asyncio event loop,
>> run the server through run_in_executor (juste like you mentioned in
>> your mail), and redirect all the client callbacks to the event loop.
>> That's the part where job submission from a different thread comes in
>> handy.
>>
>> A very similar solution has been developed using gevent, but I like
>> explicit coroutines better :p
>>
>> Another use case is the communication between two event loops. From
>> what I've seen, the current context (get/set event loop) is only
>> related to the current thread. It makes it easy to run different event
>> loops in different threads. Even though I'm not sure what the use case
>> is, I suppose it's been done intentionally. Then the executor
>> interface is useful to run things like:
>>
>> executor = LoopExecutor(other_loop)
>> result = await my_loop.run_in_executor(executor, coro_func, *args)
>>
>> There is working example in the test directory:
>>
>> https://github.com/vxgmichel/asyncio-loopexecutor/blob/master/test/test_multi_loop.py
>>
>> ***
>>
>> The coroutine(fn) cast only makes sense if a subclass of Executor is
>> used, in order to be consistent with the Executor.submit signature.
>> Otherwise, passing an already-called coroutine is perfectly fine. I
>> think it is a good idea to define a simple submit function like you
>> recommended:
>>
>> def submit_to_loop(loop, coro):
>>     future = concurrent.futures.Future()
>>     callback = partial(schedule, coro, destination=future)
>>     loop.call_soon_threadsafe(callback)
>>     return future
>>
>> And then use the executor interface if we realize it is actually
>> useful. It's really not a lot of code anyway:
>>
>> class LoopExecutor(concurrent.futures.Executor):
>>
>>     def __init__(self, loop=None):
>>         self.loop = loop or asyncio.get_event_loop()
>>
>>     def submit(self, fn, *args, **kwargs):
>>         coro = asyncio.coroutine(fn)(*args, **kwargs)
>>         return submit_to_loop(self.loop, coro)
>>
>> I'll update the repository.
>>
>> Cheers,
>>
>> Vincent
>>
>> 2015-09-27 4:52 GMT+02:00 Guido van Rossum <guido at python.org>:
>> >
>> > Hi Vincent,
>> >
>> > I've read your write-up with interest. You're right that it's a bit
>> > awkward to make calls from the threaded world into the asyncio world.
>> > Interestingly, there's much better support for passing work off from the
>> > asyncio event loop to a thread (run_in_executor()). Perhaps that's because
>> > the use case there was obvious from the start: some things that may block
>> > for I/O just don't have an async interface yet, so in order to use them from
>> > an asyncio task they must be off-loaded to a separate thread or else the
>> > entire event loop is blocked. (This is used for calling getaddrinfo(), for
>> > example.)
>> >
>> > I'm curious where you have encountered the opposite use case?
>> >
>> > I think if I had to do this myself I would go for a more minimalist
>> > interface: something like your submit() method but without the call to
>> > asyncio.coroutine(fn). Having the caller pass in the already-called
>> > coroutine object might simplify the signature even further. I'm not sure I
>> > see the advantage of trying to make this an executor -- but perhaps I'm
>> > missing something?
>> >
>> > --Guido
>> >
>> >
>> >
>> > On Sat, Sep 26, 2015 at 7:29 AM, Vincent Michel <vxgmichel at gmail.com>
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> I noticed there is currently no standard solution to submit a job from
>> >> a thread to an asyncio event loop.
>> >>
>> >> Here's what the asyncio documentation says about concurrency and
>> >> multithreading:
>> >>
>> >> > To schedule a callback from a different thread, the
>> >> > BaseEventLoop.call_soon_threadsafe() method should be used.
>> >> > Example to schedule a coroutine from a different thread:
>> >> >     loop.call_soon_threadsafe(asyncio.async, coro_func())
>> >>
>> >> The issue with this method is the loss of the coroutine result.
>> >>
>> >> One way to deal with this issue is to connect the asyncio.Future
>> >> returned by async (or ensure_future) to a concurrent.futures.Future. It is
>> >> then possible to use a subclass of concurrent.futures.Executor to submit a
>> >> callback to an asyncio event loop. Such an executor can also be used to set
>> >> up communication between two event loops using run_in_executor.
>> >>
>> >> I posted an implementation called LoopExecutor on GitHub:
>> >> https://github.com/vxgmichel/asyncio-loopexecutor
>> >> The repo contains the loopexecutor module along with tests for several
>> >> use cases. The README describes the whole thing (context, examples, issues,
>> >> implementation).
>> >>
>> >> It is interesting to note that this executor is a bit different than
>> >> ThreadPoolExecutor and ProcessPoolExecutor since it can also submit a
>> >> coroutine function. Example:
>> >>
>> >> with LoopExecutor(loop) as executor:
>> >>     future = executor.submit(operator.add, 1, 2)
>> >>     assert future.result() == 3
>> >>     future = executor.submit(asyncio.sleep, 0.1, result=3)
>> >>     assert future.result() == 3
>> >>
>> >> This works in both cases because submit always cast the given function
>> >> to a coroutine. That means it would also work with a function that returns a
>> >> Future.
>> >>
>> >> Here's a few topic related to the current implementation that might be
>> >> interesting to discuss:
>> >>
>> >> - possible drawback of casting the callback to a coroutine
>> >> - possible drawback of concurrent.future.Future using
>> >> asyncio.Future._copy_state
>> >> - does LoopExecutor need to implement the shutdown method?
>> >> - removing the limitation in run_in_executor (can't submit a coroutine
>> >> function)
>> >> - adding a generic Future connection function in asyncio
>> >> - reimplementing wrap_future with the generic connection
>> >> - adding LoopExecutor to asyncio (or concurrent.futures)
>> >>
>> >> At the moment, the interaction between asyncio and concurrent.futures
>> >> only goes one way. It would be nice to have a standard solution
>> >> (LoopExecutor or something else) to make it bidirectional.
>> >>
>> >> Thanks,
>> >>
>> >> Vincent
>> >>
>> >>
>> >> _______________________________________________
>> >> Python-ideas mailing list
>> >> Python-ideas at python.org
>> >> https://mail.python.org/mailman/listinfo/python-ideas
>> >> Code of Conduct: http://python.org/psf/codeofconduct/
>> >
>> >
>> >
>> >
>> > --
>> > --Guido van Rossum (python.org/~guido)
>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)

From guido at python.org  Sun Sep 27 22:39:12 2015
From: guido at python.org (Guido van Rossum)
Date: Sun, 27 Sep 2015 13:39:12 -0700
Subject: [Python-ideas] Submitting a job to an asyncio event loop
In-Reply-To: <CAFvThkAoR8HrbDO3i7ZOFuiKNGkj=oC3EMbFsKBkzcMEMMwGjw@mail.gmail.com>
References: <CAFvThkBK5RjLLXgxZ8ePe0xfQ5O2xKKpT_0oULRAz0PB0X--zQ@mail.gmail.com>
 <CAP7+vJLCGfenQS-p4fwhOzbBogxcev-4Nz9iROuxtB2z-skp9A@mail.gmail.com>
 <CAFvThkCSNiCNLhTJz5Ta_1-B7U4End-XOHE5v1eUcSuttCy8sw@mail.gmail.com>
 <CAP7+vJ+Pc=+sWz1oq+K0u=kjVyyHjYFAT4RxSqtpA5mwRB9PJQ@mail.gmail.com>
 <CAFvThkAoR8HrbDO3i7ZOFuiKNGkj=oC3EMbFsKBkzcMEMMwGjw@mail.gmail.com>
Message-ID: <CAP7+vJKrJENvUt6vmrFdR-9eqSayLuzUF27pAsm5JwJoj818Xw@mail.gmail.com>

Do you want to propose a minimal patch to asyncio? A PR for
https://github.com/python/asyncio would be the best thing to do. I'd leave
the LoopExecutor out of it for now. The code could probably live at the
bottom of futures.py.

On Sun, Sep 27, 2015 at 1:29 PM, Vincent Michel <vxgmichel at gmail.com> wrote:

> Yes that's exactly it. No problem for the multiple event loops, it was
> a fun thing to play with. Then there's probably no reason to have a
> loop executor either.
>
> I think the important part is really the interface between asyncio
> futures and concurrent futures, since it is not trivial to write and
> maintain. In particular, getting exceptions and cancellation to work
> safely can be a bit tricky.
>
>
> 2015-09-27 18:42 GMT+02:00 Guido van Rossum <guido at python.org>:
> > OK, I think I understand your primary use case -- the C++ library calls
> > callbacks in their own threads but you want the callback code to run in
> your
> > event loop, where presumably it is structured as a coroutine and may use
> > `yield from` or `await` to wait for other coroutines, tasks or futures.
> Then
> > when that coroutine is done it returns a value which your machinery
> passes
> > back as the result of a concurrent.futures.Future on which the callback
> > thread is waiting.
> >
> > I don't think the use case involving multiple event loops in different
> > threads is as clear. I am still waiting for someone who is actually
> trying
> > to use this. It might be useful on a system where there is a system event
> > loop that must be used for UI events (assuming this event loop can
> somehow
> > be wrapped in a custom asyncio loop) and where an app might want to have
> a
> > standard asyncio event loop for network I/O. Come to think of it, the
> > ProactorEventLoop on Windows has both advantages and disadvantages, and
> some
> > app might need to use both that and SelectorEventLoop. But this is a real
> > pain (because you can't share any mutable state between event loops).
> >
> >
> >
> > On Sun, Sep 27, 2015 at 6:36 AM, Vincent Michel <vxgmichel at gmail.com>
> wrote:
> >>
> >> Hi Guido,
> >>
> >> Thanks for your interest,
> >>
> >> I work for a synchrotron and we use the distributed control system
> >> TANGO. The main implementation is in C++, but we use a python binding
> >> called PyTango. The current server implementation (on the C++ side)
> >> does not feature an event loop but instead create a different thread
> >> for each client.
> >>
> >> TANGO: http://www.tango-controls.org/
> >> PyTango:
> >>
> http://www.esrf.eu/computing/cs/tango/tango_doc/kernel_doc/pytango/latest/index.html
> >>
> >> I wanted to add asyncio support to the library, so that we can benefit
> >> from single-threaded asynchronous programming. The problem is that
> >> client callbacks run in different threads and there is not much we can
> >> do about it until a pure python implementation is developed (and it's
> >> a lot of work). Instead, it is possible to use an asyncio event loop,
> >> run the server through run_in_executor (juste like you mentioned in
> >> your mail), and redirect all the client callbacks to the event loop.
> >> That's the part where job submission from a different thread comes in
> >> handy.
> >>
> >> A very similar solution has been developed using gevent, but I like
> >> explicit coroutines better :p
> >>
> >> Another use case is the communication between two event loops. From
> >> what I've seen, the current context (get/set event loop) is only
> >> related to the current thread. It makes it easy to run different event
> >> loops in different threads. Even though I'm not sure what the use case
> >> is, I suppose it's been done intentionally. Then the executor
> >> interface is useful to run things like:
> >>
> >> executor = LoopExecutor(other_loop)
> >> result = await my_loop.run_in_executor(executor, coro_func, *args)
> >>
> >> There is working example in the test directory:
> >>
> >>
> https://github.com/vxgmichel/asyncio-loopexecutor/blob/master/test/test_multi_loop.py
> >>
> >> ***
> >>
> >> The coroutine(fn) cast only makes sense if a subclass of Executor is
> >> used, in order to be consistent with the Executor.submit signature.
> >> Otherwise, passing an already-called coroutine is perfectly fine. I
> >> think it is a good idea to define a simple submit function like you
> >> recommended:
> >>
> >> def submit_to_loop(loop, coro):
> >>     future = concurrent.futures.Future()
> >>     callback = partial(schedule, coro, destination=future)
> >>     loop.call_soon_threadsafe(callback)
> >>     return future
> >>
> >> And then use the executor interface if we realize it is actually
> >> useful. It's really not a lot of code anyway:
> >>
> >> class LoopExecutor(concurrent.futures.Executor):
> >>
> >>     def __init__(self, loop=None):
> >>         self.loop = loop or asyncio.get_event_loop()
> >>
> >>     def submit(self, fn, *args, **kwargs):
> >>         coro = asyncio.coroutine(fn)(*args, **kwargs)
> >>         return submit_to_loop(self.loop, coro)
> >>
> >> I'll update the repository.
> >>
> >> Cheers,
> >>
> >> Vincent
> >>
> >> 2015-09-27 4:52 GMT+02:00 Guido van Rossum <guido at python.org>:
> >> >
> >> > Hi Vincent,
> >> >
> >> > I've read your write-up with interest. You're right that it's a bit
> >> > awkward to make calls from the threaded world into the asyncio world.
> >> > Interestingly, there's much better support for passing work off from
> the
> >> > asyncio event loop to a thread (run_in_executor()). Perhaps that's
> because
> >> > the use case there was obvious from the start: some things that may
> block
> >> > for I/O just don't have an async interface yet, so in order to use
> them from
> >> > an asyncio task they must be off-loaded to a separate thread or else
> the
> >> > entire event loop is blocked. (This is used for calling
> getaddrinfo(), for
> >> > example.)
> >> >
> >> > I'm curious where you have encountered the opposite use case?
> >> >
> >> > I think if I had to do this myself I would go for a more minimalist
> >> > interface: something like your submit() method but without the call to
> >> > asyncio.coroutine(fn). Having the caller pass in the already-called
> >> > coroutine object might simplify the signature even further. I'm not
> sure I
> >> > see the advantage of trying to make this an executor -- but perhaps
> I'm
> >> > missing something?
> >> >
> >> > --Guido
> >> >
> >> >
> >> >
> >> > On Sat, Sep 26, 2015 at 7:29 AM, Vincent Michel <vxgmichel at gmail.com>
> >> > wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> I noticed there is currently no standard solution to submit a job
> from
> >> >> a thread to an asyncio event loop.
> >> >>
> >> >> Here's what the asyncio documentation says about concurrency and
> >> >> multithreading:
> >> >>
> >> >> > To schedule a callback from a different thread, the
> >> >> > BaseEventLoop.call_soon_threadsafe() method should be used.
> >> >> > Example to schedule a coroutine from a different thread:
> >> >> >     loop.call_soon_threadsafe(asyncio.async, coro_func())
> >> >>
> >> >> The issue with this method is the loss of the coroutine result.
> >> >>
> >> >> One way to deal with this issue is to connect the asyncio.Future
> >> >> returned by async (or ensure_future) to a concurrent.futures.Future.
> It is
> >> >> then possible to use a subclass of concurrent.futures.Executor to
> submit a
> >> >> callback to an asyncio event loop. Such an executor can also be used
> to set
> >> >> up communication between two event loops using run_in_executor.
> >> >>
> >> >> I posted an implementation called LoopExecutor on GitHub:
> >> >> https://github.com/vxgmichel/asyncio-loopexecutor
> >> >> The repo contains the loopexecutor module along with tests for
> several
> >> >> use cases. The README describes the whole thing (context, examples,
> issues,
> >> >> implementation).
> >> >>
> >> >> It is interesting to note that this executor is a bit different than
> >> >> ThreadPoolExecutor and ProcessPoolExecutor since it can also submit a
> >> >> coroutine function. Example:
> >> >>
> >> >> with LoopExecutor(loop) as executor:
> >> >>     future = executor.submit(operator.add, 1, 2)
> >> >>     assert future.result() == 3
> >> >>     future = executor.submit(asyncio.sleep, 0.1, result=3)
> >> >>     assert future.result() == 3
> >> >>
> >> >> This works in both cases because submit always cast the given
> function
> >> >> to a coroutine. That means it would also work with a function that
> returns a
> >> >> Future.
> >> >>
> >> >> Here's a few topic related to the current implementation that might
> be
> >> >> interesting to discuss:
> >> >>
> >> >> - possible drawback of casting the callback to a coroutine
> >> >> - possible drawback of concurrent.future.Future using
> >> >> asyncio.Future._copy_state
> >> >> - does LoopExecutor need to implement the shutdown method?
> >> >> - removing the limitation in run_in_executor (can't submit a
> coroutine
> >> >> function)
> >> >> - adding a generic Future connection function in asyncio
> >> >> - reimplementing wrap_future with the generic connection
> >> >> - adding LoopExecutor to asyncio (or concurrent.futures)
> >> >>
> >> >> At the moment, the interaction between asyncio and concurrent.futures
> >> >> only goes one way. It would be nice to have a standard solution
> >> >> (LoopExecutor or something else) to make it bidirectional.
> >> >>
> >> >> Thanks,
> >> >>
> >> >> Vincent
> >> >>
> >> >>
> >> >> _______________________________________________
> >> >> Python-ideas mailing list
> >> >> Python-ideas at python.org
> >> >> https://mail.python.org/mailman/listinfo/python-ideas
> >> >> Code of Conduct: http://python.org/psf/codeofconduct/
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > --Guido van Rossum (python.org/~guido)
> >
> >
> >
> >
> > --
> > --Guido van Rossum (python.org/~guido)
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150927/3b531a9a/attachment-0001.html>

From vxgmichel at gmail.com  Sun Sep 27 23:16:42 2015
From: vxgmichel at gmail.com (Vincent Michel)
Date: Sun, 27 Sep 2015 23:16:42 +0200
Subject: [Python-ideas] Submitting a job to an asyncio event loop
In-Reply-To: <CAP7+vJKrJENvUt6vmrFdR-9eqSayLuzUF27pAsm5JwJoj818Xw@mail.gmail.com>
References: <CAFvThkBK5RjLLXgxZ8ePe0xfQ5O2xKKpT_0oULRAz0PB0X--zQ@mail.gmail.com>
 <CAP7+vJLCGfenQS-p4fwhOzbBogxcev-4Nz9iROuxtB2z-skp9A@mail.gmail.com>
 <CAFvThkCSNiCNLhTJz5Ta_1-B7U4End-XOHE5v1eUcSuttCy8sw@mail.gmail.com>
 <CAP7+vJ+Pc=+sWz1oq+K0u=kjVyyHjYFAT4RxSqtpA5mwRB9PJQ@mail.gmail.com>
 <CAFvThkAoR8HrbDO3i7ZOFuiKNGkj=oC3EMbFsKBkzcMEMMwGjw@mail.gmail.com>
 <CAP7+vJKrJENvUt6vmrFdR-9eqSayLuzUF27pAsm5JwJoj818Xw@mail.gmail.com>
Message-ID: <CAFvThkD1G2=U-N2Kwbii1EZO2hxuOYyKrB-dKg2W3BrfKoGGOQ@mail.gmail.com>

Great, I'll do that!

2015-09-27 22:39 GMT+02:00 Guido van Rossum <guido at python.org>:
> Do you want to propose a minimal patch to asyncio? A PR for
> https://github.com/python/asyncio would be the best thing to do. I'd leave
> the LoopExecutor out of it for now. The code could probably live at the
> bottom of futures.py.
>
> On Sun, Sep 27, 2015 at 1:29 PM, Vincent Michel <vxgmichel at gmail.com> wrote:
>>
>> Yes that's exactly it. No problem for the multiple event loops, it was
>> a fun thing to play with. Then there's probably no reason to have a
>> loop executor either.
>>
>> I think the important part is really the interface between asyncio
>> futures and concurrent futures, since it is not trivial to write and
>> maintain. In particular, getting exceptions and cancellation to work
>> safely can be a bit tricky.
>>
>>
>> 2015-09-27 18:42 GMT+02:00 Guido van Rossum <guido at python.org>:
>> > OK, I think I understand your primary use case -- the C++ library calls
>> > callbacks in their own threads but you want the callback code to run in
>> > your
>> > event loop, where presumably it is structured as a coroutine and may use
>> > `yield from` or `await` to wait for other coroutines, tasks or futures.
>> > Then
>> > when that coroutine is done it returns a value which your machinery
>> > passes
>> > back as the result of a concurrent.futures.Future on which the callback
>> > thread is waiting.
>> >
>> > I don't think the use case involving multiple event loops in different
>> > threads is as clear. I am still waiting for someone who is actually
>> > trying
>> > to use this. It might be useful on a system where there is a system
>> > event
>> > loop that must be used for UI events (assuming this event loop can
>> > somehow
>> > be wrapped in a custom asyncio loop) and where an app might want to have
>> > a
>> > standard asyncio event loop for network I/O. Come to think of it, the
>> > ProactorEventLoop on Windows has both advantages and disadvantages, and
>> > some
>> > app might need to use both that and SelectorEventLoop. But this is a
>> > real
>> > pain (because you can't share any mutable state between event loops).
>> >
>> >
>> >
>> > On Sun, Sep 27, 2015 at 6:36 AM, Vincent Michel <vxgmichel at gmail.com>
>> > wrote:
>> >>
>> >> Hi Guido,
>> >>
>> >> Thanks for your interest,
>> >>
>> >> I work for a synchrotron and we use the distributed control system
>> >> TANGO. The main implementation is in C++, but we use a python binding
>> >> called PyTango. The current server implementation (on the C++ side)
>> >> does not feature an event loop but instead create a different thread
>> >> for each client.
>> >>
>> >> TANGO: http://www.tango-controls.org/
>> >> PyTango:
>> >>
>> >> http://www.esrf.eu/computing/cs/tango/tango_doc/kernel_doc/pytango/latest/index.html
>> >>
>> >> I wanted to add asyncio support to the library, so that we can benefit
>> >> from single-threaded asynchronous programming. The problem is that
>> >> client callbacks run in different threads and there is not much we can
>> >> do about it until a pure python implementation is developed (and it's
>> >> a lot of work). Instead, it is possible to use an asyncio event loop,
>> >> run the server through run_in_executor (juste like you mentioned in
>> >> your mail), and redirect all the client callbacks to the event loop.
>> >> That's the part where job submission from a different thread comes in
>> >> handy.
>> >>
>> >> A very similar solution has been developed using gevent, but I like
>> >> explicit coroutines better :p
>> >>
>> >> Another use case is the communication between two event loops. From
>> >> what I've seen, the current context (get/set event loop) is only
>> >> related to the current thread. It makes it easy to run different event
>> >> loops in different threads. Even though I'm not sure what the use case
>> >> is, I suppose it's been done intentionally. Then the executor
>> >> interface is useful to run things like:
>> >>
>> >> executor = LoopExecutor(other_loop)
>> >> result = await my_loop.run_in_executor(executor, coro_func, *args)
>> >>
>> >> There is working example in the test directory:
>> >>
>> >>
>> >> https://github.com/vxgmichel/asyncio-loopexecutor/blob/master/test/test_multi_loop.py
>> >>
>> >> ***
>> >>
>> >> The coroutine(fn) cast only makes sense if a subclass of Executor is
>> >> used, in order to be consistent with the Executor.submit signature.
>> >> Otherwise, passing an already-called coroutine is perfectly fine. I
>> >> think it is a good idea to define a simple submit function like you
>> >> recommended:
>> >>
>> >> def submit_to_loop(loop, coro):
>> >>     future = concurrent.futures.Future()
>> >>     callback = partial(schedule, coro, destination=future)
>> >>     loop.call_soon_threadsafe(callback)
>> >>     return future
>> >>
>> >> And then use the executor interface if we realize it is actually
>> >> useful. It's really not a lot of code anyway:
>> >>
>> >> class LoopExecutor(concurrent.futures.Executor):
>> >>
>> >>     def __init__(self, loop=None):
>> >>         self.loop = loop or asyncio.get_event_loop()
>> >>
>> >>     def submit(self, fn, *args, **kwargs):
>> >>         coro = asyncio.coroutine(fn)(*args, **kwargs)
>> >>         return submit_to_loop(self.loop, coro)
>> >>
>> >> I'll update the repository.
>> >>
>> >> Cheers,
>> >>
>> >> Vincent
>> >>
>> >> 2015-09-27 4:52 GMT+02:00 Guido van Rossum <guido at python.org>:
>> >> >
>> >> > Hi Vincent,
>> >> >
>> >> > I've read your write-up with interest. You're right that it's a bit
>> >> > awkward to make calls from the threaded world into the asyncio world.
>> >> > Interestingly, there's much better support for passing work off from
>> >> > the
>> >> > asyncio event loop to a thread (run_in_executor()). Perhaps that's
>> >> > because
>> >> > the use case there was obvious from the start: some things that may
>> >> > block
>> >> > for I/O just don't have an async interface yet, so in order to use
>> >> > them from
>> >> > an asyncio task they must be off-loaded to a separate thread or else
>> >> > the
>> >> > entire event loop is blocked. (This is used for calling
>> >> > getaddrinfo(), for
>> >> > example.)
>> >> >
>> >> > I'm curious where you have encountered the opposite use case?
>> >> >
>> >> > I think if I had to do this myself I would go for a more minimalist
>> >> > interface: something like your submit() method but without the call
>> >> > to
>> >> > asyncio.coroutine(fn). Having the caller pass in the already-called
>> >> > coroutine object might simplify the signature even further. I'm not
>> >> > sure I
>> >> > see the advantage of trying to make this an executor -- but perhaps
>> >> > I'm
>> >> > missing something?
>> >> >
>> >> > --Guido
>> >> >
>> >> >
>> >> >
>> >> > On Sat, Sep 26, 2015 at 7:29 AM, Vincent Michel <vxgmichel at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> I noticed there is currently no standard solution to submit a job
>> >> >> from
>> >> >> a thread to an asyncio event loop.
>> >> >>
>> >> >> Here's what the asyncio documentation says about concurrency and
>> >> >> multithreading:
>> >> >>
>> >> >> > To schedule a callback from a different thread, the
>> >> >> > BaseEventLoop.call_soon_threadsafe() method should be used.
>> >> >> > Example to schedule a coroutine from a different thread:
>> >> >> >     loop.call_soon_threadsafe(asyncio.async, coro_func())
>> >> >>
>> >> >> The issue with this method is the loss of the coroutine result.
>> >> >>
>> >> >> One way to deal with this issue is to connect the asyncio.Future
>> >> >> returned by async (or ensure_future) to a concurrent.futures.Future.
>> >> >> It is
>> >> >> then possible to use a subclass of concurrent.futures.Executor to
>> >> >> submit a
>> >> >> callback to an asyncio event loop. Such an executor can also be used
>> >> >> to set
>> >> >> up communication between two event loops using run_in_executor.
>> >> >>
>> >> >> I posted an implementation called LoopExecutor on GitHub:
>> >> >> https://github.com/vxgmichel/asyncio-loopexecutor
>> >> >> The repo contains the loopexecutor module along with tests for
>> >> >> several
>> >> >> use cases. The README describes the whole thing (context, examples,
>> >> >> issues,
>> >> >> implementation).
>> >> >>
>> >> >> It is interesting to note that this executor is a bit different than
>> >> >> ThreadPoolExecutor and ProcessPoolExecutor since it can also submit
>> >> >> a
>> >> >> coroutine function. Example:
>> >> >>
>> >> >> with LoopExecutor(loop) as executor:
>> >> >>     future = executor.submit(operator.add, 1, 2)
>> >> >>     assert future.result() == 3
>> >> >>     future = executor.submit(asyncio.sleep, 0.1, result=3)
>> >> >>     assert future.result() == 3
>> >> >>
>> >> >> This works in both cases because submit always cast the given
>> >> >> function
>> >> >> to a coroutine. That means it would also work with a function that
>> >> >> returns a
>> >> >> Future.
>> >> >>
>> >> >> Here's a few topic related to the current implementation that might
>> >> >> be
>> >> >> interesting to discuss:
>> >> >>
>> >> >> - possible drawback of casting the callback to a coroutine
>> >> >> - possible drawback of concurrent.future.Future using
>> >> >> asyncio.Future._copy_state
>> >> >> - does LoopExecutor need to implement the shutdown method?
>> >> >> - removing the limitation in run_in_executor (can't submit a
>> >> >> coroutine
>> >> >> function)
>> >> >> - adding a generic Future connection function in asyncio
>> >> >> - reimplementing wrap_future with the generic connection
>> >> >> - adding LoopExecutor to asyncio (or concurrent.futures)
>> >> >>
>> >> >> At the moment, the interaction between asyncio and
>> >> >> concurrent.futures
>> >> >> only goes one way. It would be nice to have a standard solution
>> >> >> (LoopExecutor or something else) to make it bidirectional.
>> >> >>
>> >> >> Thanks,
>> >> >>
>> >> >> Vincent
>> >> >>
>> >> >>
>> >> >> _______________________________________________
>> >> >> Python-ideas mailing list
>> >> >> Python-ideas at python.org
>> >> >> https://mail.python.org/mailman/listinfo/python-ideas
>> >> >> Code of Conduct: http://python.org/psf/codeofconduct/
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > --Guido van Rossum (python.org/~guido)
>> >
>> >
>> >
>> >
>> > --
>> > --Guido van Rossum (python.org/~guido)
>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)

From eric at trueblade.com  Mon Sep 28 03:23:30 2015
From: eric at trueblade.com (Eric V. Smith)
Date: Sun, 27 Sep 2015 21:23:30 -0400
Subject: [Python-ideas] Binary f-strings
Message-ID: <56089692.1080303@trueblade.com>

Now that f-strings are in the 3.6 branch, I'd like to turn my attention
to binary f-strings (fb'' or bf'').

The idea is that:

>>> bf'datestamp:{datetime.datetime.now():%Y%m%d}\r\n'

Might be translated as:

>>> (b'datestamp:' +
...  bytes(format(datetime.datetime.now(),
...               str(b'%Y%m%d', 'ascii')),
...        'ascii') +
...  b'\r\n')


Which would result in:
b'datestamp:20150927\r\n'

The only real question is: what encoding to use for the second parameter
to bytes()? Since an object must return unicode from __format__(), I
need to convert that to bytes in order to join everything together. But how?

Here I suggest 'ascii'. Unfortunately, this would give an error if
__format__ returned anything with a char greater than 127. I think we've
learned that an API that only raises an exception with certain specific
inputs is fragile.

Guido has suggested using 'utf-8' as the encoding. That has some appeal,
but if we're designing this for wire protocols, not all protocols will
be using utf-8.

Another idea would be to extend the "conversion char" from just 's',
'r', or 'a', which don't make much sense for bytes, to instead be a
string that specifies the encoding. The default could be ascii, and if
you want to specify something else:
bf'datestamp:{datetime.datetime.now()!utf-8:%Y%m%d}\r\n'

That would work for any encoding that doesn't have ':', '{', or '}' in
the encoding name. Which seems like a reasonable restriction.

And I might be over-generalizing here, but you'd presumably want to make
the encoding a non-constant:
bf'datestamp:{datetime.datetime.now()!{encoding}:%Y%m%d}\r\n'

I think my initial proposal will be to use 'ascii', and not support any
conversion characters at all for fb-strings, not even 's', 'r', and 'a'.
In the future, if we want to support encodings other than 'ascii', we
could then add !conversions mapping to encodings.

My reasoning for using 'ascii' is that 'utf-8' could easily be an error
for non-utf-8 protocols. And by using 'ascii', at least we'd give a
runtime error and not put possibly bogus data into the resulting binary
string. Granted, the tradeoff is that we now have a case where whether
or not the code raises an exception is dependent upon the values being
formatted. If 'ascii' is the default, we could later switch to 'utf-8',
but we couldn't go the other way.

The only place this is likely to be a problem is when formatting unicode
string values. No other built-in type is going to have a non-ascii
compatible character in its __format__, unless you do tricky things with
datetime format_specs. Of course user-defined types can return any
unicode chars from __format__.

Once we make a decision, I can apply the same logic to b''.format(), if
that's desirable.

I'm open to suggestions on this.

Thanks for reading.

-- 
Eric.

From steve at pearwood.info  Mon Sep 28 04:09:58 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 28 Sep 2015 12:09:58 +1000
Subject: [Python-ideas] Binary f-strings
In-Reply-To: <56089692.1080303@trueblade.com>
References: <56089692.1080303@trueblade.com>
Message-ID: <20150928020957.GK23642@ando.pearwood.info>

On Sun, Sep 27, 2015 at 09:23:30PM -0400, Eric V. Smith wrote:
> Now that f-strings are in the 3.6 branch, I'd like to turn my attention
> to binary f-strings (fb'' or bf'').
> 
> The idea is that:
> 
> >>> bf'datestamp:{datetime.datetime.now():%Y%m%d}\r\n'
> 
> Might be translated as:
> 
> >>> (b'datestamp:' +
> ...  bytes(format(datetime.datetime.now(),
> ...               str(b'%Y%m%d', 'ascii')),
> ...        'ascii') +
> ...  b'\r\n')

What's wrong with this?

f'datestamp:{datetime.datetime.now():%Y%m%d}\r\n'.encode('ascii')

This eliminates all your questions about which encoding we should guess 
is more useful (ascii? utf-8? something else?), allows the caller 
to set an error handler without inventing yet more cryptic format codes, 
and is nicely explicit.

If people are worried about the length of ".encode(...)", a helper 
function works great:

def b(s): return bytes(s, 'utf-8')  
# or whatever encoding makes sense for them

b(f'datestamp:{datetime.datetime.now():%Y%m%d}\r\n')


> Which would result in:
> b'datestamp:20150927\r\n'
> 
> The only real question is: what encoding to use for the second parameter
> to bytes()? Since an object must return unicode from __format__(), I
> need to convert that to bytes in order to join everything together. But how?
> 
> Here I suggest 'ascii'. Unfortunately, this would give an error if
> __format__ returned anything with a char greater than 127. I think we've
> learned that an API that only raises an exception with certain specific
> inputs is fragile.
> 
> Guido has suggested using 'utf-8' as the encoding. That has some appeal,
> but if we're designing this for wire protocols, not all protocols will
> be using utf-8.

Using UTF-8 is not sufficient, since there are strings that can't be 
encoded into UTF-8 because they contain surrogates:

py> '\uDA11'.encode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'utf-8' codec can't encode character '\uda11' in 
position 0: surrogates not allowed


but we surely don't want to suppress such errors by default. Sometimes 
they will be an error that needs fixing.



-- 
Steve

From njs at pobox.com  Mon Sep 28 04:41:50 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 27 Sep 2015 19:41:50 -0700
Subject: [Python-ideas] Binary f-strings
In-Reply-To: <56089692.1080303@trueblade.com>
References: <56089692.1080303@trueblade.com>
Message-ID: <CAPJVwBkTthOOJrzJZE08nE5FUCsNdUMUSDdrKuMW=UEVQ54xLA@mail.gmail.com>

Naively, I'd expect that since f-strings and .format share the same
infrastructure, fb-strings should work the same way as bytes.format --
and in particular, either both should be supported or neither. Since
bytes.format apparently got rejected during the PEP 460/PEP 461
discussions:
    https://bugs.python.org/issue3982#msg224023
I guess you'd need to dig up those earlier discussions and see what
the issues were?

-n

On Sun, Sep 27, 2015 at 6:23 PM, Eric V. Smith <eric at trueblade.com> wrote:
> Now that f-strings are in the 3.6 branch, I'd like to turn my attention
> to binary f-strings (fb'' or bf'').
>
> The idea is that:
>
>>>> bf'datestamp:{datetime.datetime.now():%Y%m%d}\r\n'
>
> Might be translated as:
>
>>>> (b'datestamp:' +
> ...  bytes(format(datetime.datetime.now(),
> ...               str(b'%Y%m%d', 'ascii')),
> ...        'ascii') +
> ...  b'\r\n')
>
>
> Which would result in:
> b'datestamp:20150927\r\n'
>
> The only real question is: what encoding to use for the second parameter
> to bytes()? Since an object must return unicode from __format__(), I
> need to convert that to bytes in order to join everything together. But how?
>
> Here I suggest 'ascii'. Unfortunately, this would give an error if
> __format__ returned anything with a char greater than 127. I think we've
> learned that an API that only raises an exception with certain specific
> inputs is fragile.
>
> Guido has suggested using 'utf-8' as the encoding. That has some appeal,
> but if we're designing this for wire protocols, not all protocols will
> be using utf-8.
>
> Another idea would be to extend the "conversion char" from just 's',
> 'r', or 'a', which don't make much sense for bytes, to instead be a
> string that specifies the encoding. The default could be ascii, and if
> you want to specify something else:
> bf'datestamp:{datetime.datetime.now()!utf-8:%Y%m%d}\r\n'
>
> That would work for any encoding that doesn't have ':', '{', or '}' in
> the encoding name. Which seems like a reasonable restriction.
>
> And I might be over-generalizing here, but you'd presumably want to make
> the encoding a non-constant:
> bf'datestamp:{datetime.datetime.now()!{encoding}:%Y%m%d}\r\n'
>
> I think my initial proposal will be to use 'ascii', and not support any
> conversion characters at all for fb-strings, not even 's', 'r', and 'a'.
> In the future, if we want to support encodings other than 'ascii', we
> could then add !conversions mapping to encodings.
>
> My reasoning for using 'ascii' is that 'utf-8' could easily be an error
> for non-utf-8 protocols. And by using 'ascii', at least we'd give a
> runtime error and not put possibly bogus data into the resulting binary
> string. Granted, the tradeoff is that we now have a case where whether
> or not the code raises an exception is dependent upon the values being
> formatted. If 'ascii' is the default, we could later switch to 'utf-8',
> but we couldn't go the other way.
>
> The only place this is likely to be a problem is when formatting unicode
> string values. No other built-in type is going to have a non-ascii
> compatible character in its __format__, unless you do tricky things with
> datetime format_specs. Of course user-defined types can return any
> unicode chars from __format__.
>
> Once we make a decision, I can apply the same logic to b''.format(), if
> that's desirable.
>
> I'm open to suggestions on this.
>
> Thanks for reading.
>
> --
> Eric.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Nathaniel J. Smith -- http://vorpus.org

From rosuav at gmail.com  Mon Sep 28 05:03:32 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Mon, 28 Sep 2015 13:03:32 +1000
Subject: [Python-ideas] Binary f-strings
In-Reply-To: <CAPJVwBkTthOOJrzJZE08nE5FUCsNdUMUSDdrKuMW=UEVQ54xLA@mail.gmail.com>
References: <56089692.1080303@trueblade.com>
 <CAPJVwBkTthOOJrzJZE08nE5FUCsNdUMUSDdrKuMW=UEVQ54xLA@mail.gmail.com>
Message-ID: <CAPTjJmrSxqKeSACoDHSP59R9Qz=L8SyV2w9Kv1E77Z4yLsb7OA@mail.gmail.com>

On Mon, Sep 28, 2015 at 12:41 PM, Nathaniel Smith <njs at pobox.com> wrote:
> Naively, I'd expect that since f-strings and .format share the same
> infrastructure, fb-strings should work the same way as bytes.format --
> and in particular, either both should be supported or neither. Since
> bytes.format apparently got rejected during the PEP 460/PEP 461
> discussions:
>     https://bugs.python.org/issue3982#msg224023
> I guess you'd need to dig up those earlier discussions and see what
> the issues were?

The biggest issues are summarized into PEP 461:

https://www.python.org/dev/peps/pep-0461/#proposed-variations

Since the __format__ machinery is all based around text strings,
there'll need to be some (explicit or implicit) encode step. Hence
this thread.

How bad would it be to simply say "there are no bf strings"? As Steven
says, you can simply use a normal f''.encode() operation, with no
confusion. Otherwise, there'll be these "format-like" operations that
can do things that format() can't do... and then there'd be edge
cases, too, like a string with a b-prefix that contains non-ASCII
characters in it:

>>> ?????? = 1961
>>> apollo = 1969
>>> print(f"It took {apollo-??????} years to get from orbit to the moon.")
It took 8 years to get from orbit to the moon.
>>> print(b"It took {apollo-??????} years to get from orbit to the moon.")
  File "<stdin>", line 1
SyntaxError: bytes can only contain ASCII literal characters.

If that were a binary f-string, those Cyrillic characters should still
be legal (as they define an identifier, rather than ending up in the
code). Would it confuse (a) humans, or (b) tools, to have these "texty
bits" inside a byte string?

In any case, bf strings can be added later, but once they're added,
their semantics would be locked in. I'd be inclined to leave them out
for 3.6 and see what people say. A bit of real-world usage of
f-strings might show a clear front-runner in terms of expectations
(UTF-8, ASCII, or something else).

ChrisA

From steve at pearwood.info  Mon Sep 28 05:28:45 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 28 Sep 2015 13:28:45 +1000
Subject: [Python-ideas] Binary f-strings
In-Reply-To: <CAPTjJmrSxqKeSACoDHSP59R9Qz=L8SyV2w9Kv1E77Z4yLsb7OA@mail.gmail.com>
References: <56089692.1080303@trueblade.com>
 <CAPJVwBkTthOOJrzJZE08nE5FUCsNdUMUSDdrKuMW=UEVQ54xLA@mail.gmail.com>
 <CAPTjJmrSxqKeSACoDHSP59R9Qz=L8SyV2w9Kv1E77Z4yLsb7OA@mail.gmail.com>
Message-ID: <20150928032845.GM23642@ando.pearwood.info>

On Mon, Sep 28, 2015 at 01:03:32PM +1000, Chris Angelico wrote:
[...]
> >>> ?????? = 1961
> >>> apollo = 1969
> >>> print(f"It took {apollo-??????} years to get from orbit to the moon.")
> It took 8 years to get from orbit to the moon.
> >>> print(b"It took {apollo-??????} years to get from orbit to the moon.")
>   File "<stdin>", line 1
> SyntaxError: bytes can only contain ASCII literal characters.
> 
> If that were a binary f-string, those Cyrillic characters should still
> be legal (as they define an identifier, rather than ending up in the
> code). Would it confuse (a) humans, or (b) tools, to have these "texty
> bits" inside a byte string?

It would confuse the heck out of me. I leave it to the reader to decide 
whether I am a human or a tool.


-- 
Steve

From abarnert at yahoo.com  Mon Sep 28 05:48:02 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sun, 27 Sep 2015 20:48:02 -0700
Subject: [Python-ideas] Binary f-strings
In-Reply-To: <56089692.1080303@trueblade.com>
References: <56089692.1080303@trueblade.com>
Message-ID: <FF2EF7A4-8134-49EB-9EFC-F763893A68C4@yahoo.com>

On Sep 27, 2015, at 18:23, Eric V. Smith <eric at trueblade.com> wrote:
> 
> The only place this is likely to be a problem is when formatting unicode
> string values. No other built-in type is going to have a non-ascii
> compatible character in its __format__, unless you do tricky things with
> datetime format_specs. Of course user-defined types can return any
> unicode chars from __format__.

The fact that it can't handle bytes and bytes-like types makes this much less useful than %.

Beyond that, the fact that it only works reliably for the same types as %, minus bytes, plus a few others including datetime means the benefit isn't nearly as large as for f-strings and str.format, which work reliably for every type in the world, and extensibly so for many types. And meanwhile, the cost is much higher, from code that seems to work if you don't test it well to even higher performance costs (and usually in code that needs performance more).

Of course you could create a __bformat__(*args, encoding, errors, **kw) protocol (where object.__bformat__ just returns self.__format__(*args, **kw).encode(encoding, errors)), which has the same effect as your proposal except that types that need to know they're being bytes-formatted to do something reasonable, or that just want to know so they can optimize, can do so. And this of course lets you add __bformat__ to bytes, etc.--although it doesn't seem to help for types that support the buffer protocol, so it's still not as good as %b. But I don't think anyone will want that.

From ncoghlan at gmail.com  Mon Sep 28 09:13:06 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 28 Sep 2015 17:13:06 +1000
Subject: [Python-ideas] Using `or?` as the null coalescing operator
In-Reply-To: <6C2E5579-42A0-423F-AB8C-01B49FA59D67@gmail.com>
References: <6C2E5579-42A0-423F-AB8C-01B49FA59D67@gmail.com>
Message-ID: <CADiSq7eR=G1m=KH2CGa7LgVx4jCcYxfnRPC6Ta=f0crFFEmDAQ@mail.gmail.com>

On 25 September 2015 at 09:07, Alessio Bogon <youtux at gmail.com> wrote:
> I really like PEP 0505. The only thing that does not convince me is the `??` operator. I would like to know what you think of an alternative like `or?`:
>
> a_list = some_list or? []
> a_dict = some_dict or? {}
>
> The rationale behind is to let `or` do its job with ?truthy? values, while `or?` would require non-None values.
> The rest of the PEP looks good to me.
>
> I apologise in advance if this was already proposed and I missed it.

It hasn't been suggested that I recall, and yes, I also prefer it to
the doubled ?? spelling. One concrete advantage is that it helps
convey that this is a short-circuiting control flow operator like
'and' and 'or' rather than a normal binary operator that always
evaluates both operands.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From k7hoven at gmail.com  Mon Sep 28 12:50:15 2015
From: k7hoven at gmail.com (Koos Zevenhoven)
Date: Mon, 28 Sep 2015 13:50:15 +0300
Subject: [Python-ideas] Using `or?` as the null coalescing operator
In-Reply-To: <CADiSq7eR=G1m=KH2CGa7LgVx4jCcYxfnRPC6Ta=f0crFFEmDAQ@mail.gmail.com>
References: <6C2E5579-42A0-423F-AB8C-01B49FA59D67@gmail.com>
 <CADiSq7eR=G1m=KH2CGa7LgVx4jCcYxfnRPC6Ta=f0crFFEmDAQ@mail.gmail.com>
Message-ID: <CAMiohog5gNdkjhF_54reSyA8mDjqiqqKGd9SreSpzJGRvyKRYg@mail.gmail.com>

On Mon, Sep 28, 2015 at 10:13 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 25 September 2015 at 09:07, Alessio Bogon <youtux at gmail.com> wrote:
>> I really like PEP 0505. The only thing that does not convince me is the `??` operator. I would like to know what you think of an alternative like `or?`:
>>
>> a_list = some_list or? []
>> a_dict = some_dict or? {}
>>

And have the following syntax options been considered?

a_list = some_list else []

a_list = some_list or [] if None

-- Koos

From toddrjen at gmail.com  Mon Sep 28 13:46:36 2015
From: toddrjen at gmail.com (Todd)
Date: Mon, 28 Sep 2015 13:46:36 +0200
Subject: [Python-ideas] maxsplit in os.path.split
Message-ID: <CAFpSVpJygCeBcUfDQ41NV_yyZW=ny6kYsF0_cX7uhAHe+u=QWw@mail.gmail.com>

The "str.split" and "str.rsplit" methods have a useful "maxsplit" option,
which lets you set the number of times to split, defaulting to -1 (which is
"unlimited").  The corresponding "os.path.split", however, has no
"maxsplit" option.  It can only split once, which splits the last path
segment (the "basename") from the rest (equivalent of "str.rsplit" with
"maxsplit=1").

I think it would be useful if "os.path.split" also had a "maxsplit"
option.  This would default to "1" (the current behavior"), but could be
set to any value allowed by "str.split".  Using this option would follow
the behavior of "str.rsplit" for that value of "maxsplit".
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/a2167420/attachment.html>

From p.f.moore at gmail.com  Mon Sep 28 13:58:41 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 28 Sep 2015 12:58:41 +0100
Subject: [Python-ideas] maxsplit in os.path.split
In-Reply-To: <CAFpSVpJygCeBcUfDQ41NV_yyZW=ny6kYsF0_cX7uhAHe+u=QWw@mail.gmail.com>
References: <CAFpSVpJygCeBcUfDQ41NV_yyZW=ny6kYsF0_cX7uhAHe+u=QWw@mail.gmail.com>
Message-ID: <CACac1F-Jzk6BYxAREJbMUFjOF5+5_++_0-4PM8iQBWJbMx0bbA@mail.gmail.com>

On 28 September 2015 at 12:46, Todd <toddrjen at gmail.com> wrote:
> I think it would be useful if "os.path.split" also had a "maxsplit" option.
> This would default to "1" (the current behavior"), but could be set to any
> value allowed by "str.split".  Using this option would follow the behavior
> of "str.rsplit" for that value of "maxsplit".

In Python 3.6+ (which is the only place a change like this is likely
to happen) you're probably better using pathlib. There, you can use
path.parts, which returns a tuple of the path elements, so you can do
things like

    >>> Path('C:\\what\\ever\\you\\like.txt').parts[-3:]
    ('ever', 'you', 'like.txt')

That's usable now in Python 3.4+, and a backport is available at
https://pypi.python.org/pypi/pathlib/

Paul

From guido at python.org  Mon Sep 28 16:12:09 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Sep 2015 07:12:09 -0700
Subject: [Python-ideas] maxsplit in os.path.split
In-Reply-To: <CACac1F-Jzk6BYxAREJbMUFjOF5+5_++_0-4PM8iQBWJbMx0bbA@mail.gmail.com>
References: <CAFpSVpJygCeBcUfDQ41NV_yyZW=ny6kYsF0_cX7uhAHe+u=QWw@mail.gmail.com>
 <CACac1F-Jzk6BYxAREJbMUFjOF5+5_++_0-4PM8iQBWJbMx0bbA@mail.gmail.com>
Message-ID: <CAP7+vJK3GoMi07tokCc0nbEa8OQtDyx__L9axTeQdaYT0+T+WA@mail.gmail.com>

Also, the similarity between str.*split() and os.path.split() is not close
enough to draw conclusions about one from the other.

On Mon, Sep 28, 2015 at 4:58 AM, Paul Moore <p.f.moore at gmail.com> wrote:

> On 28 September 2015 at 12:46, Todd <toddrjen at gmail.com> wrote:
> > I think it would be useful if "os.path.split" also had a "maxsplit"
> option.
> > This would default to "1" (the current behavior"), but could be set to
> any
> > value allowed by "str.split".  Using this option would follow the
> behavior
> > of "str.rsplit" for that value of "maxsplit".
>
> In Python 3.6+ (which is the only place a change like this is likely
> to happen) you're probably better using pathlib. There, you can use
> path.parts, which returns a tuple of the path elements, so you can do
> things like
>
>     >>> Path('C:\\what\\ever\\you\\like.txt').parts[-3:]
>     ('ever', 'you', 'like.txt')
>
> That's usable now in Python 3.4+, and a backport is available at
> https://pypi.python.org/pypi/pathlib/
>
> Paul
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/d8a39dca/attachment.html>

From rymg19 at gmail.com  Mon Sep 28 16:33:03 2015
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Mon, 28 Sep 2015 09:33:03 -0500
Subject: [Python-ideas] Using `or?` as the null coalescing operator
In-Reply-To: <CAMiohog5gNdkjhF_54reSyA8mDjqiqqKGd9SreSpzJGRvyKRYg@mail.gmail.com>
References: <6C2E5579-42A0-423F-AB8C-01B49FA59D67@gmail.com>
 <CADiSq7eR=G1m=KH2CGa7LgVx4jCcYxfnRPC6Ta=f0crFFEmDAQ@mail.gmail.com>
 <CAMiohog5gNdkjhF_54reSyA8mDjqiqqKGd9SreSpzJGRvyKRYg@mail.gmail.com>
Message-ID: <078DE967-1A38-4886-9DDB-567E6021F19F@gmail.com>



On September 28, 2015 5:50:15 AM CDT, Koos Zevenhoven <k7hoven at gmail.com> wrote:
>On Mon, Sep 28, 2015 at 10:13 AM, Nick Coghlan <ncoghlan at gmail.com>
>wrote:
>> On 25 September 2015 at 09:07, Alessio Bogon <youtux at gmail.com>
>wrote:
>>> I really like PEP 0505. The only thing that does not convince me is
>the `??` operator. I would like to know what you think of an
>alternative like `or?`:
>>>
>>> a_list = some_list or? []
>>> a_dict = some_dict or? {}
>>>
>
>And have the following syntax options been considered?
>
>a_list = some_list else []
>

This one's ambiguous. How would you parse:

x if a else b else c

As:

x if (a else b) else c

Or:

x if a else (b else c)

>a_list = some_list or [] if None
>
>-- Koos
>_______________________________________________
>Python-ideas mailing list
>Python-ideas at python.org
>https://mail.python.org/mailman/listinfo/python-ideas
>Code of Conduct: http://python.org/psf/codeofconduct/

-- 
Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.

From jdhardy at gmail.com  Mon Sep 28 18:02:54 2015
From: jdhardy at gmail.com (Jeff Hardy)
Date: Mon, 28 Sep 2015 09:02:54 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
Message-ID: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>

TL;DR:
+1 for the idea
-1 on the propagating member-access or index operators
+1 on spelling it "or?"

C# has had null-coalescing since about 2005, and it's one feature I miss in
every other language that I use. I view null/None as a necessary evil, so
getting rid of them as soon possible is a good thing in my book. Nearly
every bit of Python I've ever written would have benefitted from it, if
just to get rid of the "x if x is not None else []" mess.

That said, I think the other (propagating) operators are a mistake, and I
think they were a mistake in C# as well. I'm not I've ever had a situation
where I wished they existed, in any language. Better to get rid of the
Nones as soon as possible than bring them along. It's worth reading the C#
design team's notes and subsequent discussion on the associativity of "?."
[1] since it goes around and around with no really good answer and no
particularly intuitive behaviour.

Rather than worry about that, I'd prefer to see just the basic
None-coalescing added. I like Alessio's suggestion of "or?" (which seems
like it should be read in a calm but threatening tone, a la Liam Neeson).
It just seems more Pythonic; ?? is fine in C# but seems punctuation-heavy
for Python. It does mean the ?= and ?. and ?[] are probably out, and I'm OK
with that.

- Jeff

[1] https://roslyn.codeplex.com/discussions/543895
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/27723f55/attachment-0001.html>

From srkunze at mail.de  Mon Sep 28 18:11:45 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Mon, 28 Sep 2015 18:11:45 +0200
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
Message-ID: <560966C1.1040704@mail.de>

On 28.09.2015 18:02, Jeff Hardy wrote:
> TL;DR:
> +1 for the idea
> -1 on the propagating member-access or index operators
> +1 on spelling it "or?"
>
> C# has had null-coalescing since about 2005, and it's one feature I 
> miss in every other language that I use. I view null/None as a 
> necessary evil, so getting rid of them as soon possible is a good 
> thing in my book. Nearly every bit of Python I've ever written would 
> have benefitted from it, if just to get rid of the "x if x is not None 
> else []" mess.
>
> That said, I think the other (propagating) operators are a mistake, 
> and I think they were a mistake in C# as well. I'm not I've ever had a 
> situation where I wished they existed, in any language. Better to get 
> rid of the Nones as soon as possible than bring them along. It's worth 
> reading the C# design team's notes and subsequent discussion on the 
> associativity of "?." [1] since it goes around and around with no 
> really good answer and no particularly intuitive behaviour.
>
> Rather than worry about that, I'd prefer to see just the basic 
> None-coalescing added. I like Alessio's suggestion of "or?" (which 
> seems like it should be read in a calm but threatening tone, a la Liam 
> Neeson). It just seems more Pythonic; ?? is fine in C# but seems 
> punctuation-heavy for Python. It does mean the ?= and ?. and ?[] are 
> probably out, and I'm OK with that.
>
> - Jeff
>
> [1] https://roslyn.codeplex.com/discussions/543895
>

That sums it all up for me as well, though I would rather use "else" 
instead of "or?" (see punctuation-heavy).

Best,
Sven

From steve at pearwood.info  Mon Sep 28 18:37:33 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 29 Sep 2015 02:37:33 +1000
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <560966C1.1040704@mail.de>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <560966C1.1040704@mail.de>
Message-ID: <20150928163733.GN23642@ando.pearwood.info>

On Mon, Sep 28, 2015 at 06:11:45PM +0200, Sven R. Kunze wrote:

> That sums it all up for me as well, though I would rather use "else" 
> instead of "or?" (see punctuation-heavy).

`else` is ambiguous. Consider:

    result = spam if eggs else cheese else aardvark

could be interpreted three ways:

    result = (spam if eggs else cheese) else aardvark
    result = spam if (eggs else cheese) else aardvark
    result = spam if eggs else (cheese else aardvark)

Whichever precedence you pick, some people will get it wrong and it will 
silently do the wrong thing and lead to hard-to-diagnose bugs. Using 
"else" for this will be a bug-magnet.



-- 
Steve

From donald at stufft.io  Mon Sep 28 18:40:51 2015
From: donald at stufft.io (Donald Stufft)
Date: Mon, 28 Sep 2015 12:40:51 -0400
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <20150928163733.GN23642@ando.pearwood.info>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <560966C1.1040704@mail.de> <20150928163733.GN23642@ando.pearwood.info>
Message-ID: <etPan.56096d93.7014668f.1859b@Draupnir.home>

could use two words!

result = spam or else eggs

On September 28, 2015 at 12:38:21 PM, Steven D'Aprano (steve at pearwood.info) wrote:
> On Mon, Sep 28, 2015 at 06:11:45PM +0200, Sven R. Kunze wrote:
> 
> > That sums it all up for me as well, though I would rather use "else"
> > instead of "or?" (see punctuation-heavy).
> 
> `else` is ambiguous. Consider:
> 
> result = spam if eggs else cheese else aardvark
> 
> could be interpreted three ways:
> 
> result = (spam if eggs else cheese) else aardvark
> result = spam if (eggs else cheese) else aardvark
> result = spam if eggs else (cheese else aardvark)
> 
> Whichever precedence you pick, some people will get it wrong and it will
> silently do the wrong thing and lead to hard-to-diagnose bugs. Using
> "else" for this will be a bug-magnet.
> 
> 
> 
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
> 

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From abarnert at yahoo.com  Mon Sep 28 18:41:36 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Mon, 28 Sep 2015 09:41:36 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
Message-ID: <1E0CDC0C-CF95-4780-8DFB-BC4D6B5CC192@yahoo.com>

On Sep 28, 2015, at 09:02, Jeff Hardy <jdhardy at gmail.com> wrote:
> 
> It's worth reading the C# design team's notes and subsequent discussion on the associativity of "?." [1] since it goes around and around with no really good answer and no particularly intuitive behaviour.

Many of the problems raised there are irrelevant to Python: the fact that C# has "value types" that aren't referenced and can't be null, the fact that its ASTs are often processed by type-driven programming, the fact that it's not considered normal to raise and catch an exception in cases that aren't truly exceptional, the fact that there's (human-reader) ambiguity with ?:, the fact that . and [] are actually operators rather than a different kind of syntax, etc. There may be parallel problems to some of those issues in Python, but just assuming there will be because there are in C# doesn't establish that.

The one argument that does carry over is that the "right-associative" version is harder to implement. That's worth discussing in Python terms, but it would be much more useful for someone to write an implementation to prove that the  grammar, AST, and compiler code actually aren't that complicated, than to argue that they wouldn't necessarily be so.

(The fact that it's harder to see where the exception comes from in something like spam(a?.b.c) is also the same in both languages, but that's already been discussed here, and I don't think that's a real problem in the first place--after all, a.b.c already raises an AttributeError that makes it just as hard to see whether it comes from None.b or None.c.)

From cmeyer1969 at gmail.com  Mon Sep 28 18:46:56 2015
From: cmeyer1969 at gmail.com (Chris Meyer)
Date: Mon, 28 Sep 2015 09:46:56 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <etPan.56096d93.7014668f.1859b@Draupnir.home>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <560966C1.1040704@mail.de> <20150928163733.GN23642@ando.pearwood.info>
 <etPan.56096d93.7014668f.1859b@Draupnir.home>
Message-ID: <08F886E8-EF4F-40AB-AAA6-35432C42DB5C@gmail.com>

> On Sep 28, 2015, at 9:40 AM, Donald Stufft <donald at stufft.io> wrote:
> 
> could use two words!
> 
> result = spam or else eggs

Could use otherwise:

result = spam otherwise eggs

> On September 28, 2015 at 12:38:21 PM, Steven D'Aprano (steve at pearwood.info) wrote:
>> On Mon, Sep 28, 2015 at 06:11:45PM +0200, Sven R. Kunze wrote:
>> 
>>> That sums it all up for me as well, though I would rather use "else"
>>> instead of "or?" (see punctuation-heavy).
>> 
>> `else` is ambiguous. Consider:
>> 
>> result = spam if eggs else cheese else aardvark
>> 
>> could be interpreted three ways:
>> 
>> result = (spam if eggs else cheese) else aardvark
>> result = spam if (eggs else cheese) else aardvark
>> result = spam if eggs else (cheese else aardvark)
>> 
>> Whichever precedence you pick, some people will get it wrong and it will
>> silently do the wrong thing and lead to hard-to-diagnose bugs. Using
>> "else" for this will be a bug-magnet.
>> 
>> 
>> 
>> --
>> Steve
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>> 
> 
> -----------------
> Donald Stufft
> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
> 
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


From abarnert at yahoo.com  Mon Sep 28 18:46:15 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Mon, 28 Sep 2015 09:46:15 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <etPan.56096d93.7014668f.1859b@Draupnir.home>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <560966C1.1040704@mail.de> <20150928163733.GN23642@ando.pearwood.info>
 <etPan.56096d93.7014668f.1859b@Draupnir.home>
Message-ID: <ABA25A9A-D0DD-41E6-AF16-7B1C2584BE0F@yahoo.com>

On Sep 28, 2015, at 09:40, Donald Stufft <donald at stufft.io> wrote:
> 
> could use two words!
> 
> result = spam or else eggs

Unless you change the tokenizer to understand "or else" as a special case, or add another level of lookahead to the parser, how do you handle "spam if eggs or else cheese else aardvark" and vice-versa? It does make the meaning less confusing to a human, but it makes understanding how the compiler parses that meaning harder to understand to a human.

>> On September 28, 2015 at 12:38:21 PM, Steven D'Aprano (steve at pearwood.info) wrote:
>>> On Mon, Sep 28, 2015 at 06:11:45PM +0200, Sven R. Kunze wrote:
>>> 
>>> That sums it all up for me as well, though I would rather use "else"
>>> instead of "or?" (see punctuation-heavy).
>> 
>> `else` is ambiguous. Consider:
>> 
>> result = spam if eggs else cheese else aardvark
>> 
>> could be interpreted three ways:
>> 
>> result = (spam if eggs else cheese) else aardvark
>> result = spam if (eggs else cheese) else aardvark
>> result = spam if eggs else (cheese else aardvark)
>> 
>> Whichever precedence you pick, some people will get it wrong and it will
>> silently do the wrong thing and lead to hard-to-diagnose bugs. Using
>> "else" for this will be a bug-magnet.
>> 
>> 
>> 
>> --
>> Steve
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
> 
> -----------------
> Donald Stufft
> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
> 
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From srkunze at mail.de  Mon Sep 28 18:47:05 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Mon, 28 Sep 2015 18:47:05 +0200
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <20150928163733.GN23642@ando.pearwood.info>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <560966C1.1040704@mail.de> <20150928163733.GN23642@ando.pearwood.info>
Message-ID: <56096F09.40804@mail.de>

On 28.09.2015 18:37, Steven D'Aprano wrote:
> On Mon, Sep 28, 2015 at 06:11:45PM +0200, Sven R. Kunze wrote:
>
>> That sums it all up for me as well, though I would rather use "else"
>> instead of "or?" (see punctuation-heavy).
> `else` is ambiguous. Consider:
>
>      result = spam if eggs else cheese else aardvark
>
> could be interpreted three ways:
>
>      result = (spam if eggs else cheese) else aardvark
>      result = spam if (eggs else cheese) else aardvark
>      result = spam if eggs else (cheese else aardvark)
>
> Whichever precedence you pick, some people will get it wrong and it will
> silently do the wrong thing and lead to hard-to-diagnose bugs. Using
> "else" for this will be a bug-magnet.

I wouldn't make a mountain out of a molehill. Other existing operators 
have the same issue.


Best,
Sven

From srkunze at mail.de  Mon Sep 28 19:00:46 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Mon, 28 Sep 2015 19:00:46 +0200
Subject: [Python-ideas] Binary f-strings
In-Reply-To: <CAPTjJmrSxqKeSACoDHSP59R9Qz=L8SyV2w9Kv1E77Z4yLsb7OA@mail.gmail.com>
References: <56089692.1080303@trueblade.com>
 <CAPJVwBkTthOOJrzJZE08nE5FUCsNdUMUSDdrKuMW=UEVQ54xLA@mail.gmail.com>
 <CAPTjJmrSxqKeSACoDHSP59R9Qz=L8SyV2w9Kv1E77Z4yLsb7OA@mail.gmail.com>
Message-ID: <5609723E.7000606@mail.de>

On 28.09.2015 05:03, Chris Angelico wrote:
>
>>>> ?????? = 1961
>>>> apollo = 1969
>>>> print(f"It took {apollo-??????} years to get from orbit to the moon.")
> It took 8 years to get from orbit to the moon.
>>>> print(b"It took {apollo-??????} years to get from orbit to the moon.")
>    File "<stdin>", line 1
> SyntaxError: bytes can only contain ASCII literal characters.
>
> If that were a binary f-string, those Cyrillic characters should still
> be legal (as they define an identifier, rather than ending up in the
> code). Would it confuse (a) humans, or (b) tools, to have these "texty
> bits" inside a byte string?

I don't think so. "{...}" indicates the injection of whatever "..." 
stands for, thus is not part of the resulting string. So, no issue here 
for me.

(The only thing that would confuse me, is that "??????" is an allowed 
identifier in the first place. But that seems to be a different matter.)

> In any case, bf strings can be added later, but once they're added,
> their semantics would be locked in. I'd be inclined to leave them out
> for 3.6 and see what people say. A bit of real-world usage of
> f-strings might show a clear front-runner in terms of expectations
> (UTF-8, ASCII, or something else).
>

I tend to agree here.

Best,
Sven

From abarnert at yahoo.com  Mon Sep 28 19:24:43 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Mon, 28 Sep 2015 10:24:43 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <56096F09.40804@mail.de>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <560966C1.1040704@mail.de> <20150928163733.GN23642@ando.pearwood.info>
 <56096F09.40804@mail.de>
Message-ID: <3819F6B1-3221-41E2-9103-62B71CBA7708@yahoo.com>

On Sep 28, 2015, at 09:47, Sven R. Kunze <srkunze at mail.de> wrote:
> 
>> On 28.09.2015 18:37, Steven D'Aprano wrote:
>>> On Mon, Sep 28, 2015 at 06:11:45PM +0200, Sven R. Kunze wrote:
>>> 
>>> That sums it all up for me as well, though I would rather use "else"
>>> instead of "or?" (see punctuation-heavy).
>> `else` is ambiguous. Consider:
>> 
>>     result = spam if eggs else cheese else aardvark
>> 
>> could be interpreted three ways:
>> 
>>     result = (spam if eggs else cheese) else aardvark
>>     result = spam if (eggs else cheese) else aardvark
>>     result = spam if eggs else (cheese else aardvark)
>> 
>> Whichever precedence you pick, some people will get it wrong and it will
>> silently do the wrong thing and lead to hard-to-diagnose bugs. Using
>> "else" for this will be a bug-magnet.
> 
> I wouldn't make a mountain out of a molehill. Other existing operators have the same issue.

Which other keywords or symbols may be either a binary operator or part of a ternary operator depending on context?
> 
> 
> Best,
> Sven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From guido at python.org  Mon Sep 28 19:29:02 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Sep 2015 10:29:02 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
Message-ID: <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>

On Mon, Sep 28, 2015 at 9:02 AM, Jeff Hardy <jdhardy at gmail.com> wrote:

> -1 on the propagating member-access or index operators
>

Can someone explain with examples what this refers to?

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/43017a5c/attachment.html>

From carl at oddbird.net  Mon Sep 28 19:38:03 2015
From: carl at oddbird.net (Carl Meyer)
Date: Mon, 28 Sep 2015 11:38:03 -0600
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
Message-ID: <56097AFB.1040906@oddbird.net>

On 09/28/2015 11:29 AM, Guido van Rossum wrote:
> On Mon, Sep 28, 2015 at 9:02 AM, Jeff Hardy <jdhardy at gmail.com
> <mailto:jdhardy at gmail.com>> wrote:
> 
>     -1 on the propagating member-access or index operators
> 
> 
> Can someone explain with examples what this refers to?

"Member-access or index operators" refers to the proposed ?. or ?[
operators.

"Propagating" refers to the proposed behavior where use of ?. or ?[
"propagates" through the following chain of operations. For example:

    x = foo?.bar.spam.eggs

Where both `.spam` and `.eggs` would behave like `?.spam` and `?.eggs`
(propagating None rather than raising AttributeError), simply because a
`.?` had occurred earlier in the chain. So the above behaves differently
from:

    temp = foo?.bar
    x = temp.spam.eggs

Which raises questions about whether the propagation escapes
parentheses, too:

    x = (foo?.bar).spam.eggs

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/0b319962/attachment.sig>

From guido at python.org  Mon Sep 28 20:38:38 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Sep 2015 11:38:38 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <56097AFB.1040906@oddbird.net>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
Message-ID: <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>

On Mon, Sep 28, 2015 at 10:38 AM, Carl Meyer <carl at oddbird.net> wrote:

> On 09/28/2015 11:29 AM, Guido van Rossum wrote:
> > On Mon, Sep 28, 2015 at 9:02 AM, Jeff Hardy <jdhardy at gmail.com
> > <mailto:jdhardy at gmail.com>> wrote:
> >
> >     -1 on the propagating member-access or index operators
> >
> >
> > Can someone explain with examples what this refers to?
>
> "Member-access or index operators" refers to the proposed ?. or ?[
> operators.
>

Got that. :-)


> "Propagating" refers to the proposed behavior where use of ?. or ?[
> "propagates" through the following chain of operations. For example:
>
>     x = foo?.bar.spam.eggs
>
> Where both `.spam` and `.eggs` would behave like `?.spam` and `?.eggs`
> (propagating None rather than raising AttributeError), simply because a
> `.?` had occurred earlier in the chain. So the above behaves differently
> from:
>
>     temp = foo?.bar
>     x = temp.spam.eggs
>
> Which raises questions about whether the propagation escapes
> parentheses, too:
>
>     x = (foo?.bar).spam.eggs
>

Oh, I see. That's evil.

The correct behavior here is that "foo?.bar.spam.eggs" should mean the same
as

    (None if foo is None else foo.bar.spam.eggs)

(Stop until you understand that is *not* the same as either of the
alternatives you describe.)

I can see the confusion that led to the idea of "propagation" -- it
probably comes from an attempt to define "foo?.bar" without reference to
the context (in this case the relevant context is that it's followed by
".spam.eggs").

It should not escape parentheses.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/fcedef44/attachment-0001.html>

From emile at fenx.com  Mon Sep 28 21:38:04 2015
From: emile at fenx.com (Emile van Sebille)
Date: Mon, 28 Sep 2015 12:38:04 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <3819F6B1-3221-41E2-9103-62B71CBA7708@yahoo.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <560966C1.1040704@mail.de> <20150928163733.GN23642@ando.pearwood.info>
 <56096F09.40804@mail.de> <3819F6B1-3221-41E2-9103-62B71CBA7708@yahoo.com>
Message-ID: <muc4v2$q29$1@ger.gmane.org>

On 9/28/2015 10:24 AM, Andrew Barnert via Python-ideas wrote:
> On Sep 28, 2015, at 09:47, Sven R. Kunze <srkunze at mail.de> wrote:
<snip>

>> I wouldn't make a mountain out of a molehill. Other existing operators have the same issue.
>
> Which other keywords or symbols may be either a binary operator or part of a ternary operator depending on context?

These come to mind:

a = b = c
a < b < c

Emile



From carl at oddbird.net  Mon Sep 28 21:43:24 2015
From: carl at oddbird.net (Carl Meyer)
Date: Mon, 28 Sep 2015 13:43:24 -0600
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
Message-ID: <5609985C.40603@oddbird.net>

On 09/28/2015 12:38 PM, Guido van Rossum wrote:
> On Mon, Sep 28, 2015 at 10:38 AM, Carl Meyer <carl at oddbird.net

>     "Propagating" refers to the proposed behavior where use of ?. or ?[
>     "propagates" through the following chain of operations. For example:
> 
>         x = foo?.bar.spam.eggs
> 
>     Where both `.spam` and `.eggs` would behave like `?.spam` and `?.eggs`
>     (propagating None rather than raising AttributeError), simply because a
>     `.?` had occurred earlier in the chain. So the above behaves differently
>     from:
> 
>         temp = foo?.bar
>         x = temp.spam.eggs
> 
>     Which raises questions about whether the propagation escapes
>     parentheses, too:
> 
>         x = (foo?.bar).spam.eggs
> 
> Oh, I see. That's evil.
> 
> The correct behavior here is that "foo?.bar.spam.eggs" should mean the
> same as
> 
>     (None if foo is None else foo.bar.spam.eggs)
> 
> (Stop until you understand that is *not* the same as either of the
> alternatives you describe.)

I see that. The distinction is "short-circuit" vs "propagate."
Short-circuit is definitely more comprehensible and palatable.

[snip]
> It should not escape parentheses.

Good. I assume that the short-circuiting would follow the precedence
order; that is, nothing with looser precedence than member and index
access would be short-circuited. So, for example,

    foo?.bar['baz'].spam

would short-circuit the indexing and the final member access, translating to

    foo.bar['baz'].spam if foo is not None else None

but

    foo?.bar or 'baz'

would mean

    (foo.bar if foo is not None else None) or 'baz'

and would never evaluate to None. Similarly for any operator that binds
less tightly than member/index access (which is basically all Python
operators).

AFAICS, under your proposed semantics what I said above is still true, that

    x = foo?.bar.baz

would necessarily have a different meaning than

    temp = foo?.bar
    x = temp.baz

Or put differently, that whereas these two are trivially equivalent (the
definition of left-to-right binding within a precedence class):

    foo.bar.baz
    (foo.bar).baz

these two are not equivalent:

   foo?.bar.baz
   (foo?.bar).baz

I'm having trouble coming up with a parallel example where the existing
short-circuit operators break "extractibility" of a sub-expression like
that.

I guess this is because the proposed short-circuiting still "breaks out
of the precedence order" in a way that the existing short-circuiting
operators don't. Both member access and indexing are within the same
left-to-right binding precedence class, but the new operators would have
a short-circuit effect that swallows operations beyond where normal
left-to-right binding would suggest their effect should reach.

Are there existing examples of behavior like this in Python that I'm
missing?

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/d60767df/attachment.sig>

From srkunze at mail.de  Mon Sep 28 21:47:10 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Mon, 28 Sep 2015 21:47:10 +0200
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <3819F6B1-3221-41E2-9103-62B71CBA7708@yahoo.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <560966C1.1040704@mail.de> <20150928163733.GN23642@ando.pearwood.info>
 <56096F09.40804@mail.de> <3819F6B1-3221-41E2-9103-62B71CBA7708@yahoo.com>
Message-ID: <5609993E.9010103@mail.de>

On 28.09.2015 19:24, Andrew Barnert wrote:
> On Sep 28, 2015, at 09:47, Sven R. Kunze <srkunze at mail.de> wrote:
>>
>>>      result = (spam if eggs else cheese) else aardvark
>>>      result = spam if (eggs else cheese) else aardvark
>>>      result = spam if eggs else (cheese else aardvark)
>>>
>>> Whichever precedence you pick, some people will get it wrong and it will
>>> silently do the wrong thing and lead to hard-to-diagnose bugs. Using
>>> "else" for this will be a bug-magnet.
>> I wouldn't make a mountain out of a molehill. Other existing operators have the same issue.
> Which other keywords or symbols may be either a binary operator or part of a ternary operator depending on context?

It has nothing to do with either of it.

I've seen young students struggling with the op precedence of AND and 
OR; and I've seen experienced coworkers rather adding superfluous pairs 
of parentheses just to make sure or because they still don't know better.

Best,
Sven

From carl at oddbird.net  Mon Sep 28 21:53:05 2015
From: carl at oddbird.net (Carl Meyer)
Date: Mon, 28 Sep 2015 13:53:05 -0600
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <5609985C.40603@oddbird.net>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
Message-ID: <56099AA1.1010609@oddbird.net>

On 09/28/2015 01:43 PM, Carl Meyer wrote:
[snip]
> I assume that the short-circuiting would follow the precedence
> order; that is, nothing with looser precedence than member and index
> access would be short-circuited. So, for example,
> 
>     foo?.bar['baz'].spam
> 
> would short-circuit the indexing and the final member access, translating to
> 
>     foo.bar['baz'].spam if foo is not None else None
> 
> but
> 
>     foo?.bar or 'baz'
> 
> would mean
> 
>     (foo.bar if foo is not None else None) or 'baz'
> 
> and would never evaluate to None. Similarly for any operator that binds
> less tightly than member/index access (which is basically all Python
> operators).

For a possibly less-intuitive example of this principle (arbitrarily
picking the operator that binds next-most-tightly), what should

    foo?.bar**3

mean?

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/927b0959/attachment.sig>

From guido at python.org  Mon Sep 28 21:53:43 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Sep 2015 12:53:43 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <5609985C.40603@oddbird.net>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
Message-ID: <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>

On Mon, Sep 28, 2015 at 12:43 PM, Carl Meyer <carl at oddbird.net> wrote:

> On 09/28/2015 12:38 PM, Guido van Rossum wrote:
> > On Mon, Sep 28, 2015 at 10:38 AM, Carl Meyer <carl at oddbird.net
>
> >     "Propagating" refers to the proposed behavior where use of ?. or ?[
> >     "propagates" through the following chain of operations. For example:
> >
> >         x = foo?.bar.spam.eggs
> >
> >     Where both `.spam` and `.eggs` would behave like `?.spam` and
> `?.eggs`
> >     (propagating None rather than raising AttributeError), simply
> because a
> >     `.?` had occurred earlier in the chain. So the above behaves
> differently
> >     from:
> >
> >         temp = foo?.bar
> >         x = temp.spam.eggs
> >
> >     Which raises questions about whether the propagation escapes
> >     parentheses, too:
> >
> >         x = (foo?.bar).spam.eggs
> >
> > Oh, I see. That's evil.
> >
> > The correct behavior here is that "foo?.bar.spam.eggs" should mean the
> > same as
> >
> >     (None if foo is None else foo.bar.spam.eggs)
> >
> > (Stop until you understand that is *not* the same as either of the
> > alternatives you describe.)
>
> I see that. The distinction is "short-circuit" vs "propagate."
> Short-circuit is definitely more comprehensible and palatable.
>

Right.


> [snip]
> > It should not escape parentheses.
>
> Good. I assume that the short-circuiting would follow the precedence
> order; that is, nothing with looser precedence than member and index
> access would be short-circuited. So, for example,
>
>     foo?.bar['baz'].spam
>
> would short-circuit the indexing and the final member access, translating
> to
>
>     foo.bar['baz'].spam if foo is not None else None
>
> but
>
>     foo?.bar or 'baz'
>
> would mean
>
>     (foo.bar if foo is not None else None) or 'baz'
>
> and would never evaluate to None. Similarly for any operator that binds
> less tightly than member/index access (which is basically all Python
> operators).
>

Correct. The scope of ? would be all following .foo, .[stuff], or .(args)
-- but stopping at any other operator (including parens).


> AFAICS, under your proposed semantics what I said above is still true, that
>
>     x = foo?.bar.baz
>
> would necessarily have a different meaning than
>
>     temp = foo?.bar
>     x = temp.baz
>
> Or put differently, that whereas these two are trivially equivalent (the
> definition of left-to-right binding within a precedence class):
>
>     foo.bar.baz
>     (foo.bar).baz
>
> these two are not equivalent:
>
>    foo?.bar.baz
>    (foo?.bar).baz
>

Right.


> I'm having trouble coming up with a parallel example where the existing
> short-circuit operators break "extractibility" of a sub-expression like
> that.
>

Why is that an interesting property?


> I guess this is because the proposed short-circuiting still "breaks out
> of the precedence order" in a way that the existing short-circuiting
> operators don't. Both member access and indexing are within the same
> left-to-right binding precedence class, but the new operators would have
> a short-circuit effect that swallows operations beyond where normal
> left-to-right binding would suggest their effect should reach.
>
> Are there existing examples of behavior like this in Python that I'm
> missing?


I don't know, but I think you shouldn't worry about this.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/600f73ca/attachment-0001.html>

From guido at python.org  Mon Sep 28 21:57:26 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Sep 2015 12:57:26 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <56099AA1.1010609@oddbird.net>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net> <56099AA1.1010609@oddbird.net>
Message-ID: <CAP7+vJJVH6tpGyOPoDZvwTbCKmqVzbzGFG9vBozwt8NVQLyt_A@mail.gmail.com>

On Mon, Sep 28, 2015 at 12:53 PM, Carl Meyer <carl at oddbird.net> wrote:

> On 09/28/2015 01:43 PM, Carl Meyer wrote:
> [snip]
> > I assume that the short-circuiting would follow the precedence
> > order; that is, nothing with looser precedence than member and index
> > access would be short-circuited. So, for example,
> >
> >     foo?.bar['baz'].spam
> >
> > would short-circuit the indexing and the final member access,
> translating to
> >
> >     foo.bar['baz'].spam if foo is not None else None
> >
> > but
> >
> >     foo?.bar or 'baz'
> >
> > would mean
> >
> >     (foo.bar if foo is not None else None) or 'baz'
> >
> > and would never evaluate to None. Similarly for any operator that binds
> > less tightly than member/index access (which is basically all Python
> > operators).
>
> For a possibly less-intuitive example of this principle (arbitrarily
> picking the operator that binds next-most-tightly), what should
>
>     foo?.bar**3
>
> mean?
>

It's nonsense -- it means (foo?.bar)**3 but since foo?.bar can return None
and None**3 is an error you shouldn't do that. But don't try to then come
up with syntax that rejects foo?.bar**something statically, because
something might be an object implements __rpow__.

And I still don't see why this "principle" would be important.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/afe0b024/attachment.html>

From carl at oddbird.net  Mon Sep 28 22:00:47 2015
From: carl at oddbird.net (Carl Meyer)
Date: Mon, 28 Sep 2015 14:00:47 -0600
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
Message-ID: <56099C6F.90700@oddbird.net>

On 09/28/2015 01:53 PM, Guido van Rossum wrote:
> On Mon, Sep 28, 2015 at 12:43 PM, Carl Meyer <carl at oddbird.net
>     Or put differently, that whereas these two are trivially equivalent (the
>     definition of left-to-right binding within a precedence class):
> 
>         foo.bar.baz
>         (foo.bar).baz
> 
>     these two are not equivalent:
> 
>        foo?.bar.baz
>        (foo?.bar).baz
> 
> 
> Right.
>  
> 
>     I'm having trouble coming up with a parallel example where the existing
>     short-circuit operators break "extractibility" of a sub-expression like
>     that.
> 
> 
> Why is that an interesting property?

Because breaking up an overly-complex expression into smaller
expressions by means of extracting sub-expressions into temporary
variables is a common programming task (in my experience anyway --
especially when trying to decipher some long-gone programmer's
overly-complex code), and it's usually one that can be handled pretty
mechanically according to precedence rules, without having to consider
that some operators might have action-at-a-distance beyond their precedence.

>     I guess this is because the proposed short-circuiting still "breaks out
>     of the precedence order" in a way that the existing short-circuiting
>     operators don't. Both member access and indexing are within the same
>     left-to-right binding precedence class, but the new operators would have
>     a short-circuit effect that swallows operations beyond where normal
>     left-to-right binding would suggest their effect should reach.
> 
>     Are there existing examples of behavior like this in Python that I'm
>     missing?
> 
> 
> I don't know, but I think you shouldn't worry about this.

I think it's kind of odd, but if nobody else is worried about it, I
won't worry about it either :-)

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/87930f67/attachment.sig>

From carl at oddbird.net  Mon Sep 28 22:05:15 2015
From: carl at oddbird.net (Carl Meyer)
Date: Mon, 28 Sep 2015 14:05:15 -0600
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAP7+vJJVH6tpGyOPoDZvwTbCKmqVzbzGFG9vBozwt8NVQLyt_A@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net> <56099AA1.1010609@oddbird.net>
 <CAP7+vJJVH6tpGyOPoDZvwTbCKmqVzbzGFG9vBozwt8NVQLyt_A@mail.gmail.com>
Message-ID: <56099D7B.4020900@oddbird.net>

On 09/28/2015 01:57 PM, Guido van Rossum wrote:
> On Mon, Sep 28, 2015 at 12:53 PM, Carl Meyer <carl at oddbird.net
> <mailto:carl at oddbird.net>> wrote:
> 
>     On 09/28/2015 01:43 PM, Carl Meyer wrote:
>     [snip]
>     > I assume that the short-circuiting would follow the precedence
>     > order; that is, nothing with looser precedence than member and index
>     > access would be short-circuited. So, for example,
>     >
>     >     foo?.bar['baz'].spam
>     >
>     > would short-circuit the indexing and the final member access, translating to
>     >
>     >     foo.bar['baz'].spam if foo is not None else None
>     >
>     > but
>     >
>     >     foo?.bar or 'baz'
>     >
>     > would mean
>     >
>     >     (foo.bar if foo is not None else None) or 'baz'
>     >
>     > and would never evaluate to None. Similarly for any operator that binds
>     > less tightly than member/index access (which is basically all Python
>     > operators).
> 
>     For a possibly less-intuitive example of this principle (arbitrarily
>     picking the operator that binds next-most-tightly), what should
> 
>         foo?.bar**3
> 
>     mean?
> 
> 
> It's nonsense -- it means (foo?.bar)**3 but since foo?.bar can return
> None and None**3 is an error you shouldn't do that. But don't try to
> then come up with syntax that rejects foo?.bar**something statically,
> because something might be an object implements __rpow__.
> 
> And I still don't see why this "principle" would be important.

The only "principle" in question here is "nothing with looser precedence
than member and index access would be short-circuited," and you seem to
agree with it. I was just making sure that

    foo?.bar**3

couldn't possibly mean

   (foo.bar**3 if foo is None else None)

and I'm glad it couldn't.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/3f79ecff/attachment.sig>

From guido at python.org  Mon Sep 28 22:06:18 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Sep 2015 13:06:18 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <56099C6F.90700@oddbird.net>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
Message-ID: <CAP7+vJ+c_v8WsNMK9N3A5ZRf0a26xkbkBVD_YWqW-w9-0E63Xg@mail.gmail.com>

On Mon, Sep 28, 2015 at 1:00 PM, Carl Meyer <carl at oddbird.net> wrote:

> On 09/28/2015 01:53 PM, Guido van Rossum wrote:
> > On Mon, Sep 28, 2015 at 12:43 PM, Carl Meyer <carl at oddbird.net
> >     Or put differently, that whereas these two are trivially equivalent
> (the
> >     definition of left-to-right binding within a precedence class):
> >
> >         foo.bar.baz
> >         (foo.bar).baz
> >
> >     these two are not equivalent:
> >
> >        foo?.bar.baz
> >        (foo?.bar).baz
> >
> >
> > Right.
> >
> >
> >     I'm having trouble coming up with a parallel example where the
> existing
> >     short-circuit operators break "extractibility" of a sub-expression
> like
> >     that.
> >
> >
> > Why is that an interesting property?
>
> Because breaking up an overly-complex expression into smaller
> expressions by means of extracting sub-expressions into temporary
> variables is a common programming task (in my experience anyway --
> especially when trying to decipher some long-gone programmer's
> overly-complex code), and it's usually one that can be handled pretty
> mechanically according to precedence rules, without having to consider
> that some operators might have action-at-a-distance beyond their
> precedence.
>

Well, if just the foo?.bar.baz part is already too complex you probably
need to reconsider your career. :-)

Seriously, when breaking things into smaller parts you *have* to understand
the shortcut properties. You can't break "foo() or bar()" into

  a = foo()
  b = bar()
  return a or b

either.


> >     I guess this is because the proposed short-circuiting still "breaks
> out
> >     of the precedence order" in a way that the existing short-circuiting
> >     operators don't. Both member access and indexing are within the same
> >     left-to-right binding precedence class, but the new operators would
> have
> >     a short-circuit effect that swallows operations beyond where normal
> >     left-to-right binding would suggest their effect should reach.
> >
> >     Are there existing examples of behavior like this in Python that I'm
> >     missing?
> >
> >
> > I don't know, but I think you shouldn't worry about this.
>
> I think it's kind of odd, but if nobody else is worried about it, I
> won't worry about it either :-)
>

Good idea.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/85a7e46f/attachment-0001.html>

From donald at stufft.io  Mon Sep 28 22:15:19 2015
From: donald at stufft.io (Donald Stufft)
Date: Mon, 28 Sep 2015 16:15:19 -0400
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <56099C6F.90700@oddbird.net>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
Message-ID: <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>

The ? Modifying additional attribute accesses beyond just the immediate one bothers me too and feels more ruby than python to me. 

Sent from my iPhone

> On Sep 28, 2015, at 4:00 PM, Carl Meyer <carl at oddbird.net> wrote:
> 
> I think it's kind of odd, but if nobody else is worried about it, I
> won't worry about it either :-)

From guido at python.org  Mon Sep 28 22:24:50 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Sep 2015 13:24:50 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
Message-ID: <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>

On Mon, Sep 28, 2015 at 1:15 PM, Donald Stufft <donald at stufft.io> wrote:

> The ? Modifying additional attribute accesses beyond just the immediate
> one bothers me too and feels more ruby than python to me.
>

Really? Have you thought about it?

Suppose I have an object post which may be None or something with a tag
attribute which should be a string. And suppose I want to get the
lowercased tag, if the object exists, else None.

This seems a perfect use case for writing post?.tag.lower() -- this
signifies that post may be None but if it exists, post.tag is not expected
to be None. So basically I want the equivalent of (post.tag.lower() if post
is not None else None).

But if post?.tag.lower() were interpreted strictly as (post?.tag).lower(),
then I would have to write post?.tag?.lower?(), which is an abomination.
OTOH if post?.tag.lower() automatically meant post?.tag?.lower?() then I
would silently get no error when post exists but post.tag is None (which in
this example is an error).

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/34a88fc2/attachment.html>

From ben+python at benfinney.id.au  Mon Sep 28 22:27:04 2015
From: ben+python at benfinney.id.au (Ben Finney)
Date: Tue, 29 Sep 2015 06:27:04 +1000
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
Message-ID: <85si5y2stj.fsf@benfinney.id.au>

Carl Meyer <carl at oddbird.net> writes:

> On 09/28/2015 01:53 PM, Guido van Rossum wrote:
> > On Mon, Sep 28, 2015 at 12:43 PM, Carl Meyer:
> > > I'm having trouble coming up with a parallel example where the
> > > existing short-circuit operators break "extractibility" of a
> > > sub-expression like that.
> > 
> > Why is that an interesting property?
>
> Because breaking up an overly-complex expression into smaller
> expressions by means of extracting sub-expressions into temporary
> variables is a common programming task

+1, this is a hugely important tool in the mental toolkit. Making that
more difficult is a high cost, thank you for expressing it so explicitly.

> it's usually one that can be handled pretty mechanically according to
> precedence rules, without having to consider that some operators might
> have action-at-a-distance beyond their precedence.
>
> > I don't know, but I think you shouldn't worry about this.
>
> I think it's kind of odd, but if nobody else is worried about it, I
> won't worry about it either :-)

I share the concerns Carl is expressing; action-at-a-distance is
something I'm glad Python doesn't have much of, and I would be loath to
see that increase.

-- 
 \      ?A lie can be told in a few words. Debunking that lie can take |
  `\   pages. That is why my book? is five hundred pages long.? ?Chris |
_o__)                                                Rodda, 2011-05-05 |
Ben Finney


From guido at python.org  Mon Sep 28 22:32:06 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Sep 2015 13:32:06 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <85si5y2stj.fsf@benfinney.id.au>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <85si5y2stj.fsf@benfinney.id.au>
Message-ID: <CAP7+vJJ6GQsMSkKnd_XebZB-OMfKQofAcjF0-jgyWhu5R9Nhdw@mail.gmail.com>

On Mon, Sep 28, 2015 at 1:27 PM, Ben Finney <ben+python at benfinney.id.au>
wrote:

> Carl Meyer <carl at oddbird.net> writes:
>
> > On 09/28/2015 01:53 PM, Guido van Rossum wrote:
> > > On Mon, Sep 28, 2015 at 12:43 PM, Carl Meyer:
> > > > I'm having trouble coming up with a parallel example where the
> > > > existing short-circuit operators break "extractibility" of a
> > > > sub-expression like that.
> > >
> > > Why is that an interesting property?
> >
> > Because breaking up an overly-complex expression into smaller
> > expressions by means of extracting sub-expressions into temporary
> > variables is a common programming task
>
> +1, this is a hugely important tool in the mental toolkit. Making that
> more difficult is a high cost, thank you for expressing it so explicitly.
>
> > it's usually one that can be handled pretty mechanically according to
> > precedence rules, without having to consider that some operators might
> > have action-at-a-distance beyond their precedence.
> >
> > > I don't know, but I think you shouldn't worry about this.
> >
> > I think it's kind of odd, but if nobody else is worried about it, I
> > won't worry about it either :-)
>
> I share the concerns Carl is expressing; action-at-a-distance is
> something I'm glad Python doesn't have much of, and I would be loath to
> see that increase.
>

Really? You would consider a syntactic feature whose scope is limited to
things to its immediate right with the most tightly binding
pseudo-operators "action-at-a-distance"? The rhetoric around this issue is
beginning to sound ridiculous.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/dda8ccd9/attachment.html>

From python at mrabarnett.plus.com  Mon Sep 28 22:38:38 2015
From: python at mrabarnett.plus.com (MRAB)
Date: Mon, 28 Sep 2015 21:38:38 +0100
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAP7+vJ+c_v8WsNMK9N3A5ZRf0a26xkbkBVD_YWqW-w9-0E63Xg@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <CAP7+vJ+c_v8WsNMK9N3A5ZRf0a26xkbkBVD_YWqW-w9-0E63Xg@mail.gmail.com>
Message-ID: <5609A54E.8050806@mrabarnett.plus.com>

On 2015-09-28 21:06, Guido van Rossum wrote:
> On Mon, Sep 28, 2015 at 1:00 PM, Carl Meyer <carl at oddbird.net
> <mailto:carl at oddbird.net>> wrote:
>
>     On 09/28/2015 01:53 PM, Guido van Rossum wrote:
>     > On Mon, Sep 28, 2015 at 12:43 PM, Carl Meyer <carl at oddbird.net <mailto:carl at oddbird.net>
>     >     Or put differently, that whereas these two are trivially equivalent (the
>     >     definition of left-to-right binding within a precedence class):
>     >
>     >         foo.bar.baz
>     >         (foo.bar).baz
>     >
>     >     these two are not equivalent:
>     >
>     >        foo?.bar.baz
>     >        (foo?.bar).baz
>     >
>     >
>     > Right.
>     >
>     >
>     >     I'm having trouble coming up with a parallel example where the existing
>     >     short-circuit operators break "extractibility" of a sub-expression like
>     >     that.
>     >
>     >
>     > Why is that an interesting property?
>
>     Because breaking up an overly-complex expression into smaller
>     expressions by means of extracting sub-expressions into temporary
>     variables is a common programming task (in my experience anyway --
>     especially when trying to decipher some long-gone programmer's
>     overly-complex code), and it's usually one that can be handled pretty
>     mechanically according to precedence rules, without having to consider
>     that some operators might have action-at-a-distance beyond their
>     precedence.
>
>
> Well, if just the foo?.bar.baz part is already too complex you probably
> need to reconsider your career. :-)
>
> Seriously, when breaking things into smaller parts you *have* to
> understand the shortcut properties. You can't break "foo() or bar()" into
>
>    a = foo()
>    b = bar()
>    return a or b
>
> either.
>
Exactly.

Can you break:

     result = do_this() if test() else do_that()

into parts without changing its meaning/behaviour?

     condition = test()
     true_result = do_this()
     false_result = do_that()
     result = true_result if condition else false_result

The ? 'operators' are syntactic sugar.

>     >     I guess this is because the proposed short-circuiting still "breaks out
>     >     of the precedence order" in a way that the existing short-circuiting
>     >     operators don't. Both member access and indexing are within the same
>     >     left-to-right binding precedence class, but the new operators would have
>     >     a short-circuit effect that swallows operations beyond where normal
>     >     left-to-right binding would suggest their effect should reach.
>     >
>     >     Are there existing examples of behavior like this in Python that I'm
>     >     missing?
>     >
>     >
>     > I don't know, but I think you shouldn't worry about this.
>
>     I think it's kind of odd, but if nobody else is worried about it, I
>     won't worry about it either :-)
>
>
> Good idea.
>


From ben+python at benfinney.id.au  Mon Sep 28 22:38:37 2015
From: ben+python at benfinney.id.au (Ben Finney)
Date: Tue, 29 Sep 2015 06:38:37 +1000
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
Message-ID: <85oagm2saa.fsf@benfinney.id.au>

Guido van Rossum <guido at python.org> writes:

> This seems a perfect use case for writing post?.tag.lower() -- this
> signifies that post may be None but if it exists, post.tag is not
> expected to be None. So basically I want the equivalent of
> (post.tag.lower() if post is not None else None).

You're deliberately choosing straightforward examples. That's fine for
showing the intended use case, but it does mean dismissing the concerns
about ambiguity in complex cases.

It also means the use cases are so simply they are easily expressed
succinctly with existing syntax, with the advantage of being more
explicit in their effect; so they don't argue strongly for the need to
add the new syntax.

So, the corner case examples in this thread, which mix up precedence,
are useful because they show how confusion is increased by making the
precedence and binding rules more complicated.

-- 
 \           ?People are very open-minded about new things, as long as |
  `\         they're exactly like the old ones.? ?Charles F. Kettering |
_o__)                                                                  |
Ben Finney


From donald at stufft.io  Mon Sep 28 22:41:52 2015
From: donald at stufft.io (Donald Stufft)
Date: Mon, 28 Sep 2015 16:41:52 -0400
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
Message-ID: <etPan.5609a610.2f0c7c53.76f@Draupnir.home>

On September 28, 2015 at 4:25:12 PM, Guido van Rossum (guido at python.org) wrote:
> On Mon, Sep 28, 2015 at 1:15 PM, Donald Stufft wrote:
>  
> > The ? Modifying additional attribute accesses beyond just the immediate
> > one bothers me too and feels more ruby than python to me.
> >
>  
> Really? Have you thought about it?

Not extensively, mostly this is a gut feeling.

>  
> Suppose I have an object post which may be None or something with a tag
> attribute which should be a string. And suppose I want to get the
> lowercased tag, if the object exists, else None.
>  
> This seems a perfect use case for writing post?.tag.lower() -- this
> signifies that post may be None but if it exists, post.tag is not expected
> to be None. So basically I want the equivalent of (post.tag.lower() if post
> is not None else None).
>  
> But if post?.tag.lower() were interpreted strictly as (post?.tag).lower(),
> then I would have to write post?.tag?.lower?(), which is an abomination.
> OTOH if post?.tag.lower() automatically meant post?.tag?.lower?() then I
> would silently get no error when post exists but post.tag is None (which in
> this example is an error).
>  

Does ? propagate past a non None value? If it were?post?.tag.name.lower() and post was not None, but tag was None would that be an error or would the ? propagate to the tag as well?


-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From guido at python.org  Mon Sep 28 22:54:09 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Sep 2015 13:54:09 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <85oagm2saa.fsf@benfinney.id.au>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
Message-ID: <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>

On Mon, Sep 28, 2015 at 1:38 PM, Ben Finney <ben+python at benfinney.id.au>
wrote:

> Guido van Rossum <guido at python.org> writes:
>
> > This seems a perfect use case for writing post?.tag.lower() -- this
> > signifies that post may be None but if it exists, post.tag is not
> > expected to be None. So basically I want the equivalent of
> > (post.tag.lower() if post is not None else None).
>
> You're deliberately choosing straightforward examples. That's fine for
> showing the intended use case, but it does mean dismissing the concerns
> about ambiguity in complex cases.
>
> It also means the use cases are so simply they are easily expressed
> succinctly with existing syntax, with the advantage of being more
> explicit in their effect; so they don't argue strongly for the need to
> add the new syntax.
>
> So, the corner case examples in this thread, which mix up precedence,
> are useful because they show how confusion is increased by making the
> precedence and binding rules more complicated.
>

But your argument seems to boil down to "it is possible to write obfuscated
code using this feature".

If you want to dumb down the feature so that foo?.bar.baz means just
(foo?.bar).baz then it's useless and I should just reject the PEP.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/0b94f94d/attachment.html>

From guido at python.org  Mon Sep 28 22:56:22 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Sep 2015 13:56:22 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <etPan.5609a610.2f0c7c53.76f@Draupnir.home>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <etPan.5609a610.2f0c7c53.76f@Draupnir.home>
Message-ID: <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>

On Mon, Sep 28, 2015 at 1:41 PM, Donald Stufft <donald at stufft.io> wrote:

> On September 28, 2015 at 4:25:12 PM, Guido van Rossum (guido at python.org)
> wrote:
> > On Mon, Sep 28, 2015 at 1:15 PM, Donald Stufft wrote:
> >
> > > The ? Modifying additional attribute accesses beyond just the immediate
> > > one bothers me too and feels more ruby than python to me.
> > >
> >
> > Really? Have you thought about it?
>
> Not extensively, mostly this is a gut feeling.
>
> >
> > Suppose I have an object post which may be None or something with a tag
> > attribute which should be a string. And suppose I want to get the
> > lowercased tag, if the object exists, else None.
> >
> > This seems a perfect use case for writing post?.tag.lower() -- this
> > signifies that post may be None but if it exists, post.tag is not
> expected
> > to be None. So basically I want the equivalent of (post.tag.lower() if
> post
> > is not None else None).
> >
> > But if post?.tag.lower() were interpreted strictly as
> (post?.tag).lower(),
> > then I would have to write post?.tag?.lower?(), which is an abomination.
> > OTOH if post?.tag.lower() automatically meant post?.tag?.lower?() then I
> > would silently get no error when post exists but post.tag is None (which
> in
> > this example is an error).
> >
>
> Does ? propagate past a non None value? If it were post?.tag.name.lower()
> and post was not None, but tag was None would that be an error or would the
> ? propagate to the tag as well?
>

I was trying to clarify that by saying that foo?.bar.baz means (foo.bar.baz
if foo is not None else None). IOW if tag was None that would be an error.

The rule then is quite simple: each ? does exactly one None check and
divides the expression into exactly two branches -- one for the case where
the thing preceding ? is None and one for the case where it isn't.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/a7ffdcf5/attachment-0001.html>

From carl at oddbird.net  Mon Sep 28 23:04:34 2015
From: carl at oddbird.net (Carl Meyer)
Date: Mon, 28 Sep 2015 15:04:34 -0600
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
Message-ID: <5609AB62.5040503@oddbird.net>

On 09/28/2015 02:54 PM, Guido van Rossum wrote:
> If you want to dumb down the feature so that foo?.bar.baz means just
> (foo?.bar).baz then it's useless and I should just reject the PEP.

I think you're right that in practice ?. and ?[ would probably be just
fine, because the scope of their action is still quite limited.

But even if they are rejected, I think a simple `??` or `or?` (or
however it's spelled) operator to reduce the repetition of "x if x is
not None else y" is worth consideration on its own merits. This operator
is entirely unambiguous, and I think would be useful and frequently
used, whether or not ?. and ?[ are added along with it.

Carl




-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/acaf41b0/attachment.sig>

From donald at stufft.io  Mon Sep 28 23:06:32 2015
From: donald at stufft.io (Donald Stufft)
Date: Mon, 28 Sep 2015 17:06:32 -0400
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <etPan.5609a610.2f0c7c53.76f@Draupnir.home>
 <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
Message-ID: <etPan.5609abd8.f9ec9f8.76f@Draupnir.home>


On September 28, 2015 at 4:56:44 PM, Guido van Rossum (guido at python.org) wrote:
> On Mon, Sep 28, 2015 at 1:41 PM, Donald Stufft wrote:
>  
> > On September 28, 2015 at 4:25:12 PM, Guido van Rossum (guido at python.org)
> > wrote:
> > > On Mon, Sep 28, 2015 at 1:15 PM, Donald Stufft wrote:
> > >
> > > > The ? Modifying additional attribute accesses beyond just the immediate
> > > > one bothers me too and feels more ruby than python to me.
> > > >
> > >
> > > Really? Have you thought about it?
> >
> > Not extensively, mostly this is a gut feeling.
> >
> > >
> > > Suppose I have an object post which may be None or something with a tag
> > > attribute which should be a string. And suppose I want to get the
> > > lowercased tag, if the object exists, else None.
> > >
> > > This seems a perfect use case for writing post?.tag.lower() -- this
> > > signifies that post may be None but if it exists, post.tag is not
> > expected
> > > to be None. So basically I want the equivalent of (post.tag.lower() if
> > post
> > > is not None else None).
> > >
> > > But if post?.tag.lower() were interpreted strictly as
> > (post?.tag).lower(),
> > > then I would have to write post?.tag?.lower?(), which is an abomination.
> > > OTOH if post?.tag.lower() automatically meant post?.tag?.lower?() then I
> > > would silently get no error when post exists but post.tag is None (which
> > in
> > > this example is an error).
> > >
> >
> > Does ? propagate past a non None value? If it were post?.tag.name.lower()
> > and post was not None, but tag was None would that be an error or would the
> > ? propagate to the tag as well?
> >
>  
> I was trying to clarify that by saying that foo?.bar.baz means (foo.bar.baz
> if foo is not None else None). IOW if tag was None that would be an error.
>  
> The rule then is quite simple: each ? does exactly one None check and
> divides the expression into exactly two branches -- one for the case where
> the thing preceding ? is None and one for the case where it isn't.
>  

Ok, that makes me feel less bad than my initial impression was that ? was going to modify all following things so that they were all implicitly ?. Just splitting it into two different branches seems OK.

I?m not a big fan of the punctuation though. It took me a minute to realize that post?.tag.lower() was saying if post is None, not if post.tag is None and I feel like it?s easy to miss the ?, especially when combined with other punctuation.?

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



From bruce at leban.us  Mon Sep 28 23:05:44 2015
From: bruce at leban.us (Bruce Leban)
Date: Mon, 28 Sep 2015 14:05:44 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <etPan.5609a610.2f0c7c53.76f@Draupnir.home>
 <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
Message-ID: <CAGu0AnuooXPiNMdzzBAygHmTi+cQhSzwN_kum7Q_ELiWf4R4oQ@mail.gmail.com>

On Mon, Sep 28, 2015 at 1:56 PM, Guido van Rossum <guido at python.org> wrote:

> The rule then is quite simple: each ? does exactly one None check and
> divides the expression into exactly two branches -- one for the case where
> the thing preceding ? is None and one for the case where it isn't.
>

I think this is exactly the right rule (when combined with the previously
stated rule that ?. ?() ?[] have the same precedence as the standard
versions of those operators).

--- Bruce
Check out my new puzzle book: http://J.mp/ingToConclusions
Get it free here: http://J.mp/ingToConclusionsFree (available on iOS)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/b98bd596/attachment.html>

From abarnert at yahoo.com  Mon Sep 28 21:47:13 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Mon, 28 Sep 2015 19:47:13 +0000 (UTC)
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
References: <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
Message-ID: <1082717728.2280152.1443469633455.JavaMail.yahoo@mail.yahoo.com>

On Monday, September 28, 2015 12:05 PM, Guido van Rossum <guido at python.org> wrote:
>On Mon, Sep 28, 2015 at 10:38 AM, Carl Meyer <carl at oddbird.net> wrote:
>>"Propagating" refers to the proposed behavior where use of ?. or ?[
>>"propagates" through the following chain of operations. For example:
>>
>>    x = foo?.bar.spam.eggs
>>
>>Where both `.spam` and `.eggs` would behave like `?.spam` and `?.eggs`
>>(propagating None rather than raising AttributeError), simply because a
>>`.?` had occurred earlier in the chain. So the above behaves differently
>>from:
>>
>>    temp = foo?.bar
>>    x = temp.spam.eggs
>>
>>Which raises questions about whether the propagation escapes
>>parentheses, too:
>>
>>    x = (foo?.bar).spam.eggs
>>
>
>Oh, I see. That's evil.
>
>The correct behavior here is that "foo?.bar.spam.eggs" should mean the same as
>
>    (None if foo is None else foo.bar.spam.eggs)
>
>(Stop until you understand that is *not* the same as either of the alternatives you describe.)
>
>I can see the confusion that led to the idea of "propagation" -- it probably comes from an attempt to define "foo?.bar" without reference to the context (in this case the relevant context is that it's followed by ".spam.eggs").


It would really help to have a complete spec, or at least a quick workthrough of how an expression gets parsed and compiled.

I assume it's something like this:

spam?.eggs.cheese becomes this pseudo-AST (I've skipped the loads and maybe some other stuff):

    Expr(
        value=Attribute(
            value=Attribute(
                value=Name(id='spam'), attr='eggs', uptalk=True),
            attr='cheese', uptalk=False))


? which is then compiled as this pseudo-bytecode:

    LOAD_NAME 'spam'
    DUP_TOP
    POP_JUMP_IF_NONE :label
    LOAD_ATTR 'eggs'
    LOAD_ATTR 'cheese'
    :label


I've invented a new opcode POP_JUMP_IF_NONE, but it should be clear what it does. I think it's clear how replacing spam with any other expression works, and how subscripting works. So the only question is whether understanding how .eggs.cheese becomes a pair of LOAD_ATTRs is sufficient to understand how ?.eggs.cheese becomes a JUMP_IF_NONE followed by the same pair of LOAD_ATTRs through the same two steps.

I suppose the reference documentation wording is also important here, to explain that an uptalked attributeref or subscription short-circuits the whole primary.

From guido at python.org  Mon Sep 28 23:48:23 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Sep 2015 14:48:23 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <1082717728.2280152.1443469633455.JavaMail.yahoo@mail.yahoo.com>
References: <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <1082717728.2280152.1443469633455.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <CAP7+vJLAqa95CJuZ-kse7cqXVUdkpyKBECDBH7Qc+ieLQBt2yw@mail.gmail.com>

On Mon, Sep 28, 2015 at 12:47 PM, Andrew Barnert <abarnert at yahoo.com> wrote:

> On Monday, September 28, 2015 12:05 PM, Guido van Rossum <guido at python.org>
> wrote:
> >On Mon, Sep 28, 2015 at 10:38 AM, Carl Meyer <carl at oddbird.net> wrote:
> >>"Propagating" refers to the proposed behavior where use of ?. or ?[
> >>"propagates" through the following chain of operations. For example:
> >>
> >>    x = foo?.bar.spam.eggs
> >>
> >>Where both `.spam` and `.eggs` would behave like `?.spam` and `?.eggs`
> >>(propagating None rather than raising AttributeError), simply because a
> >>`.?` had occurred earlier in the chain. So the above behaves differently
> >>from:
> >>
> >>    temp = foo?.bar
> >>    x = temp.spam.eggs
> >>
> >>Which raises questions about whether the propagation escapes
> >>parentheses, too:
> >>
> >>    x = (foo?.bar).spam.eggs
> >>
> >
> >Oh, I see. That's evil.
> >
> >The correct behavior here is that "foo?.bar.spam.eggs" should mean the
> same as
> >
> >    (None if foo is None else foo.bar.spam.eggs)
> >
> >(Stop until you understand that is *not* the same as either of the
> alternatives you describe.)
> >
> >I can see the confusion that led to the idea of "propagation" -- it
> probably comes from an attempt to define "foo?.bar" without reference to
> the context (in this case the relevant context is that it's followed by
> ".spam.eggs").
>
>
> It would really help to have a complete spec, or at least a quick
> workthrough of how an expression gets parsed and compiled.
>

Isn't the PEP author still planning to do that? But it hasn't happened yet.
:-(


> I assume it's something like this:
>
> spam?.eggs.cheese becomes this pseudo-AST (I've skipped the loads and
> maybe some other stuff):
>
>     Expr(
>         value=Attribute(
>             value=Attribute(
>                 value=Name(id='spam'), attr='eggs', uptalk=True),
>             attr='cheese', uptalk=False))
>

Hm, I think the problem is that this way of representing the tree
encourages thinking that each attribute (with or without ?) can be treated
on its own.

? which is then compiled as this pseudo-bytecode:
>
>     LOAD_NAME 'spam'
>     DUP_TOP
>     POP_JUMP_IF_NONE :label
>     LOAD_ATTR 'eggs'
>     LOAD_ATTR 'cheese'
>     :label
>
>
> I've invented a new opcode POP_JUMP_IF_NONE, but it should be clear what
> it does. I think it's clear how replacing spam with any other expression
> works, and how subscripting works. So the only question is whether
> understanding how .eggs.cheese becomes a pair of LOAD_ATTRs is sufficient
> to understand how ?.eggs.cheese becomes a JUMP_IF_NONE followed by the same
> pair of LOAD_ATTRs through the same two steps.
>

To most people of course that's indecipherable mumbo-jumbo. :-)


> I suppose the reference documentation wording is also important here, to
> explain that an uptalked attributeref or subscription short-circuits the
> whole primary.
>

Apparently clarifying that is the entire point of this thread. :-)

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/239af231/attachment-0001.html>

From luciano at ramalho.org  Mon Sep 28 23:48:49 2015
From: luciano at ramalho.org (Luciano Ramalho)
Date: Mon, 28 Sep 2015 18:48:49 -0300
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <1082717728.2280152.1443469633455.JavaMail.yahoo@mail.yahoo.com>
References: <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <1082717728.2280152.1443469633455.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <CALxg4FU7dd7Yi=-fiBL_uM1s4tfLHk7ko56GQu2MEJJojx-n8w@mail.gmail.com>

Glyph tweeted yesterday that everyone should watch the "Nothing is
Something" 35' talk by Sandi Metz at RailsConf 2015. It's great and,
in a way, relevant to this discussion.

https://www.youtube.com/watch?v=29MAL8pJImQ

BTW, so far, `or?` is the least horrible token suggested, IMHO. I like
the basic semantics, though.

Cheers,

Luciano


On Mon, Sep 28, 2015 at 4:47 PM, Andrew Barnert via Python-ideas
<python-ideas at python.org> wrote:
> On Monday, September 28, 2015 12:05 PM, Guido van Rossum <guido at python.org> wrote:
>>On Mon, Sep 28, 2015 at 10:38 AM, Carl Meyer <carl at oddbird.net> wrote:
>>>"Propagating" refers to the proposed behavior where use of ?. or ?[
>>>"propagates" through the following chain of operations. For example:
>>>
>>>    x = foo?.bar.spam.eggs
>>>
>>>Where both `.spam` and `.eggs` would behave like `?.spam` and `?.eggs`
>>>(propagating None rather than raising AttributeError), simply because a
>>>`.?` had occurred earlier in the chain. So the above behaves differently
>>>from:
>>>
>>>    temp = foo?.bar
>>>    x = temp.spam.eggs
>>>
>>>Which raises questions about whether the propagation escapes
>>>parentheses, too:
>>>
>>>    x = (foo?.bar).spam.eggs
>>>
>>
>>Oh, I see. That's evil.
>>
>>The correct behavior here is that "foo?.bar.spam.eggs" should mean the same as
>>
>>    (None if foo is None else foo.bar.spam.eggs)
>>
>>(Stop until you understand that is *not* the same as either of the alternatives you describe.)
>>
>>I can see the confusion that led to the idea of "propagation" -- it probably comes from an attempt to define "foo?.bar" without reference to the context (in this case the relevant context is that it's followed by ".spam.eggs").
>
>
> It would really help to have a complete spec, or at least a quick workthrough of how an expression gets parsed and compiled.
>
> I assume it's something like this:
>
> spam?.eggs.cheese becomes this pseudo-AST (I've skipped the loads and maybe some other stuff):
>
>     Expr(
>         value=Attribute(
>             value=Attribute(
>                 value=Name(id='spam'), attr='eggs', uptalk=True),
>             attr='cheese', uptalk=False))
>
>
> ? which is then compiled as this pseudo-bytecode:
>
>     LOAD_NAME 'spam'
>     DUP_TOP
>     POP_JUMP_IF_NONE :label
>     LOAD_ATTR 'eggs'
>     LOAD_ATTR 'cheese'
>     :label
>
>
> I've invented a new opcode POP_JUMP_IF_NONE, but it should be clear what it does. I think it's clear how replacing spam with any other expression works, and how subscripting works. So the only question is whether understanding how .eggs.cheese becomes a pair of LOAD_ATTRs is sufficient to understand how ?.eggs.cheese becomes a JUMP_IF_NONE followed by the same pair of LOAD_ATTRs through the same two steps.
>
> I suppose the reference documentation wording is also important here, to explain that an uptalked attributeref or subscription short-circuits the whole primary.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Luciano Ramalho
|  Author of Fluent Python (O'Reilly, 2015)
|     http://shop.oreilly.com/product/0636920032519.do
|  Professor em: http://python.pro.br
|  Twitter: @ramalhoorg

From guido at python.org  Mon Sep 28 23:49:27 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Sep 2015 14:49:27 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <etPan.5609abd8.f9ec9f8.76f@Draupnir.home>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <etPan.5609a610.2f0c7c53.76f@Draupnir.home>
 <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
 <etPan.5609abd8.f9ec9f8.76f@Draupnir.home>
Message-ID: <CAP7+vJ+=wJtTszjHLyV4O4Y2qfFxMBL2CCVk=je_yBNdriLw=w@mail.gmail.com>

On Mon, Sep 28, 2015 at 2:06 PM, Donald Stufft <donald at stufft.io> wrote:

> I?m not a big fan of the punctuation though. It took me a minute to
> realize that post?.tag.lower() was saying if post is None, not if post.tag
> is None and I feel like it?s easy to miss the ?, especially when combined
> with other punctuation.
>

But that's a different point (for the record I'm not a big fan of the ?
either).

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/a02ffcc0/attachment.html>

From niilos at gmx.com  Tue Sep 29 00:10:34 2015
From: niilos at gmx.com (Niilos)
Date: Tue, 29 Sep 2015 00:10:34 +0200
Subject: [Python-ideas] list as parameter for the split function
Message-ID: <5609BADA.8060801@gmx.com>

Hello everyone,

I was wondering how to split a string with multiple separators.
For instance, if I edit some subtitle file and I want the string 
'00:02:34,452 --> 00:02:37,927' to become ['00', '02', '34', '452', 
'00', '02', '37', '927'] I have to use split too much time and I didn't 
find a "clean" way to do it.
I imagined the split function with an iterator as parameter. The string 
would be split each time its substring is in the iterator.

Here is the syntax I considered for this :

 >>> '00:02:34,452 --> 00:02:37,927'.split([ ':', ' --> ', ',' ])
['00', '02', '34', '452', '00', '02', '37', '927']

Is it a relevant idea ? What do you think about it ?

Regards,
Niilos.

From rymg19 at gmail.com  Tue Sep 29 00:23:27 2015
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Mon, 28 Sep 2015 17:23:27 -0500
Subject: [Python-ideas] list as parameter for the split function
In-Reply-To: <5609BADA.8060801@gmx.com>
References: <5609BADA.8060801@gmx.com>
Message-ID: <CAO41-mO-_r3Uru9SNFzH4-yu8gR=0jigZ1wJNPCjrkJi=wUEmw@mail.gmail.com>

import re
parts = re.split(':|(-->)|,', '00:02:34...')



On Mon, Sep 28, 2015 at 5:10 PM, Niilos <niilos at gmx.com> wrote:

> Hello everyone,
>
> I was wondering how to split a string with multiple separators.
> For instance, if I edit some subtitle file and I want the string
> '00:02:34,452 --> 00:02:37,927' to become ['00', '02', '34', '452', '00',
> '02', '37', '927'] I have to use split too much time and I didn't find a
> "clean" way to do it.
> I imagined the split function with an iterator as parameter. The string
> would be split each time its substring is in the iterator.
>
> Here is the syntax I considered for this :
>
> >>> '00:02:34,452 --> 00:02:37,927'.split([ ':', ' --> ', ',' ])
> ['00', '02', '34', '452', '00', '02', '37', '927']
>
> Is it a relevant idea ? What do you think about it ?
>
> Regards,
> Niilos.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
Ryan
[ERROR]: Your autotools build scripts are 200 lines longer than your
program. Something?s wrong.
http://kirbyfan64.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/e75d6dc3/attachment.html>

From emile at fenx.com  Tue Sep 29 00:23:47 2015
From: emile at fenx.com (Emile van Sebille)
Date: Mon, 28 Sep 2015 15:23:47 -0700
Subject: [Python-ideas] list as parameter for the split function
In-Reply-To: <5609BADA.8060801@gmx.com>
References: <5609BADA.8060801@gmx.com>
Message-ID: <mucelo$o8m$1@ger.gmane.org>

On 9/28/2015 3:10 PM, Niilos wrote:
> '00:02:34,452 --> 00:02:37,927'.split([ ':', ' --> ', ',' ])

'00:02:34,452 --> 00:02:37,927'.replace(",",":").replace(" --> 
",":").split(":")

Emile


From p.f.moore at gmail.com  Tue Sep 29 00:24:26 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 28 Sep 2015 23:24:26 +0100
Subject: [Python-ideas] list as parameter for the split function
In-Reply-To: <5609BADA.8060801@gmx.com>
References: <5609BADA.8060801@gmx.com>
Message-ID: <CACac1F8BqAHbuLfGT5zhXYRsV0pJq-qDaaSUZhNh1Jsyk2bssg@mail.gmail.com>

On 28 September 2015 at 23:10, Niilos <niilos at gmx.com> wrote:
> I was wondering how to split a string with multiple separators.
> For instance, if I edit some subtitle file and I want the string
> '00:02:34,452 --> 00:02:37,927' to become ['00', '02', '34', '452', '00',
> '02', '37', '927'] I have to use split too much time and I didn't find a
> "clean" way to do it.

You can use re.split:

>>> re.split(r':|,| --> ', '00:02:34,452 --> 00:02:37,927')
['00', '02', '34', '452', '00', '02', '37', '927']

Paul

From rymg19 at gmail.com  Tue Sep 29 00:26:03 2015
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Mon, 28 Sep 2015 17:26:03 -0500
Subject: [Python-ideas] list as parameter for the split function
In-Reply-To: <CAO41-mO-_r3Uru9SNFzH4-yu8gR=0jigZ1wJNPCjrkJi=wUEmw@mail.gmail.com>
References: <5609BADA.8060801@gmx.com>
 <CAO41-mO-_r3Uru9SNFzH4-yu8gR=0jigZ1wJNPCjrkJi=wUEmw@mail.gmail.com>
Message-ID: <CAO41-mN13dS9Kb=6deR1sV6x6oFO2rU5925MQaf5=cnya95c+g@mail.gmail.com>

Really, you could also just use re.match:


import re

pat = re.compile(r'(\d\d):(\d\d):(\d\d),(\d{3}) -->
(\d\d):(\d\d):(\d\d),(\d{3})')
def parse(string): return pat.match(string)

...

print(parse('00:02:34,452 --> 00:02:37,927')) # prints ['00', '02', '34',
'452', '00', '02', '37', '927']


That way, if the input is invalid, `None` will be returned, so you have
free error checking (sort of).


On Mon, Sep 28, 2015 at 5:23 PM, Ryan Gonzalez <rymg19 at gmail.com> wrote:

> import re
> parts = re.split(':|(-->)|,', '00:02:34...')
>
>
>
> On Mon, Sep 28, 2015 at 5:10 PM, Niilos <niilos at gmx.com> wrote:
>
>> Hello everyone,
>>
>> I was wondering how to split a string with multiple separators.
>> For instance, if I edit some subtitle file and I want the string
>> '00:02:34,452 --> 00:02:37,927' to become ['00', '02', '34', '452', '00',
>> '02', '37', '927'] I have to use split too much time and I didn't find a
>> "clean" way to do it.
>> I imagined the split function with an iterator as parameter. The string
>> would be split each time its substring is in the iterator.
>>
>> Here is the syntax I considered for this :
>>
>> >>> '00:02:34,452 --> 00:02:37,927'.split([ ':', ' --> ', ',' ])
>> ['00', '02', '34', '452', '00', '02', '37', '927']
>>
>> Is it a relevant idea ? What do you think about it ?
>>
>> Regards,
>> Niilos.
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
>
> --
> Ryan
> [ERROR]: Your autotools build scripts are 200 lines longer than your
> program. Something?s wrong.
> http://kirbyfan64.github.io/
>
>



-- 
Ryan
[ERROR]: Your autotools build scripts are 200 lines longer than your
program. Something?s wrong.
http://kirbyfan64.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/ea1a76c1/attachment.html>

From rosuav at gmail.com  Tue Sep 29 00:27:23 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Tue, 29 Sep 2015 08:27:23 +1000
Subject: [Python-ideas] list as parameter for the split function
In-Reply-To: <5609BADA.8060801@gmx.com>
References: <5609BADA.8060801@gmx.com>
Message-ID: <CAPTjJmpgUh=+kkjtQfTgLaLuKn_EQPT3BJK_aPFLa_WOQV8zYQ@mail.gmail.com>

On Tue, Sep 29, 2015 at 8:10 AM, Niilos <niilos at gmx.com> wrote:
> I was wondering how to split a string with multiple separators.
> For instance, if I edit some subtitle file and I want the string
> '00:02:34,452 --> 00:02:37,927' to become ['00', '02', '34', '452', '00',
> '02', '37', '927'] I have to use split too much time and I didn't find a
> "clean" way to do it.
> I imagined the split function with an iterator as parameter. The string
> would be split each time its substring is in the iterator.
>
> Here is the syntax I considered for this :
>
>>>> '00:02:34,452 --> 00:02:37,927'.split([ ':', ' --> ', ',' ])
> ['00', '02', '34', '452', '00', '02', '37', '927']
>
> Is it a relevant idea ? What do you think about it ?

Two possibilities:

1) Replace all separators with the same one.
'00:02:34,452 --> 00:02:37,927'.replace(",",":").replace(" --> ",":").split(":")

2) Use a regular expression.
re.split(":|,| --> ",'00:02:34,452 --> 00:02:37,927')
# or working the other way: find all the digit strings
re.findall("[0-9]+",'00:02:34,452 --> 00:02:37,927')

You could also consider a more full parser; presumably splitting into
strings is just the first step. I don't have anything handy in Python,
but there would be ways of doing the whole thing in less steps.

ChrisA

From tjreedy at udel.edu  Tue Sep 29 00:40:41 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 28 Sep 2015 18:40:41 -0400
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <muc4v2$q29$1@ger.gmane.org>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <560966C1.1040704@mail.de> <20150928163733.GN23642@ando.pearwood.info>
 <56096F09.40804@mail.de> <3819F6B1-3221-41E2-9103-62B71CBA7708@yahoo.com>
 <muc4v2$q29$1@ger.gmane.org>
Message-ID: <mucflc$6l1$1@ger.gmane.org>

On 9/28/2015 3:38 PM, Emile van Sebille wrote:
> On 9/28/2015 10:24 AM, Andrew Barnert via Python-ideas wrote:
>> On Sep 28, 2015, at 09:47, Sven R. Kunze
>> <srkunze at mail.de> wrote:
> <snip>
>
>>> I wouldn't make a mountain out of a molehill. Other existing
>>> operators have the same issue.
>>
>> Which other keywords or symbols may be either a binary operator or
>> part of a ternary operator depending on context?
>
> These come to mind:
>
> a = b = c
> a < b < c

These are chained comparisons, which get separated, not ternary 
operators. a < b = c < e > f in g is also syntactically valid, and I 
don't think anything is gained by calling it a pentanary operator.

-- 
Terry Jan Reedy


From abarnert at yahoo.com  Tue Sep 29 00:45:11 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Mon, 28 Sep 2015 15:45:11 -0700
Subject: [Python-ideas] list as parameter for the split function
In-Reply-To: <5609BADA.8060801@gmx.com>
References: <5609BADA.8060801@gmx.com>
Message-ID: <35EC0367-E741-4640-96FD-4B709F81550C@yahoo.com>

On Sep 28, 2015, at 15:10, Niilos <niilos at gmx.com> wrote:
> 
> Hello everyone,
> 
> I was wondering how to split a string with multiple separators.
> For instance, if I edit some subtitle file and I want the string '00:02:34,452 --> 00:02:37,927' to become ['00', '02', '34', '452', '00', '02', '37', '927'] I have to use split too much time and I didn't find a "clean" way to do it.
> I imagined the split function with an iterator as parameter. The string would be split each time its substring is in the iterator.

As a side note, a list is not an Iterator. It's an iterable, but an Iterator is a special kind of iterable that only allows one pass, which is definitely not what you want here. In fact, what you probably want is a sequence (or maybe just a container, since the only thing you want to do is test "in"). 

Also, the way you've defined this ("each time its substring is in the iterator") is either ambiguous, or inherently expensive, depending on how you read it. And once you work out what you actually mean, it's hard to express it better than as a regular expression, which is why half a dozen people jumped to that answer.

From srkunze at mail.de  Tue Sep 29 00:52:40 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Tue, 29 Sep 2015 00:52:40 +0200
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CALxg4FU7dd7Yi=-fiBL_uM1s4tfLHk7ko56GQu2MEJJojx-n8w@mail.gmail.com>
References: <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <1082717728.2280152.1443469633455.JavaMail.yahoo@mail.yahoo.com>
 <CALxg4FU7dd7Yi=-fiBL_uM1s4tfLHk7ko56GQu2MEJJojx-n8w@mail.gmail.com>
Message-ID: <5609C4B8.5080501@mail.de>

On 28.09.2015 23:48, Luciano Ramalho wrote:
> Glyph tweeted yesterday that everyone should watch the "Nothing is
> Something" 35' talk by Sandi Metz at RailsConf 2015. It's great and,
> in a way, relevant to this discussion.
>
> https://www.youtube.com/watch?v=29MAL8pJImQ

Nice watch.

It's completely in line with our internal guidelines. Great to see that 
people with practical experience come to the same conclusion.


Best,
Sven

From guido at python.org  Tue Sep 29 01:03:22 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Sep 2015 16:03:22 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <mucflc$6l1$1@ger.gmane.org>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <560966C1.1040704@mail.de> <20150928163733.GN23642@ando.pearwood.info>
 <56096F09.40804@mail.de> <3819F6B1-3221-41E2-9103-62B71CBA7708@yahoo.com>
 <muc4v2$q29$1@ger.gmane.org> <mucflc$6l1$1@ger.gmane.org>
Message-ID: <CAP7+vJJe25qzczmBgu05oEHL-GSceP__k9m=gNwKxVkkmwQvsw@mail.gmail.com>

On Mon, Sep 28, 2015 at 3:40 PM, Terry Reedy <tjreedy at udel.edu> wrote:

> On 9/28/2015 3:38 PM, Emile van Sebille wrote:
>
>> On 9/28/2015 10:24 AM, Andrew Barnert via Python-ideas wrote:
>>
>>> On Sep 28, 2015, at 09:47, Sven R. Kunze
>>> <srkunze at mail.de> wrote:
>>>
>> <snip>
>>
>> I wouldn't make a mountain out of a molehill. Other existing
>>>> operators have the same issue.
>>>>
>>>
>>> Which other keywords or symbols may be either a binary operator or
>>> part of a ternary operator depending on context?
>>>
>>
>> These come to mind:
>>
>> a = b = c
>> a < b < c
>>
>
> These are chained comparisons, which get separated, not ternary operators.
> a < b = c < e > f in g is also syntactically valid, and I don't think
> anything is gained by calling it a pentanary operator.
>

But a < b < c is an excellent example of something that cannot be
mindlessly refactored into (a < b) < c.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/fc63131a/attachment.html>

From tjreedy at udel.edu  Tue Sep 29 02:38:39 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 28 Sep 2015 20:38:39 -0400
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CALxg4FU7dd7Yi=-fiBL_uM1s4tfLHk7ko56GQu2MEJJojx-n8w@mail.gmail.com>
References: <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <1082717728.2280152.1443469633455.JavaMail.yahoo@mail.yahoo.com>
 <CALxg4FU7dd7Yi=-fiBL_uM1s4tfLHk7ko56GQu2MEJJojx-n8w@mail.gmail.com>
Message-ID: <mucmih$6tg$1@ger.gmane.org>

On 9/28/2015 5:48 PM, Luciano Ramalho wrote:
> Glyph tweeted yesterday that everyone should watch the "Nothing is
> Something" 35' talk by Sandi Metz at RailsConf 2015. It's great and,
> in a way, relevant to this discussion.
>
> https://www.youtube.com/watch?v=29MAL8pJImQ

I understood Metz as advocation avoidig the nil (None) problem by giving 
every class an 'active nothing' that has the methods of the class.  We 
do that for most builtin classes -- 0, (), {}, etc.  She also used the 
identity function with a particular signature in various roles.

-- 
Terry Jan Reedy



From random832 at fastmail.com  Tue Sep 29 04:22:09 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 28 Sep 2015 22:22:09 -0400
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAP7+vJLAqa95CJuZ-kse7cqXVUdkpyKBECDBH7Qc+ieLQBt2yw@mail.gmail.com>
References: <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <1082717728.2280152.1443469633455.JavaMail.yahoo@mail.yahoo.com>
 <CAP7+vJLAqa95CJuZ-kse7cqXVUdkpyKBECDBH7Qc+ieLQBt2yw@mail.gmail.com>
Message-ID: <1443493329.2298568.396067929.05FAD0A3@webmail.messagingengine.com>

On Mon, Sep 28, 2015, at 17:48, Guido van Rossum wrote:
> >     Expr(
> >         value=Attribute(
> >             value=Attribute(
> >                 value=Name(id='spam'), attr='eggs', uptalk=True),
> >             attr='cheese', uptalk=False))
> >
> 
> Hm, I think the problem is that this way of representing the tree
> encourages thinking that each attribute (with or without ?) can be
> treated
> on its own.

How else would you represent it? Maybe some sort of expression that
represents a _list_ of attribute/item/call "operators" that are each
applied, and if one of them results in none and has uptalk=True it can
yield early.

Something like...

AtomExpr(atom=Name('spam'), trailers=[Attribute('eggs', uptalk=True),
Attribute('cheese', uptalk=False)])

For a more complex example:

a?.b.c?[12](34).f(56)?(78)

AtomExpr(Name('a'), [
	Attribute('b', True),
	Attribute('c', False),
	Subscript(12, True),
	Call([34], False),
	Attribute('f', False),
	Call([56], False),
	Call([78], True)])

I almost sent this with it called "Thing", but I checked the grammar and
found an element this thing actually maps to.

From guido at python.org  Tue Sep 29 05:11:00 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Sep 2015 20:11:00 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <1443493329.2298568.396067929.05FAD0A3@webmail.messagingengine.com>
References: <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <1082717728.2280152.1443469633455.JavaMail.yahoo@mail.yahoo.com>
 <CAP7+vJLAqa95CJuZ-kse7cqXVUdkpyKBECDBH7Qc+ieLQBt2yw@mail.gmail.com>
 <1443493329.2298568.396067929.05FAD0A3@webmail.messagingengine.com>
Message-ID: <CAP7+vJ+VQYN6ekN22Q_KKc2OVSWOd=cP11x22OP-UfzVd+JWEg@mail.gmail.com>

I would at least define different classes for the uptalk versions.

But my main complaint is using the parse tree as a spec at all -- it has
way too much noise for a clear description. We don't describe a < b < c by
first translating it to (Comparison(a, Comparison(b, c, chained=False),
chained=True) either: the reference manual uses a postfix * (i.e.
repetition) operator to describe chained comparisons -- while for other
operators it favors a recursive definition.

On Mon, Sep 28, 2015 at 7:22 PM, Random832 <random832 at fastmail.com> wrote:

> On Mon, Sep 28, 2015, at 17:48, Guido van Rossum wrote:
> > >     Expr(
> > >         value=Attribute(
> > >             value=Attribute(
> > >                 value=Name(id='spam'), attr='eggs', uptalk=True),
> > >             attr='cheese', uptalk=False))
> > >
> >
> > Hm, I think the problem is that this way of representing the tree
> > encourages thinking that each attribute (with or without ?) can be
> > treated
> > on its own.
>
> How else would you represent it? Maybe some sort of expression that
> represents a _list_ of attribute/item/call "operators" that are each
> applied, and if one of them results in none and has uptalk=True it can
> yield early.
>
> Something like...
>
> AtomExpr(atom=Name('spam'), trailers=[Attribute('eggs', uptalk=True),
> Attribute('cheese', uptalk=False)])
>
> For a more complex example:
>
> a?.b.c?[12](34).f(56)?(78)
>
> AtomExpr(Name('a'), [
>         Attribute('b', True),
>         Attribute('c', False),
>         Subscript(12, True),
>         Call([34], False),
>         Attribute('f', False),
>         Call([56], False),
>         Call([78], True)])
>
> I almost sent this with it called "Thing", but I checked the grammar and
> found an element this thing actually maps to.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/8a5aa682/attachment.html>

From steve at pearwood.info  Tue Sep 29 05:43:18 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 29 Sep 2015 13:43:18 +1000
Subject: [Python-ideas] list as parameter for the split function
In-Reply-To: <5609BADA.8060801@gmx.com>
References: <5609BADA.8060801@gmx.com>
Message-ID: <20150929034316.GO23642@ando.pearwood.info>

On Tue, Sep 29, 2015 at 12:10:34AM +0200, Niilos wrote:
> Hello everyone,
> 
> I was wondering how to split a string with multiple separators.
> For instance, if I edit some subtitle file and I want the string 
> '00:02:34,452 --> 00:02:37,927' to become ['00', '02', '34', '452', 
> '00', '02', '37', '927'] I have to use split too much time and I didn't 
> find a "clean" way to do it.
> I imagined the split function with an iterator as parameter. The string 
> would be split each time its substring is in the iterator.
> 
> Here is the syntax I considered for this :
> 
> >>> '00:02:34,452 --> 00:02:37,927'.split([ ':', ' --> ', ',' ])
> ['00', '02', '34', '452', '00', '02', '37', '927']
> 
> Is it a relevant idea ? What do you think about it ?


Quite a few string methods take multiple arguments, e.g.:

py> "spam".startswith(("a", "+", "sp"))
True

and I've often wished that split would be one of them. The substring 
argument could accept a string (as it does now) or a tuple of strings.


There are other solutions, but they have issues:

(1) Writing your own custom string mini-parser and getting it right is 
harder than it sounds. Certainly its not simple enough to reinvent this 
particular tool each time you want it.


(2) Using replace to change all the substrings to one:

py> text = "aaa,bbb ccc;ddd,eee fff"
py> text.replace(",", " ").replace(";", " ").split()
['aaa', 'bbb', 'ccc', 'ddd', 'eee', 'fff']

works well enough for simple cases, but if you have a lot of text, 
having to call replace multiple times can be expensive.


(3) Using a regular expression is probably the "right" answer, at least 
from a comp sci theorectical perspective. This is precisely the sort of 
thing that regexes are designed for. Unfortunately, regex syntax is 
itself a programming language[1], and a particularly cryptic and 
unforgiving one, so even quite experienced coders can have trouble.

At first it seems easy:

py> re.split(r";|-|~", "aaa~bbb-ccc;ddd;eee")
['aaa', 'bbb', 'ccc', 'ddd', 'eee']

but then seemingly minor changes makes it misbehave:

py> re.split(r";|-|^", "aaa^bbb-ccc^ddd;eee")
['aaa^bbb', 'ccc^ddd', 'eee']

py> re.split(r";|-|.", "aaa.bbb-ccc;ddd;eee")
['', '', '', '', '', '', '', '', '', '', '', '', '', '', 
'', '', '', '', '', '']

The solution is to escape the metacharacters, but people who aren't 
familiar with regexes won't necessarily know which they are.

So really, in my opinion, there is no good built-in solution to the 
*general* problem of splitting a string on multiple arbitrary 
substrings. Perhaps str.split can act as an interface to the re module, 
automatically escaping the substrings:


# Pseudo-implimentation
def split(self, substrings, maxsplit=None):
    if isinstance(substrings, str):
        # use the current implementation
        ...
    elif isinstance(substrings, tuple):
        regex = '|'.join(re.escape(s) for s in substrings)
        return re.split(regex, self, maxsplit)




[1] Albeit not a Turing Complete one, at least not Python's version.

-- 
Steve

From jdhardy at gmail.com  Tue Sep 29 06:04:54 2015
From: jdhardy at gmail.com (Jeff Hardy)
Date: Mon, 28 Sep 2015 21:04:54 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <etPan.5609a610.2f0c7c53.76f@Draupnir.home>
 <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
Message-ID: <CAF7AXFFst2AQ9s8xk-vsGDY-Kon_hfC86jztU1i34HXDci-wwg@mail.gmail.com>

On Mon, Sep 28, 2015 at 1:56 PM, Guido van Rossum <guido at python.org> wrote:

> On Mon, Sep 28, 2015 at 1:41 PM, Donald Stufft <donald at stufft.io> wrote:
>
>> On September 28, 2015 at 4:25:12 PM, Guido van Rossum (guido at python.org)
>> wrote:
>> > On Mon, Sep 28, 2015 at 1:15 PM, Donald Stufft wrote:
>> >
>> > > The ? Modifying additional attribute accesses beyond just the
>> immediate
>> > > one bothers me too and feels more ruby than python to me.
>> > >
>> >
>> > Really? Have you thought about it?
>>
>> Not extensively, mostly this is a gut feeling.
>>
>> >
>> > Suppose I have an object post which may be None or something with a tag
>> > attribute which should be a string. And suppose I want to get the
>> > lowercased tag, if the object exists, else None.
>> >
>> > This seems a perfect use case for writing post?.tag.lower() -- this
>> > signifies that post may be None but if it exists, post.tag is not
>> expected
>> > to be None. So basically I want the equivalent of (post.tag.lower() if
>> post
>> > is not None else None).
>> >
>> > But if post?.tag.lower() were interpreted strictly as
>> (post?.tag).lower(),
>> > then I would have to write post?.tag?.lower?(), which is an abomination.
>> > OTOH if post?.tag.lower() automatically meant post?.tag?.lower?() then I
>> > would silently get no error when post exists but post.tag is None
>> (which in
>> > this example is an error).
>> >
>>
>> Does ? propagate past a non None value? If it were post?.tag.name.lower()
>> and post was not None, but tag was None would that be an error or would the
>> ? propagate to the tag as well?
>>
>
> I was trying to clarify that by saying that foo?.bar.baz means
> (foo.bar.baz if foo is not None else None). IOW if tag was None that would
> be an error.
>
> The rule then is quite simple: each ? does exactly one None check and
> divides the expression into exactly two branches -- one for the case where
> the thing preceding ? is None and one for the case where it isn't.
>

This whole line of discussion is why I'd prefer the PEP be split to have ??
in one and ?., ?[, etc. in another (the thread I linked isn't even the
longest one discussing the associativity - there were many that preceded
it). I agree that the short circuit behaviour is the only one that makes
any sense, but I also don't want to see the very useful ?? operator lost
because of discussions over or implementation difficulties of ?. or ?[.

And if it's going to be done anyway, I'd to see ?( as well.

- Jeff
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/499e910d/attachment.html>

From random832 at fastmail.com  Tue Sep 29 06:22:38 2015
From: random832 at fastmail.com (Random832)
Date: Tue, 29 Sep 2015 00:22:38 -0400
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
References: <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <1082717728.2280152.1443469633455.JavaMail.yahoo@mail.yahoo.com>
 <CAP7+vJLAqa95CJuZ-kse7cqXVUdkpyKBECDBH7Qc+ieLQBt2yw@mail.gmail.com>
 <1443493329.2298568.396067929.05FAD0A3@webmail.messagingengine.com>
 <CAP7+vJ+VQYN6ekN22Q_KKc2OVSWOd=cP11x22OP-UfzVd+JWEg@mail.gmail.com>
Message-ID: <m21tdham7l.fsf@fastmail.com>

Guido van Rossum <guido at python.org> writes:
> I would at least define different classes for the uptalk versions.
>
> But my main complaint is using the parse tree as a spec at all

Like I said, I actually came up with that structure *before* seeing that
it mirrored a grammar element - it honestly seems like the most natural
way to embody the fact that evaluating it requires the whole context as
a unit and can short-circuit halfway through the list, depending on if
the 'operator' at that position is an uptalk version.

The evaluation given this structure can be described in pseudocode:

def evaluate(expr):
    value = expr.atom.evaluate()
    for trailer in trailers:
        if trailer.uptalk and value is None:
            return None
        value = trailer.evaluate_step(value)

The code generation could work the same way, iterating over this and
generating whatever instructions each trailer implies. In CPython, The
difference between the uptalk and non-uptalk version would be that
immediately after the left-hand value is on the stack, insert opcodes:
DUP_TOP LOAD_CONST(None) COMPARE_OP(is) POP_JUMP_IF_TRUE(end), with the
jump being to the location where the final value of the expression is
expected on the stack.

Assuming I'm understanding the meaning of each opcode correctly, this
sequence would basically be equivalent to a hypothetical JUMP_IF_NONE
opcode.

I don't think a recursive definition for the structure would work,
because evaluating / code-generating an uptalk operator needs to have
the top-level expression in order to escape from it to yield None.


From guido at python.org  Tue Sep 29 06:30:34 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Sep 2015 21:30:34 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <m21tdham7l.fsf@fastmail.com>
References: <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <1082717728.2280152.1443469633455.JavaMail.yahoo@mail.yahoo.com>
 <CAP7+vJLAqa95CJuZ-kse7cqXVUdkpyKBECDBH7Qc+ieLQBt2yw@mail.gmail.com>
 <1443493329.2298568.396067929.05FAD0A3@webmail.messagingengine.com>
 <CAP7+vJ+VQYN6ekN22Q_KKc2OVSWOd=cP11x22OP-UfzVd+JWEg@mail.gmail.com>
 <m21tdham7l.fsf@fastmail.com>
Message-ID: <CAP7+vJKZ5oOc_3hoGhAfznC3LKv2oHy7a2T-UvHeL5adqsYNDA@mail.gmail.com>

Sounds like we're in violent agreement. :-)

On Monday, September 28, 2015, Random832 <random832 at fastmail.com> wrote:

> Guido van Rossum <guido at python.org <javascript:;>> writes:
> > I would at least define different classes for the uptalk versions.
> >
> > But my main complaint is using the parse tree as a spec at all
>
> Like I said, I actually came up with that structure *before* seeing that
> it mirrored a grammar element - it honestly seems like the most natural
> way to embody the fact that evaluating it requires the whole context as
> a unit and can short-circuit halfway through the list, depending on if
> the 'operator' at that position is an uptalk version.
>
> The evaluation given this structure can be described in pseudocode:
>
> def evaluate(expr):
>     value = expr.atom.evaluate()
>     for trailer in trailers:
>         if trailer.uptalk and value is None:
>             return None
>         value = trailer.evaluate_step(value)
>
> The code generation could work the same way, iterating over this and
> generating whatever instructions each trailer implies. In CPython, The
> difference between the uptalk and non-uptalk version would be that
> immediately after the left-hand value is on the stack, insert opcodes:
> DUP_TOP LOAD_CONST(None) COMPARE_OP(is) POP_JUMP_IF_TRUE(end), with the
> jump being to the location where the final value of the expression is
> expected on the stack.
>
> Assuming I'm understanding the meaning of each opcode correctly, this
> sequence would basically be equivalent to a hypothetical JUMP_IF_NONE
> opcode.
>
> I don't think a recursive definition for the structure would work,
> because evaluating / code-generating an uptalk operator needs to have
> the top-level expression in order to escape from it to yield None.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org <javascript:;>
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (on iPad)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/7ba4691c/attachment.html>

From 375956667 at qq.com  Tue Sep 29 08:31:38 2015
From: 375956667 at qq.com (=?gb18030?B?1cXJ8sX0?=)
Date: Tue, 29 Sep 2015 14:31:38 +0800
Subject: [Python-ideas] Maybe python should support arrow syntax for easier
	use async call ?
Message-ID: <tencent_0D98F3817AC6133354ACE638@qq.com>

# The example works with tornado dev verison & python3.5


import tornado
from tornado.httpclient import AsyncHTTPClient
from tornado.concurrent import Future 
from tornado.gen import convert_yielded
from functools import wraps
    
Future.__or__ = Future.add_done_callback

def future(func):
    @wraps(func)
    def _(*args, **kwds):
        return convert_yielded(func(*args, **kwds))
    return _



##############

@future
async def ping(url):
    httpclient = AsyncHTTPClient()
    r = await httpclient.fetch(url)
    return r.body.decode('utf-8')


ping("http://baidu.com") | (
    lambda r:print(r.result())
)

"""

Maybe python should support arrow syntax for easier use async call ?

Now lambda only can write one line and must have parentheses ...

FOR EXAMPLE

ping("http://baidu.com") | r ->
    print(r.result())
    print("something else")


I saw some discuss in https://wiki.python.org/moin/AlternateLambdaSyntax
"""


tornado.ioloop.IOLoop.instance().start()

From abarnert at yahoo.com  Tue Sep 29 08:40:12 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Mon, 28 Sep 2015 23:40:12 -0700
Subject: [Python-ideas] Maybe python should support arrow syntax for
	easier use async call ?
In-Reply-To: <tencent_0D98F3817AC6133354ACE638@qq.com>
References: <tencent_0D98F3817AC6133354ACE638@qq.com>
Message-ID: <D7534302-5CBA-4591-A620-DF90F2D122F4@yahoo.com>

From a quick glance, it looks like you're converting from coroutines back to callbacks just so you can partially hide the callbacks. Why not just stick with coroutines? Compare:

    ping("http://baidu.com") | r ->
       print(r.result())
       print("something else")

    r = await ping("http://baidu.com")
    print(r.result())
    print("something else")

And this doesn't require a new operator, or multiline lambdas, or a new operator that does its thing and also introduces a multiline lambda, or anything else.

Sent from my iPhone

> On Sep 28, 2015, at 23:31, ??? <375956667 at qq.com> wrote:
> 
> # The example works with tornado dev verison & python3.5
> 
> 
> import tornado
> from tornado.httpclient import AsyncHTTPClient
> from tornado.concurrent import Future 
> from tornado.gen import convert_yielded
> from functools import wraps
> 
> Future.__or__ = Future.add_done_callback
> 
> def future(func):
>    @wraps(func)
>    def _(*args, **kwds):
>        return convert_yielded(func(*args, **kwds))
>    return _
> 
> 
> 
> ##############
> 
> @future
> async def ping(url):
>    httpclient = AsyncHTTPClient()
>    r = await httpclient.fetch(url)
>    return r.body.decode('utf-8')
> 
> 
> ping("http://baidu.com") | (
>    lambda r:print(r.result())
> )
> 
> """
> 
> Maybe python should support arrow syntax for easier use async call ?
> 
> Now lambda only can write one line and must have parentheses ...
> 
> FOR EXAMPLE
> 
> ping("http://baidu.com") | r ->
>    print(r.result())
>    print("something else")
> 
> 
> I saw some discuss in https://wiki.python.org/moin/AlternateLambdaSyntax
> """
> 
> 
> tornado.ioloop.IOLoop.instance().start()
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150928/9f7067c4/attachment.html>

From greg.ewing at canterbury.ac.nz  Tue Sep 29 07:55:44 2015
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 29 Sep 2015 18:55:44 +1300
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAP7+vJLAqa95CJuZ-kse7cqXVUdkpyKBECDBH7Qc+ieLQBt2yw@mail.gmail.com>
References: <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <1082717728.2280152.1443469633455.JavaMail.yahoo@mail.yahoo.com>
 <CAP7+vJLAqa95CJuZ-kse7cqXVUdkpyKBECDBH7Qc+ieLQBt2yw@mail.gmail.com>
Message-ID: <560A27E0.7080800@canterbury.ac.nz>

Guido van Rossum wrote:
> On Mon, Sep 28, 2015 at 12:47 PM, Andrew Barnert <abarnert at yahoo.com 
> <mailto:abarnert at yahoo.com>> wrote:
> 
>         Expr(
>             value=Attribute(
>                 value=Attribute(
>                     value=Name(id='spam'), attr='eggs', uptalk=True),
>                 attr='cheese', uptalk=False))
> 
> Hm, I think the problem is that this way of representing the tree 
> encourages thinking that each attribute (with or without ?) can be 
> treated on its own.

It's hard to think of any other way of representing this in
an AST that makes the short-circuiting behaviour any clearer.

I suspect that displaying an AST isn't really going to be
helpful as a way of documenting the semantics. Because the
semantics aren't really in the AST itself, they're in the
compiler code that interprets the AST.

-- 
Greg

From ncoghlan at gmail.com  Tue Sep 29 10:09:50 2015
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 29 Sep 2015 18:09:50 +1000
Subject: [Python-ideas] Using `or?` as the null coalescing operator
In-Reply-To: <CAMiohog5gNdkjhF_54reSyA8mDjqiqqKGd9SreSpzJGRvyKRYg@mail.gmail.com>
References: <6C2E5579-42A0-423F-AB8C-01B49FA59D67@gmail.com>
 <CADiSq7eR=G1m=KH2CGa7LgVx4jCcYxfnRPC6Ta=f0crFFEmDAQ@mail.gmail.com>
 <CAMiohog5gNdkjhF_54reSyA8mDjqiqqKGd9SreSpzJGRvyKRYg@mail.gmail.com>
Message-ID: <CADiSq7dsemUBXG=FBJqP0KddAQzOOj6cTzEtdLu7f19Sf3bToA@mail.gmail.com>

On 28 September 2015 at 20:50, Koos Zevenhoven <k7hoven at gmail.com> wrote:
> On Mon, Sep 28, 2015 at 10:13 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> On 25 September 2015 at 09:07, Alessio Bogon <youtux at gmail.com> wrote:
>>> I really like PEP 0505. The only thing that does not convince me is the `??` operator. I would like to know what you think of an alternative like `or?`:
>>>
>>> a_list = some_list or? []
>>> a_dict = some_dict or? {}
>>>
>
> And have the following syntax options been considered?
>
> a_list = some_list else []

In addition to the syntactic ambiguity Ryan notes, there's no hint
here that we're using "some_list is not None" as the condition rather
than "bool(some_list)"

> a_list = some_list or [] if None

This one isn't supportable at the language grammar level - by the time
we get to the "if" token, the parser will have already interpreted the
first part as "some_list or []".

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From mal at egenix.com  Tue Sep 29 11:49:17 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 29 Sep 2015 11:49:17 +0200
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAP7+vJ+=wJtTszjHLyV4O4Y2qfFxMBL2CCVk=je_yBNdriLw=w@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>	<CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>	<56097AFB.1040906@oddbird.net>	<CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>	<5609985C.40603@oddbird.net>	<CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>	<56099C6F.90700@oddbird.net>	<36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>	<CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>	<etPan.5609a610.2f0c7c53.76f@Draupnir.home>	<CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>	<etPan.5609abd8.f9ec9f8.76f@Draupnir.home>
 <CAP7+vJ+=wJtTszjHLyV4O4Y2qfFxMBL2CCVk=je_yBNdriLw=w@mail.gmail.com>
Message-ID: <560A5E9D.3070808@egenix.com>

On 28.09.2015 23:49, Guido van Rossum wrote:
> On Mon, Sep 28, 2015 at 2:06 PM, Donald Stufft <donald at stufft.io> wrote:
> 
>> I?m not a big fan of the punctuation though. It took me a minute to
>> realize that post?.tag.lower() was saying if post is None, not if post.tag
>> is None and I feel like it?s easy to miss the ?, especially when combined
>> with other punctuation.
>>
> 
> But that's a different point (for the record I'm not a big fan of the ?
> either).

Me neither.

The proposal simply doesn't have the right balance between usefulness
and complexity added to the language (esp. for new Python programmers
to learn in order to be able to read a Python program).

In practice, you can very often write "x or y" instead of
having to use "x if x is None else y", simply because you're
not only interested in catching the x is None case, but also
want to override an empty string or sequence value with
a default. If you really need to specifically check for None,
"x if x is None else y" is way more expressive than "x ?? y".

For default parameters with mutable types as values,
I usually write:

def func(x=None):
    if x is None:
        x = []
    ...

IMO, that's better than any of the above, but perhaps that's
just because I don't believe in the "write everything
in a single line" pattern as something we should strive
for in Python.

The other variants (member and index access) look like
typos to me ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 29 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-25: Started a Python blog ... ...          http://malemburg.com/
2015-10-21: Python Meeting Duesseldorf ...                 22 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From abarnert at yahoo.com  Tue Sep 29 12:11:30 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Tue, 29 Sep 2015 03:11:30 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <560A27E0.7080800@canterbury.ac.nz>
References: <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <1082717728.2280152.1443469633455.JavaMail.yahoo@mail.yahoo.com>
 <CAP7+vJLAqa95CJuZ-kse7cqXVUdkpyKBECDBH7Qc+ieLQBt2yw@mail.gmail.com>
 <560A27E0.7080800@canterbury.ac.nz>
Message-ID: <AF900CDD-C528-4924-9733-B627E232C7F3@yahoo.com>

On Sep 28, 2015, at 22:55, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> 
> Guido van Rossum wrote:
>> On Mon, Sep 28, 2015 at 12:47 PM, Andrew Barnert <abarnert at yahoo.com <mailto:abarnert at yahoo.com>> wrote:
>>        Expr(
>>            value=Attribute(
>>                value=Attribute(
>>                    value=Name(id='spam'), attr='eggs', uptalk=True),
>>                attr='cheese', uptalk=False))
>> Hm, I think the problem is that this way of representing the tree encourages thinking that each attribute (with or without ?) can be treated on its own.
> 
> It's hard to think of any other way of representing this in
> an AST that makes the short-circuiting behaviour any clearer.
> 
> I suspect that displaying an AST isn't really going to be
> helpful as a way of documenting the semantics. Because the
> semantics aren't really in the AST itself, they're in the
> compiler code that interprets the AST.

That's why I gave both an AST and the bytecode (and how it differs from the AST and bytecode with non-uptalked attribution). I think that makes it obvious and unambiguous what the semantics are, to anyone who knows how the compiler handles attribution ASTs, and understands the resulting bytecode.

Of course, as Guido points out, that "anyone who..." is a pretty restricted set, so maybe this wasn't as useful as I intended, and we have to wait for someone to write up the details in a way that's still unambiguous, but also human-friendly.

From steve at pearwood.info  Tue Sep 29 14:43:39 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 29 Sep 2015 22:43:39 +1000
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
References: <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
Message-ID: <20150929124339.GS23642@ando.pearwood.info>

On Mon, Sep 28, 2015 at 01:54:09PM -0700, Guido van Rossum wrote:

> If you want to dumb down the feature so that foo?.bar.baz means just
> (foo?.bar).baz then it's useless and I should just reject the PEP.

In case anyone missed it, according to the PEP author Mark Haase, that's 
the behaviour of Dart, and it is useless:

"Your interpretation of Dart's semantics is correct, and I agree that's 
absolutely the wrong way to do it. C# does have the short-circuit 
semantics that you're looking for."

https://mail.python.org/pipermail/python-ideas/2015-September/036495.html



-- 
Steve

From ericsnowcurrently at gmail.com  Tue Sep 29 15:40:21 2015
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 29 Sep 2015 07:40:21 -0600
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <560A5E9D.3070808@egenix.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <etPan.5609a610.2f0c7c53.76f@Draupnir.home>
 <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
 <etPan.5609abd8.f9ec9f8.76f@Draupnir.home>
 <CAP7+vJ+=wJtTszjHLyV4O4Y2qfFxMBL2CCVk=je_yBNdriLw=w@mail.gmail.com>
 <560A5E9D.3070808@egenix.com>
Message-ID: <CALFfu7CMzZTKmZsGc4QX5sf+z_9X2HZWM-dYBvfGwUNkXz0CAw@mail.gmail.com>

On Tue, Sep 29, 2015 at 3:49 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> On 28.09.2015 23:49, Guido van Rossum wrote:
>> But that's a different point (for the record I'm not a big fan of the ?
>> either).
>
> Me neither.

Same here.

>
> The proposal simply doesn't have the right balance between usefulness
> and complexity added to the language (esp. for new Python programmers
> to learn in order to be able to read a Python program).

+1

>
> In practice, you can very often write "x or y" instead of
> having to use "x if x is None else y", simply because you're
> not only interested in catching the x is None case, but also
> want to override an empty string or sequence value with
> a default. If you really need to specifically check for None,
> "x if x is None else y" is way more expressive than "x ?? y".
>
> For default parameters with mutable types as values,
> I usually write:
>
> def func(x=None):
>     if x is None:
>         x = []
>     ...

I do the same.  It has the right amount of explicitness and makes the
default-case branch more obvious (subjectively, of course) than the
proposed alternative:

def func(x=None):
    x = x ?? []
    ...

>
> IMO, that's better than any of the above, but perhaps that's
> just because I don't believe in the "write everything
> in a single line" pattern as something we should strive
> for in Python.

Yeah, the language has been pretty successful at striking the right
balance here.  IMO, the proposed syntax doesn't pay off.

-eric

From p.f.moore at gmail.com  Tue Sep 29 16:56:23 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 29 Sep 2015 15:56:23 +0100
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CALFfu7CMzZTKmZsGc4QX5sf+z_9X2HZWM-dYBvfGwUNkXz0CAw@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <etPan.5609a610.2f0c7c53.76f@Draupnir.home>
 <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
 <etPan.5609abd8.f9ec9f8.76f@Draupnir.home>
 <CAP7+vJ+=wJtTszjHLyV4O4Y2qfFxMBL2CCVk=je_yBNdriLw=w@mail.gmail.com>
 <560A5E9D.3070808@egenix.com>
 <CALFfu7CMzZTKmZsGc4QX5sf+z_9X2HZWM-dYBvfGwUNkXz0CAw@mail.gmail.com>
Message-ID: <CACac1F8OOxGsUwMXQWx_ppKJ8_C5BoRO-5NmPGdSuYiKtbAqfA@mail.gmail.com>

On 29 September 2015 at 14:40, Eric Snow <ericsnowcurrently at gmail.com> wrote:
>> For default parameters with mutable types as values,
>> I usually write:
>>
>> def func(x=None):
>>     if x is None:
>>         x = []
>>     ...
>
> I do the same.  It has the right amount of explicitness and makes the
> default-case branch more obvious (subjectively, of course) than the
> proposed alternative:
>
> def func(x=None):
>     x = x ?? []

Looking at those two cases in close proximity like that, I have to say
that the explicit if statement wins hands down.

But it's not quite as obvious with multiple arguments where the target
isn't the same as the parameter (for example with a constructor):

def __init__(self, vertices=None, edges=None, weights=None, source_nodes=None):
    if vertices is None:
        self.vertices = []
    else:
        self.vertices = vertices
    if edges is None:
        self.edges = []
    else:
        self.edges = edges
    if weights is None:
        self.weights = {}
    else:
        self.weights = weights
    if source_nodes is None:
        self.source_nodes = []
    else:
        self.source_nodes = source_nodes
vs

def __init__(self, vertices=None, edges=None, weights=None, source_nodes=None):
    self.vertices = vertices or? []
    self.edges = edges or? []
    self.weights = weights or? {}
    self.source_nodes = source_nodes or? []

Having said all of that, short circuiting is not important here, so

def default(var, dflt):
    if var is None:
        return dflt
    return var

def __init__(self, vertices=None, edges=None, weights=None, source_nodes=None):
    self.vertices = default(vertices, [])
    self.edges = default(edges, [])
    self.weights = default(weights, {})
    self.source_nodes = default(source_nodes, [])

is also an option.

In this case, my preference is probably (1) a default() function, (2)
or?, (3) multi-line if. The default() function approach can be used
for cases where the condition is something *other* than "is None" so
that one edges ahead of or? because it's more flexible... (although
(1) wouldn't be an option if short-circuiting really mattered...)

In practice, of course, I never write a default() function at the
moment, I just use multi-line ifs. Whether that means I'd use an or?
operator, I don't know. Probably - but I'd likely consider it a bit of
a "too many ways of doing the same thing" wart at the same time...

Paul

From chris.barker at noaa.gov  Tue Sep 29 17:30:45 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Tue, 29 Sep 2015 08:30:45 -0700
Subject: [Python-ideas] list as parameter for the split function
In-Reply-To: <20150929034316.GO23642@ando.pearwood.info>
References: <5609BADA.8060801@gmx.com>
 <20150929034316.GO23642@ando.pearwood.info>
Message-ID: <CALGmxELtWEupZw947r_8n_MhVcK84xcr7=1gaZ1SRRLPWM7H+A@mail.gmail.com>

On Mon, Sep 28, 2015 at 8:43 PM, Steven D'Aprano <steve at pearwood.info>
wrote:

> (3) Using a regular expression is probably the "right" answer, at least
> from a comp sci theorectical perspective. This is precisely the sort of
> thing that regexes are designed for. Unfortunately, regex syntax is
> itself a programming language[1], and a particularly cryptic and
> unforgiving one, so even quite experienced coders can have trouble.
>

indeed -- we all know the old maxim:

"I had a problem, and thought "I know, I'll use regular expressions" -- now
I have two problems.

And the Python "obvious way to do it" has always been for simple string
manipulation, see if what you need is in a string method before you bring
out the big guns of REs

After all, if "use REs" was the answer to simple string manipulation
problems, the string object would have a lot fewer methods.

So: I've frequently had this use-case, too -- it would be a nice
enhancement that would had substantial utility to strings. Whether it used
an re under the hood or not should be an implementation detail.

-CHB

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150929/f9447aa0/attachment-0001.html>

From chris.barker at noaa.gov  Tue Sep 29 17:36:27 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Tue, 29 Sep 2015 08:36:27 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <5609993E.9010103@mail.de>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <560966C1.1040704@mail.de> <20150928163733.GN23642@ando.pearwood.info>
 <56096F09.40804@mail.de> <3819F6B1-3221-41E2-9103-62B71CBA7708@yahoo.com>
 <5609993E.9010103@mail.de>
Message-ID: <CALGmxEKaH=R7rhMoiG71OMraYY_uGBfeFYtmaqZbKwdzKZnx7w@mail.gmail.com>

On Mon, Sep 28, 2015 at 12:47 PM, Sven R. Kunze <srkunze at mail.de> wrote:

> I've seen experienced coworkers rather adding superfluous pairs of
> parentheses just to make sure or because they still don't know better.


nothing wrong with superfluous parentheses -- it makes it clear and it's
more robust in the face of refactoring.

It fact, my answer to "precedence could be confusing here" is "use an extra
parentheses and don't worry about it."

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150929/95cdff73/attachment.html>

From rob.cliffe at btinternet.com  Tue Sep 29 18:20:45 2015
From: rob.cliffe at btinternet.com (Rob Cliffe)
Date: Tue, 29 Sep 2015 17:20:45 +0100
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CACac1F8OOxGsUwMXQWx_ppKJ8_C5BoRO-5NmPGdSuYiKtbAqfA@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <etPan.5609a610.2f0c7c53.76f@Draupnir.home>
 <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
 <etPan.5609abd8.f9ec9f8.76f@Draupnir.home>
 <CAP7+vJ+=wJtTszjHLyV4O4Y2qfFxMBL2CCVk=je_yBNdriLw=w@mail.gmail.com>
 <560A5E9D.3070808@egenix.com>
 <CALFfu7CMzZTKmZsGc4QX5sf+z_9X2HZWM-dYBvfGwUNkXz0CAw@mail.gmail.com>
 <CACac1F8OOxGsUwMXQWx_ppKJ8_C5BoRO-5NmPGdSuYiKtbAqfA@mail.gmail.com>
Message-ID: <560ABA5D.1070602@btinternet.com>



On 29/09/2015 15:56, Paul Moore wrote:
> On 29 September 2015 at 14:40, Eric Snow <ericsnowcurrently at gmail.com> wrote:
>>> For default parameters with mutable types as values,
>>> I usually write:
>>>
>>> def func(x=None):
>>>      if x is None:
>>>          x = []
>>>      ...
>> I do the same.  It has the right amount of explicitness and makes the
>> default-case branch more obvious (subjectively, of course) than the
>> proposed alternative:
>>
>> def func(x=None):
>>      x = x ?? []
> Looking at those two cases in close proximity like that, I have to say
> that the explicit if statement wins hands down.
>
> But it's not quite as obvious with multiple arguments where the target
> isn't the same as the parameter (for example with a constructor):
>
> def __init__(self, vertices=None, edges=None, weights=None, source_nodes=None):
>      if vertices is None:
>          self.vertices = []
>      else:
>          self.vertices = vertices
>      if edges is None:
>          self.edges = []
>      else:
>          self.edges = edges
>      if weights is None:
>          self.weights = {}
>      else:
>          self.weights = weights
>      if source_nodes is None:
>          self.source_nodes = []
>      else:
>          self.source_nodes = source_nodes
> vs
>
> def __init__(self, vertices=None, edges=None, weights=None, source_nodes=None):
>      self.vertices = vertices or? []
>      self.edges = edges or? []
>      self.weights = weights or? {}
>      self.source_nodes = source_nodes or? []
>
> Having said all of that, short circuiting is not important here, so
>
> def default(var, dflt):
>      if var is None:
>          return dflt
>      return var
>
> def __init__(self, vertices=None, edges=None, weights=None, source_nodes=None):
>      self.vertices = default(vertices, [])
>      self.edges = default(edges, [])
>      self.weights = default(weights, {})
>      self.source_nodes = default(source_nodes, [])
>
> is also an option.
>
>
Why not

def __init__(self, vertices=None, edges=None, weights=None, source_nodes=None):
     self.vertices     = vertices     if vertices     is not None else []
     self.edges        = edges        if edges        is not None else []
     self.weights      = weights      if weights      is not None else {}
     self.source_nodes = source_nodes if source_nodes is not None else []

Completely explicit.
Self-contained (you don't need to look up a helper function).
Reasonably compact (at least vertically).
Easy to make a change to one of the lines if the logic of that line changes.
Doesn't need a language change.
And if you align the lines (as I have attempted to, although different 
proportional fonts may make it look ragged),
it highlights the common structure of the lines *and* their differences 
(you can see that one line has "{}" instead of "[]" because it stands out).

Rob Cliffe

From srkunze at mail.de  Tue Sep 29 18:35:16 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Tue, 29 Sep 2015 18:35:16 +0200
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <mucmih$6tg$1@ger.gmane.org>
References: <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <1082717728.2280152.1443469633455.JavaMail.yahoo@mail.yahoo.com>
 <CALxg4FU7dd7Yi=-fiBL_uM1s4tfLHk7ko56GQu2MEJJojx-n8w@mail.gmail.com>
 <mucmih$6tg$1@ger.gmane.org>
Message-ID: <560ABDC4.7000308@mail.de>

On 29.09.2015 02:38, Terry Reedy wrote:
> On 9/28/2015 5:48 PM, Luciano Ramalho wrote:
>> Glyph tweeted yesterday that everyone should watch the "Nothing is
>> Something" 35' talk by Sandi Metz at RailsConf 2015. It's great and,
>> in a way, relevant to this discussion.
>>
>> https://www.youtube.com/watch?v=29MAL8pJImQ
>
> I understood Metz as advocation avoidig the nil (None) problem by 
> giving every class an 'active nothing' that has the methods of the 
> class.  We do that for most builtin classes -- 0, (), {}, etc. She 
> also used the identity function with a particular signature in various 
> roles.
>

I might stress here that nobody said there's a single "active nothing". 
There are far more "special case objects" (as Robert C. Martin calls it) 
than 0, (), {}, etc. I fear, however, the stdlib cannot account for 
every special case object possible. Without None available in the first 
place, users would be forced to create their domain-specific special 
case objects.

None being available though, people need to be taught to avoid it, which 
btw. she did a really good job of.

Best,
Sven

From g.brandl at gmx.net  Tue Sep 29 18:37:46 2015
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 29 Sep 2015 18:37:46 +0200
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CALFfu7CMzZTKmZsGc4QX5sf+z_9X2HZWM-dYBvfGwUNkXz0CAw@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <etPan.5609a610.2f0c7c53.76f@Draupnir.home>
 <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
 <etPan.5609abd8.f9ec9f8.76f@Draupnir.home>
 <CAP7+vJ+=wJtTszjHLyV4O4Y2qfFxMBL2CCVk=je_yBNdriLw=w@mail.gmail.com>
 <560A5E9D.3070808@egenix.com>
 <CALFfu7CMzZTKmZsGc4QX5sf+z_9X2HZWM-dYBvfGwUNkXz0CAw@mail.gmail.com>
Message-ID: <mueeor$ijg$1@ger.gmane.org>

On 09/29/2015 03:40 PM, Eric Snow wrote:
> On Tue, Sep 29, 2015 at 3:49 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>> On 28.09.2015 23:49, Guido van Rossum wrote:
>>> But that's a different point (for the record I'm not a big fan of the ?
>>> either).
>>
>> Me neither.
> 
> Same here.
> 
>>
>> The proposal simply doesn't have the right balance between usefulness
>> and complexity added to the language (esp. for new Python programmers
>> to learn in order to be able to read a Python program).
> 
> +1

I agree as well.

>> In practice, you can very often write "x or y" instead of
>> having to use "x if x is None else y", simply because you're
>> not only interested in catching the x is None case, but also
>> want to override an empty string or sequence value with
>> a default. If you really need to specifically check for None,
>> "x if x is None else y" is way more expressive than "x ?? y".
>>
>> For default parameters with mutable types as values,
>> I usually write:
>>
>> def func(x=None):
>>     if x is None:
>>         x = []
>>     ...
> 
> I do the same.  It has the right amount of explicitness and makes the
> default-case branch more obvious (subjectively, of course) than the
> proposed alternative:
> 
> def func(x=None):
>     x = x ?? []

Looking at this, I think people might call ?? the "WTF operator".  Not a
good sign :)

Georg


From emile at fenx.com  Tue Sep 29 18:58:40 2015
From: emile at fenx.com (Emile van Sebille)
Date: Tue, 29 Sep 2015 09:58:40 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <560ABA5D.1070602@btinternet.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <etPan.5609a610.2f0c7c53.76f@Draupnir.home>
 <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
 <etPan.5609abd8.f9ec9f8.76f@Draupnir.home>
 <CAP7+vJ+=wJtTszjHLyV4O4Y2qfFxMBL2CCVk=je_yBNdriLw=w@mail.gmail.com>
 <560A5E9D.3070808@egenix.com>
 <CALFfu7CMzZTKmZsGc4QX5sf+z_9X2HZWM-dYBvfGwUNkXz0CAw@mail.gmail.com>
 <CACac1F8OOxGsUwMXQWx_ppKJ8_C5BoRO-5NmPGdSuYiKtbAqfA@mail.gmail.com>
 <560ABA5D.1070602@btinternet.com>
Message-ID: <mueg07$7kt$1@ger.gmane.org>

On 9/29/2015 9:20 AM, Rob Cliffe wrote:
> Why not
>
> def __init__(self, vertices=None, edges=None, weights=None,
> source_nodes=None):
>      self.vertices     = vertices     if vertices     is not None else []
>      self.edges        = edges        if edges        is not None else []
>      self.weights      = weights      if weights      is not None else {}
>      self.source_nodes = source_nodes if source_nodes is not None else []

I don't understand why not:

       self.vertices     = vertices     or []
       self.edges        = edges        or []
       self.weights      = weights      or {}
       self.source_nodes = source_nodes or []


Emile




From ericsnowcurrently at gmail.com  Tue Sep 29 19:15:36 2015
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 29 Sep 2015 11:15:36 -0600
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CACac1F8OOxGsUwMXQWx_ppKJ8_C5BoRO-5NmPGdSuYiKtbAqfA@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <etPan.5609a610.2f0c7c53.76f@Draupnir.home>
 <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
 <etPan.5609abd8.f9ec9f8.76f@Draupnir.home>
 <CAP7+vJ+=wJtTszjHLyV4O4Y2qfFxMBL2CCVk=je_yBNdriLw=w@mail.gmail.com>
 <560A5E9D.3070808@egenix.com>
 <CALFfu7CMzZTKmZsGc4QX5sf+z_9X2HZWM-dYBvfGwUNkXz0CAw@mail.gmail.com>
 <CACac1F8OOxGsUwMXQWx_ppKJ8_C5BoRO-5NmPGdSuYiKtbAqfA@mail.gmail.com>
Message-ID: <CALFfu7AM6UwFnA-FHsZ7D4KQ7Qb7XtP=KPnYneyfzbgyaNQH6w@mail.gmail.com>

On Tue, Sep 29, 2015 at 8:56 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> But it's not quite as obvious with multiple arguments where the target
> isn't the same as the parameter (for example with a constructor):
>
> def __init__(self, vertices=None, edges=None, weights=None, source_nodes=None):
>     if vertices is None:
>         self.vertices = []
>     else:
>         self.vertices = vertices
>     if edges is None:
>         self.edges = []
>     else:
>         self.edges = edges
>     if weights is None:
>         self.weights = {}
>     else:
>         self.weights = weights
>     if source_nodes is None:
>         self.source_nodes = []
>     else:
>         self.source_nodes = source_nodes

Personally I usually keep the defaults handling separate, like so:

def __init__(self, vertices=None, edges=None, weights=None, source_nodes=None):
    if vertices is None:
        vertices = []
    if edges is None:
        edges = []
    if weights is None:
        weights = {}
    if source_nodes is None:
        source_nodes = []

    self.vertices = vertices
    self.edges = edges
    self.weights = weights
    self.source_nodes = source_nodes

...and given the alternatives presented here, I'd likely continue
doing so.  To me the others are less distinct about how defaults are
set and invite more churn if you have to do anything extra down the
road when composing a default.  Regardless, YMMV.

[snip]
> In practice, of course, I never write a default() function at the
> moment, I just use multi-line ifs. Whether that means I'd use an or?
> operator, I don't know. Probably - but I'd likely consider it a bit of
> a "too many ways of doing the same thing" wart at the same time...

Right.  And it doesn't really pay for itself when measured against that cost.

-eric

From srkunze at mail.de  Tue Sep 29 19:30:21 2015
From: srkunze at mail.de (Sven R. Kunze)
Date: Tue, 29 Sep 2015 19:30:21 +0200
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <mueg07$7kt$1@ger.gmane.org>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <etPan.5609a610.2f0c7c53.76f@Draupnir.home>
 <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
 <etPan.5609abd8.f9ec9f8.76f@Draupnir.home>
 <CAP7+vJ+=wJtTszjHLyV4O4Y2qfFxMBL2CCVk=je_yBNdriLw=w@mail.gmail.com>
 <560A5E9D.3070808@egenix.com>
 <CALFfu7CMzZTKmZsGc4QX5sf+z_9X2HZWM-dYBvfGwUNkXz0CAw@mail.gmail.com>
 <CACac1F8OOxGsUwMXQWx_ppKJ8_C5BoRO-5NmPGdSuYiKtbAqfA@mail.gmail.com>
 <560ABA5D.1070602@btinternet.com> <mueg07$7kt$1@ger.gmane.org>
Message-ID: <560ACAAD.4020705@mail.de>

On 29.09.2015 18:58, Emile van Sebille wrote:
> On 9/29/2015 9:20 AM, Rob Cliffe wrote:
>> Why not
>>
>> def __init__(self, vertices=None, edges=None, weights=None,
>> source_nodes=None):
>>      self.vertices     = vertices     if vertices     is not None 
>> else []
>>      self.edges        = edges        if edges        is not None 
>> else []
>>      self.weights      = weights      if weights      is not None 
>> else {}
>>      self.source_nodes = source_nodes if source_nodes is not None 
>> else []
>
> I don't understand why not:
>
>       self.vertices     = vertices     or []
>       self.edges        = edges        or []
>       self.weights      = weights      or {}
>       self.source_nodes = source_nodes or []

People fear that when you pass some special objects into the constructor 
that behaves like False, this special object is replaced by [] or {}.

I for one don't think it's a real issue. However, it has been said that 
people got bitten by this in the past. I don't know what the heck they 
did, but I presume they tried something really really nasty.

Best,
Sven

From barry at python.org  Tue Sep 29 19:35:42 2015
From: barry at python.org (Barry Warsaw)
Date: Tue, 29 Sep 2015 13:35:42 -0400
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net>
Message-ID: <20150929133542.4d04f6dd@anarchist.wooz.org>

On Sep 28, 2015, at 03:04 PM, Carl Meyer wrote:

>But even if they are rejected, I think a simple `??` or `or?` (or
>however it's spelled) operator to reduce the repetition of "x if x is
>not None else y" is worth consideration on its own merits. This operator
>is entirely unambiguous, and I think would be useful and frequently
>used, whether or not ?. and ?[ are added along with it.

But why is it an improvement?  The ternary operator is entirely obvious and
readable, and at least in my experience, is rare enough that the repetition
doesn't hurt my fingers that much.  It seems like such a radical, ugly new
syntax unjustified by the frequency of use and readability improvement.

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150929/d25c94fa/attachment.sig>

From carl at oddbird.net  Tue Sep 29 19:57:24 2015
From: carl at oddbird.net (Carl Meyer)
Date: Tue, 29 Sep 2015 11:57:24 -0600
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <20150929133542.4d04f6dd@anarchist.wooz.org>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net> <20150929133542.4d04f6dd@anarchist.wooz.org>
Message-ID: <560AD104.8030007@oddbird.net>

Hi Barry,

On 09/29/2015 11:35 AM, Barry Warsaw wrote:
> On Sep 28, 2015, at 03:04 PM, Carl Meyer wrote:
> 
>> But even if they are rejected, I think a simple `??` or `or?` (or
>> however it's spelled) operator to reduce the repetition of "x if x is
>> not None else y" is worth consideration on its own merits. This operator
>> is entirely unambiguous, and I think would be useful and frequently
>> used, whether or not ?. and ?[ are added along with it.
> 
> But why is it an improvement?  The ternary operator is entirely obvious and
> readable, and at least in my experience, is rare enough that the repetition
> doesn't hurt my fingers that much.  It seems like such a radical, ugly new
> syntax unjustified by the frequency of use and readability improvement.

I find the repetition irritating enough that I'm tempted to use 'or'
instead, even when I know it's not technically the semantics I want. (In
most cases, the difference probably doesn't matter, and when it actually
does, I probably know that and write out the full ternary.) And I find
plenty of other code using `or` when it ought to be using a ternary with
`is None` (but again, most of the time in practice it's fine.) Most of
this code is in defaults-handling; there've been plenty of examples in
the thread. I find the explicit if-block painful if there's more than
one argument with a None-default to be handled; YMMV.

I agree that I don't love any of the syntax suggestions so far, and
without a less ugly syntax, it's probably dead.

If it was in the language, I'd use it, but I don't feel strongly about it.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150929/ae175671/attachment.sig>

From twangist at gmail.com  Tue Sep 29 20:04:38 2015
From: twangist at gmail.com (Brian O'Neill)
Date: Tue, 29 Sep 2015 14:04:38 -0400
Subject: [Python-ideas]  PEP 505 (None coalescing operators) 	thoughts
Message-ID: <0D5667BE-EF2D-4469-9C61-FB982CE8AE01@gmail.com>

> On 9/29/2015 9:20 AM, Rob Cliffe wrote:
> Why not
> >
> > def __init__(self, vertices=None, edges=None, weights=None,
> > source_nodes=None):
> >      self.vertices     = vertices     if vertices     is not None else []
> >      self.edges        = edges        if edges        is not None else []
> >      self.weights      = weights      if weights      is not None else {}
> >      self.source_nodes = source_nodes if source_nodes is not None else []
> 
> I don't understand why not:
> 
>        self.vertices     = vertices     or []
>        self.edges        = edges        or []
>        self.weights      = weights      or {}
>        self.source_nodes = source_nodes or []
> 
> 
> Emile

A further virtue of 
    self.vertices = vertices or []
and the like is that they coerce falsy parameters of the wrong type to the falsy object of the correct type.
E.g. if vertices is '' or 0, self.vertices will be set to [], whereas the ternary expression only tests 
for not-None so self.vertices will be set to a probably crazy value.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150929/45714845/attachment-0001.html>

From carl at oddbird.net  Tue Sep 29 20:07:10 2015
From: carl at oddbird.net (Carl Meyer)
Date: Tue, 29 Sep 2015 12:07:10 -0600
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <0D5667BE-EF2D-4469-9C61-FB982CE8AE01@gmail.com>
References: <0D5667BE-EF2D-4469-9C61-FB982CE8AE01@gmail.com>
Message-ID: <560AD34E.8010905@oddbird.net>

On 09/29/2015 12:04 PM, Brian O'Neill wrote:
> A further virtue of 
> 
>     self.vertices = vertices or []
> 
> and the like is that they coerce falsy parameters of the wrong type to the falsy object of the correct type.
> 
> E.g. if vertices is '' or 0, self.vertices will be set to [], whereas the ternary expression only tests 
> 
> for not-None so self.vertices will be set to a probably crazy value.

Doesn't seem like a virtue to me, seems like it's probably hiding a bug
in the calling code, which may have other ramifications. Better to have
the "crazy value" visible and fail faster, so you can go fix that bug.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150929/5ae7ffb5/attachment.sig>

From twangist at gmail.com  Tue Sep 29 20:33:36 2015
From: twangist at gmail.com (Brian O'Neill)
Date: Tue, 29 Sep 2015 14:33:36 -0400
Subject: [Python-ideas]  PEP 505 (None coalescing operators) thoughts
Message-ID: <293BDA57-0C74-4241-AEC7-4AE1E7DA66A1@gmail.com>

> > A further virtue of 
> > 
> >     self.vertices = vertices or []
> > 
> > and the like is that they coerce falsy parameters of the wrong type to the falsy object of the correct type.
> > E.g. if vertices is '' or 0, self.vertices will be set to [], whereas the ternary expression only tests 
> > 
> > for not-None so self.vertices will be set to a probably crazy value.
> 
> Doesn't seem like a virtue to me, seems like it's probably hiding a bug
> in the calling code, which may have other ramifications. Better to have
> the "crazy value" visible and fail faster, so you can go fix that bug.
> 
> Carl
I have to agree. It isn't a "virtue", and it's best not to mask such mistakes. But it *is* a... property of the shorter construct that it's more forgiving, and doesn't have the same semantics.
PS -- My first post, and I lost the "Re:" in the Subject, hence this orphan thread which I'm content to see go no further.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150929/4de18530/attachment.html>

From p.f.moore at gmail.com  Tue Sep 29 20:48:05 2015
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 29 Sep 2015 19:48:05 +0100
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CALFfu7AM6UwFnA-FHsZ7D4KQ7Qb7XtP=KPnYneyfzbgyaNQH6w@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <etPan.5609a610.2f0c7c53.76f@Draupnir.home>
 <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
 <etPan.5609abd8.f9ec9f8.76f@Draupnir.home>
 <CAP7+vJ+=wJtTszjHLyV4O4Y2qfFxMBL2CCVk=je_yBNdriLw=w@mail.gmail.com>
 <560A5E9D.3070808@egenix.com>
 <CALFfu7CMzZTKmZsGc4QX5sf+z_9X2HZWM-dYBvfGwUNkXz0CAw@mail.gmail.com>
 <CACac1F8OOxGsUwMXQWx_ppKJ8_C5BoRO-5NmPGdSuYiKtbAqfA@mail.gmail.com>
 <CALFfu7AM6UwFnA-FHsZ7D4KQ7Qb7XtP=KPnYneyfzbgyaNQH6w@mail.gmail.com>
Message-ID: <CACac1F8Z0umc3+jA0JbJn3WMwwh2Wo+mzPFHpLH7S0TY9hMmxA@mail.gmail.com>

On 29 September 2015 at 18:15, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> Personally I usually keep the defaults handling separate, like so:
[...]
> ...and given the alternatives presented here, I'd likely continue
> doing so.  To me the others are less distinct about how defaults are
> set and invite more churn if you have to do anything extra down the
> road when composing a default.  Regardless, YMMV.

Agreed, there's many ways, and the new operator doesn't really add a
huge amount (other than yet another way of doing things).

> [snip]
>> In practice, of course, I never write a default() function at the
>> moment, I just use multi-line ifs. Whether that means I'd use an or?
>> operator, I don't know. Probably - but I'd likely consider it a bit of
>> a "too many ways of doing the same thing" wart at the same time...
>
> Right.  And it doesn't really pay for itself when measured against that cost.

Precisely.
Paul

From abarnert at yahoo.com  Tue Sep 29 22:33:08 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Tue, 29 Sep 2015 13:33:08 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <mueg07$7kt$1@ger.gmane.org>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <etPan.5609a610.2f0c7c53.76f@Draupnir.home>
 <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
 <etPan.5609abd8.f9ec9f8.76f@Draupnir.home>
 <CAP7+vJ+=wJtTszjHLyV4O4Y2qfFxMBL2CCVk=je_yBNdriLw=w@mail.gmail.com>
 <560A5E9D.3070808@egenix.com>
 <CALFfu7CMzZTKmZsGc4QX5sf+z_9X2HZWM-dYBvfGwUNkXz0CAw@mail.gmail.com>
 <CACac1F8OOxGsUwMXQWx_ppKJ8_C5BoRO-5NmPGdSuYiKtbAqfA@mail.gmail.com>
 <560ABA5D.1070602@btinternet.com> <mueg07$7kt$1@ger.gmane.org>
Message-ID: <87FB8972-4E5B-4E8E-8967-466E6B95FBB6@yahoo.com>

On Sep 29, 2015, at 09:58, Emile van Sebille <emile at fenx.com> wrote:
> 
>> On 9/29/2015 9:20 AM, Rob Cliffe wrote:
>> Why not
>> 
>> def __init__(self, vertices=None, edges=None, weights=None,
>> source_nodes=None):
>>     self.vertices     = vertices     if vertices     is not None else []
>>     self.edges        = edges        if edges        is not None else []
>>     self.weights      = weights      if weights      is not None else {}
>>     self.source_nodes = source_nodes if source_nodes is not None else []
> 
> I don't understand why not:
> 
>      self.vertices     = vertices     or []
>      self.edges        = edges        or []
>      self.weights      = weights      or {}
>      self.source_nodes = source_nodes or []

Because empty containers are just as falsey as None.

So, if I pass in a shared list, your "vertices or []" will replace it with a new, unshared list; if I pass a tuple because I need an immutable graph, you'll replace it with a mutable list; if I pass in a blist.sortedlist, you'll replace it with a plain list. Worse, this will only happen if the argument I pass happens to be empty, which I may not have thought to test for.

This is the same reason you don't use "if spam:" when you meant "if spam is not None:", which is explained in PEP 8.

Also, I believe the PEP for ternary if-else explains why this is an "attractive nuisance" misuse of or, as one of the major arguments for why a ternary expression should be added.

From jdhardy at gmail.com  Tue Sep 29 22:43:58 2015
From: jdhardy at gmail.com (Jeff Hardy)
Date: Tue, 29 Sep 2015 13:43:58 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <20150929133542.4d04f6dd@anarchist.wooz.org>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net>
 <20150929133542.4d04f6dd@anarchist.wooz.org>
Message-ID: <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>

On Tue, Sep 29, 2015 at 10:35 AM, Barry Warsaw <barry at python.org> wrote:

> On Sep 28, 2015, at 03:04 PM, Carl Meyer wrote:
>
> >But even if they are rejected, I think a simple `??` or `or?` (or
> >however it's spelled) operator to reduce the repetition of "x if x is
> >not None else y" is worth consideration on its own merits. This operator
> >is entirely unambiguous, and I think would be useful and frequently
> >used, whether or not ?. and ?[ are added along with it.
>
> But why is it an improvement?  The ternary operator is entirely obvious and
> readable, and at least in my experience, is rare enough that the repetition
> doesn't hurt my fingers that much.  It seems like such a radical, ugly new
> syntax unjustified by the frequency of use and readability improvement.
>

I use it all over the place in C# code, where it makes null checks much
cleaner, and the punctuation choice makes sense:

    var x = foo != null ? foo : "";
    var y = foo ?? "";

 (it also has other uses in C# relating to nullable types that aren't
relevant in Python.)

I'd argue the same is true in Python, if a decent way to spell it can be
found:

    x = foo if foo is not None else ""
    y = foo or? ""

It's pure syntactic sugar, but it *is* pretty sweet.

(It would also make get-with-default unnecessary, but since it already
exists that's not a useful argument.)

- Jeff
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150929/3369f879/attachment-0001.html>

From emile at fenx.com  Tue Sep 29 22:50:43 2015
From: emile at fenx.com (Emile van Sebille)
Date: Tue, 29 Sep 2015 13:50:43 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <87FB8972-4E5B-4E8E-8967-466E6B95FBB6@yahoo.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <etPan.5609a610.2f0c7c53.76f@Draupnir.home>
 <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
 <etPan.5609abd8.f9ec9f8.76f@Draupnir.home>
 <CAP7+vJ+=wJtTszjHLyV4O4Y2qfFxMBL2CCVk=je_yBNdriLw=w@mail.gmail.com>
 <560A5E9D.3070808@egenix.com>
 <CALFfu7CMzZTKmZsGc4QX5sf+z_9X2HZWM-dYBvfGwUNkXz0CAw@mail.gmail.com>
 <CACac1F8OOxGsUwMXQWx_ppKJ8_C5BoRO-5NmPGdSuYiKtbAqfA@mail.gmail.com>
 <560ABA5D.1070602@btinternet.com> <mueg07$7kt$1@ger.gmane.org>
 <87FB8972-4E5B-4E8E-8967-466E6B95FBB6@yahoo.com>
Message-ID: <muetj9$s9h$1@ger.gmane.org>

Thanks -- I think I've got a better handle now on the why of this 
discussion.

Emile


On 9/29/2015 1:33 PM, Andrew Barnert via Python-ideas wrote:
> On Sep 29, 2015, at 09:58, Emile van Sebille <emile at fenx.com> wrote:
>>
>>> On 9/29/2015 9:20 AM, Rob Cliffe wrote:
>>> Why not
>>>
>>> def __init__(self, vertices=None, edges=None, weights=None,
>>> source_nodes=None):
>>>      self.vertices     = vertices     if vertices     is not None else []
>>>      self.edges        = edges        if edges        is not None else []
>>>      self.weights      = weights      if weights      is not None else {}
>>>      self.source_nodes = source_nodes if source_nodes is not None else []
>>
>> I don't understand why not:
>>
>>       self.vertices     = vertices     or []
>>       self.edges        = edges        or []
>>       self.weights      = weights      or {}
>>       self.source_nodes = source_nodes or []
>
> Because empty containers are just as falsey as None.
>
> So, if I pass in a shared list, your "vertices or []" will replace it with a new, unshared list; if I pass a tuple because I need an immutable graph, you'll replace it with a mutable list; if I pass in a blist.sortedlist, you'll replace it with a plain list. Worse, this will only happen if the argument I pass happens to be empty, which I may not have thought to test for.
>
> This is the same reason you don't use "if spam:" when you meant "if spam is not None:", which is explained in PEP 8.
>
> Also, I believe the PEP for ternary if-else explains why this is an "attractive nuisance" misuse of or, as one of the major arguments for why a ternary expression should be added.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



From emile at fenx.com  Tue Sep 29 22:55:04 2015
From: emile at fenx.com (Emile van Sebille)
Date: Tue, 29 Sep 2015 13:55:04 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net> <20150929133542.4d04f6dd@anarchist.wooz.org>
 <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>
Message-ID: <muetrf$lh$1@ger.gmane.org>

On 9/29/2015 1:43 PM, Jeff Hardy wrote:

> I'd argue the same is true in Python, if a decent way to spell it can be
> found:
>
>      x = foo if foo is not None else ""
>      y = foo or? ""
>
> It's pure syntactic sugar, but it *is* pretty sweet.

as to or? variants -- I'd rather see nor:

x = foo nor 'foo was None'

Just to add my two cents worth.

Emile




From tjreedy at udel.edu  Tue Sep 29 23:48:22 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 29 Sep 2015 17:48:22 -0400
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <560ABDC4.7000308@mail.de>
References: <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <1082717728.2280152.1443469633455.JavaMail.yahoo@mail.yahoo.com>
 <CALxg4FU7dd7Yi=-fiBL_uM1s4tfLHk7ko56GQu2MEJJojx-n8w@mail.gmail.com>
 <mucmih$6tg$1@ger.gmane.org> <560ABDC4.7000308@mail.de>
Message-ID: <muf0va$grf$1@ger.gmane.org>

On 9/29/2015 12:35 PM, Sven R. Kunze wrote:
> On 29.09.2015 02:38, Terry Reedy wrote:
>> On 9/28/2015 5:48 PM, Luciano Ramalho wrote:
>>> Glyph tweeted yesterday that everyone should watch the "Nothing is
>>> Something" 35' talk by Sandi Metz at RailsConf 2015. It's great and,
>>> in a way, relevant to this discussion.
>>>
>>> https://www.youtube.com/watch?v=29MAL8pJImQ
>>
>> I understood Metz as advocation avoidig the nil (None) problem by
>> giving every class an 'active nothing' that has the methods of the
>> class.  We do that for most builtin classes -- 0, (), {}, etc. She
>> also used the identity function with a particular signature in various
>> roles.
>>
>
> I might stress here that nobody said there's a single "active nothing".

Ruby's nil and Python's None are passibe nothings.  Any operation other 
than those inherited from Object raise an exception.

> There are far more "special case objects" (as Robert C. Martin calls it)
> than 0, (), {}, etc.

Metz's point is that there is potentially one for most classes than one 
might write.

Some people have wondered why Python does not come with a builtin 
identity function.  The answer has been that one is not needed much and 
and it is easy to create one.  Metz's answer is that they are very 
useful for generalizing classes.  But she also at least implied that 
they should be specific to each situation.  Certainly in Python, if code 
were to check signature, and even type annotation, then a matching id 
function would be needed.

> I fear, however, the stdlib cannot account for
 > every special case object possible.

Right.  It is not possible to create a null instance of a class that 
does not yet exist.

> Without None available in the first place,

The problem of a general object is that it is general.  It should either 
be a ghost that does nothing, as with None, or a borg than does 
everything, as with the Bottom of some languages.

 >  users would be forced to create their domain-specific special
 > case objects.

Metz recomends doing this voluntarily ;-)
perhaps after an initial prototype.

 > None being available though, people need to be taught to avoid it,
 > which btw. she did a really good job of.

I think None works really well as the always-returned value for 
functions that are really procedures.  The problem comes with returning 
something or None, versus something or raise, or something or null of 
the class of something.

-- 
Terry Jan Reedy


From python-ideas at mgmiller.net  Wed Sep 30 00:08:24 2015
From: python-ideas at mgmiller.net (Mike Miller)
Date: Tue, 29 Sep 2015 15:08:24 -0700
Subject: [Python-ideas] list as parameter for the split function
In-Reply-To: <CALGmxELtWEupZw947r_8n_MhVcK84xcr7=1gaZ1SRRLPWM7H+A@mail.gmail.com>
References: <5609BADA.8060801@gmx.com>
 <20150929034316.GO23642@ando.pearwood.info>
 <CALGmxELtWEupZw947r_8n_MhVcK84xcr7=1gaZ1SRRLPWM7H+A@mail.gmail.com>
Message-ID: <560B0BD8.1090203@mgmiller.net>

+1 for the feature, given as a tuple.  Agreed with original, parent, and 
grandparent posts, have encountered this numerous times over the years.

Reaching for the re module and docs.python and/or stackoverflow to split a 
string (with two delimiters) feels like swatting a fly with a sledge-hammer. ;)

Conversely, I've not encountered the need as often with .startswith, which does 
support it.

-Mike

From greg.ewing at canterbury.ac.nz  Wed Sep 30 01:27:05 2015
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 30 Sep 2015 12:27:05 +1300
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <muetrf$lh$1@ger.gmane.org>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net> <20150929133542.4d04f6dd@anarchist.wooz.org>
 <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>
 <muetrf$lh$1@ger.gmane.org>
Message-ID: <560B1E49.7050102@canterbury.ac.nz>

Emile van Sebille wrote:

> x = foo nor 'foo was None'

Cute, but unfortunately it conflicts with established
usage of the word 'nor', which would suggest that
a nor b == not (a or b).

-- 
Greg

From abarnert at yahoo.com  Wed Sep 30 01:53:58 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Tue, 29 Sep 2015 16:53:58 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <muf0va$grf$1@ger.gmane.org>
References: <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <1082717728.2280152.1443469633455.JavaMail.yahoo@mail.yahoo.com>
 <CALxg4FU7dd7Yi=-fiBL_uM1s4tfLHk7ko56GQu2MEJJojx-n8w@mail.gmail.com>
 <mucmih$6tg$1@ger.gmane.org> <560ABDC4.7000308@mail.de>
 <muf0va$grf$1@ger.gmane.org>
Message-ID: <A1222533-A085-4443-B76F-48911ECA2DD4@yahoo.com>

On Sep 29, 2015, at 14:48, Terry Reedy <tjreedy at udel.edu> wrote:
> 
>> On 9/29/2015 12:35 PM, Sven R. Kunze wrote:
>>> On 29.09.2015 02:38, Terry Reedy wrote:
>>>> On 9/28/2015 5:48 PM, Luciano Ramalho wrote:
>>>> Glyph tweeted yesterday that everyone should watch the "Nothing is
>>>> Something" 35' talk by Sandi Metz at RailsConf 2015. It's great and,
>>>> in a way, relevant to this discussion.
>>>> 
>>>> https://www.youtube.com/watch?v=29MAL8pJImQ
>>> 
>>> I understood Metz as advocation avoidig the nil (None) problem by
>>> giving every class an 'active nothing' that has the methods of the
>>> class.  We do that for most builtin classes -- 0, (), {}, etc. She
>>> also used the identity function with a particular signature in various
>>> roles.
>> 
>> I might stress here that nobody said there's a single "active nothing".
> 
> Ruby's nil and Python's None are passibe nothings.  Any operation other than those inherited from Object raise an exception.
> 
>> There are far more "special case objects" (as Robert C. Martin calls it)
>> than 0, (), {}, etc.
> 
> Metz's point is that there is potentially one for most classes than one might write.

I don't think this is true.

First, "int" and "float" are such general-use/low-semantics types that "0" or "0.0" doesn't always mean "nothing". If you're talking about counts, or distances from some preferred origin, then yes, 0 is nothing; if you're talking about Unix timestamps, or ratings from 0 to 5 stars, then it's not. That's exactly why there's so much code in C and such languages that passes around -1 for nothing (but of course that only works when your real data is unsigned but small enough to waste a bit using a signed int), and the fact that Python idiomatically uses None instead of -1 is a strength, not a weakness. Likewise, sometimes "" makes a perfectly good null string, but sometimes it doesn't?it's often worth distinguishing between "" (has no middle name) and None (we haven't asked for the middle name yet), for example.

Also, list, set, dict, and most user-defined types are mutable. This means that using [] or Spam() as a type-specific nothing means your nothings are distinct, mutable objects. Sometimes that's OK, sometimes it's even explicitly a good thing, but sometimes it definitely isn't.

In a language that encouraged use of more finely-grained types (so you never use "int", you use "Rating", which is constrained to 0-5), the idea that each type that's nullable should have its own null makes some sense, and even more so for a pure-immutable language and idioms around type-driven programming. But that's not even close to Python.

> Some people have wondered why Python does not come with a builtin identity function.  The answer has been that one is not needed much and and it is easy to create one.  Metz's answer is that they are very useful for generalizing classes.  But she also at least implied that they should be specific to each situation.  Certainly in Python, if code were to check signature, and even type annotation, then a matching id function would be needed.

Having to create an identity function for each type seems like a horrible idea. Even more so in a language that encourages granular typing.

Fortunately, any such language that anyone would actually use probably has parametric genericity, so you could just write a single id function from any type A to the same A, and let the compiler deal with specializing it for each type instead of the programmer. (Or you could make type class definitions provide an id by default, or something else equivalent.)

>> I fear, however, the stdlib cannot account for
> > every special case object possible.
> 
> Right.  It is not possible to create a null instance of a class that does not yet exist.
> 
>> Without None available in the first place,
> 
> The problem of a general object is that it is general.  It should either be a ghost that does nothing, as with None, or a borg than does everything, as with the Bottom of some languages.

It might be worth having both. But maybe not?personally, while I've occasionally created a Python-like ghost in Smalltalk, I've never wanted a Smalltalk-like borg in Python. What I have wanted, quite often, is to write code that locally, explicitly, treats the ghost as a borg. And that's exactly what "is not None" tests are for, and we've all used them. This proposal isn't adding the concept to the language or idiom, just providing syntactic sugar to make an already widely-used feature easier to use.

> >  users would be forced to create their domain-specific special
> > case objects.
> 
> Metz recomends doing this voluntarily ;-)
> perhaps after an initial prototype.
> 
> > None being available though, people need to be taught to avoid it,
> > which btw. she did a really good job of.
> 
> I think None works really well as the always-returned value for functions that are really procedures.  The problem comes with returning something or None, versus something or raise, or something or null of the class of something.

I think the problem comes with assuming that there is a universal answer here. Sometimes something vs. None is appropriate. Sometimes, raising is appropriate. Sometimes, returning a special value is appropriate. Occasionally, even building a Maybe type and using that (with or without collapsing) is appropriate (even without syntactic support for pattern matching and fmapping, although it's much nicer with...). Python idiomatically uses all of the first three extensively, differently in different situations, but rarely the fourth. Some languages use a different subset.

Suggesting that one of these must always be the answer doesn't seem to be motivated by any real concerns. Does Python code really have a lot more problems with this than C or Swift or some other language that only idiomatically uses one or two different answers for? Even if it does (which I doubt), would eliminating one of the three but changing as little else as possible about the language and ecosystem actually help? And would it be even remotely feasible to do so?


From abarnert at yahoo.com  Wed Sep 30 01:57:39 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Tue, 29 Sep 2015 16:57:39 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <560B1E49.7050102@canterbury.ac.nz>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net> <20150929133542.4d04f6dd@anarchist.wooz.org>
 <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>
 <muetrf$lh$1@ger.gmane.org> <560B1E49.7050102@canterbury.ac.nz>
Message-ID: <CAA0BC24-7591-4CCF-ADA4-E6D547E1715F@yahoo.com>

LOn Sep 29, 2015, at 16:27, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> 
> Emile van Sebille wrote:
> 
>> x = foo nor 'foo was None'
> 
> Cute, but unfortunately it conflicts with established
> usage of the word 'nor', which would suggest that
> a nor b == not (a or b).

Agreed. If this is going to be a keyword rather than a symbol, it really has to read like English, or at least like abbreviated English, with the right meaning--something like "foo, falling back to 'foo was None' if needed".  Something that reads like English with a completely different meaning is a bad idea.


From alexander.belopolsky at gmail.com  Wed Sep 30 02:24:50 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 29 Sep 2015 20:24:50 -0400
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAA0BC24-7591-4CCF-ADA4-E6D547E1715F@yahoo.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net>
 <20150929133542.4d04f6dd@anarchist.wooz.org>
 <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>
 <muetrf$lh$1@ger.gmane.org> <560B1E49.7050102@canterbury.ac.nz>
 <CAA0BC24-7591-4CCF-ADA4-E6D547E1715F@yahoo.com>
Message-ID: <CAP7h-xbA7vQX9D_roXpWTYfH-2gXjeADEvpiSVuV3apk2UXabA@mail.gmail.com>

On Tue, Sep 29, 2015 at 7:57 PM, Andrew Barnert via Python-ideas <
python-ideas at python.org> wrote:

> If this is going to be a keyword rather than a symbol, it really has to
> read like English, or at least like abbreviated English,


I am -1 on this whole idea, but the keyword that comes to mind is "def":

x def []

may be read as x DEFaulting to [].

If this was a Python 4 idea, I would suggest repurposing the rarely used
xor operator: ^ and make x ^ y return the non-None of x and y or None if
both are None.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150929/c1c5a8e2/attachment.html>

From alexander.belopolsky at gmail.com  Wed Sep 30 02:27:22 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 29 Sep 2015 20:27:22 -0400
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAP7h-xbA7vQX9D_roXpWTYfH-2gXjeADEvpiSVuV3apk2UXabA@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net>
 <20150929133542.4d04f6dd@anarchist.wooz.org>
 <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>
 <muetrf$lh$1@ger.gmane.org> <560B1E49.7050102@canterbury.ac.nz>
 <CAA0BC24-7591-4CCF-ADA4-E6D547E1715F@yahoo.com>
 <CAP7h-xbA7vQX9D_roXpWTYfH-2gXjeADEvpiSVuV3apk2UXabA@mail.gmail.com>
Message-ID: <CAP7h-xa=BifmuAN7OstWbTZ3n=438edpcSphSHHKLZevQo2x+Q@mail.gmail.com>

On Tue, Sep 29, 2015 at 8:24 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

> ... suggest repurposing the rarely used xor operator: ^ and make x ^ y
> return the non-None of x and y or None if both are None.


.. and x if both are not None to allow x ^= y to work as expected.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150929/9d9db94d/attachment.html>

From rymg19 at gmail.com  Wed Sep 30 02:39:11 2015
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Tue, 29 Sep 2015 19:39:11 -0500
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAA0BC24-7591-4CCF-ADA4-E6D547E1715F@yahoo.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net> <20150929133542.4d04f6dd@anarchist.wooz.org>
 <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>
 <muetrf$lh$1@ger.gmane.org> <560B1E49.7050102@canterbury.ac.nz>
 <CAA0BC24-7591-4CCF-ADA4-E6D547E1715F@yahoo.com>
Message-ID: <D91E3BCA-A368-4F9E-8C5A-11ABE744F391@gmail.com>

What about 'otherwise'?

x = a otherwise b


On September 29, 2015 6:57:39 PM CDT, Andrew Barnert via Python-ideas <python-ideas at python.org> wrote:
>LOn Sep 29, 2015, at 16:27, Greg Ewing <greg.ewing at canterbury.ac.nz>
>wrote:
>> 
>> Emile van Sebille wrote:
>> 
>>> x = foo nor 'foo was None'
>> 
>> Cute, but unfortunately it conflicts with established
>> usage of the word 'nor', which would suggest that
>> a nor b == not (a or b).
>
>Agreed. If this is going to be a keyword rather than a symbol, it
>really has to read like English, or at least like abbreviated English,
>with the right meaning--something like "foo, falling back to 'foo was
>None' if needed".  Something that reads like English with a completely
>different meaning is a bad idea.
>
>_______________________________________________
>Python-ideas mailing list
>Python-ideas at python.org
>https://mail.python.org/mailman/listinfo/python-ideas
>Code of Conduct: http://python.org/psf/codeofconduct/

-- 
Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.
CURRENTLY LISTENING TO: Vermilion Fire (Final Fantasy Type-0)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150929/48533780/attachment.html>

From neatnate at gmail.com  Wed Sep 30 02:46:55 2015
From: neatnate at gmail.com (Nathan Schneider)
Date: Wed, 30 Sep 2015 01:46:55 +0100
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAP7h-xbA7vQX9D_roXpWTYfH-2gXjeADEvpiSVuV3apk2UXabA@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net>
 <20150929133542.4d04f6dd@anarchist.wooz.org>
 <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>
 <muetrf$lh$1@ger.gmane.org> <560B1E49.7050102@canterbury.ac.nz>
 <CAA0BC24-7591-4CCF-ADA4-E6D547E1715F@yahoo.com>
 <CAP7h-xbA7vQX9D_roXpWTYfH-2gXjeADEvpiSVuV3apk2UXabA@mail.gmail.com>
Message-ID: <CADQLQrXYpooW9dOThmV1jP2u4r49=_rpJYzKSnrpf==_yPvcDw@mail.gmail.com>

On Wed, Sep 30, 2015 at 1:24 AM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

>
> If this was a Python 4 idea, I would suggest repurposing the rarely used
> xor operator: ^ and make x ^ y return the non-None of x and y or None if
> both are None.
>

I find ^ quite useful for sets, and would rather not see it repurposed in
this way.

Another possibility: There could be a binary version of the tilde operator,
which is currently only unary: x ~ y to mean "x if x is not None else y".

But I am also -1 on the whole idea, as I rarely encounter situations that
would benefit from this construct.

Nathan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/311f1f10/attachment-0001.html>

From steve at pearwood.info  Wed Sep 30 03:02:15 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 30 Sep 2015 11:02:15 +1000
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <mueeor$ijg$1@ger.gmane.org>
References: <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <etPan.5609a610.2f0c7c53.76f@Draupnir.home>
 <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
 <etPan.5609abd8.f9ec9f8.76f@Draupnir.home>
 <CAP7+vJ+=wJtTszjHLyV4O4Y2qfFxMBL2CCVk=je_yBNdriLw=w@mail.gmail.com>
 <560A5E9D.3070808@egenix.com>
 <CALFfu7CMzZTKmZsGc4QX5sf+z_9X2HZWM-dYBvfGwUNkXz0CAw@mail.gmail.com>
 <mueeor$ijg$1@ger.gmane.org>
Message-ID: <20150930010215.GU23642@ando.pearwood.info>

On Tue, Sep 29, 2015 at 06:37:46PM +0200, Georg Brandl wrote:

> >     x = x ?? []
> 
> Looking at this, I think people might call ?? the "WTF operator".  Not a
> good sign :)

I see your smiley, but C# has this operator. What do C# programmers call 
it? ("Null coalescing operator" is the formal name, but that's way too 
long for everyday use.)


-- 
Steve

From steve at pearwood.info  Wed Sep 30 03:32:36 2015
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 30 Sep 2015 11:32:36 +1000
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <560AD34E.8010905@oddbird.net>
References: <0D5667BE-EF2D-4469-9C61-FB982CE8AE01@gmail.com>
 <560AD34E.8010905@oddbird.net>
Message-ID: <20150930013236.GV23642@ando.pearwood.info>

On Tue, Sep 29, 2015 at 12:07:10PM -0600, Carl Meyer wrote:
> On 09/29/2015 12:04 PM, Brian O'Neill wrote:
> > A further virtue of 
> > 
> >     self.vertices = vertices or []
> > 
> > and the like is that they coerce falsy parameters of the wrong type to the falsy object of the correct type.
> > 
> > E.g. if vertices is '' or 0, self.vertices will be set to [], whereas the ternary expression only tests 
> > 
> > for not-None so self.vertices will be set to a probably crazy value.
> 
> Doesn't seem like a virtue to me, seems like it's probably hiding a bug
> in the calling code, which may have other ramifications. Better to have
> the "crazy value" visible and fail faster, so you can go fix that bug.

Agreed.

Assuming the function is intended to, and documented as, using the 
passed in "vertices", using `or` is simply wrong, in two ways:

- if vertices is a falsey value of the wrong type, say, 0.0, it will be 
  silently replaced by [] instead of triggering an exception (usually a 
  TypeError or AttributeError);

- if vertices is a falsey value of the right type, say 
  collections.deque(), it will be silently replaced by [].

In the first case, the code is hiding a bug in the caller. In the second 
case, its a bug in the called code.

I am very sad to see how many people still use the error-prone `x or y` 
idiom inappropriately, a full decade after PEP 308 was approved. 
(Depending on where you are in the world, it was ten years ago today, or 
yesterday.) `x or y` still has its uses, but testing for None is not one 
of them.



-- 
Steve

From rob.cliffe at btinternet.com  Wed Sep 30 13:00:19 2015
From: rob.cliffe at btinternet.com (Rob Cliffe)
Date: Wed, 30 Sep 2015 12:00:19 +0100
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <D91E3BCA-A368-4F9E-8C5A-11ABE744F391@gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net> <20150929133542.4d04f6dd@anarchist.wooz.org>
 <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>
 <muetrf$lh$1@ger.gmane.org> <560B1E49.7050102@canterbury.ac.nz>
 <CAA0BC24-7591-4CCF-ADA4-E6D547E1715F@yahoo.com>
 <D91E3BCA-A368-4F9E-8C5A-11ABE744F391@gmail.com>
Message-ID: <560BC0C3.4040502@btinternet.com>

Or:
     x = a orelse b        # Visual Basic has a short-circuiting OrElse 
operator for boolean operands
     x = a orifNone b


On 30/09/2015 01:39, Ryan Gonzalez wrote:
> What about 'otherwise'?
>
> x = a otherwise b
>
>
> On September 29, 2015 6:57:39 PM CDT, Andrew Barnert via Python-ideas 
> <python-ideas at python.org> wrote:
>
>     LOn Sep 29, 2015, at 16:27, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>
>         Emile van Sebille wrote:
>
>             x = foo nor 'foo was None' 
>
>         Cute, but unfortunately it conflicts with established usage of
>         the word 'nor', which would suggest that a nor b == not (a or b). 
>
>
>     Agreed. If this is going to be a keyword rather than a symbol, it really has to read like English, or at least like abbreviated English, with the right meaning--something like "foo, falling back to 'foo was None' if needed".  Something that reads like English with a completely different meaning is a bad idea.
>
>     ------------------------------------------------------------------------
>
>     Python-ideas mailing list
>     Python-ideas at python.org
>     https://mail.python.org/mailman/listinfo/python-ideas
>     Code of Conduct:http://python.org/psf/codeofconduct/
>
> -- Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity. 
> CURRENTLY LISTENING TO: Vermilion Fire (Final Fantasy Type-0)
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/b8854d57/attachment.html>

From mistersheik at gmail.com  Wed Sep 30 17:28:10 2015
From: mistersheik at gmail.com (Neil Girdhar)
Date: Wed, 30 Sep 2015 08:28:10 -0700 (PDT)
Subject: [Python-ideas] Consider making enumerate a sequence if its argument
	is a sequence
Message-ID: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>

What are the pros and cons of making enumerate a sequence if its argument 
is a sequence?

I found myself writing:

                for vertex, height in zip(
                        self.cache.height_to_vertex[height_slice],
                        range(height_slice.start, height_slice.stop)):

I would have preferred:

                for height, vertex in enumerate(
                        self.cache.height_to_vertex)[height_slice]:


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/a8d7317d/attachment.html>

From jdhardy at gmail.com  Wed Sep 30 18:34:06 2015
From: jdhardy at gmail.com (Jeff Hardy)
Date: Wed, 30 Sep 2015 09:34:06 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <20150930010215.GU23642@ando.pearwood.info>
References: <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <etPan.5609a610.2f0c7c53.76f@Draupnir.home>
 <CAP7+vJK-V_Qyca3EiUuJvqAT0Nr0axeOucHpr4XZ6T1SbMSR1A@mail.gmail.com>
 <etPan.5609abd8.f9ec9f8.76f@Draupnir.home>
 <CAP7+vJ+=wJtTszjHLyV4O4Y2qfFxMBL2CCVk=je_yBNdriLw=w@mail.gmail.com>
 <560A5E9D.3070808@egenix.com>
 <CALFfu7CMzZTKmZsGc4QX5sf+z_9X2HZWM-dYBvfGwUNkXz0CAw@mail.gmail.com>
 <mueeor$ijg$1@ger.gmane.org>
 <20150930010215.GU23642@ando.pearwood.info>
Message-ID: <CAF7AXFHsiv1nDiO8V4ewrj2L4YG92gSAHg-GZtQT_XFmgfh-FQ@mail.gmail.com>

On Tue, Sep 29, 2015 at 6:02 PM, Steven D'Aprano <steve at pearwood.info>
wrote:

> On Tue, Sep 29, 2015 at 06:37:46PM +0200, Georg Brandl wrote:
>
> > >     x = x ?? []
> >
> > Looking at this, I think people might call ?? the "WTF operator".  Not a
> > good sign :)
>
> I see your smiley, but C# has this operator. What do C# programmers call
> it? ("Null coalescing operator" is the formal name, but that's way too
> long for everyday use.)
>

I've never seen it referred to as anything other than "the null coalescing
operator" (or occasionally the "double question mark operator"). C# devs
aren't necessarily the most creative bunch... :)

- Jeff
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/6cd63986/attachment.html>

From storchaka at gmail.com  Wed Sep 30 18:36:43 2015
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 30 Sep 2015 19:36:43 +0300
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
Message-ID: <muh32s$8vh$1@ger.gmane.org>

On 30.09.15 18:28, Neil Girdhar wrote:
> What are the pros and cons of making enumerate a sequence if its
> argument is a sequence?
>
> I found myself writing:
>
>                  for vertex, height in zip(
>                          self.cache.height_to_vertex[height_slice],
>                          range(height_slice.start, height_slice.stop)):
>
> I would have preferred:
>
>                  for height, vertex in enumerate(
>                          self.cache.height_to_vertex)[height_slice]:

You can write:

                  for height, vertex in enumerate(
                          self.cache.height_to_vertex[height_slice],
                          height_slice.start):



From jdhardy at gmail.com  Wed Sep 30 18:41:53 2015
From: jdhardy at gmail.com (Jeff Hardy)
Date: Wed, 30 Sep 2015 09:41:53 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAP7h-xbA7vQX9D_roXpWTYfH-2gXjeADEvpiSVuV3apk2UXabA@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net>
 <20150929133542.4d04f6dd@anarchist.wooz.org>
 <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>
 <muetrf$lh$1@ger.gmane.org> <560B1E49.7050102@canterbury.ac.nz>
 <CAA0BC24-7591-4CCF-ADA4-E6D547E1715F@yahoo.com>
 <CAP7h-xbA7vQX9D_roXpWTYfH-2gXjeADEvpiSVuV3apk2UXabA@mail.gmail.com>
Message-ID: <CAF7AXFEX6aYzjSvjTFWhe4_VLc43To1iyFQaDbcSw6U3gcveZg@mail.gmail.com>

On Tue, Sep 29, 2015 at 5:24 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

>
> On Tue, Sep 29, 2015 at 7:57 PM, Andrew Barnert via Python-ideas <
> python-ideas at python.org> wrote:
>
>> If this is going to be a keyword rather than a symbol, it really has to
>> read like English, or at least like abbreviated English,
>
>
> I am -1 on this whole idea, but the keyword that comes to mind is "def":
>
> x def []
>
> may be read as x DEFaulting to [].
>

'def' is currently short for 'define', which would be too confusing.
Spelling out 'default' isn't so bad, though:

    self.x = x default []

And if it's going to be that long anyway, we might as well just put a
`default` function in the builtins:

    self.x = default(x, [])

I actually really like 'otherwise', but it's certainly not brief:

    self.x = x if x is not None else []
    self.x = x otherwise []

That said, it's not used *that* often either, so maybe it's acceptable.

- Jeff
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/ca8ce0f4/attachment-0001.html>

From brett at python.org  Wed Sep 30 18:43:12 2015
From: brett at python.org (Brett Cannon)
Date: Wed, 30 Sep 2015 16:43:12 +0000
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
Message-ID: <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>

On Wed, 30 Sep 2015 at 09:38 Neil Girdhar <mistersheik at gmail.com> wrote:

> In fairness, one is a superset of the other.  You always get an Iterable.
> You sometimes get a Sequence.  It's a bit like multiplication? with
> integers you get integers, with floats, you get floats.
>

No, it's not like multiplication. =) I hate saying this since I think it's
tossed around too much, but int/float substitution doesn't lead to a Liskov
substitution violation like substituting out a sequence for an iterator
(which is what will happen if the type of the argument to `enumerate`
changes). And since you can just call `list` or `tuple` on enumerate and
get exactly what you're after without potential bugs cropping up if you
don't realize from afar you're affecting an assumption someone made, I'm
-1000 on this idea.

-Brett


>
> On Wed, Sep 30, 2015 at 12:35 PM Brett Cannon <brett at python.org> wrote:
>
>> On Wed, 30 Sep 2015 at 08:28 Neil Girdhar <mistersheik at gmail.com> wrote:
>>
>>> What are the pros and cons of making enumerate a sequence if its
>>> argument is a sequence?
>>>
>>> I found myself writing:
>>>
>>>                 for vertex, height in zip(
>>>                         self.cache.height_to_vertex[height_slice],
>>>                         range(height_slice.start, height_slice.stop)):
>>>
>>> I would have preferred:
>>>
>>>                 for height, vertex in enumerate(
>>>                         self.cache.height_to_vertex)[height_slice]:
>>>
>>
>> Because you now suddenly have different types and semantics of what
>> enumerate() returns based on its argument which is easy to mess up if
>> self.cache.height_to_vertex became an iterator object itself instead of a
>> sequence object. It's also not hard to simply do `tuple(enumerate(...))` to
>> get the exact semantics you want: TOOWTDI.
>>
>> IOW all I see are cons. =)
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/77b487ef/attachment.html>

From rosuav at gmail.com  Wed Sep 30 18:47:44 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 1 Oct 2015 02:47:44 +1000
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAF7AXFEX6aYzjSvjTFWhe4_VLc43To1iyFQaDbcSw6U3gcveZg@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net>
 <20150929133542.4d04f6dd@anarchist.wooz.org>
 <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>
 <muetrf$lh$1@ger.gmane.org> <560B1E49.7050102@canterbury.ac.nz>
 <CAA0BC24-7591-4CCF-ADA4-E6D547E1715F@yahoo.com>
 <CAP7h-xbA7vQX9D_roXpWTYfH-2gXjeADEvpiSVuV3apk2UXabA@mail.gmail.com>
 <CAF7AXFEX6aYzjSvjTFWhe4_VLc43To1iyFQaDbcSw6U3gcveZg@mail.gmail.com>
Message-ID: <CAPTjJmpiRpkCkUV=+r3HgX+DF7t-DP3iDztkvpVwueb4cj060A@mail.gmail.com>

On Thu, Oct 1, 2015 at 2:41 AM, Jeff Hardy <jdhardy at gmail.com> wrote:
> 'def' is currently short for 'define', which would be too confusing.
> Spelling out 'default' isn't so bad, though:
>
>     self.x = x default []
>
> And if it's going to be that long anyway, we might as well just put a
> `default` function in the builtins:
>
>     self.x = default(x, [])

I'd prefer it to have language support rather than a builtin, so it
can shortcircuit. It won't often be important, but it would be nice to
be able to put a function call in there or something.

ChrisA

From eric at trueblade.com  Wed Sep 30 18:49:49 2015
From: eric at trueblade.com (Eric V. Smith)
Date: Wed, 30 Sep 2015 12:49:49 -0400
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAF7AXFEX6aYzjSvjTFWhe4_VLc43To1iyFQaDbcSw6U3gcveZg@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net> <20150929133542.4d04f6dd@anarchist.wooz.org>
 <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>
 <muetrf$lh$1@ger.gmane.org> <560B1E49.7050102@canterbury.ac.nz>
 <CAA0BC24-7591-4CCF-ADA4-E6D547E1715F@yahoo.com>
 <CAP7h-xbA7vQX9D_roXpWTYfH-2gXjeADEvpiSVuV3apk2UXabA@mail.gmail.com>
 <CAF7AXFEX6aYzjSvjTFWhe4_VLc43To1iyFQaDbcSw6U3gcveZg@mail.gmail.co
 m>
Message-ID: <560C12AD.90305@trueblade.com>

On 09/30/2015 12:41 PM, Jeff Hardy wrote:
> 'def' is currently short for 'define', which would be too confusing.
> Spelling out 'default' isn't so bad, though:
> 
>     self.x = x default []
> 
> And if it's going to be that long anyway, we might as well just put a
> `default` function in the builtins:
> 
>     self.x = default(x, [])

You lose the short circuiting.

> I actually really like 'otherwise', but it's certainly not brief:
>     
>     self.x = x if x is not None else []
>     self.x = x otherwise []

I'm -1 on needing syntax for this, but if we're going to do it, this is
my favorite version I've seen so far. The usual caveats about adding a
keyword apply.

Eric.



From mistersheik at gmail.com  Wed Sep 30 18:53:47 2015
From: mistersheik at gmail.com (Neil Girdhar)
Date: Wed, 30 Sep 2015 16:53:47 +0000
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
Message-ID: <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>

Can you help understand how this is a Liskov substitution violation?  A
Sequence is an Iterator.  Getting the sequence back should never hurt.  The
current interface doesn't promise that the returned object won't have
additional methods or implement additional interfaces, does it?

On Wed, Sep 30, 2015 at 12:43 PM Brett Cannon <brett at python.org> wrote:

> On Wed, 30 Sep 2015 at 09:38 Neil Girdhar <mistersheik at gmail.com> wrote:
>
>> In fairness, one is a superset of the other.  You always get an
>> Iterable.  You sometimes get a Sequence.  It's a bit like multiplication?
>> with integers you get integers, with floats, you get floats.
>>
>
> No, it's not like multiplication. =) I hate saying this since I think it's
> tossed around too much, but int/float substitution doesn't lead to a Liskov
> substitution violation like substituting out a sequence for an iterator
> (which is what will happen if the type of the argument to `enumerate`
> changes). And since you can just call `list` or `tuple` on enumerate and
> get exactly what you're after without potential bugs cropping up if you
> don't realize from afar you're affecting an assumption someone made, I'm
> -1000 on this idea.
>
> -Brett
>
>
>>
>> On Wed, Sep 30, 2015 at 12:35 PM Brett Cannon <brett at python.org> wrote:
>>
>>> On Wed, 30 Sep 2015 at 08:28 Neil Girdhar <mistersheik at gmail.com> wrote:
>>>
>>>> What are the pros and cons of making enumerate a sequence if its
>>>> argument is a sequence?
>>>>
>>>> I found myself writing:
>>>>
>>>>                 for vertex, height in zip(
>>>>                         self.cache.height_to_vertex[height_slice],
>>>>                         range(height_slice.start, height_slice.stop)):
>>>>
>>>> I would have preferred:
>>>>
>>>>                 for height, vertex in enumerate(
>>>>                         self.cache.height_to_vertex)[height_slice]:
>>>>
>>>
>>> Because you now suddenly have different types and semantics of what
>>> enumerate() returns based on its argument which is easy to mess up if
>>> self.cache.height_to_vertex became an iterator object itself instead of a
>>> sequence object. It's also not hard to simply do `tuple(enumerate(...))` to
>>> get the exact semantics you want: TOOWTDI.
>>>
>>> IOW all I see are cons. =)
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/be32dbbe/attachment-0001.html>

From chris.barker at noaa.gov  Wed Sep 30 19:03:04 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Wed, 30 Sep 2015 10:03:04 -0700
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
 <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
Message-ID: <CALGmxEKCbAhjPKUmtQen=28y513ZDD7S3_OxnhE2UFHvObajFQ@mail.gmail.com>

On Wed, Sep 30, 2015 at 9:53 AM, Neil Girdhar <mistersheik at gmail.com> wrote:

> Can you help understand how this is a Liskov substitution violation?  A
> Sequence is an Iterator.  Getting the sequence back should never hurt.
>

no but getting a non-sequence iterator back when you expect a sequence sure
can hurt.

which is why I said that if you want a sequence back from enumerate, it
should always return a sequence. which could (should) be lazy-evaluated.

I think Neil's point is that calling list() or tuple() on it requires that
the entire sequence be evaluated and stored -- if you really only want one
item (and especially not one at the end), that could be a pretty big
performance hit.

Which makes me wonder why ALL iterators couldn't support indexing? It might
work like crap in some cases, but wouldn't it always be as good or better
than wrapping it in a tuple? And then some cases (like enumerate) could do
an index operation efficiently when they are working with "real" sequences.

Maybe a generic lazy_sequence object that could be wrapped around an
iterator to create a lazy-evaluating sequence??

-CHB



> On Wed, Sep 30, 2015 at 12:43 PM Brett Cannon <brett at python.org> wrote:
>
>> On Wed, 30 Sep 2015 at 09:38 Neil Girdhar <mistersheik at gmail.com> wrote:
>>
>>> In fairness, one is a superset of the other.  You always get an
>>> Iterable.  You sometimes get a Sequence.  It's a bit like multiplication?
>>> with integers you get integers, with floats, you get floats.
>>>
>>
>> No, it's not like multiplication. =) I hate saying this since I think
>> it's tossed around too much, but int/float substitution doesn't lead to a
>> Liskov substitution violation like substituting out a sequence for an
>> iterator (which is what will happen if the type of the argument to
>> `enumerate` changes). And since you can just call `list` or `tuple` on
>> enumerate and get exactly what you're after without potential bugs cropping
>> up if you don't realize from afar you're affecting an assumption someone
>> made, I'm -1000 on this idea.
>>
>> -Brett
>>
>>
>>>
>>> On Wed, Sep 30, 2015 at 12:35 PM Brett Cannon <brett at python.org> wrote:
>>>
>>>> On Wed, 30 Sep 2015 at 08:28 Neil Girdhar <mistersheik at gmail.com>
>>>> wrote:
>>>>
>>>>> What are the pros and cons of making enumerate a sequence if its
>>>>> argument is a sequence?
>>>>>
>>>>> I found myself writing:
>>>>>
>>>>>                 for vertex, height in zip(
>>>>>                         self.cache.height_to_vertex[height_slice],
>>>>>                         range(height_slice.start, height_slice.stop)):
>>>>>
>>>>> I would have preferred:
>>>>>
>>>>>                 for height, vertex in enumerate(
>>>>>                         self.cache.height_to_vertex)[height_slice]:
>>>>>
>>>>
>>>> Because you now suddenly have different types and semantics of what
>>>> enumerate() returns based on its argument which is easy to mess up if
>>>> self.cache.height_to_vertex became an iterator object itself instead of a
>>>> sequence object. It's also not hard to simply do `tuple(enumerate(...))` to
>>>> get the exact semantics you want: TOOWTDI.
>>>>
>>>> IOW all I see are cons. =)
>>>>
>>>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/459f14be/attachment.html>

From alexander.belopolsky at gmail.com  Wed Sep 30 19:15:46 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 30 Sep 2015 13:15:46 -0400
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
 <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
Message-ID: <CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>

On Wed, Sep 30, 2015 at 12:53 PM, Neil Girdhar <mistersheik at gmail.com>
wrote:

> A Sequence is an Iterator.


No, a Sequence is an Iterable, not an Iterator:

>>> issubclass(collections.Sequence, collections.Iterator)
False
>>> issubclass(collections.Sequence, collections.Iterable)
True
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/b2ffcbe2/attachment.html>

From mistersheik at gmail.com  Wed Sep 30 19:18:44 2015
From: mistersheik at gmail.com (Neil Girdhar)
Date: Wed, 30 Sep 2015 17:18:44 +0000
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
 <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
 <CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>
Message-ID: <CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>

Ah good point.  Well, in the case of a sequence argument, an enumerate
object could be both a sequence and an iterator.

On Wed, Sep 30, 2015 at 1:15 PM Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

>
> On Wed, Sep 30, 2015 at 12:53 PM, Neil Girdhar <mistersheik at gmail.com>
> wrote:
>
>> A Sequence is an Iterator.
>
>
> No, a Sequence is an Iterable, not an Iterator:
>
> >>> issubclass(collections.Sequence, collections.Iterator)
> False
> >>> issubclass(collections.Sequence, collections.Iterable)
> True
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/329b5825/attachment.html>

From mistersheik at gmail.com  Wed Sep 30 19:19:53 2015
From: mistersheik at gmail.com (Neil Girdhar)
Date: Wed, 30 Sep 2015 17:19:53 +0000
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
 <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
 <CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>
 <CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>
Message-ID: <CAA68w_mLWADmnX-H9JMwt7S3d4TCpwz4buFgbpP0MxG6a8z8KA@mail.gmail.com>

I guess, I'm just asking for enumerate to go through the same change that
range went through.  Why wasn't it a problem for range?

On Wed, Sep 30, 2015 at 1:18 PM Neil Girdhar <mistersheik at gmail.com> wrote:

> Ah good point.  Well, in the case of a sequence argument, an enumerate
> object could be both a sequence and an iterator.
>
> On Wed, Sep 30, 2015 at 1:15 PM Alexander Belopolsky <
> alexander.belopolsky at gmail.com> wrote:
>
>>
>> On Wed, Sep 30, 2015 at 12:53 PM, Neil Girdhar <mistersheik at gmail.com>
>> wrote:
>>
>>> A Sequence is an Iterator.
>>
>>
>> No, a Sequence is an Iterable, not an Iterator:
>>
>> >>> issubclass(collections.Sequence, collections.Iterator)
>> False
>> >>> issubclass(collections.Sequence, collections.Iterable)
>> True
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/611ab2d0/attachment-0001.html>

From gokoproject at gmail.com  Wed Sep 30 19:27:15 2015
From: gokoproject at gmail.com (John Wong)
Date: Wed, 30 Sep 2015 13:27:15 -0400
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <560C12AD.90305@trueblade.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net>
 <20150929133542.4d04f6dd@anarchist.wooz.org>
 <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>
 <muetrf$lh$1@ger.gmane.org> <560B1E49.7050102@canterbury.ac.nz>
 <CAA0BC24-7591-4CCF-ADA4-E6D547E1715F@yahoo.com>
 <CAP7h-xbA7vQX9D_roXpWTYfH-2gXjeADEvpiSVuV3apk2UXabA@mail.gmail.com>
 <CAF7AXFEX6aYzjSvjTFWhe4_VLc43To1iyFQaDbcSw6U3gcveZg@mail.gmail.com>
 <560C12AD.90305@trueblade.com>
Message-ID: <CACCLA57ARYHz66EvrxHGikVL2gxm0q+cMzXRk-Hh4bFEh2kpSA@mail.gmail.com>

On Wed, Sep 30, 2015 at 12:49 PM, Eric V. Smith <eric at trueblade.com> wrote:

>
> >
> >     self.x = x if x is not None else []
> >     self.x = x otherwise []
>
> I'm -1 on needing syntax for this, but if we're going to do it, this is
> my favorite version I've seen so far. The usual caveats about adding a
> keyword apply.
>
>
Also feel this is the most intuitive... every other syntax seems really
hard to read, however, the caveat I am thinking here is synonym of else and
otherwise. We really should go any more complex. Those ? and null
with/without exception seems awfully complex to reason. I'd spell out 10
lines if I had to.

If fact, is it bad if we make else working for such brevity?

BTW, this syntax just defeats the example in the PEP:

[PEP 0505 - https://www.python.org/dev/peps/pep-0505/]

This particular formulation has the undesirable effect of putting the
> operands in an unintuitive order: the brain thinks, "use data if possible
> and use [] as a fallback," but the code puts the fallback * before * the
> preferred value.



John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/b71a728a/attachment.html>

From chris.barker at noaa.gov  Wed Sep 30 19:28:48 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Wed, 30 Sep 2015 10:28:48 -0700
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <CAA68w_mLWADmnX-H9JMwt7S3d4TCpwz4buFgbpP0MxG6a8z8KA@mail.gmail.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
 <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
 <CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>
 <CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>
 <CAA68w_mLWADmnX-H9JMwt7S3d4TCpwz4buFgbpP0MxG6a8z8KA@mail.gmail.com>
Message-ID: <CALGmxEJk94+4CgKK3tyOUxD3kmAWZzNnXsX2qsjDzj8eoffdpw@mail.gmail.com>

On Wed, Sep 30, 2015 at 10:19 AM, Neil Girdhar <mistersheik at gmail.com>
wrote:

> I guess, I'm just asking for enumerate to go through the same change that
> range went through.  Why wasn't it a problem for range?
>

well, range is simpler -- you don't pass arbitrary iterables into it. It
always has to compute integer values according to start, stop, step -- easy
to implement as either iteration or indexing.

enumerate, on the other hand, takes an arbitrary iterable -- so it can't
just index into that iterable if asked for an index.

You are right, of course, that it COULD do that if it was passed a sequence
in the first place, but then you have an intera e whereby you get a
different kind of object depending on how you created it, which is pretty
ugly.

But again, we could add indexing to enumerate, and have it do the ugly
inefficient thing when it's using an underlying non-indexable iterator, and
do the efficient thing when it has a sequence to work with, thereby
providing the same API regardless.

-CHB





> On Wed, Sep 30, 2015 at 1:18 PM Neil Girdhar <mistersheik at gmail.com>
> wrote:
>
>> Ah good point.  Well, in the case of a sequence argument, an enumerate
>> object could be both a sequence and an iterator.
>>
>> On Wed, Sep 30, 2015 at 1:15 PM Alexander Belopolsky <
>> alexander.belopolsky at gmail.com> wrote:
>>
>>>
>>> On Wed, Sep 30, 2015 at 12:53 PM, Neil Girdhar <mistersheik at gmail.com>
>>> wrote:
>>>
>>>> A Sequence is an Iterator.
>>>
>>>
>>> No, a Sequence is an Iterable, not an Iterator:
>>>
>>> >>> issubclass(collections.Sequence, collections.Iterator)
>>> False
>>> >>> issubclass(collections.Sequence, collections.Iterable)
>>> True
>>>
>>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/70d4129b/attachment.html>

From storchaka at gmail.com  Wed Sep 30 19:33:05 2015
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 30 Sep 2015 20:33:05 +0300
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
 <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
 <CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>
 <CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>
Message-ID: <muh6ci$8o1$1@ger.gmane.org>

On 30.09.15 20:18, Neil Girdhar wrote:
> Ah good point.  Well, in the case of a sequence argument, an enumerate
> object could be both a sequence and an iterator.

It can't be.

For sequence:

 >>> x = 'abcd'
 >>> list(zip(x, x))
[('a', 'a'), ('b', 'b'), ('c', 'c'), ('d', 'd')]

For iterator:

 >>> x = iter('abcd')
 >>> list(zip(x, x))
[('a', 'b'), ('c', 'd')]



From anthony at xtfx.me  Wed Sep 30 19:35:49 2015
From: anthony at xtfx.me (C Anthony Risinger)
Date: Wed, 30 Sep 2015 12:35:49 -0500
Subject: [Python-ideas] Submitting a job to an asyncio event loop
In-Reply-To: <CAP7+vJ+Pc=+sWz1oq+K0u=kjVyyHjYFAT4RxSqtpA5mwRB9PJQ@mail.gmail.com>
References: <CAFvThkBK5RjLLXgxZ8ePe0xfQ5O2xKKpT_0oULRAz0PB0X--zQ@mail.gmail.com>
 <CAP7+vJLCGfenQS-p4fwhOzbBogxcev-4Nz9iROuxtB2z-skp9A@mail.gmail.com>
 <CAFvThkCSNiCNLhTJz5Ta_1-B7U4End-XOHE5v1eUcSuttCy8sw@mail.gmail.com>
 <CAP7+vJ+Pc=+sWz1oq+K0u=kjVyyHjYFAT4RxSqtpA5mwRB9PJQ@mail.gmail.com>
Message-ID: <CAGAVQTGZJY5JOArDFx7K40+5bMGDV1ybz7fbS_F1rYcjxPKkFg@mail.gmail.com>

On Sun, Sep 27, 2015 at 11:42 AM, Guido van Rossum <guido at python.org> wrote:

> [...]
>
> I don't think the use case involving multiple event loops in different
> threads is as clear. I am still waiting for someone who is actually trying
> to use this. It might be useful on a system where there is a system event
> loop that must be used for UI events (assuming this event loop can somehow
> be wrapped in a custom asyncio loop) and where an app might want to have a
> standard asyncio event loop for network I/O. Come to think of it, the
> ProactorEventLoop on Windows has both advantages and disadvantages, and
> some app might need to use both that and SelectorEventLoop. But this is a
> real pain (because you can't share any mutable state between event loops).
>

I'm not currently solving the problem this way, but I wanted to do
something like this recently for a custom Mesos framework. The framework
uses a pure-python library called "pesos" that in turn uses a pure-python
libprocess library called "compactor". compactor runs user code in a
private event loop (Mesos registration, etc). I also wanted to run my own
private loop in another thread that interacts with Redis. This loop is
expected to process some incoming updates as commands that must influence
the compactor loop (start reconciliation or some other Mesos-related thing)
and the most straightforward thing to me sounded exactly like this thread:
submitting jobs from one loop to another.

I haven't really delved into making the Redis part an async loop (it's just
threaded right now) as I'm less experienced with writing such code, so
maybe I am overlooking and/or conflating things, but seems reasonable.

-- 

C Anthony
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/c4771ecb/attachment-0001.html>

From chris.barker at noaa.gov  Wed Sep 30 19:47:29 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Wed, 30 Sep 2015 10:47:29 -0700
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <muh6ci$8o1$1@ger.gmane.org>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
 <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
 <CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>
 <CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>
 <muh6ci$8o1$1@ger.gmane.org>
Message-ID: <CALGmxE+BsVzZzj7NNEOLjKyr5=-L8ad0PGr=u+OQJQZ7QimJRQ@mail.gmail.com>

On Wed, Sep 30, 2015 at 10:33 AM, Serhiy Storchaka <storchaka at gmail.com>
wrote:

> On 30.09.15 20:18, Neil Girdhar wrote:
>
>> Ah good point.  Well, in the case of a sequence argument, an enumerate
>> object could be both a sequence and an iterator.
>>
>
> It can't be.
>
> For sequence:
>
> >>> x = 'abcd'
> >>> list(zip(x, x))
> [('a', 'a'), ('b', 'b'), ('c', 'c'), ('d', 'd')]
>
> For iterator:
>
> >>> x = iter('abcd')
> >>> list(zip(x, x))
> [('a', 'b'), ('c', 'd')]


well, that's because zip is using the same iterator it two places. would
that ever be the case with enumerate?

-CHB






>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/9aaeda2a/attachment.html>

From jimjjewett at gmail.com  Wed Sep 30 19:55:49 2015
From: jimjjewett at gmail.com (Jim J. Jewett)
Date: Wed, 30 Sep 2015 13:55:49 -0400
Subject: [Python-ideas] secrets module -- secret.keeper?
Message-ID: <CA+OGgf6bbMcfP=UmkKTN93Bkg+2UyTuLKz==GZeZZ0yxowyV4g@mail.gmail.com>

Will the secrets module offer any building blocks to actually protect a secret?

e.g.,

an easy way to encrypt a file with a given password?
an encrypted datastore?
a getpass that works even in IDLE?

-jJ

From mistersheik at gmail.com  Wed Sep 30 20:10:33 2015
From: mistersheik at gmail.com (Neil Girdhar)
Date: Wed, 30 Sep 2015 18:10:33 +0000
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <muh6ci$8o1$1@ger.gmane.org>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
 <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
 <CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>
 <CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>
 <muh6ci$8o1$1@ger.gmane.org>
Message-ID: <CAA68w_khsZ_WsQrC_daKZefi1NtzaF1S6MjvzCyC2tp0kz6kkg@mail.gmail.com>

Ah, that's a great point.  Thanks.  I guess range was never an Iterator,
which is a key difference.

On Wed, Sep 30, 2015 at 1:33 PM Serhiy Storchaka <storchaka at gmail.com>
wrote:

> On 30.09.15 20:18, Neil Girdhar wrote:
> > Ah good point.  Well, in the case of a sequence argument, an enumerate
> > object could be both a sequence and an iterator.
>
> It can't be.
>
> For sequence:
>
>  >>> x = 'abcd'
>  >>> list(zip(x, x))
> [('a', 'a'), ('b', 'b'), ('c', 'c'), ('d', 'd')]
>
> For iterator:
>
>  >>> x = iter('abcd')
>  >>> list(zip(x, x))
> [('a', 'b'), ('c', 'd')]
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/python-ideas/x1omibxxcMw/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/3a0a4214/attachment.html>

From mal at egenix.com  Wed Sep 30 20:11:00 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 30 Sep 2015 20:11:00 +0200
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <CAA68w_mLWADmnX-H9JMwt7S3d4TCpwz4buFgbpP0MxG6a8z8KA@mail.gmail.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>	<CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>	<CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>	<CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>	<CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>	<CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>	<CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>
 <CAA68w_mLWADmnX-H9JMwt7S3d4TCpwz4buFgbpP0MxG6a8z8KA@mail.gmail.com>
Message-ID: <560C25B4.5050000@egenix.com>

On 30.09.2015 19:19, Neil Girdhar wrote:
> I guess, I'm just asking for enumerate to go through the same change that
> range went through.  Why wasn't it a problem for range?

range() returns a list in Python 2 and a generator in Python 3.

enumerate() has never returned a sequence. It was one
of the first builtin APIs in Python to return a generator:

https://www.python.org/dev/peps/pep-0279/

after iterators and generators were introduced to the language:

https://www.python.org/dev/peps/pep-0234/
https://www.python.org/dev/peps/pep-0255/

The main purpose of enumerate() is to allow enumeration of
objects in a sequence or other iterable. If you need a sequence,
simply wrap it with list(), e.g. list(enumerate(sequence)).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 30 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-25: Started a Python blog ... ...          http://malemburg.com/
2015-10-21: Python Meeting Duesseldorf ...                 21 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From python-ideas at mgmiller.net  Wed Sep 30 20:13:14 2015
From: python-ideas at mgmiller.net (Mike Miller)
Date: Wed, 30 Sep 2015 11:13:14 -0700
Subject: [Python-ideas] secrets module -- secret.keeper?
In-Reply-To: <CA+OGgf6bbMcfP=UmkKTN93Bkg+2UyTuLKz==GZeZZ0yxowyV4g@mail.gmail.com>
References: <CA+OGgf6bbMcfP=UmkKTN93Bkg+2UyTuLKz==GZeZZ0yxowyV4g@mail.gmail.com>
Message-ID: <560C263A.1010600@mgmiller.net>

Somewhat related, there is a keyring module, the functionality of which I've 
sometimes wished were part of the stdlib:

     https://pypi.python.org/pypi/keyring

It supports the big three OSs.

-Mike


On 2015-09-30 10:55, Jim J. Jewett wrote:
> Will the secrets module offer any building blocks to actually protect a secret?
>
> e.g.,
>
> an easy way to encrypt a file with a given password?
> an encrypted datastore?
> a getpass that works even in IDLE?
>
> -jJ

From abarnert at yahoo.com  Wed Sep 30 20:16:33 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 30 Sep 2015 11:16:33 -0700
Subject: [Python-ideas] Consider making enumerate a sequence if its
	argument is a sequence
In-Reply-To: <CALGmxE+BsVzZzj7NNEOLjKyr5=-L8ad0PGr=u+OQJQZ7QimJRQ@mail.gmail.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
 <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
 <CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>
 <CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>
 <muh6ci$8o1$1@ger.gmane.org>
 <CALGmxE+BsVzZzj7NNEOLjKyr5=-L8ad0PGr=u+OQJQZ7QimJRQ@mail.gmail.com>
Message-ID: <054DB797-6E9A-46E4-BF5D-05717C5B1060@yahoo.com>

On Sep 30, 2015, at 10:47, Chris Barker <chris.barker at noaa.gov> wrote:
> 
>> On Wed, Sep 30, 2015 at 10:33 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:
>>> On 30.09.15 20:18, Neil Girdhar wrote:
>>> Ah good point.  Well, in the case of a sequence argument, an enumerate
>>> object could be both a sequence and an iterator.
>> 
>> It can't be.
>> 
>> For sequence:
>> 
>> >>> x = 'abcd'
>> >>> list(zip(x, x))
>> [('a', 'a'), ('b', 'b'), ('c', 'c'), ('d', 'd')]
>> 
>> For iterator:
>> 
>> >>> x = iter('abcd')
>> >>> list(zip(x, x))
>> [('a', 'b'), ('c', 'd')]
> 
> well, that's because zip is using the same iterator it two places. would that ever be the case with enumerate?

The point is that _nothing_ can be an iterator and a sequence at the same time. (And therefore, an enumerate object can't be both at the same time.)

The zip function is just a handy way of demonstrating the problem; it's not the actual problem. You could also demonstrate it by, e.g., calling len(x), next(x), list(x): If x is an iterator, next(x) will use up the 'a' so list will only give you ['b', 'c', 'd'], even though len gave you 4.

Conceptually: iterators are inherently one-shot iterables; sequences are inherently reusable iterables. While there's no explicit rule that __iter__ can't return self for a sequence, there's no reasonable way to make a sequence that does so. Which means no sequence can be an iterator.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/d02447d2/attachment-0001.html>

From alexander.belopolsky at gmail.com  Wed Sep 30 20:22:29 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 30 Sep 2015 14:22:29 -0400
Subject: [Python-ideas] Fwd: Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <CAP7h-xbNh84rimxBzGwytYaX8H-deT=X_4-qkdje48aj8Fzs3Q@mail.gmail.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP7h-xbNh84rimxBzGwytYaX8H-deT=X_4-qkdje48aj8Fzs3Q@mail.gmail.com>
Message-ID: <CAP7h-xaYr-fvROXGjJURd2Cwi872QDX5ZAu3RCbiZUHjsmUSJg@mail.gmail.com>

On Wed, Sep 30, 2015 at 11:28 AM, Neil Girdhar <mistersheik at gmail.com>
wrote:

> I found myself writing:
>
>                 for vertex, height in zip(
>                         self.cache.height_to_vertex[height_slice],
>                         range(height_slice.start, height_slice.stop)):
>
> I would have preferred:
>
>                 for height, vertex in enumerate(
>                         self.cache.height_to_vertex)[height_slice]:
>

This does not seem to be an big improvement over

                for height, vertex in enumerate(
                        self.cache.height_to_vertex[height_slice],
height_slice.start):
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/a23dd9b5/attachment.html>

From abarnert at yahoo.com  Wed Sep 30 20:26:03 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 30 Sep 2015 11:26:03 -0700
Subject: [Python-ideas] Consider making enumerate a sequence if its
	argument is a sequence
In-Reply-To: <560C25B4.5050000@egenix.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
 <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
 <CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>
 <CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>
 <CAA68w_mLWADmnX-H9JMwt7S3d4TCpwz4buFgbpP0MxG6a8z8KA@mail.gmail.com>
 <560C25B4.5050000@egenix.com>
Message-ID: <147B2DEC-310B-498A-B468-03F0053F55B7@yahoo.com>

On Sep 30, 2015, at 11:11, M.-A. Lemburg <mal at egenix.com> wrote:
> 
>> On 30.09.2015 19:19, Neil Girdhar wrote:
>> I guess, I'm just asking for enumerate to go through the same change that
>> range went through.  Why wasn't it a problem for range?
> 
> range() returns a list in Python 2 and a generator in Python 3.

No it doesn't. It returns a (lazy) sequence. Not a generator, or any other kind of iterator.

I don't know why so many people seem to believe it returns a generator. (And, when you point out what it returns, most of them say, "Why was that changed from 2.x xrange, which returned a generator?" but xrange never returned a generator either--it returned a lazy almost-a-sequence from the start.)

There's no conceptual reason that Python couldn't have more lazy sequences, and tools to build your own lazy sequences more easily.

However, things do get messy once you get into the details. For example, zip can return a lazy sequence if given only sequences, but what if it's given iterators, or other iterables that aren't sequences; filter can return something that's sort of like a sequence in that it can be repeatedly iterated but it can't be randomly-accessed. You really need a broader concept that integrates iteration and indexing, as in the C++ standard library. Swift provides the perfect example of how you could do something like that without losing the natural features of Python indexing and iteration. But it turns out to be complicated to explain, and to work with, and you end up writing multiple implementations for each iterable-processing function. I don't think the benefit is worth the cost.

Another alternative is just to wrap any iterable in a caching LazyList type. This runs into complications because there are different choices that make sense for different uses (obviously you have to handle negative indexing, and obviously you have to handle infinite lists, so... Oops!), so it makes more sense to leave that up to the application to supply whatever lazy list type it needs and use it explicitly.

From Nikolaus at rath.org  Wed Sep 30 20:27:18 2015
From: Nikolaus at rath.org (Nikolaus Rath)
Date: Wed, 30 Sep 2015 11:27:18 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <560B1E49.7050102@canterbury.ac.nz> (Greg Ewing's message of
 "Wed, 30 Sep 2015 12:27:05 +1300")
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net>
 <20150929133542.4d04f6dd@anarchist.wooz.org>
 <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>
 <muetrf$lh$1@ger.gmane.org> <560B1E49.7050102@canterbury.ac.nz>
Message-ID: <87r3lf9309.fsf@thinkpad.rath.org>

On Sep 30 2015, Greg Ewing <greg.ewing-F+z8Qja7x9Xokq/tPzqvJg at public.gmane.org> wrote:
> Emile van Sebille wrote:
>
>> x = foo nor 'foo was None'
>
> Cute, but unfortunately it conflicts with established
> usage of the word 'nor', which would suggest that
> a nor b == not (a or b).

The idea of using a named operator instead of some symbol has merit
though. What about "a orn b" or "a orin b" (*or* *i*f *n*one)? The
latter might be especially appealing to Tolkien readers :-).


Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From abarnert at yahoo.com  Wed Sep 30 20:32:52 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 30 Sep 2015 18:32:52 +0000 (UTC)
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <147B2DEC-310B-498A-B468-03F0053F55B7@yahoo.com>
References: <147B2DEC-310B-498A-B468-03F0053F55B7@yahoo.com>
Message-ID: <1541280537.3561058.1443637972521.JavaMail.yahoo@mail.yahoo.com>

I just remembered that the last few times related things came up, I wrote some blog posts going into details that I didn't want to have to dump on the list:

* http://stupidpythonideas.blogspot.com/2013/08/lazy-restartable-iteration.html
* 
http://stupidpythonideas.blogspot.com/2014/07/swift-style-map-and-filter-views.html
* 
http://stupidpythonideas.blogspot.com/2014/07/lazy-cons-lists.html
* 
http://stupidpythonideas.blogspot.com/2014/07/lazy-python-lists.html
* 
http://stupidpythonideas.blogspot.com/2015/07/creating-new-sequence-type-is-easy.html

The one about Swift-style map and filter views is, I think, the most interesting here. The tl;dr is that views (lazy sequences) are nifty, and there's nothing actually stopping Python for using them in more places, but they do add complexity, and the benefits probably don't outweigh the costs.



> On Wednesday, September 30, 2015 11:26 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
> > On Sep 30, 2015, at 11:11, M.-A. Lemburg <mal at egenix.com> wrote:
> 
>> 
>>>  On 30.09.2015 19:19, Neil Girdhar wrote:
>>>  I guess, I'm just asking for enumerate to go through the same 
> change that
>>>  range went through.  Why wasn't it a problem for range?
>> 
>>  range() returns a list in Python 2 and a generator in Python 3.
> 
> No it doesn't. It returns a (lazy) sequence. Not a generator, or any other 
> kind of iterator.
> 
> I don't know why so many people seem to believe it returns a generator. 
> (And, when you point out what it returns, most of them say, "Why was that 
> changed from 2.x xrange, which returned a generator?" but xrange never 
> returned a generator either--it returned a lazy almost-a-sequence from the 
> start.)
> 
> There's no conceptual reason that Python couldn't have more lazy 
> sequences, and tools to build your own lazy sequences more easily.
> 
> However, things do get messy once you get into the details. For example, zip can 
> return a lazy sequence if given only sequences, but what if it's given 
> iterators, or other iterables that aren't sequences; filter can return 
> something that's sort of like a sequence in that it can be repeatedly 
> iterated but it can't be randomly-accessed. You really need a broader 
> concept that integrates iteration and indexing, as in the C++ standard library. 
> Swift provides the perfect example of how you could do something like that 
> without losing the natural features of Python indexing and iteration. But it 
> turns out to be complicated to explain, and to work with, and you end up writing 
> multiple implementations for each iterable-processing function. I don't 
> think the benefit is worth the cost.
> 
> Another alternative is just to wrap any iterable in a caching LazyList type. 
> This runs into complications because there are different choices that make sense 
> for different uses (obviously you have to handle negative indexing, and 
> obviously you have to handle infinite lists, so... Oops!), so it makes more 
> sense to leave that up to the application to supply whatever lazy list type it 
> needs and use it explicitly.
> 

From mistersheik at gmail.com  Wed Sep 30 20:37:24 2015
From: mistersheik at gmail.com (Neil Girdhar)
Date: Wed, 30 Sep 2015 18:37:24 +0000
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <1541280537.3561058.1443637972521.JavaMail.yahoo@mail.yahoo.com>
References: <147B2DEC-310B-498A-B468-03F0053F55B7@yahoo.com>
 <1541280537.3561058.1443637972521.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <CAA68w_mxDJj15wjcevf8NQj7XaY3KeKnX+a-r_C9Yi=t0hOczA@mail.gmail.com>

Yup, the swift-style map is a great blog entry Andrew and exactly what I
was proposing for enumerate.  I 100% agree that "views (lazy sequences) are
nifty, and there's nothing actually stopping Python for using them in more
places, but they do add complexity, and the benefits probably don't
outweigh the costs."

However, I wonder what Python will look like 5 years from now.   Maybe it
will be time for more sequences.

On Wed, Sep 30, 2015 at 2:32 PM Andrew Barnert <abarnert at yahoo.com> wrote:

> I just remembered that the last few times related things came up, I wrote
> some blog posts going into details that I didn't want to have to dump on
> the list:
>
> *
> http://stupidpythonideas.blogspot.com/2013/08/lazy-restartable-iteration.html
> *
>
> http://stupidpythonideas.blogspot.com/2014/07/swift-style-map-and-filter-views.html
> *
> http://stupidpythonideas.blogspot.com/2014/07/lazy-cons-lists.html
> *
> http://stupidpythonideas.blogspot.com/2014/07/lazy-python-lists.html
> *
>
> http://stupidpythonideas.blogspot.com/2015/07/creating-new-sequence-type-is-easy.html
>
> The one about Swift-style map and filter views is, I think, the most
> interesting here. The tl;dr is that views (lazy sequences) are nifty, and
> there's nothing actually stopping Python for using them in more places, but
> they do add complexity, and the benefits probably don't outweigh the costs.
>
>
>
> > On Wednesday, September 30, 2015 11:26 AM, Andrew Barnert <
> abarnert at yahoo.com> wrote:
> > > On Sep 30, 2015, at 11:11, M.-A. Lemburg <mal at egenix.com> wrote:
> >
> >>
> >>>  On 30.09.2015 19:19, Neil Girdhar wrote:
> >>>  I guess, I'm just asking for enumerate to go through the same
> > change that
> >>>  range went through.  Why wasn't it a problem for range?
> >>
> >>  range() returns a list in Python 2 and a generator in Python 3.
> >
> > No it doesn't. It returns a (lazy) sequence. Not a generator, or any
> other
> > kind of iterator.
> >
> > I don't know why so many people seem to believe it returns a generator.
> > (And, when you point out what it returns, most of them say, "Why was that
> > changed from 2.x xrange, which returned a generator?" but xrange never
> > returned a generator either--it returned a lazy almost-a-sequence from
> the
> > start.)
> >
> > There's no conceptual reason that Python couldn't have more lazy
> > sequences, and tools to build your own lazy sequences more easily.
> >
> > However, things do get messy once you get into the details. For example,
> zip can
> > return a lazy sequence if given only sequences, but what if it's given
> > iterators, or other iterables that aren't sequences; filter can return
> > something that's sort of like a sequence in that it can be repeatedly
> > iterated but it can't be randomly-accessed. You really need a broader
> > concept that integrates iteration and indexing, as in the C++ standard
> library.
> > Swift provides the perfect example of how you could do something like
> that
> > without losing the natural features of Python indexing and iteration.
> But it
> > turns out to be complicated to explain, and to work with, and you end up
> writing
> > multiple implementations for each iterable-processing function. I don't
> > think the benefit is worth the cost.
> >
> > Another alternative is just to wrap any iterable in a caching LazyList
> type.
> > This runs into complications because there are different choices that
> make sense
> > for different uses (obviously you have to handle negative indexing, and
> > obviously you have to handle infinite lists, so... Oops!), so it makes
> more
> > sense to leave that up to the application to supply whatever lazy list
> type it
> > needs and use it explicitly.
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/c65af830/attachment-0001.html>

From abarnert at yahoo.com  Wed Sep 30 20:39:04 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 30 Sep 2015 11:39:04 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAF7AXFEX6aYzjSvjTFWhe4_VLc43To1iyFQaDbcSw6U3gcveZg@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net> <20150929133542.4d04f6dd@anarchist.wooz.org>
 <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>
 <muetrf$lh$1@ger.gmane.org> <560B1E49.7050102@canterbury.ac.nz>
 <CAA0BC24-7591-4CCF-ADA4-E6D547E1715F@yahoo.com>
 <CAP7h-xbA7vQX9D_roXpWTYfH-2gXjeADEvpiSVuV3apk2UXabA@mail.gmail.com>
 <CAF7AXFEX6aYzjSvjTFWhe4_VLc43To1iyFQaDbcSw6U3gcveZg@mail.gmail.com>
Message-ID: <1B6D6E1C-6B5A-45C7-B59F-D671B857BADC@yahoo.com>

On Sep 30, 2015, at 09:41, Jeff Hardy <jdhardy at gmail.com> wrote:
> 
> I actually really like 'otherwise', but it's certainly not brief:
>     
>     self.x = x if x is not None else []
>     self.x = x otherwise []
> 
> That said, it's not used *that* often either, so maybe it's acceptable.

The big problem with "otherwise", "orelse", or anything else that's a synonym of "or" is that it's a synonym of "or". There is nothing to tell the novice, or the sometime Python user who's just come back from 3 months with Ruby or Objective C or whatever, which one means a falsey check and which one means a None check. They both read the same in English, and the difference would be unique to Python rather than a general programming thing.

So you'd end up with hundreds of articles and blog posts explaining, often poorly, the difference and when to use each--just as you see for == vs. ===, eq vs. eql, etc. in other languages.

From mal at egenix.com  Wed Sep 30 20:43:30 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 30 Sep 2015 20:43:30 +0200
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <147B2DEC-310B-498A-B468-03F0053F55B7@yahoo.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>	<CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>	<CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>	<CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>	<CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>	<CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>	<CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>	<CAA68w_mLWADmnX-H9JMwt7S3d4TCpwz4buFgbpP0MxG6a8z8KA@mail.gmail.com>	<560C25B4.5050000@egenix.com>
 <147B2DEC-310B-498A-B468-03F0053F55B7@yahoo.com>
Message-ID: <560C2D52.3080809@egenix.com>



On 30.09.2015 20:26, Andrew Barnert via Python-ideas wrote:
> On Sep 30, 2015, at 11:11, M.-A. Lemburg <mal at egenix.com> wrote:
>>
>>> On 30.09.2015 19:19, Neil Girdhar wrote:
>>> I guess, I'm just asking for enumerate to go through the same change that
>>> range went through.  Why wasn't it a problem for range?
>>
>> range() returns a list in Python 2 and a generator in Python 3.
> 
> No it doesn't. It returns a (lazy) sequence. Not a generator, or any other kind of iterator.

You are right that it's not of a generator type
and more like a lazy sequence. To be exact, it returns
a range object and does implement the iter protocol via
a range_iterator object.

In Python 2 we have the xrange object which has similar
properties, but not the same, e.g. you can't slice it.

> I don't know why so many people seem to believe it returns a generator. (And, when you point out what it returns, most of them say, "Why was that changed from 2.x xrange, which returned a generator?" but xrange never returned a generator either--it returned a lazy almost-a-sequence from the start.)

Perhaps because it behaves like one ? :-)

Unlike an iterator, it doesn't iterate over a sequence, but instead
generates the values on the fly.

FWIW: I don't think many people use the lazy sequence features
of range(), e.g. the slicing or index support. By far most
uses are in for-loops.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 30 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-25: Started a Python blog ... ...          http://malemburg.com/
2015-10-21: Python Meeting Duesseldorf ...                 21 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From mistersheik at gmail.com  Wed Sep 30 20:46:04 2015
From: mistersheik at gmail.com (Neil Girdhar)
Date: Wed, 30 Sep 2015 18:46:04 +0000
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <560C2D52.3080809@egenix.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
 <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
 <CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>
 <CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>
 <CAA68w_mLWADmnX-H9JMwt7S3d4TCpwz4buFgbpP0MxG6a8z8KA@mail.gmail.com>
 <560C25B4.5050000@egenix.com> <147B2DEC-310B-498A-B468-03F0053F55B7@yahoo.com>
 <560C2D52.3080809@egenix.com>
Message-ID: <CAA68w_n-Xrn9PVV4cQj=fvkv3N6DqmDKrX3NCivO=yf_KTu=jQ@mail.gmail.com>

It doesn't behave like a generator because it doesn't implement send,
throw, or close.   It's a sequence because it implements:  __getitem__,
__len__ __contains__, __iter__, __reversed__, index, and count.

On Wed, Sep 30, 2015 at 2:43 PM M.-A. Lemburg <mal at egenix.com> wrote:

>
>
> On 30.09.2015 20:26, Andrew Barnert via Python-ideas wrote:
> > On Sep 30, 2015, at 11:11, M.-A. Lemburg <mal at egenix.com> wrote:
> >>
> >>> On 30.09.2015 19:19, Neil Girdhar wrote:
> >>> I guess, I'm just asking for enumerate to go through the same change
> that
> >>> range went through.  Why wasn't it a problem for range?
> >>
> >> range() returns a list in Python 2 and a generator in Python 3.
> >
> > No it doesn't. It returns a (lazy) sequence. Not a generator, or any
> other kind of iterator.
>
> You are right that it's not of a generator type
> and more like a lazy sequence. To be exact, it returns
> a range object and does implement the iter protocol via
> a range_iterator object.
>
> In Python 2 we have the xrange object which has similar
> properties, but not the same, e.g. you can't slice it.
>
> > I don't know why so many people seem to believe it returns a generator.
> (And, when you point out what it returns, most of them say, "Why was that
> changed from 2.x xrange, which returned a generator?" but xrange never
> returned a generator either--it returned a lazy almost-a-sequence from the
> start.)
>
> Perhaps because it behaves like one ? :-)
>
> Unlike an iterator, it doesn't iterate over a sequence, but instead
> generates the values on the fly.
>
> FWIW: I don't think many people use the lazy sequence features
> of range(), e.g. the slicing or index support. By far most
> uses are in for-loops.
>
> --
> Marc-Andre Lemburg
> eGenix.com
>
> Professional Python Services directly from the Experts (#1, Sep 30 2015)
> >>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
> >>> Python Database Interfaces ...           http://products.egenix.com/
> >>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
> ________________________________________________________________________
> 2015-09-25: Started a Python blog ... ...          http://malemburg.com/
> 2015-10-21 <http://malemburg.com/2015-10-21>: Python Meeting Duesseldorf
> ...                 21 days to go
>
> ::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::
>
>    eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>            Registered at Amtsgericht Duesseldorf: HRB 46611
>                http://www.egenix.com/company/contact/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/681fdbe0/attachment.html>

From alexander.belopolsky at gmail.com  Wed Sep 30 20:49:53 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 30 Sep 2015 14:49:53 -0400
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <CAA68w_n-Xrn9PVV4cQj=fvkv3N6DqmDKrX3NCivO=yf_KTu=jQ@mail.gmail.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
 <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
 <CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>
 <CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>
 <CAA68w_mLWADmnX-H9JMwt7S3d4TCpwz4buFgbpP0MxG6a8z8KA@mail.gmail.com>
 <560C25B4.5050000@egenix.com>
 <147B2DEC-310B-498A-B468-03F0053F55B7@yahoo.com>
 <560C2D52.3080809@egenix.com>
 <CAA68w_n-Xrn9PVV4cQj=fvkv3N6DqmDKrX3NCivO=yf_KTu=jQ@mail.gmail.com>
Message-ID: <CAP7h-xarDsG77sK4o14CMxWwb_8wifBLfv3OzO5dr=9jjwgu3g@mail.gmail.com>

On Wed, Sep 30, 2015 at 2:46 PM, Neil Girdhar <mistersheik at gmail.com> wrote:

> It doesn't behave like a generator because it doesn't implement send,
> throw, or close.


It is not a generator because Python says it is not:

>>> isinstance(range(0), collections.Generator)
False


>   It's a sequence because it implements:  __getitem__, __len__
> __contains__, __iter__, __reversed__, index, and count.


Ditto

>>> isinstance(range(0), collections.Sequence)
True
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/65ae5b40/attachment-0001.html>

From abarnert at yahoo.com  Wed Sep 30 21:19:05 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 30 Sep 2015 12:19:05 -0700
Subject: [Python-ideas] Consider making enumerate a sequence if its
	argument is a sequence
In-Reply-To: <560C2D52.3080809@egenix.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
 <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
 <CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>
 <CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>
 <CAA68w_mLWADmnX-H9JMwt7S3d4TCpwz4buFgbpP0MxG6a8z8KA@mail.gmail.com>
 <560C25B4.5050000@egenix.com>
 <147B2DEC-310B-498A-B468-03F0053F55B7@yahoo.com>
 <560C2D52.3080809@egenix.com>
Message-ID: <08E63279-5DFF-4F7A-9B7B-B927D34BC4FA@yahoo.com>

On Sep 30, 2015, at 11:43, M.-A. Lemburg <mal at egenix.com> wrote:
> 
>> On 30.09.2015 20:26, Andrew Barnert via Python-ideas wrote:
>>> On Sep 30, 2015, at 11:11, M.-A. Lemburg <mal at egenix.com> wrote:
>>> 
>>>> On 30.09.2015 19:19, Neil Girdhar wrote:
>>>> I guess, I'm just asking for enumerate to go through the same change that
>>>> range went through.  Why wasn't it a problem for range?
>>> 
>>> range() returns a list in Python 2 and a generator in Python 3.
>> 
>> No it doesn't. It returns a (lazy) sequence. Not a generator, or any other kind of iterator.
> 
> You are right that it's not of a generator type
> and more like a lazy sequence. To be exact, it returns
> a range object and does implement the iter protocol via
> a range_iterator object.

To be exact, it returns an object which returns True for isinstance(r, Sequence), which offers correct implementations of the entire sequence protocol. In other words, it's not "more like a lazy sequence", it's _exactly_ a lazy sequence.

In 2.3-2.5, xrange was a lazy "sequence-like object", and the docs explained how it didn't have all the methods of a sequence but otherwise was like one. When the collections ABCs were added, xrange (2.x)/range (3.x) started claiming to be a sequence, but the implementation was incomplete, so it was defective. This was fixed in 3.2 (which also made all of the sequence methods efficient?e.g., a range that fits into C longs can test an int for __contains__ in constant time).

>> I don't know why so many people seem to believe it returns a generator. (And, when you point out what it returns, most of them say, "Why was that changed from 2.x xrange, which returned a generator?" but xrange never returned a generator either--it returned a lazy almost-a-sequence from the start.)
> 
> Perhaps because it behaves like one ? :-)
> 
> Unlike an iterator, it doesn't iterate over a sequence, but instead
> generates the values on the fly.

You're confusing things even worse here.

A generator is an iterator. It's a perfect subtype relationship.

A range does not behave like a generator, or like any other kind of iterator. It behaves like a sequence.

Laziness is orthogonal to the iterator-vs.-sequenceness. Dictionary views are also lazy but not iterators, for example. And there's nothing stopping you from writing a generator with "yield from f.readlines()" (except that it would be stupid), which would be an iterator despite being not lazy in any useful sense.

Maybe the problem is that we don't have enough words. I've tried to use "view" to refer to a lazy non-iterator iterable (dict views, range, NumPy slices), which seems to help within the context of a single long explanation for a single user's problem, but I'm not sure that's something we'd want enshrined in the glossary, since it's a general English word that probably has wider usefulness.

> FWIW: I don't think many people use the lazy sequence features
> of range(), e.g. the slicing or index support. By far most
> uses are in for-loops.

I've used range as a sequence (or at least a reusable iterable, a sized object, and a container). I've answered questions from people on StackOverflow who are doing so, and seen the highest-rep Python answerer on SO suggest such uses to other people.

I don't think I'd ever use the index method (although I did see one SO user who was doing so, to wrap up some arithmetic in a way that avoids a possibly off-by-one error, and wanted to know why it was so slow in 3.1 but worked fine in 3.2...), but there's no reason range should be a defective "not-quite-sequence" instead of a sequence. What would be the point of that?

From random832 at fastmail.com  Wed Sep 30 21:25:54 2015
From: random832 at fastmail.com (Random832)
Date: Wed, 30 Sep 2015 15:25:54 -0400
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <CAA68w_mLWADmnX-H9JMwt7S3d4TCpwz4buFgbpP0MxG6a8z8KA@mail.gmail.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
 <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
 <CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>
 <CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>
 <CAA68w_mLWADmnX-H9JMwt7S3d4TCpwz4buFgbpP0MxG6a8z8KA@mail.gmail.com>
Message-ID: <1443641154.129845.397904033.0BBCF99E@webmail.messagingengine.com>

On Wed, Sep 30, 2015, at 13:19, Neil Girdhar wrote:
> I guess, I'm just asking for enumerate to go through the same change that
> range went through.  Why wasn't it a problem for range?

Range has always returned a sequence.

Anyway, why stop there? Why not have map return a sequence? Zip?
Anything that is a 1:1 mapping (or 1+1:1 in zip's case) could in
principle be changed to return a sequence when given one. Who decides
what does and doesn't benefit from random access?

Or sliceability. It wouldn't be hard, in principle, to write a
general-purpose function for slicing an iterator (i.e. returning an
iterator that yields the elements that slicing a list of the same length
would have given), particularly if it's limited to positive values.

From abarnert at yahoo.com  Wed Sep 30 21:31:39 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 30 Sep 2015 12:31:39 -0700
Subject: [Python-ideas] Consider making enumerate a sequence if its
	argument is a sequence
In-Reply-To: <08E63279-5DFF-4F7A-9B7B-B927D34BC4FA@yahoo.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
 <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
 <CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>
 <CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>
 <CAA68w_mLWADmnX-H9JMwt7S3d4TCpwz4buFgbpP0MxG6a8z8KA@mail.gmail.com>
 <560C25B4.5050000@egenix.com>
 <147B2DEC-310B-498A-B468-03F0053F55B7@yahoo.com>
 <560C2D52.3080809@egenix.com>
 <08E63279-5DFF-4F7A-9B7B-B927D34BC4FA@yahoo.com>
Message-ID: <0BF836B6-7451-46E8-8DCB-25270348254E@yahoo.com>

On Sep 30, 2015, at 12:19, Andrew Barnert via Python-ideas <python-ideas at python.org> wrote:
> 
> Maybe the problem is that we don't have enough words. I've tried to use "view" to refer to a lazy non-iterator iterable (dict views, range, NumPy slices), which seems to help within the context of a single long explanation for a single user's problem, but I'm not sure that's something we'd want enshrined in the glossary, since it's a general English word that probably has wider usefulness.

I've just remembered that I said the exact same thing last time this discussion came up (less than 4 months ago), and someone pointed out to me that the docs already define the word "view" in the glossary specifically for dict/mapping views, and use the term "lazy sequence" in that definition, and use the term "virtual sequence" elsewhere.

It's worth noting that dict views are not actually sequences, so defining view in terms of lazy sequence is probably not a good idea...

Anyway, we probably don't need to invent any new terms; maybe we just need to pick some wording, define it clearly, and use it consistently throughout the docs.

From abarnert at yahoo.com  Wed Sep 30 21:42:05 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 30 Sep 2015 12:42:05 -0700
Subject: [Python-ideas] Consider making enumerate a sequence if its
	argument is a sequence
In-Reply-To: <1443641154.129845.397904033.0BBCF99E@webmail.messagingengine.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
 <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
 <CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>
 <CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>
 <CAA68w_mLWADmnX-H9JMwt7S3d4TCpwz4buFgbpP0MxG6a8z8KA@mail.gmail.com>
 <1443641154.129845.397904033.0BBCF99E@webmail.messagingengine.com>
Message-ID: <2432916E-9398-45F9-BBB9-A49696837282@yahoo.com>

On Sep 30, 2015, at 12:25, Random832 <random832 at fastmail.com> wrote:
> 
>> On Wed, Sep 30, 2015, at 13:19, Neil Girdhar wrote:
>> I guess, I'm just asking for enumerate to go through the same change that
>> range went through.  Why wasn't it a problem for range?
> 
> Range has always returned a sequence.
> 
> Anyway, why stop there? Why not have map return a sequence?

Even when it's called with a set, or an iterator? Yes, you _could_ do that by lazily adding values to a list as needed, but that could lead to some confusing behavior. For example, len(m) or m[-1] has to evaluate the rest of the input, which could take infinite time (well, it'll run out of memory first?).

> Zip?
> Anything that is a 1:1 mapping (or 1+1:1 in zip's case) could in
> principle be changed to return a sequence when given one. Who decides
> what does and doesn't benefit from random access?

The end user, of course. Some applications will never pass an infinite, or even very long, iterable into map, so they'd want random access and size and reversibility. Others won't ever want those features, but would want to pass in infinite iterators. That's why I think the best answer is to let people write (or install from PyPI) LazyList classes that fit their use cases, instead of trying to come up with one that tries to do everything and is misleading as often as it's useful.

It's not actually impossible to design something that does a lot more without being inconsistent or confusing, but it's a bigger change than it appears at first glance, and would add a lot more complexity to the language than I think is worth it for the benefits. Again, see http://stupidpythonideas.blogspot.com/2014/07/swift-style-map-and-filter-views.html for details.

> Or sliceability. It wouldn't be hard, in principle, to write a
> general-purpose function for slicing an iterator (i.e. returning an
> iterator that yields the elements that slicing a list of the same length
> would have given), particularly if it's limited to positive values.

You mean itertools.islice?


From mal at egenix.com  Wed Sep 30 21:47:17 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 30 Sep 2015 21:47:17 +0200
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <08E63279-5DFF-4F7A-9B7B-B927D34BC4FA@yahoo.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>	<CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>	<CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>	<CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>	<CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>	<CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>	<CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>	<CAA68w_mLWADmnX-H9JMwt7S3d4TCpwz4buFgbpP0MxG6a8z8KA@mail.gmail.com>	<560C25B4.5050000@egenix.com>	<147B2DEC-310B-498A-B468-03F0053F55B7@yahoo.com>	<560C2D52.3080809@egenix.com>
 <08E63279-5DFF-4F7A-9B7B-B927D34BC4FA@yahoo.com>
Message-ID: <560C3C45.1070107@egenix.com>

On 30.09.2015 21:19, Andrew Barnert wrote:
> On Sep 30, 2015, at 11:43, M.-A. Lemburg <mal at egenix.com> wrote:
>>
>>> On 30.09.2015 20:26, Andrew Barnert via Python-ideas wrote:
>>>> On Sep 30, 2015, at 11:11, M.-A. Lemburg <mal at egenix.com> wrote:
>>>>
>>>>> On 30.09.2015 19:19, Neil Girdhar wrote:
>>>>> I guess, I'm just asking for enumerate to go through the same change that
>>>>> range went through.  Why wasn't it a problem for range?
>>>>
>>>> range() returns a list in Python 2 and a generator in Python 3.
>>>
>>> No it doesn't. It returns a (lazy) sequence. Not a generator, or any other kind of iterator.
>>
>> You are right that it's not of a generator type
>> and more like a lazy sequence. To be exact, it returns
>> a range object and does implement the iter protocol via
>> a range_iterator object.
> 
> To be exact, it returns an object which returns True for isinstance(r, Sequence), which offers correct implementations of the entire sequence protocol. In other words, it's not "more like a lazy sequence", it's _exactly_ a lazy sequence.
> 
> In 2.3-2.5, xrange was a lazy "sequence-like object", and the docs explained how it didn't have all the methods of a sequence but otherwise was like one. When the collections ABCs were added, xrange (2.x)/range (3.x) started claiming to be a sequence, but the implementation was incomplete, so it was defective. This was fixed in 3.2 (which also made all of the sequence methods efficient?e.g., a range that fits into C longs can test an int for __contains__ in constant time).
> 
>>> I don't know why so many people seem to believe it returns a generator. (And, when you point out what it returns, most of them say, "Why was that changed from 2.x xrange, which returned a generator?" but xrange never returned a generator either--it returned a lazy almost-a-sequence from the start.)
>>
>> Perhaps because it behaves like one ? :-)
>>
>> Unlike an iterator, it doesn't iterate over a sequence, but instead
>> generates the values on the fly.
> 
> You're confusing things even worse here.

I guess I used the wrong level of detail. I was trying
explain things in terms of concepts, not object types,
isinstance() and ABCs.

The reason was that the subject line makes a suggestion
which simply doesn't fit the main concept behind enumerate:
that of generating values on the fly instead of allocating
them as sequence.

We just got side tracked with range(), since Neil brought
this up as example of why changing enumerate() should be
possible.

Back on the topic:

>>> arg = range(10)
>>> e = enumerate(arg)
>>> e
<enumerate object at 0x7fcbbc57bd80>
>>> import collections
>>> isinstance(e, collections.Sequence)
False
>>> isinstance(e, collections.Iterator)
True

The way I understand the proposal is that Neil wants the
above to return:

>>> isinstance(e, collections.Sequence)
True
>>> isinstance(e, collections.Iterator)
False

iff isinstance(arg, collections.Sequence)

and because this only makes sense iff e doesn't actually
create a list, enumerate(arg) would have to return a
lazy/virtual/whatever-term-you-use-for-generated-on-the-fly
sequence :-)

Regardless of this breaking backwards compatibility,
what's the benefit of such a change ?

Just like range(), enumerate() is most commonly used
in for-loops, so the added sequence-ishness doesn't buy
you anything much (except the need for more words in
the glossary :-)).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 30 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-25: Started a Python blog ... ...          http://malemburg.com/
2015-10-21: Python Meeting Duesseldorf ...                 21 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From tjreedy at udel.edu  Wed Sep 30 21:57:03 2015
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 30 Sep 2015 15:57:03 -0400
Subject: [Python-ideas] Consider making enumerate a sequence if its
 argument is a sequence
In-Reply-To: <CALGmxEJk94+4CgKK3tyOUxD3kmAWZzNnXsX2qsjDzj8eoffdpw@mail.gmail.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
 <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
 <CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>
 <CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>
 <CAA68w_mLWADmnX-H9JMwt7S3d4TCpwz4buFgbpP0MxG6a8z8KA@mail.gmail.com>
 <CALGmxEJk94+4CgKK3tyOUxD3kmAWZzNnXsX2qsjDzj8eoffdpw@mail.gmail.com>
Message-ID: <muheqk$ute$1@ger.gmane.org>

On 9/30/2015 1:28 PM, Chris Barker wrote:

> But again, we could add indexing to enumerate, and have it do the ugly
> inefficient thing when it's using an underlying non-indexable iterator,

If the ugly inefficient thing is to call list(iterable), then that does 
not work with unbounded iterables.  Or the input iterable might produce 
inputs at various times in the future.

-- 
Terry Jan Reedy


From duda.piotr at gmail.com  Wed Sep 30 22:58:33 2015
From: duda.piotr at gmail.com (Piotr Duda)
Date: Wed, 30 Sep 2015 22:58:33 +0200
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <560C12AD.90305@trueblade.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net>
 <20150929133542.4d04f6dd@anarchist.wooz.org>
 <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>
 <muetrf$lh$1@ger.gmane.org> <560B1E49.7050102@canterbury.ac.nz>
 <CAA0BC24-7591-4CCF-ADA4-E6D547E1715F@yahoo.com>
 <CAP7h-xbA7vQX9D_roXpWTYfH-2gXjeADEvpiSVuV3apk2UXabA@mail.gmail.com>
 <CAF7AXFEX6aYzjSvjTFWhe4_VLc43To1iyFQaDbcSw6U3gcveZg@mail.gmail.com>
 <560C12AD.90305@trueblade.com>
Message-ID: <CAJ1Wxn1h7GR4V0_D06dtfoSGwEqPMm2usLXyUSgwBr7Vvo80qw@mail.gmail.com>

What about something like:
z = x if is not None else []


-- 
????????
??????
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/236ee372/attachment.html>

From gokoproject at gmail.com  Wed Sep 30 23:15:11 2015
From: gokoproject at gmail.com (John Wong)
Date: Wed, 30 Sep 2015 17:15:11 -0400
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAJ1Wxn1h7GR4V0_D06dtfoSGwEqPMm2usLXyUSgwBr7Vvo80qw@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net>
 <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net>
 <20150929133542.4d04f6dd@anarchist.wooz.org>
 <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>
 <muetrf$lh$1@ger.gmane.org> <560B1E49.7050102@canterbury.ac.nz>
 <CAA0BC24-7591-4CCF-ADA4-E6D547E1715F@yahoo.com>
 <CAP7h-xbA7vQX9D_roXpWTYfH-2gXjeADEvpiSVuV3apk2UXabA@mail.gmail.com>
 <CAF7AXFEX6aYzjSvjTFWhe4_VLc43To1iyFQaDbcSw6U3gcveZg@mail.gmail.com>
 <560C12AD.90305@trueblade.com>
 <CAJ1Wxn1h7GR4V0_D06dtfoSGwEqPMm2usLXyUSgwBr7Vvo80qw@mail.gmail.com>
Message-ID: <CACCLA576JgZF1kAOD3vyhAS=cxhQ_vyzuC0fvfPHb1OpYA+y3A@mail.gmail.com>

On Wed, Sep 30, 2015 at 4:58 PM, Piotr Duda <duda.piotr at gmail.com> wrote:

> What about something like:
> z = x if is not None else []
>
>
> Pretty hard to read. z x are short, but in many real code that sentence
has more characters and actually better off with today's anyway.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150930/adff87cb/attachment.html>

From abarnert at yahoo.com  Wed Sep 30 23:33:44 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 30 Sep 2015 14:33:44 -0700
Subject: [Python-ideas] Consider making enumerate a sequence if its
	argument is a sequence
In-Reply-To: <560C3C45.1070107@egenix.com>
References: <bc0a8c1e-50fa-43fd-baf8-c80464be958f@googlegroups.com>
 <CAP1=2W7SmOrJVey6ywCRJ+ifU1FxXh--q8uT5CoP-XscVeR__g@mail.gmail.com>
 <CAA68w_k42bEES3xNfjX3bU6Gq2QXPZG4r0+n657T-mvrB_jy2A@mail.gmail.com>
 <CAP1=2W5rCK-yu6V=JTWF7dRBQ7W9mP3ZsbJ5zLPwb-TKnJo+eg@mail.gmail.com>
 <CAA68w_kymM3yPz6tedKrDfxyKgEig4_gkWNPAxqMSii8dBd8Tg@mail.gmail.com>
 <CAP7h-xZDLtm3319jafO_kpojgV5=TCbpTeiHCfsDur3wyi0iDg@mail.gmail.com>
 <CAA68w_k5+z=_eC+eFH8c+BeBmfd-euRjEWT6KdkgrXN4YvrQhQ@mail.gmail.com>
 <CAA68w_mLWADmnX-H9JMwt7S3d4TCpwz4buFgbpP0MxG6a8z8KA@mail.gmail.com>
 <560C25B4.5050000@egenix.com>
 <147B2DEC-310B-498A-B468-03F0053F55B7@yahoo.com>
 <560C2D52.3080809@egenix.com>
 <08E63279-5DFF-4F7A-9B7B-B927D34BC4FA@yahoo.com>
 <560C3C45.1070107@egenix.com>
Message-ID: <9E2C1054-551B-41B1-A0D1-2E54A1FA8BF3@yahoo.com>

On Sep 30, 2015, at 12:47, M.-A. Lemburg <mal at egenix.com> wrote:
> 
>> On 30.09.2015 21:19, Andrew Barnert wrote:
>>> On Sep 30, 2015, at 11:43, M.-A. Lemburg <mal at egenix.com> wrote:
>>> 
>>>>> On 30.09.2015 20:26, Andrew Barnert via Python-ideas wrote:
>>>>>> On Sep 30, 2015, at 11:11, M.-A. Lemburg <mal at egenix.com> wrote:
>>>>>> 
>>>>>> On 30.09.2015 19:19, Neil Girdhar wrote:
>>>>>> I guess, I'm just asking for enumerate to go through the same change that
>>>>>> range went through.  Why wasn't it a problem for range?
>>>>> 
>>>>> range() returns a list in Python 2 and a generator in Python 3.
>>>> 
>>>> No it doesn't. It returns a (lazy) sequence. Not a generator, or any other kind of iterator.
>>> 
>>> You are right that it's not of a generator type
>>> and more like a lazy sequence. To be exact, it returns
>>> a range object and does implement the iter protocol via
>>> a range_iterator object.
>> 
>> To be exact, it returns an object which returns True for isinstance(r, Sequence), which offers correct implementations of the entire sequence protocol. In other words, it's not "more like a lazy sequence", it's _exactly_ a lazy sequence.
>> 
>> In 2.3-2.5, xrange was a lazy "sequence-like object", and the docs explained how it didn't have all the methods of a sequence but otherwise was like one. When the collections ABCs were added, xrange (2.x)/range (3.x) started claiming to be a sequence, but the implementation was incomplete, so it was defective. This was fixed in 3.2 (which also made all of the sequence methods efficient?e.g., a range that fits into C longs can test an int for __contains__ in constant time).
>> 
>>>> I don't know why so many people seem to believe it returns a generator. (And, when you point out what it returns, most of them say, "Why was that changed from 2.x xrange, which returned a generator?" but xrange never returned a generator either--it returned a lazy almost-a-sequence from the start.)
>>> 
>>> Perhaps because it behaves like one ? :-)
>>> 
>>> Unlike an iterator, it doesn't iterate over a sequence, but instead
>>> generates the values on the fly.
>> 
>> You're confusing things even worse here.
> 
> I guess I used the wrong level of detail. I was trying
> explain things in terms of concepts, not object types,
> isinstance() and ABCs.

But you're conflating the concept of "lazy" with the concept of "iterator". While generators, and iterators in general, are always technically lazy and nearly-always practically lazy, lazy things are not always iterators. Range, dict views, memoryview/buffer objects, NumPy slices, third-party lazy-list types, etc. are not generators, nor are they like generators in any way, except for being lazy. They're lazy sequences (well, except for the ones that aren't sequences, but they're still lazy containers, or lazy non-iterator iterables if you want to stick to terms in the glossary).

And I think experienced developers conflating the two orthogonal concepts is part of what leads to novices getting confused. They think that if they want laziness, they need a generator. That makes them unable to even form the notion that what they really want is a view/lazy container/virtual container even when that's what they want.

And it makes it hard to discuss issues like this thread clearly.

(The fact that we don't have a term for "non-iterator iterable", and that experienced users and even the documentation sometimes use the term "sequence" for that, only makes things worse. For example, a dict_keys is not a sequence in any useful sense, but the glossary says it is, because there is no word for what it wants to say.)

> Back on the topic:
> 
> The way I understand the proposal is that Neil wants the
> above to return:
> 
>>>> isinstance(e, collections.Sequence)
> True
>>>> isinstance(e, collections.Iterator)
> False
> 
> iff isinstance(arg, collections.Sequence)

That's one way to give him what he wants.

But another option would be to always return a lazy sequence--the same kind you'd get if you picked one of the LazyList classes off PyPI (which provide a sequence interface by iterating and caching an iterable), and just wrote "e = LazyList(enumerate(arg))". This is still only creating the values on demand, and only consuming the iterator (if that's what it's given) as needed. (Of course it does mean you can now demand multiple values at once from that iterator, e.g., by calling e[10] or len(e) when arg was an iterator.)

Or you could be even cleverer: enumerate always returns a lazy sequence, which uses random access if given a sequence, cached iteration if given any other iterable. That gives you the best of both worlds, right?

Either of these avoids the problem that the type of enumerate depends on the type of its input, and the more serious problem that you can't tell from inspection whether what it returns is reusable or one-shot, but of course they introduce other problems.

I don't think any of the three is worth doing. The three most consistent ways of doing this, if you were designing a language from scratch, seem to be:

1. Python: Always return an iterator; if people want sequence behavior (with whatever variety of laziness they desire), they can wrap it.

2. Haskell: Make everything in the language as lazy as possible, so you can just always return a list, and it will automatically be as lazy as possible.

3. Swift: Merge indexing and iteration, and bake in views as a fundamental concept, so you can always return a view, but whether its indices are random-access or not depends on whether its input's indices are.

I'm not sure that #1 is the best of the three, but it is exactly what Python already has, and the other two would be very hard to get to from here, so I think #1 is the best for Python 3.6 (or 4.0).

(The blog post I referenced earlier in the thread explores whether we could get to #3, or get part-way there, from here; if you don't agree that it would be harder than is worth doing, please read it and point out where I went wrong. Because that could be pretty cool.)

From abarnert at yahoo.com  Wed Sep 30 23:40:15 2015
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 30 Sep 2015 14:40:15 -0700
Subject: [Python-ideas] PEP 505 (None coalescing operators) thoughts
In-Reply-To: <CAJ1Wxn1h7GR4V0_D06dtfoSGwEqPMm2usLXyUSgwBr7Vvo80qw@mail.gmail.com>
References: <CAF7AXFE4M=1Gv_RONgH1Y9qoR0nw4JkSgan9GCboMZu-CfcW9g@mail.gmail.com>
 <CAP7+vJLF3FzwpAvVXRYudNmCMyvjh_8nQrJQrwB8zYWQ_Npf3A@mail.gmail.com>
 <56097AFB.1040906@oddbird.net>
 <CAP7+vJJdvUy-qaLFi2m3==XCDaZEGtqOGgCP7RVegMbk-nsH_w@mail.gmail.com>
 <5609985C.40603@oddbird.net>
 <CAP7+vJKKBS4NudBDQ7Z34Dv4x33m09y3T6R3HL5J3O98i+qrMw@mail.gmail.com>
 <56099C6F.90700@oddbird.net> <36AB4531-96BD-4D22-A957-B2199BA7912E@stufft.io>
 <CAP7+vJJFciihZwtFoEpvpfFEHGUoo0oL_nLMGJepor972VFRvg@mail.gmail.com>
 <85oagm2saa.fsf@benfinney.id.au>
 <CAP7+vJKiyen75dmUEqRa29EfkZkWmwy-P3qYDcB_UdmuWG30Jw@mail.gmail.com>
 <5609AB62.5040503@oddbird.net> <20150929133542.4d04f6dd@anarchist.wooz.org>
 <CAF7AXFGPB1CXdHLce5T-1OOPieJduVgOKPk4t9HMBvyA1QgAKg@mail.gmail.com>
 <muetrf$lh$1@ger.gmane.org> <560B1E49.7050102@canterbury.ac.nz>
 <CAA0BC24-7591-4CCF-ADA4-E6D547E1715F@yahoo.com>
 <CAP7h-xbA7vQX9D_roXpWTYfH-2gXjeADEvpiSVuV3apk2UXabA@mail.gmail.com>
 <CAF7AXFEX6aYzjSvjTFWhe4_VLc43To1iyFQaDbcSw6U3gcveZg@mail.gmail.com>
 <560C12AD.90305@trueblade.com>
 <CAJ1Wxn1h7GR4V0_D06dtfoSGwEqPMm2usLXyUSgwBr7Vvo80qw@mail.gmail.com>
Message-ID: <653DC9A1-3C9B-4E66-AA97-4CA24F21F9E9@yahoo.com>

On Sep 30, 2015, at 13:58, Piotr Duda <duda.piotr at gmail.com> wrote:
> 
> What about something like:
> z = x if is not None else []

For something as simple as "x", this doesn't seem much better than what we can already do:

    z = x if x is not None else []

For a complex expression that might be incorrect/expensive/dangerous to call multiple times, it might be useful, but I think it would read a lot better with an explicit pronoun:

    z = dangerous_thing(arg) if it is not None else []

In natural languages, "it" is already complex enough; adding in subject elision makes parsing even harder. I think the same would be true here.

Also, explicit "it" would be usable in other situations:

    z = dangerous_thing(arg) if it.value() > 3 else DummyValue(3)

And it gives you something to look up in the docs: help(it) can tell me how to figure out what "it" refers to, but how would I find that out with your version?

Anyway, I still don't like it even with the explicit pronoun, but maybe that's just AppleScript flashbacks. :)