Python-ideas
Threads by month
- ----- 2025 -----
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2008 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2007 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2006 -----
- December
January 2017
- 85 participants
- 44 discussions
Hmm, Thanks Chris. I thought I was posting this to the correct place.
I've never seen that "for line in open ..." after googling it many
times! Why is this question so often asked then?
Re:Indentation making end block markers not needed; well yes they aren't
/needed/. However, they are useful for readability purposes. Perhaps if
I use it some more I'll see that they aren't but I doubt it.
Re:PEP249 & SQL, I thought I was proposing something like that but it
can't be tacked on later …
[View More]I don't think - needs to be an inate part of
Python to work as cleanly as 4gl languages. Re: your named tuple
suggestion, wouldn't that mean that the naming is divorced from the
result column names - that is part of what shouldn't be.
Re:Everything being true of false. I don't see the value of that. Only
boolean data should be valid in boolean contexts. I don't really see how
that can be argued.
On 09/01/17 21:31, python-ideas-request(a)python.org wrote:
> Send Python-ideas mailing list submissions to
> python-ideas(a)python.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://mail.python.org/mailman/listinfo/python-ideas
> or, via email, send a message with subject or body 'help' to
> python-ideas-request(a)python.org
>
> You can reach the person managing the list at
> python-ideas-owner(a)python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Python-ideas digest..."
>
>
> Today's Topics:
>
> 1. Re: PEP 540: Add a new UTF-8 mode (INADA Naoki)
> 2. Python Reviewed (Simon Lovell)
> 3. Re: Python Reviewed (Chris Angelico)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 9 Jan 2017 11:21:41 +0900
> From: INADA Naoki <songofacandy(a)gmail.com>
> To: "Stephen J. Turnbull" <turnbull.stephen.fw(a)u.tsukuba.ac.jp>
> Cc: Victor Stinner <victor.stinner(a)gmail.com>, python-ideas
> <python-ideas(a)python.org>
> Subject: Re: [Python-ideas] PEP 540: Add a new UTF-8 mode
> Message-ID:
> <CAEfz+TwaVHaKnyquXuUqBMk=AyUcwDMgyA8efuhU9=oMhaGZnQ(a)mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> On Sun, Jan 8, 2017 at 1:47 AM, Stephen J. Turnbull
> <turnbull.stephen.fw(a)u.tsukuba.ac.jp> wrote:
>> INADA Naoki writes:
>>
>> > I want UTF-8 mode is enabled by default (opt-out option) even if
>> > locale is not POSIX,
>> > like `PYTHONLEGACYWINDOWSFSENCODING`.
>> >
>> > Users depends on locale know what locale is and how to configure it.
>> > They can understand difference between locale mode and UTF-8 mode
>> > and they can opt-out UTF-8 mode.
>> > But many people lives in "UTF-8 everywhere" world, and don't know
>> > about locale.
>>
>> I find all this very strange from someone with what looks like a
>> Japanese name. I see mojibake and non-Unicode encodings around me all
>> the time. Caveat: I teach at a University that prides itself on being
>> the most international of Japanese national universities, so in my
>> daily work I see Japanese in 4 different encodings (5 if you count the
>> UTF-16 used internally by MS Office), Chinese in 3 different (claimed)
>> encodings, and occasionally Russian in at least two encodings, ...,
>> uh, I could go on but won't. In any case, the biggest problems are
>> legacy email programs and busted websites in Japanese, plus email that
>> is labeled "GB2312" but actually conforms to GBK (and this is a reply
>> in Japanese to a Chinese applicant writing in Japanese encoded as GBK).
> Since I work on tech company, and use Linux for most only "server-side" program,
> I don't live such a situation.
>
> But when I see non UTF-8 text, I don't change locale to read such text.
> (Actually speaking, locale doesn't solve mojibake because it doesn't change
> my terminal emulator's encoding).
> And I don't change my terminal emulator setting only for read such a text.
> What I do is convert it to UTF-8 through command like `view
> text-from-windows.txt ++enc=cp932`
>
> So there are no problem when Python always use UTF-8 for fsencoding
> and stdio encoding.
>
>> I agree that people around me mostly know only two encodings: "works
>> for me" and "mojibake", but they also use locales configured for them
>> by technical staff. On top of that, international students (the most
>> likely victims of "UTF-8 by default" because students are the biggest
>> Python users) typically have non-Japanese locales set on their
>> imported computers.
> Hmm, Which OS do they use? There are no problem in macOS and Windows.
> Do they use Linux with locale with encoding other than UTF-8, and
> their terminal emulator
> uses non-UTF-8 encoding?
>
> As my feeling, UTF-8 start dominating from about 10 years ago, and
> ja_JP.EUC_JP (it was most common locale for Japanese befoer UTF-8) is
> complete legacy.
>
> There is only one machine (which is in LAN, lives from 10+ years ago,
> /usr/bin/python is Python 1.5!),
> I can ssh which has ja_JP.eucjp locale.
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 9 Jan 2017 19:25:45 +0800
> From: Simon Lovell <simon58500(a)bigpond.com>
> To: python-ideas(a)python.org
> Subject: [Python-ideas] Python Reviewed
> Message-ID: <69e3c5d4-d64b-063e-758e-2b0ac1720daa(a)bigpond.com>
> Content-Type: text/plain; charset=utf-8; format=flowed
>
> Python Reviewed
>
> Having used a lot of languages a little bit and not finding satisfactory
> answers to these in some cases often asked questions, I thought I'd join
> this group to make a post on the virtues and otherwise of python.
>
> The Good:
> Syntactically significant new lines
> Syntactically significant indenting
> Different types of array like structures for different situations
> Mostly simple and clear structures
> Avoiding implicit structures like C++ references which add only
> negative value
> Avoiding overly complicated chaining expressions like
> "while(*d++=*s++);"
> Single syntax for block statements (well, sort of. I'm ignoring
> lines like "if a=b: c=d")
> Lack of a with statement which only obscures the code
>
>
> The Bad:
> Colons at the end of if/while/for blocks. Most of the arguments in
> favour of this decision boil down to PEP 20.2 "Explicit is better than
> implicit". Well, no. if/while/for blocks are already explicit. Adding
> the colon makes it doubly explicit and therefore redundant. There is no
> reason I can see why this colon can't be made optional except for
> possibly PEP20.13 "There should be one-- and preferably only one
> --obvious way to do it". I don't agree that point is sufficient to
> require colons.
>
>
> No end required for if/while/for blocks. This is particularly a
> problem when placing code into text without fixed width fonts. It also
> is a potential problem with tab expansion tricking the programmer. This
> could be done similarly to requiring declarations in Fortran, which if
> "implicit none" was added to the top of the program, declarations are
> required. So add a "Block Close Mandatory" (or similar) keyword to
> enforce this. In practice there is usually a blank line placed at the
> end of blocks to try to signal this to someone reading the code. Makes
> the code less readable and I would refer to PEP20.7 "Readability counts"
>
>
> This code block doesn't compile, even given that function "process"
> takes one string parameter:
> f=open(file)
> endwhile=""
> while (line=f.readline())!=None:
> process(line)
> endwhile
>
> I note that many solutions have been proposed to this. In C, it
> is the ability to write "while(line=fgets(f))" instead of
> "while((line=fgets(f))!=NULL)" which causes the confusion. No solutions
> have been accepted to the current method which is tacky:
> f=open(file)
> endwhile=""
> endif=""
> while True:
> line=f.readline
> if line = None:
> break
> endif
> process(line)
> endwhile
>
>
> Inadequacy of PEP249 - Python Database Specification. This only
> supports dynamic SQL but SQL and particularly select statements should
> be easier to work with in the normal cases where you don't need such
> statements. e.g:
> endselect=""
> idList = select from identities where surname = 'JONES':
> idVar = id
> forenameVar = forename
> surnameVar = surname
> dobVar = dob
> endselect
>
> endfor=""
> for id in idList:
> print id.forenameVar, id.dobVar
> endfor
>
> as opposed to what is presently required in the select case
> which is:
> curs = connection.cursor()
> curs.execute("select id, forename, surname, dob from
> identities where surname = 'JONES'")
> idList=curs.fetchall()
>
> endfor=""
> for id in idList:
> print id[1], id[3]
> endfor
>
> I think the improvement in readibility for the first option
> should be plain to all even in the extremely simple case I've shown.
>
> This is the sort of thing which should be possible in any
> language which works with a database but somehow the IT industry has
> lost it in the 1990s/2000s. Similarly an upgraded syntax for the
> insert/values statement which the SQL standard has mis-specified to make
> the value being inserted too far away from the column name. Should be
> more like:
> endinsert=""
> Insert into identities:
> id = 1
> forename = 'John'
> surname = 'Smith'
> dob = '01-Jan-1970'
> endinsert
>
> One of the major problems with the status quo is the lack of
> named result columns. The other is that the programmer is required to
> convert the where clause into a string. The functionality of dynamic
> where/from clauses can still be provided without needing to rely on
> numbered result columns like so:
> endselect=""
> idList = select from identities where :where_clause:
> id = id
> forename = forename
> surname = surname
> dob = dob
> endselect
>
> Ideally, the bit after the equals sign would support all
> syntaxes allowed by the host database server which probably means it
> needs to be free text passed to the server. Where a string variable
> should be passed, the :variable syntax could be supported but this is
> not often required
>
>
> Variables never set to anything do not error until they are used,
> at least in implementations of Python 2 I have tried. e.g.
> UnlikelyCondition = False
> endif=""
> if UnlikelyCondition:
> print x
> endif
>
> The above code runs fine until UnlikelyCondition is set to True
>
>
> No do-while construct
>
>
> else keyword at the end of while loops is not obvious to those not
> familiar with it. Something more like whenFalse would be clearer
>
>
> Changing print from a statement to a function in Python 3 adds no
> positive value that I can see
>
>
> Upper delimiters being exclusive while lower delimiters are
> inclusive. This is very counter intuitive. e.g. range(1,4) returns
> [1,2,3]. Better to have the default base as one rather than zero IMO. Of
> course, the programmer should always be able to define the lower bound.
> This cannot be changed, of course.
>
>
> Lack of a single character in a method to refer to an attribute
> instead of a local variable, similar to C's "*" for dereferencing a pointer
>
>
> Inability to make simple chained assignments e.g. "a = b = 0"
>
>
> Conditional expression (<true-value> if <condition> else
> <false-value>) in Python is less intuitive than in C (<condition> ?
> <true-value> : <false-value>). Ref PEP308. Why BDFL chose the syntax he
> did is not at all clear.
>
>
> The Ugly:
> Persisting with the crapulence from C where a non zero integer is
> true and zero is false - only ever done because C lacked a boolean data
> type. This is a flagrant violation of PEP 20.2 "Explicit is better than
> implicit" and should be removed without providing backwards compatibility.
>
>
>
> ------------------------------
>
> Message: 3
> Date: Tue, 10 Jan 2017 00:31:42 +1100
> From: Chris Angelico <rosuav(a)gmail.com>
> To: python-ideas <python-ideas(a)python.org>
> Subject: Re: [Python-ideas] Python Reviewed
> Message-ID:
> <CAPTjJmrTfYup+o7BYwnYB=yCbLof3Csq0CcNU36ke8J5ynb5QA(a)mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> On Mon, Jan 9, 2017 at 10:25 PM, Simon Lovell <simon58500(a)bigpond.com> wrote:
>> Python Reviewed
>>
>> Having used a lot of languages a little bit and not finding satisfactory
>> answers to these in some cases often asked questions, I thought I'd join
>> this group to make a post on the virtues and otherwise of python.
> I think this thread belongs on python-list(a)python.org, where you'll
> find plenty of people happy to discuss why Python is and/or shouldn't
> be the way it is.
>
> A couple of responses to just a couple of your points.
>
>> The Good:
>> Syntactically significant new lines
>> Syntactically significant indenting
>> The Bad:
>> No end required for if/while/for blocks. This is particularly a problem
>> when placing code into text without fixed width fonts. It also is a
>> potential problem with tab expansion tricking the programmer.
> If indentation and line endings are significant, you shouldn't need
> end markers. They don't buy you anything. In any case, I've never
> missed them; in fact, Python code follows the "header and lines"
> concept that I've worked with in many, MANY data files for decades
> (think of the sectioned config file format, for example).
>
>> This code block doesn't compile, even given that function "process"
>> takes one string parameter:
>> f=open(file)
>> endwhile=""
>> while (line=f.readline())!=None:
>> process(line)
>> endwhile
>>
>> I note that many solutions have been proposed to this. In C, it is
>> the ability to write "while(line=fgets(f))" instead of
>> "while((line=fgets(f))!=NULL)" which causes the confusion. No solutions have
>> been accepted to the current method which is tacky:
>> f=open(file)
>> endwhile=""
>> endif=""
>> while True:
>> line=f.readline
>> if line = None:
>> break
>> endif
>> process(line)
>> endwhile
> Here's a better way:
>
> for line in open(file):
> process(line)
>
> If you translate C code to Python, sure, it'll sometimes come out even
> uglier than the C original. But there's often a Pythonic way to write
> things.
>
>> Inadequacy of PEP249 - Python Database Specification. This only supports
>> dynamic SQL but SQL and particularly select statements should be easier to
>> work with in the normal cases where you don't need such statements. e.g:
>> endselect=""
>> idList = select from identities where surname = 'JONES':
>> idVar = id
>> forenameVar = forename
>> surnameVar = surname
>> dobVar = dob
>> endselect
>>
>> endfor=""
>> for id in idList:
>> print id.forenameVar, id.dobVar
>> endfor
> You're welcome to propose something like this. I suspect you could
> build an SQL engine that uses a class to create those bindings -
> something like:
>
> class people(identities):
> id, forename, surname, dob
> where="surname = 'JONES'"
>
> for person in people:
> print(person.forename, person.dob)
>
> Side point: "forename" and "surname" are inadvisable fields.
> http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-na…
>
>> One of the major problems with the status quo is the lack of named
>> result columns. The other is that the programmer is required to convert the
>> where clause into a string. The functionality of dynamic where/from clauses
>> can still be provided without needing to rely on numbered result columns
>> like so:
>> endselect=""
>> idList = select from identities where :where_clause:
>> id = id
>> forename = forename
>> surname = surname
>> dob = dob
>> endselect
> That's easy enough to do with a namedtuple.
>
>> Variables never set to anything do not error until they are used, at
>> least in implementations of Python 2 I have tried. e.g.
>> UnlikelyCondition = False
>> endif=""
>> if UnlikelyCondition:
>> print x
>> endif
>>
>> The above code runs fine until UnlikelyCondition is set to True
> That's because globals and builtins could be created dynamically. It's
> a consequence of not having variable declarations. You'll find a lot
> of editors/linters will flag this, though.
>
>> Changing print from a statement to a function in Python 3 adds no
>> positive value that I can see
> Adds heaps of positive value to a lot of people. You simply haven't
> met the situations where it's better. It's sufficiently better that I
> often use __future__ to pull it in even in 2.7-only projects.
>
>> Lack of a single character in a method to refer to an attribute instead
>> of a local variable, similar to C's "*" for dereferencing a pointer
> Ehh. "self." isn't that long. Python isn't AWK.
>
>> Inability to make simple chained assignments e.g. "a = b = 0"
> Really? Works fine. You can chain assignment like that.
>
>> Conditional expression (<true-value> if <condition> else <false-value>)
>> in Python is less intuitive than in C (<condition> ? <true-value> :
>> <false-value>). Ref PEP308. Why BDFL chose the syntax he did is not at all
>> clear.
> I agree with you on this one - specifically, because the order of
> evaluation is "middle then outside", instead of left-to-right.
>
>> The Ugly:
>> Persisting with the crapulence from C where a non zero integer is true
>> and zero is false - only ever done because C lacked a boolean data type.
>> This is a flagrant violation of PEP 20.2 "Explicit is better than implicit"
>> and should be removed without providing backwards compatibility.
> In Python, *everything* is either true or false. Anything that
> represents "something" is true, and anything that represents "nothing"
> is false. An empty list is false, but a list with items in it is true.
> This is incredibly helpful and most definitely not ugly; Python is not
> REXX.
>
> ChrisA
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas(a)python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
>
> ------------------------------
>
> End of Python-ideas Digest, Vol 122, Issue 20
> *********************************************
[View Less]
5
4
Python Reviewed
Having used a lot of languages a little bit and not finding satisfactory
answers to these in some cases often asked questions, I thought I'd join
this group to make a post on the virtues and otherwise of python.
The Good:
Syntactically significant new lines
Syntactically significant indenting
Different types of array like structures for different situations
Mostly simple and clear structures
Avoiding implicit structures like C++ references which add …
[View More]only
negative value
Avoiding overly complicated chaining expressions like
"while(*d++=*s++);"
Single syntax for block statements (well, sort of. I'm ignoring
lines like "if a=b: c=d")
Lack of a with statement which only obscures the code
The Bad:
Colons at the end of if/while/for blocks. Most of the arguments in
favour of this decision boil down to PEP 20.2 "Explicit is better than
implicit". Well, no. if/while/for blocks are already explicit. Adding
the colon makes it doubly explicit and therefore redundant. There is no
reason I can see why this colon can't be made optional except for
possibly PEP20.13 "There should be one-- and preferably only one
--obvious way to do it". I don't agree that point is sufficient to
require colons.
No end required for if/while/for blocks. This is particularly a
problem when placing code into text without fixed width fonts. It also
is a potential problem with tab expansion tricking the programmer. This
could be done similarly to requiring declarations in Fortran, which if
"implicit none" was added to the top of the program, declarations are
required. So add a "Block Close Mandatory" (or similar) keyword to
enforce this. In practice there is usually a blank line placed at the
end of blocks to try to signal this to someone reading the code. Makes
the code less readable and I would refer to PEP20.7 "Readability counts"
This code block doesn't compile, even given that function "process"
takes one string parameter:
f=open(file)
endwhile=""
while (line=f.readline())!=None:
process(line)
endwhile
I note that many solutions have been proposed to this. In C, it
is the ability to write "while(line=fgets(f))" instead of
"while((line=fgets(f))!=NULL)" which causes the confusion. No solutions
have been accepted to the current method which is tacky:
f=open(file)
endwhile=""
endif=""
while True:
line=f.readline
if line = None:
break
endif
process(line)
endwhile
Inadequacy of PEP249 - Python Database Specification. This only
supports dynamic SQL but SQL and particularly select statements should
be easier to work with in the normal cases where you don't need such
statements. e.g:
endselect=""
idList = select from identities where surname = 'JONES':
idVar = id
forenameVar = forename
surnameVar = surname
dobVar = dob
endselect
endfor=""
for id in idList:
print id.forenameVar, id.dobVar
endfor
as opposed to what is presently required in the select case
which is:
curs = connection.cursor()
curs.execute("select id, forename, surname, dob from
identities where surname = 'JONES'")
idList=curs.fetchall()
endfor=""
for id in idList:
print id[1], id[3]
endfor
I think the improvement in readibility for the first option
should be plain to all even in the extremely simple case I've shown.
This is the sort of thing which should be possible in any
language which works with a database but somehow the IT industry has
lost it in the 1990s/2000s. Similarly an upgraded syntax for the
insert/values statement which the SQL standard has mis-specified to make
the value being inserted too far away from the column name. Should be
more like:
endinsert=""
Insert into identities:
id = 1
forename = 'John'
surname = 'Smith'
dob = '01-Jan-1970'
endinsert
One of the major problems with the status quo is the lack of
named result columns. The other is that the programmer is required to
convert the where clause into a string. The functionality of dynamic
where/from clauses can still be provided without needing to rely on
numbered result columns like so:
endselect=""
idList = select from identities where :where_clause:
id = id
forename = forename
surname = surname
dob = dob
endselect
Ideally, the bit after the equals sign would support all
syntaxes allowed by the host database server which probably means it
needs to be free text passed to the server. Where a string variable
should be passed, the :variable syntax could be supported but this is
not often required
Variables never set to anything do not error until they are used,
at least in implementations of Python 2 I have tried. e.g.
UnlikelyCondition = False
endif=""
if UnlikelyCondition:
print x
endif
The above code runs fine until UnlikelyCondition is set to True
No do-while construct
else keyword at the end of while loops is not obvious to those not
familiar with it. Something more like whenFalse would be clearer
Changing print from a statement to a function in Python 3 adds no
positive value that I can see
Upper delimiters being exclusive while lower delimiters are
inclusive. This is very counter intuitive. e.g. range(1,4) returns
[1,2,3]. Better to have the default base as one rather than zero IMO. Of
course, the programmer should always be able to define the lower bound.
This cannot be changed, of course.
Lack of a single character in a method to refer to an attribute
instead of a local variable, similar to C's "*" for dereferencing a pointer
Inability to make simple chained assignments e.g. "a = b = 0"
Conditional expression (<true-value> if <condition> else
<false-value>) in Python is less intuitive than in C (<condition> ?
<true-value> : <false-value>). Ref PEP308. Why BDFL chose the syntax he
did is not at all clear.
The Ugly:
Persisting with the crapulence from C where a non zero integer is
true and zero is false - only ever done because C lacked a boolean data
type. This is a flagrant violation of PEP 20.2 "Explicit is better than
implicit" and should be removed without providing backwards compatibility.
[View Less]
7
8

Jan. 7, 2017
I have read the discussion and I'm sure that use structure as Py_tss_t
instead of platform-specific data type. Just as Steve said that Py_tss_t
should be genuinely treated as an opaque type, the key state checking
should provide macros or inline functions with name like
PyThread_tss_is_created. Well, I'd resolve the specification a bit more :)
If PyThread_tss_create is called with the created key, it is no-op but
which the function should succeed or fail? In my opinion, It is better to
return …
[View More]a failure because it is a high possibility that the code is
incorrect for multiple callings of PyThread_tss_create for One key.
In this opinion PyThread_tss_is_created should return a value as follows:
(A) False while from after defining with Py_tss_NEED_INIT to before calling
PyThread_tss_create
(B) True after calling PyThread_tss_create succeeded
(C) Unchanging before and after calling PyThread_tss_create failed
(D) False after calling PyThread_tss_delete regardless of timing
(E) For other functions, the return value of PyThread_tss_is_created does
not change before and after calling
I think that it is better to write a test about the state of the Py_tss_t.
Kind regards,
Masayuki
2016-12-31 2:38 GMT+09:00 Erik Bray <erik.m.bray(a)gmail.com>:
> On Fri, Dec 30, 2016 at 5:05 PM, Nick Coghlan <ncoghlan(a)gmail.com> wrote:
> > On 29 December 2016 at 22:12, Erik Bray <erik.m.bray(a)gmail.com> wrote:
> >>
> >> 1) CPython's TLS: Defines -1 as an uninitialized key (by fact of the
> >> implementation--that the keys are integers starting from zero)
> >> 2) pthreads: Does not definite an uninitialized default value for
> >> keys, for reasons described at [1] under "Non-Idempotent Data Key
> >> Creation". I understand their reasoning, though I can't claim to know
> >> specifically what they mean when they say that some implementations
> >> would require the mutual-exclusion to be performed on
> >> pthread_getspecific() as well. I don't know that it applies here.
> >
> >
> > That section is a little weird, as they describe two requests (one for a
> > known-NULL default value, the other for implicit synchronisation of key
> > creation to prevent race conditions), and only provide the justification
> for
> > rejecting one of them (the second one).
>
> Right, that is confusing to me as well. I'm guessing the reason for
> rejecting the first is in part a way to force us to recognize the
> second issue.
>
> > If I've understood correctly, the situation they're worried about there
> is
> > that pthread_key_create() has to be called at least once-per-process, but
> > must be called before *any* call to pthread_getspecific or
> > pthread_setspecific for a given key. If you do "implicit init" rather
> than
> > requiring the use of an explicit mechanism like pthread_once (or our own
> > Py_Initialize and module import locks), then you may take a small
> > performance hit as either *every* thread then has to call
> > pthread_key_create() to ensure the key exists before using it, or else
> > pthread_getspecific() and pthread_setspecific() have to become
> potentially
> > blocking calls. Neither of those is desirable, so it makes sense to leave
> > that part of the problem to the API client.
> >
> > In our case, we don't want the implicit synchronisation, we just want the
> > known-NULL default value so the "Is it already set?" check can be moved
> > inside the library function.
>
> Okay, we're on the same page here then. I just wanted to make sure
> there wasn't anything else I was missing in Python's case.
>
> >> 3) windows: The return value of TlsAlloc() is a DWORD (unsigned int)
> >> and [2] states that its value should be opaque.
> >>
> >> So in principle we can cover all cases with an opaque struct that
> >> contains, as its first member, an is_initialized flag. The tricky
> >> part is how to initialize the rest of the struct (containing the
> >> underlying implementation-specific key). For 1) and 3) it doesn't
> >> matter--it can just be zero. For 2) it's trickier because there's no
> >> defined constant value to initialize a pthread_key_t to.
> >>
> >> Per Nick's suggestion this can be worked around by relying on C99's
> >> initialization semantics. Per [3] section 6.7.8, clause 21:
> >>
> >> """
> >> If there are fewer initializers in a brace-enclosed list than there
> >> are elements or members of an aggregate, or fewer characters in a
> >> string literal used to initialize an array of known size than there
> >> are elements in the array, the remainder of the aggregate shall be
> >> initialized implicitly the same as objects that have static storage
> >> duration.
> >> """
> >>
> >> How objects with static storage are initialized is described in the
> >> previous page under clause 10, but in practice it boils down to what
> >> you would expect: Everything is initialized to zero, including nested
> >> structs and arrays.
> >>
> >> So as long as we can use this feature of C99 then I think that's the
> >> best approach.
> >
> >
> >
> > I checked PEP 7 to see exactly which features we've added to the
> approved C
> > dialect, and designated initialisers are already on the list:
> > https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html
> >
> > So I believe that would allow the initializer to be declared as something
> > like:
> >
> > #define Py_tss_NEEDS_INIT {.is_initialized = false}
>
> Great! One could argue about whether or not the designated
> initializer syntax also incorporates omitted fields, but it would seem
> strange to insist that it doesn't.
>
> Have a happy new year,
>
> Erik
>
[View Less]
2
2
Suppose you have implemented an immutable Position type to represent
the state of a game played on an MxN board, where the board size can
grow quite large.
Or suppose you have implemented an immutable, ordered collection type.
For example, the collections-extended package provides a
frozensetlist[1]. One of my own packages provides a frozen, ordered
bidirectional mapping type.[2]
These types should be hashable so that they can be inserted into sets
and mappings. The order-sensitivity of the …
[View More]contents prevents them from
using the built-in collections.Set._hash() helper in their __hash__
implementations, to keep from unnecessarily causing hash collisions
for objects that compare unequal due only to having a different
ordering of the same set of contained items.
According to https://docs.python.org/3/reference/datamodel.html#object.__hash__
:
"""
it is advised to mix together the hash values of the components of the
object that also play a part in comparison of objects by packing them
into a tuple and hashing the tuple. Example:
def __hash__(self):
return hash((self.name, self.nick, self.color))
"""
Applying this advice to the use cases above would require creating an
arbitrarily large tuple in memory before passing it to hash(), which
is then just thrown away. It would be preferable if there were a way
to pass multiple values to hash() in a streaming fashion, such that
the overall hash were computed incrementally, without building up a
large object in memory first.
Should there be better support for this use case? Perhaps hash() could
support an alternative signature, allowing it to accept a stream of
values whose combined hash would be computed incrementally in
*constant* space and linear time, e.g. "hash(items=iter(self))".
In the meantime, what is the best way to incrementally compute a good
hash value for such objects using built-in Python routines? (As a
library author, it would be preferable to use a routine with explicit
support for computing a hash incrementally, rather than having to
worry about how to correctly combine results from multiple calls to
hash(contained_item) in library code. (Simply XORing such results
together would not be order-sensitive, and so wouldn't work.) Using a
routine with explicit support for incremental hashing would allow
libraries to focus on doing one thing well.[3,4,5])
I know that hashlib provides algorithms that support incremental
hashing, but those use at least 128 bits. Since hash() throws out
anything beyond sys.hash_info.hash_bits (e.g. 64) bits, anything in
hashlib seems like overkill. Am I right in thinking that's the wrong
tool for the job?
On the other hand, would binascii.crc32 be suitable, at least for
32-bit systems? (And is there some 64-bit incremental hash algorithm
available for 64-bit systems? It seems Python has no support for crc64
built in.) For example:
import binascii, struct
class FrozenOrderedCollection:
def __hash__(self):
if hasattr(self, '__hashval'): # Computed lazily.
return self.__hashval
hv = crc32(b'FrozenOrderedCollection')
for i in self:
hv = binascii.crc32(struct.pack('@l', hash(i)), hv)
hv &= 0xffffffff
self.__hashval = hv
return hv
Note that this example illustrates two other common requirements of
these use cases:
(i) lazily computing the hash value on first use, and then caching it
for future use
(ii) priming the overall hash value with some class-specific initial
value, so that if an instance of a different type of collection, which
comprised the same items but which compared unequal, were to compute
its hash value out of the same constituent items, we make sure our
hash value differs. (On that note, should the documentation in
https://docs.python.org/3/reference/datamodel.html#object.__hash__
quoted above be updated to add this advice? The current advice to
"return hash((self.name, self.nick, self.color))" would cause a hash
collision with a tuple of the same values, even though the tuple
should presumably compare unequal with this object.)
To summarize these questions:
1. Should hash() add support for incremental hashing?
2. In the meantime, what is the best way to compute a hash of a
combination of many values incrementally (in constant space and linear
time), using only what's available in the standard library? Ideally
there is some routine available that uses exactly hash_info.hash_bits
number of bits, and that does the combining of incremental results for
you.
3. Should the https://docs.python.org/3/reference/datamodel.html#object.__hash__
documentation be updated to include suitable advice for these use
cases, in particular, that the overall hash value should be computed
lazily, incrementally, and should be primed with a class-unique value?
Thanks in advance for a helpful discussion, and best wishes.
Josh
References:
[1] http://collections-extended.lenzm.net/api.html#collections_extended.frozens…
[2] https://bidict.readthedocs.io/en/dev/api.html#bidict.frozenorderedbidict
[3] http://stackoverflow.com/questions/2909106/python-whats-a-correct-and-good-…
[4] http://stackoverflow.com/a/2909572/161642
[5] http://stackoverflow.com/a/27952689/161642
[View Less]
18
55